EP0674272B1 - Mécanisme prédictif de commande de flux basé sur le principe de files d'attente - Google Patents

Mécanisme prédictif de commande de flux basé sur le principe de files d'attente Download PDF

Info

Publication number
EP0674272B1
EP0674272B1 EP94113319A EP94113319A EP0674272B1 EP 0674272 B1 EP0674272 B1 EP 0674272B1 EP 94113319 A EP94113319 A EP 94113319A EP 94113319 A EP94113319 A EP 94113319A EP 0674272 B1 EP0674272 B1 EP 0674272B1
Authority
EP
European Patent Office
Prior art keywords
bus
module
modules
queue
transactions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP94113319A
Other languages
German (de)
English (en)
Other versions
EP0674272A1 (fr
Inventor
Michael L. Ziegler
Robert J. Brooks
William R. Bryg
Craig R. Frink
Thomas R. Hotchkiss
Robert D. Odineal
James B. Williams
John L. Wood
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HP Inc
Original Assignee
Hewlett Packard Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Co filed Critical Hewlett Packard Co
Publication of EP0674272A1 publication Critical patent/EP0674272A1/fr
Application granted granted Critical
Publication of EP0674272B1 publication Critical patent/EP0674272B1/fr
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/36Handling requests for interconnection or transfer for access to common bus or bus system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs

Definitions

  • the present invention relates to computer systems that have a shared bus, and more particularly to controlling transactions issued on a shared bus.
  • Computer systems commonly have a plurality of components, such as processors, memory, and input/output devices, and a shared bus for transferring information among two or more of the components.
  • the components are coupled to the bus in the form of component modules, each of which may contain one or more processors, memory, and/or input/output devices.
  • Information is transmitted on the bus among component modules during bus cycles, each bus cycle being a period of time during which a selected module is permitted to transfer, or drive, a limited quantity of information on the bus.
  • Modules commonly send transactions on the bus to other modules to perform operations such as reading and writing data.
  • One class of computer system has two or more main processor modules for executing software running on the system (or one or more processor modules and one or more coherent input/output modules) and a shared main memory that is used by all of the processors and coherent input/output modules in the system.
  • the main memory is generally coupled to the bus through a main memory controller.
  • one or more processors also has a cache memory, which stores recently used data values for quick access by the processor.
  • a cache memory stores both the frequently used data and the addresses where these data items are stored in main memory.
  • the processor seeks data from an address in memory, it requests that data from the cache memory using the address associated with the data.
  • the cache memory checks to see whether it holds data associated with that address. If so, the cache memory returns the requested data directly to the processor. If the cache memory does not contain the desired information (i.e., a "cache miss" occurs), the cache requests the data from main memory and stalls the processor while it is waiting for the data. Since cache memory is faster than main RAM memory, this strategy results in improved system performance.
  • each module having cache memory performs a "coherency check" of its cache memory to determine whether it has data associated with the requested address and reports the results of its coherency check.
  • Each module also generally reports the status of the data stored in its cache memory in relation to the data associated with the same address stored in main memory and other cache memories. For example, a module may report that its data is "private” (i.e., the data value is only usable by this module) or that the data is "shared” (i.e., the data may reside in more than one cache memory at the same time).
  • a module may also report whether its data is "clean” (i.e., the same as the data associated with the same address stored in main memory) or "dirty” (i.e., the data has been changed after it was obtained).
  • a "coherent transaction” is any transaction that requires a check of other caches to see whether data associated with a memory address is stored in the other caches, or to verify that data is current.
  • coherent transactions Most reads and some writes to memory are coherent transactions. Those skilled in the art are familiar with many types of coherent transactions, such as a conventional read private, and non-coherent transactions, such as a conventional write-back.
  • input/output devices often operate at a much slower speed than microprocessors and, thus, modules connecting input/output devices to the bus may be slow to respond.
  • main memory accesses are relatively slow, and it is possible for the processor modules to request data faster than it can be read from the main memory.
  • Cache coherency checks may also be slow because the coherency checking processors in a module may be busy with other operations. Thus, it is often necessary to either slow down initiation of new transactions by modules or to handle the overflow of transactions when too many transactions are initiated in too short a time for them to be adequately processed or for coherency checks to be performed.
  • EP-A-0 497 054 discloses a typical prior art method for dealing with transaction overflow uses a "busy-abort" mechanism to handle the situation in which too many transactions of some type are initiated too quickly.
  • the responding module for the transaction sees a new transaction request that it cannot respond to immediately, the responding module sends back a "busy-abort" signal indicating that the transaction cannot be serviced at that time (e.g., an input/output module is occupied or a processor module having a cache memory cannot perform a coherency check fast enough).
  • the requesting module then aborts its request and tries again at a later time.
  • An alternative approach is to require handshaking between modules after each transaction to confirm whether a transaction can be processed by the responding module. This approach also results in processing delays and unnecessary design complexity.
  • the present invention is a shared bus system having a bus and a plurality of client modules coupled to the bus.
  • Each of the client modules is capable of transmitting transactions on the bus to the other client modules and receiving transactions on the bus from the other client modules.
  • Each module further has a queue for storing information specifying the transactions received by the module for processing by that module.
  • the bus system also has a bus controller that has means for limiting the types of transactions sent on the bus. When a queue in one of the modules has less than a predetermined amount of free space, the bus controller limits transactions that may be sent on the bus so as to prevent transactions requiring space in that queue from being issued.
  • Each client module preferably has a cache memory, means for detecting coherent transactions transmitted on the bus and performing a coherency check of its cache memory for the transaction, and a coherency bus for reporting results of the coherency checks.
  • Each client module preferably has a coherency queue for storing coherent transactions detected on the bus until a coherency check is performed for the coherent transactions.
  • the queues are large enough to accommodate typical transaction issue rates without the need to abort transactions.
  • the shared bus system preferably also has a main memory controller coupled to the bus.
  • the main memory controller is coupled to each of the coherency lines for receiving the results of the coherency checks reported by the client modules.
  • the main memory controller has a client option line for sending client option signals to each of the client modules to inform the client modules of what types of transactions are enabled to be transmitted on the bus during each cycle.
  • the main memory tracks the number of coherent transactions stored in each of the coherency queues and sends client option signals that prevent transactions from being transmitted on the bus that would cause one of the coherency queues to overflow.
  • the present invention encompasses a predictive flow control mechanism that prevents transactions from being issued by component modules of a shared bus system when the transactions cannot be handled at that time.
  • the present invention eliminates the need to abort such transactions after they have been issued. This is accomplished by sending signals to each module indicating what types of transactions are allowed on the bus during a given cycle, and disallowing any transactions that cannot be processed.
  • the present invention first distributes the cache coherency checking load in a manner that reduces the amount of communication required between the memory system and the individual modules on the bus.
  • Each processing module on the bus that must participate in coherency testing includes circuitry that monitors the bus (i.e. , it "snoops” or “eavesdrops” on the bus) and detects coherent transaction requests that require coherency checking by the module.
  • the central memory processor is relieved of the task of sending cache coherency checking requests to the various modules. This also reduces the number of connections between the central memory processing system and the various modules.
  • each of the modules on the bus that must participate in coherency checking includes a queue for storing cache coherency checking tasks that have not yet been completed. This buffering allows the cache coherency checking system to operate at a higher effective bandwidth. In addition, the queues assure that cache coherency checking transactions are not lost without requiring a handshaking or busy-abort protocol or hardware.
  • the central memory processing system monitors the state of the various queues and provides signals that restrict the types of transactions that are placed on the bus to assure that queue overflows do not occur. This transaction restriction system may also be used to assure that other types of transactions are not lost.
  • the predictive flow control mechanism utilizes three main sets of transaction queues.
  • each module that has a cache memory has a cache coherency queue for holding coherent transactions that have been issued on the bus until a cache coherency check can be performed.
  • each input/output module has an input/output queue for holding input/output transactions until they can be processed.
  • Input/output transactions include any transaction that requires reading data from or writing data to an input/output device.
  • certain memory addresses designate input/output locations, and transactions sent to these addresses are therefore known to be input/output transactions.
  • the main memory controller has a memory queue for holding main memory read and write transactions until coherency checking is completed and they can be processed. All of the queues are designed to handle typical transaction issue rates without overflowing.
  • the main memory controller acts as a central location for receiving and processing information on the status of each of the queues.
  • the main memory controller ensures that the queues do not overflow by sending "client option" signals to the modules indicating what types of transactions may be initiated on the bus. For example, if the input/output queues are full, the main memory controller will send a client option signal indicating that no input/output transactions are allowed. If a coherency queue is full, the main memory controller will disallow further transactions requiring coherency checks.
  • each module having a cache memory monitors the bus for transactions that have been issued and stores coherent transactions in its coherency queue for coherency checks in a first-in first-out order. The results of coherency checks are reported to the main memory controller.
  • the main memory controller also monitors the bus for and keeps track of coherent transactions, and also receives the results of the coherency checks performed by each module.
  • the main memory controller can therefore know how full each module's cache coherency queue is by comparing the number of coherent transactions issued to the number of coherency check responses received from a given module.
  • an input/output module sends a signal to the main memory controller when its input/output queue is critically full.
  • the main memory controller may keep track of its own memory queue in any conventional manner.
  • the flow control mechanism is efficient in terms of hardware because the queues are used to handle typical transaction issue rates in any event, and are not added simply to support the flow control mechanism. Bus bandwidth is also preserved since transactions are only issued once, rather than multiple times, since a transaction is guaranteed to be accepted.
  • the predictive flow control mechanism is described in connection with a computer system 10 shown in FIG. 1. Before discussing the predictive flow control mechanism, the operation of computer system 10 will be described in detail. While the basic operation of computer system 10 is not central to the present invention, it is useful to describe it in some detail before describing the operation of the predictive flow control mechanism further.
  • Computer system 10 is a multiprocessor computer having a bus 12 and a plurality of components coupled to bus 12.
  • the components include a main memory controller 14, input/output modules 16 and 18, and processor modules 20, 22, 24 and 26.
  • the components send transactions to one another on bus 12.
  • main memory controller 14 may be considered the "host” module and the remaining components may be considered “client modules.”
  • the main memory controller/host module sends client option signals to each client module specifying the types of transactions, if any, permitted on the bus during a given cycle.
  • the bus owner during a given cycle can only initiate transactions of a type permitted by the client option signal governing that cycle.
  • the bus owner during the next available cycle is also determined by arbitration based on the client option signals, along with arbitration signals from each of the client modules, and a signal sent by the current bus owner indicating whether it needs to return control of the bus.
  • Processor modules 20, 22, 24 and 26 are the main processors for computer system 10, and software for the system executes simultaneously on all processors. Processor modules 20, 22, 24 and 26 control arbitration signal transmission (i.e., ARB) lines 28, 30, 32 and 34, respectively, which couple each module to the remaining processor modules.
  • ARB arbitration signal transmission
  • Input/output modules 16 and 18 serve as interfaces between computer system 10 and input/output devices (not shown). Input/output modules 16 and 18 each contain an input/output adaptor. Input/output modules 16 and 18 control ARB lines 36 and 38, respectively. When an input/output module wants to use bus 12, it sends a predetermined signal to the remaining client modules on its ARB line, which is used for arbitration.
  • Main memory controller 14 is responsible for reading information from the main memory (not shown) and storing information in the main memory in a conventional manner.
  • Main memory controller 14 interfaces with memory either directly or through a conventional bus.
  • main memory controller 14 preferably also serves as the host module for purposes of bus control.
  • Main memory controller 14 controls a CLIENT_OP line 40, which is coupled directly to each client module.
  • Main memory controller 14 sends signals to each client module on CLIENT_OP line 40 to indicate what types of transactions may be placed on bus 12 during the next available bus cycle.
  • Bus 12 is a high performance processor-memory-I/O interconnect bus.
  • Bus 12 is a split transaction bus. For example, after a READ transaction is issued on bus 12, the module that issued the READ relinquishes the bus allowing other modules to use the bus for other transactions. When the requested data is available, the responding module for the READ arbitrates for the bus, and then transmits the data. WRITE transactions are not split, so the master transmits the WRITE data immediately following the address cycle.
  • Bus 12 preferably includes at least three buses that are primarily related to data transmission: an ADDR_DATA bus, a MASTER_ID bus, and a TRANS_ID bus. Bus 12 also includes a LONG_TRANS bus, which is related to arbitration for control of bus 12.
  • the ADDR_DATA bus is used for transmission of address information and data. Cycles where the ADDR_DATA bus carries address-related information are referred to as address cycles and cycles where the ADDR_DATA bus carries data is referred to as data cycles.
  • Write transactions for example, generally have a single address cycle followed immediately by one or more data cycles. The bus owner initiates a write transaction indicating the address to which it desires to write data and sends data during the succeeding cycles. Read transactions generally have a single address cycle used by the bus owner to indicate the address sought to be read. This address cycle is followed at some later time by one or more data cycles in which data is sent to the requesting module by the module responding to the request. Idle cycles may also occur in which no address-related information or data is sent.
  • the MASTER_ID and TRANS_ID buses are used together so that return data for a split transaction can be uniquely associated with the original transaction.
  • Each split transaction is identified by a MASTER_ID signal on the MASTER_ID bus and a TRANS_ID signal on the TRANS_ID bus that, respectively, identify the module issuing the transaction and distinguish the transaction from other transactions sent by that module. For example, a split transaction "read" is sent with a unique combination of a MASTER_ID signal and a TRANS_ID signal.
  • the MASTER_ID and TRANS_ID then accompany the return of the requested data, so that the returned data is received by the requesting module and correlated with the appropriate transaction.
  • This mechanism allows transaction returns to come back in an order other than the order in which they were issued, because the transaction order is not critical to identification of transactions. To allow unique identification, only one transaction with a given transaction ID may be outstanding from a module at a given time. The same transaction ID may, however, be used by two or more separate modules simultaneously, since the transaction can be differentiated by the MASTER_
  • LONG_TRANS is used by the current bus owner to retain control of bus 12 until a long transaction is completed. For example, a module may need to write a large amount of data during a series of cycles. When LONG_TRANS is asserted, other transactions cannot be inserted into the middle of the data by higher priority clients or the host, as explained further below.
  • the CLIENT_OP bus supports the signals shown in Table 1. Name Value Meaning SHAR_RTN 000 Host controls bus 12 for shared return during relevant cycle. HOST_CONTROL 001 Host controls bus 12 during relevant cycle. NONE_ALLOWED 010 No trans allowed during relevant cycle, but clients still control bus 12. ONE_CYCLE 011 One cycle trans allowed during relevant cycle. RET_ONLY 100 Return or response trans allowed during relevant cycle. NO_IO 101 Any except I/O trans allowed during relevant cycle. ATOMIC 110 Client who is "atomic owner" can issue any transaction, other clients can issue only responses, during relevant cycle. ANY_TRANS 111 Any transaction allowed during relevant cycle.
  • a CLIENT_OP of ANY_TRANS indicates that any transaction is allowed during the relevant cycle.
  • a CLIENT_OP of HOST_CONTROL indicates that the host seeks control of the bus during the relevant cycle.
  • the ONE_CYCLE client option signal indicates that only a one-cycle transactions are allowed.
  • the NONE_ALLOWED client option signal is used to indicate that no transactions are allowed.
  • the RET_ONLY client option signal indicates that only returns (write-backs) of previously held private-dirty cache lines, or responses to previous transactions are allowed. For example, if processor 24 issues a coherent read of a cache line that is private-dirty in processor 20's cache, processor 20 can supply that cache line in a cache-to-cache copy. That cache-to-cache copy transaction can be initiated under the influence of a RET_ONLY client option signal, since the cache-to-cache copy is a response to the coherent read. Similarly, I/O module 16 can return data from an earlier I/O read transaction under the influence of a RET_ONLY client option signal, since the data return is a response to the I/O read transaction.
  • the NO_IO and ATOMIC client option signals relate to input/output modules 16 and 18.
  • input/output modules 16 and 18 preferably control STOP_IO lines 58 and 60, respectively, for sending signals to memory controller 14 indicating that the modules cannot accept any more input/output transactions.
  • Input/output modules 16 and 18 also preferably control STOP_MOST lines 62 and 64, respectively, for sending signals to memory controller 14 and to each other to take effective control of the memory system.
  • the host when the host receives a STOP_IO signal, the host will then assert a NO_IO client option signal. If the CLIENT_OP is NO_IO, all transactions except I/O transactions are allowed.
  • the ATOMIC CLIENT_OP is generated in direct response to a client asserting STOP_MOST, assuming flow control would normally allow ANY_TRANS.
  • the ATOMIC CLIENT_OP allows the client asserting STOP_MOST to perform several consecutive transactions on bus 12. All other clients are only allowed to respond to earlier sent transactions, or write back previously held private-dirty cache lines, if they obtain the bus during any cycle in which ATOMIC is asserted.
  • the host may also ordinarily limit all clients to response-type transactions using the RET_ONLY client option signal.
  • the effective client option signal for the atomic owner is ANY_TRANS and the effective client option signal for all other clients is RET_ONLY. It will be appreciated that the ATOMIC client option signal is not necessary to the present invention.
  • the SHAR_RTN client option signal is used in some embodiments in relation to coherency schemes for systems where each module has a cache memory.
  • Each client module (both processor and input/output) has a cache memory and controls at least one coherent transaction signal transmission line (i.e., a COH lines) for sending signals directly to memory controller 14 that allow memory controller 14 to coordinate coherent transactions involving reads or writes of data that may be stored in one or more cache memories, so that most current data is used by the processors.
  • Processor modules 20, 22, 24 and 26 control COH lines 42, 44, 46 and 48, respectively.
  • Input/output module 16 controls COH lines 50 and 52.
  • Input/output module 18 controls COH lines 54 and 56.
  • the SHAR_RTN signal indicates that the main memory controller will be returning data having a shared status.
  • Main memory controller 14 monitors the full/empty status of each of the queues and issues client option signals that prevent the queues from overflowing.
  • the three types of queues used in computer system 10 are described below, and then the means by which main memory controller 14 keeps track of the full/empty status of the queues. Finally, use of this information to generate appropriate client option signals will be explained.
  • each input/output (“I/O”) module has an input/output queue, which holds transactions directed from bus 12 to the input/output module for transmission to an I/O device or an I/O bus.
  • Processor reads and writes directed to I/O devices will wait in the I/O queue until the transaction can be processed on the I/O bus and/or I/O device.
  • Such queues are commonly necessary to handle the rate at which transactions can be transmitted on bus 12.
  • bus 12 will have a frequency of 60-120 MHz, while an I/O bus will have frequency of less than 20 MHz.
  • transactions can be delivered to I/O modules much faster than they can be processed by the I/O bus or I/O device.
  • main memory controller 14 has one or more memory queues for holding main memory read and write transactions. These memory-related transactions are stored in a memory queue until the read or write is performed in memory. Preferably, separate queues are used for reads and writes. A coherent read or write cannot be performed until coherency checking is completed.
  • each module that has a cache memory has a cache coherency queue for storing coherent transactions in a first-in first-out (“FIFO") order.
  • a coherent transaction is any transaction (such as a read) that results in the need to check other caches to see whether the requested data is in the other cache, or to verify that the cache is up-to-date. Such transactions are indicated by signals sent during the address cycle for the transactions initiated on bus 12.
  • Each module having a cache memory monitors the bus and loads coherent transaction into its cache coherency queue, referred to herein as CCC queues. The coherent transactions wait in the CCC queue of a particular module until that module checks its cache, and reports the results of that coherency check to main memory controller 14.
  • main memory controller 14 begins reading the main memory as soon as the read transaction has been issued. Main memory controller 14 waits until the results of the coherency checks are reported by all of the modules, and then responds to the coherent transaction. If no client module has a private-dirty copy of the data, main memory controller 14 will supply the data from main memory. Otherwise, the client module that has a private-dirty copy will supply the data and main memory controller 14 will update main memory with the new data value. In a preferred implementation, coherency responses are received by main memory controller 14 quickly enough so that there is no appreciable delay in responding to the transaction.
  • Main memory controller 14 serves as a central location for receiving and processing information on the current full/empty status of all queues: the memory queues, the CCC queues, and the I/O queues. Different procedures are used to track each type of queue, as explained further below.
  • main memory controller 14 internally keeps track of how full its memory queues are. This can be done in any conventional manner.
  • each I/O module reports the status of its I/O queue to main memory controller 14.
  • the I/O modules monitor their own I/O queues, and assert a dedicated STOP_IO signal to main memory controller 14 when their I/O queues are critically full.
  • a queue is critically full if all remaining entries in the queue can be filled by new transactions, targeted for that queue and issued at the maximum allowed issue rate, in approximately the time required to notify all modules to stop issuing that type of transaction.
  • main memory controller 14 detects the number of coherent transactions issued on the bus and keeps track of how many coherent transactions each module has responded to, thereby indirectly monitoring the fullness of each module's CCC queue. More specifically, main memory controller 14 receives all coherent transactions as they are issued. As explained above, each module having a cache also receives each coherent transaction and sends the results of its cache coherency check for coherent transactions it has received to main memory controller 14. The responses are sent to main memory controller 14 on COH lines 42-52, which are dedicated buses from each module to main memory controller 14. Thus, main memory controller 14 can determine the number of coherent transactions remaining in a module's CCC queue by comparing cache coherency responses received from that module against the number of coherent transactions issued.
  • the process can be viewed as occurring on a "scoreboard.”
  • Coherent transactions are placed on the board when issued, indicating that the transaction is in each module's CCC queue.
  • the main memory controller monitors the bus for such transactions.
  • main memory controller 14 receives the coherency response from each module on the COH lines, main memory controller 14 records the module's response and moves a pointer to the next CCC request to be processed by that module, and reduces by one the number of transactions listed as being in that module's CCC queue.
  • Main memory controller 14 also knows when it has received all coherency responses for a given coherent transaction, so that it knows when and how to respond to the coherent transaction.
  • each module could assert a dedicated signal to main memory controller 14, similar to STOP_IO, but indicating that a CCC queue is critically full.
  • the scoreboard approach is more efficient in terms of hardware, since it utilizes coherency responses already being sent for purposes of the coherency scheme.
  • main memory controller 14 uses the CLIENT_OP bus to prevent issuance of any transaction that would overload a queue.
  • main memory controller 14 acting as host module, sends signals to all other modules on the CLIENT_OP bus indicating what types of transactions can be safely initiated.
  • a module wins arbitration for the bus it checks what encoding was driven on the CLIENT_OP bus during the arbitration state to see what transactions (or returns) the arbitration winner can start.
  • the possible CLIENT_OP signals are summarized in Table 1, above.
  • the CLIENT_OP signals directly related to flow control are ANY_TRANS, NO_IO, RET_ONLY, AND NONE_ALLOWED. If all queues have sufficient room, and main memory controller 14 is not trying to gain control of the bus, main memory controller 14 will drive the ANY_TRANS encoding, indicating that any type of transaction may be issued. If any I/O module is asserting its STOP_IO signal, main memory controller 14 will know that at least one I/O queue is critically full, and main memory controller 14 will drive the NO_IO encoding, indicating that any transaction except I/O transactions may be issued.
  • main memory controller 14 If main memory controller 14 detects that one or more CCC queues are critically full, or that its own memory queues cannot handle new read transactions, main memory controller 14 will drive the RET_ONLY encoding, indicating that the arbitration winner is only permitted to issue responses to earlier transactions or perform write backs of private dirty cache lines. In addition, new I/O transactions are disallowed.
  • main memory controller 14 If main memory controller 14 detects that its own memory queue cannot handle any new write transactions, it drives NONE_ALLOWED to prohibit starting new transactions. Since no new transactions are allowed, all queues are protected from overflowing. Internal memory processing will eventually relieve the memory queues, and cache coherency checking will eventually relieve the CCC queues, so a more permissive CLIENT_OP encoding can be issued.
  • FIG. 2 shows key elements of a computer system 100, which elements correspond functionally to elements described in connection with computer system 10 and FIG. 1.
  • Computer system 100 comprises a bus 112, a main memory controller 114 coupled to main memory 115, an input/output module 116, a processor module 120, a CLIENT_OP line 140, coherency "COH" lines 142 and 152, and STOP_IO line 158.
  • These elements correspond, respectively, to bus 12, main memory controller 14, input/output module 16, processor module 20, CLIENT_OP line 40, COH lines 42 and 52, and STOP_IO line 58, which were described in connection with FIG. 1.
  • the aspects of these elements and their interrelationship that were described in connection with FIG. 1 will not be repeated here.
  • FIG. 2 shows only one processor module and one input/output module. It is to be understood that, in a preferred implementation, additional processor modules identical to module 120 and additional input/output module identical to module 116 are coupled to bus 112 in the manner shown in FIG. 1.
  • computer system 100 includes an input/output bus 160 coupled to input/output module 116 in a conventional manner.
  • Input/output module 116 also includes an input/output queue 162, a CCC queue 164, and a memory cache 166.
  • Processor module 120 additionally includes a CCC queue 168 and a memory cache 170.
  • Main memory controller 114 includes a arbitration processor 172, a memory read queue 174, a memory write queue 176, and a scoreboard 178. It is understood that the processor and input/output modules not shown each contain elements identical to those of processor module 120 and input/output module 116, respectively.
  • coherent transactions issued by an input/output module or processor module are transmitted on bus 112.
  • the coherent transaction is detected by each module and placed in the CCC queue of each client module and on scoreboard 178.
  • coherent transactions stored in CCC queues 164 and 168 are checked against memory caches 166 and 170, respectively, and the results are reported to main memory controller 114 on lines 152 and 142, respectively.
  • the results are stored on the scoreboard until all modules have reported for the transaction in question.
  • Main memory controller 114 compares the number of coherent transactions responded to on lines 152 and 142 against the number of coherent transactions listed in scoreboard 178 to determine the full/empty status of CCC queues 164 and 168.
  • a coherent memory read issued on bus 112 will be detected by modules 116 and 120 and placed in their CCC queue for a coherency check.
  • the results of the coherency checks will be reported to main memory controller 114 indicating that neither module has a private dirty copy of the data.
  • main memory controller 114 provides the requesting module with the data, and indicates on its scoreboard that each module has responded to that coherent transaction and marks this line of the scoreboard as being free for use by an incoming transaction.
  • Input/output transactions such as a write to an input/output device are funneled through input/output queue 162 to input/output bus 160.
  • Input/output module 116 monitors the status of input/output queue 162 and, when input/output queue 162 is critically full, input/output module 116 reports this information to main memory controlled 114 on line 158. For example, if a processor module is busy writing data to input/output module 116, transactions may fill up queue 162, causing issuance of a STOP_IO signal. Main memory controller 114 will issue a NO_IO client option signal.
  • Main memory controller 114 also monitors the status of its own memory queues, queue 174 and queue 176, which are preferably a memory read queue and a memory write queue. Thus, main memory controller 114 has information concerning the full/empty status of all queues within computer system 100 that could otherwise overflow. If it detects that its memory queue is critically full, it issues a NONE_ALLOWED client option signal. As the previously-issued memory transactions are processed, the memory queue will begin to empty and a more permissive client option signal can be issued.
  • processor 172 within main memory controller 114 determines what types of transactions can be issued in the next available cycle without any of the queues overflowing. As explained above, processor 172 determines which CLIENT_OP signal should be issued such that only transactions that will not cause any of the queues to overflow are permitted during the next available bus cycle. As explained above, the winner of the arbitration will only issue transactions which are permitted by the CLIENT_OP signal. Thus, there is never a need to abort any transactions and there is no need for handshaking among modules.
  • Input/output queue 162 is close to becoming critically full. Input/output module 116 is busy receiving data. Another write to an input/output device is sent on bus 112 to input/output mode 116 and placed in input/output queue 162. Detecting that queue 162 is critically full, input/output module 116 sends a STOP_IO signal to main memory controller 114. Input/output queue 162 continues to receive transactions for several cycles until main memory controller 114 drives a NO_IO client option signal in response to the STOP_IO signal. Based on the NO_IO client option signal, the next bus owner will not drive any transactions to input/output devices.
  • main memory controller 114 may detect (using its scoreboard) that one or more coherency queues is becoming critically full. Main memory controller 114 will drive a RET_ONLY client option signal. The bus owner will not drive any further coherent transactions. However, data returns and coherency check responses will be allowed. Thus, the CCC queues will eventually begin emptying, and a more permissive client option signal will be issued.
  • bus(es) and “line(s)” have both been used in this detailed description to denote various sets of one or more electrical paths that are more fully described above. It will be appreciated by those skilled in the art that the terms “bus” and “line” are not intended to be mutually exclusive or otherwise limiting in themselves.
  • LONG_TRANS bus has been used, it is clear that the LONG_TRANS bus may consist of a conventional shared line; that is, a single electrical path along which signals can be sent by more than one module.
  • CLIENT_OP bus and “CLIENT_OP lines” have been used interchangeably to denote a set of hardware lines driven only by the host, as described more fully above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Multi Processors (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Systems (AREA)
  • Computer And Data Communications (AREA)
  • Bus Control (AREA)

Claims (4)

  1. Un système de traitement (100) de données comprenant:
    un bus (112) incluant une série de conducteurs de signaux pour transmettre une information entre des emplacements physiquement séparés;
    une série de modules (116, 118) couplés audit bus, chacun desdits modules comprenant un moyen de transmission et de réception d'information qui spécifie une transaction à effectuer par un autre module ou par ledit module, respectivement, chacun desdits modules comprenant en outre une file d'attente (162, 164, 168, 174, 176, 178) pour mémoriser une information spécifiant ladite transaction reçue par ledit module pour traitement par ledit module;
    un dispositif de commande (114) de bus qui inclut un moyen de limitation desdites transactions envoyées sur ledit bus (112); et
    un moyen, séparé desdits modules, destiné à déterminer qu'une file d'attente dans l'un desdits modules possède un espace libre inférieur à une quantité prédéterminée,
    caractérisé par
    un deuxième bus (40) sur lequel le dispositif de commande (114) de bus transmet à au moins un premier desdits modules, en réponse audit moyen de détermination, un signal qui indique qu'au moins un deuxième desdits modules possède dans sa file d'attente respective une quantité d'espace inférieure à une valeur prédéterminée, de façon à empêcher ledit premier unique au moins desdits modules d'émettre des transactions exigeant de l'espace dans ladite file d'attente dudit deuxième unique au moins desdits modules, tout en permettant audit premier unique au moins desdits modules d'émettre au moins une transaction qui n'exige aucun espace dans ladite file d'attente.
  2. Le système de traitement (100) de données selon la revendication 1, dans lequel au moins l'un desdits modules comprend en outre:
    une mémoire (166, 170, 115);
    un moyen destiné à détecter une transaction cohérente sur ledit bus (112), ladite transaction cohérente exigeant de contrôler l'existence d'un mot spécifique dans ladite mémoire et l'état dudit mot, et à amener une information spécifiant ledit contrôle à être mémorisé dans ladite file d'attente incluse dans ledit module;
    un moyen destiné à contrôler ladite mémoire quant à la présence et au statut dudit mot de données spécifié dans ladite transaction cohérente et
    un moyen (145, 152) de transmission de signaux indiquant le résultat dudit dernier contrôle effectué par ledit moyen de contrôle;
    dans lequel ledit moyen de détermination comprend en outre un moyen de réception desdits signaux transmis à partir de chacun desdits modules qui exécutent lesdits contrôles.
  3. Le système de traitement (100) de données selon la revendication 1 dans lequel au moins l'un desdits modules comprend:
    un moyen de détermination de la quantité d'espace présente dans ladite file d'attente dans ledit module; et
    un moyen (142, 152) de génération et de transmission, audit dispositif de commande (114) de bus, d'un signal qui indique que ladite quantité déterminée d'espace est inférieure à une quantité prédéterminée.
  4. Le système de traitement (100) de données selon la revendication 1, qui comprend en outre:
    une mémoire principale (115), ladite mémoire principale incluant une file d'attente pour mémoriser des instructions qui exigent des réponses par ladite mémoire principale (174, 176);
    dans lequel ledit moyen de contrôle comprend en outre un tableau d'affichage (178) des résultats qui inclut, pour chaque transaction mémorisée dans chacune desdites files d'attente (164, 168), une ligne qui mémorise une information de transaction cohérente.
EP94113319A 1994-02-24 1994-08-25 Mécanisme prédictif de commande de flux basé sur le principe de files d'attente Expired - Lifetime EP0674272B1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/201,185 US6182176B1 (en) 1994-02-24 1994-02-24 Queue-based predictive flow control mechanism
US201185 1994-02-24

Publications (2)

Publication Number Publication Date
EP0674272A1 EP0674272A1 (fr) 1995-09-27
EP0674272B1 true EP0674272B1 (fr) 2000-05-03

Family

ID=22744823

Family Applications (1)

Application Number Title Priority Date Filing Date
EP94113319A Expired - Lifetime EP0674272B1 (fr) 1994-02-24 1994-08-25 Mécanisme prédictif de commande de flux basé sur le principe de files d'attente

Country Status (6)

Country Link
US (2) US6182176B1 (fr)
EP (1) EP0674272B1 (fr)
JP (1) JPH07306810A (fr)
KR (1) KR100371844B1 (fr)
DE (1) DE69424272T2 (fr)
TW (1) TW253947B (fr)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5528766A (en) * 1994-03-24 1996-06-18 Hewlett-Packard Company Multiple arbitration scheme
US5761444A (en) * 1995-09-05 1998-06-02 Intel Corporation Method and apparatus for dynamically deferring transactions
US6240479B1 (en) * 1998-07-31 2001-05-29 Motorola, Inc. Method and apparatus for transferring data on a split bus in a data processing system
US6836829B2 (en) * 1998-11-20 2004-12-28 Via Technologies, Inc. Peripheral device interface chip cache and data synchronization method
US6732208B1 (en) * 1999-02-25 2004-05-04 Mips Technologies, Inc. Low latency system bus interface for multi-master processing environments
US6460133B1 (en) * 1999-05-20 2002-10-01 International Business Machines Corporation Queue resource tracking in a multiprocessor system
US6728253B1 (en) * 1999-09-24 2004-04-27 International Business Machines Corporation Mixed queue scheduler
US6529990B1 (en) * 1999-11-08 2003-03-04 International Business Machines Corporation Method and apparatus to eliminate failed snoops of transactions caused by bus timing conflicts in a distributed symmetric multiprocessor system
US6651124B1 (en) * 2000-04-28 2003-11-18 Hewlett-Packard Development Company, L.P. Method and apparatus for preventing deadlock in a distributed shared memory system
US6748505B1 (en) * 2000-07-11 2004-06-08 Intel Corporation Efficient system bus architecture for memory and register transfers
US6804736B2 (en) * 2000-11-30 2004-10-12 Hewlett-Packard Development Company, L.P. Bus access arbitration based on workload
US6631440B2 (en) * 2000-11-30 2003-10-07 Hewlett-Packard Development Company Method and apparatus for scheduling memory calibrations based on transactions
US20030095447A1 (en) * 2001-11-20 2003-05-22 Koninklijke Philips Electronics N.V. Shared memory controller for display processor
US6862646B2 (en) * 2001-12-28 2005-03-01 Thomas J. Bonola Method and apparatus for eliminating the software generated ready-signal to hardware devices that are not part of the memory coherency domain
US7546399B2 (en) * 2002-03-25 2009-06-09 Intel Corporation Store and forward device utilizing cache to store status information for active queues
US7024499B2 (en) * 2003-01-21 2006-04-04 Red Hat, Inc. Cache only queue option for cache controller
WO2004107180A1 (fr) * 2003-05-30 2004-12-09 Fujitsu Limited Systeme a plusieurs processeurs
US7165131B2 (en) * 2004-04-27 2007-01-16 Intel Corporation Separating transactions into different virtual channels
US7723249B2 (en) * 2005-10-07 2010-05-25 Sulzer Metco (Us), Inc. Ceramic material for high temperature service
US7596647B1 (en) * 2006-09-18 2009-09-29 Nvidia Corporation Urgency based arbiter
US7949813B2 (en) * 2007-02-06 2011-05-24 Broadcom Corporation Method and system for processing status blocks in a CPU based on index values and interrupt mapping
US8019910B2 (en) * 2007-07-31 2011-09-13 Hewlett-Packard Development Company, L.P. Transaction flow control in PCI express fabric
JP5104139B2 (ja) * 2007-09-07 2012-12-19 富士通株式会社 キャッシュシステム
US8275902B2 (en) * 2008-09-22 2012-09-25 Oracle America, Inc. Method and system for heuristic throttling for distributed file systems
US20130054896A1 (en) * 2011-08-25 2013-02-28 STMicroelectronica Inc. System memory controller having a cache
JP2016173798A (ja) * 2015-03-18 2016-09-29 ルネサスエレクトロニクス株式会社 半導体装置
US10423358B1 (en) 2017-05-31 2019-09-24 FMAD Engineering GK High-speed data packet capture and storage with playback capabilities

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5257374A (en) * 1987-11-18 1993-10-26 International Business Machines Corporation Bus flow control mechanism
GB8814633D0 (en) * 1987-11-18 1988-07-27 Ibm Bus flow control mechanism
US5204954A (en) * 1987-11-18 1993-04-20 International Business Machines Corporation Remote storage management mechanism and method
US5265235A (en) * 1990-11-30 1993-11-23 Xerox Corporation Consistency protocols for shared memory multiprocessors
US5195089A (en) * 1990-12-31 1993-03-16 Sun Microsystems, Inc. Apparatus and method for a synchronous, high speed, packet-switched bus

Also Published As

Publication number Publication date
JPH07306810A (ja) 1995-11-21
US6304932B1 (en) 2001-10-16
US6182176B1 (en) 2001-01-30
EP0674272A1 (fr) 1995-09-27
KR100371844B1 (ko) 2003-04-07
TW253947B (en) 1995-08-11
DE69424272T2 (de) 2000-11-30
KR950033892A (ko) 1995-12-26
DE69424272D1 (de) 2000-06-08

Similar Documents

Publication Publication Date Title
EP0674272B1 (fr) Mécanisme prédictif de commande de flux basé sur le principe de files d'attente
EP0669578B1 (fr) Schéma amélioré à cohérence ordonnée d'antémémoire
KR100360064B1 (ko) 고도로파이프라인된버스구조
US6141715A (en) Method and system for avoiding live lock conditions on a computer bus by insuring that the first retired bus master is the first to resubmit its retried transaction
EP0535696B1 (fr) Appareil permettant d'éviter les blocages dans un système multi-processeur
EP0675444B1 (fr) Schéma d'arbitrage multiple
EP0671692B1 (fr) Schéma d'arbitrage rapide distribué en pipeline
US5574868A (en) Bus grant prediction technique for a split transaction bus in a multiprocessor computer system
US5829033A (en) Optimizing responses in a coherent distributed electronic system including a computer system
JPH086855A (ja) メモリ
AU687627B2 (en) Multiprocessor system bus protocol for optimized accessing of interleaved storage modules
EP0674273B1 (fr) Schéma de commande des opérations atomiques
US7490184B2 (en) Systems and methods for data intervention for out-of-order castouts
US6195722B1 (en) Method and apparatus for deferring transactions on a host bus having a third party agent
US5960179A (en) Method and apparatus extending coherence domain beyond a computer system bus
US5454082A (en) System for preventing an unselected controller from transferring data via a first bus while concurrently permitting it to transfer data via a second bus
US5906659A (en) Computer system buffers for providing concurrency between CPU accesses, local bus accesses, and memory accesses
EP0832459B1 (fr) Bus de surveillance d'operations fractionnables et procede d'arbitrage
US20030126372A1 (en) Cache coherency arrangement to enhance inbound bandwidth
US8051251B2 (en) Method and apparatus for setting status of cache memory
US5623694A (en) Aborting an I/O operation started before all system data is received by the I/O controller after detecting a remote retry operation
US5790892A (en) Information handling system for modifying coherency response set to allow intervention of a read command so that the intervention is not allowed by the system memory
JPH04260956A (ja) デッドロックを回避する方法
JPH09146840A (ja) マルチプロセッサシステム

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB

17P Request for examination filed

Effective date: 19960229

17Q First examination report despatched

Effective date: 19981117

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB

REF Corresponds to:

Ref document number: 69424272

Country of ref document: DE

Date of ref document: 20000608

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

Owner name: HEWLETT-PACKARD COMPANY, A DELAWARE CORPORATION

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20070830

Year of fee payment: 14

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20070817

Year of fee payment: 14

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20080825

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20090430

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080901

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080825

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20090827

Year of fee payment: 16

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 69424272

Country of ref document: DE

Effective date: 20110301

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110301