CN100375080C - Input / output group throttling method in large scale distributed shared systems - Google Patents

Input / output group throttling method in large scale distributed shared systems Download PDF

Info

Publication number
CN100375080C
CN100375080C CNB2005100314495A CN200510031449A CN100375080C CN 100375080 C CN100375080 C CN 100375080C CN B2005100314495 A CNB2005100314495 A CN B2005100314495A CN 200510031449 A CN200510031449 A CN 200510031449A CN 100375080 C CN100375080 C CN 100375080C
Authority
CN
China
Prior art keywords
pio
affairs
logic
retry
ioc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2005100314495A
Other languages
Chinese (zh)
Other versions
CN1667602A (en
Inventor
郭御风
李琼
刘衡竹
刘路
胡军
刘涛
黄克勋
尹佳斌
郭敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CNB2005100314495A priority Critical patent/CN100375080C/en
Publication of CN1667602A publication Critical patent/CN1667602A/en
Application granted granted Critical
Publication of CN100375080C publication Critical patent/CN100375080C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Systems (AREA)
  • Multi Processors (AREA)

Abstract

The present invention relates to an input / output grouping throttling method in a large-scale distributed shared system, which has the purpose of solving the problem of system congestion caused by unbalanced speed of an I/O device and burst I/O transaction flow. The present invention has the technical scheme that input / output grouping throttling control logic in the large-scale parallel distributed system is set; grouping throttling control logic 1 comprising PIO transaction sending control logic, PIO transaction FIFO buffering and a PIO transaction repetition state machine is positioned at the internal of an interface PI of a processor in NC; grouping throttling control logic 2 which is composed of the PIO transaction processing logic, PIO transaction sending logic, PIO transaction response receiving logic, PIO transaction response generation logic, a plurality of control state machines, PIO transaction buffering and a plurality of credit counters is positioned at IOC; the throttling control logic 1 and the throttling control logic 2 can work in a collaborative mode so as to complete the grouping throttling control of the PIO transaction flow. The present invention solves the problem of the system congestion caused by the idling I/O device or the burst I/O transaction flow, and greatly improves system performance.

Description

Input/output group throttling method in the large scale distributed shared systems
Technical field
The present invention relates to I/O (IO) method in the computer realm, especially extensive the distribution shared the disposal route of the I transaction flow in the parallel processing system (PPS).
Background technology
Because disparate development between processor technology and IO technology, IO remains one of main bottleneck of massively parallel computer system at present, by design high-performance IO system, provide high bandwidth, low delay, highly reliable IO to visit, the equilibrium expansion of the calculating of realization system, communication and IO performance is one of key method that improves the massively parallel system performance.On the other hand, because the lack of uniformity of IO speed, and IO equipment is usually and deposit fast, at a slow speed in the system, how in the extensive shared parallel system that distributes, to give full play to the IO performance, avoid descending, become one of problem that presses for solution in the massively parallel system owing to system congestion causes system performance.
Present share I O structures that adopt in the massively parallel system, promptly the IO resource is shared by all calculating nodes, finishes visit and control to the IO resource by hardware more.Share I O structure is divided into two kinds again: concentrate to share formula and distribute and share formula.Concentrate shared formula IO to be meant that the IO resource is not attached to any one and calculates node, but link to each other with the calculating node, and calculate nodes by all and share by high speed internet.The shared formula IO that distributes is meant that the IO resource is attached to different calculating nodes respectively, must calculate node by this locality to other visit of calculating the attached IO resource of node and transmit, and require the remote computation node at IO resource place to participate in.The IO affairs of being calculated the node initiation by this locality are called local IO affairs, and the IO affairs of being initiated by the remote computation node are called the remote I affairs.Share the formula structure for distributing, each node mainly contains NC (NodeController, the node controller), IOC (IO Controller, the IO controller) and system and Internet form, NC realizes processor interface, memory access control and internet interface, and IOC finishes the control to the management of IO equipment and IO visit.The IO affairs are divided into PIO affairs (Process IO, processor is initiated the IO affairs) and DMA affairs (Direct Memory Access, direct memory accessing work).The PIO affairs are operations that IO equipment is conducted interviews and controls of being initiated by CPU; Dma operation is the operation of being initiated by IO equipment that system storage is read and write.
Usually IOC passes through PIO credit management method to the control of PIO transaction flow, be PIO affairs of the every transmission of CPU, PIO credit subtracts 1, IO equipment is whenever finished PIO affairs, PIO credit adds 1, in case PIO credit exhausts (being that the PIO credit counter is 0), then can not send new PIO affairs, have only when the PIO credit counter and just can initiate the PIO affairs greater than 0 the time.For large scale distributed shared systems, because IO equipment I O issued transaction speed is slow at a slow speed, when the PIO affairs of IO equipment at a slow speed delay to finish, perhaps certain section a large amount of PIO affairs of time bursts cause PIO credit to exhaust, and IOC can not flow out new PIO affairs to IO equipment again, the PIO affairs of follow-up outflow will be stopped up the IOC interface of NC, also may cause the internet interface of NC to stop up, cause those operations that do not need IO also can't finish, thereby cause system performance to descend greatly.
Summary of the invention
Technical matters to be solved by this invention is to share in the parallel system at above-mentioned extensive distribution, because the IO device rate is unbalanced and burst IO transaction flow causes system congestion, thereby reduce the situation of system performance greatly, a kind of large scale distributed shared systems input/output group throttling method (Distributed Shared Input Output Grouping Throttling is proposed, DSIOGT), effective resolution system congestion problems, make that on the one hand the IO performance is given full play under the distribution share I O structure, realize that the equilibrium of IO resource can be expanded; One side makes system performance can not be subjected to the variation of IO resource distribution situation or PIO transaction flow and reduces.
Technical scheme of the present invention is the group throttling steering logic of input and output in the design large-scale parallel compartment system, and this steering logic is divided into two parts and realizes in NC and IOC respectively.Group throttling steering logic among the NC is arranged in the processor interface PI inside of NC the first group throttling steering logic, comprises that the PIO affairs send steering logic, PIO affairs FIFO buffering and PIO affairs retry state machine.The PIO affairs send steering logic and are connected with PIO retry state machine, send steering logic by PIO affairs retry state machine control PIO affairs; Simultaneously, the PIO affairs send steering logic and link to each other the order outflow of control PIO affairs with a fifo buffer that is used to deposit from the PIO affairs of CPU.The PIO affairs send logic by the message among the NC and send to IOC through FIFO; PIO response message from IOC returns to NC by the message receive logic among the NC.
PIO affairs retry state machine comprises two states: idle condition (Idle) and retry state (Retry).This state machine original state is that the state of system power-up when resetting is idle condition, state machine carries out state exchange according to the PIO transaction response type that the message receive logic is received, if what receive is a NACK response, then state machine enters retry state, if what receive is the ACK response, then keep idle condition.After state machine enters retry state, after having only by the time the PIO affairs to send steering logic the PIO affairs that need retry among the FIFO are all finished, just enter idle condition once more.When state machine was in retry state, the WRRDY signal on the deactivation system bus stoped CPU to continue to flow out the PIO request; When state machine was in Idle state, effectively the WRRDY signal allowed CPU to flow out new PIO affairs.Be in retry state at state machine, before the inefficacy WRRDY signal, CPU has flowed out several PIO affairs, and at this moment the PIO affairs are kept in the fifo buffer, and these affairs all are the affairs that need retry.All PIO affairs that need retry enter this FIFO according to the order of first in first out, and before the retry request was really responded, any subsequent request all can not be finished, and these requests of retry in order.In order to keep the order between these PIO affairs, the PIO affairs send steering logic first PIO affairs of receiving the NACK response are stamped labeling head, to be different from the PIO affairs of the follow-up NACK of receiving response, when the request of leader label during by retry, the order that sends to IOC is retry head order (flowing out the order of PIO affairs retry the earliest), and follow-up request is during by retry, and the order that sends to IOC is simple retry order.Thereby IOC can conclude which is the earliest PIO affairs, but when the PIO credit time spent, handle this request at first, and return an ACK-HEAD response (acknowledge-HEAD flows out the PIO affairs the earliest finishes response) and notify the affairs of first retry of PI processed.Then, PI stamps labeling head for the next retry affairs closely follow to finish according to the order of sequence to guarantee all PIO affairs.After the PIO of all retries affairs were all finished, state machine returned idle condition.
The group throttling steering logic is the second group throttling steering logic among the IOC, and it sends logic, PIO transaction response receive logic, PIO transaction response formation logic, a plurality of control state machine, PIO affairs buffering and a plurality of credit counter by PIO issued transaction logic, PIO affairs and forms.PIO issued transaction logic, PIO transaction response formation logic, credit counter all link to each other with control state machine.PIO issued transaction logical and PIO affairs send logic, PIO affairs buffering and credit counter and link to each other, and also send logic with the message of NC and link to each other.PIO transaction response receive logic both linked to each other with credit counter with PIO affairs buffering, linked to each other with PIO transaction response formation logic again, was responsible for receiving the PIO response message that returns from IO equipment.The number of control state machine is identical with the number of credit counter, the IO device category of being supported by system determines, the corresponding state machine of one class IO equipment (being articulated on the same IO bridge) and a credit counter, therefore according to the IO device type, the IO equipment that is articulated under the different I bridge has been divided into different groups, by being the PIO affairs credit (the PIO affairs buffering of corresponding respective numbers) of these set of dispense correspondences, the PIO affairs buffering that the equipment of each group can only use this group to take, can not take the PIO affairs buffering of other group, thereby realize grouping control the PIO transaction flow that arrives different I equipment.Take the fewer PIO affairs bufferings of IO hold facility at a slow speed during grouping, the strategy of the more PIO affairs bufferings of quick hold facility, because IO equipment only takies the PIO affairs buffering of distributing to oneself at a slow speed, when the PIO of slow devices affairs cushion when depleted, do not take the PIO affairs buffering of quick IO equipment, the PIO transaction flow of equipment is unaffected fast, thereby can not cause the obstruction of total system.Each credit counter is made up of a PIO credit counter, a NACKA counter, a NACKB counter.The PIO credit counter indicates its pairing IO bridge device has available for how many PIO affairs bufferings, when the PIO of certain equipment credit counter is 0, to any subsequent P IO transactions requests, IOC returns the NACK response, after the message receive logic among the NC was received this NACK response, PIO affairs retry state machine entered retry state, simultaneously the WRRDY signal on the deactivation system bus, stop CPU to continue to flow out the PIO request, realize throttling control the PIO affairs; When the PIO credit counter is not 0, one every PIO request to certain IO equipment, corresponding with it PIO credit counter subtracts 1, response of every reception from this equipment, corresponding PIO credit counter adds 1; The second group throttling steering logic among the NACKA counter records IOC is in the NACK response number that normal operating conditions receives that the PIO affairs are returned; The second group throttling steering logic among the NACKB counter records IOC is in the NACK response number that the throttling operational mode state receives that the PIO affairs are returned.
The PIO transaction response that PIO transaction types that control state machine is received according to PIO issued transaction logic and PIO transaction response receive logic are received is carried out state exchange, upgrade the currency of corresponding credit counter simultaneously, one has four kinds of operational mode state: Normal mode state, Collect-A mode state, Service-A mode state and Service-B mode state, the Normal pattern is a normal operating conditions, and other 3 kinds of patterns are the throttling operational mode state:
1. establishing T1 is that in the moment that the WRRDY signal was lost efficacy, this moment, all uncompleted PIO requests were the A phase requests owing to have no credit, and IOC enters the Collect-A mode of operation;
2. establish T2 for when creditable, the serviced and NACKA counter of the PIO head request of retry is 0, and the WRRDY signal is by effectively constantly; If T3 also has the A phase requests of not serving to lose efficacy again the moment of WRRDY signal for finding; T2 is the B phase transactions to the T3 new PIO affairs that CPU sends between the moment.In the T2 moment, if NACKA is not 0, the PIO issued transaction logic of IOC enters the Service-A mode of operation, if NACKA is 0 then enters the Normal mode of operation;
3. establish T4 for all handling when the A phase requests, WRRDY is once more by the effective moment.T5 also has uncompleted B phase requests to lose efficacy once more the moment of WRRDY signal for finding.T6 is for after the B phase requests is all finished, and the WRRDY signal is by again effectively constantly.T4 is the A phase requests to the T5 new PIO request that CPU sends between the moment.In the T4 moment, if the NACKB counter is not 0, PIO issued transaction logic enters the Service-B mode of operation among the IOC, if the NACKB counter is 0 then enters the Normal mode of operation.In the T6 moment, if the NACKA counter is not 0, PIO issued transaction logic enters the Service-A mode of operation among the IOC, if the NACKA counter is 0 then enters the Normal mode of operation.
The group throttling control of PIO transaction flow is finished in the first group throttling steering logic and the second group throttling steering logic collaborative work among NC and the IOC jointly, and detailed process is as follows:
1.CPU check system bus WRRDY signal, if the WRRDY signal is effective, then CPU flows out the PIO affairs;
If the WRRDY invalidating signal, CPU can not flow out the PIO affairs.
The PIO affairs that flow out from CPU are by the FIFO buffering of PI parts the NC, and the PIO affairs send steering logic and check PIO affairs retry state machine, if state machine is an idle condition, then sends logic by message the PIO affairs are sent to IOC; If PIO affairs retry state machine is a retry state, then first PIO affairs among the FIFO are stamped leader will, send to IOC, and inefficacy WRRDY signal.After all retry affairs among the FIFO are finished, state machine becomes idle condition once more, effective WRRDY signal.Like this,, realized when burst IO affairs, the throttling of PIO transaction flow being controlled, reduced influence system performance by the PIO affairs of the continuous over flood of CPU in restriction a period of time.
3.IOC in PIO issued transaction logic receive the PIO affairs, analyze PIO transaction types and target IO equipment thereof, check the control state machine of relevant device correspondence.If state machine is in normal operating conditions, then distribute corresponding PIO affairs buffering for these PIO affairs, the PIO transaction information that record is necessary, send logic by the PIO affairs PIO affairs are sent to target IO equipment, the PIO counting of IO equipment correspondence subtracts 1 simultaneously, and state machine carries out corresponding state exchange; If state machine is in the throttling operational mode state, then be not that these PIO affairs are distributed PIO affairs buffering, return to NC by PIO transaction response formation logic generation NACK message or ACK-HEAD message, the NACKA of IO equipment correspondence or NACKB (according to state machine state control) counting adds 1 simultaneously, and state machine carries out corresponding state exchange.
4.IO equipment receives the PIO affairs by the IO bus, handle the PIO affairs after, response results is returned to PIO transaction response receive logic among the IOC by the IO bus again.This response is analyzed in the inspection of PIO transaction response receive logic, and with PIO affairs bufferings in the PIO transaction information of record mate, PIO transaction response formation logic generates corresponding response message then, returns to NC; Simultaneously, discharge corresponding PIO affairs buffering, PIO credit counting adds 1, and control state machine is carried out corresponding state exchange.
5.NC after receiving response, PIO retry state machine carries out necessary state exchange, controls the WRRDY signal, and response results is returned to CPU.Thereby finish the processing of this time PIO affairs.
Control state machine state exchange flow process is:
1. all can effectively finish when all PIO affairs, and PIO credit is all greater than 0, the PIO issued transaction is the Normal mode of operation, and all PIO affairs are returned ACK.
2. the PIO credit as IO bridge correspondence equals 0, enters the Collect-A mode of operation, and IOC returns NACK to new PIO affairs, and the PIO affairs of NC send steering logic uncompleted PIO affairs are carried out the order retry.
3. under the Collect-A mode of operation,, enter the Normal state after finishing dealing with when only surplus nose heave examination request and when also creditable; Processor interface can send new PIO affairs.
4. under the Collect-A mode of operation, if do not satisfy 3 condition, it is constant that the group throttling steering logic duty of IOC maintains the Collect-A mode of operation, PIO affairs to retry are returned NACKA, the PI interface fails WRRDY of NC, do not allow processor to flow out new PIO affairs, the PIO affairs that the WRRDY that has little time to lose efficacy flows out are returned NACKB.
5. under the Collect-A mode of operation, when new credit, correct retry operation is handled, if also have this moment A stage PIO affairs to enter the Service-A state, does not still allow CPU to flow out new PIO affairs.
6. under the Service-A mode of operation, NACK is returned in the PIO request of new PIO affairs and retry; When creditable and also have A stage PIO affairs to keep the Service-A pattern.
7. under the Service-A mode of operation,, enter the Normal state when the only surplus nose heave examination request of A and when creditable; Processor interface can send new PIO affairs.
8. under the Service-A mode of operation, when creditable, correct retry operation is handled, if have only this moment the B stage (be meant when the serviced and NACKA counter of PIO head request creditable, retry be 0, the WRRDY signal is by effectively constantly, to the A phase requests that find to also have not service and the moment of the WRRDY signal that lost efficacy again) the PIO request then enters the Service-B state.If WRRDY is effective, then allows CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
9. under the Service-B mode of operation, when creditable, correct retry operation is handled, if having only B stage PIO request then enter the Service-A state this moment.The throttling steering logic forbids that still CPU flows out new PIO affairs among the NC.
10. under the Service-B mode of operation, the PIO affairs of new PIO affairs and retry are returned NACK; When creditable and also have A stage PIO request to keep the Service-B pattern.Effective as WRRDY, then allow CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
11. under the Service-B mode of operation, when only remaining the nose heave examination request of A and Credit is arranged, enter the Normal pattern after the normal process, processor interface can send new PIO affairs.
Adopt the present invention can reach following technique effect:
1. because PIO affairs buffering is used in grouping, solved in the extensive distribution share I O system since at a slow speed IO equipment cause system jams, influence system performance problems.
2. because the PIO transaction flow is carried out throttling control, solved in the extensive distribution share I O system because burst IO transaction flow causes system jams, reduce system performance problems.
3. owing to solved the problems referred to above well, therefore can the realization system in the flexible configuration of multiple performance different I resource, and the equilibrium of IO system can be expanded.
Description of drawings
Fig. 1 is existing extensive distribution share I O system assumption diagram;
Fig. 2 is a group throttling steering logic block diagram of the present invention;
Fig. 3 is four kinds of mode of operation time distribution maps of IOC throttling steering logic of the present invention.
Embodiment
Fig. 1 is existing distribution share I O system assumption diagram.It is online that all nodes are linked system interconnect by the internet interface of NC.IO controller IOC is attached to node controller NC.Articulate multiple IO bridge such as PCI-X bridge, InfiniBand bridge, traditional I bridge etc. under the IOC, realize the configuration of multiple IO equipment.Processor CPU is by all IO equipment in PIO transactions access and the control system, and IO equipment conducts interviews to global storage by the DMA affairs.All nodes is interconnected in the system and Internet realization system.
The present invention has designed the group throttling steering logic, is divided into two parts and realizes in NC and IOC respectively, is the first group throttling steering logic among the NC, is the second group throttling steering logic among the IOC, is cooperated by their input and output are controlled.
Fig. 2 is a group throttling steering logic block diagram of the present invention.The first group throttling steering logic among the NC is arranged in the processor interface PI inside of NC, and it comprises that the PIO affairs send steering logic, PIO affairs retry state machine and PIO affairs FIFO buffering.The PIO affairs send steering logic and link to each other with PIO affairs retry state machine on the one hand, send steering logic by PIO affairs retry state machine control PIO affairs; The PIO affairs send steering logic and link to each other with the fifo buffer of depositing from the PIO affairs of CPU on the other hand, control that the PIO affairs flow out according to the order of sequence in the fifo buffer.The PIO affairs are forwarded to the FIFO buffering by system bus, send logic by the message among the NC then and send to IOC; PIO response message from IOC returns to NC by the message receive logic among the NC.
PIO affairs retry state machine comprises two states: idle condition (Idle) and retry state (Retry).This state machine original state is an idle condition, state machine carries out state exchange according to the PIO transaction response type that the message receive logic is received, if what receive is a NACK response, then state machine enters retry state, if the ACK response then keeps idle condition.After state machine enters retry state, after having only by the time the PIO affairs to send steering logic the PIO affairs that need retry among the FIFO are all finished, idle condition once more.When state machine was in retry state, the WRRDY signal on the deactivation system bus stoped CPU to continue to flow out the PIO affairs to FIFO; When state machine was in idle condition, effectively the WRRDY signal allowed CPU to flow out new PIO affairs.When state machine was in retry state, before the inefficacy WRRDY signal, CPU may flow out several PIO affairs, and at this moment the PIO affairs are kept in the fifo buffer, and these affairs all are the affairs that need retry.All PIO affairs that need retry enter this FIFO according to the order of first in first out, and before the request of retry obtained real response, any subsequent request all can not be finished, and these requests of retry in order.In order to keep the order between these PIO affairs, the PIO affairs send steering logic first PIO affairs of receiving the NACK response are stamped labeling head, to be different from the PIO affairs of the follow-up NACK of receiving response, when the request of leader label during by retry, the order that sends to IOC is retry head order (flowing out the order of PIO affairs retry the earliest), and follow-up request is during by retry, and the order that sends to IOC is simple retry order.Thereby IOC can conclude which is the earliest PIO affairs, but when the PIO credit time spent, handle this request at first, and return an ACK-HEAD response (acknowledge-HEAD flows out the PIO affairs the earliest finishes response) and notify the affairs of first retry of PIO affairs transmission steering logic processed.Then, the PIO affairs send steering logic and stamp labeling head to the next retry affairs of closelying follow and finish according to the order of sequence to guarantee all PIO affairs.After the PIO of all retries affairs were all finished, state machine returned idle condition.
The second group throttling steering logic is made up of PIO issued transaction logic, PIO affairs transmission logic, PIO transaction response receive logic, PIO transaction response formation logic, one group of control state machine, PIO affairs buffering and one group of credit counter among the IOC.PIO issued transaction logic, PIO transaction response formation logic, credit counter all link to each other with control state machine.Each IO bridge device all designs the control state machine of a correspondence.PIO issued transaction logical and PIO affairs send logic, PIO affairs buffering and credit counter and link to each other.PIO transaction response receive logic links to each other with PIO response formation logic, and it is responsible for receiving the PIO response message that returns from IO equipment.IOC safeguards a PIO credit counter to each IO bridge device under it, how many PIO affairs bufferings indicate this equipment has available, when certain PIO credit counter is 0, to any subsequent P IO transactions requests, IOC returns the NACK response, thereby realizes the throttling control of PIO affairs; When the PIO credit counter is not 0, one every PIO request to certain IO equipment, corresponding with it credit counter subtracts 1, response of every reception from this equipment, corresponding credit counter adds 1; Design an independently credit counter respectively at different IO bridge devices, thereby realize grouping control the PIO affairs that arrive different I equipment.Each IO bridge device credit counter is made up of 3 counters: PIO credit counter, NACKA (IOC throttling control is in the NACK response that normal operating conditions receives that the PIO affairs are returned) counter and NACKB (IOC throttling steering logic is in the NACK response that the throttle pattern state receives that the PIO affairs are returned) counter.Control state machine enters different states according to each credit counter currency, one has four kinds of operational mode state: Normal mode state, Collect-A mode state, Service-A mode state and Service-B mode state, the Normal pattern is a normal mode of operation, and other 3 kinds of patterns are the throttling mode of operation.Whether the state decision PIO affairs that the PIO affairs send the logical foundation control state machine mail to purpose IO equipment: is 0 if the PIO affairs send logic for the PIO credit counter of certain equipment, then no longer this equipment is sent any request, any new PIO affairs are all responded NACK, and any PIO request that receives the NACK response is carried out retry by the first group throttling steering logic of the NC of this request of outflow.In case the PIO credit counter is greater than 0, the PIO of retry request will be sent to corresponding IO equipment.PIO transaction response formation logic returns response according to the state of control state machine and the content of PIO affairs buffering to the PIO affairs of receiving, under being operated in normal mode, control state machine returns ACK (acknowledge, response is finished in request), when control state machine is operated in other 3 kinds of patterns, NACK HEAD is returned in request to retry head, NACK is returned in other request, and for the PIO write request of from processor, write data is followed NACK and returned together.
The state conversion process of control state machine is as follows in the IOC throttling steering logic:
1. all can effectively finish when all PIO affairs, and PIO credit is all greater than 0, the PIO issued transaction is a normal mode of operation, and all PIO affairs are returned ACK.
2. the PIO credit as IO bridge correspondence equals 0, enters the Collect-A mode of operation, and IOC returns NACK to new PIO affairs, and the throttling steering logic of NC is carried out the order retry to uncompleted PIO affairs.
3. under the Collect-A mode of operation,, enter the Normal state after finishing dealing with when only surplus nose heave examination request and when also creditable; Processor interface can send new PIO affairs.
4. under the Collect-A mode of operation, if do not satisfy 3 condition, the throttling steering logic duty of IOC is constant, PIO affairs to retry are returned NACKA, the throttling steering logic inefficacy WRRDY of NC, do not allow processor to flow out new PIO affairs, the PIO affairs that the WRRDY that has little time to lose efficacy flows out are returned NACKB.
5. under the Collect-A mode of operation, when new credit, correct retry operation is handled, if also have the A stage this moment (owing to have no credit, the moment that the WRRDY signal was lost efficacy) the PIO affairs enter the Service-A state, still do not allow CPU to flow out new PIO affairs.
6. under the Service-A mode of operation, NACK is returned in the PIO request of new PIO affairs and retry; When creditable and also have A stage PIO affairs to keep the Service-A pattern.
7. under the Service-A mode of operation,, enter the Normal state when the only surplus nose heave examination request of A and when creditable; Processor interface can send new PIO affairs.
8. under the Service-A mode of operation, when creditable, correct retry operation is handled, if have only this moment the B stage (be meant when the serviced and NACKA counter of PIO head request creditable, retry be 0, the WRRDY signal is by effectively constantly, to the A phase requests that find to also have not service and the moment of the WRRDY signal that lost efficacy again), the PIO request then enters the Service-B state.If WRRDY is effective, then allows CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
9. under the Service-B mode of operation, when creditable, correct retry operation is handled, if having only B stage PIO request then enter the Service-A state this moment.The throttling steering logic forbids that still CPU flows out new PIO affairs among the NC.
10. under the Service-B mode of operation, the PIO affairs of new PIO affairs and retry are returned NACK; When creditable and also have A stage PIO request to keep the Service-B pattern.Effective as WRRDY, then allow CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
11. under the Service-B mode of operation, when the surplus nose heave examination request of A only and when creditable, enter the Normal pattern after the normal process, processor interface can send new PIO affairs.
Fig. 3 is four kinds of mode of operation time distribution maps of IOC throttling steering logic of the present invention.Indicated WRRDY effectively and inefficacy opportunity.
1.T1 be meant owing to have no credit the moment that the WRRDY signal was lost efficacy.This moment, all uncompleted PIO requests were the A phase requests, and IOC enters the Collect-A mode of operation.
2.T2 be meant that when creditable serviced the and NACKA counter of the PIO head request of retry is 0, the WRRDY signal is by effectively constantly.T3 is meant the A phase requests that find to also have not service and the moment of the WRRDY signal that lost efficacy again.T2 is the B phase transactions to the T3 new PIO affairs that CPU sends between the moment.In the T2 moment, if NACKA is not 0, the PIO issued transaction logic of IOC enters the Service-A mode of operation, if NACKA is 0 then enters the Normal mode of operation.
3.T4 be meant when the A phase requests and all handle that WRRDY is once more by the effective moment.T5 is meant and finds to also have uncompleted B phase requests and the moment of the WRRDY signal that lost efficacy once more.T6 is meant that after the B phase requests is all finished the WRRDY signal is by again effectively constantly.T4 is the A phase requests to the T5 new PIO request that CPU sends between the moment.In the T4 moment, if the NACKB counter is not 0, PIO issued transaction logic enters the Service-B mode of operation among the IOC, if the NACKB counter is 0 then enters the Normal mode of operation.In the T6 moment, if the NACKA counter is not 0, PIO issued transaction logic enters the Service-A mode of operation among the IOC, if the NACKA counter is 0 then enters the Normal mode of operation.
The present invention is directed in the extensive distribution share I O system because the sudden problem that causes system performance to reduce of the lack of uniformity of IO device rate and IO affairs, adopt group throttling method, the affairs of different I bridge are divided into groups to control, solved the system congestion that may cause well, guarantee giving full play to of system IO performance, and realized the flexible configuration of IO resource.The present invention has been implemented on the high-performance computer that University of Science and Technology for National Defence develops voluntarily, through evaluation and test, the present invention extensive distribute to share realized the control of PIO affairs in the I system well, obtained Expected Results.

Claims (4)

1. the input/output group throttling method in the large scale distributed shared systems, node controller NC realizes processor interface, memory access control and internet interface, IO controller IOC finishes the control to the management of IO equipment and IO visit, it is that PIO credit management method realizes the control to the PIO transaction flow that IOC initiates IO by processor, it is characterized in that designing the group throttling steering logic of input and output in the large-scale parallel compartment system, this steering logic is divided into two parts and realizes in NC and IOC respectively:
Group throttling steering logic among the NC is arranged in the processor interface PI inside of NC the first group throttling steering logic, comprises that the PIO affairs send steering logic, PIO affairs FIFO buffering and PIO affairs retry state machine; The PIO affairs send steering logic and are connected with PIO retry state machine, send steering logic by PIO retry state machine control PIO affairs; Simultaneously, the PIO affairs send steering logic and link to each other the order outflow of control PIO affairs with a fifo buffer of depositing from the PIO affairs of CPU; The PIO affairs send logic by the message among the NC and send to IOC through FIFO; PIO response message from IOC returns to NC by the message receive logic among the NC;
The group throttling steering logic is the second group throttling steering logic among the IOC, and it sends logic, PIO transaction response receive logic, PIO transaction response formation logic, a plurality of control state machine, PIO affairs buffering and a plurality of credit counter by PIO issued transaction logic, PIO affairs and forms; PIO issued transaction logic, PIO transaction response formation logic, credit counter all link to each other with control state machine; PIO issued transaction logical and PIO affairs send logic, PIO affairs buffering and credit counter and link to each other, and also send logic with the message of NC and link to each other; PIO transaction response receive logic both linked to each other with credit counter with PIO affairs buffering, linked to each other with PIO transaction response formation logic again, was responsible for receiving the PIO response message that returns from IO equipment; The number of control state machine is identical with the number of credit counter, the IO device category of being supported by system determines, the corresponding state machine of one class IO bridge device and a credit counter, therefore according to the IO device type, the IO equipment that is articulated under the different I bridge has been divided into different groups, by being the PIO affairs credit of these set of dispense correspondences, the PIO affairs buffering that the equipment of each group can only use this group to take, can not take the PIO affairs buffering of other group, thereby realize grouping control the PIO transaction flow that arrives different I equipment;
The group throttling control of PIO transaction flow is finished in the first group throttling steering logic and the second group throttling steering logic collaborative work jointly.
2. the input/output group throttling method in the large scale distributed shared systems as claimed in claim 1, it is characterized in that described PIO affairs retry state machine comprises two state: idle condition Idle and retry state Retry, this state machine original state is that the state of system power-up when resetting is idle condition, state machine carries out state exchange according to the PIO transaction response type that the message receive logic is received, if what receive is a NACK response, then state machine enters retry state, if what receive is the ACK response, then keep idle condition; After state machine enters retry state, after having only by the time the PIO affairs to send steering logic the PIO affairs that need retry among the FIFO are all finished, just enter idle condition once more; When state machine was in retry state, the WRRDY signal on the deactivation system bus stoped CPU to continue to flow out the PIO request; When state machine was in Idle state, effectively the WRRDY signal allowed CPU to flow out new PIO affairs; Be in retry state at state machine, before the inefficacy WRRDY signal, CPU has flowed out several PIO affairs, at this moment the PIO affairs are kept in the fifo buffer, these affairs all are the affairs that need retry, and all PIO affairs that need retry enter this FIFO according to the order of first in first out, before the retry request is really responded, any subsequent request all can not be finished, and these requests of retry in order; In order to keep the order between these PIO affairs, the PIO affairs send steering logic first PIO affairs of receiving the NACK response are stamped labeling head, to be different from the PIO affairs of the follow-up NACK of receiving response, when the request of leader label during by retry, the order that sends to IOC is promptly flowed out the order of PIO affairs retry the earliest for retry head order, and follow-up request is during by retry, the order that sends to IOC is simple retry order, thereby IOC can conclude which is the earliest PIO affairs, but when the PIO credit time spent, handle this request at first, and return an ACK-HEAD response and notify the affairs of first retry of PI processed; Then, PI stamps labeling head for the next retry affairs closely follow to finish according to the order of sequence to guarantee all PIO affairs; After the PIO of all retries affairs were all finished, state machine returned idle condition.
3. the input/output group throttling method in the large scale distributed shared systems as claimed in claim 1, it is characterized in that each credit counter is by a PIO credit counter, a NACKA counter, a NACKB counter is formed, the PIO credit counter indicates its pairing IO bridge device has available for how many PIO affairs bufferings, when the PIO of certain equipment credit counter is 0, to any subsequent P IO transactions requests, IOC returns the NACK response, after the message receive logic among the NC is received this NACK response, PIO affairs retry state machine enters retry state, WRRDY signal on the while deactivation system bus, stop CPU to continue to flow out the PIO request, realize throttling control the PIO affairs; When the PIO credit counter is not 0, one every PIO request to certain IO equipment, corresponding with it PIO credit counter subtracts 1, response of every reception from this equipment, corresponding PIO credit counter adds 1; The second group throttling steering logic among the NACKA counter records IOC is in the NACK response number that normal operating conditions receives that the PIO affairs are returned; The second group throttling steering logic among the NACKB counter records IOC is in the NACK response number that the throttling operational mode state receives that the PIO affairs are returned.
4. the input/output group throttling method in the large scale distributed shared systems as claimed in claim 1, it is characterized in that the PIO transaction response that PIO transaction types that control state machine is received according to PIO issued transaction logic and PIO transaction response receive logic are received carries out state exchange, upgrade the currency of corresponding credit counter simultaneously, one total Normal mode state, the Collect-A mode state, four kinds of operational mode state of Service-A mode state and Service-B mode state, the Normal pattern is a normal operating conditions, and other 3 kinds of patterns are the throttling operational mode state:
4.1. establish T1 is that in the moment that the WRRDY signal was lost efficacy, this moment, all uncompleted PIO requests were the A phase requests owing to have no credit, and IOC enters the Collect-A mode of operation;
4.2. establish T2 for when creditable, the serviced and NACKA counter of the PIO head request of retry is 0, the WRRDY signal is by effectively constantly; If T3 also has the A phase requests of not serving to lose efficacy again the moment of WRRDY signal for finding; T2 is the B phase transactions to the T3 new PIO affairs that CPU sends between the moment; In the T2 moment, if NACKA is not 0, the PIO issued transaction logic of IOC enters the Service-A mode of operation, if NACKA is 0 then enters the Normal mode of operation;
4.3. establish T4 for all handling when the A phase requests, WRRDY is once more by the effective moment, T5 also has uncompleted B phase requests to lose efficacy once more the moment of WRRDY signal for finding, T6 is for after the B phase requests is all finished, and the WRRDY signal is by again effectively constantly; T4 is the A phase requests to the T5 new PIO request that CPU sends between the moment; In the T4 moment, if the NACKB counter is not 0, PIO issued transaction logic enters the Service-B mode of operation among the IOC, if the NACKB counter is 0 then enters the Normal mode of operation; In the T6 moment, if the NACKA counter is not 0, PIO issued transaction logic enters the Service-A mode of operation among the IOC, if the NACKA counter is 0 then enters the Normal mode of operation.
CNB2005100314495A 2005-04-15 2005-04-15 Input / output group throttling method in large scale distributed shared systems Expired - Fee Related CN100375080C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100314495A CN100375080C (en) 2005-04-15 2005-04-15 Input / output group throttling method in large scale distributed shared systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100314495A CN100375080C (en) 2005-04-15 2005-04-15 Input / output group throttling method in large scale distributed shared systems

Publications (2)

Publication Number Publication Date
CN1667602A CN1667602A (en) 2005-09-14
CN100375080C true CN100375080C (en) 2008-03-12

Family

ID=35038704

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100314495A Expired - Fee Related CN100375080C (en) 2005-04-15 2005-04-15 Input / output group throttling method in large scale distributed shared systems

Country Status (1)

Country Link
CN (1) CN100375080C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160224479A1 (en) * 2013-11-28 2016-08-04 Hitachi, Ltd. Computer system, and computer system control method
US9477631B2 (en) * 2014-06-26 2016-10-25 Intel Corporation Optimized credit return mechanism for packet sends

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6434636B1 (en) * 1999-10-31 2002-08-13 Hewlett-Packard Company Method and apparatus for performing high bandwidth low latency programmed I/O writes by passing tokens
US20030007457A1 (en) * 2001-06-29 2003-01-09 Farrell Jeremy J. Hardware mechanism to improve performance in a multi-node computer system
JP2004102642A (en) * 2002-09-10 2004-04-02 Mitsubishi Electric Corp Input/output bus converting unit, plant simulation device, plant control device and update method thereof
WO2004040858A1 (en) * 2002-11-01 2004-05-13 Nokia Corporation Dynamic load distribution using local state information
CN1547119A (en) * 2003-12-04 2004-11-17 中国科学院计算技术研究所 Method for constructing large-scale high-availability cluster operating system
JP2005004394A (en) * 2003-06-11 2005-01-06 Mitsubishi Electric Corp Distributed pio system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6434636B1 (en) * 1999-10-31 2002-08-13 Hewlett-Packard Company Method and apparatus for performing high bandwidth low latency programmed I/O writes by passing tokens
US20030007457A1 (en) * 2001-06-29 2003-01-09 Farrell Jeremy J. Hardware mechanism to improve performance in a multi-node computer system
JP2004102642A (en) * 2002-09-10 2004-04-02 Mitsubishi Electric Corp Input/output bus converting unit, plant simulation device, plant control device and update method thereof
WO2004040858A1 (en) * 2002-11-01 2004-05-13 Nokia Corporation Dynamic load distribution using local state information
JP2005004394A (en) * 2003-06-11 2005-01-06 Mitsubishi Electric Corp Distributed pio system
CN1547119A (en) * 2003-12-04 2004-11-17 中国科学院计算技术研究所 Method for constructing large-scale high-availability cluster operating system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
集群计算系统中并行I/O模拟器的研究与设计. 曾碧卿,陈志刚,邓会敏,刘伟.计算技术与自动化,第23卷第3期. 2004 *

Also Published As

Publication number Publication date
CN1667602A (en) 2005-09-14

Similar Documents

Publication Publication Date Title
CN101878475B (en) Delegating network processor operations to star topology serial bus interfaces
EP2406723B1 (en) Scalable interface for connecting multiple computer systems which performs parallel mpi header matching
CN100531125C (en) Arbitrating virtual channel transmit queues in a switched fabric network
JP5376371B2 (en) Network interface card used for parallel computing systems
CN100524252C (en) Embedded system chip and data read-write processing method
US7069361B2 (en) System and method of maintaining coherency in a distributed communication system
US6490630B1 (en) System and method for avoiding deadlock in multi-node network
CN100357922C (en) A general input/output architecture, protocol and related methods to implement flow control
CN101814060B (en) Method and apparatus to facilitate system to system protocol exchange in back to back non-transparent bridges
CN1608255B (en) Communicating transaction types between agents in a computer system using packet headers including an extended type/extended length field
CN102185751B (en) One-cycle router on chip based on quick path technology
CN102984123A (en) Communicating message request transaction types between agents in a computer system using multiple message groups
CN103959261A (en) Multi-core interconnect in a network processor
CN107949837A (en) Register file for I/O data packet compressings
US12081365B2 (en) Distributed system with fault tolerance and self-maintenance
US7346725B2 (en) Method and apparatus for generating traffic in an electronic bridge via a local controller
CN111858413A (en) Data scheduling method and device for PCIE (peripheral component interface express) exchange chip port
CN100375080C (en) Input / output group throttling method in large scale distributed shared systems
US7719964B2 (en) Data credit pooling for point-to-point links
US11593281B2 (en) Device supporting ordered and unordered transaction classes
US9665518B2 (en) Methods and systems for controlling ordered write transactions to multiple devices using switch point networks
US7047284B1 (en) Transfer request bus node for transfer controller with hub and ports
CN100470509C (en) Causality-based memory access ordering in a multiprocessing environment
CN100390771C (en) Processing system and method for transmitting data
JP2001320386A (en) Electronic system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080312

Termination date: 20150415

EXPY Termination of patent right or utility model