CN100375080C - Input / output group throttling method in large scale distributed shared systems - Google Patents
Input / output group throttling method in large scale distributed shared systems Download PDFInfo
- Publication number
- CN100375080C CN100375080C CNB2005100314495A CN200510031449A CN100375080C CN 100375080 C CN100375080 C CN 100375080C CN B2005100314495 A CNB2005100314495 A CN B2005100314495A CN 200510031449 A CN200510031449 A CN 200510031449A CN 100375080 C CN100375080 C CN 100375080C
- Authority
- CN
- China
- Prior art keywords
- pio
- affairs
- logic
- retry
- ioc
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Information Transfer Systems (AREA)
- Multi Processors (AREA)
Abstract
The present invention relates to an input / output grouping throttling method in a large-scale distributed shared system, which has the purpose of solving the problem of system congestion caused by unbalanced speed of an I/O device and burst I/O transaction flow. The present invention has the technical scheme that input / output grouping throttling control logic in the large-scale parallel distributed system is set; grouping throttling control logic 1 comprising PIO transaction sending control logic, PIO transaction FIFO buffering and a PIO transaction repetition state machine is positioned at the internal of an interface PI of a processor in NC; grouping throttling control logic 2 which is composed of the PIO transaction processing logic, PIO transaction sending logic, PIO transaction response receiving logic, PIO transaction response generation logic, a plurality of control state machines, PIO transaction buffering and a plurality of credit counters is positioned at IOC; the throttling control logic 1 and the throttling control logic 2 can work in a collaborative mode so as to complete the grouping throttling control of the PIO transaction flow. The present invention solves the problem of the system congestion caused by the idling I/O device or the burst I/O transaction flow, and greatly improves system performance.
Description
Technical field
The present invention relates to I/O (IO) method in the computer realm, especially extensive the distribution shared the disposal route of the I transaction flow in the parallel processing system (PPS).
Background technology
Because disparate development between processor technology and IO technology, IO remains one of main bottleneck of massively parallel computer system at present, by design high-performance IO system, provide high bandwidth, low delay, highly reliable IO to visit, the equilibrium expansion of the calculating of realization system, communication and IO performance is one of key method that improves the massively parallel system performance.On the other hand, because the lack of uniformity of IO speed, and IO equipment is usually and deposit fast, at a slow speed in the system, how in the extensive shared parallel system that distributes, to give full play to the IO performance, avoid descending, become one of problem that presses for solution in the massively parallel system owing to system congestion causes system performance.
Present share I O structures that adopt in the massively parallel system, promptly the IO resource is shared by all calculating nodes, finishes visit and control to the IO resource by hardware more.Share I O structure is divided into two kinds again: concentrate to share formula and distribute and share formula.Concentrate shared formula IO to be meant that the IO resource is not attached to any one and calculates node, but link to each other with the calculating node, and calculate nodes by all and share by high speed internet.The shared formula IO that distributes is meant that the IO resource is attached to different calculating nodes respectively, must calculate node by this locality to other visit of calculating the attached IO resource of node and transmit, and require the remote computation node at IO resource place to participate in.The IO affairs of being calculated the node initiation by this locality are called local IO affairs, and the IO affairs of being initiated by the remote computation node are called the remote I affairs.Share the formula structure for distributing, each node mainly contains NC (NodeController, the node controller), IOC (IO Controller, the IO controller) and system and Internet form, NC realizes processor interface, memory access control and internet interface, and IOC finishes the control to the management of IO equipment and IO visit.The IO affairs are divided into PIO affairs (Process IO, processor is initiated the IO affairs) and DMA affairs (Direct Memory Access, direct memory accessing work).The PIO affairs are operations that IO equipment is conducted interviews and controls of being initiated by CPU; Dma operation is the operation of being initiated by IO equipment that system storage is read and write.
Usually IOC passes through PIO credit management method to the control of PIO transaction flow, be PIO affairs of the every transmission of CPU, PIO credit subtracts 1, IO equipment is whenever finished PIO affairs, PIO credit adds 1, in case PIO credit exhausts (being that the PIO credit counter is 0), then can not send new PIO affairs, have only when the PIO credit counter and just can initiate the PIO affairs greater than 0 the time.For large scale distributed shared systems, because IO equipment I O issued transaction speed is slow at a slow speed, when the PIO affairs of IO equipment at a slow speed delay to finish, perhaps certain section a large amount of PIO affairs of time bursts cause PIO credit to exhaust, and IOC can not flow out new PIO affairs to IO equipment again, the PIO affairs of follow-up outflow will be stopped up the IOC interface of NC, also may cause the internet interface of NC to stop up, cause those operations that do not need IO also can't finish, thereby cause system performance to descend greatly.
Summary of the invention
Technical matters to be solved by this invention is to share in the parallel system at above-mentioned extensive distribution, because the IO device rate is unbalanced and burst IO transaction flow causes system congestion, thereby reduce the situation of system performance greatly, a kind of large scale distributed shared systems input/output group throttling method (Distributed Shared Input Output Grouping Throttling is proposed, DSIOGT), effective resolution system congestion problems, make that on the one hand the IO performance is given full play under the distribution share I O structure, realize that the equilibrium of IO resource can be expanded; One side makes system performance can not be subjected to the variation of IO resource distribution situation or PIO transaction flow and reduces.
Technical scheme of the present invention is the group throttling steering logic of input and output in the design large-scale parallel compartment system, and this steering logic is divided into two parts and realizes in NC and IOC respectively.Group throttling steering logic among the NC is arranged in the processor interface PI inside of NC the first group throttling steering logic, comprises that the PIO affairs send steering logic, PIO affairs FIFO buffering and PIO affairs retry state machine.The PIO affairs send steering logic and are connected with PIO retry state machine, send steering logic by PIO affairs retry state machine control PIO affairs; Simultaneously, the PIO affairs send steering logic and link to each other the order outflow of control PIO affairs with a fifo buffer that is used to deposit from the PIO affairs of CPU.The PIO affairs send logic by the message among the NC and send to IOC through FIFO; PIO response message from IOC returns to NC by the message receive logic among the NC.
PIO affairs retry state machine comprises two states: idle condition (Idle) and retry state (Retry).This state machine original state is that the state of system power-up when resetting is idle condition, state machine carries out state exchange according to the PIO transaction response type that the message receive logic is received, if what receive is a NACK response, then state machine enters retry state, if what receive is the ACK response, then keep idle condition.After state machine enters retry state, after having only by the time the PIO affairs to send steering logic the PIO affairs that need retry among the FIFO are all finished, just enter idle condition once more.When state machine was in retry state, the WRRDY signal on the deactivation system bus stoped CPU to continue to flow out the PIO request; When state machine was in Idle state, effectively the WRRDY signal allowed CPU to flow out new PIO affairs.Be in retry state at state machine, before the inefficacy WRRDY signal, CPU has flowed out several PIO affairs, and at this moment the PIO affairs are kept in the fifo buffer, and these affairs all are the affairs that need retry.All PIO affairs that need retry enter this FIFO according to the order of first in first out, and before the retry request was really responded, any subsequent request all can not be finished, and these requests of retry in order.In order to keep the order between these PIO affairs, the PIO affairs send steering logic first PIO affairs of receiving the NACK response are stamped labeling head, to be different from the PIO affairs of the follow-up NACK of receiving response, when the request of leader label during by retry, the order that sends to IOC is retry head order (flowing out the order of PIO affairs retry the earliest), and follow-up request is during by retry, and the order that sends to IOC is simple retry order.Thereby IOC can conclude which is the earliest PIO affairs, but when the PIO credit time spent, handle this request at first, and return an ACK-HEAD response (acknowledge-HEAD flows out the PIO affairs the earliest finishes response) and notify the affairs of first retry of PI processed.Then, PI stamps labeling head for the next retry affairs closely follow to finish according to the order of sequence to guarantee all PIO affairs.After the PIO of all retries affairs were all finished, state machine returned idle condition.
The group throttling steering logic is the second group throttling steering logic among the IOC, and it sends logic, PIO transaction response receive logic, PIO transaction response formation logic, a plurality of control state machine, PIO affairs buffering and a plurality of credit counter by PIO issued transaction logic, PIO affairs and forms.PIO issued transaction logic, PIO transaction response formation logic, credit counter all link to each other with control state machine.PIO issued transaction logical and PIO affairs send logic, PIO affairs buffering and credit counter and link to each other, and also send logic with the message of NC and link to each other.PIO transaction response receive logic both linked to each other with credit counter with PIO affairs buffering, linked to each other with PIO transaction response formation logic again, was responsible for receiving the PIO response message that returns from IO equipment.The number of control state machine is identical with the number of credit counter, the IO device category of being supported by system determines, the corresponding state machine of one class IO equipment (being articulated on the same IO bridge) and a credit counter, therefore according to the IO device type, the IO equipment that is articulated under the different I bridge has been divided into different groups, by being the PIO affairs credit (the PIO affairs buffering of corresponding respective numbers) of these set of dispense correspondences, the PIO affairs buffering that the equipment of each group can only use this group to take, can not take the PIO affairs buffering of other group, thereby realize grouping control the PIO transaction flow that arrives different I equipment.Take the fewer PIO affairs bufferings of IO hold facility at a slow speed during grouping, the strategy of the more PIO affairs bufferings of quick hold facility, because IO equipment only takies the PIO affairs buffering of distributing to oneself at a slow speed, when the PIO of slow devices affairs cushion when depleted, do not take the PIO affairs buffering of quick IO equipment, the PIO transaction flow of equipment is unaffected fast, thereby can not cause the obstruction of total system.Each credit counter is made up of a PIO credit counter, a NACKA counter, a NACKB counter.The PIO credit counter indicates its pairing IO bridge device has available for how many PIO affairs bufferings, when the PIO of certain equipment credit counter is 0, to any subsequent P IO transactions requests, IOC returns the NACK response, after the message receive logic among the NC was received this NACK response, PIO affairs retry state machine entered retry state, simultaneously the WRRDY signal on the deactivation system bus, stop CPU to continue to flow out the PIO request, realize throttling control the PIO affairs; When the PIO credit counter is not 0, one every PIO request to certain IO equipment, corresponding with it PIO credit counter subtracts 1, response of every reception from this equipment, corresponding PIO credit counter adds 1; The second group throttling steering logic among the NACKA counter records IOC is in the NACK response number that normal operating conditions receives that the PIO affairs are returned; The second group throttling steering logic among the NACKB counter records IOC is in the NACK response number that the throttling operational mode state receives that the PIO affairs are returned.
The PIO transaction response that PIO transaction types that control state machine is received according to PIO issued transaction logic and PIO transaction response receive logic are received is carried out state exchange, upgrade the currency of corresponding credit counter simultaneously, one has four kinds of operational mode state: Normal mode state, Collect-A mode state, Service-A mode state and Service-B mode state, the Normal pattern is a normal operating conditions, and other 3 kinds of patterns are the throttling operational mode state:
1. establishing T1 is that in the moment that the WRRDY signal was lost efficacy, this moment, all uncompleted PIO requests were the A phase requests owing to have no credit, and IOC enters the Collect-A mode of operation;
2. establish T2 for when creditable, the serviced and NACKA counter of the PIO head request of retry is 0, and the WRRDY signal is by effectively constantly; If T3 also has the A phase requests of not serving to lose efficacy again the moment of WRRDY signal for finding; T2 is the B phase transactions to the T3 new PIO affairs that CPU sends between the moment.In the T2 moment, if NACKA is not 0, the PIO issued transaction logic of IOC enters the Service-A mode of operation, if NACKA is 0 then enters the Normal mode of operation;
3. establish T4 for all handling when the A phase requests, WRRDY is once more by the effective moment.T5 also has uncompleted B phase requests to lose efficacy once more the moment of WRRDY signal for finding.T6 is for after the B phase requests is all finished, and the WRRDY signal is by again effectively constantly.T4 is the A phase requests to the T5 new PIO request that CPU sends between the moment.In the T4 moment, if the NACKB counter is not 0, PIO issued transaction logic enters the Service-B mode of operation among the IOC, if the NACKB counter is 0 then enters the Normal mode of operation.In the T6 moment, if the NACKA counter is not 0, PIO issued transaction logic enters the Service-A mode of operation among the IOC, if the NACKA counter is 0 then enters the Normal mode of operation.
The group throttling control of PIO transaction flow is finished in the first group throttling steering logic and the second group throttling steering logic collaborative work among NC and the IOC jointly, and detailed process is as follows:
1.CPU check system bus WRRDY signal, if the WRRDY signal is effective, then CPU flows out the PIO affairs;
If the WRRDY invalidating signal, CPU can not flow out the PIO affairs.
The PIO affairs that flow out from CPU are by the FIFO buffering of PI parts the NC, and the PIO affairs send steering logic and check PIO affairs retry state machine, if state machine is an idle condition, then sends logic by message the PIO affairs are sent to IOC; If PIO affairs retry state machine is a retry state, then first PIO affairs among the FIFO are stamped leader will, send to IOC, and inefficacy WRRDY signal.After all retry affairs among the FIFO are finished, state machine becomes idle condition once more, effective WRRDY signal.Like this,, realized when burst IO affairs, the throttling of PIO transaction flow being controlled, reduced influence system performance by the PIO affairs of the continuous over flood of CPU in restriction a period of time.
3.IOC in PIO issued transaction logic receive the PIO affairs, analyze PIO transaction types and target IO equipment thereof, check the control state machine of relevant device correspondence.If state machine is in normal operating conditions, then distribute corresponding PIO affairs buffering for these PIO affairs, the PIO transaction information that record is necessary, send logic by the PIO affairs PIO affairs are sent to target IO equipment, the PIO counting of IO equipment correspondence subtracts 1 simultaneously, and state machine carries out corresponding state exchange; If state machine is in the throttling operational mode state, then be not that these PIO affairs are distributed PIO affairs buffering, return to NC by PIO transaction response formation logic generation NACK message or ACK-HEAD message, the NACKA of IO equipment correspondence or NACKB (according to state machine state control) counting adds 1 simultaneously, and state machine carries out corresponding state exchange.
4.IO equipment receives the PIO affairs by the IO bus, handle the PIO affairs after, response results is returned to PIO transaction response receive logic among the IOC by the IO bus again.This response is analyzed in the inspection of PIO transaction response receive logic, and with PIO affairs bufferings in the PIO transaction information of record mate, PIO transaction response formation logic generates corresponding response message then, returns to NC; Simultaneously, discharge corresponding PIO affairs buffering, PIO credit counting adds 1, and control state machine is carried out corresponding state exchange.
5.NC after receiving response, PIO retry state machine carries out necessary state exchange, controls the WRRDY signal, and response results is returned to CPU.Thereby finish the processing of this time PIO affairs.
Control state machine state exchange flow process is:
1. all can effectively finish when all PIO affairs, and PIO credit is all greater than 0, the PIO issued transaction is the Normal mode of operation, and all PIO affairs are returned ACK.
2. the PIO credit as IO bridge correspondence equals 0, enters the Collect-A mode of operation, and IOC returns NACK to new PIO affairs, and the PIO affairs of NC send steering logic uncompleted PIO affairs are carried out the order retry.
3. under the Collect-A mode of operation,, enter the Normal state after finishing dealing with when only surplus nose heave examination request and when also creditable; Processor interface can send new PIO affairs.
4. under the Collect-A mode of operation, if do not satisfy 3 condition, it is constant that the group throttling steering logic duty of IOC maintains the Collect-A mode of operation, PIO affairs to retry are returned NACKA, the PI interface fails WRRDY of NC, do not allow processor to flow out new PIO affairs, the PIO affairs that the WRRDY that has little time to lose efficacy flows out are returned NACKB.
5. under the Collect-A mode of operation, when new credit, correct retry operation is handled, if also have this moment A stage PIO affairs to enter the Service-A state, does not still allow CPU to flow out new PIO affairs.
6. under the Service-A mode of operation, NACK is returned in the PIO request of new PIO affairs and retry; When creditable and also have A stage PIO affairs to keep the Service-A pattern.
7. under the Service-A mode of operation,, enter the Normal state when the only surplus nose heave examination request of A and when creditable; Processor interface can send new PIO affairs.
8. under the Service-A mode of operation, when creditable, correct retry operation is handled, if have only this moment the B stage (be meant when the serviced and NACKA counter of PIO head request creditable, retry be 0, the WRRDY signal is by effectively constantly, to the A phase requests that find to also have not service and the moment of the WRRDY signal that lost efficacy again) the PIO request then enters the Service-B state.If WRRDY is effective, then allows CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
9. under the Service-B mode of operation, when creditable, correct retry operation is handled, if having only B stage PIO request then enter the Service-A state this moment.The throttling steering logic forbids that still CPU flows out new PIO affairs among the NC.
10. under the Service-B mode of operation, the PIO affairs of new PIO affairs and retry are returned NACK; When creditable and also have A stage PIO request to keep the Service-B pattern.Effective as WRRDY, then allow CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
11. under the Service-B mode of operation, when only remaining the nose heave examination request of A and Credit is arranged, enter the Normal pattern after the normal process, processor interface can send new PIO affairs.
Adopt the present invention can reach following technique effect:
1. because PIO affairs buffering is used in grouping, solved in the extensive distribution share I O system since at a slow speed IO equipment cause system jams, influence system performance problems.
2. because the PIO transaction flow is carried out throttling control, solved in the extensive distribution share I O system because burst IO transaction flow causes system jams, reduce system performance problems.
3. owing to solved the problems referred to above well, therefore can the realization system in the flexible configuration of multiple performance different I resource, and the equilibrium of IO system can be expanded.
Description of drawings
Fig. 1 is existing extensive distribution share I O system assumption diagram;
Fig. 2 is a group throttling steering logic block diagram of the present invention;
Fig. 3 is four kinds of mode of operation time distribution maps of IOC throttling steering logic of the present invention.
Embodiment
Fig. 1 is existing distribution share I O system assumption diagram.It is online that all nodes are linked system interconnect by the internet interface of NC.IO controller IOC is attached to node controller NC.Articulate multiple IO bridge such as PCI-X bridge, InfiniBand bridge, traditional I bridge etc. under the IOC, realize the configuration of multiple IO equipment.Processor CPU is by all IO equipment in PIO transactions access and the control system, and IO equipment conducts interviews to global storage by the DMA affairs.All nodes is interconnected in the system and Internet realization system.
The present invention has designed the group throttling steering logic, is divided into two parts and realizes in NC and IOC respectively, is the first group throttling steering logic among the NC, is the second group throttling steering logic among the IOC, is cooperated by their input and output are controlled.
Fig. 2 is a group throttling steering logic block diagram of the present invention.The first group throttling steering logic among the NC is arranged in the processor interface PI inside of NC, and it comprises that the PIO affairs send steering logic, PIO affairs retry state machine and PIO affairs FIFO buffering.The PIO affairs send steering logic and link to each other with PIO affairs retry state machine on the one hand, send steering logic by PIO affairs retry state machine control PIO affairs; The PIO affairs send steering logic and link to each other with the fifo buffer of depositing from the PIO affairs of CPU on the other hand, control that the PIO affairs flow out according to the order of sequence in the fifo buffer.The PIO affairs are forwarded to the FIFO buffering by system bus, send logic by the message among the NC then and send to IOC; PIO response message from IOC returns to NC by the message receive logic among the NC.
PIO affairs retry state machine comprises two states: idle condition (Idle) and retry state (Retry).This state machine original state is an idle condition, state machine carries out state exchange according to the PIO transaction response type that the message receive logic is received, if what receive is a NACK response, then state machine enters retry state, if the ACK response then keeps idle condition.After state machine enters retry state, after having only by the time the PIO affairs to send steering logic the PIO affairs that need retry among the FIFO are all finished, idle condition once more.When state machine was in retry state, the WRRDY signal on the deactivation system bus stoped CPU to continue to flow out the PIO affairs to FIFO; When state machine was in idle condition, effectively the WRRDY signal allowed CPU to flow out new PIO affairs.When state machine was in retry state, before the inefficacy WRRDY signal, CPU may flow out several PIO affairs, and at this moment the PIO affairs are kept in the fifo buffer, and these affairs all are the affairs that need retry.All PIO affairs that need retry enter this FIFO according to the order of first in first out, and before the request of retry obtained real response, any subsequent request all can not be finished, and these requests of retry in order.In order to keep the order between these PIO affairs, the PIO affairs send steering logic first PIO affairs of receiving the NACK response are stamped labeling head, to be different from the PIO affairs of the follow-up NACK of receiving response, when the request of leader label during by retry, the order that sends to IOC is retry head order (flowing out the order of PIO affairs retry the earliest), and follow-up request is during by retry, and the order that sends to IOC is simple retry order.Thereby IOC can conclude which is the earliest PIO affairs, but when the PIO credit time spent, handle this request at first, and return an ACK-HEAD response (acknowledge-HEAD flows out the PIO affairs the earliest finishes response) and notify the affairs of first retry of PIO affairs transmission steering logic processed.Then, the PIO affairs send steering logic and stamp labeling head to the next retry affairs of closelying follow and finish according to the order of sequence to guarantee all PIO affairs.After the PIO of all retries affairs were all finished, state machine returned idle condition.
The second group throttling steering logic is made up of PIO issued transaction logic, PIO affairs transmission logic, PIO transaction response receive logic, PIO transaction response formation logic, one group of control state machine, PIO affairs buffering and one group of credit counter among the IOC.PIO issued transaction logic, PIO transaction response formation logic, credit counter all link to each other with control state machine.Each IO bridge device all designs the control state machine of a correspondence.PIO issued transaction logical and PIO affairs send logic, PIO affairs buffering and credit counter and link to each other.PIO transaction response receive logic links to each other with PIO response formation logic, and it is responsible for receiving the PIO response message that returns from IO equipment.IOC safeguards a PIO credit counter to each IO bridge device under it, how many PIO affairs bufferings indicate this equipment has available, when certain PIO credit counter is 0, to any subsequent P IO transactions requests, IOC returns the NACK response, thereby realizes the throttling control of PIO affairs; When the PIO credit counter is not 0, one every PIO request to certain IO equipment, corresponding with it credit counter subtracts 1, response of every reception from this equipment, corresponding credit counter adds 1; Design an independently credit counter respectively at different IO bridge devices, thereby realize grouping control the PIO affairs that arrive different I equipment.Each IO bridge device credit counter is made up of 3 counters: PIO credit counter, NACKA (IOC throttling control is in the NACK response that normal operating conditions receives that the PIO affairs are returned) counter and NACKB (IOC throttling steering logic is in the NACK response that the throttle pattern state receives that the PIO affairs are returned) counter.Control state machine enters different states according to each credit counter currency, one has four kinds of operational mode state: Normal mode state, Collect-A mode state, Service-A mode state and Service-B mode state, the Normal pattern is a normal mode of operation, and other 3 kinds of patterns are the throttling mode of operation.Whether the state decision PIO affairs that the PIO affairs send the logical foundation control state machine mail to purpose IO equipment: is 0 if the PIO affairs send logic for the PIO credit counter of certain equipment, then no longer this equipment is sent any request, any new PIO affairs are all responded NACK, and any PIO request that receives the NACK response is carried out retry by the first group throttling steering logic of the NC of this request of outflow.In case the PIO credit counter is greater than 0, the PIO of retry request will be sent to corresponding IO equipment.PIO transaction response formation logic returns response according to the state of control state machine and the content of PIO affairs buffering to the PIO affairs of receiving, under being operated in normal mode, control state machine returns ACK (acknowledge, response is finished in request), when control state machine is operated in other 3 kinds of patterns, NACK HEAD is returned in request to retry head, NACK is returned in other request, and for the PIO write request of from processor, write data is followed NACK and returned together.
The state conversion process of control state machine is as follows in the IOC throttling steering logic:
1. all can effectively finish when all PIO affairs, and PIO credit is all greater than 0, the PIO issued transaction is a normal mode of operation, and all PIO affairs are returned ACK.
2. the PIO credit as IO bridge correspondence equals 0, enters the Collect-A mode of operation, and IOC returns NACK to new PIO affairs, and the throttling steering logic of NC is carried out the order retry to uncompleted PIO affairs.
3. under the Collect-A mode of operation,, enter the Normal state after finishing dealing with when only surplus nose heave examination request and when also creditable; Processor interface can send new PIO affairs.
4. under the Collect-A mode of operation, if do not satisfy 3 condition, the throttling steering logic duty of IOC is constant, PIO affairs to retry are returned NACKA, the throttling steering logic inefficacy WRRDY of NC, do not allow processor to flow out new PIO affairs, the PIO affairs that the WRRDY that has little time to lose efficacy flows out are returned NACKB.
5. under the Collect-A mode of operation, when new credit, correct retry operation is handled, if also have the A stage this moment (owing to have no credit, the moment that the WRRDY signal was lost efficacy) the PIO affairs enter the Service-A state, still do not allow CPU to flow out new PIO affairs.
6. under the Service-A mode of operation, NACK is returned in the PIO request of new PIO affairs and retry; When creditable and also have A stage PIO affairs to keep the Service-A pattern.
7. under the Service-A mode of operation,, enter the Normal state when the only surplus nose heave examination request of A and when creditable; Processor interface can send new PIO affairs.
8. under the Service-A mode of operation, when creditable, correct retry operation is handled, if have only this moment the B stage (be meant when the serviced and NACKA counter of PIO head request creditable, retry be 0, the WRRDY signal is by effectively constantly, to the A phase requests that find to also have not service and the moment of the WRRDY signal that lost efficacy again), the PIO request then enters the Service-B state.If WRRDY is effective, then allows CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
9. under the Service-B mode of operation, when creditable, correct retry operation is handled, if having only B stage PIO request then enter the Service-A state this moment.The throttling steering logic forbids that still CPU flows out new PIO affairs among the NC.
10. under the Service-B mode of operation, the PIO affairs of new PIO affairs and retry are returned NACK; When creditable and also have A stage PIO request to keep the Service-B pattern.Effective as WRRDY, then allow CPU to flow out new PIO affairs, otherwise forbid that CPU flows out new PIO affairs.
11. under the Service-B mode of operation, when the surplus nose heave examination request of A only and when creditable, enter the Normal pattern after the normal process, processor interface can send new PIO affairs.
Fig. 3 is four kinds of mode of operation time distribution maps of IOC throttling steering logic of the present invention.Indicated WRRDY effectively and inefficacy opportunity.
1.T1 be meant owing to have no credit the moment that the WRRDY signal was lost efficacy.This moment, all uncompleted PIO requests were the A phase requests, and IOC enters the Collect-A mode of operation.
2.T2 be meant that when creditable serviced the and NACKA counter of the PIO head request of retry is 0, the WRRDY signal is by effectively constantly.T3 is meant the A phase requests that find to also have not service and the moment of the WRRDY signal that lost efficacy again.T2 is the B phase transactions to the T3 new PIO affairs that CPU sends between the moment.In the T2 moment, if NACKA is not 0, the PIO issued transaction logic of IOC enters the Service-A mode of operation, if NACKA is 0 then enters the Normal mode of operation.
3.T4 be meant when the A phase requests and all handle that WRRDY is once more by the effective moment.T5 is meant and finds to also have uncompleted B phase requests and the moment of the WRRDY signal that lost efficacy once more.T6 is meant that after the B phase requests is all finished the WRRDY signal is by again effectively constantly.T4 is the A phase requests to the T5 new PIO request that CPU sends between the moment.In the T4 moment, if the NACKB counter is not 0, PIO issued transaction logic enters the Service-B mode of operation among the IOC, if the NACKB counter is 0 then enters the Normal mode of operation.In the T6 moment, if the NACKA counter is not 0, PIO issued transaction logic enters the Service-A mode of operation among the IOC, if the NACKA counter is 0 then enters the Normal mode of operation.
The present invention is directed in the extensive distribution share I O system because the sudden problem that causes system performance to reduce of the lack of uniformity of IO device rate and IO affairs, adopt group throttling method, the affairs of different I bridge are divided into groups to control, solved the system congestion that may cause well, guarantee giving full play to of system IO performance, and realized the flexible configuration of IO resource.The present invention has been implemented on the high-performance computer that University of Science and Technology for National Defence develops voluntarily, through evaluation and test, the present invention extensive distribute to share realized the control of PIO affairs in the I system well, obtained Expected Results.
Claims (4)
1. the input/output group throttling method in the large scale distributed shared systems, node controller NC realizes processor interface, memory access control and internet interface, IO controller IOC finishes the control to the management of IO equipment and IO visit, it is that PIO credit management method realizes the control to the PIO transaction flow that IOC initiates IO by processor, it is characterized in that designing the group throttling steering logic of input and output in the large-scale parallel compartment system, this steering logic is divided into two parts and realizes in NC and IOC respectively:
Group throttling steering logic among the NC is arranged in the processor interface PI inside of NC the first group throttling steering logic, comprises that the PIO affairs send steering logic, PIO affairs FIFO buffering and PIO affairs retry state machine; The PIO affairs send steering logic and are connected with PIO retry state machine, send steering logic by PIO retry state machine control PIO affairs; Simultaneously, the PIO affairs send steering logic and link to each other the order outflow of control PIO affairs with a fifo buffer of depositing from the PIO affairs of CPU; The PIO affairs send logic by the message among the NC and send to IOC through FIFO; PIO response message from IOC returns to NC by the message receive logic among the NC;
The group throttling steering logic is the second group throttling steering logic among the IOC, and it sends logic, PIO transaction response receive logic, PIO transaction response formation logic, a plurality of control state machine, PIO affairs buffering and a plurality of credit counter by PIO issued transaction logic, PIO affairs and forms; PIO issued transaction logic, PIO transaction response formation logic, credit counter all link to each other with control state machine; PIO issued transaction logical and PIO affairs send logic, PIO affairs buffering and credit counter and link to each other, and also send logic with the message of NC and link to each other; PIO transaction response receive logic both linked to each other with credit counter with PIO affairs buffering, linked to each other with PIO transaction response formation logic again, was responsible for receiving the PIO response message that returns from IO equipment; The number of control state machine is identical with the number of credit counter, the IO device category of being supported by system determines, the corresponding state machine of one class IO bridge device and a credit counter, therefore according to the IO device type, the IO equipment that is articulated under the different I bridge has been divided into different groups, by being the PIO affairs credit of these set of dispense correspondences, the PIO affairs buffering that the equipment of each group can only use this group to take, can not take the PIO affairs buffering of other group, thereby realize grouping control the PIO transaction flow that arrives different I equipment;
The group throttling control of PIO transaction flow is finished in the first group throttling steering logic and the second group throttling steering logic collaborative work jointly.
2. the input/output group throttling method in the large scale distributed shared systems as claimed in claim 1, it is characterized in that described PIO affairs retry state machine comprises two state: idle condition Idle and retry state Retry, this state machine original state is that the state of system power-up when resetting is idle condition, state machine carries out state exchange according to the PIO transaction response type that the message receive logic is received, if what receive is a NACK response, then state machine enters retry state, if what receive is the ACK response, then keep idle condition; After state machine enters retry state, after having only by the time the PIO affairs to send steering logic the PIO affairs that need retry among the FIFO are all finished, just enter idle condition once more; When state machine was in retry state, the WRRDY signal on the deactivation system bus stoped CPU to continue to flow out the PIO request; When state machine was in Idle state, effectively the WRRDY signal allowed CPU to flow out new PIO affairs; Be in retry state at state machine, before the inefficacy WRRDY signal, CPU has flowed out several PIO affairs, at this moment the PIO affairs are kept in the fifo buffer, these affairs all are the affairs that need retry, and all PIO affairs that need retry enter this FIFO according to the order of first in first out, before the retry request is really responded, any subsequent request all can not be finished, and these requests of retry in order; In order to keep the order between these PIO affairs, the PIO affairs send steering logic first PIO affairs of receiving the NACK response are stamped labeling head, to be different from the PIO affairs of the follow-up NACK of receiving response, when the request of leader label during by retry, the order that sends to IOC is promptly flowed out the order of PIO affairs retry the earliest for retry head order, and follow-up request is during by retry, the order that sends to IOC is simple retry order, thereby IOC can conclude which is the earliest PIO affairs, but when the PIO credit time spent, handle this request at first, and return an ACK-HEAD response and notify the affairs of first retry of PI processed; Then, PI stamps labeling head for the next retry affairs closely follow to finish according to the order of sequence to guarantee all PIO affairs; After the PIO of all retries affairs were all finished, state machine returned idle condition.
3. the input/output group throttling method in the large scale distributed shared systems as claimed in claim 1, it is characterized in that each credit counter is by a PIO credit counter, a NACKA counter, a NACKB counter is formed, the PIO credit counter indicates its pairing IO bridge device has available for how many PIO affairs bufferings, when the PIO of certain equipment credit counter is 0, to any subsequent P IO transactions requests, IOC returns the NACK response, after the message receive logic among the NC is received this NACK response, PIO affairs retry state machine enters retry state, WRRDY signal on the while deactivation system bus, stop CPU to continue to flow out the PIO request, realize throttling control the PIO affairs; When the PIO credit counter is not 0, one every PIO request to certain IO equipment, corresponding with it PIO credit counter subtracts 1, response of every reception from this equipment, corresponding PIO credit counter adds 1; The second group throttling steering logic among the NACKA counter records IOC is in the NACK response number that normal operating conditions receives that the PIO affairs are returned; The second group throttling steering logic among the NACKB counter records IOC is in the NACK response number that the throttling operational mode state receives that the PIO affairs are returned.
4. the input/output group throttling method in the large scale distributed shared systems as claimed in claim 1, it is characterized in that the PIO transaction response that PIO transaction types that control state machine is received according to PIO issued transaction logic and PIO transaction response receive logic are received carries out state exchange, upgrade the currency of corresponding credit counter simultaneously, one total Normal mode state, the Collect-A mode state, four kinds of operational mode state of Service-A mode state and Service-B mode state, the Normal pattern is a normal operating conditions, and other 3 kinds of patterns are the throttling operational mode state:
4.1. establish T1 is that in the moment that the WRRDY signal was lost efficacy, this moment, all uncompleted PIO requests were the A phase requests owing to have no credit, and IOC enters the Collect-A mode of operation;
4.2. establish T2 for when creditable, the serviced and NACKA counter of the PIO head request of retry is 0, the WRRDY signal is by effectively constantly; If T3 also has the A phase requests of not serving to lose efficacy again the moment of WRRDY signal for finding; T2 is the B phase transactions to the T3 new PIO affairs that CPU sends between the moment; In the T2 moment, if NACKA is not 0, the PIO issued transaction logic of IOC enters the Service-A mode of operation, if NACKA is 0 then enters the Normal mode of operation;
4.3. establish T4 for all handling when the A phase requests, WRRDY is once more by the effective moment, T5 also has uncompleted B phase requests to lose efficacy once more the moment of WRRDY signal for finding, T6 is for after the B phase requests is all finished, and the WRRDY signal is by again effectively constantly; T4 is the A phase requests to the T5 new PIO request that CPU sends between the moment; In the T4 moment, if the NACKB counter is not 0, PIO issued transaction logic enters the Service-B mode of operation among the IOC, if the NACKB counter is 0 then enters the Normal mode of operation; In the T6 moment, if the NACKA counter is not 0, PIO issued transaction logic enters the Service-A mode of operation among the IOC, if the NACKA counter is 0 then enters the Normal mode of operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005100314495A CN100375080C (en) | 2005-04-15 | 2005-04-15 | Input / output group throttling method in large scale distributed shared systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005100314495A CN100375080C (en) | 2005-04-15 | 2005-04-15 | Input / output group throttling method in large scale distributed shared systems |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1667602A CN1667602A (en) | 2005-09-14 |
CN100375080C true CN100375080C (en) | 2008-03-12 |
Family
ID=35038704
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2005100314495A Expired - Fee Related CN100375080C (en) | 2005-04-15 | 2005-04-15 | Input / output group throttling method in large scale distributed shared systems |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100375080C (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160224479A1 (en) * | 2013-11-28 | 2016-08-04 | Hitachi, Ltd. | Computer system, and computer system control method |
US9477631B2 (en) * | 2014-06-26 | 2016-10-25 | Intel Corporation | Optimized credit return mechanism for packet sends |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6434636B1 (en) * | 1999-10-31 | 2002-08-13 | Hewlett-Packard Company | Method and apparatus for performing high bandwidth low latency programmed I/O writes by passing tokens |
US20030007457A1 (en) * | 2001-06-29 | 2003-01-09 | Farrell Jeremy J. | Hardware mechanism to improve performance in a multi-node computer system |
JP2004102642A (en) * | 2002-09-10 | 2004-04-02 | Mitsubishi Electric Corp | Input/output bus converting unit, plant simulation device, plant control device and update method thereof |
WO2004040858A1 (en) * | 2002-11-01 | 2004-05-13 | Nokia Corporation | Dynamic load distribution using local state information |
CN1547119A (en) * | 2003-12-04 | 2004-11-17 | 中国科学院计算技术研究所 | Method for constructing large-scale high-availability cluster operating system |
JP2005004394A (en) * | 2003-06-11 | 2005-01-06 | Mitsubishi Electric Corp | Distributed pio system |
-
2005
- 2005-04-15 CN CNB2005100314495A patent/CN100375080C/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6434636B1 (en) * | 1999-10-31 | 2002-08-13 | Hewlett-Packard Company | Method and apparatus for performing high bandwidth low latency programmed I/O writes by passing tokens |
US20030007457A1 (en) * | 2001-06-29 | 2003-01-09 | Farrell Jeremy J. | Hardware mechanism to improve performance in a multi-node computer system |
JP2004102642A (en) * | 2002-09-10 | 2004-04-02 | Mitsubishi Electric Corp | Input/output bus converting unit, plant simulation device, plant control device and update method thereof |
WO2004040858A1 (en) * | 2002-11-01 | 2004-05-13 | Nokia Corporation | Dynamic load distribution using local state information |
JP2005004394A (en) * | 2003-06-11 | 2005-01-06 | Mitsubishi Electric Corp | Distributed pio system |
CN1547119A (en) * | 2003-12-04 | 2004-11-17 | 中国科学院计算技术研究所 | Method for constructing large-scale high-availability cluster operating system |
Non-Patent Citations (1)
Title |
---|
集群计算系统中并行I/O模拟器的研究与设计. 曾碧卿,陈志刚,邓会敏,刘伟.计算技术与自动化,第23卷第3期. 2004 * |
Also Published As
Publication number | Publication date |
---|---|
CN1667602A (en) | 2005-09-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101878475B (en) | Delegating network processor operations to star topology serial bus interfaces | |
EP2406723B1 (en) | Scalable interface for connecting multiple computer systems which performs parallel mpi header matching | |
CN100531125C (en) | Arbitrating virtual channel transmit queues in a switched fabric network | |
JP5376371B2 (en) | Network interface card used for parallel computing systems | |
CN100524252C (en) | Embedded system chip and data read-write processing method | |
US7069361B2 (en) | System and method of maintaining coherency in a distributed communication system | |
US6490630B1 (en) | System and method for avoiding deadlock in multi-node network | |
CN100357922C (en) | A general input/output architecture, protocol and related methods to implement flow control | |
CN101814060B (en) | Method and apparatus to facilitate system to system protocol exchange in back to back non-transparent bridges | |
CN1608255B (en) | Communicating transaction types between agents in a computer system using packet headers including an extended type/extended length field | |
CN102185751B (en) | One-cycle router on chip based on quick path technology | |
CN102984123A (en) | Communicating message request transaction types between agents in a computer system using multiple message groups | |
CN103959261A (en) | Multi-core interconnect in a network processor | |
CN107949837A (en) | Register file for I/O data packet compressings | |
US12081365B2 (en) | Distributed system with fault tolerance and self-maintenance | |
US7346725B2 (en) | Method and apparatus for generating traffic in an electronic bridge via a local controller | |
CN111858413A (en) | Data scheduling method and device for PCIE (peripheral component interface express) exchange chip port | |
CN100375080C (en) | Input / output group throttling method in large scale distributed shared systems | |
US7719964B2 (en) | Data credit pooling for point-to-point links | |
US11593281B2 (en) | Device supporting ordered and unordered transaction classes | |
US9665518B2 (en) | Methods and systems for controlling ordered write transactions to multiple devices using switch point networks | |
US7047284B1 (en) | Transfer request bus node for transfer controller with hub and ports | |
CN100470509C (en) | Causality-based memory access ordering in a multiprocessing environment | |
CN100390771C (en) | Processing system and method for transmitting data | |
JP2001320386A (en) | Electronic system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20080312 Termination date: 20150415 |
|
EXPY | Termination of patent right or utility model |