Summary of the invention
Based on this, be necessary to provide a kind of method for scheduling task that can improve the supercomputer of operational efficiency.
The method for scheduling task of described supercomputer comprises: A. obtains the residue computing power of computing node and the distance between described computing node and the pending data of the task place node; B. extremely far described computing node is sorted by near by the distance between described computing node and the pending data of the task place node; C. according among the step B computing node being sorted the ranking results scheduler task that obtains to computing node; Described step C comprises:
According to described ranking results by near to searching computing node far successively, when the surplus capacity of first computing node does not satisfy task to the Capability Requirement of node, the volume of transmitted data that more following scheduling mode produces: after the parallel section of the treatable task of first computing node is dispatched to first computing node, with the part task immigration that moves on described first computing node on the nearest node of first computing node, satisfy task scheduling remaining parallel section behind first computing node until the residue computing power of described first computing node Capability Requirement of node and is dispatched to the volume of transmitted data that described first computing node is produced with the remaining parallel section of task; The parallel section of the treatable described task of residue computing power of described first computing node is dispatched to first computing node, extremely far find second computing node according to described ranking results by near, and the remaining parallel section of described task is dispatched to the volume of transmitted data that second computing node is produced; Scheduling mode is made as the less scheduling mode of volume of transmitted data of generation; Described step C selects data quantity transmitted is little and data rate is the fastest scheduling mode that the remaining parallel section of task that the parallel section of the treatable task of first computing node is dispatched to behind first computing node is dispatched to respective nodes.
Preferably, described step C comprises: according to described ranking results by near to searching computing node far successively, when first computing node satisfies task to the Capability Requirement of node, with task scheduling to described computing node.
Preferably, described step C can comprise: according to described ranking results by near to searching computing node far successively, when the residue computing power of first computing node does not satisfy described task to the Capability Requirement of node, moving on described first computing node of task is moved, residue computing power until described first computing node satisfies the Capability Requirement of task to node, and task scheduling is arrived described first computing node.
Described step C also can comprise: according to described ranking results by near to searching computing node far successively, when the residue computing power of first computing node does not satisfy described task to the Capability Requirement of node but greater than the minimum thread of described task, the parallel section of the treatable described task of residue computing power of described first computing node is dispatched to first computing node.
Further preferably, described parallel section with task also comprises after being dispatched to first computing node: moving on described first computing node of task is moved, residue computing power until described first computing node can be handled the remaining parallel section of described task, and the remaining parallel section of described task is dispatched to described first computing node.
Further preferably, the described step that moving on first computing node of task is moved is: with the task immigration that moves on first computing node to the 3rd nearest computing node of described first computing node of distance.
Preferably, described parallel section with task is dispatched to and also comprises after first computing node: by described ranking results by near to searching second computing node far successively, and described task scheduling remaining parallel section behind first computing node is dispatched to described second computing node.
The method for scheduling task of above-mentioned supercomputer, by obtaining the distance between the node of the pending data of computing node and task place, task scheduling is moved to the near as far as possible computing node of the pending data of affair place node of leaving one's post, can effectively reduce the migration of data, thereby improve the operational efficiency of supercomputer.
Embodiment
Fig. 1 shows the method for scheduling task flow process of supercomputer among the embodiment, and detailed process is as follows:
In step S101, obtain the residue computing power of computing node and the distance between described computing node and the pending data of the task place node.
In step S102, extremely far described computing node is sorted by near by the distance between described computing node and the pending data of the task place node.
In step S103, arrive computing node according to described ranking results scheduler task.
The residue computing power of aforementioned calculation node, be meant that each computing node in the supercomputer can be for the remaining ability of domination, information such as for example remaining computing power (can be flops etc.), remaining storage capacity (can be memory size etc.), the remaining network bandwidth byte number of per second transmission (can be etc.) and temperature.Then can judge this node operation task of whether having the ability according to the residue computing power of computing node.
Distance between the node of the pending data of aforementioned calculation node and task place is meant the distance of data transmission between the node of the pending data of computing node and task place, and when distance is far away more, then data transmission is slow more, and distance is near more, and then data transmission is fast more.In one embodiment, the required deposit data to be processed of task T (for example a certain data search task) is on certain node (node N0), and other computing node in the supercomputer and the distance between the node N0 to transmit data between N1, N2...... node N1 and the node N0 the fastest to far being respectively by near, take second place successively.
Fig. 2 shows the method flow that arrives computing node among the embodiment according to the ranking results scheduler task, and detailed process is as follows:
In step S201, according to the ranking results of computing node by near to searching computing node far successively.As mentioned above, suppose to leave one's post be engaged in the distance of node N0 at pending data place by near to far being respectively node N1, N2......
In step S202, whether the residue computing power of decision node Ni (i=1,2......) satisfies the Capability Requirement of task to node, if, then enter step S203, otherwise, search next node, return step S202.In one embodiment, according to ranking results by near to far searching computing node Ni, judge at first whether the residue computing power from the nearest node N1 of node N0 satisfies the Capability Requirement of task to node, if, then can directly task scheduling be arrived node 1, otherwise, carry out i=i+1, promptly search next computing node N2, whether the residue computing power of decision node N2 satisfies the Capability Requirement of task to node, by that analogy, satisfy the Capability Requirement of task, then task scheduling is arrived this node node up to finding the residue computing power.The task here to the Capability Requirement of node can be: move this mission requirements memory size, move this mission requirements flops, move this mission requirements bandwidth, move in the parameters such as node temperature restriction of this task one or more.
In step S203, task scheduling is satisfied on the computing node of task to the Capability Requirement of node to the residue computing power.
Fig. 3 is the method flow that arrives computing node among another embodiment according to the ranking results scheduler task, and detailed process is as follows:
In step S301, according to the ranking results of computing node by near to searching computing node far successively.As mentioned above, suppose to leave one's post be engaged in the distance of node N0 at pending data place by near to far being respectively node N1, N2.......
In step S302, whether the residue computing power of decision node Ni (i=1,2......) satisfies the Capability Requirement of task to node, if, then enter step S303, otherwise, step S304 entered.In one embodiment, judge at first whether the residue computing power from the nearest node N1 of node N0 satisfies the Capability Requirement of task to node.
In step S303, task scheduling is arrived node Ni.In one embodiment, when the residue computing power of the node Ni that finds satisfies the Capability Requirement of task to node, then directly with task scheduling to this node Ni.For example, when the residue computing power that finds from the nearest node N1 of node N0 can satisfy the Capability Requirement of task to node, then task scheduling is arrived node N1.Because the distance of node N1 and the pending data of task place node N0 is the shortest, data transmission is also the fastest, has therefore improved the operational efficiency of supercomputer.
In step S304, whether the residue computing power of decision node Ni greater than the minimum thread of task, if, then enter step S305, otherwise, search next node, return step S302.In one embodiment, needing the task of scheduling is concurrent program, comprise a plurality of parallel sections, when the residue computing power of node Ni can not satisfy the Capability Requirement of task to node, and during less than the minimum thread of task, illustrate that then node Ni can't move this task, carry out i=i+1, promptly search next node and whether can satisfy the Capability Requirement of task node.
In step S305, the parallel section of the treatable task of node Ni is dispatched to node Ni.In one embodiment, can not satisfy task to the Capability Requirement of node but greater than the minimum thread of task, then the parallel section with the treatable task of node Ni is dispatched to node Ni when the residue computing power of node Ni.For example, operation task T requires the performed flops of per second to reach 100, its minimum thread is 10 flops, and the residue computing power of node Ni is 20 flops, then the parallel section of 2 threads can be dispatched to node Ni.
In one embodiment, the parallel section of the treatable task of node Ni is dispatched to node Ni after, node Ni can be gone up the task of operation and move.Preferably, be that node Ni is gone up the part task immigration of operation on the nearest node of node Ni, make the residue computing power of node Ni can satisfy the Capability Requirement of task scheduling remaining parallel section after the node i, thereby the remaining parallel section of task is dispatched to node Ni.
In another embodiment, after the parallel section of the treatable task of node Ni is dispatched to node Ni, according to ranking results by near to far searching successively from the nearest node of node Ni (for example node Nj), and task scheduling remaining parallel section behind the node Ni can be dispatched on the node Nj.Should be noted that, when remaining parallel section is dispatched on the node Nj with task, can be with reference to method recited above, further can the residue computing power of decision node Nj satisfy the Capability Requirement of the remaining parallel section of task to node, if can satisfy, then directly the remaining parallel section of task is dispatched to node Nj, otherwise, also further the residue computing power of decision node Nj whether greater than the minimum thread of task, by that analogy, up to the parallel section of task has all been dispatched.Because the node of selected operation task has taken into full account the distance with the pending data of task place node, has effectively reduced the data between nodes transmission, thereby has improved the operational efficiency of supercomputer.
At the remaining parallel section of scheduler task before computing node, can compare to above-mentioned two kinds of volumes of transmitted data that embodiment produced, the volume of transmitted data here is meant the speed of data quantity transmitted and data transmission, therefrom selects the fastest embodiment of data quantity transmitted minimum and data rate that the remaining parallel section of task is dispatched to respective nodes.
The method for scheduling task that should be noted that above-mentioned supercomputer also can be used for the task scheduling in data center, information center, the game center, so the present invention should not be limited to certain particular system.
The above only is preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of being done within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.