CN103812949B - A kind of task scheduling towards real-time cloud platform and resource allocation methods and system - Google Patents

A kind of task scheduling towards real-time cloud platform and resource allocation methods and system Download PDF

Info

Publication number
CN103812949B
CN103812949B CN201410080647.XA CN201410080647A CN103812949B CN 103812949 B CN103812949 B CN 103812949B CN 201410080647 A CN201410080647 A CN 201410080647A CN 103812949 B CN103812949 B CN 103812949B
Authority
CN
China
Prior art keywords
task
node
matrix
situation
connection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410080647.XA
Other languages
Chinese (zh)
Other versions
CN103812949A (en
Inventor
张闯
陈蒙蒙
李钊
徐克付
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201410080647.XA priority Critical patent/CN103812949B/en
Publication of CN103812949A publication Critical patent/CN103812949A/en
Application granted granted Critical
Publication of CN103812949B publication Critical patent/CN103812949B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to a kind of task scheduling towards real-time cloud platform and resource allocation methods and system, obtain the operation conditions of cloud platform including global state memory module, operation conditions is reported global state monitoring module;Global state monitoring module, according to operation conditions, utilizes task allocation matrix, task adjacency matrix and mask code matrix to formulate corresponding scheduling strategy;In real-time cloud platform, carry out according to scheduling strategy that node is driving and/or task-driven type task scheduling is distributed with resource, the present invention takes into full account the relation between task, the traffic reduced between node, reduces bandwidth pressure when distributing task, thus improves platform property;The various situations of cloud platform dynamic dispatching can be well adapted to, it is ensured that cloud platform moment in running keeps higher calculated performance and resource utilization;And time complexity is low, it is suitable in the cloud environment with extensive node and big task amount disposing using.

Description

A kind of task scheduling towards real-time cloud platform and resource allocation methods and system
Technical field
The present invention relates to real-time field of cloud calculation, particularly relate to a kind of task scheduling towards real-time cloud platform With resource allocation methods and system.
Background technology
The data volume of society expands day by day, and data are more and more with extensive, continuous print stream Form occur.The value of data reduces as time goes by, it requires data occur after as early as possible They are processed rather than are cached and carry out batch processing by ground.Such as, at search engine each second Managing thousands of inquiries, each page comprises multiple advertisement, in order to process user feedback in time, needs One low latency, expansible, highly reliable process engine.Traditional DBMS or employing Map/Reduce The method carrying out real-time stream process is all difficult to meet application demand.
To this end, occur in that a lot of stream calculation platform both at home and abroad, such as Yahoo!Stream calculation platform of increasing income The Strom of S4 (Simple Scalable Streaming System), Twitter exploitation, commercialization are put down The stream processing system Puma etc. of platform StreamBase, Facebook;Domestic also have a lot of similar system, Including Baidu further generation data streaming system DStream, Taobao real time streaming data analysis platform Beatles Deng.These distributed systems can significantly improve the disposal ability of data, reduce the process delay of data.
The new demand that low latency mass data flow processes, distributes with resource to the scheduling between task and node Bring new challenge, the following problems of current main flow real-time cloud platform existence:
1, existing real-time cloud platform, such as the Storm of Twitter, enters task as independent unit Row distribution, does not considers the mutual relation between task, and reality is from the point of view of improving platform efficiency, phase The task of mutual correlation should be assigned on identical or adjacent node;
2, existing real-time cloud platform only considered the service condition of the CPU of task, internal memory, does not considers to appoint The traffic between business, and the upstream-downstream relationship of task;
3, existing real-time cloud platform only considered initial or static assignment problem, and have ignored platform It is open, task and node is this key character of dynamically change, the distribution in platform running Strategy will become the key factor limiting its efficiency;
4, classical multinuclear task allocation algorithms complexity is higher, the situation that, task amount few at check figure is few Under there is advantage, and the data volume of cloud platform, task amount, node scale have all surmounted the place of traditional algorithm Reason scope, so that the allocation algorithm of real-time cloud platform is urgent and necessary.
In sum, it would be desirable to a kind of time complexity is low, can meet real-time cloud platform dynamic calculation, It is suitable for task scheduling and the resource allocation algorithm of the situations such as cloud environment dynamically change, to improve appointing of cloud platform Business allocative efficiency and resource utilization.
Summary of the invention
The technical problem to be solved is for the deficiencies in the prior art, it is provided that a kind of towards in real time The task scheduling of cloud platform and resource allocation methods and system, its time complexity is low, can meet real-time cloud Platform dynamic calculation, the task scheduling being suitable for the situations such as cloud environment dynamically change and resource distribution, can be effective Improve task allocative efficiency and the resource utilization of cloud platform.
The technical scheme is that a kind of task towards real-time cloud platform Scheduling and resource allocation methods, comprise the steps:
Step 1: global state memory module obtains the operation conditions of cloud platform, operation conditions is reported Global state monitoring module;
Step 2: global state monitoring module, according to operation conditions, utilizes task allocation matrix ST, task Adjacency matrix TT and mask code matrix TTM formulates corresponding scheduling strategy;
Step 3: carry out in real-time cloud platform according to scheduling strategy that node is driving and/or task-driven type Task scheduling is distributed with resource.
The invention has the beneficial effects as follows:
1, take into full account the relation between task, the traffic reduced between node during distribution task, subtract Few bandwidth pressure, thus improve platform property;
2, the various situations of cloud platform dynamic dispatching are well adapted to, it is ensured that cloud platform is in running Moment keeps higher calculated performance and resource utilization;
3, computation complexity is low, is suitable in the cloud environment with extensive node and big task amount disposing Use.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described task allocation matrix ST is the matrix of n row m row, and row represents node, list Show task,
Described task adjacency matrix TT is the matrix of m row m row, the connection between expression task,
Described mask code matrix TTM is the matrix of m row m row, represents the interior connection feelings between task in node Condition, is multiplied with task adjacency matrix TT, and the result obtained represents outer situation about connecting between task,
Further, in step 3, the driving task scheduling of described node includes increasing newly with resource allocation conditions The situation that node, node overload, node delay machine and node plan removes;
A1. for the situation of newly-increased node, newly-increased a line it is implemented as in task allocation matrix ST, Corresponding element zero setting;
A2. for the situation of node overload, it is implemented as selection destination node, will select on overload node The task immigration to be migrated selected in destination node, the most corresponding amendment task allocation matrix ST and mask Matrix TTM,
Wherein, selecting destination node to meet condition is that destination node is not transshipped;Overload node and purpose joint Number is connected maximum between point;
Task to be migrated on overload node is selected to meet condition and be, in selecting to occur because of this task immigration Connect and become the number of outer connection and deduct the outer connection occurred because of this task immigration and become interior linking number Value minimum;
A3. delaying for node the situation of machine, implementing is each task choosing mesh for delaying on machine node Node, delaying on machine node of task is moved to successively in the destination node of correspondence, simultaneously corresponding amendment Task allocation matrix ST and mask code matrix TTM;
Wherein, the condition selecting destination node to meet is to make task to be migrated and corresponding destination node Outer connection number is most;
Situation about a4. removing for node plan, is implemented as and distributes mark by the task of node to be removed Will position becomes to distribute new task state, then waits that all task runs on this node terminate, and moves Except this node, and the element all 0 of this node corresponding row in task allocation matrix ST, this row is moved Remove.
Further, for the situation of node overload, select destination node actual conditions as follows,
AT×Msd×A+AT×Mds×A≥AT×Msk×A+AT×Mks×A
K ∈ [1, n], A=[1 ... 1]T
Wherein, MsdRepresent overload node nsTo destination node ndSend the situation of connection, MdsRepresent mesh Node ndTo overload node nsSend the situation of connection, MskRepresent overload node nsTo node nkSend out Go out situation about connecting, MksRepresent node nkNode nsSend the situation of connection, node ndWith node nkAll For not transshipping node.
Further, for the situation of node overload, select overload node nsThe concrete bar of upper task to be migrated Part is as follows:
Mss(p :) × A+AT×Mss(:, p)-Mds(:, p) × A-AT×Msd(p :)
≤Mss(k :) × A+AT×Mss(:, k)-Mds(:, k) × A-AT×Msd(k :)
∀ k t k ∈ n s , A = [ 1 . . . 1 ] T
Wherein, Mss(p :) represent task tpTo overload node nsWhat other tasks upper sent interior company connects situation, Mss (:, p) represent node nsOther tasks upper are to task tpSend the situation of interior connection, Msd(p :) represent Task tpTo node ndUpper task sends the situation of outer connection;Mds(:, p) represent node ndUpper task is to mistake Carry task t on nodepSend outer connection,
Mss(p :) × A+AT×Mss(:, p) represent because of task tpThe interior connection migrated and occur becomes The number of outer connection, Mds(:, p) × A+AT×Msd(p :) represent because of task tpMigrate and outside occurring Connect the number becoming interior connection;
In like manner, represent on the right side of inequality because migrating task tkAnd the interior connection occurred becomes outer connection and outer company Connect the difference becoming interior connection.
Further, the situation of machine of delaying for node, for the machine node n that delayssUpper each task choosing destination node ndConcrete satisfied condition is,
Msd(p :) × A+AT×Mds(:, p) >=Msi(p :) × A+AT×Mis(:, p)
∀ i ∈ [ 1 , n ] , A = [ 1 . . . 1 ] T
Wherein, Msd(p :) represent task t of needing to migratepTo destination node ndUpper task send outer company Connect situation;Mds(:, p) represent destination node ndUpper task is to need to migrating of task tpSend the feelings of outer connection Condition, Msd(p :) × A+AT×Mds(:, p) it is task tpWith destination node ndThe outer connection of upper task is total Number;Msi(p :) represent task t of needing to migratepTo node niUpper task send outer connection; Mis(:, p) represent node niUpper task is to need to migrating of task tpSend the situation of outer connection, Msi(p :) × A+AT×Mis(:, p) it is task tpWith node niThe outer connection sum of upper task;
Select and task t to be migratedpConnect the most node of number as task tpDestination node, will appoint Business tpMigrate to destination node nd, remove t in allocation matrix STpCorresponding column vector, forms new matrix.
Further, in step 3, described task-driven type task scheduling includes increasing newly with resource allocation conditions Task, task normal termination, task abnormity interrupt and task active migration;
B1. for the situation of newly-increased task, it is implemented as and utilizes task adjacency matrix TT to calculate with newly-increased Task always connects the most node of number as destination node, and newly-increased task is distributed to calculate the purpose of gained Node, the most corresponding amendment task allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B2. for the situation of task normal termination, it is implemented as task allocation matrix ST and removes normal knot The row corresponding to task of bundle, remove row and column corresponding in task adjacency matrix TT, revise mask code matrix The element that TTM is corresponding;
Situation about b3. interrupting for task abnormity, is implemented as and first re-executes this task, if Still aborted occur, this task be put in task queue, wait is redistributed, and correspondence is repaiied simultaneously Change to business allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B4. for the situation of task active migration, it is implemented as and directly task immigration to be migrated is arrived The destination node that user specifies, simultaneously corresponding amendment task allocation matrix ST, task adjacency matrix TT and Mask code matrix TTM.
Further, for the situation of newly-increased task, for newly-increased task tnewSelect destination node ndConcrete Condition is,
A T × ( M send d ) T + A T × M recv d ≥ A T × ( M send i ) T + A T × M recv i
∀ i ∈ [ 1 , n ] , A = [ 1 . . . 1 ] T
Wherein,Represent newly-increased task tnewTo destination node ndUpper task sends the situation of connection;Represent destination node ndOn task to newly-increased task tnewSend the situation of connection, sum of the two It it is i.e. newly-increased task tnewWith destination node ndBetween total connection number;Inequality representative below increases newly Task tnewWith other nodes niBetween total connection number, select and increase task t newlynewConnect number most Node as task tnewDestination node.
Another technical scheme that the present invention solves above-mentioned technical problem is as follows: a kind of towards real-time cloud platform Task scheduling and the system of resource distribution, deposit including client, global state monitoring module, global state Storage module and several working nodes;
Described client, it is used for submitting to task to arrive under the corresponding path of global state memory module, for each Working node obtains corresponding task;
Described global state memory module, it, for obtaining the operation conditions of each working node, will run shape Condition reports global state monitoring module;
Described global state monitoring module, it is for according to the operation conditions reported, utilizing task distribution moments Battle array, task adjacency matrix and mask code matrix formulate corresponding scheduling strategy, and save according to scheduling strategy Point is driving and task-driven type task scheduling is distributed with resource;
Described working node, it is used for obtaining corresponding task and performing.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described global state monitoring module includes task allocation matrix unit, task adjacency matrix Unit and mask code matrix unit;
Described task allocation matrix unit, it is used for setting up and revising task allocation matrix, and described task is divided Join matrix for the corresponding relation representing between task and working node;
Described task adjacency matrix unit, it is used for setting up and revising task adjacency matrix, described task neighbour Connect matrix for the annexation representing between task;
Described mask code matrix unit, it is used for setting up and revising mask code matrix, and described mask code matrix is used for table Show the interior annexation between task on individual node.
Accompanying drawing explanation
Fig. 1 is a kind of task scheduling towards real-time cloud platform of the present invention and resource allocation system frame Figure;
Fig. 2 is global state monitoring module structured flowchart of the present invention;
Fig. 3 is a kind of task scheduling towards real-time cloud platform of the present invention and resource allocation methods flow process Figure;
Fig. 4 is to increase task allocation matrix ST structural representation after node in the embodiment of the present invention 1 newly;
Fig. 5 a is overload node n in the embodiment of the present invention 2sTask adjacency matrix structure with other nodes Schematic diagram;
Fig. 5 b is overload node n in the embodiment of the present invention 2sWith destination node ndBetween task adjacency matrix Structural representation;
Fig. 6 is the machine node n that delays in the embodiment of the present invention 3sShow with the task adjacency matrix structure of other nodes It is intended to;
Fig. 7 is to increase task t newly after newly-increased task in the embodiment of the present invention 4newAdjacent with the task of other nodes Connect matrix structure schematic diagram;
Fig. 8 is task t in the embodiment of the present invention 5eTask adjacency matrix structure corresponding after normal termination is shown It is intended to;
Fig. 9 is actively by task t in the embodiment of the present invention 6pFrom source node nsMove to destination node ndAfter Task allocation matrix structural representation;
Figure 10 a-10g is the operation result carrying out task scheduling in the embodiment of the present invention with resource distribution.
In accompanying drawing, the list of parts representated by each label is as follows:
100, client, 200, global state monitoring module, 300, global state memory module, 400 Working node, 201, task allocation matrix unit, 202, task adjacency matrix unit, 203, mask Matrix unit.
Detailed description of the invention
Being described principle and the feature of the present invention below in conjunction with accompanying drawing, example is served only for explaining this Invention, is not intended to limit the scope of the present invention.
The concept that relate to is described below in some present invention.
Node: i.e. node, a physical machine or a virtual machine;
Connect: the process of data stream transmitting between task;
Interior connection: connection between each task on same node;
Outer connection: the connection between node, including the connection sending and receiving;
Task allocation matrix: the relations of distribution between task and node, row represents node, and task is shown in list, Element value is that the task that these row of 1 expression are corresponding distributes to the node that this row is corresponding;
Task adjacency matrix: the annexation between task, row and column all represents task, if element value It is to exist between this row of 1 expression and task corresponding to this row to connect and be the task nematic place that row is corresponding Task sends connection, otherwise represents that the two does not exist the connection of the direction;
Overloading threshold: represent whether node transships, and the CPU of node or memory usage exceed this value and then locate In overload, otherwise it is in normal condition.
Be illustrated in figure 1 the topological environmental of the present invention, use a station server as Client client, It is responsible for issuing order to cluster, submitting Job and executable program etc. to;Use three station servers as the overall situation State-storage module (Zookeeper node), is responsible for global state and stores and be responsible for communicating with other modules; Using two-server as global state monitoring module (Master node), one monitors whole cluster Duty, it is provided that fault recovery and task immigration function, another is as hot standby use;Use five Server, as Supervisor working node, is responsible for monitoring and controls Worker process works;And make Cluster network communication is provided with switch with PCI-Express.
Wherein, the task scheduling towards real-time cloud platform of the present invention and resource allocation system include client End 100, global state monitoring module 200, global state memory module 300 and several working nodes 400;
Described client 100, it is used for submitting to task to arrive under the corresponding path of global state memory module, Corresponding task is obtained for each working node;
Described global state memory module 200, it is for obtaining the operation conditions of each working node, will fortune Row situation reports global state monitoring module;
Described global state monitoring module 300, it is for according to the operation conditions reported, utilizing task to divide Join matrix, task adjacency matrix and mask code matrix and formulate corresponding scheduling strategy, and enter according to scheduling strategy Row node is driving and task-driven type task scheduling is distributed with resource;
Described working node 400, it is used for obtaining corresponding task and performing.
As in figure 2 it is shown, described global state monitoring module 200 include task allocation matrix unit 201, Task adjacency matrix unit 202 and mask code matrix unit 203;
Described task allocation matrix unit 201, it is used for setting up and revising task allocation matrix, described Business allocation matrix is for representing the corresponding relation between task and working node;
Described task adjacency matrix unit 202, it is used for setting up and revising task adjacency matrix, described Business adjacency matrix is for representing the annexation between task;
Described mask code matrix unit 203, it is used for setting up and revising mask code matrix, described mask code matrix use In the interior annexation represented on individual node between task.
Based on said system, the task scheduling towards real-time cloud platform of the present invention and resource allocation methods As follows.
As it is shown on figure 3, a kind of task scheduling towards real-time cloud platform and resource allocation methods, including such as Lower step:
Step 1: global state memory module obtains the operation conditions of cloud platform, operation conditions is reported Global state monitoring module;
Step 2: global state monitoring module, according to operation conditions, utilizes task allocation matrix, task adjacent Connect matrix and mask code matrix formulates corresponding scheduling strategy;
Step 3: carry out in real-time cloud platform according to scheduling strategy that node is driving and task-driven type is appointed Business scheduling distributes with resource.
Wherein, described task allocation matrix ST is the matrix of n row m row, and row represents node, and list is shown Task,
Described task adjacency matrix TT is the matrix of m row m row, the connection between expression task,
Described mask code matrix TTM is the matrix of m row m row, represents the interior connection between task on node Situation, is multiplied with task adjacency matrix TT, and making the interior element value connecting correspondence is 0,
In mask code matrix TT, element is to deposit between 2 task that this element place row and column of 1 expression is corresponding In annexation, it may be possible to outer connection is also likely to be interior connection, and 2 task of interior connection are positioned at same Node, its flow by switch, can not be ignored during optimization, therefore make interior connection correspondence by mask Element value is set to 0, only leaves outer connection.
Present invention is generally directed to the dynamic scheduling problem in cloud platform, dynamic dispatching is divided into two classes: node drives Ejector half and task-driven type.
Driving for node, task allocation matrix ST is bound to update, because task allocation matrix ST Represent the distribution condition of task on node;But no matter how node becomes, and the annexation between task is not Becoming, this annexation is in logic, and therefore task adjacency matrix TT does not makes an amendment;And when appointing When business distribution ST becomes, mask code matrix TTM also to change, because the element value of TTM is by the element of ST Value determines, because
A. node is driving:
1. increase node newly
When cloud platform increases node newly, number of nodes is become n+1, corresponding task allocation matrix ST from n Size is become (n+1) × m from n × m, and a line of ST represents the task distribution feelings of this row corresponding node Condition, newly-increased node performance in ST is to add a line at matrix, owing to this node not yet distributes task, So its element all 0, the allocation matrix ST of renewal is as follows:
As shown in Figure 4, a newly-increased node in the embodiment of the present invention 1, front n row is the newly-increased joint of cloud platform Task allocation matrix ST before Dian, size is n × m;After increasing node, task allocation matrix ST Size become (n+1) × m, last column is the element that newly-increased node is corresponding, owing to not yet distribution is appointed Business, so its element value is 0.
2. node overload
Node overload refers to that the CPU of node or the utilization rate of internal memory exceed overloading threshold, need to will transship node Upper part task immigration is to other node so that it is load restoration is normal;The process of node overload, including Select destination node and select the task of needing to migrate:
A. destination node is selected
Assume node nsOverload, needs nsOn some task immigration to destination node ndOn;
Destination node ndSelection need to meet 2 conditions:
1) ndDo not transship;
2) node nsWith destination node ndBetween connection number maximum, i.e.
AT×Msd×A+AT×Mds×A≥AT×Msk×A+AT×Mks×A
K ∈ [1, n], A=[1 ... 1]T
Wherein, MsdRefer to node nsOn task and node ndThe task adjacency matrix block of upper task, table Show node nsTo node ndSend the situation of connection,Represent node nsOn there is task i to joint Point ndOn task j send connection, otherwise represent and there is not such connection;In like manner understand Mds
AT×Msd× A represents node nsUpper task is to node ndTotal connection number that upper task sends, AT×Mds× A represents node ndUpper task is to node nsTotal connection number that upper task sends, the two it Be node nsAnd ndBetween connection number.
B. the task of needing to migrate is selected
Selecting the task of needing to migrate is tp, need to meet:
Mss(p :) × A+AT×Mss(:, p)-Mds(:, p) × A-AT×Msd(p :)
≤Mss(k :) × A+AT×Mss(:, k)-Mds(:, k) × A-AT×Msd(k :)
∀ k t k ∈ n s , A = [ 1 . . . 1 ] T
I.e. at source node nsOn, select in connect less and with destination node ndThe task that outer connection is more, Moved to destination node nd
Wherein, Mss(p :) represent task tpWith its place node nsOn the task of other tasks adjoin square Battle array, for row vector, represents task tpIt is sent to node nsThe connection of other tasks upper, Represent and there is task tpTo node ns, the most there is not such connection in the connection that other tasks upper send; In like manner Mss(;, p) for similar column vector;Msd(p :) represent task tpTo node ndOutside upper task sends Situation about connecting;Mds(;, p) represent node ndUpper task task t on overload nodepSend outer connection Situation;
Mss(p :) × A+AT×Mss(:, p) represent because of task tpOutside the interior connection migrated and occur becomes The number connected, Mds(:, p) × A+AT×Msd(p :) represent because of task tpMigrate and outer connecting of occurring Connect the number becoming interior connection.
Fig. 5 a is task adjacency matrix, nsCorresponding row and column represents overload node nsThe connection of upper task Situation, it is assumed that destination node is nd, nsAnd ndTwo matrix-blocks that corresponding row and column intersects be this two The connection of task on individual node, the matrix-block that dotted line circle is lived represents nsIt is sent to ndConnection, real The matrix-block that coil is lived represents ndIt is sent to nsConnection, the two element and be between two nodes Outer connection sum, ndBe in all nodes with nsThe outer node that do not transships connecting sum maximum.
Fig. 5 b is overload node nsWith destination node ndBetween task adjacency matrix, nsIt is overload node, ndIt is destination node, tpIt is overload node nsThe upper task of needing to migrate, two lived by minus sign "-" circle Individual matrix-block represents task t before migrationpWith node ndOuter connection, after migration, task tpFortune Row is at node ndOn, these outer connections become interior connection, the sum connected outside reducing;By plus sige "+" Two matrix-blocks enclosed represent task tpWith node nsConnection between other tasks upper, in being Connect, due to task t after migrationpIt is no longer belong to node ns, connect in these and will become outer connection, increase Add the sum of outer connection.
3. node is delayed machine
Causing the delay main cause of machine of node is that node overload but fails to migrate task thereon in time and makes it Load keeps normal, now need by delay on machine node all task immigrations to other nodes, it is important to Select suitable destination node nd
Machine node of assuming to delay is ns, needing migrating of task is tp, need to meet:
M sd ( p , : ) + M ds ( : , p ) ≥ M si ( p , : ) + M is ( : , p ) , ∀ i ∈ [ 1 , n ]
That is, task tpWith destination node ndOuter connection most;
Wherein, Msd(p :) represent task t of needing to migratepWith destination node ndThe task of upper task adjoins Matrix-block, for row vector,Represent and there is task tpTo destination node ndCertain task upper is sent out , the most there is not such connection in the connection gone out;In like manner, Mds(:, p) represent task tpWith destination node nd The task adjacency matrix of upper task, for column vector,Represent and there is destination node ndGo up certain Task is to need to migrating of task tpSend connection, the most there is not such connection, Msd(p :) and Mds(:, p) sum is task tpWith destination node ndThe connection sum of upper task;
By task tpMigrate to destination node nd, remove t in allocation matrix STpCorresponding column vector, is formed New matrix.
Repeat said process, until node nsOn there is no task.
As shown in Figure 6, nsRepresenting machine node of delaying, the part that dotted line circle is lived is any task t on itk, niIt is nsOutside arbitrary node, tkCorresponding row and column and niTwo matrixes that corresponding row and column intersects Block represents tkWith node niOuter connection, the element sum of two matrix-blocks is tkAnd niTotal Outer connection number, this node that do not transships always connecting number maximum is i.e. required destination node;Repeat said process, For nsOn each task find destination node, and move to destination node, until all tasks are moved Move complete;
4. node plan removes
Node plan removes and refers to, no longer distributes new task to this node, waits all tasks on this node Node is removed after completing by execution;It is implemented as and each node is arranged flag bit, by joint to be removed The flag bit of point is rewritten as distributing new task, then waits all task runs knot on this node Bundle, the element all 0 of this node corresponding row in task allocation matrix ST, this row is removed.
B. task-driven type:
1. increase task newly
Cloud platform increases task t newlynewAfter, need to select suitable destination node n for itd, meet:
A T × ( M send d ) T + A T × M recv d ≥ A T × ( M send i ) T + A T × M recv i
∀ i ∈ [ 1 , n ] , A = [ 1 . . . 1 ] T
Represent destination node ndWith newly-increased task tnewConnection number most, being assigned to this node can effectively subtract Number is connected outside few.Wherein,Represent newly-increased task tnewWith destination node ndAppointing between upper task Business adjacency matrix, for row vector, its element is that 1 expression exists newly-increased task tnewTo destination node ndOn Certain task send connection, the most there is not such connection;
Represent newly-increased task tnewTo destination node ndSend always connects number,Represent destination node ndOn task to newly-increased tnewSend always connects number, sum of the two It it is i.e. newly-increased task tnewWith destination node ndBetween total connection number.
As it is shown in fig. 7, tnewIt is newly assigned task, ndIt is tnewDestination node to be moved to, tnewWith Task adjacency matrix block between other tasks is placed on last column of predecessor's business adjacency matrix with last String, tnewCorresponding row and ndThe matrix-block that corresponding row intersect is tnewTo ndThe company sent Situation about connecing, and tnewCorresponding row and ndThe matrix-block that corresponding row intersects represents ndTo tnewSend out The connection gone out, sum of the two is tnewWith ndOuter connection sum, maximum outer connection sum is right The node answered is tnewDestination node, by tnewMove to this node.
2. task terminates
During task normal termination, the row that terminating in allocation matrix ST of task is corresponding need to be removed, remove and appoint Row and column corresponding in business adjacency matrix TT, the element that amendment mask code matrix TTM is corresponding.
As shown in Figure 8, task teBeing finished, normal termination, the part in dotted line is teAdjacent in task Connect the connection in matrix TT, remove this row and column;Equally, remove in task allocation matrix ST Corresponding row, delete and t in mask code matrix TTMeRelevant element.
3. task abnormity interrupts
First re-execute this task, if aborted still occurs, illustrate that this task is not suitable at this Run on node, be re-applied to task queue, wait to be allocated.
4. task active migration
The active migration of task refers to, by user determine by certain task immigration to certain node, this Without algorithm intervention, Direct Transfer in the case of Zhong.
Such as Fig. 9, by task tpFrom node nsMove to node ndOn, in allocation matrix ST, node ns Corresponding element is become 0 from 1, and node ndCorresponding element is become 1 from 0.
Operation result:
Figure 10 a is the initial condition of cloud platform node and task, have 4 nodes (node 1,2,3, 4) and 4 task (t1、t2、t3、t4), the connection between arrow expression task, connection is by arrow Head starting point place task is sent to terminal place task, and right side is now corresponding task allocation matrix successively ST and task adjacency matrix TT.
Figure 10 b is the operation result of newly-increased node, and cloud platform increases node 5, task allocation matrix ST newly Increasing a line newly at end, owing to not yet distributing task, so this row all elements value is 0, TT is without becoming Change.
On the basis of Figure 10 b, it is assumed that node 3 transships, utilize the algorithm in detailed description of the invention, remove Node outside node 3 does not all transship, and existing needs migrates the partial task on overload node 3, is first Task choosing destination node to be migrated, node 2 and 4 is connected number at most with the outer of node 3, is 1, Node 1,5 is 0 with the connection number of node 3, knows that node 4 load is less than node 2 according to other conditions again, So selecting node 4 as destination node;Secondly, to be migrated thinking is selected, it is assumed that migrate task t2, T after migration2Outer connection number be 2, it is assumed that migrate t3, t after migration3Outer connection number be 1, less than 2, So selecting to migrate t3, by t3Migrate to node 4, by the 3rd row in corresponding task allocation matrix ST 3rd column element is rewritten as 0, and the element of the 4th row the 3rd row row is rewritten as 1, and remaining element is constant, moves The operation result moved is shown in that Figure 10 c, TT are unchanged.
On the basis of Figure 10 c, it is assumed that node 2 is machine node of delaying, Figure 10 d is that node is delayed the operation of machine As a result, the node 2 of dotted line frame represents machine node of delaying, according to the algorithm in detailed description of the invention, need by t1Move to other nodes, it is assumed that t1After moving to node 1,4,5, t1Outer connection number be 1, false If after moving to node 3, t1Outer connection number be 0, less than 1, thus select node 3 as purpose Node, by task t on node 21Migrate to node 3, by the in corresponding task allocation matrix ST 2 row the 1st column elements are rewritten as 0, and the 3rd row the 1st column element is rewritten as 1, task adjacency matrix TT without Change.
Figure 10 e increases newly on the basis of Figure 10 c state task t5, according to the calculation in detailed description of the invention Method, t5It is respectively 0,0,1,0,0 with the connection number of node 1,2,3,4,5, with node 3 Connection number maximum, so selecting node 3 as t5Destination node, at the task allocation matrix answered ST increases string t5, element initial value is 0, and the 3rd row 5 column element value is rewritten as 1, task Adjacency matrix TT increases a line and string, and element initial value is 0, then by task adjacency matrix TT The 3rd row 5 row be rewritten as 1, remaining element value is constant.
In Figure 10 f, task t5Execution terminates, and represents with dotted ellipse block, at corresponding task distribution moments 3rd row the 5th column element is rewritten as 0 by battle array ST, and removes the 5th row, adjoin square in corresponding task Battle array TT removes 5 row and 5 row.
In Figure 10 g, task t1By active migration to node 1, will in corresponding task allocation matrix ST The element of the 2nd row the 1st row is rewritten as 0, and the element of the 1st row 1 row is rewritten as 1, and task adjoins square TT is unchanged for battle array.
The following index of effect Main Basis optimized is evaluated:
Task distribution average degree AVG, represents on the most node of number of tasks and the minimum node of number of tasks point The difference of the number of tasks joined:
AVG = max Σ j = 1 m st ij - min Σ j = 1 m st ij
Switch traffic COMM, flows through the outer connection number of switch:
A=[1 ... 1]T
COMM=AT×(TT*TTM)×A
Wherein, TT is task adjacency matrix, represents the connection between task and (includes that interior connection is with outer The total situation connected), TTM is mask code matrix (representing the interior connection between task in a node), It is to cover interior connection that TT with TTM is multiplied, and the result obtained is the outer connection between task.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all in the present invention Spirit and principle within, any modification, equivalent substitution and improvement etc. made, should be included in this Within bright protection domain.

Claims (9)

1. the task scheduling towards real-time cloud platform and resource allocation methods, it is characterised in that bag Include following steps:
Step 1: global state memory module obtains the operation conditions of cloud platform, operation conditions is reported Global state monitoring module;
Step 2: global state monitoring module, according to operation conditions, utilizes task allocation matrix ST, task Adjacency matrix TT and mask code matrix TTM formulates corresponding scheduling strategy;
Step 3: carry out in real-time cloud platform according to scheduling strategy that node is driving and/or task-driven type Task scheduling is distributed with resource;
Described task allocation matrix ST is the matrix of n row m row, and row represents node, and task is shown in list,
Described task adjacency matrix TT is the matrix of m row m row, the connection between expression task,
Described mask code matrix TTM is the matrix of m row m row, represents the interior connection feelings between task in node Condition, is multiplied with task adjacency matrix TT, and the result obtained represents outer situation about connecting between task,
A kind of task scheduling towards real-time cloud platform and resource distribution side Method, it is characterised in that in step 3, the driving task scheduling of described node includes with resource allocation conditions The situation that newly-increased node, node overload, node delay machine and node plan removes;
A1. for the situation of newly-increased node, newly-increased a line it is implemented as in task allocation matrix ST, Corresponding element zero setting;
A2. for the situation of node overload, it is implemented as selection destination node, will select on overload node The task immigration to be migrated selected in destination node, the most corresponding amendment task allocation matrix ST and mask Matrix TTM,
Wherein, selecting destination node to meet condition is that destination node is not transshipped;Overload node and purpose joint Number is connected maximum between point;
Task to be migrated on overload node is selected to meet condition and be, in selecting to occur because of this task immigration Connect and become the number of outer connection and deduct the outer connection occurred because of this task immigration and become interior linking number Value minimum;
A3. delaying for node the situation of machine, implementing is each task choosing mesh for delaying on machine node Node, delaying on machine node of task is moved to successively in the destination node of correspondence, simultaneously corresponding amendment Task allocation matrix ST and mask code matrix TTM;
Wherein, the condition selecting destination node to meet is to make task to be migrated and corresponding destination node Outer connection number is most;
Situation about a4. removing for node plan, is implemented as and distributes mark by the task of node to be removed Will position becomes to distribute new task state, then waits that all task runs on this node terminate, and moves Except this node, and the element all 0 of this node corresponding row in task allocation matrix ST, this row is moved Remove.
A kind of task scheduling towards real-time cloud platform and resource distribution side Method, it is characterised in that for the situation of node overload, selects destination node actual conditions as follows,
AT×Msd×A+AT×Mds×A≥AT×Msk×A+AT×Mks×A
K ∈ [1, n], A=[1 ... 1]T
Wherein, MsdRepresent overload node nsTo destination node ndSend the situation of connection, MdsRepresent mesh Node ndTo overload node nsSend the situation of connection, MskRepresent overload node nsTo node nkSend Situation about connecting, MksRepresent node nkTo node nsSend the situation of connection, node ndWith node nkAll For not transshipping node.
A kind of task scheduling towards real-time cloud platform and resource distribution side Method, it is characterised in that for the situation of node overload, selects overload node nsThe tool of upper task to be migrated Concrete conditions in the establishment of a specific crime is as follows:
M s s ( p , : ) × A + A T × M s s ( : , p ) - M d s ( : , p ) × A - A T × M s d ( p , : ) ≤ M s s ( k , : ) × A + A T × M s s ( : , k ) - M d s ( : , k ) × A - A T × M s d ( k , : ) ∀ k t k ∈ n s , A = 1 ... 1 T
Wherein, Mss(p :) represent task tpTo overload node nsWhat other tasks upper sent interior company connects situation, Mss(:, p) represent node nsOther tasks upper are to task tpSend the situation of interior connection, Msd(p :) represent Task tpTo node ndUpper task sends the situation of outer connection;Mds(:, p) represent node ndUpper task is to mistake Carry task t on nodepSend outer connection,
Mss(p :) × A+AT×Mss(:, p) represent because of task tpOutside the interior connection migrated and occur becomes The number connected, Mds(:, p) × A+AT×Msd(p :) represent because of task tpMigrate and outer connecting of occurring Connect the number becoming interior connection;
In like manner, represent on the right side of inequality because migrating task tkAnd the interior connection occurred becomes outer connection and outer company Connect the difference becoming interior connection.
A kind of task scheduling towards real-time cloud platform and resource distribution side Method, it is characterised in that the situation of machine of delaying for node, for the machine node n that delayssUpper each task choosing purpose Node ndConcrete satisfied condition is,
M s d ( p , : ) × A + A T × M d s ( : , p ) ≥ M s i ( p , : ) × A + A T × M i s ( : , p ) ∀ i ∈ [ 1 , n ] , A = 1 ... 1 T
Wherein, Msd(p :) represent task t of needing to migratepTo destination node ndUpper task send outer company Connect situation;Mds(:, p) represent destination node ndUpper task is to need to migrating of task tpSend the feelings of outer connection Condition, Msd(p :) × A+AT×Mds(:, p) it is task tpWith destination node ndThe outer connection of upper task is total Number;Msi(p :) represent task t of needing to migratepTo node niUpper task send outer connection; Mis(:, p) represent node niUpper task is to need to migrating of task tpSend the situation of outer connection, Msi(p :) × A+AT×Mis(:, p) it is task tpWith node niThe outer connection sum of upper task;
Select and task t to be migratedpConnect the most node of number as task tpDestination node, will appoint Business tpMigrate to destination node nd, remove t in allocation matrix STpCorresponding column vector, forms new matrix.
A kind of task scheduling towards real-time cloud platform and resource distribution side Method, it is characterised in that in step 3, described task-driven type task scheduling includes with resource allocation conditions Newly-increased task, task normal termination, task abnormity interrupt and task active migration;
B1. for the situation of newly-increased task, it is implemented as and utilizes task adjacency matrix TT to calculate with newly-increased Task always connects the most node of number as destination node, and newly-increased task is distributed to calculate the purpose of gained Node, the most corresponding amendment task allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B2. for the situation of task normal termination, it is implemented as task allocation matrix ST and removes normal knot The row corresponding to task of bundle, remove row and column corresponding in task adjacency matrix TT, revise mask code matrix The element that TTM is corresponding;
Situation about b3. interrupting for task abnormity, is implemented as and first re-executes this task, if Still aborted occur, this task be put in task queue, wait is redistributed, and correspondence is repaiied simultaneously Change to business allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B4. for the situation of task active migration, it is implemented as and directly task immigration to be migrated is arrived The destination node that user specifies, simultaneously corresponding amendment task allocation matrix ST, task adjacency matrix TT and Mask code matrix TTM.
A kind of task scheduling towards real-time cloud platform and resource distribution side Method, it is characterised in that for the situation of newly-increased task, for newly-increased task tnewSelect destination node nd's Actual conditions is,
A T × ( M s e n d d ) T + A T × M r e c v d ≥ A T × ( M s e n d i ) T + A T × M r e c v i ∀ i ∈ [ 1 , n ] , A = 1 ... 1 T
Wherein,Represent newly-increased task tnewTo destination node ndUpper task sends the situation of connection;Represent destination node ndOn task to newly-increased task tnewSend the situation of connection, sum of the two It it is i.e. newly-increased task tnewWith destination node ndBetween total connection number;Inequality representative below increases newly Task tnewWith other nodes niBetween total connection number, select and increase task t newlynewConnect number most Node as task tnewDestination node.
8. one kind realizes arbitrary task scheduling towards real-time cloud platform and resource described in claim 1-7 The system of distribution method, it is characterised in that include client, global state monitoring module, global state Memory module and several working nodes;
Described client, it is used for submitting to task to arrive under the corresponding path of global state memory module, for each Working node obtains corresponding task;
Described global state memory module, it, for obtaining the operation conditions of each working node, will run shape Condition reports global state monitoring module;
Described global state monitoring module, it is for according to the operation conditions reported, utilizing task distribution moments Battle array, task adjacency matrix and mask code matrix formulate corresponding scheduling strategy, and save according to scheduling strategy Point is driving and task-driven type task scheduling is distributed with resource;
Described working node, it is used for obtaining corresponding task and performing.
A kind of task scheduling towards real-time cloud platform with resource distribution is System, it is characterised in that described global state monitoring module includes that task allocation matrix unit, task are adjacent Matrix unit and mask code matrix unit;
Described task allocation matrix unit, it is used for setting up and revising task allocation matrix, and described task is divided Join matrix for the corresponding relation representing between task and working node;
Described task adjacency matrix unit, it is used for setting up and revising task adjacency matrix, described task neighbour Connect matrix for the annexation representing between task;
Described mask code matrix unit, it is used for setting up and revising mask code matrix, and described mask code matrix is used for table Show the interior annexation between task on individual node.
CN201410080647.XA 2014-03-06 2014-03-06 A kind of task scheduling towards real-time cloud platform and resource allocation methods and system Expired - Fee Related CN103812949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410080647.XA CN103812949B (en) 2014-03-06 2014-03-06 A kind of task scheduling towards real-time cloud platform and resource allocation methods and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410080647.XA CN103812949B (en) 2014-03-06 2014-03-06 A kind of task scheduling towards real-time cloud platform and resource allocation methods and system

Publications (2)

Publication Number Publication Date
CN103812949A CN103812949A (en) 2014-05-21
CN103812949B true CN103812949B (en) 2016-09-07

Family

ID=50709142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410080647.XA Expired - Fee Related CN103812949B (en) 2014-03-06 2014-03-06 A kind of task scheduling towards real-time cloud platform and resource allocation methods and system

Country Status (1)

Country Link
CN (1) CN103812949B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104270421B (en) * 2014-09-12 2017-12-19 北京理工大学 A kind of multi-tenant cloud platform method for scheduling task for supporting Bandwidth guaranteed
CN105589756B (en) * 2014-12-03 2019-02-15 中国银联股份有限公司 Batch processing group system and method
CN104636204B (en) * 2014-12-04 2018-06-01 中国联合网络通信集团有限公司 A kind of method for scheduling task and device
CN104917825A (en) * 2015-05-20 2015-09-16 中国科学院信息工程研究所 Load balancing method for real time stream computing platform
CN105447187B (en) * 2015-12-15 2017-09-22 广州神马移动信息科技有限公司 Web search method and system
SG11201803928UA (en) * 2015-12-17 2018-06-28 Ab Initio Technology Llc Processing data using dynamic partitioning
CN106375419A (en) * 2016-08-31 2017-02-01 东软集团股份有限公司 Deployment method and device of distributed cluster
CN107450855B (en) * 2017-08-08 2020-06-19 浪潮云信息技术有限公司 Model-variable data distribution method and system for distributed storage
CN109726004B (en) * 2017-10-27 2021-12-03 中移(苏州)软件技术有限公司 Data processing method and device
CN108234668A (en) * 2018-01-17 2018-06-29 北京网信云服信息科技有限公司 The dispatching method and system of a kind of consumer queue
CN109358954B (en) * 2018-09-21 2021-11-02 成都理工大学 Preemptive scheduling method of overload real-time system based on MaxSAT optimal solution
CN109815019B (en) * 2019-02-03 2021-06-15 普信恒业科技发展(北京)有限公司 Task scheduling method and device, electronic equipment and readable storage medium
CN111352712B (en) * 2020-02-25 2020-12-22 国网江苏省电力有限公司信息通信分公司 Cloud computing task tracking processing method and device, cloud computing system and server
WO2024020897A1 (en) * 2022-07-27 2024-02-01 西门子股份公司 Method and apparatus for allocating computing task between computing devices, and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102232282A (en) * 2010-10-29 2011-11-02 华为技术有限公司 Method and apparatus for realizing load balance of resources in data center
CN102508714A (en) * 2011-11-03 2012-06-20 南京邮电大学 Green-computer-based virtual machine scheduling method for cloud computing
CN102681899A (en) * 2011-03-14 2012-09-19 金剑 Virtual computing resource dynamic management system of cloud computing service platform
CN103095599A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Dynamic feedback weighted integration load scheduling method of cloud computing operating system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102232282A (en) * 2010-10-29 2011-11-02 华为技术有限公司 Method and apparatus for realizing load balance of resources in data center
CN102681899A (en) * 2011-03-14 2012-09-19 金剑 Virtual computing resource dynamic management system of cloud computing service platform
CN102508714A (en) * 2011-11-03 2012-06-20 南京邮电大学 Green-computer-based virtual machine scheduling method for cloud computing
CN103095599A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Dynamic feedback weighted integration load scheduling method of cloud computing operating system

Also Published As

Publication number Publication date
CN103812949A (en) 2014-05-21

Similar Documents

Publication Publication Date Title
CN103812949B (en) A kind of task scheduling towards real-time cloud platform and resource allocation methods and system
CN103870340B (en) Data processing method, control node and stream calculation system in stream calculation system
CN103309738B (en) User job dispatching method and device
CN103078941B (en) A kind of method for scheduling task of distributed computing system
CN107220123A (en) One kind solves Spark data skew method and system
CN104468353A (en) SDN based data center network flow management method
CN105471954A (en) SDN based distributed control system and user flow optimization method
CN104683488A (en) Flow-type calculation system as well as dispatching method and dispatching device of flow-type calculation system
CN103516744A (en) A data processing method, an application server and an application server cluster
CN103729257A (en) Distributed parallel computing method and system
CN103927231A (en) Data-oriented processing energy consumption optimization dataset distribution method
CN103825838A (en) Method for flow dispatch for removing bandwidth fragmentization from data center
CN104767778A (en) Task processing method and device
CN110113761A (en) Dispositions method and device in edge calculations network are applied in a kind of processing of flow data
CN105704054A (en) Data center network flow migration method and system thereof
CN105391651A (en) Virtual optical network multilayer resource convergence method and system
CN104468390A (en) Multi-controller load balancing method and system based on distributed-centralized type architecture model in software defined networking
CN106681815A (en) Concurrent migration method of virtual machines
CN102394903A (en) Active reconstruction calculating system constructing system
CN103149839A (en) Operational control method for electrical equipment based on Kuhn-Munkres algorithm
CN105786447A (en) Method and apparatus for processing data by server and server
CN106059940A (en) Flow control method and device
CN102811152A (en) Method for realizing real-time transaction and data exchange of multiple main bus network communication
CN105207856A (en) Load balancing system and method based on SDN virtual switch
CN102420797A (en) Topology mapping method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160907

CF01 Termination of patent right due to non-payment of annual fee