CN103812949B - A kind of task scheduling towards real-time cloud platform and resource allocation methods and system - Google Patents
A kind of task scheduling towards real-time cloud platform and resource allocation methods and system Download PDFInfo
- Publication number
- CN103812949B CN103812949B CN201410080647.XA CN201410080647A CN103812949B CN 103812949 B CN103812949 B CN 103812949B CN 201410080647 A CN201410080647 A CN 201410080647A CN 103812949 B CN103812949 B CN 103812949B
- Authority
- CN
- China
- Prior art keywords
- task
- node
- matrix
- situation
- connection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000013468 resource allocation Methods 0.000 title claims abstract description 17
- 239000011159 matrix material Substances 0.000 claims abstract description 169
- 238000012544 monitoring process Methods 0.000 claims abstract description 24
- 238000013508 migration Methods 0.000 claims description 12
- 230000008859 change Effects 0.000 claims description 8
- 230000005012 migration Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 6
- 230000006872 improvement Effects 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000004576 sand Substances 0.000 description 2
- 235000013290 Sagittaria latifolia Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 235000015246 common arrowhead Nutrition 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present invention relates to a kind of task scheduling towards real-time cloud platform and resource allocation methods and system, obtain the operation conditions of cloud platform including global state memory module, operation conditions is reported global state monitoring module;Global state monitoring module, according to operation conditions, utilizes task allocation matrix, task adjacency matrix and mask code matrix to formulate corresponding scheduling strategy;In real-time cloud platform, carry out according to scheduling strategy that node is driving and/or task-driven type task scheduling is distributed with resource, the present invention takes into full account the relation between task, the traffic reduced between node, reduces bandwidth pressure when distributing task, thus improves platform property;The various situations of cloud platform dynamic dispatching can be well adapted to, it is ensured that cloud platform moment in running keeps higher calculated performance and resource utilization;And time complexity is low, it is suitable in the cloud environment with extensive node and big task amount disposing using.
Description
Technical field
The present invention relates to real-time field of cloud calculation, particularly relate to a kind of task scheduling towards real-time cloud platform
With resource allocation methods and system.
Background technology
The data volume of society expands day by day, and data are more and more with extensive, continuous print stream
Form occur.The value of data reduces as time goes by, it requires data occur after as early as possible
They are processed rather than are cached and carry out batch processing by ground.Such as, at search engine each second
Managing thousands of inquiries, each page comprises multiple advertisement, in order to process user feedback in time, needs
One low latency, expansible, highly reliable process engine.Traditional DBMS or employing Map/Reduce
The method carrying out real-time stream process is all difficult to meet application demand.
To this end, occur in that a lot of stream calculation platform both at home and abroad, such as Yahoo!Stream calculation platform of increasing income
The Strom of S4 (Simple Scalable Streaming System), Twitter exploitation, commercialization are put down
The stream processing system Puma etc. of platform StreamBase, Facebook;Domestic also have a lot of similar system,
Including Baidu further generation data streaming system DStream, Taobao real time streaming data analysis platform Beatles
Deng.These distributed systems can significantly improve the disposal ability of data, reduce the process delay of data.
The new demand that low latency mass data flow processes, distributes with resource to the scheduling between task and node
Bring new challenge, the following problems of current main flow real-time cloud platform existence:
1, existing real-time cloud platform, such as the Storm of Twitter, enters task as independent unit
Row distribution, does not considers the mutual relation between task, and reality is from the point of view of improving platform efficiency, phase
The task of mutual correlation should be assigned on identical or adjacent node;
2, existing real-time cloud platform only considered the service condition of the CPU of task, internal memory, does not considers to appoint
The traffic between business, and the upstream-downstream relationship of task;
3, existing real-time cloud platform only considered initial or static assignment problem, and have ignored platform
It is open, task and node is this key character of dynamically change, the distribution in platform running
Strategy will become the key factor limiting its efficiency;
4, classical multinuclear task allocation algorithms complexity is higher, the situation that, task amount few at check figure is few
Under there is advantage, and the data volume of cloud platform, task amount, node scale have all surmounted the place of traditional algorithm
Reason scope, so that the allocation algorithm of real-time cloud platform is urgent and necessary.
In sum, it would be desirable to a kind of time complexity is low, can meet real-time cloud platform dynamic calculation,
It is suitable for task scheduling and the resource allocation algorithm of the situations such as cloud environment dynamically change, to improve appointing of cloud platform
Business allocative efficiency and resource utilization.
Summary of the invention
The technical problem to be solved is for the deficiencies in the prior art, it is provided that a kind of towards in real time
The task scheduling of cloud platform and resource allocation methods and system, its time complexity is low, can meet real-time cloud
Platform dynamic calculation, the task scheduling being suitable for the situations such as cloud environment dynamically change and resource distribution, can be effective
Improve task allocative efficiency and the resource utilization of cloud platform.
The technical scheme is that a kind of task towards real-time cloud platform
Scheduling and resource allocation methods, comprise the steps:
Step 1: global state memory module obtains the operation conditions of cloud platform, operation conditions is reported
Global state monitoring module;
Step 2: global state monitoring module, according to operation conditions, utilizes task allocation matrix ST, task
Adjacency matrix TT and mask code matrix TTM formulates corresponding scheduling strategy;
Step 3: carry out in real-time cloud platform according to scheduling strategy that node is driving and/or task-driven type
Task scheduling is distributed with resource.
The invention has the beneficial effects as follows:
1, take into full account the relation between task, the traffic reduced between node during distribution task, subtract
Few bandwidth pressure, thus improve platform property;
2, the various situations of cloud platform dynamic dispatching are well adapted to, it is ensured that cloud platform is in running
Moment keeps higher calculated performance and resource utilization;
3, computation complexity is low, is suitable in the cloud environment with extensive node and big task amount disposing
Use.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described task allocation matrix ST is the matrix of n row m row, and row represents node, list
Show task,
Described task adjacency matrix TT is the matrix of m row m row, the connection between expression task,
Described mask code matrix TTM is the matrix of m row m row, represents the interior connection feelings between task in node
Condition, is multiplied with task adjacency matrix TT, and the result obtained represents outer situation about connecting between task,
Further, in step 3, the driving task scheduling of described node includes increasing newly with resource allocation conditions
The situation that node, node overload, node delay machine and node plan removes;
A1. for the situation of newly-increased node, newly-increased a line it is implemented as in task allocation matrix ST,
Corresponding element zero setting;
A2. for the situation of node overload, it is implemented as selection destination node, will select on overload node
The task immigration to be migrated selected in destination node, the most corresponding amendment task allocation matrix ST and mask
Matrix TTM,
Wherein, selecting destination node to meet condition is that destination node is not transshipped;Overload node and purpose joint
Number is connected maximum between point;
Task to be migrated on overload node is selected to meet condition and be, in selecting to occur because of this task immigration
Connect and become the number of outer connection and deduct the outer connection occurred because of this task immigration and become interior linking number
Value minimum;
A3. delaying for node the situation of machine, implementing is each task choosing mesh for delaying on machine node
Node, delaying on machine node of task is moved to successively in the destination node of correspondence, simultaneously corresponding amendment
Task allocation matrix ST and mask code matrix TTM;
Wherein, the condition selecting destination node to meet is to make task to be migrated and corresponding destination node
Outer connection number is most;
Situation about a4. removing for node plan, is implemented as and distributes mark by the task of node to be removed
Will position becomes to distribute new task state, then waits that all task runs on this node terminate, and moves
Except this node, and the element all 0 of this node corresponding row in task allocation matrix ST, this row is moved
Remove.
Further, for the situation of node overload, select destination node actual conditions as follows,
AT×Msd×A+AT×Mds×A≥AT×Msk×A+AT×Mks×A
K ∈ [1, n], A=[1 ... 1]T
Wherein, MsdRepresent overload node nsTo destination node ndSend the situation of connection, MdsRepresent mesh
Node ndTo overload node nsSend the situation of connection, MskRepresent overload node nsTo node nkSend out
Go out situation about connecting, MksRepresent node nkNode nsSend the situation of connection, node ndWith node nkAll
For not transshipping node.
Further, for the situation of node overload, select overload node nsThe concrete bar of upper task to be migrated
Part is as follows:
Mss(p :) × A+AT×Mss(:, p)-Mds(:, p) × A-AT×Msd(p :)
≤Mss(k :) × A+AT×Mss(:, k)-Mds(:, k) × A-AT×Msd(k :)
Wherein, Mss(p :) represent task tpTo overload node nsWhat other tasks upper sent interior company connects situation,
Mss (:, p) represent node nsOther tasks upper are to task tpSend the situation of interior connection, Msd(p :) represent
Task tpTo node ndUpper task sends the situation of outer connection;Mds(:, p) represent node ndUpper task is to mistake
Carry task t on nodepSend outer connection,
Mss(p :) × A+AT×Mss(:, p) represent because of task tpThe interior connection migrated and occur becomes
The number of outer connection, Mds(:, p) × A+AT×Msd(p :) represent because of task tpMigrate and outside occurring
Connect the number becoming interior connection;
In like manner, represent on the right side of inequality because migrating task tkAnd the interior connection occurred becomes outer connection and outer company
Connect the difference becoming interior connection.
Further, the situation of machine of delaying for node, for the machine node n that delayssUpper each task choosing destination node
ndConcrete satisfied condition is,
Msd(p :) × A+AT×Mds(:, p) >=Msi(p :) × A+AT×Mis(:, p)
Wherein, Msd(p :) represent task t of needing to migratepTo destination node ndUpper task send outer company
Connect situation;Mds(:, p) represent destination node ndUpper task is to need to migrating of task tpSend the feelings of outer connection
Condition, Msd(p :) × A+AT×Mds(:, p) it is task tpWith destination node ndThe outer connection of upper task is total
Number;Msi(p :) represent task t of needing to migratepTo node niUpper task send outer connection;
Mis(:, p) represent node niUpper task is to need to migrating of task tpSend the situation of outer connection,
Msi(p :) × A+AT×Mis(:, p) it is task tpWith node niThe outer connection sum of upper task;
Select and task t to be migratedpConnect the most node of number as task tpDestination node, will appoint
Business tpMigrate to destination node nd, remove t in allocation matrix STpCorresponding column vector, forms new matrix.
Further, in step 3, described task-driven type task scheduling includes increasing newly with resource allocation conditions
Task, task normal termination, task abnormity interrupt and task active migration;
B1. for the situation of newly-increased task, it is implemented as and utilizes task adjacency matrix TT to calculate with newly-increased
Task always connects the most node of number as destination node, and newly-increased task is distributed to calculate the purpose of gained
Node, the most corresponding amendment task allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B2. for the situation of task normal termination, it is implemented as task allocation matrix ST and removes normal knot
The row corresponding to task of bundle, remove row and column corresponding in task adjacency matrix TT, revise mask code matrix
The element that TTM is corresponding;
Situation about b3. interrupting for task abnormity, is implemented as and first re-executes this task, if
Still aborted occur, this task be put in task queue, wait is redistributed, and correspondence is repaiied simultaneously
Change to business allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B4. for the situation of task active migration, it is implemented as and directly task immigration to be migrated is arrived
The destination node that user specifies, simultaneously corresponding amendment task allocation matrix ST, task adjacency matrix TT and
Mask code matrix TTM.
Further, for the situation of newly-increased task, for newly-increased task tnewSelect destination node ndConcrete
Condition is,
Wherein,Represent newly-increased task tnewTo destination node ndUpper task sends the situation of connection;Represent destination node ndOn task to newly-increased task tnewSend the situation of connection, sum of the two
It it is i.e. newly-increased task tnewWith destination node ndBetween total connection number;Inequality representative below increases newly
Task tnewWith other nodes niBetween total connection number, select and increase task t newlynewConnect number most
Node as task tnewDestination node.
Another technical scheme that the present invention solves above-mentioned technical problem is as follows: a kind of towards real-time cloud platform
Task scheduling and the system of resource distribution, deposit including client, global state monitoring module, global state
Storage module and several working nodes;
Described client, it is used for submitting to task to arrive under the corresponding path of global state memory module, for each
Working node obtains corresponding task;
Described global state memory module, it, for obtaining the operation conditions of each working node, will run shape
Condition reports global state monitoring module;
Described global state monitoring module, it is for according to the operation conditions reported, utilizing task distribution moments
Battle array, task adjacency matrix and mask code matrix formulate corresponding scheduling strategy, and save according to scheduling strategy
Point is driving and task-driven type task scheduling is distributed with resource;
Described working node, it is used for obtaining corresponding task and performing.
On the basis of technique scheme, the present invention can also do following improvement.
Further, described global state monitoring module includes task allocation matrix unit, task adjacency matrix
Unit and mask code matrix unit;
Described task allocation matrix unit, it is used for setting up and revising task allocation matrix, and described task is divided
Join matrix for the corresponding relation representing between task and working node;
Described task adjacency matrix unit, it is used for setting up and revising task adjacency matrix, described task neighbour
Connect matrix for the annexation representing between task;
Described mask code matrix unit, it is used for setting up and revising mask code matrix, and described mask code matrix is used for table
Show the interior annexation between task on individual node.
Accompanying drawing explanation
Fig. 1 is a kind of task scheduling towards real-time cloud platform of the present invention and resource allocation system frame
Figure;
Fig. 2 is global state monitoring module structured flowchart of the present invention;
Fig. 3 is a kind of task scheduling towards real-time cloud platform of the present invention and resource allocation methods flow process
Figure;
Fig. 4 is to increase task allocation matrix ST structural representation after node in the embodiment of the present invention 1 newly;
Fig. 5 a is overload node n in the embodiment of the present invention 2sTask adjacency matrix structure with other nodes
Schematic diagram;
Fig. 5 b is overload node n in the embodiment of the present invention 2sWith destination node ndBetween task adjacency matrix
Structural representation;
Fig. 6 is the machine node n that delays in the embodiment of the present invention 3sShow with the task adjacency matrix structure of other nodes
It is intended to;
Fig. 7 is to increase task t newly after newly-increased task in the embodiment of the present invention 4newAdjacent with the task of other nodes
Connect matrix structure schematic diagram;
Fig. 8 is task t in the embodiment of the present invention 5eTask adjacency matrix structure corresponding after normal termination is shown
It is intended to;
Fig. 9 is actively by task t in the embodiment of the present invention 6pFrom source node nsMove to destination node ndAfter
Task allocation matrix structural representation;
Figure 10 a-10g is the operation result carrying out task scheduling in the embodiment of the present invention with resource distribution.
In accompanying drawing, the list of parts representated by each label is as follows:
100, client, 200, global state monitoring module, 300, global state memory module, 400
Working node, 201, task allocation matrix unit, 202, task adjacency matrix unit, 203, mask
Matrix unit.
Detailed description of the invention
Being described principle and the feature of the present invention below in conjunction with accompanying drawing, example is served only for explaining this
Invention, is not intended to limit the scope of the present invention.
The concept that relate to is described below in some present invention.
Node: i.e. node, a physical machine or a virtual machine;
Connect: the process of data stream transmitting between task;
Interior connection: connection between each task on same node;
Outer connection: the connection between node, including the connection sending and receiving;
Task allocation matrix: the relations of distribution between task and node, row represents node, and task is shown in list,
Element value is that the task that these row of 1 expression are corresponding distributes to the node that this row is corresponding;
Task adjacency matrix: the annexation between task, row and column all represents task, if element value
It is to exist between this row of 1 expression and task corresponding to this row to connect and be the task nematic place that row is corresponding
Task sends connection, otherwise represents that the two does not exist the connection of the direction;
Overloading threshold: represent whether node transships, and the CPU of node or memory usage exceed this value and then locate
In overload, otherwise it is in normal condition.
Be illustrated in figure 1 the topological environmental of the present invention, use a station server as Client client,
It is responsible for issuing order to cluster, submitting Job and executable program etc. to;Use three station servers as the overall situation
State-storage module (Zookeeper node), is responsible for global state and stores and be responsible for communicating with other modules;
Using two-server as global state monitoring module (Master node), one monitors whole cluster
Duty, it is provided that fault recovery and task immigration function, another is as hot standby use;Use five
Server, as Supervisor working node, is responsible for monitoring and controls Worker process works;And make
Cluster network communication is provided with switch with PCI-Express.
Wherein, the task scheduling towards real-time cloud platform of the present invention and resource allocation system include client
End 100, global state monitoring module 200, global state memory module 300 and several working nodes
400;
Described client 100, it is used for submitting to task to arrive under the corresponding path of global state memory module,
Corresponding task is obtained for each working node;
Described global state memory module 200, it is for obtaining the operation conditions of each working node, will fortune
Row situation reports global state monitoring module;
Described global state monitoring module 300, it is for according to the operation conditions reported, utilizing task to divide
Join matrix, task adjacency matrix and mask code matrix and formulate corresponding scheduling strategy, and enter according to scheduling strategy
Row node is driving and task-driven type task scheduling is distributed with resource;
Described working node 400, it is used for obtaining corresponding task and performing.
As in figure 2 it is shown, described global state monitoring module 200 include task allocation matrix unit 201,
Task adjacency matrix unit 202 and mask code matrix unit 203;
Described task allocation matrix unit 201, it is used for setting up and revising task allocation matrix, described
Business allocation matrix is for representing the corresponding relation between task and working node;
Described task adjacency matrix unit 202, it is used for setting up and revising task adjacency matrix, described
Business adjacency matrix is for representing the annexation between task;
Described mask code matrix unit 203, it is used for setting up and revising mask code matrix, described mask code matrix use
In the interior annexation represented on individual node between task.
Based on said system, the task scheduling towards real-time cloud platform of the present invention and resource allocation methods
As follows.
As it is shown on figure 3, a kind of task scheduling towards real-time cloud platform and resource allocation methods, including such as
Lower step:
Step 1: global state memory module obtains the operation conditions of cloud platform, operation conditions is reported
Global state monitoring module;
Step 2: global state monitoring module, according to operation conditions, utilizes task allocation matrix, task adjacent
Connect matrix and mask code matrix formulates corresponding scheduling strategy;
Step 3: carry out in real-time cloud platform according to scheduling strategy that node is driving and task-driven type is appointed
Business scheduling distributes with resource.
Wherein, described task allocation matrix ST is the matrix of n row m row, and row represents node, and list is shown
Task,
Described task adjacency matrix TT is the matrix of m row m row, the connection between expression task,
Described mask code matrix TTM is the matrix of m row m row, represents the interior connection between task on node
Situation, is multiplied with task adjacency matrix TT, and making the interior element value connecting correspondence is 0,
In mask code matrix TT, element is to deposit between 2 task that this element place row and column of 1 expression is corresponding
In annexation, it may be possible to outer connection is also likely to be interior connection, and 2 task of interior connection are positioned at same
Node, its flow by switch, can not be ignored during optimization, therefore make interior connection correspondence by mask
Element value is set to 0, only leaves outer connection.
Present invention is generally directed to the dynamic scheduling problem in cloud platform, dynamic dispatching is divided into two classes: node drives
Ejector half and task-driven type.
Driving for node, task allocation matrix ST is bound to update, because task allocation matrix ST
Represent the distribution condition of task on node;But no matter how node becomes, and the annexation between task is not
Becoming, this annexation is in logic, and therefore task adjacency matrix TT does not makes an amendment;And when appointing
When business distribution ST becomes, mask code matrix TTM also to change, because the element value of TTM is by the element of ST
Value determines, because
A. node is driving:
1. increase node newly
When cloud platform increases node newly, number of nodes is become n+1, corresponding task allocation matrix ST from n
Size is become (n+1) × m from n × m, and a line of ST represents the task distribution feelings of this row corresponding node
Condition, newly-increased node performance in ST is to add a line at matrix, owing to this node not yet distributes task,
So its element all 0, the allocation matrix ST of renewal is as follows:
As shown in Figure 4, a newly-increased node in the embodiment of the present invention 1, front n row is the newly-increased joint of cloud platform
Task allocation matrix ST before Dian, size is n × m;After increasing node, task allocation matrix ST
Size become (n+1) × m, last column is the element that newly-increased node is corresponding, owing to not yet distribution is appointed
Business, so its element value is 0.
2. node overload
Node overload refers to that the CPU of node or the utilization rate of internal memory exceed overloading threshold, need to will transship node
Upper part task immigration is to other node so that it is load restoration is normal;The process of node overload, including
Select destination node and select the task of needing to migrate:
A. destination node is selected
Assume node nsOverload, needs nsOn some task immigration to destination node ndOn;
Destination node ndSelection need to meet 2 conditions:
1) ndDo not transship;
2) node nsWith destination node ndBetween connection number maximum, i.e.
AT×Msd×A+AT×Mds×A≥AT×Msk×A+AT×Mks×A
K ∈ [1, n], A=[1 ... 1]T
Wherein, MsdRefer to node nsOn task and node ndThe task adjacency matrix block of upper task, table
Show node nsTo node ndSend the situation of connection,Represent node nsOn there is task i to joint
Point ndOn task j send connection, otherwise represent and there is not such connection;In like manner understand Mds;
AT×Msd× A represents node nsUpper task is to node ndTotal connection number that upper task sends,
AT×Mds× A represents node ndUpper task is to node nsTotal connection number that upper task sends, the two it
Be node nsAnd ndBetween connection number.
B. the task of needing to migrate is selected
Selecting the task of needing to migrate is tp, need to meet:
Mss(p :) × A+AT×Mss(:, p)-Mds(:, p) × A-AT×Msd(p :)
≤Mss(k :) × A+AT×Mss(:, k)-Mds(:, k) × A-AT×Msd(k :)
I.e. at source node nsOn, select in connect less and with destination node ndThe task that outer connection is more,
Moved to destination node nd;
Wherein, Mss(p :) represent task tpWith its place node nsOn the task of other tasks adjoin square
Battle array, for row vector, represents task tpIt is sent to node nsThe connection of other tasks upper,
Represent and there is task tpTo node ns, the most there is not such connection in the connection that other tasks upper send;
In like manner Mss(;, p) for similar column vector;Msd(p :) represent task tpTo node ndOutside upper task sends
Situation about connecting;Mds(;, p) represent node ndUpper task task t on overload nodepSend outer connection
Situation;
Mss(p :) × A+AT×Mss(:, p) represent because of task tpOutside the interior connection migrated and occur becomes
The number connected, Mds(:, p) × A+AT×Msd(p :) represent because of task tpMigrate and outer connecting of occurring
Connect the number becoming interior connection.
Fig. 5 a is task adjacency matrix, nsCorresponding row and column represents overload node nsThe connection of upper task
Situation, it is assumed that destination node is nd, nsAnd ndTwo matrix-blocks that corresponding row and column intersects be this two
The connection of task on individual node, the matrix-block that dotted line circle is lived represents nsIt is sent to ndConnection, real
The matrix-block that coil is lived represents ndIt is sent to nsConnection, the two element and be between two nodes
Outer connection sum, ndBe in all nodes with nsThe outer node that do not transships connecting sum maximum.
Fig. 5 b is overload node nsWith destination node ndBetween task adjacency matrix, nsIt is overload node,
ndIt is destination node, tpIt is overload node nsThe upper task of needing to migrate, two lived by minus sign "-" circle
Individual matrix-block represents task t before migrationpWith node ndOuter connection, after migration, task tpFortune
Row is at node ndOn, these outer connections become interior connection, the sum connected outside reducing;By plus sige "+"
Two matrix-blocks enclosed represent task tpWith node nsConnection between other tasks upper, in being
Connect, due to task t after migrationpIt is no longer belong to node ns, connect in these and will become outer connection, increase
Add the sum of outer connection.
3. node is delayed machine
Causing the delay main cause of machine of node is that node overload but fails to migrate task thereon in time and makes it
Load keeps normal, now need by delay on machine node all task immigrations to other nodes, it is important to
Select suitable destination node nd;
Machine node of assuming to delay is ns, needing migrating of task is tp, need to meet:
That is, task tpWith destination node ndOuter connection most;
Wherein, Msd(p :) represent task t of needing to migratepWith destination node ndThe task of upper task adjoins
Matrix-block, for row vector,Represent and there is task tpTo destination node ndCertain task upper is sent out
, the most there is not such connection in the connection gone out;In like manner, Mds(:, p) represent task tpWith destination node nd
The task adjacency matrix of upper task, for column vector,Represent and there is destination node ndGo up certain
Task is to need to migrating of task tpSend connection, the most there is not such connection, Msd(p :) and
Mds(:, p) sum is task tpWith destination node ndThe connection sum of upper task;
By task tpMigrate to destination node nd, remove t in allocation matrix STpCorresponding column vector, is formed
New matrix.
Repeat said process, until node nsOn there is no task.
As shown in Figure 6, nsRepresenting machine node of delaying, the part that dotted line circle is lived is any task t on itk, niIt is
nsOutside arbitrary node, tkCorresponding row and column and niTwo matrixes that corresponding row and column intersects
Block represents tkWith node niOuter connection, the element sum of two matrix-blocks is tkAnd niTotal
Outer connection number, this node that do not transships always connecting number maximum is i.e. required destination node;Repeat said process,
For nsOn each task find destination node, and move to destination node, until all tasks are moved
Move complete;
4. node plan removes
Node plan removes and refers to, no longer distributes new task to this node, waits all tasks on this node
Node is removed after completing by execution;It is implemented as and each node is arranged flag bit, by joint to be removed
The flag bit of point is rewritten as distributing new task, then waits all task runs knot on this node
Bundle, the element all 0 of this node corresponding row in task allocation matrix ST, this row is removed.
B. task-driven type:
1. increase task newly
Cloud platform increases task t newlynewAfter, need to select suitable destination node n for itd, meet:
Represent destination node ndWith newly-increased task tnewConnection number most, being assigned to this node can effectively subtract
Number is connected outside few.Wherein,Represent newly-increased task tnewWith destination node ndAppointing between upper task
Business adjacency matrix, for row vector, its element is that 1 expression exists newly-increased task tnewTo destination node ndOn
Certain task send connection, the most there is not such connection;
Represent newly-increased task tnewTo destination node ndSend always connects number,Represent destination node ndOn task to newly-increased tnewSend always connects number, sum of the two
It it is i.e. newly-increased task tnewWith destination node ndBetween total connection number.
As it is shown in fig. 7, tnewIt is newly assigned task, ndIt is tnewDestination node to be moved to, tnewWith
Task adjacency matrix block between other tasks is placed on last column of predecessor's business adjacency matrix with last
String, tnewCorresponding row and ndThe matrix-block that corresponding row intersect is tnewTo ndThe company sent
Situation about connecing, and tnewCorresponding row and ndThe matrix-block that corresponding row intersects represents ndTo tnewSend out
The connection gone out, sum of the two is tnewWith ndOuter connection sum, maximum outer connection sum is right
The node answered is tnewDestination node, by tnewMove to this node.
2. task terminates
During task normal termination, the row that terminating in allocation matrix ST of task is corresponding need to be removed, remove and appoint
Row and column corresponding in business adjacency matrix TT, the element that amendment mask code matrix TTM is corresponding.
As shown in Figure 8, task teBeing finished, normal termination, the part in dotted line is teAdjacent in task
Connect the connection in matrix TT, remove this row and column;Equally, remove in task allocation matrix ST
Corresponding row, delete and t in mask code matrix TTMeRelevant element.
3. task abnormity interrupts
First re-execute this task, if aborted still occurs, illustrate that this task is not suitable at this
Run on node, be re-applied to task queue, wait to be allocated.
4. task active migration
The active migration of task refers to, by user determine by certain task immigration to certain node, this
Without algorithm intervention, Direct Transfer in the case of Zhong.
Such as Fig. 9, by task tpFrom node nsMove to node ndOn, in allocation matrix ST, node ns
Corresponding element is become 0 from 1, and node ndCorresponding element is become 1 from 0.
Operation result:
Figure 10 a is the initial condition of cloud platform node and task, have 4 nodes (node 1,2,3,
4) and 4 task (t1、t2、t3、t4), the connection between arrow expression task, connection is by arrow
Head starting point place task is sent to terminal place task, and right side is now corresponding task allocation matrix successively
ST and task adjacency matrix TT.
Figure 10 b is the operation result of newly-increased node, and cloud platform increases node 5, task allocation matrix ST newly
Increasing a line newly at end, owing to not yet distributing task, so this row all elements value is 0, TT is without becoming
Change.
On the basis of Figure 10 b, it is assumed that node 3 transships, utilize the algorithm in detailed description of the invention, remove
Node outside node 3 does not all transship, and existing needs migrates the partial task on overload node 3, is first
Task choosing destination node to be migrated, node 2 and 4 is connected number at most with the outer of node 3, is 1,
Node 1,5 is 0 with the connection number of node 3, knows that node 4 load is less than node 2 according to other conditions again,
So selecting node 4 as destination node;Secondly, to be migrated thinking is selected, it is assumed that migrate task t2,
T after migration2Outer connection number be 2, it is assumed that migrate t3, t after migration3Outer connection number be 1, less than 2,
So selecting to migrate t3, by t3Migrate to node 4, by the 3rd row in corresponding task allocation matrix ST
3rd column element is rewritten as 0, and the element of the 4th row the 3rd row row is rewritten as 1, and remaining element is constant, moves
The operation result moved is shown in that Figure 10 c, TT are unchanged.
On the basis of Figure 10 c, it is assumed that node 2 is machine node of delaying, Figure 10 d is that node is delayed the operation of machine
As a result, the node 2 of dotted line frame represents machine node of delaying, according to the algorithm in detailed description of the invention, need by
t1Move to other nodes, it is assumed that t1After moving to node 1,4,5, t1Outer connection number be 1, false
If after moving to node 3, t1Outer connection number be 0, less than 1, thus select node 3 as purpose
Node, by task t on node 21Migrate to node 3, by the in corresponding task allocation matrix ST
2 row the 1st column elements are rewritten as 0, and the 3rd row the 1st column element is rewritten as 1, task adjacency matrix TT without
Change.
Figure 10 e increases newly on the basis of Figure 10 c state task t5, according to the calculation in detailed description of the invention
Method, t5It is respectively 0,0,1,0,0 with the connection number of node 1,2,3,4,5, with node 3
Connection number maximum, so selecting node 3 as t5Destination node, at the task allocation matrix answered
ST increases string t5, element initial value is 0, and the 3rd row 5 column element value is rewritten as 1, task
Adjacency matrix TT increases a line and string, and element initial value is 0, then by task adjacency matrix TT
The 3rd row 5 row be rewritten as 1, remaining element value is constant.
In Figure 10 f, task t5Execution terminates, and represents with dotted ellipse block, at corresponding task distribution moments
3rd row the 5th column element is rewritten as 0 by battle array ST, and removes the 5th row, adjoin square in corresponding task
Battle array TT removes 5 row and 5 row.
In Figure 10 g, task t1By active migration to node 1, will in corresponding task allocation matrix ST
The element of the 2nd row the 1st row is rewritten as 0, and the element of the 1st row 1 row is rewritten as 1, and task adjoins square
TT is unchanged for battle array.
The following index of effect Main Basis optimized is evaluated:
Task distribution average degree AVG, represents on the most node of number of tasks and the minimum node of number of tasks point
The difference of the number of tasks joined:
Switch traffic COMM, flows through the outer connection number of switch:
A=[1 ... 1]T
COMM=AT×(TT*TTM)×A
Wherein, TT is task adjacency matrix, represents the connection between task and (includes that interior connection is with outer
The total situation connected), TTM is mask code matrix (representing the interior connection between task in a node),
It is to cover interior connection that TT with TTM is multiplied, and the result obtained is the outer connection between task.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all in the present invention
Spirit and principle within, any modification, equivalent substitution and improvement etc. made, should be included in this
Within bright protection domain.
Claims (9)
1. the task scheduling towards real-time cloud platform and resource allocation methods, it is characterised in that bag
Include following steps:
Step 1: global state memory module obtains the operation conditions of cloud platform, operation conditions is reported
Global state monitoring module;
Step 2: global state monitoring module, according to operation conditions, utilizes task allocation matrix ST, task
Adjacency matrix TT and mask code matrix TTM formulates corresponding scheduling strategy;
Step 3: carry out in real-time cloud platform according to scheduling strategy that node is driving and/or task-driven type
Task scheduling is distributed with resource;
Described task allocation matrix ST is the matrix of n row m row, and row represents node, and task is shown in list,
Described task adjacency matrix TT is the matrix of m row m row, the connection between expression task,
Described mask code matrix TTM is the matrix of m row m row, represents the interior connection feelings between task in node
Condition, is multiplied with task adjacency matrix TT, and the result obtained represents outer situation about connecting between task,
A kind of task scheduling towards real-time cloud platform and resource distribution side
Method, it is characterised in that in step 3, the driving task scheduling of described node includes with resource allocation conditions
The situation that newly-increased node, node overload, node delay machine and node plan removes;
A1. for the situation of newly-increased node, newly-increased a line it is implemented as in task allocation matrix ST,
Corresponding element zero setting;
A2. for the situation of node overload, it is implemented as selection destination node, will select on overload node
The task immigration to be migrated selected in destination node, the most corresponding amendment task allocation matrix ST and mask
Matrix TTM,
Wherein, selecting destination node to meet condition is that destination node is not transshipped;Overload node and purpose joint
Number is connected maximum between point;
Task to be migrated on overload node is selected to meet condition and be, in selecting to occur because of this task immigration
Connect and become the number of outer connection and deduct the outer connection occurred because of this task immigration and become interior linking number
Value minimum;
A3. delaying for node the situation of machine, implementing is each task choosing mesh for delaying on machine node
Node, delaying on machine node of task is moved to successively in the destination node of correspondence, simultaneously corresponding amendment
Task allocation matrix ST and mask code matrix TTM;
Wherein, the condition selecting destination node to meet is to make task to be migrated and corresponding destination node
Outer connection number is most;
Situation about a4. removing for node plan, is implemented as and distributes mark by the task of node to be removed
Will position becomes to distribute new task state, then waits that all task runs on this node terminate, and moves
Except this node, and the element all 0 of this node corresponding row in task allocation matrix ST, this row is moved
Remove.
A kind of task scheduling towards real-time cloud platform and resource distribution side
Method, it is characterised in that for the situation of node overload, selects destination node actual conditions as follows,
AT×Msd×A+AT×Mds×A≥AT×Msk×A+AT×Mks×A
K ∈ [1, n], A=[1 ... 1]T
Wherein, MsdRepresent overload node nsTo destination node ndSend the situation of connection, MdsRepresent mesh
Node ndTo overload node nsSend the situation of connection, MskRepresent overload node nsTo node nkSend
Situation about connecting, MksRepresent node nkTo node nsSend the situation of connection, node ndWith node nkAll
For not transshipping node.
A kind of task scheduling towards real-time cloud platform and resource distribution side
Method, it is characterised in that for the situation of node overload, selects overload node nsThe tool of upper task to be migrated
Concrete conditions in the establishment of a specific crime is as follows:
Wherein, Mss(p :) represent task tpTo overload node nsWhat other tasks upper sent interior company connects situation,
Mss(:, p) represent node nsOther tasks upper are to task tpSend the situation of interior connection, Msd(p :) represent
Task tpTo node ndUpper task sends the situation of outer connection;Mds(:, p) represent node ndUpper task is to mistake
Carry task t on nodepSend outer connection,
Mss(p :) × A+AT×Mss(:, p) represent because of task tpOutside the interior connection migrated and occur becomes
The number connected, Mds(:, p) × A+AT×Msd(p :) represent because of task tpMigrate and outer connecting of occurring
Connect the number becoming interior connection;
In like manner, represent on the right side of inequality because migrating task tkAnd the interior connection occurred becomes outer connection and outer company
Connect the difference becoming interior connection.
A kind of task scheduling towards real-time cloud platform and resource distribution side
Method, it is characterised in that the situation of machine of delaying for node, for the machine node n that delayssUpper each task choosing purpose
Node ndConcrete satisfied condition is,
Wherein, Msd(p :) represent task t of needing to migratepTo destination node ndUpper task send outer company
Connect situation;Mds(:, p) represent destination node ndUpper task is to need to migrating of task tpSend the feelings of outer connection
Condition, Msd(p :) × A+AT×Mds(:, p) it is task tpWith destination node ndThe outer connection of upper task is total
Number;Msi(p :) represent task t of needing to migratepTo node niUpper task send outer connection;
Mis(:, p) represent node niUpper task is to need to migrating of task tpSend the situation of outer connection,
Msi(p :) × A+AT×Mis(:, p) it is task tpWith node niThe outer connection sum of upper task;
Select and task t to be migratedpConnect the most node of number as task tpDestination node, will appoint
Business tpMigrate to destination node nd, remove t in allocation matrix STpCorresponding column vector, forms new matrix.
A kind of task scheduling towards real-time cloud platform and resource distribution side
Method, it is characterised in that in step 3, described task-driven type task scheduling includes with resource allocation conditions
Newly-increased task, task normal termination, task abnormity interrupt and task active migration;
B1. for the situation of newly-increased task, it is implemented as and utilizes task adjacency matrix TT to calculate with newly-increased
Task always connects the most node of number as destination node, and newly-increased task is distributed to calculate the purpose of gained
Node, the most corresponding amendment task allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B2. for the situation of task normal termination, it is implemented as task allocation matrix ST and removes normal knot
The row corresponding to task of bundle, remove row and column corresponding in task adjacency matrix TT, revise mask code matrix
The element that TTM is corresponding;
Situation about b3. interrupting for task abnormity, is implemented as and first re-executes this task, if
Still aborted occur, this task be put in task queue, wait is redistributed, and correspondence is repaiied simultaneously
Change to business allocation matrix ST, task adjacency matrix TT and mask code matrix TTM;
B4. for the situation of task active migration, it is implemented as and directly task immigration to be migrated is arrived
The destination node that user specifies, simultaneously corresponding amendment task allocation matrix ST, task adjacency matrix TT and
Mask code matrix TTM.
A kind of task scheduling towards real-time cloud platform and resource distribution side
Method, it is characterised in that for the situation of newly-increased task, for newly-increased task tnewSelect destination node nd's
Actual conditions is,
Wherein,Represent newly-increased task tnewTo destination node ndUpper task sends the situation of connection;Represent destination node ndOn task to newly-increased task tnewSend the situation of connection, sum of the two
It it is i.e. newly-increased task tnewWith destination node ndBetween total connection number;Inequality representative below increases newly
Task tnewWith other nodes niBetween total connection number, select and increase task t newlynewConnect number most
Node as task tnewDestination node.
8. one kind realizes arbitrary task scheduling towards real-time cloud platform and resource described in claim 1-7
The system of distribution method, it is characterised in that include client, global state monitoring module, global state
Memory module and several working nodes;
Described client, it is used for submitting to task to arrive under the corresponding path of global state memory module, for each
Working node obtains corresponding task;
Described global state memory module, it, for obtaining the operation conditions of each working node, will run shape
Condition reports global state monitoring module;
Described global state monitoring module, it is for according to the operation conditions reported, utilizing task distribution moments
Battle array, task adjacency matrix and mask code matrix formulate corresponding scheduling strategy, and save according to scheduling strategy
Point is driving and task-driven type task scheduling is distributed with resource;
Described working node, it is used for obtaining corresponding task and performing.
A kind of task scheduling towards real-time cloud platform with resource distribution is
System, it is characterised in that described global state monitoring module includes that task allocation matrix unit, task are adjacent
Matrix unit and mask code matrix unit;
Described task allocation matrix unit, it is used for setting up and revising task allocation matrix, and described task is divided
Join matrix for the corresponding relation representing between task and working node;
Described task adjacency matrix unit, it is used for setting up and revising task adjacency matrix, described task neighbour
Connect matrix for the annexation representing between task;
Described mask code matrix unit, it is used for setting up and revising mask code matrix, and described mask code matrix is used for table
Show the interior annexation between task on individual node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410080647.XA CN103812949B (en) | 2014-03-06 | 2014-03-06 | A kind of task scheduling towards real-time cloud platform and resource allocation methods and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410080647.XA CN103812949B (en) | 2014-03-06 | 2014-03-06 | A kind of task scheduling towards real-time cloud platform and resource allocation methods and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103812949A CN103812949A (en) | 2014-05-21 |
CN103812949B true CN103812949B (en) | 2016-09-07 |
Family
ID=50709142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410080647.XA Expired - Fee Related CN103812949B (en) | 2014-03-06 | 2014-03-06 | A kind of task scheduling towards real-time cloud platform and resource allocation methods and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103812949B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104270421B (en) * | 2014-09-12 | 2017-12-19 | 北京理工大学 | A kind of multi-tenant cloud platform method for scheduling task for supporting Bandwidth guaranteed |
CN105589756B (en) * | 2014-12-03 | 2019-02-15 | 中国银联股份有限公司 | Batch processing group system and method |
CN104636204B (en) * | 2014-12-04 | 2018-06-01 | 中国联合网络通信集团有限公司 | A kind of method for scheduling task and device |
CN104917825A (en) * | 2015-05-20 | 2015-09-16 | 中国科学院信息工程研究所 | Load balancing method for real time stream computing platform |
CN105447187B (en) * | 2015-12-15 | 2017-09-22 | 广州神马移动信息科技有限公司 | Web search method and system |
SG11201803928UA (en) * | 2015-12-17 | 2018-06-28 | Ab Initio Technology Llc | Processing data using dynamic partitioning |
CN106375419A (en) * | 2016-08-31 | 2017-02-01 | 东软集团股份有限公司 | Deployment method and device of distributed cluster |
CN107450855B (en) * | 2017-08-08 | 2020-06-19 | 浪潮云信息技术有限公司 | Model-variable data distribution method and system for distributed storage |
CN109726004B (en) * | 2017-10-27 | 2021-12-03 | 中移(苏州)软件技术有限公司 | Data processing method and device |
CN108234668A (en) * | 2018-01-17 | 2018-06-29 | 北京网信云服信息科技有限公司 | The dispatching method and system of a kind of consumer queue |
CN109358954B (en) * | 2018-09-21 | 2021-11-02 | 成都理工大学 | Preemptive scheduling method of overload real-time system based on MaxSAT optimal solution |
CN109815019B (en) * | 2019-02-03 | 2021-06-15 | 普信恒业科技发展(北京)有限公司 | Task scheduling method and device, electronic equipment and readable storage medium |
CN111352712B (en) * | 2020-02-25 | 2020-12-22 | 国网江苏省电力有限公司信息通信分公司 | Cloud computing task tracking processing method and device, cloud computing system and server |
WO2024020897A1 (en) * | 2022-07-27 | 2024-02-01 | 西门子股份公司 | Method and apparatus for allocating computing task between computing devices, and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102232282A (en) * | 2010-10-29 | 2011-11-02 | 华为技术有限公司 | Method and apparatus for realizing load balance of resources in data center |
CN102508714A (en) * | 2011-11-03 | 2012-06-20 | 南京邮电大学 | Green-computer-based virtual machine scheduling method for cloud computing |
CN102681899A (en) * | 2011-03-14 | 2012-09-19 | 金剑 | Virtual computing resource dynamic management system of cloud computing service platform |
CN103095599A (en) * | 2013-01-18 | 2013-05-08 | 浪潮电子信息产业股份有限公司 | Dynamic feedback weighted integration load scheduling method of cloud computing operating system |
-
2014
- 2014-03-06 CN CN201410080647.XA patent/CN103812949B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102232282A (en) * | 2010-10-29 | 2011-11-02 | 华为技术有限公司 | Method and apparatus for realizing load balance of resources in data center |
CN102681899A (en) * | 2011-03-14 | 2012-09-19 | 金剑 | Virtual computing resource dynamic management system of cloud computing service platform |
CN102508714A (en) * | 2011-11-03 | 2012-06-20 | 南京邮电大学 | Green-computer-based virtual machine scheduling method for cloud computing |
CN103095599A (en) * | 2013-01-18 | 2013-05-08 | 浪潮电子信息产业股份有限公司 | Dynamic feedback weighted integration load scheduling method of cloud computing operating system |
Also Published As
Publication number | Publication date |
---|---|
CN103812949A (en) | 2014-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103812949B (en) | A kind of task scheduling towards real-time cloud platform and resource allocation methods and system | |
CN103870340B (en) | Data processing method, control node and stream calculation system in stream calculation system | |
CN103309738B (en) | User job dispatching method and device | |
CN103078941B (en) | A kind of method for scheduling task of distributed computing system | |
CN107220123A (en) | One kind solves Spark data skew method and system | |
CN104468353A (en) | SDN based data center network flow management method | |
CN105471954A (en) | SDN based distributed control system and user flow optimization method | |
CN104683488A (en) | Flow-type calculation system as well as dispatching method and dispatching device of flow-type calculation system | |
CN103516744A (en) | A data processing method, an application server and an application server cluster | |
CN103729257A (en) | Distributed parallel computing method and system | |
CN103927231A (en) | Data-oriented processing energy consumption optimization dataset distribution method | |
CN103825838A (en) | Method for flow dispatch for removing bandwidth fragmentization from data center | |
CN104767778A (en) | Task processing method and device | |
CN110113761A (en) | Dispositions method and device in edge calculations network are applied in a kind of processing of flow data | |
CN105704054A (en) | Data center network flow migration method and system thereof | |
CN105391651A (en) | Virtual optical network multilayer resource convergence method and system | |
CN104468390A (en) | Multi-controller load balancing method and system based on distributed-centralized type architecture model in software defined networking | |
CN106681815A (en) | Concurrent migration method of virtual machines | |
CN102394903A (en) | Active reconstruction calculating system constructing system | |
CN103149839A (en) | Operational control method for electrical equipment based on Kuhn-Munkres algorithm | |
CN105786447A (en) | Method and apparatus for processing data by server and server | |
CN106059940A (en) | Flow control method and device | |
CN102811152A (en) | Method for realizing real-time transaction and data exchange of multiple main bus network communication | |
CN105207856A (en) | Load balancing system and method based on SDN virtual switch | |
CN102420797A (en) | Topology mapping method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160907 |
|
CF01 | Termination of patent right due to non-payment of annual fee |