Content of the invention
The technical problem to be solved in the present invention is:A kind of cloud data center task tune based on improvement ant group algorithm is provided
Degree method, realizes virtual machine is reasonably distributed, task is efficiently dispatched.
The technical solution adopted for the present invention to solve the technical problems is:In a kind of cloud data based on improvement ant group algorithm
Heart method for scheduling task, comprises the steps:
Step 1:What input user submitted to treats traffic control stream set of tasks and the virtual machine set of user's lease;
Step 2:The scheduling problem assigning the task to virtual machine execution is expressed as the minima Solve problems of standard;
Step 3:With the ant colony optimization for solving cloud computing environment virtual machine Mission Scheduling based on Pheromone update.
Further, based on the cloud data center method for scheduling task improving ant group algorithm, described step 2 will for the present invention
The scheduling problem that task distributes to virtual machine execution is expressed as the minima Solve problems of standard, and wherein optimization aim is scheduling plan
In slightly, all tasks carryings finish minimizing overhead, that is, all tasks carryings finish cost time the shortest;
Constraints be task number be greater than lease virtual machine number, the task in set of tasks be all unit appoint
Business, that is, each task can not be split as less subtask again, and each task is carried out using the virtual machine of arbitrary lease
Calculate, but each virtual machine can only process a task in the same time, and task does not complete and do not allow to interrupt before calculating.
Further, the present invention based on the cloud data center method for scheduling task improving ant group algorithm, use by described step 3
One iterative process is comprised based on the ant colony optimization for solving cloud computing environment virtual machine Mission Scheduling of Pheromone update, including
Following 8 sub-steps:
Step 3.1:Initialization;Basic parameter in this step initialization algorithm includes information heuristic factor α, expectation inspires
Factor-beta, pheromone volatilization factor ρ, Formica fusca number m, maximum iteration time NCmax, pheromone τi,jAnd transfer expected degree
ηi,j;
Step 3.2:Algorithm iteration starts, if iterationses NC is less than maximum iteration time NCmaxWhen, NC=NC+1, enters
Enter next step;When iterationses are more than or equal to maximum iteration time, iteration terminates;
Step 3.3:It is the selected probability of every virtual machine of each task computation that every Formica fusca shifts formula according to state;
Described state shifts formula:
Wherein, τi,jAnd ηi,jRepresent task T respectivelyiDistribute to VMjWhen pheromone and expected degree, Pi,jExpression will be appointed
Business TiDistribute to virtual machine VMjProbability, n for user lease virtual machine number;
Step 3.4:By roulette algorithms selection virtual machine;Solve the advance transition probability of Formica fusca by roulette algorithm
Problem, when Formica fusca starts as task choosing scheduling virtual machine, makes wheel disc rotate, pointer points to region and corresponds to when wheel disc stops
Virtual machine then be kth Formica fusca be task choosing calculate node;Alternative virtual machine corresponding transition probability value is bigger, its
The area occupying on wheel disc is bigger, selects it to calculate the probability of this task accordingly bigger;
Step 3.5:When the same layer task in Work flow model all selects same virtual machine, going to step 3.3 is to appoint
Virtual machine is redistributed in business, otherwise goes to next step;
Step 3.6:Local information element updates:After a Formica fusca completes all of task distribution, to this Formica fusca dispatching party
All virtual machines in case carry out Pheromone update;
Step 3.7:The renewal of global information element:After all Formica fuscas all complete once to travel through, find out in current iteration
All virtual machines in this scheme are carried out Pheromone update, then go to step 3.2 by good scheduling scheme;
Step 3.8:Find optimal distributing scheme, the virtual machine in binding scheme and corresponding workflow task.
Further, the present invention is based on the cloud data center method for scheduling task improving ant group algorithm, described pheromone
τi,jAnd transfer expected degree ηi,jAll represented with the computing capability of calculate node:
τi,j=ηi,j=MIPSj/N
Wherein, MIPSjRepresent process task TiVirtual machine VMjProcessing speed, N is a constant.
Further, the present invention is based on the cloud data center method for scheduling task improving ant group algorithm, described in step 3.6
Local information element updates and specifically includes herein below:
A, residual risk are updated processing, using equation below:
τij(t+1)=(1- ρ) τij(t)+Δτij(t)
Wherein:τij(t+1) task T when representing the t+1 time iterationiSelect virtual machine VMjQuantity of information, 1- ρ represents information
The element residual factor, in order to prevent the unlimited accumulation of information, the span of ρ is:ΔτijT () represents task TiChoosing
Select virtual machine VMjExecution remains in virtual machine VMjOn quantity of information;
B, all virtual machines in this Formica fusca scheduling scheme are carried out with the renewal of pheromone:
Δτij(t)=D/clockij
Wherein D is a constant, clockijRepresenting Formica fusca in this circulation is task TiSelect virtual machine VMjExecution when
Between;
Further, the present invention is based on the cloud data center method for scheduling task improving ant group algorithm, complete described in step 3.7
The renewal of office's pheromone is according to formula:Δτij(t)=D/bestclockijRow information is entered to all virtual machines in the program
Element updates, wherein, bestclockijRepresent in optimal distributing scheme is task TiSelect virtual machine VMjWhen task TiComplete
Time.
Further, the present invention is based on the cloud data center method for scheduling task improving ant group algorithm, described step 3.6
In also include defining a pheromone Dynamic gene PC, according to the virtual machine distribution condition of task, pheromone is adjusted.
Further, the present invention is based on the cloud data center method for scheduling task improving ant group algorithm, described pheromone
The computing formula of Dynamic gene PC is:
Wherein, EjExecute virtual machine VM after all tasks for virtual machine in epicycle iterative processjThe time being spent.
Further, the present invention based on the cloud data center method for scheduling task improving ant group algorithm, appoint by described basis
The virtual machine distribution condition of business is adjusted to pheromone, specially:
After local information element updates and global information element updates, then the pheromone after updating is adjusted according to the following formula,
τij(t+1)=((1- ρ) τij(t)+Δτij(t))*PC;
It is not yet assigned to task TiOther virtual machines then carry out the adjustment of pheromone according to below equation:
τix(t+1)=τix(t) * PC,
Wherein, τix(t+1) when representing the t+1 time iteration, task TiSelect to be not yet assigned to the virtual machine VM of taskxInformation
Amount.
The technical solution used in the present invention compared with prior art, has following technique effect:
A kind of cloud data center method for scheduling task based on improvement ant group algorithm that the present invention provides, calculates in basic ant colony
It is optimized on the basis of method, not only shorten the time overhead of task scheduling but also consider the load condition of each virtual machine
Prevent virtual machine from the machine of delaying or light condition occurring, it is to avoid the problems such as the wasting of resources, improve the utilization rate of resource.
Specific embodiment
In order that those skilled in the art more fully understand technical problem in the application, technical scheme and technique effect,
Cloud data center task scheduling side based on improvement ant group algorithm a kind of to the present invention with reference to the accompanying drawings and detailed description
Method is described in further detail.
The present invention proposes a kind of cloud data center Load Balancing Task Scheduling algorithm (Load based on improvement ant group algorithm
balancing task scheduling algorithm based on ant colony algorithm for cloud
Datacenters, LACO), LACO algorithm not only shortens the time overhead of tasks carrying but also will rent during task scheduling
The virtual machine rented maintains the state of load relative equilibrium.Further, it is contemplated that most researchers all focus on independence up till now
The scheduling of task, and have ignored that user may submit to priority restrictions relation, be mutually related Work flow model, therefore
The present invention carrys out the scheduling of research work stream using DAG (Directed Acyclic Graph, directed acyclic graph).Each by considering
The constraint of the sequential between individual task or cause and effect comes for one optimal resource of task choosing, and coordinates the execution of each task
To obtain final implementing result.
The present invention is used ClouSim as emulation platform, is simulated emulation experiment by it to LACO algorithm, and with
FIFO (First In First Out, FIFO) scheduling strategy and ACO (Ant colony algorithm, basic ant colony
Dispatching algorithm) contrasted, the superiority of checking LACO algorithm.
Proposed by the present invention based on improve ant group algorithm cloud data center task scheduling algorithm comprise the steps, flow process
As shown in Figure 1:
Step 1:What input user submitted to treats the set of the virtual machine of scheduler task set and user's lease;
Step 2:The scheduling problem assigning the task to resource execution is expressed as the minima Solve problems of standard;
Some tasks submitted to for user there may be complementary relation, and the present invention is right as studying using workflow
As to solve the associated task scheduling problem in cloud data center.Generally workflow all can be described as a directed acyclic graph G
=(T, E), wherein:T is the set of DAG interior joint, represents n task in workflow, T={ T1,T2,T3,……,Tn};E
It is set the E={ (T of directed edge in Work flow modeli,Tj)|Ti,Tj∈ T }, represent the restricting relation between two tasks.As
Fruit task TiThere is sensing task TjDirected edge, then TiIt is referred to as TjFather's task, TjIt is referred to as TiSubtask, this
In the case of TjOnly in TiAfter the completion of just can execute.Fig. 2 is the basic framework of one group of workflow, and containing ten needs to process
Workflow task, label is respectively T0~T9, the length of these tasks is different.In fig. 2, T={ T1,T2,T4,T5,
T6,T7,T8, T9, E={ (T0,T1),(T0,T2),(T1,T3),(T1,T4),(T2,T5),(T2,T6),(T3,T7),(T4,T7),
(T5,T8),(T6,T8),(T7,T9),(T8,T9)}.
Present invention VM represents the virtual machine of user's lease, and m represents the number of virtual machine, VM={ VM1, VM2... ...,
VMm, VMiProcessing speed MIPSiTo represent, MIPS represents million grades of machine language instruction numbers of process per second.
The present invention defines the communication matrix com, com={ c of a n × mi,j|ci,j>=0,1≤i≤n, 1≤j≤m }, its
In:N represents the number of task, and m represents the virtual machine number of user's lease, ci,j(as shown in formula (1)) represents task TiDistribution
To virtual machine VMjThe required call duration time of execution;In addition calculating matrix exe of a n × m, exe={ e are definedi,j|ei,j>=0,
1≤i≤n, 1≤j≤m }, wherein ei,j(as shown in formula (2)) represents task TiIn virtual machine VMjThe calculating time of upper execution.
cij=outputsizei/bandwidth (1)
eij=Lengthi/Mipsj(2)
Wherein:outputsizeiExpression task TiThe size of output file, bandwidth represents order wire between virtual machine
The bandwidth on road;LengthiExpression task TiSize, MipsjRepresent process task TiVirtual machine VMjProcessing speed;If
Former and later two tasks execute all on same virtual machine, there is not data transfer cost.
Virtual machine VMjThe time overhead of all tasks processing can use Ej(as shown in formula (3)) represents, and whole work
Make the total cost time E flowingtotalAs in workflow, last task completes the moment.
Wherein:TaskjRepresent virtual machine VMjAll tasks of upper execution, Ftask represents virtual machine VMjUpper execution all
(father's task is not in virtual machine VM for father's task of taskjUpper execution).
Step 3:Solve cloud computing environment with the task scheduling algorithm of the improvement ant group algorithm based on Pheromone update virtual
Machine Mission Scheduling, is described in detail below:
Step 3.1:Initialization is based on the cloud data center Load Balancing Task Scheduling algorithm improving ant group algorithm;
This step initialization information heuristic factor α, expectation heuristic factor β, pheromone volatilization factor ρ, Formica fusca number m,
Big iterationses, pheromone and transfer expected degree.
Ant group algorithm when solving some basic problems the pheromone between two nodes and transfer expected degree generally with away from
From waiting, attribute is relevant.But the particularity due to cloud computing environment, the present invention is by pheromone τi,jAnd the expected degree of this node
ηi,jAll represented with the computing capability of calculate node.
τi,j=ηi,j=MIPSj/N (4)
In formula (4), τi,jAnd ηi,jRepresent task T respectivelyiDistribute to VMjWhen pheromone and expected degree, N is
One constant (as cooperation index).
Step 3.2:Algorithm iteration starts, if iterationses NC is less than NCmax, NC=NC+1, enter next step;When repeatedly
When generation number is more than or equal to maximum iteration time, iteration terminates.
Step 3.3:It is that every virtual machine of each task computation is selected general that every Formica fusca shifts formula (5) according to state
Rate.
Formula (5) represents task TiDistribute to VMjProbability, n for user lease virtual machine number.
Step 3.4:By roulette algorithms selection virtual machine;
The present invention solves the problems, such as the advance transition probability of Formica fusca by roulette algorithm.Roulette algorithm (Roulette
Algorithm) be the process of emulation wheel disc gambling it is assumed that there being a circular wheel disc, and it is different to be divided into m block area
Sector region, this m block region represents that for Formica fusca k, one of task-set task to be assigned to every virtual machine corresponding respectively
Probit.As shown in figure 3, assuming that alternative virtual machine has 4, respectively VM1、VM2、VM3、VM4, corresponding probit is respectively
For:23%th, 52%, 6% and 19%.
When Formica fusca starts as task choosing scheduling virtual machine, wheel disc is made to rotate, the area that pointer points to when wheel disc stops
The corresponding virtual machine in domain then for Formica fusca k be task choosing calculate node.Alternative virtual machine corresponding transition probability value is bigger, its
The area occupying on wheel disc is bigger, corresponding select it to execute the probability of this task is bigger, the implementing of this algorithm
Journey is as shown in Figure 4.
For each task in workflow, after determining the selected probit of every virtual machine, it will interval in [0,1]
Interior random generation one number, this number probability selected with First virtual machine is subtracted each other, if difference is less than zero, then this is empty
Plan machine is just selected, is otherwise further continued for deducting the selected probability of next virtual machine, the result after deducting is less than or equal to
0.The virtual machine corresponding to that probit when finally deducting is as the virtual machine of this task choosing.
Step 3.5:When the same layer task in Work flow model all selects same virtual machine, going to step 3.3 is to appoint
Virtual machine is redistributed in business, otherwise goes to next step;
Due to the particularity of Work flow model, if the workflow task of same layer all distributes identical virtual machine, now
During one task of execution, other tasks of same layer will enter long waiting period, and this results in other and has processed task
Virtual machine is in idle condition thus leading to the wasting of resources.In order to solve this problem, this algorithm once detects same layer
Workflow task all selects same virtual machine and is absorbed in then to reselect other according to formula (5) for it during waiting period and be in sky
The virtual machine of not busy state.
Step 3.6:Local information element updates, after a Formica fusca completes all of task distribution, to this Formica fusca dispatching party
All virtual machines in case carry out Pheromone update;
In order to avoid residual risk element excessively floods heuristic information, after therefore every Formica fusca completes all scheduling, need logical
Cross formula (6) residual risk to be updated process.
τij(t+1)=(1- ρ) τij(t)+Δτij(t) (6)
Wherein:τij(t+1) task T when representing the t+1 time iterationiSelect virtual machine VMjQuantity of information, 1- ρ represents information
The element residual factor, in order to prevent the unlimited accumulation of information, the span of ρ is:
The present invention utilizes the Ant-Cycle model in the Basic Ant Group of Algorithm model that M.Dorigo proposes, and this model utilizes
Be the overall situation information.ΔτijT () represents task TiSelect virtual machine VMjExecution remains in virtual machine VMjOn quantity of information, just
Begin moment Δ τij(0)=0, after a Formica fusca completes all of task scheduling, according to formula (7) in this Formica fusca scheduling scheme
All virtual machines carry out the renewal of pheromone.
Δτij(t)=D/clockij(7)
Wherein D is a constant, clockijRepresenting Formica fusca in this circulation is task TiSelect virtual machine VMjExecution when
Between.
Invention defines a pheromone Dynamic gene PC, according to the virtual machine distribution condition of task, pheromone is carried out
Adjustment.
For the task of different layers in workflow, in order to prevent these tasks from all selecting the preferable virtual machine of computing capability,
Lead to this virtual machine overload, the present invention defines a pheromone Dynamic gene PC, by formula (8) Suo Shi.Wherein EjDefinition
Execute virtual machine VM after all tasks for virtual machine in epicycle iterative processjThe time being spent.Update in local information element
Again the pheromone after updating is adjusted according to formula (9) after updating with global information element, be not yet assigned to task TiOther are virtual
Machine then carries out the adjustment of pheromone according to formula (10).
τij(t+1)=((1- ρ) τij(t)+Δτij(t))*PC (9)
τix(t+1)=τix(t)*PC (10)
If virtual machine VMjOverload then E in upper wheel iterationjRelatively excessive, then the corresponding pheromone of this virtual machine
Dynamic gene PC is then relatively small, is task T during next iterationiSelect virtual machine VMjProbability relatively low.Many
The load relative equilibrium of each virtual machine can be ensured after secondary iteration, improve the execution efficiency of system.
Step 3.7:After all Formica fuscas all complete once to travel through, find out optimal scheduling scheme in current iteration, and press
According to formula (11), Pheromone update is carried out to all virtual machines in the program.
Δτij(t)=D/bestclockij(11)
Wherein D is a constant, bestclockijRepresent in optimal distributing scheme is task TiSelect virtual machine VMjWhen
Task TiDeadline.
Step 3.8:Find optimal distributing scheme, the virtual machine in binding scheme and corresponding workflow task.
Apply heuristic load-balancing algorithm proposed by the present invention to obtain the result of cloud data center task scheduling, first will
Mission Scheduling is converted to standard and minimizes problem;Next carries out the initialization of method, and what input user submitted to waits to dispatch
Set of tasks, the set of the virtual machine of user's lease;Pass through the cloud data center task scheduling based on improving ant group algorithm again to calculate
Method carries out the scheduling process shown in step 3.1- step 3.8, finally obtains optimal solution.
In order to check this algorithm with respect to FIFO scheduling strategy, basic ant colony dispatching algorithm (ACO) and ant colony and wheel disc
Whether the combination algorithm (RACO) of gambling has more superior scheduling performance and load balance ability, and the present invention is emulated using cloud computing
Simulation tool CloudSim simulating the data center of a cloud computing, and rewritten DatacenterBroker therein,
The classes such as Cloudlet are it is achieved that analog simulation to above four kinds of task scheduling algorithms.
In addition the present invention devises a kind of Work flow model for checking the effectiveness of LACO algorithm, in this Work flow model
The parameter value of ten tasks is:
LACO algorithm parameter value is set to:α=0.7, β=0.7, ρ=0.3, Formica fusca number is 100, iterationses 50 times.
The present invention simulates a data center in CoudSim, and defines four virtual machines wherein, this four void
The numbering of plan machine is respectively VM0、VM1、VM2、VM3It is assumed that the bandwidth of the communication line between all virtual machines all phases in the present invention
Deng, the parameter value that this four virtual machines set as:
Virtual machine ID |
MIPS |
Bandwidth |
Memory capacity |
0 |
420 |
1000 |
50GB |
1 |
350 |
1000 |
120GB |
2 |
508 |
1000 |
235GB |
3 |
634 |
1000 |
450GB |
The time that the task allocation result of four groups of algorithms and each tasks carrying corresponding complete is:
Wherein what ACO algorithm, RACO algorithm and LACO algorithm were all taken is the data of the solution of global optimum.Therefrom
It can be seen that FIFO scheduling strategy simply distributes for task one by one according to the order of virtual machine, the efficiency of this method is very low
, lead to that because it does not account for the process performance of each virtual machine larger bearing also is had on the low virtual machine of disposal ability
Carry, thus leading to whole task scheduling process to need the deadline grown very much.ACO algorithm all tasks after successive ignition are all selected
Select computing capability virtual machine VM the strongest3It is scheduling so that virtual machine VM3Upper overload, time overhead is also excessive.And
In RACO algorithm, due to virtual machine VM2、VM3For other two virtual machines, process performance is higher, leads to substantial amounts of
Business is all assigned on this two virtual machines, virtual machine VM1On there is no need execution task queue and be in idle condition,
Thus causing the wasting of resources of data center.Finally under the scheduling of LACO algorithm, each virtual machine is according to its execution performance
Obtain corresponding task, do not cause the waste of resource, improve system execution efficiency.
Fig. 5 shows the deadline of each task in workflow under the scheduling of four kinds of algorithms.As can be seen from the figure
The time of FIFO and ACO each task of algorithm process is both greater than two kinds of algorithms of RACO and LACO.Before FIFO execution during several task
Deadline is above ACO algorithm, and is executing T4The deadline of ACO algorithm is higher than gradually FIFO algorithm afterwards;RACO and
The time of two kinds of each tasks of algorithm performs of LACO is very close to but being carried out LACO algorithm process time during last three tasks
It is significantly less than RACO algorithm.
Fig. 6 is shown that these four algorithms complete the fortune in an experiment of each virtual machine in data center after task scheduling
The row time accounts for the ratio of all virtual machine deadline sums.As can be seen from the figure all of virtual machine in FIFO scheduling strategy
All it is assigned to task, but because it is that distribution results in virtual machine VM in order0、VM1On there is substantial amounts of task, compare and
The stronger virtual machine VM of speech disposal ability2、VM3The percentage of execution time is less, when leading to whole task processes
Between expend excessive.It is that all tasks all select the strong virtual machine VM of computing capability that ACO algorithm leads to Formica fusca after successive ignition3,
Make VM3Upper overload, other virtual machines are all in idle condition.Virtual machine VM is can be seen that in RACO algorithm0、VM1On
It is not allocated to task, even if virtual machine VM2、VM3Computing capability very strong, but excessive load can be caused thus also can make
Runtime is elongated.Finally a kind of all virtual machines of LACO algorithm have been carried out task, and it can be seen that every virtual
Percentage ratio shared by machine execution time is directly proportional to its computing capability, and the strong virtual machine of computing capability is used for executing more appointing
Business, also makes the load of virtual machine reach equilibrium while guaranteed efficiency.
Obviously, it will be appreciated by those skilled in the art that to disclosed in the invention described above based on improve ant group algorithm cloud
Data center's method for scheduling task, can also make various improvement on the basis of without departing from present invention.Therefore, the present invention
Protection domain should by appending claims content determine.