Summary of the invention
The technical problem to be solved in the present invention is: a kind of cloud data center method for scheduling task based on improving ant group algorithm is provided, realizes virtual machine is reasonably distributed, task is dispatched efficiently.
The technical solution adopted for the present invention to solve the technical problems is: a kind of cloud data center method for scheduling task based on improving ant group algorithm, comprises the steps:
Step 1: what input user submitted to treats the virtual machine set that despatching work stream set of tasks and user lease;
Step 2: the scheduling problem that assigns the task to virtual machine execution is expressed as the minimum value Solve problems of standard;
Step 3: use the ant colony optimization for solving cloud computing environment virtual machine Mission Scheduling upgrading based on pheromones.
Further, the present invention is based on the cloud data center method for scheduling task that improves ant group algorithm, described step 2 assigns the task to the scheduling problem that virtual machine carries out and is expressed as the minimum value Solve problems of standard, wherein optimization aim is that in scheduling strategy, the complete expense of all tasks carryings minimizes, i.e. the shortest time of the complete cost of all tasks carryings;
Constraint condition is the number that the number of task is greater than the virtual machine of lease, task in set of tasks is all first task, be that each task can not be split as less subtask again, and each task utilizes the virtual machine of arbitrary lease to calculate, but each virtual machine can only be processed a task at one time, and task does not complete before calculating and does not allow to interrupt.
Further, the present invention is based on the cloud data center method for scheduling task that improves ant group algorithm, described step 3 uses the ant colony optimization for solving cloud computing environment virtual machine Mission Scheduling upgrading based on pheromones to comprise an iterative process, comprises following 8 sub-steps:
Step 3.1: initialization; Basic parameter in this step initialization algorithm comprises information heuristic factor α, expectation heuristic factor β, pheromones volatilization factor ρ, ant number m, maximum iteration time NC
max, pheromones τ
i,jand transfer expected degree η
i,j;
Step 3.2: algorithm iteration starts, if iterations NC is less than maximum iteration time NC
maxtime, NC=NC+1, enters next step; In the time that iterations is more than or equal to maximum iteration time, iteration finishes;
Step 3.3: every ant is the selecteed probability of every virtual machine of each task computation according to state transitions formula;
Described state transitions formula is:
Wherein, τ
i,jand η
i,jrepresent respectively task T
idistribute to VM
jtime pheromones and expected degree, P
i,jrepresent task T
idistribute to virtual machine VM
jpossibility, n be user lease the number of virtual machine;
Step 3.4: select virtual machine by roulette algorithm; Solve the transition probability problem of advancing of ant by roulette algorithm, in the time that ant starts as task choosing scheduling virtual machine, wheel disc is rotated, in the time that wheel disc stops, virtual machine corresponding to pointed region is k the computing node that ant is task choosing; The transition probability value that alternative virtual machine is corresponding is larger, and its area occupying on wheel disc is larger, and the possibility of selecting accordingly it to calculate this task is larger;
Step 3.5: in the time that the same layer task in Work flow model is all selected same virtual machine, forward step 3.3 to and redistribute virtual machine for task, otherwise forward next step to;
Step 3.6: local message element upgrades: when an ant completes after all task distribution, all virtual machines in this ant scheduling scheme are carried out to pheromones renewal;
Step 3.7: the renewal of global information element: when all ants all complete once after traversal, find out optimal scheduling scheme in this iteration, all virtual machines in this scheme are carried out to pheromones renewal, then forward step 3.2 to;
Step 3.8: find optimal distributing scheme, the virtual machine in binding scheme and corresponding workflow task.
Further, the present invention is based on the cloud data center method for scheduling task that improves ant group algorithm, described pheromones τ
i,jand transfer expected degree η
i,jall represent by the computing power of computing node:
τ
i,j=η
i,j=MIPS
j/N
Wherein, MIPS
jrepresent Processing tasks T
ivirtual machine VM
jprocessing speed, N is a constant.
Further, the present invention is based on and improve the cloud data center method for scheduling task of ant group algorithm, the local message element described in step 3.6 upgrades and specifically comprises following content:
A, residual risk upgrade processing, adopt following formula:
τ
ij(t+1)=(1-ρ)·τ
ij(t)+Δτ
ij(t)
Wherein: τ
ij(t+1) task T while representing the t+1 time iteration
iselect virtual machine VM
jquantity of information, 1-ρ represents the residual factor of pheromones, in order to prevent the unlimited accumulation of information, the span of ρ is:
Δ τ
ij(t) represent task T
iselect virtual machine VM
jexecution remains in virtual machine VM
jon quantity of information;
B, all virtual machines in this ant scheduling scheme are carried out to the renewal of pheromones:
Δτ
ij(t)=D/clock
ij
Wherein D is a constant, clock
ijrepresent that in this circulation, ant is task T
iselect virtual machine VM
jexecution time;
Further, the present invention is based on and improve the cloud data center method for scheduling task of ant group algorithm, the renewal of global information element is according to formula described in step 3.7: Δ τ
ij(t)=D/bestclock
ijall virtual machines in this scheme are carried out to pheromones renewal, wherein, bestclock
ijrepresentative is task T in optimal distributing scheme
iselect virtual machine VM
jtime task T
ideadline.
Further, the present invention is based on the cloud data center method for scheduling task that improves ant group algorithm, in described step 3.6, also comprise pheromones adjustment factor PC of definition, according to the virtual machine distribution condition of task, pheromones is adjusted.
Further, the present invention is based on the cloud data center method for scheduling task that improves ant group algorithm, the computing formula that described pheromones is adjusted factor PC is:
Wherein, E
jfor virtual machine in epicycle iterative process executes virtual machine VM after all tasks
jthe time spending.
Further, the present invention is based on the cloud data center method for scheduling task that improves ant group algorithm, the described virtual machine distribution condition according to task is adjusted pheromones, is specially:
After the renewal of local message element and the renewal of global information element, then the pheromones after upgrading is adjusted according to the following formula,
τ
ij(t+1)=((1-ρ)·τ
ij(t)+Δτ
ij(t))*PC;
Be not yet assigned to task T
iother virtual machines carry out the adjustment of pheromones according to following formula:
τ
ix(t+1)=τ
ix(t)*PC,
Wherein, τ
ix(t+1) while representing the t+1 time iteration, task T
iselection is not yet assigned to the virtual machine VM of task
xquantity of information.
The technical solution used in the present invention compared with prior art, has following technique effect:
A kind of cloud data center method for scheduling task based on improving ant group algorithm provided by the invention, on the basis of Basic Ant Group of Algorithm, be optimized, not only shorten the time overhead of task scheduling but also considered that the load condition of each virtual machine prevents that the machine of delaying or light condition from appearring in virtual machine, avoid the problems such as the wasting of resources, improved the utilization factor of resource.
Embodiment
In order to make those skilled in the art understand better technical matters, technical scheme and the technique effect in the application, below in conjunction with the drawings and specific embodiments, a kind of cloud data center method for scheduling task based on improving ant group algorithm of the present invention is described in further detail.
The present invention proposes a kind of cloud data center Load Balancing Task Scheduling algorithm (Load balancing task scheduling algorithm based on ant colony algorithm for clouddatacenters based on improving ant group algorithm, LACO), LACO algorithm has not only shortened the time overhead of tasks carrying but also the virtual machine of leasing in task scheduling process has been maintained to the state of load relative equilibrium.In addition, consider that most researcher focuses on the scheduling of independent task, and ignored user may submit to there is priority restrictions relation, Work flow model is mutually related, therefore the present invention adopts DAG (Directed Acyclic Graph, directed acyclic graph) to carry out the scheduling of research work stream.By considering that sequential between each task or the constraint of cause and effect come for best resource of task choosing, and final execution result is obtained in the execution of coordinating each task.
The present invention uses ClouSim as emulation platform, by it, LACO algorithm is carried out to analog simulation experiment, and with FIFO (First In First Out, first-in first-out) scheduling strategy and ACO (Ant colonyalgorithm, basic ant group dispatching algorithm) contrast the superiority of checking LACO algorithm.
What the present invention proposed comprises the steps based on improving the cloud data center task scheduling algorithm of ant group algorithm, flow process as shown in Figure 1:
Step 1: the set of the virtual machine for the treatment of scheduler task set and user's lease of input user submission;
Step 2: the scheduling problem that assigns the task to resource execution is expressed as the minimum value Solve problems of standard;
May there is complementary relation in some tasks of submitting to for user, the present invention using workflow as research object to solve the associated task scheduling problem in cloud data center.Conventionally workflow all can be described as a directed acyclic graph G=(T, E), wherein: T is the set of node in DAG, represents n task in workflow, T={T
1, T
2, T
3..., T
n; E is the set E={ (T of directed edge in Work flow model
i, T
j) | T
i, T
j∈ T}, represents two restricting relations between task.If task T
ithere is one to point to task T
jdirected edge, T so
ibe called as T
jfather's task, T
jbe called as T
isubtask, T in this case
jonly at T
iafter completing, just can carry out.Fig. 2 is the basic framework of one group of workflow, has comprised ten need workflow tasks to be processed, and label is respectively T
0~T
9, the length of these tasks is different.In Fig. 2, T={T
1, T
2, T
4, T
5, T
6, T
7, T
8, T
9, E={ (T
0, T
1), (T
0, T
2), (T
1, T
3), (T
1, T
4), (T
2, T
5), (T
2, T
6), (T
3, T
7), (T
4, T
7), (T
5, T
8), (T
6, T
8), (T
7, T
9), (T
8, T
9).
The present invention represents with VM the virtual machine that user leases, and m represents the number of virtual machine, VM={VM
1, VM
2..., VM
m, VM
iprocessing speed MIPS
irepresent, MIPS represents 1,000,000 grades of machine language instruction numbers of processing per second.
The present invention defines the communication matrix com of a n × m, com={c
i,j| c
i,j>=0,1≤i≤n, 1≤j≤m}, wherein: n represents the number of task, m represents the virtual machine number that user leases, c
i,j(as shown in formula (1)) represents task T
ibe assigned to virtual machine VM
jcarry out required call duration time; In addition define the compute matrix exe of a n × m, exe={e
i,j| e
i,j>=0,1≤i≤n, 1≤j≤m}, wherein e
i,j(as shown in formula (2)) represents task T
iat virtual machine VM
jthe computing time of upper execution.
c
ij=outputsize
i/bandwidth (1)
e
ij=Length
i/Mips
j(2)
Wherein: outputsize
iexpression task T
ithe size of output file, bandwidth represents the bandwidth of communication line between virtual machine; Length
iexpression task T
isize, Mips
jrepresent Processing tasks T
ivirtual machine VM
jprocessing speed; If former and later two tasks are all carried out on same virtual machine, there is not data transmission cost.
Virtual machine VM
jthe time overhead of all tasks of processing can be used E
j(as shown in formula (3)) represents, and total spended time E of whole workflow
totalbe completing the moment of last task in workflow.
Wherein: Task
jrepresent virtual machine VM
jall tasks of upper execution, Ftask represents virtual machine VM
j(father's task is not at virtual machine VM for father's task of all tasks of upper execution
jupper execution).
Step 3: use the task scheduling algorithm of the improvement ant group algorithm upgrading based on pheromones to solve cloud computing environment virtual machine Mission Scheduling, specifically describe as follows:
Step 3.1: the cloud data center Load Balancing Task Scheduling algorithm of initialization based on improving ant group algorithm;
This step initialization information heuristic factor α, expectation heuristic factor β, pheromones volatilization factor ρ, ant number m, maximum iteration time, pheromones and transfer expected degree.
Ant group algorithm two internodal pheromones in the time solving some basic problems wait attribute relevant with distance with transfer expected degree conventionally.But due to the singularity of cloud computing environment, the present invention is by pheromones τ
i,jand the expected degree η of this node
i,jall represent by the computing power of computing node.
τ
i,j=η
i,j=MIPS
j/N (4)
In formula (4), τ
i,jand η
i,jrepresent respectively task T
idistribute to VM
jtime pheromones and expected degree, N is a constant (as cooperation index).
Step 3.2: algorithm iteration starts, if iterations NC is less than NC
max, NC=NC+1, enters next step; In the time that iterations is more than or equal to maximum iteration time, iteration finishes.
Step 3.3: every ant is the selecteed probability of every virtual machine of each task computation according to state transitions formula (5).
Formula (5) represents task T
idistribute to VM
jpossibility, n be user lease the number of virtual machine.
Step 3.4: select virtual machine by roulette algorithm;
The present invention solves the transition probability problem of advancing of ant by roulette algorithm.Roulette algorithm (Roulettealgorithm) is the process of emulation wheel disc gambling, suppose to have a circular wheel disc, and be divided into the different sector region of m piece area, this m piece region represents respectively, for ant k, a task in task-set is assigned to every probable value that virtual machine is corresponding.As shown in Figure 3, suppose that alternative virtual machine has 4, is respectively VM
1, VM
2, VM
3, VM
4, corresponding probable value is respectively: 23%, 52%, 6% and 19%.
In the time that ant starts as task choosing scheduling virtual machine, wheel disc is rotated, the computing node that virtual machine corresponding to the region of pointed is task choosing for ant k in the time that wheel disc stops.The transition probability value that alternative virtual machine is corresponding is larger, and its area occupying on wheel disc is larger, and the possibility of selecting accordingly it to carry out this task is larger, and the specific implementation process of this algorithm as shown in Figure 4.
For the each task in workflow, determine after the selected probable value of every virtual machine, will be [0,1] interval interior random generation one number, this number is subtracted each other with the selecteed probability of First virtual machine, be less than zero if poor, this virtual machine is just selected so, otherwise continue to deduct the selected probability of next virtual machine, until the result after deducting is less than or equal to 0 again.The corresponding virtual machine of that probable value while finally deducting is as the virtual machine of this task choosing.
Step 3.5: in the time that the same layer task in Work flow model is all selected same virtual machine, forward step 3.3 to and redistribute virtual machine for task, otherwise forward next step to;
Due to the singularity of Work flow model, if the workflow task of same layer all distributes identical virtual machine, while now carrying out a task, other tasks of same layer will enter long waiting period, thereby this just causes other virtual machines of handling task to cause the wasting of resources in idle condition.In order to address this problem, once this algorithm detect the waiting period that the workflow task of same layer all selecting same virtual machine to be absorbed in, can reselect other virtual machines in idle condition for it according to formula (5).
Step 3.6: local message element upgrades, when an ant completes after all task distribution, carries out pheromones renewal to all virtual machines in this ant scheduling scheme;
Too much flood heuristic information for fear of residual risk element, therefore every ant completes after all scheduling, need to upgrade processing to residual risk by formula (6).
τ
ij(t+1)=(1-ρ)·τ
ij(t)+Δτ
ij(t) (6)
Wherein: τ
ij(t+1) task T while representing the t+1 time iteration
iselect virtual machine VM
jquantity of information, 1-ρ represents the residual factor of pheromones, in order to prevent the unlimited accumulation of information, the span of ρ is:
The present invention utilize M.Dorigo propose Basic Ant Group of Algorithm model in Ant-Cycle model, this model utilization be overall information.Δ τ
ij(t) represent task T
iselect virtual machine VM
jexecution remains in virtual machine VM
jon quantity of information, initial time Δ τ
ij(0)=0, when an ant completes after all task schedulings, carries out the renewal of pheromones according to formula (7) to all virtual machines in this ant scheduling scheme.
Δτ
ij(t)=D/clock
ij(7)
Wherein D is a constant, clock
ijrepresent that in this circulation, ant is task T
iselect virtual machine VM
jexecution time.
The present invention has defined a pheromones and has adjusted factor PC, according to the virtual machine distribution condition of task, pheromones is adjusted.
For the task of different layers in workflow, in order to prevent that these tasks from all selecting the good virtual machine of computing power, cause this virtual machine load overweight, the present invention defines a pheromones and adjusts factor PC, shown in formula (8).Wherein E
jbe defined as in epicycle iterative process virtual machine and execute virtual machine VM after all tasks
jthe time spending.After the renewal of local message element and the renewal of global information element, again the pheromones after upgrading is adjusted according to formula (9), be not yet assigned to task T
iother virtual machines carry out the adjustment of pheromones according to formula (10).
τ
ij(t+1)=((1-ρ)·τ
ij(t)+Δτ
ij(t))*PC (9)
τ
ix(t+1)=τ
ix(t)*PC (10)
If virtual machine VM
jupper take turns iteration in the overweight E of load
jrelatively excessive, it is relatively little that the pheromones that this virtual machine is corresponding is so adjusted factor PC, is task T in next iteration process
iselect virtual machine VM
jprobability just relatively low.Repeatedly after iteration, can ensure the load relative equilibrium of each virtual machine, improve the execution efficiency of system.
Step 3.7: when all ants all complete once after traversal, find out optimal scheduling scheme in this iteration, and according to formula (11), all virtual machines in this scheme are carried out to pheromones renewal.
Δτ
ij(t)=D/bestclock
ij(11)
Wherein D is a constant, bestclock
ijrepresentative is task T in optimal distributing scheme
iselect virtual machine VM
jtime task T
ideadline.
Step 3.8: find optimal distributing scheme, the virtual machine in binding scheme and corresponding workflow task.
The heuristic load-balancing algorithm that application the present invention proposes obtains the result of cloud data center task scheduling, the standard that first Mission Scheduling the is converted to problem of minimizing; Next carries out the initialization of method, the set of the virtual machine for the treatment of scheduler task set, user's lease of input user submission; Carry out the scheduling process shown in step 3.1-step 3.8 by the cloud data center task scheduling algorithm based on improving ant group algorithm again, finally obtain optimum solution.
In order to check this algorithm whether to there is more superior scheduling performance and load balance ability with respect to the combination algorithm (RACO) of FIFO scheduling strategy, basic ant group dispatching algorithm (ACO) and ant group and roulette, the present invention simulates the data center of a cloud computing with cloud computing analogue simulation instrument CloudSim, and rewrite the class such as DatacenterBroker, Cloudlet wherein, realized the analog simulation to above four kinds of task scheduling algorithms.
The present invention has designed a kind of Work flow model for checking the validity of LACO algorithm in addition, and in this Work flow model, the parameter value of ten tasks is:
LACO algorithm parameter value is made as: α=0.7, and β=0.7, ρ=0.3, ant number is 100, iterations 50 times.
The present invention has simulated a data center in CoudSim, and has defined therein four virtual machines, and the numbering of these four virtual machines is respectively VM
0, VM
1, VM
2, VM
3, suppose that the bandwidth of the communication line between all virtual machines in the present invention is all equal, the parameter value that these four virtual machines are set is:
Virtual machine ID |
MIPS |
Bandwidth |
Memory capacity |
0 |
420 |
1000 |
50GB |
1 |
350 |
1000 |
120GB |
2 |
508 |
1000 |
235GB |
3 |
634 |
1000 |
450GB |
The time that the task allocation result of four groups of algorithms and corresponding each tasks carrying complete is:
What wherein ACO algorithm, RACO algorithm and LACO algorithm were all taked is the data of the solution of global optimum.Therefrom can find out that FIFO scheduling strategy is just that task is distributed one by one according to the order of virtual machine, the efficiency of this method is low-down, cause also there is larger load on virtual machine that processing power is low because it does not consider the handling property of each virtual machine, thereby cause the whole very long deadline of task scheduling process need.ACO algorithm all tasks after iteration repeatedly are all selected the virtual machine VM that computing power is the strongest
3dispatch, make virtual machine VM
3upper load is overweight, and time overhead is also excessive.And in RACO algorithm, due to virtual machine VM
2, VM
3with respect to other two virtual machines, handling property is higher, causes a large amount of tasks to be all assigned on these two virtual machines, virtual machine VM
1on do not obtain need carry out task queue and in idle condition, thereby cause the wasting of resources of data center.Last under the scheduling of LACO algorithm, each virtual machine obtains corresponding task according to its execution performance, does not cause the waste of resource, has improved system execution efficiency.
Fig. 5 has shown under the scheduling of four kinds of algorithms the deadline of each task in workflow.As can be seen from the figure the time of FIFO and the each task of ACO algorithm process is greater than RACO and two kinds of algorithms of LACO.Before FIFO carries out, the deadline, all higher than ACO algorithm, and is executing T when several task
4the deadline of rear ACO algorithm is gradually higher than FIFO algorithm; It is very approaching that two kinds of algorithms of RACO and LACO are carried out time of each task, but while carrying out last three tasks, the LACO algorithm process time is significantly less than RACO algorithm.
The ratio that working time account for all virtual machine deadline sums of each virtual machine after these four kinds of algorithms finish the work scheduling that what Fig. 6 showed is in data center in experiment.As can be seen from the figure in FIFO scheduling strategy, all virtual machines are all assigned to task, but because it is for distributing and just cause virtual machine VM in order
0, VM
1the task that upper existence is a large amount of, the Comparatively speaking more intense virtual machine VM of processing power
2, VM
3the percentage of execution time is less, causes whole task processes time consumption too much.It is that all tasks are all selected the virtual machine VM that computing power is strong that ACO algorithm causes ant after iteration repeatedly
3, make VM
3upper load is overweight, and other virtual machines are all in idle condition.In RACO algorithm, can find out virtual machine VM
0, VM
1on be not assigned to task, even virtual machine VM
2, VM
3computing power very strong, thereby but can cause too much load also can make Runtime elongated.The all virtual machines of last a kind of LACO algorithm have all been carried out task, and can find out that every shared number percent of virtual machine execution time is directly proportional to its computing power, the virtual machine that computing power is strong is used for carrying out more task, also makes the load of virtual machine reach balanced in guaranteed efficiency.
Obviously, it will be appreciated by those skilled in the art that the disclosed cloud data center method for scheduling task based on improving ant group algorithm of the invention described above, can also on the basis that does not depart from content of the present invention, make various improvement.Therefore, protection scope of the present invention should be determined by the content of appending claims.