CN110109753A

CN110109753A - Resource regulating method and system based on various dimensions constraint genetic algorithm

Info

Publication number: CN110109753A
Application number: CN201910340000.9A
Authority: CN
Inventors: 张路桥; 滕彩峰; 李飞; 王娟; 韩斌
Original assignee: Chengdu University of Information Technology
Current assignee: Chengdu University of Information Technology
Priority date: 2019-04-25
Filing date: 2019-04-25
Publication date: 2019-08-09

Abstract

The invention belongs to technical field of data processing, disclose a kind of resource regulating method and system based on various dimensions constraint genetic algorithm, after initializing to prediction model, task matrix, node matrix equation, construct Double fitness value function；Formulate selection-duplication operator, crossover operator, mutation operator；After carrying out successive ignition, the resource distribution mode of global optimum is obtained.The present invention is to seek more preferably Resource Allocation Formula, proposes a kind of Hadoop resource scheduling algorithm based on various dimensions constraint genetic algorithm, realizes Hadoop Resource Scheduler by the algorithm；Cluster resource allocative efficiency can be effectively improved using inventive algorithm, so that cluster task is performed integrally the time and shortens 20% or so.

Description

Resource regulating method and system based on various dimensions constraint genetic algorithm

Technical field

The invention belongs to technical field of data processing more particularly to a kind of resource tune based on various dimensions constraint genetic algorithm Spend method and system.Specially a kind of Hadoop resource regulating method and system based on various dimensions constraint genetic algorithm

Background technique

Currently, the immediate prior art:

Resource scheduling is a kind of combinatorial optimization problem, its final purpose is to be assigned to all tasks of cluster most to close Suitable node executes optimal to reach cluster overall performance.Hadoop YARN provides the resource scheduling algorithm and real built in three Corresponding Resource Scheduler, i.e. FIFO, Capacity and Fair scheduler are showed.But as application scenarios (hand over by such as iterative calculation Mutual formula is calculated, is calculated in real time) continuous extension, these schedulers, which are not well positioned to meet user's reasonable distribution resource and reduce, appoints Business executes the demand of time.

In conclusion problem of the existing technology is:

(1) in the prior art, allocation efficiency of resource is low, and it is long that cluster task is performed integrally the time.

(2) in different application scene, the scheduler of the prior art is not well positioned to meet user's reasonable distribution resource Reduce the demand of task execution time.

Solve the difficulty of above-mentioned technical problem:

Resource scheduling is a kind of combinatorial optimization problem, and difficulty is to need for tasks all in cluster to be assigned to most Reasonable node executes, and combines cluster loading condition and task execution time, optimal to reach cluster overall performance.Together When will based on various dimensions constraint genetic algorithm be applied to Hadoop resource scheduling when, needing to carry out many experiments can just obtain Optimized parameter setting in algorithm.

Solve the meaning of above-mentioned technical problem:

By studying existing resource scheduling scheme, more preferably resource scheduling algorithm is redesigned, for big data cloud computing The development of technology has impetus, for improving system entirety resource utilization and Hadoop platform overall performance with important Meaning.

Summary of the invention

In view of the problems of the existing technology, the present invention provides a kind of resource tune based on various dimensions constraint genetic algorithm Spend method and system.

The invention is realized in this way a kind of Hadoop resource regulating method, the Hadoop resource regulating method include:

After initializing to prediction model, task matrix, node matrix equation, Double fitness value function is constructed；

Formulate selection-duplication operator, crossover operator, mutation operator；

After carrying out successive ignition, the resource distribution mode of global optimum is obtained.

Further, the Hadoop resource regulating method includes:

Step 1 initializes prediction model after user submits operation to cluster, constructs task matrix, node matrix equation Information and coding result are saved to file；

Step 2, population building primary, generates feasible solution chromosome primary according to task matrix, node matrix equation at random, remembers It is Scale；

Step 3, fitness calculate, and the fitness value of each chromosome in population is calculated separately by fitness function；

Step 4, termination condition judgement first judge whether to meet termination condition, i.e., before entering next round iterative evolution Whether reach the iteration upper limit, the condition that meets then selects in current population the highest chromosome of fitness as optimal solution, otherwise into Enter new round iteration；

Step 5, fitness probability calculation are chosen in evolve next time according to fitness value calculation each chromosome In probability, generate new chromosome subsequently into selection-duplication, intersection, mutation operation；

Step 6 replicates the highest Scale*cp item dye of fitness in population Scale by reproduction ratio cp in duplication operator Colour solid enters next iteration；

Step 7 executes selection operator by circulation and selects two chromosomes as parent chromosome, into crossover operation Generate remaining Scale* (1-cp) chromosome；

Step 8 executes mutation operation for Scale* (1-cp) chromosome of generation, and to the dyeing that variation is completed Body enters next iteration；

Step 9 enters step the iterative evolution of a three carry out new rounds.

Further, step 1, user submit operation into cluster, cluster environment model be denoted as G=NoedSet, JobSet }, wherein NodeSet={ node₁, node₂, node₃... ..., node_nIndicate node resource set；JobSet= {Job₁, job₂, job₃... ..., job_nIndicate operation set, each Job_i={ task₁, task₂, task₃... ..., task_n} (0≤i < n), wherein task includes map task, reduce task；Task in set JobSet is assigned to NodeSet In node execute, carry out entirety Job task run；

Initialization prediction model method include:

To Map and Reduce task, TS (map/reduce) model is constructed, model uses following data format:

<FileSize, SplitSize, SplitNum, MapTime, ReduceTime>

Wherein FileSize indicates current work size, and SplitSize indicates operation fragment size, and SplitNum indicates to make Industry fragment number, MapTime, ReduceTime respectively indicate the execution time in operation Map stage and Reduce stage；Then lead to Evaluation history task attribute information is crossed to predict the execution time of new task；

The TS of building_(map/reduce)Model is stored in RescourseManager, and NodeManager is communicated by heartbeat Node attribute information is periodically passed to RescourseManager by mechanism.

Further, in step 1,

The initial method of task matrix includes:

JobSet={ Job is used for operation set₁, job₂, job₃... ..., job_nIndicate, each Job_i= { JobSize, SplitSize, SplitNum } (0≤i < n), JobSize indicate job size, and SplitSize expression is each cut Piece size, SplitNum fragment number；Node matrix equation initial method includes: for node set NodeSet={ node₁, node₂, node₃... ..., node_n, node_i={ cpuSpeed_i, AllR_i, UsedR_i, Cnum_i, Load_i(0≤i < n), cpuSpeed_iIndicate the cpu floating-point operation ability of node i, AllR_iIndicate the node server total resources, UsedR_iIndicate the section Point server resource, Cnum_iIndicate the node server CPU core number.

Further, in step 2, chromosome matrix generating method includes: to dye volume matrix by the matrix group of a n × t At n row indicates mission number, and t column indicate node serial number, are denoted as chromosomeMatrix=(chromosomeMatrix [i] [j])_n×t；

Wherein matrix element chromosomeMatrix [i] [j] ∈ { 0,1 } (0≤i < n, 0≤j < t), element ChromosomeMatrix [i] [j]=1 indicates that task i is distributed to node j and executed by this item chromosome, and element value is 0 expression Current task is not yet assigned to the node by this item chromosome；Each task is assigned to only a node and executes, full simultaneously Sufficient condition

Further, in step 3, using the Double fitness value function based on optimal time span and based on load balancing, so that Most short task execution time is found during Evolution of Population and each node load balancing direction of cluster is kept to advance；

Fitness function based on time span includes: to execute the time for Job,

T_jobIt is the execution time an of operation, for entire cluster, while runs multiple operations, executed the latest Complete operation is the optimal time span of this chromosome allocation plan；

Fitness function based on optimal time span indicates are as follows:

Wherein F_time(c) the optimal time span of the c articles chromosome in population is indicated, N indicates operation quantity,

ChromosomeScale indicates population scale；For the optimal time span collection of all chromosomes in a wheel iteration Closing indicates are as follows:

AllF_time={ F_time(1), F_time(2), F_time(3) ... ..., F_time(c)}

Wherein AllF_timeIndicate that all chromosome time span set in epicycle iteration, set subscript indicate that chromosome is compiled Number, element value indicates this chromosome time span value.

Further, Map task and the calculation method of Reduce task execution time include:

A) it is as follows to execute time calculating by each Map task of Job operation:

Wherein T_map(i, j, k) indicates that k-th of fragment of operation i distributes to the time of node j execution, Split (i, k) (0 ≤ k < splitNum) indicate operation i k-th of fragment size, cpuSpeed_jIndicate the CPU arithmetic speed of node j；If point Piece size and blocks of files are not of uniform size, then this task may need internet transmission of virtual laboratory blocks of files to synthesize a fragment, Block (i, k) indicates that the fragment task needs the data block size from other node-node transmissions, and node (i, j) indicates task i storage Network transfer speeds between node and execution node j；Job one big is divided into several fragments, one Map of a fragment Task, it is parallel respectively to execute, the Map task being finished as entire Map task task execution time, for one Job_i, T_map=Max (T_map(i,j,k))；

B) for Reduce task, task execution time is according to TS_(map/reduce)The historical information of model construction < FileSize, SplitSize, SplitNum, MapTime, ReduceTime > predicted；

Fitness function based on load balancing includes:

It is higher that more balanced allocation strategy cluster source utilization rate is loaded in resource allocation process interior joint.Present invention design one Number of tasks in set of tasks JobSet is expressed as by fitness function of the kind based on load balancing after initialization of population JobSet.length, node set NodeSet interior joint number are expressed as NodeSet.length, then per node on average distributes Number of tasks are as follows:

Dispersion degree of one group of data with respect to mean value is measured by standard deviation, standard deviation is smaller, more connects with average value Closely, cluster load is more balanced；It is indicated based on load balancing fitness function are as follows:

Wherein F_load(c) standard deviation of this chromosome node distribution number of tasks, the i.e. fitness of its load balancing are indicated Value, TaskNum (c, j) indicate the task number that the c articles chromosome, j-th of node is assigned to, and N indicates node total number amount, AvgTask indicates each node mean allocation number of tasks in this chromosome allocation plan.For all dyeing in a wheel iteration The load balancing fitness set expression of body are as follows:

AllF_load={ F_load(1), F_load(2), F_load(3) ... ..., F_load(i)}

Wherein AllF_loadIndicate that all chromosome load balancing fitness set in epicycle iteration, set subscript indicate dye Colour solid number, element value indicate this chromosome load balancing fitness value；

Normalized: following Set criteria formula is used:

For being indicated after being based on time span fitness function normalized are as follows:

For being based on indicating after loading equal fitness line number normalized are as follows:

Ftime (k) *, which is represented, for chromosome executes the time, and the value the big, illustrates that the execution time is longer, fitness is answered This is smaller；Fload (k) * represents dispersion degree of the node distribution number of tasks with respect to mean allocation number of tasks simultaneously, is worth bigger explanation Cluster load is more unbalanced, and fitness should be smaller；It is based on optimal time span and based on negative for every chromosome Carrying balanced fitness function indicates are as follows:

Further, to building historical information<FileSize, SplitSize, SplitNum, MapTime, ReduceTime> Carrying out prediction technique includes:

Step 1: assuming that the operation to be predicted is NewJob, first in TS_(map/reduce)Model is looked for and current work NewJob Size (FileSize) similar in operation set JobSet₁={ Job₁,Job₂,Job₃,……Job_k}；

Step 2: then in JobSet₁In to find fragment size (SplitSize) consistent with fragment quantity (SplitNum) Operation set JobSet₂={ Job₁,Job₂,Job₃,……Job_k}；

Step 3: and then in Jobset₂In find operation similar in Map task execution time with current work NewJob Set JobSet₃={ Job₁,Job₂,Job₃,……Job_k}；

Step 4: the Reduce phased mission for finally calculating current work NewJob according to the following formula executes the time:

Wherein T_reduceIndicate the Reduce stage overall execution time of current Job, T_mapIndicate the Map stage of current Job Overall execution time, AvgT_mapIndicate JobSet₃The Map stage average performance times of all operations in set, AvgT_reduceTable Show JobSet₃All operation Reduce stage average performance times in set；Since node load can constantly change, same node Different task execution times is had same task is in different moments, so needing plus a load regulation parameter ω, for balancing the deadline of task under different loads, ω is expressed as the ratio of current time load and history average load.

Further, in step 5, fitness probability matrix: calculation is as follows:

Wherein F_prob(i) probability that chromosome i is selected in next round iteration is indicated.Fitness probability is in each round Iteration terminates, and new round iteration calculates before starting, and value is mapped to one-dimensional matrix, and structure is as follows:

SelectionProbability={ F_prob(1), F_prob(2), F_prob(3) ..., F_prob(i)}

Wherein SelectionProbability indicates the set of all chromosome fitness probability in last round of iteration, collection Closing subscript indicates chromosome numbers, and matrix intermediate value indicates the corresponding fitness probability of the chromosome, what next round iteration was selected Probability；SelectionProbability should be met

Selection-duplication operator meets following formula:

Wherein CrossoverNum indicates to choose the chromosome quantitative for carrying out crossover operation by roulette mode, CopyNum indicates the chromosome quantitative directly replicated, and cp is reproduction ratio；

When crossover operator does crossover operation, two-dimensional matrix is first decoded into one-dimensional form:

ChromosomeMatrix=[2,3,1,4,5,7,2 ..., 9]；

Wherein chromosomeMatrix indicates item chromosome, and subscript indicates mission number, and element value indicates that node is compiled Number, such as chromosomeMatrix [1]=3, task 1 is distributed to node 3 and executed by expression；Random complementary method is taken to intersect parent Chromosome selects two high parent chromosomes of fitness by selection operator first, and random interception same position, which is write down, to be designated as Flag, parent chromosome intercept chromosomeMatrix [0, flag], and mother then intercepts chromosomeMatrix for chromosome The two, is then binned in and is formed together child chromosome by [flag, end]；

The self-adaptive mutation calculation formula of mutation operator is as follows:

Wherein P_var(k) mutation probability of chromosome k, F are indicated_Adapt(max) population chromosome maximum adaptation angle value is indicated, F_Adapt(avg) population chromosome average fitness value is indicated.F_Adapt(k) fitness value of chromosome k, λ are indicated_minAnd λ_maxIt is Mutagenic factor controls the upper and lower bound of aberration rate value.

Another object of the present invention is to provide a kind of Hadoop resource tune for implementing the Hadoop resource regulating method Degree system.

In conclusion advantages of the present invention and good effect are as follows:

To seek more preferably Resource Allocation Formula, a kind of Hadoop resource tune based on various dimensions constraint genetic algorithm is proposed Spend algorithm (Hadoop Resource Scheduler Based on Multi-dimensional Constrained GeneticAlgorithm, MCGA), Hadoop Resource Scheduler is realized by the algorithm.

The present invention can initialize prediction model, task matrix, node matrix equation first, construct Double fitness value function, then make Determine selection-duplication operator, crossover operator, mutation operator etc., then by finally searching out global optimum after completing successive ignition Resource Allocation Formula.It is proved by 3 data of table and Figure 13, Figure 14, Figure 15 experiment effect figure, can effectively be mentioned using inventive algorithm High cluster resource allocative efficiency, so that cluster task is performed integrally the time and shortens 20% or so.

Detailed description of the invention

Fig. 1 is the Hadoop resource regulating method process provided in an embodiment of the present invention based on various dimensions constraint genetic algorithm Figure.

Fig. 2 is TS prediction model data acquisition flow chart provided in an embodiment of the present invention.

Fig. 3 is chiasma operation chart provided in an embodiment of the present invention.

Fig. 4 is chromosomal variation operation chart provided in an embodiment of the present invention.

Fig. 5 is clustered node topological diagram provided in an embodiment of the present invention.

Fig. 6 is 2.1 figure of ant group algorithm experiment numbers provided in an embodiment of the present invention.

Fig. 7 is 4.1 figure of ant group algorithm experiment numbers provided in an embodiment of the present invention.

Fig. 8 1.1 figure of genetic algorithm experiment numbers provided in an embodiment of the present invention.

Fig. 9 is 2.1 figure of genetic algorithm experiment numbers provided in an embodiment of the present invention.

Figure 10 is 3.1 figure of genetic algorithm experiment numbers provided in an embodiment of the present invention.

Figure 11 is 4.1 figure of genetic algorithm experiment numbers provided in an embodiment of the present invention.

Figure 12 is 5.1 figure of genetic algorithm experiment numbers provided in an embodiment of the present invention.

Figure 13 is 6.1 figure of genetic algorithm experiment numbers provided in an embodiment of the present invention.

Figure 14 is three kinds of schedulers provided in an embodiment of the present invention average task completion time figure under four group task collection.

Figure 15 is the figure of changing that the second group job collection provided in an embodiment of the present invention runs 20 times.

Figure 16 is the figure of changing that third group job collection provided in an embodiment of the present invention runs 20 times.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to Limit the present invention.

In the prior art, allocation efficiency of resource is low, and it is long that cluster task is performed integrally the time.It is existing in different application scene Having the scheduler of technology not to be well positioned to meet user's reasonable distribution resource reduces the demand of task execution time.

To solve the above problems, below with reference to concrete scheme, the present invention is described in detail.

As shown in Figure 1, the Hadoop resource tune provided in an embodiment of the present invention based on various dimensions constraint genetic Algorithm Design Degree method, comprising:

1) start to initialize prediction model after user submits operation to cluster, building task matrix, node matrix equation will Information and coding result are saved to file；

2) one group of feasible solution population building primary: is generated as chromosome primary according to task matrix, node matrix equation at random It is denoted as Scale；

3) fitness calculates: the fitness value of each chromosome in population is calculated separately by fitness function；

4) whether termination condition judges: before entering next round iterative evolution, first judging whether to meet termination condition, i.e., Reach the iteration upper limit, the condition that meets then is selected the highest chromosome of fitness in current population and otherwise entered new as optimal solution One wheel iteration；

5) it fitness probability calculation: is selected in evolve next time according to fitness value calculation each chromosome general Rate generates new chromosome subsequently into selection-duplication, intersection, mutation operation；

6) by reproduction ratio cp in duplication operator, the highest Scale*cp chromosome of fitness in population Scale is replicated Into next iteration；

7) selection operator is executed by circulation and selects two chromosomes as parent chromosome, generated into crossover operation surplus Remaining Scale* (1-cp) chromosome；

8) mutation operation is executed for the Scale* of generation (1-cp) chromosome, under the chromosome for allowing variation to complete enters An iteration；

9) here it is the processes that an iteration is evolved, and proceed immediately to the iterative evolution that step 3) carries out a new round.

Below with reference to each parameter setting of Hadoop resource scheduling algorithm based on various dimensions constraint genetic algorithm to the present invention It is further described.

Cluster environment model is denoted as G={ NoedSet, JobSet }, wherein NodeSet={ node₁, node₂, node₃... ..., node_nIndicate node resource set；JobSet={ Job₁, job₂, job₃... ..., job_nIndicate operation set It closes, each Job_i={ task₁, task₂, task₃... ..., task_n(0≤i < n), wherein task has map task also to have reduce task.The task in this set JobSet the node in NodeSet is assigned to eventually by dispatching algorithm to execute, And make whole Job task completion time most short.

It is further described below with reference to genetic algorithm parameter and Hadoop resource dispatching model parameter.

1) prediction model is initialized:

<FileSize, SplitSize, SplitNum, MapTime, ReduceTime>

Wherein FileSize indicates current work size, and SplitSize indicates operation fragment size, and SplitNum indicates to make Industry fragment number, MapTime, ReduceTime respectively indicate the execution time in operation Map stage and Reduce stage.Then lead to Evaluation history task attribute information is crossed to predict the execution time of new task.

The TS of building_(map/reduce)Model is stored in RescourseManager, to make when dispatching algorithm starting With node attribute information is periodically passed to RescourseManager by heartbeat communication mechanism by NodeManager.

The following Fig. 2 TS prediction model data acquisition figure of the data TRANSFER MODEL of prediction model.

2) task matrix:

JobSet={ Job is shared for operation set₁, job₂, job₃... ..., job_nIndicate, wherein each Job_i= { JobSize, SplitSize, SplitNum } (0≤i < n), JobSize indicate job size, and SplitSize expression is each cut Piece size, SplitNum fragment number.

3) node matrix equation:

For node set NodeSet={ node₁, node₂, node₃... ..., node_n, wherein node_i= {cpuSpeed_i, AllR_i, UsedR_i, Cnum_i, Load_i(0≤i < n), cpuSpeed_iIndicate the cpu floating-point operation energy of node i Power, AllR_iIndicate the node server total resources, UsedR_iIndicate the node server resource, Cnum_iIndicate the node Server CPU core number.

4) volume matrix is dyed

Population is evolved every time can generate several chromosomes, and every chromosome is all a feasible solution of current problem, can Contain multiple elements in row solution, each element is known as a gene of chromosome.Volume matrix is dyed by the matrix group of a n × t At n row indicates mission number, and t column indicate node serial number, are denoted as chromosomeMatrix=(chromosomeMatrix [i] [j])_n×t。

Wherein matrix element chromosomeMatrix [i] [j] ∈ { 0,1 } (0≤i < n, 0≤j < t), element ChromosomeMatrix [i] [j]=1 indicates that task i is distributed to node j and executed by this item chromosome, and element value is 0 expression Current task is not yet assigned to the node by this item chromosome.At the same time, each task can only distribute to a node It executes, so condition need to be met

5) Double fitness value function:

Fitness function is used to control the direction of Evolution of Population.The present invention is taken based on optimal time span and based on load Balanced Double fitness value function.So that population towards the most short task execution time of searching and keeps cluster respectively to save during evolution Advance in point load balancing direction.

In embodiments of the present invention, Double fitness value function specifically includes: fitness function based on time span and being based on The fitness function of load balancing.

In embodiments of the present invention, the fitness function based on time span includes:

For a Job operation, the deadline is codetermined by Map, Reduce task completion time；It is right For entire cluster, item chromosome is exactly a kind of Resource Allocation Formula, and OPTIMAL TASK time span is the allocation plan In complete the latest Job execute the time determine.For a Job, it is as follows to execute time calculation formula for it:

T_job=T_map+T_reduce(formula 2)

In embodiments of the present invention, Map task and the calculation method of Reduce task execution time include:

A) for Map task, present invention understands that there are inconsistent for task run node and document storing section point in the cluster The case where.So Map Runtime handles the time by task and the resource transmission time determines.The task processing time mainly takes Certainly in the CPU computing capability of task run node, the resource transmission time depend on task memory node and task run node it Between network transfer speeds.So it is as follows to execute time calculating for each Map task of Job operation:

Wherein T_map(i, j, k) indicates that k-th of fragment of operation i distributes to the time of node j execution, Split (i, k) (0 ≤ k < splitNum) indicate operation i k-th of fragment size, cpuSpeed_jIndicate the CPU arithmetic speed of node j；If point Piece size and blocks of files are not of uniform size, then this task may need internet transmission of virtual laboratory blocks of files to synthesize a fragment, Block (i, k) indicates that the fragment task needs the data block size from other node-node transmissions, and node (i, j) indicates task i storage Network transfer speeds between node and execution node j.And Job one big may be divided into several fragments, one point One Map task of piece, it is parallel respectively to execute, then the Map task being finished the latest will be as the task of entire Map task The time is executed, so for a Job_iFor, T_map=Max (T_map(i,j,k))。

B) for Reduce task, task execution time is according to TS_(map/reduce)The historical information of model construction < FileSize, SplitSize, SplitNum, MapTime, ReduceTime > predicted, steps are as follows for specific execution:

Wherein T_reduceIndicate the Reduce stage overall execution time of current Job, T_mapIndicate the Map stage of current Job Overall execution time, AvgT_mapIndicate JobSet₃The Map stage average performance times of all operations in set, AvgT_reduceTable Show JobSet₃All operation Reduce stage average performance times in set.Since node load can constantly change, same node Different task execution times is had same task is in different moments, so needing plus a load regulation parameter ω, for balancing the deadline of task under different loads, ω is expressed as the ratio of current time load and history average load.

(formula 2) can be converted to following computation model according to (formula 3) and (formula 4):

T_jobIt is the execution time an of operation, for entire cluster, while multiple operations is run, wherein the latest The operation being finished is exactly the optimal time span of this chromosome allocation plan.

So the fitness function based on optimal time span indicates are as follows:

Wherein F_time(c) the optimal time span of the c articles chromosome in population is indicated, N indicates operation quantity, ChromosomeScale indicates population scale.The optimal time spans of all chromosomes in one wheel iteration are indicated are as follows:

AllF_time={ F_time(1), F_time(2), F_time(3) ... ..., F_time(c)}

In embodiments of the present invention, the fitness function based on load balancing includes:

The present invention measured by standard deviation one group of data with respect to mean value dispersion degree, standard deviation it is smaller then with average value Closer, cluster load is more balanced.Therefore it is indicated based on load balancing fitness function are as follows:

AllF_load={ F_load(1), F_load(2), F_load(3) ... ..., F_load(i)}

Wherein AllF_loadIndicate that all chromosome load balancing fitness set in epicycle iteration, set subscript indicate dye Colour solid number, element value indicate this chromosome load balancing fitness value.

C) normalized:

Optimal time span fitness function and load balancing fitness function are different evaluation index, they have not Same dimension and dimensional unit.In order to eliminate the dimension impact between index, need to be standardized data.The present invention Deviation Standardization Act is used for reference, using following Set criteria formula:

Ftime (k) *, which is represented, for chromosome executes the time, and the value the big, illustrates that the execution time is longer, fitness is answered This is smaller；Fload (k) * represents dispersion degree of the node distribution number of tasks with respect to mean allocation number of tasks simultaneously, is worth bigger explanation Cluster load is more unbalanced, and fitness should be smaller.So it is based on optimal time span and base for every chromosome It is indicated in the fitness function of load balancing are as follows:

6) fitness probability matrix:

Fitness probability matrix be calculated according to the fitness of every chromosome its in next round iteration be selected it is general Rate, the more big selected probability of fitness is bigger, and calculation is as follows:

SelectionProbability={ F_prob(1), F_prob(2), F_prob(3) ..., F_prob(i)}

Wherein SelectionProbability indicates the set of all chromosome fitness probability in last round of iteration, collection Closing subscript indicates chromosome numbers, and matrix intermediate value indicates the corresponding fitness probability of the chromosome, that is, next round iteration quilt The probability chosen.Therefore SelectionProbability should be met

7) selection-duplication operator:

After chromosome fitness has been calculated, into iterative cycles step.It is to select two by selection operator first The high chromosome of fitness enters crossover operation.Operator is replicated in the present invention and uses roulette (RWS) method, and individual is selected general Rate is got by the calculating of fitness probability matrix.In order to guarantee that outstanding chromosome obtains for delivery to the next generation, prevents from intersecting, become Outstanding Chromosome breakage is formed pernicious iteration by ETTHER-OR operation, and the present invention is added in selection operator replicates operator, in each iteration Reproduction ratio is set, is copied to several high chromosomes of fitness in previous generation population are intact in population of new generation.It is multiple The setting of ratio processed ensure that algorithm stability so that population is evolved toward the direction.

Selection-duplication operator needs to meet following formula:

Wherein CrossoverNum indicates to choose the chromosome quantitative for carrying out crossover operation by roulette mode, CopyNum indicates the chromosome quantitative directly replicated, and cp is reproduction ratio.Excessive, the excessive algorithm of reproduction ratio should not be arranged in reproduction ratio It is not easy to restrain.It is preferable by effect when experimental verification cp=0.2.

8) crossover operator:

Crossover operation is the main method that population generates new individual.The present invention is when to chromosome coding, the two dimension of use Matrix coder.When doing crossover operation, two-dimensional matrix is first decoded into one-dimensional form:

ChromosomeMatrix=[2,3,1,4,5,7,2 ..., 9]；

Wherein chromosomeMatrix indicates item chromosome, and subscript indicates mission number, and element value indicates that node is compiled Number, such as chromosomeMatrix [1]=3, task 1 is distributed to node 3 and executed by expression.The present invention takes random complementary method to hand over Parent chromosome is pitched, two high parent chromosomes of fitness are selected by selection operator first, it is random to intercept same position note Under be designated as flag, wherein parent chromosome interception chromosomeMatrix [0, flag], mother then intercepted for chromosome The two, is then binned in and is formed together child chromosome by chromosomeMatrix [flag, end].Intersect process such as Fig. 3 dye Shown in colour solid crossover operation schematic diagram.

9) mutation operator:

The present invention uses a kind of self-adaptive mutation calculation, so that mutation probability is being planted with chromosome fitness Serial regulation is carried out between cluster mean and maximum value, so that it is convergent to globally optimal solution when close to optimal solution to accelerate algorithm Speed.Self-adaptive mutation calculation formula is as follows:

Wherein P_var(k) mutation probability of chromosome k, F are indicated_Adapt(max) population chromosome maximum adaptation angle value is indicated, F_Adapt(avg) population chromosome average fitness value is indicated.F_Adapt(k) fitness value of chromosome k, λ are indicated_minAnd λ_maxIt is Mutagenic factor controls the upper and lower bound (λ of aberration rate value_min、λ_max∈(0,1)).Formula is described as follows:

It is if 1) certain chromosome fitness is higher, and has been higher than average value, then outstanding on the chromosome in order to prevent Gene is destroyed, it should be reduced its mutation probability, that is, be worked as F_Adapt(k)≥F_Adapt(avg) when, it should using the side in formula (15) Method dynamic calculates its mutation probability, so that its higher aberration rate of fitness is lower.Mutagenic factor λ is obtained according to many experiments_min= 0.005 effect is preferable.

If 2) certain chromosome fitness is lower, and subaverage, then it is bigger just to allow the chromosome to possess Mutation probability, for enhancing population ability of searching optimum.Work as F_Adapt(k)<F_Adapt(avg) when, it is general to give a maximum variation Rate λ_max.Mutagenic factor λ is obtained according to many experiments_max=0.05 effect is preferable.

In mutation operation, to reduce calculation times, using continuous variation gene position method.Genetic mutation position is determined first It sets, then calculates the number to be made a variation and make a variation.

Mutation operation process is as follows:

1) variation judgement: the random number P generated between one [0,1]_rand(k) compared with the chromosomal variation probability Compared with if P_rand(k)<P_var(k), then mutation operation is executed to this chromosome.

2) variation number calculates: in order to avoid the gene number of item chromosome variation is too many, algorithm is caused to be not easy to restrain, Therefore mutant gene number should meet following constraint condition:

0<VarNum≤P_var(k)×chromosomeMatrix.length

Wherein VarNum indicates the consecutive gene number for allowing to make a variation, and chromosomeMatrix.length indicates dyeing Body gene number.Herein using in the gene for meeting the integer representative variation generated at random within the scope of VarNum constraint condition Number.

3) variable position judges: generating a number at random within the scope of mrna length chromosomeMatrix.length P_indexIndicate that variable position, variable position add variation number backward, it can definitive variation segment.If variable position is beyond dyeing Then remaining gene makes a variation body length since the 1st gene.

4) it executes variation: random change genic value mode being used to make a variation to enhance the complete of population after definitive variation segment Office's optimizing ability.For example, it is assumed that cluster, which is appointed, 100 tasks, 10 nodes meet P for chromosome k_rand(k)<P_var(k), And VarNum=3 is calculated, P_index=5, then it represents that make a variation three genes backward at the chromosome subscript 5.

Shown in the following Fig. 4 chromosomal variation operation chart of process that makes a variation.

Below with reference to experiment, the invention will be further described.

The present invention uses two parts experimental verification algorithm operational efficiency and validity.

First part is excellent in operational efficiency compared to ant colony intelligence optimization algorithm by emulation experiment verification algorithm Gesture, while the setting of the optimized parameter in algorithm is obtained by emulation experiment；

Second part is by building Hadoop cluster environment, using this hair of HiBench performance benchmark test Tool validation Bright told MCGA scheduler is carried relative to the Resource Scheduler AntScheduler and Hadoop that ant group algorithm is realized Advantage of the Capacity scheduler on the overall task deadline illustrates the correctness and validity of algorithm.

First part:

1) environment is tested

A) algorithm operational efficiency Evaluation Environment

Simulation experimental program is realized with JavaScript language, using Chrome V8 engine as algorithm operation platform, is adopted Visual Chart is generated with Echarts.

B) algorithm validity Evaluation Environment

For verification algorithm validity and correctness, need to be verified in Hadoop cluster.Experimental situation uses 5 The Hadoop cluster of server construction, every server are 2 cores, 6GB memory, cluster 10 cores, 30GB memory in total.One of them NameNode, a ResourceManager, 4 DataNode, 4 NodeManager.Cluster topology Fig. 5 clustered node is opened up It flutters shown in figure.

2) algorithm operational efficiency is assessed

Two algorithms are all made of same task and node by the stabilization for guaranteeing test environment in emulation experiment Number setting: the task of totally 100 fixed sizes, the node of 10 fixed executive capability；Pass through continuous adjustment algorithm parametric form The algorithm operational efficiency under different conditions is compared, optimal parameter setting is finally obtained.

A) as follows for ant group algorithm parameter setting and experimental result record:

1 ant group algorithm experiment parameter of table record

The the 2.1st, No. 4.1 experiment that interception task completion time is shorter separately below and algorithm execution time is shorter, effect Fruit is schemed as shown in Fig. 6 ant group algorithm experiment numbers 2.1 and Fig. 7 ant group algorithm experiment numbers 4.1.

B) parameter setting for genetic told for the present invention and experimental result record are as follows:

2 genetic algorithm experiment parameter of table record

Separately below shown in experiment effect Fig. 8-Figure 13 under the setting of interception different parameters.

Analysis of experimental results about AntScheduler algorithm and MCGA algorithm is as follows:

AntScheduler algorithm: experiment shows that the Algorithm Convergence is good, but local optimum is easy to treat as the overall situation most It is excellent, and algorithm execution time is relatively long, while algorithm stability is bad, even if same group task, same group node, finally The overall task execution time that allocation plan obtains has difference.

MCGA algorithm: experiment display MCGA algorithm possesses shorter runing time, and execution efficiency is higher.While compared to AntScheduler algorithm has better stability, and using same group task, same group node, final allocation plan is obtained whole Body task execution time is roughly the same.By observation experiment number it has been found that the number of iterations is more, chromosome is more, can more find Globally optimal solution；Reproduction ratio setting is less susceptible to restrain more greatly, it is also difficult to obtain globally optimal solution.And observation experiment data are sent out The now experiment of number 1.1~1.4 compared to other experiment no matter algorithm execution time, Algorithm Convergence or task execution time Have a clear superiority.Therefore show that MCGA algorithm optimized parameter is set as the number of iterations 100 times, population scale i.e. chromosome number is 100, reproduction ratio 0.2.

It can be seen that MCGA algorithm of the invention meets demand in operational efficiency.

3) algorithm validity is assessed

Below will for MCGA scheduler of the present invention, ant group algorithm realization Resource Scheduler AntScheduler, Task execution time of the Capacity scheduler of Hadoop default under four group job collection compares and analyzes.To have avoided number According to error, task execution time takes the average value of 5 operations, such as following table.

Task completion time compares under 3 different work collection of table

Histogram is depicted as shown in tri- kinds of schedulers of Figure 14 average task completion time under four group task collection.

In the case where according to the small operation of upper table, algorithm advantage is not obvious, this is because what small operation set divided Map, reduce task are less, and cluster resource is relatively sufficient.And in big operation set, map, reduce task phase of division To more, resource contention is gradually fierce, and the performance advantage of algorithm just emerges from.

Second part:

For the stability of verification algorithm, MCGA, AntScheduler, Capacity scheduler operation second is respectively adopted Group, third group job collection 20 times observe the variation of its task execution time.

By Figure 15 and Figure 16 it is found that Capacity, AntScheduler scheduler are for its task execution of same group job Time fluctuation amplitude is larger, concentrates the 6th, 11,14,17 experiment for the second group job of Capacity scheduler, third group is made It is singular point that industry, which concentrates 4,8,11,13,18 experiments, and on these aspects, the execution time of Capacity scheduler is relatively It grows and increased dramatically with the difference of front and back point, this is because it arrives first what the resource distribution mode first obtained determined, if by task point The dispensing poor node of performance, which makes cluster load imbalance that will will lead to task overall execution time, larger difference.And this The MCGA scheduler is invented, resource allocation is carried out using intelligent optimization algorithm, according to group operation collection and node resource Situation dynamic tuning, smart allocation, so the fluctuation of its overall task deadline is smaller, performance is more stable.

It can be concluded that, the present invention has good robustness from above-mentioned experimental analysis, either handles big operation set also It is that small operation set its performance is superior to the resource scheduling algorithm realized using ant group algorithm and Hadoop YARN default scheduling is calculated Method is a kind of effective resource allocation methods.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims

1. a kind of Hadoop resource regulating method, which is characterized in that the Hadoop resource regulating method includes:

After user submits operation to cluster, after being initialized to prediction model, task matrix, node matrix equation, construct double suitable Response function；

2. Hadoop resource regulating method as described in claim 1, which is characterized in that the Hadoop resource regulating method packet It includes:

Step 1 initializes prediction model after user submits operation to cluster, and building task matrix, node matrix equation will be believed Breath and coding result are saved to file；

Step 2, population building primary, generates feasible solution chromosome primary according to task matrix, node matrix equation at random, is denoted as Scale；

Whether step 4, termination condition judgement first judge whether to meet termination condition, i.e., before entering next round iterative evolution Reach the iteration upper limit, the condition that meets then is selected the highest chromosome of fitness in current population and otherwise entered new as optimal solution One wheel iteration；

Step 5, fitness probability calculation are selected in evolve next time according to fitness value calculation each chromosome Probability generates new chromosome subsequently into selection-duplication, intersection, mutation operation；

Step 6 replicates the highest Scale*cp chromosome of fitness in population Scale by reproduction ratio cp in duplication operator Into next iteration；

Step 7 executes selection operator by circulation and selects two chromosomes as parent chromosome, generates into crossover operation Remaining Scale* (1-cp) chromosome；

Step 8, for generation Scale* (1-cp) chromosome execute mutation operation, and to variation complete chromosome into Enter next iteration；

Step 9 enters step the iterative evolution of a three carry out new rounds.

3. Hadoop resource regulating method as claimed in claim 2, which is characterized in that step 1 submits operation to arrive in user In cluster, cluster environment model is denoted as G={ NoedSet, JobSet }, wherein NodeSet={ node₁, node₂, node₃... ..., node_nIndicate node resource set；JobSet={ Job₁, job₂, job₃... ..., job_nIndicate operation set It closes, each Job_i={ task₁, task₂, task₃... ..., task_n(0≤i < n), wherein task include map task, reduce task；Task in set JobSet is assigned to the node in NodeSet to execute, carries out entirety Job task fortune Row；

Initialization prediction model method include:

<FileSize, SplitSize, SplitNum, MapTime, ReduceTime>

Wherein FileSize indicates current work size, and SplitSize indicates operation fragment size, and SplitNum indicates operation point The piece number, MapTime, ReduceTime respectively indicate the execution time in operation Map stage and Reduce stage；Then by commenting Historic task attribute information is estimated to predict the execution time of new task；

The TS of building_(map/reduce)Model is stored in RescourseManager, and NodeManager passes through heartbeat communication mechanism Node attribute information is periodically passed into RescourseManager.

4. Hadoop resource regulating method as claimed in claim 2, which is characterized in that in step 1,

The initial method of task matrix includes:

JobSet={ Job is used for operation set₁, job₂, job₃... ..., job_nIndicate, each Job_i=JobSize, SplitSize, SplitNum } (0≤i < n), JobSize expression job size, each slice size of SplitSize expression, SplitNum fragment number；Node matrix equation initial method includes: for node set NodeSet={ node₁, node₂, node₃... ..., node_n, node_i={ cpuSpeed_i, AllR_i, UsedR_i, Cnum_i, Load_i(0≤i < n), cpuSpeed_i Indicate the cpu floating-point operation ability of node i, AllR_iIndicate the node server total resources, UsedR_iIndicate the node server Resource, Cnum are used_iIndicate the node server CPU core number.

5. Hadoop resource regulating method as claimed in claim 2, which is characterized in that in step 2, dyeing volume matrix is generated Method includes: that dyeing volume matrix is made of the matrix of a n × t, and n row indicates mission number, and t column indicate node serial number, are denoted as ChromosomeMatrix=(chromosomeMatrix [i] [j])_n×t；

6. Hadoop resource regulating method as claimed in claim 2, which is characterized in that in step 3, when using being based on optimal Between span and the Double fitness value function based on load balancing so that finding most short task execution time and guarantor during Evolution of Population Each node load balancing direction of cluster is held to advance；

Fitness function based on time span includes: to execute the time for Job,

T_jobIt is the execution time an of operation, for entire cluster, while runs multiple operations, the operation being finished the latest For the optimal time span of this chromosome allocation plan；

Fitness function based on optimal time span indicates are as follows:

Wherein F_time(c) the optimal time span of the c articles chromosome in population is indicated, N indicates operation quantity, ChromosomeScale indicates population scale；The optimal time spans of all chromosomes in one wheel iteration are indicated are as follows:

AllF_time={ F_time(1), F_time(2), F_time(3) ... ..., F_time(c)}

Wherein AllF_timeIndicate that all chromosome time span set in epicycle iteration, set subscript indicate chromosome numbers, member Element value indicates this chromosome time span value.

7. Hadoop resource regulating method as claimed in claim 6, which is characterized in that Map task and Reduce task execution The calculation method of time includes:

Wherein T_map(i, j, k) indicate operation i k-th of fragment distribute to node j execution time, Split (i, k) (0≤k < SplitNum k-th of fragment size of operation i, cpuSpeed) are indicated_jIndicate the CPU arithmetic speed of node j；If fragment is big It is small not of uniform size with blocks of files, then this task may need internet transmission of virtual laboratory blocks of files to synthesize a fragment, Block (i, k) indicates that the fragment task needs the data block size from other node-node transmissions, and node (i, j) indicates task i memory node And execute the network transfer speeds between node j；Job one big is divided into several fragments, one Map of a fragment Task, it is parallel respectively to execute, the Map task being finished as entire Map task task execution time, for one Job_i, T_map=Max (T_map(i,j,k))；

B) for Reduce task, task execution time is according to TS_(map/reduce)Historical information < FileSize of model construction, SplitSize, SplitNum, MapTime, ReduceTime > predicted；

Fitness function based on load balancing includes:

It is higher that more balanced allocation strategy cluster source utilization rate is loaded in resource allocation process interior joint；It is suitable based on load balancing Number of tasks in set of tasks JobSet is expressed as JobSet.length, node set after initialization of population by response function NodeSet interior joint number is expressed as NodeSet.length, then the number of tasks of per node on average distribution are as follows:

Dispersion degree of one group of data with respect to mean value is measured by standard deviation, standard deviation is smaller then closer with average value, collection Group's load is more balanced；It is indicated based on load balancing fitness function are as follows:

Wherein F_load(c) standard deviation of this chromosome node distribution number of tasks, the i.e. fitness value of its load balancing are indicated, TaskNum (c, j) indicates the task number that the c articles chromosome, j-th of node is assigned to, and N indicates node total number amount, AvgTask Indicate each node mean allocation number of tasks in this chromosome allocation plan；Load for all chromosomes in a wheel iteration Balanced fitness set expression are as follows:

AllF_load={ F_load(1), F_load(2), F_load(3) ... ..., F_load(i)}

Wherein AllF_loadIndicate that all chromosome load balancing fitness set in epicycle iteration, set subscript indicate that chromosome is compiled Number, element value indicates this chromosome load balancing fitness value；

Normalized: following Set criteria formula is used:

Ftime (k) *, which is represented, for chromosome executes the time, and the value the big, illustrates that the execution time is longer, fitness should be got over It is small；Fload (k) * represents node distribution number of tasks with respect to the dispersion degree of mean allocation number of tasks simultaneously, and value is bigger to illustrate cluster Load is more unbalanced, and fitness should be smaller；It is based on optimal time span and is based on load for every chromosome The fitness function of weighing apparatus indicates are as follows:

8. Hadoop resource regulating method as claimed in claim 7, which is characterized in that

To building historical information<FileSize, SplitSize, SplitNum, MapTime, ReduceTime>carry out prediction side Method includes:

Step 1: assuming that the operation to be predicted is NewJob, first in TS_(map/reduce)Model is looked for big with current work NewJob Operation set JobSet similar in small (FileSize)₁={ Job₁,Job₂,Job₃,……Job_k}；

Step 2: then in JobSet₁In find fragment size (SplitSize) and fragment quantity (SplitNum) consistent operation Set JobSet₂={ Job₁,Job₂,Job₃,……Job_k}；

Step 3: and then in Jobset₂In find operation set similar in Map task execution time with current work NewJob JobSet₃={ Job₁,Job₂,Job₃,……Job_k}；

Wherein T_reduceIndicate the Reduce stage overall execution time of current Job, T_mapIndicate that the Map stage of current Job is whole Execute time, AvgT_mapIndicate JobSet₃The Map stage average performance times of all operations in set, AvgT_reduceIt indicates JobSet₃All operation Reduce stage average performance times in set；Since node load can constantly change, same node is Same task is set also to have different task execution times in different moments, so need plus a load regulation parameter ω, For balancing the deadline of task under different loads, ω is expressed as the ratio of current time load and history average load.

9. Hadoop resource regulating method as claimed in claim 2, which is characterized in that in step 5, fitness probability matrix: Calculation is as follows:

Wherein F_prob(i) probability that chromosome i is selected in next round iteration is indicated；Fitness probability is in each round iteration Terminate, new round iteration calculates before starting, and value is mapped to one-dimensional matrix, and structure is as follows:

SelectionProbability={ F_prob(1), F_prob(2), F_prob(3) ..., F_prob(i)}

Wherein SelectionProbability indicates the set of all chromosome fitness probability in last round of iteration, under set Mark indicates chromosome numbers, and matrix intermediate value indicates the corresponding fitness probability of the chromosome, the selected probability of next round iteration； SelectionProbability should be met

Selection-duplication operator meets following formula:

ChromosomeMatrix=[2,3,1,4,5,7,2 ..., 9]；

Wherein chromosomeMatrix indicates item chromosome, and subscript indicates mission number, and element value indicates node serial number, such as Task 1 is distributed to node 3 and executed by chromosomeMatrix [1]=3, expression；Random complementary method is taken to intersect parent's dyeing Body selects two high parent chromosomes of fitness by selection operator first, and random interception same position, which is write down, is designated as flag, Parent chromosome intercept chromosomeMatrix [0, flag], mother then intercepted for chromosome chromosomeMatrix [flag, End], then the two is binned in and is formed together child chromosome；

Wherein P_var(k) mutation probability of chromosome k, F are indicated_Adapt(max) population chromosome maximum adaptation angle value, F are indicated_Adapt (avg) population chromosome average fitness value is indicated；F_Adapt(k) fitness value of chromosome k, λ are indicated_minAnd λ_maxBe variation because Son controls the upper and lower bound of aberration rate value.

10. a kind of Hadoop resource scheduling system for implementing Hadoop resource regulating method described in claim 1.