CN109684070A

CN109684070A - A kind of dispatching method in cloud computing parallel work-flow

Info

Publication number: CN109684070A
Application number: CN201810997296.7A
Authority: CN
Inventors: 马建峰; 张兆一; 李辉; 张世哲; 李金库; 姚青松; 宁建斌
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2018-08-29
Filing date: 2018-08-29
Publication date: 2019-04-26
Anticipated expiration: 2038-08-29
Also published as: CN109684070B

Abstract

The invention discloses the dispatching methods in a kind of cloud computing parallel work-flow, after determining virtual machine configuration, while opening 1,2,3 ..., n platform virtual machine, and count the average used time t that every virtual machine is opened in test every time₁,t₂,t₃,…,t_n, the maximum likelihood value of a and b is obtained according to formula ti=ai+b；Then calculating task used time τ, then starting m platform virtual machine executing tasks parallelly is optimal case；Secondly task approximation is averagely divided n parts (n > > m) by task window, constitute set of tasks, it is placed in task pool, task dispatcher successively creates m platform virtual machine, start parallel processing task process simultaneously, whenever having, when task is completed on new virtual machine starting or virtual machine, task dispatcher is that the virtual machine distributes task, and last whole virtual machines complete whole tasks and simultaneously return to task result.Optimal scheduling strategy in the present invention is designed the characteristics of starting the time by virtual machine, the speed-up ratio of near-optimization may be implemented, guarantee that each virtual machine task is almost completed at the same time, the waste of system resource is reduced while greatly reducing task execution time and improving task efficiency.

Description

A kind of dispatching method in cloud computing parallel work-flow

Technical field

The invention belongs to field of computer technology, grasp parallel further to the cloud computing of one of field of cloud computer technology Dispatching method in work, the present invention can be used for carrying out computation-intensive task under cloud computing environment.

Background technique

Cloud computing is due to its convenient and fast storage service, the advantages that flexible charge method so that more and more enterprises and Individual by local datastore on Cloud Server, to reduce the storage burden and administration overhead of local data.In recent years in phase When obtaining great reputation in the enterprises and individuals of quantity.User does not need the hardware and software resource for retaining oneself, he Can send request to cloud at any time and calculate operation.When executing large-scale calculations task or transformation task, cloud can star Virtual machine (VM) on platform executes parallel computation or parallel transmission.For example, privacy in order to protect data, we will be big The plaintext of amount is encrypted as ciphertext, and by a large amount of ciphertext from server transport to client, this needs to take a lot of time ability It completes.In general, Task-decomposing can be multiple subtasks by we, and started a large amount of virtual machines and carried out subtasking, with Shorten the deadline.

But it needs to solve there are also many problems.Since the starting of virtual machine is also required to the time, so being not virtual machine More, task total deadline is shorter.How many a virtual machines should so be started, how to give virtual machine distribution task? it is thus determined that A kind of optimal scheduling scheme, allows cloud platform in an optimal manner by duty mapping into virtual machine, completes task always Time further shortens the hot issue for becoming field of cloud calculation instantly.

Neusoft Group Co., Ltd application a kind of parallel calculating method of patent and system (application number: 201310591160.3, publication number: 103617086B), a kind of parallel calculating method is disclosed, method includes: to each calculating Node is monitored to obtain monitoring nodes data；The load energy of the calculate node is calculated based on the monitoring nodes data Power；Task to be allocated is allocated according to the load capacity of the calculate node.The deficiency of this method is: only for Platform resource is dispatched, and does not make optimization to the dispatching method of task.

Shandong University application patent based on Linux parallel computing platform dynamic load balancing method (application number: 201310341592.9, publication number: 103399800B), the dynamic load for disclosing kind of the parallel computing platform based on Linux is equal Weighing apparatus method, method content are as follows: during parallel computation, total calculating task is divided into execution time equal multiple ranks Duan Zhihang.Using the regular works dispatching technique in system, before the parallel computation of each time phase starts, first read each The current resource utilization of a node, and the calculated performance and computation complexity of each node are combined, to the calculating task of node It is dynamically distributed, guarantees that the calculating time of each node of each stage is of substantially equal, reduce the synchronous delay waited of system.It is logical This dynamic adjustable strategies are crossed, it can be achieved that completing total calculating task with higher resource utilization.The deficiency of this method is: It is only limitted to Linux platform, and the not influence in view of the time of system starting to execution efficiency.

The main policies that are widely studied and practices at present, such as to cpu frequency progress dynamic adjustment, closing idle machine or The migration merging etc. of sleep state, virtual machine is made it into, essence of these strategies are all by data center resource Dynamic dispatching and integration, maximize energy utilization rate, however there is no a kind of in one timing of task amount and system resource, Make task total used time shortest optimal acceleration strategy.

Therefore, when system resource is fixed, when how how to shorten general assignment completion as far as possible in executing tasks parallelly Between, seek suitable virtual machine quantity, maximizing optimal speed-up ratio becomes urgent problem to be solved.

Summary of the invention

In view of the deficienciess of the prior art, the object of the present invention is to provide the dispatching parties in a kind of cloud computing parallel work-flow Method determines the virtual machine quantity that optimal acceleration can be achieved by system resource and task amount before task execution, then right Task is split, and successively starts virtual machines performing tasks.

The present invention is to be achieved through the following technical solutions:

A kind of dispatching method in cloud computing parallel work-flow, includes the following steps:

S1, cloud platform parameter is calculated:

After determining virtual machine configuration, while opening 1,2,3 ..., n platform virtual machine, and it is every to count unlatching in test every time The average used time t of platform virtual machine₁,t₂,t₃,…,t_n, two groups of data are substituted into t_iIn=ai+b, the likelihood of a and b is calculated Value, and record；

Wherein, t_iWhen to open i virtual machine simultaneously, every virtual machine is opened average time used, and a and b system are special Levy parameter；

S2, optimal acceleration strategy is determined:

Task is executed in a virtual machine, calculating task used time τ, then starting m platform virtual machine executing tasks parallelly is most Excellent scheme；

S3, segmentation task:

When task reaches task window, task window is averagely divided task approximation n parts (n > > m), constitutes task-set It closes, is placed in task pool；

S4, creation virtual machine:

Task dispatcher successively creates m platform virtual machine, while starting parallel processing task process；

S5, task distribution:

Whenever having, when task is completed on new virtual machine starting or virtual machine, task dispatcher is that virtual machine distribution is appointed Business；

S6, task is completed:

It completes whole tasks and returns to task result.

Preferably, the specific method of task dispatcher distribution task is that task dispatcher takes from task pool in step S5 1 part of task out, by duty mapping to the virtual machine, virtual machine starts execution task, after task is finished on virtual machine It checks task pool, if task pool is not sky, continues to assign tasks to virtual machine, if task pool is sky, which is restored For original state.

Compared with prior art, the invention has the following beneficial technical effects:

1) optimal scheduling strategy in the present invention is designed the characteristics of starting the time by virtual machine, and approximation may be implemented Optimal speed-up ratio guarantees that each virtual machine task is almost completed at the same time, and improves task effect greatly reducing task execution time Reduce the waste of system resource while rate.

2) the optimal acceleration strategy in the present invention is realized by control virtual machine quantity and plan target.Once providing Calculating task, near-optimization speed-up ratio system can calculate optimal virtual machine quantity, and automatic plan target.It is applicable in Range is more extensive, and can be used in conjunction with existing resource dynamic scheduling scheme, greatly improves system effectiveness.

Detailed description of the invention

Fig. 1 is the flow diagram of dispatching method；

Fig. 2 is the curve graph of more virtual machine starting total times；

Fig. 3 is the deterministic process schematic diagram of task used time τ；

Fig. 4 is task distribution schematic diagram.

Specific embodiment

The present invention will be further described with reference to the accompanying drawings and detailed description.

As shown in Figure 1, for the optimal case step schematic diagram in a kind of cloud computing parallel work-flow of the present invention, system it is specific Process is described as follows:

Step 1: it calculates cloud platform parameter: after determining virtual machine configuration, while opening 1,2,3 ..., n platform virtual machine, And count the average used time t that every virtual machine is opened in test every time₁,t₂,t₃,…,t_n, two groups of data are substituted into t_i=ai+ In (i=1,2 ..., n) likelihood value of a and b is calculated, and record in b；

Step 2: it determines optimal acceleration strategy: executing task in a virtual machine, calculating task used time τ then startsPlatform virtual machine executing tasks parallelly is optimal case；

Step 3: segmentation task: when task reaches task window, task approximation is averagely divided into n parts by task window (n > > m) constitutes set of tasks, is placed in task pool；

Step 4: creation virtual machine: task dispatcher successively creates m platform virtual machine, while starting parallel processing task mistake Journey；

Step 5: task distribution: whenever having, when task is completed on new virtual machine starting or virtual machine, task dispatcher is The virtual machine distributes task；

Step 6: it completes task: completing whole tasks and return to task result.

In step 1, at the same when starting more virtual machines every virtual machine average start-up time calculation method T=AN+B (N is virtual machine quantity, and T is average start-up time), which is tested by many experiments and is obtained.Experiment content are as follows: Start m platform virtual machine simultaneously on OpenStack platform, records first virtual machine starting time T_min, start total used time T_max, and Calculate the average used time T of every virtual machine of starting_avg, data are obtained after many experiments as shown in Fig. 2, it can be found that more virtual Machine starting total time approximation meets T=AN+B curve.

In step 2, the deterministic process schematic diagram of task used time τ is as shown in figure 3, when the virtual machine of starting is constantly from task When pond takes task execution, it is believed that all virtual machines are approximate while ending task, and are equal to so general assignment completes the moment The task finish time of every virtual machine, it is also equal to the average start-up time of every virtual machine and the average time of task execution Sum, i.e. T_all=T_avg+ τ/m=am+b+ τ/m is observed it is found that when m takesWhen, total time obtains minimum value.

In step 5, Fig. 4 be for task distribute schematic diagram, specifically includes the following steps:

Step 1: taking out 1 part of task from task pool when virtual machine is opened, by duty mapping to virtual machine, start Execution task；

Step 2: when on virtual machine task be finished after check task pool, if task pool be not it is empty, continue the first step, If task pool is sky, which is reverted into original state.

Above description is only example of the present invention, does not constitute any limitation of the invention.Obviously for this It, all may be before without departing substantially from the principle of the invention, structure after understanding summary of the invention and principle for the professional in field It puts, carries out the amendment and improvement of algorithm, but these amendments and improvement based on inventive algorithm are wanted in right of the invention It asks within protection scope.

Claims

1. the dispatching method in a kind of cloud computing parallel work-flow, which comprises the steps of:

S1, cloud platform parameter is calculated:

After determining virtual machine configuration, while opening 1,2,3 ..., n platform virtual machine, and count in test every time and open every void The average used time t of quasi- machine₁,t₂,t₃,…,t_n, the n data that statistics obtains are substituted into t respectively_iIn=ai+b, a and b is calculated Maximum likelihood value；

Wherein, t_iWhen to open i virtual machine simultaneously, every virtual machine opens average time used, a and b as system features ginseng Number；

S2, optimal acceleration strategy is determined:

Task is executed in a virtual machine, calculating task used time τ, then starting m platform virtual machine executing tasks parallelly is optimal side Case；

S3, segmentation task:

When task reaches task window, task window is averagely divided task approximation n parts (n > > m), constitutes set of tasks, It is placed in task pool；

S4, creation virtual machine:

S5, task distribution:

Whenever having, when task is completed on new virtual machine starting or virtual machine, task dispatcher is that the virtual machine distributes task；

S6, task is completed:

Whole virtual machines complete whole tasks and return to task result.

2. the dispatching method in a kind of cloud computing parallel work-flow according to claim 1, which is characterized in that optimal in step 2 Scheme determine specific method be；

When the virtual machine of starting constantly takes task execution from task pool, all virtual machines are approximate to end task simultaneously, so always Task completes the task finish time for being equal to every virtual machine at the moment, is also equal to the average start-up time and task of every virtual machine The sum of the average time of execution, formula are as follows:

T_all=T_avg+ τ/m=am+b+ τ/m

When m takesWhen, total time obtains minimum value；

Wherein, T_avgFor the average used time for starting every virtual machine, T_allThe general assignment deadline.

3. the dispatching method in a kind of cloud computing parallel work-flow according to claim 1, which is characterized in that task in step S5 The specific method that scheduler distributes task is that task dispatcher takes out 1 part of task from task pool, and duty mapping is virtual to this On machine, virtual machine starts execution task, when on virtual machine task be finished after check task pool, if task pool be not it is empty, Continue to assign tasks to virtual machine, if task pool is sky, which is reverted into original state.