CN115080236A

CN115080236A - Workflow deployment method based on graph segmentation

Info

Publication number: CN115080236A
Application number: CN202210730454.9A
Authority: CN
Inventors: 马英红; 吝李婉; 焦毅; 李红艳; 刘伟; 刘勤; 张琰
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2022-09-20
Anticipated expiration: 2042-06-24
Also published as: CN115080236B

Abstract

The invention discloses a workflow deployment method based on graph segmentation, which mainly solves the problem that the existing workflow deployment algorithm based on clustering realizes the minimization of communication overhead at the cost of sacrificing the parallel execution efficiency of tasks in workflows, so that the parallel execution efficiency of the tasks is lower. The implementation scheme is as follows: 1) establishing a DAG model G of the workflow directed acyclic graph; 2) calculating task execution time and data transmission time among tasks in the workflow; 3) merging serial structures in the workflow model G to obtain a new workflow model diagram G'; 4) segmenting the new workflow model diagram G' to obtain an optimal task partition; 5) and mapping the optimal task partition to the virtual machine according to the minimum execution time to complete the deployment of the workflow. The invention reduces the completion time of the workflow, improves the execution efficiency of the workflow, and can be used for joint optimization of data overhead and task parallel execution efficiency in the workflow execution process.

Description

Workflow deployment method based on graph segmentation

Technical Field

The invention belongs to the technical field of cloud computing, and particularly relates to a workflow deployment method which can be used for joint optimization of data overhead and task parallel execution efficiency in a workflow execution process.

Background

In a cloud computing environment, a workflow refers to an associated task consisting of a set of dependent tasks, typically described using a directed acyclic graph DAG. Workflow is much larger and more structurally complex than independent tasks. The deployment of the workflow needs to consider not only the resource allocation of each task in the workflow, but also the data transmission and execution sequence among the tasks, thereby greatly increasing the complexity of task deployment. How to more scientifically and reasonably deploy workflows in a distributed and heterogeneous environment remains a research hotspot in the academic community at present.

At present, cluster computing architectures such as MapReduce and Spark are widely applied to data center networks to analyze and process continuously increasing computing and network tasks, decompose complex large-scale tasks into simpler tasks and model them into workflows, and deliver the workflows to cloud data centers with powerful parallel processing capability. A workflow comprises a plurality of tasks with interdependencies, the tasks are connected with each other according to a certain priority order, and the deployment of the workflow must consider the data transmission among the tasks. Research shows that in the MapReduce application, the time used for intermediate data transmission accounts for more than 30% of the time for completing the whole workflow, and for some large-scale commercial data centers, such as yahoo data center clusters, the process data transmission of the workflow is the most important component of the network traffic, and the time occupied by the process data transmission is close to 60% of the total time for completing the workflow. Meanwhile, the process data transmission of the workflow is also a key cause of network congestion. Therefore, the workflow is optimized and deployed, so that the process communication overhead is reduced, the flow pressure of a data center is relieved, and the task completion time is shortened.

A representative heuristic workflow deployment algorithm mainly includes: list-based deployment algorithms, cluster-based deployment algorithms, replication-based deployment algorithms. The workflow deployment algorithm based on the clustering aims at reducing communication overhead among workflow tasks, tasks in the workflow are firstly mapped to different clusters, and then each cluster is mapped to the same computing node as a whole. The core idea is to divide a plurality of tasks with edge connection (data dependency relationship) into the same cluster, thereby saving the communication overhead among the tasks in the cluster.

For example, Ahmad SG et al, in its published paper "Data-intensive workflow optimization based on application task graph in the hierarchy-new computing systems" (IEEE Fourth International Conference on Big Data and Cloud computing. IEEE 2014: 129-. In this algorithm, the workflow is partitioned into task partitions of a specified size and number to minimize the data transfer overhead between partitions. The PDWA defines the maximum number of tasks allowed to be contained in each task partition, the value is calculated by multiplying the total number of tasks in the workflow by a coefficient smaller than 1, and then each task partition is mapped to the computing node which enables the partition to be shortest in execution time. The method has the disadvantages that the workflow is clustered according to the data dependency relationship among the workflow tasks, so that certain tasks which have stronger data dependency relationship and can be executed in parallel can be divided into the same cluster, the parallel execution performance of the tasks is poor, and the workflow completion time is influenced finally.

Disclosure of Invention

The invention aims to provide a workflow deployment method based on graph partitioning aiming at the defects of the prior art, so as to realize the balance between the minimization of communication overhead and the maximization of parallelism in the workflow clustering process and improve the execution efficiency of workflows.

The technical idea of the invention is as follows: from the perspective of graph theory, the dependency and the parallelism among tasks in the workflow are fully mined, a classic graph partitioning algorithm, namely a community discovery algorithm, is improved, and the joint optimization of data overhead and task parallelism in the workflow task partitioning process is realized.

According to the above thought, the technical scheme of the invention comprises the following steps:

(1) according to a task set T, a data dependency and time sequence relation E among tasks, a task complexity set L and a data transmission quantity set D in a workflow, establishing a DAG model of the workflow directed acyclic graph: g ═ { T, E, L, D };

(2) assigning a set of virtual machines S ═ S to the

workflow

_k 1,2,3, a. Computing execution time w of tasks in a workflow on different virtual machines _i,k And data transfer time c between tasks having data dependency _i,j And average execution time of the task across all virtual machines

And average data transfer time between tasks

(3) Two tasks with serial structures in the workflow model G are determined and merged:

in the workflow model diagram, if one task only has one subtask and the subtask only has one father task, the subtask and the father task form a serial structure;

will have task t in serial structure _i And task t _i+1 Cancel the data transmission between them, and add the complexity of the two tasks and combine them into a new task t _i '；

(4) Segmenting the workflow model diagram:

(4a) dividing the workflow model graph G into n sub-graphs, wherein each sub-graph comprises a vertex;

(4b) searching and trying two sub-graphs combined with edge connection in sequence, and calculating modularity increment delta Q after each combination according to tasks contained in each sub-graph and connection relation among the tasks:

if the tasks of the same layer exist in the two subgraphs, calculating the sum of the average execution time sum of the tasks of the same layer in the new subgraph after the two subgraphs are merged, and comparing the sum with the maximum value maxW of the average execution time of all the tasks of the layer:

if sum > maxW α, Δ Q ═ e _i,j +e _j,i -2a _i a) _j ＝-2(e _i,j -a _i a _j )；

Otherwise, Δ Q ═ e _i,j +e _j,i -2a _i a _j ＝2(e _i,j -a _i a _j ) (ii) a Wherein alpha is a comparison coefficient, the value is a number less than 1, e _i,j Representing the proportion of the weight of the connecting edge between the ith sub-graph and the jth sub-graph to the sum of the weights of the total connecting edges in the graph G, a _i And the proportion of the sum of the link edge weights of all tasks in the ith sub-graph to the sum of the total link edge weights in the graph G is represented.

If there are no tasks of the same layer in both subgraphs, Δ Q ═ e _i,j +e _j,i -2a _i a _j ＝2(e _i,j -a _i a _j )；

(4c) Merging the two sub-graphs with the maximum delta Q value, and updating the modularity Q to be Q + max delta Q;

(4d) repeating the steps (4a) to (4c) until the whole graph G is combined into a sub-graph, and finding a corresponding graph division result when the modularity value is maximum, namely the optimal task partition P ═ P ₁ ,p ₂ ,...p _x ,...,p _h In which p is _x Representing the x-th task partition, and h representing the number of partitions;

(5) deploying the optimal task partition to the virtual machine:

(5a) average execution time according to task

And average data transfer time between tasks

Calculating the priority rank (t) of each task _i )：

Wherein succ (t) _i ) Representing a task t _i A set of subtasks of (2);

(5b) according to the priority rank (t) of each task _i ) Calculating the priority of the task partition:

rank(p _x )＝max(rank(t _i )，t _i ∈p _x )

(5c) pressing rank (p) _x ) Sorting all partitions in descending order, selecting rank (p) each time _x ) Traversing all the virtual machines and calculating the sum of the total execution time of the task partition on the current virtual machine and the execution time of the deployed task on the virtual machine to find the virtual machine s with the minimum value _k ；

(5d) Deploying all tasks in a task partition as a whole together to a virtual machine s _k In the above, a plurality of tasks in the partition are arranged according to rank (t) _i ) The values are sorted in descending order and the virtual machine will perform these tasks in sequence.

Compared with the prior art, the invention has the following advantages:

1. according to the invention, when the workflow model diagram is divided, the partitioning process of the tasks is constrained by comparing the sum of the average execution times sum of the tasks on the same layer in the partition with the maximum maxW of the average execution times of all the tasks on the same layer, so that the problem that the task parallel execution efficiency is lower due to the fact that the same task partition contains too many tasks on the same layer which can be parallel is effectively avoided, the execution time balance of the tasks on the same layer among different partitions is guaranteed, and the problem that the communication overhead is minimized at the cost of sacrificing the parallel execution efficiency of the tasks in the workflow in the existing workflow deployment algorithm based on the clustering is solved, so that the parallel execution efficiency can be effectively improved while the data transmission overhead is reduced, and the smaller workflow completion time is obtained.

2. According to the invention, when the optimal task partition is deployed on the virtual machine, the priority is calculated and sequenced for each task partition, and then the appropriate virtual machine is selected for the task partition to be deployed on the basis of the minimum execution time of the task partition, so that the completion time of the workflow is further reduced, and the execution efficiency of the workflow is improved.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is an exemplary diagram of a workflow DAG model in the present invention;

FIG. 3 is a merging diagram of DAG serial structure in the present invention;

FIG. 4 is a schematic diagram of a workflow model graph partitioning according to the present invention;

FIG. 5 is a simulation comparison of the performance variation of the present invention and existing workflow deployment algorithms as the workflow scale increases;

FIG. 6 is a comparison graph of performance change simulation of the present invention and existing workflow deployment algorithms as the communication computation ratio increases;

FIG. 7 is a comparison graph of performance change simulation of the present invention and existing workflow deployment algorithms as the number of available virtual machines increases.

Detailed Description

Embodiments and effects of the present invention will be described in further detail below with reference to the accompanying drawings.

Referring to fig. 1, the implementation steps for the example are as follows:

step 1, establishing a DAG model of a workflow directed acyclic graph.

The set of tasks T in the workflow is represented as: t ═

T

_i 1,2, ·, n }, where t is ═ i ═ 1,2 _i The ith task is represented, and n is the number of tasks contained in the workflow;

the data dependency and timing relationship E between tasks is expressed as: e ═ E _i,j |t _i ,t _j E.g., T }, wherein e _i,j Value is 0 or 1, when e _i,j When the value is 0, the task t is represented _i And task t _j There is no dependency/edge present between them when e _i,j When the value is 1, the task is representedt _i And task t _j There is a dependency/presence edge between them;

the task complexity set L is represented as: l ═ L _i |t _i E.g., T }, where l _i Representing a task t _i The computational complexity of (2);

the set of data transmission amounts D is represented as: d ═ D _i,j |t _i ,t _j E.g., T }, wherein d _i,j Representing a task t _i And task t _j The amount of data transfer in between;

and according to the four element combinations, obtaining a workflow directed acyclic graph model:

G＝{T,E,L,D}，

the task set T is a vertex set of the graph G, the data dependency and time sequence relation E between tasks is an edge set of the graph G, the task complexity set L is a vertex weight set of the graph G, and the data transmission quantity set D is an edge weight set of the graph G.

The workflow directed acyclic graph model is shown in FIG. 2, where there is a directed edge e _i,j Connecting two tasks t _i And t _j Let t be _i Is t _j Parent task of, t _j Is t _i A task without any parent task is called an entry task t _enter A task without any subtasks is called an egress task t _exit The number of layers of each task is determined by the maximum distance between the node and the ingress task, wherein the ingress task t ₁ Is located in the first layer.

Without loss of generality, the embodiment assumes that a workflow has only one entry task and only one exit task, when a workflow has a plurality of exit tasks and entry tasks, a virtual exit task vertex or a virtual entry task vertex is added to the workflow, so that the workflow has only one exit task and one entry task, the complexity of the virtual task is set to be zero, the data transmission amount between the virtual task and other vertices is also zero, and therefore, the deployment result of the workflow is not affected by adding the virtual task vertex, and t in fig. 2 ₁₀ I.e. a virtual egress task.

And 2, calculating the execution time of each task and the data transmission time among the tasks in the workflow.

2.1) assigning a set of virtual machines S ═ S to the

workflow

_k 1,2, 3., q }, wherein q represents the number of virtual machines, and the physical machine corresponding to each virtual machine is different;

2.2) calculating working parameters of tasks in the workflow on different virtual machines:

2.2.1) computing task t in workflow _i Execution time w on different virtual machines _i,k And task t _i Average execution time across all virtual machines

Wherein l _i Representing a task t _i Of the computational complexity, v _k Representing virtual machines s _k Q represents the number of available virtual machines;

2.2.2) computing a task t with data dependency _i And task t _j Data transmission time c between _i,j And average data transfer time between tasks

Wherein, the task t _i And task t _j Respectively deployed to virtualSimulation machine s _k1 And virtual machine s _k2 Upper execution, d _i,j Representing a task t _i And task t _j Amount of data transmission between r _k1,k2 Representing virtual machines s _k1 And virtual machine s _k2 The data transmission rate therebetween;

representing the average data transfer rate between all virtual machines; when two tasks with precedence dependency relationship are placed in the same virtual machine, the data transmission overhead between the two tasks is negligible, namely c _i,j Is 0.

When task t is shown in FIG. 2 ₁ And task t ₂ When the virtual machines are placed in the same virtual machine, extra resources are not needed for data transmission, so the data transmission overhead can be marked as 0; when task t ₁ And task t ₂ When the data is respectively placed in two different virtual machines, the data needs additional network resources for transmission, and it is assumed that the transmission rate r between the two virtual machines _k1,k2 When the task t is equal to 1 ₁ And task t ₂ The required data transfer time in between is 8.

And 3, merging the serial structures of the workflow model diagrams.

will have task t in serial structure _i And task t _i+1 The two tasks are added and merged into a new task t' _i And updating the calculation complexity and the workflow model diagram to form a new model diagram G'.

As shown in FIG. 3, task t _i And task t _i+1 Two tasks with serial structures are combined to form a new task t' _i And updating the new task computational complexity and the workflow model diagram as follows:

w′ _i ＝w _i +w _i+1

pre(t′ _i )＝pre(t _i )

succ(t′ _i )＝succ(t _i+1 )

wherein, w' _i Represents a new task t' _i Execution time of (w) _i Representing a task t _i Execution time of (w) _i+1 Representing a task t _i+1 Execution time of pre (t' _i ) Represents a new task t' _i Parent task set of, succ (t' _i ) Represents a new task t' _i Is a set of subtasks, pre (t) _i ) Representing a task t _i Of the parent task set, succ (t) _i+1 ) Representing a task t _i+1 Is performed on the set of subtasks.

And 4, segmenting a new workflow model graph G' formed after the serial structures are combined.

4.1) dividing the workflow model graph G' into n sub-graphs, wherein each sub-graph comprises a vertex;

4.2) searching and trying to combine two subgraphs with edge connection in sequence, and calculating modularity increment delta Q after each combination according to tasks contained in each subgraph and connection relation among the tasks:

if the two subgraphs have the tasks of the same layer, calculating the sum of the average execution time of the tasks of the same layer in the new subgraph after the two subgraphs are merged, and comparing the sum with the maximum value maxW of the average execution time of all the tasks of the layer:

Otherwise, Δ Q ═ e _i,j +e _j,i -2a _i a _j ＝2(e _i,j -a _i a _j ) (ii) a Wherein alpha is a comparison coefficient, the value is a number less than 1, e _i,j Representing the proportion of the weight of the connecting edge between the ith sub-graph and the jth sub-graph to the sum of the total weight of the connecting edges in the graph G', a _i Representing the proportion of the sum of the link edge weights of all tasks in the ith sub-graph to the sum of the total link edge weights in the graph G';

Wherein e is _i,j The ratio of the weight of the connecting edge between the ith sub-graph and the jth sub-graph to the sum of the total weight of the connecting edges in the graph G', a _i Calculating the proportion of the sum of the link edge weights of all tasks in the ith sub-graph to the sum of the total link edge weights in the graph G' as follows:

a _i ＝k _i /2m

wherein, A is _i,j And m is the sum of all the continuous edge weights in the graph G'. k is a radical of _i Is the sum of all the connected edge weights connected with the ith task.

4.3) repeat 4.1) and 4.2) until the entire graph G' merges into a community. Finding out the community structure corresponding to the maximum modularity value, i.e. finding out the optimal partitioning scheme P ═ P ₁ ,p ₂ ,...p _h }。

FIG. 4 shows an example of workflow task segmentation, which is a workflow consisting of five tasks, where t ₂ ,t ₃ ,t ₄ The three tasks belong to the same layer and can be executed in parallel, the execution time of the three tasks is respectively 15, 10 and 25, when sub-graph division is carried out, the three tasks are divided into the same partition to be executed in sequence, obviously, the parallelism of workflow execution is reduced, and the completion time is prolonged; another possible result is to assign t ₃ And t ₄ Division into the same partition, t ₂ The task execution time of the same layer between the two partitions is not balanced, and the two partitions need to wait for each other in actual execution, so that the workflow completion time is prolonged; on the contrary, if t is to be ₂ And t ₃ Division into the same partition, t ₄ When the task is divided into another partition, the execution time of the task in the same layer of the two partitions is distributed uniformly, and the redundant waiting time can be prevented from being increased.

And 5, mapping the optimal task partition to the virtual machine to complete the deployment of the workflow.

5.1) average execution time according to task

And average data transfer time between tasks

Calculating the priority rank (t) of each task _i )：

Wherein succ (t) _i ) Representing a task t _i A set of subtasks of (2);

5.2) priority rank (t) according to each task _i ) Calculating the priority of the task partition:

rank(p _x )＝max(rank(t _i )，t _i ∈p _x )

5.3) priority rank (p) of partition by task _x ) All partitions are sorted in descending order, and rank (p) is selected each time _x ) Traversing all the virtual machines and calculating the sum of the total execution time of the task partition on the current virtual machine and the execution time of the deployed task on the virtual machine, and finding out the virtual machine s with the minimum value _k ；

5.4) deploying all tasks in the task partition as a whole together to the virtual machine s _k And the plurality of tasks in the partition are arranged according to rank (t) _i ) The values are arranged in a descending order, and the virtual machine can sequentially execute the tasks according to the order to complete the deployment of the workflow.

The effect of the present invention is further explained by combining the simulation experiment as follows:

1. setting simulation parameters:

randomly generating a workflow directed acyclic graph DAG, and setting simulation parameters as shown in Table 1:

table 1 workflow deployment simulation parameter settings

Parameter(s)	Taking values: fixed value (variance value)
		Workflow size N	50(20,40,60,80,100)
Communication calculation ratio CCR	0.5(0.4,0.8,1.2,1.6,2)
		Number of available virtual machines K	5(1,2,3,4,5,6)

In table 1, the workflow scale refers to the number of tasks included in the workflow; the communication computation ratio CCR refers to the ratio of the sum of average data transmission time among tasks to the sum of average execution time of all tasks, a higher CCR means that the workflow is a communication-intensive application, and conversely, a lower CCR means that the workflow is a computation-intensive application;

2. simulation content and result analysis thereof:

the method comprises the steps of selecting two workflow deployment algorithms, namely an existing list-based deployment algorithm HEFT and a clustering-based deployment algorithm PDWA, from three aspects of influence of workflow scale on algorithm performance, influence of communication calculation ratio on algorithm performance and influence of virtual machine number on algorithm performance, and evaluating and comparing deployment results of the method.

Simulation 1, the number of virtual machines is set to 5, the communication calculation ratio CCR is set to 0.5, the number of workflow task nodes is increased from 20 to 100, performance changes of the three methods along with the increase of the workflow scale are compared, and the result is shown in FIG. 5. Wherein:

fig. 5(a) is a comparison graph of changes of the scheduling length ratio SLR of the three methods with the increase of the workflow scale, wherein the abscissa is the number of tasks in the workflow, the ordinate is the scheduling length ratio SLR representing the completion time of the workflow, and the larger the SLR value is, the larger the completion time of the workflow is, the worse the performance is. As can be seen from fig. 5(a), compared with the PDWA based on the clustering deployment algorithm, the workflow completion time of the present invention is reduced by 75% on average, because the PDWA artificially specifies the size of the workflow partition, the number of the partitions is relatively fixed, and the parallelism of task execution is not considered, which results in the prolongation of the workflow completion time.

Fig. 5(b) is a comparison graph of the speed change of the three methods along with the increase of the workflow scale, the abscissa of the comparison graph is the number of tasks in the workflow, the ordinate of the comparison graph is the speed representing the parallel execution efficiency of the workflow, and the larger the speed value is, the higher the parallel execution efficiency of the deployment method is, the better the performance is. As can be seen from fig. 5(b), with the gradual increase of the workflow scale, the speedup value of the method of the present invention is always larger compared with the cluster-based deployment algorithm PDWA. The method fully considers the dependency and the parallelism of the tasks in the workflow when the tasks are partitioned, reduces the data transmission overhead among communities, simultaneously considers the parallel execution efficiency of the tasks in the workflow, and enables the tasks at the same layer to be efficiently and parallelly executed.

Fig. 5(c) is a comparison graph of the communication overhead variation of the three methods with the increase of the workflow scale, wherein the abscissa is the number of tasks in the workflow and the ordinate is the communication overhead. As can be seen from fig. 5(c), as the size of the workflow gradually increases, the communication overhead tends to increase, and the method and the cluster-based deployment algorithm PDWA always have lower communication overhead compared with the list-based algorithm het, which, in conjunction with fig. 5(a) and 5(b), illustrates that the method can achieve reduction of the completion time of the workflow and significant improvement of the parallel execution efficiency while maintaining smaller communication overhead.

Simulation 2, the number of virtual machines is set to 5, the number of workflow task nodes is set to 50, the communication calculation ratio CCR is increased from 0.4 to 2, performance changes of the three methods along with increase of CCR are compared, and the result is shown in fig. 6, wherein:

fig. 6(a) is a graph comparing changes of the scheduling length ratio SLR with the increase of CCR in the three methods, where the abscissa is the communication calculation ratio CCR and the ordinate is the scheduling length ratio SLR. As can be seen from fig. 6(a), under the condition that CCR is continuously increased, workflow completion time of the method of the present invention is always smaller than that of the cluster-based deployment algorithm PDWA, and with the change of CCR value, the performance of the method of the present invention and the list-based algorithm het is relatively stable, while SLR value of PDWA increases relatively significantly with the increase of CCR value, because with the increase of proportion of data transmission time, the cluster-based deployment algorithm PDWA is gradually out of advantage in the policy of only minimizing communication overhead on the basis of specifying the size of workflow partition.

FIG. 6(b) is a comparison graph of speed change with increasing CCR for the three methods, with the abscissa being the number of tasks in the workflow and the ordinate being speed. As can be seen from fig. 6(b), under the condition that CCR is continuously increased, the speedup value of the method of the present invention has obvious advantages compared with the deployment algorithm PDWA based on clustering, and is highly adaptive, because parallelism of the workflow task is fully considered when the workflow partition is performed by the present invention.

Fig. 6(c) is a graph comparing the change in communication overhead with the increase in CCR for the three methods, with the abscissa being the number of tasks in the workflow and the ordinate being the communication overhead. As can be seen from fig. 6(c), the inventive method and the cluster-based deployment algorithm PDWA have a lower communication overhead compared to the list-based algorithm het. With reference to fig. 6(a) and fig. 6(b), it can be verified that the method of the present invention can effectively improve the execution efficiency of the workflow while maintaining a small communication overhead.

Simulation 3, setting the number of the workflow tasks as 50, setting the communication calculation ratio CCR as 0.5, increasing the number of the virtual machines from 1 to 6, and comparing the performance changes of the three methods with the increase of the number of the available virtual machines, wherein the result is shown in fig. 7, where:

fig. 7(a) is a graph comparing changes of the scheduling length ratio SLR with an increase in the number of available virtual machines in the three methods, where the abscissa is the number of available virtual machines and the ordinate is the scheduling length ratio SLR. As can be seen from fig. 7(a), as the number of available virtual machines increases, the value of the workflow SLR decreases, and the advantages of the method of the present invention are also gradually obvious, because as the number of virtual machines increases, more tasks are allowed to be processed in parallel on different virtual machines during the workflow execution process, and when the task is partitioned, the method of the present invention considers the task parallelism and the execution time balance and the data transmission overhead between partitions at the same time, so that when a plurality of virtual machines are executed in parallel, the excessive mutual waiting time between different partitions is avoided, and the completion time of the workflow is effectively reduced.

Fig. 7(b) is a comparison graph of speed change of the three methods with the increase of the number of available virtual machines, wherein the abscissa is the number of available virtual machines and the ordinate is the speed. As can be seen from fig. 7(b), as the number of available virtual machines increases, the speed value also increases, and the advantages of the method of the present invention are gradually obvious. The reason is that when the method is used for partitioning, not only is the communication overhead among the partitions minimized, but also the parallel execution efficiency of the tasks in the workflow execution process is considered, and with the increase of the number of available virtual machines, more partitions are allowed to be processed in parallel on different virtual machines in the workflow execution process, so that the speed value can be increased with the increase of the number of the available virtual machines.

Fig. 7(c) is a comparison graph of the communication overhead variation of the three methods with the increase of the number of available virtual machines, wherein the abscissa represents the number of available virtual machines and the ordinate represents the communication overhead. As can be seen from fig. 7(c), the inventive method and the cluster-based deployment algorithm PDWA have a lower communication overhead compared to the list-based algorithm het. As can be seen from fig. 7(a) and 7(b), the method of the present invention can effectively improve the parallel execution efficiency and reduce the workflow completion time while maintaining low communication overhead.

Claims

1. A workflow deployment method based on graph partitioning is characterized by comprising the following steps:

(2) assigning a set of virtual machines S ═ S to the workflow _k 1,2,3, a. Computing execution time w of tasks in a workflow on different virtual machines _i,k And data transfer time c between tasks having data dependency _i,j And average execution time of the task across all virtual machines

And average data transfer time between tasks

(3) Determining two tasks with serial structures in the workflow model G, and merging the two tasks to obtain a new workflow model diagram G':

will have task t in serial structure _i And task t _i+1 Cancels the data transmission between the two tasks and adds and combines the two tasks into a new task t' _i ；

(4) And (3) dividing a new workflow model graph G' formed after the serial structures are combined:

(4a) dividing the workflow model graph G' into n sub-graphs, wherein each sub-graph comprises a vertex;

(4b) searching and trying two sub-graphs combined with edge connection in sequence, and calculating modularity increment delta Q after each combination according to tasks contained in each sub-graph and connection relations among the tasks:

(4d) repeating the steps (4a) to (4c) until the whole graph G' is merged into a sub-graph, and finding a corresponding graph division result when the modularity value is maximum, namely the optimal task partition P ═ P ₁ ,p ₂ ,...p _x ,...,p _h In which p is _x Representing the x-th task partition, and h representing the number of partitions;

(5) mapping the optimal task partition to a virtual machine to complete the deployment of the workflow:

(5a) average execution time according to task

And average data transfer time between tasks

Calculating the priority ran of each taskk(t _i )：

Wherein succ (t) _i ) Representing a task t _i A set of subtasks of (a);

rank(p _x )＝max(rank(t _i )，t _i ∈p _x )

(5c) according to rank (p) _x ) Sorting all partitions in descending order, selecting rank (p) each time _x ) Traversing all the virtual machines and calculating the sum of the total execution time of the task partition on the current virtual machine and the execution time of the deployed task on the virtual machine to find the virtual machine s with the minimum value _k ；

2. The method of claim 1, wherein: in the step (1), a workflow directed acyclic graph model G is established, and the following is realized:

(1a) the set of tasks T in the workflow is represented as: t ═ T _i 1,2, ·, n }, where t is ═ i ═ 1,2 _i Representing the ith task, wherein n is the number of tasks contained in the workflow;

(1b) the data dependency and timing relationship E between tasks is expressed as: e ═ E _i,j |t _i ,t _j E.g., T }, wherein e _i,j Value is 0 or 1, when e _i,j When the value is 0, the task t is represented _i And task t _j There is no dependency/edge present between them when e _i,j When the value is 1, the task t is represented _i And task t _j There is a dependency/existing edge between them, there is a directed edge e _i,j Connecting two tasks t _i And t _j Let t be _i Is t _j Parent task of, t _j Is t _i The sub task of (2), the task without any parent task is called an entry task;

(1c) the task complexity set L is represented as: l ═ L _i |t _i E.g., T }, where l _i Representing a task t _i The computational complexity of (2);

(1d) the set of data transmission amounts D is represented as: d ═ D _i,j |t _i ,t _j E.g., T }, wherein d _i,j Representing a task t _i And task t _j The amount of data transfer in between;

(1e) and combining the four elements to obtain a workflow directed acyclic graph model G ═ T, E, L and D.

3. The method of claim 1, wherein: in the step (2), the execution time w of the tasks in the workflow on different virtual machines is calculated _i,k And average execution time of tasks on all virtual machines

The formula is as follows:

wherein w _i,k Representing a task t _i In virtual machines s _k The execution time of the first time slot is greater than the execution time of the second time slot,

representing a task t _i Average execution time over all virtual machines, l _i Representing a task t _i Of the computational complexity, v _k Representing virtual machines s _k Q represents the number of available virtual machines.

4. The method of claim 1, wherein: calculating the data transmission time c between tasks with data dependency relationship in step (2) _i,j Average data transfer time between tasks

The formula is as follows:

wherein, c _i,j Representing a task t _i And task t _j Time of data transmission between, task t _i And task t _j Respectively deployed to virtual machines s _k1 And virtual machine s _k2 Upper execution, d _i,j Representing a task t _i And task t _j Amount of data transmission between r _k1,k2 Representing virtual machines s _k1 And virtual machine s _k2 The data transmission rate therebetween;

representing the ith task t _i And the jth task t _j The average data transfer time in between,

representing the average data transfer rate between all virtual machines; when two tasks with sequential dependency relationship are placed in the same virtual machine, the data transmission overhead between the tasks is negligible, namely c _i,j Is 0.

5. The method of claim 1, wherein: in the step (4b), the number of layers in which each task is located is determined by the maximum distance between the node and the ingress task.

6. The method of claim 1, wherein: step (4b) calculating the proportion e of the connecting edge weight between the ith sub-graph and the jth sub-graph related in the modularity increment delta Q formula to the sum of the total connecting edge weight in the graph G _i,j The following were determined:

wherein, A _i,j And m is the sum of all the continuous edge weights in the graph G'.

7. The method of claim 1, wherein: step (4b) calculating the proportion a of the sum of the link edge weights of all tasks in the ith sub-graph related in the modularity increment delta Q formula to the sum of the total link edge weights in the graph G _i The following were determined:

a _i ＝k _i /2m

wherein k is _i M is the sum of all the connected edge weights connected with the ith task, and m is the sum of all the connected edge weights in the graph G'.