CN115080236B - Workflow deployment method based on graph segmentation - Google Patents

Workflow deployment method based on graph segmentation Download PDF

Info

Publication number
CN115080236B
CN115080236B CN202210730454.9A CN202210730454A CN115080236B CN 115080236 B CN115080236 B CN 115080236B CN 202210730454 A CN202210730454 A CN 202210730454A CN 115080236 B CN115080236 B CN 115080236B
Authority
CN
China
Prior art keywords
task
workflow
tasks
graph
sum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210730454.9A
Other languages
Chinese (zh)
Other versions
CN115080236A (en
Inventor
马英红
吝李婉
焦毅
李红艳
刘伟
刘勤
张琰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202210730454.9A priority Critical patent/CN115080236B/en
Publication of CN115080236A publication Critical patent/CN115080236A/en
Application granted granted Critical
Publication of CN115080236B publication Critical patent/CN115080236B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a workflow deployment method based on graph segmentation, which mainly solves the problem that the task parallel execution efficiency is lower because the communication overhead is minimized at the expense of the task parallel execution efficiency in a workflow in the conventional workflow deployment algorithm based on clustering. The implementation scheme is as follows: 1) Establishing a workflow directed acyclic graph DAG model G; 2) Calculating task execution time and data transmission time among tasks in the workflow; 3) Merging the serial structures in the workflow model G to obtain a new workflow model diagram G'; 4) Dividing the new workflow model graph G' to obtain an optimal task partition; 5) And mapping the optimal task partition to the virtual machine according to the minimum execution time to complete the deployment of the workflow. The invention reduces the completion time of the workflow, improves the execution efficiency of the workflow, and can be used for the joint optimization of the data overhead and the task parallel execution efficiency in the workflow execution process.

Description

Workflow deployment method based on graph segmentation
Technical Field
The invention belongs to the technical field of cloud computing, and particularly relates to a workflow deployment method which can be used for joint optimization of data overhead and task parallel execution efficiency in a workflow execution process.
Background
In a cloud computing environment, a workflow refers to an associated task consisting of a set of dependent tasks, typically described in terms of a directed acyclic graph, DAG. Workflow is larger in scale and more complex in structure than independent tasks. The deployment of the workflow not only needs to consider the resource allocation of each task in the workflow, but also needs to consider the data transmission and execution sequence among the tasks, thereby greatly increasing the complexity of the task deployment. How to more scientifically and reasonably deploy workflows in a distributed and heterogeneous environment is still a research hotspot in the current academia.
Currently, clustered computing architectures such as MapReduce, spark are widely used in data center networks to analyze and process ever-increasing computing and networking tasks, break down complex large-scale tasks into simpler tasks and model them as workflows, and deliver them to cloud data centers with powerful parallel processing capabilities. A workflow comprises a plurality of tasks with interdependencies, which are connected to each other according to a certain order of priority, and the workflow has to be deployed taking into account the data transfer between the tasks. Studies have shown that in MapReduce applications, the time taken for intermediate data transmission is more than 30% of the overall workflow completion time, and for some large-scale commercial data centers, such as in the yahoo data center cluster, the process data transmission of the workflow is the most significant component of its network traffic, taking up approximately 60% of the total workflow completion time. At the same time, the process data transmission of the workflow is also a key cause of network congestion. Therefore, the workflow is optimally deployed, so that the process communication overhead is reduced, and the method is important to relieving the flow pressure of the data center and shortening the task completion time.
Representative heuristic workflow deployment algorithms mainly include: list-based deployment algorithms, cluster-based deployment algorithms, replication-based deployment algorithms. The workflow deployment algorithm based on clustering is mainly aimed at reducing communication overhead among workflow tasks, wherein tasks in the workflow are mapped into different clusters firstly, and then each cluster is mapped onto the same computing node as a whole. The core idea is to divide a plurality of tasks with edges connected (with data dependency relationship) into the same cluster, so as to save the communication overhead among the tasks in the cluster.
For example, ahmadsg et al, in its published paper "Data-intensive workflow optimization based on application task graph partitioning in heteroge-neous computing systems"(IEEE Fourth International Conference on Big Data and Cloud Computing.IEEE,2014:129-136), propose a partition-based, data-intensive workflow optimization algorithm PDWA for heterogeneous computing systems. In this algorithm, the workflow is divided into a specified size and number of task partitions to minimize the data transfer overhead between the partitions. PDWA defines the maximum number of tasks allowed to be included in each task partition, calculated by multiplying the total number of tasks in the workflow by a coefficient less than 1, and mapping each task partition to the computing node that minimizes the partition execution time. The method has the defects that the workflow is clustered according to the data dependency relationship among the workflow tasks, so that some tasks which have strong data dependency relationship and can be executed in parallel can be divided into the same cluster, the parallel execution performance of the tasks is poor, and finally the completion time of the workflow is influenced.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a workflow deployment method based on graph segmentation, so as to realize the balance between communication overhead minimization and parallelism maximization in the workflow clustering process and improve the execution efficiency of the workflow.
The technical idea of the invention is as follows: from the point of view of graph theory, the dependency and parallelism among tasks in the workflow are fully mined, a classical graph segmentation algorithm-community discovery algorithm is improved, and the joint optimization of data overhead and task parallelism in the workflow task partitioning process is realized.
According to the above thought, the technical scheme of the invention comprises the following steps:
(1) According to a task set T, a data dependency and time sequence relation E between tasks, a task complexity set L and a data transmission quantity set D in a workflow, a workflow directed acyclic graph DAG model is established: g= { T, E, L, D };
(2) Assigning a set of virtual machines s= { S k |k=1, 2,3, & gt, q }, wherein q represents the number of virtual machines, and the physical machines corresponding to each virtual machine are different; calculating the execution time w i,k of tasks on different virtual machines and the data transmission time c i,j between tasks with data dependency relationship in the workflow, and the average execution time of tasks on all virtual machines and the average data transmission time/>, between tasks
(3) Two tasks with serial structure in the workflow model G are determined and combined:
in the workflow model diagram, if only one subtask exists in one task and only one father task exists in the subtask, the subtask and the father task form a serial structure;
Canceling data transmission between the task t i and the task t i+1 with serial structures, and adding and combining the complexity of the two tasks into a new task t i';
(4) Partitioning a workflow model diagram:
(4a) Dividing a workflow model graph G into n subgraphs, wherein each subgraph comprises a vertex;
(4b) Sequentially searching and attempting to combine two subgraphs with edge connection, and calculating the module degree increment delta Q after each combination according to the task and the connection relation between the tasks contained in each subgraph:
If the tasks of the same layer exist in the two sub-graphs, calculating the sum of the average execution time of the tasks of the same layer in the new sub-graph after the two sub-graphs are combined, and comparing the sum with the maximum value maxW of the average execution time of all the tasks of the layer:
If sum > maxW a, then Δq= - (e i,j+ej,i-2aia)j=-2(ei,j-aiaj);
Otherwise, Δq=e i,j+ej,i-2aiaj=2(ei,j-aiaj); wherein alpha is a comparison coefficient, the value is a number smaller than 1, e i,j represents the proportion of the edge connecting weight between the ith sub-graph and the jth sub-graph to the sum of the total edge connecting weights in the graph G, and a i represents the proportion of the sum of the edge connecting weights of all tasks in the ith sub-graph to the sum of the total edge connecting weights in the graph G.
If there is no task at the same layer in both subgraphs, Δq=e i,j+ej,i-2aiaj=2(ei,j-aiaj);
(4c) Combining the two sub-graphs with the maximum delta Q value, and updating the modularity Q=Q+max delta Q;
(4d) Repeating the steps (4 a) to (4 c) until the whole graph G is combined into a sub graph, and finding a graph dividing result corresponding to the maximum module degree value, namely the optimal task partition P= { P 1,p2,...px,...,ph }, wherein P x represents the xth task partition, and h represents the partition number;
(5) Deploying the best task partition onto the virtual machine:
(5a) The priority rank of each task is calculated from the task average execution time and the average data transfer time between tasks/> (t i):
Wherein succ (t i) represents a subtask set of task t i;
(5b) The priority of the task partition is calculated according to the priority rank (t i) of each task:
rank(px)=max(rank(ti),ti∈px)
(5c) Arranging all the partitions in a descending order according to rank (p x), selecting a task partition with the largest rank (p x) value and undeployed each time, traversing all the virtual machines, calculating the sum of the total execution time of the task partition on the current virtual machine and the execution time of the deployed task on the virtual machine, and finding a virtual machine s k with the smallest rank;
(5d) All tasks in the task partition are deployed together as a whole onto the virtual machine s k, the plurality of tasks within the partition are arranged in descending rank (t i) values, and the virtual machine will execute the tasks sequentially.
Compared with the prior art, the invention has the following advantages:
1. When the workflow model graph is segmented, the sum of the average execution time sum of the tasks at the same layer in the partition is compared with the maximum value maxW of the average execution time sum of all the tasks at the same layer, so that the partition process of the tasks is restrained, the situation that the same task partition contains too many tasks at the same layer which can be parallel is effectively avoided, the execution time balance of the tasks at the same layer among different partitions is ensured, the problem that the communication overhead is minimized at the cost of sacrificing the parallel execution efficiency of the tasks in the workflow in the existing workflow deployment algorithm based on clustering, and the task parallel execution efficiency is lower is caused is solved.
2. When the optimal task partition is deployed on the virtual machine, the priority is calculated for each task partition and is sequenced, and then the proper virtual machine is selected for deployment by taking the basis of minimum execution time of the task partition as the basis, so that the completion time of the workflow is further reduced, and the execution efficiency of the workflow is improved.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is an exemplary diagram of a workflow DAG model in accordance with the present invention;
FIG. 3 is a schematic diagram showing the merging of DAG serial structures according to the present invention;
FIG. 4 is a schematic diagram of the segmentation of a workflow model diagram in accordance with the present invention;
FIG. 5 is a simulated comparison of performance changes of the present invention and existing workflow deployment algorithms as workflow scale increases;
FIG. 6 is a comparison graph of performance variation simulations of the present invention and existing workflow deployment algorithms with increasing communication computation ratio;
FIG. 7 is a comparison graph of performance variation simulations of the present invention and existing workflow deployment algorithms as the number of available virtual machines increases.
Detailed Description
Embodiments and effects of the present invention are described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation steps for the example are as follows:
And step 1, establishing a workflow directed acyclic graph DAG model.
The task set T in the workflow is expressed as: t= { T i |i=1, 2,., n }, where T i represents the i-th task and n is the number of tasks that the workflow contains;
The data dependency and timing relationship E between tasks is expressed as: e= { E i,j|ti,tj ε T }, wherein E i,j is 0 or 1, when E i,j is 0, no dependency/edge exists between task T i and task T j, and when E i,j is 1, no dependency/edge exists between task T i and task T j;
The task complexity set L is represented as: l= { L i|ti e T }, where L i represents the computational complexity of task T i;
The data transfer amount set D is expressed as: d= { D i,j|ti,tj e T }, where D i,j represents the amount of data transfer between task T i and task T j;
and according to the four element combinations, obtaining a workflow directed acyclic graph model:
G={T,E,L,D},
The task set T is a vertex set of the graph G, the data dependency and time sequence relation E between tasks is an edge set of the graph G, the task complexity set L is a vertex weight set of the graph G, and the data transmission quantity set D is an edge weight set of the graph G.
The workflow directed acyclic graph model is shown in fig. 2, wherein a directed edge e i,j connects two tasks t i and t j, t i is called a parent task of t j, t j is called a child task of t i, a task without any parent task is called an entrance task t enter, a task without any child task is called an exit task t exit, the number of layers where each task is located is determined by the maximum distance between a node and an entrance task, and the entrance task t 1 is located at the first layer.
Without loss of generality, this example assumes that a workflow has only one unique ingress task and only one egress task, and when a workflow has multiple egress tasks and ingress tasks, by adding a virtual egress task vertex or a virtual ingress task vertex to the workflow, the workflow has only one egress task and ingress task, and the complexity of the virtual task is set to be zero, and the data transmission amount between the virtual task and other vertices is also zero, so that adding the virtual task vertex does not affect the deployment result of the workflow, and t 10 in fig. 2 is a virtual egress task.
And 2, calculating the execution time of each task and the data transmission time among the tasks in the workflow.
2.1 Assigning a set of virtual machines s= { S k |k=1, 2,3,., q }, where q represents the number of virtual machines, and the physical machines corresponding to each virtual machine are different;
2.2 Computing the working parameters of tasks on different virtual machines in the workflow:
2.2.1 Computing the execution time w i,k of task t i on different virtual machines and the average execution time of task t i on all virtual machines in the workflow
Where l i represents the computational complexity of task t i, v k represents the processing power of virtual machine s k, and q represents the number of available virtual machines;
2.2.2 Calculating the data transfer time c i,j between the task t i and the task t j with the data dependency relationship and the average data transfer time between the tasks
Wherein, task t i and task t j are respectively deployed to virtual machine s k1 and virtual machine s k2 for execution, d i,j represents data transmission amount between task t i and task t j, and r k1,k2 represents data transmission rate between virtual machine s k1 and virtual machine s k2; Representing an average data transfer rate between all virtual machines; when two tasks with a sequential dependency relationship are placed in the same virtual machine, the data transmission overhead between the two tasks is negligible, namely c i,j is 0.
When the task t 1 and the task t 2 are placed in the same virtual machine as in fig. 2, no additional resources are needed for data transmission, so the data transmission overhead can be recorded as 0; when the task t 1 and the task t 2 are respectively placed in two different virtual machines, the data needs additional network resources to be transmitted, and the transmission rate r k1,k2 =1 between the two virtual machines is assumed, so that the required data transmission time between the task t 1 and the task t 2 is 8.
And step 3, merging the serial structures of the workflow model diagrams.
In the workflow model diagram, if only one subtask exists in one task and only one father task exists in the subtask, the subtask and the father task form a serial structure;
And canceling data transmission between the task t i and the task t i+1 with the serial structure, adding and combining the two tasks into a new task t 'i, and updating the calculation complexity and the workflow model diagram to form a new model diagram G'.
As shown in fig. 3, the task t i and the task t i+1 are two tasks with a serial structure, and after the two tasks are combined, a new task t' i is formed, and the new task computational complexity and the workflow model diagram are updated as follows:
w′i=wi+wi+1
pre(t′i)=pre(ti)
succ(t′i)=succ(ti+1)
Wherein w 'i represents the execution time of the new task t' i, w i represents the execution time of the task t i, w i+1 represents the execution time of the task t i+1, pre (t 'i) represents the parent task set of the new task t' i, succ (t 'i) represents the child task set of the new task t' i, pre (t i) represents the parent task set of the task t i, succ (t i+1) represents the child task set of the task t i+1.
And 4, dividing the new workflow model graph G' formed after the serial structures are combined.
4.1 Dividing the workflow model graph G' into n sub-graphs, wherein each sub-graph comprises a vertex;
4.2 Searching and attempting to combine two sub-graphs with edge connection in turn, and calculating the module degree increment delta Q after each combination according to the task and the connection relation between the tasks contained in each sub-graph:
If the tasks of the same layer exist in the two sub-graphs, calculating the sum of the average execution time of the tasks of the same layer in the new sub-graph after the two sub-graphs are combined, and comparing the sum with the maximum value maxW of the average execution time of all the tasks of the layer:
If sum > maxW a, then Δq= - (e i,j+ej,i-2aia)j=-2(ei,j-aiaj);
Otherwise, Δq=e i,j+ej,i-2aiaj=2(ei,j-aiaj); wherein alpha is a comparison coefficient, the value is a number smaller than 1, e i,j represents the proportion of the edge connecting weight between the ith sub-graph and the jth sub-graph to the sum of the total edge connecting weights in the graph G ', and a i represents the proportion of the sum of the edge connecting weights of all tasks in the ith sub-graph to the sum of the total edge connecting weights in the graph G';
If there is no task at the same layer in both subgraphs, Δq=e i,j+ej,i-2aiaj=2(ei,j-aiaj);
Wherein e i,j is the ratio of the sum of the edge weights of the ith sub-graph and the jth sub-graph to the sum of the total edge weights in the graph G ', a i is the ratio of the sum of the edge weights of all tasks in the ith sub-graph to the sum of the total edge weights in the graph G', and the ratio is calculated as follows:
ai=ki/2m
Wherein A i,j is the edge weight between the ith task and the jth task, and m is the sum of all the edge weights in the graph G'. k i is the sum of all the edge weights connected to the ith task.
4.3 Repeating 4.1) and 4.2) until the entire graph G' merges into one community. And finding a corresponding community structure when the modularity value is maximum, namely the optimal partition scheme P= { P 1,p2,...ph }.
FIG. 4 shows an example of workflow task partitioning, which is a workflow composed of five tasks, where t 2,t3,t4 belongs to the same layer and can be executed in parallel, and their execution times are 15, 10, and 25 respectively, where when sub-partitioning is performed, the three tasks may be sequentially executed by being partitioned into the same partition, which obviously results in reduced parallelism of workflow execution and prolonged completion time; another possible result is that t 3 and t 4 are divided into the same partition, t 2 is divided into another partition, the execution time of the tasks on the same layer between the two partitions is unbalanced, and the two partitions need to wait for each other when actually executing, so that the completion time of the workflow is prolonged; in contrast, if t 2 and t 3 are divided into the same partition and t 4 is divided into another partition, the execution time of the tasks on the same layer in the two partitions is uniformly distributed, and thus, the increase of redundant waiting time can be avoided.
And 5, mapping the optimal task partition to the virtual machine to complete the deployment of the workflow.
5.1 Calculating a priority rank for each task from the task average execution time and the average data transfer time between tasks/> (t i):
Wherein succ (t i) represents a subtask set of task t i;
5.2 Calculating the priority of the task partition according to the priority rank (t i) of each task:
rank(px)=max(rank(ti),ti∈px)
5.3 All the partitions are arranged in a descending order according to the priority rank (p x) of the task partition, each time a task partition with the maximum rank (p x) value and undeployed is selected, all the virtual machines are traversed, the sum of the total execution time of the task partition on the current virtual machine and the execution time of the deployed task on the virtual machine is calculated, and the virtual machine s k with the minimum rank value is found;
5.4 All tasks in the task partition are deployed together as a whole onto the virtual machine s k, and the tasks in the partition are arranged in descending order according to the rank (t i) value, and the virtual machine sequentially executes the tasks according to the sequence to complete the deployment of the workflow.
The effects of the present invention are further described below in conjunction with simulation experiments:
1. simulation parameter setting:
the workflow directed acyclic graph DAG is randomly generated, and simulation parameters are set as shown in Table 1:
Table 1 workflow deployment simulation parameter settings
Parameters (parameters) And (3) taking the value: fixed value (variation value)
Workflow Scale N 50(20,40,60,80,100)
Communication calculation ratio CCR 0.5(0.4,0.8,1.2,1.6,2)
Number of available virtual machines K 5(1,2,3,4,5,6)
In table 1, the workflow scale refers to the number of tasks included in the workflow; the communication computation ratio CCR refers to the ratio of the sum of the average data transmission time between tasks to the sum of the average execution time of all tasks, with higher CCR indicating that the workflow is a communication intensive application, and lower CCR indicating that the workflow is a computation intensive application;
2. simulation content and result analysis:
The two workflow deployment algorithms, namely the existing list-based deployment algorithm HEFT and the clustering-based deployment algorithm PDWA, are selected to evaluate and compare with the deployment result of the method according to three aspects, namely the influence of the workflow scale on the algorithm performance, the influence of the communication calculation comparison algorithm performance and the influence of the number of virtual machines on the algorithm performance.
Simulation 1, the number of virtual machines is set to be 5, the communication calculation ratio CCR is set to be 0.5, the number of workflow task nodes is increased from 20 to 100, and the performance changes of the three methods along with the increase of the workflow scale are compared, so that the result is shown in figure 5. Wherein:
Fig. 5 (a) is a graph showing the change of the scheduling length ratio SLR of three methods according to the increase of the workflow scale, wherein the abscissa is the number of tasks in the workflow, and the ordinate is the scheduling length ratio SLR representing the workflow completion time, and the larger the SLR value, the larger the workflow completion time, the worse the performance. As can be seen from fig. 5 (a), compared with the deployment algorithm PDWA based on clustering, the workflow completion time of the present invention is reduced by 75% on average, because PDWA artificially designates the size of the workflow partition, the number of partitions is relatively fixed, and the parallelism of task execution is not considered, resulting in the extension of the workflow completion time, while the method of the present invention considers the communication overhead and the parallelism of task execution when performing the workflow task partition, and avoids excessive waiting time caused by the execution time that can be executed in parallel or is unbalanced, thereby effectively reducing the workflow completion time.
Fig. 5 (b) is a diagram showing the comparison of speedup changes of three methods with the increase of the workflow scale, the abscissa thereof is the number of tasks in the workflow, and the ordinate thereof is speedup, speedup which represents the parallel execution efficiency of the workflow, and the larger the parallel execution efficiency of the deployment method is, the better the performance is. As can be seen from fig. 5 (b), as the workflow scale increases, the speedup values of the method of the present invention are always larger compared to the clustering-based deployment algorithm PDWA. The method fully considers the dependency and parallelism of the tasks in the workflow when the tasks are partitioned, reduces the data transmission overhead among communities, and simultaneously considers the parallel execution efficiency of the tasks in the workflow, so that the tasks at the same layer can be efficiently executed in parallel.
Fig. 5 (c) is a graph of the communication overhead change with increasing workflow size for three methods, with the abscissa representing the number of tasks in the workflow and the ordinate representing the communication overhead. As can be seen from fig. 5 (c), with the gradual increase of the workflow size, the communication overhead is in an upward trend, and compared with the list-based algorithm HEFT, the method and the cluster-based deployment algorithm PDWA of the present invention have lower communication overhead all the time, and with reference to fig. 5 (a) and 5 (b), it is illustrated that the method of the present invention can achieve a reduction in the workflow completion time and a significant improvement in the parallel execution efficiency while maintaining the smaller communication overhead.
Simulation 2, setting the number of virtual machines to 5, setting the number of workflow task nodes to 50, increasing the communication calculation ratio CCR from 0.4 to 2, comparing the performance changes of three methods with CCR increase, the results are shown in fig. 6, wherein:
Fig. 6 (a) is a graph showing the change of the scheduling length ratio SLR with increasing CCR for three methods, wherein the abscissa is the communication calculation ratio CCR and the ordinate is the scheduling length ratio SLR. As can be seen from fig. 6 (a), in the case of continuously increasing CCR, the workflow completion time of the method of the present invention is always smaller than that of the clustering-based deployment algorithm PDWA, and the performance of the method of the present invention and the list-based algorithm HEFT is more stable with the change of CCR values, while the SLR value of PDWA is significantly increased with the increase of CCR values, because the clustering-based deployment algorithm PDWA gradually loses advantages only with the goal of minimizing communication overhead on the basis of specifying the workflow partition size with the increase of the proportion of data transmission time.
Fig. 6 (b) is a plot of the change speedup of the three methods as CCR increases versus the number of tasks in the workflow on the abscissa and speedup on the ordinate. As can be seen from fig. 6 (b), the speedup values of the method of the present invention present a significant advantage over the clustering-based deployment algorithm PDWA with increasing CCR, and are more adaptive, because the present invention fully considers the parallelism of workflow tasks when performing workflow partitioning.
Fig. 6 (c) is a graph of the communication overhead change with CCR increase versus three methods, with the abscissa being the number of tasks in the workflow and the ordinate being the communication overhead. As can be seen in fig. 6 (c), the inventive method and the clustering-based deployment algorithm PDWA have lower communication overhead than the list-based algorithm HEFT. With reference to fig. 6 (a) and fig. 6 (b), it can be verified that the method of the present invention can effectively improve the execution efficiency of the workflow on the premise of maintaining a smaller communication overhead.
Simulation 3, setting the number of workflow tasks to 50, setting the communication calculation ratio CCR to 0.5, increasing the number of virtual machines from 1 to 6, and comparing the performance changes of three methods with the increase of the number of available virtual machines, wherein the results are shown in fig. 7, wherein:
FIG. 7 (a) is a graph showing the variation of the scheduling length ratio SLR with the number of available virtual machines in three ways, wherein the abscissa represents the number of available virtual machines and the ordinate represents the scheduling length ratio SLR. As can be seen from fig. 7 (a), as the number of available virtual machines increases, the SLR value of the workflow decreases continuously, and the advantages of the method of the present invention are also gradually apparent, because as the number of virtual machines increases, more tasks are allowed to be processed in parallel on different virtual machines in the workflow execution process, and when task partitioning is performed, the method of the present invention considers the balance of task parallelism and execution time and the data transmission overhead between the partitions, so that when a plurality of virtual machines execute in parallel, excessive waiting time between different partitions is avoided, and the completion time of the workflow is effectively reduced.
FIG. 7 (b) is a comparison of speedup changes in three methods as the number of available virtual machines increases, with the number of available virtual machines on the abscissa and speedup on the ordinate. As can be seen from fig. 7 (b), as the number of available virtual machines increases, the speedup value increases continuously, and the advantages of the method of the present invention become apparent. The method and the device not only enable the communication expenditure among the partitions to be minimum when partitioning, but also give consideration to the parallel execution efficiency of tasks in the workflow execution process, and allow more partitions to be processed in parallel on different virtual machines in the workflow execution process along with the increase of the number of the available virtual machines, so that speedup values can be increased along with the increase of the number of the available virtual machines.
Fig. 7 (c) is a graph of the communication overhead change with increasing number of available virtual machines versus three methods, with the abscissa representing the number of available virtual machines and the ordinate representing the communication overhead. As can be seen in fig. 7 (c), the inventive method and the clustering-based deployment algorithm PDWA have lower communication overhead than the list-based algorithm HEFT. As can be seen from fig. 7 (a) and fig. 7 (b), the method of the present invention can effectively improve the parallel execution efficiency and reduce the workflow completion time while maintaining lower communication overhead.

Claims (7)

1. The workflow deployment method based on graph segmentation is characterized by comprising the following steps:
(1) According to a task set T, a data dependency and time sequence relation E between tasks, a task complexity set L and a data transmission quantity set D in a workflow, a workflow directed acyclic graph DAG model is established: g= { T, E, L, D };
(2) Assigning a set of virtual machines s= { S k |k=1, 2,3, & gt, q }, wherein q represents the number of virtual machines, and the physical machines corresponding to each virtual machine are different; calculating the execution time w i,k of tasks on different virtual machines and the data transmission time c i,j between tasks with data dependency relationship in the workflow, and the average execution time of tasks on all virtual machines and the average data transmission time/>, between tasks
(3) Two tasks with serial structures in the workflow model G are determined and combined to obtain a new workflow model diagram G':
in the workflow model diagram, if only one subtask exists in one task and only one father task exists in the subtask, the subtask and the father task form a serial structure;
Canceling data transmission between the task t i and the task t i+1 with a serial structure, and adding and combining the two tasks into a new task t' i;
(4) Splitting a new workflow model diagram G' formed after the serial structures are combined:
(4a) Dividing the workflow model graph G' into n sub-graphs, wherein each sub-graph comprises a vertex;
(4b) Sequentially searching and attempting to combine two subgraphs with edge connection, and calculating the module degree increment delta Q after each combination according to the task and the connection relation between the tasks contained in each subgraph:
If the tasks of the same layer exist in the two sub-graphs, calculating the sum of the average execution time of the tasks of the same layer in the new sub-graph after the two sub-graphs are combined, and comparing the sum with the maximum value maxW of the average execution time of all the tasks of the layer:
If sum > maxW a, then Δq= - (e i,j+ej,i-2aia)j=-2(ei,j-aiaj);
Otherwise, Δq=e i,j+ej,i-2aiaj=2(ei,j-aiaj); wherein alpha is a comparison coefficient, the value is a number smaller than 1, e i,j represents the proportion of the edge connecting weight between the ith sub-graph and the jth sub-graph to the sum of the total edge connecting weights in the graph G, and a i represents the proportion of the sum of the edge connecting weights of all tasks in the ith sub-graph to the sum of the total edge connecting weights in the graph G;
If there is no task at the same layer in both subgraphs, Δq=e i,j+ej,i-2aiaj=2(ei,j-aiaj);
(4c) Combining the two sub-graphs with the maximum delta Q value, and updating the modularity Q=Q+max delta Q;
(4d) Repeating the steps (4 a) to (4 c) until the whole graph G' is combined into a sub graph, and finding a graph dividing result corresponding to the maximum module degree value, namely the optimal task partition P= { P 1,p2,...px,...,ph }, wherein P x represents the xth task partition, and h represents the partition number;
(5) Mapping the optimal task partition to a virtual machine to complete the deployment of the workflow:
(5a) The priority rank of each task is calculated from the task average execution time and the average data transfer time between tasks/> (t i):
Wherein succ (t i) represents a subtask set of task t i;
(5b) The priority of the task partition is calculated according to the priority rank (t i) of each task:
rank(px)=max(rank(ti),ti∈px)
(5c) Arranging all the partitions in a descending order according to rank (p x), selecting a task partition with the largest rank (p x) value and undeployed each time, traversing all the virtual machines, calculating the sum of the total execution time of the task partition on the current virtual machine and the execution time of the deployed task on the virtual machine, and finding a virtual machine s k with the smallest rank;
(5d) All tasks in the task partition are deployed together as a whole onto the virtual machine s k, the plurality of tasks within the partition are arranged in descending rank (t i) values, and the virtual machine will execute the tasks sequentially.
2. The method according to claim 1, characterized in that: in the step (1), a workflow directed acyclic graph model G is established, and the following is realized:
(1a) The task set T in the workflow is expressed as: t= { T i |i=1, 2,., n }, where T i represents the i-th task and n is the number of tasks that the workflow contains;
(1b) The data dependency and timing relationship E between tasks is expressed as: e= { E i,j|ti,tj ε T }, wherein E i,j takes a value of 0 or 1, when E i,j takes a value of 0, it indicates that there is no dependency/no edge between task T i and task T j, when E i,j takes a value of 1, it indicates that there is a dependency/edge between task T i and task T j, and directional edge E i,j connects two tasks T i and T j, called T i as the parent task of T j, T j as the child task of T i, and the task without any parent task is called the entry task;
(1c) The task complexity set L is represented as: l= { L i|ti e T }, where L i represents the computational complexity of task T i;
(1d) The data transfer amount set D is expressed as: d= { D i,j|ti,tj e T }, where D i,j represents the amount of data transfer between task T i and task T j;
(1e) And combining the four elements to obtain a workflow directed acyclic graph model G= { T, E, L, D }.
3. The method according to claim 1, characterized in that: in step (2), the formula for calculating the execution time w i,k of the task on different virtual machines and the average execution time of the task on all the virtual machines in the workflow is as follows:
Where w i,k represents the execution time of task t i on virtual machine s k, represents the average execution time of task t i on all virtual machines, l i represents the computational complexity of task t i, v k represents the processing power of virtual machine s k, and q represents the number of available virtual machines.
4. The method according to claim 1, characterized in that: in the step (2), the average data transmission time between tasks with data dependency relationship c i,j is calculated as follows:
Wherein c i,j represents data transmission time between task t i and task t j, task t i and task t j are respectively deployed to virtual machine s k1 and virtual machine s k2 for execution, d i,j represents data transmission amount between task t i and task t j, and r k1,k2 represents data transmission rate between virtual machine s k1 and virtual machine s k2; Representing the average data transfer time between the ith task t i and the jth task t j,/> representing the average data transfer rate between all virtual machines; when two tasks with a sequential dependency relationship are placed in the same virtual machine, the data transmission overhead between the two tasks is negligible, namely c i,j is 0.
5. The method according to claim 1, characterized in that: in step (4 b), the number of layers each task is located in is determined by the maximum distance between the node and the ingress task.
6. The method according to claim 1, characterized in that: step (4 b) calculates the ratio e i,j of the edge weight between the ith sub-graph and the jth sub-graph involved in the module degree increment Δq formula to the sum of the total edge weights in the graph G', and determines the ratio as follows:
Wherein A i,j is the edge weight between the ith task and the jth task, and m is the sum of all the edge weights in the graph G'.
7. The method according to claim 1, characterized in that: step (4 b) calculates the ratio a i of the sum of the edge weights of all tasks in the ith sub-graph involved in the module degree increment Δq formula to the sum of the total edge weights in the graph G', and determines the ratio as follows:
ai=ki/2m
Where k i is the sum of all the edge weights connected to the ith task and m is the sum of all the edge weights in graph G'.
CN202210730454.9A 2022-06-24 2022-06-24 Workflow deployment method based on graph segmentation Active CN115080236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210730454.9A CN115080236B (en) 2022-06-24 2022-06-24 Workflow deployment method based on graph segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210730454.9A CN115080236B (en) 2022-06-24 2022-06-24 Workflow deployment method based on graph segmentation

Publications (2)

Publication Number Publication Date
CN115080236A CN115080236A (en) 2022-09-20
CN115080236B true CN115080236B (en) 2024-04-16

Family

ID=83255512

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210730454.9A Active CN115080236B (en) 2022-06-24 2022-06-24 Workflow deployment method based on graph segmentation

Country Status (1)

Country Link
CN (1) CN115080236B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193658A (en) * 2017-05-25 2017-09-22 重庆工程学院 Cloud computing resource scheduling method based on game theory
CN108108225A (en) * 2017-12-14 2018-06-01 长春工程学院 A kind of method for scheduling task towards cloud computing platform
CN109298930A (en) * 2017-07-24 2019-02-01 西安电子科技大学昆山创新研究院 A kind of cloud workflow schedule method and device based on multiple-objection optimization
CN109634742A (en) * 2018-11-15 2019-04-16 华南理工大学 A kind of time-constrain scientific workflow optimization method based on ant group algorithm
CN110008026A (en) * 2019-04-09 2019-07-12 中国科学院上海高等研究院 Job scheduling method, device, terminal and the medium divided equally based on additional budget
WO2021056787A1 (en) * 2019-09-23 2021-04-01 苏州大学 Hybrid cloud service process scheduling method
CN113032155A (en) * 2021-05-25 2021-06-25 深圳大学 Cloud edge resource cooperative scheduling method driven by time-space data visualization task

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10268512B2 (en) * 2016-06-23 2019-04-23 International Business Machines Corporation Optimizing simultaneous startup or modification of inter-dependent machines with specified priorities

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193658A (en) * 2017-05-25 2017-09-22 重庆工程学院 Cloud computing resource scheduling method based on game theory
CN109298930A (en) * 2017-07-24 2019-02-01 西安电子科技大学昆山创新研究院 A kind of cloud workflow schedule method and device based on multiple-objection optimization
CN108108225A (en) * 2017-12-14 2018-06-01 长春工程学院 A kind of method for scheduling task towards cloud computing platform
CN109634742A (en) * 2018-11-15 2019-04-16 华南理工大学 A kind of time-constrain scientific workflow optimization method based on ant group algorithm
CN110008026A (en) * 2019-04-09 2019-07-12 中国科学院上海高等研究院 Job scheduling method, device, terminal and the medium divided equally based on additional budget
WO2021056787A1 (en) * 2019-09-23 2021-04-01 苏州大学 Hybrid cloud service process scheduling method
CN113032155A (en) * 2021-05-25 2021-06-25 深圳大学 Cloud edge resource cooperative scheduling method driven by time-space data visualization task

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"IPSO Task Scheduling Algorithm for Large Scale Data in Cloud Computing Environment";Heba Saleh;《IEEE Access》;20181228;第7卷;第5412-5420页 *
"云环境下截止期约束的科学工作流优化调度算法研究";曹书锦;《中国优秀硕士学位论文全文数据库 信息科技辑》;20220215(2022年第02期);第I138-3页 *
"通信计算联合优化的图分割工作流部署方法";马英红;《西安电子科技大学学报(网络首发)》;20240104;第1-16页 *

Also Published As

Publication number Publication date
CN115080236A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
CN108566659B (en) 5G network slice online mapping method based on reliability
CN111722910B (en) Cloud job scheduling and resource allocation method
CN107357652B (en) Cloud computing task scheduling method based on segmentation ordering and standard deviation adjustment factor
CN112380008A (en) Multi-user fine-grained task unloading scheduling method for mobile edge computing application
CN111314123B (en) Time delay and energy consumption-oriented power Internet of things work load distribution method
CN110717300A (en) Edge calculation task allocation method for real-time online monitoring service of power internet of things
CN112788605B (en) Edge computing resource scheduling method and system based on double-delay depth certainty strategy
CN109617826A (en) A kind of storm dynamic load balancing method based on cuckoo search
CN111836284B (en) Energy consumption optimization calculation and unloading method and system based on mobile edge calculation
CN114996001A (en) Distributed machine learning task GPU resource scheduling and distributing method and system
CN105677447A (en) Clustering-based delay bandwidth minimization virtual machine deployment method in distributed cloud
CN113590307A (en) Edge computing node optimal configuration method and device and cloud computing center
CN111131447A (en) Load balancing method based on intermediate node task allocation
Zhang et al. Dynamic DNN decomposition for lossless synergistic inference
CN112596910A (en) Cloud computing resource scheduling method in multi-user MEC system
Ahmad et al. Task assignment using a problem‐space genetic algorithm
CN115048200A (en) Cloud edge aggregation computing-oriented multi-decision computing task optimization unloading method
CN113159539B (en) Method for combining green energy scheduling and dynamic task allocation in multi-layer edge computing system
CN108833295B (en) SDN-oriented virtual network reconfiguration method based on tabu search
CN106896895A (en) A kind of heterogeneous system dynamic power consumption optimization method based on AOV gateway key path queries
CN114356585A (en) Optimization method and device for mobile edge computing unloading and computer equipment
CN104348695B (en) A kind of mapping method of virtual network and its system based on artificial immune system
CN115080236B (en) Workflow deployment method based on graph segmentation
CN111245701B (en) Link priority virtual network mapping method based on maximum weighted matching
CN113342313B (en) Method for asynchronously updating linear classification model parameters in Spark MLlib based on parameter server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant