CN115185651A - Workflow optimization scheduling algorithm based on cloud computing - Google Patents

Workflow optimization scheduling algorithm based on cloud computing Download PDF

Info

Publication number
CN115185651A
CN115185651A CN202210478776.9A CN202210478776A CN115185651A CN 115185651 A CN115185651 A CN 115185651A CN 202210478776 A CN202210478776 A CN 202210478776A CN 115185651 A CN115185651 A CN 115185651A
Authority
CN
China
Prior art keywords
workflow
algorithm
scheduling
cloud computing
pollination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210478776.9A
Other languages
Chinese (zh)
Inventor
许海峰
谭善鑫
刘心彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202210478776.9A priority Critical patent/CN115185651A/en
Publication of CN115185651A publication Critical patent/CN115185651A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a workflow scheduling optimization method based on cloud computing, and provides an improved differential pollination workflow scheduling algorithm aiming at the structure, characteristics and workflow optimization scheduling in a cloud computing environment. The algorithm firstly analyzes the structures and the characteristics of a workflow scheduling model and a cloud computing resource model, obtains two functions mainly provided by a workflow system in cloud computing, and respectively provides resources for task selection and allocates proper virtual machines to execute corresponding tasks. Therefore, a three-layer workflow scheduling model based on cloud computing is provided, global pollination and local pollination of a flower pollination algorithm are correspondingly improved, and finally a differential pollination workflow scheduling algorithm in a cloud computing environment is provided. Compared with the traditional single-target optimization algorithm, the optimization method has obvious optimization effect under the same constraint condition on the virtual machine. Therefore, the invention has certain feasibility and effectiveness.

Description

Workflow optimization scheduling algorithm based on cloud computing
Technical Field
The invention relates to a manufacturing process scheduling optimization method based on a workflow, and belongs to the field of intelligent calculation and scheduling optimization.
Background
Data storage and calculation requirements of large enterprises are increasing day by day, and in the face of so many data processing requirements, workflow optimization problems in the enterprises are very important. Cloud computing has the advantages of low cost, large storage scale, high computing power and the like, and therefore is being widely applied to various fields.
Cloud computing services, cloud services for short, are a method of uniformly managing and scheduling computing resources and then providing services to users as needed. A workflow is a collection of tasks, wherein a task may depend on the execution of one or more tasks. The set of directed edges between tasks in the workflow can represent the dependency of the tasks, whether the execution of one task is established or not needs to see whether the dependency relationship meets the sequential dependency and the data dependency, the sequential dependency ensures that all predecessor tasks of the current task are executed completely, and the data dependency ensures that the execution data required by the task is correspondingly received after the previous task is executed completely.
Compared with the traditional distributed system, the cloud computing has the characteristics of complexity and changeability, so that influence factors on the evaluation standards are increased, if the evaluation standards cannot be optimized and balanced well, the cost is increased, and the use and experience of a user are influenced. The invention aims to research a cloud workflow execution time optimization scheduling algorithm, fully utilize task scheduling time gaps, save execution time, meet user requirements, reduce user cost and provider cost and achieve the purpose of sustainable development.
Disclosure of Invention
In order to achieve the purpose, the invention provides a differential pollination workflow optimization scheduling algorithm based on cloud computing, which comprises the following steps:
(1) And establishing a cloud computing workflow scheduling model according to the workflow execution relation.
(2) And calculating the consumption of workflow execution, and writing an algorithm to preprocess the workflow task according to the priority.
(3) The global and local improvement of the algorithm is realized through the cross operation, the exchange mutation and the inverse mutation of the task sequence.
(4) The differential pollination workflow optimization algorithm is provided according to the improvement
The step (1) is specifically as follows:
(1.1) creating a directed edge set between tasks according to the task dependency, which can be represented as a quadruple, P = (K, E, Z, X).
(1.2) task R represented by R i Set of (1), 0<=i<= N. Total number of tasks N, total number of edges E.
(1.3) by directed edge e ij Will task r i And task r j Is connected, and 0<=i,j<= N, i ≠ j, indicating dependency between tasks K i All direct front-off node assemblies are parent (k) i ) All the direct successor nodes are aggregated to child (k) i )。
(1.4) the computation weight Z is measured by the number of commands and the transmission weight X is measured by the number of bits.
(1.5) Add virtual Start task node r start And end task node r end To satisfy both sequential dependency and data dependency execution workflows.
(1.6) creation of V m Representing a set of virtual machines, vm r A set of representations V m The r-th virtual machine of (1).
(1.7) computing a Single virtual machine v mr Computing power of (d), using PC (v) mr ) And (4) showing.
(1.8) establishing a cloud computing-based three-layer workflow scheduling model meeting various requirements, wherein the three-layer workflow scheduling model comprises the following steps: a resource layer, a scheduling layer and a user layer.
The step (2) is specifically as follows:
(2.1) the optimization objective according to the purpose of the workflow can be expressed as formula (2-1):
Makespan=minimize(t end ) (2-1)
(2.2) calculating the workflow scheduling cost, wherein the workflow scheduling cost is mainly used for the calculation overhead of the virtual machine of cloud computing, and a calculation formula can be obtained from a formula (2-2) table:
Figure BDA0003624777100000021
(2.3) wherein UC (vm) r ) Representing virtual machines vm r Per unit of execution cost. The energy consumption of the modern processor integrated circuit is mainly from dynamic energy consumption E dynamic The Dynamic energy consumption of the processor is reduced by adopting a DVFS (Dynamic Voltage and Frequency Scaling) technology design model, and the Dynamic energy consumption E is reduced dynamic Can be obtained from equation (2-3):
P dynamic =ACv 2 f (2-3)
(2.4) where AC is a constant value depending on the device, v denotes a supply voltage, and f denotes a clock frequency. Therefore, the energy consumption in the cloud computing environment is the sum of the energy consumption of all the rented virtual machines and can be calculated by the formula (2-4):
Figure BDA0003624777100000022
(2.5) in order to maximize profits, the cloud service provider also needs to consider the performance parameter of the resource utilization rate, which can be calculated by the formula (2-5):
Figure BDA0003624777100000023
(2.6) wherein et k Representing the time period, tt, for which a task is executed on a virtual machine k The higher the whole time period representing the virtual machine is leased, the smaller the idle time of the resource is, and conversely, the more the idle time of the resource is, the more the resource is wasted.
(2.7) in order to apply the flower pollination algorithm to workflow scheduling on cloud computing, the flower pollination algorithm needs to be improved: before scheduling is adopted, priority calculation is carried out on tasks of a workflow, and the tasks are layered according to priority.
(2.8) define the upward priority of the task, representing the longest distance between task ti and bundle task tend. The calculation method is to calculate the upward priority of each task from back to front, i.e. from tend, and once forward, knowing that the task starts from the workflow DAG graph recursively, the upward priority of the task can be calculated by formula (2-6):
β up =max(TT(t i ,t j )+ET(t i )+β up (t j ))t j ∈child(t i ) (2-6)
(2.9) writing algorithm (1): pseudo code for priority hierarchical operation of tasks
Figure BDA0003624777100000031
The step (3) is specifically as follows:
(3.1) Global pollination according to the flower pollination algorithm can be represented by the mathematical formula (2-1):
Figure BDA0003624777100000032
(3.2) because the distance between the biological global pollination is long, the cross operation of the task sequence is adopted to carry out the cross global pollination operation on the flower individuals.
(3.3) based on the task sequence, the cross operation is obtained on the flowers A and B, and the pollen position 2,3,6 is randomly selected.
(3.4) pollen t selected for flower A 2 ,t 3 And t 6 Pollen t in flower B 2 ,t 3 And t 6 And performing price difference operation on the flower B according to the same pollen sequence of the flower A to obtain a new flower B at positions 1,3 and 7 respectively, and performing the same cross operation on the flower A to obtain a new flower A in the same way to obtain new cross maps of the flower A and the flower B.
(3.5) thereby obtaining algorithm (2): cross global pollination operation pseudo code
Figure BDA0003624777100000033
Figure BDA0003624777100000041
(3.6) the priority layering operation of the tasks can be completed in small steps or large steps for changing the scheduling sequence of the tasks. For small steps, crossover mutations can be used and two randomly selected task positions can be individually adapted, e.g.crossover mutations A, B. For the larger step, inverse mutation is used, two tasks are randomly selected, and all tasks between the tasks are subjected to inverse sequence operation, such as inverse mutation A and B.
(3.7) writing (2) mutation local pollination operation algorithm according to the two mutation operations
Figure BDA0003624777100000042
The step (4) is specifically as follows:
with the above improved strategy, the DMFPA algorithm is proposed as follows:
Figure BDA0003624777100000043
Figure BDA0003624777100000051
drawings
FIG. 1 cloud computing workflow scheduling model
FIG. 2 Cross-over, crossover and reverse mutations,
FIG. 3 is a flow chart of the DMFPA algorithm
FIG. 4 GA-Budget based execution time difference ratio
FIG. 5 GA-Deadline-based scheduling cost difference ratio
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Step 1: as shown in fig. 1, the actual workflow relationship is created according to the three-layer model workflow model to execute the following operations, which are resource layers: providing resources for upper layer application to select, and the scheduling layer: for the scheduling of the tasks, different resource comparison and selection processes are carried out, and the user layer comprises the following steps: the user can set services according to the requirements of the user and submit the workflow tasks to the scheduling layer.
Step 2: as shown in fig. 2, the actually input parameters A, B are performed with a crossover operation, an exchange mutation, and an inverse mutation, which are respectively an improvement of the global algorithm and an improvement of the local algorithm, in order to improve the value of the fitness and the quality of the solution to obtain the optimal solution.
And step 3: as shown in fig. 3, which is a flow chart of the DMFPA algorithm, the workflow is first prioritized and layered, then the global algorithm or the local algorithm is selected to be executed according to the comparison between the decision random number and the transition probability, and finally the final scheduling scheme is output by executing the relevant operation when the iteration number is satisfied.
Different workflow combinations and modeling methods are used in workflow scheduling studies, the workflows used herein contain between 10 and 100 tasks and the selection probability of each workflow is the same. Nodewight represents the computational weight of the workflow and EdgeWeight represents the communication weight of two tasks between the workflow. NodeToEdgeWeight represents the calculated traffic ratio. The comparison algorithms adopted in the experiment are GA-Deadline, GA-Budget, GA-EC and MOPSO.
The execution time difference ratio of each algorithm under different types of working flows after the algorithm is executed is shown in fig. 4. As can be seen from the figure, the difference between the execution times of the GA-Budget and the GA-Budget is 0, which reveals from the side that the execution time of the GA-Budget is the longest, because the GA-Budget tries to select the least expensive computing device to execute the task to achieve the minimum scheduling cost, the result is that the execution time is significantly increased, and the execution time of the DMFPA can be increased by about 10% -20% compared with that of the GA-EC and the GA-Budget respectively. Compared with GA-Budget, the MOPSO algorithm can correspondingly reduce the execution time by 3% -18% according to different workflow structures. In all of these cases, however, the proposed algorithm DMFPA is executed at a smaller time than the MOPSO algorithm.
The difference ratio of the scheduling cost of each algorithm under different types of working flows is shown in figure 5. As can be seen from the figure, the scheduling cost difference ratio of the GA-deadlines is 0, which reveals from the side that the execution time of the GA-deadlines is the shortest, because the GA-deadlines attempt to select the computing device with the highest price and performance to execute the task to achieve the situation with the shortest execution time, thereby causing the scheduling cost to increase significantly. Compared with GA-EC and GA-Deadline, the DMFPA algorithm can save the scheduling cost by 20-30 percent respectively. In the case of LNodeHEdge, the scheduling cost difference ratio of the proposed algorithm DMFPA and MOPSO algorithm is the same, in the case of LNodeHEdge and HNodeHEdge, the proposed algorithm DMFPA is superior to MOPSO algorithm, and in the case of rnodehedge and HNodeHEdge, the MOPSO algorithm is superior to the proposed algorithm DMFPA.
After each algorithm execution, it can be seen from table 1 that under the limitation of deadline and budget proposed by the user, the proposed DMFPA algorithm can obtain relatively less execution time and scheduling cost compared with the GA-based algorithm in most cases. The algorithm DMFPA can schedule 90% of the workflow to be completed within the specified budget, and can schedule 84% of the workflow to be completed within the specified deadline, and the data shows that the DMFPA algorithm has certain advantages over the algorithm MOPSO under the limit conditions of the deadline and the budget.
Figure BDA0003624777100000061
As can be seen from table 2, the resource utilization rate of the condition fertilizer analysis algorithm after the execution of each algorithm is between 40.2% and 46%, and a good resource utilization rate is achieved in consideration of the execution time, the scheduling cost, and the energy consumption. The complexity of the proposed DMFPA algorithm consists of the following parameters: population initialization, function fitness value, transition probability, global and local pollination. Of all these parameters, the function fitness value is the most important part of the algorithm. In order to randomly generate an initial population, the number of tasks required is N, the number of virtual machines is M, the number of populations is P, the complexity of mapping tasks and virtual machines is O (N + P + 1), the fitness value complexity of execution time is O (NM + NM2+ M2), the fitness value complexity of scheduling cost is O (NM), the fitness value complexity of energy consumption is O (NMh), the global and local pollination complexity is O (P), and for the overall DMFPA algorithm, the time complexity is O (N + P + NM2+ M2+ NMh).
Figure BDA0003624777100000062
Furthermore, it should be understood that although the specification describes embodiments, not every embodiment includes only a single embodiment, and such description is for clarity purposes only, and it will be understood by those skilled in the art that the specification as a whole and the embodiments may be combined as appropriate to form other embodiments as would be understood by those skilled in the art.

Claims (4)

1. A workflow optimization scheduling algorithm based on cloud computing comprises the following specific steps:
(1) And establishing a cloud computing workflow scheduling model according to the workflow execution relation.
(2) The tasks are preprocessed according to their priorities prior to workflow execution.
(3) The global algorithm and the local algorithm of the flower pollination algorithm are improved.
2. The workflow optimization scheduling algorithm based on cloud computing of claim 1, wherein: the step (1) is specifically as follows: the workflow system under the cloud environment is decomposed into two stages: the resource providing stage and the resource scheduling stage are respectively corresponding to a resource layer and a scheduling layer in the three-layer model, and the functions of the resource providing stage and the resource scheduling stage are respectively the processes of providing various computing resources and performing comparison and selection of different resources for task scheduling. And adding the uppermost user layer in the three-layer model, and submitting the workflow task to the scheduling layer according to the QoS required by the user.
3. The workflow optimization scheduling algorithm based on cloud computing according to claim 2, wherein: the step (2) is specifically as follows: firstly, the execution time of the tasks, the cost and the energy consumption of the calculation tasks and the resource utilization rate of the calculation tasks are considered, and the workflow is preprocessed according to the parameters and the workflow to be executed without destroying the interdependence among all the tasks.
4. The workflow optimization scheduling algorithm based on cloud computing of claim 3, wherein: the step (3) is specifically as follows: and performing cross operation on the task sequence of global pollination in the flower pollination algorithm to obtain a better new solution and fitness function value. The cross mutation or the reverse mutation is proposed for the local pollination in the flower pollination algorithm, and because the task priority is hierarchically operated, the constraint limiting conditions of the task can be met when the cross mutation and the reverse mutation are executed, so that a more optimal solution can be obtained.
CN202210478776.9A 2022-04-29 2022-04-29 Workflow optimization scheduling algorithm based on cloud computing Pending CN115185651A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210478776.9A CN115185651A (en) 2022-04-29 2022-04-29 Workflow optimization scheduling algorithm based on cloud computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210478776.9A CN115185651A (en) 2022-04-29 2022-04-29 Workflow optimization scheduling algorithm based on cloud computing

Publications (1)

Publication Number Publication Date
CN115185651A true CN115185651A (en) 2022-10-14

Family

ID=83512404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210478776.9A Pending CN115185651A (en) 2022-04-29 2022-04-29 Workflow optimization scheduling algorithm based on cloud computing

Country Status (1)

Country Link
CN (1) CN115185651A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116050760A (en) * 2022-12-31 2023-05-02 上海交通大学 Multi-energy-source junction collaborative planning method and equipment based on internal structure layering

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116050760A (en) * 2022-12-31 2023-05-02 上海交通大学 Multi-energy-source junction collaborative planning method and equipment based on internal structure layering

Similar Documents

Publication Publication Date Title
Chen et al. Efficient task scheduling for budget constrained parallel applications on heterogeneous cloud computing systems
Tan et al. A trust service-oriented scheduling model for workflow applications in cloud computing
Abrishami et al. Cost-driven scheduling of grid workflows using partial critical paths
Fard et al. A multi-objective approach for workflow scheduling in heterogeneous environments
CN101237469B (en) Method for optimizing multi-QoS grid workflow based on ant group algorithm
Zuo et al. A multi-objective hybrid cloud resource scheduling method based on deadline and cost constraints
CN109800071A (en) A kind of cloud computing method for scheduling task based on improved adaptive GA-IAGA
Wang et al. Makespan-driven workflow scheduling in clouds using immune-based PSO algorithm
CN106055395A (en) Method for constraining workflow scheduling in cloud environment based on ant colony optimization algorithm through deadline
CN110347504B (en) Many-core computing resource scheduling method and device
Wu et al. A multi-model estimation of distribution algorithm for energy efficient scheduling under cloud computing system
Chakravarthi et al. TOPSIS inspired budget and deadline aware multi-workflow scheduling for cloud computing
Arabnejad et al. Multi-QoS constrained and profit-aware scheduling approach for concurrent workflows on heterogeneous systems
CN103279818A (en) Method for cloud workflow scheduling based on heuristic genetic algorithm
Li et al. Fast and energy-aware resource provisioning and task scheduling for cloud systems
Zhou et al. Concurrent workflow budget-and deadline-constrained scheduling in heterogeneous distributed environments
Subramoney et al. Multi-swarm PSO algorithm for static workflow scheduling in cloud-fog environments
CN109710372A (en) A kind of computation-intensive cloud workflow schedule method based on cat owl searching algorithm
Wang et al. Dynamic multiworkflow deadline and budget constrained scheduling in heterogeneous distributed systems
CN115185651A (en) Workflow optimization scheduling algorithm based on cloud computing
Fard et al. Budget-constrained resource provisioning for scientific applications in clouds
CN110519386A (en) Elastic resource supply method and device based on data clustering in cloud environment
CN112231081B (en) PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment
CN106802822A (en) A kind of cloud data center cognitive resources dispatching method based on moth algorithm
Ranjan et al. SLA-based coordinated superscheduling scheme for computational Grids

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination