CN115185651A - Workflow optimization scheduling algorithm based on cloud computing - Google Patents
Workflow optimization scheduling algorithm based on cloud computing Download PDFInfo
- Publication number
- CN115185651A CN115185651A CN202210478776.9A CN202210478776A CN115185651A CN 115185651 A CN115185651 A CN 115185651A CN 202210478776 A CN202210478776 A CN 202210478776A CN 115185651 A CN115185651 A CN 115185651A
- Authority
- CN
- China
- Prior art keywords
- workflow
- algorithm
- scheduling
- cloud computing
- pollination
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention provides a workflow scheduling optimization method based on cloud computing, and provides an improved differential pollination workflow scheduling algorithm aiming at the structure, characteristics and workflow optimization scheduling in a cloud computing environment. The algorithm firstly analyzes the structures and the characteristics of a workflow scheduling model and a cloud computing resource model, obtains two functions mainly provided by a workflow system in cloud computing, and respectively provides resources for task selection and allocates proper virtual machines to execute corresponding tasks. Therefore, a three-layer workflow scheduling model based on cloud computing is provided, global pollination and local pollination of a flower pollination algorithm are correspondingly improved, and finally a differential pollination workflow scheduling algorithm in a cloud computing environment is provided. Compared with the traditional single-target optimization algorithm, the optimization method has obvious optimization effect under the same constraint condition on the virtual machine. Therefore, the invention has certain feasibility and effectiveness.
Description
Technical Field
The invention relates to a manufacturing process scheduling optimization method based on a workflow, and belongs to the field of intelligent calculation and scheduling optimization.
Background
Data storage and calculation requirements of large enterprises are increasing day by day, and in the face of so many data processing requirements, workflow optimization problems in the enterprises are very important. Cloud computing has the advantages of low cost, large storage scale, high computing power and the like, and therefore is being widely applied to various fields.
Cloud computing services, cloud services for short, are a method of uniformly managing and scheduling computing resources and then providing services to users as needed. A workflow is a collection of tasks, wherein a task may depend on the execution of one or more tasks. The set of directed edges between tasks in the workflow can represent the dependency of the tasks, whether the execution of one task is established or not needs to see whether the dependency relationship meets the sequential dependency and the data dependency, the sequential dependency ensures that all predecessor tasks of the current task are executed completely, and the data dependency ensures that the execution data required by the task is correspondingly received after the previous task is executed completely.
Compared with the traditional distributed system, the cloud computing has the characteristics of complexity and changeability, so that influence factors on the evaluation standards are increased, if the evaluation standards cannot be optimized and balanced well, the cost is increased, and the use and experience of a user are influenced. The invention aims to research a cloud workflow execution time optimization scheduling algorithm, fully utilize task scheduling time gaps, save execution time, meet user requirements, reduce user cost and provider cost and achieve the purpose of sustainable development.
Disclosure of Invention
In order to achieve the purpose, the invention provides a differential pollination workflow optimization scheduling algorithm based on cloud computing, which comprises the following steps:
(1) And establishing a cloud computing workflow scheduling model according to the workflow execution relation.
(2) And calculating the consumption of workflow execution, and writing an algorithm to preprocess the workflow task according to the priority.
(3) The global and local improvement of the algorithm is realized through the cross operation, the exchange mutation and the inverse mutation of the task sequence.
(4) The differential pollination workflow optimization algorithm is provided according to the improvement
The step (1) is specifically as follows:
(1.1) creating a directed edge set between tasks according to the task dependency, which can be represented as a quadruple, P = (K, E, Z, X).
(1.2) task R represented by R i Set of (1), 0<=i<= N. Total number of tasks N, total number of edges E.
(1.3) by directed edge e ij Will task r i And task r j Is connected, and 0<=i,j<= N, i ≠ j, indicating dependency between tasks K i All direct front-off node assemblies are parent (k) i ) All the direct successor nodes are aggregated to child (k) i )。
(1.4) the computation weight Z is measured by the number of commands and the transmission weight X is measured by the number of bits.
(1.5) Add virtual Start task node r start And end task node r end To satisfy both sequential dependency and data dependency execution workflows.
(1.6) creation of V m Representing a set of virtual machines, vm r A set of representations V m The r-th virtual machine of (1).
(1.7) computing a Single virtual machine v mr Computing power of (d), using PC (v) mr ) And (4) showing.
(1.8) establishing a cloud computing-based three-layer workflow scheduling model meeting various requirements, wherein the three-layer workflow scheduling model comprises the following steps: a resource layer, a scheduling layer and a user layer.
The step (2) is specifically as follows:
(2.1) the optimization objective according to the purpose of the workflow can be expressed as formula (2-1):
Makespan=minimize(t end ) (2-1)
(2.2) calculating the workflow scheduling cost, wherein the workflow scheduling cost is mainly used for the calculation overhead of the virtual machine of cloud computing, and a calculation formula can be obtained from a formula (2-2) table:
(2.3) wherein UC (vm) r ) Representing virtual machines vm r Per unit of execution cost. The energy consumption of the modern processor integrated circuit is mainly from dynamic energy consumption E dynamic The Dynamic energy consumption of the processor is reduced by adopting a DVFS (Dynamic Voltage and Frequency Scaling) technology design model, and the Dynamic energy consumption E is reduced dynamic Can be obtained from equation (2-3):
P dynamic =ACv 2 f (2-3)
(2.4) where AC is a constant value depending on the device, v denotes a supply voltage, and f denotes a clock frequency. Therefore, the energy consumption in the cloud computing environment is the sum of the energy consumption of all the rented virtual machines and can be calculated by the formula (2-4):
(2.5) in order to maximize profits, the cloud service provider also needs to consider the performance parameter of the resource utilization rate, which can be calculated by the formula (2-5):
(2.6) wherein et k Representing the time period, tt, for which a task is executed on a virtual machine k The higher the whole time period representing the virtual machine is leased, the smaller the idle time of the resource is, and conversely, the more the idle time of the resource is, the more the resource is wasted.
(2.7) in order to apply the flower pollination algorithm to workflow scheduling on cloud computing, the flower pollination algorithm needs to be improved: before scheduling is adopted, priority calculation is carried out on tasks of a workflow, and the tasks are layered according to priority.
(2.8) define the upward priority of the task, representing the longest distance between task ti and bundle task tend. The calculation method is to calculate the upward priority of each task from back to front, i.e. from tend, and once forward, knowing that the task starts from the workflow DAG graph recursively, the upward priority of the task can be calculated by formula (2-6):
β up =max(TT(t i ,t j )+ET(t i )+β up (t j ))t j ∈child(t i ) (2-6)
(2.9) writing algorithm (1): pseudo code for priority hierarchical operation of tasks
The step (3) is specifically as follows:
(3.1) Global pollination according to the flower pollination algorithm can be represented by the mathematical formula (2-1):
(3.2) because the distance between the biological global pollination is long, the cross operation of the task sequence is adopted to carry out the cross global pollination operation on the flower individuals.
(3.3) based on the task sequence, the cross operation is obtained on the flowers A and B, and the pollen position 2,3,6 is randomly selected.
(3.4) pollen t selected for flower A 2 ,t 3 And t 6 Pollen t in flower B 2 ,t 3 And t 6 And performing price difference operation on the flower B according to the same pollen sequence of the flower A to obtain a new flower B at positions 1,3 and 7 respectively, and performing the same cross operation on the flower A to obtain a new flower A in the same way to obtain new cross maps of the flower A and the flower B.
(3.5) thereby obtaining algorithm (2): cross global pollination operation pseudo code
(3.6) the priority layering operation of the tasks can be completed in small steps or large steps for changing the scheduling sequence of the tasks. For small steps, crossover mutations can be used and two randomly selected task positions can be individually adapted, e.g.crossover mutations A, B. For the larger step, inverse mutation is used, two tasks are randomly selected, and all tasks between the tasks are subjected to inverse sequence operation, such as inverse mutation A and B.
(3.7) writing (2) mutation local pollination operation algorithm according to the two mutation operations
The step (4) is specifically as follows:
with the above improved strategy, the DMFPA algorithm is proposed as follows:
drawings
FIG. 1 cloud computing workflow scheduling model
FIG. 2 Cross-over, crossover and reverse mutations,
FIG. 3 is a flow chart of the DMFPA algorithm
FIG. 4 GA-Budget based execution time difference ratio
FIG. 5 GA-Deadline-based scheduling cost difference ratio
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Step 1: as shown in fig. 1, the actual workflow relationship is created according to the three-layer model workflow model to execute the following operations, which are resource layers: providing resources for upper layer application to select, and the scheduling layer: for the scheduling of the tasks, different resource comparison and selection processes are carried out, and the user layer comprises the following steps: the user can set services according to the requirements of the user and submit the workflow tasks to the scheduling layer.
Step 2: as shown in fig. 2, the actually input parameters A, B are performed with a crossover operation, an exchange mutation, and an inverse mutation, which are respectively an improvement of the global algorithm and an improvement of the local algorithm, in order to improve the value of the fitness and the quality of the solution to obtain the optimal solution.
And step 3: as shown in fig. 3, which is a flow chart of the DMFPA algorithm, the workflow is first prioritized and layered, then the global algorithm or the local algorithm is selected to be executed according to the comparison between the decision random number and the transition probability, and finally the final scheduling scheme is output by executing the relevant operation when the iteration number is satisfied.
Different workflow combinations and modeling methods are used in workflow scheduling studies, the workflows used herein contain between 10 and 100 tasks and the selection probability of each workflow is the same. Nodewight represents the computational weight of the workflow and EdgeWeight represents the communication weight of two tasks between the workflow. NodeToEdgeWeight represents the calculated traffic ratio. The comparison algorithms adopted in the experiment are GA-Deadline, GA-Budget, GA-EC and MOPSO.
The execution time difference ratio of each algorithm under different types of working flows after the algorithm is executed is shown in fig. 4. As can be seen from the figure, the difference between the execution times of the GA-Budget and the GA-Budget is 0, which reveals from the side that the execution time of the GA-Budget is the longest, because the GA-Budget tries to select the least expensive computing device to execute the task to achieve the minimum scheduling cost, the result is that the execution time is significantly increased, and the execution time of the DMFPA can be increased by about 10% -20% compared with that of the GA-EC and the GA-Budget respectively. Compared with GA-Budget, the MOPSO algorithm can correspondingly reduce the execution time by 3% -18% according to different workflow structures. In all of these cases, however, the proposed algorithm DMFPA is executed at a smaller time than the MOPSO algorithm.
The difference ratio of the scheduling cost of each algorithm under different types of working flows is shown in figure 5. As can be seen from the figure, the scheduling cost difference ratio of the GA-deadlines is 0, which reveals from the side that the execution time of the GA-deadlines is the shortest, because the GA-deadlines attempt to select the computing device with the highest price and performance to execute the task to achieve the situation with the shortest execution time, thereby causing the scheduling cost to increase significantly. Compared with GA-EC and GA-Deadline, the DMFPA algorithm can save the scheduling cost by 20-30 percent respectively. In the case of LNodeHEdge, the scheduling cost difference ratio of the proposed algorithm DMFPA and MOPSO algorithm is the same, in the case of LNodeHEdge and HNodeHEdge, the proposed algorithm DMFPA is superior to MOPSO algorithm, and in the case of rnodehedge and HNodeHEdge, the MOPSO algorithm is superior to the proposed algorithm DMFPA.
After each algorithm execution, it can be seen from table 1 that under the limitation of deadline and budget proposed by the user, the proposed DMFPA algorithm can obtain relatively less execution time and scheduling cost compared with the GA-based algorithm in most cases. The algorithm DMFPA can schedule 90% of the workflow to be completed within the specified budget, and can schedule 84% of the workflow to be completed within the specified deadline, and the data shows that the DMFPA algorithm has certain advantages over the algorithm MOPSO under the limit conditions of the deadline and the budget.
As can be seen from table 2, the resource utilization rate of the condition fertilizer analysis algorithm after the execution of each algorithm is between 40.2% and 46%, and a good resource utilization rate is achieved in consideration of the execution time, the scheduling cost, and the energy consumption. The complexity of the proposed DMFPA algorithm consists of the following parameters: population initialization, function fitness value, transition probability, global and local pollination. Of all these parameters, the function fitness value is the most important part of the algorithm. In order to randomly generate an initial population, the number of tasks required is N, the number of virtual machines is M, the number of populations is P, the complexity of mapping tasks and virtual machines is O (N + P + 1), the fitness value complexity of execution time is O (NM + NM2+ M2), the fitness value complexity of scheduling cost is O (NM), the fitness value complexity of energy consumption is O (NMh), the global and local pollination complexity is O (P), and for the overall DMFPA algorithm, the time complexity is O (N + P + NM2+ M2+ NMh).
Furthermore, it should be understood that although the specification describes embodiments, not every embodiment includes only a single embodiment, and such description is for clarity purposes only, and it will be understood by those skilled in the art that the specification as a whole and the embodiments may be combined as appropriate to form other embodiments as would be understood by those skilled in the art.
Claims (4)
1. A workflow optimization scheduling algorithm based on cloud computing comprises the following specific steps:
(1) And establishing a cloud computing workflow scheduling model according to the workflow execution relation.
(2) The tasks are preprocessed according to their priorities prior to workflow execution.
(3) The global algorithm and the local algorithm of the flower pollination algorithm are improved.
2. The workflow optimization scheduling algorithm based on cloud computing of claim 1, wherein: the step (1) is specifically as follows: the workflow system under the cloud environment is decomposed into two stages: the resource providing stage and the resource scheduling stage are respectively corresponding to a resource layer and a scheduling layer in the three-layer model, and the functions of the resource providing stage and the resource scheduling stage are respectively the processes of providing various computing resources and performing comparison and selection of different resources for task scheduling. And adding the uppermost user layer in the three-layer model, and submitting the workflow task to the scheduling layer according to the QoS required by the user.
3. The workflow optimization scheduling algorithm based on cloud computing according to claim 2, wherein: the step (2) is specifically as follows: firstly, the execution time of the tasks, the cost and the energy consumption of the calculation tasks and the resource utilization rate of the calculation tasks are considered, and the workflow is preprocessed according to the parameters and the workflow to be executed without destroying the interdependence among all the tasks.
4. The workflow optimization scheduling algorithm based on cloud computing of claim 3, wherein: the step (3) is specifically as follows: and performing cross operation on the task sequence of global pollination in the flower pollination algorithm to obtain a better new solution and fitness function value. The cross mutation or the reverse mutation is proposed for the local pollination in the flower pollination algorithm, and because the task priority is hierarchically operated, the constraint limiting conditions of the task can be met when the cross mutation and the reverse mutation are executed, so that a more optimal solution can be obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210478776.9A CN115185651A (en) | 2022-04-29 | 2022-04-29 | Workflow optimization scheduling algorithm based on cloud computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210478776.9A CN115185651A (en) | 2022-04-29 | 2022-04-29 | Workflow optimization scheduling algorithm based on cloud computing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115185651A true CN115185651A (en) | 2022-10-14 |
Family
ID=83512404
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210478776.9A Pending CN115185651A (en) | 2022-04-29 | 2022-04-29 | Workflow optimization scheduling algorithm based on cloud computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115185651A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116050760A (en) * | 2022-12-31 | 2023-05-02 | 上海交通大学 | Multi-energy-source junction collaborative planning method and equipment based on internal structure layering |
-
2022
- 2022-04-29 CN CN202210478776.9A patent/CN115185651A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116050760A (en) * | 2022-12-31 | 2023-05-02 | 上海交通大学 | Multi-energy-source junction collaborative planning method and equipment based on internal structure layering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Efficient task scheduling for budget constrained parallel applications on heterogeneous cloud computing systems | |
Tan et al. | A trust service-oriented scheduling model for workflow applications in cloud computing | |
Abrishami et al. | Cost-driven scheduling of grid workflows using partial critical paths | |
Fard et al. | A multi-objective approach for workflow scheduling in heterogeneous environments | |
CN101237469B (en) | Method for optimizing multi-QoS grid workflow based on ant group algorithm | |
Zuo et al. | A multi-objective hybrid cloud resource scheduling method based on deadline and cost constraints | |
CN109800071A (en) | A kind of cloud computing method for scheduling task based on improved adaptive GA-IAGA | |
Wang et al. | Makespan-driven workflow scheduling in clouds using immune-based PSO algorithm | |
CN106055395A (en) | Method for constraining workflow scheduling in cloud environment based on ant colony optimization algorithm through deadline | |
CN110347504B (en) | Many-core computing resource scheduling method and device | |
Wu et al. | A multi-model estimation of distribution algorithm for energy efficient scheduling under cloud computing system | |
Chakravarthi et al. | TOPSIS inspired budget and deadline aware multi-workflow scheduling for cloud computing | |
Arabnejad et al. | Multi-QoS constrained and profit-aware scheduling approach for concurrent workflows on heterogeneous systems | |
CN103279818A (en) | Method for cloud workflow scheduling based on heuristic genetic algorithm | |
Li et al. | Fast and energy-aware resource provisioning and task scheduling for cloud systems | |
Zhou et al. | Concurrent workflow budget-and deadline-constrained scheduling in heterogeneous distributed environments | |
Subramoney et al. | Multi-swarm PSO algorithm for static workflow scheduling in cloud-fog environments | |
CN109710372A (en) | A kind of computation-intensive cloud workflow schedule method based on cat owl searching algorithm | |
Wang et al. | Dynamic multiworkflow deadline and budget constrained scheduling in heterogeneous distributed systems | |
CN115185651A (en) | Workflow optimization scheduling algorithm based on cloud computing | |
Fard et al. | Budget-constrained resource provisioning for scientific applications in clouds | |
CN110519386A (en) | Elastic resource supply method and device based on data clustering in cloud environment | |
CN112231081B (en) | PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment | |
CN106802822A (en) | A kind of cloud data center cognitive resources dispatching method based on moth algorithm | |
Ranjan et al. | SLA-based coordinated superscheduling scheme for computational Grids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |