CN114925935A

CN114925935A - Multi-workflow scheduling method for time delay constraint in cloud edge environment

Info

Publication number: CN114925935A
Application number: CN202210702160.5A
Authority: CN
Inventors: 陈哲毅; 李博威; 林潮伟
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2022-06-21
Filing date: 2022-06-21
Publication date: 2022-08-19

Abstract

The invention provides a multi-workflow scheduling method with time delay constraint in a cloud edge environment, which minimizes the execution cost of multiple workflows by using a differential evolution algorithm on the premise of meeting the constraint of the deadline time of the multiple workflows; in order to improve the rationality and diversity of the population evolution process, two-dimensional discrete particles are introduced to encode individuals, and the differential evolution algorithm is optimized by using a selection operator based on the whole population, so that the fitness value of the whole population is improved more quickly and the speed of searching a solution space by the algorithm is increased on the premise of avoiding premature convergence.

Description

Multi-workflow scheduling method for time delay constraint in cloud edge environment

Technical Field

The invention belongs to the technical field of cloud computing and edge computing, and particularly relates to a time delay constrained multi-workflow scheduling method in a cloud edge environment.

Background

With the rapid development of technologies such as 5G and artificial intelligence, the market scale is gradually enlarged, the amount of generated heterogeneous data to be processed shows explosive growth, and meanwhile, users also put forward service requirements of "instant interaction", while the traditional cloud computing service cannot meet the increasing service requirements of users due to the fact that all data need to be uploaded to a cloud service data center for processing due to high transmission delay caused by geographical distribution of the data center and the fact that data preprocessing is lacked, and the large network bandwidth pressure is undoubtedly caused. Therefore, edge computing characterized by low transmission delay and low network bandwidth pressure is rapidly emerging for the pain point of the traditional cloud computing service. The edge calculation is a novel calculation paradigm for processing data by using information service resources at the edge of a core network, and realizes the sinking of the calculation resources and services to the edge end, so that the calculation resources and services are closer to a user end. In the aspect of computing, the resource and service capability of edge computing are lower than that of a cloud end but are obviously higher than that of a terminal, so that the problem of limited computing capability of terminal equipment can be effectively solved; in terms of transmission delay, because the edge computing service node is close to the user, the user can quickly process data which does not need high computing resources through the edge node, the data transmission quantity of a cloud is greatly reduced, the congestion degree of a data transmission network is reduced, and the energy loss of a network edge terminal is greatly reduced.

In order to combine the respective advantages of the cloud computing platform and the edge computing platform, the computer science community provides a new computing mode, namely cloud edge cooperative computing, on the basis of the cloud computing and the edge computing, and pushes the processing of data with low computing resource requirements to the edge of the internet or the near end of the data; the processing and the calculation of the data with high calculation resource requirements are submitted to the central cloud, so that the calculation density is improved, the delay is reduced as much as possible, and the availability and the expansibility of the application system are effectively improved. As an important research problem in the cloud side environment, the solution of task scheduling optimization directly affects the service efficiency of cloud and side platform service resources and the service use experience of users, which undoubtedly provides new requirements for the research of task scheduling optimization and brings new challenges.

In a cloud-edge environment, the structural complexity of the workflow and the data dependency among the subtasks cause that the execution of the workflow is still a challenge to be completed under the constraint of reasonable deadline even in a high-performance computing environment; and a large amount of cross-server data transmission is required in the workflow execution process, which causes a huge contradiction with the limited network bandwidth between the servers, and has serious transmission delay and high execution cost. Therefore, in the cloud side environment, the workflow is reasonably scheduled under the constraint of proper deadline, the workflow completion time can be reduced, the resource utilization rate is improved, and the expenditure of executing the workflow by the user is effectively reduced.

Disclosure of Invention

In order to make up for the blank and the defects of the prior art, the invention provides a time delay constrained multi-workflow scheduling method in a cloud edge environment. On the premise of meeting the multi-workflow deadline constraint, the execution cost of the multi-workflow is minimized by utilizing a differential evolution algorithm. In order to improve the rationality and diversity of the population evolution process, a two-dimensional discrete particle is introduced to encode individuals, and a basic differential evolution algorithm is optimized by using a selection operator based on the whole population, so that the fitness value of the whole population is improved more quickly on the premise of avoiding premature convergence, and the speed of searching a solution space by the algorithm is increased. Through a plurality of groups of simulation comparison experiments, the performance of the multi-workflow scheduling algorithm based on differential evolution is superior to that of other scheduling algorithms in terms of cut-off time and multi-workflow scale, and the execution cost of the multi-workflow in a cloud-edge environment can be effectively reduced.

The invention specifically adopts the following technical scheme:

a time delay constrained multi-workflow scheduling method in a cloud edge environment is characterized in that: on the premise of meeting the multiple workflow deadline constraint, minimizing the execution cost of the multiple workflows by using a differential evolution algorithm; in order to improve the rationality and diversity of the population evolution process, two-dimensional discrete particles are introduced to encode individuals, and a differential evolution algorithm is optimized by using a selection operator based on the whole population, so that the fitness value of the whole population is improved more quickly and the speed of searching a solution space by the algorithm is increased on the premise of avoiding premature convergence.

Further, the multiple workflow deadline constraint is expressed as:

wherein, c _e For the cost of the execution of a multi-workflow application,

completion time for multi-workflow applications

The middle element.

Further, the construction process of the multi-workflow deadline constraint representation is as follows:

assuming that the time intervals of the user submitting different workflows to the cloud-edge environment, i.e. the arrival times of different workflows, approximately obey the poisson distribution P (λ), where λ represents the arrival rate of the workflows, these workflows are represented by an infinite set:

W＝{w ₁ ,w ₂ … equation (1)

Wherein each workflow may be represented by a triplet:

representation, wherein elements represent arrival time, expiration date, and structure in order;

the structure of the workflow is represented by a directed acyclic graph:

G _i ＝{T _i ,E _i formula (3)

Wherein the content of the first and second substances,

is a set of tasks, N represents the number of tasks;

representing the jth task in the ith workflow;

is the set of edges between tasks; directed edge

Denotes t _ip And t _ij With data transmission between, t _ip Is t _ij Predecessor task of (1), t _ij Is t _ip The subsequent task of (2);

is t _ip The predecessor task set of (2);

is t _ij The successor task set of (1);

due to the mobility of the workflow, the task can be distributed to the server to be executed only when all the precursor nodes of the task are executed and all the data generated by the precursor nodes are transmitted;

in the scheduling process of the multi-workflow application, the cloud side environment provides computing resources and data transmission services for users;

cloud side environment:

S＝{S _cloud ,S _edge equation (7)

The cloud is composed of a cloud and an edge, wherein the cloud comprises m cloud servers:

S _cloud ＝{s ₁ ,s ₂ ,…,s _m equation (8)

The edge contains n edge servers:

S _edge ＝{s _m+1 ,s _m+2 ,…,s _m+n equation (9)

In the resource model, any type of server can be leased or released at any time, provided that the number of servers is sufficient; server s _k Expressed as:

wherein p is _k Presentation Server s _k The computing performance of (a); u. of _k Presentation Server s _k A specific asking price unit time set for providing service;

presentation server s _k At unit time u _k The unit calculation cost is approximately proportional to the calculation performance of the unit calculation cost; f. of _k E {0,1} represents the server s _k Type of the platform, when f _k When equal to 0, s _k The method belongs to a cloud platform and has strong computing performance; when f is _k When 1, s _k The method belongs to an edge platform and has general computing performance; according to the type of the platform to which the server belongs, the server s in the cloud side environment _r And s _t Bandwidth beta between _r,t Expressed as:

wherein, b _r,t Represents the bandwidth beta _r,t The value of (a) is set to (b),

representing slave servers s _r Transmitting 1GB data to a server s _t The resulting data transmission cost;

in a cloud edge environment, the scheduling scheme of the multiple workflow solves the problem of the distribution of task nodes in the multiple workflow to specific servers, and embodies the corresponding relation between each task and the server in the application of the multiple workflow;

the multi-workflow scheduling scheme is represented as:

Γ＝(W,S,M,c _e ,T ^f ) Formula (12)

Wherein, the first and the second end of the pipe are connected with each other,

a mapping representing the multi-workflow application W corresponding to the cloud-edge environment S, c _e Representing the cost of execution of the multi-workflow application W in the cloud-edge environment S,

representing a completion time of the multi-workflow application;

for two types of elements in the mapping M, (v) _i,j ,s _k ) Representing a task v _i,j At the server s _k The upper execution is carried out on the first execution block,

representing data edges

Slave server s _r To a server s _t The above step (1); when mapping a child of M:

when determined, the child map:

is also determined accordingly; thus, mapping M is equivalent to:

under a cloud edge environment, selecting a cut-off time delay as a constraint condition to research the problem of time delay minimization; the cost scheduler is a cost-driven scheduler and aims to minimize the execution cost of the optimization target through reasonable scheduling according to a scheduling scheme; the problem to be solved by the cost scheduler under the time delay constraint is to minimize the execution cost of multiple workflows on the premise of meeting the deadline time of all workflows in the multiple workflows; assuming that each server has sufficient storage space to store data generated or transmitted during execution; computing time t using tasks _tc Measuring the computing power of a server using a data transfer time t _dt The data transmission capacity between the servers is measured by the following specific calculation method:

wherein formula (18) represents task v _i,j At server s _k The calculated time of (1), equation (19) represents the data edge

Slave server s _r To a server s _t The resulting transmission time; when the data transmission edge is connected with the same server, the data transmission time is 0;

in a latency constrained cost scheduler, for one scheduling scheme Γ, each server s once its mapping M is determined _k The starting time t of each server is determined _boot (s _k ) And then determining; to compute the execution cost c of a multi-workflow application _e And completion time T ^f According to the mapping M of the multi-workflow application W corresponding to the cloud side environment S, the opposite is performedThe variables are defined as follows:

t _start (v _i,j ,s _k ): task v _i,j At server s _k By the server s _k Current idle time and task v _i,j The completion time of all predecessor tasks is determined, as shown in equation (20);

t _end (v _i,j ,s _k ): task v _i,j At the server s _k Is equal to task v _i,j With its start time at server s _k The sum of the upper calculation times, as shown in equation (21);

t _end (v _i,j ,s _k )＝t _start (v _i,j ,s _k )+t _tc (v _i,j ,s _k ),(v _i,j ,s _k ) E.g. M formula (21)

t _shut (s _k ): server s _k Is equal to the completion time of the task executed at the latest on the server, as shown in equation (22);

c _com (s _k ): server s in cloud edge environment _k The task calculation cost of (2) is represented by the running time of the server, and the calculation mode is shown as formula (23);

c _tran (w _i ): workflow application w given scheduling scheme Γ _i The data transmission cost of (2) is calculated as shown in formula (24);

d _i under a given scheduling scheme Γ, the workflow applies w _i The cutoff time constraint of (2) is calculated as shown in equation (25);

d _i ＝α _i +baseline*|W|*HEFT(w _i ) Equation (25)

Wherein HEFT (w) _i ) Representing the scheduling of a workflow w with the HEFT algorithm _i A required execution time; the parameter baseline is defined by the set of equations (26):

based on the definition, the execution cost c of the multi-workflow application is obtained _e And completion time

As shown in equation (27) and equation (28);

further, the introducing two-dimensional discrete particles to encode the individuals specifically includes:

the particles consist of task priority and server number; one individual in the population corresponds to a potential scheduling scheme of multiple workflows in a cloud-edge environment; for the G evolution, the kth individual in the population

As represented by equation (30);

wherein NP represents the size of the population,

and

respectively represent jth task v in ith workflow application _i,j Priority coding and server coding; in the initialization process, 0 th generation individuals

As shown in equation (33):

wherein the content of the first and second substances,

i＝1,2,…,|W|,j＝1,2,…,|

V

_i 1,2, … NP equation (34)

rand () represents randomly selecting a decimal in a given interval, and randint () represents randomly selecting an integer in the given interval;

in a binary group

In the step (1), the first step,

is a real number representing the priority encoding of the multi-workflow application;

is an integer representing a server code for a multi-workflow application; for the

Element (1) of

The value of the value indicates the scheduling priority of the corresponding task in the scheduling scheme, if the two tasks correspond to each other

If the numerical values are the same, the task received by the platform is higher in priority; for

Element (1) of

Whose value represents the number of the server performing the task.

Further, the optimizing the differential evolution algorithm by using a selection operator based on the whole population specifically includes:

first, the N offspring generated by N parent individuals are preserved, that is, when the parent individuals generate 1 offspring, the algorithm does not immediately perform one-to-one elimination selection, but marks all new individuals generated by mutation and crossover as new individuals

i belongs to {1,2, …, N } and is temporarily reserved, so that N child nodes and original N parents exist; thus, through one round of evolution, 2N individuals were temporarily retained; then, calculating the fitness function values of the 2N individuals in the current individual pool, and then sequencing the 2N individuals according to the fitness function values from large to small; the first N individuals in the sorted queue are then selected as the final evolution result of the current generation and used as parents for the next generation of evolution.

Further, the fitness function is a fitness function for comparing two candidate solutions, and is defined as follows:

let both individuals be feasible solutions, i.e. select c _e The fitness function for the lower individuals is defined as shown in equation (35):

if at least one infeasible solution exists in the two individuals, the constraint conditions are satisfied according to the two solutions

The fitness function value is updated according to the number of the workflows, and is defined as follows:

(2.1) if the number of workflows meeting the constraint condition in the two individuals is the same:

(2.2) if the number of workflows meeting the constraint conditions in the two individuals is different:

is an event function expressed as a constraint

The result function of (2); when the constraint condition is met, taking 1 as the function value; otherwise, the function value is taken as 0.

Further, the specific implementation process of the differential evolution algorithm is as follows:

step S1: determining control parameters of a differential evolution algorithm and determining a fitness function; the control parameters of the differential evolution algorithm comprise a population size NP, a scaling factor F and a hybridization probability CR;

step S2: randomly generating an initial population;

step S3: evaluating an initial population and calculating the fitness value of each individual in the initial population;

step S4: judging whether a termination condition is reached or an evolution algebra reaches a maximum value; if so, terminating the evolution, and outputting the obtained optimal individual as an optimal solution; if not, continuing;

step S5: carrying out mutation and cross operation to obtain an intermediate population;

step S6: selecting individuals from the original population and the intermediate population to obtain a new generation of population;

step S7: turning to step S4 when the evolution algebra g is g + 1;

the mapping of the individuals of the population to the multi-workflow scheduling scheme is realized by the following algorithm:

the input of the algorithm 1 comprises a multi-workflow application W, a cloud edge environment S and a coded particle X, and the output is a coded particle X [2 ]]The corresponding scheduling scheme Γ ═ W, S, M, c _e ,T ^f ) (ii) a First, the mapping M is initialized to an empty set null, and the queue to be executed Q ═ Q (Q) ₁ ,Q ₂ ,...,Q _|S| ) Initialized to empty queue null, data transfer cost c _tran Initialization is 0; the scheduling of the multi-workflow application W is then started, the process being divided into two steps:

(1) calling an algorithm 2 to monitor the arrival of the multi-workflow application W in real time and perform task allocation of the multi-workflow application;

(2) calling an algorithm 3 to execute the multi-workflow application on queues to be executed of all servers;

after the scheduling is finished, all the opened servers are closed, and the execution cost c is calculated according to the formula (27) and the formula (28) _e And completion time

After the calculation is completed, if the completion time of a certain workflow application exceeds the cut-off time, the method

The scheduling scheme does not meet the deadline constraint and marks the coded particle X as an infeasible solution); finally, the scheduling scheme of the return workflow Γ ═ (W, S, M, c) _e ,T ^f )；

In the execution process of the algorithm 1, the arrival of the multi-workflow application W needs to be monitored in real time, and the task allocation of the multi-workflow application is carried out, wherein the process is shown as an algorithm 2, and input parameters comprise the multi-workflow application W, a cloud edge environment S and a coded particle X; during the operation of the algorithm, if the workflow applies w _i If so, calculating the task calculation time t according to the formula (18) and the formula (19) respectively _tc [|V _i |×|S|]And data transmission time t _dt [|E _i |,|S|×|S|]And recording its arrival time alpha _i (ii) a Traversing workflow applications w _i All tasks in, if task v _i,j For entering a task, i.e. the task does not have a predecessor task, then the value s is determined according to the server code _i,j V. task _i,j Put into the server s _i,j The queue to be executed; otherwise, the task v _i,j Put into the server s _i,j The task waiting pool of (1); otherwise, waiting for the arrival of a certain workflow application; until all workflow applications have arrived, the algorithm ends;

in the execution process of the algorithm 1, the multi-workflow application distribution needs to be carried out on the queue to be executed of the server, and the process is shown as an algorithm 3, wherein the server s is input _k Server s _k To-be-executed queue Q of _k Mapping M and data transmission cost c _tran (ii) a During the operation of the algorithm, if the server s _k In the off state, the server s is turned on _k A server s _k Starting time t _boot (s _k ) Setting as a current time; if server s _k To-be-executed queue Q of _k If not, encoding mu according to priority level, and queuing Q to be executed _k Task v with highest medium priority _i,j Dispatch to server s _k Corresponding mapping relation (v) _i,j ,s _k ) Adding the mapping M into the mapping M, calling an algorithm 4, and performing task calculationA process and a data transfer process; otherwise, wait for Q _k Is not empty; the algorithm ends until all workflow applications are executed;

during the execution of algorithm 3, the task calculation and data transfer process of the simulated workflow application is shown as algorithm 4, with the input comprising a task v _i,j And server s _k The output is the transmission cost of the currently generated data

Firstly, the following components are mixed

Initialization is 0; second, record task v _i,j Start time t of _start (v _i,j ,s _k ) And according to t _end (v _i,j ,s _k )＝t _start (v _i,j ,s _k )+t _tc (v _i,j ,s _k ) (ii) a Calculation task v _i,j Is completed by time t _end (v _i,j ,s _k ) (ii) a Finally, traverse task v _i,j According to the server code s _i,s To convert data into

To the execution of a subsequent task v _i,s Server s of _i,s Calculating the corresponding generated data transmission cost; at this time, if the task v _i,s Having completed the reception of all its predecessor task data, task v will be completed _i,s Slave server s _i,s The task waiting pool of (1) is put into a queue to be executed.

Further, the maximum evolutionary iteration number k is 1000, which is used as a termination condition of the differential evolution algorithm, that is, the algorithm is ended when the 1000 th evolution is completed.

The invention and the optimal scheme thereof provide a multi-workflow scheduling method based on differential evolution under the deadline constraint aiming at the scheduling problem of the multi-workflow, and the execution cost of the multi-workflow is minimized by utilizing a differential evolution algorithm on the premise of meeting the deadline constraint of the multi-workflow. In order to improve the rationality and diversity of the population evolution process, a two-dimensional discrete particle is introduced to encode individuals, and a basic differential evolution algorithm is optimized by using a selection operator based on the whole population, so that the fitness value of the whole population is improved more quickly on the premise of avoiding premature convergence, and the speed of searching a solution space by the algorithm is increased. Through a plurality of groups of simulation comparison experiments, the performance of the multi-workflow scheduling algorithm based on differential evolution is superior to that of other scheduling algorithms in terms of cut-off time and multi-workflow scale, and the execution cost of the multi-workflow in a cloud-edge environment can be effectively reduced.

Drawings

Fig. 1 is a diagram of an example of coding applied to multi-workflow scheduling according to an embodiment of the present invention.

Fig. 2 is a flowchart of a basic differential evolution algorithm according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a scheduling result of a small-sized multi-workflow under different deadlines and different optimization algorithms according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a scheduling result of a multi-workflow in different deadlines and different optimization algorithms according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of a scheduling result of a large-scale multi-workflow under different deadlines and different optimization algorithms according to an embodiment of the present invention.

Detailed Description

In order to make the features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail as follows:

it should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

1 model construction

1.1 Multi-workflow model

In the cloud-side environment, in order to simulate an actual interaction scenario, it is assumed that time intervals of different workflows submitted by a user to the cloud-side environment, that is, arrival times of different workflows approximately follow a poisson distribution P (λ) (where λ represents arrival rate of a workflow), and the workflows can be represented by an infinite set:

W＝{w ₁ ,w ₂ … equation (1)

Wherein each workflow may be represented by a triplet:

representation, where elements represent arrival time, expiration date, and structure in order.

The structure of a workflow is typically represented by a DAG (Direct Acyclic Graph):

G _i ＝{T _i ,E _i formula (3)

Wherein the content of the first and second substances,

is a set of tasks, and N represents the number of tasks.

Representing the jth task in the ith workflow.

Is the set of edges between tasks. Directed edge

Represents t _ip And t _ij With data transmission between, t _ip Is t _ij Predecessor task of (1), t _ij Is t _ip Is executed.

Is t _ip The predecessor task set of (2).

Is t _ij The successor task set of (1).

Due to the fluidity of the workflow, a task can be allocated to a server to be executed only when all predecessor nodes of the task are executed and all data generated by the predecessor nodes are transmitted.

1.2 cloud edge Environment

In the scheduling process of the multi-workflow application, the cloud-side environment provides computing resources and data transmission services to users. The Service model of the present embodiment is Infrastructure as a Service (Infrastructure as a Service), and it is assumed that the environment provides users with an elastic cloud computing Service EC2 similar to amazon corporation and an elastic data block storage Service EBS similar to amazon corporation.

Cloud-edge environment:

S＝{S _cloud ,S _edge equation (7)

The cloud consists of a cloud and an edge, wherein the cloud comprises m cloud servers:

S _cloud ＝{s ₁ ,s ₂ ,…,s _n equation (8)

The edge contains n edge servers:

S _edge ＝{s _m+1 ,s _m+2 ,…,s _m+n equation (9)

The different types of servers have different hardware structures, such as CPU, chipset, memory, and disk systemEtc. corresponding to performance parameters of different specifications. The better the performance of the server, the higher its rent. In the resource model assumed in this embodiment, any type of server can be leased or released at any time, assuming that the number of servers is unlimited. Server s _k Can be expressed as:

presentation server s _k At unit time u _k The unit calculation cost is approximately proportional to the calculation performance (the pricing of the resource model adopts a pay-as-you-go mode, namely pricing according to the unit number of the rented virtual machine, and generally speaking, when the rented virtual machine is not used for a complete unit, the unit calculation cost is charged according to a unit time); f. of _k E {0,1} represents server s _k Type of the platform, when f _k When equal to 0, s _k The method belongs to a cloud platform and has strong computing performance; when f is _k When 1, s _k The method belongs to an edge platform and has general computing performance. According to the type of the platform to which the server belongs, the server s in the cloud side environment _r And s _t Bandwidth beta between _r,t Can be expressed as:

representing slave servers s _r Transmitting 1GB data to a server s _t Resulting data transmission cost。

1.3 Multi-workflow scheduling scheme

In a cloud edge environment, the scheduling scheme of the multiple workflow solves the problem of the distribution of task nodes in the multiple workflow to specific servers, and embodies the corresponding relation between each task and the server in the application of the multiple workflow.

Therefore, the multi-workflow scheduling scheme can be expressed as:

Γ＝(W,S,M,c _e ,T ^f ) Formula (12)

indicating the completion time of the multi-workflow application.

For two types of elements in the mapping M, (v) _i,j ,s _k ) Representing a task v _i,j At server s _k The upper side is executed in the upper part,

representing data edges

Slave server s _r To a server s _t The above. It was observed that when mapping a child of M:

when determined, the child map:

as determined accordingly. Thus, mapping M may be equivalent to:

cost scheduler under 1.4 time delay constraint

In a cloud-edge environment, the delay is mainly divided into transmission delay and deadline delay, and the deadline delay is selected as a constraint condition to study the delay minimization problem. The objective of a cost scheduler, i.e. a cost-driven scheduler, is to minimize the execution cost of the optimization objective by reasonable scheduling according to the scheduling scheme. Therefore, the problem to be solved by the cost scheduler under the delay constraint in the embodiment is to minimize the execution cost of multiple workflows on the premise of satisfying the deadlines of all workflows in the multiple workflows. The main research objects are the task calculation time generated by server calculation and the data transmission time delay generated by data transmission, and it is assumed that each server has enough storage space to store the data generated or transmitted in the execution process. The present embodiment calculates time t using a task _tc Measuring the computing power of a server using a data transfer time t _dt The specific calculation method for measuring the data transmission capacity between the servers is as follows:

wherein the formula (18) represents the task v _i,j At server s _k The calculated time of (1), equation (19) represents the data edge

Slave server s _r To a server s _t The resulting transmission time. Specifically, when the data transfer edge is connected to the same server, the data transfer time is 0.

In a latency constrained cost scheduler, for one scheduling scheme Γ, each server s once its mapping M is determined _k The starting time t of each server is determined _boot (s _k ) And also determined accordingly. To compute the execution cost c of a multi-workflow application _e And completion time T ^f According to the mapping M of the multi-workflow application W corresponding to the cloud side environment S, the related variables are defined as follows:

t _start (v _i,j ,s _k ): task v _i,j At server s _k By the server s _k Current idle time and task v _i,j The completion time of all predecessor tasks is determined, as shown in equation (20).

t _end (v _i,j ,s _k ): task v _i,j At the server s _k Is equal to task v _i,j With its start time at server s _k The sum of the upper calculation times is as shown in equation (21).

t _shut (s _k ): server s _k Is equal to the completion time of the task that was executed at the latest on the server, as shown in equation (22).

c _com (s _k ): server s in cloud edge environment _k The calculation cost of the task of (1) is represented by the running time of the server, and the calculation mode is shown as formula (23).

c _tran (w _i ): workflow application w given scheduling scheme Γ _i The data transmission cost of (2) is calculated as shown in equation (24).

d _i Workflow application w given scheduling scheme Γ _i The cutoff time constraint of (2) is calculated as shown in equation (25).

d _i ＝α _i +baseline*|W|*HEFT(w _i ) Equation (25)

Wherein HEFT (w) _i ) Representing the scheduling of a workflow w with the HEFT algorithm _i The required execution time. The parameter baseline is defined by the set of equations (26):

based on the above definition, the execution cost c of the multi-workflow application can be obtained _e And completion time

As shown in equations (27) and (28).

In the actual workflow operation process, there is also an operation cost generated by the use of services such as data storage and security verification, and compared with the execution cost generated by the calculation service and the data transmission service, the above cost has a negligible influence on the overall cost, so the embodiment only considers the execution cost generated by the use of the calculation service and the data transmission service.

2 problem definition

In summary, the deadline-constrained scheduling problem of multiple workflows studied in this embodiment can be abstracted as:

3 Algorithm design

3.1 population initialization

To better fit the multi-workflow scheduling problem in a real-world environment, the present embodiment encodes the workflow using a two-dimensional discrete particle, which consists of task priority and server number. One individual in the population corresponds to a potential scheduling scheme of multiple workflows in a cloud-edge environment. For the G evolution, the kth individual in the population

As represented by equation (30).

Wherein NP represents the size of the population,

and

respectively represent the jth task v in the ith workflow application _i,j Priority encoding and server encoding. In the initialization process, 0 th generation individuals

The encoding initialization of (2) is as shown in equation (33).

Wherein the content of the first and second substances,

i＝1,2,…,|W|,j＝1,2,…,|

V

_i 1,2, … NP equation (34)

rand () represents a one-digit decimal number in a given interval that is randomly selected, and randint () represents an integer number in a given interval that is randomly selected.

In a binary group

In the step (1), the first step,

is an integer representing the server code of the multi-workflow application. For

Element (1) of

If the numerical values are the same, the priority of the task received by the platform is higher; for

Element (1) of

Whose value represents the number of the server that performed the task.

FIG. 1 illustrates an encoding method for a multi-workflow application with 2 workflows in a cloud-edge environment, where the multi-workflow is composed of two workflows v ₁ ,v ₂ Each workflow contains 4 tasks; the cloud side environment is composed of 1 cloud server S ₁ And 2 edge servers S ₂ ,S ₃ And (4) forming. At some point in the execution process, if task v _1,3 And v _2,1 While at the server S ₃ According to the task priority, the server S ₃ Will execute the task v preferentially _2,1 (ii) a Waiting task v _2,1 And after the execution is finished, executing the next task according to the priority of each task in the queue to be executed.

3.2 fitness function

The research objective of this embodiment is to optimize the scheduling policy of the workflow in the cloud-edge environment, so as to reduce the operating cost of the workflow. Therefore, for an individual, the fitness function value is the operation cost of the scheduling scheme corresponding to the individual, and the lower the operation cost of the individual is, the better the individual is. However, the coding strategies proposed in the previous section may have unfeasible solutions that do not meet the deadline constraints. Thus, the fitness function for comparing two candidate solutions is defined as follows:

both individuals are feasible solutions, i.e. selection c _e The fitness function for the lower individuals is defined as shown in equation (35).

The presence of at least two individualsIf an infeasible solution is obtained, the constraint conditions are satisfied according to the infeasible solution and the infeasible solution

The fitness function value is updated by the number of the workflows, and is defined as follows:

(2.1) if the number of workflows meeting the constraint condition in the two individuals is the same, then:

(2.2) if the number of workflows meeting the constraint condition in the two individuals is different:

wherein the content of the first and second substances,

is an event function, here denoted as constraint

The result function of (2). When the constraint condition is satisfied, taking 1 as a function value; otherwise, the function value is 0.

3.3 update strategy for populations

3.3.1 mutation operator

The mutation operation of the differential evolution algorithm is realized by a differential strategy, and the realization method comprises the steps of randomly selecting two different individuals in a population, scaling the vector difference of the two different individuals, and then carrying out vector synthesis with the individual to be mutated, as shown in a formula (38).

Where F is called the scaling factor and is a definite constant, usually between (0,2), but through long-term practice in the academic world, F e (0,1) is more effective in practice, although F e (0,2) is theoretically acceptable.

3.3.2 crossover operator

The purpose of the crossover operation is to randomly select individuals, as determined by the crossover parameter CR ∈ [0,1 ]]A control for controlling the rate of the interleaving operation. The interleaving operation can be performed in two ways: binomial and exponential methods. The binomial method first generates a random number r for each of the d components, which random number r is subject to uniform distribution _i ∈[0,1]According to r _i The comparison with the crossover probability CR determines the crossover of the new individual, X _i May be expressed as:

in this way, it can be randomly decided whether to exchange a certain component with the variant individual.

In the exponential method, the algorithm selects a segment of the gene of the variant individual, the segment of the gene starting from a random integer k and having a length of a random value L, and the segment of the gene may include a plurality of components. Mathematically, this can be referred to as randomly selecting k e [0, d-1 ∈]And L ∈ [1, d ]]Thus, X _i May be expressed as:

in this embodiment, a binomial method is selected to implement the crossover operator.

3.3.3 selection operator

In order to further improve the optimization effect of the differential evolution algorithm, a new selection mechanism is introduced in the embodiment. First, the algorithm will keep the N offspring generated by N parent individuals, that is, when the parent individuals generate 1 offspring, the algorithm will not immediately make one-to-one elimination selection, but will mark all new individuals generated by mutation and crossover as new individuals

i∈{1,2,…,N and temporarily retained, with N child nodes and the original N parents. Thus, through one round of evolution, 2N individuals were temporarily retained. Next, the algorithm calculates fitness function values of 2N individuals in the current individual pool, and then performs a sorting operation on the 2N individuals according to the fitness function values from large to small. The first N individuals in the sorted queue are then selected as the final evolution result of the current generation and used as parents for the next generation of evolution. Compared with a one-to-one elimination mechanism of a traditional differential evolution algorithm selection operator, the selection method embodies stronger and more comprehensive evolution capability. The selection operation based on the whole population enables the evolution process to be more reasonable and diversified, the whole fitness value of the population is improved more quickly, and the convergence speed of a search solution space is accelerated.

3.4 mapping of population individuals to a Multi-workflow scheduling scheme

In a cloud-edge environment, the mapping of encoded particles to a multi-workflow application scheduling scheme is shown as algorithm 1. For simplicity of presentation, superscripts and subscripts encoding particles are omitted in this section, i.e. X ═ μ, pi is used instead

The input of the algorithm comprises a multi-workflow application W, a cloud edge environment S and a coded particle X, and the output is a coded particle X [2 ]]The corresponding scheduling scheme Γ ═ W, S, M, c _e ,T ^f ). First, the mapping M is initialized to an empty set null, and the queue to be executed Q ═ Q (Q) ₁ ,Q ₂ ,...,Q _|s| ) Initialized to empty queue null, data transfer cost c _tran Initialized to 0 (line 1). The scheduling of the multi-workflow application W is then started (lines 2-9), the process being divided into two steps:

(1) calling an algorithm 2, namely Workflow _ Applications _ installing (W, S, X), monitoring the arrival of the multi-Workflow application W in real time, and performing task allocation of the multi-Workflow application (line 5);

(2) on the queues to be executed of all servers, the algorithm 3 Workflow _ Applications _ Execution(s) is called _k ,Q _k ,M,c _tran ) Execution of the multi-workflow application is performed (lines 6-9).

(lines 11, 12). After the calculation is completed, if the completion time of a certain workflow application exceeds the cut-off time, the method

The scheduling scheme does not meet the deadline constraint and marks the encoded particle X as an infeasible solution (lines 14-16). Finally, the scheduling scheme of the return workflow Γ ═ (W, S, M, c) _e ,T ^f ) (ii) a (line 18).

During the execution of the algorithm 1, the arrival of the multi-workflow application W needs to be monitored in real time, and the task allocation of the multi-workflow application is performed, which is shown in the algorithm 2. Input parameters of the algorithm include a multi-workflow application W, a cloud-edge environment S and encoded particles X. During the operation of the algorithm, if the workflow applies w _i If so, the task calculation time t is calculated according to the formula (18) and the formula (19) _tc [|V _i |×|S|]And data transmission time t _dt [|E _i |,|S|×|S|]And recording its arrival time alpha _i (lines 3 to 4). Traversing workflow applications w _i All tasks in, if task v _i,j For entering a task, i.e. the task does not have a predecessor task, the value s is determined according to the server code _i,j V. task _i,j Put into the server s _i,j The queue to be executed; otherwise, the task v _i,j Put into the Server s _i,j The task waiting pool (lines 5-11). Otherwise, thenWait for the arrival of a workflow application (lines 14-15). The algorithm ends until all workflow applications have arrived.

During the execution of algorithm 1, the multi-workflow application allocation needs to be performed on the queue to be executed of the server, and the process is shown as algorithm 3. The input to the algorithm is a server s _k Server s _k To-be-executed queue Q of _k Mapping M and data transmission cost c _tean . During the operation of the algorithm, if the server s _k In the off state, the server s is turned on _k The server s _k Starting time t _boot (s _k ) Set to the current time (lines 2-4). If server s _k To-be-executed queue Q of _k If not, encoding mu according to priority, and queuing Q to be executed _k Task v with highest medium priority _i,j Dispatch to server s _k Corresponding mapping relation (v) _i,j ,s _k ) Adding the data into the mapping M, calling an algorithm 4, and executing a task calculation process and a data transmission process (lines 10-12); otherwise, wait for Q _k Non-empty (lines 13-14). The algorithm ends until all workflow applications are executed.

During the execution of algorithm 3, the task calculation and data transmission process of the simulated workflow application is shown as algorithm 4. The inputs to the algorithm include a task v _i,j And server s _k The output is the transmission cost of the currently generated data

First, will

Initialized to 0 (line 1). Second, record task v _i,j Start time t of _start (v _i,j ,s _k ) And according to t _end (v _i,j ,s _k )＝t _start (v _i,j ,s _k )+t _tc (v _i,j ,s _k ) (ii) a Computation task v _i,j Is completed by time t _end (v _i,j ,s _k ) (lines 2 to 3). Finally, traverse task v _i,j According to the server code s _i,s To convert data into

To the execution of a subsequent task v _i,s Server s of _i,s Calculating the corresponding generated data transmission cost; at this time, if the task v _i,s Having completed the reception of all its predecessor task data, task v will be completed _i,s Slave server s _i,s The task waiting pool is put into a queue to be executed (rows 4-10).

3.5 end conditions

The termination conditions of the differential evolution algorithm are generally two types: one is to limit the maximum evolutionary algebra, and the other is to terminate the algorithm when the value of the objective function is smaller than a certain threshold, which is usually selected to be 10-6 in general research. In this embodiment, the maximum evolution iteration number k is 1000, which is the termination condition of the algorithm, that is, the algorithm ends when the 1000 th generation of evolution is completed.

3.6 Algorithm flow-chart

The algorithm flow is shown in fig. 2:

(1) and determining control parameters of the differential evolution algorithm and determining a fitness function. The control parameters of the differential evolution algorithm comprise a population size NP, a scaling factor F and a hybridization probability CR;

(2) randomly generating an initial population;

(3) evaluating an initial population and calculating the fitness value of each individual in the initial population;

(4) the judgment is that the termination condition is reached or the evolution algebra reaches the maximum value. If so, terminating the evolution, and outputting the obtained optimal individual as an optimal solution; if not, continuing;

(5) carrying out variation and cross operation to obtain an intermediate population;

(6) selecting individuals from the original population and the intermediate population to obtain a new generation of population;

(7) and (5) turning to step (4) when the evolution algebra g is g + 1.

4 Algorithm evaluation

4.1 design of the experiment

All experiments were run with a Win 10 system of 8GB memory and 2.60GHz Intel Core i7-6700HQ CPU, and were all performed in a Python 3.10 environment.

4.1.1 workflow example

The workflow used for the test comes from 5 scientific workflows from Bharathi et al, intensively studied 5 different scientific fields: cybersheke in earthquake science, Epigenomics in biogenetic, LIGO in gravity physics, Montage in astronomy, and SIPHT in bioinformatics. Each workflow has different attributes such as structure, task quantity and the like, and relevant information such as calculation requirements and data transmission quantity is stored in a corresponding xml file. For each scientific workflow, this example selects 3 scales: micro (containing about 10 tasks), mini (containing about 30 tasks), and mid (containing about 50 tasks); the workflow application scale submitted by the user is 1 of the 3 scales. For multiple workflows, 3 scales were also chosen: small (containing about 20 workflows), medium (containing about 30 workflows), large (containing about 50 workflows).

4.1.2 resource instances

Currently mainstream commercial cloud services generally require price time p in units of 60 seconds or 1 hour _i Payment is made. In this experiment, payment was selected in units of 60 seconds.

Two servers s _i And s _j Bandwidth and unit data transmission cost therebetween, according to the environment to which the two belong (f) _i 0 and f _j 1), set up as in table 1.

TABLE 1 s _i And s _j Bandwidth and unit data transmission cost therebetween

4.1.3 Experimental parameter settings

In section 3.1, assuming that the time interval of the workflow arrival obeys poisson distribution P (λ), it is set that the user submits the workflow application to the cloud edge environment every 2.5s on average, that is, λ ═ 2.5, and then the arrival rate of the workflow is 1/λ ═ 0.4. Thus, for workflow w _i Its arrival time α _i As shown in equation (41).

Where rand (exp (λ)) is used to generate a poisson-distributed random number with parameter λ.

The cloud edge environment is composed of 5 cloud servers(s) ₁ ,s ₂ ,...,s ₅ ) And 5 edge servers(s) ₆ ,s ₇ ,...,s ₁₀ ) And (4) forming. Wherein, the cloud server s ₁ ,s ₂ ,...,s ₅ Respectively 2.5,3.5,5.0,7.5,10.0Mbps, edge server s ₆ ,s ₇ ,...,s ₁₀ The computing power of (a) is 2.5,2.6,2.2,2.3,2.7Mbps respectively; suppose a cloud server s ₅ Has the highest calculation capacity, and the lease cost per unit time is 5/24$/min (12.5 $/h). At the same time, with cloud server s ₅ The calculation cost per unit time of (2) is a benchmark, and the lease cost per unit time of the rest servers is in proportion to the calculation capacity of the servers.

For the control parameters in the differential evolution algorithm, the scaling factor F is set to 0.5, the mutation probability CR is set to 0.5, and the population size NP is set to 10.

4.2 Experimental results and analysis

In order to test the workflow scheduling performance of the improved differential evolution algorithm in the cloud-edge environment, 10 groups of experiments are carried out on multiple workflows with different workflow quantities, the average value is taken as the operation result of the algorithm on the current scale after the infeasible solution is eliminated, and the advantage of cost optimization of the differential evolution algorithm in workflow scheduling is analyzed. For the multi-workflow with different scales, the optimal values (unit: $) of the workflow execution cost of the multi-workflow with three scales under different algorithm scheduling strategies are intuitively reflected in fig. 3, 4 and 5.

The results of the scheduling of the small multi-workflow under different deadlines and different optimization algorithms are shown in fig. 3. For small multi-workflows, the differential evolutionary average outperforms the sequential scheduling by 43.2%. This is because the embodiment improves the selection operator of the conventional differential evolution algorithm, avoids trapping in a locally optimal solution, and obtains a better scheduling strategy. In addition, the solution space size of the multi-workflow scheduling problem is generally exponential, while the random strategy adopted randomly has low efficiency, and a high-quality solution or even a feasible solution is difficult to search under the limited population size and the limited search times.

The scheduling results of the medium-sized multi-workflow under different deadlines and different optimization algorithms are shown in fig. 4. DE yields the optimal solution at all cut-off times. Furthermore, the average cost of DE is up to 44.9% better than the proportion of sequential scheduling. It is noted that the multi-workflow includes a large number of data-intensive and computation-intensive tasks, and has a complex structure, that is, the DE has better performance for scheduling the composite workflow.

The scheduling results of the large multi-workflow under different deadlines and different optimization algorithms are shown in fig. 5. Like the small multi-workflow, DE gets the best solution for the average cost at all deadlines and is on average better than the sequential algorithm. Therefore, the DE algorithm can obtain better scheduling performance on a plurality of workflows with larger task sizes and has better robustness.

Combining the fig. 3, 4, and 5, the performance cost of DE and Sequence decreases with relaxed deadline constraints. This is because relaxing the deadline of the workflow allows each task in the workflow to have a more relaxed execution time, and tasks can be executed after meeting the workflow deadline. Therefore, tasks can be distributed to the servers with lower prices for execution, and more tasks can be distributed to the same server for execution, so that the number of rented virtual machines is reduced, and server rental cost is reduced. Meanwhile, the resource utilization of the four algorithms increases as the deadlineBase time increases. This is because the workflow has a longer deadline and more parallel tasks can share the same virtual machine, thereby compressing and reducing idle time on these virtual machines.

THE ADVANTAGES OF THE PRESENT INVENTION

Aiming at the scheduling problem of multiple workflows, the invention provides a differential evolution-based multiple workflow scheduling algorithm under the deadline constraint, and the execution cost of the multiple workflows is minimized by utilizing the differential evolution algorithm on the premise of meeting the deadline constraint of the multiple workflows. In order to improve the rationality and diversity of the population evolution process, a two-dimensional discrete particle is introduced to encode individuals, and a basic differential evolution algorithm is optimized by using a selection operator based on the whole population, so that the fitness value of the whole population is improved more quickly on the premise of avoiding premature convergence, and the speed of searching a solution space by the algorithm is increased. Through multiple groups of simulation comparison experiments, the performance of the multi-workflow scheduling algorithm based on differential evolution is superior to that of other scheduling algorithms in terms of cut-off time and multi-workflow scale, and the execution cost of the multi-workflow under the cloud-edge environment can be effectively reduced.

While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

The present invention is not limited to the above-mentioned preferred embodiments, and any other various types of multi-workflow scheduling methods with time delay constraint under the cloud-edge environment can be derived from the teaching of the present invention.

Claims

1. A time delay constrained multi-workflow scheduling method under a cloud edge environment is characterized by comprising the following steps: on the premise of meeting the multiple workflow deadline constraint, minimizing the execution cost of the multiple workflows by using a differential evolution algorithm; in order to improve the rationality and diversity of the population evolution process, two-dimensional discrete particles are introduced to encode individuals, and the differential evolution algorithm is optimized by using a selection operator based on the whole population, so that the fitness value of the whole population is improved more quickly and the speed of searching a solution space by the algorithm is increased on the premise of avoiding premature convergence.

2. The method for scheduling multiple workflows with time delay constraints in a cloud-edge environment according to claim 1, wherein the method comprises the following steps: the multi-workflow deadline constraint is expressed as:

wherein, c _e For the cost of the execution of a multi-workflow application,

completion time for multi-workflow applications

The middle element.

3. The method for scheduling multiple workflows with time delay constraint in cloud-edge environment according to claim 2, wherein the method comprises the following steps: the construction process of the multi-workflow deadline constraint representation is as follows:

assuming that the time intervals at which the user submits different workflows to the cloud-edge environment, i.e. the arrival times of different workflows, approximately obey the poisson distribution P (λ), where λ represents the arrival rate of the workflows, these workflows are represented by an infinite set:

W＝{w ₁ ,w ₂ … equation (1)

Wherein each workflow may be represented by a triplet:

the structure of the workflow is represented by a directed acyclic graph:

G _i ＝{T _i ,E _i formula (3)

Wherein the content of the first and second substances,

is a set of tasks, N represents the number of tasks;

representing a jth task in an ith workflow;

is a set of edges between tasks; directed edge

Represents t _ip And t _ij With data transmission therebetween, t _ip Is t _ij Predecessor task of, t _ij Is t _ip After thatA task;

is t _ip The predecessor task set of (2);

is t _ij The successor task set of (1);

due to the fluidity of the workflow, a task can be allocated to a server to be executed only when all predecessor nodes of the task are executed and all data generated by the predecessor nodes are transmitted;

cloud side environment:

S＝{S _cloud ,S _edge equation (7)

S _cloud ＝{s ₁ ,s ₂ ,…,s _m equation (8)

The edge contains n edge servers:

S _edge ＝{s _m+1 ,s _m+2 ,…,s _m+n equation (9)

wherein, b _r,t Represents the bandwidth beta _r,t The value of (a) is,

the multi-workflow scheduling scheme is represented as:

Γ＝(W,S,M,c _e ,T ^f ) Formula (12)

Wherein the content of the first and second substances,

representing a completion time of the multi-workflow application;

for two types of elements in the mapping M, (v) _i,j ,s _k ) Representing a task v _i,j At the server s _k The upper side is executed in the upper part,

representing data edges

when determined, the child map:

is also determined accordingly; thus, mapping M is equivalent to:

under a cloud edge environment, selecting a cut-off time delay as a constraint condition to research the problem of time delay minimization; the cost scheduler is a cost-driven scheduler and aims to minimize the execution cost of the optimization target through reasonable scheduling according to a scheduling scheme; the problem to be solved by the cost scheduler under the time delay constraint is to minimize the execution cost of multiple workflows on the premise of meeting the deadline time of all workflows in the multiple workflows; assuming each server has sufficient storage capacityTo store data generated or transmitted during execution; computing time t using tasks _tc Measuring the computing power of a server using a data transfer time t _dt The data transmission capacity between the servers is measured by the following specific calculation method:

wherein formula (18) represents task v _i,j At the server s _k The calculated time of (1), equation (19) represents the data edge

in a latency constrained cost scheduler, for one scheduling scheme Γ, each server s once its mapping M is determined _k The starting time t of each server is determined _boot (s _k ) And then determining; to calculate the execution cost c of a multi-workflow application _e And completion time T ^f According to the mapping M of the multi-workflow application W corresponding to the cloud side environment S, the related variables are defined as follows:

t _end (v _i,j ,s _k ): task v _i,j At server s _k Is equal to task v _i,j With its start time at the server s _k The sum of the upper calculation times, as shown in equation (21);

d _i workflow application w given scheduling scheme Γ _i The cutoff time constraint of (2) is calculated as shown in equation (25);

d _i ＝α _i +baseline*|W|*HEFT(w _i ) Equation (25)

Among them, HEFT (w) _i ) Is represented by HEFT algorithm scheduling workflow w _i A required execution time; the parameter baseline is defined by the set of equations (26):

As shown in equation (27) and equation (28);

4. the method according to claim 3, wherein the method comprises: the method for encoding the individual by introducing the two-dimensional discrete particles specifically comprises the following steps:

As represented by equation (30);

wherein NP represents the size of the population,

and

As shown in equation (33):

wherein the content of the first and second substances,

i＝1,2,…,|W|,j＝1,2,…,|V _i 1,2, … NP equation (34)

in a binary group

In the step (1), the first step,

Element (1) of

If the numerical values are the same, the task received by the platform is higher in priority; for the

Element (1) of

Whose value represents the number of the server performing the task.

5. The method for scheduling multiple workflows with time delay constraints in a cloud-edge environment according to claim 4, wherein the method comprises the following steps: the optimization of the differential evolution algorithm by using a selection operator based on the whole population specifically comprises the following steps:

Temporarily reserving, wherein N sub nodes and original N parents exist; thus, through one round of evolution, 2N individuals were temporarily retained; then, calculating the fitness function values of the 2N individuals in the current individual pool, and then sequencing the 2N individuals according to the fitness function values from large to small; the first N individuals in the sorted queue are then selected as the final evolution result of the current generation and used as parents for the next generation of evolution.

6. The method for scheduling multiple workflows with time delay constraints in a cloud-edge environment according to claim 5, wherein the method comprises the following steps: the fitness function is a fitness function for comparing two candidate solutions, and is defined as follows:

let both individuals be feasible solutions, i.e. choose c _e The fitness function for the lower individuals is defined as shown in equation (35):

if at least one infeasible solution exists in the two individuals, the constraint conditions are met according to the two solutions

wherein the content of the first and second substances,

is an event function expressed as a constraint

The result function of (2); when the constraint condition is satisfied, taking 1 as a function value; otherwise, the function value is taken0。

7. The method of claim 6, wherein the method comprises:

the specific implementation process of the differential evolution algorithm is as follows:

step S2: randomly generating an initial population;

step S5: carrying out variation and cross operation to obtain an intermediate population;

step S6: selecting individuals from the original population and the intermediate population to obtain a new generation population;

step S7: turning to step S4 when the evolution algebra g is g + 1;

the input of the algorithm 1 comprises a multi-workflow application W, a cloud edge environment S and a coded particle X, and the output is a coded particle X [2 ]]The corresponding scheduling scheme Γ ═ W, S, M, c _e ,T ^f ) (ii) a First, map M is initialized to empty set null, and queue to be executed Q ═ Q (Q) ₁ ,Q ₂ ,...,Q _S ) Initialized to empty queue null, data transfer cost c _tran Initialization is 0; the scheduling of the multi-workflow application W then starts, the process being divided into two steps:

in the execution process of the algorithm 1, the multi-workflow application distribution needs to be carried out on the queue to be executed of the server, and the process is shown as an algorithm 3, wherein the server s is input _k Server s _k To-be-executed queue Q of _k Mapping M and data transmission cost c _tran (ii) a During the operation of the algorithm, if the server s _k Is in offIn the closed state, the server s is opened _k The server s _k Starting time t _boot (s _k ) Setting as a current time; if server s _k To-be-executed queue Q of _k If not, encoding mu according to priority level, and queuing Q to be executed _k Task v with highest medium priority _i,j Dispatch to server s _k Corresponding mapping relation (v) _i,j ,s _k ) Adding the data into the mapping M, calling an algorithm 4, and executing a task calculation process and a data transmission process; otherwise, wait for Q _k Is not empty; the algorithm ends until all workflow applications are executed;

during the execution of algorithm 3, the task calculation and data transfer process of the simulated workflow application is shown as algorithm 4, with the input comprising a task v _i,j And server s _k The output is the currently generated data transmission cost

Firstly, the following components are mixed

Initializing to 0; second, record task v _i,j Start time t of _staet (v _i,j ,s _k ) And according to t _end (v _i,j ,s _k )＝t _start (v _i,j ,s _k )+t _tc (v _i,j ,s _k ) (ii) a Computation task v _i,j Is completed by time t _end (v _i,j ,s _k ) (ii) a Finally, traverse task v _i,j According to the server code s _i,s To convert data into

To the execution of a subsequent task v _i,s Server s of _i,s Calculating the corresponding generated data transmission cost; at this time, if the task v _i,s Having completed receiving all of its predecessor task data, task v will be completed _i,s Slave server s _i,s The task waiting pool of (1) is put into a queue to be executed.

8. The method according to claim 7, wherein the method comprises: the maximum evolution iteration number k is 1000, that is, the algorithm is ended when the 1000 th evolution is completed.