CN110837413B

CN110837413B - Computing migration scheduling method for deep neural network application in edge environment

Info

Publication number: CN110837413B
Application number: CN201911143030.7A
Authority: CN
Inventors: 陈星�; 胡俊钦; 张佳俊; 黄引豪; 陈佳晴
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2019-11-20
Filing date: 2019-11-20
Publication date: 2023-03-24
Anticipated expiration: 2039-11-20
Also published as: CN110837413A

Abstract

The invention relates to a computation migration scheduling method for deep neural network application in an edge environment, which is characterized in that an optimal scheduling scheme is searched based on an evaluation algorithm and a genetic algorithm, and the optimal scheduling scheme is used for realizing the optimal scheduling of deep neural network application in the edge environmentnThe minimum average response time of each task is as follows: handlenThe tasks being divided into layersn×mEach subtask corresponds to a gene locus, a gene on each gene locus represents an execution node of the subtask corresponding to the gene locus, and each individual is a feasible solution; calculating the average response time of each individual by adopting an evaluation algorithm, and finding out the optimal individual in the group of the generation; calculating the average response time of the population, and sequentially performing selection operation, cross operation and variation operation to obtain a progeny population; and continuously iterating, and finding the optimal individual, namely the optimal scheduling scheme, in the continuously updated child population to obtain the average response time of the optimal scheduling scheme. The method is beneficial to reducing the calculation migration scheduling time.

Description

Computing migration scheduling method for deep neural network application in edge environment

Technical Field

The invention relates to the technical field of computational migration, in particular to a computational migration scheduling method for deep neural network application in an edge environment.

Background

With the continuous development of deep learning technology, more and more Deep Neural Network (DNN) applications appear in the field of view of the public, and become an integral part of the daily life of people, such as personalized recommendation systems, face recognition systems, license plate recognition systems, and the like.

DNNs employ a dazzling function and praise high intelligence, relying on large-scale, structurally complex deep neural networks. The execution of these applications requires high performance of the mobile device, but the computing resources of the mobile device are limited and cannot be done separately. At present, the main solution is to migrate some computationally complex neural network layers to a resource-rich remote cloud end for execution by a computational migration technology, and then return the result to the mobile device.

Computing migration introduces time delays: data transmission between each computing node can cause transmission delay; after the task is migrated to the target node, due to the limited concurrency capability of the nodes, queuing may be required, resulting in a latency delay. If the total delay of migration is too long, the average response time of the task will be significantly increased, affecting the user experience. Different DNN tasks require different data transmission amounts, and the network connection conditions and data transmission rates among the computing resources are also different. Thus, different task schedules will generate different time delays, and finding a task schedule with a low average response time becomes a difficult problem.

The traditional migration method is to put all tasks on a moving edge or a cloud to be executed, and then return the result, but this causes a huge data transmission delay. Therefore, a good scheduling scheme is very necessary, especially when multiple tasks are executed concurrently.

Disclosure of Invention

The invention aims to provide a computation migration scheduling method for deep neural network application in an edge environment, which is beneficial to reducing computation migration scheduling time.

In order to realize the purpose, the invention adopts the technical scheme that: a computation migration scheduling method for deep neural network application in an edge environment is characterized in that an optimal scheduling scheme is found based on an evaluation algorithm and a genetic algorithm, n tasks are scheduled through the optimal scheduling scheme, and average response time of the n tasks is minimized, and the method specifically comprises the following steps:

dividing n tasks into n x m subtasks according to layers, wherein each subtask corresponds to a gene locus, n x m gene loci are in total, the gene on each gene locus represents the execution node of the subtask corresponding to the gene locus, k genes are in total and correspond to k nodes, and each individual u is divided into n x m subtasks _i Is a feasible solution to the scheduling problem; calculating the average response time u of each individual, namely each scheduling scheme by adopting an evaluation algorithm _i Time, and then finding out the optimal individual in the group of the generation; calculating the average response time of the population, sequentially performing selection operation and cross operation to select the superior individuals in the parent population to generate offspring individuals, and performing mutation operation to obtain an offspring population; by adopting the method, continuous iteration is carried out, the optimal individual, namely the optimal scheduling scheme, is found in the continuously updated child group, and the average response time of the optimal scheduling scheme, the execution node of each subtask and the execution sequence on the node are obtained.

Further, the method comprises the steps of:

1) Inputting: mobile device, cloud and edge node set S = { S = ₁ ，s ₂ ，...，s _k In which s is _i Represents the ith node; set of tasks T = { T = { T } ₁ ，T ₂ ，...，T _n Where T is _i Representing the ith task, each task T _i And is also denoted as T _i ＝{t _i，1 ，t _i，2 ，...，t _i，m }，t _i，j A jth layer sub-task representing an ith task;

2) Dividing n tasks into n x m subtasks according to layers based on a genetic algorithm, wherein each subtask corresponds to a gene locus, the total number of the gene loci is n x m, and the gene on each gene locus represents the subtask t corresponding to the gene locus _i，j Is executed node s _x Total k genes corresponding to k nodes, each of u _i Is a feasible solution of the scheduling problem; initializing initial generation population position, randomly initializing global variable best to represent optimal individual, and recording postTime is the average response time of the optimal individual;

3) Calculating the average response time u of each individual by adopting an evaluation algorithm _i .time；

4) Comparing the average response time of all the individuals with best time, and if the individual time is less than best time, updating best and best time;

5) Calculating the average response time of the population of this generation

Wherein size is the size of the population;

6) And (3) carrying out selection operation: selecting individuals having an average response time less than the population average response time, i.e., u _i Time < averTime, retained as a population of progeny;

7) Expanding the child population by crossover operations: selecting two individuals u from reserved individuals by roulette selection ₁ And u ₂ Taking them as parent to make single-point crossing to produce two filial individuals u ₁ ' and u ₂ ' repeating the process to generate new filial individuals until the size of the filial population reaches size;

8) Carrying out mutation operation: setting a variation rate mu, then randomly generating a random number, and if the generated random number is less than the variation rate mu, performing variation operation; when carrying out mutation operation, randomly generating the number num of genes to be mutated, then randomly generating hum gene loci, and randomly changing genes on the num gene loci into another gene locus; thereby generating a final population of progeny;

9) Continuously repeating the steps 3-8 to iterate until an iteration termination condition is met, wherein the best record is the optimal scheduling scheme, and the average response time is best.

10 Average response time of the output optimal scheduling scheme, the execution node of each subtask, and the execution order on that node.

Further, in step 1, the input data further includes: node s _x Number of concurrent strokes p _x Where x ∈ [1, k ]](ii) a Data transfer between nodesRate of delivery

Wherein v is _i，j Representing the data transmission rate between the node i and the node j; round trip time between nodes->

Wherein r is _i，j Represents the round trip time between node i and node j; set of execution times of subtasks on different nodes->

Wherein time _i，j Representing a task T _x Ith subtask t of (1) _x，i At node s _j The execution time of (1); set of data transmission amounts D = { D ] between subtasks ₁ ，d ₂ ，...，d _m In which d is _j Representing a subtask t _i，j To subtask t _i，j+1 The amount of data transfer therebetween; in the step 3, each individual u is calculated by adopting an evaluation algorithm _i I.e. average response time u per scheduling scheme _i Time, comprising the steps of:

301 A variable currTime) indicating the current time is set to be initialized to 0; the input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows corresponding to k nodes, and each row stores subtasks executed on the node according to an execution sequence; s _x Empty denotes node s _x The number of the rest idle lanes of (1) is p _x (ii) a For each subtask t _i，j Three attributes are defined: t is t _i， _j .arrival、t _i，j End and t _i，j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask, and is initialized as follows:

302 Setting a slice variable slice and initializing to infinity;

303 Fill lanes): sequentially arranging each node S in the node set S according to a scheduling scheme _x Sub task t of _i，j Put into a lane, put into each task, s _x Empty minus 1 and remove the subtask that has been added to the lane from the scheduling scheme until the node has no empty lanes, i.e. s _x Empty =0; the task put into the lane should satisfy t _i，j Arrrival ≦ currTime, i.e., the subtask has reached node s _x ；

304 Find the minimum time slice: for each node S in the set S of nodes _x Traversing all the subtasks in each lane to find the subtask t with the minimum residual execution time _i，j Taking the remaining time t _i，j Time is a slice, then the current time currTime is added to the slice as a new current time currTime, which represents the time with the length of slice;

305 Computing the remaining time of the subtask: for each node S in the set S of nodes _x Go through all the subtasks in each lane and leave them for the time t _i，j Time minus the slice as the new remaining time t _i，j Time, which represents that the subtask has executed the time of slice again; for node s _x Sub task t of _i，j If there is t _i，j Time is less than or equal to 0, it means that the task has been completed, and the completion time t is recorded _i，j End = curTime, then remove the task from the lane, number s of remaining free lanes of the node _x Empty plus 1, indicating that the task on the lane is finished and leaving a lane; at the same time, if the subtask t _i，j If it is not the last layer task, i.e. j ≠ m, then its successor subtask t is generated _i，j+1 (ii) a At this time, find subtask t _i，j+1 Is executed node s _y Calculating the subtask t _i，j+1 Slave node s _x To node s _y Transmission time g of _i，j+1 (x, y), then:

t _i，j+1 .arrival＝curTime+g _i，j+1 (x，y)

whereinTime of transmission

Step 306) continuously repeating the steps 302-305 for all subtasks in the scheduling scheme until the arrival time and the completion time of all subtasks are determined;

step 307) the average response time f of the scheduling scheme is calculated according to the following formula _ave (T), i.e. individual u _i Average response time u of _i .time：

Compared with the prior art, the invention has the following beneficial effects: the method can generate a scheduling scheme with minimum average response time according to the edge environment and the task set, can effectively reduce the calculation migration scheduling time of DNN application in the edge environment, and has strong practicability and wide application prospect.

Drawings

FIG. 1 is a flow chart of a method implementation of an embodiment of the present invention.

Fig. 2 is a node map provided in an embodiment of the present invention.

Fig. 3 is a timing diagram of task arrival in an embodiment of the invention.

FIG. 4 is a graph comparing average response times in the examples of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and specific embodiments.

The invention provides a computation migration scheduling method for deep neural network application in an edge environment, which is characterized in that an optimal scheduling scheme is searched based on an evaluation algorithm and a genetic algorithm, n tasks are scheduled through the optimal scheduling scheme, and the average response time of the n tasks is minimized, and the method specifically comprises the following steps: dividing n tasks into nxm by layersEach subtask corresponds to a gene locus and has n multiplied by m gene loci, the gene on each gene locus represents the execution node of the subtask corresponding to the gene locus, k genes are shared and correspond to k nodes, and each individual u is _i Is a feasible solution to the scheduling problem; calculating the average response time ui.time of each individual, namely each scheduling scheme, by adopting an evaluation algorithm, and further finding out the optimal individual in the group of the generation; calculating the average response time of the population, sequentially performing selection operation and cross operation to select the superior individuals in the parent population to generate offspring individuals, and performing mutation operation to obtain an offspring population; by adopting the method, continuous iteration is carried out, the optimal individual, namely the optimal scheduling scheme, is found in the continuously updated child group, and the average response time of the optimal scheduling scheme, the execution node of each subtask and the execution sequence on the node are obtained. The implementation flow of the method is shown in fig. 1.

In this embodiment, the method includes the steps of:

1) Inputting: mobile device, cloud and edge node set S = { S = ₁ ，s ₂ ，...，s _k In which s _i Representing the ith node (the set of nodes S includes three parts: the set of mobile devices M = { M =:) ₁ ，m ₂ ，...，m _a In which m is _i Represents the ith mobile device; edge device set E = { E = { E } ₁ ，e ₂ ，...，e _b In which e _i Represents the ith edge device; a cloud computing center c); set of tasks T = { T = { T } ₁ ，T ₂ ，...，T _n In which T is _i Representing the ith task, each task T _i And is also denoted as T _i ＝{t _i，1 ，t _i，2 ，...，t _i，m }，t _i，j A jth layer sub-task representing an ith task; node s _x Number of concurrent strokes p _x Where x ∈ [1, k ]](ii) a Data transmission rate between nodes

Wherein v is _i，j Representing nodesThe data transmission rate between i and node j; round trip time between nodes->

Wherein r is _i，j Represents the round trip time between node i and node j; task arrival rate λ on mobile device i _i Where i ∈ [1, a ]](ii) a Set of execution times of subtasks on different nodes

Wherein time _i，j Representing a task T _x Ith sub-task t _x，i At node s _j The execution time of (1); set of data transmission amounts D = { D ] between subtasks ₁ ，d ₂ ，...，d _m In which d is _j Representing a subtask t _i，j To subtask t _i，j+1 The amount of data transfer therebetween.

2) Dividing n tasks into n x m subtasks according to layers based on a genetic algorithm, wherein each subtask corresponds to a gene locus, the total number of the gene loci is n x m, and the gene on each gene locus represents the subtask t corresponding to the gene locus _i，j Is executed node s _x Total k genes corresponding to k nodes, each of u _i Is a feasible solution of the scheduling problem; initializing initial generation population position, randomly initializing a global variable best to represent an optimal individual, and recording the average response time of the optimal individual as best.

3) Calculating each individual u by adopting an evaluation algorithm _i I.e. average response time u per scheduling scheme _i .time。

The method specifically comprises the following steps:

301 A variable currTime) indicating the current time is set to be initialized to 0; the input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows corresponding to k nodes, and each row stores subtasks executed on the node according to an execution sequence; s _x Empty denotes node s _x The number of the rest idle lanes of (1) is p _x (ii) a For each subtask t _i，j Three attributes are defined:t _i， _j .arrival、t _i，j end and t _i，j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask, and is initialized as follows:

302 Set slice variable slice and initialize to ∞.

303 Fill lanes): sequentially arranging each node S in the node set S according to a scheduling scheme _x Sub task t of _i，j Put into lanes, put into task, s _x Empty minus 1 and remove the subtask that has been added to the lane from the scheduling scheme until the node has no empty lanes, i.e. s _x Empty =0; the task put into the lane should satisfy t _i，j Arrrival ≦ currTime, i.e., the subtask has reached node s _x 。

By means of a pool _x Indicating a drop node s _x The tasks in the lanes specifically comprise the following steps:

3031 For each node S in the set S of nodes _x Sub-task t of _i，j When s is _x When empty > 0, perform step 3032, otherwise end filling the lane;

3032 If t) is _i，j Arrival ≦ currTime, i.e., subtask t _i，j Arrived node s _x Then the subtask t is processed _i，j Put into the swim lane and remove the subtask that has been added to the swim lane from the scheduling scheme, and let s _x Empty > 0 minus 1;

3033 Step 3031-3032 is repeated continuously until the node has no empty lane;

304 Find the minimum time slice: for each node S in the set S of nodes _x Traversing all the subtasks in each lane to find the subtask t with the minimum residual execution time _i，j Taking the remaining time t _i，j Time is the slice, then the current time currTime is added toThe slice is used as a new current time currtime and indicates the elapsed time with the slice length.

305 Computing the remaining time of the subtask: for each node S in the set S of nodes _x Go through all the subtasks in each lane and leave them for the time t _i，j Time minus the slice as the new remaining time t _i，j Time, which represents that the subtask has executed the time with the length of slice again; for node s _x Sub task t of _i，j If there is t _i，j Time is less than or equal to 0, the task is completed, and the completion time t is recorded _i，j End = curTime, then remove the task from the lane, number s of remaining free lanes of the node _x Empty plus 1, indicating that the task on the lane is finished and leaving a lane; at the same time, if the subtask t _i，j If it is not the last layer task, i.e. j ≠ m, then its successor subtask t is generated _i，j+1 (ii) a At this time, find subtask t _i，j+1 Is executed node s _y Calculating the subtask t _i，j+1 Slave node s _x To node s _y Transmission time g of _i，j+1 (x, y), then:

t _i，j+1 .arrival＝curTime+g _i，j+1 (x，y)

wherein the transmission time

/>

Step 306) for all subtasks in the scheduling scheme, continuously repeating the steps 302-305 until the arrival time and the completion time of all subtasks are determined.

4) Time, the average response time of all individuals is compared with best.

5) Calculating the average response time of the population of this generation

Wherein size is the size of the population.

6) And (3) carrying out selection operation: selecting individuals with average response times less than the population average response time, i.e. u _i Time < averTime, retained as the progeny population.

7) Expanding the child population by crossover operations: selecting two individuals u from reserved individuals by roulette selection ₁ And u ₂ Taking them as parent to make single-point crossing to produce two filial individuals u ₁ ' and u ₂ ', this process is repeated to produce new offspring individuals until the offspring population reaches size.

8) Carrying out mutation operation: setting a variation rate mu, then randomly generating random numbers in the range of [0,0.1], and if the generated random numbers are less than the variation rate mu, performing variation operation; when carrying out mutation operation, randomly generating num of genes to be mutated, then randomly generating num of gene sites, and randomly changing genes on the num of gene sites into another gene; thereby generating the final population of progeny.

9) And continuously repeating the steps 3 to 8 to perform iteration until an iteration termination condition is met (in the embodiment, the iteration termination condition is the set iteration times), the best recorded scheme is the optimal scheduling scheme, and the average response time is best.

The related technical content of the scheduling problem and scheduling scheme of the present invention is further described below.

1 overview of the method

The invention discloses a method for automatically generating a DNN task scheduling scheme under an edge environment. The genetic algorithm first generates an initial solution and then iterates continuously according to the evaluation algorithm to generate a more optimal solution.

The detailed technical scheme is as follows.

2 technical scheme

2.1 definition

2.1.1 symbol Definitions

Definition 1: set of mobile device, cloud and edge node as S = { S = } ₁ ，s ₂ ，...，s _k In which s is _i Represents the ith node;

definition 2: set of mobile devices is M = { M = { [ M ] ₁ ，m ₂ ，...，m _a In which m is _i Represents the ith mobile device;

definition 3: set of edge devices E = { E = { E = } ₁ ，e ₂ ，...，e _b In which e _i Represents the ith edge device;

definition 4: the cloud computing center is denoted as c;

definition 5: node s _i The number of concurrent swimming channels is p _i ，i∈[1，k]；

Definition 6: data transmission rate between nodes is

Wherein v is _i，j Representing the data transmission rate between node i and node j;

definition 7: round trip time between nodes of

Wherein r is _i，j Represents the round trip time between node i and node j;

definition 8: set of tasks is T = { T = } ₁ ，T ₂ ，...，T _n Where T is _i Indicating the ith task. Each task T _i And may be represented as T _i ＝{t _i，1 ，t _i，2 ，...，t _i，m }，t _i，j A jth layer sub-task representing an ith task;

definition 9: mobile deviceThe task arrival rate at i is λ _i ，i∈[1，a]；

Definition 10: the execution time of the subtasks on different nodes is set as

Wherein time _i，j Representing a task T _x Ith subtask t of (1) _x，i At node s _j The execution time of (1);

definition 11: the data transmission amount among the subtasks is D = { D = { (D) } ₁ ，d ₂ ，...，d _m In which d is _j Representing a subtask t _i，j To subtask t _i，j+1 The amount of data transfer therebetween.

2.1.2 problem definition

t _i，j Arrival represents the subtask t _i，j To the node s performing the subtask _y The time at which the subtask starts to execute is denoted as t _i，j .begin。

w _i，j (y) represents a subtask t _i，j At node s _y The above queue waiting time is:

w _i，j (y)＝t _i，j .begin-t _i，j .arrival (2-1-1)

defining a subtask t _i，j Slave node s _x To node s _y The transmission time of (a) is as follows:

task T _i The response time of (a) is the time elapsed from the generation of the task on the mobile device to the completion of the full execution, equal to the sum of the response times of all the subtasks of the task. Subtask t _i，j Response time r of _i，j The method comprises three parts of execution time, transmission time and waiting time:

f _i，j (y)＝time _j，y +g _i，j (x，y)+w _i，j (y) (2-1-3)

task T _i The response time calculation formula of (c) is as follows:

the average response time for all n tasks is defined as follows:

finding a scheduling scheme, and scheduling the n tasks by the scheduling scheme to ensure that the average response time f of the n tasks _ave (T) is minimum. The scheduling scheme specifies the execution node for each subtask, as well as the order of execution on that node.

2.2 evaluation Algorithm

2.2.1 description of the Algorithm

The evaluation algorithm is used to evaluate a scheduling scheme, and the average response time of the scheme can be calculated according to a given scheduling scheme. The smaller the average response time, the better the scheme. The idea of the evaluation algorithm is to simulate the execution process of the task according to the scheduling scheme, and calculate the starting execution time and the execution completion time of each subtask, thereby calculating the average response time spent by the scheduling scheme.

A currTime variable is set to indicate the current time and initialized to 0. The input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows and corresponds to k nodes, and each row stores subtasks executed on the node according to the execution sequence. In addition, each subtask t is given t _i，j Three attributes are defined: t is t _i，j .arrival、t _i，j End and t _i，j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask.

The initialization is as follows:

for a task, the response time is the time from generation to completion, and the response time of the task can be calculated as long as the time of the completion of the task execution and the time of the task generation are obtained. And task T _i Is the first layer subtask t _i，1 At the arrival time t _i，1 Arrival, task T _i Is the last layer of subtask t _i，m Is completed at time t _i，m End. Thus, task T _i Response time of (c):

f _resp (T _i )＝t _i，m .end-t _i，1 .arrival (2-2-1)

the average response time is further calculated according to equation (5) as follows:

thus, it is only necessary to calculate each subtask t _i，j The average response time of the scheme can be calculated according to the formula (2-2-2).

2.2.2 algorithmic Process

First, the lanes are filled. According to the scheduling scheme, the subtasks of each node are put into the lanes in sequence until the node has no empty lane, and the scheduling scheme is used _x Empty denotes node s _x The number of idle lanes remained. And the subtasks that have been added to the lanes are removed from the scheme. The task placed in the lane should satisfy t _i，j Arrrival ≦ curTime, i.e., the subtask must have reached node s _x . By means of a pool _x Indicating a drop node s _x Tasks in lanes.

Second, find the minimum time slice. Firstly, all the subtasks in each lane are traversed to find out the subtask t with the minimum residual execution time _i，j Taking the remaining time t _i，j Time is the time slice sclce. The current time curTime is then added to the time slice, representing the elapsed time of length sclce.

And thirdly, calculating the remaining time of the subtask. For all subtasks in the lane, the time slice is subtracted from the remaining time, indicating that the subtask has executed again for a time of slice. To node s _x Sub task t of _i，j If there is t _i，j Time is less than or equal to 0, it means that the task has been completed, and the completion time t is recorded _i，j End = curTime. The task is then removed from the lane, and the number of idle lanes at the node is counted s _x Empty plus 1 indicates that the task in the lane is completed and a lane is left empty. At the same time, if the subtask t _i，j If it is not the last layer task, i.e., j ≠ m, then its successor subtask t will be generated _i，j+1 . At this time, the subtask t should be found _i，j+1 Is executed node s _y Calculating its transmission time, then:

t _i，j+1 .arrival＝curTime+g _i，j+1 (x，y) (2-2-3)

the above steps are repeated until the arrival time and the completion time of all the subtasks are determined. And finally, calculating the average response time of the scheme according to the formula (2-2-2).

2.2.3 Algorithm implementation

Evaluation algorithm

/>

/>

2.3 genetic Algorithm

2.3.1 description of the Algorithm

Dividing n tasks into n × m subtasks according to layers, wherein each subtask corresponds to a gene locus and has n × m gene loci in totalPoint, the numbering of the loci starts from 0. The ith gene locus corresponds to a subtask t _{i/m+1，i％m+1} For example, the 0 th gene locus corresponds to the subtask t _1，1 The 1 st gene locus corresponds to the subtask t _1，2 The mth gene locus corresponds to the subtask t _2，1 And so on.

The gene on each gene locus represents the subtask t corresponding to the gene locus _i，j Is executed node s _x For example, a subtask t _1，1 At node s ₁ The upper case, then the gene above the 0 th gene locus is 1. There are k genes in total, corresponding to k nodes.

Each subject u _i I.e. a feasible solution to the problem, for each feasible solution the average response time, denoted u, can be calculated from the evaluation _i Time. The size of the population is denoted by the size, and in each generation of the population, the average time averTime of the population is calculated,

then, selecting individuals with average response times less than the population average time, i.e., u _i Time < averTime, reserved to the next generation. Because the selection operator reduces the size of the population, the child population needs to be expanded by the crossover operator to keep its population size at size. First, two individuals u are selected from the reserved individuals according to roulette selection method ₁ And u ₂ Then, taking them as parent to carry out single-point crossing to generate two child individuals u ₁ ' and u ₂ ', this process is repeated until the progeny population size reaches size.

The mutation operation designates a mutation rate mu, then randomly generates a random number within the range of [0,0.1], and if the generated random number is smaller than the mutation rate mu, the mutation operation is performed. When mutation operation is carried out, the number num of genes needing mutation is randomly generated, num gene sites are randomly generated, and the genes on the sites are randomly changed into another gene. In this way, genetic mutations of organisms in nature are simulated.

2.3.2 algorithmic Process

First, the time for each individual is calculated according to algorithm 1.

And secondly, finding and recording the optimal individual best.

And thirdly, calculating the average time of the population according to the formula (2-3-1).

And fourthly, sequentially carrying out selection operation and cross operation.

And fifthly, performing mutation operation.

2.3.3 Algorithm implementation

Genetic algorithm

3 method evaluation

3.1 evaluation settings

As shown in fig. 2, 7 nodes are provided, wherein 4 mobile devices (m 1, m2, m3, m 4), 2 edge devices (e 1, e 2) and 1 cloud computing center (c) are provided, and the performance parameters of each device are listed in table 3-1.

TABLE 3-1 Performance parameters of the apparatus

Meanwhile, experiments were conducted using a 7-tier DNN application, setting the total task count to be 12, i.e., 84 subtasks in total. These 12 tasks are numbered from 1, tasks 1, 5, 9, 12 are generated on mobile device 1, tasks 2, 6, 10 are generated on mobile device 2, tasks 3, 7, 11 are generated on mobile device 3, and tasks 4, 8 are generated on mobile device 4. The task on each mobile device arrives at a constant speed within 1 second, for example, if 4 tasks are generated on the mobile device 1, then the task arrival rate is determinedIs that

I.e. one task every 0.25 seconds. Fig. 3 shows the tasks generated by each mobile device and the arrival time of each task.

The DNN uses the amount of data transfer (unit: mb) between layers as:

D＝{1.2，0.3，0.8，0.2，0.4，0.1，0.05}

the running time (unit: ms) of each layer on each node is as follows:

in five scenarios, the above settings remain unchanged, except for the network connection situation of the edge environment and the concurrency number of the edge. The following table is a detailed description of five scenarios.

TABLE 3-2 exemplary scenarios

TABLE 3-3 edge starvation scenarios

TABLE 3-4 edge margin scenarios

Tables 3 to 5: selectable edge scenes

TABLE 3-6 edge interconnect scenarios

3.2 evaluation results

Tables 3-7 show the comparison of the average response time of the scheduling scheme provided by the present invention with the average response time of the scheduling scheme based on the greedy algorithm, the optimal scheme.

Tables 3-7 comparison of average response times

Under the five scenes, the average response time of the scheduling scheme provided by the method and the greedy algorithm is tested, and compared with two traditional migration schemes of completely migrating to the nearby edge and completely migrating to the cloud, the result is shown in fig. 4.

From the evaluation result, the average response time of the scheduling scheme provided by the method is obviously shorter than that of the traditional migration scheme under each scene, and is close to that of the optimal scheme. This shows that the method of the present invention can effectively reduce the computation migration scheduling time of the DNN application in the edge environment.

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A computation migration scheduling method for deep neural network application in an edge environment is characterized in that an optimal scheduling scheme is found based on an evaluation algorithm and a genetic algorithm, n tasks are scheduled through the optimal scheduling scheme, and average response time of the n tasks is minimized, and the method specifically comprises the following steps:

1) Inputting: mobile device, cloud and edge node set S = { S = } ₁ ，s ₂ ，...，s _k In which s is _i Represents the ith node; set of tasks T = { T = { T } ₁ ，T ₂ ，...，T _n Where T is _i Representing the ith task, each task T _i And is also denoted as T _i ＝{t _i，1 ，t _i，2 ，...，t _i，m }，t _i，j A jth layer sub-task representing an ith task;

2) Dividing n tasks into n x m subtasks according to layers based on a genetic algorithm, wherein each subtask corresponds to a gene locus, the total number of the gene loci is n x m, and the gene on each gene locus represents the subtask t corresponding to the gene locus _i，j Is executed node s _x Total k genes corresponding to k nodes, each of u _i Is a feasible solution of the scheduling problem; initializing initial generation population position, randomly initializing a global variable best to represent an optimal individual, and recording the average response time of the optimal individual as best time;

5) Calculating the average response time of the population of this generation

Wherein size is the size of the population;

7) Expanding the child population by crossover operations: selecting two individuals u from reserved individuals by roulette selection ₁ And u ₂ Taking them as parent to make single-point crossing to produce two filial individuals u ₁ ' and u ₂ ', repeating the process to generate new filial individuals until the size of filial population reaches size;

8) Carrying out mutation operation: setting a variation rate mu, then randomly generating a random number, and if the generated random number is less than the variation rate mu, performing variation operation; when carrying out mutation operation, randomly generating num of genes to be mutated, then randomly generating num of gene sites, and randomly changing genes on the num of gene sites into another gene; thereby generating a final population of progeny;

2. The method for scheduling computation migration of deep neural network application in edge environment according to claim 1, wherein in step 1, the input data further includes: node s _x Number of concurrent strokes p _x Where x ∈ [1, k ]](ii) a Data transmission rate between nodes

Wherein v is _i，j Representing the data transmission rate between node i and node j; round trip time between nodes

Wherein r is _i，j Represents the round trip time between node i and node j; set of execution times of subtasks on different nodes

Wherein time _i，j Representing a task T _x Ith subtask t of (1) _x，i At node s _j The execution time of (1); set of data transmission amounts D = { D ] between subtasks ₁ ，d ₂ ，...，d _m In which d is _j Representing a subtask t _i，j To subtask t _i，j+1 The amount of data transfer in between; in the step 3, each individual u is calculated by adopting an evaluation algorithm _i I.e. average response time u per scheduling scheme _i Time, comprising the steps of:

302 Setting a slice variable slice and initializing to infinity;

303 Fill lanes): sequentially arranging each node S in the node set S according to a scheduling scheme _x Sub task t of _i，j Put into lanes, put into task, s _x Empty minus 1 and remove the subtask that has been added to the lane from the scheduling scheme until the node has no empty lanes, i.e. s _x Empty =0; the task placed in the lane should satisfy t _i， _j Arrrival ≦ currTime, i.e., the subtask has reached node s _x ；

304 Find the minimum time slice: for each node S in the set S of nodes _x Traversing all the subtasks in each lane to find the subtask t with the minimum residual execution time _i，j Get itTime remaining t _i，j Time is a slice, then the current time currTime is added to the slice as a new current time currTime, which represents the time with the length of slice;

305 Computing the remaining time of the subtask: for each node S in the set S of nodes _x Go through all the subtasks in each lane and leave them for the time t _i，j Time minus the slice as the new remaining time t _i，j Time, which represents that the subtask has executed the time with the length of slice again; for node s _x Sub task t of _i，j If there is t _i，j Time is less than or equal to 0, it means that the task has been completed, and the completion time t is recorded _i，j End = curTime, then remove the task from the lane, number s of remaining free lanes of the node _x Empty plus 1, indicating that the task on the lane is completed and a lane is left; at the same time, if the subtask t _i，j If it is not the last layer task, i.e. j ≠ m, then its successor subtask t is generated _i，j+1 (ii) a At this time, find subtask t _i，j+1 Is executed node s _y Calculating the subtask t _i，j+1 Slave node s _x To node s _y Transmission time g of _i，j+1 (x, y), then:

t _i，j+1 .arrival＝curTime+g _i，j+1 (x，y)

wherein the transmission time