CN110837413B - Computing migration scheduling method for deep neural network application in edge environment - Google Patents

Computing migration scheduling method for deep neural network application in edge environment Download PDF

Info

Publication number
CN110837413B
CN110837413B CN201911143030.7A CN201911143030A CN110837413B CN 110837413 B CN110837413 B CN 110837413B CN 201911143030 A CN201911143030 A CN 201911143030A CN 110837413 B CN110837413 B CN 110837413B
Authority
CN
China
Prior art keywords
time
node
subtask
task
average response
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911143030.7A
Other languages
Chinese (zh)
Other versions
CN110837413A (en
Inventor
陈星�
胡俊钦
张佳俊
黄引豪
陈佳晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201911143030.7A priority Critical patent/CN110837413B/en
Publication of CN110837413A publication Critical patent/CN110837413A/en
Application granted granted Critical
Publication of CN110837413B publication Critical patent/CN110837413B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Abstract

The invention relates to a computation migration scheduling method for deep neural network application in an edge environment, which is characterized in that an optimal scheduling scheme is searched based on an evaluation algorithm and a genetic algorithm, and the optimal scheduling scheme is used for realizing the optimal scheduling of deep neural network application in the edge environmentnThe minimum average response time of each task is as follows: handlenThe tasks being divided into layersn×mEach subtask corresponds to a gene locus, a gene on each gene locus represents an execution node of the subtask corresponding to the gene locus, and each individual is a feasible solution; calculating the average response time of each individual by adopting an evaluation algorithm, and finding out the optimal individual in the group of the generation; calculating the average response time of the population, and sequentially performing selection operation, cross operation and variation operation to obtain a progeny population; and continuously iterating, and finding the optimal individual, namely the optimal scheduling scheme, in the continuously updated child population to obtain the average response time of the optimal scheduling scheme. The method is beneficial to reducing the calculation migration scheduling time.

Description

Computing migration scheduling method for deep neural network application in edge environment
Technical Field
The invention relates to the technical field of computational migration, in particular to a computational migration scheduling method for deep neural network application in an edge environment.
Background
With the continuous development of deep learning technology, more and more Deep Neural Network (DNN) applications appear in the field of view of the public, and become an integral part of the daily life of people, such as personalized recommendation systems, face recognition systems, license plate recognition systems, and the like.
DNNs employ a dazzling function and praise high intelligence, relying on large-scale, structurally complex deep neural networks. The execution of these applications requires high performance of the mobile device, but the computing resources of the mobile device are limited and cannot be done separately. At present, the main solution is to migrate some computationally complex neural network layers to a resource-rich remote cloud end for execution by a computational migration technology, and then return the result to the mobile device.
Computing migration introduces time delays: data transmission between each computing node can cause transmission delay; after the task is migrated to the target node, due to the limited concurrency capability of the nodes, queuing may be required, resulting in a latency delay. If the total delay of migration is too long, the average response time of the task will be significantly increased, affecting the user experience. Different DNN tasks require different data transmission amounts, and the network connection conditions and data transmission rates among the computing resources are also different. Thus, different task schedules will generate different time delays, and finding a task schedule with a low average response time becomes a difficult problem.
The traditional migration method is to put all tasks on a moving edge or a cloud to be executed, and then return the result, but this causes a huge data transmission delay. Therefore, a good scheduling scheme is very necessary, especially when multiple tasks are executed concurrently.
Disclosure of Invention
The invention aims to provide a computation migration scheduling method for deep neural network application in an edge environment, which is beneficial to reducing computation migration scheduling time.
In order to realize the purpose, the invention adopts the technical scheme that: a computation migration scheduling method for deep neural network application in an edge environment is characterized in that an optimal scheduling scheme is found based on an evaluation algorithm and a genetic algorithm, n tasks are scheduled through the optimal scheduling scheme, and average response time of the n tasks is minimized, and the method specifically comprises the following steps:
dividing n tasks into n x m subtasks according to layers, wherein each subtask corresponds to a gene locus, n x m gene loci are in total, the gene on each gene locus represents the execution node of the subtask corresponding to the gene locus, k genes are in total and correspond to k nodes, and each individual u is divided into n x m subtasks i Is a feasible solution to the scheduling problem; calculating the average response time u of each individual, namely each scheduling scheme by adopting an evaluation algorithm i Time, and then finding out the optimal individual in the group of the generation; calculating the average response time of the population, sequentially performing selection operation and cross operation to select the superior individuals in the parent population to generate offspring individuals, and performing mutation operation to obtain an offspring population; by adopting the method, continuous iteration is carried out, the optimal individual, namely the optimal scheduling scheme, is found in the continuously updated child group, and the average response time of the optimal scheduling scheme, the execution node of each subtask and the execution sequence on the node are obtained.
Further, the method comprises the steps of:
1) Inputting: mobile device, cloud and edge node set S = { S = 1 ,s 2 ,...,s k In which s is i Represents the ith node; set of tasks T = { T = { T } 1 ,T 2 ,...,T n Where T is i Representing the ith task, each task T i And is also denoted as T i ={t i,1 ,t i,2 ,...,t i,m },t i,j A jth layer sub-task representing an ith task;
2) Dividing n tasks into n x m subtasks according to layers based on a genetic algorithm, wherein each subtask corresponds to a gene locus, the total number of the gene loci is n x m, and the gene on each gene locus represents the subtask t corresponding to the gene locus i,j Is executed node s x Total k genes corresponding to k nodes, each of u i Is a feasible solution of the scheduling problem; initializing initial generation population position, randomly initializing global variable best to represent optimal individual, and recording postTime is the average response time of the optimal individual;
3) Calculating the average response time u of each individual by adopting an evaluation algorithm i .time;
4) Comparing the average response time of all the individuals with best time, and if the individual time is less than best time, updating best and best time;
5) Calculating the average response time of the population of this generation
Figure BDA0002281472250000021
Wherein size is the size of the population;
6) And (3) carrying out selection operation: selecting individuals having an average response time less than the population average response time, i.e., u i Time < averTime, retained as a population of progeny;
7) Expanding the child population by crossover operations: selecting two individuals u from reserved individuals by roulette selection 1 And u 2 Taking them as parent to make single-point crossing to produce two filial individuals u 1 ' and u 2 ' repeating the process to generate new filial individuals until the size of the filial population reaches size;
8) Carrying out mutation operation: setting a variation rate mu, then randomly generating a random number, and if the generated random number is less than the variation rate mu, performing variation operation; when carrying out mutation operation, randomly generating the number num of genes to be mutated, then randomly generating hum gene loci, and randomly changing genes on the num gene loci into another gene locus; thereby generating a final population of progeny;
9) Continuously repeating the steps 3-8 to iterate until an iteration termination condition is met, wherein the best record is the optimal scheduling scheme, and the average response time is best.
10 Average response time of the output optimal scheduling scheme, the execution node of each subtask, and the execution order on that node.
Further, in step 1, the input data further includes: node s x Number of concurrent strokes p x Where x ∈ [1, k ]](ii) a Data transfer between nodesRate of delivery
Figure BDA0002281472250000031
Wherein v is i,j Representing the data transmission rate between the node i and the node j; round trip time between nodes->
Figure BDA0002281472250000032
Wherein r is i,j Represents the round trip time between node i and node j; set of execution times of subtasks on different nodes->
Figure BDA0002281472250000033
Wherein time i,j Representing a task T x Ith subtask t of (1) x,i At node s j The execution time of (1); set of data transmission amounts D = { D ] between subtasks 1 ,d 2 ,...,d m In which d is j Representing a subtask t i,j To subtask t i,j+1 The amount of data transfer therebetween; in the step 3, each individual u is calculated by adopting an evaluation algorithm i I.e. average response time u per scheduling scheme i Time, comprising the steps of:
301 A variable currTime) indicating the current time is set to be initialized to 0; the input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows corresponding to k nodes, and each row stores subtasks executed on the node according to an execution sequence; s x Empty denotes node s x The number of the rest idle lanes of (1) is p x (ii) a For each subtask t i,j Three attributes are defined: t is t i, j .arrival、t i,j End and t i,j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask, and is initialized as follows:
Figure BDA0002281472250000034
302 Setting a slice variable slice and initializing to infinity;
303 Fill lanes): sequentially arranging each node S in the node set S according to a scheduling scheme x Sub task t of i,j Put into a lane, put into each task, s x Empty minus 1 and remove the subtask that has been added to the lane from the scheduling scheme until the node has no empty lanes, i.e. s x Empty =0; the task put into the lane should satisfy t i,j Arrrival ≦ currTime, i.e., the subtask has reached node s x
304 Find the minimum time slice: for each node S in the set S of nodes x Traversing all the subtasks in each lane to find the subtask t with the minimum residual execution time i,j Taking the remaining time t i,j Time is a slice, then the current time currTime is added to the slice as a new current time currTime, which represents the time with the length of slice;
305 Computing the remaining time of the subtask: for each node S in the set S of nodes x Go through all the subtasks in each lane and leave them for the time t i,j Time minus the slice as the new remaining time t i,j Time, which represents that the subtask has executed the time of slice again; for node s x Sub task t of i,j If there is t i,j Time is less than or equal to 0, it means that the task has been completed, and the completion time t is recorded i,j End = curTime, then remove the task from the lane, number s of remaining free lanes of the node x Empty plus 1, indicating that the task on the lane is finished and leaving a lane; at the same time, if the subtask t i,j If it is not the last layer task, i.e. j ≠ m, then its successor subtask t is generated i,j+1 (ii) a At this time, find subtask t i,j+1 Is executed node s y Calculating the subtask t i,j+1 Slave node s x To node s y Transmission time g of i,j+1 (x, y), then:
t i,j+1 .arrival=curTime+g i,j+1 (x,y)
whereinTime of transmission
Figure BDA0002281472250000041
Step 306) continuously repeating the steps 302-305 for all subtasks in the scheduling scheme until the arrival time and the completion time of all subtasks are determined;
step 307) the average response time f of the scheduling scheme is calculated according to the following formula ave (T), i.e. individual u i Average response time u of i .time:
Figure BDA0002281472250000042
Compared with the prior art, the invention has the following beneficial effects: the method can generate a scheduling scheme with minimum average response time according to the edge environment and the task set, can effectively reduce the calculation migration scheduling time of DNN application in the edge environment, and has strong practicability and wide application prospect.
Drawings
FIG. 1 is a flow chart of a method implementation of an embodiment of the present invention.
Fig. 2 is a node map provided in an embodiment of the present invention.
Fig. 3 is a timing diagram of task arrival in an embodiment of the invention.
FIG. 4 is a graph comparing average response times in the examples of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and specific embodiments.
The invention provides a computation migration scheduling method for deep neural network application in an edge environment, which is characterized in that an optimal scheduling scheme is searched based on an evaluation algorithm and a genetic algorithm, n tasks are scheduled through the optimal scheduling scheme, and the average response time of the n tasks is minimized, and the method specifically comprises the following steps: dividing n tasks into nxm by layersEach subtask corresponds to a gene locus and has n multiplied by m gene loci, the gene on each gene locus represents the execution node of the subtask corresponding to the gene locus, k genes are shared and correspond to k nodes, and each individual u is i Is a feasible solution to the scheduling problem; calculating the average response time ui.time of each individual, namely each scheduling scheme, by adopting an evaluation algorithm, and further finding out the optimal individual in the group of the generation; calculating the average response time of the population, sequentially performing selection operation and cross operation to select the superior individuals in the parent population to generate offspring individuals, and performing mutation operation to obtain an offspring population; by adopting the method, continuous iteration is carried out, the optimal individual, namely the optimal scheduling scheme, is found in the continuously updated child group, and the average response time of the optimal scheduling scheme, the execution node of each subtask and the execution sequence on the node are obtained. The implementation flow of the method is shown in fig. 1.
In this embodiment, the method includes the steps of:
1) Inputting: mobile device, cloud and edge node set S = { S = 1 ,s 2 ,...,s k In which s i Representing the ith node (the set of nodes S includes three parts: the set of mobile devices M = { M =:) 1 ,m 2 ,...,m a In which m is i Represents the ith mobile device; edge device set E = { E = { E } 1 ,e 2 ,...,e b In which e i Represents the ith edge device; a cloud computing center c); set of tasks T = { T = { T } 1 ,T 2 ,...,T n In which T is i Representing the ith task, each task T i And is also denoted as T i ={t i,1 ,t i,2 ,...,t i,m },t i,j A jth layer sub-task representing an ith task; node s x Number of concurrent strokes p x Where x ∈ [1, k ]](ii) a Data transmission rate between nodes
Figure BDA0002281472250000051
Wherein v is i,j Representing nodesThe data transmission rate between i and node j; round trip time between nodes->
Figure BDA0002281472250000052
Wherein r is i,j Represents the round trip time between node i and node j; task arrival rate λ on mobile device i i Where i ∈ [1, a ]](ii) a Set of execution times of subtasks on different nodes
Figure BDA0002281472250000053
Wherein time i,j Representing a task T x Ith sub-task t x,i At node s j The execution time of (1); set of data transmission amounts D = { D ] between subtasks 1 ,d 2 ,...,d m In which d is j Representing a subtask t i,j To subtask t i,j+1 The amount of data transfer therebetween.
2) Dividing n tasks into n x m subtasks according to layers based on a genetic algorithm, wherein each subtask corresponds to a gene locus, the total number of the gene loci is n x m, and the gene on each gene locus represents the subtask t corresponding to the gene locus i,j Is executed node s x Total k genes corresponding to k nodes, each of u i Is a feasible solution of the scheduling problem; initializing initial generation population position, randomly initializing a global variable best to represent an optimal individual, and recording the average response time of the optimal individual as best.
3) Calculating each individual u by adopting an evaluation algorithm i I.e. average response time u per scheduling scheme i .time。
The method specifically comprises the following steps:
301 A variable currTime) indicating the current time is set to be initialized to 0; the input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows corresponding to k nodes, and each row stores subtasks executed on the node according to an execution sequence; s x Empty denotes node s x The number of the rest idle lanes of (1) is p x (ii) a For each subtask t i,j Three attributes are defined:t i, j .arrival、t i,j end and t i,j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask, and is initialized as follows:
Figure BDA0002281472250000061
302 Set slice variable slice and initialize to ∞.
303 Fill lanes): sequentially arranging each node S in the node set S according to a scheduling scheme x Sub task t of i,j Put into lanes, put into task, s x Empty minus 1 and remove the subtask that has been added to the lane from the scheduling scheme until the node has no empty lanes, i.e. s x Empty =0; the task put into the lane should satisfy t i,j Arrrival ≦ currTime, i.e., the subtask has reached node s x
By means of a pool x Indicating a drop node s x The tasks in the lanes specifically comprise the following steps:
3031 For each node S in the set S of nodes x Sub-task t of i,j When s is x When empty > 0, perform step 3032, otherwise end filling the lane;
3032 If t) is i,j Arrival ≦ currTime, i.e., subtask t i,j Arrived node s x Then the subtask t is processed i,j Put into the swim lane and remove the subtask that has been added to the swim lane from the scheduling scheme, and let s x Empty > 0 minus 1;
3033 Step 3031-3032 is repeated continuously until the node has no empty lane;
304 Find the minimum time slice: for each node S in the set S of nodes x Traversing all the subtasks in each lane to find the subtask t with the minimum residual execution time i,j Taking the remaining time t i,j Time is the slice, then the current time currTime is added toThe slice is used as a new current time currtime and indicates the elapsed time with the slice length.
305 Computing the remaining time of the subtask: for each node S in the set S of nodes x Go through all the subtasks in each lane and leave them for the time t i,j Time minus the slice as the new remaining time t i,j Time, which represents that the subtask has executed the time with the length of slice again; for node s x Sub task t of i,j If there is t i,j Time is less than or equal to 0, the task is completed, and the completion time t is recorded i,j End = curTime, then remove the task from the lane, number s of remaining free lanes of the node x Empty plus 1, indicating that the task on the lane is finished and leaving a lane; at the same time, if the subtask t i,j If it is not the last layer task, i.e. j ≠ m, then its successor subtask t is generated i,j+1 (ii) a At this time, find subtask t i,j+1 Is executed node s y Calculating the subtask t i,j+1 Slave node s x To node s y Transmission time g of i,j+1 (x, y), then:
t i,j+1 .arrival=curTime+g i,j+1 (x,y)
wherein the transmission time
Figure BDA0002281472250000071
/>
Step 306) for all subtasks in the scheduling scheme, continuously repeating the steps 302-305 until the arrival time and the completion time of all subtasks are determined.
Step 307) the average response time f of the scheduling scheme is calculated according to the following formula ave (T), i.e. individual u i Average response time u of i .time:
Figure BDA0002281472250000072
4) Time, the average response time of all individuals is compared with best.
5) Calculating the average response time of the population of this generation
Figure BDA0002281472250000073
Wherein size is the size of the population.
6) And (3) carrying out selection operation: selecting individuals with average response times less than the population average response time, i.e. u i Time < averTime, retained as the progeny population.
7) Expanding the child population by crossover operations: selecting two individuals u from reserved individuals by roulette selection 1 And u 2 Taking them as parent to make single-point crossing to produce two filial individuals u 1 ' and u 2 ', this process is repeated to produce new offspring individuals until the offspring population reaches size.
8) Carrying out mutation operation: setting a variation rate mu, then randomly generating random numbers in the range of [0,0.1], and if the generated random numbers are less than the variation rate mu, performing variation operation; when carrying out mutation operation, randomly generating num of genes to be mutated, then randomly generating num of gene sites, and randomly changing genes on the num of gene sites into another gene; thereby generating the final population of progeny.
9) And continuously repeating the steps 3 to 8 to perform iteration until an iteration termination condition is met (in the embodiment, the iteration termination condition is the set iteration times), the best recorded scheme is the optimal scheduling scheme, and the average response time is best.
10 Average response time of the output optimal scheduling scheme, the execution node of each subtask, and the execution order on that node.
The related technical content of the scheduling problem and scheduling scheme of the present invention is further described below.
1 overview of the method
The invention discloses a method for automatically generating a DNN task scheduling scheme under an edge environment. The genetic algorithm first generates an initial solution and then iterates continuously according to the evaluation algorithm to generate a more optimal solution.
The detailed technical scheme is as follows.
2 technical scheme
2.1 definition
2.1.1 symbol Definitions
Definition 1: set of mobile device, cloud and edge node as S = { S = } 1 ,s 2 ,...,s k In which s is i Represents the ith node;
definition 2: set of mobile devices is M = { M = { [ M ] 1 ,m 2 ,...,m a In which m is i Represents the ith mobile device;
definition 3: set of edge devices E = { E = { E = } 1 ,e 2 ,...,e b In which e i Represents the ith edge device;
definition 4: the cloud computing center is denoted as c;
definition 5: node s i The number of concurrent swimming channels is p i ,i∈[1,k];
Definition 6: data transmission rate between nodes is
Figure BDA0002281472250000081
Wherein v is i,j Representing the data transmission rate between node i and node j;
definition 7: round trip time between nodes of
Figure BDA0002281472250000091
Wherein r is i,j Represents the round trip time between node i and node j;
definition 8: set of tasks is T = { T = } 1 ,T 2 ,...,T n Where T is i Indicating the ith task. Each task T i And may be represented as T i ={t i,1 ,t i,2 ,...,t i,m },t i,j A jth layer sub-task representing an ith task;
definition 9: mobile deviceThe task arrival rate at i is λ i ,i∈[1,a];
Definition 10: the execution time of the subtasks on different nodes is set as
Figure BDA0002281472250000092
Wherein time i,j Representing a task T x Ith subtask t of (1) x,i At node s j The execution time of (1);
definition 11: the data transmission amount among the subtasks is D = { D = { (D) } 1 ,d 2 ,...,d m In which d is j Representing a subtask t i,j To subtask t i,j+1 The amount of data transfer therebetween.
2.1.2 problem definition
t i,j Arrival represents the subtask t i,j To the node s performing the subtask y The time at which the subtask starts to execute is denoted as t i,j .begin。
w i,j (y) represents a subtask t i,j At node s y The above queue waiting time is:
w i,j (y)=t i,j .begin-t i,j .arrival (2-1-1)
defining a subtask t i,j Slave node s x To node s y The transmission time of (a) is as follows:
Figure BDA0002281472250000093
task T i The response time of (a) is the time elapsed from the generation of the task on the mobile device to the completion of the full execution, equal to the sum of the response times of all the subtasks of the task. Subtask t i,j Response time r of i,j The method comprises three parts of execution time, transmission time and waiting time:
f i,j (y)=time j,y +g i,j (x,y)+w i,j (y) (2-1-3)
task T i The response time calculation formula of (c) is as follows:
Figure BDA0002281472250000094
the average response time for all n tasks is defined as follows:
Figure BDA0002281472250000095
finding a scheduling scheme, and scheduling the n tasks by the scheduling scheme to ensure that the average response time f of the n tasks ave (T) is minimum. The scheduling scheme specifies the execution node for each subtask, as well as the order of execution on that node.
2.2 evaluation Algorithm
2.2.1 description of the Algorithm
The evaluation algorithm is used to evaluate a scheduling scheme, and the average response time of the scheme can be calculated according to a given scheduling scheme. The smaller the average response time, the better the scheme. The idea of the evaluation algorithm is to simulate the execution process of the task according to the scheduling scheme, and calculate the starting execution time and the execution completion time of each subtask, thereby calculating the average response time spent by the scheduling scheme.
A currTime variable is set to indicate the current time and initialized to 0. The input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows and corresponds to k nodes, and each row stores subtasks executed on the node according to the execution sequence. In addition, each subtask t is given t i,j Three attributes are defined: t is t i,j .arrival、t i,j End and t i,j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask.
The initialization is as follows:
Figure BDA0002281472250000101
for a task, the response time is the time from generation to completion, and the response time of the task can be calculated as long as the time of the completion of the task execution and the time of the task generation are obtained. And task T i Is the first layer subtask t i,1 At the arrival time t i,1 Arrival, task T i Is the last layer of subtask t i,m Is completed at time t i,m End. Thus, task T i Response time of (c):
f resp (T i )=t i,m .end-t i,1 .arrival (2-2-1)
the average response time is further calculated according to equation (5) as follows:
Figure BDA0002281472250000102
thus, it is only necessary to calculate each subtask t i,j The average response time of the scheme can be calculated according to the formula (2-2-2).
2.2.2 algorithmic Process
First, the lanes are filled. According to the scheduling scheme, the subtasks of each node are put into the lanes in sequence until the node has no empty lane, and the scheduling scheme is used x Empty denotes node s x The number of idle lanes remained. And the subtasks that have been added to the lanes are removed from the scheme. The task placed in the lane should satisfy t i,j Arrrival ≦ curTime, i.e., the subtask must have reached node s x . By means of a pool x Indicating a drop node s x Tasks in lanes.
Second, find the minimum time slice. Firstly, all the subtasks in each lane are traversed to find out the subtask t with the minimum residual execution time i,j Taking the remaining time t i,j Time is the time slice sclce. The current time curTime is then added to the time slice, representing the elapsed time of length sclce.
And thirdly, calculating the remaining time of the subtask. For all subtasks in the lane, the time slice is subtracted from the remaining time, indicating that the subtask has executed again for a time of slice. To node s x Sub task t of i,j If there is t i,j Time is less than or equal to 0, it means that the task has been completed, and the completion time t is recorded i,j End = curTime. The task is then removed from the lane, and the number of idle lanes at the node is counted s x Empty plus 1 indicates that the task in the lane is completed and a lane is left empty. At the same time, if the subtask t i,j If it is not the last layer task, i.e., j ≠ m, then its successor subtask t will be generated i,j+1 . At this time, the subtask t should be found i,j+1 Is executed node s y Calculating its transmission time, then:
t i,j+1 .arrival=curTime+g i,j+1 (x,y) (2-2-3)
the above steps are repeated until the arrival time and the completion time of all the subtasks are determined. And finally, calculating the average response time of the scheme according to the formula (2-2-2).
2.2.3 Algorithm implementation
Evaluation algorithm
Figure BDA0002281472250000111
/>
Figure BDA0002281472250000121
/>
Figure BDA0002281472250000131
2.3 genetic Algorithm
2.3.1 description of the Algorithm
Dividing n tasks into n × m subtasks according to layers, wherein each subtask corresponds to a gene locus and has n × m gene loci in totalPoint, the numbering of the loci starts from 0. The ith gene locus corresponds to a subtask t i/m+1,i%m+1 For example, the 0 th gene locus corresponds to the subtask t 1,1 The 1 st gene locus corresponds to the subtask t 1,2 The mth gene locus corresponds to the subtask t 2,1 And so on.
The gene on each gene locus represents the subtask t corresponding to the gene locus i,j Is executed node s x For example, a subtask t 1,1 At node s 1 The upper case, then the gene above the 0 th gene locus is 1. There are k genes in total, corresponding to k nodes.
Each subject u i I.e. a feasible solution to the problem, for each feasible solution the average response time, denoted u, can be calculated from the evaluation i Time. The size of the population is denoted by the size, and in each generation of the population, the average time averTime of the population is calculated,
Figure BDA0002281472250000132
then, selecting individuals with average response times less than the population average time, i.e., u i Time < averTime, reserved to the next generation. Because the selection operator reduces the size of the population, the child population needs to be expanded by the crossover operator to keep its population size at size. First, two individuals u are selected from the reserved individuals according to roulette selection method 1 And u 2 Then, taking them as parent to carry out single-point crossing to generate two child individuals u 1 ' and u 2 ', this process is repeated until the progeny population size reaches size.
The mutation operation designates a mutation rate mu, then randomly generates a random number within the range of [0,0.1], and if the generated random number is smaller than the mutation rate mu, the mutation operation is performed. When mutation operation is carried out, the number num of genes needing mutation is randomly generated, num gene sites are randomly generated, and the genes on the sites are randomly changed into another gene. In this way, genetic mutations of organisms in nature are simulated.
2.3.2 algorithmic Process
First, the time for each individual is calculated according to algorithm 1.
And secondly, finding and recording the optimal individual best.
And thirdly, calculating the average time of the population according to the formula (2-3-1).
And fourthly, sequentially carrying out selection operation and cross operation.
And fifthly, performing mutation operation.
2.3.3 Algorithm implementation
Genetic algorithm
Figure BDA0002281472250000141
Figure BDA0002281472250000151
3 method evaluation
3.1 evaluation settings
As shown in fig. 2, 7 nodes are provided, wherein 4 mobile devices (m 1, m2, m3, m 4), 2 edge devices (e 1, e 2) and 1 cloud computing center (c) are provided, and the performance parameters of each device are listed in table 3-1.
TABLE 3-1 Performance parameters of the apparatus
Figure BDA0002281472250000152
Meanwhile, experiments were conducted using a 7-tier DNN application, setting the total task count to be 12, i.e., 84 subtasks in total. These 12 tasks are numbered from 1, tasks 1, 5, 9, 12 are generated on mobile device 1, tasks 2, 6, 10 are generated on mobile device 2, tasks 3, 7, 11 are generated on mobile device 3, and tasks 4, 8 are generated on mobile device 4. The task on each mobile device arrives at a constant speed within 1 second, for example, if 4 tasks are generated on the mobile device 1, then the task arrival rate is determinedIs that
Figure BDA0002281472250000153
I.e. one task every 0.25 seconds. Fig. 3 shows the tasks generated by each mobile device and the arrival time of each task.
The DNN uses the amount of data transfer (unit: mb) between layers as:
D={1.2,0.3,0.8,0.2,0.4,0.1,0.05}
the running time (unit: ms) of each layer on each node is as follows:
Figure BDA0002281472250000161
in five scenarios, the above settings remain unchanged, except for the network connection situation of the edge environment and the concurrency number of the edge. The following table is a detailed description of five scenarios.
TABLE 3-2 exemplary scenarios
Figure BDA0002281472250000162
TABLE 3-3 edge starvation scenarios
Figure BDA0002281472250000163
TABLE 3-4 edge margin scenarios
Figure BDA0002281472250000164
Figure BDA0002281472250000171
Tables 3 to 5: selectable edge scenes
Figure BDA0002281472250000172
TABLE 3-6 edge interconnect scenarios
Figure BDA0002281472250000173
3.2 evaluation results
Tables 3-7 show the comparison of the average response time of the scheduling scheme provided by the present invention with the average response time of the scheduling scheme based on the greedy algorithm, the optimal scheme.
Tables 3-7 comparison of average response times
Figure BDA0002281472250000174
Figure BDA0002281472250000181
Under the five scenes, the average response time of the scheduling scheme provided by the method and the greedy algorithm is tested, and compared with two traditional migration schemes of completely migrating to the nearby edge and completely migrating to the cloud, the result is shown in fig. 4.
From the evaluation result, the average response time of the scheduling scheme provided by the method is obviously shorter than that of the traditional migration scheme under each scene, and is close to that of the optimal scheme. This shows that the method of the present invention can effectively reduce the computation migration scheduling time of the DNN application in the edge environment.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (2)

1. A computation migration scheduling method for deep neural network application in an edge environment is characterized in that an optimal scheduling scheme is found based on an evaluation algorithm and a genetic algorithm, n tasks are scheduled through the optimal scheduling scheme, and average response time of the n tasks is minimized, and the method specifically comprises the following steps:
1) Inputting: mobile device, cloud and edge node set S = { S = } 1 ,s 2 ,...,s k In which s is i Represents the ith node; set of tasks T = { T = { T } 1 ,T 2 ,...,T n Where T is i Representing the ith task, each task T i And is also denoted as T i ={t i,1 ,t i,2 ,...,t i,m },t i,j A jth layer sub-task representing an ith task;
2) Dividing n tasks into n x m subtasks according to layers based on a genetic algorithm, wherein each subtask corresponds to a gene locus, the total number of the gene loci is n x m, and the gene on each gene locus represents the subtask t corresponding to the gene locus i,j Is executed node s x Total k genes corresponding to k nodes, each of u i Is a feasible solution of the scheduling problem; initializing initial generation population position, randomly initializing a global variable best to represent an optimal individual, and recording the average response time of the optimal individual as best time;
3) Calculating the average response time u of each individual by adopting an evaluation algorithm i .time;
4) Comparing the average response time of all the individuals with best time, and if the individual time is less than best time, updating best and best time;
5) Calculating the average response time of the population of this generation
Figure FDA0003966184520000011
Wherein size is the size of the population;
6) And (3) carrying out selection operation: selecting individuals having an average response time less than the population average response time, i.e., u i Time < averTime, retained as a population of progeny;
7) Expanding the child population by crossover operations: selecting two individuals u from reserved individuals by roulette selection 1 And u 2 Taking them as parent to make single-point crossing to produce two filial individuals u 1 ' and u 2 ', repeating the process to generate new filial individuals until the size of filial population reaches size;
8) Carrying out mutation operation: setting a variation rate mu, then randomly generating a random number, and if the generated random number is less than the variation rate mu, performing variation operation; when carrying out mutation operation, randomly generating num of genes to be mutated, then randomly generating num of gene sites, and randomly changing genes on the num of gene sites into another gene; thereby generating a final population of progeny;
9) Continuously repeating the steps 3-8 to iterate until an iteration termination condition is met, wherein the best record is the optimal scheduling scheme, and the average response time is best.
10 Average response time of the output optimal scheduling scheme, the execution node of each subtask, and the execution order on that node.
2. The method for scheduling computation migration of deep neural network application in edge environment according to claim 1, wherein in step 1, the input data further includes: node s x Number of concurrent strokes p x Where x ∈ [1, k ]](ii) a Data transmission rate between nodes
Figure FDA0003966184520000021
Wherein v is i,j Representing the data transmission rate between node i and node j; round trip time between nodes
Figure FDA0003966184520000022
Wherein r is i,j Represents the round trip time between node i and node j; set of execution times of subtasks on different nodes
Figure FDA0003966184520000023
Wherein time i,j Representing a task T x Ith subtask t of (1) x,i At node s j The execution time of (1); set of data transmission amounts D = { D ] between subtasks 1 ,d 2 ,...,d m In which d is j Representing a subtask t i,j To subtask t i,j+1 The amount of data transfer in between; in the step 3, each individual u is calculated by adopting an evaluation algorithm i I.e. average response time u per scheduling scheme i Time, comprising the steps of:
301 A variable currTime) indicating the current time is set to be initialized to 0; the input scheduling scheme is represented by a two-dimensional array scheme, the array has k rows corresponding to k nodes, and each row stores subtasks executed on the node according to an execution sequence; s x Empty denotes node s x The number of the rest idle lanes of (1) is p x (ii) a For each subtask t i,j Three attributes are defined: t is t i, j .arrival、t i,j End and t i,j Time, which respectively represents the time when the subtask reaches the execution node, the time when the subtask is executed and the remaining execution time of the subtask, and is initialized as follows:
Figure FDA0003966184520000024
302 Setting a slice variable slice and initializing to infinity;
303 Fill lanes): sequentially arranging each node S in the node set S according to a scheduling scheme x Sub task t of i,j Put into lanes, put into task, s x Empty minus 1 and remove the subtask that has been added to the lane from the scheduling scheme until the node has no empty lanes, i.e. s x Empty =0; the task placed in the lane should satisfy t i, j Arrrival ≦ currTime, i.e., the subtask has reached node s x
304 Find the minimum time slice: for each node S in the set S of nodes x Traversing all the subtasks in each lane to find the subtask t with the minimum residual execution time i,j Get itTime remaining t i,j Time is a slice, then the current time currTime is added to the slice as a new current time currTime, which represents the time with the length of slice;
305 Computing the remaining time of the subtask: for each node S in the set S of nodes x Go through all the subtasks in each lane and leave them for the time t i,j Time minus the slice as the new remaining time t i,j Time, which represents that the subtask has executed the time with the length of slice again; for node s x Sub task t of i,j If there is t i,j Time is less than or equal to 0, it means that the task has been completed, and the completion time t is recorded i,j End = curTime, then remove the task from the lane, number s of remaining free lanes of the node x Empty plus 1, indicating that the task on the lane is completed and a lane is left; at the same time, if the subtask t i,j If it is not the last layer task, i.e. j ≠ m, then its successor subtask t is generated i,j+1 (ii) a At this time, find subtask t i,j+1 Is executed node s y Calculating the subtask t i,j+1 Slave node s x To node s y Transmission time g of i,j+1 (x, y), then:
t i,j+1 .arrival=curTime+g i,j+1 (x,y)
wherein the transmission time
Figure FDA0003966184520000031
Step 306) continuously repeating the steps 302-305 for all subtasks in the scheduling scheme until the arrival time and the completion time of all subtasks are determined;
step 307) the average response time f of the scheduling scheme is calculated according to the following formula ave (T), i.e. individual u i Average response time u of i .time:
Figure FDA0003966184520000032
CN201911143030.7A 2019-11-20 2019-11-20 Computing migration scheduling method for deep neural network application in edge environment Active CN110837413B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911143030.7A CN110837413B (en) 2019-11-20 2019-11-20 Computing migration scheduling method for deep neural network application in edge environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911143030.7A CN110837413B (en) 2019-11-20 2019-11-20 Computing migration scheduling method for deep neural network application in edge environment

Publications (2)

Publication Number Publication Date
CN110837413A CN110837413A (en) 2020-02-25
CN110837413B true CN110837413B (en) 2023-03-24

Family

ID=69576883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911143030.7A Active CN110837413B (en) 2019-11-20 2019-11-20 Computing migration scheduling method for deep neural network application in edge environment

Country Status (1)

Country Link
CN (1) CN110837413B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114579512A (en) * 2022-03-16 2022-06-03 中国人民解放军总医院 Hierarchical storage method and device of image data, electronic equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800071A (en) * 2019-01-03 2019-05-24 华南理工大学 A kind of cloud computing method for scheduling task based on improved adaptive GA-IAGA
CN109840154A (en) * 2019-01-08 2019-06-04 南京邮电大学 A kind of computation migration method that task based access control relies under mobile cloud environment
CN110073301A (en) * 2017-08-02 2019-07-30 强力物联网投资组合2016有限公司 The detection method and system under data collection environment in industrial Internet of Things with large data sets

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130173510A1 (en) * 2012-01-03 2013-07-04 James Joseph Schmid, JR. Methods and systems for use in reducing solution convergence time using genetic algorithms

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110073301A (en) * 2017-08-02 2019-07-30 强力物联网投资组合2016有限公司 The detection method and system under data collection environment in industrial Internet of Things with large data sets
CN109800071A (en) * 2019-01-03 2019-05-24 华南理工大学 A kind of cloud computing method for scheduling task based on improved adaptive GA-IAGA
CN109840154A (en) * 2019-01-08 2019-06-04 南京邮电大学 A kind of computation migration method that task based access control relies under mobile cloud environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《一种新颖的云计算容错任务调度算法》;陈星等;《小型微型计算机系统》;20170426(第10期);2194-2198 *
《基于深度强化学习的流媒体边缘云会话调度策略》;徐西建等;《计算机工程》;20190731;第45卷(第5期);237-242 *

Also Published As

Publication number Publication date
CN110837413A (en) 2020-02-25

Similar Documents

Publication Publication Date Title
CN111858009B (en) Task scheduling method of mobile edge computing system based on migration and reinforcement learning
CN111625361B (en) Joint learning framework based on cooperation of cloud server and IoT (Internet of things) equipment
CN109840154B (en) Task dependency-based computing migration method in mobile cloud environment
CN109884897B (en) Unmanned aerial vehicle task matching and calculation migration method based on deep reinforcement learning
Cheng et al. Air traffic control using genetic search techniques
CN112631717B (en) Asynchronous reinforcement learning-based network service function chain dynamic deployment system and method
Visalakshi et al. Dynamic task scheduling with load balancing using hybrid particle swarm optimization
CN111325356A (en) Neural network search distributed training system and training method based on evolutionary computation
CN112685165B (en) Multi-target cloud workflow scheduling method based on joint reinforcement learning strategy
CN112711475B (en) Workflow scheduling method and system based on graph convolution neural network
CN104424507B (en) Prediction method and prediction device of echo state network
CN113821318A (en) Internet of things cross-domain subtask combined collaborative computing method and system
US20230222000A1 (en) Recommendations for scheduling jobs on distributed computing devices
Sivanandam et al. Dynamic task scheduling with load balancing using parallel orthogonal particle swarm optimisation
CN110837413B (en) Computing migration scheduling method for deep neural network application in edge environment
CN111740925B (en) Deep reinforcement learning-based flow scheduling method
CN114912357A (en) Multi-task reinforcement learning user operation method and system based on user model learning
CN115271099A (en) Self-adaptive personalized federal learning method supporting heterogeneous model
Boveiri An incremental ant colony optimization based approach to task assignment to processors for multiprocessor scheduling
CN116915869A (en) Cloud edge cooperation-based time delay sensitive intelligent service quick response method
CN109885401B (en) Structured grid load balancing method based on LPT local optimization
CN115358485A (en) Traffic flow prediction method based on graph self-attention mechanism and Hox process
CN114494553B (en) Real-time rendering method, system and equipment based on rendering time estimation and LOD selection
CN116050235A (en) Workflow data layout method under cloud side environment and storage medium
CN110135725A (en) A kind of cable assembly sequence-planning method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant