CN112788605B - Edge computing resource scheduling method and system based on double-delay depth certainty strategy - Google Patents

Edge computing resource scheduling method and system based on double-delay depth certainty strategy Download PDF

Info

Publication number
CN112788605B
CN112788605B CN202011560881.4A CN202011560881A CN112788605B CN 112788605 B CN112788605 B CN 112788605B CN 202011560881 A CN202011560881 A CN 202011560881A CN 112788605 B CN112788605 B CN 112788605B
Authority
CN
China
Prior art keywords
edge
server
distribution
network
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011560881.4A
Other languages
Chinese (zh)
Other versions
CN112788605A (en
Inventor
李林峰
肖林松
范律
陈永
余伟峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Willfar Information Technology Co Ltd
Original Assignee
Willfar Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Willfar Information Technology Co Ltd filed Critical Willfar Information Technology Co Ltd
Priority to CN202011560881.4A priority Critical patent/CN112788605B/en
Publication of CN112788605A publication Critical patent/CN112788605A/en
Application granted granted Critical
Publication of CN112788605B publication Critical patent/CN112788605B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/02Power saving arrangements
    • H04W52/0209Power saving arrangements in terminal devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/02Resource partitioning among network components, e.g. reuse partitioning
    • H04W16/04Traffic adaptive resource partitioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/53Allocation or scheduling criteria for wireless resources based on regulatory allocation policies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a method and a system for scheduling edge computing resources based on a double-delay depth deterministic strategy. An edge computing resource scheduling method based on a double-delay deep deterministic strategy, an edge computing system comprises an edge server and a plurality of edge gateways in communication connection with the edge server, and the method comprises the following steps: the edge server acquires independent task information sets of all the edge gateways; based on the independent task information set, the edge distribution network respectively outputs corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways by using a double-delay depth deterministic strategy gradient algorithm; and sending the optimal server allocation frequency and the optimal scheduling sequence to the edge gateway to perform scheduling. When the system resources are limited and tense, the delay can be greatly reduced while the energy consumption is greatly reduced, so that the user experience and the utilization rate of energy and network resources are improved.

Description

Edge computing resource scheduling method and system based on double-delay depth certainty strategy
Technical Field
The invention relates to the field of edge computing, in particular to a method and a system for scheduling edge computing resources based on a double-delay depth certainty strategy.
Background
Fifth generation mobile communication technology (5G) is facing new challenges of explosive data traffic growth and large scale device connectivity. New services of 5G networks such as virtual reality, augmented reality, unmanned vehicles, smart grids and the like put higher demands on delay, and meanwhile, the calculation-intensive applications consume a large amount of energy, so that the problems cannot be solved by user equipment, and edge calculation is carried out at the right moment. Edge computing deploys computing and storage resources at the edge of the mobile network to meet the stringent latency requirements of some applications. The edge gateway can wholly or partially unload the calculation task to the MEC server through the wireless channel for calculation, so that the delay and the energy consumption are reduced, and good user experience is obtained. The existing traditional optimization algorithm is feasible for solving the MEC computation offloading and resource allocation problem, but the traditional optimization algorithm is not very suitable for the MEC system with high real-time performance. The reinforcement learning algorithm is well suited to solve resource allocation problems, such as MEC server resource allocation.
The problem of minimizing system consumption in edge computing can be solved by finding optimal offload decisions and computing offloaded resource allocations. However, the offload decision vector X is a feasible set of binary variables and the objective function is a non-convex problem. In addition, as the number of tasks increases, the difficulty in solving the minimization of system consumption problem increases exponentially, and thus it is a non-convex problem that extends from the knapsack problem and is an NP problem.
Thus, the existing edge computing system field has shortcomings and needs to be improved and enhanced.
Disclosure of Invention
In view of the defects of the prior art, the invention aims to provide a resource scheduling method and system based on a double-delay depth deterministic strategy edge, solve the problems of delay and energy optimization in a 5G heterogeneous network, improve the utilization rate of computing resources and reduce task delay by effectively unloading resource scheduling and a server resource allocation method.
In order to achieve the purpose, the invention adopts the following technical scheme:
an edge computing resource scheduling method based on a double-delay deep deterministic strategy, wherein an edge computing system comprises an edge server and a plurality of edge gateways in communication connection with the edge server, an edge distribution network and an edge distribution target network are constructed in the edge server, and the method comprises the following steps:
the edge server acquires independent task information sets of all the edge gateways;
based on the independent task information set, the edge distribution network respectively outputs corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways by using a double-delay depth deterministic strategy gradient algorithm;
sending the optimal server distribution frequency and the optimal scheduling sequence to the edge gateway to execute scheduling;
the edge distribution target network carries out real-time training based on the acquired independent task information set; and updating the network parameters of the edge distribution network in sections according to the target network parameters of the edge distribution target network.
Preferably, the resource scheduling method based on the dual-delay depth deterministic policy edge computing is characterized in that the independent task information set comprises a plurality of independent task information, and the independent task information at least comprises data volume and CPU (central processing unit) cycle volume required for processing the task; the required CPU cycle amount comprises a cycle amount required by a server and a cycle amount required by an edge gateway;
the implementation step of the edge distribution network using a double-delay depth deterministic policy gradient algorithm to respectively output the corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways is the same as the real-time training step of the edge distribution target network, and specifically comprises the following steps:
s31, based on the independent task information set, solving the pre-distribution frequency distributed by the edge server for each edge gateway through pre-classification;
s32, the edge distribution target network classifies all independent tasks based on the required period quantity of the server and the required period quantity of the edge gateway of each independent task, and stores the tasks into an unloading task set and a local task set respectively;
s33, the edge distribution target network uses a double-delay depth certainty strategy gradient algorithm to carry out server distribution frequency on independent tasks in an unloading task set;
and S34, performing one iteration every time steps S32-S33 are executed, outputting the optimal server allocation frequency after a preset number of iterations, and determining the target network parameters of the edge allocation target network.
Preferably, in the method for scheduling resource based on edge computing of dual-delay depth deterministic policy, the output criteria of the optimal server allocation frequency are: when the iterative computation finally has a convergence result, outputting the server distribution frequency of the iterative convergence as the optimal server distribution frequency; otherwise, outputting the pre-distribution frequency as the optimal server distribution frequency.
Preferably, in the method for scheduling edge computing resources based on the double-delay depth deterministic policy, the network parameters of the edge distribution network are updated in segments according to the target network parameters of the edge distribution target network, and specifically, the method includes: when the edge distribution target network is trained in real time, in step S34, every iteration of the set number of times, the current target network parameter is divided into update sections according to a predetermined step length based on the network parameter before training to obtain update parameters, and the update parameters are used as the network parameters of the edge distribution network for updating.
Preferably, the resource scheduling method based on the dual-delay deep deterministic policy edge computing has the set times of 20-80.
Preferably, in the method for scheduling computing resources based on an edge of a dual-delay depth deterministic policy, in step S31, the specific obtaining step of the pre-allocation frequency includes:
s311, respectively calculating the dominant frequency proportion of the equipment CPU frequency of each edge gateway to the sum of the equipment CPU frequencies of all the edge gateways; distributing CPU frequency for the edge gateway in the edge server according to the main frequency proportion;
s312, calculating local execution time delay of each independent task according to the independent task information, and respectively calculating a relative time delay ratio of the local execution time delay of each edge gateway to the sum of the local execution time delays of all the edge gateways;
s313, respectively calculating the distribution weight of each edge gateway according to the main frequency proportion and the relative time delay proportion;
and S314, respectively calculating the pre-distribution frequency of each edge gateway in the edge server according to the distribution weight and the service CPU frequency.
Preferably, in the method for scheduling resource based on edge computation of dual-delay depth deterministic policy, the step S32 specifically includes:
s321, classifying the independent task information of all the edge gateways according to unloading time and server execution time, adding the independent task information of which the unloading time is less than the server execution time to a first array, and arranging all the independent task information in the first array according to the ascending order of the unloading time; adding the independent task information with the unloading time larger than or equal to the execution time of the server to a second array, and arranging all the independent task information in the second array in a descending order according to the execution time of the server;
s322, obtaining the server execution time and the unloading time of each independent task information in the first array to obtain the server processing time of each independent task information; acquiring the local execution time of each independent task information in the second array;
s323, obtaining a time difference value between the total server processing time of all the independent task information in the first array and the total local execution time of all the independent task information in the second array;
s324, determining all independent task information listed in an array with longer time according to the time difference value to form a third array; taking the processed first array as an unloading task set, and taking the processed second array as a local task set;
s325, respectively calculating server processing time and local execution time of each independent task information in the third array, putting the independent task information of which the server processing time is greater than the local execution time into the local task pre-allocation set, and putting the independent task information of which the server processing time is less than or equal to the local execution time into the unloading task pre-allocation set;
s326, after the independent task information in the third array is distributed, an unloading task set and a local task set are obtained, and an unloading decision vector is obtained according to the final unloading task set.
Preferably, in the method for scheduling resources based on dual-delay deep deterministic policy edge computation, in the step S34, the predetermined number of times is 100-200.
In the preferred edge computing resource scheduling method based on the double-delay deep deterministic strategy, the edge distribution network consists of a value network and an action network; the edge distribution target network is composed of a value target network and an action target network.
An edge computing system comprises an edge server and a plurality of edge gateways which are in communication connection with the edge server, wherein the edge server and the edge gateways work by using the edge computing resource scheduling method based on the double-delay deep deterministic strategy.
Compared with the prior art, the edge computing resource scheduling method and system based on the double-delay depth certainty strategy provided by the invention have the following beneficial effects:
the edge computing resource scheduling method provided by the invention can firstly fix the server frequency distributed to the edge computing gateway when the system resource is limited and tense, then solve the task unloading sequence and the unloading decision which can reach the minimum completion time, finally obtain the optimal server distribution frequency and the optimal scheduling sequence, greatly reduce the energy consumption and delay, and further improve the user experience and the utilization rate of energy and network resources.
Drawings
FIG. 1 is a flow chart of a resource scheduling method provided by the present invention;
FIG. 2 is a block diagram of an edge computing system provided by the present invention;
FIG. 3 is a flow chart of a real-time training and specific output method provided by the present invention;
FIG. 4 is a flow chart of a real-time training and specific output method implementation provided by the present invention;
FIG. 5 is a flow chart of the steps of the pre-sorting server allocating frequencies provided by the present invention;
FIG. 6 is a flow chart of an embodiment of the pre-sorting server frequency allocation step provided by the present invention;
FIG. 7 is a flowchart of the offload task set and offload decision vector acquisition steps provided by the present invention;
FIG. 8 is a flowchart illustrating an exemplary offloading task set and offloading decision vector obtaining procedure provided by the present invention;
FIG. 9 is a flow chart of the network parameter update procedure provided by the present invention;
FIG. 10 is a schematic diagram of a value network architecture provided by the present invention;
FIG. 11 is a schematic diagram of an action network architecture provided by the present invention;
fig. 12 is a graph of iterative rewards for optimizing network parameters provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
It is to be understood by those skilled in the art that the foregoing general description and the following detailed description are exemplary and explanatory specific embodiments of the invention, and are not intended to limit the invention.
The terms "comprises," "comprising," or any other variation thereof, herein are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps, but may include other steps not expressly listed or inherent to such process or method. Also, without further limitation, one or more devices or subsystems, elements or structures or components beginning with "comprise. The appearances of the phrases "in one embodiment," "in another embodiment," and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
Referring to fig. 1-2, the present invention provides an edge computing resource scheduling method based on a dual-delay deep deterministic policy, an edge computing system includes an edge server and a plurality of edge gateways communicatively connected to the edge server, an edge distribution network and an edge distribution target network are constructed in the edge server, including the steps of:
the edge server acquires independent task information sets of all the edge gateways; the independent task information at least comprises data volume and CPU cycle volume required by processing the task; preferably, in a specific implementation, all edge gateways U ═ U are first set to { U ═ U 1 ,U 2 ,...,U K Middle edge gateway U i Is abstracted into a task set G ═ T containing two features i,j |1≤j≤N,1≤i≤H},T i,j =(D i,j ,C i,j ) Wherein D is i,j Is an edge gateway U i The data size of the task is in bits; c i,j Is an edge gateway U i The number of CPU cycles required to process each unit data amount is in cycles/bit. Edge gateway U i Has a CPU frequency of f i,user In Hz, edge server to edge gateway U i Has a CPU frequency of f i,ser In Hz, edge gateway U i Is the transmission power p, the target value tc is initialized best =100。
Based on the independent task information set, the edge distribution network respectively outputs corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways by using a double-delay depth deterministic strategy gradient algorithm;
sending the optimal server distribution frequency and the optimal scheduling sequence to the edge gateway to execute scheduling;
the edge distribution target network carries out real-time training based on the acquired independent task information set; and updating the network parameters of the edge distribution network in sections according to the target network parameters of the edge distribution target network. Further, the real-time training process of the edge distribution target network and the network parameter segmentation updating process are not limited to be executed in the last step, and may be executed after all the independent task information sets of the edge gateways are received from the edge server, or may be executed at any time after the edge distribution network outputs the optimal server distribution frequency and the optimal scheduling sequence.
Specifically, referring to fig. 3, in the implementation of the edge scheduling method provided by the present invention, the execution principle is as follows: 1. generating an edge gateway description set U ═ U 1 ,U 2 ,...,U K H, generating a task description set G ═ T i,j |1≤j≤N,1≤i≤K},T i,j =(D i,j ,C i,j ) Wherein D is i,j Representing edge gateway U i The data size of the jth task is bit; c i,j Indicating the number of CPU cycles required to process the task in cycles/bit. 2. Initializing a current value network Q 1 ,Q 2 Synchronizing with the weight parameter w of the current action network P and the target value network Q 1 ',Q 2 'parameter w' of target action network P ═ w; initializing a default data structure for empirical playback SumTree, priority p of V leaf nodes of SumTree V Step is 0, epoch is 0. 3. Pre-classification solving server distribution frequency f ser,base =f ser,best ={f 1,ser ,...,f K,ser }. 4. Solving an unloading decision vector X by an unloading scheduling method based on the number of cycles required by the CPU to process the task; classifying all tasks according to the unloading decision vector X, respectively putting the unloading execution tasks and the local execution tasks into a set S, L, and calculating an initial state s ═ tc, ac; 5. allocating frequency f to servers of all edge gateways in set U ser,best ={f 1,ser ,...,f K,ser Using TD3(Twin Delayed Deep Delayed policy algorithm)A late depth deterministic policy gradient algorithm) algorithm solution; 6. repeating the steps 4-5 for M times and outputting f ser,best ={f 1,ser ,...,f K,ser }, optimal target value Val best . The invention greatly reduces the task execution delay and energy consumption when the task resources are limited in the edge computing network, and improves the utilization rate of the server resources of the edge computing system and the battery life of the edge gateway.
As a preferred solution, in this embodiment, the step of updating the network parameter of the edge distribution network in segments according to the target network parameter of the edge distribution target network specifically includes: when the edge distribution target network is trained in real time, in step S34, every iteration of the set number of times, the current target network parameter is divided into update sections according to a predetermined step length based on the network parameter before training to obtain update parameters, and the update parameters are used as the network parameters of the edge distribution network for updating. For example, one of the current network parameters of the edge distribution network is 10, and the corresponding parameter value of the edge distribution target network in the trained network parameters is 11, at this time, the network parameters of the edge distribution network are optimized, but instead of directly changing the corresponding parameter value to 11, an update interval is defined between 10 and 11 according to a predetermined step length, for example, 0.2 is a predetermined length, and is increased by 0.2 each time the parameter value is updated. Preferably, the set number of times is 20 to 80. More preferably, the set number of times is 50 times. The predetermined step size is preferably 5-20% of the difference between the previous and subsequent values of the network parameter to be updated. Preferably 10%. By using the segmented updating method provided by the invention to update the parameters of the network, the time for configuring the optimal network parameters can be shortened, and the time can be shortened by half in the actual operation.
As a preferred solution, in this embodiment, referring to fig. 3 to 4, the independent task information set has a plurality of independent task information, where the independent task information at least includes a data amount and a CPU cycle amount required for processing the task; the required CPU cycle amount comprises a cycle amount required by a server and a cycle amount required by an edge gateway;
the implementation step of the edge distribution network using a double-delay depth deterministic policy gradient algorithm to respectively output the corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways is the same as the real-time training step of the edge distribution target network, and specifically comprises the following steps:
s31, based on the independent task information set, solving the pre-distribution frequency distributed by the edge server for each edge gateway through pre-classification;
further, referring to fig. 5-6, in an implementation, in the step S31, the step of specifically obtaining the pre-allocated frequency includes:
s311, respectively calculating a main frequency proportion of the equipment CPU frequency of each edge gateway to the sum of the equipment CPU frequencies of all the edge gateways; distributing CPU frequency for the edge gateway in the edge server according to the main frequency proportion;
s312, calculating local execution time delay of each independent task according to the independent task information, and respectively calculating a relative time delay ratio of the local execution time delay of each edge gateway to the sum of the local execution time delays of all the edge gateways;
s313, respectively calculating the distribution weight of each edge gateway according to the main frequency proportion and the relative time delay proportion;
and S314, respectively calculating the pre-distribution frequency of each edge gateway in the edge server according to the distribution weight and the service CPU frequency.
The operation principle of step S31 is specifically: edge gateway U i Task T of i,j The execution time at the edge server is represented as
Figure BDA0002859343270000071
Figure BDA0002859343270000072
Task T i,j Is expressed as a local execution time of
Figure BDA0002859343270000073
Figure BDA0002859343270000081
Task T i,j The unloading and conveying speed of (2) is as follows:
Figure BDA0002859343270000082
where w is the transmission bandwidth, g 0 Is a path loss constant, L 0 Is a relative distance, L i Is the actual distance between the edge gateway and the edge server, theta is the path loss exponent, N 0 For noise power spectral density, p denotes the edge gateway offload task T i,j Transmission power to the edge server.
Task T i,j Is unloaded by a transfer time of
Figure BDA0002859343270000083
Figure BDA0002859343270000084
Task T i,j The unloading and conveying energy consumption is e (i,j),S
Figure BDA0002859343270000085
Task T i The local execution energy consumption is e (i,j),L
e (i,j),L =δ L C i,j (6)
Wherein, delta L The edge gateway consumes energy per CPU cycle in joules per cycle.
Please refer to fig. 6, which shows the specific steps of pre-sorting solution server allocation frequency:
1) computing edge gateway U i The ratio f of the resources to the total resources of the system i,ratio
Figure BDA0002859343270000086
2) Compute edge gateway U i Relative proportion t of local execution time delay to total system time delay i,ratio
Figure BDA0002859343270000087
3) Compute edge gateway U i Frequency assignment weight η of i
Figure BDA0002859343270000091
4) Calculating the distribution frequency f of the edge gateway i,base
f i,base =η i *F (10)
S32, the edge distribution target network classifies all independent tasks based on the required period quantity of the server and the required period quantity of the edge gateway of each independent task, and stores the tasks into an unloading task set and a local task set respectively;
further, referring to fig. 7-9, in an implementation, the step S32 specifically includes:
s321, classifying the independent task information of all the edge gateways according to unloading time and server execution time, adding the independent task information of which the unloading time is less than the server execution time to a first array, and arranging all the independent task information in the first array according to the ascending order of the unloading time; adding the independent task information with the unloading time larger than or equal to the execution time of the server to a second array, and arranging all the independent task information in the second array in a descending order according to the execution time of the server;
s322, obtaining the server execution time and the unloading time of each independent task information in the first array, and obtaining the server processing time of each independent task information; acquiring the local execution time of each independent task information in the second array;
s323, obtaining a time difference value between the total server processing time of all the independent task information in the first array and the total local execution time of all the independent task information in the second array;
s324, determining all independent task information listed in the array with longer time according to the time difference value to form a third array; taking the processed first array as an unloading task set, and taking the processed second array as a local task set;
s325, respectively calculating server processing time and local execution time of each independent task information in the third array, putting the independent task information of which the server processing time is greater than the local execution time into the local task pre-allocation set, and putting the independent task information of which the server processing time is less than or equal to the local execution time into the unloading task pre-allocation set;
and S326, after the independent task information in the third array is distributed, obtaining an unloading task set and a local task set, and obtaining an unloading decision vector according to the final unloading task set.
In practical application, the operation principle is as follows: the unloading scheduling method based on the number of cycles required by the CPU to process the task solves the unloading decision vector, and the steps based on the number of cycles required by the CPU to process the task are as follows:
inputting: edge gateway U i All task set G of i Edge gateway U i CPU frequency f i,user Server to edge gateway U i CPU frequency f i,ser
And (3) outputting: offloading task set S i ={S i,1 ,S i,2 ,...,S i,Ns }, local task set L i ={L i,1 ,L i,2 ,...,L i,Nl }, offload decision vector X i ={x i,1 ,x i,2 ,...,x i,K }。
1) According to G i The CPU period number required by the middle task to process the task is arranged in a descending order for all the tasks to obtain a new task order
Figure BDA0002859343270000101
2) Setting array
Figure BDA0002859343270000102
If the initial index value h is 1, the values are calculated according to the formulas (11) and (12)
Figure BDA0002859343270000103
Put into local set L i Unloading set S i Later completion time
Figure BDA0002859343270000104
Figure BDA0002859343270000105
Figure BDA0002859343270000106
3) If it is
Figure BDA0002859343270000107
Then the
Figure BDA0002859343270000108
Put into local set L i Task of
Figure BDA0002859343270000109
Is unloaded decision variable x i,h Step i) is repeatedly executed until the step is exited into step 4), where h is 0, h +1, and k0 is k0+ 1. Otherwise, the task
Figure BDA00028593432700001010
Put into and unload set S i Task of
Figure BDA00028593432700001011
Is unloaded decision variable x i,h Step ii) is repeatedly executed until the step is exited to step 4), where h is 1 and h + 1.
i) Repeatedly executing the step until the step is exited into the step 4): is compared with if
Figure BDA00028593432700001012
Put into local set L i And uninstall the set S i Completion time of (2)
Figure BDA00028593432700001013
Is calculated according to equation (13) for task L i,k0 Time of completion, calculating task S according to equation (14) i,k1 And (4) completion time. If it is
Figure BDA00028593432700001014
Task
Figure BDA00028593432700001015
Unload decision variable x i,h Set 0, task
Figure BDA00028593432700001016
Put into local set L i H is h + 1; otherwise, the task is
Figure BDA00028593432700001017
Put into the unload set S i H +1, and perform 4).
ii) repeating this step until the step is exited into step 4): is compared with if
Figure BDA00028593432700001018
Put into local set L i And unloading the set S i Completion time of (2)
Figure BDA00028593432700001019
Is calculated according to equation (13) for task L i,k0 Time of completion, calculating task S according to equation (14) i,k1 And (4) completion time. If it is
Figure BDA00028593432700001020
Task
Figure BDA00028593432700001021
Is unloaded decision variable x i,h 1, task
Figure BDA00028593432700001022
Put into and unload set S i,k1 H is h + 1; otherwise, the task is
Figure BDA00028593432700001023
Put into the unload set S i H +1, and perform 4).
Figure BDA00028593432700001024
Figure BDA00028593432700001025
Wherein
Figure BDA0002859343270000111
4) If it is
Figure BDA0002859343270000112
Task
Figure BDA0002859343270000113
Unload decision variable x i,h Equal to 0, task
Figure BDA0002859343270000114
Put into local set L i Else task
Figure BDA0002859343270000115
Unload decision variable x i,h 1, task
Figure BDA0002859343270000116
Put into the unload set S i . h +1, and this step is repeated until h N.
To the unloading set S i All tasks in the system are classified, and the unloading transmission time is compared
Figure BDA0002859343270000117
And edge server execution time
Figure BDA0002859343270000118
Adding the task with unloading transmission time less than the edge server execution time into the array P i
Figure BDA0002859343270000119
Will P i According to the unloading transmission time of all tasks
Figure BDA00028593432700001110
And (4) arranging in an ascending order. Adding the tasks with the unloading transmission time being more than or equal to the execution time of the edge server into an array Q i
Figure BDA00028593432700001111
Will Q i According to the execution time of the edge server of all tasks in the system
Figure BDA00028593432700001112
And (5) arranging in a descending order. Will array Q i Added to array P i The new task order sigma is obtained later i =[P i Q i ]。
S33, the edge distribution target network uses a double-delay depth certainty strategy gradient algorithm to carry out server distribution frequency on independent tasks in an unloading task set;
and S34, performing one iteration every time steps S32-S33 are executed, outputting the optimal server allocation frequency after a preset number of iterations, and determining the target network parameters of the edge allocation target network. Preferably, the predetermined number of times is 100-200, and more preferably 150. As a preferred solution, in this embodiment, the output standard of the optimal server allocation frequency is: when the iterative computation finally has a convergence result, outputting the server distribution frequency of the iterative convergence as the optimal server distribution frequency; otherwise, outputting the pre-distribution frequency as the optimal server distribution frequency.
Specifically, in the specific implementation of the steps S33 and S34, the operation principle includes: according to the unloading task set and the unloading decision vector obtained in the step S32, all edge gateways U-U are solved by using a TD3 algorithm 1 ,U 2 ,...,U K Server resource allocation of f ser,best ={f 1,ser ,...,f K,ser The solving steps of are as follows:
inputting: iteration step T, maximum cycle number M, soft update step τ c Soft update weight ratio tau ratio Sample sampling weight coefficient beta, attenuation factor gamma, exploration rate epsilon and current value network Q 1 And Q 2 Goal value network Q 1 ' and Q 2 ', the current action network P and the target action network P', the number of samples m of batch gradient descent, and the number of leaf nodes V of SumTree.
And (3) outputting: server resource allocation: f. of ser,best ={f 1,ser ,...,f K,ser }
1) The objective of the joint task scheduling and server resource allocation problem is to minimize energy consumption and completion time of all tasks, and the mathematical model of the optimization problem is represented as (16) to (21), which is denoted as original problem P1. Where formula (16) is the objective function and formulae (17) to (21) are the constraints.
Figure BDA0002859343270000121
Figure BDA0002859343270000122
Figure BDA0002859343270000123
Figure BDA0002859343270000124
Figure BDA0002859343270000125
Figure BDA0002859343270000126
Wherein
Figure BDA0002859343270000127
The completion time of all the unloading tasks after the sorting is shown, and Ns represents the number of all the unloading execution tasks. And Nl represents the number of the locally executed tasks.
Figure BDA0002859343270000128
Representing the total power consumption of the edge server to perform all tasks.
Figure BDA0002859343270000129
To the completion time of the jth ordered off-load task,
Figure BDA00028593432700001210
is a set S i The server processing time of the jth offload task in (1).
Figure BDA00028593432700001211
Is a set S i The calculation formula of the transmission time of the 1 st to the jth unloading tasks is shown in the formula (15).
2) One state s ═ is initialized and normalized (tc, ac). Where tc is the system consumption of the whole system in the current state, which can be obtained from equation (16). ac is the available computing capacity of the MEC server, and the computing mode is as follows:
Figure BDA00028593432700001212
the system state s is normalized as follows:
Figure BDA00028593432700001213
wherein, tc σ And tc μ Mean and variance of system consumption; ac μ And ac σ The mean and variance of the server's remaining frequencies.
tc σ 、tc μ 、ac σ 、ac μ The calculation of (c) is as follows:
Figure BDA00028593432700001214
Figure BDA00028593432700001215
3) generating a random action a { (f) with a probability of ε 1,ser ,...f i,ser ,...,f K,ser )|0≤f i,ser ≤2f i,base I ≦ K0 ≦ and obtains a new system state s ≦ K ≦ t (tc, ac). Or inputting the state s ═ (tc, ac) into the target action network P' with a probability of 1-epsilon to obtain the predicted action a { (f) 1,ser ,...f i,ser ,...,f K,ser )|0≤f i,ser ≤2f i,base ,0≤i≤K}。step=step+1。
Wherein epsilon max To converge with probability, epsilon min Is the minimum random probability of ∈ const For the random rate constant, step is the number of predictions by the neural network.
The method of calculating the predicted action a from the target action network P' is as follows:
i) adding Gaussian white noise with the mean value of 0 and the variance of sigma to the output layer of the target action network P' and cutting the Gaussian white noise to a position between [0 and 1], wherein the calculation mode is as follows:
Figure BDA0002859343270000131
wherein q is k Acting on the network P for the target 2 ' output layer k-th neuron output value.
ii) by (q) q 1 ,...,q k ,...,q K ) Normalization is performed by the following method:
Figure BDA0002859343270000132
iii) outputting the value q k Adjusting to a proper action interval to obtain an action a:
a=(q 1 ,...,q k ,...,q K )*2f i,base ,1≤k≤K,k∈N * (28)
4) the next state s' ═(tc, ac) is calculated from the action a. If ac is less than 0, flag bit end is True, otherwise end is False, reward r, Sam is (s, s', r, q, end) stored in SumTree in sequence, and state iteration is performed: s is equal to s'.
The calculation formula of the reward r is as follows:
Figure BDA0002859343270000133
and calculates the cumulative prize for that round of epochs.
5) If tc < tc best Then tc best =tc,f ser,best =a。
6) Judging step V, if yes, then entering the next step if the experience pool is full, if not, returning to 2)
7) Extracting m samples from SumTree to train the neural network in the following way:
i) let i equal to 1 and j equal to 1. Summing all leaf nodes in SumTree to obtain the priority of the root node with the value of L 1,1 . SumTree consensus Floor 1+ log 2 And (V) layer.
Ii) root node priority L 11 Is divided into
Figure BDA0002859343270000141
Randomly selecting one number in each interval to obtain t ═ t 1 ,...,t i ,...,t y ]。
I ii) according to t i The search starts with the topmost root node.
iv) the priority of the left leaf node is set as left at this time, and the priority of the right leaf node is set as right. If left>t i Entering a left leaf node, otherwise entering a right leaf node; if entering the right leaf node, then t i =t i -left. j is j + 1. Repeat this step until j>Floor. At this time t i The sample stored by the corresponding leaf node is Sam i
v) repeating the above steps until Sam ═ Sam is selected 1 ,...,Sam m ]For a total of m samples.
vi) and updating the priority of each sample, sample priority p y The updating method is as follows:
p y =loss m +0.0001,y∈V (30)
among them, loss m For the value loss value of sample m, 0.0001 is to prevent L after summation 1,1 =0。
8) Updating the current value network Q by back propagation using the taken m samples 1 And Q 2 Goal value network Q 1 'and Q' 2 All parameters of the current action network P and the target action network P'. The network updating method comprises the following steps:
i) the sample Sam is input to the target motion network P 'to obtain the output layer vector q (q), which is (s, s', r, q, end) 1 ,...,q k ,...,q K ) Adding white Gaussian noise with mean 0 and variance σ and clippingThe shearing, adding and shearing modes are shown as a formula (24). Then, normalizing and adjusting to a proper interval to obtain the predicted action a, wherein the normalization mode and the adjustment mode are shown as formulas (25) and (26).
ii) inputting the system state (s', a) into the target value network Q 1 ' and target value network Q 2 ', obtaining a target value vq ═ vq (vq) 1 ,vq 2 ) Introduction of Q into 2 Substituting formula (29) to obtain the expected value vq exp
Figure BDA0002859343270000142
Wherein gamma is an attenuation coefficient, and the calculation mode is as follows;
Figure BDA0002859343270000143
wherein, γ max Is the maximum value of the attenuation coefficient, gamma min Is the minimum value of attenuation coefficient, gamma const Is a constant attenuation coefficient.
i) Inputting the system state (s, a) into the current value network Q 1 And a target value network Q 2 Combined expected value vq exp Calculating loss of value
Figure BDA0002859343270000151
And performs gradient descent and back propagation to update Q 1 、Q 2 Weight coefficients for the value network. Wherein the loss of value is calculated
Figure BDA0002859343270000152
The method of (1) is as follows:
Figure BDA0002859343270000153
wherein vq is l,j Inputting the current value network Q for the system state (s, a) 1 And a target value network Q 2 The value of (A) is obtained. ws (all-weather data) j Is the jth sampleThe corresponding sample weight is calculated as follows:
Figure BDA0002859343270000154
where m is the number of samples extracted, p j And beta is a sample sampling weight coefficient, and the calculation mode is as follows:
Figure BDA0002859343270000155
wherein beta is start Is the initial value of the sampling weight coefficient, beta start Is a sample weight coefficient constant.
ii) calculating the average value loss for sample m as follows:
Figure BDA0002859343270000156
updating the priority p of a sample by equation (28) y
9) Judgment of step% τ c If 0 is true, step 10) is entered, otherwise step 11) is entered.
10) Inputting the system state s into the current action network P to obtain the predicted action a, substituting (s, a) into the current value network Q 1 Obtaining loss of motion loss a And loss of motion a Back propagation and updating of the current value network Q 1 And a current action network P. The weights of the current action network P, the current value network Q and the target value network Q' are updated in a soft updating mode, wherein the weights are updated in the following mode:
w'=w'(1-τ ratio )+w*τ ratio (37)
wherein w' is the weight of the target network, and w is the weight of the current network.
11) And (4) judging end to True or step% T to 0, if yes, then enter into step 6, otherwise return to step 2).
Judging whether the epoch is less thanM, if yes, returning to the step S32, otherwise outputting tc best ,f ser,best . Specific bonus situations, which can participate in fig. 12, can be found: the accumulated reward is continuously increased, and the reward obtained by each predicted action is integrally improved.
As shown in fig. 10 and 11, in the present embodiment, the edge distribution network preferably includes a value network and an action network; the edge distribution target network is composed of a value target network and an action target network. Further, the value networks are preferably two, and the action networks are preferably one, which form one edge distribution network. The value target networks are preferably two, the action networks are preferably one, the edge distribution target networks are correspondingly formed, joint training is carried out when the edge distribution target networks are trained in real time, and synchronous updating is carried out on three networks (two value networks and one action network) when network parameters of the real-time edge distribution networks are updated correspondingly. Specifically, the value network is mainly used for supervising the operation of the action network, the action network is mainly used for outputting the optimal server allocation frequency and the optimal scheduling sequence, and in the action network, a K layer is a neuron which is respectively connected with the edge gateways and used for receiving the independent task information sets uploaded by the edge gateways and sending the optimal scheduling sequence to the corresponding edge gateways; in this embodiment, the edge computing system has how many edge gateways and the action network has the same number of data neurons.
Specifically, the following description will be made in detail by taking an edge computing system as an example. Fig. 2 is a schematic diagram of an edge computing scenario model, which includes an edge server, K mobile edge gateways (K ═ 2), and 7 independent tasks (N ═ 7). Let a set of computing tasks be
Figure BDA0002859343270000161
Each task T i,j The amount of data required to be processed is D i,j Each task T i,j Is C per unit data i,j Maximum transmission power for each task is p max 100mw, the transmission distance from the edge gateway to the edge server is L ═ L 1 ,L 2 }。
S1-1 initializes a task set, task T i,j D of (A) i,j And C i,j As shown in table 1, in order to solve the optimal solution, it is assumed that the transmission powers corresponding to the two edge gateways are p ═ mw (64.248, 59.039), and the energy consumption δ per CPU cycle of the edge gateway is δ L =1.6541*10 -9 W/Hz, CPU frequency of edge gateway is f user (0.5, 1) GHz, edge gateway U ═ U 1 ,U 2 Distance to the edge server is L ═ (154.881, 171.518) m. The CPU frequency of the edge server is f ser 2 GHz. Each edge gateway has a transmission bandwidth of 5 MHz.
TABLE 1 parameter Table for individual tasks
Figure BDA0002859343270000162
Figure BDA0002859343270000171
The system parameters are shown in table 2.
TABLE 2 execution time and energy consumption List of tasks
Figure BDA0002859343270000172
Figure BDA0002859343270000181
S1-2 initializing value network Q 1 ,Q 2 ,Q 1 ',Q' 2 And a weight parameter of the action network P, P'. Initializing a default data structure for empirical playback of SumTree, the priority p of the V (V64) leaf nodes of SumTree V 1, epoch is 0. The neural network structure is shown in fig. 6 and fig. 7.
S1-3, solving server pre-classification distribution frequency:
calculating G ═ G (G) 1 ,G 2 ) Local execution time of each task
Figure BDA0002859343270000182
Time of task transmission
Figure BDA0002859343270000183
Energy consumption for task transmission e (i,j),S Local execution energy consumption e (i,j),L The calculation results are shown in table 3:
TABLE 3 execution time and energy consumption Chart of tasks
Figure BDA0002859343270000184
Figure BDA0002859343270000191
The formula (7) can be used to calculate the edge gateway U ═ U 1 ,U 2 The relative proportion f of the local resources of i to the total resources of the system i,ratio =(0.016,0.327)。
The formula (8) can be used to calculate the edge gateway U ═ U 1 ,U 2 The relative proportion t of the local execution time delay to the total time delay of the system i,ratio =(0.063,0.936)
The formula (9) can be used to calculate the edge gateway U ═ { U ═ U- 1 ,U 2 Frequency assignment weight η of } i =(0.576,0.424)。
The distributed frequency f of the edge gateway can be calculated by the formula (10) i,base =(1.15*10 9 ,8.49*10 8 )
S1-4, solving an unloading decision vector based on the unloading scheduling method of the number of cycles required by the CPU to process the task:
s2-1 according to G i The CPU period number required by the middle task to process the task is arranged in a descending order for all the tasks to obtain a new task order
Figure BDA0002859343270000192
TABLE 4 ordering table for CPU period number needed by task
G 1 T 1,6 T 1,4 T 1,5 T 1,1 T 1,2 T 1,3 T 1,7
G 2 T 2,7 T 2,2 T 2,5 T 2,6 T 2,4 T 2,1 T 2,3
S2-2 setting array
Figure BDA0002859343270000201
Initial subscript value h ═ 1, according to equations (1) (2) and f i,base Separate computing task
Figure BDA0002859343270000202
Put into local set L i Unloading set S i Later completion time
Figure BDA0002859343270000203
Figure BDA0002859343270000204
Figure BDA0002859343270000205
S2-3 if
Figure BDA0002859343270000206
Then
Figure BDA0002859343270000207
Put into local set L i Task of
Figure BDA0002859343270000208
Is unloaded decision variable x i,h Step S3-1) is repeatedly executed until the step-in step S2-4) is exited, where h is h +1 and k0 is k0+ 1). Otherwise, the task
Figure BDA0002859343270000209
Put into and unload set S i Task(s)
Figure BDA00028593432700002010
Unload decision variable x i,h Step S3-2) is repeatedly executed until the step S2-4) is exited, where h is 1 and h + 1).
S3-1 repeatedly executes this step until the step exits to step S2-4): is compared to if
Figure BDA00028593432700002011
Put into local set L i And uninstall the set S i Completion time of
Figure BDA00028593432700002012
According to equation (13), calculate task L i,k0 Time of completion, calculating task S according to equation (14) i,k1 The completion time. If it is
Figure BDA00028593432700002013
Task
Figure BDA00028593432700002014
Unload decision variable x i,h Equal to 0, task
Figure BDA00028593432700002015
Put into local set L i H is h + 1; otherwise, the task is
Figure BDA00028593432700002016
Put into the unload set S i H +1, and S2-4) is performed.
S3-2 repeatedly execute this step until the step exits to step S2-4): is compared to if
Figure BDA00028593432700002017
Put into local set L i And unloading the set S i Completion time of (2)
Figure BDA00028593432700002018
Is calculated according to equation (13) for task L i,k0 Time of completion, calculating task S according to equation (14) ik1 The completion time. If it is
Figure BDA00028593432700002019
Task
Figure BDA00028593432700002020
Is unloaded decision variable x i,h 1, task
Figure BDA00028593432700002021
Put into the unload set S i,k1 H is h + 1; otherwise, the task is
Figure BDA00028593432700002022
Put into the unload set S i H +1, and S2-4 is performed).
S2-4 if
Figure BDA00028593432700002023
Task
Figure BDA00028593432700002024
Is unloaded decision variable x i,h Equal to 0, task
Figure BDA00028593432700002025
Put into local set L i Else task
Figure BDA00028593432700002026
Is unloaded decision variable x i,h 1, task
Figure BDA00028593432700002027
Put into the unload set S i . h +1, and this step is repeated until h N.
At this time, set S i And set L i The task distribution in (2) is shown in table 5:
TABLE 5 distribution of tasks in set S and set L
Figure BDA00028593432700002028
Figure BDA0002859343270000211
S2-5 pairs of offload collections S i All tasks in the system are classified, and the unloading transmission time is compared
Figure BDA0002859343270000212
And edge server execution time
Figure BDA0002859343270000213
Adding the task with unloading transmission time less than the edge server execution time into the array P i
Figure BDA0002859343270000214
Will P i According to the unloading transmission time of all tasks
Figure BDA0002859343270000215
And (5) arranging in an ascending order. Adding the tasks with the unloading transmission time being more than or equal to the execution time of the edge server into an array Q i
Figure BDA0002859343270000216
Will Q i The execution time of all the tasks is determined according to the edge server
Figure BDA0002859343270000217
And (5) arranging in a descending order. Will array Q i Added to array P i The new task order sigma is obtained later i =[P i Q i ]. At this time, the set P i Set Q i The task distribution in (1) is shown in table 6:
table 6 set P i 、Q i Task distribution in
P 1 T 1,1 T 1,3 T 1,2
Q 1 T 1,5 T 1,4 T 1,6
P 2 T 2,5 T 2,6 T 2,4 T 2,1
Q 2 T 2,2
S1-5, solving all edge gateways U ═ U by using the TD3 algorithm according to the unloading task set and the unloading decision vector obtained in the step S1-4 1 ,U 2 ,...,U K Server resource allocation of f ser,best ={f 1,ser ,...,f K,ser }:
S4-1 constructs an optimization problem P1.
S4-2 randomly generates a fraction epsilon between (0, 1) 0 If epsilon 0 <If epsilon is not, the state s is input into the target action network P', and the predicted value isq and normalizing and adjusting the normalized interval to obtain the action a. step + 1.
At this time,. epsilon 0 =0.15,ε=0.2,ε 0 <Epsilon, the random action generated is a ═ (1.046 ═ 10) 9 ,9.5308*10 8 )。
S4-3 calculates the next normalized state S ' ═ 0.0071, 0, end ═ False, reward r ═ 0.31, (S, S ', r, q, end) into SumTree, state iteration S ═ S ', target value tc ═ 0.0286, according to action a.
S4-4, determining tc < tc best Whether it is true, if true, tc best =tc,f ser,best =f ser . If not, the process proceeds directly to S4-5.
S4-5 judges whether step > V is satisfied, if not, returns to step S4-2, and if so, proceeds to step S4-6.
S4-6 training neural network Q by extracting m samples from SumTree 1 ,Q 2 P, the priority of each sample is updated. .
S4-7 determines whether step% C is true, and if true, soft-updates the neural network Q 1 ',Q 2 ', P', if not, proceeds directly to S4-8.
S4-8 determines whether end or step% T is 0, and if True, epoch + 1. If not, the process returns to step S4-2.
S1-6 judgment of epoch<Whether M is true or not, if true, outputting tc best ,f ser,best If not, the process returns to step S1-4.
The final optimization results are shown in the following table:
system execution delay 0.011825
Consumption of the system 0.021417
Server allocation of frequencies [1.03294728e+09 9.67052723e+08]
In summary, reinforcement learning can create experience for learning by itself through a trial-and-return feedback mechanism different from the conventional optimization algorithm, so as to complete the optimization objective. The deep learning algorithm can learn the historical data characteristics, and compared with the traditional optimization algorithm, the efficiency is greatly improved after the training is finished. The joint task scheduling and resource allocation method is an unloading iterative algorithm combining scheduling optimization and reinforcement learning: 1. the server frequency allocated to the edge gateway is fixed, and then the task unloading sequence and the unloading decision which can reach the minimum completion time are solved. 2. And solving the optimal server distribution frequency corresponding to each unloading task in the unloading sequence under the condition that the unloading sequence obtained in the last step is fixed and unchanged. And repeating the two steps of iteration to finally obtain the optimal server distribution frequency and the optimal scheduling sequence.
The invention also provides an edge computing system, which comprises an edge server and a plurality of edge gateways in communication connection with the edge server, wherein the edge server and the edge gateways work by using the edge computing resource scheduling method based on the double-delay depth certainty strategy. The joint task scheduling and resource allocation method is an unloading iterative algorithm combining scheduling optimization and reinforcement learning: 1. the server frequency allocated to the edge gateway is fixed, and then the task unloading sequence and the unloading decision which can reach the minimum completion time are solved. 2. And solving the optimal server distribution frequency corresponding to each unloading task in the unloading sequence under the condition that the unloading sequence obtained in the last step is fixed and unchanged. And repeating the two steps of iteration to finally obtain the optimal server distribution frequency and the optimal scheduling sequence. The result of obtaining the optimal server distribution frequency is faster, and the method can be suitable for more complex systems.
It should be understood that equivalents and modifications of the technical solution and inventive concept thereof may occur to those skilled in the art, and all such modifications and alterations should fall within the scope of the appended claims.

Claims (10)

1. An edge computing resource scheduling method based on a double-delay deep deterministic strategy is characterized in that an edge computing system comprises an edge server and a plurality of edge gateways in communication connection with the edge server, an edge distribution network and an edge distribution target network are built in the edge server, and the method comprises the following steps:
the edge server acquires independent task information sets of all the edge gateways;
based on the independent task information set, the edge distribution network respectively outputs corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways by using a double-delay depth deterministic strategy gradient algorithm;
sending the optimal server distribution frequency and the optimal scheduling sequence to the edge gateway to execute scheduling;
the edge distribution target network carries out real-time training based on the acquired independent task information set; and updating the network parameters of the edge distribution network in sections according to the target network parameters of the edge distribution target network.
2. The method for scheduling resources based on the edge computing of the double-delay depth deterministic strategy according to claim 1, wherein the independent task information set comprises a plurality of independent task information, and the independent task information at least comprises data volume and CPU cycle volume required for processing the task; the required CPU cycle amount comprises a cycle amount required by a server and a cycle amount required by an edge gateway;
the edge distribution network uses a double-delay depth deterministic policy gradient algorithm to respectively output the corresponding optimal server distribution frequency and optimal scheduling sequence for all the edge gateways, and the execution steps are the same as the real-time training steps of the edge distribution target network, and specifically include:
s31, based on the independent task information set, solving the pre-distribution frequency distributed by the edge server for each edge gateway through pre-classification;
s32, the edge distribution target network classifies all the independent tasks based on the period quantity required by the server and the period quantity required by the edge gateway of each independent task, and stores the independent tasks into an unloading task set and a local task set respectively;
s33, the edge distribution target network uses a double-delay depth certainty strategy gradient algorithm to carry out server distribution frequency on independent tasks in an unloading task set;
and S34, performing one iteration every time steps S32-S33 are performed, outputting the optimal server distribution frequency after a preset number of iterations, and determining the target network parameters of the edge distribution target network.
3. The method for scheduling resource based on edge computing of double-delay deep deterministic policy according to claim 2, wherein the output criteria of the optimal server allocation frequency are: when the iterative computation finally has a convergence result, outputting the server distribution frequency of the iterative convergence as the optimal server distribution frequency; otherwise, outputting the pre-distribution frequency as the optimal server distribution frequency.
4. The method for scheduling edge computing resources based on the dual-delay deep deterministic policy according to claim 2, wherein the network parameters of the edge distribution network are updated in segments according to the target network parameters of the edge distribution target network, specifically: when the edge distribution target network is trained in real time, in step S34, every iteration of the set number of times, the current target network parameter is divided into update sections according to a predetermined step length based on the network parameter before training to obtain update parameters, and the update parameters are used as the network parameters of the edge distribution network for updating.
5. The method according to claim 4, wherein the set number of times is 20-80.
6. The method for scheduling computation resources based on the dual-delay deep deterministic policy edge as claimed in claim 2, wherein in step S31, the specific obtaining step of the pre-allocated frequency comprises:
s311, respectively calculating the dominant frequency proportion of the equipment CPU frequency of each edge gateway to the sum of the equipment CPU frequencies of all the edge gateways; distributing CPU frequency for the edge gateway in the edge server according to the main frequency proportion;
s312, calculating local execution time delay of each independent task according to the independent task information, and respectively calculating a relative time delay ratio of the local execution time delay of each edge gateway to the sum of the local execution time delays of all the edge gateways;
s313, respectively calculating the distribution weight of each edge gateway according to the dominant frequency proportion and the relative time delay proportion;
and S314, respectively calculating the pre-distribution frequency of each edge gateway in the edge server according to the distribution weight and the service CPU frequency.
7. The method for scheduling resources for computing an edge based on a dual-delay deep deterministic policy according to claim 2, wherein the step S32 specifically includes:
s321, classifying the independent task information of all the edge gateways according to unloading time and server execution time, adding the independent task information of which the unloading time is less than the server execution time to a first array, and arranging all the independent task information in the first array according to the ascending order of the unloading time; adding the independent task information with the unloading time being more than or equal to the execution time of the server to a second array, and arranging all the independent task information in the second array in a descending order according to the execution time of the server;
s322, obtaining the server execution time and the unloading time of each independent task information in the first array, and obtaining the server processing time of each independent task information; acquiring the local execution time of each independent task information in the second array;
s323, obtaining a time difference value between the total server processing time of all the independent task information in the first array and the total local execution time of all the independent task information in the second array;
s324, determining all independent task information listed in the array with longer time according to the time difference value to form a third array; taking the processed first array as an unloading task set, and taking the processed second array as a local task set;
s325, respectively calculating server processing time and local execution time for each independent task information in the third array, putting the independent task information of which the server processing time is greater than the local execution time into the local task pre-allocation set, and putting the independent task information of which the server processing time is less than or equal to the local execution time into the unloading task pre-allocation set;
s326, after the independent task information in the third array is distributed, an unloading task set and a local task set are obtained, and an unloading decision vector is obtained according to the final unloading task set.
8. The method as claimed in claim 2, wherein the predetermined number of times is 100-200 in the step S34.
9. The dual-delay deep deterministic policy based edge computing resource scheduling method of claim 1, wherein the edge distribution network consists of a value network and an action network; the edge distribution target network is composed of a value target network and an action target network.
10. An edge computing system comprising an edge server and a plurality of edge gateways communicatively coupled to the edge server, wherein the edge server and the plurality of edge gateways operate using the dual-delay depth deterministic policy based edge computing resource scheduling method of any of claims 1-9.
CN202011560881.4A 2020-12-25 2020-12-25 Edge computing resource scheduling method and system based on double-delay depth certainty strategy Active CN112788605B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011560881.4A CN112788605B (en) 2020-12-25 2020-12-25 Edge computing resource scheduling method and system based on double-delay depth certainty strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011560881.4A CN112788605B (en) 2020-12-25 2020-12-25 Edge computing resource scheduling method and system based on double-delay depth certainty strategy

Publications (2)

Publication Number Publication Date
CN112788605A CN112788605A (en) 2021-05-11
CN112788605B true CN112788605B (en) 2022-07-26

Family

ID=75752423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011560881.4A Active CN112788605B (en) 2020-12-25 2020-12-25 Edge computing resource scheduling method and system based on double-delay depth certainty strategy

Country Status (1)

Country Link
CN (1) CN112788605B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326126B (en) * 2021-05-28 2024-04-05 湘潭大学 Task processing method, task scheduling method, device and computer equipment
CN113747554B (en) * 2021-08-11 2022-08-19 中标慧安信息技术股份有限公司 Method and device for task scheduling and resource allocation of edge computing network
CN114444240B (en) * 2022-01-28 2022-09-09 暨南大学 Delay and service life optimization method for cyber-physical system
CN115190033B (en) * 2022-05-22 2024-02-20 重庆科技学院 Cloud edge fusion network task unloading method based on reinforcement learning
CN115421930B (en) * 2022-11-07 2023-03-24 山东海量信息技术研究院 Task processing method, system, device, equipment and computer readable storage medium
CN117539647B (en) * 2024-01-09 2024-04-12 四川华鲲振宇智能科技有限责任公司 Task scheduling planning method and system based on edge computing gateway node attribute

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995660A (en) * 2017-12-18 2018-05-04 重庆邮电大学 Support Joint Task scheduling and the resource allocation methods of D2D- Edge Servers unloading
CN108920280A (en) * 2018-07-13 2018-11-30 哈尔滨工业大学 A kind of mobile edge calculations task discharging method under single user scene
CN109240818A (en) * 2018-09-04 2019-01-18 中南大学 Task discharging method based on user experience in a kind of edge calculations network
CN109302709A (en) * 2018-09-14 2019-02-01 重庆邮电大学 The unloading of car networking task and resource allocation policy towards mobile edge calculations
CN109614817A (en) * 2018-11-20 2019-04-12 南京邮电大学 Distributed cryptograph index slice search method under a kind of cloud environment
CN109767117A (en) * 2019-01-11 2019-05-17 中南林业科技大学 The power distribution method of Joint Task scheduling in mobile edge calculations
CN109951821A (en) * 2019-02-26 2019-06-28 重庆邮电大学 Minimum energy consumption of vehicles task based on mobile edge calculations unloads scheme
CN110543336A (en) * 2019-08-30 2019-12-06 北京邮电大学 Edge calculation task unloading method and device based on non-orthogonal multiple access technology
CN111556089A (en) * 2020-03-16 2020-08-18 西安电子科技大学 Resource joint optimization method based on enabling block chain mobile edge computing system
CN112039950A (en) * 2020-08-03 2020-12-04 威胜信息技术股份有限公司 Edge computing network task scheduling and resource allocation method and edge computing system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11249978B2 (en) * 2018-11-29 2022-02-15 Kyndryl, Inc. Multiple parameter based composite rule data validation

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995660A (en) * 2017-12-18 2018-05-04 重庆邮电大学 Support Joint Task scheduling and the resource allocation methods of D2D- Edge Servers unloading
CN108920280A (en) * 2018-07-13 2018-11-30 哈尔滨工业大学 A kind of mobile edge calculations task discharging method under single user scene
CN109240818A (en) * 2018-09-04 2019-01-18 中南大学 Task discharging method based on user experience in a kind of edge calculations network
CN109302709A (en) * 2018-09-14 2019-02-01 重庆邮电大学 The unloading of car networking task and resource allocation policy towards mobile edge calculations
CN109614817A (en) * 2018-11-20 2019-04-12 南京邮电大学 Distributed cryptograph index slice search method under a kind of cloud environment
CN109767117A (en) * 2019-01-11 2019-05-17 中南林业科技大学 The power distribution method of Joint Task scheduling in mobile edge calculations
CN109951821A (en) * 2019-02-26 2019-06-28 重庆邮电大学 Minimum energy consumption of vehicles task based on mobile edge calculations unloads scheme
CN110543336A (en) * 2019-08-30 2019-12-06 北京邮电大学 Edge calculation task unloading method and device based on non-orthogonal multiple access technology
CN111556089A (en) * 2020-03-16 2020-08-18 西安电子科技大学 Resource joint optimization method based on enabling block chain mobile edge computing system
CN112039950A (en) * 2020-08-03 2020-12-04 威胜信息技术股份有限公司 Edge computing network task scheduling and resource allocation method and edge computing system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Canny算子改进的边缘检测算法;王文豪等;《中国科技论文》;20170423(第08期);全文 *

Also Published As

Publication number Publication date
CN112788605A (en) 2021-05-11

Similar Documents

Publication Publication Date Title
CN112788605B (en) Edge computing resource scheduling method and system based on double-delay depth certainty strategy
CN112039950B (en) Edge computing network task scheduling and resource allocation method and edge computing system
CN112367353B (en) Mobile edge computing unloading method based on multi-agent reinforcement learning
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
CN112512056B (en) Multi-objective optimization calculation unloading method in mobile edge calculation network
CN111901862B (en) User clustering and power distribution method, device and medium based on deep Q network
CN108920280B (en) Mobile edge computing task unloading method under single-user scene
CN113543176B (en) Unloading decision method of mobile edge computing system based on intelligent reflecting surface assistance
CN113296845B (en) Multi-cell task unloading algorithm based on deep reinforcement learning in edge computing environment
CN111628855B (en) Industrial 5G dynamic multi-priority multi-access method based on deep reinforcement learning
CN112118287B (en) Network resource optimization scheduling decision method based on alternative direction multiplier algorithm and mobile edge calculation
CN114662661B (en) Method for accelerating multi-outlet DNN reasoning of heterogeneous processor under edge computing
CN110968426B (en) Edge cloud collaborative k-means clustering model optimization method based on online learning
CN114567895A (en) Method for realizing intelligent cooperation strategy of MEC server cluster
CN114585006B (en) Edge computing task unloading and resource allocation method based on deep learning
CN114650228B (en) Federal learning scheduling method based on calculation unloading in heterogeneous network
CN112287990A (en) Model optimization method of edge cloud collaborative support vector machine based on online learning
CN113590279A (en) Task scheduling and resource allocation method for multi-core edge computing server
CN113573363A (en) MEC calculation unloading and resource allocation method based on deep reinforcement learning
CN116321293A (en) Edge computing unloading and resource allocation method based on multi-agent reinforcement learning
CN114936708A (en) Fault diagnosis optimization method based on edge cloud collaborative task unloading and electronic equipment
CN117459112A (en) Mobile edge caching method and equipment in LEO satellite network based on graph rolling network
CN114449536B (en) 5G ultra-dense network multi-user access selection method based on deep reinforcement learning
CN113157344B (en) DRL-based energy consumption perception task unloading method in mobile edge computing environment
CN110768827A (en) Task unloading method based on group intelligent algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant