CN114698125A - Method, device and system for optimizing computation offload of mobile edge computing network - Google Patents

Method, device and system for optimizing computation offload of mobile edge computing network Download PDF

Info

Publication number
CN114698125A
CN114698125A CN202210619336.0A CN202210619336A CN114698125A CN 114698125 A CN114698125 A CN 114698125A CN 202210619336 A CN202210619336 A CN 202210619336A CN 114698125 A CN114698125 A CN 114698125A
Authority
CN
China
Prior art keywords
model
mobile
reward
determining
edge computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210619336.0A
Other languages
Chinese (zh)
Inventor
魏楚元
何航
任涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Civil Engineering and Architecture
Original Assignee
Beijing University of Civil Engineering and Architecture
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Civil Engineering and Architecture filed Critical Beijing University of Civil Engineering and Architecture
Priority to CN202210619336.0A priority Critical patent/CN114698125A/en
Publication of CN114698125A publication Critical patent/CN114698125A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/53Allocation or scheduling criteria for wireless resources based on regulatory allocation policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44594Unloading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides a calculation unloading optimization method, a device and a system of a mobile edge computing network, which are based on a distributed execution-centralized unloading framework of deep reinforcement learning, reduce the complexity of calculating time for solving the original target optimization problem, and avoid dimension disasters possibly faced by a traditional numerical optimization algorithm in a large-scale heterogeneous mobile edge computing network; by defining a loss function, an advantage function and a multi-agent reinforcement learning algorithm, the data sampling efficiency and the model training speed are improved, the average system cost in a network is reduced, and the service quality of calculation-intensive application is improved.

Description

Method, device and system for optimizing computation offload of mobile edge computing network
Technical Field
The invention relates to the technical field of edge computing, in particular to a computing unloading optimization method, a computing unloading optimization device and a computing unloading optimization system of a mobile edge computing network.
Background
With the explosion of computationally intensive mobile applications such as online gaming, autopilot, virtual reality, etc., it is becoming more and more imperative that mobile devices provide low-latency services for these applications. However, mobile devices typically have very limited computational resources and energy reserves, which present significant challenges to meeting the latency and computational requirements of applications. Thanks to the advent of 5G technology, mobile edge computing is considered a promising technology to address the above challenges by offloading computationally intensive-delay sensitive tasks to nearby edge nodes. However, in a conventional MEC (Mobile Edge Computing) system, an Edge server is usually deployed on a ground base station at a fixed location, which is high in deployment cost and poor in flexibility, and is not suitable for accessing scenes with dynamically changing requirements, such as event relay, traffic management, emergency rescue, and the like. Therefore, a heterogeneous mobile edge computing network facing ground vehicles and unmanned aerial vehicle assistance is receiving more and more attention from both academic and industrial circles.
Due to the high flexibility of ground vehicles and unmanned aerial vehicles and the convenience in deployment, the heterogeneous mobile edge computing network can adapt to a rapidly changing network environment and provide services for hot spots or emergency rescue activities as required. However, the requirements of high flexibility and dynamic change also bring difficult problems such as real-time decision, large-scale user association, and resource allocation under strict scheduling constraints to the heterogeneous mobile edge computing network.
In the existing research and invention, part of the method is based on the traditional numerical optimization method, for example, a convex optimization method and a heuristic search algorithm are adopted to solve the problems of task unloading and resource allocation in the multi-server mobile edge computing network, and the computing rate of the wireless power supply mobile edge computing network is maximized based on a coordinate reduction method; another part is a deep learning based approach, such as online incremental learning based on a deep neural network to solve the computational offloading and resource management problems of a dynamic heterogeneous mobile edge computing network.
Although the approximate solution can be obtained based on the traditional numerical optimization method, a large amount of iterations are usually needed to obtain a more ideal local optimal solution, the calculation complexity of problem solution is higher, and the method is not suitable for a dynamically changing environment. Most methods based on deep learning have low data sampling efficiency and the problem of low model convergence speed.
Disclosure of Invention
In order to solve the above problem, an embodiment of the present invention provides a computation offload optimization method for a mobile edge computing network, where the mobile edge computing network includes a ground vehicle and an unmanned aerial vehicle, and the method includes: constructing a system model of the moving edge computing network, and determining an optimization objective function of the model based on minimizing an average system cost; converting the optimization objective function based on the average system cost minimization into an optimization objective function based on the average reward maximization according to the state, action and reward elements of a Markov decision model; determining a distributed execution and centralized training framework of multi-agent deep reinforcement learning, and determining a loss function and an advantage function of training; training of the system model is performed according to a multi-agent reinforcement learning algorithm.
Optionally, the building a system model of the mobile edge computing network includes: establishing a network model comprising a plurality of ground vehicles, unmanned aerial vehicles and mobile equipment; establishing a communication model according to the network model, wherein the communication model comprises a mobile device-ground vehicle channel model and a mobile device-unmanned aerial vehicle channel model; and establishing a calculation model according to the communication model, wherein the calculation model comprises the calculation of local calculation cost, ground vehicle edge calculation cost and unmanned aerial vehicle edge calculation cost.
Optionally, the determining an optimization objective function of the model based on average system cost minimization comprises: determining an average system cost of all mobile devices in a plurality of time slices according to the local calculated cost, the ground vehicle edge calculated cost and the unmanned aerial vehicle edge calculated cost; and simultaneously unloading decision variables of the mobile equipment, so that the average system cost is minimum to obtain an optimized objective function.
Optionally, the transforming the optimization objective function based on the average system cost reduction into the optimization objective function based on the average reward maximization according to the state, action and reward elements of the markov decision model includes: determining the track of the mobile equipment in a plurality of time slices according to the state, action and reward elements of the Markov decision model, and calculating the probability of the track and the total reward; the state comprises task information, channel state and electric quantity information of the mobile equipment, and the action comprises unloading indication, transmission power and distributed computing capacity of the mobile equipment; calculating an average reward according to the probability of the track occurrence and the total reward, and determining an optimization objective function based on the maximization of the average reward.
Optionally, the determining a distributed execution and centralized training framework of multi-agent deep reinforcement learning, and determining a loss function and a merit function of training, comprises: constructing a distributed execution and centralized training framework of multi-agent deep reinforcement learning based on an Actor-Critic algorithm; determining a merit function using the generalized merit estimate in place of the total reward, and determining a penalty function using the offline policy in place of the online policy.
Optionally, the performing training of the system model according to a multi-agent reinforcement learning algorithm comprises: each mobile device interacts with the mobile edge computing network based on the observed local state to generate batch learning experience; training a sharing strategy based on the batch learning experience according to the generalized advantage estimation and the importance sampling; and each mobile device shares the sharing strategy to interact with the mobile edge computing network.
The embodiment of the invention provides a calculation unloading optimization device of a mobile edge calculation network, wherein the mobile edge calculation network comprises a ground vehicle and an unmanned aerial vehicle, and the device comprises: a model construction module for constructing a system model of the moving edge computing network and determining an optimization objective function of the model based on an average system cost minimization; the Markov decision conversion module is used for converting the optimization objective function based on the average system cost reduction into the optimization objective function based on the average reward maximization according to the state, the action and the reward elements of the Markov decision model; the system comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a distributed execution and centralized training framework of multi-agent deep reinforcement learning and determining a loss function and an advantage function of training; a training module for performing training of the system model according to a multi-agent reinforcement learning algorithm.
The embodiment of the invention provides a computing unloading optimization system of a mobile edge computing network, which is used for executing the computing unloading optimization method of the mobile edge computing network.
The embodiment of the invention is based on a distributed execution-centralized unloading framework of deep reinforcement learning, reduces the complexity of computing time for solving the original target optimization problem, and avoids dimension disasters possibly faced by the traditional numerical optimization algorithm in a large-scale heterogeneous mobile edge computing network; by defining a loss function, an advantage function and a multi-agent reinforcement learning algorithm, the data sampling efficiency and the model training speed are improved, the average system cost in a network is reduced, and the service quality of calculation-intensive application is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flowchart of a computation offload optimization method for a mobile edge computing network according to an embodiment of the present invention;
FIG. 2 is a system model diagram of a heterogeneous mobile edge computing network according to an embodiment of the present invention;
FIG. 3 is a diagram of a distributed execution-centralized training framework provided by an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computation offload optimization apparatus of a mobile edge computing network according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to solve the problems of task unloading and resource allocation of a ground vehicle and unmanned aerial vehicle assisted heterogeneous mobile edge computing network, the embodiment of the invention provides a computation unloading optimization algorithm based on deep reinforcement learning aiming at a multi-user multi-edge node scene.
Referring to fig. 1, a flow diagram of a computation offload optimization method for a mobile edge computing network is shown, where the mobile edge computing network includes a ground vehicle and an unmanned aerial vehicle, and the method includes the following steps:
s102, a system model of the mobile edge computing network is constructed, and an optimization objective function of the model based on average system cost minimization is determined.
The mobile edge computing network comprises a plurality of ground vehicles, unmanned planes and mobile equipment.
Illustratively, a system model of a mobile edge computing network may be constructed in the following manner, including: firstly, establishing a network model comprising a plurality of ground vehicles, unmanned aerial vehicles and mobile equipment; secondly, establishing a communication model according to the network model, wherein the communication model can comprise a mobile device-ground vehicle channel model and a mobile device-unmanned aerial vehicle channel model, and each channel model comprises channel gain and unloading transmission rate; then, a calculation model is established according to the communication model, wherein the calculation model can comprise the calculation of local calculation cost, ground vehicle edge calculation cost and unmanned aerial vehicle edge calculation cost, and each calculation cost comprises the delay and the energy consumption for executing tasks.
Illustratively, the optimization objective function of the above model based on the minimization of the average system cost may be determined in the following manner, including: firstly, determining the average system cost of all mobile equipment in a plurality of time slices according to the local calculation cost, the ground vehicle edge calculation cost and the unmanned aerial vehicle edge calculation cost; and secondly, unloading decision variables of the mobile equipment are connected, so that the average system cost is minimized to obtain an optimized objective function. The offload decision variables include: variables of the execution position of the decision task (local, ground vehicle and unmanned aerial vehicle), the transmitting power and the computing resources of the local, ground vehicle and unmanned aerial vehicle.
And S104, converting the optimization objective function based on the average system cost reduction into the optimization objective function based on the average reward maximization according to the state, the action and the reward elements of the Markov decision model.
Three elements, namely states, actions and rewards based on a Markov decision model are defined for the optimization problem in the steps and are converted into an optimization objective function based on the maximization of average rewards.
Due to the high coupling between offload decision variables and several limiting factors in the system, the optimization problem is an NP-hard (non-deterministic polynomial) problem, and thus the conventional numerical optimization method usually faces problems of high computational time complexity and dimensional disaster.
In order to avoid the above problems, the embodiment of the present invention defines three elements based on the markov decision model as follows: status, action, reward. Specifically, the state of each mobile device includes its task information, channel state, and power information; the actions of each mobile device include offloading indications, transmission power, and allocated computing power; the average system cost is minimized by optimizing the unloading decision and the resource allocation in the embodiment of the invention.
Based on this, the above steps may include: determining the tracks of the mobile equipment in a plurality of time slices according to the state, the action and the reward elements of the Markov decision model, and calculating the probability of track occurrence and the total reward; then, an average reward is calculated according to the probability of the track occurrence and the total reward, and an optimization objective function based on the maximization of the average reward is determined.
And S106, determining a distributed execution and centralized training framework of the multi-agent deep reinforcement learning, and determining a loss function and an advantage function of the training.
In the embodiment of the invention, a distributed execution and centralized training framework based on deep reinforcement learning is designed, and a loss function and an advantage function of training are determined. And aiming at the optimization problem based on the maximized average reward, deep reinforcement learning is adopted to train the network model. In view of the needs of large-scale user association and real-time decision making, the embodiment of the invention can adopt an Actor-Critic (Actor-evaluator) -based algorithm to build a distributed execution and centralized training framework for computation offloading and resource allocation scheduling.
The embodiment of the invention uses the deep reinforcement learning algorithm suitable for the multi-agent, and greatly reduces the computational complexity of problem solving through distributed execution and centralized training.
Optionally, a distributed execution and centralized training framework of multi-agent deep reinforcement learning can be built based on an Actor-Critic algorithm; then, a merit function is determined using the generalized merit estimate in place of the total reward, and a penalty function is determined using the offline policy in place of the online policy. The Actor may be responsible for generating actions and interacting with the environment based on an offline policy function; critic uses a device responsible for assessing the performance of Actor and directing the action of Actor at the next stage.
And S108, training the system model according to the multi-agent reinforcement learning algorithm.
Based on the above steps, the embodiment of the present invention can be implemented by using a reinforcement learning method suitable for Multi-agents, such as shared Multi-Agent proximity Policy Optimization (smapp), Multi-Agent Deep Deterministic Policy Gradient algorithm (maddppg), QMix algorithm, and the like.
Illustratively, the embodiment of the present invention employs SMAPPO based on centralized training and distributed execution framework, and the whole framework can be divided into three parts: distributed execution, data collection, and centralized training.
Based on the centralized training and distributed execution framework described above, the training process may be performed in the following manner: each mobile device interacts with a mobile edge computing network based on the observed local state to generate batch learning experience; training a sharing strategy based on batch learning experience according to generalized advantage estimation and importance sampling; and each mobile device shares the sharing strategy to interact with the mobile edge computing network. The utilization rate of experience and the convergence speed of the model can be further improved by using a near-end optimization algorithm introducing a merit function and importance sampling.
The calculation unloading optimization method of the mobile edge computing network provided by the embodiment of the invention is based on a distributed execution-centralized unloading framework of deep reinforcement learning, reduces the complexity of calculating time for solving the original target optimization problem, and avoids dimension disasters possibly faced by a traditional numerical optimization algorithm in a large-scale heterogeneous mobile edge computing network; by defining a loss function, an advantage function and a multi-agent reinforcement learning algorithm, the data sampling efficiency and the model training speed are improved, the average system cost in a network is reduced, and the service quality of calculation-intensive application is improved.
Furthermore, the method can solve the problems of computation unloading and resource allocation in the heterogeneous mobile edge computing network assisted by ground vehicles and unmanned aerial vehicles, and through the distributed execution-centralized unloading framework based on deep reinforcement learning in the step S104 and the step S106, the computation time complexity for solving the original target optimization problem is reduced, and the dimension disaster possibly faced by the traditional numerical optimization algorithm in the large-scale heterogeneous mobile edge computing network is avoided; by means of the loss function and the advantage function defined in the step S106, the importance sampling introduced in the step S108 and other methods, the data sampling efficiency and the model training speed are improved, the average system cost in the network is greatly reduced, and the service quality of the calculation-intensive application is improved.
Exemplary processes of the above steps are described in detail below.
(1) And constructing a system model of the ground vehicle and unmanned aerial vehicle assisted heterogeneous mobile edge computing network and giving an optimization objective function based on average system cost minimization.
The method comprises the following steps of constructing a system model of a ground vehicle and unmanned aerial vehicle assisted heterogeneous mobile edge computing network:
1. establishing a network model
Referring to fig. 2, a system model diagram of a heterogeneous mobile edge computing network is shown, wherein the system model diagram is included in a vehicle-oriented and unmanned aerial vehicle-oriented auxiliary mobile edge computing networkMA mobile device,VGround vehicle andUand erecting an unmanned aerial vehicle. The ground vehicle and the drone may be represented as a set, respectivelyV={1,2,…, VThe sum setU={1,2,…, U}. The mobile devices are randomly distributed over the ground and may be represented as a setM={1,2,…, M}. The overall system time is divided equally intoNA time slice, expressed as a setN={1,2,…, N}. Mobile deviceiIn time slicenWill randomly generate a task
Figure 525048DEST_PATH_IMAGE001
Expressed as:
Figure 295427DEST_PATH_IMAGE002
in the formula (I), the compound is shown in the specification,
Figure 545143DEST_PATH_IMAGE003
represents the input data size;
Figure 530416DEST_PATH_IMAGE004
indicating the number of clock cycles required to complete a 1-bit task;
Figure 840175DEST_PATH_IMAGE005
indicating completion of a task
Figure 910899DEST_PATH_IMAGE001
Is determined.
In this embodiment, a full offloading strategy is adopted, i.e. the generated tasks are either executed locally on the mobile device or completely offloaded to an edge node (i.e. a ground vehicle or a drone) for remote execution. Mobile deviceiIn time slicenMay be made of variables
Figure 964306DEST_PATH_IMAGE006
Expressed, defined as follows:
Figure 741769DEST_PATH_IMAGE007
in the formula (I), the compound is shown in the specification,
Figure 222429DEST_PATH_IMAGE008
representing a local computation;
Figure 514870DEST_PATH_IMAGE009
representing a ground vehicle edge calculation;
Figure 371967DEST_PATH_IMAGE010
representing unmanned aerial vehicle edge calculations.
2. Establishing a communication model
1) Mobile device-ground vehicle channel model
Mobile deviceiAnd a land vehiclejIn time slicenThe channel gain of (d) is expressed as:
Figure 66254DEST_PATH_IMAGE011
,
in the formula (I), the compound is shown in the specification,
Figure 907695DEST_PATH_IMAGE012
representing mobile devicesiAnd a land vehiclejIn time slicenThe distance between, expressed as:
Figure 953012DEST_PATH_IMAGE013
,
in the formula (I), the compound is shown in the specification,
Figure 348221DEST_PATH_IMAGE014
representing mobile devicesiIn time slicenAt the position of (1) in coordinates of
Figure 897014DEST_PATH_IMAGE015
;
Figure 985056DEST_PATH_IMAGE016
Indicating land vehiclesjIn time slicenAt the position of (1) in coordinates of
Figure 189772DEST_PATH_IMAGE017
According to the Shannon formula, the mobile deviceiAnd a land vehiclejIn time slicenThe offload transfer rate in between can be expressed as:
Figure 388672DEST_PATH_IMAGE018
,
in the formula (I), the compound is shown in the specification,
Figure 791972DEST_PATH_IMAGE019
representing mobile devicesiIn time slicenThe transmit power of (a);W j representing mobile devicesiAnd a land vehiclejThe bandwidth of the channel in between the two,σ 2 representing the amount of noise power in the channel.
2) Mobile device-unmanned aerial vehicle channel model
Mobile deviceiAnd unmanned aerial vehiclekIn time slicenThe channel gain of (d) is expressed as:
Figure 50915DEST_PATH_IMAGE020
in the formula (I), the compound is shown in the specification,ζ LOS andζ NLOS respectively represent the excess loss of line-of-sight and non-line-of-sight links,
Figure 805244DEST_PATH_IMAGE021
representing mobile devicesiAnd unmanned aerial vehiclekIn time slicenIs calculated as:
Figure 542256DEST_PATH_IMAGE022
,
in the formula (I), the compound is shown in the specification,
Figure 252592DEST_PATH_IMAGE023
indicating unmanned aerial vehiclekIn time slicenOf the position of (a).
According to the Shannon formula, the mobile deviceiAnd unmanned aerial vehiclekIn time slicenThe offload transfer rate in between can be expressed as:
Figure 682436DEST_PATH_IMAGE024
,
in the formula (I), the compound is shown in the specification,
Figure 658483DEST_PATH_IMAGE025
representing mobile devicesiAnd unmanned aerial vehiclekThe channel bandwidth in between.
3. Building a computational model
1) Local calculation: when in use
Figure 199185DEST_PATH_IMAGE026
The task is executed locally. The latency of executing a task locally is expressed as:
Figure 842656DEST_PATH_IMAGE027
in the formula (I), the compound is shown in the specification,
Figure 115506DEST_PATH_IMAGE028
is a mobile deviceiIn time slicenThe local computing resources of (a). The delay should satisfy the following condition:
Figure 844427DEST_PATH_IMAGE029
.
accordingly, the locally calculated energy consumption may be expressed as:
Figure 923242DEST_PATH_IMAGE030
in the formula (I), the compound is shown in the specification,κeffective switched capacitance depending on chip architecture, ζ represents the energy consumption index, in accordance withExperience typically takes ζ = 3. In summary, the locally computed weighted cost can be expressed as:
Figure 421219DEST_PATH_IMAGE031
,
in the formula (I), the compound is shown in the specification,
Figure 927287DEST_PATH_IMAGE032
and
Figure 330455DEST_PATH_IMAGE033
respectively representing locally calculated delay weight and energy consumption weight, and having
Figure 212961DEST_PATH_IMAGE034
,
Figure 565445DEST_PATH_IMAGE035
Figure 507993DEST_PATH_IMAGE036
2) Ground vehicle edge calculation: when in use
Figure 945927DEST_PATH_IMAGE037
Off-loading of tasks to ground vehiclesjIs executed. Offloading tasks to ground vehiclesjThe transmission delay of (d) may be expressed as:
Figure 366544DEST_PATH_IMAGE038
,
accordingly, the transmission energy consumption can be expressed as:
Figure 776797DEST_PATH_IMAGE039
,
vehicle for working on groundjThe calculated delay of (d) can be expressed as:
Figure 624668DEST_PATH_IMAGE040
.
accordingly, the calculated energy consumption may be expressed as:
Figure 549898DEST_PATH_IMAGE041
.
in the formula (I), the compound is shown in the specification,
Figure 774206DEST_PATH_IMAGE042
indicating land vehiclesjThe operating power of (c). In overview, the weighted cost computed for the edge of the ground vehicle can be expressed as:
Figure 101282DEST_PATH_IMAGE043
,
in the formula (I), the compound is shown in the specification,
Figure 572584DEST_PATH_IMAGE044
and
Figure 985111DEST_PATH_IMAGE045
respectively representing delay weight and energy consumption weight calculated at edge of ground vehicle, and having
Figure 747530DEST_PATH_IMAGE046
,
Figure 929113DEST_PATH_IMAGE047
Figure 384365DEST_PATH_IMAGE048
3) And (3) unmanned plane edge calculation: when in use
Figure 284188DEST_PATH_IMAGE049
With tasks off-loaded to the dronekIs executed. Offloading tasks to unmanned aerial vehiclekThe transmission delay of (d) can be expressed as:
Figure 787982DEST_PATH_IMAGE050
,
accordingly, the transmission energy consumption can be expressed as:
Figure 824071DEST_PATH_IMAGE051
,
task is at unmanned aerial vehiclekThe calculated delay of (d) can be expressed as:
Figure 450224DEST_PATH_IMAGE052
.
accordingly, the calculated energy consumption may be expressed as:
Figure 837343DEST_PATH_IMAGE053
.
in the formula (I), the compound is shown in the specification,
Figure 941565DEST_PATH_IMAGE054
indicating unmanned aerial vehiclekThe operating power of (c). To sum up, the weighted cost computed by the drone edge may be expressed as:
Figure 281761DEST_PATH_IMAGE055
,
in the formula (I), the compound is shown in the specification,
Figure 78816DEST_PATH_IMAGE056
and
Figure 687652DEST_PATH_IMAGE057
respectively represent the delay weight and the energy consumption weight of the edge calculation of the unmanned plane, and have
Figure 861144DEST_PATH_IMAGE058
,
Figure 871825DEST_PATH_IMAGE059
Figure 839781DEST_PATH_IMAGE060
From the above, the mobile deviceiIn time slicenThe system cost of (a) may be expressed as:
Figure 873596DEST_PATH_IMAGE061
thus, all mobile devices are inNThe average system cost in a time slice can be expressed as:
Figure 319621DEST_PATH_IMAGE062
based on the established system model, unloading decision variables of all mobile devices are optimized in a combined mode
Figure 450388DEST_PATH_IMAGE063
Minimizing the average system cost of the ground vehicle and drone assisted mobile edge computing network, thus said optimizing the objective functionPComprises the following steps:
Figure 589246DEST_PATH_IMAGE064
Figure 172674DEST_PATH_IMAGE065
Figure 609340DEST_PATH_IMAGE066
Figure 594614DEST_PATH_IMAGE067
Figure 904372DEST_PATH_IMAGE068
Figure 975097DEST_PATH_IMAGE069
Figure 28503DEST_PATH_IMAGE070
.
in the formula (I), the compound is shown in the specification,C1 is an unload index constraint;C2 is the transmission power constraint;C3、C4 andC5 represent the assigned computational capability constraints of the mobile device, the ground vehicle and the drone, respectively;C6、C7、C8 means that the delay to complete a task should not be greater than its maximum tolerable delay;C9 denotes that the total energy consumption of the mobile device should be less than the maximum available energy budget of the mobile device from the start time to the current time;C10 indicates that the total energy consumption of the ground vehicle from the start time to the current time should be within the maximum available energy budget for the ground vehicle;C11 means that the total energy consumption from the start time to the current time after the drone is unloaded should not be greater than the maximum available energy budget for the drone.
And 2, defining three elements, namely states, actions and rewards based on a Markov decision model for the optimization problem in the step 1, and converting the three elements into an optimization objective function based on average reward maximization.
Due to offloading decision variables
Figure 805966DEST_PATH_IMAGE071
The optimization problem is an NP-hard problem, and thus, the conventional numerical optimization method usually faces problems of high computational time complexity and dimensional disaster. In order to avoid the above problems, the present invention defines three elements based on a markov decision model as follows:
status. The status of each mobile device includes its task information, channel status and power information. Thus, the mobile deviceiIn time slicenThe state of (c) can be expressed as:
Figure 286626DEST_PATH_IMAGE072
in the formula (I), the compound is shown in the specification,
Figure 579067DEST_PATH_IMAGE073
representing mobile devicesiIn time slicenThe current remaining capacity of electricity.
And (6) acting. The actions of each mobile device include offloading indications, transmission power, and allocated computing power. Mobile deviceiIn time slicenThe actions of (a) may be expressed as:
Figure 436165DEST_PATH_IMAGE074
and (6) awarding. The average system cost is minimized by optimizing the offloading decisions and resource allocation. Thus, the mobile deviceiIn time slicenThe reward of (a) may be expressed as:
Figure 130451DEST_PATH_IMAGE075
in the formula (I), the compound is shown in the specification,
Figure 782013DEST_PATH_IMAGE076
representing mobile devicesiIn time slicenWeighted cost of (2).
Based on the definition of the three elements in the Markov decision model, the mobile deviceiIn thatNThe trajectory of each time slice may be represented as:
Figure 14280DEST_PATH_IMAGE077
accordingly, the probability of a track occurrence and the total reward may be expressed as:
Figure 409489DEST_PATH_IMAGE078
Figure 958282DEST_PATH_IMAGE079
in the formula (I), the compound is shown in the specification,θis a network parameter of the Actor,
Figure 46324DEST_PATH_IMAGE080
indicating a state
Figure 313357DEST_PATH_IMAGE081
The probability of occurrence.
The average reward may be expressed as:
Figure 246678DEST_PATH_IMAGE082
thus, the original optimization problem can be translated into an optimization objective function based on maximizing the average reward, as follows:
Figure 853240DEST_PATH_IMAGE083
and 3, designing a distributed execution and centralized training framework based on deep reinforcement learning aiming at the Markov decision problem in the step 2, and determining a loss function and an advantage function of training.
For problem P1, the present embodiment employs deep reinforcement learning to train the network model. In view of the needs of large-scale user association and real-time decision, a distributed execution and centralized training framework is built for computation unloading and resource allocation scheduling by adopting an Actor-Critic-based algorithm. For the above optimization problem, the gradient of the objective function can be expressed as:
Figure 112183DEST_PATH_IMAGE084
=
Figure 600933DEST_PATH_IMAGE085
=
Figure 603524DEST_PATH_IMAGE086
wherein the content of the first and second substances,Bis the small batch size per sample. To add a benchmark and add a suitable confidence level, the present embodiment introduces a generalized dominance estimate instead of the total reward. The merit function is defined as follows:
Figure 313860DEST_PATH_IMAGE087
wherein the content of the first and second substances,
Figure 478125DEST_PATH_IMAGE088
indicating a state
Figure 719750DEST_PATH_IMAGE089
In the form of a desired reward for the user,γa discount factor that represents a future reward,
Figure 260453DEST_PATH_IMAGE090
mobile deviceiIn time slicen The prize of (1). Thus, the gradient
Figure 638345DEST_PATH_IMAGE091
It can become:
Figure 239091DEST_PATH_IMAGE092
in order to improve the efficiency of data sampling, we choose to omit the replacement online policy using the offline policy, and the loss function of the Actor can be expressed as:
Figure 640116DEST_PATH_IMAGE093
Figure 984510DEST_PATH_IMAGE094
wherein, the first and the second end of the pipe are connected with each other,θ is an Actor network parameter on each mobile device,θis the Actor network parameter to be trained,εrepresenting the clipping factor (a fraction between 0 and 1),cliprepresenting the clipping function, the function is defined as follows:
Figure 482487DEST_PATH_IMAGE095
Figure 988555DEST_PATH_IMAGE096
the calculation formula of (c) is as follows:
Figure 204773DEST_PATH_IMAGE097
furthermore, the loss function of Critic can be expressed as:
Figure 277158DEST_PATH_IMAGE098
and 4, defining a shared multi-agent near-end strategy optimization algorithm and an execution process aiming at the distributed execution and centralized training framework in the step 3.
On the basis of step 3, a shared multi-agent near-end optimization algorithm based on centralized training and distributed execution frameworks is provided.
Referring to the schematic diagram of the distributed execution-centralized training framework shown in fig. 3, the entire framework can be divided into three parts from bottom to top: distributed execution, data collection, and centralized training.
(1) Firstly, each user equipment interacts with a heterogeneous mobile edge computing network based on local states observed by the user equipment, and batch learning experience is generated.
(2) These learning experiences are then used to train a shared strategy and value function by employing generalized dominance estimates and importance sampling.
(3) Finally, each mobile device shares the trained policy continuation and context interactions.
Illustratively, the shared multi-agent near-end policy optimization is performed as follows:
1: initializing ActorπAnd CriticVUsage parameterθ ←θAndΦ ←Φ(ii) a And initializing an experience pool.
2: for the rounde=1ToERepeatedly perform
3: for time slicen=1ToNRepeatedly perform
4: for mobile devicesi=1ToMRepeatedly perform
5: interacting with the environment, storing experience tuples
Figure 629642DEST_PATH_IMAGE099
To experience pool
6: end the cycle
7: end the cycle
8: for the number of updatest=1ToTRepeatedly perform
9: for the number of sampling timess=1ToSRepeatedly perform
10: random selectionBA tuple of experiences
11: computing dominant function, computing Actor loss and criticic loss
12: gradient computation with gradient descent by Adam optimizer▽θAnd▽Φ;
13: update ActorπAnd CriticVUsage parameterθ ←θAndΦ ←Φ;
14: end the cycle
15: end the cycle
16: end the cycle
The embodiment of the invention can solve the problems of computation unloading and resource allocation in a ground vehicle and unmanned aerial vehicle assisted heterogeneous mobile edge computing network, and through the distributed execution-centralized unloading framework based on deep reinforcement learning in the step 2 and the step 3, the computation time complexity for solving the original target optimization problem is reduced, and the dimension disaster possibly faced by the traditional numerical optimization algorithm in a large-scale heterogeneous mobile edge computing network is avoided. By means of the loss function and the advantage function defined in the step 3, the importance sampling method introduced in the step 4 and the like, the data sampling efficiency and the model training speed are improved, the average system cost in the network is greatly reduced, and the service quality of the calculation intensive application is improved.
Fig. 4 is a schematic structural diagram of a computation offload optimization apparatus of a mobile edge computing network in an embodiment of the present invention, where the mobile edge computing network includes a ground vehicle and an unmanned aerial vehicle, and the apparatus includes:
a model building module 401, configured to build a system model of the moving edge computing network, and determine an optimization objective function of the model based on an average system cost minimization;
a markov decision transformation module 402, configured to transform the optimization objective function based on average system cost reduction into an optimization objective function based on average reward maximization according to the state, action and reward elements of a markov decision model;
a determining module 403, configured to determine a distributed execution and centralized training framework of multi-agent deep reinforcement learning, and determine a loss function and an advantage function of training;
a training module 404 for performing training of the system model according to a multi-agent reinforcement learning algorithm.
The embodiment of the invention is based on a distributed execution-centralized unloading framework of deep reinforcement learning, reduces the complexity of computing time for solving the original target optimization problem, and avoids dimension disasters possibly faced by the traditional numerical optimization algorithm in a large-scale heterogeneous mobile edge computing network; by defining a loss function, an advantage function and a multi-agent reinforcement learning algorithm, the data sampling efficiency and the model training speed are improved, the average system cost in a network is reduced, and the service quality of calculation-intensive application is improved.
The embodiment of the invention provides a computing unloading optimization system of a mobile edge computing network, which is used for executing the computing unloading optimization method of the mobile edge computing network.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by instructing a control device to implement the methods, and the programs may be stored in a computer-readable storage medium, and when executed, the programs may include the processes of the above method embodiments, where the storage medium may be a memory, a magnetic disk, an optical disk, and the like.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for optimizing computation offload of a mobile edge computing network, wherein the mobile edge computing network comprises a ground vehicle and an unmanned aerial vehicle, and the method comprises the following steps:
constructing a system model of the moving edge computing network, and determining an optimization objective function of the model based on minimizing an average system cost;
converting the optimization objective function based on the average system cost minimization into an optimization objective function based on the average reward maximization according to the state, action and reward elements of a Markov decision model;
determining a distributed execution and centralized training framework of multi-agent deep reinforcement learning, and determining a loss function and an advantage function of training;
performing training of the system model according to a multi-agent reinforcement learning algorithm.
2. The method of claim 1, wherein constructing the system model of the mobile edge computing network comprises:
establishing a network model comprising a plurality of ground vehicles, unmanned aerial vehicles and mobile equipment;
establishing a communication model according to the network model, wherein the communication model comprises a mobile device-ground vehicle channel model and a mobile device-unmanned aerial vehicle channel model;
and establishing a calculation model according to the communication model, wherein the calculation model comprises the calculation of local calculation cost, ground vehicle edge calculation cost and unmanned aerial vehicle edge calculation cost.
3. The method of claim 2, wherein determining an optimization objective function for the model based on an average system cost minimization comprises:
determining an average system cost of all mobile devices in a plurality of time slices according to the local calculated cost, the ground vehicle edge calculated cost and the unmanned aerial vehicle edge calculated cost;
and simultaneously unloading decision variables of the mobile equipment, so that the average system cost is minimum to obtain an optimized objective function.
4. The method of claim 1, wherein transforming the optimization objective function based on mean system cost reduction into an optimization objective function based on mean reward maximization according to state, action and reward elements of a Markov decision model comprises:
determining the track of the mobile equipment in a plurality of time slices according to the state, action and reward elements of the Markov decision model, and calculating the probability of the track and the total reward; the state comprises task information, channel state and electric quantity information of the mobile equipment, and the action comprises unloading indication, transmission power and distributed computing capacity of the mobile equipment;
calculating an average reward according to the probability of the track occurrence and the total reward, and determining an optimization objective function based on the maximization of the average reward.
5. The method of any of claims 1-4, wherein determining a distributed execution and centralized training framework for multi-agent deep reinforcement learning, and determining a loss function and a merit function for training, comprises:
constructing a distributed execution and centralized training framework of multi-agent deep reinforcement learning based on an Actor-Critic algorithm;
determining a merit function using the generalized merit estimate in place of the total reward, and determining a penalty function using the offline policy in place of the online policy.
6. The method of any one of claims 1-4, wherein said performing training of said system model according to a multi-agent reinforcement learning algorithm comprises:
each mobile device interacts with the mobile edge computing network based on the observed local state to generate batch learning experience;
training a sharing strategy based on the batch learning experience according to generalized advantage estimation and importance sampling;
and each mobile device shares the sharing strategy to interact with the mobile edge computing network.
7. The method of claim 4, wherein the mobile device is a mobile deviceiIn time slicenState of (1)Expressed as:
Figure 325115DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 242256DEST_PATH_IMAGE002
which is indicative of the size of the input data,
Figure 21993DEST_PATH_IMAGE003
indicating the number of clock cycles required to complete a 1-bit task,
Figure 417202DEST_PATH_IMAGE004
indicating completion of a task
Figure 231574DEST_PATH_IMAGE005
Is determined by the maximum allowable delay of the delay,
Figure 991720DEST_PATH_IMAGE006
representing mobile devicesiIn time slicenThe current amount of remaining power of the battery,
Figure 258753DEST_PATH_IMAGE007
representing mobile devices
Figure 457654DEST_PATH_IMAGE008
And a land vehicle
Figure 126532DEST_PATH_IMAGE009
In time slice
Figure 119896DEST_PATH_IMAGE010
The channel gain of (a) is determined,
Figure 61176DEST_PATH_IMAGE011
representing mobile devices
Figure 798188DEST_PATH_IMAGE012
And unmanned aerial vehiclekIn time slice
Figure 321573DEST_PATH_IMAGE013
The channel gain of (a);
mobile deviceiIn time slicenIs represented as:
Figure 751418DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 993043DEST_PATH_IMAGE015
representing mobile devices
Figure 268167DEST_PATH_IMAGE008
In time slice
Figure 849321DEST_PATH_IMAGE010
The load-shedding decision variable of (a) is,
Figure 184487DEST_PATH_IMAGE016
representing mobile devices
Figure 913409DEST_PATH_IMAGE012
In time slice
Figure 992223DEST_PATH_IMAGE010
The transmission power of (a) is set,
Figure 490200DEST_PATH_IMAGE017
representing mobile devices
Figure 448798DEST_PATH_IMAGE008
In time slice
Figure 399437DEST_PATH_IMAGE010
The local computing resources of (a) are,
Figure 281942DEST_PATH_IMAGE018
representing mobile devices
Figure 634426DEST_PATH_IMAGE008
In time slice
Figure 576974DEST_PATH_IMAGE010
The ground vehicle computing resources of (a) are,
Figure 14909DEST_PATH_IMAGE019
representing mobile devices
Figure 373209DEST_PATH_IMAGE012
In time slice
Figure 845778DEST_PATH_IMAGE010
The unmanned aerial vehicle computing resources of (1);
mobile deviceiIn time slicenThe reward of (a) is expressed as:
Figure 959228DEST_PATH_IMAGE020
wherein the content of the first and second substances,
Figure 884459DEST_PATH_IMAGE021
for mobile devices
Figure 108767DEST_PATH_IMAGE008
In time slice
Figure 377722DEST_PATH_IMAGE010
The system cost of (a);
mobile deviceiIn thatNThe trace for each time slice is represented as:
Figure 662073DEST_PATH_IMAGE022
the probability of occurrence of a track and the total reward are expressed as:
Figure 74600DEST_PATH_IMAGE023
Figure 102598DEST_PATH_IMAGE024
wherein, the first and the second end of the pipe are connected with each other,
Figure 284181DEST_PATH_IMAGE025
indicating a state
Figure 411537DEST_PATH_IMAGE026
The probability of the occurrence of the event is,
Figure 311360DEST_PATH_IMAGE027
a network parameter representing Actor;
the average reward is expressed as:
Figure 877470DEST_PATH_IMAGE028
wherein the content of the first and second substances,Eindicates a desire;
the optimization objective function based on maximizing the average reward is expressed as:
Figure 179139DEST_PATH_IMAGE029
8. the method of claim 7, wherein the mobile device is a mobile deviceiIn time slicenThe state of (a) is represented as:
the merit function is expressed as:
Figure 539713DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 926832DEST_PATH_IMAGE031
indicating a state
Figure 483584DEST_PATH_IMAGE032
In the form of a desired reward of (a),γa discount factor that represents a future reward,
Figure 374180DEST_PATH_IMAGE033
mobile deviceiIn time slice
Figure 171234DEST_PATH_IMAGE034
The reward of (1);
the loss function of Actor is expressed as:
Figure 780070DEST_PATH_IMAGE035
Figure 953563DEST_PATH_IMAGE036
wherein the content of the first and second substances,
Figure 901927DEST_PATH_IMAGE037
indicates the Actor network parameter on each mobile device,
Figure 869883DEST_PATH_IMAGE027
indicates the Actor network parameters to be trained,
Figure 966015DEST_PATH_IMAGE038
the calculation formula of (c) is as follows:
Figure 677619DEST_PATH_IMAGE039
the loss function for Critic is expressed as:
Figure 542807DEST_PATH_IMAGE040
9. a computational offload optimization apparatus for a mobile edge computing network, the mobile edge computing network comprising a ground vehicle, a drone, the apparatus comprising:
a model construction module for constructing a system model of the moving edge computing network and determining an optimization objective function of the model based on an average system cost minimization;
the Markov decision conversion module is used for converting the optimization objective function based on the average system cost reduction into an optimization objective function based on the average reward maximization according to the state, the action and the reward elements of the Markov decision model;
the system comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a distributed execution and centralized training framework of multi-agent deep reinforcement learning and determining a loss function and an advantage function of training;
a training module for performing training of the system model according to a multi-agent reinforcement learning algorithm.
10. A computing offload optimization system of a mobile edge computing network, the system being configured to perform the computing offload optimization method of the mobile edge computing network according to any of claims 1 to 8.
CN202210619336.0A 2022-06-02 2022-06-02 Method, device and system for optimizing computation offload of mobile edge computing network Pending CN114698125A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210619336.0A CN114698125A (en) 2022-06-02 2022-06-02 Method, device and system for optimizing computation offload of mobile edge computing network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210619336.0A CN114698125A (en) 2022-06-02 2022-06-02 Method, device and system for optimizing computation offload of mobile edge computing network

Publications (1)

Publication Number Publication Date
CN114698125A true CN114698125A (en) 2022-07-01

Family

ID=82131080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210619336.0A Pending CN114698125A (en) 2022-06-02 2022-06-02 Method, device and system for optimizing computation offload of mobile edge computing network

Country Status (1)

Country Link
CN (1) CN114698125A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115499440A (en) * 2022-09-14 2022-12-20 广西大学 Server-free edge task unloading method based on experience sharing deep reinforcement learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200186964A1 (en) * 2018-12-07 2020-06-11 T-Mobile Usa, Inc. Uav supported vehicle-to-vehicle communication
CN112118601A (en) * 2020-08-18 2020-12-22 西北工业大学 Method for reducing task unloading delay of 6G digital twin edge computing network
CN112929849A (en) * 2021-01-27 2021-06-08 南京航空航天大学 Reliable vehicle-mounted edge calculation unloading method based on reinforcement learning
CN113346944A (en) * 2021-06-28 2021-09-03 上海交通大学 Time delay minimization calculation task unloading method and system in air-space-ground integrated network
CN114116061A (en) * 2021-11-26 2022-03-01 内蒙古大学 Workflow task unloading method and system in mobile edge computing environment
CN114169234A (en) * 2021-11-30 2022-03-11 广东工业大学 Scheduling optimization method and system for unmanned aerial vehicle-assisted mobile edge calculation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200186964A1 (en) * 2018-12-07 2020-06-11 T-Mobile Usa, Inc. Uav supported vehicle-to-vehicle communication
CN112118601A (en) * 2020-08-18 2020-12-22 西北工业大学 Method for reducing task unloading delay of 6G digital twin edge computing network
CN112929849A (en) * 2021-01-27 2021-06-08 南京航空航天大学 Reliable vehicle-mounted edge calculation unloading method based on reinforcement learning
CN113346944A (en) * 2021-06-28 2021-09-03 上海交通大学 Time delay minimization calculation task unloading method and system in air-space-ground integrated network
CN114116061A (en) * 2021-11-26 2022-03-01 内蒙古大学 Workflow task unloading method and system in mobile edge computing environment
CN114169234A (en) * 2021-11-30 2022-03-11 广东工业大学 Scheduling optimization method and system for unmanned aerial vehicle-assisted mobile edge calculation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JINNA,CHEN等: "UAV-Assisted Vehicular Edge Computing forUAV-Assisted Vehicular Edge Computing for the 6G Internet of Vehicles: Architecture,Intelligence, and Challenges", 《IEEE COMMUNICATIONS STANDARDS MAGAZINE》 *
王云鹏: "基于深度强化学习的移动边缘计算的资源优化方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115499440A (en) * 2022-09-14 2022-12-20 广西大学 Server-free edge task unloading method based on experience sharing deep reinforcement learning

Similar Documents

Publication Publication Date Title
CN113543176B (en) Unloading decision method of mobile edge computing system based on intelligent reflecting surface assistance
Chen et al. Efficiency and fairness oriented dynamic task offloading in internet of vehicles
Faraci et al. Fog in the clouds: UAVs to provide edge computing to IoT devices
US11831708B2 (en) Distributed computation offloading method based on computation-network collaboration in stochastic network
CN115640131A (en) Unmanned aerial vehicle auxiliary computing migration method based on depth certainty strategy gradient
EP4024212B1 (en) Method for scheduling inference workloads on edge network resources
CN115175217A (en) Resource allocation and task unloading optimization method based on multiple intelligent agents
CN113645637B (en) Method and device for unloading tasks of ultra-dense network, computer equipment and storage medium
Hajiakhondi-Meybodi et al. Deep reinforcement learning for trustworthy and time-varying connection scheduling in a coupled UAV-based femtocaching architecture
CN113573363A (en) MEC calculation unloading and resource allocation method based on deep reinforcement learning
CN117499867A (en) Method for realizing high-energy-efficiency calculation and unloading through strategy gradient algorithm in multi-unmanned plane auxiliary movement edge calculation
CN113946423B (en) Multi-task edge computing, scheduling and optimizing method based on graph attention network
CN116489708A (en) Meta universe oriented cloud edge end collaborative mobile edge computing task unloading method
CN113821346B (en) Edge computing unloading and resource management method based on deep reinforcement learning
CN114698125A (en) Method, device and system for optimizing computation offload of mobile edge computing network
Henna et al. Distributed and collaborative high-speed inference deep learning for mobile edge with topological dependencies
CN115514769B (en) Satellite elastic Internet resource scheduling method, system, computer equipment and medium
CN116455903A (en) Method for optimizing dependency task unloading in Internet of vehicles by deep reinforcement learning
CN116204319A (en) Yun Bianduan collaborative unloading method and system based on SAC algorithm and task dependency relationship
CN114980160A (en) Unmanned aerial vehicle-assisted terahertz communication network joint optimization method and device
CN114217881A (en) Task unloading method and related device
CN111813539B (en) Priority and collaboration-based edge computing resource allocation method
CN117891532B (en) Terminal energy efficiency optimization unloading method based on attention multi-index sorting
Zhang et al. Cooperative optimisation strategy of computation offloading in multi‐UAVs‐assisted edge computing networks
Hevesli et al. Task Offloading Optimization in Digital Twin Assisted MEC-Enabled Air-Ground IIoT 6 G Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220701