CN113873022A - Mobile edge network intelligent resource allocation method capable of dividing tasks - Google Patents

Mobile edge network intelligent resource allocation method capable of dividing tasks Download PDF

Info

Publication number
CN113873022A
CN113873022A CN202111112170.5A CN202111112170A CN113873022A CN 113873022 A CN113873022 A CN 113873022A CN 202111112170 A CN202111112170 A CN 202111112170A CN 113873022 A CN113873022 A CN 113873022A
Authority
CN
China
Prior art keywords
unloading
subtask
task
terminal
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111112170.5A
Other languages
Chinese (zh)
Inventor
沈斐
唐亮
卜智勇
赵宇
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Microsystem and Information Technology of CAS
Original Assignee
Shanghai Institute of Microsystem and Information Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Microsystem and Information Technology of CAS filed Critical Shanghai Institute of Microsystem and Information Technology of CAS
Priority to CN202111112170.5A priority Critical patent/CN113873022A/en
Publication of CN113873022A publication Critical patent/CN113873022A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • H04L67/1078Resource delivery mechanisms
    • H04L67/1082Resource delivery mechanisms involving incentive schemes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention relates to a mobile edge network intelligent resource allocation method capable of dividing tasks, which comprises the following steps: dividing a serial task generated by a terminal to obtain a plurality of subtasks, and establishing an unloading task model; respectively establishing a time delay model and an energy consumption model for the subtasks according to two execution modes of local execution or unloading, and defining an unloading joint target optimization function based on the multi-user serial dependent task; under a multi-server scene, establishing a Markov game model according to the cooperative competition relationship of multiple users to wireless communication and computing resources, and optimizing the unloading combined target optimization function; in a time-varying environment, each terminal is used as an independent agent to execute a reinforcement learning algorithm to solve the Markov game model based on part of system state information, and an unloading strategy, sub-channel selection, transmitting power and resource allocation amount are determined. The invention is beneficial to reasonably distributing server resources and fully using fragmented resources, ensures the terminal user experience and improves the stability of network operation.

Description

Mobile edge network intelligent resource allocation method capable of dividing tasks
Technical Field
The invention relates to the technical field of edge calculation and artificial intelligence, in particular to a mobile edge network intelligent resource allocation method capable of dividing tasks.
Background
With the continuous development of communication technology, a great amount of emerging internet interactive applications emerge, and the requirements of the application programs on data transmission, the computing power of mobile equipment and time delay are continuously increased, so that the application programs are not suitable for being executed on intelligent equipment with poor computing power and limited battery capacity. In addition, a single cloud architecture requires long-distance data transmission, and the requirements of a terminal side on low time delay and large bandwidth in an ultra-dense wireless network under a next-generation communication framework are difficult to meet. Therefore, the moving edge calculation technology is an important solution to solve the above problems as a specific implementation mode in edge calculation. The Mobile Edge Computing (MEC) sinks part of service capability of the cloud to an Edge node near the user, and provides resource services such as Computing, caching and the like for the user. The user can unload part of the calculation intensive tasks to the server of the edge node for execution, thereby reducing the time delay generated in the data transmission process, relieving the transmission pressure of the backbone network and ensuring the effective execution of the tasks.
Because the MEC server has limited resources, competition of computing and communication resources exists among a plurality of devices for carrying out edge computing task unloading, at present, a plurality of research works aiming at the problems of task unloading and resource allocation exist, for example, a patent document with the application number of 202010171454.0 discloses a task unloading method based on a mobile edge computing scene, and an optimization objective equation with minimized system overhead is determined according to task information to be processed and system real-time parameter information; the optimization objective equation is decomposed into two sub-problems: task offloading and channel allocation sub-problems and transmission power and edge server resource allocation sub-problems; and solving the sub-problems to obtain a final task unloading scheme, so that the overall overhead of the system is minimized. However, the method is oriented to a single-server unloading scene, the problem solving dimensionality is high, the multi-terminal requirements in a dense network cannot be met, and the algorithm expansibility is poor.
The defects of the prior art are mainly reflected in four aspects, one is that the scene is too simple, most researches are oriented to a single/multi-terminal single-server scene, the problems of calculation and communication resource competition among devices are considered, and the problems of unloading server selection, load balance among servers, resource scheduling and distribution and the like are ignored; and secondly, the unloading task is not divisible, the existing research is limited to 0-1 unloading of an atomic task which is not divisible, the potential parallelism among the divisible tasks is ignored, and the fragmented resources of the server cannot be effectively utilized. Thirdly, the optimization target is too single, only two aspects of time delay and energy consumption are considered, other factors influencing the system performance are ignored, and tasks with different emergency degrees need to be processed differently; finally, the central offloading policy has poor adaptability to dynamic environments, a unified decision is made based on the collected global information, and the central control node needs to bear huge calculation and flow pressure, which is likely to become a bottleneck of the whole system.
Disclosure of Invention
The invention aims to solve the technical problem of providing an intelligent resource allocation method of a mobile edge network capable of dividing tasks, which is beneficial to reasonably allocating server resources and fully using fragmented resources, and is beneficial to improving the task unloading execution performance, ensuring the terminal user experience and improving the network operation stability.
The invention considers and solves the following technical problems:
1) the task unloading scene of the multi-terminal multi-server has the problems of calculation and communication resource competition among users, selection of unloading servers, load balance among servers, resource scheduling and distribution and the like, and has higher complexity compared with a single/multi-terminal single-server scene;
2) the serial tasks have strict constraint relation and need to be executed in sequence, and the execution sequence cannot be disturbed. Determining a proper sub-channel, transmission power and calculation resource amount for each subtask selecting the unloading strategy;
3) the design of the optimization objective function needs to meet the time delay requirements and the emergency degree of different tasks. In the environment with time-varying system state, the unloading problem of multiple terminals is solved in a distributed self-organizing manner, the instability of the multi-terminal environment is reduced, and meanwhile, the long-term reward of each terminal user is ensured.
The technical scheme adopted by the invention for solving the technical problems is as follows: the method for distributing the intelligent resources of the mobile edge network capable of dividing tasks comprises the following steps:
(1) dividing a serial task generated by a terminal to obtain a plurality of subtasks, and establishing an unloading task model;
(2) respectively establishing a time delay model and an energy consumption model for the subtasks according to two execution modes of local execution or unloading, and defining an unloading joint target optimization function based on the multi-user serial dependent task;
(3) under a multi-server scene, establishing a Markov game model according to the cooperative competition relationship of multiple users to wireless communication and computing resources, and optimizing the unloading combined target optimization function;
(4) in a time-varying environment, each terminal is used as an independent agent to execute a reinforcement learning algorithm to solve the Markov game model based on part of system state information, and an unloading strategy, sub-channel selection, transmitting power and resource allocation amount are determined.
The multiple subtasks in the step (1) have interdependencies, and data interaction exists among the multiple subtasks.
When the unloading task model is established in the step (1), each subtask is specified to be only unloaded to a certain MEC server for execution, but different subtasks in one application can be unloaded to different MEC servers; when an adjacent subtask is offloaded to the same or a different MEC server, the output data of the previous subtask is transferred to the MEC server to which the next subtask is offloaded through a wired connection.
The unloading joint objective optimization function in the step (2) is P:
Figure BDA0003274258310000031
wherein, TiIndicating the time delay for completing the ith subtask, EiRepresents the terminal energy consumption, δ, for completing the ith subtaskiIndicating the priority of the ith sub-task,χ12represents the weight of time delay and energy consumption, and12∈[0,1],χ1+χ 21, the unloading joint objective optimization function satisfies the following constraint conditions: constraint condition 1, the execution position of the application subtask is a local or edge server; constraint 2, the entry subtask and the exit subtask of the task can only be executed locally; constraint 3, the subtask can start to execute only when the execution of its predecessor subtask is finished; constraint condition 4, each subtask can only select one sub-channel frequency to transmit data to the server; constraint 5, the total amount of computing resources that can be allocated by all subtasks selected to offload to the edge server must not exceed the maximum resource ownership; constraint 6, the transmit power of the terminal device when inputting data to the edge server must not exceed its maximum transmit power.
The step (3) is specifically as follows: determining a known state space, an action space and a reward function; modeling a task unloading and resource allocation decision process of the multiple terminals into a Markov decision process, namely, in each time slot, the terminals observe the local environment state of the terminals and then independently take action according to different strategies adopted by the local environment state; according to the task execution condition, each agent can obtain the reward of environment feedback, and the agent is transferred to a new state according to the actions of all related agents; the decision process for all coupled terminals is modeled as a markov game process, i.e. at any time slot, each terminal is targeted to take the best action while maximizing the long-term prize.
The step (4) is specifically as follows: each terminal is used as an independent intelligent agent, and all changes except the terminal is used as an environment; each terminal independently operates an Actor-Critic reinforcement learning framework; all terminals are trained based on current partial environment data, and an optimal unloading and resource allocation strategy is selected through a reinforcement learning algorithm, so that a convergence state is achieved; and the terminal distributes the subtasks to the server nodes specified by the unloading strategy according to the unloading strategy and obtains the appropriate resource amount based on the resource distribution strategy.
The Actor-Critic reinforcement learning framework comprises a Critic network and an Actor network, the Critic network is trained on the basis of a Value-based function, and the input of the Critic network comprises the current state, a selected action and the state of the next step; the Critic network adopts a Temporal-Difference updating mode, namely, after a new round of training is started, parameters are updated after the round is finished; the Critic network estimates the value of each state-action and feeds back the time difference value to the Actor network; the loss function of the Critic network is defined as a square value of time difference, and the loss function guides the updating process of parameters; the training of the Actor network is based on a Policy-based function, and the action or the probability of the action is output according to the input state, wherein the Actor network adopts a Monte-Carlo updating mode, namely, updating is carried out once after each action is executed; the loss function of the Actor network is designed based on the time difference error calculated by the Critic network.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention considers the competition relationship of simultaneously submitting unloading requests by multiple terminals and the mutual influence between task unloading and server resource allocation, takes the task priority, the average completion time of application and the average energy consumption of the mobile terminal as evaluation indexes, and formulates a combined task unloading and resource allocation mechanism among a plurality of selfish and coupled users in a fringe network into a random game. Each user learns optimal offloading decisions distributively by observing their local network environment, with the goal of improving task execution performance without having to know all state information by selecting sub-channels, transmit power, and the amount of allocated computational resources. A multi-agent reinforcement learning framework is designed to solve the problem of random game. The strategy follows a first-come first-serve principle, resources of the edge server are reasonably distributed, and waiting time of tasks on the server is reduced, so that users can obtain better task unloading results, and user experience and application performance are improved.
Drawings
FIG. 1 is a diagram of a task offloading scenario for a multi-terminal multi-server oriented ultra-dense network in an embodiment of the present invention;
FIG. 2 is a flow chart of an embodiment of the present invention;
FIG. 3 is a model diagram of serial task offloading in a multi-terminal multi-server scenario, according to an embodiment of the invention;
FIG. 4 is an Actor-critical architecture diagram based on a multi-agent reinforcement learning algorithm in an embodiment of the present invention.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The embodiment of the invention relates to a mobile edge network intelligent resource allocation method capable of dividing tasks, which comprises the following steps: dividing a serial task generated by a terminal to obtain a plurality of subtasks, and establishing an unloading task model; respectively establishing a time delay model and an energy consumption model for the subtasks according to two execution modes of local execution or unloading, and defining an unloading joint target optimization function based on the multi-user serial dependent task; under a multi-server scene, establishing a Markov game model according to the cooperative competition relationship of multiple users to wireless communication and computing resources, and optimizing the unloading combined target optimization function; in a time-varying environment, each terminal is used as an independent agent to execute a reinforcement learning algorithm to solve the Markov game model based on part of system state information, and an unloading strategy, sub-channel selection, transmitting power and resource allocation amount are determined.
On the premise of meeting the dependency relationship among the subtasks, the invention reasonably schedules the subtasks, fully utilizes the fragmented resources of the local user and the server, improves the performance of the application program and the experience of the terminal user, and solves the problem of multi-terminal task unloading when the communication and the computing resources of the edge server are limited. And formulating a task unloading mechanism of a plurality of selfish users in the network into a random game process. A multi-agent reinforcement learning framework is designed to solve the problem of random game. Each user learns the optimal decision of local calculation or edge calculation by observing the local network environment, and the aim is to determine the optimal unloading strategy and the optimal resource allocation scheme by selecting the sub-channel and the transmitting power without knowing all state information, thereby reducing the average task completion delay and the average terminal energy consumption.
This process is described in detail below in conjunction with fig. 2.
S1, establishing a serial task dividing and unloading model generated by the mobile terminal
The application program can be automatically divided into a plurality of subtasks with mutual dependency relationship, data interaction exists among the subtasks, and the serial mobile application program with dependency constraint relationship is used as a research object. The set of mobile terminals is denoted as MDs ═ {1,2, …, N }, where N denotes the number of submitted offload requests. Assuming that the application Task generated by each end user is uniformly divisible into n subtasks, the Task set is denoted as Task ═ {1,2, …, n }. Suppose subtask 0 and subtask n +1 are virtual subtasks, represent subtasks for data input and result output, and are fixed for local execution. Suppose each application is represented by a quadruplet < MDi,di,cii>, [ i ] e.MDs, where MDiThe representative application being generated by terminal i, di={di,1,di,2,...,di,nDenotes the size of each subtask input data of terminal i, ci={ci,1,ci,2,...,ci,nDenotes the CPU cycle, δ, required to compute the subtaskiIndicating the task priority of the terminal generating the application. The invention adopts a linear linked list L ═ { V, ED } to represent the dependency relationship between subtasks, each node j ∈ V represents a subtask of the mobile application program, and each directed edge e (j-1, j) ∈ ED represents the dependency relationship between subtask j-1 and subtask j. The jth subtask can start execution, except that enough computation, storage, and network resources are allocated, and its predecessor subtask j-1 is already executed, and the subtask offload model is shown in FIG. 3.
The time range is divided into a number of time slots, assuming that the system operates on a time slot structure. Each end user uses the respective local observation information to distributively select task execution decisions at each time slot k.
The edge server is deployed at the edge of a network close to the mobile terminal, and provides services such as calculation, network and storage for task unloading. Considering a multi-edge server scenario deployed in different locations in the ultra-dense network shown in fig. 1, a server set is denoted as S ═ {1,2]. Each server can be represented as a triple: < s, Fs,Bs>, S ∈ S, where S denotes the number of the server; fsThe maximum computing power of the server s, representing the number of instructions executed per second; b issThe network bandwidth of the communication between the mobile terminal and the edge server at the current moment is represented; the uplink channel resources are divided equally into KsAnd the unloading subtask selects the kth subchannel to upload unloading data according to the strategy. The processing power and transmission capacity of all servers are assumed to be consistent and do not change as the task volume increases.
The terminals constantly collect data and perform computationally intensive tasks, and for each application run by a terminal, the offload policy can be expressed as an n-dimensional vector, Xi={xi,1,xi,2,...,xi,nIn which xi,j0 denotes that the subtask j of application i is executed locally, xi,jS ∈ S denotes that the subtask is offloaded to the edge server S for execution.
For each application program run by the terminal, the channel resource allocation strategy can be expressed as an n-dimensional vector:
Figure BDA0003274258310000061
wherein
Figure BDA0003274258310000062
Representing whether subtask j of application i transmits offload data to edge server x through kth subchanneli,jThe above. When x isi,jWhen 0, the task is executed locally, at which time
Figure BDA0003274258310000063
For each application program run by the terminal, the computing resource allocation strategy can be expressed as an n-dimensional vector:
Figure BDA0003274258310000064
wherein
Figure BDA0003274258310000065
xi,jS stands for edge server s to assign ψ F to subtask j of application isIn which FsRepresenting the maximum computing resource ownership of the edge server s. x is the number ofi,jWhen equal to 0, there is
Figure BDA0003274258310000066
Each subtask is specified to be only offloaded to one MEC server for execution, but different subtasks in one application can be offloaded to different MEC servers. When the adjacent subtasks are unloaded to the same or different MEC servers, the output data of the previous subtask is transmitted to the MEC server unloaded by the next subtask through wired connection, and the transmission energy consumption is 0.
S2, respectively establishing time delay and energy consumption models for the subtasks according to the local or unloading execution modes, and defining a combined objective optimization function
S21, establishing local execution time model
Local execution means that the subtask (i, j) is executed on the mobile terminal, STi,j,FTi,jRespectively representing the start execution time and the end time of the subtask (i, j). Wherein, STi,jExpressed as:
STi,j=FTi,j-1+Ti,j-1,j,others,
wherein, Ti,j-1,jRepresents the data transmission time between subtasks (i, j-1) and (i, j):
Figure BDA0003274258310000067
wherein d isi,j-1Indicating the size of the data transfer between subtasks (i, j-1) and (i, j).
Adopting orthogonal frequency division multiple access technology as an uplink access scheme, and for a server s, the working frequency band B thereofsIs divided into KsA plurality of equally divided frequency bands. To ensure orthogonality of uplink transmissions between user applications associated with the same server, each user is assigned to a sub-band for transmitting data to the edge server. So that server s can serve up to K simultaneouslysAnd (4) users. Each user and server has an antenna for uplink transmission. Order to
Figure BDA0003274258310000071
Representing the sub-bands K, K ∈ [1, K ]s]The uplink channel gain between upper user i and server s captures the effects of path loss, shadowing, and antenna gain. p is a radical ofi,jRepresenting the wireless transmission power when the user i uploads the subtask j request to the server, wherein p is more than or equal to 0i,j≤pmaxWhen x isi,jWhen not equal to s, pi,j0. Since users transmitting to the same server use different sub-bands, uplink intra-cell interference can be ignored, but these users are still affected by inter-cell interference. In this case, the signal-to-noise ratio from user i to server s on subband k is expressed as:
Figure BDA0003274258310000072
wherein σ2Is the background noise variance;
Figure BDA0003274258310000073
representing the cumulative interference within the cell for all users associated with other servers on subband k. Since each subtask j of user i transmits data only on a single subband, the rate at which the subtask j of user i transmits data to server s is:
Figure BDA0003274258310000074
wherein, Bi,j,sRepresenting the actual communication bandwidth after being attenuated by environmental interference and user collisions.
When a user performs his task locally, it is assumed that the user can now use all of the computing resources for sub-task execution. f. ofi lRepresenting the total computing power of end user i, using
Figure BDA0003274258310000075
Representing the time at which the subtask (i, j) executes locally, then:
Figure BDA0003274258310000076
wherein c isi,jIndicating the CPU cycles required to apply the jth sub-task of i.
Thus, the sub-task (i, j) is completed locally at the user by the time
Figure BDA0003274258310000077
S22, establishing unloading execution time model
The unloading execution includes three phases: the time to transmit the request to the MEC server over the uplink, the time to execute the task on the MEC server, and the time to return the results of the task execution from the MEC server to the user over the downlink. Since the size of the result is usually much smaller than the request, while the downstream data rate is much higher than the upstream data rate, the delay in the transmission of the result is omitted here.
The start execution time of the subtask (i, j) on the edge server is likewise STi,jAnd (4) showing. Each MEC server is capable of providing computation offload services for multiple subtasks simultaneously. Computing resource provided by each MEC server to associated subtask sharing
Figure BDA0003274258310000081
And (6) quantizing. One feasible computing resource allocation strategy must satisfy the computing resource constraints:
Figure BDA0003274258310000082
by using
Figure BDA0003274258310000083
Represents the time at which the subtask (i, j) executes at the edge server s:
Figure BDA0003274258310000084
thus, the subtask (i, j) offload execution time is:
Figure BDA0003274258310000085
for the entire task generated by user i, its completion time can be expressed as:
Ti=FTi,n+1-STi,0
where 0, n +1 represent the entry and exit subtasks of the task, respectively.
S23, establishing a calculation energy consumption model
The modeling of the invention only considers the energy consumption of the edge users, because the end user equipment is usually powered by a battery with limited energy and is sensitive to the energy consumption; the edge server is usually connected with an edge gateway such as a base station, and is powered by alternating current of a power grid, so that the energy consumption requirements on calculation and communication are relaxed.
During the entire serial task edge offload process, the energy consumption of the end-user device comes from two parts: the calculated energy consumption and the wireless communication energy consumption may be expressed as:
Figure BDA0003274258310000086
wherein EiIs the total energy consumption of user i during the edge offload,
Figure BDA0003274258310000087
is the energy consumption caused by the local computation of the subtask,
Figure BDA0003274258310000088
representing the energy consumption caused by the user wirelessly communicating with the edge server.
Figure BDA0003274258310000089
Energy consumption model using calculation cycle
Figure BDA00032742583100000812
And (4) showing. Where τ is the energy coefficient dependent on the chip structure, setting
Figure BDA00032742583100000813
f represents the current CPU frequency. Thus, the computational energy consumption of application i in locally executing subtask j
Figure BDA00032742583100000810
The calculation is as follows:
Figure BDA00032742583100000811
the total computational energy consumption of the mobile user i to complete the whole task is then equal to the sum of the computational energy consumption of all locally executed sub-tasks, i.e.:
Figure BDA0003274258310000091
s24, establishing a transmission energy consumption model
The transmission energy consumption is mainly generated by data transmission between a mobile terminal user and an edge server, and for an application program generated by a certain user i, when two adjacent subtasks are executed on a mobile local server or the edge server, the transmission energy consumption is zero; there is a data transfer power consumption only when two adjacent subtasks are executed at different locations. The energy consumption generated when the mobile terminal user i sends the subtask j to transmit data to the edge server is represented as:
Figure BDA0003274258310000092
thus, the total transmission energy consumption for a mobile terminal user i to complete the entire task is expressed as:
Figure BDA0003274258310000093
thus, the total energy consumption of all end users in the system can be expressed as:
Figure BDA0003274258310000094
s25, combining the task execution time model and the energy consumption model, defining the unloading target optimization function based on the multi-terminal serial task
Latency and power consumption are two keys to task execution. If an end user chooses to offload his computational tasks, it must request spectrum and computational resources from the gateway, thereby reducing the resources that other users can allocate. And a larger transmit power means a higher transmission rate, less transmission delay, but more interference to other end users. The time model and the energy consumption model established under the serial task unloading scene are influenced by the unloading strategy and cannot reach the minimum value simultaneously through independent calculation. The present invention designs an optimal joint computation offload scheme and provides an efficient resource allocation solution between end users.
And (3) jointly executing time delay, energy consumption constraint and newly introduced task priority delta, quantifying and unifying the three dimensions into system utility evaluation unloading performance, and simultaneously serving as a reward mechanism to feed back a training neural network. According to the analysis of the above calculation model and communication model, considering the offloading policy, channel selection policy, transmission power and calculation resource allocation, an offloading optimization objective function of user i is defined as:
Figure BDA0003274258310000095
the system cost function comprises time delay cost and energy consumption cost for executing all tasks at a certain moment; chi shape1,χ2Respectively representing the weight occupied by the task completion time delay and the terminal energy consumption, and having x12∈[0,1],χ1+χ 21. The orientation of the sub-utilities is determined by adjusting this parameter during the training process, e.g. with more focus on execution latency in latency sensitive scenarios and with more focus on energy consumption in energy limited devices. The objective optimization function P satisfies the following constraints:
Figure BDA0003274258310000101
C2:xi,0=0,xi,n+1=0
Figure BDA0003274258310000102
Figure BDA0003274258310000103
Figure BDA0003274258310000104
Figure BDA0003274258310000105
wherein the constraint C1 indicates that the execution position of the application subtask can be 0 or s; c2 indicates that the entry subtask and the exit subtask of the task can only be executed locally; constraint C3 ensures that a subtask (i, j) can only begin execution until it waits for its predecessor subtask (i, j-1) to complete; constraint C4 limits each subtask to select only one subchannel frequency for transmitting data to the server; constraint C5 indicates that the total amount of computing resources that can be allocated by all the subtasks selected for offloading to the edge server s must not exceed the maximum resource ownership; the constraint C6 indicates that the transmit power at which the terminal device inputs data to the edge server must not exceed its maximum transmit power.
From the above optimization problem, in a complex scenario of multiple users and multiple servers, the optimization problem not only needs to consider the actual unloading decision of the device, but also needs to consider the resource allocation scheme of the edge server to the subtask, and the two are mutually coupled and mutually influenced, and meanwhile, due to the dependence constraint of the task itself, the unloading problem becomes very difficult.
S3, under the scene of multiple servers, establishing a Markov game model according to the cooperative competition relationship of multiple terminals to wireless communication and computing resources
According to the objective optimization function defined in step S2, the present invention aims to solve the optimal offloading policy, sub-channel selection policy, transmission power and computational resource allocation policy to minimize the system cost during task execution. Each end user can only observe local information and know channel state information through feedback of the server, so that a multi-agent Markov game model, also called a random game, is formed.
The random game theory is well suitable for being applied to a multi-terminal multi-server edge unloading scene. And under the condition that a plurality of interested selfish terminals do not share information, the unloading strategy is selected in a distributed mode. After the terminal performs the corresponding action, the reward value fed back from the system environment is obtained, and the next state is entered, wherein the next state depends on the joint action made by all the terminals. Under the time-varying environment, the above process is repeated continuously, and convergence to the Nash equilibrium state is finally expected. Under the Nash equilibrium state, higher income cannot be obtained in any terminal network through changing strategies, and the network parameters and the long-term discount rewards of the system are optimized. In a considered multi-terminal scenario, when multiple terminals autonomously select offloading behaviors according to a policy, they may compete for limited channel and server resources, striving for maximum revenue for themselves. By definition, decisions among multiple terminals in this scenario form a non-cooperative gaming process. Each terminal will take all changes except itself as part of the environment, regardless of the benefit of the other terminals. In the non-cooperative game, the unloading behaviors of all the terminals are mutually restricted and mutually influenced.
The task offload and resource allocation decision Process for each end-user is modeled as a Markov Decision Process (MDP) to accurately describe each end-user decision Process. At each time slot theta, the end-user observes its local environment state sti(θ)∈STiThen taking action a independently according to different strategies adopted by the algorithmi(θ)∈Ai. Each agent receives a reward r of environmental feedback according to the task execution situationi(θ)=ri(sti(θ),a1(θ),...,aN(theta)), based on the actions of all relevant agents, transition to a new state sti(θ+1)∈STi. The future state in MDP depends only on the current state and is independent of the historical state. At any time slot θ, the goal of each end user is to take the best action while maximizing the long-term rewards.
The exact definition of the state space, action space and bonus functions in the Markov game is given below:
1) state space: define the state space of user i as stiAnd (theta) including state information of the user i, other users and the MEC server, such as residual channel resources, computing resources and the like. Thus, the state space of the system is defined as:
ST(θ)={st1(θ),...,sti(θ),...,stN(θ)},
wherein sti(θ)={sti,1(θ),...,sti,j(θ),...,sti,n(θ)},i∈MDs,j∈Taski
2) An action space: for user i, action space ai,j(k) Including the offload decision for subtask j, the transmit power, the uplink channel allocated by the MEC server, and the computational resources allocated by the MEC server. The motion space of the system is thus defined as:
A(θ)={a1(θ),...,ai(θ),...,aN(θ)},
wherein, ai(θ)={ai,1(θ),...,ai,j(θ),...,ai,n(θ)},i∈MDs,j∈Taski
In the multi-terminal IoT edge computing network under consideration, each end user i is treated as an agent whose actions taken include, at each time slot θ, an offload decision XiSelection of sub-channel CHiSelecting P for the transmission power leveliAnd assigned computational resource FiI.e. ai(θ)∈Ai=Xi×CHi×Pi×Fi. Therefore, the action space for calculating the offload game is:
Figure BDA0003274258310000111
3) reward function: the reward is feedback of the environment to the agent after the agent takes action. Reward function riThe design of (θ) directly guides the learning process. The invention aims to reduce the task execution cost of each user terminal to the minimum according to the resource limit of a server and a task execution delay threshold value. In particular, the system cost is considered a negative reward function in the problem, so the long-term reward must be minimized here. Rewards are set according to constraints and goals of the tasks, including priority task completion, delay constraints and energy consumption, algorithms ensure that allocated resources enable terminal applications with higher priority to execute fully earlier; the lower the mission delay and energy consumption, the higher the reward.
Next, by selecting an appropriate operation at each time slot, consideration is given to minimizing the long-term reward vi(θ):
Figure BDA0003274258310000121
Wherein, λ ∈ [0,1 ]]Represents a discount factor, vi(theta) represents the sum of the long-term discount rewards, which can be used to measure the actions taken by the end-user i, and tau is the slot index starting from the slot theta.
The optimal computation offload problem for end user i is then expressed as:
Figure BDA0003274258310000122
the design of the computation offload scheme for multi-terminal edge computing networks contains the above-mentioned N sub-problems, which correspond to all the sub-tasks of the N end users. Each end-user does not have the status and off-loading information of other end-users, so the present invention first models the optimization problem using non-cooperative random gaming, and then proposes a multi-agent reinforcement learning framework to solve the problem.
S4, based on the known partial system state information, each terminal user independently executes a reinforcement learning algorithm to determine a task unloading strategy and a resource allocation amount, and the problem of game is solved
The multi-agent depth deterministic policy gradient (MADDPG) is utilized to find the best policy for the MDP. The core of MADDPG is the Actor-Critic architecture, as shown in FIG. 4. The Critic part of each agent can acquire action information of all the other agents, centralized training and non-centralized execution are carried out, namely, global Critic capable of being observed is introduced during training to guide Actor training, and action is taken only by using an Actor with local observation during testing. off-line was trained for centralization and on-line was performed for decentralization.
Critic network: the Critic network is based on a Value-based function, i.e., a Q function. The inputs to the Critic network include the current state, the selected action, and the next state. Critic is a multi-layer fully-connected neural network structure. The Critic network adopts a Temporal-Difference update mode, namely, after a new round of training is started, parameters are updated after waiting for the end of the round. Critic estimates the value of each state-action and feeds back the time difference value to the Actor. Calculation taking into account the time difference: td _ error ═ r + λ × Q (st', a) -Q (st, a). The loss function of the Critic network is defined as the square value of the time difference, and the loss function guides the updating process of the parameters.
An Actor network: the training of the Actor network is based on a Policy-based function, and an action or probability of the action is output according to the input state. The Actor is also a multi-layer fully-connected neural network. The network adopts a Monte-Carlo updating mode, namely, after each action is executed, updating is carried out, and the process does not need to wait until the round is finished. The loss function of Actor is designed based on the time difference error calculated by Critic. And the Actor selects the action according to the value output by the softmax function, updates the parameters according to the Critic score and modifies the action selection probability.
The Actor selects and executes an action according to the current state. Critic scores the performance of Actor according to the current state and the environmental feedback reward value produced by the action. In the initial stage of learning, the Actor randomly selects an action, and Critic randomly scores the action. Due to environmental feedback, i.e., the existence of reward function, Critic scoring becomes more accurate and Actor performs better and better. In the parameter updating stage, the Actor updates its action strategy, namely, the Actor network parameters, according to the Critic score. Critic adjusts its own scoring strategy and network parameters according to the reward function given by the system by calculating the Q value. The Actor-Critic relates to two neural networks, the two neural networks interact with each other, loop iteration is carried out, parameters are updated in a continuous state, and network performance is improved.
In the invention, each end user runs an independent Actor-criticic algorithm to learn respective optimal strategies. In particular, the selection of the optimal action depends on the Q function, which is defined as being in the state stiTake action aiThe optimal expected value of time. Since the state transition probabilities are difficult to obtain in practice, the average rate of return over multiple samples may approximately represent the expected cumulative reward. This is achieved by using a monte carlo learning method and sampling the same Q function by different strategies. However, all-purposeThe monte carlo learning becomes complex by oversampling the complete interaction segment to compute the mean return. The time difference is used to recursively update the Q-value function by learning its estimated value on the basis of other estimated values, expressed as:
Figure BDA0003274258310000131
wherein the content of the first and second substances,
Figure BDA0003274258310000132
indicating the best cumulative benefit for the next time slot. α represents a learning rate, and the learning rate α is set to ensure convergence of Q learningkThe method comprises the following steps:
Figure BDA0003274258310000133
wherein alpha isiniendRespectively, an initial value and a final value of a given alpha, the epsilon being the maximum number of iterations of the learning algorithm.
In order to avoid the situations of gradient disappearance and gradient explosion, which cause model degradation, the invention adopts an empirical replay strategy. And storing experience data obtained in the environment exploration process of the intelligent agent in an experience pool, and randomly sampling and updating network parameters in the subsequent deep neural network training process. The experience pool of user i may be Mi=mi-M+1,...,miWhere M represents the size of the experience pool, the stored experience data tuple is represented as:
Figure BDA0003274258310000134
an epsilon-greedy method is adopted as an action selection strategy, the problem of exploration and utilization in reinforcement learning is mainly solved, an intelligent agent selects an optimal action corresponding to a maximum Q function according to a probability 1-epsilon, and the probability epsilon belongs to [0,1 ∈]A random action is selected.
As can be easily found, the invention adopts a distributed intelligent reinforcement learning algorithm to dynamically determine the unloading strategy, the sub-channel selection, the transmitting power and the multi-server computing resource allocation scheme of the multi-terminal divisible serial tasks, thereby optimizing the task execution time delay and the terminal energy consumption and improving the system efficiency.
The invention is oriented to a multi-terminal multi-server scene, fully considers the competition relationship of the multi-terminal and the coupling relationship of task unloading and resource allocation decision, solves the problem of multi-terminal task unloading when the edge server communication and the computing resources are limited, and aims to reduce the average task completion time and the average terminal energy consumption by establishing a computing model and an energy consumption model.
The invention establishes a divisible serial task model and designs an intelligent unloading strategy. On the premise of meeting the dependency relationship among the subtasks, the strategy reasonably schedules the subtasks, makes full use of fragmented resources of the local user and the server, and improves performance and user experience.
The method defines task priorities to represent different time efficiency emergency degrees of tasks, positions system cost by combining three factors of the task priorities, task execution time delay and task execution energy consumption, normalizes multi-objective optimization into single-objective optimization in a linear weighting mode, and models the single-objective optimization into a mixed integer nonlinear programming problem.
The invention designs a distributed resource allocation algorithm based on multi-agent reinforcement learning, which aims to minimize system cost and explore an optimal unloading strategy, so that a terminal user can achieve a balance state in a self-organizing manner in a time-varying environment. Each user learns and adapts to the environmental data as a separate agent, treating other users as part of the environment.
Experiments show that compared with the traditional atomic task 0-1 unloading strategy and the traditional central task unloading algorithm, under the scenes of different numbers of user tasks and different numbers of edge servers, the method can realize lower task execution cost, namely effectively reduce time delay and energy consumption. In addition, different priorities and maximum endurance times are set for the tasks, and it can be found that the tasks with higher priorities can be scheduled to be executed earlier, and under a certain time constraint, the task completion rate of the algorithm is the highest.

Claims (7)

1. A mobile edge network intelligent resource allocation method capable of dividing tasks is characterized by comprising the following steps:
(1) dividing a serial task generated by a terminal to obtain a plurality of subtasks, and establishing an unloading task model;
(2) respectively establishing a time delay model and an energy consumption model for the subtasks according to two execution modes of local execution or unloading, and defining an unloading joint target optimization function based on the multi-user serial dependent task;
(3) under a multi-server scene, establishing a Markov game model according to the cooperative competition relationship of multiple users to wireless communication and computing resources, and optimizing the unloading combined target optimization function;
(4) in a time-varying environment, each terminal is used as an independent agent to execute a reinforcement learning algorithm to solve the Markov game model based on part of system state information, and an unloading strategy, sub-channel selection, transmitting power and resource allocation amount are determined.
2. The method for intelligent resource allocation of a mobile edge network capable of dividing tasks according to claim 1, wherein there is an interdependence relationship between the plurality of sub-tasks in the step (1), and there is data interaction between the plurality of sub-tasks.
3. The method for intelligent resource allocation of a mobile edge network capable of dividing tasks according to claim 1, wherein when the unloading task model is established in step (1), it is specified that each subtask can only be unloaded to a certain MEC server for execution, but different subtasks in an application can be unloaded to different MEC servers; when an adjacent subtask is offloaded to the same or a different MEC server, the output data of the previous subtask is transferred to the MEC server to which the next subtask is offloaded through a wired connection.
4. The method of claim 1, wherein the offloading joint in step (2) isThe objective optimization function is P:
Figure FDA0003274258300000011
wherein, TiIndicating the time delay for completing the ith subtask, EiRepresents the terminal energy consumption, δ, for completing the ith subtaskiIndicating the priority, χ, of the ith subtask12Represents the weight of time delay and energy consumption, and12∈[0,1],χ121, the unloading joint objective optimization function satisfies the following constraint conditions: constraint condition 1, the execution position of the application subtask is a local or edge server; constraint 2, the entry subtask and the exit subtask of the task can only be executed locally; constraint 3, the subtask can start to execute only when the execution of its predecessor subtask is finished; constraint condition 4, each subtask can only select one sub-channel frequency to transmit data to the server; constraint 5, the total amount of computing resources that can be allocated by all subtasks selected to offload to the edge server must not exceed the maximum resource ownership; constraint 6, the transmit power of the terminal device when inputting data to the edge server must not exceed its maximum transmit power.
5. The method for allocating intelligent resources of a mobile edge network capable of dividing tasks according to claim 1, wherein the step (3) is specifically as follows: determining a known state space, an action space and a reward function; modeling a task unloading and resource allocation decision process of the multiple terminals into a Markov decision process, namely, in each time slot, the terminals observe the local environment state of the terminals and then independently take action according to different strategies adopted by the local environment state; according to the task execution condition, each agent can obtain the reward of environment feedback, and the agent is transferred to a new state according to the actions of all related agents; the decision process for all coupled terminals is modeled as a markov game process, i.e. at any time slot, each terminal is targeted to take the best action while maximizing the long-term prize.
6. The method for allocating intelligent resources of a mobile edge network capable of dividing tasks according to claim 1, wherein the step (4) is specifically as follows: each terminal is used as an independent intelligent agent, and all changes except the terminal is used as an environment; each terminal independently operates an Actor-Critic reinforcement learning framework; all terminals are trained based on current partial environment data, and an optimal unloading and resource allocation strategy is selected through a reinforcement learning algorithm, so that a convergence state is achieved; and the terminal distributes the subtasks to the server nodes specified by the unloading strategy according to the unloading strategy and obtains the appropriate resource amount based on the resource distribution strategy.
7. The intelligent resource allocation method for a task-partitionable mobile edge network according to claim 6, wherein the Actor-Critic reinforcement learning framework comprises a Critic network and an Actor network, the training of the Critic network is based on a Value-based function, and the input of the Critic network comprises the current state, the selected action and the next state; the Critic network adopts a Temporal-Difference updating mode, namely, after a new round of training is started, parameters are updated after the round is finished; the Critic network estimates the value of each state-action and feeds back the time difference value to the Actor network; the loss function of the Critic network is defined as a square value of time difference, and the loss function guides the updating process of parameters; the training of the Actor network is based on a Policy-based function, and the action or the probability of the action is output according to the input state, wherein the Actor network adopts a Monte-Carlo updating mode, namely, updating is carried out once after each action is executed; the loss function of the Actor network is designed based on the time difference error calculated by the Critic network.
CN202111112170.5A 2021-09-23 2021-09-23 Mobile edge network intelligent resource allocation method capable of dividing tasks Pending CN113873022A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111112170.5A CN113873022A (en) 2021-09-23 2021-09-23 Mobile edge network intelligent resource allocation method capable of dividing tasks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111112170.5A CN113873022A (en) 2021-09-23 2021-09-23 Mobile edge network intelligent resource allocation method capable of dividing tasks

Publications (1)

Publication Number Publication Date
CN113873022A true CN113873022A (en) 2021-12-31

Family

ID=78993284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111112170.5A Pending CN113873022A (en) 2021-09-23 2021-09-23 Mobile edge network intelligent resource allocation method capable of dividing tasks

Country Status (1)

Country Link
CN (1) CN113873022A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116050A (en) * 2021-11-16 2022-03-01 天津市英贝特航天科技有限公司 Selective unloading method and system for edge calculation
CN114217881A (en) * 2022-02-23 2022-03-22 北京航空航天大学杭州创新研究院 Task unloading method and related device
CN114363338A (en) * 2022-01-07 2022-04-15 山东大学 Optimization method of multi-access edge computing network task unloading strategy based on competitive cooperation mean field game
CN114390057A (en) * 2022-01-13 2022-04-22 南昌工程学院 Multi-interface self-adaptive data unloading method based on reinforcement learning under MEC environment
CN114490057A (en) * 2022-01-24 2022-05-13 电子科技大学 MEC unloaded task resource allocation method based on deep reinforcement learning
CN114599041A (en) * 2022-01-13 2022-06-07 浙江大学 Method for integrating calculation and communication
CN114615705A (en) * 2022-03-11 2022-06-10 广东技术师范大学 Single user resource allocation strategy method based on 5G network
CN114637608A (en) * 2022-05-17 2022-06-17 之江实验室 Calculation task allocation and updating method, terminal and network equipment
CN114745396A (en) * 2022-04-12 2022-07-12 广东技术师范大学 Multi-agent-based end edge cloud 3C resource joint optimization method
CN114745386A (en) * 2022-04-13 2022-07-12 浙江工业大学 Neural network segmentation and unloading method under multi-user edge intelligent scene
CN114785782A (en) * 2022-03-29 2022-07-22 南京工业大学 Heterogeneous cloud-edge computing-oriented general task unloading method
CN114884949A (en) * 2022-05-07 2022-08-09 重庆邮电大学 Low-orbit satellite Internet of things task unloading method based on MADDPG algorithm
CN114900518A (en) * 2022-04-02 2022-08-12 中国光大银行股份有限公司 Task allocation method, device, medium and electronic equipment for directed distributed network
CN115002123A (en) * 2022-05-25 2022-09-02 西南交通大学 Fast adaptive task unloading system and method based on mobile edge calculation
CN115002409A (en) * 2022-05-20 2022-09-02 天津大学 Dynamic task scheduling method for video detection and tracking
CN115037749A (en) * 2022-06-08 2022-09-09 山东省计算中心(国家超级计算济南中心) Performance-aware intelligent multi-resource cooperative scheduling method and system for large-scale micro-service
CN115190033A (en) * 2022-05-22 2022-10-14 重庆科技学院 Cloud edge fusion network task unloading method based on reinforcement learning
CN115190135A (en) * 2022-06-30 2022-10-14 华中科技大学 Distributed storage system and copy selection method thereof
CN115225496A (en) * 2022-06-28 2022-10-21 重庆锦禹云能源科技有限公司 Mobile sensing service unloading fault-tolerant method based on edge computing environment
CN115243217A (en) * 2022-07-07 2022-10-25 中山大学 DDQN-based end edge cloud collaborative scheduling method and system in Internet of vehicles edge environment
CN115841590A (en) * 2022-11-16 2023-03-24 中国烟草总公司湖南省公司 Neural network reasoning optimization method, device, equipment and readable storage medium
CN115955685A (en) * 2023-03-10 2023-04-11 鹏城实验室 Multi-agent cooperative routing method, equipment and computer storage medium
CN116346921A (en) * 2023-03-29 2023-06-27 华能澜沧江水电股份有限公司 Multi-server collaborative cache updating method and device for security management and control of river basin dam
CN117255126A (en) * 2023-08-16 2023-12-19 广东工业大学 Data-intensive task edge service combination method based on multi-objective reinforcement learning
WO2024060571A1 (en) * 2022-09-21 2024-03-28 之江实验室 Heterogeneous computing power-oriented multi-policy intelligent scheduling method and apparatus
CN117806806A (en) * 2024-02-28 2024-04-02 湖南科技大学 Task part unloading scheduling method, terminal equipment and storage medium
WO2024065903A1 (en) * 2022-09-29 2024-04-04 福州大学 Joint optimization system and method for computation offloading and resource allocation in multi-constraint-edge environment
CN117939505A (en) * 2024-03-22 2024-04-26 南京邮电大学 Edge collaborative caching method and system based on excitation mechanism in vehicle edge network
CN117931461A (en) * 2024-03-25 2024-04-26 荣耀终端有限公司 Scheduling method of computing resources, training method of strategy network and device
CN116346921B (en) * 2023-03-29 2024-06-11 华能澜沧江水电股份有限公司 Multi-server collaborative cache updating method and device for security management and control of river basin dam

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102812450A (en) * 2009-10-30 2012-12-05 时代华纳有线公司 Methods And Apparatus For Packetized Content Delivery Over A Content Delivery Network
US20180357552A1 (en) * 2016-01-27 2018-12-13 Bonsai AI, Inc. Artificial Intelligence Engine Having Various Algorithms to Build Different Concepts Contained Within a Same AI Model
CN110535700A (en) * 2019-08-30 2019-12-03 哈尔滨工程大学 A kind of calculating discharging method under multi-user's multiple edge server scene
CN110928691A (en) * 2019-12-26 2020-03-27 广东工业大学 Traffic data-oriented edge collaborative computing unloading method
CN111918339A (en) * 2020-07-17 2020-11-10 西安交通大学 AR task unloading and resource allocation method based on reinforcement learning in mobile edge network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102812450A (en) * 2009-10-30 2012-12-05 时代华纳有线公司 Methods And Apparatus For Packetized Content Delivery Over A Content Delivery Network
US20180357552A1 (en) * 2016-01-27 2018-12-13 Bonsai AI, Inc. Artificial Intelligence Engine Having Various Algorithms to Build Different Concepts Contained Within a Same AI Model
CN110535700A (en) * 2019-08-30 2019-12-03 哈尔滨工程大学 A kind of calculating discharging method under multi-user's multiple edge server scene
CN110928691A (en) * 2019-12-26 2020-03-27 广东工业大学 Traffic data-oriented edge collaborative computing unloading method
CN111918339A (en) * 2020-07-17 2020-11-10 西安交通大学 AR task unloading and resource allocation method based on reinforcement learning in mobile edge network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李晨: ""边缘计算中的任务卸载机制研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
路静等: ""移动边缘计算任务切分与最优卸载算法设计"", 《物联网学报》 *

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116050A (en) * 2021-11-16 2022-03-01 天津市英贝特航天科技有限公司 Selective unloading method and system for edge calculation
CN114363338A (en) * 2022-01-07 2022-04-15 山东大学 Optimization method of multi-access edge computing network task unloading strategy based on competitive cooperation mean field game
CN114363338B (en) * 2022-01-07 2023-01-31 山东大学 Optimization method of multi-access edge computing network task unloading strategy based on competitive cooperation mean field game
CN114390057B (en) * 2022-01-13 2024-04-05 南昌工程学院 Multi-interface self-adaptive data unloading method based on reinforcement learning under MEC environment
CN114390057A (en) * 2022-01-13 2022-04-22 南昌工程学院 Multi-interface self-adaptive data unloading method based on reinforcement learning under MEC environment
CN114599041A (en) * 2022-01-13 2022-06-07 浙江大学 Method for integrating calculation and communication
CN114599041B (en) * 2022-01-13 2023-12-05 浙江大学 Fusion method for calculation and communication
CN114490057B (en) * 2022-01-24 2023-04-25 电子科技大学 MEC offloaded task resource allocation method based on deep reinforcement learning
CN114490057A (en) * 2022-01-24 2022-05-13 电子科技大学 MEC unloaded task resource allocation method based on deep reinforcement learning
CN114217881A (en) * 2022-02-23 2022-03-22 北京航空航天大学杭州创新研究院 Task unloading method and related device
CN114615705A (en) * 2022-03-11 2022-06-10 广东技术师范大学 Single user resource allocation strategy method based on 5G network
CN114615705B (en) * 2022-03-11 2022-12-20 广东技术师范大学 Single-user resource allocation strategy method based on 5G network
CN114785782A (en) * 2022-03-29 2022-07-22 南京工业大学 Heterogeneous cloud-edge computing-oriented general task unloading method
CN114900518A (en) * 2022-04-02 2022-08-12 中国光大银行股份有限公司 Task allocation method, device, medium and electronic equipment for directed distributed network
CN114745396A (en) * 2022-04-12 2022-07-12 广东技术师范大学 Multi-agent-based end edge cloud 3C resource joint optimization method
CN114745396B (en) * 2022-04-12 2024-03-08 广东技术师范大学 Multi-agent-based end edge cloud 3C resource joint optimization method
CN114745386B (en) * 2022-04-13 2024-05-03 浙江工业大学 Neural network segmentation and unloading method in multi-user edge intelligent scene
CN114745386A (en) * 2022-04-13 2022-07-12 浙江工业大学 Neural network segmentation and unloading method under multi-user edge intelligent scene
CN114884949B (en) * 2022-05-07 2024-03-26 深圳泓越信息科技有限公司 Task unloading method for low-orbit satellite Internet of things based on MADDPG algorithm
CN114884949A (en) * 2022-05-07 2022-08-09 重庆邮电大学 Low-orbit satellite Internet of things task unloading method based on MADDPG algorithm
WO2023221353A1 (en) * 2022-05-17 2023-11-23 之江实验室 Computing task assignment method, computing task updating method, terminal and network device
CN114637608A (en) * 2022-05-17 2022-06-17 之江实验室 Calculation task allocation and updating method, terminal and network equipment
CN115002409B (en) * 2022-05-20 2023-07-28 天津大学 Dynamic task scheduling method for video detection and tracking
CN115002409A (en) * 2022-05-20 2022-09-02 天津大学 Dynamic task scheduling method for video detection and tracking
CN115190033A (en) * 2022-05-22 2022-10-14 重庆科技学院 Cloud edge fusion network task unloading method based on reinforcement learning
CN115190033B (en) * 2022-05-22 2024-02-20 重庆科技学院 Cloud edge fusion network task unloading method based on reinforcement learning
CN115002123A (en) * 2022-05-25 2022-09-02 西南交通大学 Fast adaptive task unloading system and method based on mobile edge calculation
CN115002123B (en) * 2022-05-25 2023-05-05 西南交通大学 System and method for rapidly adapting task offloading based on mobile edge computation
CN115037749B (en) * 2022-06-08 2023-07-28 山东省计算中心(国家超级计算济南中心) Large-scale micro-service intelligent multi-resource collaborative scheduling method and system
CN115037749A (en) * 2022-06-08 2022-09-09 山东省计算中心(国家超级计算济南中心) Performance-aware intelligent multi-resource cooperative scheduling method and system for large-scale micro-service
CN115225496A (en) * 2022-06-28 2022-10-21 重庆锦禹云能源科技有限公司 Mobile sensing service unloading fault-tolerant method based on edge computing environment
CN115190135B (en) * 2022-06-30 2024-05-14 华中科技大学 Distributed storage system and copy selection method thereof
CN115190135A (en) * 2022-06-30 2022-10-14 华中科技大学 Distributed storage system and copy selection method thereof
CN115243217B (en) * 2022-07-07 2023-07-18 中山大学 DDQN-based terminal Bian Yun cooperative scheduling method and system in Internet of vehicles edge environment
CN115243217A (en) * 2022-07-07 2022-10-25 中山大学 DDQN-based end edge cloud collaborative scheduling method and system in Internet of vehicles edge environment
WO2024060571A1 (en) * 2022-09-21 2024-03-28 之江实验室 Heterogeneous computing power-oriented multi-policy intelligent scheduling method and apparatus
WO2024065903A1 (en) * 2022-09-29 2024-04-04 福州大学 Joint optimization system and method for computation offloading and resource allocation in multi-constraint-edge environment
CN115841590B (en) * 2022-11-16 2023-10-03 中国烟草总公司湖南省公司 Neural network reasoning optimization method, device, equipment and readable storage medium
CN115841590A (en) * 2022-11-16 2023-03-24 中国烟草总公司湖南省公司 Neural network reasoning optimization method, device, equipment and readable storage medium
CN115955685A (en) * 2023-03-10 2023-04-11 鹏城实验室 Multi-agent cooperative routing method, equipment and computer storage medium
CN115955685B (en) * 2023-03-10 2023-06-20 鹏城实验室 Multi-agent cooperative routing method, equipment and computer storage medium
CN116346921A (en) * 2023-03-29 2023-06-27 华能澜沧江水电股份有限公司 Multi-server collaborative cache updating method and device for security management and control of river basin dam
CN116346921B (en) * 2023-03-29 2024-06-11 华能澜沧江水电股份有限公司 Multi-server collaborative cache updating method and device for security management and control of river basin dam
CN117255126A (en) * 2023-08-16 2023-12-19 广东工业大学 Data-intensive task edge service combination method based on multi-objective reinforcement learning
CN117806806A (en) * 2024-02-28 2024-04-02 湖南科技大学 Task part unloading scheduling method, terminal equipment and storage medium
CN117806806B (en) * 2024-02-28 2024-05-17 湖南科技大学 Task part unloading scheduling method, terminal equipment and storage medium
CN117939505B (en) * 2024-03-22 2024-05-24 南京邮电大学 Edge collaborative caching method and system based on excitation mechanism in vehicle edge network
CN117939505A (en) * 2024-03-22 2024-04-26 南京邮电大学 Edge collaborative caching method and system based on excitation mechanism in vehicle edge network
CN117931461A (en) * 2024-03-25 2024-04-26 荣耀终端有限公司 Scheduling method of computing resources, training method of strategy network and device

Similar Documents

Publication Publication Date Title
CN113873022A (en) Mobile edge network intelligent resource allocation method capable of dividing tasks
Wei et al. Joint optimization of caching, computing, and radio resources for fog-enabled IoT using natural actor–critic deep reinforcement learning
CN112367353B (en) Mobile edge computing unloading method based on multi-agent reinforcement learning
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
CN113573324B (en) Cooperative task unloading and resource allocation combined optimization method in industrial Internet of things
CN114285853B (en) Task unloading method based on end edge cloud cooperation in equipment-intensive industrial Internet of things
CN113543176A (en) Unloading decision method of mobile edge computing system based on assistance of intelligent reflecting surface
Chen et al. Cache-assisted collaborative task offloading and resource allocation strategy: A metareinforcement learning approach
Wang et al. Optimization for computational offloading in multi-access edge computing: A deep reinforcement learning scheme
CN114641076A (en) Edge computing unloading method based on dynamic user satisfaction in ultra-dense network
CN113590279A (en) Task scheduling and resource allocation method for multi-core edge computing server
Fan et al. Joint task offloading and resource allocation for accuracy-aware machine-learning-based IIoT applications
Zhang et al. A deep reinforcement learning approach for online computation offloading in mobile edge computing
Shang et al. Computation offloading and resource allocation in NOMA-MEC: A deep reinforcement learning approach
Zhu et al. Learn and pick right nodes to offload
CN113821346B (en) Edge computing unloading and resource management method based on deep reinforcement learning
Li et al. Computation offloading strategy for improved particle swarm optimization in mobile edge computing
CN112445617B (en) Load strategy selection method and system based on mobile edge calculation
Jiang et al. A collaborative optimization strategy for computing offloading and resource allocation based on multi-agent deep reinforcement learning
CN117195728A (en) Complex mobile task deployment method based on graph-to-sequence reinforcement learning
Li et al. Graph Tasks Offloading and Resource Allocation in Multi-Access Edge Computing: A DRL-and-Optimization-Aided Approach
Fang et al. Dependency-Aware Dynamic Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing
Liu et al. Computation offloading optimization in mobile edge computing based on HIBSA
Chen et al. Efficient Task Scheduling and Resource Allocation for AI Training Services in Native AI Wireless Networks
CN117539640B (en) Heterogeneous reasoning task-oriented side-end cooperative system and resource allocation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211231