CN111800828B - Mobile edge computing resource allocation method for ultra-dense network - Google Patents

Mobile edge computing resource allocation method for ultra-dense network Download PDF

Info

Publication number
CN111800828B
CN111800828B CN202010597779.5A CN202010597779A CN111800828B CN 111800828 B CN111800828 B CN 111800828B CN 202010597779 A CN202010597779 A CN 202010597779A CN 111800828 B CN111800828 B CN 111800828B
Authority
CN
China
Prior art keywords
user
function
representing
users
expressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010597779.5A
Other languages
Chinese (zh)
Other versions
CN111800828A (en
Inventor
李立欣
程倩倩
张敬敏
王大伟
李旭
梁微
林文晟
李煊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202010597779.5A priority Critical patent/CN111800828B/en
Publication of CN111800828A publication Critical patent/CN111800828A/en
Application granted granted Critical
Publication of CN111800828B publication Critical patent/CN111800828B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/20Control channels or signalling for resource management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a mobile edge computing resource allocation method of an ultra-dense network, which is based on the ultra-dense network, wherein a NOMA-MEC communication system in the ultra-dense network comprises M= {1,2, …, M } small base stations, wherein each small base station is provided with an MEC server to execute a computing task of user unloading; assuming that the set of users served by each small cell is n= {1,2, …, N }, the N users are divided into y= {1,2, …, Y } groups, and there are k= {1,2, …, K } users in each group. The method solves the problem that the prior art is difficult to process the mutual interference among users, thereby influencing the computing performance of the users.

Description

Mobile edge computing resource allocation method for ultra-dense network
[ field of technology ]
The invention belongs to the technical field of wireless communication, and particularly relates to a mobile edge computing resource allocation method of an ultra-dense network.
[ background Art ]
With the rapid development of the fifth generation (5G) mobile communication technology, deployment of Ultra Dense Networks (UDNs) has become a major architecture for future development. The UDN can effectively improve system capacity and data transmission rate to ensure user quality of service. However, due to the limited computing power of the user, solving the computationally intensive task in UDNs is a significant challenge. As an emerging technology, mobile Edge Computing (MEC) has been proposed to alleviate the computational pressure of users in UDNs. In particular, MECs offload computationally intensive tasks to the network edge to reduce the user's energy consumption and task delay.
In MEC systems, how to increase the utilization of spectrum resources between users is a significant challenge, as it directly affects energy consumption and task delay. As an emerging multiple access method, non-orthogonal multiple access (NOMA) can effectively improve the spectral efficiency of a system by allocating the same resources to multiple users. Thus, in certain operations NOMA has been applied to MEC systems to reduce energy consumption and task delays.
Average field gaming (MFG) is a tool suitable for scenes with large scale gaming individuals that can model relationships between individuals and groups in a UDN. Specifically, in UDN, the MFG averages the influence between each member, simplifying a complex model.
The authors in the literature 1"Learning deep mean field games for modeling large population behavior[in International Conference on Learning Representations, vancouver, canada, apr.2018 @, demonstrate an equilibrium solution for average field gaming with a Markov Decision Process (MDP) to predict the evolution of demographics over time.
Document 2"Collaborative Artificial Intelligence (AI) for User-Cell Association in Ultra-Dense Cellular Systems [ IEEE International Conference on Communications Workshops (ICCCWorkshops), kansas, MO, may2018]" proposes a neural Q learning algorithm to solve the problem of User association in ultra-dense network systems.
Unlike the prior art, the present invention models NOMA-MEC systems in UDN scenarios, where each Small Base Station (SBS) is equipped with a MEC server. When a user cannot handle a large number of computing tasks, some tasks will be offloaded onto the MEC server. Firstly, a User Clustering Matching Algorithm (UCMA) based on channel gain difference is provided, and the user is clustered, so that the data rate of the user is improved. Then, using NOMA-MEC system as model, establishing MFG theory frame, using deep deterministic strategy gradient (DDPG) algorithm in reinforcement learning to solve the equilibrium solution algorithm of MFG, so as to reduce energy consumption and task delay of user.
[ invention ]
The invention aims to provide a mobile edge computing resource allocation method of an ultra-dense network, which aims to solve the problem that the prior art is difficult to process mutual interference among users, so that the computing performance of the users is affected.
The technical scheme adopted by the invention is that the method for allocating the mobile edge computing resources of the ultra-dense network is based on the ultra-dense network, and a NOMA-MEC communication system in the ultra-dense network comprises M= {1,2, …, M } small base stations, wherein each small base station is provided with an MEC server to execute a computing task unloaded by a user; assuming that the set of users served by each small cell is n= {1,2, …, N }, the N users are divided into y= {1,2, …, Y } groups, and k= {1,2, …, K } users are in each group;
the resource allocation method is implemented according to the following steps:
step one, constructing an uplink NOMA-MEC communication system, wherein each SBS is provided with an MEC server to serve a plurality of users;
step two, clustering is carried out on all users in the NOMA-MEC communication system according to the difference of channel gains; the users in the clusters adopt a NOMA transmission mode, and the clusters adopt a TDMA transmission mode;
step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task; wherein the computing costs include a local computing cost of the user and an offload computing cost;
modeling a NOMA-MEC communication system as an MFG framework; the SINR and the channel gain of the user are expressed as a state space, and the transmitting power, the unloading decision factor and the resource allocation factor of the user are expressed as an action space; constructing a reward function of the user according to the calculation cost of the user;
and fifthly, acquiring an equilibrium solution of the average field game by using a reinforcement learning method based on DDPG, namely an optimal resource allocation scheme in the mobile edge computing system.
Further, the specific method of the second step is as follows:
in the NOMA-MEC communication system model established in the first step, all users of each SBS service are ordered according to the magnitude of the channel gain, and then the users with the first M channel gains are sequentially selected as the first users in M NOMA clusters;
selecting a user with the maximum sum of channel gain differences of the NOMA cluster from other users according to a greedy matching method;
when the number of users cannot be uniformly allocated to each cluster, redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
Further, the specific mode of the third step is as follows:
3.1 Cost of local computation for the user:
let x be mk Unloading variables representing the kth user in the mth group, for a local computing model, i.e., the user can accomplish the computing task locally, without unloading the computing task to the MEC server, assume f m l k And > 0 represents the local computing capacity of the kth user in the mth group, when the user performs the task locally, the time is:
when computing the energy consumption of local computing, a commonly used model of computing energy consumption is adopted, namely epsilon=κf 2 . Where κ is an energy coefficient depending on the chip structure, the local energy consumption of the kth user in the mth group can be expressed as:
according to formulas (5) and (6), the local computation cost of the kth user in the mth group can be expressed as:
wherein,,and->Weight coefficients representing delay and energy consumption, respectively, and
3.2 Offloading computational cost for the user:
in the process of unloading to the MEC server for calculation, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
wherein f s Is the computational power of the MEC server;
the total time of the unloading process is:
the energy consumption of the unloading process also has two parts, namely, the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively:
according to equations (11) and (12), the total energy consumption of the unloading process is expressed as:
thus, the offload computation cost function for the kth user in the mth group is expressed as:
3.3 Total computation cost for the user):
according to 3.1 and 3.2, obtaining a user local computation cost and a user offload computation cost, the overall computation cost function of the user to complete the computation task can be expressed as:
further, the specific steps of the fourth step are as follows:
in a NOMA-MEC system of an ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
s mk (t)={τ mk (t),h mk (t)} (16),
each user is based on the current state s mk (t) selecting action a from action space A mk (t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action a mk (t) ∈A is expressed as:
a mk (t)={p mk (t),x mkmk } (17),
in the method, in the process of the invention,weight coefficients representing delay and energy consumption;
according to the analysis of the user calculation cost in the third step, the cost function of the user is expressed as:
therefore, the reward function for the kth user in the mth group is expressed as:
in average field gaming, the Hamilton-Jacobi-Bellman (HJB) equation and Fokker-Planck-Kolmogorov (FPK) equation describe the overall system model;
when the kth user in the mth group is in state s mk (t) Down selection action a mk At (t), its FPK equation can be expressed as:
π mk (t+1)=π mk (t)P mk (p mk ,x mkmk ) (20),
wherein pi mk (t+1) is the state of the kth user in the mth group at the time (t+1), P mk (p mk ,x mkmk ) The probability that the kth user in the mth group transits from the t moment state to the (t+1) moment state is mainly determined by the action of the user;
according to the definition of the reward function, the state s at time t mk The value function (i.e., the HJB equation) of (t) is expressed as:
and solving a Nash equilibrium solution of the MFG based on the FPK and HJB equations.
Further, the specific mode of the fifth step is as follows:
the DDPG algorithm is adopted to solve the equilibrium solution of the MFG, and the objective function of the DDPG algorithm is defined as:
wherein θ μ Is a parameter of the policy network that generates deterministic actions, and θ μ Updating through a strategy gradient;
there are mainly two in the Actor sectionNetworks, i.e., an online policy network and a target policy network. The deterministic strategy μ is used to directly derive each moment action a t =μ(s tμ ) And (5) determining a value. Like the Actor section, the Critic section also has two networks, an online Q network and a target Q network. The Q function (i.e., action value function) defined by the bellman equation is a reward expectation of selecting an action under a deterministic strategy, using a Q network to fit the Q function, namely:
Qμ(s t ,a t )=E[R+γQ(s t+1 ,μ(s t+1 ))] (23),
wherein Q is μ (s t ,a t ) Represented in state s t The deterministic strategy mu is adopted to select the action a t The expected values obtained, in order to measure the performance of the policy, define the performance targets as follows:
wherein β represents the behavior policy, ρ β Is a probability density function of the state space. In the Critic section, the mean square error is used as a loss function, namely:
thus, the loss function L with respect to θ can be obtained from a standard back propagation algorithm Q The gradient of (a), namely:
and updating the gradient in real time to enable the objective function to be converged, and finally obtaining an optimal strategy, namely obtaining an optimal resource allocation scheme in the mobile edge computing system.
Compared with the prior art, the invention has the beneficial effects that:
1. the NOMA-MEC system is constructed as an MFG theoretical framework, and the equilibrium solution of the MFG is solved through reinforcement learning, so that the calculation cost of a user, including energy consumption and time delay, is minimized.
2. The invention constructs an uplink NOMA-MEC system in an ultra-dense network, and each SBS is provided with one MEC server to serve a plurality of users. In the system, all users of each SBS service are divided into different clusters according to a user clustering algorithm to increase the data rate of the users.
3. The NOMA-MEC system under ultra dense networks is modeled as a MFG framework. And then solving the equilibrium solution of the MFG by adopting a DDPG method, and reducing the energy consumption and task delay of the user by learning a dynamic resource allocation strategy.
4. According to the invention, the method can effectively learn the optimal resource allocation strategy through experiments, and compared with other methods, the method can more effectively reduce the calculation time delay and the energy consumption of the user.
[ description of the drawings ]
FIG. 1 is a block diagram of a system for mobile edge computation for ultra dense networks in accordance with the present invention;
FIG. 2 is a schematic diagram of the relationship between the average field gaming and reinforcement learning algorithms of the present invention;
FIG. 3 is a schematic diagram of the present invention employing a reinforcement learning algorithm to optimize resource allocation in a NOMA-MEC system;
FIG. 4 is a graph showing the relationship between the energy consumption and the maximum transmission power under different algorithm comparisons according to the present invention;
fig. 5 is a schematic diagram showing the relationship between the calculated time delay and the maximum transmitting power under the comparison of different algorithms according to the present invention.
[ detailed description ] of the invention
The invention will be described in detail below with reference to the drawings and the detailed description.
Unlike the existing literature, the invention researches the resource optimization in the uplink NOMA-MEC system in the ultra-dense network from the aspects of relieving network resources and overcoming the limitations of the mobile equipment, and the invention combines a deep reinforcement learning algorithm to minimize the system delay and energy consumption by optimizing the power and unloading strategy.
Step one, constructing a system model:
an uplink NOMA-MEC system is constructed, with one MEC server per SBS to serve multiple users.
The concrete construction mode is as follows:
as shown in fig. 1, the present invention contemplates a NOMA-MEC communication system in an ultra-dense network of m= {1,2, …, M } small base stations, each equipped with a MEC server to perform user offloaded computational tasks. Assuming that the set of users served by each small cell is n= {1,2, …, N }, in order to reduce interference between users, the users need to be grouped. In the invention, N users are divided into Y= {1,2, …, Y } groups, and K= {1,2, …, K } users are in each group.
In the information transmission, the bandwidth B of the whole system is divided into Y sub-channels, and the bandwidth of each sub-channel is expressed as B sc = B/Y, while users in each group transmit information simultaneously in their sub-channels.
And step two, clustering all users in the system through a user clustering algorithm to improve the data transmission rate of the users. The users in the clusters adopt a NOMA transmission mode, and the clusters adopt a time division multiple access (Time division multiple access, TDMA) transmission mode.
The specific mode of the second step is as follows:
in the NOMA-MEC communication system model established in step one, all users of each SBS service are ordered according to the magnitudes of their channel gains, and then the user with the first M channel gains is sequentially selected as the first user in the M NOMA clusters. Next, a user having the NOMA cluster with the largest sum of channel gain differences is selected from the remaining users according to a greedy matching method. In addition, when the number of users cannot be uniformly allocated to each cluster, redundant users may be randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
And step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task. Including the local and offload computation costs of the user.
The specific mode of the third step is as follows:
and finishing clustering by the user according to the clustering algorithm in the step two. Because the NOMA technology is adopted by the users in the clusters when information is transmitted, the TDMA technology is adopted among the clusters, so that any user can be interfered by users in the same cluster and also can be interfered by users of other SBS services in the same time slot when information is transmitted.
For users within the NOMA cluster, users with greater channel gain will experience interference from users with less channel gain. The user with the smallest channel gain is not interfered by other users. Thus, the interference experienced by users within a NOMA cluster can be expressed as:
wherein p is mf Representing the transmission power of the f-th user in the m-th NOMA cluster, h mf Representing the channel gain of the f-th user in the m groups.
Secondly, in an ultra-dense network, users served by different small base stations will interfere when transmitting tasks in the same time slot, which can be expressed as:
wherein p is jk Representing the transmission power of the kth user in the j group, h jk Representing the channel gain of the kth user in the j group.
The SINR of the kth user in the mth group is expressed as:
wherein,,is the power of additive white Gaussian noise, so the firstThe data rate of the kth user in the m groups is expressed as:
R mk =W sc log(1+τ mk ) (4),
wherein W is sc =W total /M,W total Is the system bandwidth.
The computing tasks of the kth user in the mth group may be defined asWherein d mk Representing input data required by the kth user in the mth group to complete the computing task, c mk Representing the kth user calculation d in the mth group mk The number of CPU cycles required, ">Representing the last time the kth user in the mth group completed the computing task.
Let x be mk Unloading variables representing the kth user in the mth group, for the local computational model, assume thatRepresenting the local computing capacity of the kth user in the mth group, when the user performs the task locally, its time is:
when computing the energy consumption of local computing, a commonly used model of computing energy consumption is adopted, namely epsilon=κf 2 . Where κ is an energy coefficient depending on the chip structure, so the local energy consumption of the kth user in the mth group can be expressed as:
according to formulas (5) and (6), the calculation cost of the kth user in the mth group at the time of local calculation can be expressed as:
wherein,,and->Weight coefficients representing delay and energy consumption, respectively, andwhen->Indicating that the user is sensitive to delay and paying more attention to calculation time; otherwise, the user is indicated to have low energy, and the energy consumption of the calculation task is more focused.
In the process of unloading to the MEC server for calculation, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
wherein f s Is the computational power of the MEC server. The total time for this unloading process is:
similarly, the energy consumption in the unloading process also has two parts, namely, the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively:
according to equations (11) and (12), the total energy consumption of the offloading process can be expressed as:
thus, the cost function of the kth user in the mth group during offloading can be expressed as:
further, the cost function of the kth user in the mth group to complete the computing task can be expressed as
Step four, establishing a cost function:
modeling NOMA-MEC as an MFG framework, wherein SINR and channel gain of a user are represented as a state space, and transmit power, offloading decision factors, and resource allocation factors of the user are represented as an action space; and constructing a reward function of the user according to the calculation cost of the user.
The specific steps of the fourth step are as follows:
interference can become very severe when many users compute tasks simultaneously. This severely reduces the data transfer rate of the user, thereby increasing the time delay and power consumption in offloading the computing tasks. Since each user is an independent individual, in an ultra-dense scenario, it only considers its own interests. Therefore, the present invention expresses this model as the MFG theoretical framework.
The status of each user comes only from its own local observations. In a NOMA-MEC system of an ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
s mk (t)={τ mk (t),h mk (t)} (16),
each user is based on the current state s mk (t) selecting action a from action space A mk (t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action a mk (t) ∈A is expressed as:
a mk (t)={p mk (t),x mkmk } (17),
in the method, in the process of the invention,weight coefficients representing delay and energy consumption.
It is an object of the invention to minimize the computational cost of a user on the basis of a maximum delay. From the analysis of the user's computational cost in step three, the user's cost function can be expressed as:
therefore, the reward function for the kth user in the mth group can be expressed as:
in average field gaming, the Hamilton-Jacobi-Bellman (HJB) equation and the Fokker-Planck-Kolmogorov (FPK) equation describe the overall system model. When in group m
k users in state s mk (t) Down selection action a mk At (t), its FPK equation can be expressed as:
π mk (t+1)=π mk (t)P mk (p mk ,x mkmk ) (20),
wherein pi mk (t+1) is the state of the kth user in the mth group at the time (t+1), P mk (p mk ,x mkmk ) Is the probability that the kth user in the mth group transitions from the t-time state to the (t+1) time state, which is primarily determined by the user's actions.
According to the definition of the reward function, the state s at time t mk The value function (i.e., the HJB equation) of (t) is expressed as:
the Nash equilibrium solution for MFG can be solved based on FPK and HJB equations.
And fifthly, acquiring an equilibrium solution of the average field game by using a reinforcement learning method based on DDPG.
The specific mode of the fifth step is as follows:
the DDPG algorithm is adopted to solve the equilibrium solution of the MFG, which can solve the problem of continuous motion space, and the relationship between the MFG and reinforcement learning is shown in figure 2. The DDPG algorithm can be used for resource optimization problems in many communication scenarios.
A schematic diagram of optimizing resource allocation in a NOMA-MEC system using DDPG algorithm is shown in fig. 3. The DDPG algorithm is an Actor-Critic framework, and therefore is mainly divided into an Actor and Critic to illustrate the process of the DDPG algorithm. The Actor part outputs a specific action a by minimizing the action Q (s, a) through a deterministic strategy mu on the premise of inputting a state s; the Critic part outputs Q (s, a) updated by the bellman equation on the premise of inputting the state s and the specific action a. Thus, the objective function of the DDPG algorithm can be defined as:
wherein θ μ Is a parameter of the policy network that generates deterministic actions, and θ μ By policyThe gradient is updated.
In the Actor section there are mainly two networks, an online policy network and a target policy network. The deterministic strategy μ is used to directly derive each moment action a t =μ(s tμ ) And (5) determining a value. Like the Actor section, the Critic section also has two networks, an online Q network and a target Q network. The Q function (i.e., action value function) defined by the bellman equation is a reward expectation of selecting an action under a deterministic strategy, using a Q network to fit the Q function, namely:
Q μ (s t ,a t )=E[R+γQ(s t+1 ,μ(s t+1 ))] (23),
wherein Q is μ (s t ,a t ) Represented in state s t The deterministic strategy mu is adopted to select the action a t The expected values obtained, in order to measure the performance of the policy, define the performance targets as follows:
wherein β represents the behavior policy, ρ β Is a probability density function of the state space. The purpose of training is to target the performance of the Q network J β Maximization minimizes the loss of Q network. In the Critic section, the mean square error is used as a loss function, namely:
L(θ Q )=E[R+γQ′(s t+1 ,μ′(s t+1μ′ )|θ Q′ )-Q(s t ,a tQ )] (25),
thus, the loss function L with respect to θ can be obtained from a standard back propagation algorithm Q The gradient of (a), namely:
examples:
the illustrations provided in the examples below and the setting of specific parameter values in the models are mainly for illustrating the basic idea of the invention and for performing simulation verification on the invention, and in a specific application environment, the actual scene and the requirements can be appropriately adjusted.
The invention researches a NOMA-MEC system in an ultra-dense network, wherein 60 small base stations are randomly distributed within a range of 10km by 10km, the coverage range of each small base station is 20m, and 64 users are randomly distributed near the small base stations.
To implement the DDPG algorithm, the Actor network and Critic network use a fully connected neural network with three hidden layers, each hidden layer containing 300 neurons. For an Actor network, the last output layer uses Sigmoid activation functions to ensure that the probability of the last action output is between 0-1. For Critic networks, a ReLU activation function is used for each layer. Learning rates of the Actor network and the Critic network are set to 0.0001 and 0.001, respectively.
Fig. 4 and 5 show the effect of maximum transmit power for different algorithms and different multiple access modes. In fig. 4, it can be observed that the energy consumption of the system gradually increases with an increase in the maximum transmission power. The NOMA scheme may achieve lower power consumption when the maximum transmit power is fixed. This is because users in the NOMA cluster can simultaneously use the full spectrum resources to send information, which can reduce the power consumption of the system. As can be seen from fig. 5, the calculation delay decreases with increasing maximum transmit power. This is because, when the maximum transmission power is large, both the calculation speed and the data transmission rate of the user become large, resulting in a reduction in calculation delay.

Claims (5)

1. A method for distributing computing resources at mobile edge of ultra-dense network is characterized by that,
the resource allocation method is based on an ultra-dense network, and a NOMA-MEC communication system in the ultra-dense network comprisesEach small base station is provided with an MEC server to execute the calculation task of user unloading; assume that each small base stationThe user set of the service is->N users are divided intoGroups, each group has +.>A user;
the resource allocation method is implemented according to the following steps:
step one, constructing an uplink NOMA-MEC communication system, wherein each small base station SBS is provided with an MEC server to serve a plurality of users;
step two, clustering is carried out on all users in the NOMA-MEC communication system according to the difference of channel gains; the users in the clusters adopt a NOMA transmission mode, and the clusters adopt a TDMA transmission mode;
step three, calculating the calculation cost of the user, namely the time delay and the energy consumption when the user processes the task; wherein the computing costs include a local computing cost of the user and an offload computing cost;
modeling a NOMA-MEC communication system as an MFG framework; the SINR and the channel gain of the user are expressed as a state space, and the transmitting power, the unloading decision factor and the resource allocation factor of the user are expressed as an action space; constructing a reward function of the user according to the calculation cost of the user;
and fifthly, acquiring an equilibrium solution of the average field game by using a reinforcement learning method based on DDPG, namely an optimal resource allocation scheme in the mobile edge computing system.
2. The method for allocating mobile edge computing resources of an ultra-dense network according to claim 1, wherein the specific method in the second step is as follows:
in the NOMA-MEC communication system model established in the first step, all users of each SBS service are ordered according to the magnitude of the channel gain, and then the users with the first M channel gains are sequentially selected as the first users in M NOMA clusters;
selecting a user with the maximum sum of channel gain differences of the NOMA cluster from other users according to a greedy matching method;
when the number of users cannot be uniformly allocated to each cluster, redundant users are randomly allocated to different clusters, and the channel gain of each user in the clusters is different.
3. The method for allocating mobile edge computing resources of an ultra dense network according to claim 1 or 2, wherein the specific manner of the third step is:
3.1 Cost of local computation for the user:
let x be mk Unloading variables representing the kth user in the mth group, for a local computing model, i.e., the user can accomplish the computing task locally, without unloading the computing task to the MEC server, assumingRepresenting the local computing capacity of the kth user in the mth group, c mk Representing the number of CPU cycles required by the kth user in the mth group to perform local computation, when the user performs the task locally, the time is:
when computing the energy consumption of local computing, a commonly used model of computing energy consumption is adopted, namely epsilon=κf 2 The method comprises the steps of carrying out a first treatment on the surface of the Where ε represents the local computation energy consumption and κ is the energy coefficient depending on the chip structure, the local energy consumption of the kth user in the mth group can be expressed as:
according to formulas (5) and (6), the local computation cost of the kth user in the mth group can be expressed as:
wherein,,and->Weight coefficients representing delay and energy consumption, respectively, and +.>
3.2 Offloading computational cost for the user:
in the process of unloading to the MEC server for calculation, the method comprises two parts of transmission and calculation at the MEC server, wherein the transmission time and the execution time are respectively as follows:
wherein f s Is the computational power of the MEC server; r is R mk Representing the data transmission rate of the kth user in the mth group;
the total time of the unloading process is:
the energy consumption of the unloading process also has two parts, namely, the energy consumption in the transmission process and the energy consumption for executing the calculation task at the MEC server are respectively:
wherein p is mk Representing user power;
according to equations (11) and (12), the total energy consumption of the unloading process is expressed as:
thus, the offload computation cost function for the kth user in the mth group is expressed as:
3.3 Total computation cost for the user):
according to 3.1 and 3.2, obtaining a user local computation cost and a user offload computation cost, the overall computation cost function of the user to complete the computation task can be expressed as:
4. the method for allocating mobile edge computing resources of an ultra dense network according to claim 1 or 2, wherein the step four comprises the following specific steps:
in a NOMA-MEC system of an ultra-dense network, the state and channel gain of the kth user in the mth group are expressed as a state space, and the state space is expressed as:
s mk (t)={τ mk (t),h mk (t)} (16),
wherein τ is mk (t) represents the signal-to-interference-and-noise ratio of the user, h mk (t) represents the channel gain of the user;
each user is based on the current state s mk (t) from the action spaceIn selection action a mk (t) the action of the kth user in the mth group consists of its power, unload variables and weight coefficients, action +.>Expressed as:
a mk (t)={p mk (t),x mkmk } (17),
in the method, in the process of the invention,weight coefficients representing delay and energy consumption; p is p mk (t) represents the data transmission power of the user, x mk An unload variable representing a user;
according to the analysis of the user calculation cost in the third step, the cost function of the user is expressed as:
therefore, the reward function for the kth user in the mth group is expressed as:
wherein,,representing the local computational cost of the kth user in the mth group,/>representing the offload computation cost of the kth user in the mth group;
in average field gaming, the Hamilton-Jacobi-Bellman (HJB) equation and Fokker-Planck-Kolmogorov (FPK) equation describe the overall system model;
when the kth user in the mth group is in state s mk (t) Down selection action a mk At (t), its FPK equation can be expressed as:
wherein pi mk (t+1) is the state of the kth user in the mth group at the time (t+1), P mk (p mk ,x mkmk ) The probability that the kth user in the mth group transits from the t moment state to the (t+1) moment state is mainly determined by the action of the user;
according to the definition of the reward function, the state s at time t mk The value function of (t) is expressed as:
wherein V is t μ (s mk ) A value function representing a selection strategy μ at time t, R (p mk ,x mkmk |s mk ) Representing a reward function; and solving a Nash equilibrium solution of the MFG based on the FPK and HJB equations.
5. The method for allocating mobile edge computing resources of an ultra dense network according to claim 1 or 2, wherein the specific manner of the fifth step is:
the DDPG algorithm is adopted to solve the equilibrium solution of the MFG, and the objective function of the DDPG algorithm is defined as:
wherein θ μ Is a parameter of the policy network that generates deterministic actions, and θ μ Updating by means of strategy gradients, E representing the expected value of the function, gamma representing the weighted value of the reward function, R i Representing a prize function value;
two networks, namely an online strategy network and a target strategy network, are mainly arranged in the Actor part; the deterministic strategy μ is used to directly derive each moment action a t =μ(s tμ ) A determined value; the Critic part is also provided with two networks, namely an online Q network and a target Q network, which are the same as the Actor part; the Q function defined by the bellman equation is the rewards expectation of the selection of actions under deterministic policies, the Q function is fitted using a Q network, namely:
wherein Q is μ (s t ,a t ) Represented in state s t The deterministic strategy mu is adopted to select the action a t The expected values obtained, in order to measure the performance of the policy, define the performance targets as follows:
wherein s represents the state of the user,representing user state set +.>Obeying the probability density function ρ β E represents the expected value of the function, beta represents the behavior strategy, ρ β Is a probability density function of the state space; in the Critic section, mean square error is usedThe difference is as a loss function, namely:
wherein,,represents the expected value of the function, R represents the value of the reward function, gamma represents the weighted value of the reward function, mu ' represents a deterministic strategy, Q ' represents the expected value obtained with a deterministic strategy mu ', theta Q Q network parameters, θ, representing the generation of desired values by strategy μ μ′ The representation is a parameter of the policy network μ' that generates deterministic actions, θ Q′ A Q network parameter representing the generation of the desired value by the strategy μ';
thus, the loss function L with respect to θ can be obtained from a standard back propagation algorithm Q The gradient of (a), namely:
and updating the gradient in real time to enable the objective function to be converged, and finally obtaining an optimal strategy, namely obtaining an optimal resource allocation scheme in the mobile edge computing system.
CN202010597779.5A 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network Active CN111800828B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010597779.5A CN111800828B (en) 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010597779.5A CN111800828B (en) 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network

Publications (2)

Publication Number Publication Date
CN111800828A CN111800828A (en) 2020-10-20
CN111800828B true CN111800828B (en) 2023-07-18

Family

ID=72803807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010597779.5A Active CN111800828B (en) 2020-06-28 2020-06-28 Mobile edge computing resource allocation method for ultra-dense network

Country Status (1)

Country Link
CN (1) CN111800828B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468568B (en) * 2020-11-23 2024-04-23 南京信息工程大学滨江学院 Task relay unloading method for mobile edge computing network
CN112492691B (en) * 2020-11-26 2024-03-26 辽宁工程技术大学 Downlink NOMA power distribution method of depth deterministic strategy gradient
CN112601256B (en) * 2020-12-07 2022-07-15 广西师范大学 MEC-SBS clustering-based load scheduling method in ultra-dense network
CN112654081B (en) * 2020-12-14 2023-02-07 西安邮电大学 User clustering and resource allocation optimization method, system, medium, device and application
CN112738822A (en) * 2020-12-25 2021-04-30 中国石油大学(华东) NOMA-based security offload and resource allocation method in mobile edge computing environment
CN113055854A (en) * 2021-03-16 2021-06-29 西安邮电大学 NOMA-based vehicle edge computing network optimization method, system, medium and application
CN113517920A (en) * 2021-04-20 2021-10-19 东方红卫星移动通信有限公司 Calculation unloading method and system for simulation load of Internet of things in ultra-dense low-orbit constellation
CN113543342B (en) * 2021-07-05 2024-03-29 南京信息工程大学滨江学院 NOMA-MEC-based reinforcement learning resource allocation and task unloading method
CN113938997B (en) * 2021-09-30 2024-04-30 中国人民解放军陆军工程大学 Resource allocation method of secure MEC system in NOMA (non-volatile memory access) Internet of things
CN114827191B (en) * 2022-03-15 2023-11-03 华南理工大学 Dynamic task unloading method for fusing NOMA in vehicle-road cooperative system
CN114727423A (en) * 2022-04-02 2022-07-08 北京邮电大学 Personalized access method in GF-NOMA system
CN115022937B (en) * 2022-07-14 2022-11-11 合肥工业大学 Topological feature extraction method and multi-edge cooperative scheduling method considering topological features
CN115460080B (en) * 2022-08-22 2024-04-05 昆明理工大学 Blockchain-assisted time-varying average field game edge calculation unloading optimization method
CN117857559B (en) * 2024-03-07 2024-07-12 北京邮电大学 Metropolitan area optical network task unloading method based on average field game and edge server
CN118509823B (en) * 2024-07-19 2024-10-18 山东科技大学 Distributed multidimensional network resource slicing method based on strategy gradient algorithm and game

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107819840A (en) * 2017-10-31 2018-03-20 北京邮电大学 Distributed mobile edge calculations discharging method in the super-intensive network architecture
CN109548013A (en) * 2018-12-07 2019-03-29 南京邮电大学 A kind of mobile edge calculations system constituting method of the NOMA with anti-eavesdropping ability
CN109951897A (en) * 2019-03-08 2019-06-28 东华大学 A kind of MEC discharging method under energy consumption and deferred constraint
CN110798849A (en) * 2019-10-10 2020-02-14 西北工业大学 Computing resource allocation and task unloading method for ultra-dense network edge computing
CN111245539A (en) * 2020-01-07 2020-06-05 南京邮电大学 NOMA-based efficient resource allocation method for mobile edge computing network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3072851B1 (en) * 2017-10-23 2019-11-15 Commissariat A L'energie Atomique Et Aux Energies Alternatives REALIZING LEARNING TRANSMISSION RESOURCE ALLOCATION METHOD

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107819840A (en) * 2017-10-31 2018-03-20 北京邮电大学 Distributed mobile edge calculations discharging method in the super-intensive network architecture
CN109548013A (en) * 2018-12-07 2019-03-29 南京邮电大学 A kind of mobile edge calculations system constituting method of the NOMA with anti-eavesdropping ability
CN109951897A (en) * 2019-03-08 2019-06-28 东华大学 A kind of MEC discharging method under energy consumption and deferred constraint
CN110798849A (en) * 2019-10-10 2020-02-14 西北工业大学 Computing resource allocation and task unloading method for ultra-dense network edge computing
CN111245539A (en) * 2020-01-07 2020-06-05 南京邮电大学 NOMA-based efficient resource allocation method for mobile edge computing network

Also Published As

Publication number Publication date
CN111800828A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
CN109729528B (en) D2D resource allocation method based on multi-agent deep reinforcement learning
Chen et al. Multiuser computation offloading and resource allocation for cloud–edge heterogeneous network
Li et al. Downlink transmit power control in ultra-dense UAV network based on mean field game and deep reinforcement learning
CN113873022A (en) Mobile edge network intelligent resource allocation method capable of dividing tasks
CN111405569A (en) Calculation unloading and resource allocation method and device based on deep reinforcement learning
CN109947545A (en) A kind of decision-making technique of task unloading and migration based on user mobility
CN112788605B (en) Edge computing resource scheduling method and system based on double-delay depth certainty strategy
CN110856259A (en) Resource allocation and offloading method for adaptive data block size in mobile edge computing environment
CN116456493A (en) D2D user resource allocation method and storage medium based on deep reinforcement learning algorithm
CN113490219B (en) Dynamic resource allocation method for ultra-dense networking
Cheng et al. Efficient resource allocation for NOMA-MEC system in ultra-dense network: A mean field game approach
CN113590279A (en) Task scheduling and resource allocation method for multi-core edge computing server
CN114828018A (en) Multi-user mobile edge computing unloading method based on depth certainty strategy gradient
CN113573363A (en) MEC calculation unloading and resource allocation method based on deep reinforcement learning
CN114980039A (en) Random task scheduling and resource allocation method in MEC system of D2D cooperative computing
Zhou et al. Joint multi-objective optimization for radio access network slicing using multi-agent deep reinforcement learning
Bhandari et al. Optimal Cache Resource Allocation Based on Deep Neural Networks for Fog Radio Access Networks
CN116321293A (en) Edge computing unloading and resource allocation method based on multi-agent reinforcement learning
Ma et al. On-demand resource management for 6G wireless networks using knowledge-assisted dynamic neural networks
Gao et al. Multi-armed bandits scheme for tasks offloading in MEC-enabled maritime communication networks
CN113821346B (en) Edge computing unloading and resource management method based on deep reinforcement learning
Geng et al. Deep reinforcement learning-based computation offloading in vehicular networks
Han et al. Multi-step reinforcement learning-based offloading for vehicle edge computing
CN114219074A (en) Wireless communication network resource allocation algorithm dynamically adjusted according to requirements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant