CN111726811B - Slice resource allocation method and system for cognitive wireless network - Google Patents
Slice resource allocation method and system for cognitive wireless network Download PDFInfo
- Publication number
- CN111726811B CN111726811B CN202010457568.1A CN202010457568A CN111726811B CN 111726811 B CN111726811 B CN 111726811B CN 202010457568 A CN202010457568 A CN 202010457568A CN 111726811 B CN111726811 B CN 111726811B
- Authority
- CN
- China
- Prior art keywords
- user
- resource allocation
- ultra
- slice
- wireless network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013468 resource allocation Methods 0.000 title claims abstract description 77
- 230000001149 cognitive effect Effects 0.000 title claims abstract description 67
- 238000000034 method Methods 0.000 title claims abstract description 56
- 230000002787 reinforcement Effects 0.000 claims abstract description 65
- 230000006870 function Effects 0.000 claims abstract description 49
- 238000004891 communication Methods 0.000 claims abstract description 45
- 230000009471 action Effects 0.000 claims abstract description 34
- 238000005457 optimization Methods 0.000 claims description 13
- 210000002569 neuron Anatomy 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 9
- 238000003062 neural network model Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 5
- 238000001228 spectrum Methods 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 abstract description 8
- 239000003795 chemical substances by application Substances 0.000 description 9
- 238000010586 diagram Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/02—Resource partitioning among network components, e.g. reuse partitioning
- H04W16/04—Traffic adaptive resource partitioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/22—Traffic simulation tools or models
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/309—Measuring or estimating channel quality parameters
- H04B17/336—Signal-to-interference ratio [SIR] or carrier-to-interference ratio [CIR]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/382—Monitoring; Testing of propagation channels for resource allocation, admission control or handover
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The embodiment of the invention provides a slice resource allocation method and a slice resource allocation system for a cognitive wireless network. The method comprises the following steps: based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice, establishing a cognitive wireless network slice resource allocation model; performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions. According to the embodiment of the invention, in the cognitive network resource allocation, the slicing technology and the Actor-Critic deep reinforcement learning algorithm are combined, and under the condition of limited spectrum resources and limited transmitting power, the resources are optimally allocated, so that the throughput of the system is maximum.
Description
Technical Field
The present invention relates to the field of wireless communications technologies, and in particular, to a slice resource allocation method and system for a cognitive wireless network.
Background
With the rapid development of wireless communication technology, the use of wireless devices (e.g., vehicles, mobile phones, tablet computers, and various wireless sensors) has increased rapidly, facilitating the development of fifth generation (5G) wireless communication, in 5G wireless networks, it is expected that the data rate will be 10 times the current rate, and strong connectivity and 100% coverage are expected to provide better quality of service and user experience. In practice, however, spectrum resources are limited and spectrum usage is regulated for safety and stability considerations. Spectrum access rights are typically granted to licensed users and unlicensed users are not allowed to transmit and receive data over unlicensed regions of the spectrum. Therefore, a contradiction occurs between the limitation of spectrum resources and the increase of the number of users, and how to intelligently allocate resources in the cognitive wireless network becomes a hot spot of research.
In a cognitive wireless network, an unauthorized user (secondary user) is allowed to communicate within a licensed region of the spectrum, as long as that portion of the spectrum is not used by the authorized user (primary user). The network slicing technology is one of the important characteristics of the 5G network, the network slicing is essentially to divide the physical network of an operator into a plurality of virtual networks, each virtual network is divided according to different service requirements such as time delay, security, bandwidth, reliability and the like, three application fields are flexibly provided for coping with different application scenes such as the 5G network, and the enhanced mobile broadband, ultra-high reliability ultra-low time delay communication and large-scale Internet of things are provided, so that different communication characteristics and communication requirements are met.
In addition, reinforcement learning is a branch of artificial intelligence, also called reinforcement learning, and refers to a class of problems that are constantly learned from interactions (with the environment) and methods for solving such problems. Reinforcement learning problems may be described as an agent constantly learning from interactions with the environment to accomplish a particular goal (e.g., to achieve a maximum prize value). At present, reinforcement learning algorithms are widely applied in fields such as games, communication, medicine and the like. The existing reinforcement learning methods which are more commonly used are divided into two main types, namely a model-free reinforcement learning method and a model-free reinforcement learning method, wherein the model-free reinforcement learning method is generally adopted according to the characteristic of complex communication scene, and the model-free reinforcement learning method is specifically subdivided into the following steps: the strategy learning method based on the value function comprises a dynamic programming method, a Monte Carlo method, a time sequence difference learning method, a Q learning method and a deep Q learning method; the learning method based on the strategy function comprises REINFORCE algorithm and REINFORCE algorithm with reference line. Generally, a method based on a value function, such as a Q learning method, may cause overestimation during policy updating, have a certain influence on convergence, and are good at handling discrete problems, while a method based on a policy function is smoother during policy updating, but the latter method is difficult to fully sample because of a larger solution space of the policy function, causes larger variance, and is easy to converge to a locally optimal solution.
Disclosure of Invention
The embodiment of the invention provides a slice resource allocation method and a slice resource allocation system for a cognitive wireless network, which are used for solving the problems existing in the prior art or at least partially solving the existing problems.
In a first aspect, an embodiment of the present invention provides a method for allocating slice resources for a cognitive wireless network, including:
based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice, establishing a cognitive wireless network slice resource allocation model;
performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions.
Further, the depth reinforcement learning algorithm based on the Actor-Critic carries out depth reinforcement learning on the cognitive wireless network slice resource allocation model to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises actions defining a user state and a current moment to a next moment, and constructs a system rewarding function according to the user state and the actions, and further comprises the following steps:
and obtaining a fully-connected neural network model to construct an Actor-Critic deep reinforcement learning algorithm network.
Further, the method for establishing the cognitive wireless network slice resource allocation model based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice specifically comprises the following steps:
defining the throughput of a main user of an enhanced mobile broadband and the interruption probability of the main user of ultra-high reliability ultra-low time delay communication;
and based on the throughput of the main user and the interruption probability of the main user, taking the maximum throughput of the system as a target, defining a system optimization target and a system constraint condition, and constructing the cognitive wireless network slice resource allocation model.
Further, the defining the throughput of the main user of the enhanced mobile broadband and the interruption probability of the main user of the ultra-high reliability ultra-low delay communication further comprises:
the primary user throughput is obtained from any enhanced mobile broadband primary user bandwidth and the signal-to-interference-plus-noise ratio of any enhanced mobile broadband primary user on any channel, wherein the signal-to-interference-plus-noise ratio is obtained from the channel gain of any enhanced mobile broadband primary user transmitter to primary user receiver, the channel gain of any enhanced mobile broadband primary user transmitter to secondary user receiver, the transmit power of any enhanced mobile broadband primary user transmitter on any channel, and the transmit power of any enhanced mobile broadband secondary user transmitter on any channel;
the main user interruption probability is obtained by the delay time of any ultra-high reliable ultra-low delay communication user, the maximum delay time of any ultra-high reliable ultra-low delay communication user and the maximum data arrival rate.
Further, based on the throughput of the primary user and the outage probability of the primary user, a system optimization target and a system constraint condition are defined with the maximum throughput of the system as a target, and the cognitive wireless network slice resource allocation model is constructed, which specifically comprises:
maximum sum of throughput of all secondary users in the system is used as the system optimization target;
defining any enhanced mobile broadband user rate not lower than a first preset value;
defining that the probability that any communication user with ultra-high reliability and ultra-low time delay does not meet the low time delay is smaller than a second preset value;
defining that a secondary user can only occupy one channel;
the secondary user transmitter power is defined not to exceed a third preset value.
Further, the depth reinforcement learning algorithm based on the Actor-Critic carries out depth reinforcement learning on the cognitive wireless network slice resource allocation model to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises actions defining a user state and a current moment to a next moment, and builds a system rewarding function according to the user state and the actions, and specifically comprises the following steps:
defining all secondary users as intelligent agents and all primary users at any moment as signal-to-interference-and-noise ratio state functions;
based on the signal-to-interference-and-noise ratio state function, obtaining an action function of the intelligent agent from the current time to the next time, wherein the action function comprises a sub-carrier state representation occupied by a user at any time and a power state representation of the user at any time;
setting the rewarding function of the intelligent agent as the sum of throughput of all secondary users, and obtaining the result of the rewarding function according to whether the enhanced mobile broadband user meets the rate constraint condition and whether the ultra-high reliability ultra-low time delay communication user meets the power constraint condition.
Further, the obtaining the fully connected neural network model to construct an Actor-Critic deep reinforcement learning algorithm network specifically comprises the following steps:
obtaining a three-layer linear neural network, wherein the number of neurons of an input layer is a first preset parameter, the number of neurons of a middle hidden layer is a second preset parameter, the input layer and the middle hidden layer adopt a ReLU as an activation function, the number of neurons of an output layer is a third preset parameter, and the output layer adopts a sigmoid and a softmax as an activation function;
and respectively constructing an Actor network and a Critic network based on the three-layer linear neural network.
In a second aspect, an embodiment of the present invention provides a slice resource allocation system for a cognitive wireless network, including:
the construction module is used for establishing a cognitive wireless network slice resource allocation model based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice;
the solution module is used for performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution for slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions.
In a third aspect, an embodiment of the present invention provides an electronic device, including:
the method comprises the steps of a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes any one of the slice resource allocation methods for the cognitive wireless network when executing the program.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the slice resource allocation methods for a cognitive wireless network.
According to the slice resource allocation method and system for the cognitive wireless network, in the cognitive network resource allocation, the slice technology and the Actor-Critic deep reinforcement learning algorithm are combined, and under the condition of limited frequency spectrum resources and limited transmitting power, the resources are optimally allocated, so that the throughput of the system is maximum.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a slice resource allocation method for a cognitive wireless network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an Actor and Critic network according to an embodiment of the present invention;
fig. 3 is a block diagram of a slice resource allocation system for a cognitive wireless network according to an embodiment of the present invention;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Aiming at the defects in the prior art, the embodiment of the invention provides a slice resource allocation method for a cognitive wireless network, which realizes that the secondary users combine power and channel allocation while guaranteeing the service of the primary users, so that the throughput of all the secondary users in the system is maximum.
Fig. 1 is a flowchart of a slice resource allocation method for a cognitive wireless network according to an embodiment of the present invention, where, as shown in fig. 1, the method includes:
s1, establishing a cognitive wireless network slice resource allocation model based on an enhanced mobile broadband slice and an ultra-high reliable ultra-low time delay communication slice;
s2, performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions.
Specifically, considering resource allocation of two application scenes in a 5G network, namely enhanced mobile broadband and ultra-high reliability ultra-low time delay communication, performing network resource slicing, mapping corresponding resource allocation problems into a general reinforcement learning algorithm model, establishing a cognitive wireless network slice resource allocation model which comprises an optimization target and constraint conditions of a system, further providing a deep reinforcement learning resource allocation method-CNAC algorithm based on an Actor-Critic, providing a reward function setting mechanism, simultaneously putting the constraint conditions and the optimization target in the cognitive wireless network slice resource allocation model into a reward function, and solving to obtain a resource allocation optimal solution of the cognitive wireless system.
According to the embodiment of the invention, in the cognitive network resource allocation, the slicing technology and the Actor-Critic deep reinforcement learning algorithm are combined, and under the condition of limited spectrum resources and limited transmitting power, the resources are optimally allocated, so that the throughput of the system is maximum.
Based on the above embodiment, the method further includes, before step S2:
and obtaining a fully-connected neural network model to construct an Actor-Critic deep reinforcement learning algorithm network.
Specifically, the Actor-Critic based deep reinforcement learning algorithm adopts a neural network structure, and comprises two networks, namely an Actor network and a Critic network, which adopt the same network structure.
Based on any of the above embodiments, step S1 in the method specifically includes:
defining the throughput of a main user of an enhanced mobile broadband and the interruption probability of the main user of ultra-high reliability ultra-low time delay communication;
and based on the throughput of the main user and the interruption probability of the main user, taking the maximum throughput of the system as a target, defining a system optimization target and a system constraint condition, and constructing the cognitive wireless network slice resource allocation model.
Wherein the defining the throughput of the main user of the enhanced mobile broadband and the interruption probability of the main user of the ultra-high reliability ultra-low delay communication further comprises:
the primary user throughput is obtained from any enhanced mobile broadband primary user bandwidth and the signal-to-interference-plus-noise ratio of any enhanced mobile broadband primary user on any channel, wherein the signal-to-interference-plus-noise ratio is obtained from the channel gain of any enhanced mobile broadband primary user transmitter to primary user receiver, the channel gain of any enhanced mobile broadband primary user transmitter to secondary user receiver, the transmit power of any enhanced mobile broadband primary user transmitter on any channel, and the transmit power of any enhanced mobile broadband secondary user transmitter on any channel;
the main user interruption probability is obtained by the delay time of any ultra-high reliable ultra-low delay communication user, the maximum delay time of any ultra-high reliable ultra-low delay communication user and the maximum data arrival rate.
The method specifically includes the steps of defining a system optimization target and a system constraint condition based on the throughput of the main user and the outage probability of the main user and with the maximum throughput of the system as a target, and constructing a cognitive wireless network slice resource allocation model, wherein the method specifically includes the following steps:
maximum sum of throughput of all secondary users in the system is used as the system optimization target;
defining any enhanced mobile broadband user rate not lower than a first preset value;
defining that the probability that any communication user with ultra-high reliability and ultra-low time delay does not meet the low time delay is smaller than a second preset value;
defining that a secondary user can only occupy one channel;
the secondary user transmitter power is defined not to exceed a third preset value.
Specifically, a cognitive wireless network slice resource allocation model is first established, where enhanced mobile broadband (emmbb) slice users and ultra-high reliability ultra-low latency communication (URLLC) slice users are considered.
The primary user m throughput defining the eMBB slice satisfies:
c m,k (t)≥μ 0 ,m∈M 1
wherein,
c m,k (t) represents the data transmission rate of the mth user in channel k, g m,k And g nm,k Representing the channel gain of the transmitter of primary user m to the receiver of primary user m and the channel gain of the transmitter of primary user m to the receiver of secondary user n, p, respectively m (k) Representing the transmit power of the primary user transmitter on the kth channel, p n,k (t) represents the transmit power of the secondary user transmitter on the kth channel, B m Represents the bandwidth of user m, B represents the bandwidth of the entire cognitive system, μ 0 Representing the lowest throughput requirement of user m.
For URLLC slice users, it is assumed that the arrival process of the slice 2 user data packet can be represented by an M/M/1/≡queuing system, and the data packet length follows an exponential distribution, and the outage probability of the master user M ism∈M 2 . Wherein d m Represents the delay time of user m, d m,β Representing the maximum delay time of user m, r m Indicating the maximum data arrival rate.
Further, the allocation method aims at the maximum throughput of the system, and the proposed optimization targets and constraint conditions are as follows:
here, constraint C 1 Indicating that the user rate of the eMBB slice is at least not below mu 0 ,C 2 The probability that the URLLC slice user does not meet the low delay is less than the minimum value tau, C 3 Meaning that a channel can only be occupied at most by one secondary user, C 4 A power constraint is transmitted for the secondary user transmitter.
Based on any of the above embodiments, step S2 in the method specifically includes:
defining all secondary users as intelligent agents and all primary users at any moment as signal-to-interference-and-noise ratio state functions;
based on the signal-to-interference-and-noise ratio state function, obtaining an action function of the intelligent agent from the current time to the next time, wherein the action function comprises a sub-carrier state representation occupied by a user at any time and a power state representation of the user at any time;
setting the rewarding function of the intelligent agent as the sum of throughput of all secondary users, and obtaining the result of the rewarding function according to whether the enhanced mobile broadband user meets the rate constraint condition and whether the ultra-high reliability ultra-low time delay communication user meets the power constraint condition.
Specifically, based on the foregoing embodiment, a CNAC algorithm, which is a deep reinforcement learning resource allocation method based on Actor-Critic, is proposed, all secondary users are regarded as an agent, SINR of all primary users at time t is regarded as a state, and s is used t Expressed as:
s t ={SINR 1 (t),SINR 2 (t),...,SINR M (t)}
the intelligent agent is from s t To s t+1 The actions taken are expressed as:
indicating the situation where sub-carriers are occupied by sub-users at time t, -/-, a>Representing a secondary user power condition.
Since the objective of the problem is that the throughput of the cognitive system is maximum and the user traffic requirements of the two slices are different, i.e. the constraints are different, according to Lagrangian dual, the reward function r (s t ,a t ) Set to the sum of the throughput of all secondary users
If the eMBB user meets the rate constraint and the URLLC user meets the power constraint, then reward is set to the sum of the throughput of the secondary users; when the eMBB user does not meet the rate constraint requirement, the reorder is set to 0; when the URLLC user does not meet the rate requirement, reorder is set to 0.
Based on any one of the above embodiments, the obtaining the fully connected neural network model to construct an Actor-Critic reinforcement learning algorithm network specifically includes:
obtaining a three-layer linear neural network, wherein the number of neurons of an input layer is a first preset parameter, the number of neurons of a middle hidden layer is a second preset parameter, the input layer and the middle hidden layer adopt a ReLU as an activation function, the number of neurons of an output layer is a third preset parameter, and the output layer adopts a sigmoid and a softmax as an activation function;
and respectively constructing an Actor network and a Critic network based on the three-layer linear neural network.
Specifically, the Actor-Critic deep reinforcement learning algorithm network comprises two networks, namely an Actor network and a Critic network, as shown in fig. 2, the Actor is based on a Policy algorithm, the function is to make a decision, critic evaluates the decision of the Actor, generates a TD error according to the state, the action and the rewards, and then guides the decision after the Actor.
It can be understood that the same neural network structure is adopted by the Actor network and the Critic network, the CNAC algorithm neural network part is adopted by the main body, the three-layer linear neural network is adopted by the main body, the neuron number of the input layer is 16, the activation function is relu, the neuron number of the middle hidden layer is 30, the activation function is relu, the neuron number of the output layer is 12, and two activation functions of sigmoid and softmax are used.
The neural network adopts a dropout technology, so that the variance of the network is reduced while the generalization capability of the network is improved, and the occurrence of overfitting is prevented. In order to speed up the training of the network, an adam optimizers optimizer is used during the back propagation of the network.
Based on any embodiment, on the basis of the embodiment of the invention, a simulation experiment is carried out, a DQN (Deep Q Network) algorithm is used for a comparison experiment, and experimental results show that the result of the CNAC algorithm can be converged more quickly, and the stability and the interruption rate are better.
Fig. 3 is a block diagram of a slice resource allocation system for a cognitive wireless network according to an embodiment of the present invention, as shown in fig. 3, including: a construction module 31 and a solving module 32; wherein:
the construction module 31 is used for establishing a cognitive wireless network slice resource allocation model based on the enhanced mobile broadband slice and the ultra-high reliability ultra-low time delay communication slice; the solution module 32 is configured to perform deep reinforcement learning on the cognitive radio network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm, so as to obtain an optimal solution for slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions.
The system provided by the embodiment of the present invention is used for executing the corresponding method, and the specific implementation manner of the system is consistent with the implementation manner of the method, and the related algorithm flow is the same as the algorithm flow of the corresponding method, which is not repeated here.
According to the embodiment of the invention, in the cognitive network resource allocation, the slicing technology and the Actor-Critic reinforcement learning algorithm are combined, and under the condition of limited spectrum resources and limited transmitting power, the resources are optimally allocated, so that the throughput of the system is maximum.
Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. The processor 410 may call logic instructions in the memory 430 to perform the following method: based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice, establishing a cognitive wireless network slice resource allocation model; performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, embodiments of the present invention further provide a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the transmission method provided in the above embodiments, for example, including: based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice, establishing a cognitive wireless network slice resource allocation model; performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (9)
1. The slice resource allocation method for the cognitive wireless network is characterized by comprising the following steps of:
based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice, establishing a cognitive wireless network slice resource allocation model;
performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution of slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions;
the method for establishing the cognitive wireless network slice resource allocation model based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice specifically comprises the following steps:
defining the throughput of a main user of an enhanced mobile broadband and the interruption probability of the main user of ultra-high reliability ultra-low time delay communication;
and based on the throughput of the main user and the interruption probability of the main user, taking the maximum throughput of the system as a target, defining a system optimization target and a system constraint condition, and constructing the cognitive wireless network slice resource allocation model.
2. The method for allocating slice resources of a cognitive wireless network according to claim 1, wherein the depth reinforcement learning algorithm based on an Actor-Critic depth reinforcement learning algorithm performs depth reinforcement learning on a slice resource allocation model of the cognitive wireless network to obtain an optimal solution for slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises actions defining a user state and a current moment to a next moment, and constructs a system rewarding function according to the user state and the actions, and further comprises the following steps:
and obtaining a fully-connected neural network model to construct an Actor-Critic deep reinforcement learning algorithm network.
3. The method for slice resource allocation for a cognitive wireless network of claim 1, wherein the defining the master user throughput and master user outage probability for ultra-high reliability ultra-low latency communications for enhanced mobile broadband further comprises:
the primary user throughput is obtained from any enhanced mobile broadband primary user bandwidth and the signal-to-interference-plus-noise ratio of any enhanced mobile broadband primary user on any channel, wherein the signal-to-interference-plus-noise ratio is obtained from the channel gain of any enhanced mobile broadband primary user transmitter to primary user receiver, the channel gain of any enhanced mobile broadband primary user transmitter to secondary user receiver, the transmit power of any enhanced mobile broadband primary user transmitter on any channel, and the transmit power of any enhanced mobile broadband secondary user transmitter on any channel;
the main user interruption probability is obtained by the delay time of any ultra-high reliable ultra-low delay communication user, the maximum delay time of any ultra-high reliable ultra-low delay communication user and the maximum data arrival rate.
4. The method for allocating slice resources of a cognitive radio network according to claim 1, wherein the constructing the cognitive radio network slice resource allocation model based on the throughput of the primary user and the outage probability of the primary user, with a system throughput maximum as a target, defines a system optimization target and a system constraint condition, and specifically comprises:
maximum sum of throughput of all secondary users in the system is used as the system optimization target;
defining any enhanced mobile broadband user rate not lower than a first preset value;
defining that the probability that any communication user with ultra-high reliability and ultra-low time delay does not meet the low time delay is smaller than a second preset value;
defining that a secondary user can only occupy one channel;
the secondary user transmitter power is defined not to exceed a third preset value.
5. The method for allocating slice resources of a cognitive wireless network according to claim 1, wherein the depth reinforcement learning algorithm based on an Actor-Critic depth reinforcement learning algorithm performs depth reinforcement learning on a slice resource allocation model of the cognitive wireless network to obtain an optimal solution for slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises actions defining a user state and a current moment to a next moment, and builds a system rewarding function according to the user state and the actions, and specifically comprises the following steps:
defining all secondary users as intelligent agents and all primary users at any moment as signal-to-interference-and-noise ratio state functions;
based on the signal-to-interference-and-noise ratio state function, obtaining an action function of the intelligent agent from the current time to the next time, wherein the action function comprises a sub-carrier state representation occupied by a user at any time and a power state representation of the user at any time;
setting the rewarding function of the intelligent agent as the sum of throughput of all secondary users, and obtaining the result of the rewarding function according to whether the enhanced mobile broadband user meets the rate constraint condition and whether the ultra-high reliability ultra-low time delay communication user meets the power constraint condition.
6. The method for allocating slice resources for a cognitive wireless network according to claim 2, wherein the acquiring the fully connected neural network model constructs an Actor-Critic deep reinforcement learning algorithm network, specifically comprising:
obtaining a three-layer linear neural network, wherein the number of neurons of an input layer is a first preset parameter, the number of neurons of a middle hidden layer is a second preset parameter, the input layer and the middle hidden layer adopt a ReLU as an activation function, the number of neurons of an output layer is a third preset parameter, and the output layer adopts a sigmoid and a softmax as an activation function;
and respectively constructing an Actor network and a Critic network based on the three-layer linear neural network.
7. A slice resource allocation system for a cognitive wireless network, comprising:
the construction module is used for establishing a cognitive wireless network slice resource allocation model based on the enhanced mobile broadband slice and the ultra-high reliable ultra-low time delay communication slice;
the solution module is used for performing deep reinforcement learning on the cognitive wireless network slice resource allocation model based on an Actor-Critic deep reinforcement learning algorithm to obtain an optimal solution for slice resource allocation; the Actor-Critic deep reinforcement learning algorithm comprises a user state and actions from the current moment to the next moment, and a system rewarding function is constructed by the user state and the actions;
wherein, the construction module includes: defining the throughput of a main user of an enhanced mobile broadband and the interruption probability of the main user of ultra-high reliability ultra-low time delay communication;
and based on the throughput of the main user and the interruption probability of the main user, taking the maximum throughput of the system as a target, defining a system optimization target and a system constraint condition, and constructing the cognitive wireless network slice resource allocation model.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor performs the steps of the method for slice resource allocation for a cognitive wireless network as claimed in any one of claims 1 to 6.
9. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the slice resource allocation method for a cognitive wireless network according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010457568.1A CN111726811B (en) | 2020-05-26 | 2020-05-26 | Slice resource allocation method and system for cognitive wireless network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010457568.1A CN111726811B (en) | 2020-05-26 | 2020-05-26 | Slice resource allocation method and system for cognitive wireless network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111726811A CN111726811A (en) | 2020-09-29 |
CN111726811B true CN111726811B (en) | 2023-11-14 |
Family
ID=72565084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010457568.1A Active CN111726811B (en) | 2020-05-26 | 2020-05-26 | Slice resource allocation method and system for cognitive wireless network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111726811B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114374608B (en) * | 2020-10-15 | 2023-08-15 | 中国移动通信集团浙江有限公司 | Slice instance backup task scheduling method and device and electronic equipment |
CN112272410B (en) * | 2020-10-22 | 2022-04-19 | 北京邮电大学 | Model training method for user association and resource allocation in NOMA (non-orthogonal multiple Access) network |
CN112367628B (en) * | 2020-11-12 | 2024-01-23 | 广东电网有限责任公司 | Intelligent network slice instantiation method and system of electric power Internet of things |
CN112991384B (en) * | 2021-01-27 | 2023-04-18 | 西安电子科技大学 | DDPG-based intelligent cognitive management method for emission resources |
CN112911715B (en) * | 2021-02-03 | 2024-02-13 | 南京南瑞信息通信科技有限公司 | Method and device for distributing power with maximized throughput in virtual wireless network |
CN113163451B (en) * | 2021-04-23 | 2022-08-02 | 中山大学 | D2D communication network slice distribution method based on deep reinforcement learning |
CN113395757B (en) * | 2021-06-10 | 2023-06-30 | 中国人民解放军空军通信士官学校 | Deep reinforcement learning cognitive network power control method based on improved return function |
CN113438723B (en) * | 2021-06-23 | 2023-04-28 | 广东工业大学 | Competition depth Q network power control method with high rewarding punishment |
CN114501644A (en) * | 2021-12-17 | 2022-05-13 | 北京邮电大学 | Time domain resource configuration method and device, electronic equipment and storage medium |
CN114520772B (en) * | 2022-01-19 | 2023-11-14 | 广州杰赛科技股份有限公司 | 5G slice resource scheduling method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109862610A (en) * | 2019-01-08 | 2019-06-07 | 华中科技大学 | A kind of D2D subscriber resource distribution method based on deeply study DDPG algorithm |
US10405193B1 (en) * | 2018-06-28 | 2019-09-03 | At&T Intellectual Property I, L.P. | Dynamic radio access network and intelligent service delivery for multi-carrier access for 5G or other next generation network |
CN110381541A (en) * | 2019-05-28 | 2019-10-25 | 中国电力科学研究院有限公司 | A kind of smart grid slice distribution method and device based on intensified learning |
CN110519783A (en) * | 2019-09-26 | 2019-11-29 | 东华大学 | 5G network based on enhancing study is sliced resource allocation methods |
WO2020049181A1 (en) * | 2018-09-07 | 2020-03-12 | NEC Laboratories Europe GmbH | System and method for network automation in slice-based network using reinforcement learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11153229B2 (en) * | 2018-01-19 | 2021-10-19 | Ciena Corporation | Autonomic resource partitions for adaptive networks |
KR102030128B1 (en) * | 2018-02-28 | 2019-11-08 | 한국과학기술원 | Resource allocating method for wireless backhaul network and apparatus based on machine learning |
-
2020
- 2020-05-26 CN CN202010457568.1A patent/CN111726811B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10405193B1 (en) * | 2018-06-28 | 2019-09-03 | At&T Intellectual Property I, L.P. | Dynamic radio access network and intelligent service delivery for multi-carrier access for 5G or other next generation network |
WO2020049181A1 (en) * | 2018-09-07 | 2020-03-12 | NEC Laboratories Europe GmbH | System and method for network automation in slice-based network using reinforcement learning |
CN109862610A (en) * | 2019-01-08 | 2019-06-07 | 华中科技大学 | A kind of D2D subscriber resource distribution method based on deeply study DDPG algorithm |
CN110381541A (en) * | 2019-05-28 | 2019-10-25 | 中国电力科学研究院有限公司 | A kind of smart grid slice distribution method and device based on intensified learning |
CN110519783A (en) * | 2019-09-26 | 2019-11-29 | 东华大学 | 5G network based on enhancing study is sliced resource allocation methods |
Non-Patent Citations (1)
Title |
---|
下一代无线网络中基于经济理论的资源分配;孙三山;中国博士学位论文全文数据库信息科技辑(第01期);第88-109页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111726811A (en) | 2020-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111726811B (en) | Slice resource allocation method and system for cognitive wireless network | |
CN111901392B (en) | Mobile edge computing-oriented content deployment and distribution method and system | |
CN109729528B (en) | D2D resource allocation method based on multi-agent deep reinforcement learning | |
CN110267338B (en) | Joint resource allocation and power control method in D2D communication | |
CN112995951B (en) | 5G Internet of vehicles V2V resource allocation method adopting depth certainty strategy gradient algorithm | |
Li | Multi-agent Q-learning of channel selection in multi-user cognitive radio systems: A two by two case | |
Zhang et al. | Team learning-based resource allocation for open radio access network (O-RAN) | |
Lee et al. | Resource allocation in wireless networks with deep reinforcement learning: A circumstance-independent approach | |
CN113411826B (en) | Edge network equipment caching method based on attention mechanism reinforcement learning | |
CN116541106B (en) | Computing task unloading method, computing device and storage medium | |
CN116600324B (en) | Channel allocation method for channel-bonded WiFi network | |
CN110996365B (en) | Heterogeneous network vertical switching algorithm and system based on multi-objective optimization model | |
CN116582860A (en) | Link resource allocation method based on information age constraint | |
CN114095940A (en) | Slice resource allocation method and equipment for hybrid access cognitive wireless network | |
CN108282888B (en) | D2D resource allocation method based on improved fuzzy clustering | |
CN114885422A (en) | Dynamic edge computing unloading method based on hybrid access mode in ultra-dense network | |
CN114599115A (en) | Unmanned aerial vehicle self-organizing network channel access method | |
CN111741050A (en) | Data distribution method and system based on roadside unit | |
CN114938543A (en) | Honeycomb heterogeneous network resource allocation method based on deep reinforcement learning | |
CN115866559B (en) | Non-orthogonal multiple access auxiliary Internet of vehicles low-energy-consumption safe unloading method | |
CN108768602A (en) | Independently exempt from the method that licensed band cell mobile communication systems selection authorized user feeds back CSI | |
CN115802465B (en) | D2D edge cache network energy consumption management method based on reinforcement learning framework | |
CN116828607A (en) | Radio access network slice resource allocation method adapting to different channel characteristics | |
CN116566522A (en) | Channel intersection method, system, equipment and medium | |
US20230403685A1 (en) | Communication method, base station, user equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |