CN108880709B - Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network - Google Patents
Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network Download PDFInfo
- Publication number
- CN108880709B CN108880709B CN201810737835.3A CN201810737835A CN108880709B CN 108880709 B CN108880709 B CN 108880709B CN 201810737835 A CN201810737835 A CN 201810737835A CN 108880709 B CN108880709 B CN 108880709B
- Authority
- CN
- China
- Prior art keywords
- user
- channel
- value
- cognitive user
- cognitive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/02—Resource partitioning among network components, e.g. reuse partitioning
- H04W16/10—Dynamic resource partitioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/309—Measuring or estimating channel quality parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/382—Monitoring; Testing of propagation channels for resource allocation, admission control or handover
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Electromagnetism (AREA)
- Quality & Reliability (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses distributed multi-user dynamic spectrum access method in a kind of cognition wireless network, this method is as follows: Step 1: building system model, the system model are as follows: n authorized user and k cognitive user in cell share b channel;Wherein: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;Step 2: carrying out frequency spectrum selection and access to cognitive user using DQN algorithm, specifically: setting initial Q function, each cognitive user is set as an executor, each executor carries out movement selection according to DQN algorithm, selection one is transmitted i.e. from b channel, the average utility value of computing system sets reward value using evolution theory of games;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;Step 3: step 2 is continued to execute, until obtained power system capacity tends to definite value.Under high spectrum conditions of demand, band efficiency is high.
Description
Technical field
The invention belongs to wireless communication technology fields, and in particular to distributed multi-user dynamic in a kind of cognition wireless network
Frequency spectrum access method.
Background technique
With the rapid development of wireless communication technique and universal and new business the continuous growth of wireless device, frequency spectrum money
Source becomes more and more in short supply, and fixed available frequency spectrum resource can no longer meet the communicating requirement of user, and low frequency spectrum utilizes in addition
The problems such as frequency spectrum resource brought by rate is insufficient becomes to get worse, so that wireless communication system is pushing economy and society development
When receive the constraint of frequency spectrum resource.Cognitive radio technology has become the key for solving the problems, such as low frequency spectrum utilization rate at present
Technology, the main thought of the technology are to detect which frequency spectrum is in idle condition first, then these skies of intelligent selection and access
Ideler frequency spectrum, which greatly enhances the availability of frequency spectrums.
In order to improve the Quality of experience of user and alleviate frequency spectrum pressure, there are many dynamics in cognition wireless network recently
The relevant work of spectrum management is completed.These research work greatly enhance the availability of frequency spectrum.But these have
Its limitation, although to the more demanding of environment, and user does not have as that can obtain Nash Equilibrium and evolution stable equilibrium
Process through overfitting, convergence rate are slow.Or research is military network, each auxiliary nothing being built upon in military network
Line electrical nodes can distribute in the premise of frequency spectrum resource according to its priority, using there is certain limitation.Some is not
In view of the equilibrium between each user and coordinate, systematic comparison is unstable.
Summary of the invention
Technical problem to be solved by the present invention lies in view of the above shortcomings of the prior art, provide a kind of cognitive wireless
Distributed multi-user dynamic spectrum access method in network, under high spectrum conditions of demand, band efficiency is high.
In order to solve the above technical problems, the technical solution adopted by the present invention is that, it is distributed more in a kind of cognition wireless network
User's dynamic spectrum access method, this method are as follows:
Step 1: building system model, the system model are as follows: the n authorized user and k cognitive user in cell are total
Enjoy b channel;Wherein: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;
Step 2: frequency spectrum selection and access are carried out to cognitive user using DQN algorithm, specifically:
Initial Q function is set, sets each cognitive user as an executor, each executor calculates according to DQN
Method carries out movement selection, i.e., selection one is transmitted from b channel, the average utility value of computing system, rich using evolution
Play chess theory setting reward value;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;
Step 3: step 2 is continued to execute, until obtained power system capacity tends to definite value.
Further, the average utility value of system isCalculating process is as follows:
The transmission value of utility u of each cognitive user is determined firsti, 1≤i≤k, specifically: using signal-to-noise ratio as effectiveness
Value:And i-th of cognitive user monopolizes channel p; (1);
Then:
Wherein: SNRiIndicate the signal-to-noise ratio that i-th of cognitive user obtains;
ypFor the state of authorized user on channel p;When being 1, show that channel authorized user occupies;When being 0, show this
The uncommitted user occupancy of channel;
SiThe signal power sent for i-th of cognitive user;
NpFor the noise power of channel p;
For the signal power summation for selecting the other users of same channel to send with i-th of cognitive user;
Further, determine that the process of reward value is as follows using evolution theory of games:
Award value function is provided that
Wherein:
R is reward value;
Indicate the change rate that the ratio of the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
Change rate is more than or equal to 0, then reward value is+1;Less than 0 reward value of change rate is -1.
Further, the process of training neural network is as follows:
Set error function:
Neural network is trained, network θ is updated, to approach Q functional value;
Wherein:
θ is network parameter;
It is the parameter value of one of network;
It is the parameter value of another network;
E expression is averaged;
Expression takes the maximum value of the network parameter;
S indicates state;
Which channel a expression movement, select;
The state of s' subsequent time;
A' indicates which channel subsequent time selects.
Distributed multi-user dynamic spectrum access method has the advantages that 1. is logical in a kind of cognition wireless network of the present invention
It crosses and learns to combine with evolution game theory by deeply, propose in a kind of cognitive radio networks distributed multi-user dynamically
The new method of frequency spectrum access.
2. carrying out dynamic spectrum access as main frame using DQN algorithm, each user implements DQN as independent agency
Algorithm carries out channel selection and study, to be continuously increased power system capacity, while reducing the collision rate between user.
3. introducing evolution theory of games, and the reward functions of nitrification enhancement are set using Replicator Dynamics model,
With the independent study of equiblibrium mass distribution formula multi-user.
Detailed description of the invention
Fig. 1: cognition wireless network structure chart;
Fig. 2: system spectrum environmental structure figure;
Fig. 3: intensified learning schematic diagram;
Fig. 4: DQN algorithm flow chart;
Fig. 5: using and does not use DQN-RD method, power system capacity simulation comparison figure;
Fig. 6: using and does not use DQN-RD method, user's collision rate simulation comparison figure;
Fig. 7: when only channel quantity changes, using DQN-RD method system Capacity Simulation figure.
Specific embodiment
Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network of the present invention, as shown in Figure 1, step
One, system model, the system model are constructed are as follows: n authorized user and k cognitive user in cell share b channel;Its
In: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;, n authorized user be authorized to use this respectively
A little channels, it is assumed that n and k is constant.The spectrum environment of system is as shown in Fig. 2, n authorized user is authorized to use b letter
Road, and k unauthorized user can only seek an opportunity, and utilize frequency spectrum machine therein without the free time of transmission in authorized user
Meeting.Isometric time slot is divided time into, authorized user and unauthorized user keep slot synchronization, and data packet is divided into can be at one
The length that time slot has passed.All unauthorized users have always demand of giving out a contract for a project, and each unauthorized user has independent study and determines
The ability of plan, unauthorized user select optimum channel using independent study algorithm to attempt to access.
Selection and access problem due to dynamic spectrum can be expressed as having continuous state and motion space it is discrete when
Between Markovian decision process, and in mobile environment state transition probability and stateful expectation reward be all often unknown
, therefore power distribution problems are expressed as a Markov process.Under normal circumstances, Markovian decision process is by one
Quaternary array representation, i.e. M=<S, A, P, R>.
Frequency spectrum selection and access are carried out to cognitive user using DQN algorithm, specifically:
Initial Q function is set, sets each cognitive user as an executor, each executor calculates according to DQN
Method carries out movement selection, i.e., selection one is transmitted from b channel, the average utility value of computing system, rich using evolution
Play chess theory setting reward value;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;
Above-mentioned steps are continued to execute, until obtained power system capacity tends to definite value.
In the system model that the present invention studies, each cognitive user is calculated as an executor, independent execution DQN
Method carries out channel selection and access.Each executor is in the selectable behavior aggregate A={ a of moment t1,a2,...,ab, abWhen
Carve the b channel that each cognitive user of t can select.State set S={ s1,s2,...,sbIndicate, sbIndicate moment t's
State, sbIncluding two data: the selected channel p of executor (1≤p≤b) and the value of utility obtained after transmission on channel p
ui(1≤i≤b).Reward functions R we introduce the relevant knowledge of evolution theory of games and be configured.
Introduce signal-to-noise ratio as system value of utility, it is specific as follows:
Wherein: ypFor the state of authorized user on channel p;SiThe signal power sent for i-th of cognitive user;NpFor letter
The noise power of road p;For the signal power summation for selecting the other users of same channel to send with i-th of cognitive user.
Then:
Reward value is arranged using replicator dynamics equation:
Wherein: ε is the factor for influencing evolution speed;
xiIndicate the ratio that the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
U indicates the resulting expected utility of individual of selection access channel in group, and group refers to all k cognitive users
Set;
For group's average expectation effectiveness;
Award value function is provided that
Wherein:
R is reward value;
Indicate the change rate that the ratio of the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
Adopt the channel of i-th of user selection;It is+1 that change rate, which is more than or equal to 0 reward value, less than 0 reward value of change rate
It is -1.
The DQN algorithm that the present invention uses is a kind of algorithm for learning Q to combine with neural network.It is used in DQN algorithm
Neural network approaches Q function as function approximator, and training the basic thought of neural network is by minimizing cost
Function trains the parameter of neural network, and optimal neural network parameter is obtained with this.
Therefore, in Q network, error function is set:
Neural network is trained, network θ is updated, to approach Q functional value;
Find out gradient of the error function about parameter θ, so that it may train neural network with the methods of stochastic gradient descent, more
New parameter obtains optimal Q value.For difference and Q function, used in formula (5) and (6)Symbol, in difference and the present invention
Q value.
Wherein:
θ is network parameter;
It is the parameter value of one of network;
It is the parameter value of another network;
E expression is averaged;
Expression takes the maximum value of the network parameter;
S indicates state;
Which channel a expression movement, select;
The state of s' subsequent time;
A' indicates which channel subsequent time selects.
Formula (5) and (6) be it is existing in the prior art, be used in the model in the present invention.
Q study is one of intensified learning algorithms most in use, and Q study indicates state action to value, Q function Q with Q value
(s a) is described, and is meant that in state s housing choice behavior a award obtained and the then tactful expectation for obtaining award.Q letter
Several replacement criterias are as follows:
Wherein α ∈ (0,1] be learning rate, β ∈ (0,1] be discount factor, rtFor reward functions.
In the system model that the present invention studies, each cognitive user is as an executor, independent execution DQN algorithm
Carry out channel selection and access.I-th of executor is in the selectable behavior aggregate A={ a of moment t1,a2,...,ab, it is held in moment t
The b channel that passerby can select;State set S={ s1,s2,...,sbIndicate, the state s of moment tbIncluding two data:
The selected channel p of executor (1≤p≤N) and the effectiveness u obtained after transmission on channel pi(1≤i≤K)。
As shown in Figures 3 and 4, in each time slot, each cognitive user is as an independent executor according to DQN algorithm
Movement selection is carried out, one is selected from b channel and is transmitted.The transmission effectiveness u of each cognition is obtained after transmissioni.Simultaneously
Calculate the channel capacity of each cognition and the average size of system.Then according to the transmission value of utility of each cognitive user
uiCalculate average utilityThe tactful differentiation rate of each cognitive user is calculated according to the replicator dynamics equation in evolution theory of games
xi.Rule is set according to reward functions, according to xiSize obtain reward value.Neural network is finally trained, Q value is updated, carries out down
The movement of one time slot selects, until obtained power system capacity tends to definite value.
The present invention has carried out simulation analysis to the scheme mentioned, as shown in Figure 1, in simulations, it is contemplated that secondary user
Quantity k=300, channel quantity n=100, and 100 primary users are authorized to and use this 100 channels respectively.
Fig. 5 shows comparison of the different channels access scheme in terms of power system capacity.Line with diamond shape indicates the side DQN-RD
Method indicates random access with circular line.It can be seen from the figure that being held as time slot increases using the system of DQN-RD method
Amount is also increasing, and basicly stable after 550 time slots.It is fluctuated using the power system capacity of random access and increase is not presented
Trend.Although the blue line with circle may be higher than the cyan line with diamond shape, it is not showed in some that study starts
The trend risen out.
In Fig. 6, we show both different access schemes with the comparison in terms of user's collision rate.Collision rate refers to difference
Secondary user select same channel probability.According to Fig. 6, we may safely draw the conclusion: using DQN-RD algorithm, user's collision rate
It will gradually decrease, therefore the utilization rate of channel will increase, system will be gradually stable.This is because being examined when calculating utility function
The interference of the secondary user of the mutual same channel of access is considered.This conflict can be effectively reduced in study.With DQN-RD algorithm phase
Than random access scheme will lead to the result of random fluctuation.
Fig. 7 indicates when using DQN-RD algorithm, power system capacity with channel quantity variation.When other parameters are constant, but
When only the number of channel changes, the simulation result of power system capacity variation tendency is as shown in Figure 7.As can be seen that in number of users phase
With in the case where, no matter increased number of available channels results how much, algorithm can effectively improve power system capacity.
Claims (1)
1. distributed multi-user dynamic spectrum access method in a kind of cognition wireless network, which is characterized in that this method is as follows:
Step 1: building system model, the system model are as follows: n authorized user and k cognitive user in cell share b
A channel;Wherein: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;
Step 2: frequency spectrum selection and access are carried out to cognitive user using DQN algorithm, specifically:
Set initial Q function, set each cognitive user as an executor, each executor according to DQN algorithm into
Action elects, i.e., selection one is transmitted from b channel, the average utility value of computing system, is managed using evolution game
By setting reward value;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;
Step 3: step 2 is continued to execute, until obtained power system capacity tends to definite value;
The average utility value of system isCalculating process is as follows:
The transmission value of utility u of each cognitive user is determined firsti, 1≤i≤k, specifically: using signal-to-noise ratio as value of utility:And i-th of cognitive user monopolizes channel p;
Then:
Wherein: SNRiIndicate the signal-to-noise ratio that i-th of cognitive user obtains;
ypFor the state of authorized user on channel p;Show that channel authorized user occupies when being 1, shows the channel not when being 0
Authorized user occupies;
SiThe signal power sent for i-th of cognitive user;
NpFor the noise power of channel p;
Si-For the signal power summation for selecting the other users of same channel to send with i-th of cognitive user;
It is described to determine that the process of reward value is as follows using evolution theory of games:
Reward value is arranged using replicator dynamics equation:
Wherein: ε is the factor for influencing evolution speed;
xiIndicate the ratio that the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
U indicates the resulting expected utility of individual of selection access channel in group, and group refers to the collection of all k cognitive users
It closes;
For group's average expectation effectiveness;
Award value function is provided that
Wherein:
R is reward value;
Indicate the change rate that the ratio of the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
When change rate is more than or equal to 0, then reward value is+1;When change rate is less than 0, then reward value is -1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810737835.3A CN108880709B (en) | 2018-07-06 | 2018-07-06 | Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810737835.3A CN108880709B (en) | 2018-07-06 | 2018-07-06 | Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108880709A CN108880709A (en) | 2018-11-23 |
CN108880709B true CN108880709B (en) | 2019-05-07 |
Family
ID=64299586
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810737835.3A Active CN108880709B (en) | 2018-07-06 | 2018-07-06 | Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108880709B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109905165B (en) * | 2019-03-24 | 2020-12-08 | 西安电子科技大学 | Asynchronous random access method for satellite internet of things based on Q learning algorithm |
CN111327376B (en) * | 2020-03-04 | 2021-10-01 | 南通大学 | Cognitive radio-based frequency spectrum access method for emergency communication network |
CN111431646B (en) * | 2020-03-31 | 2021-06-15 | 北京邮电大学 | Dynamic resource allocation method in millimeter wave system |
CN114928549A (en) * | 2022-04-20 | 2022-08-19 | 清华大学 | Communication resource allocation method and device of unauthorized frequency band based on reinforcement learning |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101711032B (en) * | 2009-11-23 | 2012-07-25 | 哈尔滨工业大学 | Cognitive radio electric dynamic smart frequency spectrum access method for unknown environmental model characteristics |
CN101854640B (en) * | 2010-05-13 | 2013-10-23 | 北京邮电大学 | Dynamic spectrum access method and system applied to cognitive radio networks |
CN102238555A (en) * | 2011-07-18 | 2011-11-09 | 南京邮电大学 | Collaborative learning based method for multi-user dynamic spectrum access in cognitive radio |
CN106685552B (en) * | 2017-02-20 | 2020-07-07 | 南京邮电大学 | Cooperative detection method among cognitive users based on evolutionary game theory under uncertain noise |
CN108076467B (en) * | 2017-12-29 | 2020-04-10 | 中国人民解放军陆军工程大学 | Generalized perception model and distributed Q learning access method under limitation of frequency spectrum resources |
-
2018
- 2018-07-06 CN CN201810737835.3A patent/CN108880709B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN108880709A (en) | 2018-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108880709B (en) | Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network | |
Guo et al. | Mobile-edge computation offloading for ultradense IoT networks | |
Zhao et al. | Energy-saving offloading by jointly allocating radio and computational resources for mobile edge computing | |
Amiri et al. | Blind federated edge learning | |
Gu et al. | Matching theory for future wireless networks: Fundamentals and applications | |
Fayaz et al. | Transmit power pool design for grant-free NOMA-IoT networks via deep reinforcement learning | |
CN104936186B (en) | Cognitive radio network spectrum allocation method based on cuckoo searching algorithm | |
Yang et al. | Dynamic spectrum access in cognitive radio networks using deep reinforcement learning and evolutionary game | |
CN109600178A (en) | The optimization method of energy consumption and time delay and minimum in a kind of edge calculations | |
CN101980470A (en) | Chaotic particle swarm optimization-based OFDM system resource allocation algorithm | |
CN102186174A (en) | Cooperative spectrum sharing game method for cognitive radio system | |
CN113543342B (en) | NOMA-MEC-based reinforcement learning resource allocation and task unloading method | |
WO2023179010A1 (en) | User packet and resource allocation method and apparatus in noma-mec system | |
CN106358300B (en) | A kind of distributed resource allocation method in microcellulor network | |
CN106230528B (en) | A kind of cognition wireless network frequency spectrum distributing method and system | |
CN112492686A (en) | Cellular network power distribution method based on deep double-Q network | |
CN114698128A (en) | Anti-interference channel selection method and system for cognitive satellite-ground network | |
CN105792218A (en) | Optimization method of cognitive radio network with radio frequency energy harvesting capability | |
Kim | Biform game based cognitive radio scheme for smart grid communications | |
Zu et al. | Multi-user cognitive radio network resource allocation based on the adaptive niche immune genetic algorithm | |
Benamor et al. | Mean field game-theoretic framework for distributed power control in hybrid noma | |
CN114615744A (en) | Knowledge migration reinforcement learning network slice general-purpose sensing calculation resource collaborative optimization method | |
Tang et al. | Nonconvex dynamic spectrum allocation for cognitive radio networks via particle swarm optimization and simulated annealing | |
CN110392377A (en) | A kind of 5G super-intensive networking resources distribution method and device | |
CN103582094B (en) | For many from the multi channel energy-conservation dynamic spectrum access strategy process of user |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |