CN108880709B - Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network - Google Patents

Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network Download PDF

Info

Publication number
CN108880709B
CN108880709B CN201810737835.3A CN201810737835A CN108880709B CN 108880709 B CN108880709 B CN 108880709B CN 201810737835 A CN201810737835 A CN 201810737835A CN 108880709 B CN108880709 B CN 108880709B
Authority
CN
China
Prior art keywords
user
channel
value
cognitive user
cognitive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810737835.3A
Other languages
Chinese (zh)
Other versions
CN108880709A (en
Inventor
李立欣
杨佩彤
张会生
高昂
梁微
李旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN201810737835.3A priority Critical patent/CN108880709B/en
Publication of CN108880709A publication Critical patent/CN108880709A/en
Application granted granted Critical
Publication of CN108880709B publication Critical patent/CN108880709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/02Resource partitioning among network components, e.g. reuse partitioning
    • H04W16/10Dynamic resource partitioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/382Monitoring; Testing of propagation channels for resource allocation, admission control or handover
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses distributed multi-user dynamic spectrum access method in a kind of cognition wireless network, this method is as follows: Step 1: building system model, the system model are as follows: n authorized user and k cognitive user in cell share b channel;Wherein: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;Step 2: carrying out frequency spectrum selection and access to cognitive user using DQN algorithm, specifically: setting initial Q function, each cognitive user is set as an executor, each executor carries out movement selection according to DQN algorithm, selection one is transmitted i.e. from b channel, the average utility value of computing system sets reward value using evolution theory of games;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;Step 3: step 2 is continued to execute, until obtained power system capacity tends to definite value.Under high spectrum conditions of demand, band efficiency is high.

Description

Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network
Technical field
The invention belongs to wireless communication technology fields, and in particular to distributed multi-user dynamic in a kind of cognition wireless network Frequency spectrum access method.
Background technique
With the rapid development of wireless communication technique and universal and new business the continuous growth of wireless device, frequency spectrum money Source becomes more and more in short supply, and fixed available frequency spectrum resource can no longer meet the communicating requirement of user, and low frequency spectrum utilizes in addition The problems such as frequency spectrum resource brought by rate is insufficient becomes to get worse, so that wireless communication system is pushing economy and society development When receive the constraint of frequency spectrum resource.Cognitive radio technology has become the key for solving the problems, such as low frequency spectrum utilization rate at present Technology, the main thought of the technology are to detect which frequency spectrum is in idle condition first, then these skies of intelligent selection and access Ideler frequency spectrum, which greatly enhances the availability of frequency spectrums.
In order to improve the Quality of experience of user and alleviate frequency spectrum pressure, there are many dynamics in cognition wireless network recently The relevant work of spectrum management is completed.These research work greatly enhance the availability of frequency spectrum.But these have Its limitation, although to the more demanding of environment, and user does not have as that can obtain Nash Equilibrium and evolution stable equilibrium Process through overfitting, convergence rate are slow.Or research is military network, each auxiliary nothing being built upon in military network Line electrical nodes can distribute in the premise of frequency spectrum resource according to its priority, using there is certain limitation.Some is not In view of the equilibrium between each user and coordinate, systematic comparison is unstable.
Summary of the invention
Technical problem to be solved by the present invention lies in view of the above shortcomings of the prior art, provide a kind of cognitive wireless Distributed multi-user dynamic spectrum access method in network, under high spectrum conditions of demand, band efficiency is high.
In order to solve the above technical problems, the technical solution adopted by the present invention is that, it is distributed more in a kind of cognition wireless network User's dynamic spectrum access method, this method are as follows:
Step 1: building system model, the system model are as follows: the n authorized user and k cognitive user in cell are total Enjoy b channel;Wherein: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;
Step 2: frequency spectrum selection and access are carried out to cognitive user using DQN algorithm, specifically:
Initial Q function is set, sets each cognitive user as an executor, each executor calculates according to DQN Method carries out movement selection, i.e., selection one is transmitted from b channel, the average utility value of computing system, rich using evolution Play chess theory setting reward value;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;
Step 3: step 2 is continued to execute, until obtained power system capacity tends to definite value.
Further, the average utility value of system isCalculating process is as follows:
The transmission value of utility u of each cognitive user is determined firsti, 1≤i≤k, specifically: using signal-to-noise ratio as effectiveness Value:And i-th of cognitive user monopolizes channel p; (1);
Then:
Wherein: SNRiIndicate the signal-to-noise ratio that i-th of cognitive user obtains;
ypFor the state of authorized user on channel p;When being 1, show that channel authorized user occupies;When being 0, show this The uncommitted user occupancy of channel;
SiThe signal power sent for i-th of cognitive user;
NpFor the noise power of channel p;
For the signal power summation for selecting the other users of same channel to send with i-th of cognitive user;
Further, determine that the process of reward value is as follows using evolution theory of games:
Award value function is provided that
Wherein:
R is reward value;
Indicate the change rate that the ratio of the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
Change rate is more than or equal to 0, then reward value is+1;Less than 0 reward value of change rate is -1.
Further, the process of training neural network is as follows:
Set error function:
Neural network is trained, network θ is updated, to approach Q functional value;
Wherein:
θ is network parameter;
It is the parameter value of one of network;
It is the parameter value of another network;
E expression is averaged;
Expression takes the maximum value of the network parameter;
S indicates state;
Which channel a expression movement, select;
The state of s' subsequent time;
A' indicates which channel subsequent time selects.
Distributed multi-user dynamic spectrum access method has the advantages that 1. is logical in a kind of cognition wireless network of the present invention It crosses and learns to combine with evolution game theory by deeply, propose in a kind of cognitive radio networks distributed multi-user dynamically The new method of frequency spectrum access.
2. carrying out dynamic spectrum access as main frame using DQN algorithm, each user implements DQN as independent agency Algorithm carries out channel selection and study, to be continuously increased power system capacity, while reducing the collision rate between user.
3. introducing evolution theory of games, and the reward functions of nitrification enhancement are set using Replicator Dynamics model, With the independent study of equiblibrium mass distribution formula multi-user.
Detailed description of the invention
Fig. 1: cognition wireless network structure chart;
Fig. 2: system spectrum environmental structure figure;
Fig. 3: intensified learning schematic diagram;
Fig. 4: DQN algorithm flow chart;
Fig. 5: using and does not use DQN-RD method, power system capacity simulation comparison figure;
Fig. 6: using and does not use DQN-RD method, user's collision rate simulation comparison figure;
Fig. 7: when only channel quantity changes, using DQN-RD method system Capacity Simulation figure.
Specific embodiment
Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network of the present invention, as shown in Figure 1, step One, system model, the system model are constructed are as follows: n authorized user and k cognitive user in cell share b channel;Its In: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;, n authorized user be authorized to use this respectively A little channels, it is assumed that n and k is constant.The spectrum environment of system is as shown in Fig. 2, n authorized user is authorized to use b letter Road, and k unauthorized user can only seek an opportunity, and utilize frequency spectrum machine therein without the free time of transmission in authorized user Meeting.Isometric time slot is divided time into, authorized user and unauthorized user keep slot synchronization, and data packet is divided into can be at one The length that time slot has passed.All unauthorized users have always demand of giving out a contract for a project, and each unauthorized user has independent study and determines The ability of plan, unauthorized user select optimum channel using independent study algorithm to attempt to access.
Selection and access problem due to dynamic spectrum can be expressed as having continuous state and motion space it is discrete when Between Markovian decision process, and in mobile environment state transition probability and stateful expectation reward be all often unknown , therefore power distribution problems are expressed as a Markov process.Under normal circumstances, Markovian decision process is by one Quaternary array representation, i.e. M=<S, A, P, R>.
Frequency spectrum selection and access are carried out to cognitive user using DQN algorithm, specifically:
Initial Q function is set, sets each cognitive user as an executor, each executor calculates according to DQN Method carries out movement selection, i.e., selection one is transmitted from b channel, the average utility value of computing system, rich using evolution Play chess theory setting reward value;Then neural network is trained, uses neural network as function approximator, obtains updated Q function; Above-mentioned steps are continued to execute, until obtained power system capacity tends to definite value.
In the system model that the present invention studies, each cognitive user is calculated as an executor, independent execution DQN Method carries out channel selection and access.Each executor is in the selectable behavior aggregate A={ a of moment t1,a2,...,ab, abWhen Carve the b channel that each cognitive user of t can select.State set S={ s1,s2,...,sbIndicate, sbIndicate moment t's State, sbIncluding two data: the selected channel p of executor (1≤p≤b) and the value of utility obtained after transmission on channel p ui(1≤i≤b).Reward functions R we introduce the relevant knowledge of evolution theory of games and be configured.
Introduce signal-to-noise ratio as system value of utility, it is specific as follows:
Wherein: ypFor the state of authorized user on channel p;SiThe signal power sent for i-th of cognitive user;NpFor letter The noise power of road p;For the signal power summation for selecting the other users of same channel to send with i-th of cognitive user.
Then:
Reward value is arranged using replicator dynamics equation:
Wherein: ε is the factor for influencing evolution speed;
xiIndicate the ratio that the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
U indicates the resulting expected utility of individual of selection access channel in group, and group refers to all k cognitive users Set;
For group's average expectation effectiveness;
Award value function is provided that
Wherein:
R is reward value;
Indicate the change rate that the ratio of the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
Adopt the channel of i-th of user selection;It is+1 that change rate, which is more than or equal to 0 reward value, less than 0 reward value of change rate It is -1.
The DQN algorithm that the present invention uses is a kind of algorithm for learning Q to combine with neural network.It is used in DQN algorithm Neural network approaches Q function as function approximator, and training the basic thought of neural network is by minimizing cost Function trains the parameter of neural network, and optimal neural network parameter is obtained with this.
Therefore, in Q network, error function is set:
Neural network is trained, network θ is updated, to approach Q functional value;
Find out gradient of the error function about parameter θ, so that it may train neural network with the methods of stochastic gradient descent, more New parameter obtains optimal Q value.For difference and Q function, used in formula (5) and (6)Symbol, in difference and the present invention Q value.
Wherein:
θ is network parameter;
It is the parameter value of one of network;
It is the parameter value of another network;
E expression is averaged;
Expression takes the maximum value of the network parameter;
S indicates state;
Which channel a expression movement, select;
The state of s' subsequent time;
A' indicates which channel subsequent time selects.
Formula (5) and (6) be it is existing in the prior art, be used in the model in the present invention.
Q study is one of intensified learning algorithms most in use, and Q study indicates state action to value, Q function Q with Q value (s a) is described, and is meant that in state s housing choice behavior a award obtained and the then tactful expectation for obtaining award.Q letter Several replacement criterias are as follows:
Wherein α ∈ (0,1] be learning rate, β ∈ (0,1] be discount factor, rtFor reward functions.
In the system model that the present invention studies, each cognitive user is as an executor, independent execution DQN algorithm Carry out channel selection and access.I-th of executor is in the selectable behavior aggregate A={ a of moment t1,a2,...,ab, it is held in moment t The b channel that passerby can select;State set S={ s1,s2,...,sbIndicate, the state s of moment tbIncluding two data: The selected channel p of executor (1≤p≤N) and the effectiveness u obtained after transmission on channel pi(1≤i≤K)。
As shown in Figures 3 and 4, in each time slot, each cognitive user is as an independent executor according to DQN algorithm Movement selection is carried out, one is selected from b channel and is transmitted.The transmission effectiveness u of each cognition is obtained after transmissioni.Simultaneously Calculate the channel capacity of each cognition and the average size of system.Then according to the transmission value of utility of each cognitive user uiCalculate average utilityThe tactful differentiation rate of each cognitive user is calculated according to the replicator dynamics equation in evolution theory of games xi.Rule is set according to reward functions, according to xiSize obtain reward value.Neural network is finally trained, Q value is updated, carries out down The movement of one time slot selects, until obtained power system capacity tends to definite value.
The present invention has carried out simulation analysis to the scheme mentioned, as shown in Figure 1, in simulations, it is contemplated that secondary user Quantity k=300, channel quantity n=100, and 100 primary users are authorized to and use this 100 channels respectively.
Fig. 5 shows comparison of the different channels access scheme in terms of power system capacity.Line with diamond shape indicates the side DQN-RD Method indicates random access with circular line.It can be seen from the figure that being held as time slot increases using the system of DQN-RD method Amount is also increasing, and basicly stable after 550 time slots.It is fluctuated using the power system capacity of random access and increase is not presented Trend.Although the blue line with circle may be higher than the cyan line with diamond shape, it is not showed in some that study starts The trend risen out.
In Fig. 6, we show both different access schemes with the comparison in terms of user's collision rate.Collision rate refers to difference Secondary user select same channel probability.According to Fig. 6, we may safely draw the conclusion: using DQN-RD algorithm, user's collision rate It will gradually decrease, therefore the utilization rate of channel will increase, system will be gradually stable.This is because being examined when calculating utility function The interference of the secondary user of the mutual same channel of access is considered.This conflict can be effectively reduced in study.With DQN-RD algorithm phase Than random access scheme will lead to the result of random fluctuation.
Fig. 7 indicates when using DQN-RD algorithm, power system capacity with channel quantity variation.When other parameters are constant, but When only the number of channel changes, the simulation result of power system capacity variation tendency is as shown in Figure 7.As can be seen that in number of users phase With in the case where, no matter increased number of available channels results how much, algorithm can effectively improve power system capacity.

Claims (1)

1. distributed multi-user dynamic spectrum access method in a kind of cognition wireless network, which is characterized in that this method is as follows:
Step 1: building system model, the system model are as follows: n authorized user and k cognitive user in cell share b A channel;Wherein: n, k and b take the natural number being not zero, and the value of n and b is equal;K > n;
Step 2: frequency spectrum selection and access are carried out to cognitive user using DQN algorithm, specifically:
Set initial Q function, set each cognitive user as an executor, each executor according to DQN algorithm into Action elects, i.e., selection one is transmitted from b channel, the average utility value of computing system, is managed using evolution game By setting reward value;Then neural network is trained, uses neural network as function approximator, obtains updated Q function;
Step 3: step 2 is continued to execute, until obtained power system capacity tends to definite value;
The average utility value of system isCalculating process is as follows:
The transmission value of utility u of each cognitive user is determined firsti, 1≤i≤k, specifically: using signal-to-noise ratio as value of utility:And i-th of cognitive user monopolizes channel p;
Then:
Wherein: SNRiIndicate the signal-to-noise ratio that i-th of cognitive user obtains;
ypFor the state of authorized user on channel p;Show that channel authorized user occupies when being 1, shows the channel not when being 0 Authorized user occupies;
SiThe signal power sent for i-th of cognitive user;
NpFor the noise power of channel p;
Si-For the signal power summation for selecting the other users of same channel to send with i-th of cognitive user;
It is described to determine that the process of reward value is as follows using evolution theory of games:
Reward value is arranged using replicator dynamics equation:
Wherein: ε is the factor for influencing evolution speed;
xiIndicate the ratio that the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
U indicates the resulting expected utility of individual of selection access channel in group, and group refers to the collection of all k cognitive users It closes;
For group's average expectation effectiveness;
Award value function is provided that
Wherein:
R is reward value;
Indicate the change rate that the ratio of the total cognitive user of cognitive user Zhan of same channel is selected with i-th of user;
When change rate is more than or equal to 0, then reward value is+1;When change rate is less than 0, then reward value is -1.
CN201810737835.3A 2018-07-06 2018-07-06 Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network Active CN108880709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810737835.3A CN108880709B (en) 2018-07-06 2018-07-06 Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810737835.3A CN108880709B (en) 2018-07-06 2018-07-06 Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network

Publications (2)

Publication Number Publication Date
CN108880709A CN108880709A (en) 2018-11-23
CN108880709B true CN108880709B (en) 2019-05-07

Family

ID=64299586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810737835.3A Active CN108880709B (en) 2018-07-06 2018-07-06 Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network

Country Status (1)

Country Link
CN (1) CN108880709B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109905165B (en) * 2019-03-24 2020-12-08 西安电子科技大学 Asynchronous random access method for satellite internet of things based on Q learning algorithm
CN111327376B (en) * 2020-03-04 2021-10-01 南通大学 Cognitive radio-based frequency spectrum access method for emergency communication network
CN111431646B (en) * 2020-03-31 2021-06-15 北京邮电大学 Dynamic resource allocation method in millimeter wave system
CN114928549A (en) * 2022-04-20 2022-08-19 清华大学 Communication resource allocation method and device of unauthorized frequency band based on reinforcement learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101711032B (en) * 2009-11-23 2012-07-25 哈尔滨工业大学 Cognitive radio electric dynamic smart frequency spectrum access method for unknown environmental model characteristics
CN101854640B (en) * 2010-05-13 2013-10-23 北京邮电大学 Dynamic spectrum access method and system applied to cognitive radio networks
CN102238555A (en) * 2011-07-18 2011-11-09 南京邮电大学 Collaborative learning based method for multi-user dynamic spectrum access in cognitive radio
CN106685552B (en) * 2017-02-20 2020-07-07 南京邮电大学 Cooperative detection method among cognitive users based on evolutionary game theory under uncertain noise
CN108076467B (en) * 2017-12-29 2020-04-10 中国人民解放军陆军工程大学 Generalized perception model and distributed Q learning access method under limitation of frequency spectrum resources

Also Published As

Publication number Publication date
CN108880709A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
CN108880709B (en) Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network
Guo et al. Mobile-edge computation offloading for ultradense IoT networks
Zhao et al. Energy-saving offloading by jointly allocating radio and computational resources for mobile edge computing
Amiri et al. Blind federated edge learning
Gu et al. Matching theory for future wireless networks: Fundamentals and applications
Fayaz et al. Transmit power pool design for grant-free NOMA-IoT networks via deep reinforcement learning
CN104936186B (en) Cognitive radio network spectrum allocation method based on cuckoo searching algorithm
Yang et al. Dynamic spectrum access in cognitive radio networks using deep reinforcement learning and evolutionary game
CN109600178A (en) The optimization method of energy consumption and time delay and minimum in a kind of edge calculations
CN101980470A (en) Chaotic particle swarm optimization-based OFDM system resource allocation algorithm
CN102186174A (en) Cooperative spectrum sharing game method for cognitive radio system
CN113543342B (en) NOMA-MEC-based reinforcement learning resource allocation and task unloading method
WO2023179010A1 (en) User packet and resource allocation method and apparatus in noma-mec system
CN106358300B (en) A kind of distributed resource allocation method in microcellulor network
CN106230528B (en) A kind of cognition wireless network frequency spectrum distributing method and system
CN112492686A (en) Cellular network power distribution method based on deep double-Q network
CN114698128A (en) Anti-interference channel selection method and system for cognitive satellite-ground network
CN105792218A (en) Optimization method of cognitive radio network with radio frequency energy harvesting capability
Kim Biform game based cognitive radio scheme for smart grid communications
Zu et al. Multi-user cognitive radio network resource allocation based on the adaptive niche immune genetic algorithm
Benamor et al. Mean field game-theoretic framework for distributed power control in hybrid noma
CN114615744A (en) Knowledge migration reinforcement learning network slice general-purpose sensing calculation resource collaborative optimization method
Tang et al. Nonconvex dynamic spectrum allocation for cognitive radio networks via particle swarm optimization and simulated annealing
CN110392377A (en) A kind of 5G super-intensive networking resources distribution method and device
CN103582094B (en) For many from the multi channel energy-conservation dynamic spectrum access strategy process of user

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant