CN109039505B - Channel state transition probability prediction method in cognitive radio network - Google Patents

Channel state transition probability prediction method in cognitive radio network Download PDF

Info

Publication number
CN109039505B
CN109039505B CN201810696652.1A CN201810696652A CN109039505B CN 109039505 B CN109039505 B CN 109039505B CN 201810696652 A CN201810696652 A CN 201810696652A CN 109039505 B CN109039505 B CN 109039505B
Authority
CN
China
Prior art keywords
state
channel
channel state
probability
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810696652.1A
Other languages
Chinese (zh)
Other versions
CN109039505A (en
Inventor
韩光洁
李傲寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN201810696652.1A priority Critical patent/CN109039505B/en
Publication of CN109039505A publication Critical patent/CN109039505A/en
Application granted granted Critical
Publication of CN109039505B publication Critical patent/CN109039505B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • H04B17/3913Predictive models, e.g. based on neural network models

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention designs a channel state transition probability prediction method in a cognitive radio network based on a two-stage Q learning method, which comprises two learning processes, namely a channel state transition probability learning process and a channel state learning process; the channel state transition probability learning process and the channel state learning process are both based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state. The method can predict the continuously changing authorized channel state in real time. Furthermore, the method does not require that any probability distribution is assumed in advance for an authorized user or interferer. Moreover, the method fully considers the problem that the used channel state transition probability predicts whether the sample state is correct or not.

Description

Channel state transition probability prediction method in cognitive radio network
Technical Field
The invention relates to a method for predicting channel state transition probability in a cognitive radio network, and belongs to the technical field of radio networks.
Background
Over the past decades, wireless networks have supported the need for ever increasing higher data rates in order to meet consumer demand for fast, secure, and intelligent wireless networks. However, current wireless network systems face bottlenecks due to spectrum resources. This bottleneck makes it difficult to enhance performance in the limited available frequency band. Therefore, new communication cases are needed to further enhance the performance of wireless networks. The industry has predicted that wireless networks require more spectrum resources to serve more users of high-speed communications. Also, current spectrum resources need to be utilized more efficiently. Cognitive radio technology has emerged to make more efficient use of limited spectrum resources. In a cognitive radio network, a cognitive radio user can dynamically utilize a licensed spectrum without interfering with normal communications of the licensed user. However, the design of cognitive radio users faces many security challenges. In a cognitive radio network, cognitive radio users may be attacked by interference. A jammer equipped with cognitive functions can perceive channels occupied by unauthorized users and attack these channels. The attack of the disturber on the cognitive radio users will destroy the normal communication between the cognitive radio users. Therefore, in order to ensure normal communication of the cognitive radio users in the cognitive radio network and avoid interference to authorized users, prediction of channel state information is very important.
In a cognitive radio network, a cognitive radio user can generally perceive the state of a licensed channel through spectrum sensing technology. However, due to cognitive radio hardware and energy limitations, a cognitive radio user may not have the ability to perceive all licensed spectrum at the same time. Furthermore, even if the cognitive radio user has the ability to sense the entire licensed spectrum, sensing the entire licensed spectrum will consume a large amount of time, which will cause a large degree of delay in the communication of the cognitive radio user. Therefore, in order to obtain more correct spectrum state in a limited time, we need to obtain more information about the channel. The most important information is the channel state transition probability. Many existing articles assume that the channel state is determined by the state of an authorized user, and that the cognitive radio user knows the state transition information of the authorized user. However, in a real cognitive radio network, it is difficult for a cognitive radio user to know state transition information of an authorized user. Therefore, the cognitive radio user needs to predict the state information of the grant channel. To date, only a small percentage of the related research has predicted the state transition information of the grant channel in the cognitive radio network.
The prior relevant research literature for the channel state transition probability prediction method in the cognitive radio network is as follows:
liu et al, in 2012, published in IEEE Globecom, "Prediction of interpreted Distributed Primary User Traffic for Dynamic Spectrum Access," proposed an authorized User activity Prediction method based on the maximum likelihood approach. This method assumes that the activities of authorized users follow an exponential distribution and that perception is perfect. In this prediction method, the probability that an authorized user occupies a channel is first calculated from a sample. Then, the time of the authorized user occupying the authorized channel is evaluated according to the evaluation probability. And then, evaluating the transition probability of the channel occupation situation of the unauthorized user according to the evaluated channel occupation probability and the channel occupation time. And finally, estimating the channel state at the next moment according to the perfectly perceived channel state and the estimated transition probability. However, this maximum likelihood prediction method assumes that the channel perception result is perfect. In a real cognitive radio network, due to the limitations of sensing time and sensing method, the channel state sensing result may be imperfect. Furthermore, the occupation interval of the channel by the authorized user needs to be subject to exponential distribution. Moreover, a long sampling process is required in predicting the next state. Therefore, the method brings long time delay to the communication of the cognitive radio user.
Song et al proposed a Markov-based channel prediction method in an article "underlying the prediction of Wireless Spectrum: A Large-scale Empirical Study" published in IEEE ICC 2010. The method predicts the channel state of the slot using the first K channel states of the slot to be predicted. And predicting the channel state of the time slot by counting the number of the occupied time slot and the number of the unoccupied time slot of the K channel states. However, this approach predicts that different levels of error rates will be encountered in different situations. In some cases, this method faces a large prediction error rate. Furthermore, the channel states in the sequence of channel states used for prediction are simultaneously facing the problem of perceived channel state errors.
A Channel transition probability prediction algorithm Based on the Baum-Welch algorithm is proposed in an article 'Optimal Channel-Sensing Scheme for Cognitive Radio Systems Based on Fuzzy Q-Learning' published in IEICE transition on Communication 2014 by F.H. Panahi et al. However, this method requires that the channel transition probability be predicted over a certain number of sample spaces before the channel is predicted using the channel transition probability. In addition, if the channel transition probability changes, the prediction method takes a lot of time to re-sample to re-predict the channel transition probability. This channel transition prediction algorithm cannot predict the state of the channel in real time. Therefore, the communication delay of the cognitive radio user will be increased. And the time of resampling cannot be predicted.
S. Filippi et al propose a Channel Transition probability prediction method based on a maximum likelihood Estimation method in An article "An Estimation Algorithm of Channel State Transition Prohability for Cognitive Radio Systems" published in IEEE CrownCom 2008, which can estimate the Channel Transition probability relatively accurately. The state transition probability of the method is based on the fact that the cognitive radio user can obtain the correct channel state. However, in an actual cognitive radio network, due to limitations of cognitive radio users in sensing time, sensing method and surrounding communication environment, the cognitive radio users may not obtain real channel state information. In addition, in this method, the cognitive radio user needs to simultaneously collect channel state information for a long time to evaluate the channel transition probability. Therefore, the channel transition probability prediction method will increase the communication delay of the cognitive radio user.
Akbuliut et al propose a Channel transfer probability prediction method based on a Particle Swarm method in an article "Estimation of Time-Varying Channel State Transition Probabilities for Cognitive Radio Systems by Means of Particle Swarm Optimization" published in Radio engineering 2012. This method enables prediction of varying channel transition probabilities. However, this method also faces the case where the used channel state information may be different from the actual channel state.
On the basis of summarizing these studies, it can be seen that the following major problems exist in the design of current cognitive radio network architectures:
1. many articles default to the same state of the observed channel state sequence as the actual channel state. However, due to the spectrum sensing time of the unauthorized user, the spectrum sensing method, and the limitation of the communication environment, the observed channel state may deviate from the actual channel state to some extent.
2. Most of the articles need to predict the channel state transition probability in advance, which will bring some delay to the communication of the cognitive radio user. Furthermore, most articles cannot predict the channel state transition probability in real time. Therefore, the channel state transition probability in most of the articles is only applicable to the cognitive radio network in which the channel state transition probability is always unchanged. However, in a real cognitive radio network, the channel state transition probability may be time varying.
3. The channel state transition probability prediction methods in some articles need to meet certain probability distribution and cannot be generally used in cognitive radio networks.
Disclosure of Invention
The technical problem is as follows: the invention designs a channel state transition probability prediction method in a cognitive radio network based on a two-stage Q learning method. The prediction method comprises two stages of prediction processes based on a Q learning method, namely a channel state transition probability learning process and a channel state learning process. The input and output of the channel state transition probability learning process are the channel state learned by the channel state learning process and the predicted channel state transition probability, respectively. The inputs and outputs of the channel state learning process are: the output of the channel state transition probability learning process, and the state of the learned channel. The cognitive radio user can finally deduce the state transition probability of the channel through continuous learning of the channel, so that the state information of the channel at the next moment can be correctly deduced, and the situation that the cognitive radio user is attacked by an interferer and causes interference to an authorized user is avoided. The method can predict the continuously changing authorized channel state in real time. Furthermore, the method does not require that any probability distribution is assumed in advance for an authorized user or interferer. Moreover, the method fully considers the problem that the used channel state transition probability predicts whether the sample state is correct or not.
The technical scheme of the invention is as follows:
the invention relates to a channel state transition probability prediction method in a cognitive radio network, which comprises two learning processes, namely a channel state transition probability learning process and a channel state learning process;
the channel state transition probability learning process and the channel state learning process are both based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state.
The channel state transition probability learning process is based on the basic elements of a Q learning method and respectively comprises the following steps:
the state is as follows: the states of the authorized channel are 0 and 1, when the state of the authorized channel is 0, the authorized channel is in an idle state, and when the state of the authorized channel is 1, the authorized channel is occupied by an authorized user or attacked by an interferer or both the authorized user and the interferer;
the actions are as follows: the predicted grant channel state at the next time instant, i.e., 0or 1;
reward: determining according to the predicted authorized channel state at the next moment and the difference of the channel state learned by the channel learning process;
the Q learning process of the channel state transition probability learning process is as follows:
(2a) initializing parameters in a Q-learning method
Initializing a parameter in the Q learning method to be a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in Q learning; during initialization, setting a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in the Q learning method to be 0;
(2b) action decision process
Selecting action according to the Q value of the next state corresponding to the current state, namely predicting the state of the channel at the next moment;
selecting actions by a xi greedy method in the action decision process, namely selecting the state of the next moment corresponding to the maximum Q value by the cognitive radio user according to the probability of xi, and selecting the state of the next moment corresponding to the non-maximum Q value according to the probability of 1-xi;
the selection method comprises the following steps:
Figure BDA0001713592840000061
wherein StIs the state corresponding to the channel at the time t; st+1The state corresponding to the channel at the time of t + 1; q (-) is the channel state StQ values corresponding to different next time channel states;
(2c) updating the Q value
Updating according to the channel state learned in the channel state learning process, wherein the Q value updating method comprises the following steps:
Figure BDA0001713592840000062
wherein the content of the first and second substances,
Figure BDA0001713592840000063
the Q value corresponding to the next state selected in the current state;
Figure BDA0001713592840000064
the value is the maximum Q value corresponding to the current channel state; α is the learning rate; gamma is a discount factor; r is a reward earned for performing the selected action;
when the action selected in the channel state transition probability learning process is the same as the channel state learned by the channel state learning method, the channel state transition probability learning process can obtain the reward and record the reward as 1; if the action selected in the channel state transition probability learning process is different from the channel state learned by the channel state learning method, the obtainable reward is marked as-1;
(2d) computing channel state transition probabilities
Calculating the channel state transition probability according to the Q value of the next state corresponding to the current state;
the calculation method comprises the following steps:
Figure BDA0001713592840000065
wherein, P0,jIs the probability of the channel transitioning from state 0 to state j; p1,jIs the probability of the channel transitioning from state 1 to j; q. q.s0,jThe Q value of j is the next state corresponding to the current state 0; q. q.s0,0The Q value of the next state which corresponds to the current state 0 and is 0; q. q.s0,1The Q value of the next state corresponding to the current state 0 is 1; q. q.s1,jThe Q value of j is the next state corresponding to the current state 1; q. q.s1,0The Q value of the next state corresponding to the current state 1 is 0; q. q.s1,1The Q value of the next state corresponding to the current state 1 is 1.
A channel state learning process comprising the steps of:
the basic elements of the channel state learning process based on the Q learning method are respectively as follows:
the state is as follows: the state probability of the grant channel, that is, the probability that the state of the grant channel is 0 and the probability that the state of the grant channel is 1;
the actions are as follows: transmitting data or not transmitting data;
reward: determining whether to transmit data or not and whether to generate a collision;
the Q learning process of the channel state learning process is as follows:
(3a) predicting grant channel state probability
The cognitive radio calculates the probability of each state of the authorized channel according to the spectrum sensing result and the channel state transition probability output in the channel state transition probability learning process,
the calculation method comprises the following steps:
Figure BDA0001713592840000071
Figure BDA0001713592840000072
wherein the content of the first and second substances,
Figure BDA0001713592840000075
probability of state i at time t for grant channel,i=0or1;
Figure BDA0001713592840000073
The probability that the state of the grant channel is i at the time t +1 is 0or 1; y isiThe probability that the authorized channel state observed by the sensing result is i is 0or 1; pi,jThe probability of the grant channel transitioning from state i to state j, j being 0or 1;
(3b) selecting corresponding action
Selecting a corresponding action according to the predicted state probability of the authorized channel,
the specific selection process is as follows: if the probability that the grant channel is free is greater than the probability that the grant channel is occupied, i.e. the grant channel is free
Figure BDA0001713592840000074
When the data is transmitted, the authorized channel is considered to be in an idle state, data transmission is carried out according to probability selection of eta, and data transmission is not carried out according to probability selection of 1-eta; otherwise, the authorized channel is considered to be in an occupied state, data transmission is not carried out according to the probability selection of eta, and data transmission is carried out according to the probability selection of 1-eta;
(3c) calculating rewards
If the selected action is data transmission and does not conflict with authorized users and interferers, the reward is 1; if the selected action is data transmission and conflicts with an authorized user or an interferer, the reward is-1; if the action is selected as not carrying out data transmission, the reward is 0;
(3d) updating the Q value
The Q value updating method comprises the following steps:
Figure BDA0001713592840000081
wherein the content of the first and second substances,
Figure BDA0001713592840000082
a Q value corresponding to the selected action in the current state;
Figure BDA0001713592840000083
value is currentMaximum Q value of action corresponding to the state; β is the learning rate; χ is a discount factor; rαA reward earned for performing the selected action;
(3e) outputting grant channel state
If the selected action is data transmission and does not conflict with the authorized user and the interferers, outputting the authorized channel state as idle; otherwise, the output channel state is occupied.
The invention achieves the following beneficial effects:
(1) the invention takes into account the problem of whether the perceived channel state is correct or not. The cognitive radio user continuously learns the perceived channel state through a channel state learning process, so that a correct channel state is obtained, and interference of the cognitive radio user on an authorized user and malicious attack of an interferer on the cognitive radio user are avoided.
(2) The invention can evaluate the channel state transition probability in real time and can evaluate the changed channel state transition probability. Therefore, certain evaluation time delay and channel state transition probability prediction error rate are avoided.
(3) The channel state transition probability prediction method in the cognitive radio network does not need to assume in advance that the activities of authorized users or interference users meet any probability distribution, and the prediction method is more generally applied to the cognitive radio network.
Drawings
FIG. 1 is an overall block diagram of the channel state transition probability prediction method of the present invention;
FIG. 2 is a flow chart of a channel state transition probability prediction method of the present invention;
fig. 3 is a flowchart of a channel state prediction method according to the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
As shown in fig. 1, the present invention relates to a method for predicting channel state transition probability in a cognitive radio network, which includes two learning processes, namely a channel state transition probability learning process and a channel state learning process;
the channel state transition probability learning process and the channel state learning process are both based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state.
The channel state transition probability learning process is based on the basic elements of a Q learning method and respectively comprises the following steps:
the state is as follows: the states of the authorized channel are 0 and 1, when the state of the authorized channel is 0, the authorized channel is in an idle state, and when the state of the authorized channel is 1, the authorized channel is occupied by an authorized user or attacked by an interferer or both the authorized user and the interferer;
the actions are as follows: the predicted grant channel state at the next time instant, i.e., 0or 1;
reward: determining according to the predicted authorized channel state at the next moment and the difference of the channel state learned by the channel learning process;
as shown in fig. 2, the Q learning process of the channel state transition probability learning process is as follows:
(2a) initializing parameters in a Q-learning method
Initializing a parameter in the Q learning method to be a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in Q learning; during initialization, setting a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in the Q learning method to be 0;
(2b) action decision process
Selecting action according to the Q value of the next state corresponding to the current state, namely predicting the state of the channel at the next moment;
selecting actions by a xi greedy method in the action decision process, namely selecting the state of the next moment corresponding to the maximum Q value by the cognitive radio user according to the probability of xi, and selecting the state of the next moment corresponding to the non-maximum Q value according to the probability of 1-xi;
the selection method comprises the following steps:
Figure BDA0001713592840000101
wherein StIs the state corresponding to the channel at the time t; st+1The state corresponding to the channel at the time of t + 1; q (-) is the channel state StQ values corresponding to different next time channel states;
(2c) updating the Q value
Updating according to the channel state learned in the channel state learning process, wherein the Q value updating method comprises the following steps:
Figure BDA0001713592840000102
wherein the content of the first and second substances,
Figure BDA0001713592840000103
the Q value corresponding to the next state selected in the current state;
Figure BDA0001713592840000104
the value is the maximum Q value corresponding to the current channel state; α is the learning rate; gamma is a discount factor; r is a reward earned for performing the selected action;
when the action selected in the channel state transition probability learning process is the same as the channel state learned by the channel state learning method, the channel state transition probability learning process can obtain the reward and record the reward as 1; if the action selected in the channel state transition probability learning process is different from the channel state learned by the channel state learning method, the obtainable reward is marked as-1;
(2d) computing channel state transition probabilities
Calculating the channel state transition probability according to the Q value of the next state corresponding to the current state;
the calculation method comprises the following steps:
Figure BDA0001713592840000105
wherein, P0,jIs the probability of the channel transitioning from state 0 to state j; p1,jIs the probability of the channel transitioning from state 1 to j; q. q.s0,jThe Q value of j is the next state corresponding to the current state 0; q. q.s0,0The Q value of the next state which corresponds to the current state 0 and is 0; q. q.s0,1The Q value of the next state corresponding to the current state 0 is 1; q. q.s1,jThe Q value of j is the next state corresponding to the current state 1; q. q.s1,0The Q value of the next state corresponding to the current state 1 is 0; q. q.s1,1The Q value of the next state corresponding to the current state 1 is 1.
A channel state learning process comprising the steps of:
the basic elements of the channel state learning process based on the Q learning method are respectively as follows:
the state is as follows: the state probability of the grant channel, that is, the probability that the state of the grant channel is 0 and the probability that the state of the grant channel is 1;
the actions are as follows: transmitting data or not transmitting data;
reward: determining whether to transmit data or not and whether to generate a collision;
as shown in fig. 3, the Q learning process of the channel state learning process is as follows:
(3a) predicting grant channel state probability
The cognitive radio calculates the probability of each state of the authorized channel according to the spectrum sensing result and the channel state transition probability output in the channel state transition probability learning process,
the calculation method comprises the following steps:
Figure BDA0001713592840000111
Figure BDA0001713592840000112
wherein the content of the first and second substances,
Figure BDA0001713592840000113
for grant channel state at time tProbability of i, i ═ 0or 1;
Figure BDA0001713592840000114
for granting channels
Probability that the state is i at time t +1, i ═ 0or 1; y isiThe probability that the authorized channel state observed by the sensing result is i is 0or 1; pi,jThe probability of the grant channel transitioning from state i to state j, j being 0or 1;
(3b) selecting corresponding action
Selecting a corresponding action according to the predicted state probability of the authorized channel,
the specific selection process is as follows: if the probability that the grant channel is free is greater than the probability that the grant channel is occupied, i.e. the grant channel is free
Figure BDA0001713592840000115
When the data is transmitted, the authorized channel is considered to be in an idle state, data transmission is carried out according to probability selection of eta, and data transmission is not carried out according to probability selection of 1-eta; otherwise, the authorized channel is considered to be in an occupied state, data transmission is not carried out according to the probability selection of eta, and data transmission is carried out according to the probability selection of 1-eta;
(3c) calculating rewards
If the selected action is data transmission and does not conflict with authorized users and interferers, the reward is 1; if the selected action is data transmission and conflicts with an authorized user or an interferer, the reward is-1; if the action is selected as not carrying out data transmission, the reward is 0;
(3d) updating the Q value
The Q value updating method comprises the following steps:
Figure BDA0001713592840000121
wherein the content of the first and second substances,
Figure BDA0001713592840000123
a Q value corresponding to the selected action in the current state;
Figure BDA0001713592840000122
the value is the maximum Q value of the action corresponding to the current state; β is the learning rate; χ is a discount factor; rαA reward earned for performing the selected action;
(3e) outputting grant channel state
If the selected action is data transmission and does not conflict with the authorized user and the interferers, outputting the authorized channel state as idle; otherwise, the output channel state is occupied.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (2)

1. A channel state transition probability prediction method in a cognitive radio network is characterized by comprising two learning processes, namely a channel state transition probability learning process and a channel state learning process;
the channel state transition probability learning process and the channel state learning process are based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state;
the channel state transition probability learning process is based on the basic elements of a Q learning method and respectively comprises the following steps:
the state is as follows: the states of the authorized channel are 0 and 1, when the state of the authorized channel is 0, the authorized channel is in an idle state, and when the state of the authorized channel is 1, the authorized channel is occupied by an authorized user or attacked by an interferer or both the authorized user and the interferer;
the actions are as follows: the predicted grant channel state at the next time instant, i.e., 0or 1;
reward: determining according to the predicted authorized channel state at the next moment and the difference of the channel state learned by the channel learning process;
the Q learning process of the channel state transition probability learning process is as follows:
(2a) initializing parameters in a Q-learning method
Initializing a parameter in the Q learning method to be a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in Q learning; during initialization, setting a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in the Q learning method to be 0;
(2b) action decision process
Selecting action according to the Q value of the next state corresponding to the current state, namely predicting the state of the channel at the next moment;
selecting actions by a xi greedy method in the action decision process, namely selecting the state of the next moment corresponding to the maximum Q value by the cognitive radio user according to the probability of xi, and selecting the state of the next moment corresponding to the non-maximum Q value according to the probability of 1-xi;
the selection method comprises the following steps:
Figure FDA0002764939580000021
wherein StIs the state corresponding to the channel at the time t; st+1The state corresponding to the channel at the time of t + 1; q (-) is the channel state StQ values corresponding to different next time channel states;
(2c) updating the Q value
Updating according to the channel state learned in the channel state learning process, wherein the Q value updating method comprises the following steps:
Figure FDA0002764939580000022
wherein the content of the first and second substances,
Figure FDA0002764939580000023
the Q value corresponding to the next state selected in the current state;
Figure FDA0002764939580000024
the value is the maximum Q value corresponding to the current channel state; α is the learning rate; gamma is a discount factor; r is a reward earned for performing the selected action;
when the action selected in the channel state transition probability learning process is the same as the channel state learned by the channel state learning method, the channel state transition probability learning process can obtain the reward and record the reward as 1; if the action selected in the channel state transition probability learning process is different from the channel state learned by the channel state learning method, the obtainable reward is marked as-1;
(2d) computing channel state transition probabilities
Calculating the channel state transition probability according to the Q value of the next state corresponding to the current state;
the calculation method comprises the following steps:
Figure FDA0002764939580000025
wherein, P0,jIs the probability of the channel transitioning from state 0 to state j; p1,jIs the probability of the channel transitioning from state 1 to j; q. q.s0,jThe Q value of j is the next state corresponding to the current state 0; q. q.s0,0The Q value of the next state which corresponds to the current state 0 and is 0; q. q.s0,1The Q value of the next state corresponding to the current state 0 is 1; q. q.s1,jThe Q value of j is the next state corresponding to the current state 1; q. q.s1,0The Q value of the next state corresponding to the current state 1 is 0; q. q.s1,1The Q value of the next state corresponding to the current state 1 is 1.
2. The method for predicting channel state transition probability in a cognitive radio network according to claim 1, wherein: a channel state learning process comprising the steps of:
the basic elements of the channel state learning process based on the Q learning method are respectively as follows:
the state is as follows: the state probability of the grant channel, that is, the probability that the state of the grant channel is 0 and the probability that the state of the grant channel is 1;
the actions are as follows: transmitting data or not transmitting data;
reward: determining whether to transmit data or not and whether to generate a collision;
the Q learning process of the channel state learning process is as follows:
(3a) predicting grant channel state probability
The cognitive radio calculates the probability of each state of the authorized channel according to the spectrum sensing result and the channel state transition probability output in the channel state transition probability learning process,
the calculation method comprises the following steps:
Figure FDA0002764939580000031
Figure FDA0002764939580000032
wherein the content of the first and second substances,
Figure FDA0002764939580000033
the probability that the state of the grant channel is i at the time t is 0or 1;
Figure FDA0002764939580000034
the probability that the state of the grant channel is i at the time t +1 is 0or 1; y isiThe probability that the authorized channel state observed by the sensing result is i is 0or 1; pi,jThe probability of the grant channel transitioning from state i to state j, j being 0or 1;
(3b) selecting corresponding action
Selecting a corresponding action according to the predicted state probability of the authorized channel,
the specific selection process is as follows: if the probability that the grant channel is free is greater than the probability that the grant channel is occupied, i.e. the grant channel is free
Figure FDA0002764939580000035
And then, the authorized channel is considered to be in an idle state, and data transmission is carried out according to probability selection of etaAnd (3) selecting the probability of 1-eta not to transmit data; otherwise, the authorized channel is considered to be in an occupied state, data transmission is not carried out according to the probability selection of eta, and data transmission is carried out according to the probability selection of 1-eta;
(3c) calculating rewards
If the selected action is data transmission and does not conflict with authorized users and interferers, the reward is 1; if the selected action is data transmission and conflicts with an authorized user or an interferer, the reward is-1; if the action is selected as not carrying out data transmission, the reward is 0;
(3d) updating the Q value
The Q value updating method comprises the following steps:
Figure FDA0002764939580000041
wherein the content of the first and second substances,
Figure FDA0002764939580000042
a Q value corresponding to the selected action in the current state;
Figure FDA0002764939580000043
the value is the maximum Q value of the action corresponding to the current state; β is the learning rate; χ is a discount factor; rαA reward earned for performing the selected action;
(3e) outputting grant channel state
If the selected action is data transmission and does not conflict with the authorized user and the interferers, outputting the authorized channel state as idle; otherwise, the output channel state is occupied.
CN201810696652.1A 2018-06-29 2018-06-29 Channel state transition probability prediction method in cognitive radio network Expired - Fee Related CN109039505B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810696652.1A CN109039505B (en) 2018-06-29 2018-06-29 Channel state transition probability prediction method in cognitive radio network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810696652.1A CN109039505B (en) 2018-06-29 2018-06-29 Channel state transition probability prediction method in cognitive radio network

Publications (2)

Publication Number Publication Date
CN109039505A CN109039505A (en) 2018-12-18
CN109039505B true CN109039505B (en) 2021-02-09

Family

ID=65521930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810696652.1A Expired - Fee Related CN109039505B (en) 2018-06-29 2018-06-29 Channel state transition probability prediction method in cognitive radio network

Country Status (1)

Country Link
CN (1) CN109039505B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110380802A (en) * 2019-06-14 2019-10-25 中国人民解放军陆军工程大学 Single user dynamic spectrum jamproof system and method based on Software Radio platform
CN110601826B (en) * 2019-09-06 2021-10-08 北京邮电大学 Self-adaptive channel distribution method in dynamic DWDM-QKD network based on machine learning
CN111181669B (en) * 2020-01-03 2022-02-11 中国科学院上海高等研究院 Self-adaptive spectrum sensing method, system, medium and terminal based on pre-evaluation processing
CN111211831A (en) * 2020-01-13 2020-05-29 东方红卫星移动通信有限公司 Multi-beam low-orbit satellite intelligent dynamic channel resource allocation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102665219A (en) * 2012-04-20 2012-09-12 南京邮电大学 Dynamic frequency spectrum allocation method of home base station system based on OFDMA
CN105120468A (en) * 2015-07-13 2015-12-02 华中科技大学 Dynamic wireless network selection method based on evolutionary game theory
CN105357158A (en) * 2015-10-26 2016-02-24 天津大学 Method for node to access multiple channels exactly and efficiently in underwater cognitive network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10091785B2 (en) * 2014-06-11 2018-10-02 The Board Of Trustees Of The University Of Alabama System and method for managing wireless frequency usage

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102665219A (en) * 2012-04-20 2012-09-12 南京邮电大学 Dynamic frequency spectrum allocation method of home base station system based on OFDMA
CN105120468A (en) * 2015-07-13 2015-12-02 华中科技大学 Dynamic wireless network selection method based on evolutionary game theory
CN105357158A (en) * 2015-10-26 2016-02-24 天津大学 Method for node to access multiple channels exactly and efficiently in underwater cognitive network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Intelligent Spectrum Management Based on Transfer Actor-Critic Learning for Rateless Transmissions in Cognitive Radio Networks;Koushik A.M.等;《IEEE Transactions on Mobile Computing》;20180501;第17卷(第5期);1204-1215页 *
无线网络中基于深度Q学习的传输调度方案;朱江等;《通信学报》;20180430;第39卷(第4期);36-43页 *

Also Published As

Publication number Publication date
CN109039505A (en) 2018-12-18

Similar Documents

Publication Publication Date Title
CN109039505B (en) Channel state transition probability prediction method in cognitive radio network
Tumuluru et al. Channel status prediction for cognitive radio networks
Mastronarde et al. Joint physical-layer and system-level power management for delay-sensitive wireless communications
Fu et al. Structure-aware stochastic control for transmission scheduling
CN103731173A (en) Transceiver operating in wireless communication network, network transmission system and method
Wang et al. Analysis of opportunistic spectrum access in cognitive radio networks using hidden Markov model with state prediction
Hosahalli et al. Enhanced reinforcement learning assisted dynamic power management model for internet‐of‐things centric wireless sensor network
CN105915300B (en) It is a kind of that spectrum prediction method being kept out of the way based on RLNC in CR networks
CN103916969A (en) Combined authorized user perception and link state estimation method and device
CN113014340A (en) Satellite spectrum resource dynamic allocation method based on neural network
Yan et al. Gaussian process reinforcement learning for fast opportunistic spectrum access
Mafuta et al. Decentralized resource allocation-based multiagent deep learning in vehicular network
Yang et al. Adaptive modulation based on nondata-aided error vector magnitude for smart systems in smart cities
CN108449151B (en) Spectrum access method in cognitive radio network based on machine learning
Yang et al. Detection performances and effective capacity of cognitive radio with primary user emulators
Ganewattha et al. Confidence aware deep learning driven wireless resource allocation in shared spectrum bands
CN109769258B (en) Resource optimization method based on secure URLLC communication protocol
Khalifa et al. Enhanced cooperative behavior and fair spectrum allocation for intelligent IoT devices in cognitive radio networks
Håkansson et al. Cost-aware dual prediction scheme for reducing transmissions at IoT sensor nodes
Nandakumar et al. LSTM Based Spectrum Prediction for Real-Time Spectrum Access for IoT Applications.
Osman Empowering internet-of-everything (IoE) networks through synergizing Lagrange optimization and deep learning for enhanced performance
Caetano et al. A recurrent neural network mac protocol towards to opportunistic communication in wireless networks
Teixeira et al. Model-free predictor of signal-to-noise ratios for mobile communications systems
Fazeli‐Dehkordy et al. Markovian‐based framework for cooperative channel selection in cognitive radio networks
Clement et al. Throughput enhancement in a cognitive radio network using a reinforcement learning method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210209