CN109039505B

CN109039505B - Channel state transition probability prediction method in cognitive radio network

Info

Publication number: CN109039505B
Application number: CN201810696652.1A
Authority: CN
Inventors: 韩光洁; 李傲寒
Original assignee: Changzhou Campus of Hohai University
Current assignee: Changzhou Campus of Hohai University
Priority date: 2018-06-29
Filing date: 2018-06-29
Publication date: 2021-02-09
Anticipated expiration: 2038-06-29
Also published as: CN109039505A

Abstract

The invention designs a channel state transition probability prediction method in a cognitive radio network based on a two-stage Q learning method, which comprises two learning processes, namely a channel state transition probability learning process and a channel state learning process; the channel state transition probability learning process and the channel state learning process are both based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state. The method can predict the continuously changing authorized channel state in real time. Furthermore, the method does not require that any probability distribution is assumed in advance for an authorized user or interferer. Moreover, the method fully considers the problem that the used channel state transition probability predicts whether the sample state is correct or not.

Description

Channel state transition probability prediction method in cognitive radio network

Technical Field

The invention relates to a method for predicting channel state transition probability in a cognitive radio network, and belongs to the technical field of radio networks.

Background

Over the past decades, wireless networks have supported the need for ever increasing higher data rates in order to meet consumer demand for fast, secure, and intelligent wireless networks. However, current wireless network systems face bottlenecks due to spectrum resources. This bottleneck makes it difficult to enhance performance in the limited available frequency band. Therefore, new communication cases are needed to further enhance the performance of wireless networks. The industry has predicted that wireless networks require more spectrum resources to serve more users of high-speed communications. Also, current spectrum resources need to be utilized more efficiently. Cognitive radio technology has emerged to make more efficient use of limited spectrum resources. In a cognitive radio network, a cognitive radio user can dynamically utilize a licensed spectrum without interfering with normal communications of the licensed user. However, the design of cognitive radio users faces many security challenges. In a cognitive radio network, cognitive radio users may be attacked by interference. A jammer equipped with cognitive functions can perceive channels occupied by unauthorized users and attack these channels. The attack of the disturber on the cognitive radio users will destroy the normal communication between the cognitive radio users. Therefore, in order to ensure normal communication of the cognitive radio users in the cognitive radio network and avoid interference to authorized users, prediction of channel state information is very important.

In a cognitive radio network, a cognitive radio user can generally perceive the state of a licensed channel through spectrum sensing technology. However, due to cognitive radio hardware and energy limitations, a cognitive radio user may not have the ability to perceive all licensed spectrum at the same time. Furthermore, even if the cognitive radio user has the ability to sense the entire licensed spectrum, sensing the entire licensed spectrum will consume a large amount of time, which will cause a large degree of delay in the communication of the cognitive radio user. Therefore, in order to obtain more correct spectrum state in a limited time, we need to obtain more information about the channel. The most important information is the channel state transition probability. Many existing articles assume that the channel state is determined by the state of an authorized user, and that the cognitive radio user knows the state transition information of the authorized user. However, in a real cognitive radio network, it is difficult for a cognitive radio user to know state transition information of an authorized user. Therefore, the cognitive radio user needs to predict the state information of the grant channel. To date, only a small percentage of the related research has predicted the state transition information of the grant channel in the cognitive radio network.

The prior relevant research literature for the channel state transition probability prediction method in the cognitive radio network is as follows:

liu et al, in 2012, published in IEEE Globecom, "Prediction of interpreted Distributed Primary User Traffic for Dynamic Spectrum Access," proposed an authorized User activity Prediction method based on the maximum likelihood approach. This method assumes that the activities of authorized users follow an exponential distribution and that perception is perfect. In this prediction method, the probability that an authorized user occupies a channel is first calculated from a sample. Then, the time of the authorized user occupying the authorized channel is evaluated according to the evaluation probability. And then, evaluating the transition probability of the channel occupation situation of the unauthorized user according to the evaluated channel occupation probability and the channel occupation time. And finally, estimating the channel state at the next moment according to the perfectly perceived channel state and the estimated transition probability. However, this maximum likelihood prediction method assumes that the channel perception result is perfect. In a real cognitive radio network, due to the limitations of sensing time and sensing method, the channel state sensing result may be imperfect. Furthermore, the occupation interval of the channel by the authorized user needs to be subject to exponential distribution. Moreover, a long sampling process is required in predicting the next state. Therefore, the method brings long time delay to the communication of the cognitive radio user.

Song et al proposed a Markov-based channel prediction method in an article "underlying the prediction of Wireless Spectrum: A Large-scale Empirical Study" published in IEEE ICC 2010. The method predicts the channel state of the slot using the first K channel states of the slot to be predicted. And predicting the channel state of the time slot by counting the number of the occupied time slot and the number of the unoccupied time slot of the K channel states. However, this approach predicts that different levels of error rates will be encountered in different situations. In some cases, this method faces a large prediction error rate. Furthermore, the channel states in the sequence of channel states used for prediction are simultaneously facing the problem of perceived channel state errors.

A Channel transition probability prediction algorithm Based on the Baum-Welch algorithm is proposed in an article 'Optimal Channel-Sensing Scheme for Cognitive Radio Systems Based on Fuzzy Q-Learning' published in IEICE transition on Communication 2014 by F.H. Panahi et al. However, this method requires that the channel transition probability be predicted over a certain number of sample spaces before the channel is predicted using the channel transition probability. In addition, if the channel transition probability changes, the prediction method takes a lot of time to re-sample to re-predict the channel transition probability. This channel transition prediction algorithm cannot predict the state of the channel in real time. Therefore, the communication delay of the cognitive radio user will be increased. And the time of resampling cannot be predicted.

S. Filippi et al propose a Channel Transition probability prediction method based on a maximum likelihood Estimation method in An article "An Estimation Algorithm of Channel State Transition Prohability for Cognitive Radio Systems" published in IEEE CrownCom 2008, which can estimate the Channel Transition probability relatively accurately. The state transition probability of the method is based on the fact that the cognitive radio user can obtain the correct channel state. However, in an actual cognitive radio network, due to limitations of cognitive radio users in sensing time, sensing method and surrounding communication environment, the cognitive radio users may not obtain real channel state information. In addition, in this method, the cognitive radio user needs to simultaneously collect channel state information for a long time to evaluate the channel transition probability. Therefore, the channel transition probability prediction method will increase the communication delay of the cognitive radio user.

Akbuliut et al propose a Channel transfer probability prediction method based on a Particle Swarm method in an article "Estimation of Time-Varying Channel State Transition Probabilities for Cognitive Radio Systems by Means of Particle Swarm Optimization" published in Radio engineering 2012. This method enables prediction of varying channel transition probabilities. However, this method also faces the case where the used channel state information may be different from the actual channel state.

On the basis of summarizing these studies, it can be seen that the following major problems exist in the design of current cognitive radio network architectures:

1. many articles default to the same state of the observed channel state sequence as the actual channel state. However, due to the spectrum sensing time of the unauthorized user, the spectrum sensing method, and the limitation of the communication environment, the observed channel state may deviate from the actual channel state to some extent.

2. Most of the articles need to predict the channel state transition probability in advance, which will bring some delay to the communication of the cognitive radio user. Furthermore, most articles cannot predict the channel state transition probability in real time. Therefore, the channel state transition probability in most of the articles is only applicable to the cognitive radio network in which the channel state transition probability is always unchanged. However, in a real cognitive radio network, the channel state transition probability may be time varying.

3. The channel state transition probability prediction methods in some articles need to meet certain probability distribution and cannot be generally used in cognitive radio networks.

Disclosure of Invention

The technical problem is as follows: the invention designs a channel state transition probability prediction method in a cognitive radio network based on a two-stage Q learning method. The prediction method comprises two stages of prediction processes based on a Q learning method, namely a channel state transition probability learning process and a channel state learning process. The input and output of the channel state transition probability learning process are the channel state learned by the channel state learning process and the predicted channel state transition probability, respectively. The inputs and outputs of the channel state learning process are: the output of the channel state transition probability learning process, and the state of the learned channel. The cognitive radio user can finally deduce the state transition probability of the channel through continuous learning of the channel, so that the state information of the channel at the next moment can be correctly deduced, and the situation that the cognitive radio user is attacked by an interferer and causes interference to an authorized user is avoided. The method can predict the continuously changing authorized channel state in real time. Furthermore, the method does not require that any probability distribution is assumed in advance for an authorized user or interferer. Moreover, the method fully considers the problem that the used channel state transition probability predicts whether the sample state is correct or not.

The technical scheme of the invention is as follows:

the invention relates to a channel state transition probability prediction method in a cognitive radio network, which comprises two learning processes, namely a channel state transition probability learning process and a channel state learning process;

the channel state transition probability learning process and the channel state learning process are both based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state.

The channel state transition probability learning process is based on the basic elements of a Q learning method and respectively comprises the following steps:

the state is as follows: the states of the authorized channel are 0 and 1, when the state of the authorized channel is 0, the authorized channel is in an idle state, and when the state of the authorized channel is 1, the authorized channel is occupied by an authorized user or attacked by an interferer or both the authorized user and the interferer;

the actions are as follows: the predicted grant channel state at the next time instant, i.e., 0or 1;

reward: determining according to the predicted authorized channel state at the next moment and the difference of the channel state learned by the channel learning process;

the Q learning process of the channel state transition probability learning process is as follows:

(2a) initializing parameters in a Q-learning method

Initializing a parameter in the Q learning method to be a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in Q learning; during initialization, setting a Q value corresponding to a channel state to be predicted at the next moment corresponding to the current channel state in the Q learning method to be 0;

(2b) action decision process

Selecting action according to the Q value of the next state corresponding to the current state, namely predicting the state of the channel at the next moment;

selecting actions by a xi greedy method in the action decision process, namely selecting the state of the next moment corresponding to the maximum Q value by the cognitive radio user according to the probability of xi, and selecting the state of the next moment corresponding to the non-maximum Q value according to the probability of 1-xi;

the selection method comprises the following steps:

wherein S^tIs the state corresponding to the channel at the time t; s^t+1The state corresponding to the channel at the time of t + 1; q (-) is the channel state S^tQ values corresponding to different next time channel states;

(2c) updating the Q value

Updating according to the channel state learned in the channel state learning process, wherein the Q value updating method comprises the following steps:

wherein the content of the first and second substances,

the Q value corresponding to the next state selected in the current state;

the value is the maximum Q value corresponding to the current channel state; α is the learning rate; gamma is a discount factor; r is a reward earned for performing the selected action;

when the action selected in the channel state transition probability learning process is the same as the channel state learned by the channel state learning method, the channel state transition probability learning process can obtain the reward and record the reward as 1; if the action selected in the channel state transition probability learning process is different from the channel state learned by the channel state learning method, the obtainable reward is marked as-1;

(2d) computing channel state transition probabilities

Calculating the channel state transition probability according to the Q value of the next state corresponding to the current state;

the calculation method comprises the following steps:

wherein, P_0,jIs the probability of the channel transitioning from state 0 to state j; p_1,jIs the probability of the channel transitioning from state 1 to j; q. q.s_0,jThe Q value of j is the next state corresponding to the current state 0; q. q.s_0,0The Q value of the next state which corresponds to the current state 0 and is 0; q. q.s_0,1The Q value of the next state corresponding to the current state 0 is 1; q. q.s_1,jThe Q value of j is the next state corresponding to the current state 1; q. q.s_1,0The Q value of the next state corresponding to the current state 1 is 0; q. q.s_1,1The Q value of the next state corresponding to the current state 1 is 1.

A channel state learning process comprising the steps of:

the basic elements of the channel state learning process based on the Q learning method are respectively as follows:

the state is as follows: the state probability of the grant channel, that is, the probability that the state of the grant channel is 0 and the probability that the state of the grant channel is 1;

the actions are as follows: transmitting data or not transmitting data;

reward: determining whether to transmit data or not and whether to generate a collision;

the Q learning process of the channel state learning process is as follows:

(3a) predicting grant channel state probability

The cognitive radio calculates the probability of each state of the authorized channel according to the spectrum sensing result and the channel state transition probability output in the channel state transition probability learning process,

the calculation method comprises the following steps:

wherein the content of the first and second substances,

probability of state i at time t for grant channel，i＝0or1；

The probability that the state of the grant channel is i at the time t +1 is 0or 1; y is_iThe probability that the authorized channel state observed by the sensing result is i is 0or 1; p_i,jThe probability of the grant channel transitioning from state i to state j, j being 0or 1;

(3b) selecting corresponding action

Selecting a corresponding action according to the predicted state probability of the authorized channel,

the specific selection process is as follows: if the probability that the grant channel is free is greater than the probability that the grant channel is occupied, i.e. the grant channel is free

When the data is transmitted, the authorized channel is considered to be in an idle state, data transmission is carried out according to probability selection of eta, and data transmission is not carried out according to probability selection of 1-eta; otherwise, the authorized channel is considered to be in an occupied state, data transmission is not carried out according to the probability selection of eta, and data transmission is carried out according to the probability selection of 1-eta;

(3c) calculating rewards

If the selected action is data transmission and does not conflict with authorized users and interferers, the reward is 1; if the selected action is data transmission and conflicts with an authorized user or an interferer, the reward is-1; if the action is selected as not carrying out data transmission, the reward is 0;

(3d) updating the Q value

The Q value updating method comprises the following steps:

wherein the content of the first and second substances,

a Q value corresponding to the selected action in the current state;

value is currentMaximum Q value of action corresponding to the state; β is the learning rate; χ is a discount factor; r_αA reward earned for performing the selected action;

(3e) outputting grant channel state

If the selected action is data transmission and does not conflict with the authorized user and the interferers, outputting the authorized channel state as idle; otherwise, the output channel state is occupied.

The invention achieves the following beneficial effects:

(1) the invention takes into account the problem of whether the perceived channel state is correct or not. The cognitive radio user continuously learns the perceived channel state through a channel state learning process, so that a correct channel state is obtained, and interference of the cognitive radio user on an authorized user and malicious attack of an interferer on the cognitive radio user are avoided.

(2) The invention can evaluate the channel state transition probability in real time and can evaluate the changed channel state transition probability. Therefore, certain evaluation time delay and channel state transition probability prediction error rate are avoided.

(3) The channel state transition probability prediction method in the cognitive radio network does not need to assume in advance that the activities of authorized users or interference users meet any probability distribution, and the prediction method is more generally applied to the cognitive radio network.

Drawings

FIG. 1 is an overall block diagram of the channel state transition probability prediction method of the present invention;

FIG. 2 is a flow chart of a channel state transition probability prediction method of the present invention;

fig. 3 is a flowchart of a channel state prediction method according to the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

As shown in fig. 1, the present invention relates to a method for predicting channel state transition probability in a cognitive radio network, which includes two learning processes, namely a channel state transition probability learning process and a channel state learning process;

as shown in fig. 2, the Q learning process of the channel state transition probability learning process is as follows:

(2a) initializing parameters in a Q-learning method

(2b) action decision process

the selection method comprises the following steps:

(2c) updating the Q value

wherein the content of the first and second substances,

the Q value corresponding to the next state selected in the current state;

(2d) computing channel state transition probabilities

the calculation method comprises the following steps:

A channel state learning process comprising the steps of:

the actions are as follows: transmitting data or not transmitting data;

as shown in fig. 3, the Q learning process of the channel state learning process is as follows:

(3a) predicting grant channel state probability

the calculation method comprises the following steps:

wherein the content of the first and second substances,

for grant channel state at time tProbability of i, i ═ 0or 1;

for granting channels

Probability that the state is i at time t +1, i ═ 0or 1; y is_iThe probability that the authorized channel state observed by the sensing result is i is 0or 1; p_i,jThe probability of the grant channel transitioning from state i to state j, j being 0or 1;

(3b) selecting corresponding action

(3c) calculating rewards

(3d) updating the Q value

The Q value updating method comprises the following steps:

wherein the content of the first and second substances,

a Q value corresponding to the selected action in the current state;

the value is the maximum Q value of the action corresponding to the current state; β is the learning rate; χ is a discount factor; r_αA reward earned for performing the selected action;

(3e) outputting grant channel state

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A channel state transition probability prediction method in a cognitive radio network is characterized by comprising two learning processes, namely a channel state transition probability learning process and a channel state learning process;

the channel state transition probability learning process and the channel state learning process are based on a Q learning method, and the channel state learning process can learn the state of a channel according to the channel state transition probability learned by the channel state transition probability learning process and output the learned channel state;

(2a) initializing parameters in a Q-learning method

(2b) action decision process

the selection method comprises the following steps:

(2c) updating the Q value

wherein the content of the first and second substances,

the Q value corresponding to the next state selected in the current state;

(2d) computing channel state transition probabilities

the calculation method comprises the following steps:

2. The method for predicting channel state transition probability in a cognitive radio network according to claim 1, wherein: a channel state learning process comprising the steps of:

the actions are as follows: transmitting data or not transmitting data;

the Q learning process of the channel state learning process is as follows:

(3a) predicting grant channel state probability

the calculation method comprises the following steps:

wherein the content of the first and second substances,

the probability that the state of the grant channel is i at the time t is 0or 1;

(3b) selecting corresponding action

And then, the authorized channel is considered to be in an idle state, and data transmission is carried out according to probability selection of etaAnd (3) selecting the probability of 1-eta not to transmit data; otherwise, the authorized channel is considered to be in an occupied state, data transmission is not carried out according to the probability selection of eta, and data transmission is carried out according to the probability selection of 1-eta;

(3c) calculating rewards

(3d) updating the Q value

The Q value updating method comprises the following steps:

wherein the content of the first and second substances,

a Q value corresponding to the selected action in the current state;

(3e) outputting grant channel state