CN113573323A - Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle - Google Patents

Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle Download PDF

Info

Publication number
CN113573323A
CN113573323A CN202110681497.8A CN202110681497A CN113573323A CN 113573323 A CN113573323 A CN 113573323A CN 202110681497 A CN202110681497 A CN 202110681497A CN 113573323 A CN113573323 A CN 113573323A
Authority
CN
China
Prior art keywords
channel
belief
knowledge
gradient
aerial vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110681497.8A
Other languages
Chinese (zh)
Inventor
杜丰
林艳
李骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202110681497.8A priority Critical patent/CN113573323A/en
Publication of CN113573323A publication Critical patent/CN113573323A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/22Traffic simulation tools or models
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • H04B17/336Signal-to-interference ratio [SIR] or carrier-to-interference ratio [CIR]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • H04B17/345Interference values
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W36/00Hand-off or reselection arrangements
    • H04W36/24Reselection being triggered by specific parameters
    • H04W36/30Reselection being triggered by specific parameters by measured or perceived connection quality data

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient. The method comprises the following steps: modeling the channel capacity of all channels into a lookup table belief model based on Bayesian theorem; initializing a belief model according to the past communication task experience of the unmanned aerial vehicle; calculating to obtain knowledge gradient values of all channels according to the belief state about the channel capacity at the current moment, and selecting the channel with the maximum knowledge gradient as the current-moment channel; the unmanned aerial vehicle communicates on the selected channel, monitors the transmission rate at the same time, and updates the belief state of the channel capacity according to the monitored transmission rate; the above process is repeated until the time limit, i.e. budget, for each channel selection is exceeded. The method is suitable for quick channel selection of the high-dynamic unmanned aerial vehicle network, and effectively improves the speed of optimal channel selection.

Description

Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle
Technical Field
The invention belongs to the technical field of unmanned aerial vehicle communication, and particularly relates to an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient.
Background
With the large-scale development and application of the unmanned aerial vehicle technology, the anti-interference problem in the field of unmanned aerial vehicle communication becomes increasingly severe. Interference in the communication process of the unmanned aerial vehicle is not only from background noise, but also possibly from an interfering machine. The interference machine transmits a signal with certain interference strength on a channel to occupy channel resources, and the unmanned aerial vehicle end can make a corresponding strategy to select the channel which is not interfered or is less interfered, so that the unmanned aerial vehicle can obtain higher transmission rate on the channels.
At present, in a plurality of methods for resisting interference of the unmanned aerial vehicle, frequency hopping is a common and easy-to-implement direction. But the related algorithms are not much proposed and are mainly limited by the drone's resistance to this particular environment. Based on the hypothesis of cognitive radio, the channel selection process of the unmanned aerial vehicle is modeled as MDP, and the benefit of the unmanned aerial vehicle is maximized through reinforcement learning algorithm (such as Q learning) selection, so that a channel with small interference and high benefit is selected in each time slot (Zhanyu. unmanned aerial vehicle network anti-interference method research [ D ]. Beijing post and telecommunications university.2019). Or, also modeling the channel selection process as MDP, select the best channel by minimizing the perceived interference power through Q-learning (huang bang. The method of reinforcement learning can well cope with the jammers with fixed strategies, but when the interference power and the interference channel of the jammers are relatively high in randomness, the reinforcement learning algorithm is difficult to converge and consumes a long time, and the environment confronted by the unmanned aerial vehicle is difficult to cope with. In addition, there is a method for modeling unmanned plane channel selection as a dobby gambling machine, and the best channel is estimated by a greedy algorithm or a UCB algorithm, although the method based on statistics better deals with the randomness problem, the method also faces the same problem that a large amount of training is needed to converge.
Disclosure of Invention
The invention aims to provide a method for quickly selecting an optimal channel of an unmanned aerial vehicle based on knowledge gradient, so that the speed of selecting the optimal channel of the unmanned aerial vehicle under the scene of random dynamic change of interference power is increased.
The technical solution for realizing the purpose of the invention is as follows: an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient comprises the following steps:
step 1, modeling the channel capacity of all channels into a lookup table belief model based on Bayesian theorem;
step 2, initializing a lookup table belief model according to the unmanned aerial vehicle communication task experience;
step 3, calculating to obtain knowledge gradient values of all channels according to the belief state of the channel capacity at the current moment, and selecting the channel with the maximum knowledge gradient as a communication channel at the current moment;
and 4, the unmanned aerial vehicle communicates on the selected channel, simultaneously monitors the transmission rate, and updates the belief state of the channel capacity according to the transmission rate.
Further, the lookup table belief model in step 1 is based on bayes theorem, and specifically includes the following steps:
the lookup table belief model is used for modeling channel capacity, is composed of a belief mean and a belief variance of the channel capacity and is generally called a belief state; the belief state updated at the previous moment belongs to posterior distribution of channel capacity and is used for prior distribution participation operation at the current moment.
Further, in step 2, the lookup table belief model is initialized, wherein initial values of the belief states are empirically aggregated, and comprise a belief mean of the channel capacity.
Further, in step 3, according to the belief state of the current time about the channel capacity, a knowledge gradient value about each channel is obtained by calculation, and a channel with the largest knowledge gradient is selected as the communication channel of the current time, specifically:
using a knowledge gradient algorithm based on a lookup table model, taking a belief state about channel capacity as prior distribution of the iteration, wherein the prior distribution obeys Gaussian distribution and is a two-dimensional table consisting of a belief mean and a belief variance;
and calculating to obtain knowledge gradient values of all channels in the current belief state according to a calculation formula of the knowledge gradient, selecting the channel with the maximum knowledge gradient value as a communication channel at the current moment, and monitoring the selected channel.
Further, a knowledge gradient algorithm based on a lookup table belief model is used for selecting a channel with minimum interference, and for a scene of variable power interference of a single jammer or multiple jammers on a communication channel, the method specifically comprises the following steps:
(1) lookup table belief model
Definition of StFor a belief state for channel N e {1, 2., N }, the action is to select one of the N channels,
Figure BDA0003122791110000021
wherein
Figure BDA0003122791110000022
The channel capacity estimate for channel n for time t,
Figure BDA0003122791110000023
then it is the belief standard deviation for the channel n at time t and the belief state records the actual value r for the channel capacitynLet us say the belief of EF (n, W), W representing the observation of channel n, assuming
Figure BDA0003122791110000024
the gradient of knowledge about the channel n at time t is defined as:
Figure BDA0003122791110000025
Figure BDA0003122791110000026
is that at time t, a channel n is selectedtRear end
Figure BDA0003122791110000027
Updating of (1);
after the action selection is done at each moment, the following reports, namely the observed values, are obtained:
Figure BDA0003122791110000028
rnis an actual observation about the channel capacity of channel n;
defining belief accuracy
Figure BDA0003122791110000031
Sum noise accuracy
Figure BDA0003122791110000032
Figure BDA0003122791110000033
Figure BDA0003122791110000034
Wherein
Figure BDA0003122791110000035
Is based on empirical and historical statistics about the variance of the channel capacity of channel n;
Figure BDA0003122791110000036
is the belief variance, which is the square of the belief standard deviation;
based on the above definitions and formulas, the following updates are made to the channel capacity and belief accuracy, for the selected channel n at time t:
Figure BDA0003122791110000037
Figure BDA0003122791110000038
the rest channels use the belief state of the last moment;
(2) knowledge gradient algorithm
Calculating the knowledge gradient of each channel capacity in the interference scene of the multi-channel variable power of the interference machine by using a knowledge gradient algorithm based on a lookup table belief model, wherein the size of the knowledge gradient represents the amount of information which can be obtained after the corresponding channel is selected; the knowledge gradient algorithm calculates the information quantity which can be obtained by each channel based on Bayesian theory and hypothesis that the belief state obeys Gaussian distribution, and the larger the information quantity is, the larger the decision making progress is after the belief state is updated;
noting the variance of the change in the mean value of beliefs caused by selecting channel n at time t
Figure BDA0003122791110000039
Figure BDA00031227911100000310
Figure BDA00031227911100000311
Indicating an observation error;
then calculate
Figure BDA00031227911100000312
The normalized influence called action n gives the standard deviation of the channel capacity corresponding to the current action n;
recalculation
f(ζ)=ζΦ(ζ)+φ(ζ)
Wherein Φ (ζ) and Φ (ζ) represent a cumulative standard normal distribution function and a standard normal density function, respectively;
in summary, the knowledge gradient corresponding to the channel n at the time t is written as:
Figure BDA0003122791110000041
considering the influence of the knowledge gradient of the current moment on the rest moments, and simultaneously weighing the relation between data exploration and utilization, the online knowledge gradient is finally used as the basis for channel selection:
Figure BDA0003122791110000042
inputting the current belief state StDefinition of
Figure BDA0003122791110000043
And calculating therefrom a normalized influence
Figure BDA0003122791110000044
And corresponding f (zeta), and finally, giving online knowledge gradients corresponding to all channels at the time t
Figure BDA0003122791110000045
The action is selected as
Figure BDA0003122791110000046
And then observing the actual communication rate of the channel, updating the belief state according to an update equation in the lookup table belief model in step (1), and then starting channel selection at the next moment, wherein each moment corresponds to one budget until the budget is used up and the iteration of the algorithm is stopped.
Compared with the prior art, the invention has the following remarkable advantages: (1) by adopting a decision-making mode based on a Bayesian theory, more scientific judgment is made on the value of unmanned aerial vehicle channel information or whether the information needs to be collected again, the value is completely believed or not to be believed in the calculation result unlike the common decision, but the data is measured through belief variance, the confidence degree is digitized, and the method is more scientific and reasonable; (2) the method has the advantages that the concept of knowledge gradient constructed based on Bayesian theory is utilized, the information value which can be obtained after the channel is selected is digitalized, and meanwhile, the online knowledge gradient which can balance the channel capacity mean value and the channel information value is used as the reference for selecting the channel, so that the method has higher convergence speed.
The present invention is described in further detail below with reference to the attached drawing figures.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of the unmanned aerial vehicle communication anti-interference transmission model.
Fig. 3 is a graphical representation of the accumulated interference power over 5 channels as a function of time.
Fig. 4 is a schematic diagram of channel selection for each slot drone.
Fig. 5 is a statistical diagram of the number of times each channel is selected by the drones.
Detailed Description
While considering uncertainty of jammers, The invention applies a Knowledge Gradient (Powell, w.b. "The Knowledge Gradient for Optimal Learning," Encyclopedia for Operations Research and Management Science,2011(c) John Wiley and Sons.) to The field of drone communication, greatly reducing The convergence time. The invention provides an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient, which rapidly learns the information of all channels through an updating method of a lookup table belief model constructed based on Bayesian hypothesis and a knowledge gradient calculation formula based on Gaussian hypothesis, and provides channel selection with minimum accumulated interference after few iterations, namely a channel with maximum accumulated channel capacity, and specifically comprises the following steps:
step 1, modeling the channel capacity of all channels into a lookup table belief model based on Bayesian theorem;
step 2, initializing a belief model according to the communication task experience of the unmanned aerial vehicle;
step 3, calculating to obtain knowledge gradient values of all channels according to the belief state of the channel capacity at the current moment, and selecting the channel with the maximum knowledge gradient as a communication channel at the current moment;
and 4, the unmanned aerial vehicle communicates on the selected channel, simultaneously monitors the transmission rate, and updates the belief state of the channel capacity according to the transmission rate.
Further, the lookup table belief model in step 1 is based on bayes theorem, which is specifically as follows:
the lookup table belief model is used for modeling channel capacity, is composed of a belief mean and a belief variance of the channel capacity and is generally called a belief state; the belief state updated at the previous moment belongs to posterior distribution of channel capacity, and can be used as prior distribution participation operation at the current moment.
Further, the initial value of the belief state in step 2 is obtained according to the past experience aggregation, and mainly includes the belief mean value of the channel capacity.
Further, the belief state about the channel capacity in step 3 is used as the prior distribution of the iteration, and then the knowledge gradient values of all channels in the current belief state are calculated, specifically:
using a knowledge gradient algorithm based on a lookup table model, taking the belief state obtained in the step 2 or the step 4 as prior distribution, wherein the distribution obeys Gaussian distribution and is a two-dimensional table consisting of a belief mean and a belief variance;
and calculating to obtain the knowledge gradient value of each channel according to a calculation formula of the knowledge gradient, selecting the channel with the maximum value as the communication channel at the current moment, and monitoring the selected channel.
Further, aiming at the variable power interference scene of a single jammer or a plurality of jammers on a communication channel, a knowledge gradient algorithm based on a lookup table belief model is used for selecting the channel with the minimum interference, and the method specifically comprises the following steps:
(1) lookup table belief model
Definition of StFor a belief state for channel N e {1, 2., N }, the action is to select one of the N channels,
Figure BDA0003122791110000051
wherein
Figure BDA0003122791110000052
The channel capacity estimate for channel n for time t,
Figure BDA0003122791110000053
then is the belief variance for the channel n at time t. The belief state records the actual value r of the channel capacitynBeliefs of EF (n, W)W denotes the observation of channel n, assuming
Figure BDA0003122791110000061
the gradient of knowledge about the channel n at time t is defined as:
Figure BDA0003122791110000062
Figure BDA0003122791110000063
is that at time t, a channel n is selectedtRear end
Figure BDA0003122791110000064
And (4) updating.
After the action selection is done at each moment, the following reports, namely the observed values, are obtained:
Figure BDA0003122791110000065
rnis an actual observation about the channel capacity of channel n.
Defining belief precision and noise precision:
Figure BDA0003122791110000066
Figure BDA0003122791110000067
wherein
Figure BDA0003122791110000068
Is based on empirical and historical statistics on the variance of the channel capacity of channel n,
Figure BDA0003122791110000069
is the belief variance, which is the beliefThe square of the standard deviation is recited.
Based on the above definitions and formulas, the following updates are made to the channel capacity and belief accuracy, for the selected channel n at time t:
Figure BDA00031227911100000610
Figure BDA00031227911100000611
the remaining channels follow the belief state at the last time.
(2) Knowledge gradient algorithm
And calculating the knowledge gradient of each channel capacity in the interference machine multi-channel variable power interference scene by using a knowledge gradient algorithm based on a lookup table belief model, wherein the size of the knowledge gradient represents the amount of information which can be obtained after the corresponding channel is selected. The knowledge gradient algorithm is based on Bayesian theory and the assumption that the belief state obeys Gaussian distribution, the information quantity which can be acquired by each channel is calculated, and the larger the information quantity is, the larger the decision making progress is after the belief state is updated.
Firstly, defining the variance of the change of the belief mean value caused by selecting the channel n at the time t:
Figure BDA00031227911100000612
Figure BDA0003122791110000071
indicating an observation error.
Then calculate
Figure BDA0003122791110000072
Called the normalized impact of action n, which gives the magnitude of the standard deviation of the channel capacity for the current action n.
Recalculation
f(ζ)=ζΦ(ζ)+φ(ζ)
Where Φ (ζ) and Φ (ζ) represent a cumulative standard normal distribution function and a standard normal density function, respectively.
In summary, the knowledge gradient corresponding to the channel n at the time t can be written as:
Figure BDA0003122791110000073
considering the influence of the knowledge gradient of the current moment on the rest moments, and simultaneously weighing the relation between data exploration and utilization, the online knowledge gradient is finally used as the basis for channel selection:
Figure BDA0003122791110000074
inputting the current belief state StDefinition of
Figure BDA0003122791110000075
And calculating therefrom a normalized influence
Figure BDA0003122791110000076
And corresponding f (zeta), and finally, giving online knowledge gradients corresponding to all channels at the time t
Figure BDA0003122791110000077
The action is selected as
Figure BDA0003122791110000078
And then observing the actual communication rate of the channel, updating the belief state according to the updating equation in the lookup table belief model in the step (1), starting channel selection at the next moment, wherein each moment corresponds to one budget, and stopping the iteration of the algorithm until the budget is used up.
Examples
One embodiment of the invention is described in detail below, with the simulation using python programming, and the parameter settings do not affect generality. The embodiment verifies the effectiveness and convergence of the proposed model and algorithm, and the scenario is as follows:
since the algorithm is based on the assumption of instantaneous tasks, only a single drone user is in the scene, the position of the drone user and the ground base station is fixed relatively, and then communication is carried out, and 5 channels are available. And in each time slot, a communication link is established between the aerial unmanned aerial vehicle and the ground base station, the flying height of the unmanned aerial vehicle is 80m, the distance between the unmanned aerial vehicle and the base station is 100m, and the transmitting power of the unmanned aerial vehicle is 0.4W. 5 malicious interference machines simultaneously interfere normal transmission of users, the positions of the interference machines are also fixed, and the distances between the interference machines and a base station are 100 m; the interference patterns are all multi-tone random interference, i.e. a single jammer performs interference on 5 channels, with the difference that the distribution of the interference power is different from each other. In addition, the interference powers of the jammers are independent of each other and subject to different distributions, for example, some jammers use interference powers subject to gaussian distribution, and some jammers use uniform distribution. The specific interference power setting is as shown in table 1:
TABLE 1
Jammer serial number Interference power mean/lower bound Interference power variance/ceiling
Jammer 1 (Gauss) [0.4,0.6,0.8,0.8,1.0] [0.3,0.3,0.3,0.3,0.3]
Jammer 2 (Gauss) [0.5,0.5,0.8,0.7,0.9] [0.2,0.2,0.2,0.2,0.2]
Jammer 3 (Uniform) [0.1,0.2,0.3,0.4,0.5] [0.8,0.9,0.9,1.0,1.2]
Jammer 4 (Uniform) [0.3,0.2,0.3,0.6,0.8] [1.0,1.2,1.0,1.4,1.5]
Jammer 5 (Uniform) [0.2,0.2,0.2,0.2,0.2] [0.9,0.9,0.9,0.9,0.9]
The unmanned aerial vehicle channel selection algorithm based on the knowledge gradient provided by the invention is combined with the figures 1-2, and the specific process is as follows:
step 1: initialization: initializing the iteration time t as 0; initializing a belief model, estimating a channel capacity mean value according to past experience, and setting the same belief variance for each channel; other system fixed parameters are initialized.
Step 2: repeating iteration, which is specifically divided into the following substeps:
1) sequentially calculating and storing the online knowledge gradients of all the channels, wherein the step of calculating the knowledge gradients is as follows:
firstly, calculating variance of change of mean value of belief
Figure BDA0003122791110000081
② calculating normalized influence of channel n
Figure BDA0003122791110000082
Calculating f (zeta) ═ zeta phi (zeta) + phi (zeta), where phi (zeta) and phi (zeta) represent cumulative standard normal distribution functions and
a standard normal density function;
calculating knowledge gradient
Figure BDA0003122791110000083
Calculating on-line knowledge gradient
Figure BDA0003122791110000084
2) Selecting a communication channel according to the online knowledge gradient of each channel corresponding to the current time t;
3) after the channel is selected, the unmanned aerial vehicle monitors the actual transmission rate of the channel, and the belief state is updated by the following formula
Figure BDA0003122791110000085
Figure BDA0003122791110000086
Figure BDA0003122791110000087
Wherein the belief precision and the noise precision are as follows:
Figure BDA0003122791110000088
as can be seen from fig. 3, as time increases, the interference power experienced by each channel also increases. Where channel C0 is the channel with the least accumulated interference and vice versa the channel with the largest channel capacity. Fig. 4 shows whether the drone selects the channel with the smallest cumulative interference or not in each time slot, and it can be seen from fig. 3 and 4 that after the drone performs multiple operations in a cyclic manner, the channel selection of the drone tends to be stable, and convergence is achieved. And the convergence rate is within 100 times of calculation, and the convergence rate is higher than that of an algorithm based on historical statistics, such as UCB or epsilon-greedy. As can be seen from fig. 5, in the process of learning the channel capacity and selecting a communication channel, the drone tends to select a channel with the minimum actual interference mean, and the validity of the algorithm is verified.
In conclusion, the knowledge gradient-based rapid selection method for the optimal channel of the unmanned aerial vehicle fully considers the experience about the channel capacity in historical communication, and by utilizing the Bayesian theory, the rapid convergence speed can be achieved, the channel with the maximum average value of the channel capacity can be learned, and the channel can be continuously selected as the actual communication channel. Therefore, the method is very suitable for the channel selection task of the unmanned aerial vehicle in the time-limited environment.

Claims (5)

1. A method for quickly selecting an optimal channel of an unmanned aerial vehicle based on knowledge gradient is characterized by comprising the following steps:
step 1, modeling the channel capacity of all channels into a lookup table belief model based on Bayesian theorem;
step 2, initializing a lookup table belief model according to the unmanned aerial vehicle communication task experience;
step 3, calculating to obtain knowledge gradient values of all channels according to the belief state of the channel capacity at the current moment, and selecting the channel with the maximum knowledge gradient as a communication channel at the current moment;
and 4, the unmanned aerial vehicle communicates on the selected channel, simultaneously monitors the transmission rate, and updates the belief state of the channel capacity according to the transmission rate.
2. The method for quickly selecting the optimal channel of the unmanned aerial vehicle based on the knowledge gradient according to claim 1, wherein the lookup table belief model in the step 1 is based on Bayesian theorem and specifically comprises the following steps:
the lookup table belief model is used for modeling channel capacity, is composed of a belief mean and a belief variance of the channel capacity and is generally called a belief state; the belief state updated at the previous moment belongs to posterior distribution of channel capacity and is used for prior distribution participation operation at the current moment.
3. The knowledge gradient-based unmanned aerial vehicle optimal channel rapid selection method according to claim 1, wherein the initialization lookup table belief model in step 2 is obtained by empirically aggregating initial values of belief states, including belief mean values of channel capacity.
4. The method for quickly selecting the optimal channel of the unmanned aerial vehicle based on the knowledge gradient as claimed in claim 1,2 or 3, wherein in step 3, the knowledge gradient value of each channel is calculated according to the belief state about the channel capacity at the current moment, and the channel with the largest knowledge gradient is selected as the communication channel at the current moment, specifically:
using a knowledge gradient algorithm based on a lookup table model, taking a belief state about channel capacity as prior distribution of the iteration, wherein the prior distribution obeys Gaussian distribution and is a two-dimensional table consisting of a belief mean and a belief variance;
and calculating to obtain knowledge gradient values of all channels in the current belief state according to a calculation formula of the knowledge gradient, selecting the channel with the maximum knowledge gradient value as a communication channel at the current moment, and monitoring the selected channel.
5. The method for quickly selecting the optimal channel of the unmanned aerial vehicle based on the knowledge gradient as claimed in claim 4, wherein a knowledge gradient algorithm based on a lookup table belief model is used to select the channel with the minimum interference, and for a scene of variable power interference of a single jammer or multiple jammers on a communication channel, the method specifically comprises the following steps:
(1) lookup table belief model
Definition of StFor a belief state for channel N e {1, 2., N }, the action is to select one of the N channels,
Figure FDA0003122791100000011
wherein
Figure FDA0003122791100000012
The channel capacity estimate for channel n for time t,
Figure FDA0003122791100000013
then it is the belief standard deviation for the channel n at time t and the belief state records the actual value r for the channel capacitynLet us say the belief of EF (n, W), W representing the observation of channel n, assuming
Figure FDA0003122791100000021
the gradient of knowledge about the channel n at time t is defined as:
Figure FDA0003122791100000022
Figure FDA0003122791100000023
is that at time t, a channel n is selectedtRear end
Figure FDA0003122791100000024
Updating of (1);
after the action selection is done at each moment, the following reports, namely the observed values, are obtained:
Figure FDA0003122791100000025
rnis an actual observation about the channel capacity of channel n;
defining belief accuracy
Figure FDA0003122791100000026
Sum noise accuracy
Figure FDA0003122791100000027
Figure FDA0003122791100000028
Figure FDA0003122791100000029
Wherein
Figure FDA00031227911000000210
Is based on empirical and historical statistics about the variance of the channel capacity of channel n;
Figure FDA00031227911000000211
is the belief variance, which is the square of the belief standard deviation;
based on the above definitions and formulas, the following updates are made to the channel capacity and belief accuracy, for the selected channel n at time t:
Figure FDA00031227911000000212
Figure FDA00031227911000000213
the rest channels use the belief state of the last moment;
(2) knowledge gradient algorithm
Calculating the knowledge gradient of each channel capacity in the interference scene of the multi-channel variable power of the interference machine by using a knowledge gradient algorithm based on a lookup table belief model, wherein the size of the knowledge gradient represents the amount of information which can be obtained after the corresponding channel is selected; the knowledge gradient algorithm calculates the information quantity which can be obtained by each channel based on Bayesian theory and hypothesis that the belief state obeys Gaussian distribution, and the larger the information quantity is, the larger the decision making progress is after the belief state is updated;
noting the variance of the change in the mean value of beliefs caused by selecting channel n at time t
Figure FDA00031227911000000214
Figure FDA0003122791100000031
Figure FDA0003122791100000032
Indicating an observation error;
then calculate
Figure FDA0003122791100000033
Figure FDA0003122791100000034
The normalized influence called action n gives the standard deviation of the channel capacity corresponding to the current action n;
recalculation
f(ζ)=ζΦ(ζ)+φ(ζ)
Wherein Φ (ζ) and Φ (ζ) represent a cumulative standard normal distribution function and a standard normal density function, respectively;
in summary, the knowledge gradient corresponding to the channel n at the time t is written as:
Figure FDA0003122791100000035
considering the influence of the knowledge gradient of the current moment on the rest moments, and simultaneously weighing the relation between data exploration and utilization, the online knowledge gradient is finally used as the basis for channel selection:
Figure FDA0003122791100000036
inputting the current belief state StDefinition of
Figure FDA0003122791100000037
And calculating therefrom a normalized influence
Figure FDA0003122791100000038
And corresponding f (zeta), and finally, giving online knowledge gradients corresponding to all channels at the time t
Figure FDA0003122791100000039
The action is selected as
Figure FDA00031227911000000310
And then observing the actual communication rate of the channel, updating the belief state according to an update equation in the lookup table belief model in step (1), and then starting channel selection at the next moment, wherein each moment corresponds to one budget until the budget is used up and the iteration of the algorithm is stopped.
CN202110681497.8A 2021-06-18 2021-06-18 Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle Pending CN113573323A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110681497.8A CN113573323A (en) 2021-06-18 2021-06-18 Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110681497.8A CN113573323A (en) 2021-06-18 2021-06-18 Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle

Publications (1)

Publication Number Publication Date
CN113573323A true CN113573323A (en) 2021-10-29

Family

ID=78162313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110681497.8A Pending CN113573323A (en) 2021-06-18 2021-06-18 Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle

Country Status (1)

Country Link
CN (1) CN113573323A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108092729A (en) * 2017-12-29 2018-05-29 中国人民解放军陆军工程大学 Anti-interference model and Staenberg game Subgradient Algorithm in UAV Communication
CN108919640A (en) * 2018-04-20 2018-11-30 西北工业大学 The implementation method of the adaptive multiple target tracking of unmanned plane
US20190230671A1 (en) * 2018-01-23 2019-07-25 Electronics And Telecommunications Research Institute Method and apparatus for selecting unmanned aerial vehicle control and non-payload communication channel on basis of channel interference analysis
CN111367317A (en) * 2020-03-27 2020-07-03 中国人民解放军国防科技大学 Unmanned aerial vehicle cluster online task planning method based on Bayesian learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108092729A (en) * 2017-12-29 2018-05-29 中国人民解放军陆军工程大学 Anti-interference model and Staenberg game Subgradient Algorithm in UAV Communication
US20190230671A1 (en) * 2018-01-23 2019-07-25 Electronics And Telecommunications Research Institute Method and apparatus for selecting unmanned aerial vehicle control and non-payload communication channel on basis of channel interference analysis
CN108919640A (en) * 2018-04-20 2018-11-30 西北工业大学 The implementation method of the adaptive multiple target tracking of unmanned plane
CN111367317A (en) * 2020-03-27 2020-07-03 中国人民解放军国防科技大学 Unmanned aerial vehicle cluster online task planning method based on Bayesian learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘尧;彭艺;: "一种基于向量回归的无人机通信信道选择方法", 软件导刊, no. 01 *

Similar Documents

Publication Publication Date Title
CN111065103B (en) Multi-objective optimization wireless sensor network node deployment method
Li et al. Minimizing packet expiration loss with path planning in UAV-assisted data sensing
Shi et al. Drone-cell trajectory planning and resource allocation for highly mobile networks: A hierarchical DRL approach
CN111628855B (en) Industrial 5G dynamic multi-priority multi-access method based on deep reinforcement learning
Masazade et al. Dynamic bit allocation for object tracking in wireless sensor networks
CN112598150B (en) Method for improving fire detection effect based on federal learning in intelligent power plant
CN111524034B (en) High-reliability low-time-delay low-energy-consumption power inspection system and inspection method
CN113312177B (en) Wireless edge computing system and optimizing method based on federal learning
Chen et al. Swarm intelligence application to UAV aided IoT data acquisition deployment optimization
CN113919485B (en) Multi-agent reinforcement learning method and system based on dynamic hierarchical communication network
Swenson et al. Distributed inertial best-response dynamics
CN109391511B (en) Intelligent communication resource allocation strategy based on expandable training network
CN110300417B (en) Energy efficiency optimization method and device for unmanned aerial vehicle communication network
CN113469325A (en) Layered federated learning method, computer equipment and storage medium for edge aggregation interval adaptive control
CN109784604A (en) A kind of flexible job shop manufacturing recourses distribution method based on whale algorithm
CN111865474B (en) Wireless communication anti-interference decision method and system based on edge calculation
CN114363911A (en) Wireless communication system for deploying layered federated learning and resource optimization method
CN112437131A (en) Data dynamic acquisition and transmission method considering data correlation in Internet of things
CN115310360A (en) Digital twin auxiliary industrial Internet of things reliability optimization method based on federal learning
CN115766089A (en) Energy acquisition cognitive Internet of things anti-interference optimal transmission method
CN117055619A (en) Unmanned aerial vehicle scheduling method based on multi-agent reinforcement learning
Gao et al. Quantum-inspired bacterial foraging algorithm for parameter adjustment in green cognitive radio
CN113573323A (en) Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle
CN116774584A (en) Unmanned aerial vehicle differentiated service track optimization method based on multi-agent deep reinforcement learning
CN115099419B (en) User cooperative transmission method for wireless federal learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Li Jun

Inventor after: Lin Yan

Inventor after: Du Feng

Inventor before: Du Feng

Inventor before: Lin Yan

Inventor before: Li Jun