CN113573323A - Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle - Google Patents
Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle Download PDFInfo
- Publication number
- CN113573323A CN113573323A CN202110681497.8A CN202110681497A CN113573323A CN 113573323 A CN113573323 A CN 113573323A CN 202110681497 A CN202110681497 A CN 202110681497A CN 113573323 A CN113573323 A CN 113573323A
- Authority
- CN
- China
- Prior art keywords
- channel
- belief
- knowledge
- gradient
- aerial vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010187 selection method Methods 0.000 title claims abstract description 8
- 230000006854 communication Effects 0.000 claims abstract description 29
- 238000004891 communication Methods 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 27
- 230000005540 biological transmission Effects 0.000 claims abstract description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 28
- 238000009826 distribution Methods 0.000 claims description 22
- 230000009471 action Effects 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000008859 change Effects 0.000 claims description 5
- 230000001186 cumulative effect Effects 0.000 claims description 5
- 238000005315 distribution function Methods 0.000 claims description 4
- 238000012544 monitoring process Methods 0.000 claims description 3
- 238000005303 weighing Methods 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000002787 reinforcement Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 2
- 208000001613 Gambling Diseases 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/22—Traffic simulation tools or models
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/309—Measuring or estimating channel quality parameters
- H04B17/336—Signal-to-interference ratio [SIR] or carrier-to-interference ratio [CIR]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/309—Measuring or estimating channel quality parameters
- H04B17/345—Interference values
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/391—Modelling the propagation channel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W36/00—Hand-off or reselection arrangements
- H04W36/24—Reselection being triggered by specific parameters
- H04W36/30—Reselection being triggered by specific parameters by measured or perceived connection quality data
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Electromagnetism (AREA)
- Quality & Reliability (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient. The method comprises the following steps: modeling the channel capacity of all channels into a lookup table belief model based on Bayesian theorem; initializing a belief model according to the past communication task experience of the unmanned aerial vehicle; calculating to obtain knowledge gradient values of all channels according to the belief state about the channel capacity at the current moment, and selecting the channel with the maximum knowledge gradient as the current-moment channel; the unmanned aerial vehicle communicates on the selected channel, monitors the transmission rate at the same time, and updates the belief state of the channel capacity according to the monitored transmission rate; the above process is repeated until the time limit, i.e. budget, for each channel selection is exceeded. The method is suitable for quick channel selection of the high-dynamic unmanned aerial vehicle network, and effectively improves the speed of optimal channel selection.
Description
Technical Field
The invention belongs to the technical field of unmanned aerial vehicle communication, and particularly relates to an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient.
Background
With the large-scale development and application of the unmanned aerial vehicle technology, the anti-interference problem in the field of unmanned aerial vehicle communication becomes increasingly severe. Interference in the communication process of the unmanned aerial vehicle is not only from background noise, but also possibly from an interfering machine. The interference machine transmits a signal with certain interference strength on a channel to occupy channel resources, and the unmanned aerial vehicle end can make a corresponding strategy to select the channel which is not interfered or is less interfered, so that the unmanned aerial vehicle can obtain higher transmission rate on the channels.
At present, in a plurality of methods for resisting interference of the unmanned aerial vehicle, frequency hopping is a common and easy-to-implement direction. But the related algorithms are not much proposed and are mainly limited by the drone's resistance to this particular environment. Based on the hypothesis of cognitive radio, the channel selection process of the unmanned aerial vehicle is modeled as MDP, and the benefit of the unmanned aerial vehicle is maximized through reinforcement learning algorithm (such as Q learning) selection, so that a channel with small interference and high benefit is selected in each time slot (Zhanyu. unmanned aerial vehicle network anti-interference method research [ D ]. Beijing post and telecommunications university.2019). Or, also modeling the channel selection process as MDP, select the best channel by minimizing the perceived interference power through Q-learning (huang bang. The method of reinforcement learning can well cope with the jammers with fixed strategies, but when the interference power and the interference channel of the jammers are relatively high in randomness, the reinforcement learning algorithm is difficult to converge and consumes a long time, and the environment confronted by the unmanned aerial vehicle is difficult to cope with. In addition, there is a method for modeling unmanned plane channel selection as a dobby gambling machine, and the best channel is estimated by a greedy algorithm or a UCB algorithm, although the method based on statistics better deals with the randomness problem, the method also faces the same problem that a large amount of training is needed to converge.
Disclosure of Invention
The invention aims to provide a method for quickly selecting an optimal channel of an unmanned aerial vehicle based on knowledge gradient, so that the speed of selecting the optimal channel of the unmanned aerial vehicle under the scene of random dynamic change of interference power is increased.
The technical solution for realizing the purpose of the invention is as follows: an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient comprises the following steps:
and 4, the unmanned aerial vehicle communicates on the selected channel, simultaneously monitors the transmission rate, and updates the belief state of the channel capacity according to the transmission rate.
Further, the lookup table belief model in step 1 is based on bayes theorem, and specifically includes the following steps:
the lookup table belief model is used for modeling channel capacity, is composed of a belief mean and a belief variance of the channel capacity and is generally called a belief state; the belief state updated at the previous moment belongs to posterior distribution of channel capacity and is used for prior distribution participation operation at the current moment.
Further, in step 2, the lookup table belief model is initialized, wherein initial values of the belief states are empirically aggregated, and comprise a belief mean of the channel capacity.
Further, in step 3, according to the belief state of the current time about the channel capacity, a knowledge gradient value about each channel is obtained by calculation, and a channel with the largest knowledge gradient is selected as the communication channel of the current time, specifically:
using a knowledge gradient algorithm based on a lookup table model, taking a belief state about channel capacity as prior distribution of the iteration, wherein the prior distribution obeys Gaussian distribution and is a two-dimensional table consisting of a belief mean and a belief variance;
and calculating to obtain knowledge gradient values of all channels in the current belief state according to a calculation formula of the knowledge gradient, selecting the channel with the maximum knowledge gradient value as a communication channel at the current moment, and monitoring the selected channel.
Further, a knowledge gradient algorithm based on a lookup table belief model is used for selecting a channel with minimum interference, and for a scene of variable power interference of a single jammer or multiple jammers on a communication channel, the method specifically comprises the following steps:
(1) lookup table belief model
Definition of StFor a belief state for channel N e {1, 2., N }, the action is to select one of the N channels,whereinThe channel capacity estimate for channel n for time t,then it is the belief standard deviation for the channel n at time t and the belief state records the actual value r for the channel capacitynLet us say the belief of EF (n, W), W representing the observation of channel n, assuming
the gradient of knowledge about the channel n at time t is defined as:
after the action selection is done at each moment, the following reports, namely the observed values, are obtained:
rnis an actual observation about the channel capacity of channel n;
WhereinIs based on empirical and historical statistics about the variance of the channel capacity of channel n;is the belief variance, which is the square of the belief standard deviation;
based on the above definitions and formulas, the following updates are made to the channel capacity and belief accuracy, for the selected channel n at time t:
the rest channels use the belief state of the last moment;
(2) knowledge gradient algorithm
Calculating the knowledge gradient of each channel capacity in the interference scene of the multi-channel variable power of the interference machine by using a knowledge gradient algorithm based on a lookup table belief model, wherein the size of the knowledge gradient represents the amount of information which can be obtained after the corresponding channel is selected; the knowledge gradient algorithm calculates the information quantity which can be obtained by each channel based on Bayesian theory and hypothesis that the belief state obeys Gaussian distribution, and the larger the information quantity is, the larger the decision making progress is after the belief state is updated;
noting the variance of the change in the mean value of beliefs caused by selecting channel n at time t
then calculateThe normalized influence called action n gives the standard deviation of the channel capacity corresponding to the current action n;
recalculation
f(ζ)=ζΦ(ζ)+φ(ζ)
Wherein Φ (ζ) and Φ (ζ) represent a cumulative standard normal distribution function and a standard normal density function, respectively;
in summary, the knowledge gradient corresponding to the channel n at the time t is written as:
considering the influence of the knowledge gradient of the current moment on the rest moments, and simultaneously weighing the relation between data exploration and utilization, the online knowledge gradient is finally used as the basis for channel selection:
inputting the current belief state StDefinition ofAnd calculating therefrom a normalized influenceAnd corresponding f (zeta), and finally, giving online knowledge gradients corresponding to all channels at the time tThe action is selected as
And then observing the actual communication rate of the channel, updating the belief state according to an update equation in the lookup table belief model in step (1), and then starting channel selection at the next moment, wherein each moment corresponds to one budget until the budget is used up and the iteration of the algorithm is stopped.
Compared with the prior art, the invention has the following remarkable advantages: (1) by adopting a decision-making mode based on a Bayesian theory, more scientific judgment is made on the value of unmanned aerial vehicle channel information or whether the information needs to be collected again, the value is completely believed or not to be believed in the calculation result unlike the common decision, but the data is measured through belief variance, the confidence degree is digitized, and the method is more scientific and reasonable; (2) the method has the advantages that the concept of knowledge gradient constructed based on Bayesian theory is utilized, the information value which can be obtained after the channel is selected is digitalized, and meanwhile, the online knowledge gradient which can balance the channel capacity mean value and the channel information value is used as the reference for selecting the channel, so that the method has higher convergence speed.
The present invention is described in further detail below with reference to the attached drawing figures.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of the unmanned aerial vehicle communication anti-interference transmission model.
Fig. 3 is a graphical representation of the accumulated interference power over 5 channels as a function of time.
Fig. 4 is a schematic diagram of channel selection for each slot drone.
Fig. 5 is a statistical diagram of the number of times each channel is selected by the drones.
Detailed Description
While considering uncertainty of jammers, The invention applies a Knowledge Gradient (Powell, w.b. "The Knowledge Gradient for Optimal Learning," Encyclopedia for Operations Research and Management Science,2011(c) John Wiley and Sons.) to The field of drone communication, greatly reducing The convergence time. The invention provides an unmanned aerial vehicle optimal channel rapid selection method based on knowledge gradient, which rapidly learns the information of all channels through an updating method of a lookup table belief model constructed based on Bayesian hypothesis and a knowledge gradient calculation formula based on Gaussian hypothesis, and provides channel selection with minimum accumulated interference after few iterations, namely a channel with maximum accumulated channel capacity, and specifically comprises the following steps:
and 4, the unmanned aerial vehicle communicates on the selected channel, simultaneously monitors the transmission rate, and updates the belief state of the channel capacity according to the transmission rate.
Further, the lookup table belief model in step 1 is based on bayes theorem, which is specifically as follows:
the lookup table belief model is used for modeling channel capacity, is composed of a belief mean and a belief variance of the channel capacity and is generally called a belief state; the belief state updated at the previous moment belongs to posterior distribution of channel capacity, and can be used as prior distribution participation operation at the current moment.
Further, the initial value of the belief state in step 2 is obtained according to the past experience aggregation, and mainly includes the belief mean value of the channel capacity.
Further, the belief state about the channel capacity in step 3 is used as the prior distribution of the iteration, and then the knowledge gradient values of all channels in the current belief state are calculated, specifically:
using a knowledge gradient algorithm based on a lookup table model, taking the belief state obtained in the step 2 or the step 4 as prior distribution, wherein the distribution obeys Gaussian distribution and is a two-dimensional table consisting of a belief mean and a belief variance;
and calculating to obtain the knowledge gradient value of each channel according to a calculation formula of the knowledge gradient, selecting the channel with the maximum value as the communication channel at the current moment, and monitoring the selected channel.
Further, aiming at the variable power interference scene of a single jammer or a plurality of jammers on a communication channel, a knowledge gradient algorithm based on a lookup table belief model is used for selecting the channel with the minimum interference, and the method specifically comprises the following steps:
(1) lookup table belief model
Definition of StFor a belief state for channel N e {1, 2., N }, the action is to select one of the N channels,whereinThe channel capacity estimate for channel n for time t,then is the belief variance for the channel n at time t. The belief state records the actual value r of the channel capacitynBeliefs of EF (n, W)W denotes the observation of channel n, assuming
the gradient of knowledge about the channel n at time t is defined as:
After the action selection is done at each moment, the following reports, namely the observed values, are obtained:
rnis an actual observation about the channel capacity of channel n.
Defining belief precision and noise precision:
whereinIs based on empirical and historical statistics on the variance of the channel capacity of channel n,is the belief variance, which is the beliefThe square of the standard deviation is recited.
Based on the above definitions and formulas, the following updates are made to the channel capacity and belief accuracy, for the selected channel n at time t:
the remaining channels follow the belief state at the last time.
(2) Knowledge gradient algorithm
And calculating the knowledge gradient of each channel capacity in the interference machine multi-channel variable power interference scene by using a knowledge gradient algorithm based on a lookup table belief model, wherein the size of the knowledge gradient represents the amount of information which can be obtained after the corresponding channel is selected. The knowledge gradient algorithm is based on Bayesian theory and the assumption that the belief state obeys Gaussian distribution, the information quantity which can be acquired by each channel is calculated, and the larger the information quantity is, the larger the decision making progress is after the belief state is updated.
Firstly, defining the variance of the change of the belief mean value caused by selecting the channel n at the time t:
Then calculateCalled the normalized impact of action n, which gives the magnitude of the standard deviation of the channel capacity for the current action n.
Recalculation
f(ζ)=ζΦ(ζ)+φ(ζ)
Where Φ (ζ) and Φ (ζ) represent a cumulative standard normal distribution function and a standard normal density function, respectively.
In summary, the knowledge gradient corresponding to the channel n at the time t can be written as:
considering the influence of the knowledge gradient of the current moment on the rest moments, and simultaneously weighing the relation between data exploration and utilization, the online knowledge gradient is finally used as the basis for channel selection:
inputting the current belief state StDefinition ofAnd calculating therefrom a normalized influenceAnd corresponding f (zeta), and finally, giving online knowledge gradients corresponding to all channels at the time tThe action is selected as
And then observing the actual communication rate of the channel, updating the belief state according to the updating equation in the lookup table belief model in the step (1), starting channel selection at the next moment, wherein each moment corresponds to one budget, and stopping the iteration of the algorithm until the budget is used up.
Examples
One embodiment of the invention is described in detail below, with the simulation using python programming, and the parameter settings do not affect generality. The embodiment verifies the effectiveness and convergence of the proposed model and algorithm, and the scenario is as follows:
since the algorithm is based on the assumption of instantaneous tasks, only a single drone user is in the scene, the position of the drone user and the ground base station is fixed relatively, and then communication is carried out, and 5 channels are available. And in each time slot, a communication link is established between the aerial unmanned aerial vehicle and the ground base station, the flying height of the unmanned aerial vehicle is 80m, the distance between the unmanned aerial vehicle and the base station is 100m, and the transmitting power of the unmanned aerial vehicle is 0.4W. 5 malicious interference machines simultaneously interfere normal transmission of users, the positions of the interference machines are also fixed, and the distances between the interference machines and a base station are 100 m; the interference patterns are all multi-tone random interference, i.e. a single jammer performs interference on 5 channels, with the difference that the distribution of the interference power is different from each other. In addition, the interference powers of the jammers are independent of each other and subject to different distributions, for example, some jammers use interference powers subject to gaussian distribution, and some jammers use uniform distribution. The specific interference power setting is as shown in table 1:
TABLE 1
Jammer serial number | Interference power mean/lower bound | Interference power variance/ceiling |
Jammer 1 (Gauss) | [0.4,0.6,0.8,0.8,1.0] | [0.3,0.3,0.3,0.3,0.3] |
Jammer 2 (Gauss) | [0.5,0.5,0.8,0.7,0.9] | [0.2,0.2,0.2,0.2,0.2] |
Jammer 3 (Uniform) | [0.1,0.2,0.3,0.4,0.5] | [0.8,0.9,0.9,1.0,1.2] |
Jammer 4 (Uniform) | [0.3,0.2,0.3,0.6,0.8] | [1.0,1.2,1.0,1.4,1.5] |
Jammer 5 (Uniform) | [0.2,0.2,0.2,0.2,0.2] | [0.9,0.9,0.9,0.9,0.9] |
The unmanned aerial vehicle channel selection algorithm based on the knowledge gradient provided by the invention is combined with the figures 1-2, and the specific process is as follows:
step 1: initialization: initializing the iteration time t as 0; initializing a belief model, estimating a channel capacity mean value according to past experience, and setting the same belief variance for each channel; other system fixed parameters are initialized.
Step 2: repeating iteration, which is specifically divided into the following substeps:
1) sequentially calculating and storing the online knowledge gradients of all the channels, wherein the step of calculating the knowledge gradients is as follows:
Calculating f (zeta) ═ zeta phi (zeta) + phi (zeta), where phi (zeta) and phi (zeta) represent cumulative standard normal distribution functions and
a standard normal density function;
2) Selecting a communication channel according to the online knowledge gradient of each channel corresponding to the current time t;
3) after the channel is selected, the unmanned aerial vehicle monitors the actual transmission rate of the channel, and the belief state is updated by the following formula
as can be seen from fig. 3, as time increases, the interference power experienced by each channel also increases. Where channel C0 is the channel with the least accumulated interference and vice versa the channel with the largest channel capacity. Fig. 4 shows whether the drone selects the channel with the smallest cumulative interference or not in each time slot, and it can be seen from fig. 3 and 4 that after the drone performs multiple operations in a cyclic manner, the channel selection of the drone tends to be stable, and convergence is achieved. And the convergence rate is within 100 times of calculation, and the convergence rate is higher than that of an algorithm based on historical statistics, such as UCB or epsilon-greedy. As can be seen from fig. 5, in the process of learning the channel capacity and selecting a communication channel, the drone tends to select a channel with the minimum actual interference mean, and the validity of the algorithm is verified.
In conclusion, the knowledge gradient-based rapid selection method for the optimal channel of the unmanned aerial vehicle fully considers the experience about the channel capacity in historical communication, and by utilizing the Bayesian theory, the rapid convergence speed can be achieved, the channel with the maximum average value of the channel capacity can be learned, and the channel can be continuously selected as the actual communication channel. Therefore, the method is very suitable for the channel selection task of the unmanned aerial vehicle in the time-limited environment.
Claims (5)
1. A method for quickly selecting an optimal channel of an unmanned aerial vehicle based on knowledge gradient is characterized by comprising the following steps:
step 1, modeling the channel capacity of all channels into a lookup table belief model based on Bayesian theorem;
step 2, initializing a lookup table belief model according to the unmanned aerial vehicle communication task experience;
step 3, calculating to obtain knowledge gradient values of all channels according to the belief state of the channel capacity at the current moment, and selecting the channel with the maximum knowledge gradient as a communication channel at the current moment;
and 4, the unmanned aerial vehicle communicates on the selected channel, simultaneously monitors the transmission rate, and updates the belief state of the channel capacity according to the transmission rate.
2. The method for quickly selecting the optimal channel of the unmanned aerial vehicle based on the knowledge gradient according to claim 1, wherein the lookup table belief model in the step 1 is based on Bayesian theorem and specifically comprises the following steps:
the lookup table belief model is used for modeling channel capacity, is composed of a belief mean and a belief variance of the channel capacity and is generally called a belief state; the belief state updated at the previous moment belongs to posterior distribution of channel capacity and is used for prior distribution participation operation at the current moment.
3. The knowledge gradient-based unmanned aerial vehicle optimal channel rapid selection method according to claim 1, wherein the initialization lookup table belief model in step 2 is obtained by empirically aggregating initial values of belief states, including belief mean values of channel capacity.
4. The method for quickly selecting the optimal channel of the unmanned aerial vehicle based on the knowledge gradient as claimed in claim 1,2 or 3, wherein in step 3, the knowledge gradient value of each channel is calculated according to the belief state about the channel capacity at the current moment, and the channel with the largest knowledge gradient is selected as the communication channel at the current moment, specifically:
using a knowledge gradient algorithm based on a lookup table model, taking a belief state about channel capacity as prior distribution of the iteration, wherein the prior distribution obeys Gaussian distribution and is a two-dimensional table consisting of a belief mean and a belief variance;
and calculating to obtain knowledge gradient values of all channels in the current belief state according to a calculation formula of the knowledge gradient, selecting the channel with the maximum knowledge gradient value as a communication channel at the current moment, and monitoring the selected channel.
5. The method for quickly selecting the optimal channel of the unmanned aerial vehicle based on the knowledge gradient as claimed in claim 4, wherein a knowledge gradient algorithm based on a lookup table belief model is used to select the channel with the minimum interference, and for a scene of variable power interference of a single jammer or multiple jammers on a communication channel, the method specifically comprises the following steps:
(1) lookup table belief model
Definition of StFor a belief state for channel N e {1, 2., N }, the action is to select one of the N channels,whereinThe channel capacity estimate for channel n for time t,then it is the belief standard deviation for the channel n at time t and the belief state records the actual value r for the channel capacitynLet us say the belief of EF (n, W), W representing the observation of channel n, assuming
the gradient of knowledge about the channel n at time t is defined as:
after the action selection is done at each moment, the following reports, namely the observed values, are obtained:
rnis an actual observation about the channel capacity of channel n;
WhereinIs based on empirical and historical statistics about the variance of the channel capacity of channel n;is the belief variance, which is the square of the belief standard deviation;
based on the above definitions and formulas, the following updates are made to the channel capacity and belief accuracy, for the selected channel n at time t:
the rest channels use the belief state of the last moment;
(2) knowledge gradient algorithm
Calculating the knowledge gradient of each channel capacity in the interference scene of the multi-channel variable power of the interference machine by using a knowledge gradient algorithm based on a lookup table belief model, wherein the size of the knowledge gradient represents the amount of information which can be obtained after the corresponding channel is selected; the knowledge gradient algorithm calculates the information quantity which can be obtained by each channel based on Bayesian theory and hypothesis that the belief state obeys Gaussian distribution, and the larger the information quantity is, the larger the decision making progress is after the belief state is updated;
noting the variance of the change in the mean value of beliefs caused by selecting channel n at time t
then calculate The normalized influence called action n gives the standard deviation of the channel capacity corresponding to the current action n;
recalculation
f(ζ)=ζΦ(ζ)+φ(ζ)
Wherein Φ (ζ) and Φ (ζ) represent a cumulative standard normal distribution function and a standard normal density function, respectively;
in summary, the knowledge gradient corresponding to the channel n at the time t is written as:
considering the influence of the knowledge gradient of the current moment on the rest moments, and simultaneously weighing the relation between data exploration and utilization, the online knowledge gradient is finally used as the basis for channel selection:
inputting the current belief state StDefinition ofAnd calculating therefrom a normalized influenceAnd corresponding f (zeta), and finally, giving online knowledge gradients corresponding to all channels at the time tThe action is selected as
And then observing the actual communication rate of the channel, updating the belief state according to an update equation in the lookup table belief model in step (1), and then starting channel selection at the next moment, wherein each moment corresponds to one budget until the budget is used up and the iteration of the algorithm is stopped.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110681497.8A CN113573323A (en) | 2021-06-18 | 2021-06-18 | Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110681497.8A CN113573323A (en) | 2021-06-18 | 2021-06-18 | Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113573323A true CN113573323A (en) | 2021-10-29 |
Family
ID=78162313
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110681497.8A Pending CN113573323A (en) | 2021-06-18 | 2021-06-18 | Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113573323A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108092729A (en) * | 2017-12-29 | 2018-05-29 | 中国人民解放军陆军工程大学 | Anti-interference model and Staenberg game Subgradient Algorithm in UAV Communication |
CN108919640A (en) * | 2018-04-20 | 2018-11-30 | 西北工业大学 | The implementation method of the adaptive multiple target tracking of unmanned plane |
US20190230671A1 (en) * | 2018-01-23 | 2019-07-25 | Electronics And Telecommunications Research Institute | Method and apparatus for selecting unmanned aerial vehicle control and non-payload communication channel on basis of channel interference analysis |
CN111367317A (en) * | 2020-03-27 | 2020-07-03 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle cluster online task planning method based on Bayesian learning |
-
2021
- 2021-06-18 CN CN202110681497.8A patent/CN113573323A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108092729A (en) * | 2017-12-29 | 2018-05-29 | 中国人民解放军陆军工程大学 | Anti-interference model and Staenberg game Subgradient Algorithm in UAV Communication |
US20190230671A1 (en) * | 2018-01-23 | 2019-07-25 | Electronics And Telecommunications Research Institute | Method and apparatus for selecting unmanned aerial vehicle control and non-payload communication channel on basis of channel interference analysis |
CN108919640A (en) * | 2018-04-20 | 2018-11-30 | 西北工业大学 | The implementation method of the adaptive multiple target tracking of unmanned plane |
CN111367317A (en) * | 2020-03-27 | 2020-07-03 | 中国人民解放军国防科技大学 | Unmanned aerial vehicle cluster online task planning method based on Bayesian learning |
Non-Patent Citations (1)
Title |
---|
刘尧;彭艺;: "一种基于向量回归的无人机通信信道选择方法", 软件导刊, no. 01 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111065103B (en) | Multi-objective optimization wireless sensor network node deployment method | |
Li et al. | Minimizing packet expiration loss with path planning in UAV-assisted data sensing | |
Shi et al. | Drone-cell trajectory planning and resource allocation for highly mobile networks: A hierarchical DRL approach | |
CN111628855B (en) | Industrial 5G dynamic multi-priority multi-access method based on deep reinforcement learning | |
Masazade et al. | Dynamic bit allocation for object tracking in wireless sensor networks | |
CN112598150B (en) | Method for improving fire detection effect based on federal learning in intelligent power plant | |
CN111524034B (en) | High-reliability low-time-delay low-energy-consumption power inspection system and inspection method | |
CN113312177B (en) | Wireless edge computing system and optimizing method based on federal learning | |
Chen et al. | Swarm intelligence application to UAV aided IoT data acquisition deployment optimization | |
CN113919485B (en) | Multi-agent reinforcement learning method and system based on dynamic hierarchical communication network | |
Swenson et al. | Distributed inertial best-response dynamics | |
CN109391511B (en) | Intelligent communication resource allocation strategy based on expandable training network | |
CN110300417B (en) | Energy efficiency optimization method and device for unmanned aerial vehicle communication network | |
CN113469325A (en) | Layered federated learning method, computer equipment and storage medium for edge aggregation interval adaptive control | |
CN109784604A (en) | A kind of flexible job shop manufacturing recourses distribution method based on whale algorithm | |
CN111865474B (en) | Wireless communication anti-interference decision method and system based on edge calculation | |
CN114363911A (en) | Wireless communication system for deploying layered federated learning and resource optimization method | |
CN112437131A (en) | Data dynamic acquisition and transmission method considering data correlation in Internet of things | |
CN115310360A (en) | Digital twin auxiliary industrial Internet of things reliability optimization method based on federal learning | |
CN115766089A (en) | Energy acquisition cognitive Internet of things anti-interference optimal transmission method | |
CN117055619A (en) | Unmanned aerial vehicle scheduling method based on multi-agent reinforcement learning | |
Gao et al. | Quantum-inspired bacterial foraging algorithm for parameter adjustment in green cognitive radio | |
CN113573323A (en) | Knowledge gradient-based rapid selection method for optimal channel of unmanned aerial vehicle | |
CN116774584A (en) | Unmanned aerial vehicle differentiated service track optimization method based on multi-agent deep reinforcement learning | |
CN115099419B (en) | User cooperative transmission method for wireless federal learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information |
Inventor after: Li Jun Inventor after: Lin Yan Inventor after: Du Feng Inventor before: Du Feng Inventor before: Lin Yan Inventor before: Li Jun |