CN113613337B - User cooperation anti-interference method for beam forming communication - Google Patents
User cooperation anti-interference method for beam forming communication Download PDFInfo
- Publication number
- CN113613337B CN113613337B CN202110896542.1A CN202110896542A CN113613337B CN 113613337 B CN113613337 B CN 113613337B CN 202110896542 A CN202110896542 A CN 202110896542A CN 113613337 B CN113613337 B CN 113613337B
- Authority
- CN
- China
- Prior art keywords
- user
- interference
- users
- strategy
- channel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000004891 communication Methods 0.000 title claims abstract description 12
- 238000005381 potential energy Methods 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims description 24
- 230000008859 change Effects 0.000 claims description 8
- 230000008901 benefit Effects 0.000 claims description 5
- 230000001351 cycling effect Effects 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 230000002452 interceptive effect Effects 0.000 claims 2
- 230000008485 antagonism Effects 0.000 abstract description 5
- 230000006399 behavior Effects 0.000 abstract description 2
- 238000001514 detection method Methods 0.000 abstract 1
- 238000005457 optimization Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/54—Allocation or scheduling criteria for wireless resources based on quality criteria
- H04W72/541—Allocation or scheduling criteria for wireless resources based on quality criteria using the level of interference
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a user cooperation anti-interference method for beam forming communication, which models the antagonism relation between multiple users and interference; the interference is a leader, and the user is a follower; continuously adjusting an interference strategy to ensure that the interference utility is maximum; modeling the cooperative anti-interference behavior among users as potential energy games. Firstly, initializing a strategy of user and interference, namely randomly selecting a communication/interference channel, and setting each user zone bit to 0; then all users execute channel detection or channel updating operation simultaneously, then corresponding utility is calculated, user experience quality satisfaction is interacted between the users, and the zone bit is updated according to the selected strategy; iterating circularly until all the anti-interference strategies of the users are converged; updating a Q table and adjusting a strategy; until the strategy of interference converges. The invention improves the convergence rate by setting different learning parameters for different users, and improves the anti-interference efficiency of the network by the cooperation of the information layers among the users.
Description
Technical Field
The invention belongs to the technical field of wireless communication, and particularly relates to a user cooperation anti-interference method for beam forming communication.
Background
With the development of wireless technology, global communication services show an exponential burst growth, and in hot spot areas, users usually show ultra-dense distribution, so that great difficulty is brought to the cooperation of users for frequency adjustment and anti-malicious interference attack. To solve this problem, the former proposes to avoid the interference attack by using the frequency hopping method (F.Yao and L.Jia, A Collaborative Multi-Agent Reinforcement Learning Anti-Jamming Algorithm in Wireless Networks, IEEE Wireless Communications Letters, vol.8, no.4, pp.1024-1027, aug.2019.); however, most previous studies only use the maximized throughput of the whole network as an optimization target, and do not consider the actual business requirements of users, and do not consider the requirements of users in the decision-making closed loop. Such methods often suffer from situations where the optimization objective cannot be completely matched with the user requirements, thereby wasting resources.
In addition, the existing anti-interference algorithm has the following two problems: (1) The lack of a cooperative mechanism among users makes the anti-interference method biased to be independent and resistant, and does not exert the crowd-sourced effect. (2) Asynchronous updating algorithms are popular, i.e. only one user updating strategy per iteration, resulting in a slower algorithm convergence speed.
Disclosure of Invention
The invention aims to provide a collaborative anti-interference model and a corresponding anti-interference learning algorithm, which can improve user quality of experience (QoE) and reduce interference influence.
The technical solution for realizing the purpose of the invention is as follows: consider that a malicious user can adaptively adjust an interference strategy according to the frequency usage condition of a communication user, so that the interference utility is maximized. First, the antagonism between the user and the disturbance is modeled as a Stackelberg game. In addition, in the aspect of modeling of user relationship, considering the characteristic of asymmetric mutual interference among users under the condition of space division multiple access, a non-cooperative game model with the characteristic of local advantage is provided. Secondly, to overcome the waste of resources caused by blindly improving throughput, a user experience quality model based on the average evaluation value MOS (Mean Opinion Score) is proposed, and the user utility is quantified by QoE grade. Then, the local literacy game among users is proved to be an accurate potential energy game, and further, the full-network optimal strategy of the users is proved to be a pure strategy Nash balance of the game. Finally, a user cooperation anti-interference algorithm which can realize the whole network optimization only by local information is designed.
An anti-jamming algorithm comprising the steps of:
step 1, modeling a cooperative anti-interference problem in a multi-user single-interference scene as a single-leader multi-follower Stackelberg game model, wherein game participants are all users and interferences in a system;
and 2, randomly selecting one channel for interference by interference, and defining a utility function of the interference as the sum of interference power applied by the jammer to all users of the co-channel. The users select anti-interference channels according to an interference strategy, in order to reduce the inter-user interference in the process, a local cooperation model is considered, the cooperation among the users is analyzed by utilizing the potential energy game framework, and each user needs to consider the benefits of the neighbor users. Thus, the user's utility function is defined as the sum of the QoE satisfaction of the user itself with the neighbor users.
And 3, all users simultaneously perform anti-interference strategy adjustment, and the users perform channel selection according to the current zone bit, the strategy and return of the first two time slots. According to different influence degrees of users on the network, the invention sets different learning parameters for each user so as to improve the convergence rate of the algorithm.
And 4, cycling to the step 3, and performing strategy selection by the user through exploring and learning until the interference strategy and the anti-interference strategies of all the users are converged or the set iteration times are reached.
Step 5, interference assessment of its utility u j (k) And updates the Q table.
And step 6, disturbing the updating strategy, and circulating to the step 3 until the maximum circulation times are reached.
Further, the cooperative anti-interference problem in the multi-user single-interference scenario described in step 1 is modeled as a single-leader multi-follower Stackelberg game, expressed as:
wherein ,for user set, j is malicious jammer, < -> and />Policy set, u, representing user and interference, respectively n and uj Representing the utility functions of user n and interference, respectively.
Further, the inter-user local cooperation model described in step 2 is modeled as a precise potential energy game, which is specifically as follows: defining the potential energy function among users as follows:
wherein an Channel access policy for user n, c j Selecting channels for interference;for the set of users interfered by user n, < +.>A user set which causes interference to the user n; the formula represents the sum of QoE satisfaction for all users of the whole network.
The potential energy game proves the following process:
if any user n unilaterally changes the policy from a n Conversion toThe amount of change in the user utility function is as follows:
in addition, the unilateral change of the policy choices by user n results in a change of the potential energy function as:
wherein For the set of users interfered by user n, < +.>For a set of users causing interference to user n, < > for>Expressed in the collection->Delete set in->
The following will be further concluded:
the local collaboration model between users is therefore a potential energy game.
Further, all the users in step 3 perform anti-interference policy adjustment at the same time, and the users perform channel selection according to the current flag bit and the policies and rewards of the first two time slots. The specific operation is as follows:
if the flag bit Y n (t-1) =0, and user n updates the channel according to the following rule:
where M represents the number of channels available to the user,is the learning parameter of user n. If a is n (t)=a n (t-1), the flag bit Y n (t) set to 0, otherwise set to 1.
If the flag bit Y n (t-1) =1, user n updates the channel according to the following rule:
wherein β is the learning rate; u (u) n (t-1) and u n (t-2) is the utility of user n in t-1 and t-2 slots, respectively. Setting a flag bit Y after updating n (t)=0。
Further, the learning parameters of the user are set asWhen x is n When the method is large enough, the user cooperation anti-interference algorithm can gradually converge to the full-network optimum, and different learning parameters are set for different users mainly for accelerating the convergence speed, and the method specifically comprises the following steps:
x n (t)=Γ n ·ε(t)
where ε (t) =ε (0) +tΔε is the amount of change in time, ε (0) is the initial value, Δε is the step size, and t is the number of iterations.Indicating how much user n affects the network.
Further, the interference of step 5 evaluates its utility u j (k) And updates the Q table. The method comprises the following steps:
interference assessment current utility u j :
wherein ,pj Is the interference power;is the interference frequency; d, d jn Distance between jammer and user n; />For channel gain, the interference frequency and interference distance are related;
updating the Q table:
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k),
wherein ,Qk+1 Q value of the period k+1 of the jammer; c j (k) Selecting an interference channel for an interference machine in a k period; q (Q) k Q value of period k of the jammer; u (u) j (k) The utility of the jammer in the k period; lambda epsilon (0, 1) represents the learning rate for controlling the Q learning convergence rate.
Further, the interference policy updating method in step 6 is as follows:
the channel selection strategy of the self is updated by adopting the Boltzmann function:
where τ is a temperature coefficient, representing a compromise between exploration and utilization.Selecting channel c for jammer during k period j (k) Is a probability of (2).
Compared with the prior art, the invention has the remarkable advantages that: (1) The method provides a framework for modeling the relationship between the user and the interference strong countermeasure and the cooperative relationship between the users for the multi-user anti-interference problem. (2) The method and the system consider diversified service demands of users, meanwhile, in order to overcome resource waste caused by improving throughput for blind purposes of the users, a QoE model based on MOS and an optimization mechanism centering on the user demands are provided, the user utility is quantified by QoE level, and the system performance is improved by using user demand diversity. (3) Through the limited improvement of the potential energy game, a multi-user synchronous anti-interference algorithm is designed, and the convergence rate of the algorithm is improved by setting different learning parameters for different users by utilizing the characteristic that the influence degree of each user on the whole network is different.
Drawings
Fig. 1 is a schematic diagram of a multi-user single interference network in a hierarchical anti-interference model for heterogeneous service requirements according to the present invention.
FIG. 2 is a graph comparing the convergence of the algorithm of the present invention with the prior art asynchronous learning algorithm.
Fig. 3 is a schematic diagram of the anti-interference effect of the algorithm of the present invention when the interference power is changed.
Detailed Description
With reference to fig. 1, the hierarchical anti-interference model for multi-user service requirements of the present invention has two millimeter wave picocell base stations in the system, the distance between the two base stations is 50m, and the users are randomly distributed in a circle with the radius of 100m centered on the base station. Meanwhile, the interference is distributed in a range of about 100-200m from the two base stations. In addition, the number of available channels is set to m=4, the channel bandwidth b=1 MHz, and the noise power spectral density N 0 =-130dB/Hz。
The invention is directed to a layered anti-interference model of multi-user business demands, which models interference as a leader and models users as followers. Modeling the antagonism of interference and users as a jackberg game, a method capable of avoiding interference is sought. Modeling the collaboration relationship between users as potential energy games, and searching for a method capable of eliminating co-channel interference. In addition, the collaboration among users provided by the invention is information-level collaboration, which means interaction QoE satisfaction among adjacent users.
Based on the relation between the QoE satisfaction degree of the whole network and the user strategy, the invention accurately maps the user behavior to the system performance by proving the existence of Nash equilibrium and Stackelberg equilibrium, and provides theoretical guidance for further providing a corresponding anti-interference algorithm.
The invention discloses a user cooperation anti-interference algorithm of a layering anti-interference model facing heterogeneous service demands, which comprises the following steps:
step 1, modeling a cooperative anti-interference problem in a multi-user single-interference scene as a single-leader multi-follower Stackelberg game model, wherein game participants are all users and interferences in a system;
and 2, randomly selecting one channel for interference by interference, and defining a utility function of the interference as the sum of interference power applied by the jammer to all users of the same channel. The users select anti-interference channels according to an interference strategy, in order to reduce the inter-user interference in the process, a local cooperation model is considered, the cooperation among the users is analyzed by utilizing the potential energy game framework, and each user needs to consider the benefits of the neighbor users. Thus, the user's utility function is defined as the sum of the QoE satisfaction of the user itself with the neighbor users.
And 3, all users simultaneously perform anti-interference strategy adjustment, and the users perform channel selection according to the current zone bit, the strategy and return of the first two time slots. According to different degrees of influence of users on the whole network, different learning parameters are set for each user, so that the convergence speed of the algorithm is improved.
And 4, cycling to the step 3, and performing strategy selection by the user through exploring and learning until the interference strategy and the anti-interference strategies of all the users are converged or the set iteration times are reached.
Step 5, interference assessment of its utility u j (k) And updates the Q table.
And step 6, disturbing the updating strategy, and circulating to the step 3 until the maximum circulation times are reached.
The specific embodiments of the present invention are as follows:
1. modeling the antagonism between multiuser and interference as a Stackelberg game, expressed as wherein ,/>For user set, j is malicious jammer, < -> and />Policy set, u, representing user and interference, respectively n and uj Representing the utility functions of user n and interference, respectively.
2. Considering that users have multiple services, the throughput requirements are also different. In other words, the same throughput may correspond to different QoE satisfaction under different services. The specific QoE satisfaction calculating process comprises the following steps:
user n can only access one base station at a time, we will represent the base station accessed by user n as S n . Base station S n And the distance between the user n is expressed asBase station S n The direction angle to user n is denoted +.>We can obtain the base station S n The directional gain in the direction in which user m is located when user n is served using beamforming techniques is:
wherein ,θn For base station S n Main lobe width of beam when serving user n.
The beam coverage area of the serving user n is defined as:
wherein ,θn For base station S n Main lobe width of beam when serving user n.
Further, define the set of potential users that are interfered by user n as:
defining a set of potential users that cause interference to user n as:
wherein ,coverage area of beam for serving user m; g mn Is S m Serving user m using beamforming techniques
The gain in the direction in which the user n is located; g 0 Is the beam gain threshold, taken as 0.01.Representing the set of all but user n.
Thus, the sum of the external malicious interference suffered by user n and the inter-user interference is expressed as:
wherein ,is the interference frequency; />For channel a m The frequency; a, a m ,a n and cj Channels selected for user m, user n and jammer respectively; g mn Is S m The directional gain in the direction of the user n when the user m is served by using the beam forming technology;channel gain for the channel on which user m is located; />Is the channel gain of the channel in which the jammer is located. P is p m For the transmit power of user m, d jn Which is the distance of the jammer to the user n. P is p j Is the interference power. Delta (x, y) is an indicator function defined as
Therefore, the communication rate of user n is expressed as:
wherein B is the channel bandwidth; p is p n Representing the transmit power of user n;for base station S n Distance to user n; n (N) 0 Power spectral density, which is noise; d (D) n Is the sum of external malicious interference and mutual interference suffered by the user n. />Channel gain for the channel on which user n is located;
the MOS function is defined as:
MOS=εlog 10 (R/γ),
where R is the throughput of the user; epsilon and gamma are constants which are sized according to the maximum and minimum throughput requirements of the users, and the values of the constants are different due to the different service requirements of the users. The mapping relation between the MOS value and the five levels is shown in Table 1.
Table 1: mean Opinion Score (MOS)
Further, using a functionQuantifying different experience levels of the user, and representing satisfaction degree of the user n under different QoE levels:
based on the above analysis, the optimization objective is expressed as the maximum QoE return (i.e. sum of user satisfaction) for the whole network, namely:
based on the above analysis, the utility function for user n is expressed as:
the optimization problem for user n can be expressed as:
further, all users compose a lower level sub-game, denoted:
for interference, the objective is to maximize the cumulative interference for all users, and its utility function is defined as:
we express the decision optimization problem of interference as:
the upper layer sub-game is represented as:
3. the channel selection procedure for each user is as follows:
(1) Initializing: each userFrom its set of available channels->Medium probability of randomly selecting a channel a n (0) And set the flag bit Y n (0)=0。
(2) Channel sounding: if Y n (t-1) =0, and user n updates the channel according to the following rule:
where M represents the number of channels available to the user,can be considered as the learning rate of user n. If a is n (t)=a n (t-1), the flag bit Y n (t) set to 0, otherwise set to 1.
(3) Updating the channel: if Y n (t-1) =1, user n updates the channel according to the following rule:
wherein beta is learningParameters; u (u) n (t-1) and u n (t-2) is the user utility of user n in time slots t-1 and t-2, respectively. Setting a flag bit Y after updating n (t)=0。
4. And (3) circulating the steps 1 to 3, and simultaneously performing exploration learning and channel access by all users until the channel access selection of all users achieves convergence or reaches the set iteration times.
For the partial cooperative model, it can prove to be a potential energy game, and there is at least one Nash equilibrium solution. And the corresponding anti-interference algorithm can be designed by utilizing the limited improved property of the potential energy game.
5. Interference assessment of its utility u j (k) The method comprises the steps of carrying out a first treatment on the surface of the Interference updates Q value as follows
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k), (6-25)
Wherein λ e (0, 1) represents a learning rate for controlling a Q learning convergence rate.
Similar to the user, the interference also updates its own channel selection strategy using the boltzmann function:
where τ is a temperature coefficient, representing a compromise between exploration and utilization.
6. And (3) cycling to the step (3) until the maximum cycle number is reached.
Example 1
One embodiment of the invention is described below: matlab software is adopted for system simulation, and parameter setting does not affect generality; the system has two millimeter wave picocell base stations, the distance between the two base stations is 50m, and users are randomly distributed in a circle with the radius of 100m taking the base station as the center. Meanwhile, the interference is distributed in a range of about 100-200m from the two base stations. In addition, the number of available channels is set to m=4, the channel bandwidth b=1 MHz, and the noise power spectral density N 0 -130dB/Hz, learning parameter β=t/2500. Learning rate of interference λ=0.1, temperature coefficientWhere K is the total simulation period and K is the current simulation period.
The invention discloses a user cooperation anti-interference algorithm, which comprises the following specific processes:
Step 2: in the kth period, the interference depends on probabilitySelecting a channel c j (k) The method comprises the steps of carrying out a first treatment on the surface of the Every user +.>From its set of available channels->Medium probability of randomly selecting a channel a n (0) And set the flag bit Y n (0)=0。
During this period, all users simultaneously perform the following processes:
cycle t=1, 2, …:
channel sounding:
if Y n (t-1) =0, and user n updates the channel according to the following rule:
where M represents the number of channels available to the user,can be considered as the learning rate of user n. If a is n (t)=a n (t-1), the flag bit Y n (t) set to 0, otherwise set to 1.
Updating the channel:
if Y n (t-1) =1, user n updates the channel according to the following rule:
wherein, beta is a learning parameter; u (u) n (t-1) and u n (t-2) is the utility of user n in t-1 and t-2 slots, respectively. Setting a flag bit Y after updating n (t)=0
Step 3: interference acquisition utility u j (k);
Step 4: the interference updates the Q value according to:
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k),
wherein λ e (0, 1) represents a learning rate for controlling a Q learning convergence rate.
Similar to the user, the interference also updates its own channel selection strategy using the boltzmann function:
where τ is a temperature coefficient, representing a compromise between exploration and utilization.
Step 5: update k=k+1, go to step 2. Until the maximum number of cycles is reached
In connection with fig. 2, for the convergence of the collaborative anti-interference algorithm, the comparison algorithm is an asynchronous learning algorithm, i.e. only one user performs policy update per iteration. The figure shows that the synchronous learning algorithm provided by the invention can obviously improve the learning speed.
In connection with fig. 3, the impact of interference power on network satisfaction rate at different user numbers. The network satisfaction rate is basically unchanged with the increase of the user power, and the method provided by the invention can help the user to avoid the interference channel successfully and has a better anti-interference effect.
In summary, the hierarchical anti-interference model and the user cooperation anti-interference algorithm for multi-user service requirements provided by the invention consider that malicious users can adaptively adjust the interference strategy according to the frequency utilization condition of communication users, so that the interference utility is maximized. The idea of modeling the antagonism relationship between the user and the interference as a Stackelberg game is provided. In addition, by considering the characteristic of asymmetric mutual interference among users under the space division multiple access condition, a user cooperation anti-interference algorithm is provided, and the network satisfaction rate is effectively improved. By comparing with an asynchronous learning algorithm, the remarkable improvement of the convergence rate of the proposed algorithm is proved. And the effectiveness of the anti-interference algorithm provided by the invention is proved by performance comparison under different interference powers.
Claims (1)
1. The user cooperation anti-interference method for the beam forming communication is characterized in that interference is modeled as a leader, a user is modeled as a follower, and the interference always aims at causing maximum interference to the user; the user needs to combine the self business requirement and utilize the anti-interference algorithm to maximize the user satisfaction degree of the whole network, namely the network satisfaction rate; the method comprises the following steps:
step 1, modeling a cooperative anti-interference problem in a multi-user single-interference scene as a single-leader multi-follower Stackelberg game model, wherein game participants are all users and interferences in a system;
the cooperative anti-interference problem in the multi-user single-interference scene is modeled as a single-leader multi-follower Stackelberg game model, which is expressed as:
wherein ,for user set, j is malicious jammer, < -> and />Policy set, u, representing user and interference, respectively n and uj The utility functions of user n and interference are represented respectively;
step 2, randomly selecting one channel for interference, and defining a utility function of the interference as the sum of interference power applied by an interfering machine to all users of the same channel; the users select anti-interference channels according to the interference strategy, the potential energy game framework is utilized to analyze the cooperation among the users, and each user needs to consider the benefits of the neighbor users; the utility function of the user is defined as the sum of QoE satisfaction of the user itself and the neighbor users;
utility function u of users in local collaboration model n Defined as the sum of QoE satisfaction of the user itself with the neighbor user, expressed as:
wherein an Channel access policy for user n, c j Selecting channels for interference;for the set of users interfered by user n, < +.>A user set which causes interference to the user n; />A user set which causes interference to the user k; />For user set->Channel selection for all users in the network; />For user set->Channel selection for all users in the network; q n QoE satisfaction for user n; q k QoE satisfaction for user k;
wherein ,is a function related to user throughput and specific service requirements, and the mapping relation can be represented by MOS functions;
the MOS function is defined as:
MOS=εlog 10 (R/γ), (6-3)
where R is the throughput of the user; epsilon and gamma are constants, the size is determined according to the maximum throughput requirement and the minimum throughput requirement of users, and the values of the constants are different due to different service requirements of the users;
satisfaction of user n at different QoE levels is expressed as:
the partial cooperative model has been demonstrated to be an accurate potential energy game, which has been demonstrated as follows:
the potential energy function is expressed as:
from policy a due to policy unilateral to arbitrary user n n Change toThe resulting satisfaction change is consistent with the change in potential energy function, namely:
wherein an For the original channel access policy of user n,a changed channel access policy for user n; />Policy change for user n is +>After that, user set->Channel selection for all users in the network; a, a -n Channel access for the remaining users c j Selecting channels for interference; />For the set of users interfered by user n, < +.>For a set of users causing interference to user n, < > for>Expressed in the collection->Delete set in->
Step 3, all users simultaneously carry out anti-interference strategy adjustment, and the users carry out channel selection according to the current zone bit, the strategy and return of the first two time slots; according to different influence degrees of users on the whole network, different learning parameters are set for each user, and the convergence speed of the algorithm is improved;
according to different influence degrees of users on the network, different learning parameters are set for each user, and the method specifically comprises the following steps:
wherein xn (t)=Γ n ×ε(t);Indicating the influence degree of the user n on the network; epsilon (t) =epsilon (0) +tΔepsilon, epsilon (0) is an initial value, Δepsilon is a step size, and t is the iteration number;
step 4, cycling the steps 1 to 3, and performing strategy selection by the user through exploration and learning until the interference strategy and the anti-interference strategies of all the users are converged or the set iteration times are reached;
step 5, interference evaluation utility u j (k) And updating the Q table; the method comprises the following steps:
interference assessment current utility u j :
wherein ,pj Is the interference power; f (f) cj Is the interference frequency; d, d jn Distance between jammer and user n; h (f) cj ,d jn ) For channel gain, the interference frequency and interference distance are related;
updating the Q table:
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k),
wherein ,Qk+1 Q value of the period k+1 of the jammer; c j (k) Selecting an interference channel for an interference machine in a k period; q (Q) k Q value of period k of the jammer; u (u) j (k) The utility of the jammer in the k period; lambda E (0, 1) represents learning rate for controlling Q learning convergenceA speed;
step 6, interfering with the updating strategy, and circulating to the step 3 until the maximum circulation times are reached;
the interference strategy updating mode is as follows:
the channel selection strategy of the self is updated by adopting the Boltzmann function:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110896542.1A CN113613337B (en) | 2021-08-05 | 2021-08-05 | User cooperation anti-interference method for beam forming communication |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110896542.1A CN113613337B (en) | 2021-08-05 | 2021-08-05 | User cooperation anti-interference method for beam forming communication |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113613337A CN113613337A (en) | 2021-11-05 |
CN113613337B true CN113613337B (en) | 2023-06-20 |
Family
ID=78307112
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110896542.1A Active CN113613337B (en) | 2021-08-05 | 2021-08-05 | User cooperation anti-interference method for beam forming communication |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113613337B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114698128B (en) * | 2022-05-17 | 2022-09-13 | 中国人民解放军战略支援部队航天工程大学 | Anti-interference channel selection method and system for cognitive satellite-ground network |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108616916A (en) * | 2018-04-28 | 2018-10-02 | 中国人民解放军陆军工程大学 | A kind of anti-interference layering betting model of cooperation and anti-interference learning algorithm |
CN112188504A (en) * | 2020-09-30 | 2021-01-05 | 中国人民解放军陆军工程大学 | Multi-user cooperative anti-interference system and dynamic spectrum cooperative anti-interference method |
-
2021
- 2021-08-05 CN CN202110896542.1A patent/CN113613337B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108616916A (en) * | 2018-04-28 | 2018-10-02 | 中国人民解放军陆军工程大学 | A kind of anti-interference layering betting model of cooperation and anti-interference learning algorithm |
CN112188504A (en) * | 2020-09-30 | 2021-01-05 | 中国人民解放军陆军工程大学 | Multi-user cooperative anti-interference system and dynamic spectrum cooperative anti-interference method |
Non-Patent Citations (1)
Title |
---|
An Anti-Jamming Hierachical Optimization Approach in Relay Communication System via Stackelberg Game;Zhibin Feng, etc.;《MDPI》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113613337A (en) | 2021-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tan et al. | Deep reinforcement learning for joint channel selection and power control in D2D networks | |
Zhang et al. | Intelligent user association for symbiotic radio networks using deep reinforcement learning | |
Lu et al. | A cross-layer resource allocation scheme for ICIC in LTE-Advanced | |
CN113316154A (en) | Authorized and unauthorized D2D communication resource joint intelligent distribution method | |
Yao et al. | Distributed ABS-slot access in dense heterogeneous networks: A potential game approach with generalized interference model | |
Wang et al. | User association in non-orthogonal multiple access networks | |
CN113613337B (en) | User cooperation anti-interference method for beam forming communication | |
Chen et al. | Intelligent control of cognitive radio parameter adaption: Using evolutionary multi-objective algorithm based on user preference | |
Sroka et al. | Distributed interference mitigation in two-tier wireless networks using correlated equilibrium and regret-matching learning | |
Li et al. | Reinforcement Learning-Based Resource Allocation for Coverage Continuity in High Dynamic UAV Communication Networks | |
Wang et al. | Intelligent user-centric networks: Learning-based Downlink CoMP region breathing | |
Huang et al. | Joint AMC and resource allocation for mobile wireless networks based on distributed MARL | |
Xu et al. | Distributed-training-and-execution multi-agent reinforcement learning for power control in HetNet | |
Chai et al. | A user-selected uplink power control algorithm in the two-tier femtocell network | |
CN107919931A (en) | A kind of multichannel power control mechanism based on hidden Markov in cognition net | |
Chen et al. | Beamforming in multi-user MISO cellular networks with deep reinforcement learning | |
Adeel et al. | Random neural network based power controller for inter-cell interference coordination in lte-ul | |
Sheu et al. | Joint Beamforming, Power Control, and Interference Coordination: A Reinforcement Learning Approach Replacing Rewards with Examples | |
CN105960008B (en) | Method for inhibiting interference of Femtocell on surrounding cells | |
Maaz et al. | Inter-cell interference coordination based on power control for self-organized 4G systems | |
CN113472472B (en) | Multi-cell collaborative beam forming method based on distributed reinforcement learning | |
Mohamed et al. | Spectral Efficiency Improvement in Downlink Fog Radio Access Network With Deep-Reinforcement-Learning-Enabled Power Control | |
Zhou | Deep Reinforcement Learning for Channel Selection and Power Allocation in D2D Communications | |
Trankatwar et al. | Power control algorithm to improve coverage probability in heterogeneous networks | |
Sivaraj et al. | Soft computing based power control for interference mitigation in LTE femtocell networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |