CN113613337B - User cooperation anti-interference method for beam forming communication - Google Patents

User cooperation anti-interference method for beam forming communication Download PDF

Info

Publication number
CN113613337B
CN113613337B CN202110896542.1A CN202110896542A CN113613337B CN 113613337 B CN113613337 B CN 113613337B CN 202110896542 A CN202110896542 A CN 202110896542A CN 113613337 B CN113613337 B CN 113613337B
Authority
CN
China
Prior art keywords
user
interference
users
strategy
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110896542.1A
Other languages
Chinese (zh)
Other versions
CN113613337A (en
Inventor
任国春
徐煜华
张云鹏
徐逸凡
方贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Army Engineering University of PLA
Original Assignee
Army Engineering University of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Army Engineering University of PLA filed Critical Army Engineering University of PLA
Priority to CN202110896542.1A priority Critical patent/CN113613337B/en
Publication of CN113613337A publication Critical patent/CN113613337A/en
Application granted granted Critical
Publication of CN113613337B publication Critical patent/CN113613337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/541Allocation or scheduling criteria for wireless resources based on quality criteria using the level of interference
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a user cooperation anti-interference method for beam forming communication, which models the antagonism relation between multiple users and interference; the interference is a leader, and the user is a follower; continuously adjusting an interference strategy to ensure that the interference utility is maximum; modeling the cooperative anti-interference behavior among users as potential energy games. Firstly, initializing a strategy of user and interference, namely randomly selecting a communication/interference channel, and setting each user zone bit to 0; then all users execute channel detection or channel updating operation simultaneously, then corresponding utility is calculated, user experience quality satisfaction is interacted between the users, and the zone bit is updated according to the selected strategy; iterating circularly until all the anti-interference strategies of the users are converged; updating a Q table and adjusting a strategy; until the strategy of interference converges. The invention improves the convergence rate by setting different learning parameters for different users, and improves the anti-interference efficiency of the network by the cooperation of the information layers among the users.

Description

User cooperation anti-interference method for beam forming communication
Technical Field
The invention belongs to the technical field of wireless communication, and particularly relates to a user cooperation anti-interference method for beam forming communication.
Background
With the development of wireless technology, global communication services show an exponential burst growth, and in hot spot areas, users usually show ultra-dense distribution, so that great difficulty is brought to the cooperation of users for frequency adjustment and anti-malicious interference attack. To solve this problem, the former proposes to avoid the interference attack by using the frequency hopping method (F.Yao and L.Jia, A Collaborative Multi-Agent Reinforcement Learning Anti-Jamming Algorithm in Wireless Networks, IEEE Wireless Communications Letters, vol.8, no.4, pp.1024-1027, aug.2019.); however, most previous studies only use the maximized throughput of the whole network as an optimization target, and do not consider the actual business requirements of users, and do not consider the requirements of users in the decision-making closed loop. Such methods often suffer from situations where the optimization objective cannot be completely matched with the user requirements, thereby wasting resources.
In addition, the existing anti-interference algorithm has the following two problems: (1) The lack of a cooperative mechanism among users makes the anti-interference method biased to be independent and resistant, and does not exert the crowd-sourced effect. (2) Asynchronous updating algorithms are popular, i.e. only one user updating strategy per iteration, resulting in a slower algorithm convergence speed.
Disclosure of Invention
The invention aims to provide a collaborative anti-interference model and a corresponding anti-interference learning algorithm, which can improve user quality of experience (QoE) and reduce interference influence.
The technical solution for realizing the purpose of the invention is as follows: consider that a malicious user can adaptively adjust an interference strategy according to the frequency usage condition of a communication user, so that the interference utility is maximized. First, the antagonism between the user and the disturbance is modeled as a Stackelberg game. In addition, in the aspect of modeling of user relationship, considering the characteristic of asymmetric mutual interference among users under the condition of space division multiple access, a non-cooperative game model with the characteristic of local advantage is provided. Secondly, to overcome the waste of resources caused by blindly improving throughput, a user experience quality model based on the average evaluation value MOS (Mean Opinion Score) is proposed, and the user utility is quantified by QoE grade. Then, the local literacy game among users is proved to be an accurate potential energy game, and further, the full-network optimal strategy of the users is proved to be a pure strategy Nash balance of the game. Finally, a user cooperation anti-interference algorithm which can realize the whole network optimization only by local information is designed.
An anti-jamming algorithm comprising the steps of:
step 1, modeling a cooperative anti-interference problem in a multi-user single-interference scene as a single-leader multi-follower Stackelberg game model, wherein game participants are all users and interferences in a system;
and 2, randomly selecting one channel for interference by interference, and defining a utility function of the interference as the sum of interference power applied by the jammer to all users of the co-channel. The users select anti-interference channels according to an interference strategy, in order to reduce the inter-user interference in the process, a local cooperation model is considered, the cooperation among the users is analyzed by utilizing the potential energy game framework, and each user needs to consider the benefits of the neighbor users. Thus, the user's utility function is defined as the sum of the QoE satisfaction of the user itself with the neighbor users.
And 3, all users simultaneously perform anti-interference strategy adjustment, and the users perform channel selection according to the current zone bit, the strategy and return of the first two time slots. According to different influence degrees of users on the network, the invention sets different learning parameters for each user so as to improve the convergence rate of the algorithm.
And 4, cycling to the step 3, and performing strategy selection by the user through exploring and learning until the interference strategy and the anti-interference strategies of all the users are converged or the set iteration times are reached.
Step 5, interference assessment of its utility u j (k) And updates the Q table.
And step 6, disturbing the updating strategy, and circulating to the step 3 until the maximum circulation times are reached.
Further, the cooperative anti-interference problem in the multi-user single-interference scenario described in step 1 is modeled as a single-leader multi-follower Stackelberg game, expressed as:
Figure BDA0003198077420000021
wherein ,
Figure BDA0003198077420000022
for user set, j is malicious jammer, < ->
Figure BDA0003198077420000023
and />
Figure BDA0003198077420000024
Policy set, u, representing user and interference, respectively n and uj Representing the utility functions of user n and interference, respectively.
Further, the inter-user local cooperation model described in step 2 is modeled as a precise potential energy game, which is specifically as follows: defining the potential energy function among users as follows:
Figure BDA0003198077420000025
wherein an Channel access policy for user n, c j Selecting channels for interference;
Figure BDA0003198077420000026
for the set of users interfered by user n, < +.>
Figure BDA0003198077420000027
A user set which causes interference to the user n; the formula represents the sum of QoE satisfaction for all users of the whole network.
The potential energy game proves the following process:
if any user n unilaterally changes the policy from a n Conversion to
Figure BDA0003198077420000028
The amount of change in the user utility function is as follows:
Figure BDA0003198077420000031
in addition, the unilateral change of the policy choices by user n results in a change of the potential energy function as:
Figure BDA0003198077420000032
wherein
Figure BDA0003198077420000033
For the set of users interfered by user n, < +.>
Figure BDA0003198077420000034
For a set of users causing interference to user n, < > for>
Figure BDA0003198077420000035
Expressed in the collection->
Figure BDA0003198077420000036
Delete set in->
Figure BDA0003198077420000037
The following will be further concluded:
Figure BDA0003198077420000038
the local collaboration model between users is therefore a potential energy game.
Further, all the users in step 3 perform anti-interference policy adjustment at the same time, and the users perform channel selection according to the current flag bit and the policies and rewards of the first two time slots. The specific operation is as follows:
if the flag bit Y n (t-1) =0, and user n updates the channel according to the following rule:
Figure BDA0003198077420000039
where M represents the number of channels available to the user,
Figure BDA00031980774200000310
is the learning parameter of user n. If a is n (t)=a n (t-1), the flag bit Y n (t) set to 0, otherwise set to 1.
If the flag bit Y n (t-1) =1, user n updates the channel according to the following rule:
Figure BDA00031980774200000311
Figure BDA00031980774200000312
wherein β is the learning rate; u (u) n (t-1) and u n (t-2) is the utility of user n in t-1 and t-2 slots, respectively. Setting a flag bit Y after updating n (t)=0。
Further, the learning parameters of the user are set as
Figure BDA00031980774200000313
When x is n When the method is large enough, the user cooperation anti-interference algorithm can gradually converge to the full-network optimum, and different learning parameters are set for different users mainly for accelerating the convergence speed, and the method specifically comprises the following steps:
x n (t)=Γ n ·ε(t)
where ε (t) =ε (0) +tΔε is the amount of change in time, ε (0) is the initial value, Δε is the step size, and t is the number of iterations.
Figure BDA0003198077420000041
Indicating how much user n affects the network.
Further, the interference of step 5 evaluates its utility u j (k) And updates the Q table. The method comprises the following steps:
interference assessment current utility u j
Figure BDA0003198077420000042
wherein ,pj Is the interference power;
Figure BDA0003198077420000043
is the interference frequency; d, d jn Distance between jammer and user n; />
Figure BDA0003198077420000044
For channel gain, the interference frequency and interference distance are related;
updating the Q table:
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k),
wherein ,Qk+1 Q value of the period k+1 of the jammer; c j (k) Selecting an interference channel for an interference machine in a k period; q (Q) k Q value of period k of the jammer; u (u) j (k) The utility of the jammer in the k period; lambda epsilon (0, 1) represents the learning rate for controlling the Q learning convergence rate.
Further, the interference policy updating method in step 6 is as follows:
the channel selection strategy of the self is updated by adopting the Boltzmann function:
Figure BDA0003198077420000045
where τ is a temperature coefficient, representing a compromise between exploration and utilization.
Figure BDA0003198077420000046
Selecting channel c for jammer during k period j (k) Is a probability of (2).
Compared with the prior art, the invention has the remarkable advantages that: (1) The method provides a framework for modeling the relationship between the user and the interference strong countermeasure and the cooperative relationship between the users for the multi-user anti-interference problem. (2) The method and the system consider diversified service demands of users, meanwhile, in order to overcome resource waste caused by improving throughput for blind purposes of the users, a QoE model based on MOS and an optimization mechanism centering on the user demands are provided, the user utility is quantified by QoE level, and the system performance is improved by using user demand diversity. (3) Through the limited improvement of the potential energy game, a multi-user synchronous anti-interference algorithm is designed, and the convergence rate of the algorithm is improved by setting different learning parameters for different users by utilizing the characteristic that the influence degree of each user on the whole network is different.
Drawings
Fig. 1 is a schematic diagram of a multi-user single interference network in a hierarchical anti-interference model for heterogeneous service requirements according to the present invention.
FIG. 2 is a graph comparing the convergence of the algorithm of the present invention with the prior art asynchronous learning algorithm.
Fig. 3 is a schematic diagram of the anti-interference effect of the algorithm of the present invention when the interference power is changed.
Detailed Description
With reference to fig. 1, the hierarchical anti-interference model for multi-user service requirements of the present invention has two millimeter wave picocell base stations in the system, the distance between the two base stations is 50m, and the users are randomly distributed in a circle with the radius of 100m centered on the base station. Meanwhile, the interference is distributed in a range of about 100-200m from the two base stations. In addition, the number of available channels is set to m=4, the channel bandwidth b=1 MHz, and the noise power spectral density N 0 =-130dB/Hz。
The invention is directed to a layered anti-interference model of multi-user business demands, which models interference as a leader and models users as followers. Modeling the antagonism of interference and users as a jackberg game, a method capable of avoiding interference is sought. Modeling the collaboration relationship between users as potential energy games, and searching for a method capable of eliminating co-channel interference. In addition, the collaboration among users provided by the invention is information-level collaboration, which means interaction QoE satisfaction among adjacent users.
Based on the relation between the QoE satisfaction degree of the whole network and the user strategy, the invention accurately maps the user behavior to the system performance by proving the existence of Nash equilibrium and Stackelberg equilibrium, and provides theoretical guidance for further providing a corresponding anti-interference algorithm.
The invention discloses a user cooperation anti-interference algorithm of a layering anti-interference model facing heterogeneous service demands, which comprises the following steps:
step 1, modeling a cooperative anti-interference problem in a multi-user single-interference scene as a single-leader multi-follower Stackelberg game model, wherein game participants are all users and interferences in a system;
and 2, randomly selecting one channel for interference by interference, and defining a utility function of the interference as the sum of interference power applied by the jammer to all users of the same channel. The users select anti-interference channels according to an interference strategy, in order to reduce the inter-user interference in the process, a local cooperation model is considered, the cooperation among the users is analyzed by utilizing the potential energy game framework, and each user needs to consider the benefits of the neighbor users. Thus, the user's utility function is defined as the sum of the QoE satisfaction of the user itself with the neighbor users.
And 3, all users simultaneously perform anti-interference strategy adjustment, and the users perform channel selection according to the current zone bit, the strategy and return of the first two time slots. According to different degrees of influence of users on the whole network, different learning parameters are set for each user, so that the convergence speed of the algorithm is improved.
And 4, cycling to the step 3, and performing strategy selection by the user through exploring and learning until the interference strategy and the anti-interference strategies of all the users are converged or the set iteration times are reached.
Step 5, interference assessment of its utility u j (k) And updates the Q table.
And step 6, disturbing the updating strategy, and circulating to the step 3 until the maximum circulation times are reached.
The specific embodiments of the present invention are as follows:
1. modeling the antagonism between multiuser and interference as a Stackelberg game, expressed as
Figure BDA0003198077420000061
wherein ,/>
Figure BDA0003198077420000062
For user set, j is malicious jammer, < ->
Figure BDA0003198077420000063
and />
Figure BDA0003198077420000064
Policy set, u, representing user and interference, respectively n and uj Representing the utility functions of user n and interference, respectively.
2. Considering that users have multiple services, the throughput requirements are also different. In other words, the same throughput may correspond to different QoE satisfaction under different services. The specific QoE satisfaction calculating process comprises the following steps:
user n can only access one base station at a time, we will represent the base station accessed by user n as S n . Base station S n And the distance between the user n is expressed as
Figure BDA0003198077420000065
Base station S n The direction angle to user n is denoted +.>
Figure BDA0003198077420000066
We can obtain the base station S n The directional gain in the direction in which user m is located when user n is served using beamforming techniques is:
Figure BDA0003198077420000067
wherein ,θn For base station S n Main lobe width of beam when serving user n.
The beam coverage area of the serving user n is defined as:
Figure BDA0003198077420000068
wherein ,θn For base station S n Main lobe width of beam when serving user n.
Further, define the set of potential users that are interfered by user n as:
Figure BDA0003198077420000069
Figure BDA00031980774200000610
coverage area for the beam serving user n;
defining a set of potential users that cause interference to user n as:
Figure BDA0003198077420000071
wherein ,
Figure BDA0003198077420000072
coverage area of beam for serving user m; g mn Is S m Serving user m using beamforming techniques
The gain in the direction in which the user n is located; g 0 Is the beam gain threshold, taken as 0.01.
Figure BDA0003198077420000073
Representing the set of all but user n.
Thus, the sum of the external malicious interference suffered by user n and the inter-user interference is expressed as:
Figure BDA0003198077420000074
wherein ,
Figure BDA0003198077420000075
is the interference frequency; />
Figure BDA0003198077420000076
For channel a m The frequency; a, a m ,a n and cj Channels selected for user m, user n and jammer respectively; g mn Is S m The directional gain in the direction of the user n when the user m is served by using the beam forming technology;
Figure BDA0003198077420000077
channel gain for the channel on which user m is located; />
Figure BDA0003198077420000078
Is the channel gain of the channel in which the jammer is located. P is p m For the transmit power of user m, d jn Which is the distance of the jammer to the user n. P is p j Is the interference power. Delta (x, y) is an indicator function defined as
Figure BDA0003198077420000079
Therefore, the communication rate of user n is expressed as:
Figure BDA00031980774200000710
wherein B is the channel bandwidth; p is p n Representing the transmit power of user n;
Figure BDA00031980774200000711
for base station S n Distance to user n; n (N) 0 Power spectral density, which is noise; d (D) n Is the sum of external malicious interference and mutual interference suffered by the user n. />
Figure BDA00031980774200000712
Channel gain for the channel on which user n is located;
the MOS function is defined as:
MOS=εlog 10 (R/γ),
where R is the throughput of the user; epsilon and gamma are constants which are sized according to the maximum and minimum throughput requirements of the users, and the values of the constants are different due to the different service requirements of the users. The mapping relation between the MOS value and the five levels is shown in Table 1.
Table 1: mean Opinion Score (MOS)
Figure BDA00031980774200000713
Figure BDA0003198077420000081
Further, using a function
Figure BDA0003198077420000082
Quantifying different experience levels of the user, and representing satisfaction degree of the user n under different QoE levels:
Figure BDA0003198077420000083
based on the above analysis, the optimization objective is expressed as the maximum QoE return (i.e. sum of user satisfaction) for the whole network, namely:
Figure BDA0003198077420000084
based on the above analysis, the utility function for user n is expressed as:
Figure BDA0003198077420000085
the optimization problem for user n can be expressed as:
Figure BDA0003198077420000086
further, all users compose a lower level sub-game, denoted:
Figure BDA0003198077420000087
for interference, the objective is to maximize the cumulative interference for all users, and its utility function is defined as:
Figure BDA0003198077420000088
we express the decision optimization problem of interference as:
Figure BDA0003198077420000089
the upper layer sub-game is represented as:
Figure BDA00031980774200000810
3. the channel selection procedure for each user is as follows:
(1) Initializing: each user
Figure BDA00031980774200000811
From its set of available channels->
Figure BDA00031980774200000812
Medium probability of randomly selecting a channel a n (0) And set the flag bit Y n (0)=0。
(2) Channel sounding: if Y n (t-1) =0, and user n updates the channel according to the following rule:
Figure BDA0003198077420000091
where M represents the number of channels available to the user,
Figure BDA0003198077420000092
can be considered as the learning rate of user n. If a is n (t)=a n (t-1), the flag bit Y n (t) set to 0, otherwise set to 1.
(3) Updating the channel: if Y n (t-1) =1, user n updates the channel according to the following rule:
Figure BDA0003198077420000093
Figure BDA0003198077420000094
wherein beta is learningParameters; u (u) n (t-1) and u n (t-2) is the user utility of user n in time slots t-1 and t-2, respectively. Setting a flag bit Y after updating n (t)=0。
4. And (3) circulating the steps 1 to 3, and simultaneously performing exploration learning and channel access by all users until the channel access selection of all users achieves convergence or reaches the set iteration times.
For the partial cooperative model, it can prove to be a potential energy game, and there is at least one Nash equilibrium solution. And the corresponding anti-interference algorithm can be designed by utilizing the limited improved property of the potential energy game.
5. Interference assessment of its utility u j (k) The method comprises the steps of carrying out a first treatment on the surface of the Interference updates Q value as follows
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k), (6-25)
Wherein λ e (0, 1) represents a learning rate for controlling a Q learning convergence rate.
Similar to the user, the interference also updates its own channel selection strategy using the boltzmann function:
Figure BDA0003198077420000095
where τ is a temperature coefficient, representing a compromise between exploration and utilization.
6. And (3) cycling to the step (3) until the maximum cycle number is reached.
Example 1
One embodiment of the invention is described below: matlab software is adopted for system simulation, and parameter setting does not affect generality; the system has two millimeter wave picocell base stations, the distance between the two base stations is 50m, and users are randomly distributed in a circle with the radius of 100m taking the base station as the center. Meanwhile, the interference is distributed in a range of about 100-200m from the two base stations. In addition, the number of available channels is set to m=4, the channel bandwidth b=1 MHz, and the noise power spectral density N 0 -130dB/Hz, learning parameter β=t/2500. Learning rate of interference λ=0.1, temperature coefficient
Figure BDA0003198077420000101
Where K is the total simulation period and K is the current simulation period.
The invention discloses a user cooperation anti-interference algorithm, which comprises the following specific processes:
step 1: t=0, k=0, initializing the mixing strategy of the interference
Figure BDA0003198077420000102
Step 2: in the kth period, the interference depends on probability
Figure BDA0003198077420000103
Selecting a channel c j (k) The method comprises the steps of carrying out a first treatment on the surface of the Every user +.>
Figure BDA0003198077420000104
From its set of available channels->
Figure BDA0003198077420000105
Medium probability of randomly selecting a channel a n (0) And set the flag bit Y n (0)=0。
During this period, all users simultaneously perform the following processes:
cycle t=1, 2, …:
channel sounding:
if Y n (t-1) =0, and user n updates the channel according to the following rule:
Figure BDA0003198077420000106
where M represents the number of channels available to the user,
Figure BDA0003198077420000107
can be considered as the learning rate of user n. If a is n (t)=a n (t-1), the flag bit Y n (t) set to 0, otherwise set to 1.
Updating the channel:
if Y n (t-1) =1, user n updates the channel according to the following rule:
Figure BDA0003198077420000108
Figure BDA0003198077420000109
wherein, beta is a learning parameter; u (u) n (t-1) and u n (t-2) is the utility of user n in t-1 and t-2 slots, respectively. Setting a flag bit Y after updating n (t)=0
Step 3: interference acquisition utility u j (k);
Step 4: the interference updates the Q value according to:
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k),
wherein λ e (0, 1) represents a learning rate for controlling a Q learning convergence rate.
Similar to the user, the interference also updates its own channel selection strategy using the boltzmann function:
Figure BDA0003198077420000111
where τ is a temperature coefficient, representing a compromise between exploration and utilization.
Step 5: update k=k+1, go to step 2. Until the maximum number of cycles is reached
In connection with fig. 2, for the convergence of the collaborative anti-interference algorithm, the comparison algorithm is an asynchronous learning algorithm, i.e. only one user performs policy update per iteration. The figure shows that the synchronous learning algorithm provided by the invention can obviously improve the learning speed.
In connection with fig. 3, the impact of interference power on network satisfaction rate at different user numbers. The network satisfaction rate is basically unchanged with the increase of the user power, and the method provided by the invention can help the user to avoid the interference channel successfully and has a better anti-interference effect.
In summary, the hierarchical anti-interference model and the user cooperation anti-interference algorithm for multi-user service requirements provided by the invention consider that malicious users can adaptively adjust the interference strategy according to the frequency utilization condition of communication users, so that the interference utility is maximized. The idea of modeling the antagonism relationship between the user and the interference as a Stackelberg game is provided. In addition, by considering the characteristic of asymmetric mutual interference among users under the space division multiple access condition, a user cooperation anti-interference algorithm is provided, and the network satisfaction rate is effectively improved. By comparing with an asynchronous learning algorithm, the remarkable improvement of the convergence rate of the proposed algorithm is proved. And the effectiveness of the anti-interference algorithm provided by the invention is proved by performance comparison under different interference powers.

Claims (1)

1. The user cooperation anti-interference method for the beam forming communication is characterized in that interference is modeled as a leader, a user is modeled as a follower, and the interference always aims at causing maximum interference to the user; the user needs to combine the self business requirement and utilize the anti-interference algorithm to maximize the user satisfaction degree of the whole network, namely the network satisfaction rate; the method comprises the following steps:
step 1, modeling a cooperative anti-interference problem in a multi-user single-interference scene as a single-leader multi-follower Stackelberg game model, wherein game participants are all users and interferences in a system;
the cooperative anti-interference problem in the multi-user single-interference scene is modeled as a single-leader multi-follower Stackelberg game model, which is expressed as:
Figure QLYQS_1
wherein ,
Figure QLYQS_2
for user set, j is malicious jammer, < ->
Figure QLYQS_3
and />
Figure QLYQS_4
Policy set, u, representing user and interference, respectively n and uj The utility functions of user n and interference are represented respectively;
step 2, randomly selecting one channel for interference, and defining a utility function of the interference as the sum of interference power applied by an interfering machine to all users of the same channel; the users select anti-interference channels according to the interference strategy, the potential energy game framework is utilized to analyze the cooperation among the users, and each user needs to consider the benefits of the neighbor users; the utility function of the user is defined as the sum of QoE satisfaction of the user itself and the neighbor users;
utility function u of users in local collaboration model n Defined as the sum of QoE satisfaction of the user itself with the neighbor user, expressed as:
Figure QLYQS_5
wherein an Channel access policy for user n, c j Selecting channels for interference;
Figure QLYQS_6
for the set of users interfered by user n, < +.>
Figure QLYQS_7
A user set which causes interference to the user n; />
Figure QLYQS_8
A user set which causes interference to the user k; />
Figure QLYQS_9
For user set->
Figure QLYQS_10
Channel selection for all users in the network; />
Figure QLYQS_11
For user set->
Figure QLYQS_12
Channel selection for all users in the network; q n QoE satisfaction for user n; q k QoE satisfaction for user k;
wherein ,
Figure QLYQS_13
is a function related to user throughput and specific service requirements, and the mapping relation can be represented by MOS functions;
the MOS function is defined as:
MOS=εlog 10 (R/γ), (6-3)
where R is the throughput of the user; epsilon and gamma are constants, the size is determined according to the maximum throughput requirement and the minimum throughput requirement of users, and the values of the constants are different due to different service requirements of the users;
satisfaction of user n at different QoE levels is expressed as:
Figure QLYQS_14
the partial cooperative model has been demonstrated to be an accurate potential energy game, which has been demonstrated as follows:
the potential energy function is expressed as:
Figure QLYQS_15
from policy a due to policy unilateral to arbitrary user n n Change to
Figure QLYQS_16
The resulting satisfaction change is consistent with the change in potential energy function, namely:
Figure QLYQS_17
wherein an For the original channel access policy of user n,
Figure QLYQS_20
a changed channel access policy for user n; />
Figure QLYQS_22
Policy change for user n is +>
Figure QLYQS_24
After that, user set->
Figure QLYQS_19
Channel selection for all users in the network; a, a -n Channel access for the remaining users c j Selecting channels for interference; />
Figure QLYQS_23
For the set of users interfered by user n, < +.>
Figure QLYQS_25
For a set of users causing interference to user n, < > for>
Figure QLYQS_26
Expressed in the collection->
Figure QLYQS_18
Delete set in->
Figure QLYQS_21
Step 3, all users simultaneously carry out anti-interference strategy adjustment, and the users carry out channel selection according to the current zone bit, the strategy and return of the first two time slots; according to different influence degrees of users on the whole network, different learning parameters are set for each user, and the convergence speed of the algorithm is improved;
according to different influence degrees of users on the network, different learning parameters are set for each user, and the method specifically comprises the following steps:
the learning parameters are set as
Figure QLYQS_27
wherein xn (t)=Γ n ×ε(t);
Figure QLYQS_28
Indicating the influence degree of the user n on the network; epsilon (t) =epsilon (0) +tΔepsilon, epsilon (0) is an initial value, Δepsilon is a step size, and t is the iteration number;
step 4, cycling the steps 1 to 3, and performing strategy selection by the user through exploration and learning until the interference strategy and the anti-interference strategies of all the users are converged or the set iteration times are reached;
step 5, interference evaluation utility u j (k) And updating the Q table; the method comprises the following steps:
interference assessment current utility u j
Figure QLYQS_29
wherein ,pj Is the interference power; f (f) cj Is the interference frequency; d, d jn Distance between jammer and user n; h (f) cj ,d jn ) For channel gain, the interference frequency and interference distance are related;
updating the Q table:
Q k+1 (c j (k))=(1-λ)Q k (c j (k))+λu j (k),
wherein ,Qk+1 Q value of the period k+1 of the jammer; c j (k) Selecting an interference channel for an interference machine in a k period; q (Q) k Q value of period k of the jammer; u (u) j (k) The utility of the jammer in the k period; lambda E (0, 1) represents learning rate for controlling Q learning convergenceA speed;
step 6, interfering with the updating strategy, and circulating to the step 3 until the maximum circulation times are reached;
the interference strategy updating mode is as follows:
the channel selection strategy of the self is updated by adopting the Boltzmann function:
Figure QLYQS_30
wherein τ is a temperature coefficient, represents a compromise between exploration and utilization,
Figure QLYQS_31
selecting channel c for jammer during k period j (k) Is a probability of (2).
CN202110896542.1A 2021-08-05 2021-08-05 User cooperation anti-interference method for beam forming communication Active CN113613337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110896542.1A CN113613337B (en) 2021-08-05 2021-08-05 User cooperation anti-interference method for beam forming communication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110896542.1A CN113613337B (en) 2021-08-05 2021-08-05 User cooperation anti-interference method for beam forming communication

Publications (2)

Publication Number Publication Date
CN113613337A CN113613337A (en) 2021-11-05
CN113613337B true CN113613337B (en) 2023-06-20

Family

ID=78307112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110896542.1A Active CN113613337B (en) 2021-08-05 2021-08-05 User cooperation anti-interference method for beam forming communication

Country Status (1)

Country Link
CN (1) CN113613337B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114698128B (en) * 2022-05-17 2022-09-13 中国人民解放军战略支援部队航天工程大学 Anti-interference channel selection method and system for cognitive satellite-ground network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108616916A (en) * 2018-04-28 2018-10-02 中国人民解放军陆军工程大学 A kind of anti-interference layering betting model of cooperation and anti-interference learning algorithm
CN112188504A (en) * 2020-09-30 2021-01-05 中国人民解放军陆军工程大学 Multi-user cooperative anti-interference system and dynamic spectrum cooperative anti-interference method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108616916A (en) * 2018-04-28 2018-10-02 中国人民解放军陆军工程大学 A kind of anti-interference layering betting model of cooperation and anti-interference learning algorithm
CN112188504A (en) * 2020-09-30 2021-01-05 中国人民解放军陆军工程大学 Multi-user cooperative anti-interference system and dynamic spectrum cooperative anti-interference method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
An Anti-Jamming Hierachical Optimization Approach in Relay Communication System via Stackelberg Game;Zhibin Feng, etc.;《MDPI》;全文 *

Also Published As

Publication number Publication date
CN113613337A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
Tan et al. Deep reinforcement learning for joint channel selection and power control in D2D networks
Zhang et al. Intelligent user association for symbiotic radio networks using deep reinforcement learning
Lu et al. A cross-layer resource allocation scheme for ICIC in LTE-Advanced
CN113316154A (en) Authorized and unauthorized D2D communication resource joint intelligent distribution method
Yao et al. Distributed ABS-slot access in dense heterogeneous networks: A potential game approach with generalized interference model
Wang et al. User association in non-orthogonal multiple access networks
CN113613337B (en) User cooperation anti-interference method for beam forming communication
Chen et al. Intelligent control of cognitive radio parameter adaption: Using evolutionary multi-objective algorithm based on user preference
Sroka et al. Distributed interference mitigation in two-tier wireless networks using correlated equilibrium and regret-matching learning
Li et al. Reinforcement Learning-Based Resource Allocation for Coverage Continuity in High Dynamic UAV Communication Networks
Wang et al. Intelligent user-centric networks: Learning-based Downlink CoMP region breathing
Huang et al. Joint AMC and resource allocation for mobile wireless networks based on distributed MARL
Xu et al. Distributed-training-and-execution multi-agent reinforcement learning for power control in HetNet
Chai et al. A user-selected uplink power control algorithm in the two-tier femtocell network
CN107919931A (en) A kind of multichannel power control mechanism based on hidden Markov in cognition net
Chen et al. Beamforming in multi-user MISO cellular networks with deep reinforcement learning
Adeel et al. Random neural network based power controller for inter-cell interference coordination in lte-ul
Sheu et al. Joint Beamforming, Power Control, and Interference Coordination: A Reinforcement Learning Approach Replacing Rewards with Examples
CN105960008B (en) Method for inhibiting interference of Femtocell on surrounding cells
Maaz et al. Inter-cell interference coordination based on power control for self-organized 4G systems
CN113472472B (en) Multi-cell collaborative beam forming method based on distributed reinforcement learning
Mohamed et al. Spectral Efficiency Improvement in Downlink Fog Radio Access Network With Deep-Reinforcement-Learning-Enabled Power Control
Zhou Deep Reinforcement Learning for Channel Selection and Power Allocation in D2D Communications
Trankatwar et al. Power control algorithm to improve coverage probability in heterogeneous networks
Sivaraj et al. Soft computing based power control for interference mitigation in LTE femtocell networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant