CN115150963A - Multi-user scene-oriented distributed interference avoidance method and device - Google Patents
Multi-user scene-oriented distributed interference avoidance method and device Download PDFInfo
- Publication number
- CN115150963A CN115150963A CN202211076124.9A CN202211076124A CN115150963A CN 115150963 A CN115150963 A CN 115150963A CN 202211076124 A CN202211076124 A CN 202211076124A CN 115150963 A CN115150963 A CN 115150963A
- Authority
- CN
- China
- Prior art keywords
- utility function
- time slot
- local
- data transmission
- legal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000006870 function Effects 0.000 claims abstract description 153
- 230000005540 biological transmission Effects 0.000 claims abstract description 73
- 230000006854 communication Effects 0.000 claims abstract description 67
- 238000004891 communication Methods 0.000 claims abstract description 66
- 230000009471 action Effects 0.000 claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000004364 calculation method Methods 0.000 claims abstract description 20
- 230000004927 fusion Effects 0.000 claims description 10
- 230000008447 perception Effects 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 7
- 230000002452 interceptive effect Effects 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000010485 coping Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 101000741965 Homo sapiens Inactive tyrosine-protein kinase PRAG1 Proteins 0.000 description 1
- 102100038659 Inactive tyrosine-protein kinase PRAG1 Human genes 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000010408 sweeping Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/309—Measuring or estimating channel quality parameters
- H04B17/345—Interference values
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B17/00—Monitoring; Testing
- H04B17/30—Monitoring; Testing of propagation channels
- H04B17/382—Monitoring; Testing of propagation channels for resource allocation, admission control or handover
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Electromagnetism (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The application relates to a distributed interference avoidance method and device for a multi-user scene. The method comprises the following steps: each legal user obtains local observation information of a current time slot by perceiving an electromagnetic environment, inputs the local observation information of the current time slot and the action of a previous time slot into an intelligent network for calculation to obtain a local utility function of each legal user, fuses all the local utility functions by means of a central controller with a hybrid network during training to obtain a combined utility function, continuously adjusts and evaluates strategies under central auxiliary training to finally obtain an optimal combined strategy, and further obtains an optimal interference avoidance strategy of each legal user; the optimal interference avoidance strategy of the legal user independently completes communication decision to carry out data transmission. By adopting the method, the legal users can simultaneously deal with the malicious interference of the external jammers and the mutual interference between the internal users without depending on the central centralized deployment, and the normal communication is realized.
Description
Technical Field
The application relates to the technical field of intelligent wireless communication, in particular to a distributed interference avoidance method and device for a multi-user scene.
Background
With the continuous development of communication technology, the communication of users in the network is susceptible to external electromagnetic interference, and meanwhile, mutual interference is easy to occur when multiple users communicate without negotiation in the network, so that respective normal communication is affected. Therefore, how to enable users deployed in a distributed manner in a network to autonomously identify and avoid electromagnetic interference is imperative, and the users have intelligent interference avoidance capability.
Traditional solutions based on centralized allocation or negotiation between users solve the multi-user scenario electromagnetic interference problem. On one hand, for the centralized allocation scheme, the disadvantages exist that in an actual scene, it cannot be guaranteed that a base station and other facilities exist as a central control role, even if the facilities exist, the overall scheduling of the base station is quite complex, and information transmission between the base station and a user brings about great communication and calculation cost, which is not suitable for a wireless communication network with limited resources; meanwhile, a large processing time delay is inevitably caused, which is unfavorable for coping with a time-varying and severe electromagnetic interference environment; in addition, once the center is destroyed, it directly results in the disorganization of the wireless resource usage by the users, thereby causing interference and communication failure. On the other hand, for the scheme of mutual negotiation between users, the disadvantage is that the negotiation between users is often limited to adjacent users in the communication range, and this limitation causes the problem of overall processing delay of the network; due to the complexity of the network environment, the communication strategy which is obtained by the scheme is possibly not the optimal strategy, and the problem of interference in the communication process of the user cannot be completely solved; meanwhile, the potential security problem in the user negotiation needs to be also taken into consideration.
Disclosure of Invention
In view of the foregoing, it is necessary to provide a distributed interference avoidance method and apparatus for a multi-user scenario, which can implement legal communication of multiple legitimate users under the conditions of a harsh communication environment and dynamic changes.
A distributed interference avoidance method for a multi-user scenario, the method comprising:
a legal user obtains local observation information of the current time slot by sensing an electromagnetic environment; the local observation information comprises the state information of a legal user and the channel use information in the electromagnetic environment;
each legal user inputs the local observation information of the current time slot and the action of the previous time slot into the intelligent network for calculation to obtain a local utility function of each legal user; the action input of the previous time slot is obtained by selecting an access channel for data transmission according to the local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission;
all legal users input the local utility function into a mixed network in a central controller for fusion to obtain a combined utility function, and the combined utility function is trained and updated according to a utility function approximation algorithm to obtain an optimal combined utility function; wherein the optimal joint utility function comprises an optimal joint strategy;
and the central controller distributes the optimal combination strategy to each legal user to obtain an optimal interference avoidance strategy corresponding to each legal user, and the legal users autonomously decide a communication channel according to the optimal interference avoidance strategy to carry out data transmission.
In one embodiment, the step of inputting, by each legitimate user, the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation to obtain the local utility function of each legitimate user includes:
and each legal user inputs the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation to obtain a local utility function of each legal user, wherein the local utility function comprises a historical observation action set of the intelligent agent network, the action of the current time slot and the intelligent agent network strategy.
In one embodiment, the action input of the previous time slot is obtained by selecting an access channel for data transmission according to local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission, and the action input of the previous time slot includes:
selecting an access channel for data transmission according to the local observation information of the previous time slot to obtain a data transmission result;
analyzing the communication condition of the access channel according to the data transmission result to obtain a feedback result;
and making a decision according to the feedback result to obtain the action of the last time slot.
In one embodiment, the data transmission result comprises a data transmission success result and a data transmission failure result;
analyzing the communication condition of the access channel according to the data transmission result to obtain a feedback result, wherein the feedback result comprises the following steps:
analyzing the communication condition of the access channel according to the successful data transmission result to obtain a positive feedback result;
and analyzing the communication condition of the access channel according to the data transmission failure result to obtain a negative feedback result and a non-feedback result.
In one embodiment, analyzing the communication condition of the access channel according to the data transmission failure result to obtain a negative feedback result and a non-feedback result, includes:
analyzing the communication condition of the access channel according to the data transmission failure result, and obtaining a negative feedback result when the communication condition is interfered by other legal users; when the communication condition is interfered by the jammer, a non-feedback result is obtained.
In one embodiment, the making a decision according to the feedback result to obtain the action of the previous time slot includes:
making a decision according to the positive feedback result to obtain the action of the last time slot as selecting an access channel for communication;
and making a decision according to the negative feedback and non-feedback results to obtain the action of the previous time slot, namely selecting other channels for communication.
In one embodiment, the step of inputting the local utility function into a hybrid network in the central controller for fusion by all legal users to obtain a joint utility function includes:
all legal users input the local utility function into a mixed network in a central controller for fusion to obtain a combined utility function; the joint utility function comprises historical observation action sets of all the intelligent agent networks, an action set of the current time slot and a joint strategy;
and coordinating the relationship between the joint utility function and the local utility function according to the hyper-network in the hybrid network, so that the monotonicity between the joint utility function and the local utility function of each legal user is met.
In one embodiment, coordinating a relationship between a joint utility function and a local utility function according to a super network in a hybrid network so that the joint utility function and the local utility function of each legitimate user satisfy monotonicity, includes:
inputting the global environment state into a super network in the hybrid network for calculation to obtain a bias parameter and a non-negative weight parameter of the hybrid network;
and coordinating the relationship between the joint utility function and the local utility function according to the bias parameter and the non-negative weight parameter, so that the monotonicity between the joint utility function and the local utility function of each legal user is met.
In one embodiment, training and updating the combined utility function according to a utility function approximation algorithm to obtain an optimal combined utility function includes:
and inputting the joint utility function and the target network utility function into a pre-constructed loss function for calculation, and training and updating the joint utility function through a minimized loss function to obtain an optimal joint utility function, wherein the optimal joint utility function comprises an optimal joint strategy.
A distributed interference avoidance apparatus facing a multi-user scene comprises:
the perception module is used for a legal user to obtain local observation information of the current time slot by perceiving the electromagnetic environment; the local observation information comprises the state information of a legal user and the channel use information in the electromagnetic environment;
the training module is used for inputting the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation by each legal user to obtain a local utility function of each legal user; the action input of the previous time slot is obtained by selecting an access channel for data transmission according to the local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission; all legal users input the local utility function into a mixed network in the central controller for fusion to obtain a combined utility function, and the combined utility function is trained and updated according to a utility function approximation algorithm to obtain an optimal combined utility function; wherein the optimal joint utility function comprises an optimal joint strategy;
and the interference avoidance module is used for distributing the optimal combined strategy to each legal user by the central controller to obtain the optimal interference avoidance strategy corresponding to each legal user, and the legal users autonomously decide a communication channel according to the optimal interference avoidance strategy to perform data transmission.
According to the distributed interference avoiding method and device for the multi-user scene, each legal user calculates perceived local observation information and actions according to an intelligent lifting network by means of a central controller with a hybrid network during training to obtain a local utility function, all legal users input the local utility functions into the central controller to be fused to obtain a combined utility function, and under the auxiliary training of the center, strategies are continuously adjusted and evaluated to finally obtain the optimal combined strategies, so that the optimal interference avoiding strategies of all legal users are obtained; in actual execution, a legal user does not rely on the central controller any more, and the legal user can independently complete communication decision to carry out data transmission according to currently perceived local observation information and an optimal interference avoidance strategy. By adopting the distributed interference avoiding method for decentralized execution of centralized training, under the condition of severe and dynamic change of the actual wireless communication environment, legal users can simultaneously deal with malicious interference of an external interference machine and mutual interference between internal users without depending on centralized deployment of a center, and normal communication is realized.
Drawings
Fig. 1 is an application scenario diagram of a distributed interference avoidance method for a multi-user scenario in an embodiment;
FIG. 2 is a flowchart illustrating a distributed interference avoidance method for a multi-user scenario according to an embodiment;
fig. 3 is a schematic diagram of a timeslot structure in which a legitimate user and a jammer operate in one embodiment;
FIG. 4 is a block diagram of the method of the present invention according to one embodiment;
FIG. 5 is a graph comparing the interference avoidance performance of the proposed method of the present invention with Q learning and ideal solution in one embodiment;
fig. 6 is a graph of the interference avoidance performance of the proposed method for different network sizes in one embodiment;
FIG. 7 is a diagram illustrating a time-frequency representation of a user without training and learning according to the present invention;
FIG. 8 is a time-frequency representation of a user after training and learning according to the method of the present invention in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The distributed interference avoidance method for the multi-user scene can be applied toAs in the communication network scenario shown in fig. 1. The presence of a jammer in a communication network andNfor a legitimate user comprising a transmitter and a receiver, denoted asIn space existsKThe set of interference channels of the jammer, which is consistent with the set of communication channels of the legitimate users, is indicated as available channelsWhen legal users communicate, the jammers continuously work and switch on different channels in hopes of interfering with the data transmission of the legal users. Specifically, the invention considers that the jammer only interferes one channel per time slot, the interference mode is sweep frequency interference but unknown to legal users, and meanwhile, because the communication network comprises multiple pairs of legal users, mutual interference among the users exists in the data transmission process.
In an embodiment, as shown in fig. 2, a distributed interference avoidance method for a multi-user scenario is provided, and is described by taking an application scenario in fig. 1 as an example, the method includes the following steps:
It can be understood that a legal user mainly perceives the frequency domain information in the electromagnetic environment, so as to perceive that the channel is free from interference of an interference machine, i.e. no other user uses the channel; in particular, due to the limitation of user hardware devices and the complexity of the environment, a user can only perceive local observation information in the environment, but cannot acquire global information.
It can be understood that the intelligent network adopts a deep-cycle neural network (DRQN) structure, and can use all observation-action history information to represent the current state, so as to effectively deal with the local observation problem of the legal user, the intelligent network corresponds to the legal user one by one, the input of the intelligent network is the local observation information of the current time slot and the action of the last time slot of each legal user, and the output information is the local utility function of each legal user.
the hybrid network provided by the invention is a nonlinear network, local utility functions of all legal users are fused according to the hybrid network to obtain a joint utility function, the joint utility function is used for evaluating the quality of actions of all legal users, the joint utility function is trained and updated according to a utility function approximation algorithm to obtain an optimal joint utility function, and the maximization of the multi-user interference avoidance performance is ensured.
And 208, the central controller distributes the optimal combined strategy to each legal user to obtain an optimal interference avoidance strategy corresponding to each legal user, and the legal users autonomously decide a communication channel according to the optimal interference avoidance strategy to perform data transmission.
The distributed interference avoidance method for the multi-user scene is a distributed interference avoidance method based on multi-agent reinforcement learning (QMIX), the central controller distributes the optimal combination strategy to each legal user to obtain the optimal interference avoidance strategy corresponding to each legal user, the legal users can autonomously capture the interference mode of an interference machine under the condition that no center exists and mutual negotiation does not exist according to the optimal interference avoidance strategies, and the legal users can avoid mutual interference among the legal users, so that normal communication can be realized.
The distributed interference avoiding method for the multi-user scene comprises the steps that a central controller with a hybrid network is deployed during training, each legal user calculates perceived local observation information and actions according to an intelligent network to obtain a local utility function, all legal users input the local utility functions into the central controller to be fused to obtain a combined utility function, strategies are continuously adjusted and evaluated under central auxiliary training, and finally the optimal interference avoiding strategies of each legal user are obtained; in actual execution, a legal user does not rely on the central controller any more, and the legal user can independently complete communication decision to carry out data transmission according to currently perceived local observation information and an optimal interference avoidance strategy. By adopting the distributed interference avoiding method for decentralized execution of centralized training, the method can ensure that legal users simultaneously deal with the malicious interference of an external interference machine and the mutual interference between internal users under the condition of independent central centralized deployment under the condition of severe and dynamic change of the actual wireless communication environment, and realize normal communication.
In one embodiment, the step of inputting, by each legal user, the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation to obtain the local utility function of each legal user includes:
for arbitrary agentsEach legal user will observe the local information of the current time slotAnd the last time slotInputting the corresponding intelligent network for calculation to obtain the local utility function of each legal userWherein the local utility function comprises a set of historical observed actions for the agent networkThe current time slot istMovement of timeAnd agent network policy。
It can be understood that the local utility function is used for evaluating the quality of the actions of the legal users under the current policy, and when the maximum value of each local utility function is obtained, the optimal distributed policy of each legal user is obtained.
In one embodiment, the action input of the previous time slot is obtained by selecting an access channel for data transmission according to local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission, and the action input of the previous time slot includes:
selecting an access channel for data transmission according to the local observation information of the last time slot to obtain a data transmission result;
analyzing the communication condition of the access channel according to the data transmission result to obtain a feedback result;
and making a decision according to the feedback result to obtain the action of the last time slot.
It can be understood that, as shown in fig. 3, when a legal user performs communication, the jammer continuously operates, the jammer performs frequency sweeping interference on channels in space according to time slots, selects one channel at the initial time of each time slot, and keeps the interference channel unchanged in the current time slot, and particularly, if the jammer interference channel is a communication channel of the legal user, the jammer performs interference successfully; otherwise, the interference is invalid;
when the jammer switches over different channels to interfere the data transmission of the legal user, the state of the user on the unaccessed channel is unknown, the real interference situation of the channel can be obtained only after the data transmission is carried out on the accessed channel, and the decision situation of the user on other users in the network is also unknown, so that the user needs to obtain the channel access action sequentially through the sensing, data transmission, learning analysis and decision process, and the jammer is prevented from interfering the accessed channel.
In one embodiment, the data transmission result comprises a data transmission success result and a data transmission failure result;
analyzing the communication condition of the access channel according to the data transmission result to obtain a feedback result, wherein the feedback result comprises the following steps:
analyzing the communication condition of the access channel according to the successful data transmission result to obtain a positive feedback result;
and analyzing the communication condition of the access channel according to the data transmission failure result to obtain a negative feedback result and a non-feedback result.
In one embodiment, analyzing the communication condition of the access channel according to the data transmission failure result to obtain a negative feedback result and a non-feedback result, includes:
analyzing the communication condition of the access channel according to the data transmission failure result, and obtaining a negative feedback result when the communication condition is interfered by other legal users; when the communication situation is interfered by the jammer, a feedback-free result is obtained.
It can be understood that there are two cases of success and failure in data transmission, where the failure is mainly caused by channel sensing and access failure, resulting in interference from jammers or collision of data transmission by other legitimate users.
It can be understood that three feedback results are obtained according to the data transmission result, and when the data transmission is successful, positive feedback of ACK is received, which indicates that the user is not interfered; the user receives NACK negative feedback to indicate that the user is interfered by other users in the data transmission process; receiving a result of no feedback, which indicates that the user is maliciously interfered by an interference machine; wherein the latter two cases both indicate a data transmission failure.
In one embodiment, the making a decision according to the feedback result to obtain the action of the previous time slot includes:
making a decision according to the positive feedback result, wherein the action of obtaining the last time slot is to select an access channel for communication;
and making a decision according to the negative feedback and non-feedback results to obtain the action of the previous time slot, namely selecting other channels for communication.
It can be understood that the user identifies the network environment situation according to the received feedback result, and decides how to adjust the interference avoidance strategy of the next time slot.
In one embodiment, the step of inputting the local utility function into a hybrid network in the central controller for fusion by all legal users to obtain a joint utility function includes:
the network framework composed of the intelligent agent network deployed in the legal users and the hybrid network and the super network deployed in the central controller is shown in FIG. 4, and all the legal users will use the local utility functionThe mixed networks input into the central controller are fused to obtain a combined utility function(ii) a Wherein the joint utility function comprises a historical observation action set of all agent networksCurrent time slot action setAnd federation policies;
Coordinating the relationship between the joint utility function and the local utility function according to the hyper-network in the hybrid network so that the joint utility function and the local utility function of each legal user satisfy monotonicity, expressed as
In one embodiment, coordinating a relationship between a joint utility function and a local utility function according to a super network in a hybrid network so that the joint utility function and the local utility function of each legitimate user satisfy monotonicity, includes:
global environment stateInputting into the super network in the hybrid network for calculation to obtain bias parameters of the hybrid networkbAnd non-negative weight parameterw;
And coordinating the relationship between the joint utility function and the local utility function according to the bias parameter b and the non-negative weight parameter w, so that the monotonicity between the joint utility function and the local utility function of each legal user is met.
It can be appreciated that the super network is deployed within the hybrid network, and the convergence speed of the hybrid network is improved by coordinating the relationship between the joint utility function and the local utility function.
In one embodiment, training and updating the combined utility function according to a utility function approximation algorithm to obtain an optimal combined utility function includes:
inputting the joint utility function and the target network utility function into a pre-constructed loss function for calculation,
a pre-constructed loss function ofWherein, in the step (A),,indicating the batch size at which sampling was performed during training,to representFirst, theiThe global reward under each sampling batch, namely the sum of the instant rewards of all the intelligent agent networks,is a target network utility function and is interpreted as the state of the userTemporal basis policyPerforming an actionCombined with historical observation-action informationThe utility function value obtained by evaluation can be used for ensuring the operation stability of the algorithm during training;
by minimizing a loss functionTraining and updating the combined utility function to obtain an optimal combined utility function, wherein the optimal combined utility function comprises an optimal combined strategy;
Upon execution, the central controller will optimize the federation policyDistributing the interference avoidance information to each legal user to obtain the optimal interference avoidance strategy corresponding to each legal userAnd a legal user autonomously decides a communication channel according to the optimal interference avoidance strategy to carry out data transmission.
It can be understood that the invention adopts the idea of centralized training and decentralized execution, and comprises two stages of off-line training and on-line execution respectively. On one hand, in the off-line training stage, the central controller is used for collecting the observation, action and reward information of all the agents, training and issuing the optimal interference avoidance strategy to the user. On the other hand, the user online execution stage does not need the participation of a central controller, each legal user inputs own perception information locally, and the decision information is output autonomously and executed by learning the optimal interference avoidance strategy, so that the intelligent anti-interference capability of the legal user is ensured.
Furthermore, the method provided by the invention is compared with the Q learning and interference avoidance performance under an ideal scheme through experimental verification, two pairs of legal users, one interference machine and six channels are considered during the experimental verification, and particularly, each iteration comprises 100 rounds of training, and each round of training comprises 60-time-slot interactive learning. The performance comparison result is shown in fig. 5, where the abscissa in fig. 5 is the iteration number, and the ordinate is the normalized reward value, and it can be seen from fig. 5 that the performance under the method of the present invention shows a trend of increasing continuously with the increase of the training number, then gradually becomes stable within a limited number, and reaches convergence after about 220 times of iterative training, and the obtained performance effect is significantly better than that of the centralized Q learning method, and the convergence value is highly consistent with the performance value under the ideal scheme. The result shows that the invention has effective interference avoidance performance, and ensures that users have the capability of autonomously coping with malicious interference of an interference machine and mutual interference among users.
Specifically, the method provided by the invention is also experimentally verified for the interference avoidance performance of different network scales, and similarly, the number of the fixed channels is six, the number of the legal users is two pairs, three pairs and four pairs respectively, and particularly, the larger the number of the legal users is, the larger the network scale is, and the more complex the communication environment is. The verification result is shown in fig. 6, the abscissa in the figure is the iteration number, and the ordinate in the figure is the average global reward value of one time slot, and it can be seen from fig. 6 that, in three different scale scenes, the performance value can reach convergence within a limited number of times, which ensures the interference avoidance effectiveness and applicability of the present invention for different network scales and different communication environment complexities.
In addition, the time-frequency result of the user not trained and learned by the method of the invention is compared with the time-frequency result of the user trained and learned by the algorithm of the invention, the schematic diagram of the time-frequency result of the user not trained and learned by the method of the invention is shown in fig. 7, the diagram shows the working conditions of three pairs of legal users and an interference machine under six channels, the abscissa is the tested time slot, the ordinate represents the channel ID, different color blocks of the grid correspond to the conditions of different channels used by the legal users and the interference machine, it can be seen that the interference machine implements frequency sweep interference in real time, most of the legal users can not normally communicate, namely are interfered by the interference machine, the mutual interference of other users or the two kinds of interference at the same time, which indicates that the user is frequently interfered and the interference avoiding capability is very poor before the method of the invention is trained. The schematic diagram of the time-frequency result of the user after training and learning by the algorithm provided by the invention is shown in fig. 8, and it can be seen that under the condition that the network environment is complex and changes and no negotiation exists between multiple users when accessing the channel, by the method provided by the invention, a legal user can completely and independently avoid the interference of an interference machine, only individual mutual interference exists, and normal communication can be realized under most conditions, and the result further ensures the effectiveness of the invention.
It should be understood that, although the various steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, a distributed interference avoidance apparatus facing a multi-user scenario is provided, including: perception module, training module and interference avoidance module, wherein:
the perception module is used for a legal user to obtain local observation information of the current time slot by perceiving the electromagnetic environment; the local observation information comprises the state information of a legal user and the channel use information in the electromagnetic environment.
The training module is used for inputting the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation by each legal user to obtain a local utility function of each legal user; the action input of the previous time slot is obtained by selecting an access channel for data transmission according to the local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission; all legal users input the local utility function into a mixed network in the central controller for fusion to obtain a combined utility function, and the combined utility function is trained and updated according to a utility function approximation algorithm to obtain an optimal combined utility function; wherein the optimal joint utility function comprises an optimal joint strategy.
And the interference avoidance module is used for distributing the optimal combined strategy to each legal user by the central controller to obtain the optimal interference avoidance strategy corresponding to each legal user, and the legal users autonomously decide a communication channel according to the optimal interference avoidance strategy to perform data transmission.
For specific limitation of a distributed interference avoidance apparatus for a multi-user scenario, refer to the above limitation on a distributed interference avoidance method for a multi-user scenario, and details are not repeated here. Each module in the distributed interference avoiding device facing to the multi-user scenario can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A distributed interference avoidance method facing a multi-user scene is characterized by comprising the following steps:
a legal user obtains local observation information of the current time slot by sensing an electromagnetic environment; the local observation information comprises the state information of a legal user and the channel use information in the electromagnetic environment;
each legal user inputs the local observation information of the current time slot and the action of the previous time slot into the intelligent network for calculation to obtain a local utility function of each legal user; the action input of the previous time slot is obtained by selecting an access channel for data transmission according to the local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission;
all legal users input the local utility function into a mixed network in a central controller for fusion to obtain a combined utility function, and the combined utility function is trained and updated according to a utility function approximation algorithm to obtain an optimal combined utility function; wherein the optimal joint utility function comprises an optimal joint strategy;
and the central controller distributes the optimal combined strategy to each legal user to obtain an optimal interference avoidance strategy corresponding to each legal user, and the legal users autonomously decide a communication channel according to the optimal interference avoidance strategy to carry out data transmission.
2. The method of claim 1, wherein each valid user inputs the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation, so as to obtain a local utility function of each valid user, and the method comprises the following steps:
and each legal user inputs the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation to obtain a local utility function of each legal user, wherein the local utility function comprises a historical observation action set of the intelligent agent network, the action of the current time slot and the intelligent agent network strategy.
3. The method of claim 1, wherein the action input of the previous time slot is obtained by selecting an access channel for data transmission according to local observation information of the previous time slot and making a decision according to a feedback result received from the data transmission, and the method comprises:
selecting an access channel for data transmission according to the local observation information of the last time slot to obtain a data transmission result;
analyzing the communication condition of the access channel according to the data transmission result to obtain a feedback result;
and making a decision according to the feedback result to obtain the action of the last time slot.
4. The method of claim 3, wherein the data transmission result comprises a data transmission success result and a data transmission failure result;
analyzing the communication condition of the access channel according to the data transmission result to obtain a feedback result, wherein the feedback result comprises the following steps:
analyzing the communication condition of the access channel according to the successful data transmission result to obtain a positive feedback result;
and analyzing the communication condition of the access channel according to the data transmission failure result to obtain a negative feedback result and a non-feedback result.
5. The method of claim 4, wherein analyzing the communication condition of the access channel according to the data transmission failure result to obtain a negative feedback result and a non-feedback result comprises:
analyzing the communication condition of the access channel according to the data transmission failure result, and obtaining a negative feedback result when the communication condition is interfered by other legal users; when the communication condition is interfered by the jammer, a non-feedback result is obtained.
6. The method of claim 5, wherein the act of making a decision based on the feedback result to obtain a last timeslot comprises:
making a decision according to the positive feedback result to obtain the action of the last time slot, namely selecting the access channel for communication;
and making a decision according to the negative feedback and non-feedback results to obtain the action of the last time slot, namely selecting other channels for communication.
7. The method of claim 1, wherein the step of all legal users fusing the local utility function input into a hybrid network in a central controller to obtain a joint utility function comprises:
all legal users input the local utility function into a mixed network in a central controller for fusion to obtain a combined utility function; the joint utility function comprises historical observation action sets of all intelligent agent networks, action sets of current time slots and joint strategies;
and coordinating the relationship between the joint utility function and the local utility function according to the hyper-network in the hybrid network, so that the monotonicity between the joint utility function and the local utility function of each legal user is met.
8. The method of claim 7, wherein coordinating the relationship between the joint utility function and the local utility function according to a hyper-network in the hybrid network such that the joint utility function and the local utility function of each legitimate user satisfy monotonicity, comprises:
inputting a global environment state into a hyper-network in the hybrid network for calculation to obtain a bias parameter and a nonnegative weight parameter of the hybrid network;
and coordinating the relationship between the joint utility function and the local utility function according to the bias parameter and the non-negative weight parameter, so that the monotonicity between the joint utility function and the local utility function of each legal user is met.
9. The method of claim 1, wherein training and updating the combined utility function according to a utility function approximation algorithm to obtain an optimal combined utility function comprises:
and inputting the combined utility function and a target network utility function into a pre-constructed loss function for calculation, and training and updating the combined utility function by minimizing the loss function to obtain an optimal combined utility function, wherein the optimal combined utility function comprises an optimal combined strategy.
10. A distributed interference avoidance apparatus for a multi-user scenario, the apparatus comprising:
the perception module is used for a legal user to obtain local observation information of the current time slot by perceiving the electromagnetic environment; the local observation information comprises the state information of a legal user and the channel use information in the electromagnetic environment;
the training module is used for inputting the local observation information of the current time slot and the action of the previous time slot into the intelligent agent network for calculation by each legal user to obtain a local utility function of each legal user; the action input of the previous time slot is obtained by selecting an access channel for data transmission according to the local observation information of the previous time slot and making a decision according to a feedback result received by the data transmission; all legal users input the local utility function into a mixed network in a central controller for fusion to obtain a combined utility function, and the combined utility function is trained and updated according to a utility function approximation algorithm to obtain an optimal combined utility function; wherein the optimal joint utility function comprises an optimal joint strategy;
and the interference avoidance module is used for distributing the optimal combined strategy to each legal user by the central controller to obtain the optimal interference avoidance strategy corresponding to each legal user, and the legal users autonomously decide a communication channel according to the optimal interference avoidance strategy to carry out data transmission.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211076124.9A CN115150963B (en) | 2022-09-05 | 2022-09-05 | Multi-user scene-oriented distributed interference avoidance method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211076124.9A CN115150963B (en) | 2022-09-05 | 2022-09-05 | Multi-user scene-oriented distributed interference avoidance method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115150963A true CN115150963A (en) | 2022-10-04 |
CN115150963B CN115150963B (en) | 2022-11-04 |
Family
ID=83415990
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211076124.9A Active CN115150963B (en) | 2022-09-05 | 2022-09-05 | Multi-user scene-oriented distributed interference avoidance method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115150963B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140097979A1 (en) * | 2012-10-09 | 2014-04-10 | Accipiter Radar Technologies, Inc. | Device & method for cognitive radar information network |
US9858967B1 (en) * | 2015-09-09 | 2018-01-02 | A9.Com, Inc. | Section identification in video content |
CN111917509A (en) * | 2020-08-10 | 2020-11-10 | 中国人民解放军陆军工程大学 | Multi-domain intelligent communication model and communication method based on channel-bandwidth joint decision |
CN112180724A (en) * | 2020-09-25 | 2021-01-05 | 中国人民解放军军事科学院国防科技创新研究院 | Training method and system for multi-agent cooperative cooperation under interference condition |
CN113111594A (en) * | 2021-05-12 | 2021-07-13 | 中国人民解放军国防科技大学 | Multi-objective optimization-based frequency planning method and device and computer equipment |
CN114047523A (en) * | 2021-10-19 | 2022-02-15 | 中国人民解放军国防科技大学 | Method for detecting and tracking real target by puzzling and disturbing electromagnetic waves based on noise interference |
-
2022
- 2022-09-05 CN CN202211076124.9A patent/CN115150963B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140097979A1 (en) * | 2012-10-09 | 2014-04-10 | Accipiter Radar Technologies, Inc. | Device & method for cognitive radar information network |
US9858967B1 (en) * | 2015-09-09 | 2018-01-02 | A9.Com, Inc. | Section identification in video content |
CN111917509A (en) * | 2020-08-10 | 2020-11-10 | 中国人民解放军陆军工程大学 | Multi-domain intelligent communication model and communication method based on channel-bandwidth joint decision |
CN112180724A (en) * | 2020-09-25 | 2021-01-05 | 中国人民解放军军事科学院国防科技创新研究院 | Training method and system for multi-agent cooperative cooperation under interference condition |
CN113111594A (en) * | 2021-05-12 | 2021-07-13 | 中国人民解放军国防科技大学 | Multi-objective optimization-based frequency planning method and device and computer equipment |
CN114047523A (en) * | 2021-10-19 | 2022-02-15 | 中国人民解放军国防科技大学 | Method for detecting and tracking real target by puzzling and disturbing electromagnetic waves based on noise interference |
Non-Patent Citations (3)
Title |
---|
HAIJUN WANG: "Survey on Unmanned Aerial Vehicle Networks:A Cyber Physical System Perspective", 《IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 22, NO. 2, SECOND QUARTER 2020》 * |
潘筱茜: "基于深度强化学习的多域联合干扰规避", 《信号处理》 * |
荆楠等: "MIMO-OFDM系统中时变稀疏信道估计", 《信号处理》 * |
Also Published As
Publication number | Publication date |
---|---|
CN115150963B (en) | 2022-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109862610B (en) | D2D user resource allocation method based on deep reinforcement learning DDPG algorithm | |
Li | Multi-agent Q-learning of channel selection in multi-user cognitive radio systems: A two by two case | |
Slimeni et al. | Jamming mitigation in cognitive radio networks using a modified Q-learning algorithm | |
Slimeni et al. | Cooperative Q-learning based channel selection for cognitive radio networks | |
US12067487B2 (en) | Method and apparatus employing distributed sensing and deep learning for dynamic spectrum access and spectrum sharing | |
CN109274456B (en) | Incomplete information intelligent anti-interference method based on reinforcement learning | |
CN112492591B (en) | Method and device for accessing power Internet of things terminal to network | |
Slimeni et al. | Learning multi-channel power allocation against smart jammer in cognitive radio networks | |
Han et al. | Joint resource allocation in underwater acoustic communication networks: A game-based hierarchical adversarial multiplayer multiarmed bandit algorithm | |
Pourranjbar et al. | Reinforcement learning for deceiving reactive jammers in wireless networks | |
Slimeni et al. | Cognitive radio jamming mitigation using markov decision process and reinforcement learning | |
Albinsaid et al. | Multi-agent reinforcement learning-based distributed dynamic spectrum access | |
CN116600324B (en) | Channel allocation method for channel-bonded WiFi network | |
CN117615419A (en) | Distributed data unloading method based on task scheduling and resource allocation | |
Li et al. | Intelligent anti-jamming communication with continuous action decision for ultra-dense network | |
Jiang et al. | Q-learning for non-cooperative channel access game of cognitive radio networks | |
Lakew et al. | Adaptive partial offloading and resource harmonization in wireless edge computing-assisted IoE networks | |
Xu et al. | Play it by ear: Context-aware distributed coordinated anti-jamming channel access | |
Thien et al. | A transfer games actor–critic learning framework for anti-jamming in multi-channel cognitive radio networks | |
CN108449151B (en) | Spectrum access method in cognitive radio network based on machine learning | |
Wei et al. | Optimal frequency-hopping anti-jamming strategy based on multi-step prediction Markov decision process | |
CN115150963B (en) | Multi-user scene-oriented distributed interference avoidance method and device | |
Slimeni et al. | A modified Q-learning algorithm to solve cognitive radio jamming attack | |
Vien et al. | Enhancing security of MME handover via fractional programming and firefly algorithm | |
Adeogun et al. | Distributed channel allocation for mobile 6G subnetworks via multi-agent deep Q-learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |