CN115296705A - Active monitoring method in MIMO communication system - Google Patents
Active monitoring method in MIMO communication system Download PDFInfo
- Publication number
- CN115296705A CN115296705A CN202210470392.2A CN202210470392A CN115296705A CN 115296705 A CN115296705 A CN 115296705A CN 202210470392 A CN202210470392 A CN 202210470392A CN 115296705 A CN115296705 A CN 115296705A
- Authority
- CN
- China
- Prior art keywords
- antenna
- transmitter
- listener
- parameter
- strategy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000012544 monitoring process Methods 0.000 title claims abstract description 20
- 238000004891 communication Methods 0.000 title claims abstract description 12
- 230000009471 action Effects 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 17
- 238000005457 optimization Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 6
- 230000002787 reinforcement Effects 0.000 abstract description 9
- 230000005540 biological transmission Effects 0.000 description 5
- 239000003795 chemical substances by application Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000013480 data collection Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B7/00—Radio transmission systems, i.e. using radiation field
- H04B7/02—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas
- H04B7/04—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas
- H04B7/0413—MIMO systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/06—Testing, supervising or monitoring using simulated traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/04—Wireless resource allocation
- H04W72/044—Wireless resource allocation based on the type of the allocated resource
- H04W72/0473—Wireless resource allocation based on the type of the allocated resource the resource being transmission power
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses an active monitoring method in an MIMO communication system, which comprises a suspicious transmitter A, a suspicious receiver B and a legal monitor E, wherein the transmitter A sends information to the transmitter B, the monitor E makes a decision according to a part of known channels to improve the monitoring performance, and the A makes a corresponding decision to stop the monitoring of the monitor, thereby generating a monitoring and anti-monitoring game between a source node A and the monitor E. The invention designs a reinforcement learning algorithm to optimize the transmitting power strategies of the monitor E and the transmitter A and obtain Nash equilibrium in the monitoring and anti-monitoring games between the monitor E and the transmitter A.
Description
Technical Field
The invention belongs to the field of wireless communication, and particularly relates to an active monitoring method in a Multiple Input Multiple Output (MIMO) system, and more particularly relates to an optimization method of a power distribution strategy based on a multi-agent reinforcement learning algorithm (FSP-SAC).
Background
In recent decades, wireless communication has played a very important role in people's daily life by providing efficient convenience in linking people.
Currently, the physical layer snooping can be divided into two categories, passive snooping and active snooping. Passive listening is simply to accept the leaked radio signal through a silent receiver. However, with the deployment of highband channels and MIMO in fifth generation (5G) networks, the beams transmitting the signals become more and more station-wide, so that passive listening will have difficulty listening to valid information. According to the information theory, information can be decoded in the physical layer sense as long as the channel capacity of the communication channel is smaller than the channel capacity of the listening channel. Therefore, in order to improve the monitoring efficiency, an active monitoring method for reducing the capacity of the communication channel by using the interference signal is beginning to be widely used.
Currently, a common active monitoring method only considers a static monitored target, and with the development of anti-monitoring measures, more and more illegal information transmitters start to intelligently adjust transmission power and reduce monitoring channel capacity by using noise signals, so that a game problem about anti-monitoring and monitoring is generated, which also causes great difficulty to legal active monitoring. Therefore, it is important to construct a method to seek the solution of nash equilibrium in the anti-snooping and snooping game problems.
Disclosure of Invention
The purpose of the invention is as follows: in view of the problems and deficiencies of the prior art, an object of the present invention is to provide an active listening method in an MIMO communication system, which optimizes a power transmission policy of a listener and a power transmission policy of a suspicious source node, so that the policies of the two achieve nash balance.
The technical scheme is as follows: in order to achieve the above object, the present invention adopts a technical solution of an active monitoring method in an MIMO communication system, comprising the steps of:
(1) In each time slot t, a multi-antenna transmitter (transmitter for short) A transmits an information signal to a multi-antenna receiver (receiver for short) BAnd transmits an interference signal to a multi-antenna listener (simply called listener) ETo reduce the channel capacity A to E and thus prevent listening from E, transmitter A's actionIs shown asWhereinAndare respectivelyAndbased on local channel information, transmitter AUsing a strategy of pi A Selecting actions, emitter A use based onConditional probability distribution ofThe value of sampling is the selected action
(2) At each time slot t, the listener E transmits an interference signal x to the receiver B E (t) to reduce the channel capacity between transmitter A and receiver B to increase the listening success rate, the action of listener EIs shown asWhereinIs x E (t) covariance matrix, and the listener E is based on the local channel informationUsing a strategy of pi E Selecting actions, listener E usage based onConditional probability distribution ofThe value of sampling is the selected action
(3) At each time slot t, when both the transmitter A and the listener E have performed an action, a prize is respectively awardedAndlet pi = { pi A ,π E Define the average reward function J of the emitter A A (π) isWhereinRepresenting the average reward function J of a listener E based on a mathematical expectation of the time-slot t taken by a condition pi E (π) isOptimization strategy pi A To maximize J A (π), optimization strategy π E To maximum J E And (pi) to achieve nash-nash equalization in the snoop and anti-snoop game.
Further, the step (3) further comprises the following steps:
1) For any device n, wherein n is epsilon { A, E }, initializing a parameter of theta n Parameterized strategy ofOne parameter is psi n Strategy (2)One parameter is ω n Value function ofAnd one parameter is phi n Value function ofA parameter phi n Is assigned toParameter (d) of
2) At each time slot t, device n uses the policy with a probability of 0.1To choose an action to use the policy with a probability of 0.9Selecting an action to collect dataStore to the first storage areaWherein when n = A, the dataIs composed ofWhen n = E, dataIs composed ofIf the action is by policySelecting, then the data will beTo a second storage area
3) FromRandomly selecting samples with length L Calculating gradients WhereinWhich means that the gradient is taken over the variable x,is made by a policySampling is carried out, and the temperature parameter alpha belongs to [0,1 ]]CalculatingGradient of (2) Calculating gradientsWherein the discount factor γ ∈ (0, 1), fromRandomly selecting samples with the length LCalculating gradientsThen the parameter theta is updated n ←θ n + ηΔθ n ,ω n ←ω n +ηΔω n ,φ n ←φ n +ηΔφ n ,ψ n ←ψ n + ηΔψ n Wherein eta is learning rate, eta value range is (0, 1), v is moving average parameter, v value range is (0, 1), symbol ← shows that the value on the right side of arrow is assigned to the left side, then step 2 is returned until strategy parameter theta n No longer changed.
Has the advantages that: the invention solves the problem of dimension disaster in the face of high-dimensional games by designing an algorithm FSP-SAC and introducing deep strong chemistry, and solves the problem that the common single-agent reinforcement learning algorithm is difficult to converge in the game problem by combining virtual game playing and deep reinforcement learning technologies, so that the algorithm can gradually converge to Nash equilibrium.
Drawings
FIG. 1 is a diagram of a system model of the present invention;
FIG. 2 is a graph comparing the performance of the method used in the present invention with other methods;
FIG. 3 is a graph comparing the performance of the method used in the present invention with other methods.
Detailed Description
The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.
As shown in fig. 1, the communication system considered consists of a multi-antenna transmitter (transmitter) a, a multi-antenna receiver (receiver) B, and a multi-antenna listener (listener) E. Let the number of transmitting antennas of transmitter A be N A The number of receiving antennas of the receiver B is N B And the monitor E has two groups of antennas, one for transmitting the interference signals, the number of which isAnother group for listening to signals from A, the number of which isAt time slot t, the channel matrices between transmitter A and receiver B, transmitter A and listener E, and listener E and receiver B are denoted H, respectively AB (t),H AE (t), and H EB (t)。 WhereinA complex field space of size i x j is represented.
In each time slot t, the signal transmitted by the transmitter A is composed of an information signalAnd artificial noise signalIs composed of (a) whereinIs expressed asWhereinIs represented as a pre-coding matrix ofWhereinIn order that the artificial noise does not interfere with the information signal,from H AB N of (t) corresponding to non-0 singular values B A right singular value vector, andfrom the remaining N A -N B Right singular vectors corresponding to singular values of 0. The total signal transmitted by the transmitter a in the time slot t is denoted as The signals received by receiver B are:
wherein x E (t) is the interference signal transmitted by listener E to B,n B is gaussian white noise. The signal received by listener E is:
from equation (1), the covariance matrix of the received signal B is:
whereinSuperscript (.) H Representing the conjugate transpose of a matrix or vector. The covariance matrix of the received interference is:
whereinIs n B Of covariance matrix, σ 2 Is the noise coefficient, I x An identity matrix of size x is shown. According to equations (3) and (4), the channel capacity between the transmitter A and the receiver B is
Where the function det represents the determinant of the matrix, the superscript (.) -1 Representing the inverse of the matrix.
According to the formula (2), the covariance matrix of the signal at the E terminal of the listener is:
the covariance matrix of the interference is:
from equations (6) and (7), the channel capacity between transmitter a and receiver B is:
in time slot t, both transmitter a and listener E can only obtain partial channel information, which we call local observation information (or local channel information). In time slot t, the local observation information of transmitter A is defined asWhereinA space of local observation information of a. The local observation information of the listener E is defined as The local observation information space of E. And global state is defined asWhereinIs a global state space.
At each time slot t, the transmitter A decides to transmit a signalAndpower allocation of (2), i.e. covariance matrixAndthe power of the signal is Where tr is the trace of the matrix, the power of the artificial noiseIf transmitter A does not know channel H AE (t), noise power can be assumedEqually distributed over each artificial noise stream, i.e.If it is assumed that each stream of the signal is uncorrelated with each other, thenAndare all positive semidefinite symmetric matrices. The signal power and the artificial noise power need to meet the overall constraint of the attack: whereinIs the maximum power of the transmitter a. Define the action of the transmitter A in the time slot t asInterference signal x of listener E E (t) covariance matrixAssuming that each stream of the signal is uncorrelated with each other, thenA symmetric matrix is defined for half positive. The power of the noise signal needs to satisfy the total power constraint: whereinIs the maximum transmit power of the listener E. Define the action of E in time slot t asDefining federated actionsAccording to the information theory, if C E (t) is greater than C B (t), the listener can decode the transmitter with arbitrarily small errors in the physical layer senseA transmits information to receiver B, so define the reward function of listener E as:
whereinRepresenting a Boolean indicating function, outputting 1 when the input value is true, otherwise outputting 0,because the amplitude of the variation of the reward function to the action can be increased exponentially. If the information transmitted by transmitter a is once eavesdropped, a penalty is given based on each portion of the eavesdropped data. Transmitter a aims to reduce the amount of eavesdropped information while maximizing the transmission rate, so the reward function of transmitter a is defined as:
where ζ is a coefficient that balances transmission rate and information leakage penalty greater than 0. We will r E (s t ,a t ) And r A (s t ,a t ) Are respectively abbreviated asAnd
the strategy for defining the emitter A is pi A The selection of the action is carried out,is based onThe observed information isWhen A chooses an action using the probability distributionThe policy for definition of listener E is π E The selection of the action is carried out,is based onThe observed information isThe transmitter A uses the probability distribution to select an actionThe joint strategy of the two is expressed as pi = (pi) A ,π E ). The objective function of the transmitter A isMeaning that under the conditions of the strategy pi,mathematical expectation in the time dimension, i.e. the average prize value. Likewise, the listener E has an objective function of
The optimization objectives for transmitter a are:
the optimization objectives for listener E are:
to solve the problems (11) and (12), a common reinforcement learning algorithm can be applied to the transmitter a and the listener E to solve the problems, but the learning results are difficult to converge due to the fact that the strategies of the two parties are changed. Therefore, a multi-agent reinforcement learning algorithm FSP-SAC is designed to learn respectively optimal strategies pi for the transmitter A and the monitor E A And pi E . The method stabilizes the learning process by training an average strategy and an optimal response strategy.
We present the process of solving the problems (11) and (12) using FSP-SAC, since both problems (11) and (12) are solved using FSP-SAC, for simplicity of description, one of the transmitter a or the listener E is represented by n, i.e. n ∈ { a, E }, and the algorithm process is as follows:
1) For any n ∈ { A, E }, a parameter is initialized to be theta n Parameterized optimal response strategyOne parameter is psi n Average strategy of (1)One parameter is ω n Value function ofAnd one parameter is phi n Value function ofInitializing a value functionTo stabilize the learning process, the parameter phi is set n Is assigned toParameter (d) of
2) Data collection: the definition of x to p (x) denotes that x obeys the probability distribution p (x). In each time slot t, the local observation information isn use policy with probability of 0.1To select an action: using policy with a probability of 0.9Selecting an action:we call this probabilistic selection strategy a hybrid strategy,andthe mixing strategy of (1) is pi n . After both transmitter A and listener E have performed the action, the system transitions to the next state and n is awardedAnd local observation information for observing next time slotCollecting the collected dataStore to storage areaIf the action is by a policyIf selected, the data is then processedStore to another storage areaAssuming that the data collection phase is T steps, when T = T, the data collection is finished, and the optimization learning phase is entered.
3) Reinforcement learning stage usingTo update the data inAndfromRandomly selecting samples with the length L Calculate aboutGradient:
whereinWhich means that the gradient is taken over the variable x,is made by a policyDerived from the sample, not from the sample tau RL Is obtained by a temperature parameter alpha belongs to [0,1 ]]. Recalculating aboutGradient (2):
wherein the discount factor γ ∈ (0, 1). Then, parameters are updated: theta.theta. n ←θ n +ηΔθ n ,ω n ←ω n + ηΔω n ,φ n ←φ n +ηΔφ n ,Wherein eta is the learning rate, eta is in a value range of (0, 1), v is the sliding average parameter, v is in a value range of (0, 1), and the symbol ← shows that the value on the right side of the arrow is assigned to the left side.
4) Supervising the learning phase, usingIs updated with the data ofFromRandomly selecting samples with the length of BThen calculating the gradient
Then, parameters are updated: psi n ←ψ n +ηΔψ n . Then return to step 2) until strategyConverges to a steady state.
Finally, we simulated the system. The simulation parameters are set as: sigma 2 =10 -8 mW,N A = 4,The distances among the transmitter A, the receiver B and the monitor E are all 200m, and the path loss index is 3.48. The coefficient ζ =2 in the formula (10). The strategy and value functions are parameterized by a multilayer perceptron (one type of artificial neural network), and the activation function is ReLu (Rectified Linear Unit), which has two layers, wherein each layer has 128 neurons. η =0.0003, α =0.05, ν = 0.005, γ =0.99, t =1000, l =128.
In FIG. 3, we compared several other methods, where SAC (Soft activator-critical) method is from Soft Actor-critical: off-Policy Maximum Engine expression evaluation with a Stoustic Actor, woLF-PPO method is from Win or lean fast expression strategy optimization. Fig. 2 and 3 are learning curve diagrams of the transmitter a and the monitor E, respectively, and it can be seen that a result curve using the multi-agent reinforcement learning algorithm FSP-SAC progressively converges to a steady-state value, while other learning algorithms for comparison face severe fluctuation, so the present invention solves the problem that other reinforcement learning methods are difficult to converge in the game problem, and according to the relationship between convergence and rationality in the game, since the FSP-SAC inherits the rationality from the SAC, it can be judged that the result of the FSP-SAC method converges to nash equilibrium.
Claims (2)
1. An active monitoring method in a MIMO communication system comprises the following steps:
(1) At each time slot t, the multi-antenna transmitter A transmits an information signal to the multi-antenna receiver BAnd transmits an interference signal to the multi-antenna listener ETo reduce the channel capacity of the multi-antenna transmitter A to the multi-antenna listener E to prevent listening from the multi-antenna listener E, the multi-antenna transmitter A being activeIs shown as WhereinAndare respectivelyAndassistant ofDifference matrix, multiple antenna transmitter A based on local channel informationUsing a strategy of pi A Selection action, multiple antenna transmitter A usage based onConditional probability distribution ofThe value of sampling is the selected action
(2) At each time slot t, the multi-antenna listener E transmits an interference signal x to the multi-antenna receiver B E (t) to reduce the channel capacity between the multi-antenna transmitter A and the multi-antenna receiver B to increase the listening success rate, the action of the multi-antenna listener EIs shown asWhereinIs x E (t) covariance matrix, multi-antenna listener E based on local channel informationUsing a strategy of pi E Selecting actions, multi-antenna listener E usage based onConditional probability distribution ofThe value of sampling is the selected action
(3) At each time slot t, when the multi-antenna transmitter A and the multi-antenna transmitter E perform actions, respectively obtaining prizesAndlet pi = { pi = A ,π E Define an average reward function J for a multi-antenna transmitter A A (π) isWhereinRepresenting the average reward function J of a multi-antenna listener E based on a mathematical expectation of the time slot t taken by the condition pi E (π) isOptimization strategy pi A To maximize J A (π), optimization strategy π E To maximum J E And (pi) to achieve nash-nash equalization in the snoop and anti-snoop game.
2. The active listening method in a MIMO communication system according to claim 1, wherein said step (3) further comprises the steps of:
1) For any device n, where n is in the range of { A, E }, initializing a parameter to be theta n Parameterized strategy ofOne parameter is psi n Strategy (2)One parameter is ω n Value function ofAnd one parameter is phi n Value function ofA parameter phi n Is assigned toParameter (d) of
2) At each time slot t, device n uses the policy with a probability of 0.1To choose an action to use the policy with a probability of 0.9Selecting an action to collect dataStore to the first storage areaWherein when n = A, dataIs composed ofWhen n = E, dataIs composed ofIf the action is by policySelecting, then data will beTo a second storage area
3) FromRandomly selecting samples with the length LCalculating gradientsWhereinWhich means that the gradient is taken over the variable x,is made by a policySampling is carried out, and the temperature parameter alpha belongs to [0,1 ]]CalculatingGradient of (2) Calculating gradients Wherein the discount factor γ ∈ (0, 1), fromRandomly selecting samples with the length LCalculating gradientsThen updating the parameter theta n ←θ n +ηΔθ n ,ω n ←ω n +ηΔω n ,φ n ←φ n +ηΔφ n ,ψ n ←ψ n +ηΔψ n Wherein eta is learning rate, eta value range is (0, 1), v is sliding average parameter, v value range is (0, 1), symbol ← shows value on arrow right side is assigned to left side, then step 2 is returned until strategy parameter theta n No longer changed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210470392.2A CN115296705B (en) | 2022-04-28 | 2022-04-28 | Active monitoring method in MIMO communication system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210470392.2A CN115296705B (en) | 2022-04-28 | 2022-04-28 | Active monitoring method in MIMO communication system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115296705A true CN115296705A (en) | 2022-11-04 |
CN115296705B CN115296705B (en) | 2023-11-21 |
Family
ID=83819503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210470392.2A Active CN115296705B (en) | 2022-04-28 | 2022-04-28 | Active monitoring method in MIMO communication system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115296705B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090088176A1 (en) * | 2007-09-27 | 2009-04-02 | Koon Hoo Teo | Method for Reducing Inter-Cell Interference in Wireless OFDMA Networks |
US10069592B1 (en) * | 2015-10-27 | 2018-09-04 | Arizona Board Of Regents On Behalf Of The University Of Arizona | Systems and methods for securing wireless communications |
CN111726845A (en) * | 2020-07-01 | 2020-09-29 | 南京大学 | Base station switching selection and power distribution method in multi-user heterogeneous network system |
CN112087749A (en) * | 2020-08-27 | 2020-12-15 | 华北电力大学(保定) | Cooperative active eavesdropping method for realizing multiple listeners based on reinforcement learning |
CN112840600A (en) * | 2018-08-20 | 2021-05-25 | 瑞典爱立信有限公司 | Immune system for improving sites using generation of countermeasure networks and reinforcement learning |
WO2021136070A1 (en) * | 2019-12-30 | 2021-07-08 | 三维通信股份有限公司 | Resource allocation method for simultaneous wireless information and power transfer, device, and computer |
CN114363908A (en) * | 2022-01-13 | 2022-04-15 | 重庆邮电大学 | A2C-based unlicensed spectrum resource sharing method |
-
2022
- 2022-04-28 CN CN202210470392.2A patent/CN115296705B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090088176A1 (en) * | 2007-09-27 | 2009-04-02 | Koon Hoo Teo | Method for Reducing Inter-Cell Interference in Wireless OFDMA Networks |
US10069592B1 (en) * | 2015-10-27 | 2018-09-04 | Arizona Board Of Regents On Behalf Of The University Of Arizona | Systems and methods for securing wireless communications |
CN112840600A (en) * | 2018-08-20 | 2021-05-25 | 瑞典爱立信有限公司 | Immune system for improving sites using generation of countermeasure networks and reinforcement learning |
WO2021136070A1 (en) * | 2019-12-30 | 2021-07-08 | 三维通信股份有限公司 | Resource allocation method for simultaneous wireless information and power transfer, device, and computer |
CN111726845A (en) * | 2020-07-01 | 2020-09-29 | 南京大学 | Base station switching selection and power distribution method in multi-user heterogeneous network system |
CN112087749A (en) * | 2020-08-27 | 2020-12-15 | 华北电力大学(保定) | Cooperative active eavesdropping method for realizing multiple listeners based on reinforcement learning |
CN114363908A (en) * | 2022-01-13 | 2022-04-15 | 重庆邮电大学 | A2C-based unlicensed spectrum resource sharing method |
Non-Patent Citations (4)
Title |
---|
DELIN GUO: "A Proactive Eavesdropping Game in MIMO Systems Based on Multiagent Deep Reinforcement Learning", 《 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS》 * |
DELIN GUO: "Eavesdropping Game Based on Multi-Agent Deep Reinforcement Learning", 《 2022 IEEE 23RD INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATION (SPAWC)》 * |
吴伟;胡冰;胡峰;: "基于全双工的主动监听系统中合法通信速率最大化方法设计", 南京邮电大学学报(自然科学版), no. 02 * |
李奕男: "基于博弈论的移动Ad hoc网络入侵检测模型", 《电子与信息学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN115296705B (en) | 2023-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | NAS-AMR: Neural architecture search-based automatic modulation recognition for integrated sensing and communication systems | |
CN109617584B (en) | MIMO system beam forming matrix design method based on deep learning | |
CN112600772B (en) | OFDM channel estimation and signal detection method based on data-driven neural network | |
CN109274456B (en) | Incomplete information intelligent anti-interference method based on reinforcement learning | |
Zhao et al. | Cognitive radio engine design based on ant colony optimization | |
CN109302262A (en) | A kind of communication anti-interference method determining Gradient Reinforcement Learning based on depth | |
CN108712748B (en) | Cognitive radio anti-interference intelligent decision-making method based on reinforcement learning | |
CN111224905B (en) | Multi-user detection method based on convolution residual error network in large-scale Internet of things | |
CN112906035B (en) | Method for generating frequency division duplex system key based on deep learning | |
CN118054828B (en) | Intelligent super-surface-oriented beam forming method, device, equipment and storage medium | |
CN109412661A (en) | A kind of user cluster-dividing method under extensive mimo system | |
CN114449584B (en) | Distributed computing unloading method and device based on deep reinforcement learning | |
Zhang et al. | Deep reinforcement learning-empowered beamforming design for IRS-assisted MISO interference channels | |
Fan et al. | Demodulator based on deep belief networks in communication system | |
Zhang et al. | Resource management for heterogeneous semantic and bit communication systems | |
CN115296705A (en) | Active monitoring method in MIMO communication system | |
Omid et al. | Deep Reinforcement Learning-Based Secure Standalone Intelligent Reflecting Surface Operation | |
Zhou et al. | QoS-aware power management with deep learning | |
Zhang et al. | Beyond supervised power control in massive MIMO network: Simple deep neural network solutions | |
Dai et al. | Power allocation for multiple transmitter-receiver pairs under frequency-selective fading based on convolutional neural network | |
Miao et al. | A novel millimeter wave channel estimation algorithm based on IC-ELM | |
CN112087275A (en) | Cooperative spectrum sensing method based on birth and death process and viscous hidden Markov model | |
Tingting et al. | Dynamic threshold spectrum sensing method based on DQN combined with clustered cooperative sensing architecture | |
Zhang et al. | Machine Learning enabled Heterogeneous Semantic and Bit Communication | |
Tan et al. | Personalized Recognition for Distributed Jamming in Dynamic Environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |