CN114553274A

CN114553274A - Security self-precoding machine optimization method based on antagonistic learning

Info

Publication number: CN114553274A
Application number: CN202210112026.XA
Authority: CN
Inventors: 郑重; 王新尧; 费泽松
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2022-01-27
Filing date: 2022-01-27
Publication date: 2022-05-27
Anticipated expiration: 2042-01-27
Also published as: CN114553274B

Abstract

The invention discloses a security self-precoding machine optimization method based on countermeasure learning, and belongs to the technical field of wireless communication physical layer security. Aiming at the technical problems of non-convex safety rate optimization and high complexity of a large-scale multi-antenna system, the invention designs and constructs a safety transceiver training frame based on a self-encoder and supporting multi-user, multi-antenna and multi-data stream transmission, and performs joint optimization on signal modulation and a space precoder at a transmitting end, so that a legal user receiving end demodulates at an extremely low symbol error rate to recover correct confidential information, and an eavesdropping user receiving end cannot demodulate correctly and only can obtain wrong confidential information. The trained safety transmitter can improve the communication reliability of a legal user and simultaneously remarkably reduce the reliability of the eavesdropping user, thereby realizing safe transmission. In addition, the invention reduces the model convergence time, reduces the space signal processing complexity and improves the safe transmission efficiency by introducing the counterstudy strategy.

Description

Security self-precoding machine optimization method based on antagonistic learning

Technical Field

The invention relates to a security self-precoding machine optimization method based on countermeasure learning, and belongs to the technical field of wireless communication physical layer security.

Background

A large-scale Multiple-Input Multiple-Output (MIMO) system is a key enabling technology of the next generation mobile communication B5G/6G, can provide higher-speed physical layer data transmission efficiency, can remarkably improve the spatial freedom degree of signal processing by increasing the number of antenna arrays, and brings greater potential for precoding-based physical layer safety design and reliability gain. However, a large-scale antenna often needs to integrate dozens to hundreds of antenna units on an antenna panel, and especially in a spatial multi-stream transmission scenario, the number of radio frequency links increases with the increase of the number of antennas, which further increases the signal processing complexity of a transmitting end and the hardware processing overhead of power amplification and the like. Meanwhile, end-to-end transceiver design based on deep learning draws wide attention in academia and industry in recent years, the network of a transceiving end is trained jointly by using measured channel environment data and priori knowledge, the method is different from the prior communication system design and is realized by adopting a mode of cascading functional modules, the modules are independently optimized, and the performance of the designed communication system is not optimal under the condition. And an end-to-end design based on an Automatic Encoder (AE) for deep learning can design a unified target function for a plurality of cascade modules, perform multi-module joint optimization and realize integral optimization. In addition, deep learning can be achieved by unloading large-scale array signal processing calculation overhead to an offline training stage and guiding model training in a data-driven mode, so that the signal calculation time of the online stage is shortened, and especially for the conditions of high-order signal modulation and a larger number of antenna systems, the method has the advantages that the traditional algorithm is incomparable.

Currently, depth autocoders are beginning to be used to study physical layer secure communications. Some existing work provides some solutions to the problem, and the X.L.Zhang adopts a safety precoding design method based on supervised learning, firstly, an iterative optimization and water injection algorithm is utilized to perform combined optimization on a precoding direction vector and a power distribution vector, and a suboptimal solution of a signal covariance matrix under an MIMO channel is solved. However, the safety design method based on supervised learning has the advantage that although the computational complexity is reduced compared with the traditional scheme, the achievable practical upper limit of performance is always limited by the traditional scheme. Therefore, c.h.lin studied an end-to-end physical layer security scheme based on a variational self-encoder, in which an objective function that directs model update was designed as a sum of three parts, directing optimization of communication rate, security performance, and noise adaptation performance, respectively. Wherein the security part is realized by minimizing mutual information between the secret information and the precoded symbols, wherein the mutual information is characterized by a correlation function. Li has studied a Mutual Information Neural Estimation (MINE) based network model that can approximate the Mutual Information size of input and output distributions of a Neural network, and opens the door to design Information theory based physical layer secure communication using deep learning. In addition, r.fritschek studied an end-to-end secure self-encoder scheme based on user error rate, which designed a secure transmitter based on a neural network by designing an objective function including maximizing the error rate of an eavesdropping user and minimizing the error rate of a legitimate user.

Most of physical layer security technologies designed for MIMO channels in the scheme are based on supervised learning methods, and the problem that the security rate is limited by the traditional method is difficult to overcome; the secure self-coding machine scheme based on unsupervised learning is designed for secure channel coding or a high-level symmetric encryption algorithm, and is not designed for large-scale array signal processing layers such as a secure constellation diagram and secure pre-coding, and most system simulations are performed under the condition of a small-scale antenna array or a single-antenna system. In addition, with the close combination of artificial intelligence and mobile communication, an illegal eavesdropping user also has the ability to obtain the priori knowledge of the transmitter through blind modulation identification, transmitter fingerprint identification and the like, so that the receiving and cracking capabilities of own confidential information are improved, and the risk that a legal system is eavesdropped is further increased.

Disclosure of Invention

Aiming at the problems of non-convex safety rate optimization problem and high complexity of a future large-scale multi-antenna system, the invention provides a safety self-precoding machine optimization method based on countermeasure learning. By designing and constructing a safety transceiver training frame based on a self-encoder and simultaneously supporting multi-user, multi-antenna and multi-data stream transmission, signal modulation and space precoder are jointly optimized at a transmitting end, so that a legal user receiving end demodulates at an extremely low symbol error rate and recovers correct confidential information, while an eavesdropping user receiving end cannot demodulate correctly and only can obtain wrong confidential information. The trained safety transmitter can greatly reduce the reliability aiming at eavesdropping users while improving the communication reliability of legal users, thereby realizing safe transmission.

The purpose of the invention is realized by the following technical scheme:

aiming at the technical problems of non-convex safety rate optimization problem and high complexity in a large-scale MIMO system, the safety self-precoding machine is trained based on antagonistic learning. A modulation module and a space pre-coding module which are cascaded are designed in a combined optimization mode, a safe transmitter constellation diagram and a full-digital beam forming vector are designed, and a safe self pre-coder (SAP) is obtained through training. Meanwhile, an iterative countermeasure learning training framework is introduced, an eavesdropping receiver with better symbol detection capability is developed under the condition that parameters of a Security transmitter are known, and based on the eavesdropping receiver evolved, an countermeasure Security self-pre-coder (ASAP) with higher legal user information reliability is trained in an countermeasure mode. The trained safety transmitter can greatly reduce the reliability aiming at eavesdropping users while improving the communication reliability of legal users, thereby realizing safe transmission.

The invention discloses a security self-precoding machine optimization method based on antagonistic learning, which comprises the following steps:

step one, setting system parameters of an MIMO communication system based on a self-encoder frame, wherein the system parameters comprise: the number of antennas M, N of the transmitter Alice, the legal user Bob and the eavesdropping user Eve_BAnd N_EBit information R of each symbol, length J of symbol sequence, channel multipath quantity L and channel parameter alpha_l，θ_l，

Distribution of (d), transmit power constraint p, signal-to-noise ratio SNR; setting a neural network model structure, training/testing dataset parameters, and training hyper-parameters, the training hyper-parameters comprising: selected optimizer, training round Epoch, sample length per round Batch Size.

Step two, building an MIMO communication system supporting multi-user, large-scale multi-antenna and multi-stream data transmission based on a training framework of a depth self-coding machine, designing a multi-user average SER as a loss function, and training a transmitter network model formed by cascading a modulation module and a pre-coding module; the trained model can realize the reliable transmission of multiple users in the system under the limited transmitting power through space beam forming.

Step 2.1: and designing a transmitting terminal neural network, wherein the network structure comprises a signal modulation module and a space pre-coding module.

The transmitting symbol corresponding to the jth secret information after the antenna mapping of the transmitting end is X_j：

Wherein the content of the first and second substances,

and

respectively representing a modulation module and a precoding module; m is_jIs a secret information to be transmitted, from a predetermined limited set of secret information

Is obtained in (1). The modulation symbol is output after passing through the modulation neural network module

s_jEstimating channel parameters with the transmitting end

Combining and inputting precoding neural network module together

Carrying out precoding operation on the modulation symbols to obtain a precoded signal X_j. Likewise, J X_jThe method can process and send in parallel to realize multi-stream signal transmission of the MIMO system. All parameters in the training and testing data sets, including channel parameters and signal parameters, are expressed in a manner that real parts and imaginary parts are separated, that is, all channels and signals in the system are characterized as a real matrix.

All transmitting end networks adopt a Fully-Connected Neural Network (FCNN) and a modulation module Neural Network

The calculation process for the secret information sequence m is represented as follows:

wherein, the first and the second end of the pipe are connected with each other,

and

separately representing modulation module neural networks

The activation function, the weight vector and the offset vector of the g-th network; then all symbol sequences s after modulation and the estimated channel parameters estimated by the transmitting end

Combining to obtain new training sample

AsSpatial precoding module neural network

The input of (1); spatial precoding module neural network

The calculation process of the matrix U after combining the modulation symbols and the channel is expressed as follows:

wherein the content of the first and second substances,

and

separately representing modulation module neural networks

T of^thAn activation function, a weight vector and an offset vector of a layer network; to limit the transmit signal power | X | ≦ p,

t of^thThe layer is designed as a power constraint layer and adopts a self-defined activation function

The following were used:

wherein | X | represents the F-norm of matrix X, and p represents the maximum transmit power; thus, a normalized signal to be transmitted mapped to the antenna port is obtained via step 2.1.

Step 2.2: and designing a receiving end neural network, which comprises a received signal detection module and a probability mapping module.

The legal user Bob and the eavesdropping user Eve are regarded as two legal users in the step, and two receiver models with the same network structure are built. J th^thSignal X with normalized power_jRespectively reaches a receiving end through respective MIMO channels of Bob and Eve, and the j-th channel received by Bob and Eve^thA signal Y_B,jAnd Y_E,jRespectively, as follows:

Y_B,j＝H_BX_j+n_B (5)

Y_E,j＝H_EX_j+n_E (6)

wherein the content of the first and second substances,

and

representing additive white gaussian noise.

At the receiving end, the receiver networks of Bob and Eve adopt the same network structure, which is respectively expressed as:

and

j-th of receiver recovery for Bob and Eve^thThe secret information is represented as:

wherein alpha is_B,θ_B,

Individual watchShowing channel parameters estimated by a Bob receiving end; alpha is alpha_E,θ_E,

Respectively representing the channel parameters estimated by an Eve receiving end;

the last layer of the receiving end neural network adopts a Softmax activation function to respectively output prediction probability vectors P of Bob and Eve_BAnd P_E(ii) a The probability vector represents

And

the predicted secret information is a set of secret information

The probability corresponding to a certain secret information.

Step 2.3: designing an average cross-entropy loss function for multiple users according to the classified cross-entropy loss function

And updating the parameters of the self-precoding machine model by adopting a reverse gradient descent strategy.

Average cross entropy loss function of legal user Bob and eavesdropping user Eve

The design is as follows:

wherein, P_j,mOne-hot encoding matrix representing transmitted secret information sequence m

Row j, column m;

and

probability prediction matrices for receiver networks representing Bob and Eve, respectively

And

row j, column m;

expressed in discrete parameter sets

Next, the average of the loss function calculated under the number of batch size data samples, the batch size representing the length of each batch of training samples sent into the neural network.

Optimizer based on Tensorflow deep learning framework, minimizing the above-mentioned average cross entropy loss function

The confidential information recovered by the receiver sides of Bob and Eve is processed through an end-to-end neural network, an unsupervised training process is achieved, signals recovered by a receiving end are consistent with signals of a sending end, reliable transmission of the confidential information is achieved, both Eve and Bob can obtain an optimal receiver in the channel scene, meanwhile, the trained Eve is considered to be an optimal eavesdropper based on self-encoding training, and subsequent safety design is conducted on the basis of the eavesdropper of the user Eve.

Step three, designing a new safety loss function by introducing a fuzzy matrix P aiming at the eavesdropping user according to the multi-user and multi-stream MIMO self-precoding machine model built in the step two

Guiding models with new safety loss functionsTraining, giving safety attributes to the self-precoding machine, generating a new safety constellation diagram, and ensuring that a receiving end of a legal user Bob can complete symbol detection, and a receiving end of an eavesdropping user Eve cannot correctly complete symbol detection.

Step 3.1: similar to the second step, in order to realize the safe transmission of the physical layer signal, a new safety loss function is designed aiming at the fuzzy matrix P of the eavesdropping user

Is represented as follows:

wherein, P_j,m，

And

the meaning is in accordance with formula (9); introducing an ambiguity matrix P to confuse eavesdropping subscriber receivers, P_j,mThe elements representing the jth row and mth column of the blur matrix P, which is written as follows:

according to the principle of a cross entropy loss function, the prediction probability matrix of the eavesdropping user Eve receiver is closer to the fuzzy matrix P along with the progress of the training process, so that the probabilities of Eve judging that the received symbols belong to a certain class are consistent, the symbols cannot be obtained for detection, and Bob can correctly detect the received symbols, so that the secure transmission of confidential information is realized.

Step 3.2: based on the new safety loss function designed in step 3.1

On fixed eavesdropping user interfaceAnd carrying out safety training under the condition that the receiver parameters are not changed, and training to obtain the SAP.

Step 3.2.1: firstly, determining the total training times N, and initializing N to be 1; reading the parameters of the pre-trained self-encoder model in the step two, including the initialization network parameter phi of the transmitter Alice_AAnd receiver initialization network parameters of Bob and Eve

And

step 3.2.2: initializing channel parameters of training samples and corresponding one-hot coded labels, and reading a training data set;

step 3.2.3: determining a training hyperparameter: the method comprises the steps of optimizing the learning rate of an optimizer, training turns, the length of each batch of samples, and the division ratio of a training data set and a verification data set;

step 3.2.4: starting training, updating all network model parameters phi by using an Adam optimizer based on a loss function (10)_A′，

And

step 3.2.5: n is n + 1; and ending the training until N is equal to N.

Step four: an antagonistic learning mechanism is introduced, and a target loss function aiming at the eavesdropping user is designed by combining the safety loss function of the step 3.1

Dividing the whole self-precoding machine into two parts of links, namely a legal link Main Chain and an eavesdropping link Eve Chain, wherein the Main Chain comprises an Alice transmitter and a receiver network of Bob, the Eve Chain comprises an Eve receiver network, designing two parts of iterative confrontation training algorithms based on the pre-training model in the second step, and obtaining the confrontation safety self-coding algorithmThe coder model ASAP.

Step 4.1: an antagonistic learning mechanism is introduced, and a target loss function aiming at the eavesdropping user is designed by combining the safety loss function of the step 3.1

Wherein, P_j,m，

Is identical to the expression in formula (10); the purpose of this loss function is to continue to optimize the virtual eavesdropping receiver for the secure transmitter to obtain lower SER values after the secure transmitter training of step 3.2 is completed.

Step 4.2: the whole self-precoding machine is divided into two parts, namely a legal link Main Chain and an eavesdropping link Eve Chain, the Main Chain comprises an Alice transmitter and a receiver network of Bob, the Eve Chain comprises an Eve receiver network, and based on the pre-training model in the step two, an iterative confrontation training algorithm of the two parts is designed to obtain the confrontation safety self-precoding machine ASAP.

Step 4.2.1: firstly, determining the total iteration number N; determining a training hyperparameter: the method comprises the steps of optimizing the learning rate of an optimizer, training round, the length of a sample of each batch, and the division ratio of a training data set and a verification data set;

step 4.2.2: is iteration turn n? (ii) a If n is 1, reading the model parameters of the pre-trained self-precoding machine in the step two, including the initialized network parameter phi of the transmitter Alice_AAnd receiver initialization network parameters of Bob and Eve

And

if n ≠ 1, read n-Model parameters of 1 round update, Φ_A＝Φ′_A,

Step 4.2.3: initializing channel parameters of training samples

Reading a training data set with the corresponding one-hot coded label P;

step 4.2.4: setting the training times epoch _1, freezing the network parameters of Eve Chain, training the network model of Main Chain according to the loss function (10), and updating the parameter phi_A＝Φ′_A,

Step 4.2.5: setting the training times epoch _2, freezing the network parameters of the Main Chain, training the network model of the Eve Chain according to the loss function (12), and updating the parameters

Step 4.2.6: n is n + 1; returning to step 4.2.2, 4.2.3, 4.2.4 and 4.2.5 are continuously executed until N is equal to N, and the training is finished.

Step five: according to the countermeasure security self-precoding machine ASAP obtained by training in the step four, under a new security transmission scene, by collecting a small amount of channel samples, the step four is continuously executed to finely adjust the model, and then the security self-precoding machine model with updated parameters is used for carrying out combined optimization of modulation and precoding on confidential information, so that a signal to be transmitted with confidential property for a target eavesdropping user is obtained. And the interception user Eve can only obtain the symbol detection performance of the blind guess level while the legal user Bob has high reliability, so that the safe transmission is realized.

And finally, completing the whole process of the safety self-precoding machine optimization method based on the counterstudy through the steps from the first step to the fifth step.

Has the advantages that:

1. the invention discloses a security self-precoding machine optimization method based on antagonistic learning, which can jointly optimize a signal modulation module and a space precoding module of a transmitting terminal under the condition of meeting the maximum transmitting power constraint, optimize a brand-new security receiving constellation map, obviously reduce the information reliability of a receiving terminal of an eavesdropping user while improving the information decoding reliability of the receiving terminal of a legal user, and realize the security transmission of confidential information under any receiving-transmitting antenna relationship.

2. The invention discloses a security self-precoding machine optimization method based on antagonistic learning, which can be used for alternately and iteratively training a legal link network and an eavesdropping link network by introducing an antagonistic learning strategy on the basis of a security self-precoding machine, continuously improving the security information transmission reliability and security of a legal user even if an eavesdropper has active learning capacity, and simultaneously reducing the model convergence time, the space signal processing complexity and the security transmission efficiency compared with the security self-precoding machine optimization method without antagonistic learning.

3. The safety self-precoding machine optimization method based on the countermeasure learning disclosed by the invention can effectively improve the communication transmission throughput under the limited transmitting power by designing reasonable receiving end constellation deployment while ensuring the information safety and reliability, and realizes the balance of the communication safety and effectiveness.

Drawings

FIG. 1 is a flowchart of an overall training framework of a countermeasure safety self-precoding machine in an optimization method and an embodiment of the countermeasure safety self-precoding machine based on countermeasure learning according to the present invention;

FIG. 2 is a schematic diagram of a method and an embodiment of a joint modulation constellation based on a deep self-coder and a secure self-precoding machine optimization according to the present invention;

FIG. 3 is a schematic diagram of an iterative training algorithm of a security transmitter and a wiretap receiver based on countermeasure learning in the optimization method of the security self-precoding machine based on countermeasure learning and the embodiment of the invention;

FIG. 4 is a schematic diagram showing the comparison of SER vs SNR performances of a legal user Bob and an eavesdropping user Eve under the training frames of the security self-precoding machine SAP and the countermeasure security self-precoding machine ASAP in the countermeasure learning-based security self-precoding machine optimization method and embodiment of the invention;

fig. 5 is a schematic diagram of comparison of signal constellations of receiving ends of a legal user Bob and an eavesdropping user Eve under two safety training frames in the method for optimizing the safety self-precoding machine based on countermeasures learning and the embodiment of the invention.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and examples. The technical problems and the advantages solved by the technical solutions of the present invention are also described, and it should be noted that the described embodiments are only intended to facilitate the understanding of the present invention, and do not have any limiting effect.

This embodiment details the steps of the security self-precoding machine optimization method based on counterstudy when the method is implemented specifically under specific system parameter configuration, self-coding machine network parameter configuration, and training hyper-parameter configuration.

In the embodiment, a three-node MIMO interception channel scene is considered, and the whole communication system is built based on the architecture of a depth self-encoder; in the system, Alice represents a safety transmitter trained based on a neural network model, and Bob represents a legal user receiver obtained by end-to-end training; eve represents an eavesdropping user receiver which is also trained end to end, and after model pre-training is finished, Eve can be trained to obtain a receiver with basically consistent receiving and demodulating performances with Bob.

The channel conforms to the 5G millimeter wave channel model as follows: (the invention and the channel model itself have no constraint relation, and only exemplifies the algorithm implementation process)

Wherein, L represents the number of scattering diameters; alpha is alpha_lRepresenting the complex channel gain in the first path; a is_r() represents a receive antenna array response vector; a is_t(. represents hair)A transmit antenna array response vector;

and theta_lRespectively representing an arrival angle and an emission angle; with a Uniform Linear Array (ULA) in the K dimension, the array response vector of the antenna is expressed as follows:

wherein T represents the number of transmitting antennas, d represents the distance between two adjacent antennas on the antenna panel, and λ is the electromagnetic wavelength; meanwhile, the perfect channel information of Bob and Eve is known by Alice

The same is true for the receiving end.

As shown in fig. 1, the secure self-precoding machine optimization method based on counterlearning disclosed in this embodiment includes the following specific implementation steps:

Distribution of (d), transmit power constraint p, signal-to-noise ratio SNR; setting a neural network model structure, training/testing dataset parameters, and training hyper-parameters, the training hyper-parameters comprising: selected optimizer, training round Epoch, sample length per round Batch Size. Regarding the system configuration parameters, the number of antennas of the transmitter Alice is set to M-64; the number of antennae of legal user Bob is N_B2; the number of the antennas for intercepting Eve of the user is N_E2; transmitting a symbol sequence length J of 1, wherein each symbol contains information of R of 4 bits; the relative positions of the channel multipath numbers L being 1, Alice, Bob and Eve are random along withVaries with variations in discrete channel parameters, and the channel parameter distribution is:

the maximum transmitting power is p-1; meanwhile, in order to adapt to a plurality of signal-to-noise ratio scenes, a data enhancement method is adopted, and the SNR range in the training data set is set to obey uniform distribution, namely

Regarding the model configuration hyper-parameters, all the Neural Network modules adopt a Fully-Connected Neural Network (FCNN); 5 layers of FCNN are adopted for the modulation module, and the number of neuron nodes of each layer is 512,256,128,32,2](ii) a Adopting 5 layers of FCNN for the precoding module, wherein the number of neuron nodes of each layer is [512,512,256,256,128 ]](ii) a Receiver modules for Bob and Eve respectively adopt 4 layers of FCNN, and the number of neuron nodes in each layer is [128, 64 and 16 ]](ii) a In addition, the activation function of each layer adopts a Rectified Linear Unit (ReLU); updating model parameters by adopting an Adam optimizer; in addition, 10 is generated based on the system configuration parameters described above⁶A model data sample including discrete channel parameters

Sending secret information sequence m ═ m corresponding to each channel realization₁,m₂,…m_JAnd a One-hot encoder (OE) matrix corresponding to each secret information sequence; this matrix is used as a label for each sample at the receiving end. The number of times of iterative training is set to 50, the number of times of model training Epoch in each iteration is set to 30, and the number of samples Batch Size in each input network is set to 512.

And step two, constructing an MIMO communication system supporting multi-user, large-scale multi-antenna and multi-stream data transmission based on a model architecture and a training method of the deep self-coding machine. Designing a transmitter network model in which a signal modulation module and a space pre-coding module are cascaded, designing a multi-user average SER loss function, and realizing the reliable transmission of multiple users in the system through space beam forming.

Step 2.1: and designing a transmitting end neural network, which comprises a cascaded modulation module and a spatial pre-coding module.

The sending symbol corresponding to the jth secret information after the antenna mapping of the sending end is X_j：

Wherein the content of the first and second substances,

and

respectively representing a modulation module and a precoding module; m is a unit of_jIs secret information to be sent, outputs modulation symbols after passing through a modulation neural network module

s_jCo-discrete channel parameters

Combining and inputting precoding neural network module together

Multiple X_jThe processing and transmission can be performed in parallel to achieve multi-stream signaling for a MIMO system. It should be noted that, currently, the mainstream deep learning framework based on the tensoflow cannot directly represent complex numbers, and therefore, all parameters in the data set, including channel parameters and signal parameters, are represented in a manner that real parts and imaginary parts are separated, that is, all signals and channels are represented as real matrixes.

All network structures of the transmitting and receiving ends adopt FCNN, and the modulation module neural network

wherein the content of the first and second substances,

and

separately representing modulation module neural networks

The activation function, the weight vector and the offset vector of the g-th layer network; then all symbol sequences s after modulation and the estimated sparse channel parameters

Combining to obtain new training sample

Neural network as spatial precoding module

The input of (1); therefore, the modulation symbol sequence is mapped to the antenna in the precoding module with the aid of the channel prior information, and the signal matrix to be transmitted is obtained as follows:

wherein the content of the first and second substances,

and

are respectively provided withRepresentation modulation module neural network

T of^thAn activation function, a weight vector and an offset vector of a layer network; in particular, to limit the transmit signal power | X | ≦ p,

The following were used:

wherein | X | represents the F-norm of matrix X, and p represents the maximum transmit power constraint;

J (th) after power normalization^thA signal X_jAnd respectively reaching a receiving end through respective MIMO channels of Bob and Eve, wherein the channel passing process is represented as follows:

Y_B,j＝H_BX_j+n_B (19)

Y_E,j＝H_EX_j+n_E (20)

wherein the content of the first and second substances,

and

representing additive white gaussian noise.

and

wherein alpha is_B,θ_B,

Respectively representing the channel parameters estimated by the Bob receiving end; alpha is alpha_E,θ_E,

Respectively representing the channel parameters estimated by an Eve receiving end; according to the illustration in fig. 2, the last layer of the receiving end neural network adopts the Softmax activation function to output the prediction probability vectors P of Bob and Eve respectively_BAnd P_E(ii) a The probability vector represents

And

the predicted secret information is a set of secret information

The probability corresponding to a certain secret information.

Directing self-precodersAnd (5) network training.

Average cross entropy loss function of Bob and Eve

The design is as follows:

Row j, column m;

and

And

row j, column m;

expressed in discrete parameter sets

Adam optimizer self-contained with Tensorflow deep learning framework is adopted to minimize the above average cross entropy loss function

Implementation ofUnsupervised training process.

And guiding model training by using a new safety loss function, giving a self-precoding machine safety attribute, generating a new safety constellation diagram, and ensuring that a receiving end of a legal user Bob can finish symbol detection, and a receiving end of an eavesdropping user Eve cannot finish symbol detection correctly.

Step 3.1: similar to the second step, in order to realize the safe transmission of the physical layer signal, a new safety loss function designed for the fuzzy matrix P of the eavesdropping user is introduced

Is represented as follows:

wherein, P_j,m，

And

the meaning is consistent with that in formula (23); introducing an ambiguity matrix P to confuse eavesdropping subscriber receivers, P_j,mThe elements representing the jth row and mth column of the obfuscation matrix P, which is written in the form:

according to the principle of a cross entropy loss function, along with the progress of a training process, a prediction probability matrix of an Eve receiver of an eavesdropping user is closer to a confusion matrix P, so that the probabilities of Eve judging that received symbols belong to a certain class are consistent, therefore, symbol detection cannot be performed in a striving way, Bob can perform correct detection, and the secure transmission of confidential information is realized.

Step 3.2: based on the new safety loss function designed in step 3.1

And carrying out safety training under the condition that the parameters of the fixed eavesdropping user receiver are not changed, and training to obtain the SAP.

Step 3.2.1: firstly, determining the total training times N, and initializing N to be 1; reading the model parameters of the pre-trained self-precoding machine in the step two, including the initialization network parameter phi of the transmitter Alice_AAnd receiver initialization network parameters of Bob and Eve

And

step 3.2.3: determining a training hyperparameter: the method comprises the steps of optimizing the learning rate of an optimizer, training turns, the length of a sample of each batch, the division ratio of a training data set and a verification data set and the like;

step 3.2.4: starting training, updating all network model parameters phi by using an Adam optimizer based on a loss function (24)_A′，

And

step 3.2.5: n is n + 1; and ending the training until N is equal to N.

And (2) dividing the whole self-precoding machine into two parts of links, namely a legal link Main Chain and an eavesdropping link Eve Chain, wherein the Main Chain comprises an Alice transmitter and a receiver network of Bob, the Eve Chain comprises the receiver network of Eve, and designing two parts of iterative confrontation training algorithms based on the pre-training model in the second step to obtain a confrontation safety self-encoder model ASAP.

Wherein, P_j,m，

Is identical to the expression in formula (24); the purpose of this loss function is to continue to optimize the virtual eavesdropping receiver for the secure transmitter to obtain lower SER values after the secure transmitter training of step 3.2 is completed.

Step 4.2: according to the iterative security training algorithm block diagram shown in fig. 3, the whole self-precoding machine model is divided into two parts, namely a legal link Main Chain and an eavesdropping link Eve Chain, wherein the Main Chain comprises an Alice transmitter and a receiver network of Bob, the Eve Chain comprises an Eve receiver network, and two parts of iterative countertraining algorithms are designed based on the pre-training model in the second step to obtain the countersecurity self-precoding machine ASAP.

Step 4.2.1: firstly, determining the total iteration number N; determining a training hyperparameter: the method comprises the steps of optimizing the learning rate of an optimizer, training turns, the length of a sample of each batch, the division ratio of a training data set and a verification data set and the like;

step 4.2.2: is iteration turn n?(ii) a If n is 1, reading the model parameters of the pre-trained self-precoding machine in the step two, including the initialized network parameter phi of the transmitter Alice_AAnd receiver initialization network parameters of Bob and Eve

And

if n ≠ 1, the model parameters updated in n-1 rounds are read, phi_A＝Φ′_A,

Step 4.2.3: initializing channel parameters of training samples

Reading a training data set with the corresponding one-hot coded label P;

step 4.2.4: setting the training times epoch _1, freezing the network parameters of Eve Chain, training the network model of Main Chain according to the loss function (24), and updating the parameter phi_A＝Φ′_A,

Step 4.2.5: setting the training times epoch _2, freezing the network parameters of the Main Chain, training the network model of the Eve Chain according to the loss function (26), and updating the parameters

FIG. 4 is a diagram of SER vs SNR performance simulation results of a legal user Bob and an eavesdropping user Eve under the training frames of the security self-precoding machine SAP and the countermeasure security self-precoding machine ASAP in the countermeasure learning-based security self-precoding machine optimization method and embodiment of the invention;

in fig. 4, the abscissa is the SNR, the range is 0 to 30dB, the ordinate is the symbol error rate SER, and the simulation experiment performs comparative analysis on four cases, namely, a single-user self-precoding machine S-AP, two-user self-precoding machines M-AP, a secure self-precoding machine SAP, and a confrontation secure self-precoding machine ASAP. It can be seen that S-AP and M-AP occupy the upper and lower performance limits, respectively. From the angle of beam forming precision, the beam alignment of the S-AP is more accurate, and the received signal power is high; M-AP beam forming is a trade-off between users and therefore the performance of each user is degraded. Compared with the M-AP, the SAP performance curve is characterized in that the directivity of beam forming is definitely biased to a legal user because another user is regarded as an eavesdropper, so that the performance is slightly improved compared with the M-AP, and meanwhile, the SER of Eve is almost equal to the probability of signal blind guess detection in the scene, so that the safety is well guaranteed; the ASAP performance curve, whether Bob or Eve, has some boost after the counterlearning compared to SAP, but Bob' S performance boost is more meaningful because it can be an order of magnitude boost after 10dB compared to SAP and approaches the best SER performance that the self-precoders can provide, i.e., the S-AP curve.

FIG. 5 is a received signal constellation simulation result diagram of a legal user Bob and an eavesdropping user Eve under two safety training frames in the security self-precoding machine optimization method based on countermeasure learning and the embodiment of the invention;

in fig. 5, the abscissa represents the real part of the received constellation, the ordinate represents the imaginary part of the received constellation, and the simulation experiment performs comparative analysis on four received constellations: wherein "Bob w AL" represents the reception constellation of Bob under the ASAP framework; "Eve w AL" represents the receive constellation for Eve under the ASAP framework; "Bob wo AL" represents the reception constellation of Bob under the SAP framework; "Eve wo AL" represents the receive constellation of Eve under SAP framework; it can be seen that the received constellation diagram before counterlearning is similar to the traditional PSK star diagram modulation, only occupies the phase information of the signal, the average distance of the symbol clustering center is smaller, and the error probability of symbol detection is increased; the constellation diagram after the counterstudy is similar to the traditional QAM modulation, and under the condition of the same power constraint, the amplitude and the phase information of the signal can be simultaneously utilized, so that the two-dimensional space is better occupied to control the intersymbol interference. Meanwhile, it can be seen that the received constellation of Eve is relatively chaotic in both SAP and ASAP cases, especially in Eve without counterlearning, the received constellation shrinks to a bunch of noise points in a very small range space, with the worst SER performance. Therefore, the counterlearning has important significance for designing a self-precoding machine with higher safety and a more reasonable safe modulation constellation.

The above detailed description is intended to illustrate the objects, aspects and advantages of the present invention, and it should be understood that the above detailed description is only exemplary of the present invention and is not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. The safety self-precoding machine optimization method based on the counterstudy is characterized in that: comprises the following steps of (a) preparing a solution,

Distribution of (d), transmit power constraint p, signal-to-noise ratio SNR; setting a neural network model structure, training/testing dataset parameters, and training hyper-parameters, the training hyper-parameters comprising: the selected optimizer, the training round Epoch, and the sample length of each round Batch Size;

step two, building an MIMO communication system supporting multi-user, large-scale multi-antenna and multi-stream data transmission based on a training framework of a depth self-coding machine, designing a multi-user average SER as a loss function, and training a transmitter network model formed by cascading a modulation module and a pre-coding module; the trained model can realize the reliable transmission of multiple users in the system under limited transmitting power through space beam forming;

Guiding model training by using a new safety loss function, giving a self-precoding machine safety attribute, generating a new safety constellation diagram, and ensuring that a receiving end of a legal user Bob can finish symbol detection, and a receiving end of an eavesdropping user Eve cannot finish symbol detection correctly;

Dividing the whole self-precoding machine into two parts of links, namely a legal link Main Chain and an eavesdropping link Eve Chain, wherein the Main Chain comprises an Alice transmitter and a receiver network of Bob, the Eve Chain comprises an Eve receiver network, and designing two parts of iterative confrontation training algorithms based on the pre-training model in the second step to obtain a confrontation safety self-coder model ASAP;

step five: according to the anti-security self-pre-coder ASAP obtained by training in the step four, in a new security transmission scene, by acquiring a small amount of channel samples, the step four is continuously executed to finely adjust the model, and then the security self-pre-coder model with updated parameters is used for carrying out combined optimization of modulation and pre-coding on confidential information to obtain a signal to be transmitted, which has confidential property for a target eavesdropping user; and the interception user Eve can only obtain the symbol detection performance of the blind guess level while the legal user Bob has high reliability, so that the safe transmission is realized.

2. The method of claim 1 for secure self-precoding machine optimization based on antagonistic learning, characterized in that: the implementation method of the step one is that,

setting system parameters of the MIMO communication system based on the self-encoder framework, wherein the system parameters comprise: the number of antennas M, N of the transmitter Alice, the legal user Bob and the eavesdropping user Eve_BAnd N_EBit information R of each symbol, length J of symbol sequence, channel multipath quantity L and channel parameter alpha_l，θ_l，

3. The method of secure self-precoding machine optimization based on antagonistic learning of claim 1, characterized by: the implementation method of the second step is that,

step 2.1: designing a transmitting terminal neural network, wherein the transmitting terminal neural network comprises a network structure of a signal modulation module and a space pre-coding module;

Wherein the content of the first and second substances,

and

respectively representing a modulation module and a precoding module; m is_jIs secret information to be transmitted from a predetermined limited set of secret information

Obtaining; the modulation symbol is output after passing through the modulation neural network module

s_jEstimating channel parameters with the transmitting end

Combining and inputting precoding neural network module together

Carrying out precoding operation on the modulation symbols to obtain a precoded signal X_j(ii) a In the same way as above, the first and second,

x_jThe method can process and send in parallel to realize multi-stream signal transmission of the MIMO system; training and testing all parameters in the data set, including channel parameters and signal parameters, by adopting a real part and imaginary part separated mode to represent, namely, all channels and signals in the system are characterized as a real matrix;

all transmitting end networks adopt a fully-connected neural network FCNN and a modulation module neural network

wherein the content of the first and second substances,

and

separately representing modulation module neural networks

Combining to obtain new training sample

Neural network as spatial precoding module

The input of (1); spatial precoding module neural network

wherein the content of the first and second substances,

and

separately representing modulation module neural networks

t of (A)^thThe layer is designed as a power constraint layer and adopts a self-defined activation function

The following were used:

wherein | X | represents the F-norm of matrix X, and p represents the maximum transmit power; therefore, a normalized signal to be transmitted mapped to the antenna port is obtained through the step 2.1;

step 2.2: designing a receiving end neural network, which comprises a received signal detection module and a probability mapping module;

the legal user Bob and the eavesdropping user Eve are regarded as two legal users in the step, and two receiver models with the same network structure are built; j th^thSignal X with normalized power_jRespectively reaches a receiving end through respective MIMO channels of Bob and Eve, and the j-th channel received by Bob and Eve^thA signal Y_B,jAnd Y_E,jRespectively, as follows:

Y_B,j＝H_BX_j+n_B (5)

Y_E,j＝H_EX_j+n_E (6)

wherein the content of the first and second substances,

and

representing additive white gaussian noise;

and

wherein alpha is_B,θ_B,

And

the predicted secret is a set of secrets

The probability corresponding to a certain secret information;

Updating the parameters of the self-precoding machine model by adopting a reverse gradient descent strategy;

The design is as follows:

Row j, column m;

and

And

row j, column m;

expressed in discrete parameter sets

Next, the average of the loss function calculated under the number of batch size data samples, where the batch size represents the length of each batch of training samples sent into the neural network;

an optimizer based on Tensorflow deep learning framework for minimizing the average cross entropy loss function

The confidential information recovered by the receiver sides of Bob and Eve is processed through an end-to-end neural network, an unsupervised training process is achieved, signals recovered by a receiving end are consistent with signals of a sending end, reliable transmission of the confidential information is achieved, both Eve and Bob can obtain an optimal receiver under the channel scene, meanwhile, the trained Eve is considered to be an optimal eavesdropper based on self-encoding training, and subsequent safety design is conducted on the basis of the eavesdropper of the Eve user.

4. The method of claim 1 for secure self-precoding machine optimization based on antagonistic learning, characterized in that: the implementation method of the third step is that,

Is represented as follows:

wherein, P_j,m，

And

according to the principle of a cross entropy loss function, along with the progress of a training process, a prediction probability matrix of an Eve receiver of an eavesdropping user is closer to a fuzzy matrix P, so that the probabilities of Eve judging that received symbols belong to a certain class are consistent, therefore, symbol detection cannot be performed in a striving way, Bob can perform correct detection, and secure transmission of confidential information is realized;

step 3.2: based on the new safety loss function designed in step 3.1

Carrying out safety training under the condition that the parameters of a fixed eavesdropping user receiver are not changed, and training to obtain SAP;

And

And

step 3.2.5: n is n + 1; and ending the training until N is equal to N.

5. The method of secure self-precoding machine optimization based on antagonistic learning of claim 1, characterized by: the implementation method of the fourth step is that,

Wherein, P_j,m，

Is identical to the expression in formula (10); the purpose of designing the loss function is to continue to optimize the virtual eavesdropping receiver for the secure transmitter to obtain a lower SER value after the secure transmitter training of step 3.2 is completed;

step 4.2: the whole self-precoding machine is divided into two parts, namely a legal link Main Chain and an eavesdropping link Eve Chain, wherein the Main Chain comprises an Alice transmitter and a receiver network of Bob, the Eve Chain comprises an Eve receiver network, and an iterative confrontation training algorithm of the two parts is designed based on the pre-training model of the step two to obtain an confrontation safety self-precoding machine ASAP;

step 4.2.1: firstly, determining the total iteration number N; determining a training hyperparameter: the method comprises the steps of optimizing the learning rate of an optimizer, training turns, the length of each batch of samples, and the division ratio of a training data set and a verification data set;

And

if n ≠ 1, the model parameters of n-1 times of updating are read,

step 4.2.3: initializing channel parameters of training samples

Reading a training data set with the corresponding one-hot coded label P;

step 4.2.4: setting the training times epoch _1, freezing the network parameters of Eve Chain, training the network model of Main Chain according to the loss function (10), and updating the parameters

6. The method of secure self-precoding machine optimization based on antagonistic learning of claim 1, characterized by: the implementation method of the fifth step is that,

according to the anti-security self-pre-coder ASAP obtained by training in the step four, in a new security transmission scene, by acquiring a small amount of channel samples, the step four is continuously executed to finely adjust the model, and then the security self-pre-coder model with updated parameters is used for carrying out combined optimization of modulation and pre-coding on confidential information to obtain a signal to be transmitted, which has confidential property for a target eavesdropping user; and the interception user Eve can only obtain the symbol detection performance of the blind guess level while the legal user Bob has high reliability, so that the safe transmission is realized.