CN115795548A

CN115795548A - Covert communication detection method based on federal learning

Info

Publication number: CN115795548A
Application number: CN202211590792.3A
Authority: CN
Inventors: 陈立文; 钱玉文
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2022-12-12
Filing date: 2022-12-12
Publication date: 2023-03-14

Abstract

The invention discloses a covert communication detection method based on federal learning, which comprises the following steps: storing the sample data transmitted by a plurality of time slot antennas in a matrix form; determining a global model, and initializing the global model by a central server; the central server sends the initialized global model to the user; the participating nodes download the global model from the central server, use the downloaded global model as a local model, and train and update the local model by using local data; the participating nodes upload the local models to a central server, and the central server receives all the local models and then performs model aggregation updating to form a new global model; the central server issues the new global model to all the training nodes for iterative training; and the user detects the local data by using the finally updated global model, judges whether the physical layer signal contains the covert communication data or not and finishes the classification detection task. The invention saves communication resources and protects data security.

Description

Covert communication detection method based on federal learning

Technical Field

The invention belongs to the technical field of wireless communication, and particularly relates to a covert communication detection method based on federal learning.

Background

Covert communication, also known as Low Detection (LPD) and Low Interception (LPI) communication, is often used for secure communication. In the field of privacy protection, covert communication is used as a means for revealing private information and confidential information, and the detection of the possible covert communication is necessary to protect the information security of individuals and organizations. In the field of military countermeasure, covert communication is used as one of safety communication, has the characteristics of low detection and low interception, is used for transmitting information with important value, is used as a countermeasure party, detects and breaks the safety communication of an enemy, and has great strategic value for the own party.

As a technique developed specifically against detection means, detection means in the conventional security field have difficulty in effectively detecting covert communication. The progressive development of machine learning makes it possible to detect covert communications using machine learning. The development of electronic hardware solves the problem of the large amount of data that has plagued machine learning and is difficult to process. Now, in combination with big data, machine learning comprehensively applies psychology, biology, neurophysiology, mathematics, automation and computer science to become a widely used subject. In the communication field, the time-frequency domain characteristics of signals, data packets, characteristic frame structures and other data provide rich label information for machine learning.

For the detection of the network layer hidden channel, the flag bits of some abnormal network layer message segments are essentially subjected to pattern matching. The detection idea is that Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) processing are firstly carried out on the segment information of a quantized network time covert channel to realize sample dimension reduction and reconstruction, then supervised learning such as a Support Vector Machine (SVM) and a K-nearest neighbor (KNN) is adopted to carry out multi-classification, and finally suspicious message sequence and detection of a network layer are realized. The paper proposes a network time Covert channel Detection method based on random forests, expands statistical characteristic indexes and finally reduces the complexity of an overall algorithm on the premise of improving the classification performance of a classifier aiming at the problem that the physical layer Covert channel cannot balance the contradiction between Detection recall ratio and Detection complexity.

However, existing machine learning detection methods for covert communication all adopt a centralized training mode, and a large amount of user data needs to be collected before training, so that risks of data leakage and misuse of user data exist in the process, and the defects are particularly prominent in a high-countermeasure environment. Meanwhile, the original data set of covert communication faces the practical problem of data imbalance.

Disclosure of Invention

In order to solve the technical defects in the prior art, the invention provides a covert communication detection method based on federal learning.

The technical scheme for realizing the purpose of the invention is as follows: a covert communication detection method based on federal learning comprises the following steps:

(10) User data processing: storing the physical layer bit stream signal sample data which are transmitted by a plurality of time slot antennas and carry hidden information in a matrix form;

(20) Initializing training parameters: determining a global model, and initializing the global model by a central server;

(30) Initial model release: the central server sends the initialized global model to the user;

(40) Local training of a model: the participating nodes download the global model from the central server, use the downloaded global model as a local model, and train and update the local model by using local data;

(50) And (3) global model aggregation: the participating nodes upload the local models to a central server, and the central server receives all the local models and then performs model aggregation updating to form a new global model;

(60) Global model issuing: the central server sends the new global model to all the training nodes, judges whether the training updating times of the global model reach times, if so, carries out the step (70), otherwise returns to the step (40);

(70) And (3) covert communication detection: and the user detects the local data by using the finally updated global model, judges whether the physical layer signal contains the covert communication data or not and finishes the classification detection task.

Preferably, the global model employs a two-classification FNN neuron network.

Preferably, the activation function of the two-classification FNN neuron network adopts a Tanh activation function:

in the formula, x is the output of the neuron in the neuron network, and y represents the final judgment result.

Preferably, the specific method for training and updating the local model setting wheel by using the local data, which takes the global model issued by the central server as the local model, is as follows:

(41) Training sample { { x) using the data stored in step (10) ⁽¹⁾ ,y ⁽¹⁾ }…{x ⁽ⁿ⁾ ,y ⁽ⁿ⁾ }}；

In the formula, x is sample data, and y is a classification label corresponding to the sample data;

(42) Forward pass training network weights:

z ^l ＝w ^l a ^l-1 +b ^l ,a ^l ＝σ(z ^l )

where l and l-1 represent the number of layers in which the neuron is located, a represents the input to which the neuron is connected, w is the weight of the input to which the neuron corresponds, z represents the output of the neuron, b is the corresponding bias, σ (z) represents the activation function, a ^l Representing the input that the neurons of the current layer pass to the neurons of the next layer.

(43) Calculating the error of the model output layer:

in the formula, J represents a loss function of the model,

gradient representing a loss function is the Hadamard product of the matrix, σ' (z) ^L ) A matrix of activation function outputs arranged to represent neurons in the output layer;

(44) The error for each layer of neurons was calculated using a back propagation algorithm:

δ ^l ＝((w ^l+1 ) ^T δ ^l+1 )⊙σ′(z ^l )

in the formula, l, l +1 represents the number of layers in which the neuron is located, σ' (z) ^l ) Is a matrix, δ, that arranges the activation function outputs of the neurons of the layer ^l+1 Is the activation value of l +1 layer neurons, (w) ^l+1 ) ^T A transposed matrix of l +1 layer neuron input weights,. Alpha. ^l Is the final activation value of layer l neurons;

(45) Gradient descent neural network weight parameters:

where η represents the learning rate and m is the amount of user local data;

and judging whether the iteration number reaches the set total number E, if so, performing the step (50), otherwise, returning to the step (41).

Preferably, the specific formula of the global model aggregation is as follows:

in the formula, K represents the number of users,

a local model representing the user is shown,

the local data set representing the kth user accounts for the weight of the total data set of all users,

representing a new global model resulting from the weighted averaging.

Compared with the prior art, the invention has the following remarkable advantages:

1. communication resources are saved: the nodes participating in training do not need to transmit a large amount of original data to the central server, so that a large amount of communication transmission resources are saved, and the advantages are prominent in the environment of limited communication resources.

2. And (3) protecting data security: the nodes participating in training use local data to train the model locally, and do not need to upload the data of the nodes to the central node, so that the risks of data leakage and data abuse are avoided.

3. The adaptability is better: the trained model can detect the covert communication of the physical layer without prior information.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, wherein like reference numerals are used to designate like parts throughout.

FIG. 1 is a system model for federated learning-based covert communication detection in accordance with the present invention.

FIG. 2 is a flow chart of a covert communication detection method based on federated learning

FIG. 3 is a flow chart of a method for local training of nodes participating in training.

FIG. 4 is a flow diagram of a method for federated learning global model updates.

FIG. 5 is a comparison graph comparing the detection accuracy of the FNN network structure and other network structures under the Federal learning framework.

Detailed Description

It is easily understood that various embodiments of the present invention can be conceived by those skilled in the art according to the technical solution of the present invention without changing the essential spirit of the present invention. Therefore, the following detailed description and the accompanying drawings are merely illustrative of the technical aspects of the present invention, and should not be construed as all of the present invention or as limitations or limitations on the technical aspects of the present invention. Rather, these embodiments are provided so that this disclosure will be thorough and complete. The preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and which together with the embodiments of the invention serve to explain the innovative concepts of the invention.

As shown in fig. 1, the covert communication detection method based on federal learning of the present invention includes the following steps:

(10) Obtaining communication parameters during user communication: the nodes participating in training prepare the signal sample data of the physical layer bit stream carrying the hidden information transmitted by a plurality of time slot antennas and store the sample data in a matrix form.

(11) And (3) training parameter confirmation: and determining the number T of each iteration in each round of training process of the training node and the number m of samples fed into the training in each batch.

(20) Initializing a training model: and the central server issues the initial training model to each participating node.

(21) Determining the number i of model input layer neurons, determining the number k of hidden layers, determining the number n of hidden layer neurons ₁ ,n ₁ ,...n _k The output layer is a single neuron, and the model is a two-classification FNN neuron network.

(22) The activation function adopts a Tanh activation function:

(23) Determining an initialization model weight parameter w ₀ ，b ₀ ；

(24) Setting global model training ethics T, local user training updating ethics E, setting global user number M, user number N participating in aggregation at random each time, iteration number K and learning rate eta.

(30) Initial model release: the central server will initiate the model w ₀ And issued to all users.

(40) The selected users i e N locally perform the following training and updating processes respectively:

Wherein, x is sample data, y is a classification label corresponding to the sample data;

(42) Forward pass training network weights:

z ^l ＝w ^l a ^l-1 +b ^l ,a ^l ＝σ(z ^l )

(43) Calculating the error of the model output layer:

in the formula, J represents a loss function of the model,

gradient representing a loss function is the Hadamard product of the matrix, σ' (z) ^L ) Activation function output permutation representing output layer neuronsA matrix of the result;

δ ^l ＝((w ^l+1 ) ^T δ ^l+1 )⊙σ′(z ^l )

(45) Gradient descent neural network weight parameters:

where η represents the learning rate and m is the amount of user local data;

(50) And (3) global model aggregation: the nodes participating in the training send the updated model parameters to the central server, the central server updates the global model after collecting the model parameters,

(60) Global model release: the central server updates the global model w _t+1 And issued to all users.

And judging whether the iteration times reach the set total iteration times T, if so, performing the step (70), otherwise, returning to the step (40).

(70) And (3) covert communication detection: the central server sends the trained model to all nodes, and the nodes participating in training use the model to classify the data needing to be detected locally, so as to complete the detection task of the covert communication of the physical layer.

The invention allows the user to keep the data in the local, performs joint training on the premise of ensuring that the user data is not open to obtain the global model, and shares the training results together, thereby effectively solving the problems of data privacy and safety protection.

The following is described in detail with reference to the examples:

examples

Firstly, training samples input by a model are prepared, 800 frames are selected from a training set, and a transmitting end of each frame of 300 packets is provided with 16 wireless signals carrying hidden information by transmitting antennas. The wireless signals are subjected to independent slow fading channel gain with the mean value of 0 and the variance of 1, a channel matrix is kept constant in the transmission process of one frame of signals, and each frame of signals are independently changed. The transmitted signal is modulated by the cover QAM, the number of the concealed time slots accounts for 20% of the total time slots, and the transmitting antenna carrying the modulation constellation deviation in the time slots carrying the concealed information accounts for 25% of the total antenna number.

In the proposed model, the FNN input layer is 16 neurons, 3 hidden layers are provided in total, the number of neurons in each hidden layer is 32,64 and 32, the output layer is a single neuron, and the model is a binary FNN neural network.

The simulation sets the total number of users M =50, and 30 users are randomly selected to upload the quantized compression model to the parameter server for aggregation in each communication round. Meanwhile, the local updating times are set to be 5 times, and the learning rate is 0.01.

And comparing the cross validation set precision, recall rate and model precision of the FNN network structure and other network structures by utilizing the data setting. The accuracy rate of the cross validation set reflects the proportion of the hidden information in the total antenna array sample information detected by the trained classification model, the accuracy rate is also called precision rate, the problem is defined as the proportion of the detected antenna array information carrying the hidden information, the recall rate is also called recall rate, the problem is defined as the proportion of the hidden information detected by the detection network, and the accuracy rate is the key content of the model. The final experimental results are shown in fig. 5.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

It should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes described in a single embodiment or with reference to a single figure, for the purpose of streamlining the disclosure and aiding in the understanding of various aspects of the invention by those skilled in the art. However, the present invention should not be construed such that the features included in the exemplary embodiments are all the essential technical features of the patent claims.

It should be understood that the modules, units, components, and the like included in the device of one embodiment of the present invention may be adaptively changed to be provided in a device different from that of the embodiment. The different modules, units or components comprised by the apparatus of an embodiment may be combined into one module, unit or component or may be divided into a plurality of sub-modules, sub-units or sub-components.

Claims

1. A covert communication detection method based on federal learning is characterized by comprising the following steps:

(30) Initial model release: the central server issues the initialized global model to the user;

2. The federally-learned covert communication testing method as claimed in claim 1, wherein said global model employs a two-class FNN neuron network.

3. The concealed communication detection method based on federal learning according to claim 2, wherein the activation function of the two-class FNN neuron network adopts a Tanh activation function:

4. The covert communication detection method based on federal learning of claim 1, wherein a global model issued by a central server is used as a local model, and a specific method for training and updating a local model setting wheel by using local data comprises:

(41) Training sample { { x) using the data stored in step (10) ⁽¹⁾ ,y ⁽¹⁾ }...{x ⁽ⁿ⁾ ,y ⁽ⁿ⁾ }}；

(42) Forward pass training network weights:

z ^l ＝w ^l a ^l-1 +bl,a ^l ＝σ(z ^l )

(43) Calculating the error of the model output layer:

in the formula, J represents a loss function of the model,

gradient representing a loss function is the Hadamard product of the matrix, σ' (z) ^L ) A matrix of activation function outputs representing neurons of the output layer arranged;

δ ^l ＝((w ^l+1 ) ^T δ ^l+1 )⊙σ′(z ^l )

(45) Gradient descent neural network weight parameters:

in the formula, η represents a learning rate, and m is the number of local data of a user;

5. The covert communication detection method based on federal learning of claim 1, wherein a specific formula of global model aggregation is:

in the formula, K represents the number of users,

a local model representing the user is shown,

representing a new global model resulting from the weighted averaging.