CN115146670A - Radio frequency fingerprint identification method and system based on data enhancement and comparison learning - Google Patents

Radio frequency fingerprint identification method and system based on data enhancement and comparison learning Download PDF

Info

Publication number
CN115146670A
CN115146670A CN202210600110.6A CN202210600110A CN115146670A CN 115146670 A CN115146670 A CN 115146670A CN 202210600110 A CN202210600110 A CN 202210600110A CN 115146670 A CN115146670 A CN 115146670A
Authority
CN
China
Prior art keywords
data set
training
radio frequency
loss function
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210600110.6A
Other languages
Chinese (zh)
Inventor
任品毅
任占义
张田田
鲁磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202210600110.6A priority Critical patent/CN115146670A/en
Publication of CN115146670A publication Critical patent/CN115146670A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/60Context-dependent security
    • H04W12/69Identity-dependent
    • H04W12/79Radio fingerprint

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Image Analysis (AREA)

Abstract

The radio frequency fingerprint identification method and system based on data enhancement and comparison learning comprises the following steps: acquiring a radio frequency signal to acquire a training data set; selecting a specified number of samples as a batch training data set according to the training data set; calculating all samples to obtain output through the calculation of a Convolutional Neural Network (CNN); calculating the average loss function value of each batch of training process; calculating the gradient of the loss function, updating the network parameters according to a gradient descent principle, repeating the steps until the network parameter stable training is finished in advance or the preset Epoch value iteration is finished, and finishing the comparison learning pre-training process based on the original data set and the enhanced data set. By jointly optimizing the category sample consistency and the label consistency in the training stage, the internal information of the sample and the related information between the original sample and the enhanced sample can be effectively utilized, and higher individual identification precision of a communication radiation source can be realized compared with a low signal-to-noise ratio radio frequency fingerprint identification method only depending on data enhancement.

Description

Radio frequency fingerprint identification method and system based on data enhancement and comparison learning
Technical Field
The invention belongs to the technical field of radio frequency fingerprint identification, and particularly relates to a radio frequency fingerprint identification method and system based on data enhancement and comparison learning.
Background
In recent years, with the rapid development of wireless communication technology, especially the wide application of 5G communication technology, the rapid increase of the number of devices in the internet of things is promoted. At present, the internet of things equipment with a large number of levels is accessed into a wireless network, so that the wireless network faces different previous security challenges, and the security requirement also promotes the development of an intelligent and reliable legal equipment authentication system. The traditional authentication system based on the communication protocol and the key authentication has the problems of easy key stealing, easy copying and the like, and cannot meet the security requirement of a large-scale Internet of things system. In comparison, the radio frequency fingerprint identification technology based on the signal physical layer characteristics can realize individual identification of the communication radiation source without depending on a secret key, and has the advantages of high identification precision, strong safety and the like.
To date, researchers have done a great deal of work on radio frequency fingerprinting techniques. The traditional machine learning method based on artificial feature extraction needs complex feature construction engineering, is complex in process and needs expert knowledge support. In addition, due to the complexity of the actual communication environment, the radio frequency fingerprint cannot be modeled simply, so that the identification performance of the methods based on artificial feature extraction in the actual communication environment is poor. The deep learning method with the convolutional neural network as the core avoids a complicated artificial feature extraction process, can directly extract high-level abstract features for individual identification of communication radiation sources, and has the advantages of high identification precision, strong generalization capability, easiness in system expansion and the like. However, nowadays, radio frequency fingerprint identification technology based on deep learning is mostly realized based on a specific training data set, which also results in low identification precision when the test set data and the training set data signal acquisition environment are different. In addition, researches show that the accuracy of radio frequency fingerprint identification depends heavily on the signal-to-noise ratio (SNR) level of the acquired signal, and when the signal-to-noise ratio of the signal is low, the identification accuracy is sharply reduced compared with the case of high SNR. However, in an actual wireless communication environment, a high signal-to-noise ratio condition is often difficult to obtain, so that the research on a high-precision radio frequency fingerprint identification method under a low signal-to-noise ratio condition is of great significance.
In summary, the existing methods have the following drawbacks and disadvantages:
1. the existing radio frequency fingerprint identification method has low generalization capability, and can realize high-precision communication radiation source individual identification by always needing the same training set data and test set data acquisition environment;
2. the existing radio frequency fingerprint identification method based on deep learning has low individual identification precision of communication radiation sources under the condition of low signal to noise ratio, the robustness of the signal to noise ratio is poor, and the individual identification of the communication radiation sources under the environment of dynamic signal to noise ratio cannot be realized;
3. the existing low signal-to-noise ratio radio frequency fingerprint identification method based on data enhancement increases the signal-to-noise ratio variation variety of a training set by utilizing noise modeling, but the traditional supervised learning method excessively depends on sample label information, but does not effectively utilize rich information in a sample, so that the problem that the model has poor generalization capability and the like is caused. In addition, the existing method does not utilize the related information between the original sample and the enhanced sample of the same data sample in the training stage, so that there is a space for improving the performance.
Disclosure of Invention
The invention aims to provide a radio frequency fingerprint identification method and a radio frequency fingerprint identification system based on data enhancement and contrast learning, and aims to solve the problems that the existing radio frequency fingerprint identification method is low in generalization capability, low in individual identification precision of a communication radiation source, poor in signal-to-noise ratio robustness of the method and poor in generalization capability of a model.
In order to achieve the purpose, the invention adopts the following technical scheme:
the radio frequency fingerprint identification method based on data enhancement and contrast learning comprises the following steps:
collecting radio frequency signals, and preprocessing the collected radio frequency signals;
performing data enhancement and signal characterization on the preprocessed signals to obtain a two-dimensional time-frequency square matrix;
carrying out maximum-minimum normalization on the obtained time-frequency square matrix, mapping the time-frequency square matrix data to a specified interval, and obtaining a corresponding time-frequency gray image as a training data set;
selecting a specified number of samples as a batch training data set according to the training data set;
calculating output obtained by calculating all samples through a Convolutional Neural Network (CNN), wherein the output comprises projection vector output and probability vector output;
calculating an average loss function value of each batch of training processes, defining label consistency through cross entropy loss function values, and defining sample consistency by comparing loss function values;
calculating the gradient of a loss function, updating network parameters according to a gradient descent principle, repeating the steps until the network parameter stable training is finished in advance or the preset Epoch value iteration is finished, finishing a comparison learning pre-training process based on the original data set and the enhanced data set, carrying out network parameter fine adjustment according to the enhanced data set and corresponding label information thereof, and training by adopting a standard supervised learning algorithm until an optimal training result is obtained.
Further, acquiring a radio frequency signal, and preprocessing the acquired radio frequency signal specifically comprises:
collecting radio frequency signals at a receiving end, segmenting the radio frequency signals according to signal frames, and then performing mean-variance normalization on the signal frames to enable each signal frame to obey Gaussian distribution N (0, 1);
the data enhancement specifically comprises:
carrying out noise modeling on the signal frame x (n) subjected to normalization processing to construct a corresponding enhanced data frame s (n); assuming a signal frame data length of l, the enhanced data frame when SNR = a dB is expressed as:
Figure BDA0003669580170000031
where w (n) is the corresponding white noise signal and the mean variances are respectively μ w =0,
Figure BDA0003669580170000032
Further, signal characterization: performing time-frequency characterization on a signal frame by using short-time Fourier transform (STFT), and converting a one-dimensional time domain signal into a two-dimensional time-frequency square matrix by adjusting a conversion parameter of the STFT, wherein a square matrix element is a corresponding Power Spectral Density (PSD);
and (3) carrying out maximum value-minimum value normalization on the time-frequency square matrix, then multiplying by 255 and then carrying out rounding, and mapping the time-frequency square matrix data to a [0,255] interval, wherein square matrix elements are corresponding pixel values to obtain a corresponding time-frequency gray image.
Further, the training data set comprises an original time-frequency grayscale image data set { X, Y } and corresponding enhanced time-frequency grayscale image data sets with different SNRs, and the enhanced time-frequency grayscale image data set with SNR = snrdB is expressed as { X } snr ,Y snr }。
Further, for each batch training process, a specified number of samples are selected from the original time-frequency grayscale image data set and the enhanced time-frequency grayscale image data set according to the batch size to serve as batch training original data sets
Figure BDA0003669580170000041
And corresponding enhanced data set
Figure BDA0003669580170000042
Where B is the batch size and the samples in the original and enhanced data sets correspond one-to-one,
Figure BDA0003669580170000043
is a one-hot coded matrix smoothed by a label.
Further, calculating the output obtained by all samples through the CNN calculation, and for each batch of training process, obtaining a projection vector p and a probability vector z after each sample passes through the network, wherein the outputs of the original data set and the enhanced data set are respectively
Figure BDA0003669580170000044
And
Figure BDA0003669580170000045
further, label consistency is defined by cross entropy loss function values of the original data set probability vector and the sample one-hot coded vector, and the cross entropy loss function value of each batch of training process is defined as:
Figure BDA0003669580170000046
defining sample consistency by a comparison loss function value of a projection vector of an original data set and a projection vector of an enhanced data set, for a pair of projection vectors of the original data set and the enhanced data set
Figure BDA0003669580170000047
Will be provided with
Figure BDA0003669580170000048
As positive sample pairs and the remaining 2B-2 samples as negative samples, then the sample pairs
Figure BDA0003669580170000049
The comparative loss function value of (d) is defined as:
Figure BDA00036695801700000410
wherein S (u, v) is a cosine similarity formula, τ is a temperature parameter, and τ >0;
Figure BDA00036695801700000411
further, the final contrast loss function value includes the contrast loss function values for all positive sample pairs, including l i (p,p snr ) And l i (p snr P); obtaining a projection vector pair
Figure BDA0003669580170000051
The formula for calculating the comparison loss function value is as follows:
Figure BDA0003669580170000052
the average loss function value for each batch of training sessions is:
L avg =L cce (z,y)+L cl (p,p snr ).。
further, updating the network parameters according to a gradient descent principle:
Figure BDA0003669580170000053
where W is the network parameter and η represents the learning rate.
Further, the radio frequency fingerprint identification system based on data enhancement and contrast learning comprises
The acquisition module is used for acquiring radio frequency signals and preprocessing the acquired radio frequency signals;
the signal characterization module is used for performing data enhancement and signal characterization on the preprocessed signals to obtain a two-dimensional time-frequency square matrix;
the training data set acquisition module is used for carrying out maximum-minimum normalization on the obtained time-frequency square matrix, mapping the time-frequency square matrix data to a specified interval, and obtaining a corresponding time-frequency gray image as a training data set;
the batch training data set acquisition module is used for selecting a specified number of samples as a batch training data set according to the training data set;
the sample output module is used for calculating the output obtained by calculating all samples through the convolutional neural network CNN, and comprises projection vector output and probability vector output;
the consistency definition module is used for calculating the average loss function value of each batch of training process, defining label consistency through the cross entropy loss function value and defining sample consistency by comparing the loss function values;
and the network updating iteration module is used for calculating the gradient of the loss function, updating the network parameters according to a gradient descent principle, repeating the steps until the network parameter stable training is finished in advance or the preset Epoch value iteration is finished, finishing a comparison learning pre-training process based on the original data set and the enhanced data set, carrying out network parameter fine adjustment according to the enhanced data set and the corresponding label information thereof, and training by adopting a standard supervised learning algorithm until an optimal training result is obtained.
Compared with the prior art, the invention has the following technical effects:
the invention provides a low signal-to-noise ratio radio frequency fingerprint identification method based on data enhancement and contrast learning, which comprises the steps of firstly increasing the signal-to-noise ratio variation types of a training set through data enhancement, so that a model can mainly utilize radio frequency fingerprint information in a signal frequency band to carry out individual identification and is insensitive to fine features outside the frequency band; in addition, the invention provides a novel comparison learning framework, the consistency of the class samples and the consistency of the labels are jointly optimized in the training stage, the internal information of the samples and the related information between the original samples and the enhanced samples can be effectively utilized, and the individual identification precision of the communication radiation source can be higher compared with a low signal-to-noise ratio radio frequency fingerprint identification method only depending on data enhancement.
When the low signal-to-noise ratio radio frequency fingerprint identification method based on data enhancement and contrast learning is applied specifically, the signal-to-noise ratio change types of the training set are increased by adopting a data enhancement mode, the generalization capability of the method is improved, and higher individual identification precision of the communication radiation source can be realized on different test sets.
The invention adopts time-frequency transformation to perform signal characterization, and utilizes the CNN data characterization consistency of contrast learning optimized original data and enhanced data to ensure that the model can mainly extract radio frequency fingerprint information in a frequency band to perform individual identification of a communication radiation source, is insensitive to low energy characteristics outside the frequency band, and effectively improves the individual identification precision of the communication radiation source under the condition of low signal-to-noise ratio.
The comparison learning framework provided by the invention can jointly optimize the sample consistency and the label consistency, can effectively utilize the internal information of the sample and the related information between the original sample and the enhanced sample to carry out radio frequency fingerprint identification in the training process, avoids excessive dependence on label information, has higher individual identification precision of the communication radiation source compared with other methods, and can be applied to the individual identification of the communication radiation source in the dynamic signal-to-noise environment.
In conclusion, the method has the advantages of strong generalization capability, high signal-to-noise ratio robustness and high individual identification precision of the communication radiation source.
Drawings
FIG. 1 is a schematic diagram of the system architecture and training set construction of the present invention;
FIG. 2 is a schematic flow chart of a comparative learning algorithm of the present invention;
FIG. 3 is a schematic diagram of a network architecture employed by the present invention;
FIG. 4 is a schematic diagram of a data collection platform used for performance verification in accordance with the present invention;
FIG. 5 is a graph of average test accuracy for different algorithms;
FIG. 6 shows the test accuracy of different algorithms in different SNR test sets;
fig. 7 is a confusion matrix result for different algorithms.
Table 1 shows the average test accuracy for different algorithms.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention adopts the following scheme:
and collecting and preprocessing radio frequency signals. The method comprises the steps of collecting radio frequency signals at a receiving end, segmenting the radio frequency signals according to signal frames, and then carrying out mean-variance normalization on the signal frames to enable each signal frame to be subjected to Gaussian distribution N (0, 1).
And (4) enhancing data. And carrying out noise modeling on the signal frame x (n) subjected to the normalization processing to construct a corresponding enhanced data frame s (n). Here we consider an additive white gaussian noise environment (AWGN), and assuming a signal frame data length of l, the enhanced data frame when SNR = a dB can be expressed as:
Figure BDA0003669580170000071
wherein w: (n) is the corresponding white noise signal, the mean variances are respectively μ w =0,
Figure BDA0003669580170000072
And (5) signal characterization. In order to better perform subsequent radio frequency fingerprint feature extraction, a Short Time Fourier Transform (STFT) is used for performing time-frequency characterization on a signal frame, and a one-dimensional time domain signal is transformed into a two-dimensional time-frequency square matrix by adjusting a transformation parameter of the STFT, wherein an element of the square matrix is a corresponding power spectral density value (PSD).
And (4) pixel mapping. And carrying out maximum value-minimum value normalization on the obtained time-frequency square matrix, then multiplying by 255 and carrying out rounding, and mapping the time-frequency square matrix data to a [0,255] interval, wherein the square matrix elements are corresponding pixel values, so that a corresponding time-frequency gray image can be obtained.
A data set is acquired. Through steps 1-4, we process to obtain a training data set, which includes an original time-frequency gray-scale image data set { X, Y } and corresponding enhanced time-frequency gray-scale image data sets with different SNRs, and represent the enhanced time-frequency gray-scale image data set with SNR = SNR dB as { X snr ,Y snr }。
A batch training data set is constructed. For each batch training process, selecting a specified number of samples from the original time-frequency gray image data set and the enhanced time-frequency gray image data set according to the batch size to serve as batch training original data sets
Figure BDA0003669580170000081
And corresponding enhanced data set
Figure BDA0003669580170000082
Where B is the batch size and the samples in the original and enhanced data sets correspond one-to-one, that is to say
Figure BDA0003669580170000083
Is a one-hot coded matrix smoothed by a label.
Convolutional Neural Network (CNN) feature extraction. Calculate all samplesAnd the output obtained by CNN calculation comprises projection vector output and probability vector output. For each batch of training process, a projection vector p and a probability vector z are respectively obtained after each sample passes through the network. The outputs of the original data set and the enhanced data set are then respectively
Figure BDA0003669580170000084
And
Figure BDA0003669580170000085
and calculating the label consistency. We define label consistency as the cross-entropy loss function values of the original data set probability vector and the sample one-hot coded vector, then the cross-entropy loss function value for each batch of training processes is defined as:
Figure BDA0003669580170000086
and calculating the sample consistency. We define sample consistency as the value of the contrast loss function of the projection vector of the original data set and the projection vector of the enhancement data set. Projecting vectors for a pair of original data set and enhanced data set
Figure BDA0003669580170000087
We will want to
Figure BDA0003669580170000088
As positive sample pairs and the remaining 2B-2 samples as negative samples, then the sample pairs
Figure BDA0003669580170000089
The comparative loss function value of (d) is defined as:
Figure BDA0003669580170000091
wherein S (u, v) is a cosine similarity formula, τ is a temperature parameter, and τ >0.
Figure BDA0003669580170000092
Obviously, the final contrast loss function value includes the contrast loss function values for all positive sample pairs, including l i (p,p snr ) And l i (p snr P). Therefore we can get the projection vector pair
Figure BDA0003669580170000093
The formula for calculating the comparison loss function value is as follows:
Figure BDA0003669580170000094
further, we can give the average loss function value for each batch training process as:
L avg =L cce (z,y)+L cl (p,p snr ).
and updating the network parameters. Calculating the gradient of the loss function, and updating the network parameters according to a gradient descending principle:
Figure BDA0003669580170000095
where W is the network parameter and η represents the learning rate.
And repeating the steps 6-12 until the network parameter stable training is finished in advance or the preset Epoch value is finished in an iteration mode. At this point, the process of learning pre-training based on the comparison of the original data set and the enhanced data set is completed.
And further, carrying out network parameter fine adjustment according to the enhanced data set and the corresponding label information, and training by adopting a standard supervised learning algorithm until an optimal training result is obtained.
FIG. 1 is a schematic diagram of the system architecture and training set construction according to the present invention. The present invention considers a typical deep learning based rf fingerprint identification system, as shown in fig. 1 (a), the original real signal is collected before the demodulation process at the receiving end. The processed signal is then subjected to a series of necessary pre-processing steps including normalization, characterization, etc., and subsequently feature extraction and individual identification using CNN. As shown in fig. 1 (b), a training procedure of the present invention is implemented, for a radio frequency signal collected by a radio frequency receiver at a receiving end, the radio frequency signal is first divided according to signal frames, and mean-variance normalization is performed, so that each signal frame follows gaussian distribution N (0, 1), and the signal frames after normalization are referred to as original data.
Then, data enhancement processing is carried out on the original data, corresponding enhanced data samples are obtained through noise modeling of different SNR, and the signal-to-noise ratio change types of the training set are increased. The specific processing flow comprises the steps of firstly carrying out energy normalization on an original data sample, then generating a corresponding AWGN signal according to the SNR, and adding the original data sample and a corresponding white noise signal to obtain a corresponding enhanced data sample. Assuming that the sample length of the original data sample x (n) is l, the enhanced data sample s (n) at signal-to-noise ratio SNR = a dB can be expressed as:
Figure BDA0003669580170000101
where w (n) is the corresponding white noise signal and the mean variances are respectively μ w =0,
Figure BDA0003669580170000102
And then performing time-frequency characterization by using STFT to obtain a time-frequency gray image corresponding to the one-dimensional data sample. The specific processing flow is that firstly mean-variance normalization is carried out on the enhanced data sample, so that the sample data obeys Gaussian distribution N (0, 1), and normalization processing is not carried out on the original data sample. And then, performing time-frequency transformation by using the STFT, and converting the one-dimensional time domain signal into a two-dimensional time-frequency square matrix by adjusting transformation parameters of the STFT, wherein the square matrix element is a corresponding power spectral density value, and the size of the square matrix is W multiplied by W. And then performing pixel mapping on the time-frequency square matrix, specifically performing maximum-minimum normalization on the time-frequency square matrix, multiplying the time-frequency square matrix by 255, then performing rounding, and mapping the time-frequency square matrix to a [0,255] interval, wherein square matrix elements are corresponding pixel values, so that a corresponding time-frequency gray image can be obtained. Specifically, the calculation formulas of the mean-variance normalization and the maximum-minimum normalization used above are respectively:
Figure BDA0003669580170000103
Figure BDA0003669580170000104
where x (n) is the data sample, x (n) mean ,x(n) std ,x(n) min ,x(n) max Mean, standard deviation, minimum and maximum of the data samples, respectively.
Through the processing, a training set can be obtained, and comprises an original time-frequency gray-scale image data set { X, Y } and enhanced time-frequency gray-scale image data sets with different corresponding SNRs, wherein the enhanced time-frequency gray-scale image data set when the SNR = SNR dB is expressed as { X } snr ,Y snr }。
FIG. 2 is a schematic diagram of a comparative learning algorithm according to the present invention. The invention provides a novel comparison learning framework which can jointly optimize the consistency of class samples and the consistency of labels in a training phase. First, we use the original sample as the supervised information of the enhanced sample, and define the feature spatial similarity between the original sample and the enhanced sample of the same data sample using the contrast loss function (CL) value. In addition, we use class-cross entropy loss function (CCE) values to efficiently utilize the label parity information of the original samples. As shown in fig. 2, we describe a specific algorithm flow by using a batch training process, where a training data set is an original time-frequency grayscale image data set and an enhanced time-frequency grayscale image data set generated in the foregoing, and it is assumed that the number of different SNR types in the enhanced time-frequency grayscale image data set is N =3, and corresponding SNRs are r, s, and t dB, respectively. The specific algorithm flow then comprises the following steps:
step 1: a batch training data set is generated. In each batch training process, we first choose a specified number of training samples, including the original data set, based on the batch size
Figure BDA0003669580170000111
And corresponding enhanced data set
Figure BDA0003669580170000112
Where B is the batch size and the samples in the original and enhanced data sets correspond one-to-one, that is to say
Figure BDA0003669580170000113
Is a one-hot coded matrix smoothed by a label. Wherein x is i ∈R 1×(W×W×1) ,y i ∈R 1×K And K is the sample class in the training set.
Suppose sample x i Class k, then:
Figure BDA0003669580170000114
where T is the tag smoothing parameter.
And 2, step: convolutional Neural Network (CNN) feature extraction. As shown in FIG. 3, the CNN used in the present invention comprises three modules, conv Block, MLP Block and Linear Classifier, using f (θ) 1 ),h(θ 2 ),g(θ 3 ) And (4) showing. The Conv Block is composed of three convolution layers and used for calculating a characterization vector of a gray level image sample; the MLP Block comprises three Dense layers, and the characterization vectors are projected and output to a calculation space of the contrast loss function value, wherein the projection vector output p belongs to the R 1×128 (ii) a The Linear Classifier is used for reducing the dimension of the projection vector and obtaining the corresponding probability vector output z belonging to R by utilizing the softmax activation function 1×K . For each sample in the batch training dataset, we can obtain the projection vector output separately
Figure BDA0003669580170000121
Sum probability vector output
Figure BDA0003669580170000122
And step 3: and calculating the label consistency. We use the probability vector of the original data set
Figure BDA0003669580170000123
And sample one-hot coded vector
Figure BDA0003669580170000124
Defining label consistency, we can obtain the cross-entropy loss function value definition of each batch training process as:
Figure BDA0003669580170000125
and 4, step 4: and calculating the sample consistency. We project vectors from the raw data set
Figure BDA0003669580170000126
Projection vector with enhancement data set
Figure BDA0003669580170000127
The comparative loss function value of (a) defines the sample consistency. Projecting vectors for a pair of original data set and enhanced data set
Figure BDA0003669580170000128
We will want to
Figure BDA0003669580170000129
As positive sample pairs and the remaining 2B-2 samples as negative samples, then the sample pairs
Figure BDA00036695801700001210
The comparative loss function value of (a) is defined as:
Figure BDA00036695801700001211
wherein S (u, v) is a cosine similarity formula, τ is a temperature parameter, and τ >0.
Figure BDA00036695801700001212
It is clear that the final contrast loss function value includes the contrast loss function values for all positive sample pairs, including l i (p,p snr ) And l i (p snr P). Thus we can get
Figure BDA00036695801700001213
The formula for calculating the comparison loss function value is as follows:
Figure BDA0003669580170000131
and 5: the average loss function value for each batch of training sessions was calculated. In each batch of training process, the comparison learning loss function value of each enhanced data set and the original data set needs to be calculated, and in order to balance the influence of the cross entropy loss function value and the comparison learning loss function value on the weight of the training process, all the comparison learning loss function values are weighted and averaged. We can therefore give the average loss function value for each batch of training processes as:
Figure BDA0003669580170000132
and 6: and updating the network parameters. Calculating the gradient of the loss function, and updating the network parameters according to a gradient descending principle:
Figure BDA0003669580170000133
where W is the network parameter W = { theta = 123 And η represents a learning rate.
And (5) repeatedly executing the step 1 to the step 6 until the network parameter stable training is finished in advance or the preset Epoch value is iteratively finished. Above we complete the comparative learning pre-training process based on the original data set and the enhanced data set. It can be found that in the above pre-training process, the label information of the enhanced data set is not effectively utilized, so further, we utilize the enhanced data set in the pre-training process to perform network fine tuning, and the learning algorithm is standard supervised learning.
In the above, we have introduced the system structure model, the data set construction process, the network model structure and the network training process of the present invention, and will actually verify the individual identification performance of the communication radiation source of the method of the present invention by taking a smart phone device as an example. Fig. 4 (a) is a schematic diagram of a data acquisition platform, which includes a shielding box, an omnidirectional antenna, a base station, and a data processing platform. In order to eliminate the influence of a channel environment on radio frequency fingerprints, two omnidirectional antennas are used for collecting radio frequency signals in a shielding box, and data exchange is carried out between the omnidirectional antennas and a base station through radio frequency lines. The base station is configured in a Frequency Division Duplex (FDD) mode of LTE, where the uplink frequency is 1.725GHz and the downlink frequency is 1.82GHz. The acquired signal is an LTE real signal, the length of a signal frame is 8192, and the sampling frequency is 122.88MHz. The acquired signals are transmitted to a data processing platform from a base station through a network cable, mean-variance normalization and data enhancement with different SNR are respectively carried out on the acquired signal frames, and then time-frequency representation is carried out on the signal frames through STFT to obtain time-frequency gray images with the size of 128 multiplied by 1. Non-repetitive samples are randomly selected from a time-frequency gray image data set to serve as a training set and a testing set, and the number of the samples is 10000samples/devices/SNR and 4000 samples/devices/SNR respectively.
We performed all experiments on a GPU server with 6 NVIDIA RTX3060 and implemented a network model and learning procedure based on TensorFlow. We chose { -10,0,10,20} dB as the data enhancement SNR for the training set and validated the communication radiation source individual identification performance of the model on the test set of [ -10. For each training, we randomly selected a certain number of samples from the training data set as training samples, and all samples in the test set were used for performance evaluation. To reduce the randomness of the experimental results, we repeated 4 times the experiment for each training condition and found the mean and variance of the training results. Specifically, the label smoothing parameter T =0.1, and the cosine similarity temperature parameter τ =0.2. We chose SGD as the optimizer, with the learning rate set to 0.01, and linear learning rate preheat at the first 20 epochs using cosine learning rate planning, with the preheat learning rate set to 0.005. We set batch size B =512 and maximum training Epoch to 100, with the early termination Epoch set to 20. For the network fine tuning parameters, epoch is set to 30 and the learning rate is set to 0.01.
The average test accuracy of several different individual recognition algorithms in the db test set of [ -10. It can be seen from the results that for NA-RFF, since CNN is obtained by training based on the original data set, and the test set is a data set with different SNRs, that is, the training set and the test set are completely different, the performance is poor on the test set with different SNRs. In addition, as can be analyzed and obtained from fig. 4 (b), in the process of training CNN by using the original data set, CNN does not pay attention to the in-band radio frequency fingerprint of the time-frequency sample, and the out-of-band low-energy features also participate in individual identification. In the enhanced data set, due to the addition of the noise signal, the low-energy features outside the frequency band are submerged in the noise, so that the features extracted by the CNN based on the original data set cannot be used for the classification and identification of the low signal-to-noise ratio sample. In contrast, we find that the individual identification accuracy of the communication radiation source of the CNN under the condition of low signal-to-noise ratio can be significantly improved by artificially introducing noise samples into the training set in a data enhancement mode. In addition, as can be seen from fig. 5, the scheme proposed by the present invention has a significant performance improvement compared to the scheme with only data enhancement, and as the number of training samples increases, the magnitude of the performance improvement gradually increases. Therefore, the comparison learning framework provided by the invention can enable the CNN to pay more attention to the radio frequency fingerprint information in the frequency band of the time frequency sample, so that the identification performance of the CNN on a low signal-to-noise ratio data set is obviously improved.
TABLE 1
Figure BDA0003669580170000151
Fig. 6 compares the signal-to-noise robustness of different methods, wherein TTOS is the same method as the test data acquisition environment for training data. As can be seen from the figure, the signal-to-noise ratio has a large influence on the individual recognition accuracy, which is 94.69% for DACL-RFF when SNR =20dB, but is only 28.26% when SNR = -10 dB. For NA-RFF, the individual identification accuracy is only 6.84% more when SNR = -10 dB. FIG. 7 visually shows the confusion matrix results for NA-RFF and DACL-RFF. In addition, it can be seen from fig. 6 that the individual identification performance of TTOS is improved relatively higher than that of NA-RFF, which indicates that although the out-of-band features of the enhanced data samples are affected by noise, the same data acquisition environment can still ensure more common features of the training data and the test data, thereby achieving higher individual identification accuracy. However, in an actual system, since training different models for different SNR training sets requires more computation and storage resources, and we need to perform accurate SNR estimation on signals to select a suitable model for individual identification of communication radiation sources, TTOS cannot be used in an actual time-varying SNR communication scenario. In contrast, the training set of the method provided by the invention only contains enhanced data of { -10,0,10,20} dB, but has significant performance improvement on all test data sets of [ -10. In addition, the low signal-to-noise ratio individual identification performance of DACL-RFF is higher than that of DA-RFF. In conclusion, in the above methods, the method provided by the invention exhibits the highest robustness of the signal-to-noise ratio, and has a higher application prospect in a time-varying SNR scene.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. The radio frequency fingerprint identification method based on data enhancement and contrast learning is characterized by comprising the following steps of:
collecting radio frequency signals, and preprocessing the collected radio frequency signals;
performing data enhancement and signal characterization on the preprocessed signals to obtain a two-dimensional time-frequency square matrix;
carrying out maximum-minimum normalization on the obtained time-frequency square matrix, mapping the time-frequency square matrix data to a specified interval, and obtaining a corresponding time-frequency gray image as a training data set;
selecting a specified number of samples as a batch training data set according to the training data set;
calculating the output of all samples obtained by the calculation of the convolutional neural network CNN, including the output of projection vectors and the output of probability vectors;
calculating an average loss function value of each batch of training processes, defining label consistency through cross entropy loss function values, and defining sample consistency by comparing loss function values;
calculating the gradient of a loss function, updating network parameters according to a gradient descent principle, repeating the steps until the network parameter stable training is finished in advance or the preset Epoch value iteration is finished, finishing a comparison learning pre-training process based on the original data set and the enhanced data set, carrying out network parameter fine adjustment according to the enhanced data set and corresponding label information thereof, and training by adopting a standard supervised learning algorithm until an optimal training result is obtained.
2. The radio frequency fingerprint identification method based on data enhancement and contrast learning according to claim 1, wherein the collecting radio frequency signals and the preprocessing the collected radio frequency signals are specifically:
collecting radio frequency signals at a receiving end, segmenting the radio frequency signals according to signal frames, and then carrying out mean-variance normalization on the signal frames to make each signal frame obey Gaussian distribution N (0, 1);
the data enhancement is specifically as follows:
carrying out noise modeling on the signal frame x (n) subjected to normalization processing to construct a corresponding enhanced data frame s (n); assuming a signal frame data length of l, the enhanced data frame at SNR = a dB is represented as:
Figure FDA0003669580160000011
where w (n) is the corresponding white noise signal and the mean variances are respectively μ w =0,
Figure FDA0003669580160000021
3. The radio frequency fingerprint identification method based on data enhancement and contrast learning of claim 1, wherein the signal characterization is: performing time-frequency characterization on a signal frame by using short-time Fourier transform (STFT), and converting a one-dimensional time domain signal into a two-dimensional time-frequency square matrix by adjusting a conversion parameter of the STFT, wherein a square matrix element is a corresponding Power Spectral Density (PSD);
and (3) carrying out maximum value-minimum value normalization on the time-frequency square matrix, then multiplying by 255 and then carrying out rounding, and mapping the time-frequency square matrix data to a [0,255] interval, wherein square matrix elements are corresponding pixel values to obtain a corresponding time-frequency gray image.
4. The data-enhancement and contrast-learning-based radio frequency fingerprint identification method according to claim 1, wherein the training data set comprises an original time-frequency grayscale image data setX, Y and corresponding enhanced time-frequency grayscale image datasets of different SNR, the enhanced time-frequency grayscale image dataset when SNR = snrdB is expressed as { X } snr ,Y snr }。
5. The data enhancement and contrast learning-based radio frequency fingerprint identification method according to claim 1, wherein for each batch training process, a specified number of samples are selected from the original time-frequency grayscale image data set and the enhanced time-frequency grayscale image data set as batch training original data sets according to batch size
Figure FDA0003669580160000022
And corresponding enhanced data set
Figure FDA0003669580160000023
Where B is the batch size and the samples in the original and enhanced data sets correspond one-to-one,
Figure FDA0003669580160000024
Figure FDA0003669580160000025
is a one-hot coded matrix smoothed by a label.
6. The method of claim 1, wherein the calculating of the output of all samples obtained through the CNN calculation is performed, and for each batch training process, the projection vector p and the probability vector z are obtained after each sample passes through the network, so that the outputs of the original data set and the enhanced data set are respectively the output of the original data set and the enhanced data set
Figure FDA0003669580160000026
And
Figure FDA0003669580160000027
7. the method of claim 1, wherein label consistency is defined by cross-entropy loss function values of the original data set probability vector and the sample one-hot coded vector, and the cross-entropy loss function value of each batch of training processes is defined as:
Figure FDA0003669580160000031
defining sample consistency by the contrast loss function value of the projection vector of the original data set and the projection vector of the enhanced data set, and projecting the vectors for a pair of the projection vector of the original data set and the projection vector of the enhanced data set
Figure FDA0003669580160000032
Will be provided with
Figure FDA0003669580160000033
As a positive sample pair and the remaining 2B-2 samples as negative samples, then the sample pairs
Figure FDA0003669580160000034
The comparative loss function value of (d) is defined as:
Figure FDA0003669580160000035
wherein S (u, v) is a cosine similarity formula, τ is a temperature parameter, and τ >0;
Figure FDA0003669580160000036
8. the method of claim 1, wherein the final contrast loss function value is a function of a comparison loss of the dataContrast loss function values including all positive sample pairs, including l i (p,p snr ) And l i (p snr P); obtaining a projection vector pair
Figure FDA0003669580160000037
The formula for calculating the comparison loss function value is as follows:
Figure FDA0003669580160000038
the average loss function value for each batch of training sessions is:
L avg =L cce (z,y)+L cl (p,p snr ).。
9. the method of claim 1, wherein the network parameters are updated according to a gradient descent principle:
Figure FDA0003669580160000039
where W is the network parameter and η represents the learning rate.
10. The radio frequency fingerprint identification system based on data enhancement and contrast learning is characterized by comprising
The acquisition module is used for acquiring radio frequency signals and preprocessing the acquired radio frequency signals;
the signal characterization module is used for performing data enhancement and signal characterization on the preprocessed signals to obtain a two-dimensional time-frequency square matrix;
the training data set acquisition module is used for carrying out maximum-minimum normalization on the obtained time-frequency square matrix, mapping the time-frequency square matrix data to a specified interval, and obtaining a corresponding time-frequency gray image as a training data set;
the batch training data set acquisition module is used for selecting a specified number of samples as a batch training data set according to the training data set;
the sample output module is used for calculating the output obtained by calculating all samples through a Convolutional Neural Network (CNN), and comprises projection vector output and probability vector output;
the consistency definition module is used for calculating the average loss function value of each batch of training process, defining label consistency through the cross entropy loss function value and defining sample consistency by comparing the loss function values;
and the network updating iteration module is used for calculating the gradient of the loss function, updating the network parameters according to a gradient descent principle, repeating the steps until the network parameter stable training is finished in advance or the preset Epoch value iteration is finished, finishing a comparison learning pre-training process based on the original data set and the enhanced data set, carrying out network parameter fine adjustment according to the enhanced data set and the corresponding label information thereof, and training by adopting a standard supervised learning algorithm until an optimal training result is obtained.
CN202210600110.6A 2022-05-30 2022-05-30 Radio frequency fingerprint identification method and system based on data enhancement and comparison learning Pending CN115146670A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210600110.6A CN115146670A (en) 2022-05-30 2022-05-30 Radio frequency fingerprint identification method and system based on data enhancement and comparison learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210600110.6A CN115146670A (en) 2022-05-30 2022-05-30 Radio frequency fingerprint identification method and system based on data enhancement and comparison learning

Publications (1)

Publication Number Publication Date
CN115146670A true CN115146670A (en) 2022-10-04

Family

ID=83406102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210600110.6A Pending CN115146670A (en) 2022-05-30 2022-05-30 Radio frequency fingerprint identification method and system based on data enhancement and comparison learning

Country Status (1)

Country Link
CN (1) CN115146670A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116112932A (en) * 2023-02-20 2023-05-12 南京航空航天大学 Data knowledge dual-drive radio frequency fingerprint identification method and system
CN116340849A (en) * 2023-05-17 2023-06-27 南京邮电大学 Non-contact type cross-domain human activity recognition method based on metric learning
CN116361859A (en) * 2023-06-02 2023-06-30 之江实验室 Cross-mechanism patient record linking method and system based on depth privacy encoder
CN116527461A (en) * 2023-04-28 2023-08-01 哈尔滨工程大学 Electromagnetic signal time domain enhancement method based on shielding analysis
CN116573508A (en) * 2023-07-13 2023-08-11 深圳市万物云科技有限公司 High-resolution elevator fault identification method, device and related medium
CN117113061A (en) * 2023-09-14 2023-11-24 中国人民解放军军事科学院系统工程研究院 Cross-receiver radiation source fingerprint identification method and device

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116112932A (en) * 2023-02-20 2023-05-12 南京航空航天大学 Data knowledge dual-drive radio frequency fingerprint identification method and system
CN116112932B (en) * 2023-02-20 2023-11-10 南京航空航天大学 Data knowledge dual-drive radio frequency fingerprint identification method and system
CN116527461A (en) * 2023-04-28 2023-08-01 哈尔滨工程大学 Electromagnetic signal time domain enhancement method based on shielding analysis
CN116527461B (en) * 2023-04-28 2024-05-24 哈尔滨工程大学 Electromagnetic signal time domain enhancement method based on shielding analysis
CN116340849A (en) * 2023-05-17 2023-06-27 南京邮电大学 Non-contact type cross-domain human activity recognition method based on metric learning
CN116340849B (en) * 2023-05-17 2023-08-15 南京邮电大学 Non-contact type cross-domain human activity recognition method based on metric learning
CN116361859A (en) * 2023-06-02 2023-06-30 之江实验室 Cross-mechanism patient record linking method and system based on depth privacy encoder
CN116361859B (en) * 2023-06-02 2023-08-25 之江实验室 Cross-mechanism patient record linking method and system based on depth privacy encoder
CN116573508A (en) * 2023-07-13 2023-08-11 深圳市万物云科技有限公司 High-resolution elevator fault identification method, device and related medium
CN116573508B (en) * 2023-07-13 2023-10-10 深圳市万物云科技有限公司 High-resolution elevator fault identification method, device and related medium
CN117113061A (en) * 2023-09-14 2023-11-24 中国人民解放军军事科学院系统工程研究院 Cross-receiver radiation source fingerprint identification method and device
CN117113061B (en) * 2023-09-14 2024-02-23 中国人民解放军军事科学院系统工程研究院 Cross-receiver radiation source fingerprint identification method and device

Similar Documents

Publication Publication Date Title
CN115146670A (en) Radio frequency fingerprint identification method and system based on data enhancement and comparison learning
CN113887502B (en) Communication radiation source time-frequency characteristic extraction and individual identification method and system
CN113259288A (en) Underwater acoustic communication modulation mode identification method based on feature fusion and lightweight hybrid neural network
CN112749633B (en) Separate and reconstructed individual radiation source identification method
CN111010356A (en) Underwater acoustic communication signal modulation mode identification method based on support vector machine
CN111967358B (en) Neural network gait recognition method based on attention mechanism
CN112347910A (en) Signal fingerprint identification method based on multi-mode deep learning
CN114022914B (en) Palmprint recognition method based on fusion depth network
Ren et al. Deep RF device fingerprinting by semi-supervised learning with meta pseudo time-frequency labels
CN114757224A (en) Specific radiation source identification method based on continuous learning and combined feature extraction
Sun et al. Radar emitter individual identification based on convolutional neural network learning
CN111551893A (en) Deep learning and neural network integrated indoor positioning method
CN113347175B (en) Method and system for fingerprint feature extraction and equipment identity identification of optical communication equipment
Ren et al. Noise-Tolerant Radio Frequency Fingerprinting With Data Augmentation and Contrastive Learning
Punyani et al. Iris recognition system using morphology and sequential addition based grouping
Chen et al. Edge Detection and Deep Learning Based SETI Signal Classification Method
CN117577117B (en) Training method and device for orthogonalization low-rank adaptive matrix voice detection model
CN115424383B (en) Intelligent access control management system and method
CN116647376B (en) Voiceprint information-based underwater acoustic network node identity authentication method
CN113487519B (en) Image rain removing method based on artificial intelligence
CN117830083B (en) Method and device for generating face sketch-to-face photo
CN116862803B (en) Reverse image reconstruction method, device, equipment and readable storage medium
CN117744130A (en) Label-only model reverse attack method based on conditional diffusion model
CN116471598A (en) Radio frequency fingerprint identification method based on image features
CN116167953A (en) Self-supervision-based trans-equipment magnetic resonance data harmonious method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination