CN112702294B

CN112702294B - Modulation recognition method for multi-level feature extraction based on deep learning

Info

Publication number: CN112702294B
Application number: CN202110311297.3A
Authority: CN
Inventors: 张江; 张航; 雒瑞森
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2021-03-24
Filing date: 2021-03-24
Publication date: 2021-06-22
Anticipated expiration: 2041-03-24
Also published as: CN112702294A

Abstract

The invention discloses a modulation recognition method for multi-level feature extraction based on deep learning, which comprises the following steps: s1: exploring hidden characteristic information among components of the IQ signal by utilizing one-dimensional convolution; s2: combining a two-dimensional convolution block, further extracting the characteristics of the hidden characteristic information extracted in the step S1 and the original IQ signal to obtain a high-order spatial fusion characteristic, sending the obtained high-order spatial fusion characteristic to a long-short term memory network of a time sequence characteristic extraction module, introducing an attention mechanism into the time sequence characteristic extraction module, and extracting an important time sequence characteristic; s3: the integrated space fusion feature and time sequence feature extraction module; s4: complementing the space fusion characteristic information and the time sequence characteristic information to complete the identification of various modulation signals; by emphasizing important influence among components of the I/Q signals, hidden information among the I/Q signals is fully mined, and information loss is effectively avoided.

Description

Modulation recognition method for multi-level feature extraction based on deep learning

Technical Field

The invention relates to the field of radio modulation identification, in particular to a modulation identification method for multi-level feature extraction based on deep learning.

Background

Automatic modulation identification belongs to an important technology in the field of wireless communication, and has wide application in the civil and military fields. On the premise that the transmission data and the parameters of the transmitting end are unknown, blind identification of the modulation mode of the radio signal is a great challenge. Generally, methods for modulation identification are generally classified into two categories: likelihood ratio based modulation identification methods and feature based modulation identification methods. The modulation recognition method based on the likelihood ratio needs strong prior knowledge, and the likelihood ratio obtained by processing the likelihood function of the signal is compared with a set threshold value, so that the recognition of the modulation signal is completed. The method has higher calculation complexity, is seriously dependent on the setting of a threshold value, is sensitive to parameter deviation and has poorer model robustness. The feature-based identification method mainly comprises three steps of preprocessing, feature extraction and classifier selection. The traditional modulation signal characteristic extraction comprises the characteristic parameter extraction based on high-order cumulant, a method based on cyclic spectrum analysis, a characteristic extraction based on wavelet transformation, a transformation analysis based on a constellation diagram and other methods, and then the modulation mode identification is carried out on the expert characteristics which are artificially extracted by using classifiers such as a decision tree or a support vector machine. The characteristics belong to expert knowledge, strong professional knowledge is needed, the application range is limited, and certain limitation is realized.

The existing modulation recognition method based on the neural network mainly utilizes the convolutional neural network or the cyclic neural network to complete modulation recognition through a well-designed network model. The methods mostly ignore mutual information existing among components of the I/Q signals, and have information loss in the network calculation process, and influence of complex environments such as frequency offset, multipath fading and the like exists in the actual transmission process, so that the identification difficulty of certain modulation categories is increased. Therefore, how to fully mine hidden information between I/Q signals and pay attention to the mutual influence between I components and Q components becomes an urgent problem to be solved in the field of radio modulation identification.

Disclosure of Invention

The invention aims to overcome the defect that mutual information existing among components of an I/Q signal is ignored to cause information loss in the prior art, and provides a modulation identification method for multi-level feature extraction based on deep learning.

The purpose of the invention is mainly realized by the following technical scheme:

a modulation identification method for multi-level feature extraction based on deep learning comprises the following steps:

s1: exploring hidden characteristic information among components of the IQ signal by utilizing one-dimensional convolution;

s2: combining the two-dimensional convolution block, further extracting the characteristics of the hidden characteristic information and the original IQ signal extracted in the step S1 by using a spatial fusion characteristic extraction module to obtain a high-order spatial fusion characteristic, sending the obtained high-order spatial fusion characteristic to a long-short term memory network of a time sequence characteristic extraction module, introducing an attention mechanism into the time sequence characteristic extraction module, and extracting important time sequence characteristics;

s3: the integrated space fusion feature extraction module and the time sequence feature extraction module;

s4: and complementing the spatial fusion characteristic information and the time sequence characteristic information to finish the identification of various modulation signals.

In the communication field, an I/Q signal comprises an I orthogonal component and a Q in-phase component which are combined to form a complete sample point, so that various characteristic information of the signal, such as instantaneous amplitude, phase, power and the like, can be conveniently determined, the high identification accuracy of various modulation modes is realized by extracting the instantaneous amplitude and phase characteristics of the IQ signal and utilizing a long-short term memory network, important correlation exists among IQ signal components, and due to the fact that IQ signals of different modulation types have corresponding rules and complex influences such as multipath fading, frequency offset, path loss and the like exist in the actual channel transmission process, the invention provides an algorithm of multi-level characteristic fusion in order to improve the robustness of modulation identification and overcome various environmental influences; the method comprises the steps of firstly, exploring hidden characteristic information among components of IQ signals by utilizing one-dimensional convolution, wherein the hidden characteristic refers to mutual information between I-channel components and Q-channel components, for example, instantaneous characteristics such as instantaneous amplitude, instantaneous phase and the like of the modulation signals can be calculated through the I components and the Q components, the important information among the two components is explored by utilizing the one-dimensional convolution, then, the advantages of extracting space characteristics by combining two-dimensional convolution are combined, high-order space fusion characteristic information extracted by a network is sent to a long-term and short-term memory network for time characteristic extraction, and an attention mechanism is introduced to fully excavate important time characteristics. The invention realizes the complementation between the IQ signal space characteristic and the time characteristic information by integrating the convolutional neural network and the long-short term memory network, and completes the high-precision identification of various modulation signals.

Further, the extraction of the spatial fusion features comprises the following steps:

s11: performing batch normalization on the IQ signals;

s12: sending the regular IQ signals into a one-dimensional convolutional layer and extracting mutual information among IQ component signals to obtain a first array;

s13: splicing and fusing the original IQ signal and the first array through a configure 1 operation to obtain a second array;

s14: sending the second array into two attention residual blocks by using a two-dimensional convolution layer, and extracting characteristic information;

s15: sending the feature information extracted in the step S14 to a batch normalization layer and a two-dimensional convolution layer to realize cross-channel interaction and information integration to obtain spatial features;

s16: the spatial fusion feature is obtained by a splice 2 operation, i.e. by splicing and combining the original IQ signal and the spatial feature.

In the extraction process of the spatial fusion features, the batch normalization layer is used for accelerating network training, the training difficulty of a complex model can be solved, the phenomena of disappearance of gradients and deviation of internal data distribution can be prevented, the IQ signals are normalized, then the normalized IQ signals are sent to the one-dimensional convolutional layer, the purpose is to extract mutual information among IQ component signals, hidden transient features of the IQ component signals are mined, a corresponding array is obtained, the array is operated through Concatenate1, the original IQ signals and the extracted feature values are spliced and fused, and a further array is obtained, wherein the array not only contains original IQ sample points, but also comprises implicit features extracted through the one-dimensional convolutional layer. Therefore, not only is the loss of original useful information avoided, but also the extracted signal characteristics are effectively combined, each IQ signal sample point is expanded to more dimensions from two dimensions of only an I component and a Q component, the characteristic information is enriched, the fused IQ characteristic information is sent into two attention residual blocks by utilizing the space characteristic extraction capability of a two-dimensional convolutional layer, and then the original IQ signal and the space characteristic extracted by the convolutional neural network are spliced and combined through the operation of the concatemate 2 to realize the multiplexing of the original signal characteristic and obtain the space fusion characteristic, the concatemate 1 and the concatemate 2 in the invention are both the existing concatemate function operation, the concatemate operation in the invention is to splice the obtained array according to a certain axis and send the one-dimensional convolutional layer, the concatemate 1 operation, the two-dimensional convolutional layer, the two attention residual blocks and the concatemate 2 operation through the regularization processing of the IQ signal, and completely and effectively extracting the spatial fusion characteristics.

Further, the attention residual block is composed of a standard residual network, a total of three convolutional layers, a batch normalization layer is added between each convolutional layer, and a ReLU activation function is used. The main structure of the attention residual block is composed of a standard residual network and three convolution layers in total, a batch normalization layer is added between each convolution layer, a ReLU activation function is used to improve the nonlinear mapping capability of the convolution layer, and the batch normalization layer and a two-dimensional convolution layer are sent to the batch normalization layer and the two-dimensional convolution layer through an array of the attention residual block, so that cross-channel interaction and information integration are achieved.

Further, an attention mechanism is introduced in both the attention residue blocks of the step S14, and the attention mechanism includes the following steps:

s141: extracting a feature mapping U;

s142: compressing the extracted feature mapping U to a channel descriptor by a global average pooling technique

Wherein the Cth element of z is calculated by:

each element of z has a global receptive field for a feature map U, where H represents the height of each channel feature map and W represents the width of each channel feature map;

s143: and substituting the channel descriptor z into an excitation part comprising two fully-connected layers, learning the weight of each channel through the dimension transformation of the two fully-connected layers, multiplying the obtained weight by the feature map of the corresponding channel to recalibrate the feature map, and adding the recalibrated feature map and the input second array to obtain feature information.

In order to further improve the utilization rate of useful information and the feature extraction capability of a model, the invention also introduces a channel attention mechanism, and the realization of the mechanism mainly comprises two parts of compression and excitation. The compression part compresses the channel descriptor by a global average pooling technology by means of the feature mapping U

And the excitation part comprises two fully-connected layers, and the dimension control of the fully-connected layers is controlled by a hyperparameter r. The weights of the individual channels are learned by dimensional transformation of the two fully connected layers and then the obtained weights are multiplied by the feature maps of the respective channels to re-align the feature maps, similar to the gating mechanism applied to each channel separately. The method is a lightweight gating mechanism and is used for improving the network representation capability by establishing a channel-by-channel relation. And finally, adding the recalibrated characteristic diagram and the previously input array, so that the purpose of residual error learning is achieved, and the network degradation is avoided.

Furthermore, the time sequence feature extraction module is composed of two layers of bidirectional long-short term memory networks and a full connection layer. In order to extract the intrinsic characteristics of the modulation signal from multiple directions, the invention inputs the acquired spatial fusion characteristics into a network consisting of two layers of bidirectional long-short term memory networks and a full connection layer, and is expected to extract the timing characteristics contained in the modulation signal, wherein the bidirectional long-short term memory networks have the advantages of extracting the past and future associated information of the input sequence data and capturing more complete timing characteristics.

Further, the extracting step of the time sequence feature in the time sequence feature extracting module includes:

s21: inputting the spatial fusion features into a time sequence feature extraction module;

s22: after passing through two layers of bidirectional long and short term memory networks, extracting a vector positioned in the middle of the output of the last layer of bidirectional long and short term memory network, projecting the vector by using a full connection layer as a query vector, performing dot product operation on the vector and the output of the bidirectional long and short term memory network, and normalizing by using a softmax function to obtain an attention probability distribution value;

s23: and performing dot product operation on the attention probability distribution value and the output of the last layer of bidirectional long and short term memory network to obtain a time sequence feature vector.

Because a large amount of information is stored in the bidirectional long and short term memory network, in order to filter out irrelevant characteristic information and enable a model to pay attention to more important time sequence characteristics, an attention mechanism is added to the final output of the two layers of bidirectional long and short term memory networks. The method extracts the vector positioned in the middle of the output of the last layer of the bidirectional long and short term memory network, projects the vector by using a full connection layer to serve as a query vector, performs dot product operation on the vector and the output of the bidirectional long and short term memory network, and performs normalization by using a softmax function to obtain a final attention probability distribution value. Next, the attention point value, i.e. a vector, and the output of the last layer of bidirectional long-short term memory network are subjected to dot product operation to obtain a final feature vector.

In conclusion, compared with the prior art, the invention has the following beneficial effects:

(1) the invention realizes the complementation between the space characteristic and the time sequence characteristic information of the IQ signals by integrating the convolutional neural network and the long-term and short-term memory network, and completes the high-precision identification of various modulation signals, thereby fully excavating the hidden information between the I/Q signals by emphasizing the important influence among the components of the I/Q signals and effectively avoiding the loss of information.

(2) The IQ signals are subjected to regularization processing and sent to a one-dimensional convolutional layer, a coordinate 1 operation, a two-dimensional convolutional layer, two attention residual error blocks and a coordinate 2 operation, so that the spatial fusion features are completely and effectively extracted.

(3) The invention inputs the acquired space fusion characteristics into a network consisting of two layers of bidirectional long and short term memory networks and a full connection layer, and is expected to extract the timing sequence characteristics contained in the modulation signals.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

FIG. 1 is a schematic diagram of the present invention;

FIG. 2 is a schematic diagram of an attention residual block of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.

Example 1:

as shown in fig. 1, the present embodiment relates to a modulation identification method for multi-level feature extraction based on deep learning, which includes the following steps:

In the embodiment, the hidden feature information among components of the IQ signal is firstly explored by utilizing one-dimensional convolution, the hidden feature in the invention refers to mutual information between an I channel component and a Q channel component, and then the high-order spatial fusion feature information extracted by the network is sent to a long-term and short-term memory network for time feature extraction by combining the advantage of extracting the spatial feature by the two-dimensional convolution, and an attention mechanism is introduced to fully mine important time features. By integrating the convolutional neural network and the long-term and short-term memory network, the complementation between the IQ signal space characteristic information and the time characteristic information is realized, and the high-precision identification of various modulation signals is completed.

Since IQ signal modulation identification is inherently a multi-classification task, various modulation signal type labels are typically one-hot coded and cross-entropy loss functions are used. And because each modulation type of input is comparatively similar between some categories, and the influence of noise of different degrees is added, overfitting is caused very easily, make the model too confident to the classification result, lead to misclassifying to the modulation type difficult to discern, introduce the label smoothing technique to carry out certain processing to the label of each modulation type in this embodiment, in order to prevent overfitting, use training sample x as an example, to the label y that training sample x corresponds, this embodiment needs to replace label distribution

Using the formula:

wherein ϵ represents the smoothing parameter, and 0.2, u (K) represents the distribution of labels in this embodiment, and since the number of samples of each modulation type is the same in the experiment, this embodiment employs uniform distribution, so that u (K) =1/K, K is equal to the number of label categories.

Example 2:

as shown in fig. 1, in this embodiment, based on embodiment 1, the extracting of the spatial fusion feature includes the following steps:

s11: performing batch normalization on the IQ signals;

In this embodiment, the IQ signal is regularized, the size of the regularized IQ signal is 2 × 128, then the regularized IQ signal with the size of 2 × 128 is sent to a one-dimensional convolutional layer with the size of 5 and the number of convolutional kernels is 25, in order to extract mutual information between IQ component signals, hidden transient features of the IQ component signals are mined, so as to obtain an array with the size of 25 × 128, the array is operated by Concatenate1, the original IQ signal and extracted feature values are spliced and fused, so as to obtain an array of 27 × 128, wherein 27 includes both original IQ sample points and implicit features extracted by the one-dimensional convolutional layer, 128 represents the number of sample points of each frame, then the fused IQ feature information is sent to two attention residue blocks by using the spatial feature extraction capability of the two-dimensional convolutional layer, and then the original IQ signal and spatial features extracted by the previously spliced neural network are convolutionally combined by Concatenate2 operation, and multiplexing the original signal characteristics to obtain the spatial fusion characteristics.

Example 3:

as shown in fig. 1-2, in this embodiment, based on embodiment 2, the attention residual block is composed of a standard residual network, and a total of three convolutional layers, and a batch normalization layer is added between each convolutional layer, and a ReLU activation function is used.

Example 4:

as shown in fig. 1-2, in this embodiment, based on embodiment 2 or 3, an attention mechanism is introduced into both of the two attention residue blocks of step S14, where the attention mechanism includes the following steps:

s141: extracting a feature mapping U;

s142: is to be extractedThe feature map U is compressed to channel descriptors by a global average pooling technique

Wherein the Cth element of z is calculated by the following formula:

Example 5:

as shown in fig. 1 to 2, in this embodiment, on the basis of any one of embodiments 1 to 4, the step of extracting the timing characteristics in the timing characteristic extraction module includes:

In the embodiment, a vector positioned in the middle of the output of the last layer of the bidirectional long and short term memory network is extracted, a full connection layer is used for projecting the vector to be used as a query vector, then dot product operation is carried out on the vector and the output of the bidirectional long and short term memory network, and then normalization is carried out by utilizing a softmax function to obtain a final attention probability distribution value. Next, the attention point value, that is, a vector, is dot-product-calculated with the output of the last layer of the bidirectional long-short term memory network to obtain a final feature vector, in order to map the extracted feature vector to a space that is more easily separated, this embodiment uses two fully-connected layers, where the activation function is selected to be 'selu', and a dropout strategy is adopted to prevent overfitting, and finally the activation function is input to the fully-connected layer whose unit number is the number of modulation modes to be distinguished, and a softmax function is adopted, so as to obtain the confidence of the IQ signal corresponding to each type of modulation mode.

Example 6:

as shown in FIGS. 1 to 2, in this example, experiments were designed based on examples 1 to 5.

In practice, the I/Q signal y (t) received by the receiver can be expressed as

Where s (t) represents the modulated signal, h (t) represents the channel impulse response, and n (t) represents the additive white gaussian noise.

In the present embodiment, the radio standard data set rml2016.10a disclosed heretofore is employed. The modulation method comprises 11 common modulation signals (BPSK, QPS, 8PSK, 16QAM, 64QAM, BFSK, CPFSKWB-FM, AM-SSB, AM-DSB and PAM 4) in total, wherein each modulation type comprises signal-to-noise ratios with the interval of 2 from-20 dB to 18dB, and the number of each type of modulation signal corresponding to each signal-to-noise ratio is 1000. The total data set has 220000 samples, and each IQ signal sample has a size of 2 × 128. In the signal generation process, besides adding noise, factors such as center frequency offset, sampling rate offset, multipath fading and the like are considered so as to approach the real transmission condition.

First, parameter setting is performed, and the neural network framework used in the experiment in this embodiment is Keras, with tenserflow as the back end. The training apparatus used was an Nvidia GeForce RTX 2080. During training, the blocksize is set to 64, the initial learning rate is 0.001, and a step-wise decay strategy is introduced, decaying every 10 epochs. The data set division is to randomly divide a training set, a verification set and a test set according to each type of modulation signals corresponding to each signal-to-noise ratio in a ratio of 7:1: 2. And simultaneously, an early-stopping mechanism is introduced, namely whether the accuracy of the observation and verification set is improved within 10 epochs or not is judged, and if not, the training is stopped. Adam is selected by the optimizer.

Second, for comparative evaluation of the proposed model, this example was compared to previously popular algorithms, named VTCNN2, LSTM2, CLDNN, GRU, respectively. The VTCNN2, CLDNN and GRU directly use IQ original signals as input, and realize the task of modulated signal identification through a well-designed model, while LSTM2 preprocesses IQ modulated signals, extracts instantaneous amplitude and phase, normalizes corresponding amplitude vector with L2, normalizes phase vector in the range of-1 to 1, and then sends the normalized phase vector into a long-short term memory network for training. Meanwhile, the embodiment visualizes the attention mechanism introduced into the long-term and short-term memory network in the network model provided by the invention, compares the influence of adding attention and not adding attention, and also compares the difference between the introduced label smoothing technology and the non-introduced label smoothing technology. The advantages and disadvantages of the algorithm are then illustrated in terms of model complexity and training time.

Finally, by comparing the five algorithms, the experiment leads to the following conclusions: when the signal-to-noise ratio is greater than or equal to-6 dB, the identification accuracy of the algorithm provided by the invention on the modulation signal is obviously superior to that of other algorithms. Above 4dB SNR, the identification accuracy is stabilized to be about 92%, wherein when the signal-to-noise ratio is 16dB, the identification accuracy can reach 92.68% at the highest, which shows the stability and reliability of the model. The signal-to-noise ratio is 0 to 18dB, the average recognition rate is 91.3%, and compared with other models, the accuracy is improved by nearly 1% -17%.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A modulation identification method for multi-level feature extraction based on deep learning is characterized by comprising the following steps:

s4: complementing the space fusion characteristic information and the time sequence characteristic information to complete the identification of various modulation signals;

the extraction of the spatial fusion features comprises the following steps:

s11: performing batch normalization on the IQ signals;

s16: performing a splice combination on the original IQ signals and the spatial features through a configure 2 operation to obtain spatial fusion features;

the time sequence characteristic extraction module consists of two layers of bidirectional long and short term memory networks and a full connection layer;

the extraction step of the time sequence feature in the time sequence feature extraction module comprises the following steps:

2. The modulation recognition method for multi-level feature extraction based on deep learning of claim 1, wherein the attention residual block is composed of a standard residual network, a total of three convolutional layers, a batch normalization layer is added between each convolutional layer, and a ReLU activation function is used.

3. The modulation recognition method for multi-level feature extraction based on deep learning of claim 1, wherein an attention mechanism is introduced in both of the attention residue blocks of step S14, and the attention mechanism comprises the following steps:

s141: extracting a feature mapping U;

s142: compressing the extracted feature mapping U to a channel descriptor Z ϵ R by a global average pooling technique^CWherein the Cth element of z is calculated by the following formula: