CN112866156A

CN112866156A - Radio signal clustering method and system based on deep learning

Info

Publication number: CN112866156A
Application number: CN202110053844.2A
Authority: CN
Inventors: 宣琦; 李晓慧; 崔慧
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-01-15
Filing date: 2021-01-15
Publication date: 2021-05-28
Anticipated expiration: 2041-01-15
Also published as: CN112866156B

Abstract

A deep learning based radio signal clustering method, comprising: s1, adjusting the sample data size of the second data set to be consistent with that of the first data set, and then dividing the sample data of the two data sets into a plurality of batches containing the same amount of sample data; s2, constructing a depth model, inputting the adjusted batch sample data of the second data set into the depth model for pre-training until the model training is stable; s3, clustering training; and S4, outputting the clustering result. The invention also comprises a radio signal clustering system based on deep learning. The invention builds a neural network combining one-dimensional convolution and two-dimensional convolution, and uses the modulation signal data set to pre-train the depth model, so that the depth model is more prone to extracting the characteristics related to the modulation type, and the model is guided to cluster samples according to the modulation type in the subsequent clustering.

Description

Radio signal clustering method and system based on deep learning

Technical Field

The invention relates to the field of unsupervised learning of radio signals, in particular to a radio signal clustering method and system based on deep learning.

Background

Radio refers to electromagnetic waves propagating in free space. The radio communication technology used at present converts information such as voice, text, data, image, etc. into electric signals, loads the radio signals onto radio waves through modulation, the radio waves are transmitted in space, and the radio signals carried by the radio waves are received by a receiving end. In daily life, the transmission of information by using radio signals is still the mainstream mode of modern information transmission. After the radio signal is received, the specific modulation type of the radio signal needs to be identified, and then corresponding demodulation can be carried out according to the modulation type, so that the information carried by the signal is extracted. People often need to spend a great deal of time and energy to collect data samples with known categories, and adopt a classification algorithm to train the data samples to form a clear interface so as to complete modulation recognition. In signal identification, the typical challenge is that in various real-existing environments, the labeled data is difficult, so that the sample set is limited, a reliable database cannot be well established, and when the class information and the prior knowledge of the samples are not known, the modulation types cannot be distinguished. Different clustering algorithms are adopted, and before clustering analysis, the class information, the statistical distribution of data and the prior information of the samples are not required to be known, and even the class number of the samples is unknown. If can be according to modulation type clustering to radio signal, carry out artifical class mark by artifical a small amount of samples of each cluster extraction after clustering again, will alleviate work burden greatly, promote work efficiency.

There are papers: a Deep Adaptive Clustering method (DAC) based on Image data of two classifications is provided in a paper Deep Adaptive Image Clustering published in IEEE international computer vision conference in 2017. The method combines two samples, and converts a multi-class clustering task into a two-class problem by judging whether two samples in a group of samples are similar. The method comprises the steps of firstly using an initialized neural network to extract characteristics of samples, using cosine distance to represent similarity between every two samples, then selecting a sample combination with higher similarity and lower similarity as a relatively reliable sample, and training a depth model. And repeating the iterative training in such a way, and enabling the upper threshold and the lower threshold of the selected similarity to be gradually close to each other in the iterative training so as to ensure that all the samples can be finally judged. However, the iterative training of the depth adaptive clustering is premised on that the depth model has preliminary judgment capability on the samples at the beginning, and can preliminarily capture the difference between different classes of the samples. Therefore, in the first iteration, useful reliable samples are selected, and the model is guided to be trained towards the correct direction, so that the samples of the same type are gradually close to each other, and the samples of different types are gradually far away from each other. In the radio field, the difference between initial data classes of radio signals with different modulation types is very small, and a two-dimensional convolution network used by the method cannot be used for preliminary judgment of radio signal samples. The model cannot be iterated correctly. In the two types of classified experiments, because the depth model has no judgment capability on the samples, the feature expressions of all the guided samples tend to be the same finally, namely all the types are judged to be the same.

There are papers: the technical scheme disclosed in the patent with the application number of CN201810884693.3 is a method for classifying various radar and communication signals based on clustering analysis, and the method extracts the characteristics of the mean square error, the instantaneous autocorrelation characteristic, the instantaneous amplitude mean square error, the phase mean square error and the like of the maximum frequency of a sample according to the characteristics of radio signals, and then clusters different modulation signals by using a K-means algorithm according to the characteristics. In the design of the method, the characteristics of different modulation modes are fully utilized, and the characteristics which can distinguish the modulation modes are extracted. Therefore, this method is only suitable for distinguishing single-frequency signals SF and BPSK, QPSK, 16QAM signals, and cannot distinguish more types of radio signals.

Disclosure of Invention

The invention aims to provide a radio signal clustering method and system based on deep learning, which are used for solving the problems in the prior art, so that more radio signal types can be distinguished, and radio signals are clustered according to modulation types.

In order to achieve the purpose, the invention provides the following scheme:

the invention provides a radio signal clustering method based on deep learning, which comprises the following steps:

assuming that a target signal data set to be clustered is a first data set, an existing signal data set with a similar target is a second data set, and the first data set and the second data set have no association;

s1, adjusting the sizes of the sample data of the second data set and the first data set to be consistent, and then dividing the sample data of the two adjusted data sets into a plurality of batches, wherein each batch contains the same amount of sample data;

s2, constructing a depth model CNN, inputting sample data of the adjusted second data set batch into the depth model CNN for pre-training until the model training is stable, and the specific steps are as follows:

s2.1, inputting a batch of sample data in the adjusted second data set into a randomly initialized depth model CNN, and extracting a sample characteristic matrix;

s2.2, calculating a cosine similarity matrix among a batch of sample data of the adjusted second data set according to the sample feature matrix;

s2.3, calculating a loss function according to the real label and the cosine similarity matrix, training a model by using the loss function, and updating parameters of the model;

s2.4, repeating S2.1-S2.3 until the model training is stable, stopping training, saving the parameters of the model, and saving the depth model;

s3, inputting the sample data of the adjusted first data set into the depth model stored in S2.4 in batches for clustering, selecting the sample with higher confidence coefficient as a reliable sample, training the depth model, and iterating for multiple times until the model is stable;

and S4, outputting the clustering result of the target signals to be clustered.

Further, the adjusting method of S1 is as follows: if the sample data length of the second data set is smaller than that of the first data set, expanding the sample data of the second data set in a whole copying mode; and if the sample data length of the second data set is larger than that of the first data set, intercepting the sample data of the second data set in a space-spaced extraction mode.

Furthermore, the depth model CNN is formed by a one-dimensional convolution layer, a two-dimensional convolution layer, two one-dimensional convolution layers, a steering quantity, and two fully-connected layers in sequence.

Further, the S2.1 includes:

1) respectively extracting characteristics of the IQ two paths of signals by using the one-dimensional convolutional layer to obtain primary characteristics of the IQ two paths;

2) integrating the primary characteristics of the IQ two paths based on the two-dimensional convolutional layer, and extracting the primary characteristics of signals;

3) and further processing the preliminary features of the signal by using two one-dimensional convolutional layers and two full-connection layers, and extracting deep features of the signal.

Further, the S2.3 includes: and converting the real label of the sample into a one-hot vector, and calculating a similar label matrix and a heterogeneous label matrix through a sample real two-classification judgment matrix.

Further, the loss function of the guided pre-training phase in S2.3 is:

in the pre-training phase, λ<1, the model tends to judge that two samples are heterogeneous, the samples are separated, the characteristic difference of different modulation types is searched, wherein, lambda is a balance parameter which is used for coordinating the proportion occupied by homogeneous loss and heterogeneous loss and adjusting the training tendency of the model, the lambda value of the pre-training stage is 0.1,

for the loss function of the pre-training phase, P_BpFor marking matrices of the same kind, P_BnFor heterogeneous marking of matrices, S_BIs a cosine similarity matrix.

Further, the specific step of S3 is:

s3.1, inputting a batch of sample data in the adjusted first data set into the depth model stored in S2.4, and extracting a sample characteristic matrix;

s3.2, calculating a sample similarity matrix according to the sample feature matrix, setting an upper threshold and a lower threshold, selecting reliable samples according to the upper threshold, the lower threshold and the sample similarity matrix, calculating a loss function according to the reliable samples, training a depth model, and updating parameters of the depth model;

and S3.3, repeating S3.1-S3.2 until the CNN training of the depth model is stable, stopping training, storing parameters of the CNN depth model, and storing the depth model.

Further, said S3.2 comprises:

s3.2.1, comparing the similarity matrix with the upper threshold, selecting a sample pair with higher similarity, preliminarily judging the sample pair as a similar sample, and obtaining a soft distribution similar label matrix;

s3.2.2, comparing the similarity matrix with the lower threshold, selecting a sample pair with lower similarity, preliminarily judging the sample pair to be a different sample, and obtaining a soft distribution heterogeneous marking matrix;

s3.2.3, the sample pair with higher similarity and the sample pair with lower similarity form a reliable sample, the depth model CNN is retrained according to the reliable sample, the soft distribution homogeneous label matrix and the soft distribution heterogeneous label matrix are used for replacing the sample real label to obtain the loss function, and the parameter of the depth model is updated according to the loss function.

Further, the cluster training phase loss function is as follows:

in the clustering phase, λ>1, the model tends to judge the two samples as being homogeneous, aggregating the samples so that samples with similar modulation type characteristics can be aggregated together, wherein the value of the balance parameter lambda in the clustering stage is 100,

for the loss function of the clustering training phase, P_OpFor soft allocation of homogeneous marking matrices, P_OnFor soft allocation of a heterogeneous marking matrix, S_OIs a sample similarity matrix.

A deep learning based radio signal clustering system, comprising: the device comprises a preprocessing module, a pre-training module, a clustering module and an output module;

the preprocessing module inputs a first data set and a second data set, adjusts the size of sample data of the second data set to be consistent with that of the sample data of the first data set, and then divides the sample data of the two adjusted data sets into a plurality of batches, wherein each batch contains the same amount of sample data;

the pre-training module inputs sample data of the adjusted second data set batch in the pre-processing module into the pre-training module, pre-trains the depth model, and stores the depth model after multiple iterations until the depth model is stably trained;

the clustering module loads the depth model stored by the pre-training module, sets an upper limit threshold and a lower limit threshold, inputs a batch of sample data in the adjusted first data set into the stored depth model for clustering training, performs multiple iterations until the depth model is stably trained, and stores the depth model;

the output module is used for inputting the target data set into the depth model obtained by the clustering module and outputting the clustering result of the target signal to be clustered;

the preprocessing module, the pre-training module, the clustering module and the output module are sequentially linked.

The invention has the following positive effects:

1. the invention does not need to use various prior knowledge of signal data, and has better universality on signal clustering.

2. Supervised pre-training of the depth model is performed using the same domain data set. Firstly, the depth model is pre-trained by using the existing modulation signal data set for supervised learning, so that the depth model has certain characteristic extraction capability on signal data, and preliminary judgment can be made on whether two samples in a sample combination belong to the same class. The depth model has primary judgment capability on the signal data, so that the reliability of the similarity between the samples calculated by the depth model can be ensured when the clustering is started, thereby ensuring that the model is used for gathering the truly similar samples together and guiding the model to train in a relatively correct direction. Meanwhile, the depth model is pre-trained by using the modulation signal data set, so that the depth model is more prone to extracting features related to the modulation type. And when in subsequent clustering, the guide model clusters the samples according to the modulation types.

3. The loss function consists of two parts: homogeneous losses and heterogeneous losses. Namely, the loss required to be borne by two samples judged to be of the same type and the loss required to be borne by two samples judged to be of different types. The balance parameter lambda is used for coordinating the occupation weight of the two parameters, and the training tendency of the model can be adjusted. During pre-training, the control model tends to separate samples and look for feature differences of different modulation types; when clustering learning is carried out, the control model tends to gather samples, so that samples with similar modulation type characteristics can be gathered together, and the depth model is effectively trained by the two loss functions.

4. The invention builds a neural network combining one-dimensional convolution and two-dimensional convolution, respectively extracts the characteristics of two paths of signal data by using the one-dimensional convolution, integrates the information of the two paths of signals by using the two-dimensional convolution, further integrates the information by using the one-dimensional convolution and a full connection layer, and finally outputs the probability distribution of samples on various types through a softmax layer.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a flow chart of a clustering method;

FIG. 2 is a diagram of a depth model CNN structure;

FIG. 3 is a flow chart of a pre-training process;

fig. 4 is a flow chart of a clustering process.

Detailed Description

Reference will now be made in detail to various exemplary embodiments of the invention, the detailed description should not be construed as limiting the invention but as a more detailed description of certain aspects, features and embodiments of the invention.

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. Further, for numerical ranges in this disclosure, it is understood that each intervening value, between the upper and lower limit of that range, is also specifically disclosed. Every smaller range between any stated value or intervening value in a stated range and any other stated or intervening value in a stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included or excluded in the range.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although only preferred methods and materials are described herein, any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention. All documents mentioned in this specification are incorporated by reference herein for the purpose of disclosing and describing the methods and/or materials associated with the documents. In case of conflict with any incorporated document, the present specification will control.

It will be apparent to those skilled in the art that various modifications and variations can be made in the specific embodiments of the present disclosure without departing from the scope or spirit of the disclosure. Other embodiments will be apparent to those skilled in the art from consideration of the specification. The specification and examples are exemplary only.

As used herein, the terms "comprising," "including," "having," "containing," and the like are open-ended terms that mean including, but not limited to.

The "parts" in the present invention are all parts by mass unless otherwise specified.

The existing signal clustering method is insufficient in universality and imposes strict requirements on target sample information. However, because the neural network has poor capability of preliminary judgment of the signal data, a large number of depth clustering methods in the image field often cannot achieve satisfactory effects on the signal data. Therefore, the method provides a novel deep clustering framework combining pre-training and deep clustering. The framework flow is shown in figure 1. After data information is preliminarily processed, a pre-training mechanism is used, so that the neural network has certain characteristic extraction capacity on signal data, then a deep clustering mode is used for carrying out deep clustering training on the signal data, and finally a sample label is output. The method has better universality on signal clustering because various prior knowledge of signal data is not required to be used.

A radio signal clustering method based on deep learning specifically comprises the following steps:

assuming that the first data set is O and the second data set is B, the target data sets O to be clustered are k types; the data set B is an existing signal data set with a class mark, and is totally of k classes, besides, O has no relation with B, and the used modulation signal data is stored by IQ two-way data.

S1, adjusting the sizes of the sample data of the data set B and the data set O to be consistent, and then randomly dividing the sample data of the two data sets into different batches, wherein each batch contains m samples;

s2, constructing a depth model CNN, inputting the batch sample data of the data set B into the depth model CNN for pre-training, wherein the specific flow is shown in FIG. 3, and the specific steps are as follows:

s2.1, inputting a batch of sample data of the data set B into a randomly initialized depth model CNN, and extracting a sample characteristic matrix

f_iRepresenting the characteristics of the ith sample in the batch, wherein the dimension is k dimension;

s2.2, calculating a cosine similarity matrix between sample data according to the sample characteristics

S2.3, calculating a loss function according to the real label and the cosine similarity matrix, and updating model parameters;

s2.4, repeating S2.1-S2.3 until the model training is stable, stopping training, saving model parameters and saving a depth model;

s3, clustering training, wherein the specific flow is shown in figure 3;

s3.1, inputting the sample data in the target signal data set O to be clustered into the depth model stored in S2.4 according to batches, and extracting a sample characteristic matrix F_O；

S3.2, calculating cosine similarity among samples, setting an upper threshold and a lower threshold, selecting reliable samples according to the upper threshold, the lower threshold and a sample similarity matrix, calculating a loss function according to the reliable samples, training a depth model, and updating parameters of the model;

and S3.3, repeating S3.1-S3.2 until the model training is stable, stopping training, storing the parameters of the model, and storing the depth model after the parameters are updated.

S4, outputting the clustering result

In step S1, when the length of the sample data in B is smaller than the length of the sample in O, the sample in B is extended in a manner of overall copying, and when the length of the sample data in B is larger than the length of the sample in O, the sample data in B is intercepted in a manner of extracting at intervals. Thereby ensuring that the structural characteristics of the sample are not damaged as much as possible. Since the workload of combining all samples two by two is huge, the samples are divided into a plurality of batches, and then m samples of the same batch are simultaneously input into the depth model.

In said step S2.1, depth usedThe specific structure of the Convolutional Neural Network (CNN) composed of the convolutional layer model and the fully connected layer is shown in figure 1. And using a one-dimensional convolutional layer to respectively perform primary feature extraction on the IQ two paths, then using a two-dimensional convolutional layer to perform integration of two paths of information, and then further processing the primarily integrated information by using two layers of one-dimensional convolutional layers and two layers of full-connection layers to extract deep features. M samples of the same batch are simultaneously input into a depth model, and the output sample feature matrix is

In step S2.2, the cosine similarity between the ith sample and the jth sample in the same batch is calculated using the sample characteristics obtained in step S2.1:

in order to simplify the calculation, the modular length of the features output by S2.2 is uniformly adjusted to 1, so that the sample similarity calculation formula is simplified as follows:

sim(x_i,x_j)＝f_i·f_j (4)

so the sample similarity matrix

Is S_BThe ith row and the jth column of (1) indicate the similarity of the ith sample and the jth sample.

And S2.3, when the data set B is used for pre-training the model, in order to ensure the efficiency, the real label of the sample is used during model training. Converting the sample tags into one-hot vectors, i.e.

Calculating a sample real two-classification judgment matrix:

wherein

Is a Boolean matrix of values, p_BijWhether the ith sample and the jth sample belong to the same class or not is shown, the same class is 1, and the different class is 0. And respectively counting all the sample pairs judged to be the same type and the sample pairs judged to be different types to obtain a similar marking matrix and a dissimilar marking matrix. Homogeneous mark matrix P_Bp＝P_BThe Boolean matrix has the same kind of 1 and different kind of 0. Heterogeneous label matrix P_Bn＝1-P_BThe Boolean matrix has the same kind of 0 and different kind of 1. Calculating a loss function, and guiding the model update in the pre-training stage:

wherein, the lambda is a balance parameter and is used for coordinating the proportion of homogeneous loss and heterogeneous loss and adjusting the training tendency of the model. In the pre-training stage, lambda is less than 1, the model tends to judge that the two samples are heterogeneous, the samples are separated, and the characteristic difference of different modulation types is searched. Through experiments, the method is not very sensitive to lambda, so that values of different orders of magnitude, such as 0.01, 0.1, 1 and the like, can be selected for experiments, and a proper value is selected. Here, 0.1 is selected.

And S2.4, repeating the steps S2.1-S2.3, after repeated iterative training, enabling the model to tend to be stable, storing the parameters of the CNN, and storing the depth model after updating the parameters.

S3.1, dividing the data set O to be clustered into a plurality of batches, wherein each batch comprises m samples, inputting the samples into CNN according to the batches, and outputting the sample characteristics F_O。

Said step S3.2, according to the sample characteristic F_OCalculating a sample similarity matrix

Comparing the similarity matrix with an upper limit threshold u, selecting a sample pair with higher similarity, preliminarily judging the sample pair as a similar sample, and obtaining a soft distribution similar mark matrix

Similarly, the similarity matrix is compared with a lower limit threshold l, a sample pair with lower similarity is selected and preliminarily judged as a different sample, and a soft distribution heterogeneous marking matrix is obtained

And retraining the CNN by using the selected samples, training the depth model and updating the parameters of the model. By P_OpAnd P_OnInstead of sample true labels, the cluster training phase loss function is as follows:

in the clustering phase, λ>1, the model tends to judge that two samples are of the same type, and the samples are gathered, so that samples with similar modulation type characteristics can be gathered together. Here, the x is selected to be 100,

S4, due to similarity sim (x) between samples_i,x_j)＝f_i·f_jDuring the training process, the feature vectors of different classes of samples tend to be perpendicular to each other. The dimension of the feature vector is k, which is the same as the number of the categories, and the elements of the feature vector are limited between 0 and 1 after normalization. Whereas in the m-dimensional space, only the basis vectors that satisfy the above adjustment. Therefore, as the training process advances, the output features tend to be in the form of one-hot vectors. The output sample features, in effect, also represent the probability distribution of the sample over the classes. Therefore, the dimension of the maximum value of the characteristic vector can be directly used as the label of the sample.

the pre-training module inputs sample data of the adjusted second data set batch in the pre-processing module into the pre-training module, pre-trains the depth model, and stores the depth model after multiple iterations until the depth model is stably trained; the depth model CNN is formed by a one-dimensional convolution layer, a two-dimensional convolution layer, two layers of one-dimensional convolution layers, a steering quantity and two layers of fully-connected layers in sequence; the method specifically comprises the following steps:

s2.1, inputting a batch of sample data in the adjusted second data set into a randomly initialized depth model CNN, and extracting a sample characteristic matrix; the method specifically comprises the following steps:

3) further processing the preliminary features of the signal by using two one-dimensional convolutional layers and two full-connection layers, and extracting deep features of the signal;

s2.3, calculating a loss function according to the real label and the cosine similarity matrix, training a model by using the loss function, and updating parameters of the model; the method specifically comprises the following steps:

converting the real label of the sample into a one-hot vector, and calculating through a sample real two-classification judgment matrix to obtain a similar label matrix and a heterogeneous label matrix;

the loss function for the lead pre-training phase is:

for the loss function of the pre-training phase, P_BpFor marking matrices of the same kind, P_BnFor heterogeneous marking of matrices, S_BIs a cosine similarity matrix;

the clustering module loads the depth model stored by the pre-training module, sets an upper limit threshold and a lower limit threshold, inputs a batch of sample data in the adjusted first data set into the stored depth model for clustering training, performs multiple iterations until the depth model is stably trained, and stores the depth model; the method specifically comprises the following steps:

the cluster training phase loss function is as follows:

for the loss function of the clustering training phase, P_OpFor soft allocation of homogeneous marking matrices, P_OnFor soft allocation of a heterogeneous marking matrix, S_OIs a sample similarity matrix;

the method specifically comprises the following steps:

s3.2.3, the sample pair with higher similarity and the sample pair with lower similarity form a reliable sample, a depth model CNN is retrained according to the reliable sample, the soft distribution homogeneous label matrix and the soft distribution heterogeneous label matrix are used for replacing a sample real label to obtain the loss function, and the parameter of the depth model is updated according to the loss function;

s3.3, repeating S3.1-S3.2 until the CNN training of the depth model is stable, stopping training, storing parameters of the CNN depth model, and storing the depth model;

The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.

Claims

1. A radio signal clustering method based on deep learning is characterized in that: the method comprises the following steps:

s1, adjusting the sizes of the sample data of the second data set and the first data set to be consistent, and then dividing the sample data of the two adjusted data sets into a plurality of batches, wherein each batch contains the same amount of sample data; assuming that a target signal data set to be clustered is a first data set, an existing signal data set with a similar target is a second data set, and the first data set and the second data set have no association;

and S4, outputting the clustering result of the target signals to be clustered.

2. The deep learning based radio signal clustering method according to claim 1, wherein: the adjusting method in step S1 includes: if the sample data length of the second data set is smaller than that of the first data set, expanding the sample data of the second data set in a whole copying mode; and if the sample data length of the second data set is larger than that of the first data set, intercepting the sample data of the second data set in a space-spaced extraction mode.

3. The deep learning based radio signal clustering method according to claim 1, wherein: the depth model CNN described in step S2 is formed by a one-dimensional convolution layer, a two-dimensional convolution layer, two one-dimensional convolution layers, a steering amount, and two fully-connected layers in this order.

4. The deep learning based radio signal clustering method according to claim 1, wherein: step S2.1 comprises:

5. The deep learning based radio signal clustering method according to claim 1, wherein: step S2.3 includes: and converting the real label of the sample into a one-hot vector, and calculating a similar label matrix and a heterogeneous label matrix through a sample real two-classification judgment matrix.

6. The deep learning based radio signal clustering method according to claim 1, wherein: the loss function in the guided pre-training phase in step S2.3 is:

7. The deep learning based radio signal clustering method according to claim 1, wherein: the specific steps of step S3 are:

8. The deep learning based radio signal clustering method according to claim 7, wherein: step S3.2 includes:

9. The deep learning based radio signal clustering method according to claim 7, wherein: the cluster training phase loss function is as follows:

10. A deep learning based radio signal clustering system, characterized by: the device comprises a preprocessing module, a pre-training module, a clustering module and an output module which are connected in sequence;

the loss function for the lead pre-training phase is:

the cluster training phase loss function is as follows:

the method specifically comprises the following steps:

and the output module is used for inputting the target data set into the depth model obtained by the clustering module and outputting the clustering result of the target signal to be clustered.