CN113484837A

CN113484837A - Electromagnetic big data-oriented intelligent identification method for unknown radar radiation source

Info

Publication number: CN113484837A
Application number: CN202110727881.7A
Authority: CN
Inventors: 冯蕴天; 王国良; 陈翔; 汪亚; 许雄; 韩慧; 邰宁; 吴若无; 冯润明
Original assignee: UNIT 63892 OF PLA
Current assignee: Unit 63893 Of Pla
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2021-10-08
Anticipated expiration: 2041-06-29
Also published as: CN113484837B

Abstract

The invention discloses an electromagnetic big data-oriented intelligent identification method for an unknown radar radiation source, which comprises the following steps: s1, measuring and calculating to obtain conventional characteristic parameters of all newly-appeared pulses of the radar radiation source; s2, performing feature representation and extraction on the conventional characteristic parameters of all pulses of the radar radiation source; s3, constructing a mass background signal data distributed storage system; s4, completing quick comparison, retrieval and identification of unknown radar radiation sources; and S5, realizing global data sharing. The invention converts the classification problem into the retrieval problem by means of massive data and strong computing power of a big data computing cluster, thereby solving the problem of unknown radar radiation source identification in a complex electromagnetic environment.

Description

Electromagnetic big data-oriented intelligent identification method for unknown radar radiation source

Technical Field

The invention relates to the technical field of radar signal processing, in particular to an electromagnetic big data oriented unknown radar radiation source intelligent identification method.

Background

Radar Emitter Recognition (RER) is a key link in radar counterreconnaissance, and is characterized in that characteristic parameters and working parameters in radar emitter signals are extracted on the basis of sorting, information such as the system, the application, the model and the carrier platform of a target radiation source can be obtained on the basis of the parameters, and then the information can be inferred on battlefield situation, threat level, activity rule, tactical intention and the like, so that important information support is provided for own decision making.

At present, in the field of radar radiation source identification, an artificial intelligence correlation theory method is widely applied and can obtain a good effect, but with the development of an electronic information technology, more and more unknown radiation sources can appear, the characteristic distribution and the category of the unknown radiation sources are unknown, and under the condition of lacking prior knowledge, an artificial intelligence model is difficult to be trained sufficiently, so that most of the existing methods cannot well identify the unknown radar radiation sources. Therefore, the problem of unknown radar radiation source identification is the most difficult and urgent problem to be solved in the field of radar radiation source identification at present.

For the problem of unknown radar radiation source identification, researchers mainly adopt methods such as transfer learning and online learning to deal with the problem. The field matching migration learning theory is deeply researched by Luxinwei and the like, a neural network classifier improved based on migration component analysis is designed, and the classifier is trained by migrating data in different fields in an auxiliary manner, so that the recognition error of the system is reduced. Lemont and the like introduce a transfer learning theory into a recognition system, and provide a radar radiation source recognition method based on transfer component analysis. In order to meet the requirement of a radar radiation source identification system for processing data in real time, various online learning algorithms are researched.

However, the above solutions all belong to the modification of the traditional machine learning paradigm, and can achieve a certain effect, but do not substantially and directly solve the problem of unknown radar radiation source identification.

Aiming at the defects of the traditional machine learning paradigm, the invention attempts to convert the classification problem into the retrieval problem by means of massive data and strong calculation power of a big data calculation cluster, so as to solve the problem of unknown radar radiation source identification in a complex electromagnetic environment.

Disclosure of Invention

In order to solve the above problems, the present invention aims to provide an electromagnetic big data oriented intelligent identification method for an unknown radar radiation source, which uses "retrieval" instead of "classification" to complete the identification task of the unknown radar radiation source.

In order to achieve the purpose, the invention adopts the following technical scheme:

an electromagnetic big data oriented unknown radar radiation source intelligent identification method comprises the following steps:

s1, measuring and calculating to obtain conventional characteristic parameters of all newly-appeared pulses of the radar radiation source;

s2, performing feature representation and extraction on the conventional characteristic parameters of all pulses of the radar radiation source;

s3, constructing a mass background signal data distributed storage system;

s4, completing the quick comparison, retrieval and identification of the unknown radar radiation source, wherein the operation method comprises the following steps:

step 4.1, in order to accurately calculate the similarity between the radar radiation sources and capture the fluctuation change characteristics in the radiation source signals, a radar radiation source similarity calculation method based on mutual information is adopted, the mutual information in the information theory is used for replacing the traditional Euclidean distance to calculate the similarity between the radar radiation sources, and the deep feature vector X of two n-dimensional radar radiation source signals is (X is) the deep feature vector X₁,x₂,…,x_n) And Y ═ Y₁,y₂,…,y_n) Calculating to obtain the similarity XSD (X, Y) between the two radiation sources;

step 4.2, by combining the mutual information MI and the K nearest neighbor KNN algorithm, the MI-KNN algorithm is provided to realize retrieval and identification of the unknown radar radiation source, and meanwhile, the threat level of the radiation source can be judged; the deep characteristic data set K-K (K) of the known mass radiation source signal₀,K₁,…,K_m) And its corresponding threat level tag T ═ T (T)₀,T₁,…,T_m) Specific types of the unknown deep characteristic data X of the radiation source signal to be classified and threat levels T' of the X can be obtained;

4.3, parallelizing the MI-KNN algorithm, and finishing the identification of the unknown radar radiation source by using a fast comparison retrieval identification algorithm of the unknown radar radiation source based on the Flink;

and S5, realizing global data sharing.

Further, in the step S1, the specific operation method includes:

1.1, receiving radar radiation source signals by using a superheterodyne reconnaissance receiver;

step 1.2, parameter measurement and sorting are carried out on radar radiation source signals;

and step 1.3, obtaining pulse carrier frequency CF, pulse width PW, pulse repetition interval PRI, pulse amplitude PA and arrival angle AOA of all newly-appeared pulses of the radar radiation source.

Further, in the step S2, the specific operation method includes:

step 2.1, adding edge computing terminals on all the reconnaissance receivers, wherein the edge computing terminals mainly comprise an embedded GPU;

2.2, training an unsupervised deep learning model Autoencoder on historical collected data in the server, and deploying the trained model to the embedded GPU;

and 2.3, the deep learning model Autoencoder receives conventional characteristic parameters of all newly-appeared pulses of the radar radiation source as input, performs characteristic representation and extraction, and outputs deep characteristics of a radiation source signal, namely a high-dimensional real-value characteristic vector, wherein the vector represents core characteristics of the radiation source which are irrelevant to a channel environment and is used for retrieval of the radiation source.

Further, in the step S3, the specific operation method includes:

step 3.1, collecting the calculation results of all edge calculation terminals, namely the deep characteristic data of unknown radar radiation source signals, as messages through a flash message transmission middleware;

3.2, transmitting the deep characteristic data to a large data computing cluster based on Flink by the Kafka message processing middleware;

and 3.3, performing data persistence on all deep characteristic data and attribute information thereof in the HBase-based distributed storage system.

Further, in step S4, the mutual information-based method for calculating similarity of radar radiation sources in step 4.1 includes the steps of:

step 4.1.1, for two n-dimensional radar radiation source signal deep layer eigenvectors X ═ X₁,x₂,…,x_n) And Y ═ Y₁,y₂,…,y_n) And solving a mutual information value I (X; y);

wherein, p (x)_i) Probability density function of X, p (y)_j) A probability density function of Y;

step 4.1.2, solving the symmetry uncertainty value of the mutual information value I (X; Y);

step 4.1.3, calculating to obtain the similarity XSD (X, Y) between the two radiation sources;

XSD(X；Y)＝1-SU(X；Y)。

further, in step S4, the MI-KNN algorithm in step 4.2 realizes the retrieval and identification of the unknown radar radiation source, and can determine the threat level of the radiation source, and the steps are as follows:

step 4.2.1, calculating the symmetry uncertainty-similarity of the deep characteristic data X of the radiation source signal to be classified and all sample data in the data set K;

4.2.2, sequencing all samples according to the calculated similarity, and if the samples exceed a set similarity threshold, completing retrieval and identification of the unknown radar radiation source; if all samples do not exceed the set similarity threshold, directly adding X into the deep characteristic data set K;

and 4.2.3, after the retrieval and identification are completed, taking out k samples which are most similar to the X, calculating the threat level proportion of the k samples, and taking the threat level with the highest proportion as the threat level judgment result of the X.

Further, in the above step S4, the whole parallelization strategy in step 4.3 is divided into three phases: firstly, a deep characteristic data set is segmented, then the similarity calculation process is parallelized, and finally the calculation results are aggregated.

Further, in the step 4.3, the specific steps are as follows:

4.3.1, dividing the deep characteristic data set into a plurality of subsets to be used as a data source for subsequent parallelization calculation;

step 4.3.2, the parallelization strategy is to perform task parallelization on similarity calculation in the MI-KNN algorithm to realize acceleration after the HDFS is used for performing user-defined partition on the data set; task parallelization of the Flink computing architecture adopts a master-slave model, and JobManager and TaskManager are respectively a master node and a slave node; the main task of the JobManager is the division and distribution of data, namely the division of a data set is completed, and the division result is broadcasted to all the TaskManagers; the task of the TaskManager is parallelization calculation and feedback, namely calculating the similarity between the deep characteristic data of the radiation source signal to be classified and other samples, and feeding back the calculation result to the JobManager;

and 4.3.3, aggregating similarity obtained by parallelization calculation and each sample by using a key value as an index on all the divided data subsets, and selecting the sample exceeding a set similarity threshold value as a final retrieval identification result.

Further, in the step 4.3.1, the deep feature data set is divided into a plurality of subsets, and the specific segmentation steps are as follows:

step 4a, dividing a deep feature data set K into p parts by a data set partitioning method, wherein each part is distributed to a processor;

and 4b, broadcasting the k value and the similarity threshold value selected in the MI-KNN algorithm to each processor by the main node.

Further, in the step S5, the specific operation method includes:

step 5.1, connecting the cloud storage server with the big data computing cluster through a network cable;

and 5.2, uploading the unknown radar radiation source identification result information to a cloud storage server at regular time.

Due to the adoption of the technical scheme, the invention has the following advantages:

the feature distribution and the category of an unknown radar radiation source are unknown, so that the unknown radar radiation source cannot be labeled and an available training data set cannot be constructed, a traditional classification model cannot be trained, conventional solutions are all correction on a traditional machine learning paradigm, a certain effect can be achieved, and the problem of unknown radar radiation source identification is not solved essentially and directly.

The invention discloses an electromagnetic big data oriented unknown radar radiation source intelligent identification method, which is used for solving the problem of unknown radar radiation source identification in a complex electromagnetic environment by converting a classification problem into a retrieval problem by means of massive data and strong calculation power of a big data calculation cluster aiming at the defects of a traditional machine learning paradigm from the viewpoint of electromagnetic big data. The mass unmarked radar radiation source signal characteristic data can be directly stored in a large database for retrieval, when a radiation source signal newly appears, if the radiation source signal can be retrieved in the large database, the information of the radiation source is combined into the retrieved radiation source, so that the attribute information of the time, the direction and the like appearing in the past is obtained, and the characteristics of the behavior of the radiation source and the like can be further analyzed.

Drawings

FIG. 1 is a flow chart of an electromagnetic big data oriented unknown radar radiation source intelligent identification method of the invention;

FIG. 2 is a flow chart of an electromagnetic big data solution in the present invention;

FIG. 3 is a diagram of a parallelization of similarity computation in the present invention;

FIG. 4 is a histogram comparing KNN and MI-KNN algorithm performance in the present invention;

FIG. 5 is a time-consuming statistic proportion diagram of each stage of the MI-KNN parallelization algorithm in the invention;

FIG. 6 is a bar graph of the time consumption of the T (Source) stage under different parallelism in the present invention;

FIG. 7 is a time-consuming histogram of the entire MI-KNN parallelization algorithm under different parallelism in the present invention.

Detailed Description

The technical solution of the present invention will be further described in detail with reference to the accompanying drawings and examples.

As shown in fig. 1 and 2, the method for intelligently identifying an unknown radar radiation source facing electromagnetic big data comprises the following steps:

s1, measuring and calculating to obtain the conventional characteristic parameters of all the pulses of the newly-appeared radar radiation source, wherein the specific operation method comprises the following steps:

step 1.3, obtaining pulse carrier frequency CF, pulse width PW, pulse repetition interval PRI, pulse amplitude PA and arrival angle AOA of all newly-appeared pulses of the radar radiation source;

s2, performing feature representation and extraction on the conventional characteristic parameters of all pulses of the radar radiation source, wherein the specific operation method comprises the following steps:

step 2.1, adding edge computing terminals on all the reconnaissance receivers, wherein the edge computing terminals mainly comprise an embedded GPU, and the number of CUDA cores is more than 256;

step 2.3, the deep learning model Autoencoder receives conventional characteristic parameters of all newly-appeared pulses of the radar radiation source as input, performs characteristic representation and extraction, and outputs deep characteristics of a radiation source signal, namely a high-dimensional real-value characteristic vector, wherein the vector represents core characteristics of the radiation source which are irrelevant to a channel environment and is used for retrieval of the radiation source;

s3, constructing a mass background signal data distributed storage system, wherein the specific operation method comprises the following steps:

step 3.3, performing data persistence on all deep characteristic data and attribute information thereof in the HBase-based distributed storage system;

step 4.1, in order to accurately calculate the similarity between the radar radiation sources and capture the fluctuation change characteristics in the radiation source signals, a radar radiation source similarity calculation method based on mutual information is adopted, the mutual information in the information theory is used for replacing the traditional Euclidean distance to calculate the similarity between the radar radiation sources, and the deep feature vector X of two n-dimensional radar radiation source signals is (X is) the deep feature vector X₁,x₂,…,x_n) And Y ═ Y₁,y₂,…,y_n) Calculating to obtain the similarity XSD (X, Y) between the two radiation sources; the method comprises the following specific steps:

XSD(X；Y)＝1-SU(X；Y)；

step 4.2, combining Mutual Information (MI) and a K-nearest neighbor (KNN) algorithm, providing the MI-KNN algorithm to realize retrieval and identification of an unknown radar radiation source, and meanwhile, judging the threat level of the radiation source; the deep characteristic data set K-K (K) of the known mass radiation source signal₀,K₁,…,K_m) And its corresponding threat level tag T ═ T (T)₀,T₁,…,T_m) Aiming at unknown deep characteristic data X of a radiation source signal to be classified, the specific category of the X and the threat level T' thereof can be obtained, and the method comprises the following specific steps:

4.2.2, sequencing all samples according to the calculated similarity, and if the samples exceed a set similarity threshold, completing retrieval and identification of the unknown radar radiation source; if all samples do not exceed the set similarity threshold, directly adding X into the deep characteristic data set K; the similarity threshold value can be preset according to the actual situation, and is usually set to be 0.85-0.95; preferably, the similarity threshold is empirically chosen to be 0.9;

4.2.3, after the retrieval and identification are completed, taking out k samples which are most similar to the X, calculating the threat level proportion of the k samples, and taking the threat level with the highest proportion as the threat level judgment result of the X;

the whole parallelization strategy is divided into three stages: firstly, segmenting a deep characteristic data set, then parallelizing a similarity calculation process, and finally aggregating calculation results; the method comprises the following specific steps:

4.3.1, dividing the deep characteristic data set into a plurality of subsets to be used as a data source for subsequent parallelization calculation; the specific segmentation steps are as follows:

step 4b, the main node broadcasts the k value and the similarity threshold value selected in the MI-KNN algorithm to each processor;

step 4.3.2, the parallelization strategy is to perform task parallelization on similarity calculation in the MI-KNN algorithm after the data set is subjected to custom partitioning by using the HDFS so as to realize acceleration, and the parallelization of the similarity calculation is shown in FIG. 3; task parallelization of the Flink computing architecture adopts a master-slave model, and JobManager and TaskManager are respectively a master node and a slave node; the main task of the JobManager is the division and distribution of data, namely the division of a data set is completed, and the division result is broadcasted to all the TaskManagers; the task of the TaskManager is parallelization calculation and feedback, namely calculating the similarity between the deep characteristic data of the radiation source signal to be classified and other samples, and feeding back the calculation result to the JobManager;

4.3.3, on all the divided data subsets, aggregating the similarity obtained by the parallelization calculation and each sample by taking a key value as an index, and selecting the sample exceeding a set similarity threshold value as a final retrieval identification result;

s5, realizing global data sharing, wherein the specific operation method comprises the following steps:

The experiment establishes a Flink big data computing cluster environment consisting of 7 high-performance workstations, wherein 1 is used as a master node JobManager, 4 is used as a slave node TaskManager, 1 is used as a Kafka intermediate data source acquisition node, and 1 is used as an HDFS storage node. The Flank computing cluster adopts a Flank1.10.0 version, and the memories of the Master node and the Slave node are both 16G.

In order to verify the performance of a fast comparison, retrieval and identification algorithm of an unknown radar radiation source based on Flink, a massive radiation source signal data set is generated through simulation, simulation parameters are set as shown in table 1, 150000 radar radiation source modes are provided, each mode is a pulse group, namely 150000 pulse groups, and usually 40-200 pulses represent a mode of a radiation source. The pulse group consists of the conventional characteristics of each pulse, and the conventional characteristic parameters of the unknown radar radiation source pulse signal can change rapidly.

TABLE 1 simulation parameter settings for massive radiation source signal data sets

After the massive radiation source signal data sets are processed by a deep network, each radiation source signal generates a high-dimensional real-value feature vector through feature representation and extraction, and thus the massive radiation source signal deep feature data sets are obtained and are used for evaluating the fast comparison, retrieval and identification algorithm of the Flunk-based unknown radar radiation source.

Experiments in the data set above verified the performance of the KNN and MI-KNN algorithms. The result is shown in fig. 4, compared with the conventional KNN algorithm, the identification accuracy of the MI-KNN algorithm in the invention for the unknown radar radiation source is improved by 3.1% and reaches 87.2%. The main reason is that the Euclidean spatial structure relationship does not exist among the radar pulses, so that the overall performance of the algorithm is influenced by directly using Euclidean distance to calculate the similarity in the traditional KNN algorithm, and the mutual information-based radar radiation source similarity calculation method can well represent the time sequence correlation relationship and the information interaction relationship among the radar pulses.

Experiments are submitted to all stages in the process from parallelization tasks to task completion to verify the MI-KNN parallelization acceleration effect based on Flink, and three groups of comparison experiments are set. The parallelization task based on the Flink is divided into 4 stages, wherein the time for acquiring data is T (Source), the time for calculating the similarity is T (MI-KNN), the time for exchanging data between the TaskManagers is T (Process), and the time for exchanging data between the slots is T (thread).

On the massive radiation source signal deep feature data set, when the parallelism is set to 10, the time consumption of T (Source), T (MI-KNN), T (Process) and T (thread) in the MI-KNN parallelization algorithm is counted, and the experimental result is shown in FIG. 5.

According to the statistical result or knowledge, in the execution process of the MI-KNN parallelization algorithm, the time consumption of the T (source) stage accounts for 43%, and as the whole parallelization process is carried out on all the segmented data subsets, a large amount of time is consumed by operations such as data segmentation and data acquisition in the early stage; the T (MI-KNN) stage consumes the most time, accounts for 49%, and is a stage for executing the operation of the parallelization algorithm, including the operations of Map, Reduce and the like; t (process) and T (thread) take less time, accounting for 3% and 5%, respectively.

Compared with the traditional serial execution process, due to the use of a large number of distributed computing nodes, the time consumption of the T (Source) and T (MI-KNN) phases of the MI-KNN parallelization algorithm is remarkably reduced, but a large amount of data exchange and communication between distributed processes and threads can cause T (Process) and T (thread) to be increased to some extent. By taking the above factors into consideration, the overall time consumption of the Flink-based MI-KNN parallelization is obviously reduced compared with the serial execution process.

In the execution process of the MI-KNN parallelization algorithm, the time consumption of the T (source) stage occupies nearly half, is almost close to the time consumption of the algorithm operation execution, and obviously has a large optimization space. Therefore, the experiment counts the time consumption of the t (source) stages with the parallelism of 2, 4, 6, 8 and 10, and the experimental result is shown in fig. 6.

From the statistical result, when the parallelism is set to be 6, the time consumption of the T (Source) stage is the least, and is only 2.0s, which indicates that the parallelism is a reasonable choice. The time consumption of the T (source) stage is increased when the parallelism is set to be too small or too large, and the time consumption of the T (source) stage is maximum and reaches 2.8s when the parallelism is set to be 10.

Finally, on a massive radiation source signal deep feature data set, the time consumption of the whole MI-KNN parallelization algorithm under different parallelism degrees is counted, and an experimental result is shown in fig. 7.

According to statistical results or knowledge, along with the continuous increase of the parallelism, the time consumption of the whole MI-KNN parallelization algorithm is continuously reduced firstly and then slowly increased, and the main reason is that when the parallelism is set to be too small, the time consumption of algorithm operation execution is obviously increased, so that the whole time consumption is very high; when the parallelism is set to be 6, the time consumption of the whole MI-KNN parallelization algorithm is the least, and is only 4.7 s; when the parallelism is set to be too large, although the time consumption for executing the arithmetic operation is reduced, the time consumption for operations such as data segmentation, data acquisition, and data exchange for a huge data set is increased, so that the overall time consumption is also high. Therefore, the selection of the proper parallelism is very important for the whole MI-KNN parallelization algorithm.

The above description is only a preferred embodiment of the present invention, and not intended to limit the present invention, and all equivalent changes and modifications made within the scope of the claims of the present invention should fall within the protection scope of the present invention.

Claims

1. An electromagnetic big data-oriented intelligent identification method for an unknown radar radiation source is characterized by comprising the following steps: which comprises the following steps:

s3, constructing a mass background signal data distributed storage system;

and S5, realizing global data sharing.

2. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S1, the specific operation method is as follows:

3. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S2, the specific operation method is as follows:

4. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S3, the specific operation method is as follows:

5. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S4, the mutual information-based method for calculating similarity of radar radiation sources in step 4.1 includes the steps of:

XSD(X；Y)＝1-SU(X；Y)。

6. the intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S4, the MI-KNN algorithm in step 4.2 realizes retrieval and identification of an unknown radar radiation source, and can determine a threat level of the radiation source, and the steps are as follows:

7. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S4, the whole parallelization strategy in step 4.3 is divided into three phases: firstly, a deep characteristic data set is segmented, then the similarity calculation process is parallelized, and finally the calculation results are aggregated.

8. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 7, wherein: in step 4.3, the concrete steps are as follows:

9. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 8, wherein: in step 4.3.1, the deep feature data set is divided into a plurality of subsets, and the specific segmentation steps are as follows:

10. The intelligent identification method for the unknown radar radiation source facing the electromagnetic big data as claimed in claim 1, wherein: in step S5, the specific operation method is as follows: