CN113239880A - Radar radiation source identification method based on improved random forest - Google Patents

Radar radiation source identification method based on improved random forest Download PDF

Info

Publication number
CN113239880A
CN113239880A CN202110613814.2A CN202110613814A CN113239880A CN 113239880 A CN113239880 A CN 113239880A CN 202110613814 A CN202110613814 A CN 202110613814A CN 113239880 A CN113239880 A CN 113239880A
Authority
CN
China
Prior art keywords
signal
similarity
random forest
training
signals
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110613814.2A
Other languages
Chinese (zh)
Inventor
武斌
黄静
李鹏
张葵
王钊
武佳玥
荆泽寰
袁士博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110613814.2A priority Critical patent/CN113239880A/en
Publication of CN113239880A publication Critical patent/CN113239880A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • G06F2218/06Denoising by applying a scale-space analysis, e.g. using wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The invention discloses a radar radiation source identification method based on an improved random forest IRFC, which mainly solves the problems of low speed and repeated voting of the traditional random forest algorithm. The implementation scheme is as follows: simulating and generating a radar signal data set by using commercial software; performing feature extraction on the data set, and dividing a training sample set and a testing sample set; the method comprises the steps of improving a traditional random forest classifier, namely obtaining an improved random forest IRFC by eliminating a decision tree with low classification precision and a decision tree which is easy to vote and repeat; training the improved random forest IRFC by using a training set; and sending the test set signals to a trained IRFC network, and outputting radar signal prediction categories. The invention can effectively improve the random forest classification speed, fully extract the radar signal characteristics, improve the signal identification rate and can be used for radar signal identification in a complex electromagnetic environment.

Description

Radar radiation source identification method based on improved random forest
Technical Field
The invention belongs to the technical field of signal processing, and particularly relates to a radar signal identification method which can be used for electronic information reconnaissance, electronic support and threat warning systems.
Background
With the development of the electronic information field, electronic countermeasure plays an important role in electronic information reconnaissance, electronic support and threat alarm systems, and radar radiation source signal identification is an important link in electronic countermeasure.
However, the increasingly complex electromagnetic environment and the increasingly diverse new radar models currently present new challenges to the field of electronic countermeasures. How to effectively utilize the intercepted signals to identify and confirm an individual radar so as to position and track a radiation source; the problems not only put new requirements on the research of radar radiation source individual identification technology, but also play a very important role in subsequent accurate identification.
Wu Gaojie et al in 2016 published a random forest-based radar radiation source individual identification method, which obtains an identification result by sending a constructed pulse multidimensional fine feature vector into a random forest classifier after dimensionality reduction. Liu song et al in "random forest based radar signal intra-pulse modulation recognition" published in journal "telecommunication sciences" 2016 No. 5 fused the shape and texture features of a radar signal time-frequency graph, and sent to a random forest classifier for signal recognition. In addition, as the most typical and most commonly used combined classifier algorithm, the random forest classifier is also applied to other various fields, such as Liujian's model for predicting solar radiation based on random forest' and Liuqian's research for simulating and predicting fruit quality of greenhouse netted melon based on random forest' which all adopt the random forest classifier directly in the classification process. However, in the current big data environment, the direct adoption of the random forest algorithm causes the problems of slow recognition speed and excessively long training time.
Disclosure of Invention
The invention aims to provide a radar radiation source identification method based on an improved random forest aiming at the defects of a traditional random forest classification RFC algorithm in a big data environment so as to improve identification speed and accuracy.
The technical idea of the invention is as follows: by eliminating decision trees with low classification precision and decision trees which are easy to vote repeatedly in the original RFC, the improved random forest IRFC with higher diagnosis speed and higher classification precision is constructed, and the improved random forest IRFC is applied to radar radiation source signal identification, so that the identification accuracy is improved.
According to the above thought, the implementation scheme of the invention comprises the following steps:
1) using MATLAB software to simulate and generate a data set of radar signals, wherein the data set comprises LFM signals containing 9 types of different phase noises, pulse widths, bandwidths and carrier frequencies, and each type of signals respectively generates 1000 signals from 0-20dB every 4dB of signal-to-noise ratio to serve as the data set for experiment;
2) outputting the data set signals generated in the step 1) in a sequence form, and extracting the following characteristics of the data set signals:
performing Morlet wavelet transform on the signal sequence to obtain an envelope component of the signal sequence, and extracting three output characteristics of rising edge time, pulse width and top drop of the envelope;
extracting the bispectrum characteristic of the phase noise by using contour integral on the signal sequence, and calculating the waveform entropy E of the signal sequence according to the bispectrum characteristicbEntropy of energy EnAnd singular value entropy EsvdThese three output characteristics;
performing VDM decomposition on the signal sequence to obtain three output characteristics of signal sequence bandwidth, center frequency and Lagrange multiplier;
3) synthesizing the output characteristics into a nine-dimensional characteristic vector matrix, taking the characteristic vector matrix of each signal sequence and the signal category of each signal sequence as one piece of data, synthesizing a new data set, randomly extracting 800 samples from each signal-to-noise ratio of each signal in the data set as a training set, and taking the remaining 200 samples as a test set;
4) the RFC of the random forest classifier is improved:
4a) training an original RFC model by using a training set, inputting a test set into the trained RFC, setting a precision threshold q, evaluating the classification precision of each decision tree, eliminating the decision trees with the classification precision lower than the set precision threshold q, and obtaining a sub-forest comprising w decision trees;
4b) traversing w decision trees of the sub-forest to obtain all path information of the sub-forest;
4c) calculating similarity S between every two decision trees in the son forest through path informationabConstructing a similarity matrix M;
4d) setting a similarity threshold c, comparing the similarity S of each row in the similarity matrix M with the similarity threshold c, and classifying the decision trees in the similarity matrix M to obtain M categories of the decision trees;
4e) selecting a decision tree with the highest classification precision from each category and combining to obtain an improved random forest IRFC model;
5) sampling the training set by a bootstrap sampling method to obtain a training subset, and inputting the training subset into an improved random forest IRFC model for model training; after each round of training, carrying out classification accuracy evaluation on the obtained model by using data which are not sampled in the training set, and stopping training when the classification accuracy reaches an expected value to obtain a trained IRFC;
6) inputting the test set data into the trained IRFC, outputting the predicted LFM signal class C of each test dataiAnd obtaining the identification result of the radiation source signal.
The invention has the following advantages:
1) according to the invention, as the random forest classification method is improved, the decision tree which is easy to vote and repeat is eliminated, the problem of decision tree voting and repeating in the traditional random forest algorithm under a big data environment is solved, and the radar radiation source signal identification speed is effectively improved;
2) according to the method, the problem of low accuracy of traditional random forest recognition in a big data environment is solved by eliminating the decision tree with low classification accuracy, and the accuracy of radar radiation source signal recognition is effectively improved.
Drawings
FIG. 1 is a flow chart of the overall implementation of the present invention.
FIG. 2 is a graph of simulation results of the recognition accuracy of the present invention.
Detailed Description
Embodiments and effects of the present invention will be described in detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation of this embodiment includes the following steps:
step 1: a radar signal data set is generated.
The data set comprises LFM pulse signals containing 9 different phase noises, pulse widths, bandwidths and carrier frequencies, and the simulation steps are as follows:
1.1) setting the pulse width of an ideal emission signal of a radiation source to be 10us, the bandwidth to be 10MHz, the carrier frequency to be 10GHz and the sampling frequency to be 100 MHz;
1.2) adding three different phase noises S1, S2, S3 to the ideal transmission signal, the phase coefficients of which are set as shown in table 1, resulting in LFM signals carrying different phase noises.
TABLE 1 three phase noise phase modulation factor settings
Frequency offset/Hz 1K 10K 100K 500K 1000K
S1 phase coefficient 1 0.1 0.01 0.001 0.0001
S2 phase coefficient 0.5 0.08 0.008 0.002 0.0001
S3 phase coefficient 0.8 0.2 0.07 0.004 0.006
1.3) respectively passing a first LFM signal containing phase noise S1 through a first Butterworth filter F1, passing a second LFM signal containing phase noise S2 through a second Butterworth filter F2, and passing a third LFM signal containing phase noise S3 through a third Butterworth filter F3 to obtain three LFM signals containing phase noise passing through the Butterworth filters, wherein the three LFM signals are marked as E1, E2 and E3;
wherein the parameter settings of three different butterworth filters are shown in table 2:
TABLE 2 Butterworth filter parameter settings
F1 F2 F3
Sampling frequency/Hz 20000 30000 30000
Cut-off frequency/Hz 200 150 200
1.4) adding noises with signal-to-noise ratios of 0dB, 4dB, 8dB, 12dB, 16dB and 20dB into three LFM signals E1, E2 and E3 containing phase noise and passing through the Butterworth filters, so that each LFM signal containing phase noise and passing through the Butterworth filters respectively generates 1000 samples at each signal-to-noise ratio point, and the total number of the samples is 18000;
1.5) changing the pulse width to 7us, the bandwidth to 5MHz and the carrier frequency to 8GHz, and repeating the processes from 1.2) to 1.4) to obtain 18000 samples of three new different signals E4, E5 and E6 under different signal-to-noise ratios;
1.6) changing the pulse width to 15us, the bandwidth to 20MHz and the carrier frequency to 12GHz, repeating the processes from 1.2) to 1.4) and obtaining 18000 samples of three new signals E7, E8 and E9 under different signal-to-noise ratios;
1.7) the samples obtained in 1.4), 1.5), 1.6) were combined together as a data set, resulting in a total of 54000 analog signal samples for 9 different types of signals at 6 different signal-to-noise ratios.
Step 2: features of the data set signals are extracted.
2.1) outputting the generated data set signals in a sequence form, and performing Morlet wavelet transform on the signal sequence to obtain an envelope component of the signal sequence:
Figure BDA0003097165240000041
where a is a scale factor, b is a panning factor, t denotes time, s (t) denotes the input signal sequence, ψ*Representing the conjugate function of a Morlet wavelet function, WTψs(a, b) is the result of Morlet wavelet transform of the signal sequence;
2.2) extracting three output characteristics of rising edge time, pulse width and top drop of the envelope component:
2.2.1) setting the time corresponding to the rising edge of the envelope component reaching the maximum amplitude of the envelope component by 10 percent as the starting time t of the rising edge1(ii) a Setting the time corresponding to the rising edge of the envelope component reaching 90% of the maximum amplitude of the envelope component as the time t of the rising edge ending2The rising edge time t of the envelope component is obtainedr=t2-t1
2.2.2) setting the time corresponding to the rising edge of the envelope component reaching 50% of the maximum amplitude of the envelope component as the measurement start time t of the pulse width3(ii) a Setting the time corresponding to the falling edge of the envelope component falling to 50% of the maximum amplitude of the envelope component as the ending time t of the pulse width measurement4The pulse width tau of the envelope component is obtained as t4-t3
2.2.3) the time corresponding to the first arrival of the envelope component at 90% of the maximum amplitude of the envelope component is denoted t5The envelope component reaches 90% of the maximum amplitude of the envelope component for the last timeThe corresponding time is denoted as t6Calculating t5And t6The variance of the top amplitude of the envelope component between the two to obtain an envelope top drop TD;
2.3) extracting the bispectrum characteristic of the phase noise by using contour integral on the signal sequence, and calculating the waveform entropy E of the signal sequence according to the bispectrum characteristicbEntropy of energy EnAnd singular value entropy EsvdThese three output characteristics:
Figure BDA0003097165240000051
pi=|ri|/||R||
Figure BDA0003097165240000052
pij=|b(i,j)|/||B||
Figure BDA0003097165240000053
wherein the content of the first and second substances,
Figure BDA0003097165240000054
represents the sum of the bispectral estimates over all the integration paths, L represents the number of integration paths, | riL represents the sum of the bispectral estimates over each round of the integration path;
Figure BDA0003097165240000055
b (i, j) represents the sum of all elements in each row of the bispectral matrix of each bispectral value on the integration path, and I, J represents the row and column number of the bispectral matrix respectively;
βi(i ═ 1, 2., N) is the singular value of the ith bispectrum estimation result, and N is the number of bispectrum singular value estimates;
2.4) carrying out VDM decomposition on the signal sequence to obtain V intrinsic mode function components;
2.5) obtaining Signal sequence Bandwidth
Figure BDA0003097165240000056
Center frequency omegakAnd lagrange multiplier
Figure BDA0003097165240000057
Three output characteristics:
2.5.1) initializing time domain Bandwidth
Figure BDA0003097165240000058
Center frequency
Figure BDA0003097165240000059
Time domain Lagrange multiplier lambda1(t), a secondary penalty factor alpha, a noise tolerance zeta and an iteration number N, and a precision epsilon is set;
2.5.2) versus frequency domain bandwidth
Figure BDA00030971652400000510
And center frequency omegakUpdating, namely updating the formula as follows:
Figure BDA00030971652400000511
Figure BDA00030971652400000512
wherein
Figure BDA0003097165240000061
Is the time domain signal bandwidth at the nth +1 iteration of the kth eigenmode function component,
Figure BDA0003097165240000062
is the ith eigenmode function component after VDM decomposition,
Figure BDA0003097165240000063
is a sequence of signals that are input to the device,
Figure BDA0003097165240000064
is a time domain lagrangian multiplier;
Figure BDA0003097165240000065
respectively correspond to
Figure BDA0003097165240000066
Fourier transform of (1);
2.5.3) updating the frequency domain Lagrange multiplier
Figure BDA0003097165240000067
The update formula is as follows:
Figure BDA0003097165240000068
wherein the content of the first and second substances,
Figure BDA0003097165240000069
is the frequency domain lagrangian multiplier iterated the (n + 1) th time,
Figure BDA00030971652400000610
is the frequency domain lagrangian multiplier for the nth iteration;
2.5.4) calculating the frequency domain bandwidth of the n +1 th iteration of each eigenmode function component
Figure BDA00030971652400000611
Bandwidth of frequency domain with nth iteration
Figure BDA00030971652400000612
The modulus of the difference is then summed over V moduli, denoted as Y, i.e.
Figure BDA00030971652400000613
Comparing Y with precision ε:
if Y > ε and N < N, repeat 2.5.2) to 2.5.4);
if Y is less than or equal to epsilon, the process is completedIterating to obtain frequency domain bandwidth
Figure BDA00030971652400000614
Center frequency omegakFrequency domain lagrange multiplier
Figure BDA00030971652400000615
Three features.
And step 3: a training set and a test set are obtained.
Synthesizing the features output in 2.2), 2.3) and 2.5) into a nine-dimensional feature vector matrix, and synthesizing a new data set by using the feature vector matrix of each signal sequence and the signal category to which the feature vector matrix belongs as a piece of data;
800 samples are randomly drawn from the data set at each signal-to-noise ratio for each type of signal as a training set, leaving 200 samples as a test set.
And 4, step 4: and improving the RFC of the random forest classifier.
4.1) carrying out bootstrap sampling for r times in a training set to obtain r training subsets, and forming a feature subset for each training subset by randomly selecting features;
4.2) recursively executing the following operations on each node from the root node according to the training subset and the feature subset to generate a decision tree to form a random forest:
4.2.1) corresponding each tangent point a of each feature A in the feature subset according to the training subset D of the current node OiThe training subset D is arranged at each tangent point aiAre all divided into1And D2Two subsets;
calculating all tangent points a of each feature AiCoefficient of kini of
Figure BDA0003097165240000071
Figure BDA0003097165240000072
Wherein the content of the first and second substances,
Figure BDA0003097165240000073
a being of feature AiPoint cutting; gini (D)1)=2p1(1-p1) Representing a first training subset D1Coefficient of kini of (p)1Is D1In signal class CiThe probability of (d); gini (D)2)=2p2(1-p2) Representing a second training subset D2Coefficient of kini of (p)2Is D2In signal class CiThe probability of (d);
4.2.2) from all the characteristics A and their possible values tangent point aiIn the method, the tangent point with the smallest kini coefficient is taken as the optimal tangent point, and the characteristic of the tangent point is the optimal characteristic. Dividing the current node O into two sub-nodes according to the optimal characteristics;
4.2.3) repeating the steps 4.2.1) and 4.2.2) on the obtained two child nodes, and dividing the child nodes;
4.2.4) repeating 4.2.1) to 4.2.3) until all nodes are leaf nodes, and completing the construction of the decision tree;
4.2.5) performing operations from 4.2.1) to 4.2.4) on the r training subsets to obtain r decision trees, and combining the r decision trees to form a random forest RFC model;
4.3) training the acquired random forest RFC model:
4.3.1) sampling the training set by a bootstrap sampling method to obtain a new training subset bi
4.3.2) new training subset biInputting the random forest RFC model to carry out model training to obtain a currently trained random forest RFC model;
4.3.3) carrying out classification accuracy evaluation on the trained random forest RFC model by using data which is not sampled in the training set:
when the classification accuracy rate does not reach the expected value, returning to the step 4.3.1);
and when the classification accuracy reaches an expected value, stopping training to obtain a trained random forest RFC model.
4.4) inputting the test set into the trained RFC model, setting a precision threshold q, evaluating the classification precision of each decision tree, eliminating the decision trees with the classification precision lower than the set precision threshold q, and obtaining a sub-forest comprising w decision trees;
4.5) traversing w decision trees of the sub-forest to obtain all path information of the sub-forest;
4.6) calculating the similarity S between every two decision trees in the son forest according to the path information and whether the root nodes between every two decision trees are the same or notab
If decision tree DTaAnd decision tree DTbIf the root nodes are different, the similarity of the two trees is 0;
if decision tree DTaAnd decision tree DTbIf the root nodes are the same, the similarity S is calculated by the following formulaab
Figure BDA0003097165240000081
Wherein S isabIs decision tree DTaAnd decision tree DTbSimilarity between, LiIs decision tree DTaThe ith path and decision tree DTbIs the cosine similarity between each path, l is the decision tree DTaTotal number of paths of, MaxSimiIs DTaThe ith path pair DTbMaximum similarity of paths;
4.7) constructing a similarity matrix M by similarity as follows:
Figure BDA0003097165240000082
4.8) setting a similarity threshold c, comparing the similarity of each row in the similarity matrix M with the similarity threshold c, and classifying the decision tree in the similarity matrix M:
firstly, classifying decision trees with the similarity of the first row exceeding a threshold value c in a similarity matrix M into one class, and then determining a decision tree DT of the ith rowiWhether the classification is a certain class, i is more than or equal to 2 and less than or equal to w: if so, skipping the row; whether or notThen, classifying the decision trees with the row similarity exceeding a threshold value c into one class;
after completing the similarity comparison of w rows in the similarity matrix M, dividing the decision tree into M categories;
4.9) selecting the decision tree with the highest classification precision from each category, and combining the decision tree as an improved random forest IRFC model.
And 5: and obtaining a new training subset to train the improved random forest IRFC model.
5.1) sampling the training set by a bootstrap sampling method to obtain a final training subset ei
5.2) final training subset eiInputting the model into an improved random forest IRFC model for model training to obtain the currently trained improved random forest IRFC model;
5.3) carrying out classification accuracy evaluation on the trained improved random forest IRFC model by using data which is not sampled in the training set:
when the classification accuracy rate does not reach the expected value, returning to the step 5.1);
and when the classification accuracy reaches an expected value, stopping training to obtain the trained improved random forest IRFC.
Step 6: inputting the test set data into a trained modified random forest IRFC, and outputting the predicted LFM radiation source signal class C of each test data xiAnd obtaining the identification result of the radiation source signal.
Then for each test data x, the output result is determined by each decision tree together, and the expression is as follows:
Figure BDA0003097165240000091
wherein, CiIndicates the class value of the output, h (x) ═ CiIndicates that the predicted result is CiN represents the total number of categories, and m is the number of decision trees.
The effects of the present invention can be further illustrated by the following simulations.
1. Simulation conditions are as follows:
the hardware tools are as follows: the commercial computer and the chip are an Intel Core i5-6500 processor, the main frequency is 3.20GHz, the memory is 8GB, and the hard disk is 1 TB; operating the system: windows 7; developing a tool: matlab 2014a, spyder 3.3.6.
2. Simulation content:
and (3) respectively inputting the nine-dimensional feature vector matrix in the step (3) into the improved random forest IRFC classifier and the conventional KNN classifier, support vector machine SVM classifier, decision tree DT classifier and random forest RFC classifier to obtain the signal type identification accuracy of the classifiers on the radiation source signal types in the step (1) under different signal to noise ratios, as shown in the attached figure 2.
As can be seen from FIG. 2, the accuracy of the radiation source signal identification of the invention is obviously higher than that of other classifiers, and the accuracy of the radiation source signal identification of the invention is increased along with the increase of the signal-to-noise ratio, so that a good identification effect can be achieved.
The foregoing description is only an example of the present invention and is not intended to limit the present invention, and it will be apparent to those skilled in the art that modifications and variations in form and detail may be made without departing from the spirit and structure of the invention, but these modifications and variations are within the scope of the invention as defined in the appended claims.

Claims (9)

1. The radar radiation source identification method based on the improved random forest is characterized by comprising the following steps:
1) using MATLAB software to simulate and generate a data set of radar signals, wherein the data set comprises LFM signals containing 9 types of different phase noises, pulse widths, bandwidths and carrier frequencies, and each type of signals respectively generates 1000 signals from 0-20dB every 4dB of signal-to-noise ratio to serve as the data set for experiment;
2) outputting the data set signals generated in the step 1) in a sequence form, and extracting the following characteristics of the data set signals:
performing Morlet wavelet transform on the signal sequence to obtain an envelope component of the signal sequence, and extracting three output characteristics of rising edge time, pulse width and top drop of the envelope;
extracting the bispectrum characteristic of the phase noise by using contour integral on the signal sequence, and calculating the waveform entropy E of the signal sequence according to the bispectrum characteristicbEntropy of energy EnAnd singular value entropy EsvdThese three output characteristics;
performing VDM decomposition on the signal sequence to obtain three output characteristics of signal sequence bandwidth, center frequency and Lagrange multiplier;
3) synthesizing the output characteristics into a nine-dimensional characteristic vector matrix, taking the characteristic vector matrix of each signal sequence and the signal category of each signal sequence as one piece of data, synthesizing a new data set, randomly extracting 800 samples from each signal-to-noise ratio of each signal in the data set as a training set, and taking the remaining 200 samples as a test set;
4) the RFC of the random forest is improved:
4a) training an original random forest RFC model by using a training set, inputting a test set into the trained random forest RFC, setting a precision threshold q, evaluating the classification precision of each decision tree, eliminating the decision trees with the classification precision lower than the set precision threshold q, and obtaining a sub-forest comprising w decision trees;
4b) traversing w decision trees of the sub-forest to obtain all path information of the sub-forest;
4c) calculating similarity S between every two decision trees in the son forest through path informationabConstructing a similarity matrix M;
4d) setting a similarity threshold c, comparing the similarity S of each row in the similarity matrix M with the similarity threshold c, and classifying the decision trees in the similarity matrix M to obtain M categories of the decision trees;
4e) selecting a decision tree with the highest classification precision from each category and combining to obtain an improved random forest IRFC model;
5) sampling the training set by a bootstrap sampling method to obtain a final training subset, and inputting the training subset into an improved random forest IRFC model for model training; after each round of training, carrying out classification accuracy evaluation on the obtained model by using data which are not sampled in the training set, and stopping training when the classification accuracy reaches an expected value to obtain a trained IRFC;
6) inputting the test set data into the trained IRFC, outputting the predicted LFM signal class C of each test dataiAnd obtaining the identification result of the radiation source signal.
2. The method according to claim 1, wherein the LFM signals of 9 different phase noise, pulse width, bandwidth and carrier frequency in 1) are respectively set as follows:
1a) setting the pulse width of an ideal emission signal of a radiation source to be 10us, the bandwidth to be 10MHz, the carrier frequency to be 10GHz and the sampling frequency to be 100 MHz;
1b) adding three different phase noises S1, S2 and S3 to an ideal transmitting signal to obtain an LFM signal carrying the different phase noises;
1c) the LFM signal with phase noise S1 is passed through a first butterworth filter F1, the LFM signal with phase noise S2 is passed through a second butterworth filter F2, and the LFM signal with phase noise S3 is passed through a third butterworth filter F3. Obtaining three LFM signals containing phase noise and passing through the Butterworth filter, and recording the LFM signals as E1, E2 and E3;
1d) noise with signal-to-noise ratios of 0dB, 4dB, 8dB, 12dB, 16dB and 20dB is added into three LFM signals E1, E2 and E3 which pass through the Butterworth filter, so that each LFM signal which passes through the Butterworth filter and contains phase noise respectively generates 1000 samples at each signal-to-noise ratio point, and 18000 samples are recorded;
1e) changing the pulse width to be 7us, the bandwidth to be 5MHz and the carrier frequency to be 8GHz, repeating the processes 1b) to 1d) to obtain 18000 samples of three new different signals E4, E5 and E6 under different signal-to-noise ratios;
1f) changing the pulse width to be 15us, the bandwidth to be 20MHz and the carrier frequency to be 12GHz, repeating the processes 1b) to 1d), and obtaining 18000 samples under different signal-to-noise ratios of three new signals E7, E8 and E9;
1g) the samples obtained in 1d), 1e), 1f) were combined together as a data set, resulting in 54000 analog signal samples of 9 different types of signals at 6 different signal-to-noise ratios.
3. The method as claimed in claim 2, wherein the phase noise phase modulation coefficients of three different phase noises S1, S2, S3 in 1b) are set as follows:
the phase modulation coefficients of the first phase noise S1 are 1, 0.1, 0.01, 0.001 and 0.0001 when the frequency offsets of the ideal transmitting signal are 1KHz, 10KHz, 100KHz, 500KHz and 1000KHz, respectively;
the phase modulation coefficients of the second phase noise S2 are 0.5, 0.08, 0.008, 0.002, and 0.0001 when the frequency offsets of the ideal transmission signal are 1KHz, 10KHz, 100KHz, 500KHz, and 1000KHz, respectively;
the phase modulation coefficients of the third phase noise S3 are 0.8, 0.2, 0.07, 0.004, and 0.006 when the frequency offsets of the ideal transmission signal are 1KHz, 10KHz, 100KHz, 500KHz, and 1000KHz, respectively.
4. Method according to claim 2, characterized in that the three different butterworth filters F1, F2, F3 parameters in 1c) are set as follows:
the sampling frequency of the first butterworth filter F1 is 20000Hz, and the cut-off frequency is 200 Hz;
the sampling frequency of the second butterworth filter F2 is 30000Hz, and the cut-off frequency is 150 Hz;
the sampling frequency of the third butterworth filter F3 is 30000Hz and the cut-off frequency is 200 Hz.
5. The method of claim 1, wherein the Morlet wavelet transform is performed on the signal sequence in 2) and has the following formula:
Figure FDA0003097165230000031
where a is a scale factor, b is a translation factor, t denotes time, s (t) denotes the input signal sequence, #*Represents MorleConjugate function of t wavelet function, WTψsAnd (a, b) is the result of Morlet wavelet transform of the signal sequence.
6. The method according to claim 1, wherein the waveform entropy E of the signal sequence in 2) is calculated according to the bispectral featuresbEntropy of energy EnAnd singular value entropy EsvdThe three output characteristics are expressed as follows:
Figure FDA0003097165230000032
Figure FDA0003097165230000033
Figure FDA0003097165230000034
wherein the content of the first and second substances,
Figure FDA0003097165230000035
| R | | represents the sum of the bispectrum estimates over all the integration paths, L represents the number of integration paths, | RiL represents the sum of the bispectral estimates over each round of the integration path;
Figure FDA0003097165230000036
b (i, j) represents a bispectrum matrix of each bispectrum value on the integration path, I, J is the row and column number of the bispectrum matrix respectively;
βiand (i ═ 1, 2., N) is a singular value of the ith bispectrum estimation result, and N is the number of bispectrum singular value estimates.
7. The method of claim 1, wherein the 4c) calculating similarity between every two decision trees in the forestabThe formula is as follows:
Figure FDA0003097165230000041
wherein S isabIs decision tree DTaAnd decision tree DTbSimilarity between, LiIs decision tree DTaThe ith path and decision tree DTbIs the cosine similarity between each path, l is the decision tree DTaTotal number of paths of, MaxSimiIs DTaThe ith path pair DTbThe maximum similarity of the paths of (1).
8. The method according to claim 1, wherein the decision tree in the similarity matrix M in 4d) is classified as follows:
classifying decision trees with the similarity of the first row in the similarity matrix M exceeding a threshold value c into one category;
determining the decision Tree DT of the first lineiWhether the classification is a certain class, i is more than or equal to 2 and less than or equal to w: if so, skipping the row; otherwise, classifying the decision trees with the row similarity exceeding a threshold value c into one class;
and after the similarity comparison of w rows in the similarity matrix M is completed, dividing the decision tree into M categories.
9. The method of claim 1, wherein 6) outputs the predicted LFM signal class for each test datum, which is determined by all decision trees in the IRFC model, as follows:
Figure FDA0003097165230000042
wherein, CiDenotes the predicted LFM signal class, h (x) ═ CiIndicates that the predicted result is CiM is the number of decision trees in the IRFC model, and N represents the total number of signal categories.
CN202110613814.2A 2021-06-02 2021-06-02 Radar radiation source identification method based on improved random forest Pending CN113239880A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110613814.2A CN113239880A (en) 2021-06-02 2021-06-02 Radar radiation source identification method based on improved random forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110613814.2A CN113239880A (en) 2021-06-02 2021-06-02 Radar radiation source identification method based on improved random forest

Publications (1)

Publication Number Publication Date
CN113239880A true CN113239880A (en) 2021-08-10

Family

ID=77136389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110613814.2A Pending CN113239880A (en) 2021-06-02 2021-06-02 Radar radiation source identification method based on improved random forest

Country Status (1)

Country Link
CN (1) CN113239880A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936748A (en) * 2021-11-17 2022-01-14 西安电子科技大学 Molecular recognition characteristic function prediction method based on ensemble learning
CN115951315A (en) * 2023-03-02 2023-04-11 中国人民解放军空军预警学院 Radar deception jamming identification method and system based on improved wavelet packet energy spectrum

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009086137A (en) * 2007-09-28 2009-04-23 Kddi Corp Device, method and program for creating decision tree for speech synthesis,
CN104462502A (en) * 2014-12-19 2015-03-25 中国科学院深圳先进技术研究院 Image retrieval method based on feature fusion
CN111680737A (en) * 2020-06-03 2020-09-18 西安电子科技大学 Radar radiation source individual identification method under differential signal-to-noise ratio condition
CN112085335A (en) * 2020-08-10 2020-12-15 国网上海市电力公司 Improved random forest algorithm for power distribution network fault prediction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009086137A (en) * 2007-09-28 2009-04-23 Kddi Corp Device, method and program for creating decision tree for speech synthesis,
CN104462502A (en) * 2014-12-19 2015-03-25 中国科学院深圳先进技术研究院 Image retrieval method based on feature fusion
CN111680737A (en) * 2020-06-03 2020-09-18 西安电子科技大学 Radar radiation source individual identification method under differential signal-to-noise ratio condition
CN112085335A (en) * 2020-08-10 2020-12-15 国网上海市电力公司 Improved random forest algorithm for power distribution network fault prediction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王日升: "基于Spark的一种改进的随机森林算法研究", 《中国知网》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936748A (en) * 2021-11-17 2022-01-14 西安电子科技大学 Molecular recognition characteristic function prediction method based on ensemble learning
CN115951315A (en) * 2023-03-02 2023-04-11 中国人民解放军空军预警学院 Radar deception jamming identification method and system based on improved wavelet packet energy spectrum

Similar Documents

Publication Publication Date Title
Li et al. Toward convolutional neural networks on pulse repetition interval modulation recognition
CN109307862A (en) A kind of target radiation source individual discrimination method
Bulakh et al. Time series classification based on fractal properties
CN111310833B (en) Travel mode identification method based on Bayesian neural network
CN111680737B (en) Radar radiation source individual identification method under differential signal-to-noise ratio condition
CN113239880A (en) Radar radiation source identification method based on improved random forest
Sun et al. Cluster guide particle swarm optimization (CGPSO) for underdetermined blind source separation with advanced conditions
CN105304078B (en) Target sound data training device and target sound data training method
CN107830996B (en) Fault diagnosis method for aircraft control surface system
CN106483514B (en) Airplane motion mode identification method based on EEMD and support vector machine
CN111209960B (en) CSI system multipath classification method based on improved random forest algorithm
CN114114166A (en) Radar pulse de-interlacing method based on DTM algorithm
CN112949383A (en) Waveform agility radar radiation source identification method based on Hydeep-Att network
Orduyilmaz et al. Machine learning-based radar waveform classification for cognitive EW
Han et al. Radar specific emitter identification based on open-selective kernel residual network
CN114611551A (en) Electromechanical fault classification method based on wavelet packet energy spectrum entropy
Wang et al. Radar HRRP target recognition in frequency domain based on autoregressive model
CN111797690A (en) Optical fiber perimeter intrusion identification method and device based on wavelet neural network grating array
CN108846407B (en) Magnetic resonance image classification method based on independent component high-order uncertain brain network
CN115951315A (en) Radar deception jamming identification method and system based on improved wavelet packet energy spectrum
CN113298138B (en) Individual identification method and system for radar radiation source
Fucai et al. Classification using wavelet packet decomposition and support vector machine for digital modulations
Mengmeng et al. Signal sorting using teaching-learning-based optimization and random forest
CN112434716B (en) Underwater target data amplification method and system based on condition countermeasure neural network
CN113673683A (en) Electronic nose recognition model optimization method based on CGAN (Carrier-grade Analyzer) and generator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210810

WD01 Invention patent application deemed withdrawn after publication