Disclosure of Invention
The invention aims to: the invention provides a method for detecting a small and weak target based on a feature mapping neural network, which solves the problem of low detection precision caused by the influence of noise and interference on the existing small and weak target.
The technical scheme adopted by the invention is as follows:
a weak and small target detection method based on a feature mapping neural network comprises the following steps:
step 1: constructing and training a spindle-type deep neural network;
step 2: inputting the collected images of the weak and small targets into a trained spindle-type deep neural network to obtain an amplitude map of target enhancement and background suppression;
and step 3: and the amplitude diagram completes the detection of the weak and small targets by adopting a constant false alarm rate method.
Preferably, the step 1 comprises the steps of:
step 1.1: constructing a structure of a spindle-type deep neural network comprising an input layer, a decoding layer, an encoding layer and a softmax output layer;
step 1.2: determining the hyper-parameters of the spindle-type deep neural network by adopting a cross validation method to obtain the spindle-type deep neural network;
step 1.3: constructing a training data set;
step 1.4: inputting the training data set into the spindle-type deep neural network, and training in an unsupervised mode to obtain the weight of the initialized network to finish training.
Preferably, the decoding layer and the encoding layer are trained and calculated as follows:
hk=σ(WkX+bk)
wherein, WkRepresenting a weight matrix, bkRepresenting a bias vector, sigma representing an activation function, X ═ X1,x2,...,xmDenotes the input of the current layer, hkRepresenting the output of the current layer.
Preferably, the step 2 comprises the steps of:
step 2.1: inputting the collected small target image into a trained spindle-type deep neural network, setting a small target sample label as 1 and setting a background sample label as 0;
step 2.2: judging the weak and small target image by adopting a sliding window method to obtain the probability value of the weak and small target in the image;
step 2.3: and obtaining target enhancement and background suppression amplitude maps corresponding to a plurality of weak and small targets by using the output layer logistic regression result, namely a plurality of probability values as window coordinate point response amplitude values.
Preferably, the step 3 comprises the steps of:
step 3.1: extracting sub-blocks of each value in the amplitude map by adopting a sliding window
Detecting, inputting the spindle type deep neural network to obtain an amplitude value;
step 3.2: counting each detected amplitude value by adopting a constant false alarm rate to obtain a false alarm probability;
wherein T represents likelihood ratio detectionThe threshold value is set to a value that is,
representing the mean of the sliding window sub-blocks, P representing the number of points within a window sub-block, P
faIndicating the false alarm probability, tau, of a constant false alarm rate detection setting
CFARDenotes a detection threshold value, F
1,p-1(τ
CFAR) A cumulative distribution function representing a central F random variable;
step 3.3: the total number of candidate targets is calculated according to the false alarm probability, the false alarm probability is ranked from high to low, and the bit target is determined from a plurality of amplitude values.
Preferably, the constructing of the training data set in step 1.3 comprises the following steps:
step 1.3.1: randomly generating coordinate points in the image without the weak and small targets as simulation targets, and extracting an N x N window area as a background sample;
step 1.3.2: in the background sample, a two-dimensional Gaussian intensity model and a simulation target are used as target samples, wherein the two-dimensional Gaussian model is as follows:
wherein (x)0,y0) Representing the center position of the target image, s (i, j) representing the pixel value of the target image at position (i, j), sERepresenting the intensity of the generated object, and having a value of (0, 1)]Random number between, σxAnd σyRespectively representing horizontal and vertical dispersion parameters, having values between 0, 2]To (c) to (d);
step 1.3.3: and adjusting different parameters of the target sample to generate weak targets with different signal-to-noise ratios to complete the construction of the training data set.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. according to the method, a spindle-shaped network structure is adopted, firstly, low-dimensional small target block features are mapped to a high-dimensional space, then, high-resolution features are extracted through a coding neural network, background and target discrimination is completed, an image with enhanced background suppression targets is obtained according to the strength output by the network discrimination, and finally, detection of the small targets is completed by a detection method based on a constant false alarm rate, so that the problem that the existing small targets are low in detection precision due to the influence of noise and interference is solved, the strong representation capability of the network is improved, and the detection precision of the small targets is improved under a high-noise environment;
2. the method detects the millimeter waves and the infrared images under different scenes, realizes decoding operation at the front end of the network, enables the network to have stronger expression capability, firstly maps the pixel characteristics of a weak and small target area to a high-dimensional characteristic space, and then maps the high-dimensional characteristics to a low-dimensional characteristic space which is easy to distinguish through coding, thereby achieving the effects of lower false alarm rate, higher detection precision and stronger robustness;
3. the network is pre-trained in an unsupervised learning mode, a deeper structure is constructed, the stability of the network can be improved through the initialized network weight obtained by unsupervised learning, so that the local minimum problem in the process of directly training the deep neural network is avoided, meanwhile, the internal characteristics of a series of related data sets are obtained by unsupervised learning, the redundant components of input data are removed, the high discriminable characteristics can be obtained, and the discrimination precision of the network is further improved;
4. the constant false alarm rate method adopts the slider to detect, and adopts the slider to detect the constant false alarm rate when the distance between two targets is assumed to be a certain distance according to the actual situation, so that the detection speed can be increased on one hand, and the detection precision can be improved on the other hand.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It is noted that relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The technical problem is as follows: solves the problem that the prior weak and small target is influenced by noise and interference to cause low detection precision
The technical means is as follows:
a weak and small target detection method based on a feature mapping neural network comprises the following steps:
step 1: constructing and training a spindle-type deep neural network;
step 2: inputting the collected images of the weak and small targets into a trained spindle-type deep neural network to obtain an amplitude map of target enhancement and background suppression;
and step 3: and the amplitude diagram completes the detection of the weak and small targets by adopting a constant false alarm rate method.
The step 1 comprises the following steps:
step 1.1: constructing a structure of a spindle-type deep neural network comprising an input layer, a decoding layer, an encoding layer and a softmax output layer;
step 1.2: determining the hyper-parameters of the spindle-type deep neural network by adopting a cross validation method to obtain the spindle-type deep neural network;
step 1.3: constructing a training data set;
step 1.4: inputting the training data set into the spindle-type deep neural network, and training in an unsupervised mode to obtain the weight of the initialized network to finish training.
The decoding layer and the coding layer are trained and calculated as follows:
hk=σ(WkX+bk)
wherein, WkRepresenting a weight matrix, bkRepresenting a bias vector, sigma representing an activation function, X ═ X1,x2,...,xmDenotes the input of the current layer, hkRepresenting the output of the current layer.
The step 2 comprises the following steps:
step 2.1: inputting the collected small target image into a trained spindle-type deep neural network, setting a small target sample label as 1 and setting a background sample label as 0;
step 2.2: judging the weak and small target image by adopting a sliding window method to obtain the probability value of the weak and small target in the image;
step 2.3: and obtaining target enhancement and background suppression amplitude maps corresponding to a plurality of weak and small targets by using the output layer logistic regression result, namely a plurality of probability values as window coordinate point response amplitude values.
The step 3 comprises the following steps:
step 3.1: extracting sub-blocks of each value in the amplitude map by adopting a sliding window
Detecting, inputting the spindle type deep neural network to obtain an amplitude value;
step 3.2: counting each detected amplitude value by adopting a constant false alarm rate to obtain a false alarm probability;
wherein T represents a threshold value of likelihood ratio detection,
representing the mean of the sliding window sub-blocks, P representing the number of points within a window sub-block, P
faIndicating the false alarm probability, tau, of a constant false alarm rate detection setting
CFARDenotes a detection threshold value, F
1,p-1(τ
CFAR) A cumulative distribution function representing a central F random variable;
step 3.3: the total number of candidate targets is calculated according to the false alarm probability, the false alarm probability is ranked from high to low, and the bit target is determined from a plurality of amplitude values.
The construction of the training data set in step 1.3 comprises the following steps:
step 1.3.1: randomly generating coordinate points in the image without the weak and small targets as simulation targets, and extracting an N x N window area as a background sample;
step 1.3.2: in the background sample, a two-dimensional Gaussian intensity model and a simulation target are used as target samples, wherein the two-dimensional Gaussian model is as follows:
wherein (x)0,y0) Representing the center position of the target image, s (i, j) representing the pixel value of the target image at position (i, j), sERepresenting the intensity of the generated object, and having a value of (0, 1)]Random number between, σxAnd σyRespectively representing horizontal and vertical dispersion parameters, having values between 0, 2]To (c) to (d);
step 1.3.3: and adjusting different parameters of the target sample to generate weak targets with different signal-to-noise ratios to complete the construction of the training data set.
The technical effects are as follows:
the method adopts a spindle-shaped network structure, firstly maps the low-dimensional weak and small target block characteristics to a high-dimensional space, then extracts high-resolution characteristics through a coding neural network to complete background and target discrimination, obtains an image with enhanced background suppression targets according to the strength output by the network discrimination, and finally completes the detection of the weak and small targets by adopting a detection method based on the constant false alarm rate, thereby solving the problem of low detection precision caused by the influence of noise and interference on the existing weak and small targets, achieving the effects of improving the strong representation capability of the network and improving the detection precision of the weak and small targets in a high-noise environment.
The features and properties of the present invention are described in further detail below with reference to examples.
Example 1
The training data of the network is shown in fig. 3, a small circle represents one type of project, an x point represents another type of sample, and the training samples are linear inseparable and have high discrimination difficulty; after the deep network is trained through a training sample, the features of the second layer and the third layer are output as shown in fig. 4, after the two-dimensional features which are not linearly separable are mapped to three dimensions through a neural network, the three-dimensional features have linear discriminability in a three-dimensional feature space, and after the three-dimensional features are recoded to two dimensions, as shown in fig. 5, the original low-dimensional features which are not linearly separable are mapped to a linear discriminability feature space, and blue x sample samples are distributed in a compact area. As shown in fig. 2, the whole network contains 6 layers, which can be divided into 2 parts, the input layer to the 2 nd layer are decoding layers, and the mapping from the ascending dimension of the features to the high dimension is completed; 2, a typical sparse self-encoder is arranged between the 2 layer and the 5 layer, the encoding extraction of abstract high-level features is realized, and the last layer is a softmax output layer; the network model and the typical deep neural network model provided by the invention mainly differ from a first layer to a second layer of decoding layers, and the traditional deep neural network mainly processes high-dimensional data and extracts abstract dimension reduction characteristics; based on the characteristics of small detection window and low dimensionality of a weak target, decoding operation is realized at the front end of the network firstly, so that the network has stronger representation capability; in a high-noise environment, the detection precision of the weak and small targets is higher; constructing and training a spindle type deep neural network: step 1.1: constructing a structure of a spindle-type deep neural network comprising an input layer, a decoding layer, an encoding layer and a softmax output layer; step 1.2: determining the hyper-parameters of the spindle-type deep neural network by adopting a cross validation method; step 1.3: constructing a training data set; step 1.4: inputting the training data set into the spindle-type deep neural network, and training in an unsupervised mode to obtain the weight of the initialized network to finish training. The sizes of the layers of the trained network model are [81, 512, 256, 121, 81, 1], wherein the input layers { I1, I2, …, IN } are linear arrangements of weak and small target detection window pixels IN the image; feature transformation, we adopt a sparse autoencoder, and the calculation formula is as follows:
wherein x is
(i)Indicating that, given an input of x, the self-coding neural network hides the degree of activation of neuron j,
representing the average liveness of the hidden neuron j; ρ represents a sparsity parameter and β represents a hyperparameter.
Rho is set to be 0.5, beta is set to be 3, the network parameters are learned by adopting a gradient descent method, and the learning rate is set to be 0.01; during training, the label of a small target sample is set to be 1, the label of a background sample is set to be 0, all area input networks of an image are judged by adopting a sliding window method, the result of output layer logistic regression is adopted as a window coordinate point response amplitude, the more values, the higher the probability that the detection window contains the small target is, and the lower the probability is, and the lower the probability is, and the lower the probability is, the lower the probability is, the lower the probability is, the probability isThe amplitude map of the target enhancement and the background suppression obtained by the deep neural network is Io(ii) a The amplitude diagram adopts a constant false alarm rate method to complete the detection of the weak and small targets; acquiring a training data set by adopting a simulation method, artificially adding weak and small targets in 220 images which do not contain the weak and small targets, and simulating to construct a training set; randomly generating coordinate points in each image, extracting a 9 x 9 region as a background sample, and adding a simulation target in the background sample by using a two-dimensional Gaussian intensity model as a target sample, wherein the two-dimensional Gaussian model is as follows:
wherein (x)0,y0) Representing the center position of the target image, s (i, j) representing the pixel value of the target image at position (i, j), sERepresenting the intensity of the generated object, and having a value of (0, 1)]Random number between, σxAnd σyRespectively representing horizontal and vertical dispersion parameters, having values between 0, 2]In the meantime.
Weak targets with different signal-to-noise ratios are generated by adjusting different parameters, the signal-to-noise ratio of the weak targets generated in the method is between [0 and 120], all samples are basically and uniformly distributed in the range of the signal-to-noise ratio interval, 26400 positive samples and 26400 negative samples are included, a part of generated training sample images are shown in fig. 6, and part of data of a simulation test data set is shown in fig. 7.
The test set comprises a simulation test set and a real data test set, the image data comprises millimeter wave images and infrared images, the infrared images are from a plurality of data sets, the simulation test data set adopts a weak and small target simulation method with the same training data, weak and small targets with different signal-to-noise ratios are added into a background image at random, 1920 weak and small targets are added into 32 images in total, and the signal-to-noise ratios of the 1920 weak and small targets are approximately and uniformly distributed between [0-120] Db; inputting test set data into a trained spindle-shaped neural network to complete testing, firstly mapping low-dimensional weak and small target block features to a high-dimensional space through the spindle-shaped neural network, then extracting high-resolution features through a coding neural network to complete background and target discrimination, obtaining an image with enhanced background suppression targets according to the strength output by the network discrimination, and finally completing detection of the weak and small targets by adopting a detection method based on constant false alarm rate, thereby effectively improving the detection precision of the weak and small targets.
Example 2
The step 1 comprises the following steps:
step 1.1: constructing a structure of a spindle-type deep neural network comprising an input layer, a decoding layer, an encoding layer and a softmax output layer;
step 1.2: determining the hyper-parameters of the spindle-type deep neural network by adopting a cross validation method to obtain the spindle-type deep neural network;
step 1.3: constructing a training data set;
step 1.4: inputting the training data set into the spindle-type deep neural network, and training in an unsupervised mode to obtain the weight of the initialized network to finish training.
The decoding layer and the coding layer are trained and calculated as follows:
hk=σ(WkX+bk)
wherein, WkRepresenting a weight matrix, bkRepresenting a bias vector, sigma representing an activation function, X ═ X1,x2,...,xmDenotes the input of the current layer, hkRepresenting the output of the current layer.
The step 2 comprises the following steps:
step 2.1: after the acquired weak and small target image is input into the trained spindle-type deep neural network, setting the label of the weak and small target sample as 1 and the label of the background sample as 0, wherein the image is as follows:
where s denotes the sensor acquired image, stRepresenting the target signal, sbRepresenting the background signal and n representing noise.
Step 2.2: judging the weak and small target image by adopting a sliding window method to obtain the probability value of the weak and small target in the image;
step 2.3: and obtaining target enhancement and background suppression amplitude maps corresponding to a plurality of weak and small targets by using the output layer logistic regression result, namely a plurality of probability values as window coordinate point response amplitude values.
The step 3 comprises the following steps:
step 3.1: extracting sub-blocks of each value in the amplitude map by adopting a sliding window
Detecting, inputting the amplitude value obtained by the network;
step 3.2: counting each detected amplitude value by adopting a constant false alarm rate to obtain a false alarm probability;
wherein T represents a threshold value of likelihood ratio detection,
representing the mean of the sliding window sub-blocks, P representing the number of points within a window sub-block, P
faIndicating the false alarm probability, tau, of a constant false alarm rate detection setting
CFARDenotes a detection threshold value, F
1,p-1(τ
CFAR) A cumulative distribution function representing a central F random variable;
step 3.3: the total number of candidate targets is calculated according to the false alarm probability, the false alarm probability is ranked from high to low, and the bit target is determined from a plurality of amplitude values.
The construction of the training data set in step 1.3 comprises the following steps:
step 1.3.1: randomly generating coordinate points in the image without the weak and small targets as simulation targets, and extracting an N x N window area as a background sample;
step 1.3.2: in the background sample, a two-dimensional Gaussian intensity model and a simulation target are used as target samples, wherein the two-dimensional Gaussian model is as follows:
wherein (x)0,y0) Representing the center position of the target image, s (i, j) representing the pixel value of the target image at position (i, j), sERepresenting the intensity of the generated object, and having a value of (0, 1)]Random number between, σxAnd σyRespectively representing horizontal and vertical dispersion parameters, having values between 0, 2]To (c) to (d);
step 1.3.3: and adjusting different parameters of the target sample to generate weak targets with different signal-to-noise ratios to complete the construction of the training data set.
The precision of the method is embodied by comparing and analyzing the detection effects of other mainstream algorithms and the detection effect of the method. Two types of curves are adopted as evaluation indexes, the first type of curve is an ROC curve, and the ROC curve reflects the detection probability P in target detectiondAnd false alarm rate PfaThe larger the area under the ROC curve is, the better the detection performance is, PdAnd PfaThe calculation formula of (a) is as follows:
wherein N istIndicating the number of correctly detected targets, NaDenotes the total number of targets, NfRepresenting the number of false objects detected and N representing the number of all pixel points in the image.
The second type of curve is the detection probability PdVariation with SNR of the signal-to-noise ratio, P as the SNR value increasesdWill gradually become larger, and finally approach to 1, the SNR calculation formula adopted is:
wherein, gtMean value, g, representing pixels of the target local areabAnd σbRespectively representing pixels of local areas of the backgroundMean and standard deviation.
Several mainstream algorithms detect effects including: ACSDM, CSCD, SR, ISTCR, ISTCSR-CSCD, the application represents DL algorithm; as shown in fig. 8: from left to right represent ACSDM, CSCD, SR, ISTCR, ISTCSR-CSCD and DL algorithms, respectively; the real position of the target is represented by a green solid line frame, the output of the detector is represented by a red frame, if the output position of the detector is superposed with the real position of the target, the green frame is covered by the red frame, the target point is only a real red frame, only the target of the green frame shows that the detector has missed detection under the specified false alarm rate, and the DL algorithm is shown to have the largest number of detected real targets and the smallest number of missed targets relative to other algorithms;
the results of the quantitative analysis of the 6 algorithms are shown in FIGS. 9-12, and FIGS. 9-12 depict the SNR (SNR) under various signal to noise ratios<10,10<SNR<20,20<SNR<30,30<SNR<40) Probability of detection PdAnd false alarm rate PfaThe solid line marked as star is the result of the DL algorithm proposed herein, and it can be seen that the algorithm of the present application is generally superior to the other 5 algorithms in 4 different SNR intervals; at SNR < 10, at Pfa=1×10-4In time, the ISTCR algorithm is superior to the method, the best result is achieved, and in other cases, the method Dl has better detection precision; at 10 < SNR < 20, the detection rate of the algorithm herein is much higher at each false alarm rate than the rest of the methods, DL being about 20% higher than the second ranked method, CSCD; at 20 < SNR < 30, about 12% higher than the best of the same class; when SNR is more than 30 and less than 40, the method is also obviously superior to the similar method, and is about 8 percent higher;
FIGS. 13-16 are shown at the same PfaUnder P, PdThe solid line labeled star represents the DL algorithm proposed in this application as a function of SNR, and it can be seen that the DL algorithm achieves the best results under 4 different constant false alarm rates, especially under Pfa>10×10-4While, the process herein is on average 20% higher than the best of the same class; at Pfa=1×10-4In time, the performance of the detection methods of the three algorithms of DL, ISTCR and ISTCR-CSCD is relatively close, butIs obviously superior to the other three algorithms. Fig. 17 is an enlarged schematic view of the detection effect of the DL algorithm, because the detection uses colors to distinguish the real position frame from the detection output frame, the effect may not be obvious after performing the black-and-white process according to the patent law, and a color image can be provided if necessary.
The method comprises the steps of learning the characteristics of a weak and small target under a complex background by constructing a spindle neural network structure, judging an image block by a deep neural network to output a target probability, wherein the target region has higher probability and the background region has lower probability, constructing a target intensity map according to the probability, and detecting and positioning the target by using a Constant False Alarm Rate (CFAR) method.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.