CN111814875A - Method for expanding ship samples in infrared image based on pattern generation countermeasure network - Google Patents
Method for expanding ship samples in infrared image based on pattern generation countermeasure network Download PDFInfo
- Publication number
- CN111814875A CN111814875A CN202010650897.8A CN202010650897A CN111814875A CN 111814875 A CN111814875 A CN 111814875A CN 202010650897 A CN202010650897 A CN 202010650897A CN 111814875 A CN111814875 A CN 111814875A
- Authority
- CN
- China
- Prior art keywords
- layer
- convolution
- network
- setting
- pattern
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention provides a method for expanding ship samples in an infrared image based on a pattern generation countermeasure network. The method mainly solves the problems that the reality is poor due to the complex simulation modeling of the infrared image generated in the prior art, the acquisition difficulty of visible light-infrared image photoelectric conversion training samples is high, and the number of training sets is small, so that the expanded infrared image lacks diversity, and comprises the following steps: (1) selecting real-shot infrared images to form a training set; (2) constructing a generator network; (3) constructing a discriminator network; (4) constructing a pattern generation countermeasure network; (5) training a discriminator network; (6) training a generator network; (7) training the pattern to generate a confrontation network; (8) and outputting the infrared image sample by using the trained generator network to complete the expansion of the infrared image sample. The invention can generate a large amount of infrared ship samples, and effectively improves the reality and diversity of the extended samples.
Description
Technical Field
The invention belongs to the technical field of image processing, and further relates to a method for expanding ship samples in infrared images of a confrontation Network style-based generated adaptive acquired Network (anti-web) based on a pattern in the field of deep learning. The invention can expand ship samples in the infrared image so as to provide a rich data set for training an infrared target detection and recognition algorithm.
Background
The infrared imaging technology is commonly used for detecting, identifying, tracking and the like of targets due to the characteristics of strong target detection capability, strong anti-interference capability and the like. Because the infrared characteristics of the target are complex and change obviously along with the temperature condition, the detection and identification of the infrared target are difficult to realize, and in order to improve the capability of detecting and identifying the infrared target, a large number of infrared images are generally needed for training and learning a detection and identification algorithm. However, the infrared images are usually obtained by shooting a target scene through a thermal imager, so that the means for acquiring some specific target infrared images is limited, thereby causing a serious shortage of the number of infrared samples. At present, researchers have proposed a sample expansion method based on infrared simulation, which generates infrared simulation images by modeling a target area scene and performing radiation calculation, however, because target information required by modeling may be limited, the generated infrared images are not high in reality and limited in number.
A target sample generation method based on infrared simulation is proposed in a patent document 'target sample generation method based on infrared simulation' (2018105060882, CN110162812A) applied by the Beijing electromechanical engineering research institute. The method comprises the steps of modeling a target area scene, completing area division and material assignment of the target scene, generating a material image of the target area scene, constructing a three-dimensional scene of the target area scene according to the material image, a model of the target area scene and data of the target area scene, then setting external environmental conditions of the target area scene, calculating a temperature field, infrared radiation and atmospheric transmittance of the target area scene to generate an infrared simulation scene of the target area scene under zero-line-of-sight, and generating an infrared simulation target sample according to line-of-sight parameters and imaging system parameters on the basis of the target area infrared simulation scene. The method has the following defects: radiance calculations are required for the environment outside the target area scene. Although the problem of infrared simulation image generation can be solved by radiation calculation, in a complex external environment, factors needing to be considered are various, and a modeling process is complex, so that the simulation result is poor in reality, and an infrared sample of a required target scene cannot be accurately obtained.
Chen Buddhist et al, in its published paper "infrared image data enhancement based on generation of countermeasure networks" (computer application, 2020:1-7), propose a sample expansion method for generating infrared images based on visible light images of a photoelectric image conversion model. The method comprises the steps of firstly constructing a paired data set through existing visible light image data and infrared image data, and then constructing a generator and a discriminator for generating a countermeasure network based on a convolutional neural network. The generator countermeasure network is then trained using the paired data sets until an equilibrium state is reached between the generator and the arbiter. Finally, the method uses the trained generator that generates the countermeasure network to transform the visible light image from the visible light domain to the infrared domain, thereby completing the sample expansion of the infrared image. The method has the following defects: paired visible light images and corresponding infrared images are required to be collected as data sets to train the generated countermeasure network, the acquisition difficulty of the paired visible light-infrared image data is high, the number of the training sets is small, and the generated infrared image samples lack diversity; when the infrared image sample is expanded, a large number of additional visible light images still need to be collected as input for photoelectric conversion, the expansion mode is complex, and the number of the expanded samples is limited.
Disclosure of Invention
The invention aims to provide a warship sample expansion method in an infrared image based on a pattern generation countermeasure network aiming at overcoming the defects of the prior art, and aims to solve the problems that the simulation process is complex when an infrared image sample is expanded, the reality of the expanded infrared sample is poor, the acquisition difficulty of a visible light-infrared photoelectric conversion method training set is high, the expanded infrared sample lacks diversity and the expansion quantity is limited.
The idea of the invention for realizing the above purpose is as follows: and constructing a pattern which takes random feature vectors as input to generate a countermeasure network, using a real-shot infrared image ship sample as a training set, and updating parameters of the network in an alternate training mode of a discriminator and a generator. And under the condition that the loss value updated to the generator network is less than 20 and the average loss value of the discriminator network is less than 10, storing the weight parameters of each layer of the trained generator network. And finally, inputting the randomly generated feature vectors into a trained generator network for calculation to obtain generated infrared images, and adding the generated infrared images into a training set to complete the expansion of infrared image ship samples.
The method comprises the following specific steps:
(1) acquiring a training set:
selecting at least 2000 infrared images, wherein each image comprises a ship target; scaling and cutting the size of each image into 256 multiplied by 256 to form a training set;
(2) constructing a generator network:
(2a) a generator network is built, and the structure of the generator network is as follows in sequence: constant matrix layer → 1 st noise modulation layer → 1 st adaptive pattern modulation layer → 1 st deconvolution layer → 2 nd noise modulation layer → activation function layer → 2 nd adaptive pattern modulation layer → pattern convolution block combination → 2 nd deconvolution layer → output layer;
the adaptive pattern modulation layer structure sequentially comprises: the feature vector input layer → the normalization layer → the 1 st fully-connected layer → the 2 nd fully-connected layer → the 3 rd fully-connected layer → the 4 th fully-connected layer → the 5 th fully-connected layer → the 6 th fully-connected layer → the 7 th fully-connected layer → the 8 th fully-connected layer → the scaling translation transform layer → the output layer;
the pattern rolling block combination is formed by cascading 6 pattern rolling blocks with the same structure, and the structure of each pattern rolling block is as follows in sequence: 1 st deconvolution layer → 1 st noise modulation layer → 1 st activation function layer → 1 st adaptive pattern modulation layer → 2 nd deconvolution layer → 2 nd noise modulation layer → 2 nd activation function layer → 2 nd adaptive pattern modulation layer;
the normalization layer is realized by adopting an example normalization function; the activation function layers are all realized by adopting LeakyReLU functions;
(2b) setting per-layer parameters of the generator network:
setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer to be 3 multiplied by 3, respectively setting the number of the convolution kernels to be 512 and 3, setting the convolution step length to be 1, and initializing the weight to be a random value which meets the normal distribution with the standard deviation of 0.02;
setting the slopes of the Leaky ReLU functions of the activation function layers to be 0.2;
setting the sizes of convolution kernels of 1 st and 2 nd deconvolution layers of 1 st to 3 rd pattern convolution blocks in the pattern convolution block combination to be 3 multiplied by 3, setting the number of the convolution kernels to be 512, and setting convolution step length to be 1; setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 4 th pattern convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 256, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 5 th pattern convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 128, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 6 th pattern convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 64, and setting the convolution step length to be 1; initializing the weight of each convolution kernel of each deconvolution layer in the 1 st to 6 th pattern convolution blocks to be a random value satisfying a normal distribution with a standard deviation of 0.02; setting the slope of each leak ReLU function of the activation function layers in the pattern 1 to pattern 6 volume blocks to 0.2;
the number of neurons of the 1 st to 8 th fully-connected layers in the self-adaptive pattern modulation layer is set to be 512, and weights are initialized to be random values meeting the normal distribution with the standard deviation of 0.02;
(3) constructing a discriminator network:
(3a) a discriminator network is built, and the structure of the discriminator network is as follows in sequence: input layer → convolutional layer → activation function layer → convolutional block combination → fully connected layer → output layer;
the convolution block combination is formed by cascading 6 convolution blocks with the same structure, and the structure of each convolution block is as follows in sequence: 1 st convolution layer → 1 st activation function layer → 2 nd convolution layer → 1 st pooling layer → 2 nd activation function layer;
the activation function layers are all realized by adopting a Leaky ReLU function, and the pooling layer is realized by adopting global average pooling;
(3b) setting parameters of each layer of the discriminator network:
setting the sizes of convolution kernels of the convolution layers to be 3 multiplied by 3, setting the number of the convolution kernels to be 512 respectively, setting convolution step lengths to be 1, and initializing the weights to be random values meeting the standard deviation of 0.02 normal distribution;
setting the slope of each Leaky ReLU function of the activation function layer to be 0.2;
setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 1 st convolution block and the 2 nd convolution block in the convolution block combination to be 3 multiplied by 3, setting the number of the convolution kernels to be 512, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 3 rd convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 256, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 4 th convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 128, and setting the convolution step size to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 5 th convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 64, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 6 th convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 32, and setting the convolution step length to be 1; initializing each convolution kernel weight of each convolution layer in the 1 st to 6 th convolution blocks to a random value satisfying a normal distribution with a standard deviation of 0.02; setting the slope of the Leaky ReLU function of each activation function layer in the 1 st to 6 th volume blocks to 0.2;
setting the number of the neurons of the full connection layer as 1, and initializing the weight into a random value which meets the normal distribution with the standard deviation of 0.02;
(4) constructing a pattern generation countermeasure network:
cascading the generator network and the discriminator network to form a pattern to generate a countermeasure network;
(5) training the arbiter network:
fixing the current weight parameters of the generator network, inputting randomly generated feature vectors into the generator network, outputting random infrared images, respectively inputting the generated random infrared images and infrared images in a training set into a discriminator network, respectively outputting corresponding evaluation scores by the discriminator network after evaluating the sequentially input infrared images, and calculating the loss value of the discriminator network by using the evaluation scores of the discriminator network and the loss function of the discriminator network;
calculating the gradient of each convolution kernel of each convolution layer of the discriminator network and the gradient of the full-connection layer by using the loss value and the gradient descent method of the discriminator network;
updating the weight of each convolution kernel of each convolution layer and each convolution de-convolution layer of the discriminator network and the weight of the full connection layer by using an Adam optimizer with the learning rate of 0.005 according to the gradient of each convolution kernel of each convolution layer of the discriminator network and the gradient of the full connection layer;
(6) training the generator network:
fixing the current weight parameters of the discriminator network, inputting randomly generated feature vectors into the generator network, outputting infrared images, inputting the generated random infrared images into the discriminator network, evaluating the input generated images by the discriminator network, outputting evaluation scores, and calculating a generator network loss value by using the evaluation scores of the discriminator and a loss function of the generator network;
calculating the gradient of each convolution kernel of each deconvolution layer of the generator network, the gradient of a full-connection layer, the gradient of a modulation noise layer and the gradient of a self-adaptive pattern modulation layer by using a loss value and gradient descent method of the generator network;
iteratively updating the weight of each convolution kernel of each deconvolution layer of the generator network, the weight of the full-connection layer, the weight of the modulation noise layer and the weight of the adaptive pattern modulation layer by using an Adam optimizer with a learning rate of 0.005;
(7) training pattern generation confrontation network:
repeating the steps (5) and (6), alternately training the discriminator network and the generator network until the loss value of the generator network obtained by current iteration is less than 20 and the loss value mean value of the discriminator network is less than 10, obtaining the trained generator network weight, and storing the trained pattern to generate the weight of each convolution kernel of each deconvolution layer of the generator network in the countermeasure network, the weight of the full connection layer, the weight of the modulation noise layer and the weight of the self-adaptive pattern modulation layer;
(8) expanding an infrared image sample:
and inputting the randomly generated feature vectors into a trained generator network for calculation, outputting infrared images containing ship samples, adding the infrared images into a training set, and completing the expansion of the infrared image ship samples.
Compared with the prior art, the invention has the following advantages:
firstly, the method for modulating the convolution layer result by adopting the pattern convolution block and using the random characteristic vector as the input of the pattern convolution block generates infrared image ship samples with various ship target postures and sea surface fluctuation, overcomes the problem of poor infrared image diversity generated in the prior art, and improves the diversity of the infrared image ship samples when the infrared image ship samples are expanded.
Secondly, because the invention adopts the pattern to generate the countermeasure network, the infrared image ship sample which is shot in real time is used as the training set, and the network is trained by adopting the alternate training mode of the discriminator and the generator, the infrared characteristic of the output infrared image ship sample is more approximate to the infrared image which is shot in real time, the problem of poor reality sense when the infrared image is generated by simulation in the prior art is overcome, and the reality sense of the infrared image sample is improved when the infrared image sample is expanded.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of a generator network according to the present invention;
FIG. 3 is a schematic diagram of a discriminator network according to the present invention;
FIG. 4 is a schematic diagram of a pattern generation countermeasure network according to the present invention;
FIG. 5 is a simulation of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the accompanying drawings.
The implementation steps of the present invention are described in further detail with reference to fig. 1.
Step 1, a training set is obtained.
Selecting at least 2000 infrared images, wherein each image comprises a ship target; the size of each image is scaled and cropped to 256 × 256, which constitutes a training set.
And 2, constructing a generator network.
And constructing a generator network, setting parameters of each layer of the network, and generating the generator network of the countermeasure network by taking the parameters as patterns.
The structure of the generator network in the pattern generation countermeasure network constructed by the present invention will be described in further detail with reference to fig. 2.
The structure of the generator network is as follows in sequence: constant matrix layer → 1 st noise modulation layer → 1 st adaptive pattern modulation layer → 1 st deconvolution layer → 2 nd noise modulation layer → activation function layer → 2 nd adaptive pattern modulation layer → pattern convolution block combination → 2 nd deconvolution layer → output layer.
The adaptive pattern modulation layer structure sequentially comprises: feature vector input layer → normalization layer → fully connected layer → scaled translational transform layer → output layer.
The pattern rolling block combination is formed by cascading 6 pattern rolling blocks with the same structure, and the structure of each pattern rolling block is as follows in sequence: 1 st deconvolution layer → 1 st noise modulation layer → 1 st activation function layer → 1 st adaptive pattern modulation layer → 2 nd deconvolution layer → 2 nd noise modulation layer → 2 nd activation function layer → 2 nd adaptive pattern modulation layer.
The normalization layer is implemented using an example normalization function. The activation function layers are all realized by adopting LeakyReLU functions.
The ellipses in fig. 2 represent the 2 nd to 6 th pattern convolution blocks, ten deconvolution layers, ten noise modulation layers, ten activation function layers, and ten scale-shift transform layers. The random noise in fig. 2 represents the input to the noise modulation layer. The single-directional arrows in fig. 2 indicate the characteristic connections.
The per-layer parameter settings of the generator network are as follows:
the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer is set to be 3 multiplied by 3, the number of the convolution kernels is set to be 512 and 3 respectively, the convolution step length is set to be 1, and the weight is initialized to be a random value which meets the normal distribution with the standard deviation of 0.02.
The slopes of the leak ReLU functions of the activation function layers are each set to 0.2.
The size of each convolution kernel of the 1 st and 2 nd deconvolution layers of the 1 st to 3 rd pattern convolution blocks in the pattern convolution block combination is set to be 3 x 3, the number of the convolution kernels is set to be 512, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st and 2 nd deconvolution layers in the 4 th pattern convolution block is set to be 3 x 3, the number of the convolution kernels is set to be 256, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st and 2 nd deconvolution layers in the 5 th pattern convolution block is set to be 3 x 3, the number of the convolution kernels is set to be 128, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 6 th pattern convolution block is set to be 3 multiplied by 3, the number of the convolution kernels is set to be 64, and the convolution step size is set to be 1. The weight of each convolution kernel of each deconvolution layer in the pattern 1 to 6 volume blocks is initialized to a random value satisfying a normal distribution with a standard deviation of 0.02. The slope of each leakage ReLU function of the activation function layers in the pattern 1 to pattern 6 volume blocks is set to 0.2.
The fully-connected layer weights in the adaptive pattern modulation layer are initialized to random values that satisfy a normal distribution with a standard deviation of 0.02.
And 3, constructing a discriminator network.
And constructing a discriminator network, setting parameters of each layer of the network, and using the parameters as a pattern to generate the discriminator network of the countermeasure network.
The structure of the arbiter network in the pattern generation countermeasure network constructed by the present invention will be described in further detail with reference to fig. 3.
The network structure of the discriminator is as follows in sequence: input layer → convolutional layer → activation function layer → combination of convolutional blocks → fully connected layer → output layer.
The convolution block combination is formed by cascading 6 convolution blocks with the same structure, and the structure of each convolution block is as follows in sequence: convolution layer 1 → activation function layer 1 → convolution layer 2 → pooling layer 1 → activation function layer 2.
The activation function layers are all realized by adopting a Leaky ReLU function, and the pooling layer is realized by adopting global average pooling.
The ellipses in fig. 3 represent the 2 nd to 6 th convolution blocks, ten convolution layers in total, ten activation function layers, and five pooling layers, with the one-way arrows in fig. 3 representing the characteristic connection relationships.
The parameters of each layer of the arbiter network are set as follows:
the sizes of convolution kernels of the convolution layers are all set to be 3 multiplied by 3, the number of the convolution kernels is respectively set to be 512, convolution step lengths are all set to be 1, and weights are initialized to be random values which meet the normal distribution with the standard deviation of 0.02.
The slope of each leakage ReLU function of the activation function layer is set to 0.2.
The size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 1 st convolution block and the 2 nd convolution block in the convolution block combination is set to be 3 x 3, the number of the convolution kernels is set to be 512, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 3 rd convolution block is set to be 3 multiplied by 3, the number of the convolution kernels is set to be 256, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 4 th convolution block is set to be 3 multiplied by 3, the number of the convolution kernels is set to be 128, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 5 th convolution block is set to be 3 multiplied by 3, the number of the convolution kernels is set to be 64, and the convolution step size is set to be 1. The size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 6 th convolution block is set to be 3 multiplied by 3, the number of the convolution kernels is set to be 32, and the convolution step size is set to be 1. Each convolution kernel weight of each convolution layer in the 1 st to 6 th convolution blocks is initialized to a random value satisfying a normal distribution with a standard deviation of 0.02. The slope of the leak ReLU function for each activation function layer in the 1 st to 6 th volume blocks is set to 0.2.
The weights of the fully-connected layers are initialized to random values that satisfy a normal distribution with a standard deviation of 0.02.
And 4, constructing a pattern to generate a countermeasure network.
The structure of the pattern generation countermeasure network constructed by the present invention will be described in further detail with reference to fig. 4.
The pattern generation countermeasure network is composed of a generator network and a network cascade of discriminators.
The generator in fig. 4 represents the generator network described in step 2, the arbiter in fig. 4 represents the arbiter network described in step 3, and the single-headed arrow in fig. 4 represents the signature connection relationship.
And 5, training a discriminator network.
Fixing the current weight parameters of a generator network, inputting randomly generated feature vectors into the generator network, outputting random infrared images, respectively inputting the generated random infrared images and infrared images in a training set into a discriminator network, respectively outputting corresponding evaluation scores after the discriminator network evaluates the sequentially input infrared images, and calculating the loss value of the discriminator network by using the evaluation scores of the discriminator network and the loss function of the discriminator network, wherein the loss function calculation formula of the discriminator network is as follows:
wherein L isDRepresenting a loss function of the arbiter network, E [. cndot]Representing a desired operation, D (-) representing the output of the network of discriminators in the pattern generating confrontation network, G (-) representing the output of the network of generators in the pattern generating confrontation network, z representing a randomly generated feature vector, x representing a 16-tap infrared image taken from the training set, and γ representing the averageThe coefficient of the square term, λ represents the coefficient of the constraint term,representing a constraint term infrared image obtained by correspondingly fusing the 16 infrared images output by the generator and the 16 real shot infrared images in the training set according to a random proportion, | · | | luminance2Represents a 2 norm operation and means a derivative operation.
And calculating the gradient of each convolution kernel of each convolution layer of the discriminator network and the gradient of the full-connection layer by using the loss value and the gradient descent method of the discriminator network.
The weights of each convolution kernel of each convolution layer and each convolution inverse layer of the discriminator network and the weights of the fully connected layers are updated by using the gradient of each convolution kernel of each convolution layer and the gradient of the fully connected layers of the discriminator network and by using an Adam optimizer with a learning rate of 0.005.
And 6, training a generator network.
Fixing the current weight parameters of the discriminator network, inputting randomly generated feature vectors into the generator network, outputting infrared images, inputting the generated random infrared images into the discriminator network, evaluating the input generated images by the discriminator network, outputting evaluation scores, and calculating the loss value of the generator network by using the evaluation scores of the discriminator and the loss function of the generator network, wherein the calculation formula of the generator network is as follows:
LG=-E[D(G(z))]
wherein L isGA loss function representing the generator network, E [ ·]Representing the desired operation, D (-) represents the pattern to generate the output of the network of discriminators in the countermeasure network, G (-) represents the pattern to generate the output of the network of generators in the countermeasure network, and z represents the randomly generated feature vector.
And calculating the gradient of each convolution kernel of each deconvolution layer of the generator network, the gradient of a full-connection layer, the gradient of a modulation noise layer and the gradient of a self-adaptive pattern modulation layer by using a loss value and gradient descent method of the generator network.
And iteratively updating the weight of each convolution kernel of each deconvolution layer of the generator network, the weight of the full-connection layer, the weight of the modulation noise layer and the weight of the adaptive pattern modulation layer by using an Adam optimizer with a learning rate of 0.005.
And 7, training the pattern to generate a countermeasure network.
And repeating the step 5 and the step 6, alternately training the discriminator network and the generator network until the loss value of the generator network obtained by current iteration is less than 20 and the loss value mean value of the discriminator network is less than 10, obtaining the trained generator network weight, and storing the trained pattern to generate the weight of each convolution kernel of each deconvolution layer of the generator network in the countermeasure network, the weight of the full connection layer, the weight of the modulation noise layer and the weight of the self-adaptive pattern modulation layer.
And 8, expanding the infrared image sample.
And inputting the randomly generated feature vectors into a trained generator network for calculation, outputting infrared images containing ship samples, adding the infrared images into a training set, and completing the expansion of the infrared image ship samples.
The effect of the present invention will be further explained with the simulation experiment.
1. Simulation conditions are as follows:
the simulation experiment of the invention is carried out in a single NVIDIA RTX 2080 model GPU, a 128GB hardware environment for operating a memory and a PyTorch1.1.0 software environment.
2. Simulation content and result analysis:
the simulation experiment of the invention is to respectively expand the actually shot 2000 infrared image ship samples by using the method of the invention and a method in the prior art.
The prior art is as follows: the patent document "target sample generation method based on infrared simulation" (2018105060882, CN110162812A) applied by the Beijing electromechanical engineering research institute proposes a target sample generation method based on infrared simulation.
Fig. 5 is a simulation diagram of the present invention, wherein fig. 5(a) is a diagram of a ship sample of 1 live infrared image randomly selected from a training set used in a simulation experiment of the present invention. Fig. 5(b) is a ship expansion sample map of 1 infrared image generated by the prior art, and fig. 5(c) is a ship expansion sample map of 9 infrared images generated by the method of the present invention.
Comparing fig. 5(a), 5(b) and 5(c) in fig. 5, it can be seen that the infrared image ship sample generated by using the prior art in fig. 5(b) has a significant difference compared with the real-shot infrared image ship sample in fig. 5 (a). And the infrared image ship sample generated by adopting the method of the invention in fig. 5(c) has high consistency in the aspects of infrared texture detail characteristics, target radiation characteristics and the like compared with the infrared image ship sample actually shot in fig. 5 (a).
Comparing with fig. 5(c) in fig. 5, 9 infrared image ship extension samples generated by the method of the present invention, it can be seen that the infrared image ship extension samples generated by the method of the present invention have ship targets of various sizes and angles and sea surface bright bands of various forms, and it can be seen that the infrared image ship samples generated by the method of the present invention have strong diversity.
Therefore, the method of the invention overcomes the problems in the prior art, improves the sense of reality of the infrared image ship expansion sample, and increases the diversity of the infrared image ship sample.
Claims (3)
1. A warship sample expansion method in infrared image based on pattern generation countermeasure network is characterized in that a pattern alternately trained by a discriminator and a generator is adopted to generate the countermeasure network, and random feature vectors are used to respectively modulate each layer of the generator, so as to realize the purpose that the random feature vectors output infrared image warship samples through the generator for expansion, and the method specifically comprises the following steps:
(1) acquiring a training set:
selecting at least 2000 infrared images, wherein each image comprises a ship target; scaling and cutting the size of each image into 256 multiplied by 256 to form a training set;
(2) constructing a generator network:
(2a) a generator network is built, and the structure of the generator network is as follows in sequence: constant matrix layer → 1 st noise modulation layer → 1 st adaptive pattern modulation layer → 1 st deconvolution layer → 2 nd noise modulation layer → activation function layer → 2 nd adaptive pattern modulation layer → pattern convolution block combination → 2 nd deconvolution layer → output layer;
the adaptive pattern modulation layer structure sequentially comprises: the feature vector input layer → the normalization layer → the 1 st fully-connected layer → the 2 nd fully-connected layer → the 3 rd fully-connected layer → the 4 th fully-connected layer → the 5 th fully-connected layer → the 6 th fully-connected layer → the 7 th fully-connected layer → the 8 th fully-connected layer → the scaling translation transform layer → the output layer;
the pattern rolling block combination is formed by cascading 6 pattern rolling blocks with the same structure, and the structure of each pattern rolling block is as follows in sequence: 1 st deconvolution layer → 1 st noise modulation layer → 1 st activation function layer → 1 st adaptive pattern modulation layer → 2 nd deconvolution layer → 2 nd noise modulation layer → 2 nd activation function layer → 2 nd adaptive pattern modulation layer;
the normalization layer is realized by adopting an example normalization function; (ii) a The activation function layers are all realized by adopting a Leaky ReLU function;
(2b) setting per-layer parameters of the generator network:
setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer to be 3 multiplied by 3, respectively setting the number of the convolution kernels to be 512 and 3, setting the convolution step length to be 1, and initializing the weight to be a random value which meets the normal distribution with the standard deviation of 0.02;
setting the slopes of the Leaky ReLU functions of the activation function layers to be 0.2;
setting the sizes of convolution kernels of 1 st and 2 nd deconvolution layers of 1 st to 3 rd pattern convolution blocks in the pattern convolution block combination to be 3 multiplied by 3, setting the number of the convolution kernels to be 512, and setting convolution step length to be 1; setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 4 th pattern convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 256, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 5 th pattern convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 128, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st deconvolution layer and the 2 nd deconvolution layer in the 6 th pattern convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 64, and setting the convolution step length to be 1; initializing the weight of each convolution kernel of each deconvolution layer in the 1 st to 6 th pattern convolution blocks to be a random value satisfying a normal distribution with a standard deviation of 0.02; setting the slope of each leak ReLU function of the activation function layers in the pattern 1 to pattern 6 volume blocks to 0.2;
the number of neurons of the 1 st to 8 th fully-connected layers in the self-adaptive pattern modulation layer is set to be 512, and weights are initialized to be random values meeting the normal distribution with the standard deviation of 0.02;
(3) constructing a discriminator network:
(3a) a discriminator network is built, and the structure of the discriminator network is as follows in sequence: input layer → convolutional layer → activation function layer → convolutional block combination → fully connected layer → output layer;
the convolution block combination is formed by cascading 6 convolution blocks with the same structure, and the structure of each convolution block is as follows in sequence: 1 st convolution layer → 1 st activation function layer → 2 nd convolution layer → 1 st pooling layer → 2 nd activation function layer;
the activation function layers are all realized by adopting a Leaky ReLU function, and the pooling layer is realized by adopting global average pooling;
(3b) setting parameters of each layer of the discriminator network:
setting the sizes of convolution kernels of the convolution layers to be 3 multiplied by 3, setting the number of the convolution kernels to be 512 respectively, setting convolution step lengths to be 1, and initializing the weights to be random values meeting the standard deviation of 0.02 normal distribution;
setting the slope of each Leaky ReLU function of the activation function layer to be 0.2;
setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 1 st convolution block and the 2 nd convolution block in the convolution block combination to be 3 multiplied by 3, setting the number of the convolution kernels to be 512, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 3 rd convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 256, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 4 th convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 128, and setting the convolution step size to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 5 th convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 64, and setting the convolution step length to be 1; setting the size of each convolution kernel of the 1 st convolution layer and the 2 nd convolution layer in the 6 th convolution block to be 3 multiplied by 3, setting the number of the convolution kernels to be 32, and setting the convolution step length to be 1; initializing each convolution kernel weight of each convolution layer in the 1 st to 6 th convolution blocks to a random value satisfying a normal distribution with a standard deviation of 0.02; setting the slope of the Leaky ReLU function of each activation function layer in the 1 st to 6 th volume blocks to 0.2;
setting the number of the neurons of the full connection layer as 1, and initializing the weight into a random value which meets the normal distribution with the standard deviation of 0.02;
(4) constructing a pattern generation countermeasure network:
cascading the generator network and the discriminator network to form a pattern to generate a countermeasure network;
(5) training the generator network:
fixing the current weight parameters of the generator network, inputting randomly generated feature vectors into the generator network, outputting random infrared images, respectively inputting the generated random infrared images and infrared images in a training set into a discriminator network, respectively outputting corresponding evaluation scores by the discriminator network after evaluating the sequentially input infrared images, and calculating the loss value of the discriminator network by using the evaluation scores of the discriminator network and the loss function of the discriminator network;
calculating the gradient of each convolution kernel of each convolution layer of the discriminator network and the gradient of the full-connection layer by using the loss value and the gradient descent method of the discriminator network;
updating the weight of each convolution kernel of each convolution layer and each convolution de-convolution layer of the discriminator network and the weight of the full connection layer by using an Adam optimizer with the learning rate of 0.005 according to the gradient of each convolution kernel of each convolution layer of the discriminator network and the gradient of the full connection layer;
(6) training the arbiter network:
fixing the current weight parameters of the discriminator network, inputting randomly generated feature vectors into the generator network, outputting infrared images, inputting the generated random infrared images into the discriminator network, evaluating the input generated images by the discriminator network, outputting evaluation scores, and calculating a generator network loss value by using the evaluation scores of the discriminator and a loss function of the generator network;
calculating the gradient of each convolution kernel of each deconvolution layer of the generator network, the gradient of a full-connection layer, the gradient of a modulation noise layer and the gradient of a self-adaptive pattern modulation layer by using a loss value and gradient descent method of the generator network;
iteratively updating the weight of each convolution kernel of each deconvolution layer of the generator network, the weight of the full-connection layer, the weight of the modulation noise layer and the weight of the adaptive pattern modulation layer by using an Adam optimizer with a learning rate of 0.005;
(7) training pattern generation confrontation network:
repeating the steps (5) and (6), alternately training the discriminator network and the generator network until the loss value of the generator network obtained by current iteration is less than 20 and the loss value mean value of the discriminator network is less than 10, obtaining the trained generator network weight, and storing the trained pattern to generate the weight of each convolution kernel of each deconvolution layer of the generator network in the countermeasure network, the weight of the full connection layer, the weight of the modulation noise layer and the weight of the self-adaptive pattern modulation layer;
(8) expanding an infrared image sample:
and inputting the randomly generated feature vectors into a trained generator network for calculation, outputting infrared images containing ship samples, adding the infrared images into a training set, and completing the expansion of the infrared image ship samples.
2. The method for generating ship sample expansion in infrared image of countermeasure network based on pattern as claimed in claim 1, wherein the loss function of the discriminator network in step (5) is as follows:
wherein L isDRepresenting a loss function of the arbiter network, E [. cndot]Representing an expectation operation, D (-) representing the output of the discriminator network in the pattern generation confrontation network, G (-) representing the output of the generator network in the pattern generation confrontation network, z representing a randomly generated feature vector, x representing 16 real-shot infrared images in the training set, γ representing a square term coefficient, λ representing a constraint term coefficient,representing a constraint term infrared image obtained by correspondingly fusing the 16 infrared images output by the generator and the 16 real shot infrared images in the training set according to a random proportion, | · | | luminance2Represents a 2 norm operation and means a derivative operation.
3. The method for generating ship sample expansion in infrared image of countermeasure network based on pattern as claimed in claim 2, wherein the loss function of the generator network in step (6) is as follows:
LG=-E[D(G(z))]
wherein L isGRepresenting the loss function of the generator network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010650897.8A CN111814875B (en) | 2020-07-08 | 2020-07-08 | Ship sample expansion method in infrared image based on pattern generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010650897.8A CN111814875B (en) | 2020-07-08 | 2020-07-08 | Ship sample expansion method in infrared image based on pattern generation countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111814875A true CN111814875A (en) | 2020-10-23 |
CN111814875B CN111814875B (en) | 2023-08-01 |
Family
ID=72842044
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010650897.8A Active CN111814875B (en) | 2020-07-08 | 2020-07-08 | Ship sample expansion method in infrared image based on pattern generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111814875B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112465736A (en) * | 2020-11-18 | 2021-03-09 | 武汉理工大学 | Infrared video image enhancement method for port ship monitoring |
CN112700408A (en) * | 2020-12-28 | 2021-04-23 | 中国银联股份有限公司 | Model training method, image quality evaluation method and device |
CN112784930A (en) * | 2021-03-17 | 2021-05-11 | 西安电子科技大学 | CACGAN-based HRRP identification database sample expansion method |
CN112819914A (en) * | 2021-02-05 | 2021-05-18 | 北京航空航天大学 | PET image processing method |
CN112835709A (en) * | 2020-12-17 | 2021-05-25 | 华南理工大学 | Method, system and medium for generating cloud load time sequence data based on generation countermeasure network |
CN112884003A (en) * | 2021-01-18 | 2021-06-01 | 中国船舶重工集团公司第七二四研究所 | Radar target sample expansion generation method based on sample expander |
CN112950505A (en) * | 2021-03-03 | 2021-06-11 | 西安工业大学 | Image processing method, system and medium based on generation countermeasure network |
CN113516656A (en) * | 2021-09-14 | 2021-10-19 | 浙江双元科技股份有限公司 | Defect image data processing simulation method based on ACGAN and Cameralink cameras |
CN113792764A (en) * | 2021-08-24 | 2021-12-14 | 北京遥感设备研究所 | Sample expansion method, system, storage medium and electronic equipment |
CN114022724A (en) * | 2021-10-08 | 2022-02-08 | 郑州大学 | Pipeline disease image data enhancement method for generating countermeasure network based on style migration |
CN114881884A (en) * | 2022-05-24 | 2022-08-09 | 河南科技大学 | Infrared target sample enhancement method based on generation countermeasure network |
CN115086674A (en) * | 2022-06-16 | 2022-09-20 | 西安电子科技大学 | Image steganography method based on generation of countermeasure network |
CN115393242A (en) * | 2022-09-30 | 2022-11-25 | 国网电力空间技术有限公司 | Method and device for enhancing foreign matter image data of power grid based on GAN |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018028255A1 (en) * | 2016-08-11 | 2018-02-15 | 深圳市未来媒体技术研究院 | Image saliency detection method based on adversarial network |
US20200097766A1 (en) * | 2018-09-26 | 2020-03-26 | Nec Laboratories America, Inc. | Multi-scale text filter conditioned generative adversarial networks |
CN111027454A (en) * | 2019-12-06 | 2020-04-17 | 西安电子科技大学 | SAR (synthetic Aperture Radar) ship target classification method based on deep dense connection and metric learning |
CN111260594A (en) * | 2019-12-22 | 2020-06-09 | 天津大学 | Unsupervised multi-modal image fusion method |
-
2020
- 2020-07-08 CN CN202010650897.8A patent/CN111814875B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018028255A1 (en) * | 2016-08-11 | 2018-02-15 | 深圳市未来媒体技术研究院 | Image saliency detection method based on adversarial network |
US20200097766A1 (en) * | 2018-09-26 | 2020-03-26 | Nec Laboratories America, Inc. | Multi-scale text filter conditioned generative adversarial networks |
CN111027454A (en) * | 2019-12-06 | 2020-04-17 | 西安电子科技大学 | SAR (synthetic Aperture Radar) ship target classification method based on deep dense connection and metric learning |
CN111260594A (en) * | 2019-12-22 | 2020-06-09 | 天津大学 | Unsupervised multi-modal image fusion method |
Non-Patent Citations (2)
Title |
---|
苗壮;张?;李伟华;: "基于双重对抗自编码网络的红外目标建模方法", 光学学报, no. 11 * |
黄硕;胡勇;巩彩兰;郑付强;: "基于稀疏编码的红外显著区域超分重建算法", 红外与毫米波学报, no. 03 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112465736B (en) * | 2020-11-18 | 2023-03-24 | 武汉理工大学 | Infrared video image enhancement method for port ship monitoring |
CN112465736A (en) * | 2020-11-18 | 2021-03-09 | 武汉理工大学 | Infrared video image enhancement method for port ship monitoring |
CN112835709B (en) * | 2020-12-17 | 2023-09-22 | 华南理工大学 | Cloud load time sequence data generation method, system and medium based on generation countermeasure network |
CN112835709A (en) * | 2020-12-17 | 2021-05-25 | 华南理工大学 | Method, system and medium for generating cloud load time sequence data based on generation countermeasure network |
CN112700408A (en) * | 2020-12-28 | 2021-04-23 | 中国银联股份有限公司 | Model training method, image quality evaluation method and device |
CN112700408B (en) * | 2020-12-28 | 2023-09-08 | 中国银联股份有限公司 | Model training method, image quality evaluation method and device |
CN112884003A (en) * | 2021-01-18 | 2021-06-01 | 中国船舶重工集团公司第七二四研究所 | Radar target sample expansion generation method based on sample expander |
CN112819914A (en) * | 2021-02-05 | 2021-05-18 | 北京航空航天大学 | PET image processing method |
CN112950505A (en) * | 2021-03-03 | 2021-06-11 | 西安工业大学 | Image processing method, system and medium based on generation countermeasure network |
CN112950505B (en) * | 2021-03-03 | 2024-01-23 | 西安工业大学 | Image processing method, system and medium based on generation countermeasure network |
CN112784930A (en) * | 2021-03-17 | 2021-05-11 | 西安电子科技大学 | CACGAN-based HRRP identification database sample expansion method |
CN112784930B (en) * | 2021-03-17 | 2022-03-04 | 西安电子科技大学 | CACGAN-based HRRP identification database sample expansion method |
CN113792764A (en) * | 2021-08-24 | 2021-12-14 | 北京遥感设备研究所 | Sample expansion method, system, storage medium and electronic equipment |
CN113516656A (en) * | 2021-09-14 | 2021-10-19 | 浙江双元科技股份有限公司 | Defect image data processing simulation method based on ACGAN and Cameralink cameras |
CN114022724A (en) * | 2021-10-08 | 2022-02-08 | 郑州大学 | Pipeline disease image data enhancement method for generating countermeasure network based on style migration |
CN114881884A (en) * | 2022-05-24 | 2022-08-09 | 河南科技大学 | Infrared target sample enhancement method based on generation countermeasure network |
CN114881884B (en) * | 2022-05-24 | 2024-03-29 | 河南科技大学 | Infrared target sample enhancement method based on generation countermeasure network |
CN115086674A (en) * | 2022-06-16 | 2022-09-20 | 西安电子科技大学 | Image steganography method based on generation of countermeasure network |
CN115086674B (en) * | 2022-06-16 | 2024-04-02 | 西安电子科技大学 | Image steganography method based on generation of countermeasure network |
CN115393242A (en) * | 2022-09-30 | 2022-11-25 | 国网电力空间技术有限公司 | Method and device for enhancing foreign matter image data of power grid based on GAN |
Also Published As
Publication number | Publication date |
---|---|
CN111814875B (en) | 2023-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111814875A (en) | Method for expanding ship samples in infrared image based on pattern generation countermeasure network | |
CN109934282B (en) | SAGAN sample expansion and auxiliary information-based SAR target classification method | |
CN110781830B (en) | SAR sequence image classification method based on space-time joint convolution | |
CN110348399B (en) | Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network | |
CN113178255B (en) | GAN-based medical diagnosis model attack resistance method | |
CN110458765B (en) | Image quality enhancement method based on perception preserving convolution network | |
CN103927531B (en) | It is a kind of based on local binary and the face identification method of particle group optimizing BP neural network | |
CN112052886A (en) | Human body action attitude intelligent estimation method and device based on convolutional neural network | |
CN105844627B (en) | A kind of sea-surface target image background suppressing method based on convolutional neural networks | |
Ablavatski et al. | Enriched deep recurrent visual attention model for multiple object recognition | |
CN109447936A (en) | A kind of infrared and visible light image fusion method | |
CN110570363A (en) | Image defogging method based on Cycle-GAN with pyramid pooling and multi-scale discriminator | |
CN112561796B (en) | Laser point cloud super-resolution reconstruction method based on self-attention generation countermeasure network | |
CN112465718B (en) | Two-stage image restoration method based on generation of countermeasure network | |
CN110826428A (en) | Ship detection method in high-speed SAR image | |
CN112489164B (en) | Image coloring method based on improved depth separable convolutional neural network | |
CN113420639A (en) | Method and device for establishing near-ground infrared target data set based on generation countermeasure network | |
CN114049434A (en) | 3D modeling method and system based on full convolution neural network | |
CN114359387A (en) | Bag cultivation mushroom detection method based on improved YOLOV4 algorithm | |
Zhang et al. | Ship HRRP target recognition based on CNN and ELM | |
CN117036875B (en) | Infrared weak and small moving target generation algorithm based on fusion attention GAN | |
CN113989612A (en) | Remote sensing image target detection method based on attention and generation countermeasure network | |
CN116258816A (en) | Remote sensing image simulation method based on nerve radiation field | |
CN113326924B (en) | Depth neural network-based key target photometric positioning method in sparse image | |
CN111681156B (en) | Deep compressed sensing image reconstruction method applied to wireless sensor network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |