CN117437502A - Fuzzy space-time image dataset amplification method, device and system and storage medium - Google Patents

Fuzzy space-time image dataset amplification method, device and system and storage medium Download PDF

Info

Publication number
CN117437502A
CN117437502A CN202311274265.6A CN202311274265A CN117437502A CN 117437502 A CN117437502 A CN 117437502A CN 202311274265 A CN202311274265 A CN 202311274265A CN 117437502 A CN117437502 A CN 117437502A
Authority
CN
China
Prior art keywords
convolution
module
layer
improved
fuzzy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311274265.6A
Other languages
Chinese (zh)
Inventor
王剑平
胡淇铭
张果
金建辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202311274265.6A priority Critical patent/CN117437502A/en
Publication of CN117437502A publication Critical patent/CN117437502A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention discloses a fuzzy space-time image data set amplification method, a device, a system and a storage medium, belonging to the technical field of hydrological flow measurement, wherein the method comprises the following steps: generating a real fuzzy space-time image by utilizing the pre-acquired river video and preprocessing to acquire a training data set; respectively improving a discriminator and a generator for generating a antagonism network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model; training the DCGAN improved model by utilizing the training data set to obtain a trained DCGAN improved model; and generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying a fuzzy space-time image data set. The method can generate false fuzzy space-time images with the same characteristics as the real fuzzy space-time images, and amplify the fuzzy space-time image data set so as to solve the problem of insufficient current fuzzy space-time image data sets.

Description

Fuzzy space-time image dataset amplification method, device and system and storage medium
Technical Field
The invention relates to a fuzzy space-time image data set amplification method, a device, a system and a storage medium, belonging to the technical field of hydrological flow measurement.
Background
River monitoring technology based on computer vision has been widely studied and put into practical use, and the development of comprehensive river treatment and flood control early warning is promoted, wherein the measurement of river flow rate is one of important tasks of hydrologic information monitoring.
The spatiotemporal image velocimetry is a method for calculating one-dimensional time average flow rate of water surface by detecting main texture angles of spatiotemporal images, and the key of the method is to obtain accurate main texture angles, but the traditional spatiotemporal image velocimetry is easily affected by image noise, so that detection errors are large.
The invention patent with the publication number of CN111062978A provides a texture recognition method for space-time image flow measurement based on a frequency domain filtering technology, which can effectively remove noise in space-time images, but parameters of a filter cannot be selected in a self-adaptive manner for space-time images under different conditions, and the parameters are adjusted manually.
In order to solve the defects of the traditional method, the invention patent with the patent publication number of CN113222976A provides a space-time image texture direction detection method and system based on DCNN and transfer learning, and a main texture angle of a space-time image is obtained through a regression model based on the DCNN, but a sufficient data set is needed for training the model.
The fuzzy space-time image is a data set required by training a network model, the data set is small in scale and the acquisition process is time-consuming and labor-consuming at present, but the traditional image data enhancement method, such as rotation, brightness adjustment, noise addition and the like, cannot effectively improve the diversity of the fuzzy space-time image, and only can obtain little additional information from the fuzzy space-time image during the training of the network model.
Disclosure of Invention
The invention aims to provide a fuzzy space-time image data set amplification method, a fuzzy space-time image data set amplification device, a fuzzy space-time image data set amplification system and a storage medium, which can utilize a trained DCGAN improved model to generate false fuzzy space-time images with the same characteristics as the real fuzzy space-time images, and amplify the fuzzy space-time image data set so as to solve the problem that the existing fuzzy space-time image data set is insufficient.
In order to achieve the above purpose, the present invention provides the following technical solutions:
in a first aspect, the present invention provides a method for amplifying a blurred spatiotemporal image dataset, comprising:
generating a real fuzzy space-time image by utilizing the pre-acquired river video and preprocessing to acquire a training data set;
respectively improving a discriminator and a generator for generating a antagonism network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model;
training the DCGAN improved model by utilizing the training data set to obtain a trained DCGAN improved model;
and generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying a fuzzy space-time image data set.
In combination with the first aspect, the improved generator further comprises a reconstruction module, 2 BCC modules, a first upsampling module, a feature extraction module and a second upsampling module which are connected, randomly generated noise is input to the improved generator, the reconstruction module reconstructs the noise into a feature map and inputs the feature map to the BBC module, the BCC module performs feature extraction on the feature map, obtains a feature-enhanced feature map and inputs the feature-enhanced feature map to the first upsampling module, the first upsampling module performs upsampling on the feature-enhanced feature map, obtains a twice-amplified feature map and inputs the feature-enhanced feature map to the feature extraction module, the feature extraction module performs feature extraction on the twice-amplified feature map, obtains feature maps converging different feature information under a uniform scale and inputs the feature map to the second upsampling module, and the second upsampling module performs upsampling on the feature map converging different feature information under the uniform scale and outputs a false fuzzy space-time image.
With reference to the first aspect, further, the reconstruction module includes a batch normalization layer and a ReLU activation function layer that are connected; the BCC module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a convolution kernel filled with 1, a batch normalization layer, a ReLU activation function layer and a CA attention module which are connected; the first upsampling module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a convolution kernel filled with 1, a batch normalization layer and a ReLU activation function layer which are connected; the feature extraction module comprises a convolution layer with a connected convolution kernel of 3 multiplied by 3 and a 1 filling of 1, a batch normalization layer, a PReLU activation function layer, a convolution layer with a convolution kernel of 1 multiplied by 1 and a 1 filling of 0, a convolution layer with a convolution kernel of 3 multiplied by 3 and a 1 filling of 1, a batch normalization layer, a PReLU activation function layer and a CA attention module; the second upsampling module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3×3 steps of 1 and a filling of 1, and a Tanh activation function layer which are connected.
In combination with the first aspect, the improved discriminator comprises a first convolution module, 3 second convolution modules, a CA attention module, a full connection layer and a Sigmoid activation function layer which are connected, the preprocessed real fuzzy space-time image and the false fuzzy space-time image output by the improved generator are input to the improved discriminator, the first convolution module and the second convolution module perform downsampling to obtain a reduced feature image and input to the CA attention module, the CA attention module performs feature extraction in the height and width directions on the reduced feature image to obtain the feature image with attention weight in the height and width directions, and the full connection layer and the Sigmoid activation function layer output are used for judging the false fuzzy space-time image as true probability.
With reference to the first aspect, further, the first convolution module includes a convolution layer with a concatenated convolution kernel of 4×4 steps of 2 and a padding 1 and a LeakyReLU activation function layer; the second convolution module comprises a convolution layer, a batch normalization layer and a LeakyReLU activation function layer, wherein the convolution layer is connected, the convolution core is filled with 1, and the step length of the convolution layer is 4 multiplied by 4, and the step length of the convolution layer is 2; the feature map is reduced by one half per pass through one first convolution module or one second convolution module.
With reference to the first aspect, further, the CA attention module includes:
encoding the position information of the input feature images, and carrying out global average pooling on the input feature images from the height direction and the width direction respectively to obtain the feature images in the height direction and the width direction;
the feature images in the height and width directions are spliced and then input into a convolution with a shared convolution kernel of 1 multiplied by 1 to carry out dimension reduction, nonlinear transformation is carried out, and a dimension reduction feature image is obtained;
carrying out convolution with convolution kernel of 1×1 on the dimension-reducing feature map according to the height and width of the input feature map, recovering the dimension-reducing feature map to the height and width of the input feature map, and acquiring the attention weight of the input feature map in the height and width directions through a Sigmoid activation function;
and carrying out multiplication weighted calculation on the input feature map according to the attention weights of the input feature map in the height and width directions to obtain the feature map with the attention weights in the height and width directions.
With reference to the first aspect, further, training the DCGAN improvement model using the training data set, and obtaining the trained DCGAN improvement model includes:
inputting the randomly generated noise to an improved generator, and outputting false fuzzy space-time images by the improved generator;
inputting the false fuzzy space-time image and the real fuzzy space-time image in the training data set to an improved discriminator, and outputting the probability of discriminating the false fuzzy space-time image as true by the improved discriminator;
according to a preset objective function, the improved generator and the improved arbiter optimize respective parameters through mutual game until dynamic balance is achieved, and a trained DCGAN improved model is obtained;
wherein, the objective function is:
wherein x is a true blurred spatiotemporal image, x obeys P data(x) Distribution, z is noise, z obeys P z(z) The distribution, i.e., the random distribution, G (z) is the false blurred spatiotemporal image generated by the improved generator from noise z, D (x) is the probability that the improved arbiter will discriminate x as true, D (G (z)) is the probability that the improved arbiter will discriminate G (z) as true, E [. Cndot.]Is a mathematical expectation;
according to a preset objective function, the improved generator and the improved arbiter optimize respective parameters through mutual gaming, wherein the parameters comprise:
fixing parameters of a generator, training a discriminator by using a maximized target in a target function, and acquiring a trained discriminator;
and fixing parameters of the trained discriminators, and obtaining the trained generators by using the minimized target training generators in the target functions.
With reference to the first aspect, further comprising transforming the generated false blurred spatiotemporal image having the same characteristics as the true blurred spatiotemporal image into the frequency domain by fast fourier transform, and amplifying the frequency domain image dataset.
In a second aspect, the present invention provides a blurred spatiotemporal image dataset amplification apparatus comprising:
and a pretreatment module: the method comprises the steps of generating real fuzzy space-time images by utilizing pre-acquired river videos, preprocessing the real fuzzy space-time images, and acquiring a training data set;
the improvement module: the method comprises the steps of respectively improving a discriminator and a generator for generating a countermeasure network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model;
training module: the training data set is used for training the DCGAN improved model to obtain a trained DCGAN improved model;
amplification module: the method is used for generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying the fuzzy space-time image data set.
In a third aspect, the present invention provides a system comprising a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is operative according to the instructions to perform the steps of the method according to any one of the first aspects.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the first aspects.
Compared with the prior art, the invention has the beneficial effects that:
according to the fuzzy space-time image data set amplification method provided by the invention, the discriminator and the generator for generating the antagonism network DCGAN through the deep convolution are improved, and the DCGAN improved model obtained through improvement is trained, so that the trained DCGAN improved model can stably generate a false fuzzy space-time image with the same characteristic as a real fuzzy space-time image, and the fuzzy space-time image data set is amplified. The improved generator adopts the combination of bilinear interpolation up-sampling and convolution to replace deconvolution in the original DCGAN model, so that the 'checkerboard' texture in the image generated by the original DCGAN model can be eliminated, the detail and quality of the generated image are improved, and the problem of insufficient current fuzzy space-time image data sets is solved.
Drawings
FIG. 1 is a flow chart of a fuzzy spatiotemporal image dataset amplification method provided by an embodiment of the invention;
FIG. 2 is a schematic diagram of an improved generator structure provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of an improved arbiter according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a CA attention module provided by an embodiment of the invention;
FIG. 5 is a false blurred spatiotemporal image provided by an embodiment of the present invention;
FIG. 6 is a true blurred spatiotemporal image provided by an embodiment of the present invention;
fig. 7 shows a frequency domain image provided by an embodiment of the present invention, where (a) is a frequency domain image of a true blurred spatiotemporal image and (b) is a frequency domain image of a false blurred spatiotemporal image.
Detailed Description
The technical scheme of the present application will be described in further detail with reference to the specific embodiments.
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application. The embodiments of the present application and the technical features in the embodiments may be combined with each other without conflict.
Example 1:
fig. 1 is a flowchart of a fuzzy spatiotemporal image dataset amplification method provided in this embodiment, which merely shows the logical sequence of the method of this embodiment, and on the premise of not conflicting with each other, the steps shown or described can be completed in a different sequence from that shown in fig. 1.
The fuzzy spatiotemporal image dataset amplification method provided in this embodiment may be applied to a terminal, and may be performed by a fuzzy spatiotemporal image dataset amplification device, which may be implemented in a software and/or hardware manner, and the device may be integrated in the terminal, for example: any tablet computer or computer equipment with communication function.
Referring to fig. 1, the method of this embodiment specifically includes the following steps:
step one: generating a real fuzzy space-time image by utilizing the pre-acquired river video and preprocessing to acquire a training data set;
when the pre-acquired river video is used for generating the real fuzzy space-time image, a velocimetry line is required to be set first, the width of the generated real fuzzy space-time image is equal to the length of the velocimetry line, and the height of the generated real fuzzy space-time image is equal to the total frame number of the river video, so that the difference in length of the velocimetry line can lead to the difference in width of the generated real fuzzy space-time image, and the difference in time of the river video can lead to the difference in height of the generated real fuzzy space-time image.
In this embodiment, a monitoring camera arranged on the shore is used for shooting a river, collecting a river video, setting the video time to be 20s, setting the video frame rate to be 25 frames per second, setting the frame width to be 1920, and setting the frame height to be 1080. Setting a velocimetry line in the acquired river video, and generating a real fuzzy space-time image through a preset program. Preprocessing the real fuzzy space-time images, uniformly cutting the sizes of the real fuzzy space-time images into 224 multiplied by 224px and taking the 224px as a training data set.
Step two: respectively improving a discriminator and a generator for generating a antagonism network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model;
in this embodiment, as shown in fig. 2, the improved generator includes a reconstruction module, 2 BCC modules, a first upsampling module, a feature extraction module and a second upsampling module, which are connected, randomly generated noise is input to the improved generator, the reconstructed noise is reconstructed by the reconstruction module into a feature map and input to the BBC module, the feature map is subjected to feature extraction by the BCC modules, a feature map subjected to feature enhancement is obtained and input to the first upsampling module, the feature map subjected to feature enhancement is upsampled by the first upsampling module, an amplified twice feature map is obtained and input to the feature extraction module, the feature extraction module performs feature extraction on the amplified twice feature map, the feature map which is converged with different feature information under a uniform scale is obtained and input to the second upsampling module, and the feature map which is converged with different feature information under the uniform scale is upsampled by the second upsampling module and a false fuzzy space-time image is output.
The reconfiguration module comprises a batch normalization layer and a ReLU activation function layer which are connected; the BCC module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a 1 filling of 1, a batch normalization layer, a ReLU activation function layer and a CA attention module which are connected; the first upsampling module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a convolution kernel filled with 1, a batch normalization layer and a ReLU activation function layer which are connected; the feature extraction module comprises a convolution layer with 3 multiplied by 3 step length of 1 and filled with 1, a batch normalization layer, a PReLU activation function layer, a convolution layer with 1 multiplied by 1 step length of 1 and filled with 0, a convolution layer with 3 multiplied by 3 step length of 1 and filled with 1, a batch normalization layer, a PReLU activation function layer and a CA attention module which are connected; the second upsampling module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3×3 steps of 1 and a packing of 1, and a Tanh activation function layer which are connected.
Wherein the bilinear interpolation upsampling layer adopts bilinear interpolation algorithm, align_filters is set to False.
In this embodiment, the randomly generated 100-dimensional noise vector is input to the improved generator, reconstructed by the reconstruction module into a feature map with a size of 14×14×1024, feature extracted by the BCC module, up-sampled by the first up-sampling module, feature extracted by the feature extraction module, up-sampled by the second up-sampling module and outputting a false blurred spatiotemporal image with a size of 224×224×3. The BCC module and the feature extraction module can enhance extraction and learning of the image detail information by the network.
In this embodiment, as shown in fig. 3, the improved arbiter includes a first convolution module, 3 second convolution modules, a CA attention module, a full connection layer, and a Sigmoid activation function layer that are connected, a preprocessed real blurred spatiotemporal image and a false blurred spatiotemporal image output by the improved generator are input to the improved arbiter, downsampled by the first convolution module and the second convolution module, a reduced feature map is obtained and input to the CA attention module, the feature extraction in the height and width directions is performed on the reduced feature map by the CA attention module, a feature map with attention weights in the height and width directions is obtained, and the probability of discriminating the false blurred spatiotemporal image as true is output through the full connection layer and the Sigmoid activation function layer.
The first convolution module comprises a convolution layer with connected convolution kernels of 4 multiplied by 4 and filled with 1 with 2 steps and a LeakyReLU activation function layer; the second convolution module comprises a convolution layer, a batch normalization layer and a LeakyReLU activation function layer, wherein the convolution layer is connected, the convolution kernel is filled with 1, and the step length of the convolution kernel is 4 multiplied by 4, and the step length of the convolution layer is 2; the feature map is reduced by one half per pass through one first convolution module or one second convolution module.
In this embodiment, as shown in fig. 4, the CA attention module includes:
(1) encoding the position information of the input feature images, and carrying out global average pooling on the input feature images from the height direction and the width direction respectively to obtain the feature images in the height direction and the width direction;
global average pooling of the input feature graphs from the height direction is performed as:
wherein,is the average value of the characteristic information in the width direction of the height h in the channel c, x c (h, i) is characteristic information of the i-th wide position of the height h in the channel c, and W is the total width.
Global average pooling of the input feature graphs from the width direction is performed as:
wherein,is the average value of the characteristic information in the height direction with the width w in the channel c, x c (j, w) is characteristic information of the j-th high position with the width w in the channel c, and H is the total height.
(2) The feature images in the height and width directions are spliced and then input into a convolution with a shared convolution kernel of 1 multiplied by 1 to carry out dimension reduction, nonlinear transformation is carried out, and a dimension reduction feature image is obtained;
the characteristic diagrams in the height and width directions are spliced and then input into a convolution with a shared convolution kernel of 1 multiplied by 1 to be subjected to dimension reduction, and nonlinear transformation is carried out to obtain the following components:
wherein f is a dimension reduction feature map,the representation will->Feature stitching is performed and a convolution operation with a convolution kernel of 1×1 is performed, and δ (·) represents a nonlinear transformation of the combination of batch normalization and ReLU activation functions.
(3) Convolving the dimension reduction feature map with convolution kernel of 1×1 according to the height and width of the input feature map, recovering the dimension reduction feature map to the height and width of the input feature map, and acquiring the attention weight of the input feature map in the height and width directions by using a Sigmoid activation function;
the attention weight of the input feature map in the height direction and the width direction is as follows:
wherein g h 、g w Is the attention weight of the input characteristic diagram in the height and width directions, f h 、f w Is a characteristic diagram in the height and width directions, F h (f h )、F w (f w ) The feature map in the height and width directions is subjected to convolution operation with convolution kernel of 1×1, and σ (·) is a Sigmoid activation function.
(4) And carrying out multiplication weighted calculation on the input feature map according to the attention weights of the input feature map in the height and width directions to obtain the feature map with the attention weights in the height and width directions.
In this embodiment, the number of channels of the input feature map of the CA attention module is the same as the number of channels of the output feature map. The CA attention module can acquire more characteristic information of the characteristic diagram in the height and width directions, so that the extraction of the characteristics by the discriminator is enhanced.
Step three: training the DCGAN improved model by using a training data set to obtain a trained DCGAN improved model;
training the DCGAN improved model by using a training data set, and obtaining the trained DCGAN improved model comprises the following steps:
step 1: inputting the randomly generated noise to an improved generator, and outputting false fuzzy space-time images by the improved generator;
step 2: inputting the false fuzzy space-time image and the real fuzzy space-time image in the training data set to an improved discriminator, and outputting the probability of discriminating the false fuzzy space-time image as true by the improved discriminator;
step 3: according to a preset objective function, the improved generator and the improved arbiter optimize respective parameters through mutual game until dynamic balance is achieved, and a trained DCGAN improved model is obtained.
In this embodiment, the objective function is:
wherein x is a true blurred spatiotemporal image, x obeys P data(x) Distribution, z is noise, z obeys P z(z) The distribution, i.e., the random distribution, G (z) is the false blurred spatiotemporal image generated by the improved generator from noise z, D (x) is the probability that the improved arbiter will discriminate x as true, D (G (z)) is the probability that the improved arbiter will discriminate G (z) as true, E [. Cndot.]Is a mathematical expectation.
In this embodiment, according to a preset objective function, the improved generator and the improved arbiter optimize respective parameters by playing with each other, including the following steps:
step i: fixing parameters of a generator, training a discriminator by using a maximized target in a target function, and acquiring a trained discriminator;
the maximization objective in the objective function is:
step ii: and fixing parameters of the trained discriminators, and obtaining the trained generators by using the minimized target training generators in the target functions.
The minimization objective in the objective function is:
in this embodiment, the number of iterations of training is set to 500, the optimizer uses Adam, the default learning rate is 0.0002, the batch_size is set to 64, and the BCELoss penalty function is used. After training, a trained generator is used for generating a false fuzzy space-time image as shown in fig. 5, a real fuzzy space-time image is shown in fig. 6, and the false fuzzy space-time image is seen to have the same characteristics with the real fuzzy space-time image, wherein the main texture direction is contained in the generated false fuzzy space-time image.
Step four: and generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying a fuzzy space-time image data set.
In this embodiment, a trained DCGAN improved model is used to generate a false blurred spatiotemporal image with the same characteristics as a real blurred spatiotemporal image, amplify the blurred spatiotemporal image dataset, and further convert the generated false blurred spatiotemporal image with the same characteristics as the real blurred spatiotemporal image into a frequency domain by fast fourier transform, and amplify the frequency domain image dataset.
The frequency spectrum main direction angle of the frequency domain image obtained by the false fuzzy space-time image through the fast Fourier transform is perpendicular to the main texture angle of the false fuzzy space-time image, namely the space-time image speed measurement can be realized by detecting the frequency spectrum main direction angle. The real blurred spatiotemporal image and the false blurred spatiotemporal image are both transformed to the frequency domain by fast fourier transformation, the frequency domain image of which is shown in fig. 7, wherein (a) is the frequency domain image of the real blurred spatiotemporal image and (b) is the frequency domain image of the false blurred spatiotemporal image. As can be seen from fig. 7, the centers of the frequency domain image of the true blurred spatiotemporal image and the frequency domain image of the false blurred spatiotemporal image have significant primary direction textures of the frequency spectrum.
The two data sets amplified by the embodiment can be applied to the current space-time image velocity measurement method combined with deep learning.
According to the fuzzy space-time image data set amplification method, the discriminator and the generator for generating the antagonism network DCGAN through the deep convolution are improved, and the DCGAN improved model obtained through improvement is trained, so that the trained DCGAN improved model can stably generate false fuzzy space-time images with the same characteristics as the real fuzzy space-time images, and the fuzzy space-time image data set is amplified. The improved generator adopts the combination of bilinear interpolation up-sampling and convolution to replace deconvolution in the original DCGAN model, so that the 'checkerboard' texture in the image generated by the original DCGAN model can be eliminated, the detail and quality of the generated image are improved, and the problem of insufficient current fuzzy space-time image data sets is solved.
Example 2:
the present embodiment provides a fuzzy spatiotemporal image dataset amplification apparatus, including:
and a pretreatment module: the method comprises the steps of generating real fuzzy space-time images by utilizing pre-acquired river videos, preprocessing the real fuzzy space-time images, and acquiring a training data set;
the improvement module: the method comprises the steps of respectively improving a discriminator and a generator for generating a countermeasure network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model;
training module: the method comprises the steps of training a DCGAN improved model by using a training data set to obtain a trained DCGAN improved model;
amplification module: the method is used for generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying the fuzzy space-time image data set.
The fuzzy space-time image data set amplification device provided by the embodiment of the invention can execute the fuzzy space-time image data set amplification method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example 3:
the embodiment provides a system, which comprises a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is operative to perform the steps of the method of embodiment one in accordance with the instructions.
Example 4:
the present embodiment provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first embodiment.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (10)

1. A method for augmenting a blurred spatiotemporal image dataset, comprising:
generating a real fuzzy space-time image by utilizing the pre-acquired river video and preprocessing to acquire a training data set;
respectively improving a discriminator and a generator for generating a antagonism network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model;
training the DCGAN improved model by utilizing the training data set to obtain a trained DCGAN improved model;
and generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying a fuzzy space-time image data set.
2. The fuzzy spatiotemporal image dataset amplification method of claim 1, wherein the improved generator comprises a reconstruction module, 2 BCC modules, a first upsampling module, a feature extraction module and a second upsampling module which are connected, randomly generated noise is input to the improved generator, the noise is reconstructed into a feature map by the reconstruction module and input to the BBC module, the feature map is subjected to feature extraction by the BCC module, a feature map subjected to feature enhancement is obtained and input to the first upsampling module, the feature map subjected to feature enhancement is subjected to upsampling by the first upsampling module, a feature map twice as large is obtained and input to the feature extraction module, the feature map twice as large is subjected to feature extraction by the feature extraction module, the feature map gathering different feature information under a unified scale is obtained and input to the second upsampling module, and a false fuzzy spatiotemporal image is obtained and output by the second upsampling module.
3. The fuzzy spatiotemporal image dataset augmentation method of claim 2, wherein the reconstruction module comprises a batch normalization layer and a ReLU activation function layer connected; the BCC module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a convolution kernel filled with 1, a batch normalization layer, a ReLU activation function layer and a CA attention module which are connected; the first upsampling module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a convolution kernel filled with 1, a batch normalization layer and a ReLU activation function layer which are connected; the feature extraction module comprises a convolution layer with a connected convolution kernel of 3 multiplied by 3 and a 1 filling of 1, a batch normalization layer, a PReLU activation function layer, a convolution layer with a convolution kernel of 1 multiplied by 1 and a 1 filling of 0, a convolution layer with a convolution kernel of 3 multiplied by 3 and a 1 filling of 1, a batch normalization layer, a PReLU activation function layer and a CA attention module; the second upsampling module comprises a bilinear interpolation upsampling layer, a convolution layer with a convolution kernel of 3×3 steps of 1 and a filling of 1, and a Tanh activation function layer which are connected.
4. The fuzzy spatiotemporal image dataset augmentation method of claim 1, wherein the improved arbiter comprises a first convolution module, 3 second convolution modules, a CA attention module, a full connection layer and a Sigmoid activation function layer which are connected, the preprocessed real fuzzy spatiotemporal image and the false fuzzy spatiotemporal image output by the improved generator are input to the improved arbiter, downsampling is carried out by the first convolution module and the second convolution module, a reduced feature map is obtained and input to the CA attention module, feature extraction in the height and width directions is carried out on the reduced feature map by the CA attention module, a feature map with attention weight in the height and width directions is obtained, and the probability of judging the false fuzzy spatiotemporal image as true is output through the full connection layer and the Sigmoid activation function layer.
5. The method of claim 4, wherein the first convolution module comprises a convolution layer with concatenated convolution kernels of 4 x 4 steps of 2 padding 1 and a LeakyReLU activation function layer; the second convolution module comprises a convolution layer, a batch normalization layer and a LeakyReLU activation function layer, wherein the convolution layer is connected, the convolution core is filled with 1, and the step length of the convolution layer is 4 multiplied by 4, and the step length of the convolution layer is 2; the feature map is reduced by one half per pass through one first convolution module or one second convolution module.
6. The method of claim 3 or 4, wherein the CA attention module comprises:
encoding the position information of the input feature images, and carrying out global average pooling on the input feature images from the height direction and the width direction respectively to obtain the feature images in the height direction and the width direction;
the feature images in the height and width directions are spliced and then input into a convolution with a shared convolution kernel of 1 multiplied by 1 to carry out dimension reduction, nonlinear transformation is carried out, and a dimension reduction feature image is obtained;
carrying out convolution with convolution kernel of 1×1 on the dimension-reducing feature map according to the height and width of the input feature map, recovering the dimension-reducing feature map to the height and width of the input feature map, and acquiring the attention weight of the input feature map in the height and width directions through a Sigmoid activation function;
and carrying out multiplication weighted calculation on the input feature map according to the attention weights of the input feature map in the height and width directions to obtain the feature map with the attention weights in the height and width directions.
7. The method of claim 1, wherein training the DCGAN improvement model using the training dataset, the obtaining a trained DCGAN improvement model comprising:
inputting the randomly generated noise to an improved generator, and outputting false fuzzy space-time images by the improved generator;
inputting the false fuzzy space-time image and the real fuzzy space-time image in the training data set to an improved discriminator, and outputting the probability of discriminating the false fuzzy space-time image as true by the improved discriminator;
according to a preset objective function, the improved generator and the improved arbiter optimize respective parameters through mutual game until dynamic balance is achieved, and a trained DCGAN improved model is obtained;
wherein, the objective function is:
wherein x is a true blurred spatiotemporal image, x obeys P data(x) Distribution, z is noise, z obeys P z(z) The distribution, i.e., the random distribution, G (z) is the false blurred spatiotemporal image generated by the improved generator from noise z, D (x) is the probability that the improved arbiter will discriminate x as true, D (G (z)) is the probability that the improved arbiter will discriminate G (z) as true, z [. Cndot.]Is a mathematical expectation;
according to a preset objective function, the improved generator and the improved arbiter optimize respective parameters through mutual gaming, wherein the parameters comprise:
fixing parameters of a generator, training a discriminator by using a maximized target in a target function, and acquiring a trained discriminator;
and fixing parameters of the trained discriminators, and obtaining the trained generators by using the minimized target training generators in the target functions.
8. The method of claim 1, further comprising amplifying a frequency domain image dataset by transforming a generated false blurred spatiotemporal image having the same characteristics as a true blurred spatiotemporal image into a frequency domain by a fast fourier transform.
9. A blurred spatiotemporal image dataset augmentation apparatus comprising:
and a pretreatment module: the method comprises the steps of generating real fuzzy space-time images by utilizing pre-acquired river videos, preprocessing the real fuzzy space-time images, and acquiring a training data set;
the improvement module: the method comprises the steps of respectively improving a discriminator and a generator for generating a countermeasure network DCGAN by pre-acquired depth convolution to acquire a DCGAN improvement model;
training module: the training data set is used for training the DCGAN improved model to obtain a trained DCGAN improved model;
amplification module: the method is used for generating false fuzzy space-time images with the same characteristics as the real fuzzy space-time images by using the trained DCGAN improved model, and amplifying the fuzzy space-time image data set.
10. A system comprising a processor and a storage medium;
the storage medium is used for storing instructions;
the processor being operative according to the instructions to perform the steps of the method according to any one of claims 1 to 8.
CN202311274265.6A 2023-09-28 2023-09-28 Fuzzy space-time image dataset amplification method, device and system and storage medium Pending CN117437502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311274265.6A CN117437502A (en) 2023-09-28 2023-09-28 Fuzzy space-time image dataset amplification method, device and system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311274265.6A CN117437502A (en) 2023-09-28 2023-09-28 Fuzzy space-time image dataset amplification method, device and system and storage medium

Publications (1)

Publication Number Publication Date
CN117437502A true CN117437502A (en) 2024-01-23

Family

ID=89547099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311274265.6A Pending CN117437502A (en) 2023-09-28 2023-09-28 Fuzzy space-time image dataset amplification method, device and system and storage medium

Country Status (1)

Country Link
CN (1) CN117437502A (en)

Similar Documents

Publication Publication Date Title
Lan et al. MADNet: a fast and lightweight network for single-image super resolution
CN110378844B (en) Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network
CN112016507B (en) Super-resolution-based vehicle detection method, device, equipment and storage medium
CN111524135B (en) Method and system for detecting defects of tiny hardware fittings of power transmission line based on image enhancement
US20230081645A1 (en) Detecting forged facial images using frequency domain information and local correlation
CN110599401A (en) Remote sensing image super-resolution reconstruction method, processing device and readable storage medium
CN111626932B (en) Super-resolution reconstruction method and device for image
CN112686119B (en) License plate motion blurred image processing method based on self-attention generation countermeasure network
CN111861894A (en) Image motion blur removing method based on generating type countermeasure network
CN112767279B (en) Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration
CN111681188B (en) Image deblurring method based on combination of image pixel prior and image gradient prior
CN111932461A (en) Convolutional neural network-based self-learning image super-resolution reconstruction method and system
CN114757832A (en) Face super-resolution method and device based on cross convolution attention antagonistic learning
Zhao et al. ADRN: Attention-based deep residual network for hyperspectral image denoising
CN114723630A (en) Image deblurring method and system based on cavity double-residual multi-scale depth network
CN111489304A (en) Image deblurring method based on attention mechanism
CN109948575A (en) Eyeball dividing method in ultrasound image
CN113538246A (en) Remote sensing image super-resolution reconstruction method based on unsupervised multi-stage fusion network
CN113643183B (en) Non-matching remote sensing image weak supervised learning super-resolution reconstruction method and system
CN116029902A (en) Knowledge distillation-based unsupervised real world image super-resolution method
CN109993701B (en) Depth map super-resolution reconstruction method based on pyramid structure
Liu et al. Facial image inpainting using multi-level generative network
CN114202473A (en) Image restoration method and device based on multi-scale features and attention mechanism
CN114283058A (en) Image super-resolution reconstruction method based on countermeasure network and maximum mutual information optimization
CN115272131B (en) Image mole pattern removing system and method based on self-adaptive multispectral coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination