CN116452469B - Image defogging processing method and device based on deep learning - Google Patents

Image defogging processing method and device based on deep learning Download PDF

Info

Publication number
CN116452469B
CN116452469B CN202310729654.7A CN202310729654A CN116452469B CN 116452469 B CN116452469 B CN 116452469B CN 202310729654 A CN202310729654 A CN 202310729654A CN 116452469 B CN116452469 B CN 116452469B
Authority
CN
China
Prior art keywords
image
model
defogging
training
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310729654.7A
Other languages
Chinese (zh)
Other versions
CN116452469A (en
Inventor
裴朝科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ouye Semiconductor Co ltd
Original Assignee
Shenzhen Ouye Semiconductor Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ouye Semiconductor Co ltd filed Critical Shenzhen Ouye Semiconductor Co ltd
Priority to CN202310729654.7A priority Critical patent/CN116452469B/en
Publication of CN116452469A publication Critical patent/CN116452469A/en
Application granted granted Critical
Publication of CN116452469B publication Critical patent/CN116452469B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • G06T5/73
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention provides an image defogging processing method and device based on deep learning, wherein the method comprises the following steps: acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image; inputting the input feature map into a pre-trained image defogging model to obtain a defogging image; the image defogging model is obtained by model training a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then model training the initial training model by a pre-built real scene data set. According to the image defogging model obtained by the two-stage training method, the strong supervision information of the synthesized data and the real data distribution of the real data are fully utilized, the problem of poor domain distribution of the synthesized data and the real data is reasonably solved, and the defogging effect of the image is effectively improved.

Description

Image defogging processing method and device based on deep learning
Technical Field
The invention relates to the field of computer vision, in particular to an image defogging processing method and device based on deep learning.
Background
The image defogging technology is an image processing method for improving the image quality by eliminating the atmospheric scattering effect, and has wide application in the field of computer vision. The foggy image is generally influenced by factors such as haze, insufficient illumination, noise and the like, so that the image quality is reduced, details are fuzzy, and the recognition and analysis are difficult. The image defogging technology can remove the influences, improve the quality and the visibility of the image, and better meet the human visual requirements.
Currently, in image defogging technology, cycleGAN is a generation countermeasure network for unsupervised image conversion, and has been applied to many computer vision tasks. The CycleGAN can perform image conversion without matching data by learning a mapping between two different domains. The cycle-dehazing algorithm converts a foggy image into a foggy image based on the principle of cycle GAN to realize a defogging function, but model features are extracted singly, and the foggy image cannot be defogged effectively only by using an unsupervised training method.
Accordingly, the prior art has drawbacks and needs to be improved and developed.
Disclosure of Invention
The invention aims to solve the technical problems of the prior art, and provides an image defogging processing method and device based on deep learning, which aims to solve the problem that the image defogging cannot be effectively performed in the prior art.
The technical scheme adopted for solving the technical problems is as follows:
an image defogging processing method based on deep learning, comprising the following steps:
acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image;
inputting the input feature map into a pre-trained image defogging model to obtain a defogging image;
the image defogging model is obtained by carrying out model training on a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then carrying out model training on the initial training model by a pre-built real scene data set.
In one implementation, the training step of the image defogging model includes:
constructing a synthetic data set and a real scene data set;
constructing an initial image defogging model;
performing model training on the initial image defogging model by using the synthetic data set to obtain an initial training model;
and training the initial training model by utilizing the real scene data set to obtain a trained image defogging model.
In one implementation, the constructing the synthetic data set and the real scene data set includes:
Acquiring an original haze-free image, a transmittance map and preset global atmospheric light, wherein the transmittance map is obtained by randomly generating a depth map and then applying a preset exponential decay function;
acquiring a preset atmospheric degradation model formula, and substituting the original haze-free image, the transmittance map and the preset global atmospheric light into the preset atmospheric degradation model formula to obtain a synthesized haze image matched with the original haze-free image;
constructing the synthetic dataset from the original haze-free image and the synthetic haze image;
and acquiring a real foggy image and a real non-foggy image in a real scene, and constructing the real scene data set according to the real foggy image and the real non-foggy image.
In one implementation, the constructing the initial image defogging model includes:
the method comprises the steps of adopting a cyclic network model of a cyclic egan-like double generator and a cyclic graph-like double discriminator as a framework, wherein the double generator comprises a defogging generator and a synthetic fog graph generator, and the defogging generator and the synthetic fog graph generator both adopt a Unet-like structure;
constructing a self-adaptive mixed weighted layer jump connection module of the defogging generator and the synthetic fog figure generator;
Constructing a feature fusion layer in the self-adaptive mixed weighting layer-jump connection module;
and constructing a multi-layer perception loss function and supervising the absolute value loss function.
In one implementation, model training is performed on the initial image defogging model by using the synthetic data set to obtain an initial training model, including:
acquiring an original haze-free image in the synthesized data set and a synthesized haze-free image matched with the original haze-free image;
acquiring a preset counterloss function and a cyclic consistency loss function, and obtaining a one-stage training loss function according to the multi-layer perception loss function, the supervision absolute value loss function, the counterloss function and the cyclic consistency loss function;
inputting the synthesized hazy image into a pre-constructed hazy image feature enhancement module, calculating a first brightness feature image, a first contrast priori feature image and a first dark channel feature image of the synthesized hazy image, and fusing the first brightness feature image, the first contrast priori feature image and the first dark channel feature image to obtain a first training input feature image;
inputting the first training input characteristic diagram into the dual generator, processing by the self-adaptive mixed weighted layer jump connection module in the defogging generator, recording the first characteristic diagram after each downsampling of an encoder in the defogging generator, and recording the second characteristic diagram after each upsampling of a decoder in the defogging generator;
Inputting the first feature map and the second feature map into the feature fusion layer to perform feature fusion, taking the original haze-free image as supervision information, and performing model training on the initial image haze-removal model based on the one-stage training loss function to obtain first candidate training models corresponding to different test index values, wherein the test index values are peak signal-to-noise ratios;
and taking the first candidate training model with the highest test index value as an initial training model.
In one implementation, training the initial training model using the real scene dataset to obtain a trained image defogging model, comprising:
acquiring a real foggy image and a real foggy image in the real scene data set;
obtaining a two-stage training loss function according to the multi-layer perception loss function, the antagonism loss function and the cycle consistency loss function;
inputting the real foggy image into a trained initial training model;
the real foggy image passes through the foggy image feature enhancement module, a second brightness feature image, a second contrast priori feature image and a second dark channel feature image of the real foggy image are calculated, and the second brightness feature image, the second contrast priori feature image and the second dark channel feature image are fused to obtain a second training input feature image;
Inputting the second training input characteristic diagram into the dual generator, processing by the adaptive mixed weighted layer jump connection module in the defogging generator, recording a third characteristic diagram after each downsampling of an encoder in the defogging generator, and recording a fourth characteristic diagram after each upsampling of a decoder in the defogging generator;
inputting the third feature map and the fourth feature map into the feature fusion layer for feature fusion, taking the real haze-free image as supervision information, and carrying out model training on the initial training model based on the two-stage training loss function to obtain second candidate training models corresponding to different test index values, wherein the test index values are peak signal-to-noise ratios;
and taking the second candidate training model with the highest test index value as a defogging model of the trained image.
In one implementation, the multi-layer perceptual loss function is expressed as:
wherein x represents a foggy image, i represents the number of feature layers, theWeights representing the error of each feature layer, said +.>Representing a synthetic mist pattern generator, said +.>Representing a defogging generator; said- >Representing an ith layer characteristic value corresponding to a currently input foggy image;
the supervised absolute value loss function is expressed as:
wherein the saidFor the original haze-free image in the synthetic dataset, the +.>A hazy image in the synthetic dataset;
the one-stage training loss function is expressed as:
the two-stage training loss function is expressed as:
wherein the said、/>、/>And->Is constant, said->Representing an fight loss function, said +.>Representing a cyclic consistency loss function, said +.>Representing a multi-layer perceptual loss function, said +.>Representing a supervised absolute value loss function.
The invention also provides an image defogging processing device based on deep learning, which comprises:
the acquisition module is used for acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image;
the input module is used for inputting the input feature map into a pre-trained image defogging model to obtain a defogging image;
the image defogging model is obtained by carrying out model training on a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then carrying out model training on the initial training model by a pre-built real scene data set.
The invention also provides a terminal, comprising: the image defogging device comprises a memory, a processor and a deep learning-based image defogging processing program which is stored in the memory and can be run on the processor, wherein the deep learning-based image defogging processing program realizes the steps of the image defogging processing method based on the deep learning when being executed by the processor.
The present invention also provides a computer-readable storage medium storing a computer program executable for implementing the steps of the image defogging processing method based on deep learning as described above.
The invention provides an image defogging processing method and device based on deep learning, wherein the method comprises the following steps: acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image; inputting the input feature map into a pre-trained image defogging model to obtain a defogging image; the image defogging model is obtained by model training a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then model training the initial training model by a pre-built real scene data set. According to the image defogging model obtained by the two-stage training method, the strong supervision information of the synthesized data and the real data distribution of the real data are fully utilized, the problem of poor domain distribution of the synthesized data and the real data is reasonably solved, and the defogging effect of the image is effectively improved.
Drawings
Fig. 1 is a flowchart of a preferred embodiment of an image defogging processing method based on deep learning in the present invention.
Fig. 2 is a functional block diagram of a preferred embodiment of an image defogging processing device based on deep learning in the present invention.
Fig. 3 is a functional block diagram of a terminal in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clear and clear, the present invention will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The existing image defogging technology mainly comprises a defogging algorithm based on traditional image enhancement and an image defogging algorithm based on deep learning. The defogging algorithm based on the traditional image enhancement mainly comprises the following steps: histogram equalization algorithms (HLE), adaptive Histogram Equalization (AHE), constrained contrast adaptive histogram equalization (CLAHE), retinex algorithm, wavelet transform, homomorphism filtering, etc. The defogging algorithm based on traditional image enhancement needs to manually set parameters for different scenes, which limits the generalization capability of the defogging algorithm; aiming at fog and the like with different types and different blurring processes, the traditional defogging algorithm cannot learn multi-level features of an image well, wherein the multi-level features comprise low-level features (such as edges and textures) and high-level features (such as objects and scenes), so that the accuracy and the robustness of the algorithm are limited. Therefore, deep learning is introduced into an image defogging algorithm, so that complex nonlinear problems can be processed, and features are automatically learned, so that the defogging effect of the image is improved. Image defogging algorithms based on deep learning can be divided into two categories: defogging algorithm based on atmosphere degradation model and single image defogging algorithm based on gan generation type. The defogging algorithm based on the atmosphere degradation model has simpler assumption on the atmosphere model, haze distribution and illumination conditions, the complexity of a real scene cannot be expressed by a single formula, the accuracy of the algorithm is limited, and the accuracy problem exists in the image processed by the method in the complex real scene; the single image defogging algorithm based on gan generation can automatically learn characteristics, process complex nonlinear problems, and has wide application scenes. However, the image defogging algorithm based on deep learning generally needs to train paired data, has extremely high dependence on the data, and even if the synthesized data of the real scene is applied to the real fog map, the defogging effect is still not ideal. CycleGAN is a generating countermeasure network for unsupervised image conversion, which has been applied to many computer vision tasks. The cycle-dehazing algorithm converts a foggy image into a foggy image based on the principle of cycle GAN, and realizes a defogging function.
Aiming at the defects, the invention provides a single-image defogging algorithm based on the end-to-end of the CycleGAN, which improves defogging effect by fully utilizing various information of images and introducing perception loss. The invention provides a training method without matching data, and expands the application range of an algorithm.
The invention provides an image defogging processing method and device based on deep learning, wherein the method comprises the following steps: acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image; inputting the input feature map into a pre-trained image defogging model to obtain a defogging image; the image defogging model is obtained by model training a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then model training the initial training model by a pre-built real scene data set. According to the image defogging model obtained by the two-stage training method, the strong supervision information of the synthesized data and the real data distribution of the real data are fully utilized, the problem of poor domain distribution of the synthesized data and the real data is reasonably solved, and the defogging effect of the image is effectively improved.
Referring to fig. 1, fig. 1 is a flowchart of an image defogging processing method based on deep learning in the present invention. As shown in fig. 1, the image defogging processing method based on deep learning according to the embodiment of the invention includes the following steps:
step S100, an original image to be processed is obtained, each target feature image in the original image is extracted, and each target feature image is fused to obtain an input feature image.
Specifically, the original image is a foggy image. The extracting each target feature map in the original image comprises the following steps: inputting the original image into a pre-constructed fog pattern feature enhancement module, extracting a brightness feature pattern, a contrast priori feature pattern and a dark channel feature pattern of the original image, and fusing the target feature patterns to obtain an input feature pattern. According to the image defogging method, the target feature images with multiple dimensions are extracted, various information (such as a channel image, a brightness image and a contrast image) of the image is fully utilized, and the defogging effect of the image is effectively improved.
As shown in fig. 1, the image defogging processing method based on deep learning according to the embodiment of the invention further includes the following steps:
step 200, inputting the input feature map into a pre-trained image defogging model to obtain a defogging image;
The image defogging model is obtained by carrying out model training on a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then carrying out model training on the initial training model by a pre-built real scene data set.
Specifically, the method trains the initial image defogging model constructed in advance by utilizing the synthetic data set to obtain an initial training model, trains the initial training model by utilizing the true scene data set to obtain the image defogging model, and fully utilizes the strong supervision information of the synthetic data and the real data distribution of the true data by the two-stage training method to improve the defogging effect of the image.
In one implementation, the training step of the image defogging model includes:
constructing a synthetic data set and a real scene data set;
constructing an initial image defogging model;
performing model training on the initial image defogging model by using the synthetic data set to obtain an initial training model;
and training the initial training model by utilizing the real scene data set to obtain a trained image defogging model.
Specifically, training the initial image defogging model by using a constructed synthetic data set to obtain an initial training model, wherein the synthetic data set can help the model to better understand an image defogging task, so that the performance and the robustness of the model are improved; and then training the initial training model by using the constructed real scene data set to obtain a trained image defogging model, wherein the real scene can provide more real and diversified image defogging samples, so that the model can better adapt to different conditions and changes in the real scene. According to the method, the model can learn defogging characteristics better than a general generation network in training, and the defogging effect of the image is effectively improved. The invention utilizes the strong supervision information of the synthetic data set Paired, can be rapidly applied to a real scene, and has extremely low dependency on Paired data.
In one implementation, the constructing the synthetic data set and the real scene data set includes:
acquiring an original haze-free image, a transmittance map and preset global atmospheric light, wherein the transmittance map is obtained by randomly generating a depth map and then applying a preset exponential decay function;
acquiring a preset atmospheric degradation model formula, and substituting the original haze-free image, the transmittance map and the preset global atmospheric light into the preset atmospheric degradation model formula to obtain a synthesized haze image matched with the original haze-free image;
constructing the synthetic dataset from the original haze-free image and the synthetic haze image;
and acquiring a real foggy image and a real non-foggy image in a real scene, and constructing the real scene data set according to the real foggy image and the real non-foggy image.
In particular, the synthetic dataset Paired is constructed from an original haze free image and a synthetic haze image, and is a Paired synthetic dataset. The real scene data set Unpaired is constructed by a real foggy image and a real foggy image acquired in a real scene and is a non-paired data set. Wherein there is no correlation between the true foggy image and the true foggy image. And processing the original haze-free image into a synthesized haze image by using the atmospheric degradation model. The preset atmospheric degradation model is expressed as: The method comprises the steps of carrying out a first treatment on the surface of the Wherein, I (x) is a synthetic foggy image; the J (x) is an original haze-free image, and the J (x) and the original haze-free image are matched; the t (x) is a transmittance graph showing the degree of atmospheric transmittance, and the value range is +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein A is global atmospheric light, represents the intensity of light in the atmosphere, and is set to be +.>. The transmittance graph is expressed as +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein t (x) is a projection ratio map, d (x) is a depth map, beta is an attenuation coefficient, represents the concentration of fog, and is set to a value. The depth map is randomly generated, the random generation method is to randomly generate a gray map, and each coordinate of the gray map is set as +.>Or directly from a randomly generated gray scale map to a depth map. Multiplying the defogging image J (x) with the transmissivity graph t (x) pixel by pixel, multiplying the global atmosphere light A with (1-t (x)) pixel by pixel, and finally adding the two results to obtain the corresponding original defogging imageThe foggy image is synthesized and output. The invention constructs the paired synthetic data set and the unpaired real scene data set for subsequent two-stage training, and can improve the precision of the model, so that the model has better defogging effect.
In one implementation, the constructing the initial image defogging model includes:
The method comprises the steps of adopting a cyclic network model of a cyclic egan-like double generator and a cyclic graph-like double discriminator as a framework, wherein the double generator comprises a defogging generator and a synthetic fog graph generator, and the defogging generator and the synthetic fog graph generator both adopt a Unet-like structure;
constructing a self-adaptive mixed weighted layer jump connection module of the defogging generator and the synthetic fog figure generator;
constructing a feature fusion layer in the self-adaptive mixed weighting layer-jump connection module;
and constructing a multi-layer perception loss function and supervising the absolute value loss function.
In particular, the defogging generatorAnd a synthetic mist generator->Each having a downsampled encoder and an upsampled decoder, the downsampling multiple being 1/32 and the upsampling multiple being 32.
Construction of defogging generatorsAnd a synthetic mist pattern generator->The adaptive hybrid weighted layer-hopping connection module MixUp (MixUp Weighted Skip Connection Module, MWSCM) combines the MixUp data enhancement method with layer-hopping connection to establish a weighted connection between different feature layers, which aims to improve the performance of the deep neural network in handling multi-scale and multi-domain problems, and is used for optimizing the skip link module in unet.
The specific steps of constructing the adaptive hybrid weighted layer jump connection module of the defogging generator and the synthetic fog generator comprise the following steps:
Recording the feature map of the encoder after each downsamplingWherein->,/>Is +.>1/2 of (C);
recording the feature map after each upsampling by the decoderWherein->,/>Is of the size of1/2 of (C);
constructing a feature fusion layer, wherein a weighted fusion formula of the feature fusion layer is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein c is the number of samplings, +.>For the feature map after the c-th upsampling of the decoder, w is a preset learnable parameter,/->,/>For the characteristic map after the c-th downsampling of the encoder,>is the feature map after the c+1th upsampling of the decoder.
The invention is converted to by setting a learning parameter w and a sigmoid functionThe model learns the fusion weights among different feature layers, and the weighted fusion method is favorable for establishing smooth connection among features at different abstract levels, so that the generalization performance of the neural network is improved. The decoder feature layer upsampling is kept at the same resolution and channel number as the feature layer in encoding by Deconv deconvolution, weight +.>Determined by the learnable parameter w. By weight->The model may dynamically learn the importance between feature maps. The feature fusion layer has the main advantage that smooth connection can be established between features at different abstract levels, so that the generalization performance of the neural network is improved. Meanwhile, due to the introduction of the MixUp method, the self-adaptive mixed weighted layer jump also has a certain data enhancement effect, and the risk of over fitting can be effectively reduced.
The constructing a multi-layer perceptual loss function comprises:
selecting 5 output characteristic layers of the image net pre-trained VGG19 model based on the image net pre-trained VGG19 model, and respectively applying the image with fog and the corresponding defogging image to obtain a characteristic image with fog and a characteristic image without fog;
the Mean Square Error (MSE) is used to measure the difference between the foggy image feature map and the non-foggy image feature map.
The invention constructs VGG multi-layer perception loss functionThe higher-level features of the image are more focused than conventional pixel-level loss functions, enabling better capture of the visual perception differences of the image.
Construction of a supervised absolute value loss function for synthetic dataThe method is used for performing strong supervision training on the synthesized fog map generator and the defogging generator. And extracting a fog diagram and a corresponding true value clear diagram of the synthesized data, calculating absolute value loss with images obtained by the two generators respectively in the training process, and adding the two absolute value losses to obtain a supervised absolute value loss function. The supervised absolute value loss function focuses on pixel level errors between the generated image and the target image, making the image generated by the generator more closely approximate to the target image in both global structure and local detail.
In one implementation, model training is performed on the initial image defogging model by using the synthetic data set to obtain an initial training model, including:
acquiring an original haze-free image in the synthesized data set and a synthesized haze-free image matched with the original haze-free image;
acquiring a preset counterloss function and a cyclic consistency loss function, and obtaining a one-stage training loss function according to the multi-layer perception loss function, the supervision absolute value loss function, the counterloss function and the cyclic consistency loss function;
inputting the synthesized hazy image into a pre-constructed hazy image feature enhancement module, calculating a first brightness feature image, a first contrast priori feature image and a first dark channel feature image of the synthesized hazy image, and fusing the first brightness feature image, the first contrast priori feature image and the first dark channel feature image to obtain a first training input feature image;
inputting the first training input characteristic diagram into the dual generator, processing by the self-adaptive mixed weighted layer jump connection module in the defogging generator, recording the first characteristic diagram after each downsampling of an encoder in the defogging generator, and recording the second characteristic diagram after each upsampling of a decoder in the defogging generator;
Inputting the first feature map and the second feature map into the feature fusion layer to perform feature fusion, taking the original haze-free image as supervision information, and performing model training on the initial image haze-removal model based on the one-stage training loss function to obtain first candidate training models corresponding to different test index values, wherein the test index values are peak signal-to-noise ratios;
and taking the first candidate training model with the highest test index value as an initial training model.
Specifically, an original foggy image and a corresponding synthetic foggy image in a synthetic data set are obtained; and acquiring a preset counterattack loss function, acquiring a preset cycle consistency loss function, and combining the preset cycle consistency loss function with the supervised absolute value loss function to obtain a one-stage training loss function. In this stage of training, strong supervision information of the synthesized data is needed, so the supervision absolute value loss function weight setting is bigger than the same ratio.
The fog pattern characteristic enhancement module is pre-constructed and used for enhancing the input characteristics of the model and is applied to a defogging generatorIs input to the computer. The fog pattern characteristic enhancement module is realized by respectively calculating fog images +.>Luminance profile of (2)Contrast prior feature map- >And dark channel feature map->And the three are fused.
The fog pattern feature enhancement module processes the image as follows:
to be fogged imageConverting into gray level map, and using the converted gray level map as brightness characteristic map +.>
In foggy imagesA window W is slid upwards, the window size is 3x3, and the pixel mean value in each window is calculated>And standard deviation->Wherein global->Namely contrast prior feature map->
Calculating a dark channel characteristic diagram based on a dark channel priori formulaWherein the dark channel feature map formula isThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>A partial image block centered on the pixel x is indicated, and a sliding window is set to a size of 3x 3. Luminance feature map->Contrast prior feature map->And dark channel feature map/>Extracting features by 1x1 convolution layer, and finally mixing with foggy image +.>Performing channel splicing operation to obtain a number of channels of 6, size and foggy image +.>Consistent input feature maps.
The peak signal to noise ratio expression of one stage training is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein said->For the original haze-free image in the synthetic dataset, the +.>For a hazy image in the composite dataset, theIs a defogging generator; said->The method comprises the steps of carrying out a first treatment on the surface of the Where n=8, mse is the mean square error. The larger the peak signal-to-noise value is, the better the image restoration quality is.
The invention extracts the prior characteristics in defogging, such as brightness, contrast, dark channel and the like, by utilizing the prior-based fog image characteristic enhancement module, and enhances the characteristics of the image before model training, so that the model can learn defogging characteristics better than a general generation network.
In one implementation, training the initial training model using the real scene dataset to obtain a trained image defogging model, comprising:
acquiring a real foggy image and a real foggy image in the real scene data set;
obtaining a two-stage training loss function according to the multi-layer perception loss function, the antagonism loss function and the cycle consistency loss function;
inputting the real foggy image into a trained initial training model;
the real foggy image passes through the foggy image feature enhancement module, a second brightness feature image, a second contrast priori feature image and a second dark channel feature image of the real foggy image are calculated, and the second brightness feature image, the second contrast priori feature image and the second dark channel feature image are fused to obtain a second training input feature image;
inputting the second training input characteristic diagram into the dual generator, processing by the adaptive mixed weighted layer jump connection module in the defogging generator, recording a third characteristic diagram after each downsampling of an encoder in the defogging generator, and recording a fourth characteristic diagram after each upsampling of a decoder in the defogging generator;
Inputting the third feature map and the fourth feature map into the feature fusion layer for feature fusion, taking the real haze-free image as supervision information, and carrying out model training on the initial training model based on the two-stage training loss function to obtain second candidate training models corresponding to different test index values, wherein the test index values are peak signal-to-noise ratios;
and taking the second candidate training model with the highest test index value as a defogging model of the trained image.
Specifically, the peak signal-to-noise ratio expression of the two-stage training isThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>For a real foggy image in a real scene dataset, saidThe method comprises the steps of carrying out a first treatment on the surface of the Where n=8, mse isMean square error.
According to the invention, the model training is carried out on the initial image defogging model by using the synthetic data set to obtain an initial training model, and then the initial training model is trained by using the real data set, so that the problem of domain distribution difference of the synthetic data and the real data is reasonably solved, the dependence of the model on the data is reduced, the data cost is low, the perception loss is introduced by fusing physical priori characteristics, and the end-to-end defogging general performance is improved.
In one implementation, the multi-layer perceptual loss function is expressed as:
Wherein x represents a foggy image, i represents the number of feature layers, theWeights representing the error of each feature layer, said +.>Representing a synthetic mist pattern generator, said +.>Representing a defogging generator; said->Representing an ith layer characteristic value corresponding to a currently input foggy image;
the supervised absolute value loss function is expressed as:
wherein the saidFor the original haze-free image in the synthetic dataset, the +.>A hazy image in the synthetic dataset;
the one-stage training loss function is expressed as:
the two-stage training loss function is expressed as:
wherein the said、/>、/>And said->Is constant, said->Representing an fight loss function, said +.>Representing a cyclic consistency loss function, said +.>Representing a multi-layer perceptual loss function, said +.>Representing a supervised absolute value loss function.
In particular, in the multi-layer perceptual loss function,said->The contribution of the different level losses is adjusted, aiming at that the deep feature map can represent high-level semantic information of the image, and the shallow feature map represents low-level texture information of the image. In the one-stage training loss function, +.>、/>、/>、/>. In the two-stage training loss function, +. >、/>、/>. The model training of the invention fuses virtual and real data, and fully utilizes the strong supervision information of the synthesized data and the real data distribution of the real data by a two-stage training method, thereby improving the recovery effect of defogging images.
In one implementation, the image defogging model uses model light-weight processing, so that the running speed of the model is improved.
Further, as shown in fig. 2, based on the image defogging processing method based on deep learning, the invention further correspondingly provides an image defogging processing device based on deep learning, which comprises:
the acquisition module 100 is configured to acquire an original image to be processed, extract each target feature image in the original image, and fuse each target feature image to obtain an input feature image;
the input module 200 is configured to input the input feature map into a pre-trained image defogging model to obtain a defogging image;
the image defogging model is obtained by carrying out model training on a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then carrying out model training on the initial training model by a pre-built real scene data set.
In one embodiment, as shown in FIG. 3, includes: the image defogging device comprises a memory 20, a processor 10 and a deep learning-based image defogging processing program 30 stored on the memory and capable of running on the processor, wherein the deep learning-based image defogging processing program 30 realizes the steps of the image defogging processing method based on the deep learning when being executed by the processor 10.
The present invention also provides a computer-readable storage medium storing a computer program executable for implementing the steps of the image defogging processing method based on deep learning as described above.
In summary, the image defogging processing method and device based on deep learning provided by the invention comprise the following steps: acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image; inputting the input feature map into a pre-trained image defogging model to obtain a defogging image; the image defogging model is obtained by model training a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and then model training the initial training model by a pre-built real scene data set. According to the image defogging model obtained by the two-stage training method, the strong supervision information of the synthesized data and the real data distribution of the real data are fully utilized, the problem of poor domain distribution of the synthesized data and the real data is reasonably solved, and the defogging effect of the image is effectively improved.
It is to be understood that the invention is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.

Claims (8)

1. An image defogging processing method based on deep learning is characterized by comprising the following steps:
acquiring an original image to be processed, extracting each target feature image in the original image, and fusing each target feature image to obtain an input feature image;
inputting the input feature map into a pre-trained image defogging model to obtain a defogging image;
the image defogging model is obtained by carrying out model training on a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and carrying out model training on the initial training model by a pre-built real scene data set;
the training step of the image defogging model comprises the following steps:
constructing a synthetic data set and a real scene data set;
constructing an initial image defogging model;
performing model training on the initial image defogging model by using the synthetic data set to obtain an initial training model;
Training the initial training model by utilizing the real scene data set to obtain a trained image defogging model;
the constructing a synthetic dataset and a real scene dataset, comprising:
acquiring an original haze-free image, a transmittance map and preset global atmospheric light, wherein the transmittance map is obtained by randomly generating a depth map and then applying a preset exponential decay function;
acquiring a preset atmospheric degradation model formula, and substituting the original haze-free image, the transmittance map and the preset global atmospheric light into the preset atmospheric degradation model formula to obtain a synthesized haze image matched with the original haze-free image;
constructing the synthetic dataset from the original haze-free image and the synthetic haze image;
acquiring a real foggy image and a real non-foggy image in a real scene, and constructing a real scene data set according to the real foggy image and the real non-foggy image;
extracting each target feature map in the original image comprises the following steps: inputting the original image into a pre-constructed fog pattern feature enhancement module, and extracting a brightness feature pattern, a contrast priori feature pattern and a dark channel feature pattern of the original image.
2. The image defogging processing method based on deep learning according to claim 1, wherein the constructing an initial image defogging model comprises:
the method comprises the steps of adopting a cyclic network model of a cyclic egan-like double generator and a cyclic graph-like double discriminator as a framework, wherein the double generator comprises a defogging generator and a synthetic fog graph generator, and the defogging generator and the synthetic fog graph generator both adopt a Unet-like structure;
constructing a self-adaptive mixed weighted layer jump connection module of the defogging generator and the synthetic fog figure generator;
constructing a feature fusion layer in the self-adaptive mixed weighting layer-jump connection module;
and constructing a multi-layer perception loss function and supervising the absolute value loss function.
3. The image defogging processing method based on deep learning according to claim 2, wherein model training is performed on the initial image defogging model by using the synthetic data set to obtain an initial training model, comprising:
acquiring an original haze-free image in the synthesized data set and a synthesized haze-free image matched with the original haze-free image;
acquiring a preset counterloss function and a cyclic consistency loss function, and obtaining a one-stage training loss function according to the multi-layer perception loss function, the supervision absolute value loss function, the counterloss function and the cyclic consistency loss function;
Inputting the synthesized hazy image into a pre-constructed hazy image feature enhancement module, calculating a first brightness feature image, a first contrast priori feature image and a first dark channel feature image of the synthesized hazy image, and fusing the first brightness feature image, the first contrast priori feature image and the first dark channel feature image to obtain a first training input feature image;
inputting the first training input characteristic diagram into the dual generator, processing by the self-adaptive mixed weighted layer jump connection module in the defogging generator, recording the first characteristic diagram after each downsampling of an encoder in the defogging generator, and recording the second characteristic diagram after each upsampling of a decoder in the defogging generator;
inputting the first feature map and the second feature map into the feature fusion layer to perform feature fusion, taking the original haze-free image as supervision information, and performing model training on the initial image haze-removal model based on the one-stage training loss function to obtain first candidate training models corresponding to different test index values, wherein the test index values are peak signal-to-noise ratios;
and taking the first candidate training model with the highest test index value as an initial training model.
4. A deep learning based image defogging processing method according to claim 3, wherein training the initial training model by using the real scene data set to obtain a trained image defogging model comprises:
acquiring a real foggy image and a real foggy image in the real scene data set;
obtaining a two-stage training loss function according to the multi-layer perception loss function, the antagonism loss function and the cycle consistency loss function;
inputting the real foggy image into a trained initial training model;
the real foggy image passes through the foggy image feature enhancement module, a second brightness feature image, a second contrast priori feature image and a second dark channel feature image of the real foggy image are calculated, and the second brightness feature image, the second contrast priori feature image and the second dark channel feature image are fused to obtain a second training input feature image;
inputting the second training input characteristic diagram into the dual generator, processing by the adaptive mixed weighted layer jump connection module in the defogging generator, recording a third characteristic diagram after each downsampling of an encoder in the defogging generator, and recording a fourth characteristic diagram after each upsampling of a decoder in the defogging generator;
Inputting the third feature map and the fourth feature map into the feature fusion layer for feature fusion, taking the real haze-free image as supervision information, and carrying out model training on the initial training model based on the two-stage training loss function to obtain second candidate training models corresponding to different test index values, wherein the test index values are peak signal-to-noise ratios;
and taking the second candidate training model with the highest test index value as a defogging model of the trained image.
5. The image defogging processing method based on deep learning of claim 4, wherein the multi-layer perceptual loss function is expressed as:
wherein x represents a foggy image, i represents the number of feature layers, theWeights representing the error of each feature layer, said +.>Representing a synthetic mist pattern generator, said +.>Representing a defogging generator; said->Representing an ith layer characteristic value corresponding to a currently input foggy image;
the supervised absolute value loss function is expressed as:
wherein the saidFor the original haze-free image in the synthetic dataset, the +.>A hazy image in the synthetic dataset;
the one-stage training loss function is expressed as:
The two-stage training loss function is expressed as:
wherein the said、/>、/>And->Is constant, said->Representing an fight loss function, said +.>Representing a cyclic consistency loss function, said +.>Representing a multi-layer perceptual loss function, said +.>Representing a supervised absolute value loss function.
6. An image defogging processing device based on deep learning, characterized by comprising:
the acquisition module acquires an original image to be processed, extracts each target feature image in the original image, and fuses each target feature image to obtain an input feature image;
the input module is used for inputting the input feature map into a pre-trained image defogging model to obtain a defogging image;
the image defogging model is obtained by carrying out model training on a pre-built initial image defogging model by a pre-built synthetic data set to obtain an initial training model, and carrying out model training on the initial training model by a pre-built real scene data set;
the training step of the image defogging model comprises the following steps:
constructing a synthetic data set and a real scene data set;
constructing an initial image defogging model;
performing model training on the initial image defogging model by using the synthetic data set to obtain an initial training model;
Training the initial training model by utilizing the real scene data set to obtain a trained image defogging model;
the constructing a synthetic dataset and a real scene dataset, comprising:
acquiring an original haze-free image, a transmittance map and preset global atmospheric light, wherein the transmittance map is obtained by randomly generating a depth map and then applying a preset exponential decay function;
acquiring a preset atmospheric degradation model formula, and substituting the original haze-free image, the transmittance map and the preset global atmospheric light into the preset atmospheric degradation model formula to obtain a synthesized haze image matched with the original haze-free image;
constructing the synthetic dataset from the original haze-free image and the synthetic haze image;
acquiring a real foggy image and a real non-foggy image in a real scene, and constructing a real scene data set according to the real foggy image and the real non-foggy image;
extracting each target feature map in the original image comprises the following steps: inputting the original image into a pre-constructed fog pattern feature enhancement module, and extracting a brightness feature pattern, a contrast priori feature pattern and a dark channel feature pattern of the original image.
7. A terminal, comprising: the image defogging processing program based on the deep learning comprises a memory, a processor and an image defogging processing program based on the deep learning, wherein the image defogging processing program based on the deep learning is stored in the memory and can run on the processor, and the image defogging processing program based on the deep learning realizes the steps of the image defogging processing method based on the deep learning according to any one of claims 1 to 5 when being executed by the processor.
8. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program executable for implementing the steps of the deep learning-based image defogging processing method of any one of claims 1 to 5.
CN202310729654.7A 2023-06-20 2023-06-20 Image defogging processing method and device based on deep learning Active CN116452469B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310729654.7A CN116452469B (en) 2023-06-20 2023-06-20 Image defogging processing method and device based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310729654.7A CN116452469B (en) 2023-06-20 2023-06-20 Image defogging processing method and device based on deep learning

Publications (2)

Publication Number Publication Date
CN116452469A CN116452469A (en) 2023-07-18
CN116452469B true CN116452469B (en) 2023-10-03

Family

ID=87120602

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310729654.7A Active CN116452469B (en) 2023-06-20 2023-06-20 Image defogging processing method and device based on deep learning

Country Status (1)

Country Link
CN (1) CN116452469B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117611644A (en) * 2024-01-23 2024-02-27 南京航空航天大学 Method, device, medium and equipment for converting visible light image into SAR image

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614070A (en) * 2020-12-28 2021-04-06 南京信息工程大学 DefogNet-based single image defogging method
CN114266933A (en) * 2021-12-10 2022-04-01 河南垂天科技有限公司 GAN image defogging algorithm based on deep learning improvement
WO2022267641A1 (en) * 2021-06-25 2022-12-29 南京邮电大学 Image defogging method and system based on cyclic generative adversarial network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220414838A1 (en) * 2021-06-25 2022-12-29 Nanjing University Of Posts And Telecommunications Image dehazing method and system based on cyclegan

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614070A (en) * 2020-12-28 2021-04-06 南京信息工程大学 DefogNet-based single image defogging method
WO2022267641A1 (en) * 2021-06-25 2022-12-29 南京邮电大学 Image defogging method and system based on cyclic generative adversarial network
CN114266933A (en) * 2021-12-10 2022-04-01 河南垂天科技有限公司 GAN image defogging algorithm based on deep learning improvement

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
二阶段端到端的图像去雾生成网络;邢晓敏 等;计算机辅助设计与图形学学报(第01期);全文 *
邢晓敏 等.二阶段端到端的图像去雾生成网络.计算机辅助设计与图形学学报.2020,(第01期),全文. *

Also Published As

Publication number Publication date
CN116452469A (en) 2023-07-18

Similar Documents

Publication Publication Date Title
Kim et al. Fast single image dehazing using saturation based transmission map estimation
CN108986050B (en) Image and video enhancement method based on multi-branch convolutional neural network
CN108875935B (en) Natural image target material visual characteristic mapping method based on generation countermeasure network
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
Anvari et al. Dehaze-GLCGAN: unpaired single image de-hazing via adversarial training
Wang et al. MAGAN: Unsupervised low-light image enhancement guided by mixed-attention
CN116452469B (en) Image defogging processing method and device based on deep learning
CN116311254B (en) Image target detection method, system and equipment under severe weather condition
Swami et al. Candy: Conditional adversarial networks based fully end-to-end system for single image haze removal
CN111931857A (en) MSCFF-based low-illumination target detection method
Su et al. Prior guided conditional generative adversarial network for single image dehazing
Zhang et al. Hierarchical attention aggregation with multi-resolution feature learning for GAN-based underwater image enhancement
CN113066025B (en) Image defogging method based on incremental learning and feature and attention transfer
Babu et al. An efficient image dahazing using Googlenet based convolution neural networks
Guo et al. Haze removal for single image: A comprehensive review
Saleem et al. A non-reference evaluation of underwater image enhancement methods using a new underwater image dataset
CN117011181A (en) Classification-guided unmanned aerial vehicle imaging dense fog removal method
CN115953312A (en) Joint defogging detection method and device based on single image and storage medium
Wali et al. Recent Progress in Digital Image Restoration Techniques: A Review
Li et al. Multi-scale fusion framework via retinex and transmittance optimization for underwater image enhancement
Kumar et al. Underwater Image Enhancement using deep learning
CN113744152A (en) Tide water image denoising processing method, terminal and computer readable storage medium
CN112927250A (en) Edge detection system and method based on multi-granularity attention hierarchical network
Yuan et al. Turbidity underwater image enhancement based on generative adversarial network
CN116563615B (en) Bad picture classification method based on improved multi-scale attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant