CN116542865A - Multi-scale real-time defogging method and device based on structural re-parameterization - Google Patents

Multi-scale real-time defogging method and device based on structural re-parameterization Download PDF

Info

Publication number
CN116542865A
CN116542865A CN202310223074.0A CN202310223074A CN116542865A CN 116542865 A CN116542865 A CN 116542865A CN 202310223074 A CN202310223074 A CN 202310223074A CN 116542865 A CN116542865 A CN 116542865A
Authority
CN
China
Prior art keywords
defogging
network
image
parameterization
structural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310223074.0A
Other languages
Chinese (zh)
Inventor
左方
刘家萌
高铭远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN202310223074.0A priority Critical patent/CN116542865A/en
Publication of CN116542865A publication Critical patent/CN116542865A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-scale real-time defogging method and device based on structural re-parameterization, wherein the method comprises the following steps: constructing a structural re-parameterization module; constructing a multi-scale image defogging network based on structural re-parameterization; defogging the haze image by adopting a K-estination image reconstruction module; defining a composite loss function of a multi-scale image defogging network based on structural re-parameterization; initializing a multi-scale image defogging network; preparing a data set; training a multi-scale image defogging network by using the prepared data set; and defogging the haze image by using a trained multi-scale image defogging network, and detecting the quality and efficiency of the defogged image. According to the invention, the K-arrival image reconstruction module is added in the multi-scale network to defog the haze image, so that the physical characteristics contained in haze weather in the image can be better learned, and a defogging picture with higher quality can be recovered.

Description

Multi-scale real-time defogging method and device based on structural re-parameterization
Technical Field
The invention relates to the technical field of defogging of single images, in particular to a multi-scale real-time defogging method and device based on structural heavy parameterization.
Background
The modern industrialized development causes the haze phenomenon to be frequent, and has great influence on scientific research in the field of computer vision besides affecting our life. Haze is a common atmospheric phenomenon generated by small floating particles such as dust and smoke in the air, which greatly absorb and scatter light, resulting in reduced visibility, reduced contrast of photographed images, blurred image quality and pixel distortion, and seriously affecting the optical system of visible light. Under the influence of haze, practical applications requiring high-quality clear pictures such as remote sensing, navigation, automatic driving, video monitoring and the like are easily threatened, and advanced computer vision tasks such as detection and identification are difficult to finish. Therefore, the image defogging technology becomes an increasingly important underlying visual task, has important research value and is a challenging subject. The traditional defogging algorithm is mainly divided into two types, namely, a defogging algorithm based on image enhancement and a defogging algorithm based on image restoration. The defogging algorithm based on image enhancement, such as wavelet transformation, homomorphic filtering and the like, starts from removing image noise as much as possible and improving image contrast, so that a defogging clear image is recovered. Defogging algorithms based on image restoration, such as dark channel defogging algorithm and Bayesian defogging algorithm, perform defogging processing based on an atmospheric degradation model. Defogging effect based on the atmospheric degradation model is generally better than defogging algorithm based on image enhancement.
In recent years, convolutional neural networks have been rapidly developed in the field of computer vision image processing, and many conventional computer vision algorithms have been replaced by Deep Learning (DL), so that modern defogging technologies using convolutional neural networks (Convolutional Neural Network, CNN) are continuously emerging. These deep learning defogging techniques can be divided into two categories, the first category still being to input an image with haze into an atmospheric scattering model (Atmospheric Scattering Model, ASM), and estimating global atmospheric light values and transmittance in an atmospheric degradation model by using a neural network, so as to calculate a clear image after defogging. The second category is to directly predict and output defogged images by inputting haze images into a convolutional neural network by using a deep learning end-to-end method.
Conventional image defogging algorithms have focused more on using prior knowledge, such as dark channel prior, color decay prior, contrast maximum prior to restore a sharp image. Not all real scene images are compatible with a predefined prior, however, the performance of conventional image defogging algorithms is greatly limited. Recently, deep learning has shown effectiveness in the field of image defogging, and various convolutional neural network-based methods have been proposed to estimate an atmospheric degradation model. The degradation model can be expressed specifically as: i (x) =j (x) t (x) +a (1-t (x)), where a is a global atmospheric light value and t (x) is a transmission matrix, and most of current deep learning methods utilize a multi-branch network to estimate the transmission matrix t (x) and the global atmospheric light value a, respectively, and calculate defogging images through an atmospheric degradation model. The multi-branch network can give consideration to the low-layer characteristics and the high-layer characteristics in the convolutional neural network, and the detail information of the security image simultaneously contains more semantic information. However, the use of the multi-branch network causes excessive parameter quantity and large calculation cost, which increases the time complexity of the defogging algorithm, and the defogging algorithm cannot be applied to scenes with high instantaneity, such as automatic navigation, real-time monitoring and the like. In addition, in order to pursue defogging speed, most algorithms such as an AOD algorithm, a Light-DehazeNet algorithm adopts a lightweight single-branch network for training reasoning, however, the lightweight single-branch network has lower performance, so that defogging effect is not good.
Disclosure of Invention
Aiming at the problem that the existing image defogging method does not achieve a good balance on defogging speed and defogging quality, the invention provides a multi-scale real-time defogging method and device based on structure re-parameterization, which improves the image defogging quality through a structure re-parameterization module and a multi-branch network structure, converts the structure re-parameterization module into a common convolution module in the reasoning process, reduces the parameter quantity in the network model reasoning process, and further improves the reasoning speed of a defogging model while improving the image defogging quality.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
the invention provides a multi-scale real-time defogging method based on structural heavy parameterization, which comprises the following steps:
step 1: constructing a structural re-parameterization module;
step 2: constructing a multi-scale image defogging network based on structural re-parameterization; the multi-scale image defogging network comprises a structural re-parameterization module and a K-arrival image reconstruction module;
step 3: defogging the haze image by adopting a K-estination image reconstruction module;
step 4: defining a composite loss function of a multi-scale image defogging network based on structural re-parameterization;
step 5: initializing a multi-scale image defogging network;
step 6: preparing a data set;
step 7: training a multi-scale image defogging network by using the prepared data set;
step 8: and defogging the haze image by using a trained multi-scale image defogging network, and detecting the quality and efficiency of the defogged image.
Further, the structure re-parameterization module has different structures during network training and reasoning; the structure re-parameterization module has a plurality of different branches during training, converts the multi-branch structure during training into a single-branch structure during reasoning through an identity transformation during reasoning, and uses the single-branch structure after conversion to perform equivalent reasoning during reasoning.
Further, the structural re-parameterization module comprises an identity mapping branch, a 1×1 convolution layer, a 3×3 convolution layer and a 5×5 convolution layer during training; mapping the identity to branches, representing the 1 multiplied by 1 convolution layer and the 3 multiplied by 3 convolution layer as a 5 multiplied by 5 convolution layer through zero padding, and converting the identity to a single-branch 5 multiplied by 5 convolution layer through element addition operation on the four branches; the converted structure re-parameterized block structure has only one branch consisting of a 5 x 5 convolution layer and a nonlinear activation function ReLU layer.
Further, the multi-scale image defogging network comprises three feature extraction modules with different scales and a K-estination image reconstruction module, wherein each feature extraction module consists of a 3X 3 convolution layer and two structural heavy parameterization modules.
Further, the K-estimate module combines the global atmospheric light value and the transmission matrix by mathematical transformation to a parameter K,where t (x) represents a transmission matrix, a represents a global atmospheric light value, I (x) =j (x) t (x) +a (1-t (x)) represents an atmospheric degradation model, and b is a constant deviation value of 1 as a default value, J (x) =k (x) I (x) -K (x) +b.
Further, in the step 4, the composite loss function is composed of a mean square error loss function and an edge perception loss function.
Further, in the step 5, the parameter initialization is performed on the convolution kernel using gaussian distribution initialization.
Further, the step 6 includes:
the synthetic haze image is created using the NYU2 depth dataset and the training set, the validation set and the test set are divided in a proportion.
Further, the step 7 includes:
setting the initial learning rate, and setting the number of batch images and training rounds by adopting an ADAM optimizer until the network converges.
In another aspect, the present invention provides a multi-scale real-time defogging device based on structural heavy parameterization, comprising:
the first network construction unit is used for constructing a structure re-parameterization module;
the second network construction unit is used for constructing a multi-scale image defogging network based on structural reconsideration; the multi-scale image defogging network comprises a structural re-parameterization module and a K-arrival image reconstruction module;
the third network construction unit is used for defogging the haze image by adopting the K-arrival image reconstruction module;
the loss function construction unit is used for defining a composite loss function of the multi-scale image defogging network based on the structural reconrameterization;
the network initialization unit is used for initializing a multi-scale image defogging network;
a data set construction unit for preparing a data set;
a network training unit for training a multi-scale image defogging network using the prepared data set;
and the defogging unit is used for defogging the haze image by using the trained multi-scale image defogging network and detecting the quality and efficiency of the defogged image.
Further, the structure re-parameterization module has different structures during network training and reasoning; the structure re-parameterization module has a plurality of different branches during training, converts the multi-branch structure during training into a single-branch structure during reasoning through an identity transformation during reasoning, and uses the single-branch structure after conversion to perform equivalent reasoning during reasoning.
Further, the structural re-parameterization module comprises an identity mapping branch, a 1×1 convolution layer, a 3×3 convolution layer and a 5×5 convolution layer during training; mapping the identity to branches, representing the 1 multiplied by 1 convolution layer and the 3 multiplied by 3 convolution layer as a 5 multiplied by 5 convolution layer through zero padding, and converting the identity to a single-branch 5 multiplied by 5 convolution layer through element addition operation on the four branches; the converted structure re-parameterized block structure has only one branch consisting of a 5 x 5 convolution layer and a nonlinear activation function ReLU layer.
Further, the multi-scale image defogging network comprises three feature extraction modules with different scales and a K-estination image reconstruction module, wherein each feature extraction module consists of a 3X 3 convolution layer and two structural heavy parameterization modules.
Further, the K-estimate module combines the global atmospheric light value and the transmission matrix by mathematical transformation to a parameter K,where t (x) represents a transmission matrix, a represents a global atmospheric light value, I (x) =j (x) t (x) +a (1-t (x)) represents an atmospheric degradation model, and b is a constant deviation value of 1 as a default value, J (x) =k (x) I (x) -K (x) +b.
Further, in the lost function construction unit, the composite loss function is composed of a mean square error loss function and an edge perception loss function.
Further, in the network initialization unit, the gaussian distribution initialization is used for initializing parameters of the convolution kernel.
Further, the data set construction unit includes:
the synthetic haze image is created using the NYU2 depth dataset and the training set, the validation set and the test set are divided in a proportion.
Further, the network training unit includes:
setting the initial learning rate, and setting the number of batch images and training rounds by adopting an ADAM optimizer until the network converges.
Compared with the prior art, the invention has the beneficial effects that:
(1) The invention provides a multi-scale real-time defogging method and device based on structure re-parameterization aiming at the single image defogging field, wherein a structure re-parameterization module can equivalently convert a multi-branch structure in training into a single-branch structure in reasoning, the multi-branch structure is utilized for training so as to improve the arrangement capacity of a network, and the single-branch structure is utilized for reasoning so as to reduce the calculation cost of the network.
(2) The K-arrival image reconstruction module is added in the multi-scale network to defog the haze image. Three features with different scales generated in three feature extraction stages are input into two up-sampling convolution layers, and are input into a K-arrival module after fusion so as to capture more key underlying structure information and higher semantic information. The K-estimate module is a transformation structure based on an atmospheric scattering model, and physical characteristics contained in haze weather in an image can be better learned through the K-estimate module so as to recover defogging pictures with higher quality.
(3) The present invention performs training and testing on NYU2 depth datasets. Experimental results show that the defogging quality of the model is superior to that of a mainstream defogging algorithm based on deep learning, and meanwhile, the reasoning speed of the network reaches the real-time field. In addition, because the lightweight network model can be conveniently embedded into computer vision-based systems such as aerial photography, automatic navigation and real-time monitoring.
Drawings
FIG. 1 is a schematic flow chart of a multi-scale real-time defogging method based on structural re-parameterization according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a structural re-parameterized module according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a multi-scale image defogging network architecture according to an embodiment of the present invention;
FIG. 4 is a graph showing the comparison of defogging effects according to an embodiment of the present invention;
FIG. 5 is a diagram showing an example of defogging effect according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a multi-scale real-time defogging device based on structural re-parameterization according to an embodiment of the present invention.
Detailed Description
The invention is further illustrated by the following description of specific embodiments in conjunction with the accompanying drawings:
as shown in fig. 1, a multi-scale real-time defogging method based on structural re-parameterization includes:
step one, a structural re-parameterization module is constructed. The structure re-parameterization module has a different structure at the time of network training and at the time of reasoning, and has a plurality of different branches at the time of training, including an identity mapping branch, a 1 x 1 convolution layer, a 3 x 3 convolution layer, and a 5 x 5 convolution layer. During reasoning, we convert the multi-branch structure during training into a single-branch structure during reasoning through an identity transformation. Specifically, the identity map branches, the 1×1 convolutional layer, and the 3×3 convolutional layer can be represented as a 5×5 convolutional layer by zero padding, and we convert the identity of the four branches into a single-branch 5×5 convolutional layer by performing element addition on the four branches. The converted structure re-parameterization module structure has only one branch consisting of a 5×5 convolution layer and a nonlinear activation function ReLU layer, and the equivalent reasoning is carried out by using the converted single-branch structure in the reasoning process, and the structure re-parameterization module is shown in fig. 2.
And step two, constructing a multi-scale image defogging network. The network comprises a three-dimensional feature extraction module of pyramidal structure to extract multi-dimensional features, the first feature extraction stage consisting of a 3 x 3 convolution layer and two structural re-parameterization modules, in which the number of channels of the feature map is increased to 32, and the next two stages each consisting of a 3 x 3 convolution layer and two structural re-parameterization modules, except that they increase the depth of the feature map to 64 and 128, respectively, and reduce the feature resolution by half. Three different scale features generated by the three feature extraction stages are fed into the channel attention module to capture more critical underlying structural information and high level semantic information. The multi-scale image defogging network is shown in fig. 3.
Step three, a K-estimation image reconstruction module is constructed to defog the haze image. The K-estimation module is equivalently transformed by an atmosphere degradation model I (x) =J (x) t (x) +A (1-t (x)), and the atmosphere degradation model respectively estimates a global atmosphere light value A and a transmission matrix t (x). The K-estimate module then combines the global atmospheric light value and the transmission matrix by mathematical transformation to a parameter K. Specifically, the formula is: j (x) =k (x) I (x) -K (x) +b, whereinBy this formula +.>And A are both integrated into the new variable K (x). b is a constant offset value of 1 as a default value. Since K (x) is I (x) dependent, an input adaptive depth model can be constructed and by minimizing its output J (x) and haze-free images.
And step four, defining a composite loss function of the multi-scale image defogging network based on the structural reparameterization. The composite loss function is formed by mean squareAn error loss function (MSE) and an edge-aware loss function. During the training phase, the loss function L is compounded total Defined by the combination of these two loss functions and given by:
wherein lambda is 12 Is the weight of two loss functions, L 1 As a mean square error loss function, L 2 Is an edge-aware loss function.
And fifthly, initializing a multi-scale image defogging network. Specifically, during training, weights parameterize the convolution kernel using gaussian distribution initialization.
Step six, preparing a data set. Specifically, the present embodiment uses the NYU2 depth dataset to create a composite haze image comprising 27256 composite haze images containing different haze thicknesses and corresponding sharp images, the composite images being generated from 1450 indoor scene images. As one implementation, the training set, validation set and test set are partitioned in a ratio of 8:1:1.
And step seven, training a multi-scale image defogging network by using the prepared data set. Specifically, we set the initial learning rate to 0.0001, perform training optimization with ADAM optimizer, the number of batch images to 16, and the training round to 100 until the network converges.
And step eight, defogging the haze image by using a trained image defogging network, and detecting the quality and efficiency of the defogged image. Specifically, the test set is fed into a defogging network model after training, and defogging reconstruction is carried out on haze images. Comparing the defogged picture with a corresponding clear image, evaluating the merits of a defogging algorithm from objective values by adopting two indexes of Structural Similarity (SSIM) and peak signal to noise ratio (PSNR), and calculating defogging time to prove that the defogging efficiency reaches the real-time field.
The specific parameter configuration of the multi-scale network based on the structural re-parameterization module is shown in table 1.
Table 1 network architecture parameter configuration
Compared with the traditional algorithm based on a physical model, such as a DCP defogging algorithm or an MSCNN defogging algorithm based on deep learning and an AOD defogging algorithm, the method has better performance compared with the other three defogging algorithms. Compared with a DCP algorithm, the PSNR quantified by the defogging effect of the model on the NYU2 depth data set exceeds the PSNR after defogging by the DCP algorithm by 5.0%, the SSIM exceeds 10%, and compared with the MSCNN, the PSNR of the image after defogging by the model exceeds the MSCNN by 3.5% and the SSIM exceeds 7%. Compared with AOD-Net, the model provided by the invention has the advantages that the quantized PSNR value of the image after defogging is 1% -1.7% higher, and the SSIM value is 0.5% -1% higher. In addition to quantitatively describing our algorithm, we qualitatively describe the defogging capability of our algorithm, the visual effect of the defogged picture of the NYU2 dataset is compared with that of the clear picture, see fig. 4, and in order to measure our generalization, we do defogging treatment on a set of outdoor scenes, the visual effect is shown in fig. 5.
The invention is compared with the traditional algorithm based on a physical model, such as a DCP defogging algorithm, or the related lightweight algorithm based on deep learning, such as an AOD defogging algorithm and a DCPDN algorithm, and the comparison test is carried out on a workstation with a display card of Nvidia Titan XP, and the experimental result is shown in Table 2:
table 2 run times for four different models
As can be seen from Table 2, the defogging time of the traditional defogging algorithm DCP is 1.62s, the high-performance defogging algorithm AOD-Net based on deep learning is 4.5ms, the DCPDN is 41.7ms, the method is 7.6ms, the method is in the same order as the AOD-Net, the real-time defogging requirement is achieved, and the defogging quality of the defogging method is better than that of the AOD-Net defogging algorithm.
On the basis of the above embodiment, as shown in fig. 6, the present invention further provides a multi-scale real-time defogging device based on structural heavy parameterization, including:
the first network construction unit is used for constructing a structure re-parameterization module;
the second network construction unit is used for constructing a multi-scale image defogging network based on structural reconsideration; the multi-scale image defogging network comprises a structural re-parameterization module and a K-arrival image reconstruction module;
the third network construction unit is used for defogging the haze image by adopting the K-arrival image reconstruction module;
the loss function construction unit is used for defining a composite loss function of the multi-scale image defogging network based on the structural reconrameterization;
the network initialization unit is used for initializing a multi-scale image defogging network;
a data set construction unit for preparing a data set;
a network training unit for training a multi-scale image defogging network using the prepared data set;
and the defogging unit is used for defogging the haze image by using the trained multi-scale image defogging network and detecting the quality and efficiency of the defogged image.
Further, the structure re-parameterization module has different structures during network training and reasoning; the structure re-parameterization module has a plurality of different branches during training, converts the multi-branch structure during training into a single-branch structure during reasoning through an identity transformation during reasoning, and uses the single-branch structure after conversion to perform equivalent reasoning during reasoning.
Further, the structural re-parameterization module comprises an identity mapping branch, a 1×1 convolution layer, a 3×3 convolution layer and a 5×5 convolution layer during training; mapping the identity to branches, representing the 1 multiplied by 1 convolution layer and the 3 multiplied by 3 convolution layer as a 5 multiplied by 5 convolution layer through zero padding, and converting the identity to a single-branch 5 multiplied by 5 convolution layer through element addition operation on the four branches; the converted structure re-parameterized block structure has only one branch consisting of a 5 x 5 convolution layer and a nonlinear activation function ReLU layer.
Further, the multi-scale image defogging network comprises three feature extraction modules with different scales and a K-estination image reconstruction module, wherein each feature extraction module consists of a 3X 3 convolution layer and two structural heavy parameterization modules.
Further, the K-estimate module combines the global atmospheric light value and the transmission matrix by mathematical transformation to a parameter K,where t (x) represents a transmission matrix, a represents a global atmospheric light value, I (x) =j (x) t (x) +a (1-t (x)) represents an atmospheric degradation model, and b is a constant deviation value of 1 as a default value, J (x) =k (x) I (x) -K (x) +b.
Further, in the lost function construction unit, the composite loss function is composed of a mean square error loss function and an edge perception loss function.
Further, in the network initialization unit, the gaussian distribution initialization is used for initializing parameters of the convolution kernel.
Further, the data set construction unit includes:
the synthetic haze image is created using the NYU2 depth dataset and the training set, the validation set and the test set are divided in a proportion.
Further, the network training unit includes:
setting the initial learning rate, and setting the number of batch images and training rounds by adopting an ADAM optimizer until the network converges.
In summary, the invention provides a multi-scale real-time defogging method and device based on structure re-parameterization aiming at the single image defogging field, wherein a structure re-parameterization module can equivalently convert a multi-branch structure in training into a single-branch structure in reasoning, the multi-branch structure is utilized for training so as to improve the arrangement capacity of a network, and the single-branch structure is utilized for reasoning so as to reduce the calculation cost of the network.
The K-arrival image reconstruction module is added in the multi-scale network to defog the haze image. Three features with different scales generated in three feature extraction stages are input into two up-sampling convolution layers, and are input into a K-arrival module after fusion so as to capture more key underlying structure information and higher semantic information. The K-estimate module is a transformation structure based on an atmospheric scattering model, and physical characteristics contained in haze weather in an image can be better learned through the K-estimate module so as to recover defogging pictures with higher quality.
The present invention performs training and testing on NYU2 depth datasets. Experimental results show that the defogging quality of the model is superior to that of a mainstream defogging algorithm based on deep learning, and meanwhile, the reasoning speed of the network reaches the real-time field. In addition, because the lightweight network model can be conveniently embedded into computer vision-based systems such as aerial photography, automatic navigation and real-time monitoring.
The foregoing is merely illustrative of the preferred embodiments of this invention, and it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of this invention, and it is intended to cover such modifications and changes as fall within the true scope of the invention.

Claims (10)

1. A multi-scale real-time defogging method based on structural re-parameterization, which is characterized by comprising the following steps:
step 1: constructing a structural re-parameterization module;
step 2: constructing a multi-scale image defogging network based on structural re-parameterization; the multi-scale image defogging network comprises a structural re-parameterization module and a K-arrival image reconstruction module;
step 3: defogging the haze image by adopting a K-estination image reconstruction module;
step 4: defining a composite loss function of a multi-scale image defogging network based on structural re-parameterization;
step 5: initializing a multi-scale image defogging network;
step 6: preparing a data set;
step 7: training a multi-scale image defogging network by using the prepared data set;
step 8: and defogging the haze image by using a trained multi-scale image defogging network, and detecting the quality and efficiency of the defogged image.
2. The multi-scale real-time defogging method based on structure re-parameterization according to claim 1, wherein the structure re-parameterization module has different structures during network training and reasoning; the structure re-parameterization module has a plurality of different branches during training, converts the multi-branch structure during training into a single-branch structure during reasoning through an identity transformation during reasoning, and uses the single-branch structure after conversion to perform equivalent reasoning during reasoning.
3. The multi-scale real-time defogging method based on structure re-parameterization according to claim 1, wherein said structure re-parameterization module comprises an identity mapping branch, a 1 x 1 convolution layer, a 3 x 3 convolution layer, a 5 x 5 convolution layer during training; mapping the identity to branches, representing the 1 multiplied by 1 convolution layer and the 3 multiplied by 3 convolution layer as a 5 multiplied by 5 convolution layer through zero padding, and converting the identity to a single-branch 5 multiplied by 5 convolution layer through element addition operation on the four branches; the converted structure re-parameterized block structure has only one branch consisting of a 5 x 5 convolution layer and a nonlinear activation function ReLU layer.
4. The multi-scale real-time defogging method based on structural re-parameterization according to claim 1, wherein the multi-scale image defogging network comprises three different-scale feature extraction modules and a K-arrival image reconstruction module, wherein each feature extraction module is composed of a 3 x 3 convolution layer and two structural re-parameterization modules.
5. The method of claim 1, wherein the K-estimate module combines the global atmospheric light value and the transmission matrix into a parameter K by mathematical transformation,where t (x) represents a transmission matrix, a represents a global atmospheric light value, I (x) =j (x) t (x) +a (1-t (x)) represents an atmospheric degradation model, and b is a constant deviation value of 1 as a default value, J (x) =k (x) I (x) -K (x) +b.
6. The multi-scale real-time defogging method based on the structural reparameterization according to claim 1, wherein in the step 4, the composite loss function is composed of a mean square error loss function and an edge perception loss function.
7. The multi-scale real-time defogging method based on the structural re-parameterization according to claim 1, wherein in the step 5, the convolution kernel is initialized by using gaussian distribution initialization.
8. A multi-scale real-time defogging method based on structural re-parameterization according to claim 1, wherein said step 6 comprises:
the synthetic haze image is created using the NYU2 depth dataset and the training set, the validation set and the test set are divided in a proportion.
9. A multi-scale real-time defogging method based on structural re-parameterization according to claim 1, wherein said step 7 comprises:
setting the initial learning rate, and setting the number of batch images and training rounds by adopting an ADAM optimizer until the network converges.
10. A multi-scale real-time defogging device based on structural re-parameterization, comprising:
the first network construction unit is used for constructing a structure re-parameterization module;
the second network construction unit is used for constructing a multi-scale image defogging network based on structural reconsideration; the multi-scale image defogging network comprises a structural re-parameterization module and a K-arrival image reconstruction module;
the third network construction unit is used for defogging the haze image by adopting the K-arrival image reconstruction module;
the loss function construction unit is used for defining a composite loss function of the multi-scale image defogging network based on the structural reconrameterization;
the network initialization unit is used for initializing a multi-scale image defogging network;
a data set construction unit for preparing a data set;
a network training unit for training a multi-scale image defogging network using the prepared data set;
and the defogging unit is used for defogging the haze image by using the trained multi-scale image defogging network and detecting the quality and efficiency of the defogged image.
CN202310223074.0A 2023-03-09 2023-03-09 Multi-scale real-time defogging method and device based on structural re-parameterization Pending CN116542865A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310223074.0A CN116542865A (en) 2023-03-09 2023-03-09 Multi-scale real-time defogging method and device based on structural re-parameterization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310223074.0A CN116542865A (en) 2023-03-09 2023-03-09 Multi-scale real-time defogging method and device based on structural re-parameterization

Publications (1)

Publication Number Publication Date
CN116542865A true CN116542865A (en) 2023-08-04

Family

ID=87454911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310223074.0A Pending CN116542865A (en) 2023-03-09 2023-03-09 Multi-scale real-time defogging method and device based on structural re-parameterization

Country Status (1)

Country Link
CN (1) CN116542865A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117786823A (en) * 2024-02-26 2024-03-29 陕西天润科技股份有限公司 Light weight processing method based on building monomer model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117786823A (en) * 2024-02-26 2024-03-29 陕西天润科技股份有限公司 Light weight processing method based on building monomer model
CN117786823B (en) * 2024-02-26 2024-05-03 陕西天润科技股份有限公司 Light weight processing method based on building monomer model

Similar Documents

Publication Publication Date Title
Dudhane et al. RYF-Net: Deep fusion network for single image haze removal
Dudhane et al. C^ 2msnet: A novel approach for single image haze removal
CN112184577B (en) Single image defogging method based on multiscale self-attention generation countermeasure network
CN110570363A (en) Image defogging method based on Cycle-GAN with pyramid pooling and multi-scale discriminator
CN111539247B (en) Hyper-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN104217404A (en) Video image sharpness processing method in fog and haze day and device thereof
CN110148088B (en) Image processing method, image rain removing method, device, terminal and medium
CN112581409B (en) Image defogging method based on end-to-end multiple information distillation network
CN112241939B (en) Multi-scale and non-local-based light rain removal method
CN111539246B (en) Cross-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN112381733B (en) Image recovery-oriented multi-scale neural network structure searching method and network application
CN107590779A (en) A kind of image denoising deblurring method based on image block cluster dictionary training
CN112149526B (en) Lane line detection method and system based on long-distance information fusion
CN115311186B (en) Cross-scale attention confrontation fusion method and terminal for infrared and visible light images
CN116542865A (en) Multi-scale real-time defogging method and device based on structural re-parameterization
CN116757986A (en) Infrared and visible light image fusion method and device
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN111598793A (en) Method and system for defogging image of power transmission line and storage medium
CN113628143A (en) Weighted fusion image defogging method and device based on multi-scale convolution
CN114155165A (en) Image defogging method based on semi-supervision
CN117392065A (en) Cloud edge cooperative solar panel ash covering condition autonomous assessment method
CN116385293A (en) Foggy-day self-adaptive target detection method based on convolutional neural network
CN115631108A (en) RGBD-based image defogging method and related equipment
CN115937048A (en) Illumination controllable defogging method based on non-supervision layer embedding and vision conversion model
CN115393901A (en) Cross-modal pedestrian re-identification method and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination