CN111915531B - Neural network image defogging method based on multi-level feature fusion and attention guidance - Google Patents

Neural network image defogging method based on multi-level feature fusion and attention guidance Download PDF

Info

Publication number
CN111915531B
CN111915531B CN202010781155.9A CN202010781155A CN111915531B CN 111915531 B CN111915531 B CN 111915531B CN 202010781155 A CN202010781155 A CN 202010781155A CN 111915531 B CN111915531 B CN 111915531B
Authority
CN
China
Prior art keywords
image
convolution
feature
attention
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010781155.9A
Other languages
Chinese (zh)
Other versions
CN111915531A (en
Inventor
张笑钦
王涛
徐曰旺
赵丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wenzhou University
Original Assignee
Wenzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wenzhou University filed Critical Wenzhou University
Priority to CN202010781155.9A priority Critical patent/CN111915531B/en
Publication of CN111915531A publication Critical patent/CN111915531A/en
Application granted granted Critical
Publication of CN111915531B publication Critical patent/CN111915531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a neural network image defogging method with multi-level feature fusion and attention guidance, which comprises the following steps: constructing an image defogging model; acquiring foggy image data, and extracting feature graphs representing different phases through a feature extraction module; the characteristic graphs obtained at different stages are fused by utilizing a multi-level characteristic fusion module in the defogging model in a point-by-point element multiplication mode, and a clear image is better recovered by utilizing complementarity of low-layer and high-layer characteristics to guide a network; reconstructing the characteristics generated by the multi-level characteristic fusion module into clear fog-free images through a residual mixed attention module; and calculating the mean square error and the perceived loss of the restored image and the corresponding clear image, updating the image defogging model, and cooperatively optimizing the defogging model by two loss functions, namely a mean square error loss function and a perceived loss function. According to the technical scheme, defogging enhancement processing is carried out on the fog map which is actually shot, high-quality images are recovered, and the practicability is good.

Description

Neural network image defogging method based on multi-level feature fusion and attention guidance
Technical Field
The invention relates to the technical field of image processing, in particular to a neural network image defogging method with multi-level feature fusion and attention guidance.
Background
The low visibility in severe weather (heavy fog and heavy rain) is a major problem faced by most computer vision techniques applied to actual scenes. Most automatic monitoring, autopilot, outdoor target recognition, etc. systems assume that the incoming video, images have clear visibility. However, such ideal conditions are not always satisfied in most cases, and therefore enhancement of low quality images, video, is an unavoidable task. Among them, image defogging is a representative image quality enhancement problem. The process of clear image fogging can be described by the atmospheric light scattering model proposed by McCartney et al:
I=tJ+A(1-t),
t(x)=e βd(x)
wherein I is a foggy image, t is medium transmissivity, J is a clear image, A is atmospheric illumination, d is the depth of object imaging, and beta is the scattering coefficient of the atmosphere. In the above model, I is a known quantity, and the objective of the image defogging task is to estimate a and t, and then generate a sharp image. The problem of image defogging is a pathological problem. Over the past 20 years, researchers have developed a number of image defogging algorithms to process images taken in foggy complex scenes. Early algorithms primarily focused on estimating depth information of images with multiple images and atmospheric cues to achieve image defogging. For example, narasimhan et al propose a physics-based method to locate depth discontinuities and calculate scene structures from two images of the same scene captured under arbitrary weather conditions. In addition, there are a series of algorithms to enhance the visual effect of image defogging by means of some prior information, and what is the typical algorithm is the defogging method of dark channel prior (DCP, dark Channel Prior) proposed in 2009 by kemine et al, the prior is based on observation and statistics to find that in most of the local areas of the foggy diagram which are not sky, some pixels always have at least one color channel with very low pixel value. For example, zhu Qingsong et al propose color decay prior (CAP, color Attenuation Prior). And recovering a clear image through the atmospheric scattering model by using the prior estimation t. These priors improve the defogging performance of the model to some extent. However, different priors rely on an estimate of a certain characteristic of the image and are often not suitable for real scenes.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a neural network image defogging method for defogging enhancement processing of a fog image which is actually shot and recovering multi-level characteristic fusion and attention guidance of a high-quality image.
In order to achieve the above purpose, the present invention provides the following technical solutions: a neural network image defogging method with multi-level feature fusion and attention guidance comprises the following steps:
s1, constructing an image defogging model; the image defogging model comprises a feature extraction module, a multi-level feature fusion module and a residual mixed convolution attention module;
s2, acquiring foggy image data, and firstly converting a foggy image into 16 feature images through a convolution layer; then, the feature graphs are processed through four stages of a feature extraction module to obtain features of different layers;
s3, a multi-level feature fusion module fuses feature graphs obtained at different stages in a point-by-point element multiplication mode, and a network is guided to better recover a clear image by utilizing complementarity of low-level features and high-level features;
s4, the characteristics generated by the multi-level characteristic fusion module are subjected to residual mixed convolution attention module to obtain a weight graph with the same size as the input elements, the weight graph obtained from an attention layer designed based on an attention mechanism guides a network to discard redundant information, the characteristic information effective for restoring a clear graph is focused, meanwhile, the training and operation efficiency of the residual mixed convolution attention module can be improved through the depth separable convolution operation adopted in the residual mixed convolution attention module, and the characteristics are finally reconstructed into clear haze-free images after passing through the residual mixed convolution attention module;
s5, calculating the mean square error and the perception loss of the restored image and the corresponding clear image, and updating an image defogging model; the method comprises the steps of measuring deviation between a restored image and a corresponding clear image by means of a mean square error, enabling a perception loss help model to perceive the image from a higher dimension, enabling the restored image to be more true, and enabling two loss functions of the mean square error and the perception loss to cooperate and jointly optimize a defogging model.
Preferably, step S5 specifically includes:
calculating a mean square error and a perceived loss for the restored image and the corresponding sharp image, wherein the first loss function is a mean square error loss function, and the formula is:
wherein W and H represent the width and height of the image, I re And I gt The method is a restored image and a corresponding clear image, i and j are pixel positions in an index image, c is an image RGB channel, and the range is from 1 to 3;
the second is a perceptual loss function, using a VGG16 network pre-trained on an ImageNet dataset (VGG-16 has 13 convolutional layers, divided into 5 phases) where the final convolutional layer at each phase of the network extracts features and computes the difference, the formula:
wherein { phi } k (-), k=1, 2,3} represents the feature extractor corresponding to the convolutional layer corresponding to VGG16 (i.e., conv1-2, conv2-2, and Conv 3-3), C k ,W k And H k Corresponds to phi k (.);
The total defogging model loss function is:
L=L mse +α*L per ,
where α is a parameter that balances the two loss functions.
Preferably, step S2 specifically includes:
feature extraction starts with a 3 x 3 convolution layer that converts a given input foggy image into 16 feature maps;
then, the feature maps are processed through the following four stages to obtain features of different layers; each stage comprises four layers, the first layer being a 3 x 3 convolution with a step size of 2, which is used to reduce the resolution of the feature map to 1/2 and double the width; the second layer and the third layer respectively comprise 3×3 convolutions, a ReLU activation function and 3×3 convolutions; the fourth layer is a 1 x 1 convolution, which reduces the width of the features produced by the third layer to 64 as an output for each stage.
Preferably, in step S3, the multi-level feature fusion module has three feature fusion modules from top to bottom. The feature fusion module is used for fusing the high-level features (fourth convolution-activation function-convolution combined output feature map) and the low-level features (third convolution-activation function-convolution combined output feature map), the fused features are regarded as high-level features, and then the second feature fusion module is used for fusing the high-level features with middle-level features in the third convolution-activation function-convolution combined output feature map. And finally, taking the features obtained by the second feature fusion module as high-level features, and fusing the features with low-level features in the first convolution-activation function-convolution combined output feature map through a third feature fusion module.
For each feature fusion module, given high-level and low-level features, element-by-element multiplication is used to achieve fusion between features. The fused features will be applied to the convolutional layers, the batch normalization and the ReLU activation functions, and then processed by the next fusion module.
Preferably, step S4 specifically includes: the residual hybrid convolution attention module has three consecutive packet convolution layers followed by an attention layer. The given features are processed by them and added to the residual to obtain the output features. The grouping convolution is to group the input features by the number of channels (the number of groups is a super parameter), and apply the convolution operation to each group. Because of the division of the groups, the FLOP (floating point operations performed per second) of the residual mixed convolution attention model is greatly reduced, and the training and defogging efficiency of the network is improved. The group numbers of the grouping convolution layers are respectively 4, 8 and 16, namely, the input characteristic diagram is respectively divided into 4 groups, 8 groups and 16 groups according to the channel number for processing. This configuration was determined by experimental studies.
After three grouping convolution processes, an attention layer is added, and the attention layer enables the output characteristics to reflect important characteristic information of a clear image in an input fog image, so that the network focuses on important clear fog-free image information to be adopted; the attention mechanism is realized through two steps, wherein the first step is to use depth convolution, then a ReLU activation function, then point convolution and then a Sigmoid activation function so as to acquire feature weights; the second step multiplies the original input feature by the obtained weight to obtain a weight map with the same size as the input element, and the weight map is applied to the input feature by element multiplication to output a final feature; the weight map obtained from the attention layer guides the network to discard redundant information (haze characteristic information), focuses on characteristic information of clear haze-free images, and simultaneously adopts depth separable convolution operation (combining two parts of depth convolution and point convolution) to improve training and operation efficiency of the residual mixed convolution attention module.
The invention has the advantages that: compared with the prior art, the invention has the following beneficial effects:
1. compared with the prior art, the invention provides a multi-level feature fusion module which can adaptively adopt features of different levels and recover clear images by utilizing the complementation effect between the modules;
2. compared with the prior art, the invention develops a residual mixed convolution attention module with an attention layer. The mixed convolution operation improves the efficiency of network operation, and the attention block concentrates the model on more important information;
3. the invention also provides a method for cooperatively guiding the defogging model to achieve defogging performance by using the mean square error loss and the perception loss function. The mean square error measures the deviation between the restored image and the corresponding sharp image, while the perceived loss helps the model to perceive the image from a higher dimension, restoring a more realistic sharp image.
The invention is further described below with reference to the drawings and specific examples.
Drawings
FIG. 1 is a defogging flow chart according to an embodiment of the present invention;
FIG. 2 is an application scenario diagram of an embodiment of the present invention;
FIG. 3 is an application scenario diagram of a core component residual hybrid convolution module in the model of FIG. 2;
FIG. 4 is an application scenario diagram of the attention layer of the core component of the model of FIG. 3;
FIG. 5 is an effect diagram of the restored image in the image defogging model of FIG. 2 compared with other methods.
Detailed Description
Referring to fig. 1 to 5, the neural network image defogging method with multi-level feature fusion and attention guidance disclosed by the invention comprises the following steps:
s1, constructing an image defogging model; the image defogging model comprises a feature extraction module, a multi-level feature fusion module and a residual mixed convolution attention module;
the specific process is that an image defogging model is constructed as shown in fig. 2. The image defogging model comprises a feature extraction module (shown in figure 2), a multi-level feature fusion module (shown in figure 2) and a residual mixed convolution attention module (shown in figure 2);
s2, acquiring foggy image data, and firstly converting a foggy image into 16 feature images through a convolution layer; then, the feature graphs are processed through four stages of a feature extraction module to obtain features of different layers;
s3, a multi-level feature fusion module fuses feature graphs obtained at different stages in a point-by-point element multiplication mode, and a network is guided to better recover a clear image by utilizing complementarity of low-level features and high-level features;
s4, the characteristics generated by the multi-level characteristic fusion module are subjected to residual mixed convolution attention module to obtain a weight graph with the same size as the input elements, the weight graph obtained from an attention layer designed based on an attention mechanism guides a network to discard redundant information, the characteristic information effective for restoring a clear graph is focused, meanwhile, the training and operation efficiency of the residual mixed convolution attention module can be improved through the depth separable convolution operation adopted in the residual mixed convolution attention module, and the characteristics are finally reconstructed into clear haze-free images after passing through the residual mixed convolution attention module;
s5, calculating the mean square error and the perception loss of the restored image and the corresponding clear image, and updating an image defogging model; the method comprises the steps of measuring deviation between a restored image and a corresponding clear image by means of a mean square error, enabling a perception loss help model to perceive the image from a higher dimension, enabling the restored image to be more true, and enabling two loss functions of the mean square error and the perception loss to cooperate and jointly optimize a defogging model.
Preferably, step S5 specifically includes:
calculating a mean square error and a perceived loss for the restored image and the corresponding sharp image, wherein the first loss function is a mean square error loss function, and the formula is:
wherein W and H represent the width and height of the image, I re And I gt The method is a restored image and a corresponding clear image, i and j are pixel positions in an index image, c is an image RGB channel, and the range is from 1 to 3;
the second is a perceptual loss function, which uses a VGG16 pre-trained on the ImageNet dataset (VGG-16 has 13 convolutional layers, divided into 5 phases) to extract features and calculate differences from the last convolutional layer of each phase of the network, the formula:
wherein { phi } k (-), k=1, 2,3} represents the feature extractor corresponding to the convolutional layer corresponding to VGG16 (i.e., conv1-2, conv2-2, and Conv 3-3), C k ,W k And H k Corresponds to phi k (.);
The total defogging model loss function is:
L=L mse +α*L per ,
where α is a parameter that balances the two loss functions.
Preferably, step S2 specifically includes: the method comprises the specific processes that a hazy picture is obtained, and the characteristic extractor is different from the characteristic extractor of other methods in that the characteristic extractor does not need training in advance and is lightweight;
feature extraction starts with a 3 x 3 convolution layer that converts a given input foggy image into 16 feature maps;
then, the feature maps are processed through the following four stages to obtain features of different layers; each stage comprises four layers, the first layer being a 3 x 3 convolution with a step size of 2, which is used to reduce the resolution of the feature map to 1/2 and double the width; the second layer and the third layer respectively comprise 3×3 convolutions, a ReLU activation function and 3×3 convolutions; the fourth layer is a 1 x 1 convolution, which reduces the width of the features produced by the third layer to 64 as an output for each stage.
Preferably, in step S3, the multi-level feature fusion module has three feature fusion modules from top to bottom, where the feature fusion module fuses a high-level feature (a fourth convolution-activation function-convolution combined output feature map) and a low-level feature (a third convolution-activation function-convolution combined output feature map), and the feature generated by the fusion is regarded as a high-level feature, and then the second feature fusion module fuses the feature with a middle-level feature in the third convolution-activation function-convolution combined output feature map. And finally, taking the features obtained by the second feature fusion module as high-level features, and fusing the features with low-level features in the first convolution-activation function-convolution combined output feature map through a third feature fusion module.
For each feature fusion module, given high-level and low-level features, element-by-element multiplication is used to achieve fusion between features. The fused features will be applied to the convolutional layers, the batch normalization and the ReLU activation functions, and then processed by the next fusion module.
Preferably, step S4 specifically includes: the residual hybrid convolution attention module has three consecutive packet convolution layers followed by an attention layer. The given features are processed by them and added to the residual to obtain the output features. The grouping convolution is to group the input features by the number of channels (the number of groups is a super parameter), and apply the convolution operation to each group. Because of the division of the groups, the FLOP (floating point operations performed per second) of the residual mixed convolution attention model is greatly reduced, and the training and defogging efficiency of the network is improved. The group numbers of the grouping convolution layers are respectively 4, 8 and 16, namely, the input characteristic diagram is respectively divided into 4 groups, 8 groups and 16 groups according to the channel number for processing. This configuration was determined by experimental studies.
After three grouping convolution processes, an attention layer is added, and the attention layer enables the output characteristics to reflect important characteristic information of a clear image in an input fog image, so that the network focuses on important clear fog-free image information to be adopted; the attention mechanism is realized through two steps, wherein the first step is to use depth convolution, then a ReLU activation function, then point convolution and then a Sigmoid activation function so as to acquire feature weights; the second step multiplies the original input feature by the obtained weight to obtain a weight map with the same size as the input element, and the weight map is applied to the input feature by element multiplication to output a final feature; the weight map obtained from the attention layer guides the network to discard redundant information (haze characteristic information), focuses on characteristic information of clear haze-free images, and simultaneously adopts depth separable convolution operation (combining two parts of depth convolution and point convolution) to improve training and operation efficiency of the residual mixed convolution attention module.
When the method is actually applied, firstly, a foggy image is input into a feature extraction module, and the combination of a convolution layer, an activation function and the convolution layer at four different stages of the module is utilized to extract different features of four layers of the image effectively;
secondly, inputting the extracted four features into a multi-level feature fusion module, wherein the module multiplies the features of different levels element by element, and the complementarity of the features of the lower level and the higher level is utilized to help the network to better recover the clear image;
and then, processing the characteristics generated by the multi-level characteristic fusion module by using the residual mixed convolution attention module to obtain a weight graph with the same size as the input element. The weight map derived from the attention module directs the network to relinquish redundant functionality and focus attention on more important functions. The depth and point direction convolution operations employed can improve the efficiency of this module. Finally reconstructing the image into a clear fog-free image after passing through the module;
finally, calculating the mean square error and the perception loss of the restored image and the corresponding clear image, and updating an image defogging model; wherein the mean square error measures the deviation between the restored image and the corresponding sharp image, and the perceived loss helps the model to perceive the image from a higher dimension, the restored more realistic sharp image. The two loss functions cooperate to jointly optimize the defogging model.
The invention has the following beneficial effects:
1. compared with the prior art, the invention provides a multi-level feature fusion module which can adaptively adopt features of different levels and effectively recover clear images from blurred images by utilizing the complementary action between the features;
2. compared with the prior art, the invention develops a residual mixed convolution attention module with an attention layer. The mixed convolution operation improves the efficiency of network operation, and the attention block concentrates the model on more important information;
3. the invention also provides a method for cooperatively guiding the defogging model to achieve defogging performance by using the mean square error loss and the perception loss function. The mean square error measures the deviation between the restored image and the corresponding sharp image, while the perceived loss helps the model to perceive the image from a higher dimension, restoring a more realistic sharp image.
The foregoing embodiments are provided for further explanation of the present invention and are not to be construed as limiting the scope of the present invention, and some insubstantial modifications and variations of the present invention, which are within the scope of the invention, will be suggested to those skilled in the art in light of the foregoing teachings.

Claims (3)

1. A neural network image defogging method with multi-level feature fusion and attention guidance is characterized in that: the method comprises the following steps:
s1, constructing an image defogging model; the image defogging model comprises a feature extraction module, a multi-level feature fusion module and a residual mixed convolution attention module;
s2, acquiring foggy image data, and firstly converting a foggy image into 16 feature images through a convolution layer; then, the feature graphs are processed through four stages of a feature extraction module to obtain features of different layers;
s3, a multi-level feature fusion module fuses feature graphs obtained at different stages in a point-by-point element multiplication mode, and a network is guided to better recover a clear image by utilizing complementarity of low-level features and high-level features;
s4, the characteristics generated by the multi-level characteristic fusion module are subjected to residual mixed convolution attention module to obtain a weight graph with the same size as the input elements, the weight graph obtained from an attention layer designed based on an attention mechanism guides a network to discard redundant information, the characteristic information effective for restoring a clear graph is focused, meanwhile, the training and operation efficiency of the residual mixed convolution attention module can be improved through the depth separable convolution operation adopted in the residual mixed convolution attention module, and the characteristics are finally reconstructed into clear haze-free images after passing through the residual mixed convolution attention module;
s5, calculating the mean square error and the perception loss of the restored image and the corresponding clear image, and updating an image defogging model; the method comprises the steps of measuring deviation between a restored image and a corresponding clear image by means of a mean square error, enabling a perception loss help model to perceive the image from a higher dimension, enabling the restored image to be more true, and enabling two loss functions of the mean square error and the perception loss to cooperate to jointly optimize a defogging model;
step S3, the multi-level feature fusion module is provided with three feature fusion modules from top to bottom,
the feature fusion module I fuses the high-level features and the low-level features, and the fused features are regarded as high-level features; then the second feature fusion module fuses the middle layer feature in the third convolution-activation function-convolution combination output feature map with the middle layer feature; finally, the features obtained by the second feature fusion module are regarded as high-level features, and the features are fused with low-level features in the first convolution-activation function-convolution combination output feature map through a third feature fusion module;
for each feature fusion module, given high-level and low-level features, element-by-element multiplication is used for realizing fusion between features, the fused features are applied to convolution layer batch normalization and ReLU activation functions, and then the next fusion module is used for processing;
step S4, specifically comprising:
the residual mixed convolution attention module is provided with three continuous grouping convolution layers, the back of the attention layer is provided with an attention layer, given features are processed by the attention layer and added into the residual to obtain output features, the grouping convolution is to group the input features according to the number of channels and apply convolution operation to each group respectively, and due to the division of the groups, the FLOP of the residual mixed convolution attention model is greatly reduced, and the training and defogging efficiency of the network is improved; the group numbers of the grouping convolution layers are respectively 4, 8 and 16, and the input characteristic images are respectively divided into 4 groups, 8 groups and 16 groups according to the channel number for processing;
after three grouping convolution processes, an attention layer is added, and the attention layer enables the output characteristics to reflect important characteristic information of a clear image in an input fog image, so that the network focuses on important clear fog-free image information to be adopted; the attention mechanism is realized through two steps, wherein the first step is to use depth convolution, then a ReLU activation function, then point convolution and then a Sigmoid activation function so as to acquire feature weights; the second step multiplies the original input feature by the obtained weight to obtain a weight map with the same size as the input element, and the weight map is applied to the input feature by element multiplication to output a final feature; the weight map obtained from the attention layer guides the network to discard redundant information, focuses on the characteristic information of clear fog-free images, and improves the training and operation efficiency of the residual mixed convolution attention module by adopting depth separable convolution operation.
2. The neural network image defogging method based on multi-level feature fusion and attention guidance according to claim 1, wherein the method comprises the following steps: step S5, specifically comprising:
calculating a mean square error and a perceived loss for the restored image and the corresponding sharp image, wherein the first loss function is a mean square error loss function, and the formula is:
wherein W and H represent the width and height of the image, I re And I gt The method is a restored image and a corresponding clear image, i and j are pixel positions in an index image, c is an image RGB channel, and the range is from 1 to 3;
the second is a perceptual loss function, which uses the last convolutional layer of each stage of the VGG16 network pre-trained on the ImageNet dataset to extract features and calculate differences, the formula:
wherein { phi } k (-), k=1, 2,3} represents the feature extractor corresponding to the convolutional layer corresponding to VGG16, C k ,W k And H k Corresponds to phi k (.);
The total defogging model loss function is:
L=L mse +α*L per ,
where α is a parameter that balances the two loss functions.
3. The neural network image defogging method for multi-level feature fusion and attention guidance according to claim 2, wherein the method comprises the following steps: step S2, specifically comprising:
feature extraction starts with a 3 x 3 convolution layer that converts a given input foggy image into 16 feature maps;
then, the feature maps are processed through the following four stages to obtain features of different layers; each stage comprises four layers, the first layer being a 3 x 3 convolution with a step size of 2, which is used to reduce the resolution of the feature map to 1/2 and double the width; the second layer and the third layer respectively comprise 3×3 convolutions, a ReLU activation function and 3×3 convolutions; the fourth layer is a 1 x 1 convolution, which reduces the width of the features produced by the third layer to 64 as an output for each stage.
CN202010781155.9A 2020-08-06 2020-08-06 Neural network image defogging method based on multi-level feature fusion and attention guidance Active CN111915531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010781155.9A CN111915531B (en) 2020-08-06 2020-08-06 Neural network image defogging method based on multi-level feature fusion and attention guidance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010781155.9A CN111915531B (en) 2020-08-06 2020-08-06 Neural network image defogging method based on multi-level feature fusion and attention guidance

Publications (2)

Publication Number Publication Date
CN111915531A CN111915531A (en) 2020-11-10
CN111915531B true CN111915531B (en) 2023-09-29

Family

ID=73288183

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010781155.9A Active CN111915531B (en) 2020-08-06 2020-08-06 Neural network image defogging method based on multi-level feature fusion and attention guidance

Country Status (1)

Country Link
CN (1) CN111915531B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581409B (en) * 2021-01-05 2024-05-07 戚如嬅耳纹科技(深圳)有限公司 Image defogging method based on end-to-end multiple information distillation network
CN112991201B (en) * 2021-02-18 2024-04-05 西安理工大学 Image defogging method based on color correction and context aggregation residual error network
CN113222016B (en) * 2021-05-12 2022-07-12 中国民航大学 Change detection method and device based on cross enhancement of high-level and low-level features
CN113139922B (en) * 2021-05-31 2022-08-02 中国科学院长春光学精密机械与物理研究所 Image defogging method and defogging device
CN113284070A (en) * 2021-06-16 2021-08-20 河南理工大学 Non-uniform fog image defogging algorithm based on attention transfer mechanism
CN113450273B (en) * 2021-06-18 2022-10-14 暨南大学 Image defogging method and system based on multi-scale multi-stage neural network
CN113344806A (en) * 2021-07-23 2021-09-03 中山大学 Image defogging method and system based on global feature fusion attention network
CN113870126B (en) * 2021-09-07 2024-04-19 深圳市点维文化传播有限公司 Bayer image recovery method based on attention module
CN113689356B (en) * 2021-09-14 2023-11-24 三星电子(中国)研发中心 Image restoration method and device
CN113781363B (en) * 2021-09-29 2024-03-05 北京航空航天大学 Image enhancement method with adjustable defogging effect
CN114022371B (en) * 2021-10-22 2024-04-05 中国科学院长春光学精密机械与物理研究所 Defogging device and defogging method based on space and channel attention residual error network
CN113962901B (en) * 2021-11-16 2022-08-23 中国矿业大学(北京) Mine image dust removing method and system based on deep learning network
CN114283078B (en) * 2021-12-09 2024-06-18 北京理工大学 Self-adaptive fusion image defogging method based on two-way convolutional neural network
CN117853371B (en) * 2024-03-06 2024-05-31 华东交通大学 Multi-branch frequency domain enhanced real image defogging method, system and terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097519A (en) * 2019-04-28 2019-08-06 暨南大学 Double supervision image defogging methods, system, medium and equipment based on deep learning
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097519A (en) * 2019-04-28 2019-08-06 暨南大学 Double supervision image defogging methods, system, medium and equipment based on deep learning
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于条件生成对抗网络的去雾方法;贾绪仲;文志强;;信息与电脑(理论版)(第09期);全文 *

Also Published As

Publication number Publication date
CN111915531A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN111915531B (en) Neural network image defogging method based on multi-level feature fusion and attention guidance
US10353271B2 (en) Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
CN110111366B (en) End-to-end optical flow estimation method based on multistage loss
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN112288658A (en) Underwater image enhancement method based on multi-residual joint learning
CN111709895A (en) Image blind deblurring method and system based on attention mechanism
CN111754446A (en) Image fusion method, system and storage medium based on generation countermeasure network
CN110349093B (en) Single image defogging model construction and defogging method based on multi-stage hourglass structure
CN111539888B (en) Neural network image defogging method based on pyramid channel feature attention
CN115223004A (en) Method for generating confrontation network image enhancement based on improved multi-scale fusion
CN114742719A (en) End-to-end image defogging method based on multi-feature fusion
CN111275627A (en) Image snow removing algorithm based on snow model and deep learning fusion
CN111553845A (en) Rapid image splicing method based on optimized three-dimensional reconstruction
CN115115685A (en) Monocular image depth estimation algorithm based on self-attention neural network
CN115035010A (en) Underwater image enhancement method based on convolutional network guided model mapping
CN112419163A (en) Single image weak supervision defogging method based on priori knowledge and deep learning
CN114119694A (en) Improved U-Net based self-supervision monocular depth estimation algorithm
CN112508828A (en) Multi-focus image fusion method based on sparse representation and guided filtering
CN116542865A (en) Multi-scale real-time defogging method and device based on structural re-parameterization
CN112767275B (en) Single image defogging method based on artificial sparse annotation information guidance
CN116228550A (en) Image self-enhancement defogging algorithm based on generation of countermeasure network
CN113870162A (en) Low-light image enhancement method integrating illumination and reflection
CN117994167B (en) Diffusion model defogging method integrating parallel multi-convolution attention
CN115496694B (en) Method for recovering and enhancing underwater image based on improved image forming model
CN116128768B (en) Unsupervised image low-illumination enhancement method with denoising module

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant