CN112884680A - Single image defogging method using end-to-end neural network - Google Patents

Single image defogging method using end-to-end neural network Download PDF

Info

Publication number
CN112884680A
CN112884680A CN202110326940.XA CN202110326940A CN112884680A CN 112884680 A CN112884680 A CN 112884680A CN 202110326940 A CN202110326940 A CN 202110326940A CN 112884680 A CN112884680 A CN 112884680A
Authority
CN
China
Prior art keywords
module
network
attention
image
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110326940.XA
Other languages
Chinese (zh)
Inventor
胡彬
顾铭岑
岳壮壮
李金航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN202110326940.XA priority Critical patent/CN112884680A/en
Publication of CN112884680A publication Critical patent/CN112884680A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a single image defogging method by using an end-to-end neural network, which comprises the following steps: constructing a grid attention network model: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolution layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output. The invention combines the mesh network and the attention mechanism, in the traditional multi-scale network or the coding and decoding network, due to the hierarchical structure, the information flow is often influenced by the bottleneck effect, and the mesh network can avoid the problem by using the up-sampling block and the down-sampling block and by densely connecting the mesh network and the down-sampling block across different scales. In addition, the attention mechanism is given to a channel and pixel of the network, which can provide extra flexibility to process different types of information, and the attention mechanism also enables the network to expand the characterization capability of the CNNs.

Description

Single image defogging method using end-to-end neural network
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a single image defogging method by using an end-to-end neural network.
Background
Haze is a common atmospheric phenomenon produced by small floating particles such as dust and smoke in the air, which absorb scattered light greatly, resulting in a reduction in image quality. Under the influence of haze, many practical applications such as video monitoring, remote sensing, automatic driving and the like are easily threatened, and high-level computer vision tasks such as detection and identification are difficult to complete, so that image defogging (defogging) becomes an increasingly important technology.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a single image defogging method using an end-to-end neural network, which combines a grid network and an attention mechanism, gives the network an attention mechanism of one channel and one pixel, and can provide extra flexibility to process different types of information.
To solve the above technical problem, an embodiment of the present invention provides a single image defogging method using an end-to-end neural network, including the following steps:
s1, constructing a grid attention network model: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolutional layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output;
s2, model training: by smoothing L1Loss function:
Figure BDA0002995000480000021
Figure BDA0002995000480000022
training in a data set RESIDE;
wherein N refers to the total number of image pixels,
Figure BDA0002995000480000023
and Ji(x) Refers to the pixel value of x on the ith channel (a total of RGB3 channels),
Figure BDA0002995000480000024
refers to a value calculated over a network, Ji(x) The actual value is represented by the value of,
Figure BDA0002995000480000025
in step S1, the GridNet module has four rows and six columns, each row corresponds to a different feature scale, and is composed of five basic attention convolution modules ABD, which combine a skip connection and an attention module, and each column is a bridge connecting adjacent scales through up-sampling and down-sampling blocks; in each upsampling module, the size of the feature map is reduced by a factor of 2, while the number of feature maps is increased by a factor of 2, and the downsampling doubles the size of the features.
Further, the attention convolution module ABD in the GridNet module consists of a local residual learning and attention module, which learns less important information from the low frequency region of the input features through jump concatenation.
In the Attention module of step S1, global average pooling is first adopted:
Figure BDA0002995000480000026
wherein HpRepresenting a global average pooling function, Xc(i, j) a value at (i, j) where the c-channel representing the input value is located;
then obtaining SA after processing through convolutional layer, ReLU, convolutional layer and sigmoid activation function1
SA1=σ(Conv(δ(Conv(gc)))),
Wherein, sigma represents a sigmoid function, and delta represents a ReLU function;
will input FcAnd SA1Multiplication to obtain
Figure BDA0002995000480000031
Figure BDA0002995000480000032
Figure BDA0002995000480000033
Then, the following functions are obtained through convolutional layers, ReLU, convolutional layers and sigmoid activation functions:
Figure BDA0002995000480000034
and (3) final output:
Figure BDA0002995000480000035
the technical scheme of the invention has the following beneficial effects: the invention provides a single image defogging method by utilizing an end-to-end neural network, which combines a grid network and an attention mechanism, wherein in a traditional multi-scale network or a coding and decoding network, due to a hierarchical structure, information flow is often influenced by a bottleneck effect, and the grid network avoids the problem by using up-sampling and down-sampling blocks and by densely connecting the grid network and the coding and decoding network in different scales. In addition, the attention mechanism is given to a channel and pixel of the network, which can provide extra flexibility to process different types of information, and the attention mechanism also enables the network to expand the characterization capability of the CNNs.
Drawings
FIG. 1 is a block diagram of an attention network model in accordance with the present invention;
FIG. 2 is a block diagram of the GridNet module of the present invention;
FIG. 3 is a block diagram of an attention convolution module ABD of the present invention;
FIG. 4 is a structural diagram of an Attention module in the present invention;
FIG. 5 is a comparison of before and after image defogging according to one embodiment of the present invention;
FIG. 6 is a comparison of before and after image defogging according to the second embodiment of the present invention;
FIG. 7 is a comparison of before and after image defogging according to the third embodiment of the present invention;
FIG. 8 is a comparison of before and after image defogging according to the fourth embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
The invention provides a single image defogging method by using an end-to-end neural network, which is characterized by comprising the following steps of:
s1, constructing a grid attention network model shown in the figure 1: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolution layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output.
The structure of the GridNet module is shown in FIG. 2, and the GridNet module has four rows and six columns, each row corresponds to a different characteristic scale, and consists of five basic attention convolution modules ABD which are combined with a jump connection and an attention module, and each column is a bridge which is connected with adjacent scales through an up-sampling block and a down-sampling block; in each upsampling module, the size of the feature map is reduced by a factor of 2, while the number of feature maps is increased by a factor of 2, and the downsampling doubles the size of the features.
The attention convolution module ABD (structure as figure 3) in the GridNet module consists of a local residual learning and attention module, which learns less important information from the low frequency regions of the input features through jump-joins.
The Attention module in the Attention convolution module ABD is constructed as shown in FIG. 4. In the Attention module, global average pooling is firstly adopted:
Figure BDA0002995000480000051
wherein HpRepresenting a global average pooling function, Xc(i, j) a value at (i, j) where the c-channel representing the input value is located;
then passing through convolutional layer, ReLU, convolutional layer and sigmoidObtaining SA after activating function processing1
SA1=σ(Conv(δ(Conv(gc)))),
Wherein, sigma represents a sigmoid function, and delta represents a ReLU function;
will input FcAnd SA1Multiplication to obtain
Figure BDA0002995000480000052
Figure BDA0002995000480000053
Figure BDA0002995000480000054
Then, the following functions are obtained through convolutional layers, ReLU, convolutional layers and sigmoid activation functions:
Figure BDA0002995000480000055
and (3) final output:
Figure BDA0002995000480000056
s2, model training: by smoothing L1Loss function:
Figure BDA0002995000480000057
Figure BDA0002995000480000058
training in a data set RESIDE;
wherein N refers to the total number of image pixels,
Figure BDA0002995000480000059
and Ji(x) Refers to the pixel value of x on the ith channel (a total of RGB3 channels),
Figure BDA00029950004800000510
refers to a value calculated over a network, Ji(x) The actual value is represented by the value of,
Figure BDA00029950004800000511
the purpose of model training is: and training the neural network by using a sample set consisting of the foggy images and the corresponding clean images to obtain an end-to-end network model. When the image is restored, the trained model is used to input the foggy image and output the clean image, so that no intermediate result exists. Or the method is mainly embodied in the construction of the network structure. After the network is constructed, the sample set can be used for training, and the training result can be directly taken for image defogging.
S3, test result: the peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) are used for measurement, and the results are better than those of the conventional method, wherein the PSNR is 36.69, the SSIM is 0.9900, the PSNR is 33.89 and the SSIM is 0.9865 in an indoor test set of the RESIDE. Partial visualization results are shown in fig. 5-8.
Fig. 5a, 6a, 7a and 8a are images before defogging, and fig. 5b, 6b, 7b and 8b are images after defogging.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (4)

1. A single image defogging method by using an end-to-end neural network is characterized by comprising the following steps:
s1, constructing a grid attention network model: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolutional layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output;
s2, model training: by smoothing L1Loss function:
Figure FDA0002995000470000011
Figure FDA0002995000470000012
training in a data set RESIDE;
wherein N refers to the total number of image pixels,
Figure FDA0002995000470000013
and Ji(x) Refers to the pixel value of x on the ith channel,
Figure FDA0002995000470000014
refers to a value calculated over a network, Ji(x) The actual value is represented by the value of,
Figure FDA0002995000470000015
2. the method for defogging a single image by using an end-to-end neural network as claimed in claim 1, wherein in step S1, the GridNet module has four rows and six columns, each row corresponds to a different feature size, and is composed of five basic attention convolution modules ABD, which combine a jump connection and an attention module, and each column is a bridge connecting adjacent scales through an up sampling block and a down sampling block; in each upsampling module, the size of the feature map is reduced by a factor of 2, while the number of feature maps is increased by a factor of 2, and the downsampling doubles the size of the features.
3. The method for defogging a single image by using an end-to-end neural network as recited in claim 1 or 2, wherein the attention convolution module ABD in the GridNet module is composed of a local residual learning and attention module, and the local residual learning learns less important information from the low frequency region of the input features through jump connection.
4. The method for defogging single image by using end-to-end neural network according to claim 1, wherein in the Attention module of step S1, the global average pooling is firstly adopted:
Figure FDA0002995000470000021
wherein HpRepresenting a global average pooling function, Xc(i, j) a value at (i, j) where the c-channel representing the input value is located;
then obtaining SA after processing through convolutional layer, ReLU, convolutional layer and sigmoid activation function1
SA1=σ(Conv(δ(Conv(gc)))),
Wherein, sigma represents a sigmoid function, and delta represents a ReLU function;
will input FcAnd SA1Multiplication to obtain
Figure FDA0002995000470000022
Figure FDA0002995000470000023
Figure FDA0002995000470000024
Then, the following functions are obtained through convolutional layers, ReLU, convolutional layers and sigmoid activation functions:
Figure FDA0002995000470000025
and (3) final output:
Figure FDA0002995000470000026
CN202110326940.XA 2021-03-26 2021-03-26 Single image defogging method using end-to-end neural network Pending CN112884680A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110326940.XA CN112884680A (en) 2021-03-26 2021-03-26 Single image defogging method using end-to-end neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110326940.XA CN112884680A (en) 2021-03-26 2021-03-26 Single image defogging method using end-to-end neural network

Publications (1)

Publication Number Publication Date
CN112884680A true CN112884680A (en) 2021-06-01

Family

ID=76042583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110326940.XA Pending CN112884680A (en) 2021-03-26 2021-03-26 Single image defogging method using end-to-end neural network

Country Status (1)

Country Link
CN (1) CN112884680A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450273A (en) * 2021-06-18 2021-09-28 暨南大学 Image defogging method and system based on multi-scale multi-stage neural network
CN114022371A (en) * 2021-10-22 2022-02-08 中国科学院长春光学精密机械与物理研究所 Defogging device and defogging method based on space and channel attention residual error network
CN114529470A (en) * 2022-02-21 2022-05-24 南通大学 Single image rain removing method based on end-to-end neural network
CN115018786A (en) * 2022-06-02 2022-09-06 国网江苏省电力有限公司电力科学研究院 Cable defect identification method and device based on grid network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal
CN111598814A (en) * 2020-05-26 2020-08-28 北京理工大学 Single image defogging method based on extreme scattering channel
CN111814753A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Target detection method and device under foggy weather condition
CN111832348A (en) * 2019-04-17 2020-10-27 中国科学院宁波材料技术与工程研究所 Pedestrian re-identification method based on pixel and channel attention mechanism
CN112184577A (en) * 2020-09-17 2021-01-05 西安理工大学 Single image defogging method based on multi-scale self-attention generation countermeasure network
CN112365414A (en) * 2020-11-04 2021-02-12 天津大学 Image defogging method based on double-path residual convolution neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111832348A (en) * 2019-04-17 2020-10-27 中国科学院宁波材料技术与工程研究所 Pedestrian re-identification method based on pixel and channel attention mechanism
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal
CN111598814A (en) * 2020-05-26 2020-08-28 北京理工大学 Single image defogging method based on extreme scattering channel
CN111814753A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Target detection method and device under foggy weather condition
CN112184577A (en) * 2020-09-17 2021-01-05 西安理工大学 Single image defogging method based on multi-scale self-attention generation countermeasure network
CN112365414A (en) * 2020-11-04 2021-02-12 天津大学 Image defogging method based on double-path residual convolution neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIAOHONG LIU等: "GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing", 《IEEE》, pages 7313 - 7322 *
XU QIN等: "FFA-Net: Feature Fusion Attention Network for Single Image Dehazing", 《ARXIV》, pages 1 - 8 *
殷帅;胡越黎;刘思齐;燕明;: "基于YOLO网络的数据采集与标注", 仪表技术, no. 12 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450273A (en) * 2021-06-18 2021-09-28 暨南大学 Image defogging method and system based on multi-scale multi-stage neural network
CN114022371A (en) * 2021-10-22 2022-02-08 中国科学院长春光学精密机械与物理研究所 Defogging device and defogging method based on space and channel attention residual error network
CN114022371B (en) * 2021-10-22 2024-04-05 中国科学院长春光学精密机械与物理研究所 Defogging device and defogging method based on space and channel attention residual error network
CN114529470A (en) * 2022-02-21 2022-05-24 南通大学 Single image rain removing method based on end-to-end neural network
CN114529470B (en) * 2022-02-21 2024-09-20 南通大学 Single image rain removing method based on end-to-end neural network
CN115018786A (en) * 2022-06-02 2022-09-06 国网江苏省电力有限公司电力科学研究院 Cable defect identification method and device based on grid network

Similar Documents

Publication Publication Date Title
CN112884680A (en) Single image defogging method using end-to-end neural network
CN110059772B (en) Remote sensing image semantic segmentation method based on multi-scale decoding network
CN111539887B (en) Channel attention mechanism and layered learning neural network image defogging method based on mixed convolution
CN113469094A (en) Multi-mode remote sensing data depth fusion-based earth surface coverage classification method
CN110570396A (en) industrial product defect detection method based on deep learning
CN113837938B (en) Super-resolution method for reconstructing potential image based on dynamic vision sensor
CN112365414B (en) Image defogging method based on double-path residual convolution neural network
CN112488025B (en) Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion
CN112862774B (en) Accurate segmentation method for remote sensing image building
CN110378344B (en) Spectral dimension conversion network-based convolutional neural network multispectral image segmentation method
CN107749048B (en) Image correction system and method, and color blindness image correction system and method
CN113269685A (en) Image defogging method integrating multi-attention machine system
CN113449691A (en) Human shape recognition system and method based on non-local attention mechanism
CN113409355A (en) Moving target identification system and method based on FPGA
CN115546046A (en) Single image defogging method fusing frequency and content characteristics
CN113052776A (en) Unsupervised image defogging method based on multi-scale depth image prior
CN116523875A (en) Insulator defect detection method based on FPGA pretreatment and improved YOLOv5
CN111242053B (en) Power transmission line flame detection method and system
CN117036182A (en) Defogging method and system for single image
CN115656444B (en) Method for reconstructing concentration of carbon dioxide field in large-scale venue
CN116823775A (en) Display screen defect detection method based on deep learning
CN114049274B (en) Defogging method for single image
CN115564647A (en) Novel super-division module and up-sampling method for image semantic segmentation
CN113409321B (en) Cell nucleus image segmentation method based on pixel classification and distance regression
CN115578256A (en) Unmanned aerial vehicle aerial insulator infrared video panorama splicing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination