CN112884680A - Single image defogging method using end-to-end neural network - Google Patents
Single image defogging method using end-to-end neural network Download PDFInfo
- Publication number
- CN112884680A CN112884680A CN202110326940.XA CN202110326940A CN112884680A CN 112884680 A CN112884680 A CN 112884680A CN 202110326940 A CN202110326940 A CN 202110326940A CN 112884680 A CN112884680 A CN 112884680A
- Authority
- CN
- China
- Prior art keywords
- module
- network
- attention
- image
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 13
- 238000005070 sampling Methods 0.000 claims abstract description 11
- 238000000605 extraction Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 10
- 238000011176 pooling Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 238000009499 grossing Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 abstract description 8
- 230000008569 process Effects 0.000 abstract description 3
- 238000012512 characterization method Methods 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a single image defogging method by using an end-to-end neural network, which comprises the following steps: constructing a grid attention network model: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolution layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output. The invention combines the mesh network and the attention mechanism, in the traditional multi-scale network or the coding and decoding network, due to the hierarchical structure, the information flow is often influenced by the bottleneck effect, and the mesh network can avoid the problem by using the up-sampling block and the down-sampling block and by densely connecting the mesh network and the down-sampling block across different scales. In addition, the attention mechanism is given to a channel and pixel of the network, which can provide extra flexibility to process different types of information, and the attention mechanism also enables the network to expand the characterization capability of the CNNs.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a single image defogging method by using an end-to-end neural network.
Background
Haze is a common atmospheric phenomenon produced by small floating particles such as dust and smoke in the air, which absorb scattered light greatly, resulting in a reduction in image quality. Under the influence of haze, many practical applications such as video monitoring, remote sensing, automatic driving and the like are easily threatened, and high-level computer vision tasks such as detection and identification are difficult to complete, so that image defogging (defogging) becomes an increasingly important technology.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a single image defogging method using an end-to-end neural network, which combines a grid network and an attention mechanism, gives the network an attention mechanism of one channel and one pixel, and can provide extra flexibility to process different types of information.
To solve the above technical problem, an embodiment of the present invention provides a single image defogging method using an end-to-end neural network, including the following steps:
s1, constructing a grid attention network model: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolutional layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output;
s2, model training: by smoothing L1Loss function:
wherein N refers to the total number of image pixels,and Ji(x) Refers to the pixel value of x on the ith channel (a total of RGB3 channels),refers to a value calculated over a network, Ji(x) The actual value is represented by the value of,
in step S1, the GridNet module has four rows and six columns, each row corresponds to a different feature scale, and is composed of five basic attention convolution modules ABD, which combine a skip connection and an attention module, and each column is a bridge connecting adjacent scales through up-sampling and down-sampling blocks; in each upsampling module, the size of the feature map is reduced by a factor of 2, while the number of feature maps is increased by a factor of 2, and the downsampling doubles the size of the features.
Further, the attention convolution module ABD in the GridNet module consists of a local residual learning and attention module, which learns less important information from the low frequency region of the input features through jump concatenation.
In the Attention module of step S1, global average pooling is first adopted:
wherein HpRepresenting a global average pooling function, Xc(i, j) a value at (i, j) where the c-channel representing the input value is located;
then obtaining SA after processing through convolutional layer, ReLU, convolutional layer and sigmoid activation function1,
SA1=σ(Conv(δ(Conv(gc)))),
Wherein, sigma represents a sigmoid function, and delta represents a ReLU function;
Then, the following functions are obtained through convolutional layers, ReLU, convolutional layers and sigmoid activation functions:
the technical scheme of the invention has the following beneficial effects: the invention provides a single image defogging method by utilizing an end-to-end neural network, which combines a grid network and an attention mechanism, wherein in a traditional multi-scale network or a coding and decoding network, due to a hierarchical structure, information flow is often influenced by a bottleneck effect, and the grid network avoids the problem by using up-sampling and down-sampling blocks and by densely connecting the grid network and the coding and decoding network in different scales. In addition, the attention mechanism is given to a channel and pixel of the network, which can provide extra flexibility to process different types of information, and the attention mechanism also enables the network to expand the characterization capability of the CNNs.
Drawings
FIG. 1 is a block diagram of an attention network model in accordance with the present invention;
FIG. 2 is a block diagram of the GridNet module of the present invention;
FIG. 3 is a block diagram of an attention convolution module ABD of the present invention;
FIG. 4 is a structural diagram of an Attention module in the present invention;
FIG. 5 is a comparison of before and after image defogging according to one embodiment of the present invention;
FIG. 6 is a comparison of before and after image defogging according to the second embodiment of the present invention;
FIG. 7 is a comparison of before and after image defogging according to the third embodiment of the present invention;
FIG. 8 is a comparison of before and after image defogging according to the fourth embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
The invention provides a single image defogging method by using an end-to-end neural network, which is characterized by comprising the following steps of:
s1, constructing a grid attention network model shown in the figure 1: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolution layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output.
The structure of the GridNet module is shown in FIG. 2, and the GridNet module has four rows and six columns, each row corresponds to a different characteristic scale, and consists of five basic attention convolution modules ABD which are combined with a jump connection and an attention module, and each column is a bridge which is connected with adjacent scales through an up-sampling block and a down-sampling block; in each upsampling module, the size of the feature map is reduced by a factor of 2, while the number of feature maps is increased by a factor of 2, and the downsampling doubles the size of the features.
The attention convolution module ABD (structure as figure 3) in the GridNet module consists of a local residual learning and attention module, which learns less important information from the low frequency regions of the input features through jump-joins.
The Attention module in the Attention convolution module ABD is constructed as shown in FIG. 4. In the Attention module, global average pooling is firstly adopted:
wherein HpRepresenting a global average pooling function, Xc(i, j) a value at (i, j) where the c-channel representing the input value is located;
then passing through convolutional layer, ReLU, convolutional layer and sigmoidObtaining SA after activating function processing1,
SA1=σ(Conv(δ(Conv(gc)))),
Wherein, sigma represents a sigmoid function, and delta represents a ReLU function;
Then, the following functions are obtained through convolutional layers, ReLU, convolutional layers and sigmoid activation functions:
s2, model training: by smoothing L1Loss function:
wherein N refers to the total number of image pixels,and Ji(x) Refers to the pixel value of x on the ith channel (a total of RGB3 channels),refers to a value calculated over a network, Ji(x) The actual value is represented by the value of,
the purpose of model training is: and training the neural network by using a sample set consisting of the foggy images and the corresponding clean images to obtain an end-to-end network model. When the image is restored, the trained model is used to input the foggy image and output the clean image, so that no intermediate result exists. Or the method is mainly embodied in the construction of the network structure. After the network is constructed, the sample set can be used for training, and the training result can be directly taken for image defogging.
S3, test result: the peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) are used for measurement, and the results are better than those of the conventional method, wherein the PSNR is 36.69, the SSIM is 0.9900, the PSNR is 33.89 and the SSIM is 0.9865 in an indoor test set of the RESIDE. Partial visualization results are shown in fig. 5-8.
Fig. 5a, 6a, 7a and 8a are images before defogging, and fig. 5b, 6b, 7b and 8b are images after defogging.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (4)
1. A single image defogging method by using an end-to-end neural network is characterized by comprising the following steps:
s1, constructing a grid attention network model: the input is an image to be defogged, the image to be defogged is sent into a shallow feature extraction convolutional layer, then is sent into a GridNet module and an Attention module, and finally the features are transmitted to a reconstruction part and a global residual error learning structure, and a clear image is output;
s2, model training: by smoothing L1Loss function:
2. the method for defogging a single image by using an end-to-end neural network as claimed in claim 1, wherein in step S1, the GridNet module has four rows and six columns, each row corresponds to a different feature size, and is composed of five basic attention convolution modules ABD, which combine a jump connection and an attention module, and each column is a bridge connecting adjacent scales through an up sampling block and a down sampling block; in each upsampling module, the size of the feature map is reduced by a factor of 2, while the number of feature maps is increased by a factor of 2, and the downsampling doubles the size of the features.
3. The method for defogging a single image by using an end-to-end neural network as recited in claim 1 or 2, wherein the attention convolution module ABD in the GridNet module is composed of a local residual learning and attention module, and the local residual learning learns less important information from the low frequency region of the input features through jump connection.
4. The method for defogging single image by using end-to-end neural network according to claim 1, wherein in the Attention module of step S1, the global average pooling is firstly adopted:
wherein HpRepresenting a global average pooling function, Xc(i, j) a value at (i, j) where the c-channel representing the input value is located;
then obtaining SA after processing through convolutional layer, ReLU, convolutional layer and sigmoid activation function1,
SA1=σ(Conv(δ(Conv(gc)))),
Wherein, sigma represents a sigmoid function, and delta represents a ReLU function;
Then, the following functions are obtained through convolutional layers, ReLU, convolutional layers and sigmoid activation functions:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110326940.XA CN112884680A (en) | 2021-03-26 | 2021-03-26 | Single image defogging method using end-to-end neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110326940.XA CN112884680A (en) | 2021-03-26 | 2021-03-26 | Single image defogging method using end-to-end neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112884680A true CN112884680A (en) | 2021-06-01 |
Family
ID=76042583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110326940.XA Pending CN112884680A (en) | 2021-03-26 | 2021-03-26 | Single image defogging method using end-to-end neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112884680A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450273A (en) * | 2021-06-18 | 2021-09-28 | 暨南大学 | Image defogging method and system based on multi-scale multi-stage neural network |
CN114022371A (en) * | 2021-10-22 | 2022-02-08 | 中国科学院长春光学精密机械与物理研究所 | Defogging device and defogging method based on space and channel attention residual error network |
CN114529470A (en) * | 2022-02-21 | 2022-05-24 | 南通大学 | Single image rain removing method based on end-to-end neural network |
CN115018786A (en) * | 2022-06-02 | 2022-09-06 | 国网江苏省电力有限公司电力科学研究院 | Cable defect identification method and device based on grid network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
CN111598814A (en) * | 2020-05-26 | 2020-08-28 | 北京理工大学 | Single image defogging method based on extreme scattering channel |
CN111814753A (en) * | 2020-08-18 | 2020-10-23 | 深延科技(北京)有限公司 | Target detection method and device under foggy weather condition |
CN111832348A (en) * | 2019-04-17 | 2020-10-27 | 中国科学院宁波材料技术与工程研究所 | Pedestrian re-identification method based on pixel and channel attention mechanism |
CN112184577A (en) * | 2020-09-17 | 2021-01-05 | 西安理工大学 | Single image defogging method based on multi-scale self-attention generation countermeasure network |
CN112365414A (en) * | 2020-11-04 | 2021-02-12 | 天津大学 | Image defogging method based on double-path residual convolution neural network |
-
2021
- 2021-03-26 CN CN202110326940.XA patent/CN112884680A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111832348A (en) * | 2019-04-17 | 2020-10-27 | 中国科学院宁波材料技术与工程研究所 | Pedestrian re-identification method based on pixel and channel attention mechanism |
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
CN111598814A (en) * | 2020-05-26 | 2020-08-28 | 北京理工大学 | Single image defogging method based on extreme scattering channel |
CN111814753A (en) * | 2020-08-18 | 2020-10-23 | 深延科技(北京)有限公司 | Target detection method and device under foggy weather condition |
CN112184577A (en) * | 2020-09-17 | 2021-01-05 | 西安理工大学 | Single image defogging method based on multi-scale self-attention generation countermeasure network |
CN112365414A (en) * | 2020-11-04 | 2021-02-12 | 天津大学 | Image defogging method based on double-path residual convolution neural network |
Non-Patent Citations (3)
Title |
---|
XIAOHONG LIU等: "GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing", 《IEEE》, pages 7313 - 7322 * |
XU QIN等: "FFA-Net: Feature Fusion Attention Network for Single Image Dehazing", 《ARXIV》, pages 1 - 8 * |
殷帅;胡越黎;刘思齐;燕明;: "基于YOLO网络的数据采集与标注", 仪表技术, no. 12 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450273A (en) * | 2021-06-18 | 2021-09-28 | 暨南大学 | Image defogging method and system based on multi-scale multi-stage neural network |
CN114022371A (en) * | 2021-10-22 | 2022-02-08 | 中国科学院长春光学精密机械与物理研究所 | Defogging device and defogging method based on space and channel attention residual error network |
CN114022371B (en) * | 2021-10-22 | 2024-04-05 | 中国科学院长春光学精密机械与物理研究所 | Defogging device and defogging method based on space and channel attention residual error network |
CN114529470A (en) * | 2022-02-21 | 2022-05-24 | 南通大学 | Single image rain removing method based on end-to-end neural network |
CN114529470B (en) * | 2022-02-21 | 2024-09-20 | 南通大学 | Single image rain removing method based on end-to-end neural network |
CN115018786A (en) * | 2022-06-02 | 2022-09-06 | 国网江苏省电力有限公司电力科学研究院 | Cable defect identification method and device based on grid network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112884680A (en) | Single image defogging method using end-to-end neural network | |
CN110059772B (en) | Remote sensing image semantic segmentation method based on multi-scale decoding network | |
CN111539887B (en) | Channel attention mechanism and layered learning neural network image defogging method based on mixed convolution | |
CN113469094A (en) | Multi-mode remote sensing data depth fusion-based earth surface coverage classification method | |
CN110570396A (en) | industrial product defect detection method based on deep learning | |
CN113837938B (en) | Super-resolution method for reconstructing potential image based on dynamic vision sensor | |
CN112365414B (en) | Image defogging method based on double-path residual convolution neural network | |
CN112488025B (en) | Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion | |
CN112862774B (en) | Accurate segmentation method for remote sensing image building | |
CN110378344B (en) | Spectral dimension conversion network-based convolutional neural network multispectral image segmentation method | |
CN107749048B (en) | Image correction system and method, and color blindness image correction system and method | |
CN113269685A (en) | Image defogging method integrating multi-attention machine system | |
CN113449691A (en) | Human shape recognition system and method based on non-local attention mechanism | |
CN113409355A (en) | Moving target identification system and method based on FPGA | |
CN115546046A (en) | Single image defogging method fusing frequency and content characteristics | |
CN113052776A (en) | Unsupervised image defogging method based on multi-scale depth image prior | |
CN116523875A (en) | Insulator defect detection method based on FPGA pretreatment and improved YOLOv5 | |
CN111242053B (en) | Power transmission line flame detection method and system | |
CN117036182A (en) | Defogging method and system for single image | |
CN115656444B (en) | Method for reconstructing concentration of carbon dioxide field in large-scale venue | |
CN116823775A (en) | Display screen defect detection method based on deep learning | |
CN114049274B (en) | Defogging method for single image | |
CN115564647A (en) | Novel super-division module and up-sampling method for image semantic segmentation | |
CN113409321B (en) | Cell nucleus image segmentation method based on pixel classification and distance regression | |
CN115578256A (en) | Unmanned aerial vehicle aerial insulator infrared video panorama splicing method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |