CN115880177A - Full-resolution low-illumination image enhancement method for aggregating context and enhancing details - Google Patents
Full-resolution low-illumination image enhancement method for aggregating context and enhancing details Download PDFInfo
- Publication number
- CN115880177A CN115880177A CN202211600774.9A CN202211600774A CN115880177A CN 115880177 A CN115880177 A CN 115880177A CN 202211600774 A CN202211600774 A CN 202211600774A CN 115880177 A CN115880177 A CN 115880177A
- Authority
- CN
- China
- Prior art keywords
- low
- full
- convolution
- enhancement
- illumination image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Image Processing (AREA)
Abstract
The invention provides a full-resolution low-illumination image enhancement method for aggregating context and enhancing details, which comprises the following steps: carrying out data preprocessing, including data pairing, data random cutting and data enhancement processing, to obtain a training data set; designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, wherein the network consists of a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module; designing a loss function, and guiding the parameter optimization of the network designed in the step B; and B, training the full-resolution low-illumination image enhancement network for the aggregation context and the enhancement details in the step B by using the training data set obtained in the step A, and converging to Nash balance to obtain a trained full-resolution low-illumination image enhancement model for the aggregation context and the enhancement details. The invention can enhance the low-illumination image and solve the problems of low-illumination image detail loss, color distortion, insufficient brightness and the like.
Description
Technical Field
The invention belongs to the technical field of image processing and computer vision, and particularly relates to a full-resolution low-illumination image enhancement method for aggregating context and enhancing details.
Background
Low-light image enhancement is an important branch of image enhancement. Due to environmental reasons such as insufficient illumination, non-uniform illumination, backlight and the like, and the influence of scene conditions such as interference and the like easily generated in the imaging process of a camera, the low-illumination image presents degradation conditions such as low brightness, high noise, color and detail information loss and the like, so that the low-illumination image is required to be processed by a low-illumination image enhancement algorithm, useful information such as color and the like is recovered on the basis of reserving the useful detail information, and noise is removed, so that a normal-illumination image which meets the human visual perception experience or is more suitable for the analysis and processing of downstream tasks is obtained.
The low-illumination image enhancement technology has wide application prospect. It can improve the visibility of night inspection or monitoring. The cameras are arranged in public places for better monitoring and recording, and images captured by the night monitoring cameras are mostly dark and unclear under the condition of insufficient light, so that strong evidence cannot be provided for some events. After the low-illumination image enhancement technology is used for processing, the clearer video plays a more powerful supporting role in judging and deciding events. The low-light image enhancement technique may also improve the quality of human visual perception. The picture taken has unsatisfactory ambient lighting, but can be processed by this technique quickly to a satisfactory level without the need for a scene retake. Meanwhile, the low-illumination image enhancement technology can also improve the performance of downstream tasks. Many downstream technologies, such as face recognition, human key point recognition, etc., have relatively high requirements on the quality of input images, and the algorithm can correctly recognize the positions of faces and bodies depending on whether the input images are clear enough. In the case of dark or fuzzy input pictures, it is very challenging to recognize human faces or body contours, and low-illumination image enhancement techniques can improve image quality, thereby significantly improving the recognition and detection accuracy of algorithms.
The early method is mainly based on histogram equalization and Retinex theory, the histogram equalization realizes enhancement by expanding the dynamic range of image pixels, and the method can better improve the contrast, but due to lack of local consideration, overexposure and underexposure of the image are easy to cause; the Retinex theory considers that an image can be described as a product of a reflection component R and an illumination component I, prior knowledge is needed, and poor prior knowledge brings unreal enhancement of serious chromatic aberration and noise amplification. In recent years, many methods of deep learning have been proposed. Some methods combine the Retinex theory with a convolutional neural network, directly classify the reflections as enhanced images, causing loss of detail and color deviation; some methods directly migrate the mainstream network architecture in other fields to the low-illumination task, and lack consideration on the low-illumination characteristic; still other methods independently solve some aspects of the problems of insufficient illumination, high noise and loss of color and detail information in low-illumination images, and ignore the correlation among the problems. However, low-illumination images have low-level information such as details and noise, and high-level information such as colors and scenes, and the two types of features are not completely irrelevant. The processing of the scene, illumination, facilitates the restoration of the details, which in turn facilitates the restoration of the overall scene.
The existing method gives the same independent processing status to low-level information such as details, noise and the like and high-level information such as colors, scenes and the like, and a low-illumination image is a relatively fine image processing task, needs to be processed with more attention paid to details and then is fused into the processing of the high-level information such as the colors, the scenes and the like. Moreover, the two types of characteristics should be connected and enhanced together, and cannot be processed independently.
Disclosure of Invention
Aiming at the defects and shortcomings in the prior art, the invention aims to provide a full-resolution low-illumination image enhancement method for aggregating context and enhancing details.
The invention designs a full-resolution low-illumination image enhancement method for aggregating context and enhancing details, which comprises the steps of firstly designing a full-resolution detail extraction module to extract detail features, then designing a frequency-space domain context information attention module to extract the context features such as frequency domain, color and scene in the space domain, learning the importance degree of the features in the frequency domain and the space domain by using the attention module, and finally designing a feature aggregation and enhancement module to perform collaborative enhancement after aggregating the detail features and the context features.
It includes: carrying out data preprocessing, including data pairing, data random cutting and data enhancement processing, to obtain a training data set; designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, wherein the network consists of a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module; designing a loss function, and guiding the parameter optimization of the network designed in the step B; training the full-resolution low-illumination image enhancement network for the aggregation context and the enhancement details in the step B by using the training data set obtained in the step A, and converging to Nash balance to obtain a trained full-resolution low-illumination image enhancement model for the aggregation context and the enhancement details; and inputting the low-illumination image to be detected into a trained full-resolution low-illumination image enhancement model for aggregating context and enhancing details, and outputting an enhanced normal-illumination image. The invention can enhance the low-illumination image and solve the problems of low-illumination image detail loss, color distortion, insufficient brightness and the like.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a full resolution low-illumination image enhancement method that aggregates context and enhanced detail, characterized by:
a, preprocessing data, including data pairing, data random cutting and data enhancement processing, to obtain a training data set;
step B, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, comprising: the system comprises a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module;
step C, designing a loss function for guiding the parameter optimization of the network designed in the step B;
d, training the full-resolution low-illumination image enhancement network for the context aggregation and the enhanced details in the step B by using the training data set obtained in the step A, and converging to Nash balance to obtain a trained full-resolution low-illumination image enhancement model for the context aggregation and the enhanced details;
and E, inputting the low-illumination image to be detected into the trained full-resolution low-illumination image enhancement model for aggregating context and enhancing details, and outputting the enhanced normal-illumination image.
Further, the specific implementation steps of step a are as follows:
a1, matching a low-illumination image and a corresponding label image;
step A2, randomly cutting each low-illumination image with the size of h multiplied by w multiplied by 3 into an image with the size of p multiplied by 3, and adopting the same random cutting mode for the corresponding label image, wherein h and w are the height and the width of the low-illumination image and the label image, and p is the height and the width of the cut image;
step A3, randomly adopting 1 enhancement mode of the following 8 enhancement modes to enhance the data of the paired images to be trained: the method comprises the steps of maintaining an original image, vertically turning, rotating by 90 degrees, vertically turning after rotating by 90 degrees, rotating by 180 degrees, vertically turning after rotating by 180 degrees, rotating by 270 degrees, and vertically turning after rotating by 270 degrees.
Further, the specific implementation steps of step B are as follows:
b1, constructing a full-resolution detail extraction module, consisting of a shallow layer feature extraction submodule, a CBAM-based attention submodule and a frequency domain transformation submodule, and extracting detail features by using a designed network;
b2, designing a frequency-space domain context information attention module which consists of a multi-scale feature extraction submodule and a frequency-space domain feature fusion submodule, and extracting context features by using a designed network;
step B3, designing a feature aggregation and enhancement module which consists of a feature aggregation volume block and a cooperative enhancement sub-module, aggregating the detail features extracted in the step B1 and the context features extracted in the step B2, and enhancing the two types of features together;
and step B4, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, wherein the full-resolution image enhancement network comprises a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module.
Further, the specific implementation steps of step B1 are as follows:
step B11, designing a shallow feature extraction submodule, inputting the shallow feature extraction submodule into a low-illumination image I, and obtaining an initial feature map F through 3 multiplied by 3 convolution ori Then, three branches are entered, the first branch containing 13 × 3 convolution, the second branch containing 2 serial 3 × 3 convolution, and the third branch containing 3 serial 3 × 3 convolution, and the processing results F of the three branches are processed B1 、F B2 、F B3 After splicing along the channel dimension, obtaining a feature graph F output by the shallow feature extraction submodule through a3 multiplied by 3 convolution low (ii) a The specific formula is as follows:
F ori =Conv3(I)
F B1 =Conv3(F ori )
F B2 =Conv3(Conv3(F ori ))
F B3 =Conv3(Conv3(Conv3(F ori )))
F low =Conv3(Concat(F B1 ,F B2 ,F B3 ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension;
step B12, constructing a CBAM-based attention submodule according to the serial attention Att of the channel dimension c And attention Att in spatial dimension s Composition, the input characteristic diagram is the characteristic diagram F obtained in the step B11 low And obtaining a feature map of the attention submodule output based on the CBAM as F spa (ii) a The specific formula is as follows:
F spa =Att s (Att c (F low ))
wherein, att c Is the attention of the channel dimension, att s Attention is in the spatial dimension;
step B13, designing a frequency domain transformation submodule, and inputting a characteristic diagram which is the characteristic diagram F obtained in the step B12 spa Converting the spatial domain into the frequency domain by using a Fourier transform function, sequentially carrying out 3 multiplied by 3 convolution, a normalization layer and a ReLU activation function, and then converting the frequency domain into the spatial domain by using an inverse Fourier transform function to obtain an output characteristic diagram F fre (ii) a The specific formula is as follows:
F fre =idft(ReLU(BN(Conv3(dft(F spa )))))
where dft is the fourier transform, idft is the inverse fourier transform, reLU is the ReLU activation function, BN is the batch normalization layer, conv3 is a3 × 3 convolution;
step B14, constructing a full-resolution detail extraction module which consists of a shallow layer feature extraction submodule, a CBAM-based attention submodule and a frequency domain transformation submodule; the input is the low-illumination image I processed in the step A, and a feature map F is obtained after the low-illumination image I is sequentially processed by a shallow feature extraction sub-module, an attention sub-module and a frequency domain transformation sub-module low 、F spa 、F fre 。
Further, the specific implementation steps of step B2 are as follows:
step B21, designing a multi-scale feature extraction submodule, inputting a feature diagram to be marked as F,h, W, C are respectively the height, width and channel number of the characteristic F, and the height, width and channel number are larger by one coreAfter the average pooling layer with the size of 2 multiplied by 2 and the step length of 2, sequentially carrying out dimension reduction on the average pooling layer through 1 multiplied by 1 convolution, reLU activation function, 1 multiplied by 1 convolution and ReLU activation function to obtain an intermediate characteristic diagramThen the signal is divided into two branches, the upper branch is subjected to 1 multiplied by 1 convolution to continue dimension reduction, and the output of the upper branch is obtained through an upsampling layer>a is the number of channels after dimensionality reduction; after the other branch passes through an average pooling layer with the kernel size of 2 multiplied by 2 and the step size of 2, the other branch is subjected to dimensionality reduction sequentially through 1 multiplied by 1 convolution, a ReLU activation function, 1 multiplied by 1 convolution and the ReLU activation function to obtain an intermediate feature map->Middle feature graph F 121 Then sequentially passing through an up-sampling layer, a1 × 1 convolution, a ReLU activation function and an up-sampling layer to obtain the output of the lower branch->F is to be 11 And F 12 After addition, the obtained sum is spliced with F in channel dimension, and a characteristic diagram output by a multi-scale characteristic extraction submodule is obtained through a1 multiplied by 1 convolution adjustment channel after an SE moduleThe specific formula is as follows:
F 1 =ReLU(Conv1(ReLU(Conv1(Avgpooling(F)))))
F 11 =Upsampling(Conv1(F 1 ))
F 121 =ReLU(Conv1(ReLU(Conv1(Avgpooling(F 1 )))))
F 12 =Upsampling(ReLU(Conv1(Upsampling(F 121 ))))
F m =Conv1(SE(Concat(F 11 +F 12 ,F)))
wherein, reLU is an activation function, conv1 is 1 × 1 convolution, SE (·) is a SE module, avgpooling is an average pooling layer with kernel size of 2 × 2 and step size of 2, upsampling is a two-times nearest neighbor Upsampling layer, and Concat is a stitching operation along a channel dimension;
b22, designing a frequency-space domain feature fusion submodule which is formed by serially connecting channel attention and space attention;
step B23, designing a frequency-space domain context information attention module, which consists of three multi-scale feature extraction submodules and a frequency-space domain feature fusion submodule; the input of the three multi-scale feature extraction sub-modules is three feature maps F obtained in the step B1 respectively low 、F spa 、F fre Respectively processed by the multi-scale feature extraction submodule designed in the step B21 to obtain a feature map F with context information low_m 、F spa_m 、F fre_m Then, through the frequency-space domain feature fusion submodule designed in the step B22, a feature diagram F output by the frequency-space domain context information attention module is obtained f 。
Further, the specific implementation steps of step B22 are as follows:
step B221, designing the attention of the channel, and inputting the characteristic diagram obtained in the step B23 The three feature maps are respectively subjected to global average pooling in space dimensions to obtain three vectors with the scale of 1 multiplied by C, and then the three vectors are spliced along the dimension of a channel to obtain a middle feature map->F is to be c Performing dimensionality reduction and dimensionality enhancement sequentially through 1 × 1 convolution, reLU activation function, 1 × 1 convolution, reLU activation function and 1 × 1 convolution, and obtaining the weight on the channel dimension through Sigmoid activation function>F is to be W1 Decomposition into three-scale 1 × 1 × C vector F along the channel dimension W10 、F W11 、F W12 Input feature map F of sub-module respectively fused with frequency-space domain features low_m 、F spa_m 、F fre_m Multiplied to obtain an output characteristic map of the attention of the channel> The specific formula is as follows:
F c =Concat(Avgpooling s (F low_m ),Avgpooling s (F spa_m ),Avgpooling s (F fre_m ))
FW 1 =Sigmoid(Conv1(ReLU(Conv1(ReLU(Conv1(F c ))))))
F low_c =F W10 ×F low_m
F spa_c =F W11 ×F spa_m
F fre_c =F W12 ×F fre_m
wherein Concat is a splicing operation along the channel dimension, avgpoling s Global average pooling for spatial dimensions, reLU being an activation function, conv1 being a1 × 1 convolution, sigmoid being a Sigmoid activation function;
step B222, designing spatial attention, and inputting three characteristic diagrams F obtained in step B221 low_c 、F spa_c 、F fre_c The three characteristic graphs are respectively subjected to average pooling of channel dimensions to obtain three characteristic graphs with H multiplied by W multiplied by 1, and then the three characteristic graphs are spliced along the channel dimensions to obtain a middle characteristic graphF is to be s After sequentially passing through an average pooling layer with the kernel size of 2 multiplied by 2 and the step size of 2, a ReLU activation function and an upsampling layer, the weight on the spatial dimension is obtained through a Sigmoid activation function>F is to be W2 Decomposed into three dimensions H × W × 1 characteristic diagram F W20 、F W21 、F W22 Input feature map F of spatial attention, respectively low_c 、F spa_c 、F fre_c Multiplying to obtain the output characteristic diagram of the spatial attentionThe specific formula is as follows:
F s =Concat(Avgpooling c (F low_c ),Avgpooling c (F spa_c ),Avgpooling c (F fre_c ))
F W2 =Sigmoid(Upsampling(ReLU(Avgpooling(F s ))))
F low_s =F W20 ×F low_c
F spa_s =F W21 ×F spa_c
F fre_s =F W22 ×F fre_c
wherein Concat is a splicing operation along the channel dimension, avgpoling c The method comprises the steps of channel dimension average pooling, wherein ReLU is an activation function, sigmoid is a Sigmoid activation function, avgpoLong is an average pooling layer with the kernel size of 2 multiplied by 2 and the step length of 2, and Upsampling is an upper sampling layer with twice nearest neighbor;
b223, designing a frequency-space domain feature fusion submodule, and inputting a feature diagram F obtained in the step B23 low_m 、F spa_m 、F fre_m The three feature maps are focused on the channel in step B221 to obtain feature map F low_c 、F spa_c 、F fre_c Then, the spatial attention in step B222 is performed to obtain a feature map F low_s 、F spa_s 、F fre_s Adding the three characteristic graphs to obtain the final output F f (ii) a The specific formula is as follows:
F f =F low_s +F spa_s +F fre_s 。
further, the specific implementation steps of step B3 are as follows:
b31, designing a feature aggregation volume block to realize the fusion of detail information and context information; inputting the characteristic diagram to obtain a characteristic diagram F in the step B1 fre And the characteristic diagram F obtained in the step B2 f Splicing the two along the channel dimension, and performing 3 multiplied by 3 convolution to obtain an output characteristic diagram F conv (ii) a The specific formula is as follows:
F conv =Conv3(Concat(F fre ,F f ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension;
step B32, designing a cooperative enhancement submodule to cooperatively enhance the fusion information of the detail information and the context information; the input feature map is F obtained in step B31 conv Will F conv Sequentially performing 1 × 1 convolution, reLU6 activation function, dropout random deactivation layer, 1 × 1 convolution and Dropout random deactivation layer, and mixing with F conv Adding to obtain an intermediate characteristic diagram F mid Then through the LeakyReLU activation function, with F conv After being spliced along the channel dimension, the output characteristic diagram F is obtained through a3 multiplied by 3 convolution co (ii) a The specific formula is as follows:
F mid =Dropout(Conv1(Dropout(ReLU6(Conv1(F conv )))))+F conv
F co =Conv3(Concat(LeakyReLU(F mid ),F conv ))
wherein Conv1 is a1 × 1 convolution, conv3 is a3 × 3 convolution, concat is a stitching operation along the channel dimension, dropout is a random deactivation layer, reLU6 is a ReLU6 activation function, and LeakyReLU is a LeakyReLU activation function;
step B33, designing a characteristic aggregation and enhancement module which consists of a characteristic aggregation volume block and a cooperative enhancement sub-module, wherein the input characteristic diagram is the characteristic diagram F obtained in the step B1 fre And the characteristic diagram F obtained in the step B2 f Obtaining a characteristic diagram F after characteristic aggregation rolling blocks conv And obtaining a characteristic diagram F after the synergistic enhancement of the submodules co 。
Further, the specific implementation manner of step B4 is:
b4, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, and integrating a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module; inputting a low-illumination image I, and obtaining three characteristic graphs F after passing through a full-resolution detail extraction module in the step B1 low 、F spa 、F fre Obtaining a characteristic diagram F after passing through a frequency-space domain context information attention module f Then obtaining a feature map F through a feature aggregation and enhancement module co Followed by F co And the characteristic diagram F in the step B1 low Splicing along the channel dimension, and performing 3 × 3 convolution to obtain the final enhanced image I out (ii) a The specific formula is as follows:
I out =Conv3(Concat(F co ,F low ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension.
Further, the specific implementation manner of step C is:
and step C, designing a loss function, wherein the loss function consists of L2 loss and VGG perception loss, and the total target loss function of the network is as follows:
l=ω 1 ||I out -G|| 2 +ω 2 ||Φ(I out )-Φ(G)|| 1
where Φ (-) represents the operation of extracting Conv4-1 layer features using a VGG-16 classification model pre-trained on ImageNet datasets; i is out An enhanced image representing the low-illumination image I, G representing a label image corresponding to the low-illumination image I, | |, I _ calting 1 Representing L1 loss, | | - | 2 Denotes L2 loss, ω 1 、ω 2 Are weights.
Further, the specific implementation steps of step D are as follows:
step D1, randomly dividing the training data set obtained in the step A into a plurality of batches, wherein each batch comprises N pairs of images;
step D2, inputting a low-illumination image I, and going through the stepsObtaining an enhanced image I after aggregating the context and the full-resolution low-illumination image enhancement network for enhancing details in the step B out Calculating the loss l by using the formula in the step C;
d3, calculating the gradient of the parameters in the network by using a back propagation method according to the loss, and updating the network parameters by using an Adam optimization method;
and D4, repeatedly executing the steps D1 to D3 by taking batches as units until the target loss function value of the network converges to Nash balance, and storing network parameters to obtain a full-resolution low-illumination image enhancement model for aggregating context and enhancing details.
Compared with the prior art, the method and the preferred scheme thereof extract the detail features on the full resolution, extract the context features on the frequency domain and the space domain, and gather the two types of features together for common enhancement, can better extract the two types of information, and learn the relationship between the two types of features in the enhancement process. A full-resolution low-illumination image enhancement network for context aggregation and detail enhancement is designed, a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module are respectively arranged for extracting detail features, extracting frequency-space domain context features and aggregating and cooperatively enhancing the two features, and the method is different from other methods for independently solving the problems existing in low-illumination images.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
FIG. 1 is a flow chart of an implementation of a method of an embodiment of the invention.
Fig. 2 is a block diagram of a full resolution low-light image enhancement network that aggregates context and enhanced details in an embodiment of the invention.
Fig. 3 is a structural diagram of a multi-scale feature extraction sub-module in the embodiment of the present invention.
Fig. 4 is a structural diagram of a frequency-space domain feature fusion submodule in an embodiment of the present invention.
Fig. 5 is a structural diagram of a cooperative enhancement sub-module in the embodiment of the present invention.
Detailed Description
In order to make the features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail as follows:
it should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiment is further described in detail with reference to the accompanying drawings:
the invention provides a full-resolution low-illumination image enhancement method for aggregating context and enhancing details, which comprises the following steps as shown in figures 1-5:
a, preprocessing data, including data pairing, data random cutting and data enhancement processing, to obtain a training data set;
b, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, wherein the network consists of a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module;
step C, designing a loss function, and guiding the parameter optimization of the network designed in the step B;
step D, training the full-resolution low-illumination image enhancement network for the aggregation context and the enhancement details in the step B by using the training data set obtained in the step A, and converging to Nash balance to obtain a trained full-resolution low-illumination image enhancement model for the aggregation context and the enhancement details;
and E, inputting the low-illumination image to be detected into the trained full-resolution low-illumination image enhancement model for aggregating the context and enhancing the details, and outputting the enhanced normal-illumination image.
Further, step a comprises the steps of:
a1, matching a low-illumination image and a corresponding label image;
step A2, randomly cutting each low-illumination image with the size of h multiplied by w multiplied by 3 into an image with the size of p multiplied by 3, and adopting the same random cutting mode for the corresponding label image, wherein h and w are the height and the width of the low-illumination image and the label image, and p is the height and the width of the cut image;
step A3, randomly adopting 1 enhancement mode of the following 8 enhancement modes to perform data enhancement on the paired images to be trained: the method comprises the steps of maintaining an original image, vertically turning, rotating by 90 degrees, vertically turning after rotating by 90 degrees, rotating by 180 degrees, vertically turning after rotating by 180 degrees, rotating by 270 degrees, and vertically turning after rotating by 270 degrees.
Further, step B comprises the steps of:
b1, constructing a full-resolution detail extraction module, consisting of a shallow layer feature extraction submodule, a CBAM-based attention submodule and a frequency domain transformation submodule, and extracting detail features by using a designed network;
b2, designing a frequency-space domain context information attention module which consists of a multi-scale feature extraction submodule and a frequency-space domain feature fusion submodule and extracting context features by using a designed network;
step B3, designing a feature aggregation and enhancement module which consists of a feature aggregation volume block and a cooperative enhancement sub-module, aggregating the detail features extracted in the step B1 and the context features extracted in the step B2, and enhancing the two types of features together;
and step B4, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, wherein the full-resolution image enhancement network comprises a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module.
Further, step B1 comprises the steps of:
step B11, designing a shallow feature extraction submodule, inputting the shallow feature extraction submodule into a low-illumination image I, and obtaining an initial feature map F through 3 multiplied by 3 convolution ori Then, three branches are entered, the first branch containing 13 × 3 convolution, the second branch containing 2 serial 3 × 3 convolution, and the third branch containing 3 serial 3 × 3 convolution, and the processing results F of the three branches are processed B1 、F B2 、F B3 After splicing along the channel dimension, obtaining a feature graph F output by the shallow feature extraction submodule through a3 multiplied by 3 convolution low . The specific formula is as follows:
F ori =Conv3(I)
F B1 =Conv3(F ori )
F B2 =Conv3(Conv3(F ori ))
F B3 =Conv3(Conv3(Conv3(F ori )))
F low =Conv3(Concat(F B1 ,F B2 ,F B3 ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension;
step B12, constructing a CBAM-based attention submodule which is formed by serial channel dimension attention Att c And attention Att in spatial dimension s Composition, the input characteristic diagram is the characteristic diagram F obtained in the step B11 low And obtaining a characteristic diagram F of the output of the attention submodule based on the CBAM spa . The specific formula is as follows:
F spa =Att s (Att c (F low ))
wherein, att c Is the attention of the channel dimension, att s Is the focus of the spatial dimension.
Step B13, designing a frequency domain transformation submodule, and inputting a characteristic diagram F obtained in the step B12 spa Converting the spatial domain into the frequency domain by using a Fourier transform function, sequentially carrying out 3 multiplied by 3 convolution, a normalization layer and a ReLU activation function, and then converting the frequency domain into the spatial domain by using an inverse Fourier transform function to obtain output characteristicsFIG. F fre . The specific formula is as follows:
F fre =idft(ReLU(BN(Conv3(dft(F spa )))))
where dft is the fourier transform, idft is the inverse fourier transform, reLU is the ReLU activation function, BN is the batch normalization layer, and Conv3 is a3 × 3 convolution.
And B14, constructing a full-resolution detail extraction module which consists of a shallow layer feature extraction submodule, a CBAM-based attention submodule and a frequency domain transformation submodule. The input is the low-illumination image I processed in the step A, and a feature map F is obtained after the low-illumination image I is sequentially processed by a shallow feature extraction sub-module, an attention sub-module and a frequency domain transformation sub-module low 、F spa 、F fre 。
Further, step B2 comprises the steps of:
and B21, designing a multi-scale feature extraction submodule. The input characteristic map of the module is denoted as F,h, W and C are respectively the height, width and channel number of the feature F) passes through an average pooling layer with kernel size of 2 x 2 and step length of 2, and then sequentially passes through 1 x 1 convolution, reLU activation function, 1 x 1 convolution and ReLU activation function for dimensionality reduction to obtain an intermediate feature mapThen the signal is divided into two branches, the upper branch is subjected to 1 multiplied by 1 convolution to continue dimension reduction, and the output of the upper branch is obtained through an upsampling layer>and a is the number of channels after dimensionality reduction. After the other branch passes through an average pooling layer with the kernel size of 2 multiplied by 2 and the step size of 2, the other branch is subjected to dimensionality reduction sequentially through 1 multiplied by 1 convolution, a ReLU activation function, 1 multiplied by 1 convolution and the ReLU activation function to obtain an intermediate feature map->Middle feature map F 121 Then sequentially passing through an up-sampling layer, a1 × 1 convolution, a ReLU activation function and an up-sampling layer to obtain the output of the lower branch->F is to be 11 And F 12 After addition, the obtained sum is spliced with F in channel dimension, and a characteristic diagram output by a multi-scale characteristic extraction submodule is obtained through a1 multiplied by 1 convolution adjustment channel after an SE moduleThe specific formula is as follows:
F 1 =ReLU(Conv1(ReLU(Conv1(Avgpooling(F)))))
F 11 =Upsampling(Conv1(F 1 ))
F 121 =ReLU(Conv1(ReLU(Conv1(Avgpooling(F 1 )))))
F 12 =Upsampling(EeLU(Conv1(Upsampling(F 121 ))))
F m =Conv1(SE(Concat(F 11 +F 12 ,F)))
wherein, reLU is an activation function, conv1 is a1 × 1 convolution, SE (·) is an SE module, avgpooling is an average pooling layer with a kernel size of 2 × 2 and a step size of 2, upsampling is a two-time nearest neighbor Upsampling layer, and Concat is a splicing operation along a channel dimension;
b22, designing a frequency-space domain feature fusion submodule which is formed by serially connecting channel attention and space attention;
and step B23, designing a frequency-space domain context information attention module, which consists of three multi-scale feature extraction submodules and a frequency-space domain feature fusion submodule. The input of the three multi-scale feature extraction sub-modules is three feature maps F obtained in the step B1 respectively low 、F spa 、F fre Respectively processed by the multi-scale feature extraction submodule designed in the step B21 to obtain a feature map F with context information low_m 、F spa_m 、F fre_m Then, the frequency-space domain feature fusion submodule designed in the step B22 is used for obtaining the context information of the frequency-space domainFeature map F of attention module output f 。
Further, step B22 includes the steps of:
step B221, designing the attention of the channel, and inputting the characteristic diagram obtained in the step B23 The three feature maps are respectively subjected to global average pooling in space dimensions to obtain three vectors with the scale of 1 multiplied by C, and then the three vectors are spliced along the dimension of a channel to obtain a middle feature map->F is to be c Performing dimensionality reduction and dimensionality enhancement sequentially through 1 × 1 convolution, reLU activation function, 1 × 1 convolution, reLU activation function and 1 × 1 convolution, and obtaining the weight on the channel dimension through Sigmoid activation function>F is to be W1 Decomposition into three-scale 1 × 1 × C vector F along the channel dimension W10 、F W11 、F W12 Input feature map F of sub-module respectively fused with frequency-space domain features low_m 、F spa_m 、F fre_m Multiplied to an output characteristic map +that is based on channel attention> The specific formula is as follows:
F c =Concat(Avgpooling s (F low_m ),Avgpooling s (F spa_m ),Avgpooling s (F fre_m ))
F W1 =Sigmoid(Conv1(ReLU(Conv1(ReLU(Conv1(F c ))))))
F low_c =F W10 ×F low_m
F spa_c =F W11 ×F spa_m
F fre_c =F W12 ×F fre_m
wherein Concat is a splicing operation along the channel dimension, avgpoling s Global average pooling for spatial dimensions, reLU being an activation function, conv1 being a1 × 1 convolution, sigmoid being a Sigmoid activation function;
step B222, designing spatial attention, and inputting three characteristic diagrams F obtained in step B221 low_c 、F spa_c 、F fre_c The three characteristic graphs are respectively subjected to average pooling of channel dimensions to obtain three characteristic graphs with H multiplied by W multiplied by 1, and then the three characteristic graphs are spliced along the channel dimensions to obtain a middle characteristic graphF is to be s Sequentially passing through an average pooling layer with kernel size of 2 multiplied by 2 and step length of 2, a ReLU activation function and an upsampling layer, and obtaining weights on a spatial dimension through a Sigmoid activation function>F is to be W2 Decomposed into three dimensions H × W × 1 characteristic diagram F W20 、F W21 、F W22 Input feature map F of spatial attention, respectively low_c 、F spa_c 、F fre_c Multiplying to obtain the output characteristic diagram of the spatial attentionThe specific formula is as follows:
F s =Concat(Avgpooling c (F low_ c),Avgpooling c (F spa_c ),Avgpooling c (F fre_c ))
F W2 =Sigmoid(Upsampling(ReLU(Avgpooling(F s ))))
F low_s =F W20 ×F low_c
F spa_s =F W21 ×F spa_c
F fre_s =F W22 ×F fre_c
wherein Concat is a splicing operation along the channel dimension, avgpoling c The method comprises the steps of channel dimension average pooling, wherein ReLU is an activation function, sigmoid is a Sigmoid activation function, avgpoLong is an average pooling layer with the kernel size of 2 multiplied by 2 and the step length of 2, and Upsampling is an upper sampling layer with twice nearest neighbor;
b223, designing a frequency-space domain feature fusion submodule, and inputting a feature diagram F obtained in the step B23 low_m 、F spa_m 、F fre_m The three feature maps are focused on the channel in step B221 to obtain feature map F low_c 、F spa_c 、F fre_c Then, the spatial attention in step B222 is performed to obtain a feature map F low_s 、F spa_s 、F fre_s Adding the three characteristic graphs to obtain the final output F f . The specific formula is as follows:
F f =F low_s +F spa_s +F fre_s
further, step B3 is implemented as follows:
and step B31, designing a feature aggregation volume block to realize the fusion of the detail information and the context information. Inputting the characteristic diagram to obtain a characteristic diagram F in the step B1 fre And the characteristic diagram F obtained in the step B2 f Splicing the two along the channel dimension, and performing 3 multiplied by 3 convolution to obtain an output characteristic diagram F conv . The specific formula is as follows:
F conv =Conv3(Concat(F fre ,F f ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension;
and step B32, designing a cooperative enhancement submodule to cooperatively enhance the fusion information of the detail information and the context information. The input feature map is F obtained in step B31 conv Will F conv Sequentially performing 1 × 1 convolution, reLU6 activation function, dropout random deactivation layer, 1 × 1 convolution and Dropout randomAfter deactivation of the layer, with F conv Adding to obtain an intermediate characteristic diagram F mid Then through the LeakyReLU activation function, with F conv After splicing along the channel dimension, obtaining an output characteristic diagram F through a3 multiplied by 3 convolution co . The specific formula is as follows:
F mid =Dropout(Conv1(Dropout(ReLU6(Conv1(F conv )))))+F conv
F co =Conv3(Concat(LeakyReLU(F mid ),F conv ))
wherein Conv1 is a1 × 1 convolution, conv3 is a3 × 3 convolution, concat is a stitching operation along the channel dimensions, dropout is a random deactivation layer, reLU6 is a ReLU6 activation function, and LeakyReLU is a LeakyReLU activation function;
step B33, designing a characteristic aggregation and enhancement module which consists of a characteristic aggregation volume block and a cooperative enhancement sub-module, wherein the input characteristic diagram is the characteristic diagram F obtained in the step B1 fre And the characteristic diagram F obtained in the step B2 f Obtaining a characteristic diagram F after characteristic aggregation rolling blocks conv Obtaining a characteristic diagram F after the characteristic diagram is subjected to a cooperative enhancement submodule co 。
Further, step B4 is implemented as follows:
and B4, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, and integrating a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module. Inputting a low-illumination image I, and obtaining three characteristic graphs F after passing through a full-resolution detail extraction module in the step B1 low 、F spa 、F fre Obtaining a characteristic diagram F after passing through a frequency-space domain context information attention module f Finally, obtaining a feature map F through a feature aggregation and enhancement module co Followed by F co And the characteristic diagram F in the step B1 low Splicing along the channel dimension, and performing 3 × 3 convolution to obtain the final enhanced image I out . The specific formula is as follows:
I out =Conv3(Concat(F co ,F low ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension.
Further, step C is implemented as follows:
and step C, designing a loss function, wherein the loss function consists of L2 loss and VGG perception loss, and the total target loss function of the network is as follows:
l=ω 1 ||I out -G|| 2 +ω 2 ||Φ(I out )-Φ(G)|| 1
where Φ (-) represents the operation of extracting Conv4-1 layer features using a VGG-16 classification model pre-trained on ImageNet datasets. I is out Representing an enhanced image of the low-illumination image I, G representing a label image corresponding to the low-illumination image I, | 1 Representing L1 loss, | | - | 2 Denotes L2 loss, ω 1 、ω 2 Are weights.
Further, step D is implemented as follows:
step D1, randomly dividing the training data set obtained in the step A into a plurality of batches, wherein each batch comprises N pairs of images;
step D2, inputting the low-illumination image I, and obtaining an enhanced image I after passing through the full-resolution low-illumination image enhancement network for aggregating context and enhancing details in the step B out Calculating the loss l by using the formula in the step C;
and D3, calculating the gradient of the parameters in the network by using a back propagation method according to the loss, and updating the network parameters by using an Adam optimization method.
And D4, repeatedly executing the steps D1 to D3 by taking batches as units until the target loss function value of the network converges to Nash balance, and storing network parameters to obtain a full-resolution low-illumination image enhancement model for aggregating context and enhancing details.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention will still fall within the protection scope of the technical solution of the present invention.
The present invention is not limited to the above preferred embodiments, and other various types of aggregate context and full-resolution low-illumination image enhancement methods with enhanced details can be derived by anyone in light of the present patent disclosure.
Claims (10)
1. A full resolution low-illumination image enhancement method that aggregates context and enhances details, characterized by:
a, preprocessing data, including data pairing, data random cutting and data enhancement processing, to obtain a training data set;
step B, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, comprising: the system comprises a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module;
step C, designing a loss function for guiding the parameter optimization of the network designed in the step B;
step D, training the full-resolution low-illumination image enhancement network for the aggregation context and the enhancement details in the step B by using the training data set obtained in the step A, and converging to Nash balance to obtain a trained full-resolution low-illumination image enhancement model for the aggregation context and the enhancement details;
and E, inputting the low-illumination image to be detected into the trained full-resolution low-illumination image enhancement model for aggregating context and enhancing details, and outputting the enhanced normal-illumination image.
2. The full-resolution low-illumination image enhancement method for aggregating context and enhancing details according to claim 1, wherein the step a is implemented as follows:
a1, matching a low-illumination image and a corresponding label image;
step A2, randomly cutting each low-illumination image with the size of h multiplied by w multiplied by 3 into an image with the size of p multiplied by 3, and adopting the same random cutting mode for the corresponding label image, wherein h and w are the height and the width of the low-illumination image and the label image, and p is the height and the width of the cut image;
step A3, randomly adopting 1 enhancement mode of the following 8 enhancement modes to perform data enhancement on the paired images to be trained: the method comprises the steps of maintaining an original image, vertically turning, rotating by 90 degrees, vertically turning after rotating by 90 degrees, rotating by 180 degrees, vertically turning after rotating by 180 degrees, rotating by 270 degrees, and vertically turning after rotating by 270 degrees.
3. The full-resolution low-illumination image enhancement method for aggregating context and enhancing details according to claim 1, wherein the step B is implemented as follows:
b1, constructing a full-resolution detail extraction module, consisting of a shallow layer feature extraction submodule, a CBAM-based attention submodule and a frequency domain transformation submodule, and extracting detail features by using a designed network;
b2, designing a frequency-space domain context information attention module which consists of a multi-scale feature extraction submodule and a frequency-space domain feature fusion submodule and extracting context features by using a designed network;
step B3, designing a feature aggregation and enhancement module which consists of a feature aggregation volume block and a cooperative enhancement sub-module, aggregating the detail features extracted in the step B1 and the context features extracted in the step B2, and enhancing the two types of features together;
and step B4, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, wherein the full-resolution image enhancement network comprises a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module.
4. The full-resolution low-illumination image enhancement method for aggregating context and enhancing details according to claim 3, wherein the step B1 is implemented as follows:
step B11, designing a shallow feature extraction submodule, inputting the shallow feature extraction submodule into a low-illumination image I, and generating a low-illumination image I after 3 timesConvolution to obtain an initial feature map F ori Then, three branches are entered, the first branch containing 13 × 3 convolution, the second branch containing 2 serial 3 × 3 convolution, and the third branch containing 3 serial 3 × 3 convolution, and the processing results F of the three branches are processed B1 、F B2 、F B3 After splicing along the channel dimension, obtaining a feature graph F output by the shallow feature extraction submodule through a3 multiplied by 3 convolution low (ii) a The specific formula is as follows:
F ori =Conv3(I)
F B1 =Conv3(F ori )
F B2 =Conv3(Conv3(F ori ))
F B3 =Conv3(Conv3(Conv3(F ori )))
F low =Conv3(Concat(F B1 ,F B2 ,F B3 ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension;
step B12, constructing a CBAM-based attention submodule according to the attention Att of the serial channel dimension c And attention Att in spatial dimension s Composition, the input characteristic diagram is the characteristic diagram F obtained in the step B11 low And obtaining a characteristic diagram F of the output of the attention submodule based on the CBAM spa (ii) a The specific formula is as follows:
F spa =Att s (Att c (F low ))
wherein, att c Attention is paid to channel dimensions, att s Attention is in the spatial dimension;
step B13, designing a frequency domain transformation submodule, and inputting a characteristic diagram F obtained in the step B12 spa Converting the spatial domain into the frequency domain by using a Fourier transform function, sequentially carrying out 3 multiplied by 3 convolution, a normalization layer and a ReLU activation function, and then converting the frequency domain into the spatial domain by using an inverse Fourier transform function to obtain an output characteristic diagram F fre (ii) a The specific formula is as follows:
F fre =idft(ReLU(BN(Conv3(dft(F spa )))))
where dft is the fourier transform, idft is the inverse fourier transform, reLU is the ReLU activation function, BN is the batch normalization layer, conv3 is a3 × 3 convolution;
step B14, constructing a full-resolution detail extraction module which consists of a shallow layer feature extraction submodule, a CBAM-based attention submodule and a frequency domain transformation submodule; the input is the low-illumination image I processed in the step A, and a feature map F is obtained after the low-illumination image I is sequentially processed by a shallow feature extraction sub-module, an attention sub-module and a frequency domain transformation sub-module low 、F spa 、F fre 。
5. The full-resolution low-illumination image enhancement method for aggregating context and enhancing details according to claim 4, wherein the step B2 is implemented by the following steps:
step B21, designing a multi-scale feature extraction submodule, inputting a feature diagram to be marked as F,h, W and C are respectively the height, width and channel number of the feature F, and after passing through an average pooling layer with kernel size of 2 x 2 and step length of 2, the feature F is subjected to dimensionality reduction sequentially through 1 x 1 convolution, reLU activation function, 1 x 1 convolution and ReLU activation function to obtain a middle feature map ^ and/or the median value>Then dividing the signal into two branches, after the upper branch continues to reduce dimension through 1 × 1 convolution, obtaining the output of the upper branch through an up-sampling layera is the number of channels after dimensionality reduction; after the other branch passes through an average pooling layer with the kernel size of 2 multiplied by 2 and the step length of 2, the other branch is subjected to dimensionality reduction sequentially through 1 multiplied by 1 convolution, a ReLU activation function, 1 multiplied by 1 convolution and the ReLU activation function to obtain an intermediate feature map->Middle feature map F 121 Then sequentially passing through an up-sampling layer, a1 × 1 convolution, a ReLU activation function and an up-sampling layer to obtain the output of the lower branch->F is to be 11 And F 12 After addition, the sum is spliced with F in the channel dimension, and the characteristic diagram output by the multi-scale characteristic extraction submodule is obtained through a1 multiplied by 1 convolution adjustment channel after passing through an SE module>The specific formula is as follows:
F 1 =ReLU(Conv1(ReLU(Conv1(Avgpooling(F)))))
F 11 =Upsampling(Conv1(F 1 ))
F 121 =ReLU(Conv1(ReLU(Conv1(Avgpooling(F 1 )))))
F 12 =Upsampling(ReLU(Conv1(Upsampling(F 121 ))))
Fm=Conv1(SE(Concat(F 11 +F 12 ,F)))
wherein, reLU is an activation function, conv1 is a1 × 1 convolution, SE (·) is an SE module, avgpooling is an average pooling layer with a kernel size of 2 × 2 and a step size of 2, upsampling is a two-time nearest neighbor Upsampling layer, and Concat is a splicing operation along a channel dimension;
b22, designing a frequency-space domain feature fusion submodule which is formed by serially connecting channel attention and space attention;
step B23, designing a frequency-space domain context information attention module, which consists of three multi-scale feature extraction submodules and a frequency-space domain feature fusion submodule; the input of the three multi-scale feature extraction sub-modules is three feature maps F obtained in the step B1 respectively low 、F spa 、F fre Respectively processed by the multi-scale feature extraction submodule designed in the step B21 to obtain a feature map F with context information low_m 、F spa_m 、F fre_m Then, the frequency-space domain feature fusion submodule designed in the step B22 is used for obtaining the upper frequency-space domain and the lower frequency-space domainCharacter diagram F output by text information attention module f 。
6. The full-resolution low-illuminance image enhancement method for aggregating context and enhancing details according to claim 5, wherein the step B22 is implemented by the following steps:
step B221, designing the attention of the channel, inputting the characteristic diagram obtained in step B23 The three characteristic maps are respectively subjected to global average pooling of spatial dimensions to obtain three vectors with the scale of 1 multiplied by C, and then the three vectors are spliced along the dimension of a channel to obtain a middle characteristic map->F is to be c Performing dimensionality reduction and dimensionality enhancement sequentially through 1 × 1 convolution, reLU activation function, 1 × 1 convolution, reLU activation function and 1 × 1 convolution, and obtaining the weight on the channel dimension through Sigmoid activation function>F is to be W1 Decomposition into three-scale 1 × 1 × C vector F along the channel dimension W10 、F W11 、F W12 Input feature map F of sub-module respectively fused with frequency-space domain features low_m 、F spa_m 、F fre_m Multiplied to obtain an output characteristic map of the attention of the channel> The specific formula is as follows:
F c =Concat(Avgpooling s (F low_m ),Avgpooling s (F spa_m ),Avgpooling s (F fre_m ))
FW 1 =Sigmoid(Conv1(ReLU(Conv1(ReLU(Conv1(F c ))))))
F low_c =F W10 ×F low_m
F spa_c =F W11 ×F spa_m
F fre_c =F W12 ×F fre_m
wherein Concat is a splicing operation along the channel dimension, avgpoling s Is the global average pooling of spatial dimensions, reLU is the activation function, conv1 is a1 × 1 convolution, sigmoid is the Sigmoid activation function;
step B222, designing spatial attention, and inputting three characteristic diagrams F obtained in step B221 low_c 、F spa_c 、F fre_c The three characteristic graphs are respectively subjected to average pooling of channel dimensions to obtain three characteristic graphs with H multiplied by W multiplied by 1, and then the three characteristic graphs are spliced along the channel dimensions to obtain a middle characteristic graphF is to be s After sequentially passing through an average pooling layer with the kernel size of 2 multiplied by 2 and the step size of 2, a ReLU activation function and an upsampling layer, the weight on the spatial dimension is obtained through a Sigmoid activation function>F is to be W2 Decomposed into three dimensions H × W × 1 characteristic diagram F W20 、F W21 、F W22 Input feature map F of spatial attention, respectively low_c 、F spa_c 、F fre_c Multiplying to obtain the output characteristic diagram of spatial attentionThe specific formula is as follows:
F s =Concat(Avgpooling c (F low_c ),Avgpooling c (Fs pa_c ),Avgpooling c (F fre_c ))
F W2 =Sigmoid(Upsampling(ReLU(Avgpooling(F s ))))
F low_s =F W20 ×F low_c
F spa_s =F W21 ×F spa_c
F fre_s =F W22 ×F fre_c
wherein Concat is splicing operation along a channel dimension, avgpolingc is average pooling of the channel dimension, reLU is an activation function, sigmoid is a Sigmoid activation function, avgpoling is an average pooling layer with a kernel size of 2 × 2 and a step size of 2, and Upsampling is a double nearest neighbor Upsampling layer;
step B223, designing a frequency-space domain feature fusion submodule, and inputting a feature diagram F obtained in the step B23 low_m 、F spa_m 、F fre_m The three feature maps are first subjected to the channel attention in step B221 to obtain feature map F low_c 、F spa_c 、F fre_c Then, the spatial attention in step B222 is performed to obtain a feature map F low_s 、F spa_s 、F fre_s Adding the three characteristic graphs to obtain the final output F f (ii) a The specific formula is as follows:
F f =F low_s +F spa_s +F fre_s 。
7. the full-resolution low-illuminance image enhancement method for aggregating context and enhancing details according to claim 6, wherein the step B3 is implemented by the following steps:
step B31, designing a feature aggregation rolling block to realize the fusion of detail information and context information; inputting the characteristic diagram to obtain a characteristic diagram F in the step B1 fre And the characteristic diagram F obtained in the step B2 f Splicing the two along the channel dimension, and performing 3 multiplied by 3 convolution to obtain an output characteristic diagram F conv (ii) a The specific formula is as follows:
F conv =Conv3(Concat(F fre ,F f ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension;
b32, designing a cooperative enhancement submodule to cooperatively enhance the fusion information of the detail information and the context information; the input feature map is F obtained in step B31 conv A1 to F conv Sequentially performing 1 × 1 convolution, reLU6 activation function, dropout random deactivation layer, 1 × 1 convolution and Dropout random deactivation layer, and mixing with F conv Adding to obtain an intermediate characteristic diagram F mid Then through the LeakyReLU activation function, with F conv After splicing along the channel dimension, obtaining an output characteristic diagram F through a3 multiplied by 3 convolution co (ii) a The specific formula is as follows:
F mid =Dropout(Conv1Dropout(ReLU6(Conv1(F conv )))))+F conv
F co =Conv3(Concat(LeakyReLU(F mid ),F conv ))
wherein Conv1 is a1 × 1 convolution, conv3 is a3 × 3 convolution, concat is a stitching operation along the channel dimension, dropout is a random deactivation layer, reLU6 is a ReLU6 activation function, and LeakyReLU is a LeakyReLU activation function;
step B33, designing a characteristic aggregation and enhancement module which consists of a characteristic aggregation volume block and a cooperative enhancement submodule, and inputting a characteristic diagram which is the characteristic diagram F obtained in the step B1 fre And the characteristic diagram F obtained in the step B2 f Obtaining a characteristic diagram F after characteristic aggregation rolling blocks conv And obtaining a characteristic diagram F after the synergistic enhancement of the submodules co 。
8. The full-resolution low-illuminance image enhancement method for aggregating context and enhancing details according to claim 7, wherein the step B4 is implemented in a manner that:
b4, designing a full-resolution low-illumination image enhancement network for aggregating context and enhancing details, and integrating a full-resolution detail extraction module, a frequency-space domain context information attention module and a feature aggregation and enhancement module; inputting a low-illumination image I, and performing full-resolution detail extraction in step B1Obtaining three characteristic graphs F after the module low 、F spa 、F fre Obtaining a characteristic diagram F after passing through a frequency-space domain context information attention module f Then obtaining a feature map F through a feature aggregation and enhancement module co Followed by F co And the characteristic diagram F in the step B1 low Splicing along the channel dimension, and performing 3 × 3 convolution to obtain the final enhanced image I out (ii) a The specific formula is as follows:
I out =Conv3(Concat(F co ,F low ))
where Conv3 is a3 × 3 convolution and Concat is a stitching operation along the channel dimension.
9. The full-resolution low-illuminance image enhancement method for aggregating context and enhancing details according to claim 1, wherein step C is implemented in a manner of:
and step C, designing a loss function, wherein the loss function consists of L2 loss and VGG perception loss, and the total target loss function of the network is as follows:
l=ω 1 ||I out -G|| 2 +ω 2 ||Φ(I out )-Φ(G)|| 1
where Φ (-) represents the operation of extracting Conv4-1 layer features using a VGG-16 classification model pre-trained on ImageNet datasets; i is out Representing an enhanced image of the low-illumination image I, G representing a label image corresponding to the low-illumination image I, | 1 Representing L1 loss, | | - | 2 Denotes L2 loss, ω 1 、ω 2 Are the weights.
10. The full-resolution low-illumination image enhancement method for aggregating context and enhancing details according to claim 1, wherein the step D is implemented as follows:
step D1, randomly dividing the training data set obtained in the step A into a plurality of batches, wherein each batch comprises N pairs of images;
step D2, inputting the low-illumination image I, and performing full-resolution low-illumination image enhancement network for aggregating context and enhancing details in step BObtaining an enhanced image I out Calculating the loss l by using the formula in the step C;
d3, calculating the gradient of the parameters in the network by using a back propagation method according to the loss, and updating the network parameters by using an Adam optimization method;
and D4, repeatedly executing the steps D1 to D3 by taking batches as units until the target loss function value of the network converges to Nash balance, and storing network parameters to obtain a full-resolution low-illumination image enhancement model for aggregating context and enhancing details.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211600774.9A CN115880177A (en) | 2022-12-12 | 2022-12-12 | Full-resolution low-illumination image enhancement method for aggregating context and enhancing details |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211600774.9A CN115880177A (en) | 2022-12-12 | 2022-12-12 | Full-resolution low-illumination image enhancement method for aggregating context and enhancing details |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115880177A true CN115880177A (en) | 2023-03-31 |
Family
ID=85767341
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211600774.9A Pending CN115880177A (en) | 2022-12-12 | 2022-12-12 | Full-resolution low-illumination image enhancement method for aggregating context and enhancing details |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115880177A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116137023A (en) * | 2023-04-20 | 2023-05-19 | 中国民用航空飞行学院 | Low-illumination image enhancement method based on background modeling and detail enhancement |
CN117152019A (en) * | 2023-09-15 | 2023-12-01 | 河北师范大学 | Low-illumination image enhancement method and system based on double-branch feature processing |
-
2022
- 2022-12-12 CN CN202211600774.9A patent/CN115880177A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116137023A (en) * | 2023-04-20 | 2023-05-19 | 中国民用航空飞行学院 | Low-illumination image enhancement method based on background modeling and detail enhancement |
CN117152019A (en) * | 2023-09-15 | 2023-12-01 | 河北师范大学 | Low-illumination image enhancement method and system based on double-branch feature processing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112233038B (en) | True image denoising method based on multi-scale fusion and edge enhancement | |
Lv et al. | Attention guided low-light image enhancement with a large scale low-light simulation dataset | |
CN113284054B (en) | Image enhancement method and image enhancement device | |
WO2021164234A1 (en) | Image processing method and image processing device | |
Vasamsetti et al. | Wavelet based perspective on variational enhancement technique for underwater imagery | |
CN112308095A (en) | Picture preprocessing and model training method and device, server and storage medium | |
CN115880177A (en) | Full-resolution low-illumination image enhancement method for aggregating context and enhancing details | |
Luan et al. | Fast single image dehazing based on a regression model | |
CN104599256B (en) | The method and system of removal image rain line based on single image | |
CN110148088B (en) | Image processing method, image rain removing method, device, terminal and medium | |
CN110189260B (en) | Image noise reduction method based on multi-scale parallel gated neural network | |
WO2021063341A1 (en) | Image enhancement method and apparatus | |
Wang et al. | Enhancement for dust-sand storm images | |
CN111179196B (en) | Multi-resolution depth network image highlight removing method based on divide-and-conquer | |
CN112561813B (en) | Face image enhancement method and device, electronic equipment and storage medium | |
CN114627034A (en) | Image enhancement method, training method of image enhancement model and related equipment | |
Zheng et al. | Low-light image and video enhancement: A comprehensive survey and beyond | |
CN116363011A (en) | Multi-branch low-illumination image enhancement method based on frequency domain frequency division | |
CN116433518A (en) | Fire image smoke removing method based on improved Cycle-Dehaze neural network | |
Xu et al. | Gan based multi-exposure inverse tone mapping | |
Zhu et al. | Low-light image enhancement network with decomposition and adaptive information fusion | |
Soma et al. | An efficient and contrast-enhanced video de-hazing based on transmission estimation using HSL color model | |
He et al. | Low-light image enhancement with multi-scale attention and frequency-domain optimization | |
CN117974459A (en) | Low-illumination image enhancement method integrating physical model and priori | |
CN117058019A (en) | Pyramid enhancement network-based target detection method under low illumination |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |