CN114581337B - Low-light image enhancement method combining multi-scale feature aggregation and lifting strategies - Google Patents
Low-light image enhancement method combining multi-scale feature aggregation and lifting strategies Download PDFInfo
- Publication number
- CN114581337B CN114581337B CN202210278847.0A CN202210278847A CN114581337B CN 114581337 B CN114581337 B CN 114581337B CN 202210278847 A CN202210278847 A CN 202210278847A CN 114581337 B CN114581337 B CN 114581337B
- Authority
- CN
- China
- Prior art keywords
- image
- feature
- images
- low
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000002776 aggregation Effects 0.000 title claims abstract description 17
- 238000004220 aggregation Methods 0.000 title claims abstract description 17
- 238000012360 testing method Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 10
- 230000004931 aggregating effect Effects 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000005728 strengthening Methods 0.000 claims description 6
- 238000005286 illumination Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 230000008447 perception Effects 0.000 claims description 4
- 238000013480 data collection Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000006116 polymerization reaction Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims description 2
- 238000007906 compression Methods 0.000 claims description 2
- 230000010363 phase shift Effects 0.000 claims description 2
- 230000008901 benefit Effects 0.000 abstract description 5
- 238000013527 convolutional neural network Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000007246 mechanism Effects 0.000 abstract description 3
- 230000008713 feedback mechanism Effects 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a low-light image enhancement method combining multi-scale feature aggregation and lifting strategies, belongs to the technical field of image enhancement, improves a low-light image enhancement model of a coding-decoding architecture based on a convolutional neural network, and provides a multi-scale feature aggregation module (FBAM) and a noise removal module (BPM) combining the lifting strategies and a pixel attention mechanism. The method has the advantages that the method is based on an error feedback mechanism, and a back projection technology is used, so that all the previous characteristics can be considered when the current characteristics are aggregated; the latter can improve the signal-to-noise ratio of the image and model the relationship between each pixel point in the image, helping the network to better identify the image content, thereby emphasizing commonality and removing differences.
Description
Technical Field
The invention relates to a low-light image enhancement method combining multi-scale feature aggregation and lifting strategies, and belongs to the technical field of image enhancement.
Background
Low-light image enhancement is a research hotspot in the field of computer vision in recent years, and is widely applied to various advanced visual tasks such as object detection, semantic segmentation and the like, and at the same time, is also applied to the real fields of full-day automatic driving, visual monitoring, computed photography and the like. The low-light image enhancement technology can improve the visibility and contrast of photos taken in low light, backlight and extremely low light, strengthen the details of the content of the photos and improve the aesthetic perception of the photos. Therefore, the low-light image enhancement has strong practical significance and use value. There are two conventional algorithms for low-light image enhancement, one based on histogram equalization and the other based on Retinex theory. The former expands the dynamic range of the image by equalizing the image and improves the contrast of the image, the latter treats the image as both illumination and reflection components by some kind of a priori or regularization, treats the reflection component as the final enhancement result, and derives the reflection component by predicting the illumination component. Although both of these methods can enhance low-light images to some extent, they both have their own limitations. While histogram equalization-based methods focus only on enhancing the contrast of the image but disregarding noise present in the image, it is difficult for Retinex theory-based methods to find an effective a priori or regularization. Thus, the enhanced image obtained by the conventional algorithm may not be cleaned up or even amplified in noise, and the image may have problems of artifacts, color deviation, overexposure, and the like. In recent years, a large number of low-light image intensifiers based on deep learning have been proposed, with remarkable success.
Compared with the traditional algorithm, the low-light image enhancement method based on deep learning has higher accuracy, robustness and faster speed, and benefits from the fine network structure design and the dependence on a large number of training data sets. However, there is almost no deep learning-based method that focuses on the detail content in the image during enhancement of the low-light image, which results in the problems of losing detail information in the enhanced image result, causing excessive smoothing of the image, and the like. In addition, most deep learning-based methods preserve and even amplify noise in the original image. In summary, the current low-light image enhancement method based on deep learning cannot ensure the preservation of detail content in the image and the removal of noise in the image, so that the application scene and effect of the technology are limited. In order to ensure natural color of the enhanced result of the low-light image, complete removal of abundant details and redundant information, and improvement of the quality of the enhanced result, the method needs to be improved aiming at the existing model method.
Disclosure of Invention
The invention aims to improve a low-light image enhancement model of a coding-decoding architecture based on a convolutional neural network, and provides a low-light image enhancement method combining multi-scale feature aggregation and promotion strategies.
The technical scheme of the invention is that the method comprises the following steps:
step 1, data collection and processing are carried out,
(1.1) in order to train the low-light image enhancement model, enough image data is needed, and two low-light image data sets with open sources are adopted for model training; the first dataset contains 916 Zhang Diguang images and 1016 Zhang Zhengchang light images, which are unpaired; the second dataset contained 2117 Zhang Diguang images and 2117 Zhang Zhengchang light images, the two images being paired;
(1.2) for the images in both data sets, training images with the resolution of 320×320 are obtained by adopting random clipping, and then the images are subjected to random horizontal overturn for data enhancement; for the test set, a total of 672 images of 6 open source low light image data sets are adopted for testing, wherein 5 test data sets are unpaired data sets, and the other test data set contains 440 Zhang Diguang images, wherein 100 data sets are unpaired data sets without corresponding reference images; another 340 are pairs of datasets with corresponding reference images; step 2, constructing a model,
(2.1) designing a low-light image enhancement model based on the encoded-decoded U-net structure, wherein the model consists of a generator and two discriminators; one global discriminator is used for discriminating the whole image, and one local discriminator is used for discriminating the image block;
(2.2) in the encoding stage of the generator, for each scale feature, aggregating it to all subsequent levels using a multi-scale feature aggregation module, abbreviated as FABM; in the decoding stage, summarizing the characteristics extracted by each stage of encoder, and combining a denoising module, namely BPM (binary phase shift keying), so as to enhance the image;
the multi-scale feature aggregation module of (2.2.1), abbreviated as FABM, in the encoder for class C feature map E C First, calculate its sum among different scales, the previous (C-1) level feature map E i_A (i=1,., C-1)
Where p represents the projection operator, which will map E C Upsampling, j denotes the number of upsampling, E C Will be up-sampled to and from each E i_A (i=1,., C-1) the same dimensions;
then using the back projection differenceDe-updating feature map E C Thereby obtaining the polymerization characteristic E C_A :
Wherein bp represents the backprojection operator, which will backproject the differenceDownsampling, j denotes the number of downsampling, +.>Will be downsampled to the sum feature map E C The same size;
finally obtained feature E C_A Features of all dimensions between them are considered; assuming that stage C is the last stage of the encoder, then E C_A The output of the encoder after aggregating the features of each stage;
then each stage of the decoder is operated in the same way for E C_A Re-polymerizing to obtain enhanced feature D by fully utilizing the polymerized feature C_E :
D C_E =bp C-1 [p C-1 (D C )-E C_A ]+D C (3)
Wherein D is C Representing a characteristic diagram of a C level in a decoder, wherein bp and p respectively represent a back projection operator and a projection operator; in the decoder, bp represents upsampling and p represents downsampling;
the denoising module of (2.2.2), abbreviated as BPM, usesThe previous estimate improves the current signal; first, feature C, which is currently required to be enhanced f And features P which were previously considered to have been enhanced f Added to obtain enhanced features S of high signal-to-noise ratio f The higher the signal-to-noise ratio, the easier the denoising:
S f =C f +P f (4)
this operation ensures C f And P f Implicit and unconstrained fusion of (1);
then the strengthening feature S f Into a pixel attention block proposed by the invention, which will characterize the graph X ε R H×W× C compressing in channel dimension to obtain a pixel attention map M P ∈R H×W×1 Multiplying the original feature image X by the original feature image X to output a feature image X P ∈R H×W×C In the feature map, each pixel is associated with each other:
X P =X·σ(conv(avg(x))) (5)
wherein σ, conv and avg represent sigmoid function, convolution operation and average pooling operation, respectively;
will strengthen the feature S f After the pixel attention block is sent, the relation between the pixel points of the characteristic image is established, and a recovered signal strengthening result R is obtained f :
R f =P(S f ) (6)
Wherein P represents the pixel attention block proposed by the invention; r is R f The method can represent an emphasized signal of noise, so that the denoising module can better identify the image content, and a better denoising effect is obtained;
finally, subtracting the previous feature P f To remove redundant information:
O f =R f -P f (7)
wherein O is f Representing the final output of the denoising module; the whole process reduces the difference between local block modeling and global recovery tasks;
(2.3) finally, carrying out residual error on the output of the encoder and the input image to obtain a final result; in addition to the global discriminant, a local discriminant is used to facilitate and stabilize training, image blocks are randomly cropped from the output image and the normal illumination image, and then sent to the local discriminant;
and 3, a loss function, wherein the low-light image enhancement model provided by the invention adopts counterloss and perception loss. The countering losses are from LSGAN, whose mathematical specifications are as shown in formulas (8) and (9):
wherein L is D And L G Representing the loss of the arbiter and the generator, p, respectively r (x) Representing the distribution of normal light images/image blocks, p f (y) represents the distribution of low-light images/image blocks; x and y each represent a group p r (x) And p f (y) the obtained samples, D representing the arbiter, G representing the generator, E representing the expected value;
the perceived loss is based on a feature map output in a pre-trained vgg-16 network, which is defined as:
wherein phi is ij Representing extraction of a feature map obtained from the j-th convolutional layer of the i-th block of the vgg-16 network; w (w) ij ,h ij And c ij Representing the dimension of the feature map, I x And I y Representing a low light image/image block and a generated normal light image/image block, respectively.
The invention improves a low-light image enhancement model of a coding-decoding architecture based on a convolutional neural network, and provides a multi-scale feature aggregation module (FBAM) and a noise removal module (BPM) combining a lifting strategy and a pixel attention mechanism. The former is based on an error feedback mechanism, using a back-projection technique, all previous features can be considered when aggregating current features. The latter can improve the signal-to-noise ratio of the image and model the relationship between each pixel in the image, helping the network to better identify the image content, thereby emphasizing commonalities and removing differences (often referred to as noise).
The invention has the main advantages that:
1. the method has the advantages that an encoded-decoded low-light image enhancement model is built, and the model can effectively enhance a low-light image through single-stage training, so that the image has rich details, a clean background and natural colors;
2. the multi-scale feature aggregation module can consider all scale features and effectively aggregate the features, so that the network can effectively save the content of the image while enhancing the low-light image, and an image with rich details is generated.
3. The denoising module can effectively improve the signal-to-noise ratio of the image and model the relation between each pixel point in the image, so that the network can be helped to better identify the image content, and therefore the commonality is emphasized, and the difference (usually referred to as noise) is eliminated.
Drawings
FIG. 1 is a schematic diagram of the overall structure of the present invention.
FIG. 2 is a schematic diagram of a multi-scale feature aggregation module.
Fig. 3 is a schematic diagram of a pixel attention module.
Detailed Description
The objects, technical solutions and advantages of the present invention will become more apparent by the following detailed description of the present invention with reference to the accompanying drawings. It should be understood that the description is only illustrative and is not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention;
in addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other. The invention will be described in more detail below with reference to the accompanying drawings. Like elements are denoted by like reference numerals throughout the various figures. For clarity, the various features of the drawings are not drawn to scale.
A method of low-light image enhancement incorporating a multi-scale feature aggregation and promotion strategy according to an embodiment of the present invention is described below with reference to fig. 1 to 3, comprising the steps of:
step 1, data collection and processing are carried out,
(1.1) in order to train the low-light image enhancement model, enough image data is needed, and the invention adopts two low-light image data sets with open sources to train the model; the first dataset contains 916 Zhang Diguang images and 1016 Zhang Zhengchang light images, which are unpaired; the second dataset contained 2117 Zhang Diguang images and 2117 Zhang Zhengchang light images, the two images being paired;
(1.2) for the images in both data sets, training images with the resolution of 320×320 are obtained by adopting random clipping, and then the images are subjected to random horizontal overturn for data enhancement; for a test set, the invention adopts 6 open source low light image data sets to test 672 images, wherein 5 test data sets are unpaired data sets, and the other test data set contains 440 Zhang Diguang images, wherein 100 data sets are unpaired data sets without corresponding reference images; another 340 are pairs of datasets with corresponding reference images;
step 2, constructing a model,
(2.1) the present invention designs a low-light image enhancement model based on the encoded-decoded U-net structure, the overall structure of which is shown in fig. 1. Consists of a generator and two discriminators; one global discriminator is used for discriminating the whole image, and one local discriminator is used for discriminating the image block;
(2.2) in the encoding stage of the generator, for each scale feature, the present invention uses the proposed multi-scale feature aggregation module, abbreviated as FABM, to aggregate it to all subsequent levels; in the decoding stage, the invention gathers the extracted characteristics of each stage of encoder, combines the proposed denoising module, called BPM for short, and enhances the image;
the multi-scale feature aggregation module of (2.2.1), abbreviated as FABM, is shown in FIG. 2. In the encoder, for level C feature map E C First, calculate its sum among different scales, the previous (C-1) level feature map E i_A (i=1, …, C-1)
Where p represents the projection operator, which will map E C Upsampling, j denotes the number of upsampling, E C Will be up-sampled to and from each E i_A (i=1,., C-1) the same dimensions;
then using the back projection differenceDe-updating feature map E C Thereby obtaining the polymerization characteristic E C_A :
Wherein bp represents the backprojection operator, which will backproject the differenceDownsampling, j denotes the number of downsampling, +.>Will be downsampled to the sum feature map E C The same size;
finally obtained feature E C_A Features of all dimensions between them are considered. Assuming that stage C is the last stage of the encoder, then E C_A I.e. the output of the encoder after aggregating the features of each stage. Each stage of the decoder then proceeds in the same mannerPair E C_A Re-polymerizing to obtain enhanced feature D by fully utilizing the polymerized feature C_E :
D C_E =bp C-1 [p C-1 (D C )-E C_A ]+D C (3)
Wherein DC represents a C-level feature diagram in the decoder, and bp and p represent a back projection operator and a projection operator respectively; in the decoder, bp represents upsampling and p represents downsampling;
the denoising module of (2.2.2), abbreviated as BPM, uses the previous estimate to improve the current signal. Fig. 1 depicts its working mechanism. First, feature C, which is currently required to be enhanced f And features P which were previously considered to have been enhanced f Added to obtain enhanced features S of high signal-to-noise ratio f The higher the signal-to-noise ratio, the easier the denoising:
S f =C f +P f (4)
this operation ensures C f And P f Implicit and unconstrained fusion of (1);
then the strengthening feature S f A pixel attention block proposed by the present invention is fed in, which pixel attention block is shown in fig. 3. It will feature map X ε R H×W×C Compression in the channel dimension, eventually obtaining a pixel attention map M P ∈R H ×W×1 Multiplying the original feature image X by the original feature image X to output a feature image X P ∈R H×W×C In the feature map, each pixel is associated with each other:
X P =X·σ(conv(avg(x))) (5)
wherein σ, conv and avg represent sigmoid function, convolution operation and average pooling operation, respectively;
will strengthen the feature S f After the pixel attention block is sent, the relation between the pixel points of the characteristic image is established, and a recovered signal strengthening result R is obtained f :
R f =P(S f ) (6)
Wherein P represents the pixel attention block proposed by the present invention;R f The method can represent an emphasized signal of noise, so that the denoising module can better identify the image content, and a better denoising effect is obtained;
finally, subtracting the previous feature P f To remove redundant information:
O f =R f -P f (7)
wherein O is f Representing the final output of the denoising module; the whole process reduces the difference between local block modeling and global recovery tasks;
(2.3) finally, carrying out residual error on the output of the encoder and the input image to obtain a final result; in addition to the global discriminant, the invention also uses a local discriminant to promote and stabilize training, randomly cropping image blocks from the output image and the normal illumination image, and then sending the image blocks into the local discriminant;
and 3, a loss function, wherein the low-light image enhancement model provided by the invention adopts counterloss and perception loss. The countering losses are from LSGAN, whose mathematical specifications are as shown in formulas (8) and (9):
wherein L is D And L G Representing the loss of the arbiter and the generator, p, respectively r (x) Representing the distribution of normal light images/image blocks, p f (y) represents the distribution of low-light images/image blocks; x and y each represent a group p r (x) And p f (y) the obtained samples, D representing the arbiter, G representing the generator, E representing the expected value;
the perceived loss is based on a feature map output in a pre-trained vgg-16 network, which is defined as:
wherein phi is ij Representing extraction of a feature map obtained from the j-th convolutional layer of the i-th block of the vgg-16 network; w (w) ij ,h ij And c ij Representing the dimension of the feature map, I x And I y Representing a low light image/image block and a generated normal light image/image block, respectively.
It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Claims (1)
1. A low-light image enhancement method combining multi-scale feature aggregation and boosting strategies, characterized by comprising the steps of:
step 1, data collection and processing are carried out,
(1.1) in order to train the low-light image enhancement model, enough image data is needed, and two low-light image data sets with open sources are adopted for model training; the first dataset contains 916 Zhang Diguang images and 1016 Zhang Zhengchang light images, which are unpaired; the second dataset contained 2117 Zhang Diguang images and 2117 Zhang Zhengchang light images, the two images being paired;
(1.2) for the images in both data sets, training images with the resolution of 320×320 are obtained by adopting random clipping, and then the images are subjected to random horizontal overturn for data enhancement; for the test set, a total of 672 images of 6 open source low light image data sets are adopted for testing, wherein 5 test data sets are unpaired data sets; another test dataset contained 440 Zhang Diguang images, 100 of which were unpaired datasets without corresponding reference images; another 340 are pairs of datasets with corresponding reference images; step 2, constructing a model,
(2.1) designing a low-light image enhancement model based on the encoded-decoded U-net structure, wherein the model consists of a generator and two discriminators; one global discriminator is used for discriminating the whole image, and one local discriminator is used for discriminating the image block;
(2.2) in the encoding stage of the generator, for each scale feature, aggregating it to all subsequent levels using a multi-scale feature aggregation module, abbreviated as FABM; in the decoding stage, summarizing the characteristics extracted by each stage of encoder, and combining a denoising module, namely BPM (binary phase shift keying), so as to enhance the image;
the multi-scale feature aggregation module of (2.2.1), abbreviated as FABM, in the encoder for class C feature map E C First, calculate its sum among different scales, the previous (C-1) level feature map E i_A (i=1,., C-1)
Where p represents the projection operator, which will map E C Upsampling, j denotes the number of upsampling, E C Will be up-sampled to and from each E i_A (i=1,., C-1) the same dimensions;
then using the back projection differenceDe-updating feature map E C Thereby obtaining the polymerization characteristic E C_A :
Wherein bp represents the backprojection operator, which will backproject the differenceDownsampling, j denotes the number of downsampling, +.>Will be downsampled to the sum feature map E C The same size;
finally obtained feature E C_A Features of all dimensions between them are considered; assuming that stage C is the last stage of the encoder, then E C_A The output of the encoder after aggregating the features of each stage;
then each stage of the decoder is operated in the same way for E C_A Re-polymerizing to obtain enhanced feature D by fully utilizing the polymerized feature C_E :
D C_E =bp C-1 [p C-1 (D C )-E C_A ]+D C (3)
Wherein D is C Representing a characteristic diagram of a C level in a decoder, wherein bp and p respectively represent a back projection operator and a projection operator;
in the decoder, bp represents upsampling and p represents downsampling;
(2.2.2) the denoising module, abbreviated as BPM, uses the previous estimate to improve the current signal; first, feature C, which is currently required to be enhanced f And features P which were previously considered to have been enhanced f Added to obtain enhanced features S of high signal-to-noise ratio f The higher the signal-to-noise ratio, the easier the denoising:
S f =C f +P f (4)
this operation ensures C f And P f Implicit and unconstrained fusion of (1);
then the strengthening feature S f Into a pixel attention block which subjects the feature map X ε R H×W×C Compression in the channel dimension, eventually obtaining a pixel attention map M P ∈R H×W×1 Multiplying the original feature image X by the original feature image X to output a feature image X P ∈R H×W×C In the feature map, each pixel is associated with each other:
X P =X·σ(conv(avg(x))) (5)
wherein σ, conv and avg represent sigmoid function, convolution operation and average pooling operation, respectively;
will strengthen the feature S f After the pixel attention block is sent, the relation between the pixel points of the characteristic image is established, and a recovered signal strengthening result R is obtained f :
R f =P(S f ) (6)
Wherein P represents a pixel attention block; r is R f Representing the emphasized signal of noise, so that the denoising module better recognizes the image content, thereby obtaining better denoising effect;
finally, subtracting the previous feature P f To remove redundant information:
O f =R f -P f (7)
wherein O is f Representing the final output of the denoising module; the whole process reduces the difference between local block modeling and global recovery tasks;
(2.3) finally, carrying out residual error on the output of the encoder and the input image to obtain a final result; in addition to the global discriminant, a local discriminant is used to facilitate and stabilize training, image blocks are randomly cropped from the output image and the normal illumination image, and then sent to the local discriminant;
step 3, loss function, low-light image enhancement model adopts counterloss and perception loss; the countering losses are from LSGAN, whose mathematical specifications are as shown in formulas (8) and (9):
wherein L is D And L G Representing the loss of the arbiter and the generator, p, respectively r (x) Representing the distribution of normal light images/image blocks, p f (y) represents the distribution of low-light images/image blocks; x and y each represent a group p r (x) And p f (y) obtainedThe sample, D represents the discriminator, G represents the generator, E represents the expected value;
the perceived loss is based on a feature map output in a pre-trained vgg-16 network, which is defined as:
wherein phi is ij Representing extraction of a feature map obtained from the j-th convolutional layer of the i-th block of the vgg-16 network; w (w) ij ,h ij And c ij Representing the dimension of the feature map, I x And I y Representing a low light image/image block and a generated normal light image/image block, respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210278847.0A CN114581337B (en) | 2022-03-17 | 2022-03-17 | Low-light image enhancement method combining multi-scale feature aggregation and lifting strategies |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210278847.0A CN114581337B (en) | 2022-03-17 | 2022-03-17 | Low-light image enhancement method combining multi-scale feature aggregation and lifting strategies |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114581337A CN114581337A (en) | 2022-06-03 |
CN114581337B true CN114581337B (en) | 2024-04-05 |
Family
ID=81777353
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210278847.0A Active CN114581337B (en) | 2022-03-17 | 2022-03-17 | Low-light image enhancement method combining multi-scale feature aggregation and lifting strategies |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114581337B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110889813A (en) * | 2019-11-15 | 2020-03-17 | 安徽大学 | Low-light image enhancement method based on infrared information |
CN112884668A (en) * | 2021-02-22 | 2021-06-01 | 大连理工大学 | Lightweight low-light image enhancement method based on multiple scales |
CN113052814A (en) * | 2021-03-23 | 2021-06-29 | 浙江工业大学 | Dark light image enhancement method based on Retinex and attention mechanism |
CN113628152A (en) * | 2021-09-15 | 2021-11-09 | 南京天巡遥感技术研究院有限公司 | Dim light image enhancement method based on multi-scale feature selective fusion |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112602088B (en) * | 2018-09-06 | 2024-03-12 | Oppo广东移动通信有限公司 | Method, system and computer readable medium for improving quality of low light images |
-
2022
- 2022-03-17 CN CN202210278847.0A patent/CN114581337B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110889813A (en) * | 2019-11-15 | 2020-03-17 | 安徽大学 | Low-light image enhancement method based on infrared information |
CN112884668A (en) * | 2021-02-22 | 2021-06-01 | 大连理工大学 | Lightweight low-light image enhancement method based on multiple scales |
CN113052814A (en) * | 2021-03-23 | 2021-06-29 | 浙江工业大学 | Dark light image enhancement method based on Retinex and attention mechanism |
CN113628152A (en) * | 2021-09-15 | 2021-11-09 | 南京天巡遥感技术研究院有限公司 | Dim light image enhancement method based on multi-scale feature selective fusion |
Non-Patent Citations (1)
Title |
---|
极端低光情况下的图像增强方法;杨勇;刘惠义;;图学学报;20200807(第04期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114581337A (en) | 2022-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022083026A1 (en) | Ultrasound image denoising model establishing method and ultrasound image denoising method | |
CN111062892B (en) | Single image rain removing method based on composite residual error network and deep supervision | |
CN111915530B (en) | End-to-end-based haze concentration self-adaptive neural network image defogging method | |
CN112164011B (en) | Motion image deblurring method based on self-adaptive residual error and recursive cross attention | |
CN112257766B (en) | Shadow recognition detection method in natural scene based on frequency domain filtering processing | |
CN114066747B (en) | Low-illumination image enhancement method based on illumination and reflection complementarity | |
Shen et al. | Convolutional neural pyramid for image processing | |
CN111161360A (en) | Retinex theory-based image defogging method for end-to-end network | |
CN110503049B (en) | Satellite video vehicle number estimation method based on generation countermeasure network | |
Jam et al. | Symmetric skip connection Wasserstein GAN for high-resolution facial image inpainting | |
He et al. | Fast single image dehazing via multilevel wavelet transform based optimization | |
CN117152182B (en) | Ultralow-illumination network camera image processing method and device and electronic equipment | |
CN113947538A (en) | Multi-scale efficient convolution self-attention single image rain removing method | |
CN114581337B (en) | Low-light image enhancement method combining multi-scale feature aggregation and lifting strategies | |
Xu et al. | Depth map super-resolution via joint local gradient and nonlocal structural regularizations | |
CN114862711B (en) | Low-illumination image enhancement and denoising method based on dual complementary prior constraints | |
CN112348103B (en) | Image block classification method and device and super-resolution reconstruction method and device thereof | |
CN114926348A (en) | Device and method for removing low-illumination video noise | |
CN114549343A (en) | Defogging method based on dual-branch residual error feature fusion | |
Chu et al. | Deep Video Decaptioning. | |
CN112288738B (en) | Single image snowflake removing method and device based on deep learning and storage medium | |
Cheng et al. | Two-stage image dehazing with depth information and cross-scale non-local attention | |
CN115482162B (en) | Implicit image blind denoising method based on random rearrangement and label-free model | |
CN114821703B (en) | Distance self-adaptive thermal infrared face recognition method | |
Meng et al. | Pyramid image defogging network based on attention fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |