CN116645305A - Low-light image enhancement method based on multi-attention mechanism and Retinex - Google Patents
Low-light image enhancement method based on multi-attention mechanism and Retinex Download PDFInfo
- Publication number
- CN116645305A CN116645305A CN202310573810.5A CN202310573810A CN116645305A CN 116645305 A CN116645305 A CN 116645305A CN 202310573810 A CN202310573810 A CN 202310573810A CN 116645305 A CN116645305 A CN 116645305A
- Authority
- CN
- China
- Prior art keywords
- map
- feature map
- component
- feature
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000007246 mechanism Effects 0.000 title claims abstract description 14
- 238000005286 illumination Methods 0.000 claims abstract description 70
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 31
- 238000012545 processing Methods 0.000 claims abstract description 31
- 230000006870 function Effects 0.000 claims description 50
- 238000010586 diagram Methods 0.000 claims description 34
- 230000004913 activation Effects 0.000 claims description 29
- 238000011176 pooling Methods 0.000 claims description 16
- 230000004044 response Effects 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 230000004927 fusion Effects 0.000 claims description 7
- 230000003213 activating effect Effects 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 7
- 238000013527 convolutional neural network Methods 0.000 description 5
- 235000008733 Citrus aurantifolia Nutrition 0.000 description 4
- 235000011941 Tilia x europaea Nutrition 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 239000004571 lime Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000002708 enhancing effect Effects 0.000 description 3
- 210000001525 retina Anatomy 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002715 modification method Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012113 quantitative test Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a low-light image enhancement method based on a multi-attention mechanism and Retinex, which relates to the technical field of image processing and solves the problems of weak sense of reality, serious noise and distortion, partial underexposure and overexposure of an enhanced image in the prior art, and the method comprises the following steps: inputting the low-illumination image into a trained decomposition module, and determining an illumination map component and a reflection map component; inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; the reflection image component and the enhanced illumination image component are fused, the processed low illumination image is determined, accurate noise removal is achieved, the global brightness of the features is amplified, the enhancement effect is improved, and the output fused image is smoother.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a low-light image enhancement method based on a multi-attention mechanism and Retinex.
Background
The low-illumination image enhancement belongs to an end-to-end RGB color image restoration technology, and the image is enhanced by analyzing the brightness, contrast and other information of the image. Existing methods of image enhancement are largely classified into a conventional enhancement method, a retina-based enhancement scheme, and a deep learning-based enhancement method. Based on the traditional image processing method, taking histogram equalization as a main representation, wherein the histogram equalization method is a histogram modification method based on cumulative distribution function, and the image histogram is adjusted to be uniformly distributed to stretch the dynamic range of the image, so that the image contrast is improved; the method is simple to operate and high in efficiency, but the generated image is easily affected by the artifact and has weak sense of reality. The retina (Retinex) -based low-illumination enhancement method decomposes an image into an illumination component and a reflection component by simulating a visual mechanism of human eyes, and improves the brightness of the image by adjusting the illumination component; however, the parameters of the method need to be set manually, so that diversified images cannot be processed adaptively, and partial underexposure and overexposure are caused. The low-light enhancement method based on deep learning is characterized in that a large-scale data set is utilized to learn an optimal network model, and a complex mapping relation is established between a low-light image and a normal-light image.
With the rapid development of deep learning, many underlying visual tasks are greatly affected, and RetinexNet is a typical deep learning-based low-light enhancement model, and the network structure mainly comprises a decomposition network and an enhancement network, wherein the decomposition network is composed of 5 convolution layers with ReLu, and the enhancement network is an encoder-decoder structure. But its enhanced result is too low smoothness and too loud noise, causing severe distortion of the image.
Disclosure of Invention
The invention provides a low-light image enhancement method based on a multi-attention mechanism and Retinex, which solves the problems of weak sense of reality, serious noise and distortion, partial underexposure and overexposure of the enhanced image in the prior art, thereby realizing accurate noise removal, amplifying the global brightness of the features, improving the enhancement effect and enabling the output fused image to be smoother.
The invention provides a low-light image enhancement method based on a multi-attention mechanism and Retinex, which comprises the following steps:
inputting the low-light image into a trained decomposition module, and determining an illumination map component and a reflection map component, wherein the decomposition module comprises: a CBAM, a first channel attention module of the CBAM reducing extraneous feature responses and activating useful feature responses according to different feature weights of interest, and reducing extraneous spatial information by a first spatial attention module of the CBAM;
inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein the enhancement module comprises: the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, and the second space attention module of the global attention module fuses the characteristic information and amplifies the global brightness of the illumination graph component;
and fusing the reflection map component and the enhanced illumination map component, and determining a processed low illumination image.
In one possible implementation, the first channel attention module of the CBAM reduces extraneous feature responses and activates useful feature responses according to focusing on different feature weights, and reduces extraneous spatial information by the first spatial attention module of the CBAM, comprising:
performing feature extraction on the low-light image by using a convolution and activation function ReLu to determine an intermediate feature map;
the first channel attention module and the first space attention module inject attention to the intermediate feature map, and then multiply the attention by the intermediate feature map to determine a refined feature map;
processing the refined feature map by using a multi-layer convolution operation, and activating a processing result by using the activation function ReLu to determine a multi-layer convolution feature map;
performing channel splicing on the intermediate feature map and the multilayer convolution feature map to determine a spliced feature map;
and carrying out normalization processing on the spliced feature map, and determining the reflection map component of the 3 channels and the illumination map component of the 1 channels.
In one possible implementation, the determining a refinement feature map includes:
carrying out global maximum pooling and mean pooling on the intermediate feature map according to the channel to obtain two pooled one-dimensional vectors, inputting the two pooled one-dimensional vectors into a full-connection layer for addition operation, and determining the attention of the one-dimensional channel;
multiplying the attention of the one-dimensional channel with the intermediate feature map by elements to determine an adjustment feature map;
carrying out global maximum pooling and mean pooling on the adjustment feature map according to space to obtain two pooled two-dimensional vectors, and carrying out splicing operation and convolution operation on the two pooled two-dimensional vectors to determine the attention of the two-dimensional space;
and multiplying the two-dimensional space attention by the adjustment feature map by elements to determine the refinement feature map.
In one possible implementation manner, the training method of the decomposition module includes: training the decomposition module according to the decomposition loss function;
the decomposition loss function is expressed as:
wherein ,representing the coefficient of coherence loss of the reflected component lambda lap Represents the Laplace loss coefficient, lambda ssim Representing a structural similarity loss coefficient;
representing a reconstruction loss function, the reconstruction loss function being represented as:
representing a reflection component coherence loss, the reflection component coherence loss being represented as:
representing a loss of structural similarity, the loss of structural similarity being represented as:
representing a laplace loss, the laplace loss being represented as:
wherein S represents the low-light image, R represents the reflection map component, I represents the illumination map component, low represents low light, high represents high light, and lambda ij Representing the reconstruction loss coefficient, R high Representing a high-light reflection diagram component, R low Representing low-light reflectance map components, i representing the result of the Laplacian pyramid ith layer, L j (R high ) Laplacian pyramid, L representing high-illuminance reflectance graph component j (R low ) Representing the low-light reflectance map component laplacian pyramid.
In one possible implementation manner, the second channel attention module of the global attention module screens the feature information and retains the valid feature information, the second space attention module of the global attention module performs feature information fusion, amplifies the global brightness of the illumination map component, and includes:
performing convolution operation on the illumination map component by using convolution to generate feature map, enabling the size of a convolution kernel to be 3 multiplied by 3, performing nonlinear activation on the generated feature map by using an activation function ReLu, and determining a feature map F 1 ;
The global attention module and the activation function SIGMOD are utilized to make the characteristic diagram F 1 Processing is performed to determine a feature map M (F 2 );
Multiple use of convolution in cascade form, activation function ReLU and BN layer against the feature map M (F 2 ) Processing and determining a characteristic diagram F 3 ;
The feature map F is subjected to convolution and an activation function SIGMOD 3 Processing and determining a characteristic diagram F 4 ;
The characteristic diagram F is convolved 4 Processing and then convolving the feature map F 4 And reconstructing and determining the enhanced illumination map component.
In a possible implementation, the feature map F is generated by the global attention module and an activation function SIGMOD 1 Processing is performed to determine a feature map M (F 2 ) Comprising:
using the global attention module to the feature map F 1 Performing dimension conversion to determine a transformation dimension feature map F 1 ';
The dimension characteristic diagram F is transformed by using a multi-layer perceptron 1 ' processing, determining and comparing with the characteristic diagram F 1 Dimension characteristic diagram F of the same dimension 1 ", the dimension feature map F is activated using an activation function SIGMOD 1 ", determine feature map F 2 ;
The feature map F is convolved with a convolution kernel of 7 2 Performing convolution operationsReducing the characteristic diagram F 2 Determining the number of channels of the feature map F 2 ';
The characteristic diagram F is subjected to convolution with convolution kernel 7 2 ' convolutional operations, adding the feature map F 2 ' number of channels, determining the characteristic map M (F 2 ) The characteristic map M (F 2 ) And the characteristic diagram F 1 The number of channels is the same.
In one possible implementation, the multiple-use cascade form of convolution, activation function ReLU and BN layer pair of the feature map M (F 2 ) Processing and determining a characteristic diagram F 3 Comprising:
the feature map M (F 2 ) Performing convolution operation, and adding a BN layer to enable feature graphs after the convolution operation to have similar distribution;
the feature map M (F 2 ) Activating;
the steps are circulated for a plurality of times, and the characteristic diagram F is determined 3 。
In a possible implementation, the using convolution is performed on the feature map F 4 Performing reconstruction, including:
checking the feature map F using a 3 x 3 convolution 4 And (5) reconstructing.
In one possible implementation manner, the training method of the enhancement module includes: training the enhancement module according to an enhancement loss function;
the enhancement loss function is expressed as:
wherein ,λlap Representing the laplace loss factor;
representing a reconstruction loss function, the reconstruction loss function being represented as:
wherein λ represents the coefficient of reconstruction loss, R low Representing the low-light reflectance map component,representing the enhanced illumination map component, S high Representing a high-light input image.
Representing a laplace loss, the laplace loss being represented as:
wherein ,laplacian pyramid, L, representing enhanced image i (S low ) Representing a low-light image laplacian pyramid.
In one possible implementation, the fusing the reflectogram component with the enhanced illumination map component includes:
and multiplying the reflection map component and the enhanced illumination map component element by element, and outputting the processed low illumination image.
One or more technical solutions provided in the embodiments of the present invention at least have the following technical effects or advantages:
the invention adopts a low-light image enhancement method based on a multi-attention mechanism and Retinex, which comprises the following steps: inputting the low-light image into a trained decomposition module to determine an illumination map component and a reflection map component, wherein the decomposition module comprises: the method comprises the steps that a CBAM, a first channel attention module of the CBAM reduces irrelevant characteristic responses and activates useful characteristic responses according to different focused characteristic weights, irrelevant spatial information is reduced through a first spatial attention module of the CBAM, sensitivity of a decomposition network to noise is improved, and noise with different degrees is removed accurately; inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein, the enhancement module includes: the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, the second space attention module of the global attention module fuses the characteristic information, the global brightness of the illumination map component is amplified, the global brightness of the characteristic is effectively amplified by the enhancement module, and the enhancement effect of the enhancement module is improved; fusing the reflection image component and the enhanced illumination image component, and determining a processed low illumination image; the problems of weak sense of reality, serious noise and distortion, partial underexposure and overexposure of the enhanced image in the prior art are effectively solved, the accurate noise removal is further realized, the global brightness of the features is amplified, the enhancement effect is improved, and the output fusion image is smoother.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments of the present invention or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of steps of a low-light image enhancement method based on a multi-attention mechanism and Retinex according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps performed by the decomposing module to decompose a low-light image according to an embodiment of the present invention;
FIG. 3 is a low-light image of an input provided by an embodiment of the present invention;
fig. 4 is an image enhanced by RetinexNet in the prior art according to an embodiment of the present invention;
fig. 5 is a prior art BIMEF enhanced image provided by an embodiment of the present invention;
FIG. 6 is a LIME enhanced image of the prior art provided by an embodiment of the present invention;
fig. 7 is a processed low-light image output according to the method provided by the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The method is simple to operate and high in efficiency based on the traditional image processing method, but the generated image is easily influenced by artifacts and has weak sense of reality; parameters of retina (Retinex) -based enhancement methods need to be manually set, cannot adaptively process diversified images, and there are cases of partial underexposure and overexposure; the RetinexNet method has serious noise, has low smoothness and serious image distortion; based on the deep-learned NPE model, the naturalness of illumination is maintained while enhancing, but there is no relationship of illumination in different scenes; the BIMEF model provides a multi-exposure fusion framework for enhancing low-illumination images, adopts a double-exposure fusion algorithm to provide accurate contrast and illumination intensity, but has lower enhancement result brightness; based on a LIME model proposed by Retinex, selecting the maximum value in each pixel channel of an input image, processing an illumination image, processing the illumination image by using structured prior knowledge, and taking the output of a reflection image as an enhancement result, wherein the method is easy to cause the phenomenon of excessive enhancement; GLADNet proposes a global perception and detail preserving network that estimates the illumination image from the low-light image by global illumination estimation, and then reconstructs the module through a detail module, thereby enhancing the low-light image without a denoising module and considering the influence on the image.
The embodiment of the invention provides a low-light image enhancement method based on a multi-attention mechanism and Retinex, which comprises the following steps S101 to S103 as shown in FIG. 1.
S101, inputting the low-light image into a trained decomposition module, and determining an illumination map component and a reflection map component, wherein the decomposition module comprises: the convolution attention module (full name: convolutional Block Attention Module, abbreviated as CBAM) reduces irrelevant feature responses and activates useful feature responses according to different feature weights of interest, and reduces irrelevant spatial information through the first spatial attention module of the CBAM. According to the invention, the CBAM is added into the decomposition network to improve the network, the CBAM can calculate the attention weight corresponding to the decomposition module feature map, the channel attention module is used for paying attention to the feature weights of different channels to reduce irrelevant feature responses and activate useful feature responses, and the spatial attention module is used for reducing irrelevant spatial information, so that the sensitivity of the decomposition module to noise is improved, and the noise with different degrees is accurately removed.
In one possible implementation, the first channel attention module of the CBAM reduces irrelevant feature responses and activates useful feature responses according to focusing on different feature weights, and reduces irrelevant spatial information through the first spatial attention module of the CBAM, specifically including the following steps S201 to S205 as shown in fig. 2.
S201, feature extraction is carried out on the low-illumination image by utilizing a convolution and activation function ReLu, and an intermediate feature map F epsilon R is determined C×H×W 。
S202, the first channel attention module and the first space attention module inject attention to the intermediate feature map, and then multiply the attention by the intermediate feature map to determine a refined feature map. CBAM is a simple and effective convolutional neural network attention module. And (3) any given intermediate feature map in the convolutional neural network, injecting the attention map along two independent dimensions of a channel and a space of the feature map by the CBAM, multiplying the attention by the input feature map, and carrying out self-adaptive feature refinement on the input feature map. Because CBAM is an end-to-end generic module, it can be seamlessly integrated into any CNNs architecture and can be end-to-end trained with basic CNNs.
Determining a refined feature map, specifically comprising the following steps (1) to (4). (1) Intermediate feature map F e R C×H×W Carrying out global maximum pooling and mean pooling according to channels to obtain two one-dimensional vectors after pooling, inputting the two one-dimensional vectors after pooling into a full-connection layer for addition operation, and determining the attention M of the one-dimensional channel C ∈R C×1×1 . (2) Directing attention to one-dimensional channel M C ∈R C ×1×1 And intermediate feature map F epsilon R C×H×W The adjustment feature map F' is determined by element multiplication. (3) Carrying out global maximum pooling and mean pooling on the adjustment feature map F' according to space to obtain two-dimensional vectors after pooling, and carrying out splicing operation and convolution operation on the two-dimensional vectors after pooling to determine the two-dimensional space attention M s ∈R 1×H×W . (4) The two-dimensional spatial attention is multiplied by the adjustment feature map F' by elements to determine a refined feature map.
S203, the refinement feature map is processed by using a multi-layer convolution operation, and the processing result is activated by using an activation function ReLu to determine a 64-channel multi-layer convolution feature map F.
S204, performing channel splicing on the intermediate feature map F and the multi-layer convolution feature map F' to determine a 128-channel spliced feature map.
S205, carrying out normalization processing on the spliced characteristic map, and determining a reflection map component of 3 channels and an illumination map component of 1 channel.
The training method of the decomposition module provided by the invention comprises the following steps: and training the decomposition module according to the decomposition loss function.
The decomposition loss function is expressed as:
wherein ,λir Representing the coefficient of coherence loss of the reflected component lambda lap Represents the Laplace loss coefficient, lambda ssim Representing a structural similarity loss coefficient;
representing a reconstruction loss function, the reconstruction loss function being represented as:
representing a reflection component uniformity loss, which is expressed as:
representing a loss of structural similarity, the loss of structural similarity is expressed as:
indicating the laplace loss, which is expressed as:
wherein S represents a low-light image, R represents a reflection image component, I represents a light image component, low represents a low light, high represents a high light, and lambda ij Representing the reconstruction loss coefficient, R high Representing a high-light reflection diagram component, R low Representing low light reflectionThe ray diagram component, i, represents the result of the ith layer of the Laplacian pyramid, L j (R high ) Laplacian pyramid, L representing high-illuminance reflectance graph component j (R low ) Representing the low-light reflectance map component laplacian pyramid.Global and local information capable of capturing features and estimating anti-high light reflection component R high And a low light component R low Color difference between, wherein L j (R high ) Laplacian pyramid, L, for high-luminance reflectance graph component j (R low ) A laplacian pyramid is the low-light reflectance map component.
S102, inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein, the enhancement module includes: and the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, and the second space attention module of the global attention module fuses the characteristic information and amplifies the global brightness of the illumination map component. The invention adds a Global Attention Module (GAM) attention mechanism to improve the network, is used for focusing on global brightness information, and can better enhance the illumination map component.
In S102, the second channel attention module of the global attention module screens the feature information and retains the valid feature information, the second spatial attention module of the global attention module performs fusion of the feature information, and enlarges the global brightness of the illumination map component, which includes the following steps (1) to (5).
(1) Performing convolution operation on the illumination map component to generate a feature map, wherein the convolution kernel has a size of 3×3, generating the feature map by convolution operation with the illumination map component obtained in the decomposition module as input in step S205, performing nonlinear activation on the generated feature map by using an activation function ReLu, and determining a feature map F 1 。
(2) Feature map F using global attention module and activation function SIGMOD 1 Processing and determiningFeature map M (F) 2 ). Specifically comprising the following steps (1) to (4).
(1) Feature map F with global attention module 1 Performing dimension conversion to determine a transformation dimension feature map F 1 '。
(2) Transforming dimension feature map F using multi-layer perceptron pairs 1 ' processing, determining and characterizing graph F 1 Dimension characteristic diagram F of the same dimension 1 ", use of an activation function SIGMOD to activate dimension feature map F 1 ", determine feature map F 2 。
(3) Convolution vs. feature map F with convolution kernel 7 2 Performing convolution operation to reduce the characteristic diagram F 2 Determining the number of channels of the feature map F 2 '。
(4) Re-using a convolution pair signature F with a convolution kernel of 7 2 ' convolutional operations, adding feature map F 2 ' number of channels, determining a feature map M (F 2 ) Feature map M (F 2 ) And feature map F 1 The number of channels is the same.
(3) Multiple use of convolution in cascade form, activation function ReLU and BN layer pair profile M (F 2 ) Processing and determining a characteristic diagram F 3 . Specifically comprising the following steps (1) to (3): (1) the feature map M (F) 2 ) A convolution operation is performed, and the convolution pair feature map M (F 2 ) The number of convolution kernels is 64, the convolution kernel size is 3×3, and the BN layer is added to make the feature map after the convolution operation have similar distribution. (2) Feature map M (F) using activation function ReLU 2 ) Activation is performed. (3) The steps are circulated for a plurality of times, and the characteristic diagram F is determined 3 。
(4) Feature map F by convolution and activation function SIGMOD 3 Processing, convolution kernel size is 3×3, determining feature map F 4 。
(5) Using convolution to characteristic map F 4 Processing and then using convolution to make characteristic diagram F 4 And (5) reconstructing to determine the enhanced illumination map component. Using convolution vs. feature map F 4 Performing reconstruction, including: the characteristic diagram F is subjected to convolution operation with convolution kernel size of 3 multiplied by 3 4 And (5) reconstructing.
In S102, the training method of the enhancement module includes: training the enhancement module according to the enhancement loss function; the enhancement loss function is expressed as:
wherein ,λlap Representing the laplace loss factor.
Representing a reconstruction loss function, the reconstruction loss function being represented as:
wherein λ represents the coefficient of reconstruction loss, R low Representing the low-light reflectance map component,representing the enhanced illumination map component, S high Representing a high-light input image.
Indicating the laplace loss, which is expressed as:
wherein ,laplacian pyramid, L, representing enhanced image i (S low ) Representing a low-light image laplacian pyramid.
In the enhancement module, a Laplace loss function is introduced to constrain the whole network, so that the output result is smoother.
And S103, multiplying the reflection map component and the enhanced illumination map component element by element for fusion, and determining the processed low illumination image.
Compared with the existing low illumination enhancement method based on Retinex: according to the invention, a CBAM attention mechanism is added into the decomposition module, so that the denoising capability of the decomposition module is improved, and the noise of the reflection map component and the illumination map component generated by the decomposition module is smaller.
The invention uses a deep convolutional neural network (DNCNN) in the enhancement module, adds a GAM attention mechanism, uses a DNCNN network structure to enhance the denoising capability of the enhancement module, uses GAM to pay attention to the global brightness of the network, and improves the enhancement effect of the network.
The invention uses peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM) as evaluation indexes, and the result obtained by comparing the evaluation indexes with the main stream algorithm models in RetienxNet, BIMEF, LIME, MF, dong, NPE, SRIE, CRM, MSR, RRM, GLAD and other 11 main stream methods is superior to the main stream 11 methods. As shown in table 1.
Table 1 quantitative test comparison on LOL dataset
In the aspect of enhancement effect, the embodiment enhances the visual comparison of the result and the enhancement result of other methods by using the LOL data set, so that the result of the method is better than the result of the other methods in the aspects of detail, result, color and the like, and the visual effect is more natural. In an embodiment of the present invention, an input low-light image is shown in fig. 3, a RetinexNet enhanced image is shown in fig. 4, a BIMEF enhanced image is shown in fig. 5, a LIME enhanced image is shown in fig. 6, and an image finally output by the method provided by the present invention is shown in fig. 7.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment is mainly described as a difference from other embodiments. All or portions of the present invention are operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, mobile communication terminals, multiprocessor systems, microprocessor-based systems, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the present invention; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (10)
1. A multi-attention mechanism and Retinex based low-light image enhancement method, comprising:
inputting the low-light image into a trained decomposition module, and determining an illumination map component and a reflection map component, wherein the decomposition module comprises: a CBAM, a first channel attention module of the CBAM reducing extraneous feature responses and activating useful feature responses according to different feature weights of interest, and reducing extraneous spatial information by a first spatial attention module of the CBAM;
inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein the enhancement module comprises: the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, and the second space attention module of the global attention module fuses the characteristic information and amplifies the global brightness of the illumination graph component;
and fusing the reflection map component and the enhanced illumination map component, and determining a processed low illumination image.
2. The method of claim 1, wherein the first channel attention module of the CBAM reduces extraneous feature responses and activates useful feature responses according to different feature weights of interest, and reduces extraneous spatial information by the first spatial attention module of the CBAM, comprising:
performing feature extraction on the low-light image by using a convolution and activation function ReLu to determine an intermediate feature map;
the first channel attention module and the first space attention module inject attention to the intermediate feature map, and then multiply the attention by the intermediate feature map to determine a refined feature map;
processing the refined feature map by using a multi-layer convolution operation, and activating a processing result by using the activation function ReLu to determine a multi-layer convolution feature map;
performing channel splicing on the intermediate feature map and the multilayer convolution feature map to determine a spliced feature map;
and carrying out normalization processing on the spliced feature map, and determining the reflection map component of the 3 channels and the illumination map component of the 1 channels.
3. The method of claim 2, wherein the determining a refined feature map comprises:
carrying out global maximum pooling and mean pooling on the intermediate feature map according to the channel to obtain two pooled one-dimensional vectors, inputting the two pooled one-dimensional vectors into a full-connection layer for addition operation, and determining the attention of the one-dimensional channel;
multiplying the attention of the one-dimensional channel with the intermediate feature map by elements to determine an adjustment feature map;
carrying out global maximum pooling and mean pooling on the adjustment feature map according to space to obtain two pooled two-dimensional vectors, and carrying out splicing operation and convolution operation on the two pooled two-dimensional vectors to determine the attention of the two-dimensional space;
and multiplying the two-dimensional space attention by the adjustment feature map by elements to determine the refinement feature map.
4. The method of claim 1, wherein the training method of the decomposition module comprises: training the decomposition module according to the decomposition loss function;
the decomposition loss function is expressed as:
wherein ,λir Representing the coefficient of coherence loss of the reflected component lambda lap Represents the Laplace loss coefficient, lambda ssim Representing a structural similarity loss coefficient;
representing a reconstruction loss function, the reconstruction loss function being represented as:
representing a reflection component coherence loss, the reflection component coherence loss being represented as:
representing a loss of structural similarity, the loss of structural similarity being represented as:
representing a laplace loss, the laplace loss being represented as:
wherein S represents the low-light image, R represents the reflection map component, I represents the illumination map component, low represents low light, high represents high light, and lambda ij Representing the reconstruction loss coefficient, R high Representing a high-light reflection diagram component, R low Representing low-light reflectance map components, i representing the result of the Laplacian pyramid ith layer, L j (R high ) Laplacian pyramid, L representing high-illuminance reflectance graph component j (R low ) Representing the low-light reflectance map component laplacian pyramid.
5. The method of claim 1, wherein the second channel attention module of the global attention module filters feature information and retains valid feature information, the second spatial attention module of the global attention module performs feature information fusion, and the amplifying the global brightness of the illumination map component comprises:
performing convolution operation on the illumination map component by using convolution to generate feature map, enabling the size of a convolution kernel to be 3 multiplied by 3, performing nonlinear activation on the generated feature map by using an activation function ReLu, and determining a feature map F 1 ;
The global attention module and the activation function SIGMOD are utilized to make the characteristic diagram F 1 Processing is performed to determine a feature map M (F 2 );
Multiple use of convolution in cascade form, activation function ReLU and BN layer against the feature map M (F 2 ) Processing and determining a characteristic diagram F 3 ;
The feature map F is subjected to convolution and an activation function SIGMOD 3 Processing and determining a characteristic diagram F 4 ;
The characteristic diagram F is convolved 4 Processing and then convolving the feature map F 4 And reconstructing and determining the enhanced illumination map component.
6. The method of claim 5, wherein the feature map F is generated using the global attention module and an activation function SIGMOD 1 Processing is performed to determine a feature map M (F 2 ) Comprising:
using the global attention module to the feature map F 1 Performing dimension conversion to determine a transformation dimension feature map F 1 ';
The dimension characteristic diagram F is transformed by using a multi-layer perceptron 1 ' processing, determining and comparing with the characteristic diagram F 1 Dimension characteristic diagram F of the same dimension 1 ", the dimension feature map F is activated using an activation function SIGMOD 1 ", determine feature map F 2 ;
The feature map F is convolved with a convolution kernel of 7 2 Performing convolution operation to reduce the characteristic diagram F 2 Determining the number of channels of the feature map F 2 ';
The characteristic diagram F is subjected to convolution with convolution kernel 7 2 ' convolutional operations, adding the feature map F 2 ' number of channels, determining the characteristic map M (F 2 ) The characteristic map M (F 2 ) And the characteristic diagram F 1 The number of channels is the same.
7. According to claim 5The method is characterized in that the characteristic map M (F 2 ) Processing and determining a characteristic diagram F 3 Comprising:
the feature map M (F 2 ) Performing convolution operation, and adding a BN layer to enable feature graphs after the convolution operation to have similar distribution;
the feature map M (F 2 ) Activating;
the steps are circulated for a plurality of times, and the characteristic diagram F is determined 3 。
8. The method of claim 5, wherein the using convolution is on the feature map F 4 Performing reconstruction, including:
checking the feature map F using a 3 x 3 convolution 4 And (5) reconstructing.
9. The method of claim 1, wherein the training method of the enhancement module comprises: training the enhancement module according to an enhancement loss function;
the enhancement loss function is expressed as:
wherein ,λlap Representing the laplace loss factor;
representing a reconstruction loss function, the reconstruction loss function being represented as:
wherein λ represents the coefficient of reconstruction loss, R low Representation ofThe low-light reflection map component is used,representing the enhanced illumination map component, S high Representing a high-light input image.
Representing a laplace loss, the laplace loss being represented as:
wherein ,laplacian pyramid, L, representing enhanced image i (S low ) Representing a low-light image laplacian pyramid.
10. The method of claim 1, wherein the fusing the reflectogram component with the enhanced illumination map component comprises:
and multiplying the reflection map component and the enhanced illumination map component element by element, and outputting the processed low illumination image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310573810.5A CN116645305A (en) | 2023-05-19 | 2023-05-19 | Low-light image enhancement method based on multi-attention mechanism and Retinex |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310573810.5A CN116645305A (en) | 2023-05-19 | 2023-05-19 | Low-light image enhancement method based on multi-attention mechanism and Retinex |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116645305A true CN116645305A (en) | 2023-08-25 |
Family
ID=87642842
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310573810.5A Pending CN116645305A (en) | 2023-05-19 | 2023-05-19 | Low-light image enhancement method based on multi-attention mechanism and Retinex |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116645305A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117372307A (en) * | 2023-12-01 | 2024-01-09 | 南京航空航天大学 | Multi-unmanned aerial vehicle collaborative detection distributed image enhancement method |
-
2023
- 2023-05-19 CN CN202310573810.5A patent/CN116645305A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117372307A (en) * | 2023-12-01 | 2024-01-09 | 南京航空航天大学 | Multi-unmanned aerial vehicle collaborative detection distributed image enhancement method |
CN117372307B (en) * | 2023-12-01 | 2024-02-23 | 南京航空航天大学 | Multi-unmanned aerial vehicle collaborative detection distributed image enhancement method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ren et al. | LECARM: Low-light image enhancement using the camera response model | |
Marnerides et al. | Expandnet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content | |
Ancuti et al. | Single-scale fusion: An effective approach to merging images | |
CN110675336A (en) | Low-illumination image enhancement method and device | |
Rahman et al. | Structure revealing of low-light images using wavelet transform based on fractional-order denoising and multiscale decomposition | |
Wang et al. | Low-light image enhancement based on virtual exposure | |
JP2019219928A (en) | Image processing device, image processing method, and image processing program | |
Liu et al. | Underexposed image correction via hybrid priors navigated deep propagation | |
Robidoux et al. | End-to-end high dynamic range camera pipeline optimization | |
Lyu et al. | An efficient learning-based method for underwater image enhancement | |
CN116645305A (en) | Low-light image enhancement method based on multi-attention mechanism and Retinex | |
Tan et al. | A real-time video denoising algorithm with FPGA implementation for Poisson–Gaussian noise | |
CN115393231A (en) | Defect image generation method and device, electronic equipment and storage medium | |
Panetta et al. | Deep perceptual image enhancement network for exposure restoration | |
Li et al. | A degradation model for simultaneous brightness and sharpness enhancement of low-light image | |
Ou et al. | Real-time tone mapping: A state of the art report | |
Chen et al. | Retinex low-light image enhancement network based on attention mechanism | |
Wang et al. | Learning a self‐supervised tone mapping operator via feature contrast masking loss | |
Huang et al. | Underwater image enhancement based on color restoration and dual image wavelet fusion | |
Jia et al. | An extended variational image decomposition model for color image enhancement | |
Zhang et al. | A cross-scale framework for low-light image enhancement using spatial–spectral information | |
Tao et al. | An effective and robust underwater image enhancement method based on color correction and artificial multi-exposure fusion | |
Soma et al. | An efficient and contrast-enhanced video de-hazing based on transmission estimation using HSL color model | |
Wu et al. | Contrast enhancement based on reflectance-oriented probabilistic equalization | |
Cui et al. | Attention-guided multi-scale feature fusion network for low-light image enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |