CN116645305A - Low-light image enhancement method based on multi-attention mechanism and Retinex - Google Patents

Low-light image enhancement method based on multi-attention mechanism and Retinex Download PDF

Info

Publication number
CN116645305A
CN116645305A CN202310573810.5A CN202310573810A CN116645305A CN 116645305 A CN116645305 A CN 116645305A CN 202310573810 A CN202310573810 A CN 202310573810A CN 116645305 A CN116645305 A CN 116645305A
Authority
CN
China
Prior art keywords
map
feature map
component
feature
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310573810.5A
Other languages
Chinese (zh)
Inventor
朱娟娟
郑世鑫
刘佳琪
陈鹏吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202310573810.5A priority Critical patent/CN116645305A/en
Publication of CN116645305A publication Critical patent/CN116645305A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a low-light image enhancement method based on a multi-attention mechanism and Retinex, which relates to the technical field of image processing and solves the problems of weak sense of reality, serious noise and distortion, partial underexposure and overexposure of an enhanced image in the prior art, and the method comprises the following steps: inputting the low-illumination image into a trained decomposition module, and determining an illumination map component and a reflection map component; inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; the reflection image component and the enhanced illumination image component are fused, the processed low illumination image is determined, accurate noise removal is achieved, the global brightness of the features is amplified, the enhancement effect is improved, and the output fused image is smoother.

Description

Low-light image enhancement method based on multi-attention mechanism and Retinex
Technical Field
The invention relates to the technical field of image processing, in particular to a low-light image enhancement method based on a multi-attention mechanism and Retinex.
Background
The low-illumination image enhancement belongs to an end-to-end RGB color image restoration technology, and the image is enhanced by analyzing the brightness, contrast and other information of the image. Existing methods of image enhancement are largely classified into a conventional enhancement method, a retina-based enhancement scheme, and a deep learning-based enhancement method. Based on the traditional image processing method, taking histogram equalization as a main representation, wherein the histogram equalization method is a histogram modification method based on cumulative distribution function, and the image histogram is adjusted to be uniformly distributed to stretch the dynamic range of the image, so that the image contrast is improved; the method is simple to operate and high in efficiency, but the generated image is easily affected by the artifact and has weak sense of reality. The retina (Retinex) -based low-illumination enhancement method decomposes an image into an illumination component and a reflection component by simulating a visual mechanism of human eyes, and improves the brightness of the image by adjusting the illumination component; however, the parameters of the method need to be set manually, so that diversified images cannot be processed adaptively, and partial underexposure and overexposure are caused. The low-light enhancement method based on deep learning is characterized in that a large-scale data set is utilized to learn an optimal network model, and a complex mapping relation is established between a low-light image and a normal-light image.
With the rapid development of deep learning, many underlying visual tasks are greatly affected, and RetinexNet is a typical deep learning-based low-light enhancement model, and the network structure mainly comprises a decomposition network and an enhancement network, wherein the decomposition network is composed of 5 convolution layers with ReLu, and the enhancement network is an encoder-decoder structure. But its enhanced result is too low smoothness and too loud noise, causing severe distortion of the image.
Disclosure of Invention
The invention provides a low-light image enhancement method based on a multi-attention mechanism and Retinex, which solves the problems of weak sense of reality, serious noise and distortion, partial underexposure and overexposure of the enhanced image in the prior art, thereby realizing accurate noise removal, amplifying the global brightness of the features, improving the enhancement effect and enabling the output fused image to be smoother.
The invention provides a low-light image enhancement method based on a multi-attention mechanism and Retinex, which comprises the following steps:
inputting the low-light image into a trained decomposition module, and determining an illumination map component and a reflection map component, wherein the decomposition module comprises: a CBAM, a first channel attention module of the CBAM reducing extraneous feature responses and activating useful feature responses according to different feature weights of interest, and reducing extraneous spatial information by a first spatial attention module of the CBAM;
inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein the enhancement module comprises: the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, and the second space attention module of the global attention module fuses the characteristic information and amplifies the global brightness of the illumination graph component;
and fusing the reflection map component and the enhanced illumination map component, and determining a processed low illumination image.
In one possible implementation, the first channel attention module of the CBAM reduces extraneous feature responses and activates useful feature responses according to focusing on different feature weights, and reduces extraneous spatial information by the first spatial attention module of the CBAM, comprising:
performing feature extraction on the low-light image by using a convolution and activation function ReLu to determine an intermediate feature map;
the first channel attention module and the first space attention module inject attention to the intermediate feature map, and then multiply the attention by the intermediate feature map to determine a refined feature map;
processing the refined feature map by using a multi-layer convolution operation, and activating a processing result by using the activation function ReLu to determine a multi-layer convolution feature map;
performing channel splicing on the intermediate feature map and the multilayer convolution feature map to determine a spliced feature map;
and carrying out normalization processing on the spliced feature map, and determining the reflection map component of the 3 channels and the illumination map component of the 1 channels.
In one possible implementation, the determining a refinement feature map includes:
carrying out global maximum pooling and mean pooling on the intermediate feature map according to the channel to obtain two pooled one-dimensional vectors, inputting the two pooled one-dimensional vectors into a full-connection layer for addition operation, and determining the attention of the one-dimensional channel;
multiplying the attention of the one-dimensional channel with the intermediate feature map by elements to determine an adjustment feature map;
carrying out global maximum pooling and mean pooling on the adjustment feature map according to space to obtain two pooled two-dimensional vectors, and carrying out splicing operation and convolution operation on the two pooled two-dimensional vectors to determine the attention of the two-dimensional space;
and multiplying the two-dimensional space attention by the adjustment feature map by elements to determine the refinement feature map.
In one possible implementation manner, the training method of the decomposition module includes: training the decomposition module according to the decomposition loss function;
the decomposition loss function is expressed as:
wherein ,representing the coefficient of coherence loss of the reflected component lambda lap Represents the Laplace loss coefficient, lambda ssim Representing a structural similarity loss coefficient;
representing a reconstruction loss function, the reconstruction loss function being represented as:
representing a reflection component coherence loss, the reflection component coherence loss being represented as:
representing a loss of structural similarity, the loss of structural similarity being represented as:
representing a laplace loss, the laplace loss being represented as:
wherein S represents the low-light image, R represents the reflection map component, I represents the illumination map component, low represents low light, high represents high light, and lambda ij Representing the reconstruction loss coefficient, R high Representing a high-light reflection diagram component, R low Representing low-light reflectance map components, i representing the result of the Laplacian pyramid ith layer, L j (R high ) Laplacian pyramid, L representing high-illuminance reflectance graph component j (R low ) Representing the low-light reflectance map component laplacian pyramid.
In one possible implementation manner, the second channel attention module of the global attention module screens the feature information and retains the valid feature information, the second space attention module of the global attention module performs feature information fusion, amplifies the global brightness of the illumination map component, and includes:
performing convolution operation on the illumination map component by using convolution to generate feature map, enabling the size of a convolution kernel to be 3 multiplied by 3, performing nonlinear activation on the generated feature map by using an activation function ReLu, and determining a feature map F 1
The global attention module and the activation function SIGMOD are utilized to make the characteristic diagram F 1 Processing is performed to determine a feature map M (F 2 );
Multiple use of convolution in cascade form, activation function ReLU and BN layer against the feature map M (F 2 ) Processing and determining a characteristic diagram F 3
The feature map F is subjected to convolution and an activation function SIGMOD 3 Processing and determining a characteristic diagram F 4
The characteristic diagram F is convolved 4 Processing and then convolving the feature map F 4 And reconstructing and determining the enhanced illumination map component.
In a possible implementation, the feature map F is generated by the global attention module and an activation function SIGMOD 1 Processing is performed to determine a feature map M (F 2 ) Comprising:
using the global attention module to the feature map F 1 Performing dimension conversion to determine a transformation dimension feature map F 1 ';
The dimension characteristic diagram F is transformed by using a multi-layer perceptron 1 ' processing, determining and comparing with the characteristic diagram F 1 Dimension characteristic diagram F of the same dimension 1 ", the dimension feature map F is activated using an activation function SIGMOD 1 ", determine feature map F 2
The feature map F is convolved with a convolution kernel of 7 2 Performing convolution operationsReducing the characteristic diagram F 2 Determining the number of channels of the feature map F 2 ';
The characteristic diagram F is subjected to convolution with convolution kernel 7 2 ' convolutional operations, adding the feature map F 2 ' number of channels, determining the characteristic map M (F 2 ) The characteristic map M (F 2 ) And the characteristic diagram F 1 The number of channels is the same.
In one possible implementation, the multiple-use cascade form of convolution, activation function ReLU and BN layer pair of the feature map M (F 2 ) Processing and determining a characteristic diagram F 3 Comprising:
the feature map M (F 2 ) Performing convolution operation, and adding a BN layer to enable feature graphs after the convolution operation to have similar distribution;
the feature map M (F 2 ) Activating;
the steps are circulated for a plurality of times, and the characteristic diagram F is determined 3
In a possible implementation, the using convolution is performed on the feature map F 4 Performing reconstruction, including:
checking the feature map F using a 3 x 3 convolution 4 And (5) reconstructing.
In one possible implementation manner, the training method of the enhancement module includes: training the enhancement module according to an enhancement loss function;
the enhancement loss function is expressed as:
wherein ,λlap Representing the laplace loss factor;
representing a reconstruction loss function, the reconstruction loss function being represented as:
wherein λ represents the coefficient of reconstruction loss, R low Representing the low-light reflectance map component,representing the enhanced illumination map component, S high Representing a high-light input image.
Representing a laplace loss, the laplace loss being represented as:
wherein ,laplacian pyramid, L, representing enhanced image i (S low ) Representing a low-light image laplacian pyramid.
In one possible implementation, the fusing the reflectogram component with the enhanced illumination map component includes:
and multiplying the reflection map component and the enhanced illumination map component element by element, and outputting the processed low illumination image.
One or more technical solutions provided in the embodiments of the present invention at least have the following technical effects or advantages:
the invention adopts a low-light image enhancement method based on a multi-attention mechanism and Retinex, which comprises the following steps: inputting the low-light image into a trained decomposition module to determine an illumination map component and a reflection map component, wherein the decomposition module comprises: the method comprises the steps that a CBAM, a first channel attention module of the CBAM reduces irrelevant characteristic responses and activates useful characteristic responses according to different focused characteristic weights, irrelevant spatial information is reduced through a first spatial attention module of the CBAM, sensitivity of a decomposition network to noise is improved, and noise with different degrees is removed accurately; inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein, the enhancement module includes: the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, the second space attention module of the global attention module fuses the characteristic information, the global brightness of the illumination map component is amplified, the global brightness of the characteristic is effectively amplified by the enhancement module, and the enhancement effect of the enhancement module is improved; fusing the reflection image component and the enhanced illumination image component, and determining a processed low illumination image; the problems of weak sense of reality, serious noise and distortion, partial underexposure and overexposure of the enhanced image in the prior art are effectively solved, the accurate noise removal is further realized, the global brightness of the features is amplified, the enhancement effect is improved, and the output fusion image is smoother.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments of the present invention or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of steps of a low-light image enhancement method based on a multi-attention mechanism and Retinex according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps performed by the decomposing module to decompose a low-light image according to an embodiment of the present invention;
FIG. 3 is a low-light image of an input provided by an embodiment of the present invention;
fig. 4 is an image enhanced by RetinexNet in the prior art according to an embodiment of the present invention;
fig. 5 is a prior art BIMEF enhanced image provided by an embodiment of the present invention;
FIG. 6 is a LIME enhanced image of the prior art provided by an embodiment of the present invention;
fig. 7 is a processed low-light image output according to the method provided by the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The method is simple to operate and high in efficiency based on the traditional image processing method, but the generated image is easily influenced by artifacts and has weak sense of reality; parameters of retina (Retinex) -based enhancement methods need to be manually set, cannot adaptively process diversified images, and there are cases of partial underexposure and overexposure; the RetinexNet method has serious noise, has low smoothness and serious image distortion; based on the deep-learned NPE model, the naturalness of illumination is maintained while enhancing, but there is no relationship of illumination in different scenes; the BIMEF model provides a multi-exposure fusion framework for enhancing low-illumination images, adopts a double-exposure fusion algorithm to provide accurate contrast and illumination intensity, but has lower enhancement result brightness; based on a LIME model proposed by Retinex, selecting the maximum value in each pixel channel of an input image, processing an illumination image, processing the illumination image by using structured prior knowledge, and taking the output of a reflection image as an enhancement result, wherein the method is easy to cause the phenomenon of excessive enhancement; GLADNet proposes a global perception and detail preserving network that estimates the illumination image from the low-light image by global illumination estimation, and then reconstructs the module through a detail module, thereby enhancing the low-light image without a denoising module and considering the influence on the image.
The embodiment of the invention provides a low-light image enhancement method based on a multi-attention mechanism and Retinex, which comprises the following steps S101 to S103 as shown in FIG. 1.
S101, inputting the low-light image into a trained decomposition module, and determining an illumination map component and a reflection map component, wherein the decomposition module comprises: the convolution attention module (full name: convolutional Block Attention Module, abbreviated as CBAM) reduces irrelevant feature responses and activates useful feature responses according to different feature weights of interest, and reduces irrelevant spatial information through the first spatial attention module of the CBAM. According to the invention, the CBAM is added into the decomposition network to improve the network, the CBAM can calculate the attention weight corresponding to the decomposition module feature map, the channel attention module is used for paying attention to the feature weights of different channels to reduce irrelevant feature responses and activate useful feature responses, and the spatial attention module is used for reducing irrelevant spatial information, so that the sensitivity of the decomposition module to noise is improved, and the noise with different degrees is accurately removed.
In one possible implementation, the first channel attention module of the CBAM reduces irrelevant feature responses and activates useful feature responses according to focusing on different feature weights, and reduces irrelevant spatial information through the first spatial attention module of the CBAM, specifically including the following steps S201 to S205 as shown in fig. 2.
S201, feature extraction is carried out on the low-illumination image by utilizing a convolution and activation function ReLu, and an intermediate feature map F epsilon R is determined C×H×W
S202, the first channel attention module and the first space attention module inject attention to the intermediate feature map, and then multiply the attention by the intermediate feature map to determine a refined feature map. CBAM is a simple and effective convolutional neural network attention module. And (3) any given intermediate feature map in the convolutional neural network, injecting the attention map along two independent dimensions of a channel and a space of the feature map by the CBAM, multiplying the attention by the input feature map, and carrying out self-adaptive feature refinement on the input feature map. Because CBAM is an end-to-end generic module, it can be seamlessly integrated into any CNNs architecture and can be end-to-end trained with basic CNNs.
Determining a refined feature map, specifically comprising the following steps (1) to (4). (1) Intermediate feature map F e R C×H×W Carrying out global maximum pooling and mean pooling according to channels to obtain two one-dimensional vectors after pooling, inputting the two one-dimensional vectors after pooling into a full-connection layer for addition operation, and determining the attention M of the one-dimensional channel C ∈R C×1×1 . (2) Directing attention to one-dimensional channel M C ∈R C ×1×1 And intermediate feature map F epsilon R C×H×W The adjustment feature map F' is determined by element multiplication. (3) Carrying out global maximum pooling and mean pooling on the adjustment feature map F' according to space to obtain two-dimensional vectors after pooling, and carrying out splicing operation and convolution operation on the two-dimensional vectors after pooling to determine the two-dimensional space attention M s ∈R 1×H×W . (4) The two-dimensional spatial attention is multiplied by the adjustment feature map F' by elements to determine a refined feature map.
S203, the refinement feature map is processed by using a multi-layer convolution operation, and the processing result is activated by using an activation function ReLu to determine a 64-channel multi-layer convolution feature map F.
S204, performing channel splicing on the intermediate feature map F and the multi-layer convolution feature map F' to determine a 128-channel spliced feature map.
S205, carrying out normalization processing on the spliced characteristic map, and determining a reflection map component of 3 channels and an illumination map component of 1 channel.
The training method of the decomposition module provided by the invention comprises the following steps: and training the decomposition module according to the decomposition loss function.
The decomposition loss function is expressed as:
wherein ,λir Representing the coefficient of coherence loss of the reflected component lambda lap Represents the Laplace loss coefficient, lambda ssim Representing a structural similarity loss coefficient;
representing a reconstruction loss function, the reconstruction loss function being represented as:
representing a reflection component uniformity loss, which is expressed as:
representing a loss of structural similarity, the loss of structural similarity is expressed as:
indicating the laplace loss, which is expressed as:
wherein S represents a low-light image, R represents a reflection image component, I represents a light image component, low represents a low light, high represents a high light, and lambda ij Representing the reconstruction loss coefficient, R high Representing a high-light reflection diagram component, R low Representing low light reflectionThe ray diagram component, i, represents the result of the ith layer of the Laplacian pyramid, L j (R high ) Laplacian pyramid, L representing high-illuminance reflectance graph component j (R low ) Representing the low-light reflectance map component laplacian pyramid.Global and local information capable of capturing features and estimating anti-high light reflection component R high And a low light component R low Color difference between, wherein L j (R high ) Laplacian pyramid, L, for high-luminance reflectance graph component j (R low ) A laplacian pyramid is the low-light reflectance map component.
S102, inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein, the enhancement module includes: and the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, and the second space attention module of the global attention module fuses the characteristic information and amplifies the global brightness of the illumination map component. The invention adds a Global Attention Module (GAM) attention mechanism to improve the network, is used for focusing on global brightness information, and can better enhance the illumination map component.
In S102, the second channel attention module of the global attention module screens the feature information and retains the valid feature information, the second spatial attention module of the global attention module performs fusion of the feature information, and enlarges the global brightness of the illumination map component, which includes the following steps (1) to (5).
(1) Performing convolution operation on the illumination map component to generate a feature map, wherein the convolution kernel has a size of 3×3, generating the feature map by convolution operation with the illumination map component obtained in the decomposition module as input in step S205, performing nonlinear activation on the generated feature map by using an activation function ReLu, and determining a feature map F 1
(2) Feature map F using global attention module and activation function SIGMOD 1 Processing and determiningFeature map M (F) 2 ). Specifically comprising the following steps (1) to (4).
(1) Feature map F with global attention module 1 Performing dimension conversion to determine a transformation dimension feature map F 1 '。
(2) Transforming dimension feature map F using multi-layer perceptron pairs 1 ' processing, determining and characterizing graph F 1 Dimension characteristic diagram F of the same dimension 1 ", use of an activation function SIGMOD to activate dimension feature map F 1 ", determine feature map F 2
(3) Convolution vs. feature map F with convolution kernel 7 2 Performing convolution operation to reduce the characteristic diagram F 2 Determining the number of channels of the feature map F 2 '。
(4) Re-using a convolution pair signature F with a convolution kernel of 7 2 ' convolutional operations, adding feature map F 2 ' number of channels, determining a feature map M (F 2 ) Feature map M (F 2 ) And feature map F 1 The number of channels is the same.
(3) Multiple use of convolution in cascade form, activation function ReLU and BN layer pair profile M (F 2 ) Processing and determining a characteristic diagram F 3 . Specifically comprising the following steps (1) to (3): (1) the feature map M (F) 2 ) A convolution operation is performed, and the convolution pair feature map M (F 2 ) The number of convolution kernels is 64, the convolution kernel size is 3×3, and the BN layer is added to make the feature map after the convolution operation have similar distribution. (2) Feature map M (F) using activation function ReLU 2 ) Activation is performed. (3) The steps are circulated for a plurality of times, and the characteristic diagram F is determined 3
(4) Feature map F by convolution and activation function SIGMOD 3 Processing, convolution kernel size is 3×3, determining feature map F 4
(5) Using convolution to characteristic map F 4 Processing and then using convolution to make characteristic diagram F 4 And (5) reconstructing to determine the enhanced illumination map component. Using convolution vs. feature map F 4 Performing reconstruction, including: the characteristic diagram F is subjected to convolution operation with convolution kernel size of 3 multiplied by 3 4 And (5) reconstructing.
In S102, the training method of the enhancement module includes: training the enhancement module according to the enhancement loss function; the enhancement loss function is expressed as:
wherein ,λlap Representing the laplace loss factor.
Representing a reconstruction loss function, the reconstruction loss function being represented as:
wherein λ represents the coefficient of reconstruction loss, R low Representing the low-light reflectance map component,representing the enhanced illumination map component, S high Representing a high-light input image.
Indicating the laplace loss, which is expressed as:
wherein ,laplacian pyramid, L, representing enhanced image i (S low ) Representing a low-light image laplacian pyramid.
In the enhancement module, a Laplace loss function is introduced to constrain the whole network, so that the output result is smoother.
And S103, multiplying the reflection map component and the enhanced illumination map component element by element for fusion, and determining the processed low illumination image.
Compared with the existing low illumination enhancement method based on Retinex: according to the invention, a CBAM attention mechanism is added into the decomposition module, so that the denoising capability of the decomposition module is improved, and the noise of the reflection map component and the illumination map component generated by the decomposition module is smaller.
The invention uses a deep convolutional neural network (DNCNN) in the enhancement module, adds a GAM attention mechanism, uses a DNCNN network structure to enhance the denoising capability of the enhancement module, uses GAM to pay attention to the global brightness of the network, and improves the enhancement effect of the network.
The invention uses peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM) as evaluation indexes, and the result obtained by comparing the evaluation indexes with the main stream algorithm models in RetienxNet, BIMEF, LIME, MF, dong, NPE, SRIE, CRM, MSR, RRM, GLAD and other 11 main stream methods is superior to the main stream 11 methods. As shown in table 1.
Table 1 quantitative test comparison on LOL dataset
In the aspect of enhancement effect, the embodiment enhances the visual comparison of the result and the enhancement result of other methods by using the LOL data set, so that the result of the method is better than the result of the other methods in the aspects of detail, result, color and the like, and the visual effect is more natural. In an embodiment of the present invention, an input low-light image is shown in fig. 3, a RetinexNet enhanced image is shown in fig. 4, a BIMEF enhanced image is shown in fig. 5, a LIME enhanced image is shown in fig. 6, and an image finally output by the method provided by the present invention is shown in fig. 7.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment is mainly described as a difference from other embodiments. All or portions of the present invention are operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, mobile communication terminals, multiprocessor systems, microprocessor-based systems, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the present invention; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (10)

1. A multi-attention mechanism and Retinex based low-light image enhancement method, comprising:
inputting the low-light image into a trained decomposition module, and determining an illumination map component and a reflection map component, wherein the decomposition module comprises: a CBAM, a first channel attention module of the CBAM reducing extraneous feature responses and activating useful feature responses according to different feature weights of interest, and reducing extraneous spatial information by a first spatial attention module of the CBAM;
inputting the illumination map component into a trained enhancement module, and determining an enhanced illumination map component; wherein the enhancement module comprises: the second channel attention module of the global attention module screens the characteristic information and retains the effective characteristic information, and the second space attention module of the global attention module fuses the characteristic information and amplifies the global brightness of the illumination graph component;
and fusing the reflection map component and the enhanced illumination map component, and determining a processed low illumination image.
2. The method of claim 1, wherein the first channel attention module of the CBAM reduces extraneous feature responses and activates useful feature responses according to different feature weights of interest, and reduces extraneous spatial information by the first spatial attention module of the CBAM, comprising:
performing feature extraction on the low-light image by using a convolution and activation function ReLu to determine an intermediate feature map;
the first channel attention module and the first space attention module inject attention to the intermediate feature map, and then multiply the attention by the intermediate feature map to determine a refined feature map;
processing the refined feature map by using a multi-layer convolution operation, and activating a processing result by using the activation function ReLu to determine a multi-layer convolution feature map;
performing channel splicing on the intermediate feature map and the multilayer convolution feature map to determine a spliced feature map;
and carrying out normalization processing on the spliced feature map, and determining the reflection map component of the 3 channels and the illumination map component of the 1 channels.
3. The method of claim 2, wherein the determining a refined feature map comprises:
carrying out global maximum pooling and mean pooling on the intermediate feature map according to the channel to obtain two pooled one-dimensional vectors, inputting the two pooled one-dimensional vectors into a full-connection layer for addition operation, and determining the attention of the one-dimensional channel;
multiplying the attention of the one-dimensional channel with the intermediate feature map by elements to determine an adjustment feature map;
carrying out global maximum pooling and mean pooling on the adjustment feature map according to space to obtain two pooled two-dimensional vectors, and carrying out splicing operation and convolution operation on the two pooled two-dimensional vectors to determine the attention of the two-dimensional space;
and multiplying the two-dimensional space attention by the adjustment feature map by elements to determine the refinement feature map.
4. The method of claim 1, wherein the training method of the decomposition module comprises: training the decomposition module according to the decomposition loss function;
the decomposition loss function is expressed as:
wherein ,λir Representing the coefficient of coherence loss of the reflected component lambda lap Represents the Laplace loss coefficient, lambda ssim Representing a structural similarity loss coefficient;
representing a reconstruction loss function, the reconstruction loss function being represented as:
representing a reflection component coherence loss, the reflection component coherence loss being represented as:
representing a loss of structural similarity, the loss of structural similarity being represented as:
representing a laplace loss, the laplace loss being represented as:
wherein S represents the low-light image, R represents the reflection map component, I represents the illumination map component, low represents low light, high represents high light, and lambda ij Representing the reconstruction loss coefficient, R high Representing a high-light reflection diagram component, R low Representing low-light reflectance map components, i representing the result of the Laplacian pyramid ith layer, L j (R high ) Laplacian pyramid, L representing high-illuminance reflectance graph component j (R low ) Representing the low-light reflectance map component laplacian pyramid.
5. The method of claim 1, wherein the second channel attention module of the global attention module filters feature information and retains valid feature information, the second spatial attention module of the global attention module performs feature information fusion, and the amplifying the global brightness of the illumination map component comprises:
performing convolution operation on the illumination map component by using convolution to generate feature map, enabling the size of a convolution kernel to be 3 multiplied by 3, performing nonlinear activation on the generated feature map by using an activation function ReLu, and determining a feature map F 1
The global attention module and the activation function SIGMOD are utilized to make the characteristic diagram F 1 Processing is performed to determine a feature map M (F 2 );
Multiple use of convolution in cascade form, activation function ReLU and BN layer against the feature map M (F 2 ) Processing and determining a characteristic diagram F 3
The feature map F is subjected to convolution and an activation function SIGMOD 3 Processing and determining a characteristic diagram F 4
The characteristic diagram F is convolved 4 Processing and then convolving the feature map F 4 And reconstructing and determining the enhanced illumination map component.
6. The method of claim 5, wherein the feature map F is generated using the global attention module and an activation function SIGMOD 1 Processing is performed to determine a feature map M (F 2 ) Comprising:
using the global attention module to the feature map F 1 Performing dimension conversion to determine a transformation dimension feature map F 1 ';
The dimension characteristic diagram F is transformed by using a multi-layer perceptron 1 ' processing, determining and comparing with the characteristic diagram F 1 Dimension characteristic diagram F of the same dimension 1 ", the dimension feature map F is activated using an activation function SIGMOD 1 ", determine feature map F 2
The feature map F is convolved with a convolution kernel of 7 2 Performing convolution operation to reduce the characteristic diagram F 2 Determining the number of channels of the feature map F 2 ';
The characteristic diagram F is subjected to convolution with convolution kernel 7 2 ' convolutional operations, adding the feature map F 2 ' number of channels, determining the characteristic map M (F 2 ) The characteristic map M (F 2 ) And the characteristic diagram F 1 The number of channels is the same.
7. According to claim 5The method is characterized in that the characteristic map M (F 2 ) Processing and determining a characteristic diagram F 3 Comprising:
the feature map M (F 2 ) Performing convolution operation, and adding a BN layer to enable feature graphs after the convolution operation to have similar distribution;
the feature map M (F 2 ) Activating;
the steps are circulated for a plurality of times, and the characteristic diagram F is determined 3
8. The method of claim 5, wherein the using convolution is on the feature map F 4 Performing reconstruction, including:
checking the feature map F using a 3 x 3 convolution 4 And (5) reconstructing.
9. The method of claim 1, wherein the training method of the enhancement module comprises: training the enhancement module according to an enhancement loss function;
the enhancement loss function is expressed as:
wherein ,λlap Representing the laplace loss factor;
representing a reconstruction loss function, the reconstruction loss function being represented as:
wherein λ represents the coefficient of reconstruction loss, R low Representation ofThe low-light reflection map component is used,representing the enhanced illumination map component, S high Representing a high-light input image.
Representing a laplace loss, the laplace loss being represented as:
wherein ,laplacian pyramid, L, representing enhanced image i (S low ) Representing a low-light image laplacian pyramid.
10. The method of claim 1, wherein the fusing the reflectogram component with the enhanced illumination map component comprises:
and multiplying the reflection map component and the enhanced illumination map component element by element, and outputting the processed low illumination image.
CN202310573810.5A 2023-05-19 2023-05-19 Low-light image enhancement method based on multi-attention mechanism and Retinex Pending CN116645305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310573810.5A CN116645305A (en) 2023-05-19 2023-05-19 Low-light image enhancement method based on multi-attention mechanism and Retinex

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310573810.5A CN116645305A (en) 2023-05-19 2023-05-19 Low-light image enhancement method based on multi-attention mechanism and Retinex

Publications (1)

Publication Number Publication Date
CN116645305A true CN116645305A (en) 2023-08-25

Family

ID=87642842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310573810.5A Pending CN116645305A (en) 2023-05-19 2023-05-19 Low-light image enhancement method based on multi-attention mechanism and Retinex

Country Status (1)

Country Link
CN (1) CN116645305A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372307A (en) * 2023-12-01 2024-01-09 南京航空航天大学 Multi-unmanned aerial vehicle collaborative detection distributed image enhancement method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117372307A (en) * 2023-12-01 2024-01-09 南京航空航天大学 Multi-unmanned aerial vehicle collaborative detection distributed image enhancement method
CN117372307B (en) * 2023-12-01 2024-02-23 南京航空航天大学 Multi-unmanned aerial vehicle collaborative detection distributed image enhancement method

Similar Documents

Publication Publication Date Title
Ren et al. LECARM: Low-light image enhancement using the camera response model
Marnerides et al. Expandnet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content
Ancuti et al. Single-scale fusion: An effective approach to merging images
CN110675336A (en) Low-illumination image enhancement method and device
Rahman et al. Structure revealing of low-light images using wavelet transform based on fractional-order denoising and multiscale decomposition
Wang et al. Low-light image enhancement based on virtual exposure
JP2019219928A (en) Image processing device, image processing method, and image processing program
Liu et al. Underexposed image correction via hybrid priors navigated deep propagation
Robidoux et al. End-to-end high dynamic range camera pipeline optimization
Lyu et al. An efficient learning-based method for underwater image enhancement
CN116645305A (en) Low-light image enhancement method based on multi-attention mechanism and Retinex
Tan et al. A real-time video denoising algorithm with FPGA implementation for Poisson–Gaussian noise
CN115393231A (en) Defect image generation method and device, electronic equipment and storage medium
Panetta et al. Deep perceptual image enhancement network for exposure restoration
Li et al. A degradation model for simultaneous brightness and sharpness enhancement of low-light image
Ou et al. Real-time tone mapping: A state of the art report
Chen et al. Retinex low-light image enhancement network based on attention mechanism
Wang et al. Learning a self‐supervised tone mapping operator via feature contrast masking loss
Huang et al. Underwater image enhancement based on color restoration and dual image wavelet fusion
Jia et al. An extended variational image decomposition model for color image enhancement
Zhang et al. A cross-scale framework for low-light image enhancement using spatial–spectral information
Tao et al. An effective and robust underwater image enhancement method based on color correction and artificial multi-exposure fusion
Soma et al. An efficient and contrast-enhanced video de-hazing based on transmission estimation using HSL color model
Wu et al. Contrast enhancement based on reflectance-oriented probabilistic equalization
Cui et al. Attention-guided multi-scale feature fusion network for low-light image enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination