CN115082341A - Low-light image enhancement method based on event camera - Google Patents

Low-light image enhancement method based on event camera Download PDF

Info

Publication number
CN115082341A
CN115082341A CN202210723127.0A CN202210723127A CN115082341A CN 115082341 A CN115082341 A CN 115082341A CN 202210723127 A CN202210723127 A CN 202210723127A CN 115082341 A CN115082341 A CN 115082341A
Authority
CN
China
Prior art keywords
image
gradient
event
branch
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210723127.0A
Other languages
Chinese (zh)
Inventor
金海燕
王乔斌
苏浩楠
肖照林
蔡磊
王彬
刘瑾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202210723127.0A priority Critical patent/CN115082341A/en
Publication of CN115082341A publication Critical patent/CN115082341A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/94Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a low-light image enhancement method based on an event camera, which comprises the steps of firstly selecting a data set with a normal illumination image and an event stream, requiring the normal illumination image and the event stream to be matched in space, then generating a gradient image and a low-light image containing noise by the normal illumination image, and preprocessing the event stream to obtain an event pseudo image with good edge information; then reconstructing and enhancing the gradient image, designing a characteristic fusion module, adding a condition discriminator, training a neural network built by the gradient branch and the weak light image enhancement branch, storing a model of the neural network, finally testing the model, and outputting the enhanced image. The method guides the enhancement of the low-light image in the image domain through the gradient image reconstructed by the event, and can obtain the normal light image with rich edge information.

Description

Low-light image enhancement method based on event camera
Technical Field
The invention belongs to the technical field of computer digital image processing, and particularly relates to a low-light image enhancement method based on an event camera.
Background
With the advent of the information age, photographing has become an indispensable part of human life, and thus digital image processing has rapidly progressed, and demands for digital image processing have gradually increased.
In the aspect of weak light, the requirement of taking a picture under normal illumination exists, and night scene shooting also becomes an important selection index for selecting a mobile phone. In a picture taken under weak light, if the taking time is short, the picture generally has large noise, and if the exposure time is prolonged, the picture is easily blurred due to motion. The hardware method can cause the cost of the imaging system to rise, meanwhile, the deep learning is rapidly developed, and the software method can be enhanced to obtain a good effect.
The early low-light image enhancement method mostly adopts the traditional method for enhancement, for example, histogram equalization or gamma correction is easy to cause that the image is locally exposed, other areas are insufficiently enhanced, and the overall contrast is insufficient. The computer computing power is remarkably improved, the deep learning is rapidly developed, and better experimental results than the traditional method can be obtained by adopting a deep learning mode. For weak light image enhancement, a single image enhancement mode can be adopted, but for very weak light, the noise seriously destroys the structural information of the image, and the single image enhancement cannot obtain a better result. The event camera records the change of the intensity value, has the capabilities of high dynamic range, low delay and no motion blur, can obtain good event stream even under the condition of weak light, can synthesize an event pseudo image with a good edge information structure, supplements a dark area structure which cannot be shot by the traditional camera, and finally obtains a normal light image with rich texture details.
Disclosure of Invention
The invention aims to provide a low-light image enhancement method based on an event camera, which guides the enhancement of a low-light image in an image domain through a gradient map reconstructed by an event and can obtain a normal light image with rich edge information.
The technical scheme adopted by the invention is that the low-light image enhancement method based on the event camera is implemented according to the following steps:
step 1, data set composition, selecting a data set with a normal illumination image and an event stream, requiring the normal illumination image and the event stream to be matched in space, then generating a gradient image and a noise-containing weak light image from the normal illumination image, and preprocessing the event stream to obtain an event pseudo image with good edge information;
step 2, reconstructing the event pseudo image obtained in the step 1 into a gradient image by adopting UNet gradient branches;
step 3, enhancing the low-light image obtained in the step 1 by adopting a UNet low-light image enhancement branch to obtain an enhanced image;
step 4, designing a feature fusion module, namely adopting a module CBAM based on a channel and space attention, and fusing information contained in the gradient image in the step 2 to the weak light enhancement branch in the step 3;
step 5, adding a condition discriminator, wherein the condition is an event pseudo image and a gradient image, and generating a more real enhanced image;
step 6, training 300 epochs on the neural network built by the gradient branch in the step 2 and the weak light image enhancement branch in the step 3, verifying the training result and storing the model of the neural network;
and 7, testing the neural network model stored in the step 6, and outputting the enhanced image.
The present invention is also characterized in that,
the step 1 is as follows:
step 1.1, selecting a data set with a normal illumination image and an event stream, wherein the normal illumination image and the event stream are required to be paired in space, and the event is represented by e ∈ ═ x j ,y j ,t j ,p j ) Where e represents an event, x and y represent coordinate locations of pixel points, t represents a timestamp, p represents the polarity of the event, and j represents the number of events. Assuming an event stream time duration of Δ t, during which the event camera returns n gray frames simultaneously, for the pixel values of the synthesized pseudo-image
Figure BDA0003712366020000031
Each event interval is represented as
Figure BDA0003712366020000032
Figure BDA0003712366020000033
Adding the polarity values of the events to obtain an event pseudo image;
step 1.2, for the reference image GT, adding noise to the reference image GT, and using Gaussian blind noise, specifically, using a standard deviation range to generate Gaussian blind noise to substitute a standard deviation to generate Gaussian noise, then performing weak light scene simulation, gamma correction, as shown in formula I,
V out =(V in ) gamma (1)
wherein V out Representing corrected images, V in Representing the image before correction, gamma representing the scaling strength of the pixel value;
and then the data is subjected to linear normalization processing, as shown in formula two,
Figure BDA0003712366020000034
wherein X norm Representing the normalized image, X representing the pixel value at coordinates (X, y), X min Representing the minimum pixel value, X, of an image X max Representing the maximum pixel value of the image to obtain a simulated noise-containing dim light image;
step 1.3, performing edge extraction on the normal illumination image obtained in the step 1.1, and using a sobel operator, wherein a specific calculation mode is shown as formulas (3), (4) and (5): for hypothetical image A:
Figure BDA0003712366020000044
Figure BDA0003712366020000041
Figure BDA0003712366020000042
G x representing a first order difference in the transverse direction, G y The first difference in the longitudinal direction is indicated, G indicates a gradient map, where x indicates the convolution operation, resulting in G being a gradient image.
The step 2 is as follows:
step 2.1, firstly, using a dataset class of a deep learning framework PyTorch, performing transform operation on pictures in the class, firstly, converting the pictures into a tensor format, performing normalization operation, and then performing normalization processing, as shown in a formula (6):
Figure BDA0003712366020000043
wherein O is c Denotes the output of the c-th channel, i c Input of c channels of representation, m c Means, S, for the c-th channel c Representing the variance of the c channel, and then packaging the data through a DataLoader;
step 2.2, for the packed data, firstly performing feature extraction to obtain a feature map, selecting UNet as a backbone network of a gradient branch, replacing maximum pooling operation with convolution operation with a step length of 2, then adjusting a data range by a batchnorm algorithm, next activating by using a ReLU activation function, performing down sampling for seven times, then performing up sampling by deconvolution operation with the step length of 2, adjusting the data range by the batchnorm algorithm and using the ReLU activation function, performing up sampling for seven times in the same way, finally performing gradient map reconstruction on the up sampled feature map to obtain a reconstructed gradient image, performing padding operation, and introducing the information of the weak light image obtained in the step 1.2 into the gradient branch, specifically as shown in a convolution formula (7,8) and a deconvolution formula (9, 10):
Figure BDA0003712366020000051
Figure BDA0003712366020000052
H out =(H in -1)×2-2×p+k (9)
W out =(W in -1)×2-2×p+k (10)
wherein H out Indicating the height, W, of the output image out Indicating the width of the input image, letter H in Denotes the height of the input picture, p denotes the padding size, k denotes the convolution kernel size, s denotes the step size, W in Representing the width of the input image;
step 2.3, for the reconstructed gradient map and the gradient reference map of the gradient branch output, using L1loss, as shown in equations (11) (12),
L={l 1 ,l 2 ,...,l n },l n =|x n -y n | (11)
l (x,y) =mean(L) (12)
wherein L represents the sum of loss values of all pixel points, x and y represent the coordinates of the pixel respectively, and L 1 ,l 2 ,...,l n Respectively representing pixel values, n representing the number of pixels, l (x,y) Representing loss values, calculated here as mean valuesAnd completing updating of the gradient branch parameters by using the loss value to obtain the updated gradient branch.
The step 3 is as follows:
3.1, for the packed data input image enhancement branch in the step 2.1, selecting UNet as a backbone network of the image enhancement branch, replacing downsampling operation with convolution operation with the step length of 2, then performing batch norm and ReLU operation, repeating the operation for seven times, then completing upsampling operation by deconvolution with the step length of 2, then performing batch norm and ReLU operation, repeating the operation for seven times to obtain a feature map, and performing convolution and ReLU operation on the feature map once to obtain an output image with the channel number of 3;
step 3.2, calculating L1loss for the output image of the image enhancement branch and the normal light image, judging which of the output image and the normal light image is true, adopting MSEloss, as shown in formulas (13) and (14),
L={l 1 ,l 2 ,...,l n },l n =(x n -y n ) 2 (13)
l (x,y) =mean(L) (14)
where L represents the sum of the loss values of all pixels, L 1 ,l 2 ,...,l n Respectively representing the values of the corresponding blocks, l (x,y) And the loss value is expressed, x and y express the horizontal and vertical coordinates of patch, and the image enhancement branch parameters are updated by using the loss value in a mean value mode to obtain an updated image enhancement branch.
The step 4 is as follows:
step 4.1, constructing a feature fusion block, and adopting a module CBAM based on a channel and space attention, wherein the CBAM specifically operates as follows: firstly, performing channel attention operation on the output feature maps of the step 2 and the step 3, respectively performing maximum pooling and average pooling on the space, then inputting the shared multi-layer perceptron MLP, then adding and passing through tanh function, as shown in formula (15),
Figure BDA0003712366020000061
where tanh (x) represents the output value after activation, x represents the input value, and exp represents the natural logarithm.
Then, carrying out space attention operation, carrying out maximum pooling and average pooling on each channel, then carrying out splicing operation, and similarly, sending the output characteristic diagram to the image enhancement branch in the step 3 through a sigmoid function;
and 4.2, firstly performing splicing operation on the feature maps in the steps 2 and 3, then performing convolution to obtain a new feature map, then following a CBAM module on the new feature map, then performing batch norm operation and ReLU operation, adding dropout operation, randomly shielding off half of neurons of the feature fusion block in the step 4.1, repeating the splicing operation until half of the neurons are shielded twice, and using nine fusion modules in the gradient branch and the weak light enhancement branch together to obtain nine feature fusion modules.
The step 5 is as follows:
generating a countering neural network RGAN by adopting a relativistic theory, adding a condition to control the output of a discriminator in a mode of adopting a condition discriminator, wherein the added condition is the event pseudo image, the reconstructed gradient image and the weak light image in the step 1, adopting a patchGAN discriminator to judge the probability that the part of the image is true and false when the probability that the image is true is calculated, adopting a sigmoid activation function as shown in a formula (16),
Figure BDA0003712366020000071
the discriminator is specifically operated by firstly splicing the feature map, extracting features through convolution operation, then performing instancenorm and ReLU operation, repeating for four times, and judging the probability that the output image in the step 3 is a normal illumination image;
the step 6 is as follows:
step 6.1, for the neural network formed by the step 2 and the step 3, selecting the optimizer of the network as an ADAM optimizer, setting the initial learning rate to be 0.0002, setting the scheduler strategy as a multi-step attenuation strategy, wherein the attenuation steps are respectively 25 and 100, the attenuation is half each time, and the total training time is 300 epochs, and during the training period, observing psnr as shown in the formulas (17) and (18),
Figure BDA0003712366020000081
Figure BDA0003712366020000082
MSE represents mean square error, m and n represent length and width of the image respectively, I and j represent coordinate positions of pixel points, I (I, j) represents the image output by the network in the step 3, K (I, j) represents the normal illumination image obtained in the step 1, PSNR represents peak signal-to-noise ratio, MAX represents MAX I Representing the maximum pixel value of the image.
ssim is shown in equation (23), equation (22) contains three parts, brightness comparison l (x, y), as shown in equation (19), structure comparison s (x, y), as shown in equation (20), contrast comparison c (x, y), as shown in equation (21), μ x And mu y Denotes the mean, σ, of x and y, respectively x And σ y Denotes the variance, σ, of x and y, respectively xy Denotes the covariance of x and y, where c 1 ,c 2 ,c 3 Denotes a constant, avoids the denominator being 0, and sets α ═ β ═ γ ═ 1 and
Figure BDA0003712366020000083
reducing equation (22) to equation (23):
Figure BDA0003712366020000084
Figure BDA0003712366020000085
Figure BDA0003712366020000086
SSIM(x,y)=[l(x,y)] α [c(x,y)] β [s(x,y)] γ (22)
Figure BDA0003712366020000091
dynamically adjusting the hyper-parameters by observing two indexes of PSNR and SSIM: learning rate lr, balance parameter λ, and training round number epoch;
and 6.2, outputting the reference indexes of the training process loss, PSNR and SSIM to the tenasorboard through Summary writer of a Python third-party library tenasorboard, verifying a set test result, then storing the model, and storing the neural network parameters trained in the step 6.1, the number of rounds epoch under training, the optimizer ADAM and the scheduler to obtain the trained network model.
Step 7 is specifically as follows:
and 6, loading the trained network model in the step 6, inputting the test set into the trained network model, and then storing the test result to obtain an enhanced image.
The method has the advantages that the event stream is synthesized into an event pseudo image, the event pseudo image is reconstructed into a gradient map, part of information of the weak light enhancement branch is introduced into the gradient branch in order to reduce the reconstructed parameter quantity, then the weak light enhancement branch is guided by the gradient information, a CBAM module is adopted by a fusion block, information needing to be learned by a network is better concerned, L1loss is adopted as a loss function in order to increase the robustness of the network, finally a discriminator based on conditions is added in order to improve the reality of the generated image, and experimental results prove that the method has a good weak light enhancement effect.
Drawings
FIG. 1 is a schematic diagram of the general structure of the low-light enhancement method based on the event camera;
FIG. 2 is an example of a data set for training, including a low light image, an event artifact, a normal light image, and a gradient image;
FIG. 3 is a schematic diagram of a network for reconstructing an event pseudo-image into a gradient image;
FIG. 4 is a schematic diagram of a network for enhancing a low-light image to a normal-light image;
FIG. 5(a) is a network schematic of a convergence module;
FIG. 5(b) a network schematic of a channel and spatial attention module;
FIG. 6(a) is a diagram showing the change in the evaluation index psnr during training;
fig. 6(b) shows a change in the evaluation index ssim;
FIG. 6(c) shows a decrease in loss value;
FIG. 7 is a graph showing the results of the experiment according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a low-light image enhancement method based on an event camera, a flow chart is shown in figure 1, and the method is implemented according to the following steps:
step 1, data set composition, selecting a data set with a normal illumination image and an event stream, requiring the normal illumination image and the event stream to be matched in space, then generating a gradient image and a noise-containing weak light image from the normal illumination image, and preprocessing the event stream to obtain an event pseudo image with good edge information;
the step 1 is as follows:
the low-light image enhancement process based on the event camera comprises the following steps:
for an event e ═ (x) i ,y i ,t i ,p i ) Firstly, the polarities of events are summed to obtain an event pseudo image, then the event pseudo image is subjected to feature extraction and gradient image reconstruction, in order to reduce the calculation amount of parameters in the reconstruction process, a feature map of a low-light image is introduced into a Gradient Branch (GB), meanwhile, a fusion block is used for introducing gradient information into a low-light enhancement branch, and the image is completed through the low-light enhancement branchAnd (5) enhancing the image, and finally judging whether the generated image is true or not through a condition discriminator.
Firstly, the event combination is called as an event pseudo image, then a simulated weak light image is obtained by adding noise and gamma correction to a normal light image, sobel edge extraction operation is carried out on the normal light image to obtain a gradient map, and the gradient map is divided into a training set, a verification set and a test set according to the proportion of 7: 2: 1.
Step 1.1, selecting a data set with a normal illumination image and an event stream, wherein the normal illumination image and the event stream are required to be paired in space, and the event is represented by e ∈ ═ x j ,y j ,t j ,p j ) Where e represents an event, x and y represent coordinate locations of pixel points, t represents a timestamp, p represents the polarity of the event, and j represents the number of events. Assuming an event stream time duration of Δ t, during which the event camera returns n gray frames simultaneously, for the pixel values of the synthesized pseudo-image
Figure BDA0003712366020000111
Each event interval is represented as
Figure BDA0003712366020000112
Figure BDA0003712366020000113
Adding the polarity values of the events to obtain an event pseudo image;
step 1.2, for the reference image GT, adding noise to the reference image GT in order to obtain a more real low-light image, and using Gaussian blind noise for the authenticity of the data set, specifically, generating Gaussian blind noise by using a standard deviation range instead of generating Gaussian noise by using a standard deviation, then performing simulation and gamma correction of a low-light scene, as shown in formula I,
V out =(V in ) gamma (1)
wherein V out Representing the rectified image, V in Representing the image before correction, gamma representing the scaling strength of the pixel value;
and then the data is subjected to linear normalization processing, as shown in formula two,
Figure BDA0003712366020000114
wherein X norm Representing the normalized image, X representing the pixel value at coordinates (X, y), X min Representing the minimum pixel value, X, of an image X max Representing the maximum pixel value of the image to obtain a simulated noise-containing dim light image;
step 1.3, in order to obtain a gradient image, performing edge extraction on the normal illumination image obtained in the step 1.1, and using a sobel operator to calculate the specific calculation mode as shown in formulas (3), (4) and (5): for hypothetical image A:
Figure BDA0003712366020000121
Figure BDA0003712366020000122
Figure BDA0003712366020000123
G x representing a first order difference in the transverse direction, G y The first difference in the longitudinal direction is indicated, G indicates a gradient map, where x indicates the convolution operation, resulting in G being a gradient image.
Step 2, reconstructing the event pseudo image obtained in the step 1 into a gradient image by adopting UNet gradient branches;
the step 2 is as follows:
step 2.1, firstly, using the dataset class of the deep learning framework pytorreh, performing transform operation on pictures in the class, firstly, converting the pictures into a tensor format, performing normalization operation, and then performing normalization processing so as to enable better deep learning of learning data, as shown in formula (6):
Figure BDA0003712366020000124
wherein O is c Represents the output of the c channel, i c Input of c channels shown, m c Mean value, S, of the c-th channel c Representing the variance of the c channel, and then packaging the data through a DataLoader;
and 2.2, firstly performing feature extraction on the packed data to obtain a feature map, wherein an UNet network has a good feature extraction structure, the UNet is selected as a backbone network of a gradient branch, the maximum pooling operation is replaced by the convolution operation with the step length of 2, then the data range is adjusted through a batch norm algorithm, next, a ReLU activation function is used for activation, downsampling is performed seven times, then, up-sampling is performed through deconvolution operation with the step length of 2, the data range is adjusted through the batch norm algorithm, the ReLU activation function is used for the same up-sampling seven times, and finally, the up-sampled feature map is subjected to gradient map reconstruction to obtain a reconstructed gradient image. In order to ensure the size of the feature map, padding operation is performed, and in order to reduce the parameter, the information of the low-light image obtained in step 1.2 is introduced into the gradient branch, as shown in convolution equations (7,8) and deconvolution equations (9, 10):
Figure BDA0003712366020000131
Figure BDA0003712366020000132
H out =(H in -1)×2-2×p+k (9)
W out =(W in -1)×2-2×p+k (10)
wherein H out Indicating the height, W, of the output image out Indicating the width of the input image, letter H in Denotes the height of the input picture, p denotes paddingSize, k denotes convolution kernel size, s denotes step size, W in Representing the width of the input image;
step 2.3, for the reconstructed gradient map and the gradient reference map of the gradient branch output, using L1loss, as shown in equation (11) (12),
L={l 1 ,l 2 ,...,l n },l n =|x n -y n | (11)
l (x,y) =mean(L) (12)
wherein L represents the sum of loss values of all pixel points, x and y represent the coordinates of the pixel respectively, and L 1 ,l 2 ,...,l n Respectively representing pixel values, n representing the number of pixels, l (x,y) And representing the loss value, calculating the loss value in a mean value mode, and updating the gradient branch parameters by using the loss value to obtain the updated gradient branch.
Step 3, enhancing the low-light image obtained in the step 1 by adopting a UNet low-light image enhancement branch to obtain an enhanced image;
the step 3 is as follows:
3.1, for the packed data input image enhancement branch in the step 2.1, selecting UNet as a backbone network of the image enhancement branch, replacing downsampling operation with convolution operation with the step length of 2, then performing batch norm and ReLU operation, repeating the operation for seven times, then completing upsampling operation by deconvolution with the step length of 2, then performing batch norm and ReLU operation, repeating the operation for seven times to obtain a feature map, and performing convolution and ReLU operation on the feature map once to obtain an output image with the channel number of 3;
step 3.2, calculating L1loss for the output image of the image enhancement branch and the normal light image, and judging which of the output image and the normal light image is true, wherein MSEloss is adopted, as shown in formulas (13) and (14),
L={l 1 ,l 2 ,...,l n },l n =(x n -y n ) 2 (13)
l (x,y) =mean(L) (14)
wherein L represents all pixelsSum of point loss values,/ 1 ,l 2 ,...,l n Respectively representing the values of the corresponding blocks, l (x,y) And the loss value is expressed, x and y express the horizontal and vertical coordinates of patch, and the image enhancement branch parameters are updated by using the loss value in a mean value mode to obtain an updated image enhancement branch.
Step 4, in order to help the gradient map to better guide the enhancement of the low-light image, a feature fusion module is designed, namely the feature fusion module adopts a module CBAM based on a channel and space attention to fuse the information contained in the gradient image in the step 2 to the low-light enhancement branch in the step 3;
the step 4 is specifically as follows:
step 4.1, constructing a feature fusion block, and adopting a module CBAM based on a channel and space attention, wherein the CBAM specifically operates as follows: firstly, performing channel attention operation on the output feature maps of the step 2 and the step 3, respectively performing maximum pooling and average pooling on the space, then inputting the shared multi-layer perceptron MLP, then adding and passing through tanh function, as shown in formula (15),
Figure BDA0003712366020000151
where tanh (x) represents the output value after activation, x represents the input value, and exp represents the natural logarithm.
Then, carrying out space attention operation, carrying out maximum pooling and average pooling on each channel, then carrying out splicing operation, and similarly, sending the output characteristic diagram to the image enhancement branch in the step 3 through a sigmoid function;
and 4.2, firstly performing splicing operation on the feature maps in the steps 2 and 3, then performing convolution to obtain a new feature map, then following a CBAM module on the new feature map, then performing batch norm operation and ReLU operation, adding dropout operation, randomly shielding off half of neurons of the feature fusion block in the step 4.1, repeating the splicing operation until half of the neurons are shielded twice, and using nine fusion modules in the gradient branch and the weak light enhancement branch together to obtain nine feature fusion modules.
Step 5, adding a condition discriminator, wherein the condition is an event pseudo image and a gradient image, and generating a more real enhanced image;
the step 5 is as follows:
in order to increase the authenticity of the generated picture, a countering neural network RGAN is generated by adopting a relativistic theory, and the output of the discriminator can be better controlled by adopting a condition discriminator, wherein the added conditions are the step 1 event false image, the reconstructed gradient image and the dim light image, when the probability of the picture being true is calculated, a single value is not adopted for representing, but a PatchGAN discriminator is adopted for judging the probability of the part of the image being true and false, a sigmoid activation function is adopted, as shown in an equation (16),
Figure BDA0003712366020000161
the discriminator is specifically operated by firstly splicing the feature map, extracting features through convolution operation, then performing instancenorm and ReLU operation, repeating for four times, and judging the probability that the output image in the step 3 is a normal illumination image;
step 6, training 300 epochs on the neural network built by the gradient branch in the step 2 and the weak light image enhancement branch in the step 3, verifying the training result and storing the model of the neural network;
the step 6 is specifically as follows:
step 6.1, for the neural network formed by the step 2 and the step 3, selecting the optimizer of the network as an ADAM optimizer, setting the initial learning rate to be 0.0002, setting the scheduler strategy as a multi-step attenuation strategy, wherein the attenuation steps are respectively 25 and 100, the attenuation is half each time, and the total training time is 300 epochs, and during the training period, observing psnr as shown in the formulas (17) and (18),
Figure BDA0003712366020000162
Figure BDA0003712366020000171
MSE represents mean square error, m and n represent length and width of the image respectively, I and j represent coordinate positions of pixel points, I (I, j) represents the image output by the network in the step 3, K (I, j) represents the normal illumination image obtained in the step 1, PSNR represents peak signal-to-noise ratio, MAX represents MAX I Representing the maximum pixel value of the image.
ssim is shown in equation (23), equation (22) contains three parts, luminance comparison l (x, y), as shown in equation (19), structure comparison s (x, y), as shown in equation (20), contrast comparison c (x, y), as shown in equation (21), μ x And mu y Denotes the mean, σ, of x and y, respectively x And σ y Denotes the variance, σ, of x and y, respectively xy Denotes the covariance of x and y, where c 1 ,c 2 ,c 3 Denotes a constant, avoids the denominator being 0, and sets α ═ β ═ γ ═ 1 and
Figure BDA0003712366020000172
reducing equation (22) to equation (23):
Figure BDA0003712366020000173
Figure BDA0003712366020000174
Figure BDA0003712366020000175
SSIM(x,y)=[l(x,y)] α [c(x,y)] β [s(x,y)] γ (22)
Figure BDA0003712366020000176
dynamically adjusting the hyper-parameters by observing two indexes of PSNR and SSIM: learning rate lr, balance parameter λ, and training round number epoch;
and 6.2, outputting reference indexes of the training process loss, PSNR and SSIM to the tenasorboard through Summary writer of a Python third-party library tenasorboard for better visualization of the training result, testing the result in a verification set, and then storing the model, wherein the neural network parameters trained in the step 6.1, the number of rounds epoch in training, the optimizer ADAM and the scheduler can be stored for convenience of continuous training and over-parameter adjustment, so that the trained network model is obtained.
And 7, testing the neural network model stored in the step 6, and outputting the enhanced image.
Step 7 is specifically as follows:
and 6, loading the trained network model in the step 6, inputting the test set into the trained network model, and then storing the test result to obtain an enhanced image.
As shown in fig. 1, an embodiment of the present invention includes:
a low-light image enhancement based on an event camera comprises the steps of firstly extracting an event stream and a normal light image, synthesizing an event pseudo image by the event stream, generating a gradient image and a low-light noise image by the normal light image, inputting the event pseudo image into a gradient branch to be reconstructed into the gradient image, inputting the low-light image into a low-light enhancement branch to obtain the normal light image, adopting a module based on a channel and a space attention mechanism in order to reduce parameter calculation amount and adopt the gradient image to guide the enhancement of the low-light image, adopting an identifier network added with conditions in order to increase the authenticity of the generated low-light image, and proving that the low-light image enhancement method has an excellent enhancement effect through experiments.
The dim light enhancement method based on the event camera is implemented according to the following steps:
step 1, firstly, combining event streams to be event pseudo images, then obtaining simulated weak light images through noise addition and gamma correction of normal light images, carrying out sobel edge extraction operation on the normal light images to obtain gradient images, and carrying out the following steps according to the step 7: 2: the scale of 1 is divided into a training set, a validation set, and a test set.
Step 1.1, the event is represented by ∈ ═ x i ,y i ,t i ,p i ) Assuming that the event stream has a time length Δ t, the period during which the event camera returns n gray frames at the same time, for the pixel values of the synthesized pseudo-image
Figure BDA0003712366020000181
Each event interval is represented as
Figure BDA0003712366020000182
Using time-interval-based input of reconstructed branches for gradient maps obtained by adding the polarity values of the events
Figure BDA0003712366020000191
The stacking mode of (2) is easy to cause excessive superposition and sparseness of events, so a fixed number N is adopted e Events may lead to better results and may be modified by changing N e The value and the quantity of the control gradient branch are input;
step 1.2, adding noise to the reference image (GT) in order to obtain a more real low-light image, and in order to increase the authenticity of the data set, gaussian blind noise can be used, specifically, the gaussian blind noise generated by a standard deviation range is used for replacing a standard deviation to generate the gaussian noise, and then, the simulation and gamma correction of a low-light scene are carried out, formula (1), wherein V is out Representing the rectified image, V in Representing the image before rectification, gamma representing the degree of scaling of the pixel values, after which the data is subjected to a linear normalization process, equation (2), where X represents the pixel value at coordinates (X, y), X min Representing the minimum pixel value, X, of the image of view X max Representing the maximum pixel value of the image, thus obtaining an input of the low-light image enhancement branch;
V out =(V in ) gamma (1)
Figure BDA0003712366020000192
step 1.3, in order to obtain a reference image of a reconstructed gradient map, edge extraction is carried out on a normal light image, a sobel operator is utilized, a concrete calculation mode is adopted, and a formula (3, 4, 5) is used for supposing images A and G x Representing a first order difference in the transverse direction, G y The first order difference in the vertical direction is represented, G represents a gradient map, where x represents a convolution operation, then G is a reference image of the gradient map, the data set is exemplified as fig. 2, the first and fifth lines represent weak light noisy images, the second and sixth lines represent event-synthesized pseudo images, the third and seventh lines represent gradient images extracted from normal light images, and the fourth and eighth lines represent normal light images.
Figure BDA0003712366020000201
Figure BDA0003712366020000202
Figure BDA0003712366020000203
The step 2 specifically comprises the following operations:
step 2.1, firstly defining a dataset class, performing transform operation on pictures in the class, firstly converting the pictures into a tensor format, performing normalization operation, and then performing normalization processing on the learning data so as to enable deep learning to be better, wherein the formula (6) is shown in the specification, wherein O is c Denotes the output of the c-th channel, i c Input of c channels of representation, m c Means, S, for the c-th channel c Representing the variance of the c channel, and then packaging the data through a DataLoader;
Figure BDA0003712366020000204
step 2.2, for the packed data, firstly performing feature extraction to obtain a feature map, wherein the UNet network has a good feature extraction structure, selecting UNet as a backbone network of a gradient branch, as shown in fig. 3, replacing a maximum pooling operation by a convolution operation with a step size of 2, then adjusting a data range by a batchnorm algorithm, next performing a ReLU activation function, performing downsampling for seven times, then performing upsampling by a deconvolution operation with the step size of 2, adjusting the data range by a bach norm algorithm and using the ReLU activation function, performing upsampling in the same way, finally performing gradient map reconstruction on the upsampled feature map, performing padding operation to ensure the size of the feature map, introducing the feature map of a weak light image enhancement branch into the gradient branch in order to reduce the number of parameters, and specifically performing convolution formulas (7,8) and deconvolution formulas (9,10) as follows, wherein the letter H in Denotes the height of the input picture, p denotes the padding size, k denotes the convolution kernel size, s denotes the step size, H out Indicating the height, W, of the output image in Representing the width, W, of the input image out Representing the width of the input image;
Figure BDA0003712366020000211
Figure BDA0003712366020000212
H out =(H in -1)×2-2×p+k (9)
W out =(W in -1)×2-2×p+k (10)
and 2.3, for the reconstructed gradient map and the gradient reference map output by the gradient branch, using L1loss and formulas (11 and 12), wherein x and y respectively represent the coordinates of the pixels, and n represents the number of the pixel points, and the loss is calculated by adopting a mean value mode.
l (x,y) =L={l 1 ,l 2 ,…,l n },l n =|x n -y n | (11)
l (x,y) =mean(L) (12)
The step 3 specifically comprises the following steps:
step 3.1, for the packed data input image enhancement branch in step 2.1, similarly, the convolution operation with the step length of 2 is used to replace the downsampling operation, then the batch norm and the ReLU operation are performed, the operation is repeated for seven times, then the deconvolution with the step length of 2 is used to complete the upsampling operation, then the batch norm and the ReLU operation are performed, the operation is repeated for seven times, finally the convolution and the ReLU operation are performed on the feature diagram once, and the output diagram with the channel number of 3 is obtained, and the network structure diagram is shown in fig. 4;
and 3.2, calculating L1loss for the output image of the image enhancement branch and the normal light image, and judging which of the output image and the normal light image is true, wherein MSEloss is adopted, and the formula (13 and 14) is adopted, wherein x and y represent the value of nth patch, and the mean value mode is adopted.
l (x,y) =L={l 1 ,l 2 ,...,l n },l n =(x n -y n ) 2 (13)
l (x,y) =mean(L) (14)
The step 4 specifically comprises the following steps:
step 4.1, for the feature fusion block, a module (CBAM) based on channel and spatial attention is adopted, and the specific operation of the CBAM is to firstly perform channel attention on an input feature map F, respectively perform maximum pooling and average pooling on a space, then input a shared multilayer perceptron MLP, then perform addition and pass through a tanh function and a formula (15), then perform spatial attention, perform maximum pooling and average pooling on each channel, then perform splicing operation, and similarly pass through a sigmoid function;
Figure BDA0003712366020000221
step 4.2, firstly performing splicing operation on the feature maps needing to be fused, then performing convolution to obtain the feature maps, then following a CBAM module on the fused feature maps, then performing batch norm operation and ReLU operation, adding dropout operation, randomly shielding half of neurons, repeating the operation twice, totally using nine fusion modules on the gradient branches and the weak light enhancement branches, and referring to a fusion module network structure shown in fig. 5(a) and a schematic diagram of a channel and space attention mechanism module shown in fig. 5 (b);
the step 5 specifically comprises the following steps:
in order to increase the authenticity of the generated picture, a relativistic generation antagonistic neural network (RGAN) is adopted, a condition discriminator mode is adopted, the output of the generator can be better controlled by adding conditions, the added conditions are an event pseudo image, a reconstructed gradient image and a dim light image, when the probability of the picture being true is calculated, a single value is not adopted for representing, and PatchGAN is adopted for judging the probability of the part of the image being true and false, a sigmoid activation function is adopted, equation (16) is adopted, a network module of the discriminator specifically operates to firstly splice characteristic maps and extract characteristics through convolution operation, then instancenorm and ReLU operation are carried out, and the step is repeated for four times.
Figure BDA0003712366020000231
The step 6 specifically comprises the following steps:
step 6.1, selecting an optimizer of the network as an ADAM optimizer, setting the initial learning rate to be 0.0002, setting the scheduler to be a multi-step attenuation strategy, wherein the attenuation steps are respectively 25 and 100, each time of attenuation is half, training is carried out on 300 epochs in total, and during training, through observing psnr, formulas (17 and 18), wherein MAX is the maximum allowable Moving Average (MAX) I Maximum pixel value of the image, ssim equation (23), where l (x, y) represents brightness comparison, c (x, y) represents contrast comparison, s (x, y) represents structure comparison, μ x And mu y Denotes the mean, σ, of x and y, respectively x And σ y Denotes the variance, σ, of x and y, respectively xy Denotes the covariance of x and y, where c 1 ,c 2 ,c 3 Denotes a constant, avoids the denominator being 0, and sets α ═ β ═ γ ═ 1 and
Figure BDA0003712366020000232
equation (22) may be reduced to equation (23). The two indexes dynamically adjust some hyper-parameters, such as learning rate lr, balance parameter λ, and training round number epoch.
Figure BDA0003712366020000233
Figure BDA0003712366020000234
Figure BDA0003712366020000235
Figure BDA0003712366020000236
Figure BDA0003712366020000237
SSIM(x,y)=[l(x,y)] α [c(x,y)] β [s(x,y)] γ (22)
Figure BDA0003712366020000238
And 6.2, outputting reference indexes such as loss, psnr, ssim and the like in the training process to the tensorboard through Summary writer for better visualizing the training result, and testing the result in the verification set. The training result is shown in fig. 6, where fig. 6(a) shows the psnr variation per epoch in the training phase, fig. 6(b) shows the ssim variation per epoch in the training phase, and fig. 6(c) shows the loss reduction in the training phase. And then, storing the model, wherein in order to facilitate continuous training and super-parameter adjustment, the network parameters, the number of rounds epoch under training, the optimizer and the scheduler can be stored, and meanwhile, preparation is made for the next test.
The step 7 specifically comprises the following steps:
loading the trained model, inputting the test set into the trained network model, then storing the test result, and calculating psnr and ssim indexes of the test result, wherein the objective indexes are that psnr is about 25.5 and ssim is 0.82, the subjective experiment result is shown in fig. 7, the first line is a low-light image, the second line is an event pseudo image, the third line is a reconstructed gradient image, the fourth line is a gradient image corresponding to a normal light image, the fifth line is a result obtained after enhancement of the low-light image, and the sixth line is a normal light image.
The method enhances the low-light image, guides the enhancement of the low-light image through the gradient image reconstructed by the event stream, obtains a higher numerical value on an objective index of the enhanced result, obtains a good enhancement effect on a subjective experiment, and can remove most of noise of the low-light image and reconstruct a good detailed normal light image.

Claims (8)

1. The low-light image enhancement method based on the event camera is characterized by being implemented according to the following steps:
step 1, data set composition, selecting a data set with a normal illumination image and an event stream, requiring the normal illumination image and the event stream to be matched in space, then generating a gradient image and a noise-containing weak light image from the normal illumination image, and preprocessing the event stream to obtain an event pseudo image with good edge information;
step 2, reconstructing the event pseudo image obtained in the step 1 into a gradient image by adopting UNet gradient branches;
step 3, enhancing the low-light image obtained in the step 1 by adopting a UNet low-light image enhancement branch to obtain an enhanced image;
step 4, designing a feature fusion module, namely adopting a module CBAM based on a channel and space attention, and fusing information contained in the gradient image in the step 2 to the weak light enhancement branch in the step 3;
step 5, adding a condition discriminator, wherein the condition is an event pseudo image and a gradient image, and generating a more real enhanced image;
step 6, training 300 epochs on the neural network built by the gradient branch in the step 2 and the weak light image enhancement branch in the step 3, verifying the training result and storing the model of the neural network;
and 7, testing the neural network model stored in the step 6, and outputting the enhanced image.
2. The event camera-based low-light image enhancement method according to claim 1, wherein the step 1 is specifically as follows:
step 1.1, selecting a data set with a normal illumination image and an event stream, wherein the normal illumination image and the event stream are required to be paired in space, and the event is represented by e ∈ ═ x j ,y j ,t j ,p j ) Wherein ∈ represents an event, x and y represent coordinate positions of pixel points, t represents a timestamp, p represents the polarity of the event, and j represents the next event; assuming an event stream time duration of Δ t, during which the event camera returns n gray frames simultaneously, for the pixel values of the synthesized pseudo-image
Figure FDA0003712366010000021
i is 1,2, …, n, each event interval being represented as
Figure FDA0003712366010000022
Figure FDA0003712366010000023
Adding the polarity values of the events to obtain an event pseudo image;
step 1.2, adding noise to the reference image GT, and using Gaussian blind noise, specifically, using a standard deviation range to generate Gaussian blind noise to replace a standard deviation to generate Gaussian noise, then performing simulation and gamma correction of a weak light scene, as shown in formula I,
V out =(V in ) gamma (1)
wherein V out Representing the rectified image, V in Representing the image before correction, gamma representing the scaling strength of the pixel value;
and then the data is subjected to linear normalization processing, as shown in formula two,
Figure FDA0003712366010000024
wherein X norm Representing the normalized image, X representing the pixel value at coordinates (X, y), X min Representing the minimum pixel value, X, of an image X max Representing the maximum pixel value of the image to obtain a simulated noise-containing dim light image;
step 1.3, performing edge extraction on the normal illumination image obtained in the step 1.1, and using a sobel operator, wherein a specific calculation mode is shown as formulas (3), (4) and (5): for hypothetical image A:
Figure FDA0003712366010000031
Figure FDA0003712366010000032
Figure FDA0003712366010000033
G x representing a first order difference in the transverse direction, G y The first difference in the longitudinal direction is indicated, G indicates a gradient map, where x indicates the convolution operation, resulting in G being a gradient image.
3. The event camera-based low-light image enhancement method according to claim 2, wherein the step 2 is specifically as follows:
step 2.1, firstly, a dataset class of a deep learning framework PyTorch is utilized, a transform operation is carried out on pictures in the class, the pictures are firstly converted into a tensor format and are normalized, and then normalization processing is carried out, as shown in a formula (6):
Figure FDA0003712366010000034
wherein O is c Denotes the output of the c-th channel, i c Input of c channels of representation, m c Means, S, for the c-th channel c Representing the variance of the c channel, and then packaging the data through a DataLoader;
step 2.2, for the packed data, firstly performing feature extraction to obtain a feature map, selecting UNet as a backbone network of a gradient branch, replacing a maximum pooling operation by a convolution operation with a step length of 2, then adjusting a data range by a batch norm algorithm, next using a ReLU activation function for activation, performing down sampling for seven times, then performing up sampling by a deconvolution operation with the step length of 2, adjusting the data range by the batch norm algorithm, using the ReLU activation function, performing up sampling for seven times in the same way, finally performing gradient map reconstruction on the up sampled feature map to obtain a reconstructed gradient image, performing padding operation, introducing the information of the weak light image obtained in the step 1.2 into the gradient branch, specifically as shown by convolution formulas (7,8) and deconvolution formulas (9, 10):
Figure FDA0003712366010000041
Figure FDA0003712366010000042
H out =(H in -1)×2-2×p+k (9)
W out =(W in -1)×2-2×p+k (10)
wherein H out Indicating the height, W, of the output image out Indicating the width of the input image, letter H in Denotes the height of the input picture, p denotes the padding size, k denotes the convolution kernel size, s denotes the step size, W in Representing the width of the input image;
step 2.3, for the reconstructed gradient map and the gradient reference map of the gradient branch output, using L1loss, as shown in equation (11) (12),
L={l 1 ,l 2 ,…,l n },l n =|x n -y n | (11)
l (x,y) =mean(L) (12)
wherein L represents the sum of loss values of all pixel points, x and y represent the coordinates of the pixel respectively, and L 1 ,l 2 ,…,l n Respectively representing pixel values, n representing the number of pixels, l (x,y) And representing the loss value, calculating the loss value in a mean value mode, and updating the gradient branch parameters by using the loss value to obtain the updated gradient branch.
4. The event camera-based low-light image enhancement method according to claim 3, wherein the step 3 is specifically as follows:
3.1, for the packed data input image enhancement branch in the step 2.1, selecting UNet as a backbone network of the image enhancement branch, replacing downsampling operation with convolution operation with the step length of 2, then performing batch norm and ReLU operation, repeating the operation for seven times, then completing upsampling operation by deconvolution with the step length of 2, then performing batch norm and ReLU operation, repeating the operation for seven times to obtain a feature map, and performing convolution and ReLU operation on the feature map once to obtain an output image with the channel number of 3;
step 3.2, calculating L1loss for the output image of the image enhancement branch and the normal light image, judging which of the output image and the normal light image is true, adopting MSEloss, as shown in formulas (13) and (14),
L={l 1 ,l 2 ,…,l n },l n =(x n -y n ) 2 (13)
l (x,y) =mean(L) (14)
where L represents the sum of the loss values of all pixels, L 1 ,l 2 ,…,l n Respectively representing the values of the corresponding blocks, l (x,y) And the loss value is expressed, x and y express the horizontal and vertical coordinates of patch, and the image enhancement branch parameters are updated by using the loss value in a mean value mode to obtain an updated image enhancement branch.
5. The event camera-based low-light image enhancement method according to claim 4, wherein the step 4 is specifically as follows:
step 4.1, constructing a feature fusion block, and adopting a module CBAM based on a channel and space attention, wherein the CBAM specifically operates as follows: firstly, performing channel attention operation on the output feature maps of the step 2 and the step 3, respectively performing maximum pooling and average pooling on the space, then inputting the shared multi-layer perceptron MLP, then adding and passing through tanh function, as shown in formula (15),
Figure FDA0003712366010000051
wherein tanh (x) represents an output value after activation, x represents an input value, and exp represents a natural logarithm;
then, carrying out space attention operation, carrying out maximum pooling and average pooling on each channel, then carrying out splicing operation, and similarly, sending the output characteristic diagram to the image enhancement branch in the step 3 through a sigmoid function;
and 4.2, firstly performing splicing operation on the feature maps in the steps 2 and 3, then performing convolution to obtain a new feature map, then following a CBAM module on the new feature map, then performing batch norm operation and ReLU operation, adding dropout operation, randomly shielding off half of neurons of the feature fusion block in the step 4.1, repeating the splicing operation until half of the neurons are shielded twice, and using nine fusion modules in the gradient branch and the weak light enhancement branch together to obtain nine feature fusion modules.
6. The event camera-based low-light image enhancement method according to claim 5, wherein the step 5 is specifically as follows:
generating a countering neural network RGAN by adopting a relativistic theory, adding a condition to control the output of a discriminator in a mode of adopting a condition discriminator, wherein the added condition is the event pseudo image, the reconstructed gradient image and the weak light image in the step 1, adopting a patchGAN discriminator to judge the probability that the part of the image is true and false when the probability that the image is true is calculated, adopting a sigmoid activation function as shown in a formula (16),
Figure FDA0003712366010000061
wherein δ (x) represents the activated value, x represents the input value, exp represents the natural logarithm, the specific operation of the discriminator is to firstly splice the feature map and extract the features through convolution operation, then perform instancenorm and ReLU operation, repeat for four times, and judge the probability that the output image in step 3 is the normal illumination image.
7. The event camera-based low-light image enhancement method according to claim 6, wherein the step 6 is as follows:
step 6.1, for the neural network formed by the step 2 and the step 3, selecting the optimizer of the network as an ADAM optimizer, setting the initial learning rate to be 0.0002, setting the scheduler strategy as a multi-step attenuation strategy, wherein the attenuation steps are respectively 25 and 100, the attenuation is half each time, and the total training time is 300 epochs, and during the training period, observing psnr as shown in the formulas (17) and (18),
Figure FDA0003712366010000071
Figure FDA0003712366010000072
MSE represents mean square error, m and n represent length and width of the image respectively, I and j represent coordinate positions of pixel points, I (I, j) represents the image output by the network in the step 3, K (I, j) represents the normal illumination image obtained in the step 1, PSNR represents peak signal-to-noise ratio, MAX represents MAX I A maximum pixel value representing an image;
ssim is shown in equation (23), equation (22) contains three parts, luminance comparison l (x, y), as shown in equation (19), structure comparison s (x, y), as shown in equation (20), contrast comparison c (x, y), as shown in equation (21), μ x And mu y Denotes the mean, σ, of x and y, respectively x And σ y Denotes the variance, σ, of x and y, respectively xy Denotes the covariance of x and y, where c 1 ,c 2 ,c 3 Denotes a constant, avoids the denominator being 0, and sets α ═ β ═ γ ═ 1 and
Figure FDA0003712366010000073
reducing equation (22) to equation (23):
Figure FDA0003712366010000074
Figure FDA0003712366010000075
Figure FDA0003712366010000076
SSIM(x,y)=[l(x,y)] α [c(x,y)] β [s(x,y)] γ (22)
Figure FDA0003712366010000081
dynamically adjusting the hyper-parameters by observing two indexes of PSNR and SSIM: learning rate lr, balance parameter λ, and training round number epoch;
and 6.2, outputting the reference indexes of the training process loss, PSNR and SSIM to the tenasorboard through Summary writer of a Python third-party library tenasorboard, verifying a set test result, then storing the model, and storing the neural network parameters trained in the step 6.1, the number of rounds epoch under training, the optimizer ADAM and the scheduler to obtain the trained network model.
8. The event camera-based low-light image enhancement method according to claim 7, wherein the step 7 is specifically as follows:
and 6, loading the trained network model in the step 6, inputting the test set into the trained network model, and then storing the test result to obtain an enhanced image.
CN202210723127.0A 2022-06-24 2022-06-24 Low-light image enhancement method based on event camera Pending CN115082341A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210723127.0A CN115082341A (en) 2022-06-24 2022-06-24 Low-light image enhancement method based on event camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210723127.0A CN115082341A (en) 2022-06-24 2022-06-24 Low-light image enhancement method based on event camera

Publications (1)

Publication Number Publication Date
CN115082341A true CN115082341A (en) 2022-09-20

Family

ID=83254982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210723127.0A Pending CN115082341A (en) 2022-06-24 2022-06-24 Low-light image enhancement method based on event camera

Country Status (1)

Country Link
CN (1) CN115082341A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091337A (en) * 2022-11-29 2023-05-09 北京大学 Image enhancement method and device based on event signal nerve coding mode
JP7425276B1 (en) 2023-05-11 2024-01-31 浙江工商大学 Method, medium and device for augmenting reconstructed images of an event camera that fuses visible light images

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091337A (en) * 2022-11-29 2023-05-09 北京大学 Image enhancement method and device based on event signal nerve coding mode
CN116091337B (en) * 2022-11-29 2024-02-02 北京大学 Image enhancement method and device based on event signal nerve coding mode
JP7425276B1 (en) 2023-05-11 2024-01-31 浙江工商大学 Method, medium and device for augmenting reconstructed images of an event camera that fuses visible light images

Similar Documents

Publication Publication Date Title
Lim et al. DSLR: Deep stacked Laplacian restorer for low-light image enhancement
CN115082341A (en) Low-light image enhancement method based on event camera
CN111669514B (en) High dynamic range imaging method and apparatus
Khan et al. Fhdr: Hdr image reconstruction from a single ldr image using feedback network
CN111292264A (en) Image high dynamic range reconstruction method based on deep learning
CN110570377A (en) group normalization-based rapid image style migration method
CN113344773B (en) Single picture reconstruction HDR method based on multi-level dual feedback
CN112541877B (en) Defuzzification method, system, equipment and medium for generating countermeasure network based on condition
CN111835983B (en) Multi-exposure-image high-dynamic-range imaging method and system based on generation countermeasure network
US20210217151A1 (en) Neural network trained system for producing low dynamic range images from wide dynamic range images
CN110225260B (en) Three-dimensional high dynamic range imaging method based on generation countermeasure network
CN113592726A (en) High dynamic range imaging method, device, electronic equipment and storage medium
CN115170915A (en) Infrared and visible light image fusion method based on end-to-end attention network
CN111986106A (en) High dynamic image reconstruction method based on neural network
CN115526891B (en) Training method and related device for defect data set generation model
CN115484410A (en) Event camera video reconstruction method based on deep learning
CN116739899A (en) Image super-resolution reconstruction method based on SAUGAN network
Chambe et al. HDR-LFNet: inverse tone mapping using fusion network
CN113538266A (en) WGAN-based fuzzy aerial image processing method
CN116389912B (en) Method for reconstructing high-frame-rate high-dynamic-range video by fusing pulse camera with common camera
CN113128517A (en) Tone mapping image mixed visual feature extraction model establishment and quality evaluation method
Wu et al. An improved method of low light image enhancement based on retinex
CN116152128A (en) High dynamic range multi-exposure image fusion model and method based on attention mechanism
Kumar et al. Underwater Image Enhancement using deep learning
CN113837945A (en) Display image quality optimization method and system based on super-resolution reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination