CN115456903B

CN115456903B - Deep learning-based full-color night vision enhancement method and system

Info

Publication number: CN115456903B
Application number: CN202211166825.1A
Authority: CN
Inventors: 彭成磊; 刘知豪; 岳涛; 潘红兵; 王宇宣
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2023-05-09
Anticipated expiration: 2042-09-23
Also published as: CN115456903A

Abstract

The invention provides a full-color night vision enhancement method and system based on deep learning. The method comprises the following steps: s1, acquiring RAW format image sequence information under various ambient illuminance; s2, preprocessing the RAW format image sequence to obtain an RGB format image sequence after pixel fusion; s3, obtaining a black level image, and removing the black level; s4, linearly brightening according to the brightness of the typical area of the image; s5, acquiring a denoised image sequence through a denoising network with a gating circulation unit; s6, recovering the initial brightness; and S7, self-adaptively adjusting the brightness of the image sequence through a self-supervised cyclic convolutional neural network. The invention uses the long time sequence information to denoise the image sequence, and can effectively remove 10 ^‑3 Image noise collected in the Lux left and right environments improves the signal to noise ratio of the image.

Description

Deep learning-based full-color night vision enhancement method and system

Technical Field

The invention relates to a full-color night vision enhancement method and system based on deep learning, and belongs to the field of computer vision.

Background

Night vision imaging enhancement is a back-end image processing technique that removes noise and color deviation by algorithm processing after obtaining an infrared or low-light image at night, forming a clear and easily observable enhanced image. The enhancement of infrared images generally uses filtering algorithms to achieve noise and background suppression, highlighting the primary objective. The enhancement of black-white low-light images is similar to infrared, and mainly uses adaptive filtering and other methods to enhance the signal-to-noise ratio of the images. In addition, some enhancement algorithms impart pseudo-color information to the black-and-white low-light-level image to highlight the main targets in the scene and enhance the user's viewing experience. Compared with the night vision technology, the full-color low-light-level image is acquired, the quality of the low-light-level image is improved by combining the deep learning technology, the image with accurate color, high signal to noise ratio and balanced brightness can be obtained, and the night vision impression is improved.

The existing artificial intelligence glimmer enhancement technology uses paired or unpaired glimmer images and an image training network under normal illumination to realize mapping from glimmer to normal illumination, and has better glimmer enhancement effect. But lacks the full-color night vision enhancement technology of starlight level or atmospheric light level, and can not improve the signal to noise ratio of the image and adjust the brightness, when the ambient illuminance detects 10 ^-3 When Lux is equal to or lower than Lux, the existing low-light enhancement algorithm based on deep learning cannot effectively improve the image quality.

Disclosure of Invention

In order to solve the technical problems existing in the prior art, the temperature is improved to be 10 at the minimum ^-3 The invention provides a deep learning-based full-color night vision enhancement method and a deep learning-based full-color night vision enhancement system with separate denoising and brightness adjustment.

The specific technical scheme of the method is as follows:

a full-color night vision enhancement method based on deep learning comprises the following steps:

s1: collecting a low-light image in a RAW format, and recording image information as X _RAW ；

S2: for the image information X _RAW Performing pixel fusion, converting into RGB format image, and marking as X _RGB ；

S3: obtaining N dark field images in RGB format by using the same acquisition parameters in the step S1 and using the processing method in the step S2, taking the average value of the N Zhang Anchang images as black level information, and marking the average value as X _BLACK ；

S4: selecting the image information X _RAW Image blocks of M x N resolution in five typical positions around and in the center of (a), a mean value is calculated

And further calculating to obtain linear brightness enhancement coefficient +.>

And input X of denoising network _IN1 ＝Ratio×(X _RGB -X _BLACK )；

S5: x is to be _IN1 Inputting into a denoising network to obtain an image X after removing noise _OUT1 ；

S6: recovering the denoised image X from the coefficient Ratio obtained in step S4 _OUT1 As input X to an adaptive brightness adjustment network _IN2 ＝X _OUT1 /Ratio；

S7: x is to be _IN2 And inputting the image sequence into a self-adaptive brightness adjustment network to obtain a final output image sequence.

The invention has the following beneficial effects:

(1) Aiming at the imaging characteristics of the low-light-level image in the extremely low-light environment, the invention uses the preprocessing methods such as pixel fusion, black reduction level and the like to remove partial noise and color deviation in advance, and can improve the quality of the low-light-level image.

(2) The low-light-level enhancement method is split into two steps of denoising and self-adaptive brightness adjustment, corresponding functions are realized by using two convolutional neural networks respectively, and the self-adaptive brightness adjustment network is used for processing the denoised low-light-level image, so that the removing effect of noise of the extremely-low illumination image is effectively improved, and the image brightness is improved.

(3) The gating circulation unit GRU is used in the denoising network, image time sequence information is utilized for denoising, the signal to noise ratio of the image is effectively improved, and 10 can be effectively removed ^-3 Image noise collected in the environment around Lux. Up-sampling is done using a pixel rebinning method PixelShuffle, avoiding image blurring and checkerboard noise introduced by deconvolution.

(4) The self-supervision learning method is used for training the cyclic convolutional neural network, memory occupation is reduced through multiplexing weight parameters, the self-supervision learning effectively improves the robustness of self-adaptive brightness adjustment, the overall brightness of an output image is uniform and stable, the color deviation is small, and the display consistency is improved.

Drawings

FIG. 1 is a schematic diagram of a full color night vision enhancement system of the present invention;

FIG. 2 is a flow chart of the full color night vision enhancement method of the present invention;

FIG. 3 is a schematic diagram of the structure of the denoising network DenoiseNet of the present invention;

fig. 4 is a schematic structural diagram of an adaptive brightness adjustment network LightNet according to the present invention;

FIG. 5 is a schematic view of a GRU unit structure used in the present invention.

Detailed Description

The following describes the scheme of the invention in detail with reference to the accompanying drawings.

As shown in fig. 1, the present embodiment provides a full-color night vision enhancement system based on deep learning, including:

the low-light image acquisition module is used for acquiring low-light images in a RAW format, a full-color low-light camera is generally used for acquiring visible light information under the environment illumination of a starlight level and effectively imaging, the acquired image format is RAW, and the arrangement mode is RGGB;

the preprocessing module is used for preprocessing the RAW format image acquired by the low-light image acquisition module: performing BIN2 pixel fusion on the RAW format image, namely fusing adjacent four pixels, converting the RAW format image into an RGB format, and subtracting black level information, wherein the black level information is the average value of N RGB format dark field images obtained by a micro-light image acquisition module;

the denoising module is used for removing noise of the RGB image after preprocessing through a denoising network; linearly brightening an RGB image, inputting the RGB image into a denoising network, and linearly restoring the RGB image to initial brightness distribution to serve as the input of a self-adaptive brightness adjustment network;

the self-adaptive brightness adjustment module is used for processing the RGB image after denoising through the trained self-adaptive brightness adjustment network and adjusting brightness distribution;

and the coding output module is used for coding the RGB image which is processed and enhanced by the self-adaptive brightness adjustment module into a video signal, and storing the video signal in a local storage medium or transmitting the video signal to a display for display.

As shown in fig. 2, the method for enhancing full-color night vision based on deep learning provided in this embodiment includes the following steps:

s1: using a micro-light camera to collect RAW format image, fixing parameters such as exposure time, gain, aperture size and the like of the camera, and recording initial image information as X _RAW ；

S2: for X _RAW BIN2 fusion, namely adjacent four pixels fusion, is carried out, and then the image is converted into an RGB format image, and is marked as X _RGB ；

S3: using the same acquisition parameters in the step S1, acquiring N Zhang Anchang images by using a micro-light camera, then converting the dark field images into RGB format by using the method in the step S2, taking the average value of the N dark field images in RGB format as the black level information of the camera, and recording as X _BLACK ；

S4: selecting the image information X in step S1 _RAW Image blocks of M x N resolution in five typical positions around and in the center of (a), calculating the mean value

And further calculating to obtain linear brightness enhancement coefficient +.>

And input X of denoising network _IN1 ＝Ratio×(X _RGB -X _BLACK )；

S5: x is to be _IN1 Inputting a denoising network DenoiseNet to obtain a denoised image X _OUT1 ；

The de-noiseNet has a specific structure shown in FIG. 3, and comprises an encoder, a feature mapping unit and a decoder, wherein the encoder encodes an image with a size of H×W×3 into an image with a size of H×W×3

Wherein H and W represent the height and width of the input image, respectively), the decoder uses the pixel rebinning PixelShuffle method to scale +.>

The feature map data of (2) is rearranged into an output image with the size of H multiplied by W multiplied by 3, a feature mapping unit consisting of a residual error network ResNet and a gating circulation unit GRU is inserted into an encoder and a decoder, and the mapping from noise features to noiseless features is completed; the encoder section comprises three 3×3 convolutions, step size 2, padding 1, activation function ReLU, output +.>

Feature map F of (1) _encode The method comprises the steps of carrying out a first treatment on the surface of the The feature mapping unit comprises two ResNet blocks and a GRU unit, the GRU unit structure is shown in FIG. 5, F is firstly carried out _encode Splitting into F ₁ And F ₂ The sizes are all +.>

Then, the four 3×3 convolutions are respectively performed, the step length is 1, the filling is 1, the activation function is ReLU, and F is obtained ₃ F is passed through a 3X 3 convolution layer ₃ The signature path is compressed from 192 to 64, denoted F ₄ And input into GRU unit, GRU unit receives the hidden characteristic layer H obtained by processing the previous frame image by GRU unit _t-1 And obtaining the output of the GRU: current frame implicit feature layer H _t ，H _t The number of recovered channels was 192 by 3×3 convolution, and the mapping from noise to noise-free features was accomplished by the same structure as the ResNet block described above, the result of which was denoted as F ₅ Size of +.>

The decoder consists of a layer of PixelShuffle layer, the up-sampling multiple is set to 8, i.e. 192 channels are reduced by 64 times to 3 channels, both length and width are increased by 8 times, and the output is h×w×3. Wherein, at the initial frame, an array of all 0's is used as an implicit feature layer of the gating loop.

In particular, the training mode of the denoising network DenoiseNet adopts supervised learning, and the average loss of the statistical image sequence is used as an error for back propagation during training. Firstly, analyzing noise distribution characteristics of an acquired RAW format image after preprocessing, modeling the noise distribution characteristics into a noise model formed by a mixed result of Gaussian noise, poisson noise, dynamic stripe noise and color degradation noise, constructing a noise data set from a noise-free image sequence, and designing a loss function as follows:

L _DN ＝L _pixel +L _ssim +α ₁ L _tv +α ₂ L _lpips

wherein the method comprises the steps of

N represents the number of pixels, x _i Representing the pixel value of the input image at point i, y _i Pixel value representing the label image at point i, DN (x _i ) Representing pixel values of an image of the input image after denoising by a denoising network, wherein the loss function represents an absolute value error of each pixel between the output image and the real image;

μ _x sum mu _y Representing the mean of the input image and the mean, sigma, of the output image _xy Representing covariance between input image and output image, < >>

And->

Representing the variance of the input image and the output image, C1 and C2 being constants, the loss function characterizing the structural similarity error of the output image with the real image;

and->

Representing the output image in both x and y directionsThe loss function characterizes noise error; />

Representing the consistency error between feature vectors after the output image and the real image are subjected to the VGG16 network feature extraction, wherein the loss function represents the consistency of high-dimensional features between the two images; α1 and α2 are adjustable parameters.

S6: restoring the initial brightness distribution of the denoised image by the Ratio obtained in the step S4 as the input X of the self-adaptive brightness adjustment network _IN2 ＝X _OUT1 /Ratio；

S7: x is to be _IN2 Inputting a self-adaptive brightness adjustment network (LightNet) to obtain a final output image sequence;

the self-adaptive brightness adjustment network comprises a coder-decoder and a gating circulation unit, X _IN2 After the input of the self-adaptive brightness adjustment network, the increment output delta is obtained _i And hidden layer output H of the gated loop unit _i Transferring the output back to the input, adding the output to the input, and performing second enhancement, namely X _IN2 +Δ _i Inputting into a self-adaptive brightness adjustment network, and H _i And inputting the input data into a gating circulation unit to realize circulation. In this embodiment, the specific structure of the adaptive brightness adjustment network LightNet is shown in fig. 4, and includes an encoder composed of 3 convolution layers, 1 gate control circulation unit GRU, and a decoder composed of 3 deconvolution layers, where the output of each GRU and the output of the network are sent back to the input of the GRU and the input of the network, so as to implement circulation; the encoder comprises three 3 x 3 convolutional layers, step size 2, padding 1, and mapping a feature map of input size H x W x 3 to size

As input to the GRU; the GRU unit realizes characteristic transmission of each cycle, and the structure of the GRU unit is shown in fig. 5; the decoder comprises three 3 x 3 deconvolution layers, step size 2, padding 1, input +.>

Is characterized by (a) feature mapThe output is H×W×3, and after 8 cycles, the final output result is obtained.

In particular, the training mode of the self-adaptive brightness adjustment network LightNet is self-supervision learning, the paired low-illumination and normal-illumination image data sets are not needed, the large-scale low-illumination image data set and the real low-illumination image data set are adopted for training, and the loss function is designed as follows:

L _LN ＝L _light +β ₁ L _contrast +β ₂ L _color

here, the image is first divided into M pixel blocks of 16×16, and the above-described loss function is calculated for these pixel blocks.

Y[]The method comprises the steps of calculating average gray values of M pixel blocks processed by an adaptive brightness adjustment network LN, wherein the loss function constrains the overall brightness of an output image to be 0.5;

wherein->

Calculating the sum of the average values of the absolute values of the gradients of the pixel blocks in the x and y directions, wherein the loss function constrains the output image to have similar contrast with the input image;

wherein mu _i And->

Representing the average value of three channels of pixel points RGB, wherein the loss function constrains the output image to be consistent with the input image in color; beta ₁ And beta ₂ Is an adjustable parameter.

S8: the encoded output is either saved as video or output to a display.

In summary, the method and system provided in this embodiment establishes a noise model of a low-light image by cascading a denoising network denoise and a self-adaptive brightness adjustment network LightNet, collects data in a RAW format for preprocessing, uses a gate-control circulation unit GRU to remove noise in time sequence, increases the signal-to-noise ratio of a full-color night vision image, optimizes the brightness distribution of an output image, and can clearly present the night vision image under the condition of extremely low illumination.

The above description is only a specific embodiment of the present invention, and is not intended to limit the present invention in any way. It should be noted that the micro-light image capturing device used does not limit the present invention, the image resolution does not limit the present invention, and the image content does not limit the present invention. The scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the technical scope of the present invention, and it is intended to cover the scope of the present invention.

Claims

1. The full-color night vision enhancement method based on deep learning is characterized by comprising the following steps of:

And further calculating to obtain linear brightness enhancement coefficient +.>

And input X of denoising network _IN1 ＝Ratio×(X _RGB -X _BLACK )；

S5: x is to be _IN1 Inputting into a denoising network to obtain an image X after removing noise _OUT1 The method comprises the steps of carrying out a first treatment on the surface of the The denoising network comprises an encoder, a feature mapping unit and a decoder; the characteristic mapping unit comprises a residual error network and a gating circulation unit and is used for finishing the mapping from the noise characteristic to the noiseless characteristic; the loss function of the denoising network is as follows:

L _DN ＝L _pixel +L _ssim +α ₁ L _tv +α ₂ L _lpips

wherein the method comprises the steps of

N represents the number of pixels, x _i Representing the pixel value of the input image at point i, y _i Pixel value representing the label image at point i, DN (x _i ) Representing pixel values of an image of the input image after denoising by a denoising network, wherein the loss function represents an absolute value error of each pixel between the output image and the real image; />

And

representing the variance of the input image and the output image, C1 and C2 being constants, the loss function characterizing the structural similarity error of the output image with the real image; />

And->

Representing outputGradient of the image in both x and y directions, the loss function characterizing noise error; />

After the characteristics of the output image and the real image are extracted through a convolutional neural network, the consistency error between the characteristic vectors is represented, and the loss function represents the consistency of the high-dimensional characteristics between the two images; α1 and α2 are adjustable parameters;

S7: x is to be _IN2 Inputting the image sequence into a self-adaptive brightness adjustment network to obtain a final output image sequence; the self-adaptive brightness adjustment network comprises a coder and decoder and a gating circulation unit; the self-adaptive brightness adjustment network is trained by using a self-supervision learning method, and the loss function is as follows:

L _LN ＝L _light +β ₁ L _contrast +β ₂ L _color

wherein the image is first divided into 16×16M pixel blocks, and the above-mentioned loss function is calculated for these pixel blocks

wherein->

Calculating the sum of the mean values of the absolute values of the gradients of the pixel blocks in both the x and y directions, the loss function constraining the output image and the outputThe incoming images have similar contrast;

wherein mu _i And->

2. The deep learning based full color night vision enhancement method of claim 1, wherein the encoder is configured to encode RGB images of size hxwx 3 to size

Is used for decoding the characteristic diagram of the block with the size of +.>

Is rearranged into an output image of size h×w×3.

3. The deep learning-based full-color night vision enhancement method according to claim 2, wherein the feature mapping unit consists of two residual error networks and a gating circulation unit, and the specific implementation process comprises the following steps: first, the size is as follows

Is split into two +.>

Respectively extracting effective features in two subgraphs by using two residual error networks, and splicing to obtain an input G of a gating circulation unit _IN The gating circulation unit receives an implicit characteristic layer H obtained by processing the previous frame of image through the gating circulation unit at the same time _t-1 And outputs the implicit feature layer H of the current frame _t WhereinIn the initial frame, an array of all 0 s is used as an implicit characteristic layer of the gating cycle unit; acquiring secondary implicit feature layer H using the residual network _t Mapping to noiseless features.

4. The deep learning-based full-color night vision enhancement method according to claim 1, wherein the denoising network is trained by using a supervised learning method, training data is a simulation data set with artificially added noise, noise is modeled as a mixed result of gaussian noise, poisson noise, dynamic stripe noise and color degradation noise, and average loss of a statistical image sequence is counter-propagated as an error during training.

5. The deep learning-based full-color night vision enhancement method according to claim 1, wherein in step S7, X is set to be _IN2 Inputting the self-adaptive brightness adjustment network to obtain incremental output delta _i And hidden layer output H of the gate control loop unit _i Transferring the output back to the input, adding the output to the input, and performing second enhancement, namely X _IN2 +Δ _i Inputting the self-adaptive brightness adjustment network to adjust H _i And inputting the gate control circulation unit to realize circulation.

6. A deep learning-based full-color night vision enhancement system, the system comprising:

the low-light image acquisition module is used for acquiring low-light images in a RAW format;

the preprocessing module is used for preprocessing the RAW format image acquired by the low-light image acquisition module;

the denoising module is used for removing noise of the RGB image obtained by the preprocessing module through a denoising network; the denoising network comprises an encoder, a feature mapping unit and a decoder; the characteristic mapping unit comprises a residual error network and a gating circulation unit and is used for finishing the mapping from the noise characteristic to the noiseless characteristic; the loss function of the denoising network is as follows:

L _DN ＝L _pixel +L _ssim +α ₁ L _tv +α ₂ L _lpips

wherein the method comprises the steps of

And

And->

Representing the gradient of the output image in both the x and y directions, the loss function characterizing the noise error; />

After the characteristics of the output image and the real image are extracted through a convolutional neural network, the consistency error between the characteristic vectors is represented, and the loss function represents the consistency of the high-dimensional characteristics between the two images; alpha1 and α2 are adjustable parameters;

the self-adaptive brightness adjustment module is used for processing the RGB image after the denoising by the denoising module through the self-adaptive brightness adjustment network and adjusting the brightness distribution of the image; the self-adaptive brightness adjustment network comprises a coder and decoder and a gating circulation unit; the self-adaptive brightness adjustment network is trained by using a self-supervision learning method, and the loss function is as follows:

L _LN ＝L _light +β ₁ L _contrast +β ₂ L _color

wherein->

wherein mu _i And->

Representing the average value of three channels of pixel points RGB, wherein the loss function constrains the output image to be consistent with the input image in color; beta ₁ And beta ₂ Is an adjustable parameter;

and the coding output module is used for coding the RGB image which is processed and enhanced by the adaptive brightness adjustment module into a video signal.