CN110246105B

CN110246105B - Video denoising method based on actual camera noise modeling

Info

Publication number: CN110246105B
Application number: CN201910518690.2A
Authority: CN
Inventors: 王伟; 陈鑫
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2019-06-15
Filing date: 2019-06-15
Publication date: 2023-03-28
Anticipated expiration: 2039-06-15
Also published as: CN110246105A

Abstract

The invention discloses a video denoising method based on actual camera noise modeling. The specific steps are as follows: (1) Explore the physical causes of the main noise in the imaging process and establish a mathematical model of noise distribution; (2) Based on the established noise model, perform model parameter calibration to generate a realistic noise video training set; (3) Design a video denoising and enhancement neural network, and combine spatial and temporal information to suppress and weaken the noise; (4) Train and optimize the neural network, and use synthetic and actual video datasets to verify the practicability of the method. The denoising method of the present invention is suitable for low-light video denoising, and has very important application requirements in the fields of national defense, military, security monitoring, scientific research and environmental protection.

Description

A video denoising method based on actual camera noise modeling

技术领域Technical Field

本发明涉及计算摄像学和深度学习领域，尤其涉及实际相机噪声建模的弱光视频去噪领域。The present invention relates to the fields of computational photography and deep learning, and in particular to the field of low-light video denoising based on actual camera noise modeling.

背景技术Background Art

极低光照条件下，大量的噪声会显著降低图像的质量，因此弱光视频成像是一个具有挑战性的问题。大量的视频去噪或视频增强算法已被提出来解决这个问题，然而这些算法中的大部分噪声模型是简单独立同分布的假设，包括加性高斯白噪声，泊松噪声或者高斯噪声和泊松噪声的混合。实际情况中，视频中噪声非常复杂，尤其在低照度情况下，某些通常忽略的因素，例如动态图形噪声、噪声通道间联系、截断效应等都会变成主要问题。Under extremely low light conditions, a large amount of noise can significantly reduce the quality of the image, so low-light video imaging is a challenging problem. A large number of video denoising or video enhancement algorithms have been proposed to solve this problem. However, most of the noise models in these algorithms are simple independent and identically distributed assumptions, including additive Gaussian white noise, Poisson noise, or a mixture of Gaussian noise and Poisson noise. In reality, the noise in the video is very complex, especially in low light conditions, and some commonly ignored factors, such as dynamic graphic noise, noise channel correlation, truncation effect, etc., will become major problems.

基于深度学习的方法在许多图像处理任务上，取得了显著的进展。深度学习通过构建具有很多隐层的神经网络模型和海量的训练数据，来学习噪声更有用的特征，从而最终提升图像去噪的效果。通过逐层特征变换，将样本在原空间的特征表示变换到另一个新特征空间，从而使图像重建更加容易。Methods based on deep learning have made significant progress in many image processing tasks. Deep learning builds a neural network model with many hidden layers and a large amount of training data to learn more useful features of noise, thereby ultimately improving the effect of image denoising. Through layer-by-layer feature transformation, the feature representation of the sample in the original space is transformed into another new feature space, making image reconstruction easier.

因此，如何准确地对视频的噪声分布建立数学模型并利用神经网络来恢复高质量弱光视频是当前的一个研究方向。Therefore, how to accurately establish a mathematical model for the noise distribution of the video and use neural networks to restore high-quality low-light video is a current research direction.

发明内容Summary of the invention

针对以上现有视频去噪方法存在的不足，本发明的目的在于提出一种实际相机噪声建模的视频去噪方法。In view of the shortcomings of the above existing video denoising methods, the purpose of the present invention is to propose a video denoising method based on actual camera noise modeling.

为达上述目的，本发明采用的技术方案如下：To achieve the above object, the technical solution adopted by the present invention is as follows:

一种实际相机噪声建模的弱光视频去噪方法，A low-light video denoising method based on actual camera noise modeling,

步骤1，建立弱光环境的实际噪声数学模型，该模型中包括动态条纹噪声、噪声通道间联系以及截断影响；Step 1, establish a mathematical model of actual noise in a weak light environment, which includes dynamic stripe noise, the connection between noise channels, and the truncation effect;

步骤2，利用实际相机的噪声对步骤1的数学模型的参数进行标定，生成符合实际情形的噪声视频训练集；Step 2, using the noise of the actual camera to calibrate the parameters of the mathematical model in step 1, and generate a noise video training set that conforms to the actual situation;

步骤3，构建视频去噪和增强神经网络，联合空间和时间信息对噪声进行抑制削弱；Step 3: construct a video denoising and enhancement neural network to suppress and weaken the noise by combining spatial and temporal information;

步骤4，利用步骤2生成的噪声视频训练集以及实际采集的弱光视频数据集来训练优化所述神经网络。Step 4: Use the noise video training set generated in step 2 and the actually collected low-light video data set to train and optimize the neural network.

本发明考虑了实际相机成像中主要噪声的物理成因，首先建立噪声分布的数学模型，据此生成更符合现实情形的数据来训练基于长短时记忆LSTM的视频去噪和增强网络，设计的网络输入可为任意帧数的弱光视频，网络逐帧输出高质量的视频帧。The present invention takes into account the physical causes of the main noise in actual camera imaging. First, a mathematical model of noise distribution is established. Based on this model, data that is more in line with realistic situations is generated to train a video denoising and enhancement network based on long short-term memory (LSTM). The designed network input can be low-light video of any number of frames, and the network outputs high-quality video frames frame by frame.

本发明的优点在于：(1)提出的实际噪声模型是基于物理成像过程和传感器的硬件特性，在弱光条件下主要考虑三种不可忽略的噪声：动态条纹噪声、噪声通道间关联和截断影响，使得该模型能够很好地处理实际情形中的复杂噪声，尤其是弱光成像下的视频；The advantages of the present invention are as follows: (1) the proposed actual noise model is based on the physical imaging process and the hardware characteristics of the sensor, and mainly considers three types of non-negligible noise under low-light conditions: dynamic stripe noise, noise channel correlation and truncation effect, so that the model can well handle complex noise in actual situations, especially videos under low-light imaging;

(2)提出了噪声模型的估计方法并合成符合实际的噪声视频训练集，而不需要太多实采训练数据集，这对于难以同时获取成对噪声和干净的视频帧的视频去噪算法是极具吸引力的；(2) A noise model estimation method is proposed to synthesize a realistic noisy video training set without requiring too many real-world training datasets, which is very attractive for video denoising algorithms that have difficulty in simultaneously acquiring paired noisy and clean video frames.

(3)设计基于长短时记忆LSTM视频去噪和增强神经网络，本发明的网络结构能够记忆并利用多达前面20帧的信息来处理恢复当前帧图像的信息；(3) Designing a long short-term memory (LSTM) video denoising and enhancement neural network, the network structure of the present invention can memorize and utilize information of up to 20 previous frames to process and restore the information of the current frame image;

(4)本发明的弱光视频去噪方法在国防军事、安防监控、科学研究环境保护等领域具有非常重要的应用前景。(4) The low-light video denoising method of the present invention has very important application prospects in the fields of national defense, military, security monitoring, scientific research and environmental protection.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例中去噪方法的流程图。FIG. 1 is a flow chart of a denoising method according to an embodiment of the present invention.

图2为本发明实施例成像过程中的噪声源示意图。FIG. 2 is a schematic diagram of noise sources during the imaging process according to an embodiment of the present invention.

图3为本发明实施例中的2D查找表，(a)是截断效应均值校正，(b)是截断效应方差校正。FIG. 3 is a 2D lookup table in an embodiment of the present invention, wherein (a) is a truncation effect mean correction, and (b) is a truncation effect variance correction.

图4为本发明实施例中设计的网络架构示意图。FIG4 is a schematic diagram of a network architecture designed in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合附图及具体实施例对本发明进行详细描述。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

参照图1，本实施例的一种实际相机噪声建模的视频去噪方法，具体步骤如下：1, a video denoising method for actual camera noise modeling in this embodiment, the specific steps are as follows:

步骤1，探索成像过程中主要噪声的物理成因并建立噪声分布的数学模型。在弱光成像过程中，高灵敏度的相机设置使得某些微小的噪声在弱光下成为重要的噪声成分。图2展示了整个成像过程中常见噪声源，基于这个物理成像过程，建立主要包含三个噪声源的基础噪声模型。整个模型假设相机服从增益为K的全局一致的线性响应曲线。通过下式得到测量值yⁱ：Step 1, explore the physical causes of the main noise in the imaging process and establish a mathematical model of noise distribution. In the low-light imaging process, the high-sensitivity camera setting makes some tiny noise become important noise components in low light. Figure 2 shows the common noise sources in the entire imaging process. Based on this physical imaging process, a basic noise model mainly including three noise sources is established. The entire model assumes that the camera obeys a globally consistent linear response curve with a gain of K. The measured value y ⁱ is obtained by the following formula:

yⁱ＝K(Sⁱ+Dⁱ+Rⁱ) (1) ^yi ＝K( ^Si + ^Di + ^Ri ) (1)

其中i像素索引，yⁱ是获取的像素值，

表示散粒噪声，

是泊松分布，

是像素i处的光电子数量，

表示暗电流，N_d是每个像素值的暗电流电子数，

代表读出噪声，

是高斯分布符号，

高斯分布的方差。Where i is the pixel index, ^yi is the obtained pixel value,

represents the shot noise,

is a Poisson distribution,

is the number of photoelectrons at pixel i,

represents the dark current, _Nd is the number of dark current electrons per pixel value,

represents the read noise,

is the Gaussian distribution symbol,

Variance of the Gaussian distribution.

除了以上广泛熟知的噪声类别外，弱光成像过程中，本发明提出一些特别的噪声源。In addition to the above widely known noise categories, the present invention proposes some special noise sources during low-light imaging.

动态条纹噪声：相比于传统的固定图案噪声，动态条纹噪声是连续变换，不能通过直接相减消除。通常在视频中，动态条纹噪声以行条纹形式出现。这种现象不仅存在于卷帘快门相机中，全局快门相机中也存在，但特性稍微有所不同。经过探索成像芯片的成像过程，用两种物理成因解释，电路波动和异步触发器。这两个成因都轻微影响全局行增益，最终公式(1)中的K替换为行增益K^r：Dynamic stripe noise: Compared with traditional fixed pattern noise, dynamic stripe noise is a continuous change and cannot be eliminated by direct subtraction. Usually in videos, dynamic stripe noise appears in the form of row stripes. This phenomenon exists not only in rolling shutter cameras, but also in global shutter cameras, but the characteristics are slightly different. After exploring the imaging process of imaging chips, two physical causes are used to explain it, circuit fluctuations and asynchronous triggers. Both causes slightly affect the global row gain, and finally K in formula (1) is replaced by the row gain ^Kr :

其中，K是全局一致的系统增益。

是电路波动引起的扰动，符合颜色(1/f)高斯分布。

是异步触发器引起的扰动，符合高斯白噪声分布。λ是电路波动与异步触发器之间的权重衡量。因为

和

都遵循均值为0的高斯分布，K^r的期望恰好是K，为了简化参数的标定，公式(2)简化为：where K is the globally consistent system gain.

It is the disturbance caused by circuit fluctuation and conforms to the color (1/f) Gaussian distribution.

is the disturbance caused by the asynchronous trigger, which conforms to the Gaussian white noise distribution. λ is the weight between the circuit fluctuation and the asynchronous trigger.

and

All follow a Gaussian distribution with a mean of 0. The expectation of K ^r is exactly K. To simplify the calibration of parameters, formula (2) is simplified to:

K^r＝Kβ^r (3)K ^r ＝Kβ ^r (3)

β^r是动态条纹噪声的矫正参数，在卷帘快门相机中，相当于

在全局快门相机中，相当于

β^r分别符合颜色或白色高斯分布

β ^r is the correction parameter for dynamic fringe noise, which is equivalent to

In a global shutter camera, this is equivalent to

β ^r conforms to the color or white Gaussian distribution respectively

噪声通道间联系：本发明中，通过探索颜色传感器的物理特性来模拟通道之间的噪声关系，同时考虑像素均匀性和通道差异。通常，卷帘快门和全局快门相机通过在硅传感器上覆盖颜色阵列滤波器来获取三通道图像。实际中，三通道的噪声并不一致，主要有两个原因：三通道具有不同的系统增益和三个通道具有明显的DPN波动。据此，修改(3)式K^r得到：

是与通道有关的动态条纹噪声的矫正参数，K_c是与通道有关的全局系统增益。Noise channel relationship: In this invention, the noise relationship between channels is simulated by exploring the physical characteristics of the color sensor, while considering pixel uniformity and channel differences. Usually, rolling shutter and global shutter cameras acquire three-channel images by covering a color array filter on a silicon sensor. In practice, the noise of the three channels is not consistent, mainly for two reasons: the three channels have different system gains and the three channels have obvious DPN fluctuations. Based on this, modifying equation (3) K ^r yields:

is the correction parameter for the channel-dependent dynamic fringe noise, and _Kc is the channel-dependent global system gain.

截断影响：不考虑读出噪声，数字传感器通常都是正值。但是，读出噪声是服从均值为0的高斯分布，这时负值是存在的。尤其在弱光环境中，在这种情况下，甚至许多信号比读出噪声还微弱，这会导致许多负值的出现。这些负值会在结果输出前被截断为0值。在数学表达，截断的运算

可以表示如下：Truncation effect: Without considering the read noise, digital sensors are usually positive values. However, the read noise follows a Gaussian distribution with a mean of 0, so negative values exist. Especially in low light environments, many signals are even weaker than the read noise, which leads to many negative values. These negative values will be truncated to 0 before the result is output. In mathematical expression, the truncation operation

It can be expressed as follows:

综上分析，本发明提出的弱光环境的实际噪声模型最终形式为：Based on the above analysis, the actual noise model for a weak light environment proposed by the present invention is finally in the form of:

对全局快门相机中，

对三通道是相同的。For global shutter cameras,

It is the same for three channels.

步骤2，基于建立的噪声模型，进行模型参数标定，生成符合实际的噪声视频训练集。本发明的噪声模型可以对实际相机噪声进行参数标定。为了便于推理，先不考虑截断运算。本发明提出基于2D查找表，如图3，进行截断引起期望和方差的偏置的校正。Step 2, based on the established noise model, calibrate the model parameters to generate a noise video training set that conforms to reality. The noise model of the present invention can calibrate the parameters of the actual camera noise. In order to facilitate reasoning, truncation operation is not considered first. The present invention proposes to correct the bias of expectation and variance caused by truncation based on a 2D lookup table, as shown in Figure 3.

步骤21，校准

暗场情况下获取视频帧，可得

利用每行的像素均值除以全局的像素均值可得

即

得到

后，动态图案噪声可以通过除以每行的像素值来移除。推导得到动态图案噪声校正后的算式：yⁱ＝K_c(Dⁱ+Rⁱ)|_c∈{r,g,b}。根据相机的类型，卷帘快门相机或者全局快门相机，确定动态图案噪声的分布特性，为颜色噪声或者高斯白噪声。Step 21, Calibration

Acquire video frames in dark field, and we can get

By dividing the pixel mean of each row by the global pixel mean, we can get

Right now

get

After that, the dynamic pattern noise can be removed by dividing by the pixel value of each row. The formula after dynamic pattern noise correction is derived: ^yi = _Kc ( ^Di + ^Ri ) | _c∈{r,g,b} . Depending on the type of camera, rolling shutter camera or global shutter camera, the distribution characteristics of dynamic pattern noise are determined, which is color noise or Gaussian white noise.

步骤22，校准K_c。暗电流

读出噪声

yⁱ校正后的期望和方差可表示为：Step 22, calibrate K _c . Dark current

Read Noise

The corrected expectation and variance of ^yi can be expressed as:

E[y']＝K_cN_d E[y']＝K _c N _d

使用N_d＝E[y']/K_c替换N_d，将得到以下公式：Substituting N _d with N _d = E[y']/K _c , we get the following formula:

基于暗电流随曝光时间变换，采用不同曝光的暗场视频帧，可消除常量

可得最后的公式：Based on the change of dark current with exposure time, the constant can be eliminated by using dark field video frames with different exposures.

The final formula is:

ΔD[y']＝K_cΔE[y'] (8)ΔD[y']＝K _c ΔE[y'] (8)

其中

t₁，t₂代表不同的曝光时间。实际中，E[y']可等效于mean(y')，D[y']可等效于var[y']。拍摄一系列不同曝光时间的暗场视频，可计算出一组点(ΔE[y']，ΔD[y'])，所以K_c可由这些点进行线性拟合得出。in

t ₁ , t ₂ represent different exposure times. In practice, E[y'] is equivalent to mean(y'), and D[y'] is equivalent to var[y']. By shooting a series of dark-field videos with different exposure times, a set of points (ΔE[y'], ΔD[y']) can be calculated, so K _c can be obtained by linear fitting of these points.

步骤23，校准N_d和

校准得到K_c后，N_d和

可由公式(6)计算得到。Step 23, calibrate _Nd and

After calibration, K _c is obtained, N _d and

It can be calculated by formula (6).

步骤24，截断误差查表校正。经过截断运算后的均值mean(x)、方差var(x)与不经过截断的均值mean(x)、方差var(x)有很大区别。实际计算中，在没有像素x的先验知识的情况下，很难计算出截断的影响。在本发明中，截断中所有的随机变量可以分成两部分，泊松分布部分和均值为零的高斯分布部分。利用matlab对大量数据生成一系列不同期望和方差的像素x，制作截断后的均值

方差

与真实的均值mean(x)、方差var(x)对应的2D表格。通过查找表格的方法，可得真实数据的值。Step 24, truncation error correction by table lookup. The mean mean(x) and variance var(x) after truncation are very different from the mean mean(x) and variance var(x) without truncation. In actual calculations, it is difficult to calculate the impact of truncation without prior knowledge of pixel x. In the present invention, all random variables in truncation can be divided into two parts, a Poisson distribution part and a Gaussian distribution part with a mean of zero. Matlab is used to generate a series of pixels x with different expectations and variances for a large amount of data, and the mean after truncation is produced.

variance

A 2D table corresponding to the true mean(x) and variance var(x). By looking up the table, the true data value can be obtained.

步骤25，训练数据合成。根据公式(5)以及以上相机的参数标定步骤，可以从干净的视频序列合成噪声训练数据集。本发明先推导出期望的光电子数N_e，Step 25, training data synthesis. According to formula (5) and the above camera parameter calibration steps, a noise training data set can be synthesized from a clean video sequence. The present invention first derives the expected number of photoelectrons _Ne ,

其中，S_pixel是单个像素的面积，C_lum2radiant是从光通量到辐射强度的传递常数，E_p是单个光子的能量，Q_e是相机的等效系统量子效率。调整图像的平均值到期望的光电子数E[N_e]，可以从图像像素值推导出期望的光子数，最终根据蒙特卡罗模拟算式，依据算式(5)合成噪声训练数据集。Among them, S _pixel is the area of a single pixel, C _lum2radiant is the transfer constant from light flux to radiation intensity, E _p is the energy of a single photon, and Q _e is the equivalent system quantum efficiency of the camera. By adjusting the average value of the image to the expected number of photoelectrons E[N _e ], the expected number of photons can be derived from the image pixel value. Finally, according to the Monte Carlo simulation formula, the noise training data set is synthesized according to formula (5).

步骤3，设计视频去噪和增强神经网络，联合空间和时间信息来对噪声进行抑制削弱。本发明提出基于LSTM的视频去噪增强网络，输入为实际相机弱光下拍摄的噪声视频，输出为明亮清晰的视频帧。图4为本实施例设计的网络架构。为了同时自适应地从视频中提取短期依赖和长期依赖，网络中采用时空联合记忆单元ST-LSTM。ST-LSTM可以在一个统一的内存单元中对空间和时间表示进行建模，并将内存垂直地跨层传递和水平地跨状态传递。本发明网络架构包含两层卷积层和4个ST-LSTM层，首先，卷积层提取输入帧的特征，然后将特征传递到ST-LSTM层。在空间关联中加入跳跃连接。最后一层将前一层学习重构信息合并到sRGB空间中。其中第一和第四个ST-LSTM层使用的卷积核大小为3x3,第二和第三个ST-LSTM层使用的卷积核大小为5x5，每层的特征个数是64。采用零填充，保证了输入和输出之间的尺寸一致性。Step 3, design a video denoising and enhancement neural network, and combine spatial and temporal information to suppress and weaken noise. The present invention proposes a video denoising and enhancement network based on LSTM, the input is a noisy video shot under low light by an actual camera, and the output is a bright and clear video frame. Figure 4 is a network architecture designed in this embodiment. In order to adaptively extract short-term dependencies and long-term dependencies from the video at the same time, a spatiotemporal joint memory unit ST-LSTM is used in the network. ST-LSTM can model spatial and temporal representations in a unified memory unit, and transfer the memory vertically across layers and horizontally across states. The network architecture of the present invention includes two convolutional layers and four ST-LSTM layers. First, the convolutional layer extracts the features of the input frame and then transfers the features to the ST-LSTM layer. Jump connections are added to the spatial association. The last layer merges the reconstruction information learned by the previous layer into the sRGB space. The convolution kernel size used in the first and fourth ST-LSTM layers is 3x3, the convolution kernel size used in the second and third ST-LSTM layers is 5x5, and the number of features in each layer is 64. Zero padding is used to ensure size consistency between input and output.

步骤4，训练优化神经网络，利用步骤25合成的噪声训练数据集和实际采集的弱光视频数据集验证方法的实用性。通过最小化网络输出的帧I和对应的实际真实帧I^*损失函数(公式(10))来训练网络。Step 4: Train and optimize the neural network, and use the noise training data set synthesized in step 25 and the actual low-light video data set collected to verify the practicability of the method. The network is trained by minimizing the loss function (Formula (10)) between the frame I output by the network and the corresponding actual real frame I ^* .

将基本损失函数定义为

和

损失的加权平均值，

距离和

都计算像素强度的一致性，前者使输出平滑，而后者使输出更加细致。为了进一步提高感知质量，引入感知损失

其是通过预先训练的视觉感知组(VGG)网络提取高级特征进行网络输出和真实值的约束。此外，在损失函数中加入了总变分稳压器

作为平滑的正则化项。在这里，α,β,γ,δ是训练过程的超参数，N是训练的视频帧数。本发明的训练过程中，α＝5,β＝1,γ＝0.06,δ＝2×10^-6，N＝8。The basic loss function is defined as

and

The weighted average of the losses,

Distance and

Both calculate the consistency of pixel intensity, the former makes the output smoother, while the latter makes the output more detailed. In order to further improve the perceptual quality, perceptual loss is introduced

It extracts high-level features from a pre-trained Visual Group (VGG) network to constrain network output and true values. In addition, a total variation regulator is added to the loss function.

As a smooth regularization term. Here, α, β, γ, δ are hyperparameters of the training process, and N is the number of video frames for training. In the training process of the present invention, α＝5, β＝1, γ＝0.06, δ＝2×10 ^-6 , N＝8.

本发明网络实施的深度学习网络结构是Pytorch,训练过程采用的优化器是Adma，学习率是1×10^-6。训练集包含大量干净的视频，并选择了大约900个序列，这些序列在移动场景中非常丰富。考虑到每个摄像机都有一组独特的噪声参数，针对不同的相机用不同的训练数据对网络进行训练。The deep learning network structure implemented by the network of the present invention is Pytorch, the optimizer used in the training process is Adma, and the learning rate is 1× ^10-6 . The training set contains a large number of clean videos, and about 900 sequences are selected, which are very rich in mobile scenes. Considering that each camera has a unique set of noise parameters, the network is trained with different training data for different cameras.

Claims

1. A video denoising method based on actual camera noise modeling is characterized by comprising the following steps:

step 1, establishing an actual noise mathematical model of a weak light environment, wherein the model comprises dynamic stripe noise, inter-noise channel relation and truncation influence, and the actual noise mathematical model specifically comprises the following steps:

where i is the pixel index, y ⁱ Is the value of the pixel that was acquired,

is a truncation operation, K _c Is the global system gain associated with the channel,

is a fluctuation factor of the dynamic streak noise; s ⁱ Indicates shot noise, and>

is a poisson distribution,. Sup.>

Is the number of photoelectrons at pixel i; d ⁱ Represents dark current, is greater than or equal to>

N _d Is the number of dark current electrons, R, per pixel value ⁱ Represents a readout noise, <' > or>

Is a Gaussian distribution sign>

The variance of Gaussian distribution, c ∈ { r, g, b } is a color three channel;

step 2, calibrating parameters of the mathematical model in the step 1 by using noise of an actual camera to generate a noise video training set which accords with an actual situation;

step 3, constructing a video denoising and enhancing neural network, and combining spatial and temporal information to suppress and weaken noise;

and 4, training and optimizing the neural network by using the noise video training set generated in the step 2 and the actually acquired low-light video data set.

2. The method as claimed in claim 1, wherein in step 2, a series of pixels x with different mean and variance are generated, and then the mean value after truncation operation is made

Variance->

Two-dimensional tables corresponding to the true mean (x), variance var (x), by looking up the tablesAnd the grid method can obtain the value of the real data.

3. The method for denoising video based on actual camera noise modeling according to claim 1, wherein in step 3, the input of the video denoising and enhancing neural network is the noise video shot under the weak light of the actual camera, and the output is a bright and clear video frame.

4. The method as claimed in claim 1, wherein in step 4, the frame I outputted by the minimization network and the corresponding actual real frame I are outputted ^* Training the network with a loss function as follows:

wherein,

is a final loss function>

Is an absolute value error function>

Mean square error function->

A perception-loss function, <' > or>

The total variation regularization function, α, β, γ, and δ are all hyper-parameters. />