CN110246105B - Video denoising method based on actual camera noise modeling - Google Patents

Video denoising method based on actual camera noise modeling Download PDF

Info

Publication number
CN110246105B
CN110246105B CN201910518690.2A CN201910518690A CN110246105B CN 110246105 B CN110246105 B CN 110246105B CN 201910518690 A CN201910518690 A CN 201910518690A CN 110246105 B CN110246105 B CN 110246105B
Authority
CN
China
Prior art keywords
noise
video
actual
denoising
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910518690.2A
Other languages
Chinese (zh)
Other versions
CN110246105A (en
Inventor
王伟
陈鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201910518690.2A priority Critical patent/CN110246105B/en
Publication of CN110246105A publication Critical patent/CN110246105A/en
Application granted granted Critical
Publication of CN110246105B publication Critical patent/CN110246105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/21Circuitry for suppressing or minimising disturbance, e.g. moiré or halo
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Studio Devices (AREA)

Abstract

本发明公开了一种基于实际相机噪声建模的视频去噪方法。具体步骤如下:(1)探索成像过程中主要噪声的物理成因并建立噪声分布的数学模型;(2)基于建立的噪声模型,进行模型参数标定,生成符合实际的噪声视频训练集;(3)设计视频去噪和增强神经网络,联合空间和时间信息来对噪声进行抑制削弱;(4)训练优化神经网络,利用合成和实采的视频数据集验证方法的实用性。本发明的去噪方法适用于弱光视频去噪,在国防军事、安防监控、科学研究环境保护等领域具有非常重要的应用需求。

Figure 201910518690

The invention discloses a video denoising method based on actual camera noise modeling. The specific steps are as follows: (1) Explore the physical causes of the main noise in the imaging process and establish a mathematical model of noise distribution; (2) Based on the established noise model, perform model parameter calibration to generate a realistic noise video training set; (3) Design a video denoising and enhancement neural network, and combine spatial and temporal information to suppress and weaken the noise; (4) Train and optimize the neural network, and use synthetic and actual video datasets to verify the practicability of the method. The denoising method of the present invention is suitable for low-light video denoising, and has very important application requirements in the fields of national defense, military, security monitoring, scientific research and environmental protection.

Figure 201910518690

Description

一种基于实际相机噪声建模的视频去噪方法A video denoising method based on actual camera noise modeling

技术领域Technical Field

本发明涉及计算摄像学和深度学习领域,尤其涉及实际相机噪声建模的弱光视频去噪领域。The present invention relates to the fields of computational photography and deep learning, and in particular to the field of low-light video denoising based on actual camera noise modeling.

背景技术Background Art

极低光照条件下,大量的噪声会显著降低图像的质量,因此弱光视频成像是一个具有挑战性的问题。大量的视频去噪或视频增强算法已被提出来解决这个问题,然而这些算法中的大部分噪声模型是简单独立同分布的假设,包括加性高斯白噪声,泊松噪声或者高斯噪声和泊松噪声的混合。实际情况中,视频中噪声非常复杂,尤其在低照度情况下,某些通常忽略的因素,例如动态图形噪声、噪声通道间联系、截断效应等都会变成主要问题。Under extremely low light conditions, a large amount of noise can significantly reduce the quality of the image, so low-light video imaging is a challenging problem. A large number of video denoising or video enhancement algorithms have been proposed to solve this problem. However, most of the noise models in these algorithms are simple independent and identically distributed assumptions, including additive Gaussian white noise, Poisson noise, or a mixture of Gaussian noise and Poisson noise. In reality, the noise in the video is very complex, especially in low light conditions, and some commonly ignored factors, such as dynamic graphic noise, noise channel correlation, truncation effect, etc., will become major problems.

基于深度学习的方法在许多图像处理任务上,取得了显著的进展。深度学习通过构建具有很多隐层的神经网络模型和海量的训练数据,来学习噪声更有用的特征,从而最终提升图像去噪的效果。通过逐层特征变换,将样本在原空间的特征表示变换到另一个新特征空间,从而使图像重建更加容易。Methods based on deep learning have made significant progress in many image processing tasks. Deep learning builds a neural network model with many hidden layers and a large amount of training data to learn more useful features of noise, thereby ultimately improving the effect of image denoising. Through layer-by-layer feature transformation, the feature representation of the sample in the original space is transformed into another new feature space, making image reconstruction easier.

因此,如何准确地对视频的噪声分布建立数学模型并利用神经网络来恢复高质量弱光视频是当前的一个研究方向。Therefore, how to accurately establish a mathematical model for the noise distribution of the video and use neural networks to restore high-quality low-light video is a current research direction.

发明内容Summary of the invention

针对以上现有视频去噪方法存在的不足,本发明的目的在于提出一种实际相机噪声建模的视频去噪方法。In view of the shortcomings of the above existing video denoising methods, the purpose of the present invention is to propose a video denoising method based on actual camera noise modeling.

为达上述目的,本发明采用的技术方案如下:To achieve the above object, the technical solution adopted by the present invention is as follows:

一种实际相机噪声建模的弱光视频去噪方法,A low-light video denoising method based on actual camera noise modeling,

步骤1,建立弱光环境的实际噪声数学模型,该模型中包括动态条纹噪声、噪声通道间联系以及截断影响;Step 1, establish a mathematical model of actual noise in a weak light environment, which includes dynamic stripe noise, the connection between noise channels, and the truncation effect;

步骤2,利用实际相机的噪声对步骤1的数学模型的参数进行标定,生成符合实际情形的噪声视频训练集;Step 2, using the noise of the actual camera to calibrate the parameters of the mathematical model in step 1, and generate a noise video training set that conforms to the actual situation;

步骤3,构建视频去噪和增强神经网络,联合空间和时间信息对噪声进行抑制削弱;Step 3: construct a video denoising and enhancement neural network to suppress and weaken the noise by combining spatial and temporal information;

步骤4,利用步骤2生成的噪声视频训练集以及实际采集的弱光视频数据集来训练优化所述神经网络。Step 4: Use the noise video training set generated in step 2 and the actually collected low-light video data set to train and optimize the neural network.

本发明考虑了实际相机成像中主要噪声的物理成因,首先建立噪声分布的数学模型,据此生成更符合现实情形的数据来训练基于长短时记忆LSTM的视频去噪和增强网络,设计的网络输入可为任意帧数的弱光视频,网络逐帧输出高质量的视频帧。The present invention takes into account the physical causes of the main noise in actual camera imaging. First, a mathematical model of noise distribution is established. Based on this model, data that is more in line with realistic situations is generated to train a video denoising and enhancement network based on long short-term memory (LSTM). The designed network input can be low-light video of any number of frames, and the network outputs high-quality video frames frame by frame.

本发明的优点在于:(1)提出的实际噪声模型是基于物理成像过程和传感器的硬件特性,在弱光条件下主要考虑三种不可忽略的噪声:动态条纹噪声、噪声通道间关联和截断影响,使得该模型能够很好地处理实际情形中的复杂噪声,尤其是弱光成像下的视频;The advantages of the present invention are as follows: (1) the proposed actual noise model is based on the physical imaging process and the hardware characteristics of the sensor, and mainly considers three types of non-negligible noise under low-light conditions: dynamic stripe noise, noise channel correlation and truncation effect, so that the model can well handle complex noise in actual situations, especially videos under low-light imaging;

(2)提出了噪声模型的估计方法并合成符合实际的噪声视频训练集,而不需要太多实采训练数据集,这对于难以同时获取成对噪声和干净的视频帧的视频去噪算法是极具吸引力的;(2) A noise model estimation method is proposed to synthesize a realistic noisy video training set without requiring too many real-world training datasets, which is very attractive for video denoising algorithms that have difficulty in simultaneously acquiring paired noisy and clean video frames.

(3)设计基于长短时记忆LSTM视频去噪和增强神经网络,本发明的网络结构能够记忆并利用多达前面20帧的信息来处理恢复当前帧图像的信息;(3) Designing a long short-term memory (LSTM) video denoising and enhancement neural network, the network structure of the present invention can memorize and utilize information of up to 20 previous frames to process and restore the information of the current frame image;

(4)本发明的弱光视频去噪方法在国防军事、安防监控、科学研究环境保护等领域具有非常重要的应用前景。(4) The low-light video denoising method of the present invention has very important application prospects in the fields of national defense, military, security monitoring, scientific research and environmental protection.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例中去噪方法的流程图。FIG. 1 is a flow chart of a denoising method according to an embodiment of the present invention.

图2为本发明实施例成像过程中的噪声源示意图。FIG. 2 is a schematic diagram of noise sources during the imaging process according to an embodiment of the present invention.

图3为本发明实施例中的2D查找表,(a)是截断效应均值校正,(b)是截断效应方差校正。FIG. 3 is a 2D lookup table in an embodiment of the present invention, wherein (a) is a truncation effect mean correction, and (b) is a truncation effect variance correction.

图4为本发明实施例中设计的网络架构示意图。FIG4 is a schematic diagram of a network architecture designed in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合附图及具体实施例对本发明进行详细描述。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

参照图1,本实施例的一种实际相机噪声建模的视频去噪方法,具体步骤如下:1, a video denoising method for actual camera noise modeling in this embodiment, the specific steps are as follows:

步骤1,探索成像过程中主要噪声的物理成因并建立噪声分布的数学模型。在弱光成像过程中,高灵敏度的相机设置使得某些微小的噪声在弱光下成为重要的噪声成分。图2展示了整个成像过程中常见噪声源,基于这个物理成像过程,建立主要包含三个噪声源的基础噪声模型。整个模型假设相机服从增益为K的全局一致的线性响应曲线。通过下式得到测量值yiStep 1, explore the physical causes of the main noise in the imaging process and establish a mathematical model of noise distribution. In the low-light imaging process, the high-sensitivity camera setting makes some tiny noise become important noise components in low light. Figure 2 shows the common noise sources in the entire imaging process. Based on this physical imaging process, a basic noise model mainly including three noise sources is established. The entire model assumes that the camera obeys a globally consistent linear response curve with a gain of K. The measured value y i is obtained by the following formula:

yi=K(Si+Di+Ri) (1) yi =K( Si + Di + Ri ) (1)

其中i像素索引,yi是获取的像素值,

Figure BDA0002095905920000021
表示散粒噪声,
Figure BDA0002095905920000022
是泊松分布,
Figure BDA0002095905920000023
是像素i处的光电子数量,
Figure BDA0002095905920000024
表示暗电流,Nd是每个像素值的暗电流电子数,
Figure BDA0002095905920000025
代表读出噪声,
Figure BDA0002095905920000026
是高斯分布符号,
Figure BDA0002095905920000027
高斯分布的方差。Where i is the pixel index, yi is the obtained pixel value,
Figure BDA0002095905920000021
represents the shot noise,
Figure BDA0002095905920000022
is a Poisson distribution,
Figure BDA0002095905920000023
is the number of photoelectrons at pixel i,
Figure BDA0002095905920000024
represents the dark current, Nd is the number of dark current electrons per pixel value,
Figure BDA0002095905920000025
represents the read noise,
Figure BDA0002095905920000026
is the Gaussian distribution symbol,
Figure BDA0002095905920000027
Variance of the Gaussian distribution.

除了以上广泛熟知的噪声类别外,弱光成像过程中,本发明提出一些特别的噪声源。In addition to the above widely known noise categories, the present invention proposes some special noise sources during low-light imaging.

动态条纹噪声:相比于传统的固定图案噪声,动态条纹噪声是连续变换,不能通过直接相减消除。通常在视频中,动态条纹噪声以行条纹形式出现。这种现象不仅存在于卷帘快门相机中,全局快门相机中也存在,但特性稍微有所不同。经过探索成像芯片的成像过程,用两种物理成因解释,电路波动和异步触发器。这两个成因都轻微影响全局行增益,最终公式(1)中的K替换为行增益KrDynamic stripe noise: Compared with traditional fixed pattern noise, dynamic stripe noise is a continuous change and cannot be eliminated by direct subtraction. Usually in videos, dynamic stripe noise appears in the form of row stripes. This phenomenon exists not only in rolling shutter cameras, but also in global shutter cameras, but the characteristics are slightly different. After exploring the imaging process of imaging chips, two physical causes are used to explain it, circuit fluctuations and asynchronous triggers. Both causes slightly affect the global row gain, and finally K in formula (1) is replaced by the row gain Kr :

Figure BDA0002095905920000031
Figure BDA0002095905920000031

其中,K是全局一致的系统增益。

Figure BDA0002095905920000032
是电路波动引起的扰动,符合颜色(1/f)高斯分布。
Figure BDA0002095905920000033
是异步触发器引起的扰动,符合高斯白噪声分布。λ是电路波动与异步触发器之间的权重衡量。因为
Figure BDA0002095905920000034
Figure BDA0002095905920000035
都遵循均值为0的高斯分布,Kr的期望恰好是K,为了简化参数的标定,公式(2)简化为:where K is the globally consistent system gain.
Figure BDA0002095905920000032
It is the disturbance caused by circuit fluctuation and conforms to the color (1/f) Gaussian distribution.
Figure BDA0002095905920000033
is the disturbance caused by the asynchronous trigger, which conforms to the Gaussian white noise distribution. λ is the weight between the circuit fluctuation and the asynchronous trigger.
Figure BDA0002095905920000034
and
Figure BDA0002095905920000035
All follow a Gaussian distribution with a mean of 0. The expectation of K r is exactly K. To simplify the calibration of parameters, formula (2) is simplified to:

Kr=Kβr (3)K r =Kβ r (3)

βr是动态条纹噪声的矫正参数,在卷帘快门相机中,相当于

Figure BDA0002095905920000036
在全局快门相机中,相当于
Figure BDA0002095905920000037
βr分别符合颜色或白色高斯分布
Figure BDA0002095905920000038
β r is the correction parameter for dynamic fringe noise, which is equivalent to
Figure BDA0002095905920000036
In a global shutter camera, this is equivalent to
Figure BDA0002095905920000037
β r conforms to the color or white Gaussian distribution respectively
Figure BDA0002095905920000038

噪声通道间联系:本发明中,通过探索颜色传感器的物理特性来模拟通道之间的噪声关系,同时考虑像素均匀性和通道差异。通常,卷帘快门和全局快门相机通过在硅传感器上覆盖颜色阵列滤波器来获取三通道图像。实际中,三通道的噪声并不一致,主要有两个原因:三通道具有不同的系统增益和三个通道具有明显的DPN波动。据此,修改(3)式Kr得到:

Figure BDA0002095905920000039
Figure BDA00020959059200000310
是与通道有关的动态条纹噪声的矫正参数,Kc是与通道有关的全局系统增益。Noise channel relationship: In this invention, the noise relationship between channels is simulated by exploring the physical characteristics of the color sensor, while considering pixel uniformity and channel differences. Usually, rolling shutter and global shutter cameras acquire three-channel images by covering a color array filter on a silicon sensor. In practice, the noise of the three channels is not consistent, mainly for two reasons: the three channels have different system gains and the three channels have obvious DPN fluctuations. Based on this, modifying equation (3) K r yields:
Figure BDA0002095905920000039
Figure BDA00020959059200000310
is the correction parameter for the channel-dependent dynamic fringe noise, and Kc is the channel-dependent global system gain.

截断影响:不考虑读出噪声,数字传感器通常都是正值。但是,读出噪声是服从均值为0的高斯分布,这时负值是存在的。尤其在弱光环境中,在这种情况下,甚至许多信号比读出噪声还微弱,这会导致许多负值的出现。这些负值会在结果输出前被截断为0值。在数学表达,截断的运算

Figure BDA00020959059200000311
可以表示如下:Truncation effect: Without considering the read noise, digital sensors are usually positive values. However, the read noise follows a Gaussian distribution with a mean of 0, so negative values exist. Especially in low light environments, many signals are even weaker than the read noise, which leads to many negative values. These negative values will be truncated to 0 before the result is output. In mathematical expression, the truncation operation
Figure BDA00020959059200000311
It can be expressed as follows:

Figure BDA00020959059200000312
Figure BDA00020959059200000312

综上分析,本发明提出的弱光环境的实际噪声模型最终形式为:Based on the above analysis, the actual noise model for a weak light environment proposed by the present invention is finally in the form of:

Figure BDA0002095905920000041
Figure BDA0002095905920000041

对全局快门相机中,

Figure BDA0002095905920000042
对三通道是相同的。For global shutter cameras,
Figure BDA0002095905920000042
It is the same for three channels.

步骤2,基于建立的噪声模型,进行模型参数标定,生成符合实际的噪声视频训练集。本发明的噪声模型可以对实际相机噪声进行参数标定。为了便于推理,先不考虑截断运算。本发明提出基于2D查找表,如图3,进行截断引起期望和方差的偏置的校正。Step 2, based on the established noise model, calibrate the model parameters to generate a noise video training set that conforms to reality. The noise model of the present invention can calibrate the parameters of the actual camera noise. In order to facilitate reasoning, truncation operation is not considered first. The present invention proposes to correct the bias of expectation and variance caused by truncation based on a 2D lookup table, as shown in Figure 3.

步骤21,校准

Figure BDA0002095905920000043
暗场情况下获取视频帧,可得
Figure BDA0002095905920000044
利用每行的像素均值除以全局的像素均值可得
Figure BDA0002095905920000045
Figure BDA0002095905920000046
得到
Figure BDA0002095905920000047
后,动态图案噪声可以通过除以每行的像素值来移除。推导得到动态图案噪声校正后的算式:yi=Kc(Di+Ri)|c∈{r,g,b}。根据相机的类型,卷帘快门相机或者全局快门相机,确定动态图案噪声的分布特性,为颜色噪声或者高斯白噪声。Step 21, Calibration
Figure BDA0002095905920000043
Acquire video frames in dark field, and we can get
Figure BDA0002095905920000044
By dividing the pixel mean of each row by the global pixel mean, we can get
Figure BDA0002095905920000045
Right now
Figure BDA0002095905920000046
get
Figure BDA0002095905920000047
After that, the dynamic pattern noise can be removed by dividing by the pixel value of each row. The formula after dynamic pattern noise correction is derived: yi = Kc ( Di + Ri ) | c∈{r,g,b} . Depending on the type of camera, rolling shutter camera or global shutter camera, the distribution characteristics of dynamic pattern noise are determined, which is color noise or Gaussian white noise.

步骤22,校准Kc。暗电流

Figure BDA0002095905920000048
读出噪声
Figure BDA0002095905920000049
yi校正后的期望和方差可表示为:Step 22, calibrate K c . Dark current
Figure BDA0002095905920000048
Read Noise
Figure BDA0002095905920000049
The corrected expectation and variance of yi can be expressed as:

E[y']=KcNd E[y']=K c N d

Figure BDA00020959059200000410
Figure BDA00020959059200000410

使用Nd=E[y']/Kc替换Nd,将得到以下公式:Substituting N d with N d = E[y']/K c , we get the following formula:

Figure BDA00020959059200000411
Figure BDA00020959059200000411

基于暗电流随曝光时间变换,采用不同曝光的暗场视频帧,可消除常量

Figure BDA00020959059200000412
可得最后的公式:Based on the change of dark current with exposure time, the constant can be eliminated by using dark field video frames with different exposures.
Figure BDA00020959059200000412
The final formula is:

ΔD[y']=KcΔE[y'] (8)ΔD[y']=K c ΔE[y'] (8)

其中

Figure BDA00020959059200000413
t1,t2代表不同的曝光时间。实际中,E[y']可等效于mean(y'),D[y']可等效于var[y']。拍摄一系列不同曝光时间的暗场视频,可计算出一组点(ΔE[y'],ΔD[y']),所以Kc可由这些点进行线性拟合得出。in
Figure BDA00020959059200000413
t 1 , t 2 represent different exposure times. In practice, E[y'] is equivalent to mean(y'), and D[y'] is equivalent to var[y']. By shooting a series of dark-field videos with different exposure times, a set of points (ΔE[y'], ΔD[y']) can be calculated, so K c can be obtained by linear fitting of these points.

步骤23,校准Nd

Figure BDA00020959059200000414
校准得到Kc后,Nd
Figure BDA00020959059200000415
可由公式(6)计算得到。Step 23, calibrate Nd and
Figure BDA00020959059200000414
After calibration, K c is obtained, N d and
Figure BDA00020959059200000415
It can be calculated by formula (6).

步骤24,截断误差查表校正。经过截断运算后的均值mean(x)、方差var(x)与不经过截断的均值mean(x)、方差var(x)有很大区别。实际计算中,在没有像素x的先验知识的情况下,很难计算出截断的影响。在本发明中,截断中所有的随机变量可以分成两部分,泊松分布部分和均值为零的高斯分布部分。利用matlab对大量数据生成一系列不同期望和方差的像素x,制作截断后的均值

Figure BDA0002095905920000051
方差
Figure BDA0002095905920000052
与真实的均值mean(x)、方差var(x)对应的2D表格。通过查找表格的方法,可得真实数据的值。Step 24, truncation error correction by table lookup. The mean mean(x) and variance var(x) after truncation are very different from the mean mean(x) and variance var(x) without truncation. In actual calculations, it is difficult to calculate the impact of truncation without prior knowledge of pixel x. In the present invention, all random variables in truncation can be divided into two parts, a Poisson distribution part and a Gaussian distribution part with a mean of zero. Matlab is used to generate a series of pixels x with different expectations and variances for a large amount of data, and the mean after truncation is produced.
Figure BDA0002095905920000051
variance
Figure BDA0002095905920000052
A 2D table corresponding to the true mean(x) and variance var(x). By looking up the table, the true data value can be obtained.

步骤25,训练数据合成。根据公式(5)以及以上相机的参数标定步骤,可以从干净的视频序列合成噪声训练数据集。本发明先推导出期望的光电子数NeStep 25, training data synthesis. According to formula (5) and the above camera parameter calibration steps, a noise training data set can be synthesized from a clean video sequence. The present invention first derives the expected number of photoelectrons Ne ,

Figure BDA0002095905920000053
Figure BDA0002095905920000053

其中,Spixel是单个像素的面积,Clum2radiant是从光通量到辐射强度的传递常数,Ep是单个光子的能量,Qe是相机的等效系统量子效率。调整图像的平均值到期望的光电子数E[Ne],可以从图像像素值推导出期望的光子数,最终根据蒙特卡罗模拟算式,依据算式(5)合成噪声训练数据集。Among them, S pixel is the area of a single pixel, C lum2radiant is the transfer constant from light flux to radiation intensity, E p is the energy of a single photon, and Q e is the equivalent system quantum efficiency of the camera. By adjusting the average value of the image to the expected number of photoelectrons E[N e ], the expected number of photons can be derived from the image pixel value. Finally, according to the Monte Carlo simulation formula, the noise training data set is synthesized according to formula (5).

步骤3,设计视频去噪和增强神经网络,联合空间和时间信息来对噪声进行抑制削弱。本发明提出基于LSTM的视频去噪增强网络,输入为实际相机弱光下拍摄的噪声视频,输出为明亮清晰的视频帧。图4为本实施例设计的网络架构。为了同时自适应地从视频中提取短期依赖和长期依赖,网络中采用时空联合记忆单元ST-LSTM。ST-LSTM可以在一个统一的内存单元中对空间和时间表示进行建模,并将内存垂直地跨层传递和水平地跨状态传递。本发明网络架构包含两层卷积层和4个ST-LSTM层,首先,卷积层提取输入帧的特征,然后将特征传递到ST-LSTM层。在空间关联中加入跳跃连接。最后一层将前一层学习重构信息合并到sRGB空间中。其中第一和第四个ST-LSTM层使用的卷积核大小为3x3,第二和第三个ST-LSTM层使用的卷积核大小为5x5,每层的特征个数是64。采用零填充,保证了输入和输出之间的尺寸一致性。Step 3, design a video denoising and enhancement neural network, and combine spatial and temporal information to suppress and weaken noise. The present invention proposes a video denoising and enhancement network based on LSTM, the input is a noisy video shot under low light by an actual camera, and the output is a bright and clear video frame. Figure 4 is a network architecture designed in this embodiment. In order to adaptively extract short-term dependencies and long-term dependencies from the video at the same time, a spatiotemporal joint memory unit ST-LSTM is used in the network. ST-LSTM can model spatial and temporal representations in a unified memory unit, and transfer the memory vertically across layers and horizontally across states. The network architecture of the present invention includes two convolutional layers and four ST-LSTM layers. First, the convolutional layer extracts the features of the input frame and then transfers the features to the ST-LSTM layer. Jump connections are added to the spatial association. The last layer merges the reconstruction information learned by the previous layer into the sRGB space. The convolution kernel size used in the first and fourth ST-LSTM layers is 3x3, the convolution kernel size used in the second and third ST-LSTM layers is 5x5, and the number of features in each layer is 64. Zero padding is used to ensure size consistency between input and output.

步骤4,训练优化神经网络,利用步骤25合成的噪声训练数据集和实际采集的弱光视频数据集验证方法的实用性。通过最小化网络输出的帧I和对应的实际真实帧I*损失函数(公式(10))来训练网络。Step 4: Train and optimize the neural network, and use the noise training data set synthesized in step 25 and the actual low-light video data set collected to verify the practicability of the method. The network is trained by minimizing the loss function (Formula (10)) between the frame I output by the network and the corresponding actual real frame I * .

Figure BDA0002095905920000054
Figure BDA0002095905920000054

将基本损失函数定义为

Figure BDA0002095905920000061
Figure BDA0002095905920000062
损失的加权平均值,
Figure BDA0002095905920000063
距离和
Figure BDA0002095905920000064
都计算像素强度的一致性,前者使输出平滑,而后者使输出更加细致。为了进一步提高感知质量,引入感知损失
Figure BDA0002095905920000065
其是通过预先训练的视觉感知组(VGG)网络提取高级特征进行网络输出和真实值的约束。此外,在损失函数中加入了总变分稳压器
Figure BDA0002095905920000066
作为平滑的正则化项。在这里,α,β,γ,δ是训练过程的超参数,N是训练的视频帧数。本发明的训练过程中,α=5,β=1,γ=0.06,δ=2×10-6,N=8。The basic loss function is defined as
Figure BDA0002095905920000061
and
Figure BDA0002095905920000062
The weighted average of the losses,
Figure BDA0002095905920000063
Distance and
Figure BDA0002095905920000064
Both calculate the consistency of pixel intensity, the former makes the output smoother, while the latter makes the output more detailed. In order to further improve the perceptual quality, perceptual loss is introduced
Figure BDA0002095905920000065
It extracts high-level features from a pre-trained Visual Group (VGG) network to constrain network output and true values. In addition, a total variation regulator is added to the loss function.
Figure BDA0002095905920000066
As a smooth regularization term. Here, α, β, γ, δ are hyperparameters of the training process, and N is the number of video frames for training. In the training process of the present invention, α=5, β=1, γ=0.06, δ=2×10 -6 , N=8.

本发明网络实施的深度学习网络结构是Pytorch,训练过程采用的优化器是Adma,学习率是1×10-6。训练集包含大量干净的视频,并选择了大约900个序列,这些序列在移动场景中非常丰富。考虑到每个摄像机都有一组独特的噪声参数,针对不同的相机用不同的训练数据对网络进行训练。The deep learning network structure implemented by the network of the present invention is Pytorch, the optimizer used in the training process is Adma, and the learning rate is 1× 10-6 . The training set contains a large number of clean videos, and about 900 sequences are selected, which are very rich in mobile scenes. Considering that each camera has a unique set of noise parameters, the network is trained with different training data for different cameras.

Claims (4)

1. A video denoising method based on actual camera noise modeling is characterized by comprising the following steps:
step 1, establishing an actual noise mathematical model of a weak light environment, wherein the model comprises dynamic stripe noise, inter-noise channel relation and truncation influence, and the actual noise mathematical model specifically comprises the following steps:
Figure FDA0003940291750000011
where i is the pixel index, y i Is the value of the pixel that was acquired,
Figure FDA0003940291750000012
is a truncation operation, K c Is the global system gain associated with the channel,
Figure FDA0003940291750000013
is a fluctuation factor of the dynamic streak noise; s i Indicates shot noise, and>
Figure FDA0003940291750000014
Figure FDA0003940291750000015
is a poisson distribution,. Sup.>
Figure FDA0003940291750000016
Is the number of photoelectrons at pixel i; d i Represents dark current, is greater than or equal to>
Figure FDA0003940291750000017
N d Is the number of dark current electrons, R, per pixel value i Represents a readout noise, <' > or>
Figure FDA0003940291750000018
Figure FDA0003940291750000019
Is a Gaussian distribution sign>
Figure FDA00039402917500000110
The variance of Gaussian distribution, c ∈ { r, g, b } is a color three channel;
step 2, calibrating parameters of the mathematical model in the step 1 by using noise of an actual camera to generate a noise video training set which accords with an actual situation;
step 3, constructing a video denoising and enhancing neural network, and combining spatial and temporal information to suppress and weaken noise;
and 4, training and optimizing the neural network by using the noise video training set generated in the step 2 and the actually acquired low-light video data set.
2. The method as claimed in claim 1, wherein in step 2, a series of pixels x with different mean and variance are generated, and then the mean value after truncation operation is made
Figure FDA00039402917500000111
Variance->
Figure FDA00039402917500000112
Two-dimensional tables corresponding to the true mean (x), variance var (x), by looking up the tablesAnd the grid method can obtain the value of the real data.
3. The method for denoising video based on actual camera noise modeling according to claim 1, wherein in step 3, the input of the video denoising and enhancing neural network is the noise video shot under the weak light of the actual camera, and the output is a bright and clear video frame.
4. The method as claimed in claim 1, wherein in step 4, the frame I outputted by the minimization network and the corresponding actual real frame I are outputted * Training the network with a loss function as follows:
Figure FDA00039402917500000113
wherein,
Figure FDA00039402917500000114
is a final loss function>
Figure FDA00039402917500000115
Is an absolute value error function>
Figure FDA00039402917500000116
Mean square error function->
Figure FDA00039402917500000117
A perception-loss function, <' > or>
Figure FDA00039402917500000118
The total variation regularization function, α, β, γ, and δ are all hyper-parameters. />
CN201910518690.2A 2019-06-15 2019-06-15 Video denoising method based on actual camera noise modeling Active CN110246105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910518690.2A CN110246105B (en) 2019-06-15 2019-06-15 Video denoising method based on actual camera noise modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910518690.2A CN110246105B (en) 2019-06-15 2019-06-15 Video denoising method based on actual camera noise modeling

Publications (2)

Publication Number Publication Date
CN110246105A CN110246105A (en) 2019-09-17
CN110246105B true CN110246105B (en) 2023-03-28

Family

ID=67887384

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910518690.2A Active CN110246105B (en) 2019-06-15 2019-06-15 Video denoising method based on actual camera noise modeling

Country Status (1)

Country Link
CN (1) CN110246105B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807812B (en) * 2019-09-29 2022-04-05 浙江大学 Digital image sensor system error calibration method based on prior noise model
CN111260579B (en) * 2020-01-17 2021-08-03 北京理工大学 A low-light image denoising enhancement method based on physical noise generation model
CN111724317A (en) * 2020-05-20 2020-09-29 天津大学 A Supervised Dataset Construction Method for Video Denoising in Raw Domain
CN112381731B (en) * 2020-11-12 2021-08-10 四川大学 Single-frame stripe image phase analysis method and system based on image denoising
CN112686828B (en) * 2021-03-16 2021-07-02 腾讯科技(深圳)有限公司 Video denoising method, device, equipment and storage medium
CN115460359A (en) * 2021-06-08 2022-12-09 寒武纪(昆山)信息科技有限公司 Device, board card and method for denoising camera and readable storage medium
CN115514906B (en) * 2021-06-21 2025-01-10 寒武纪(昆山)信息科技有限公司 Integrated circuit device for camera denoising based on foreground data
CN114219820B (en) * 2021-12-08 2024-09-06 苏州工业园区智在天下科技有限公司 Neural network generation method, denoising method and device thereof
CN114418073B (en) * 2022-03-30 2022-06-21 深圳时识科技有限公司 Impulse neural network training method, storage medium, chip and electronic product
CN114897729B (en) * 2022-05-11 2024-06-04 北京理工大学 Filtering array type spectral image denoising enhancement method and system based on physical modeling

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7652788B2 (en) * 2006-06-23 2010-01-26 Nokia Corporation Apparatus, method, mobile station and computer program product for noise estimation, modeling and filtering of a digital image
CN107424176A (en) * 2017-07-24 2017-12-01 福州智联敏睿科技有限公司 A kind of real-time tracking extracting method of weld bead feature points
CN109214990A (en) * 2018-07-02 2019-01-15 广东工业大学 A kind of depth convolutional neural networks image de-noising method based on Inception model

Also Published As

Publication number Publication date
CN110246105A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
CN110246105B (en) Video denoising method based on actual camera noise modeling
Wang et al. Enhancing low light videos by exploring high sensitivity camera noise
Lee et al. Deep chain hdri: Reconstructing a high dynamic range image from a single low dynamic range image
CN111539879A (en) Video blind denoising method and device based on deep learning
Liu et al. Noise estimation from a single image
Wang et al. Joint iterative color correction and dehazing for underwater image enhancement
CN111986084A (en) Multi-camera low-illumination image quality enhancement method based on multi-task fusion
CN108492262A (en) It is a kind of based on gradient-structure similitude without ghost high dynamic range imaging method
CN113284061B (en) Underwater image enhancement method based on gradient network
CN111724317A (en) A Supervised Dataset Construction Method for Video Denoising in Raw Domain
WO2023086194A1 (en) High dynamic range view synthesis from noisy raw images
CN115082341A (en) Low-light image enhancement method based on event camera
CN115861113B (en) A semi-supervised dehazing method based on fusion of depth map and feature mask
Huang et al. Underwater image enhancement based on color restoration and dual image wavelet fusion
CN111652815B (en) A mask camera image restoration method based on deep learning
Ye et al. Lfienet: Light field image enhancement network by fusing exposures of lf-dslr image pairs
CN115209119A (en) Video automatic coloring method based on deep neural network
CN118247418B (en) A method for reconstructing neural radiation fields using a small number of blurred images
CN113935917A (en) A method for removing thin clouds from optical remote sensing images based on cloud computing and multi-scale generative adversarial network
Lv et al. Unsupervised low-light video enhancement with spatial-temporal co-attention transformer
Paliwal et al. Multi-stage raw video denoising with adversarial loss and gradient mask
CN117097997A (en) Noise image synthesis method for reverse image signal processing
CN115841523A (en) Double-branch HDR video reconstruction algorithm based on Raw domain
De Neve et al. An improved HDR image synthesis algorithm
Zhang et al. Joint Luminance Adjustment and Color Correction for Low-Light Image Enhancement Network.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant