CN112396674A - Rapid event image filling method and system based on lightweight generation countermeasure network - Google Patents
Rapid event image filling method and system based on lightweight generation countermeasure network Download PDFInfo
- Publication number
- CN112396674A CN112396674A CN202011133015.7A CN202011133015A CN112396674A CN 112396674 A CN112396674 A CN 112396674A CN 202011133015 A CN202011133015 A CN 202011133015A CN 112396674 A CN112396674 A CN 112396674A
- Authority
- CN
- China
- Prior art keywords
- event
- sequence
- loss
- discriminator
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 230000006870 function Effects 0.000 claims description 139
- 230000004913 activation Effects 0.000 claims description 38
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 12
- 238000005315 distribution function Methods 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000010339 dilation Effects 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims 2
- 101100409194 Rattus norvegicus Ppargc1b gene Proteins 0.000 description 12
- 230000002123 temporal effect Effects 0.000 description 9
- 230000004044 response Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/40—Filling a planar surface by adding surface attributes, e.g. colour or texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于轻量生成对抗网络的快速事件图像填补方法及系统,其中基于轻量生成对抗网络的快速事件图像填补方法包括:构建轻量生成对抗网络;获取训练数据,所述训练数据包括多对相匹配的损失事件图像和未损失事件图像;利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数;获取待填补的损失事件图像,输入至基于最优网络参数的轻量生成对抗网络中,得到轻量生成对抗网络输出的填补事件图像。本发明的基于轻量生成对抗网络的快速事件图像填补方法及系统,充分利用事件图像的稀疏特性,提高图像填补结构的真实性和结构的精细度。
The invention discloses a method and system for fast event image filling based on a lightweight generation confrontation network, wherein the fast event image filling method based on a lightweight generation confrontation network comprises: constructing a lightweight generation confrontation network; acquiring training data, the training The data includes multiple pairs of matching loss event images and non-loss event images; use the training data to optimize the lightweight generative adversarial network to obtain optimal network parameters; obtain the loss event images to be filled, and input them to the network based on the optimal In the light-weight generative adversarial network of parameters, the padded event images output by the light-weight generative adversarial network are obtained. The method and system for fast event image filling based on the light-weight generative confrontation network of the present invention make full use of the sparse characteristics of the event image to improve the authenticity of the image filling structure and the fineness of the structure.
Description
技术领域technical field
本申请属于图像处理技术领域,具体涉及一种基于轻量生成对抗网络的快速事件图像填补方法及系统。The present application belongs to the technical field of image processing, and in particular relates to a method and system for fast event image filling based on a lightweight generative confrontation network.
背景技术Background technique
事件相机(Event-based Camera,或简称为Event Camera,缩写为EB。有时也称作DVS(Dynamic Vision Sensor,动态视觉传感器))是一款新型传感器。不同于传统相机拍摄一幅完整的图像,事件相机拍摄的是“事件”,可以简单理解为“像素亮度的变化”,即事件相机输出的是像素亮度的变化情况。Event-based Camera (Event-based Camera, or simply Event Camera, abbreviated as EB. Sometimes also called DVS (Dynamic Vision Sensor, dynamic vision sensor)) is a new type of sensor. Different from the traditional camera that shoots a complete image, the event camera shoots an "event", which can be simply understood as "change in pixel brightness", that is, what the event camera outputs is the change in pixel brightness.
目前,事件相机能够生成稀疏事件流并捕获高速运动信息,然而,随着时间分辨率的提高,空间分辨率会急剧降低。尽管生成对抗网络在传统图像修复上取得了显著效果,但是直接将其用于事件填补会埋没事件相机的快速响应特性,并且事件流的稀疏性也未被充分利用。Currently, event cameras are able to generate sparse event streams and capture high-speed motion information, however, with increasing temporal resolution, spatial resolution decreases dramatically. Although generative adversarial networks have achieved remarkable results on traditional image inpainting, direct use of them for event padding burys the fast response properties of event cameras, and the sparsity of event streams is also underutilized.
发明内容SUMMARY OF THE INVENTION
本申请的目的在于提供一种基于轻量生成对抗网络的快速事件图像填补方法及系统,充分利用事件图像的稀疏特性,提高图像填补结构的真实性和结构的精细度。The purpose of this application is to provide a fast event image filling method and system based on a lightweight generative adversarial network, making full use of the sparse characteristics of event images, and improving the authenticity of the image filling structure and the fineness of the structure.
为实现上述目的,本申请所采取的技术方案为:In order to achieve the above-mentioned purpose, the technical scheme adopted in this application is:
一种基于轻量生成对抗网络的快速事件图像填补方法,所述基于轻量生成对抗网络的快速事件图像填补方法,包括:A fast event image filling method based on lightweight generative adversarial network, the fast event image filling method based on lightweight generative adversarial network, comprising:
构建轻量生成对抗网络;Build a lightweight generative adversarial network;
获取训练数据,所述训练数据包括多对相匹配的损失事件图像和未损失事件图像;acquiring training data, the training data including multiple pairs of matching loss event images and non-loss event images;
利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数;Using the training data to optimize the lightweight generative adversarial network to obtain optimal network parameters;
获取待填补的损失事件图像,输入至基于最优网络参数的轻量生成对抗网络中,得到轻量生成对抗网络输出的填补事件图像;Obtain the loss event image to be filled, input it into the light-weight generative adversarial network based on the optimal network parameters, and obtain the filled event image output by the light-weight generative adversarial network;
其中,所述轻量生成对抗网络包括生成器和判别器,所述生成器包括编码器、解码器、以及连接在编码器和解码器之间的两个残差块,所述编码器包括三个3D卷积,所述编码器对图像进行两次下采样,所述解码器包括三个3D转置卷积,所述解码器对图像进行两次上采样;所述判别器包括事件帧判别器和事件序列判别器,所述事件帧判别器为PatchGAN结构,且事件帧判别器中的卷积为2D卷积,所述事件序列判别器为PatchGAN结构,且事件序列判别器中的卷积为3D卷积。Wherein, the lightweight generative adversarial network includes a generator and a discriminator, the generator includes an encoder, a decoder, and two residual blocks connected between the encoder and the decoder, the encoder includes three 3D convolutions, the encoder downsamples the image twice, the decoder includes three 3D transposed convolutions, the decoder upsamples the image twice; the discriminator includes event frame discrimination device and event sequence discriminator, the event frame discriminator is a PatchGAN structure, and the convolution in the event frame discriminator is a 2D convolution, the event sequence discriminator is a PatchGAN structure, and the convolution in the event sequence discriminator is a 3D convolution.
以下还提供了若干可选方式,但并不作为对上述总体方案的额外限定,仅仅是进一步的增补或优选,在没有技术或逻辑矛盾的前提下,各可选方式可单独针对上述总体方案进行组合,还可以是多个可选方式之间进行组合。Several optional methods are also provided below, which are not intended to be additional limitations on the above-mentioned overall solution, but are merely further additions or optimizations. On the premise of no technical or logical contradiction, each optional method can be independently implemented for the above-mentioned overall solution. The combination can also be a combination between multiple optional ways.
作为优选,所述残差块中的卷积采用扩张因子为2的扩张卷积。Preferably, the convolution in the residual block adopts a dilated convolution with a dilation factor of 2.
作为优选,所述利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数,包括:Preferably, using the training data to optimize the lightweight generative adversarial network to obtain optimal network parameters, including:
基于训练数据取P对相匹配的损失事件图像和未损失事件图像;Take P pairs of matching loss event images and non-loss event images based on the training data;
将P张损失事件图像作为一个损失事件图像序列输入所述生成器中,得到所述生成器输出的填补事件图像序列,所述填补事件图像序列中的每一填补事件图像与作为输入的损失事件图像序列中的每一损失事件图像相对应;Input the P loss event images as a loss event image sequence into the generator, and obtain the padded event image sequence output by the generator, each padded event image in the padded event image sequence and the input loss event image Each loss event image in the image sequence corresponds to;
将P张未损失事件图像作为一个未损失事件图像序列,根据所述未损失事件图像序列和填补事件图像序列,先基于判别器的总损失函数进行判别器的反向传播,再基于生成器的总损失函数进行生成器的反向传播;Taking P unlost event images as a sequence of unlossed event images, according to the unlossed event image sequence and the filled event image sequence, the back-propagation of the discriminator is first performed based on the total loss function of the discriminator, and then based on the generator's total loss function. The total loss function performs back-propagation of the generator;
重复训练直至得到所述轻量生成对抗网络最优的网络参数。The training is repeated until the optimal network parameters of the light-weight generative adversarial network are obtained.
作为优选,所述判别器的总损失函数包括:Preferably, the total loss function of the discriminator includes:
其中,LD为判别器的总损失函数,为事件序列判别器的损失函数,为事件帧判别器的损失函数,为事件序列判别器的权重参数,为事件帧判别器的权重参数;where LD is the total loss function of the discriminator, is the loss function of the event sequence discriminator, is the loss function of the event frame discriminator, is the weight parameter of the event sequence discriminator, is the weight parameter of the event frame discriminator;
所述事件序列判别器的损失函数和事件帧判别器的损失函数如下:The loss function of the event sequence discriminator and the loss function of the event frame discriminator as follows:
其中,Igt表示未损失事件图像序列,Pdata(Igt)表示未损失事件图像序列的分布,E[*]表示分布函数的期望值,logDs(Igt)表示事件序列判别器判别为未损失事件图像的概率,logDf(Igt)表示事件帧判别器判别为未损失事件图像的概率,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,log(1-Ds(G(Iin)))表示事件序列判别器判别为由生成器输出的填补事件图像的概率,log(1-Df(G(Iin)))表示事件帧判别器判别为由生成器输出的填补事件图像的概率。Among them, I gt represents the sequence of non-lost event images, P data (I gt ) represents the distribution of the sequence of non-lost event images, E[*] represents the expected value of the distribution function, and logD s (I gt ) represents the event sequence discriminator discriminated as not The probability of loss event image, logD f (I gt ) represents the probability that the event frame discriminator discriminates as a non-lost event image, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, log( 1-D s (G(I in ))) represents the probability that the event sequence discriminator discriminates as the padded event image output by the generator, log(1-D f (G(I in ))) represents the event frame discriminator discriminant is the probability of filling the event image output by the generator.
作为优选,所述生成器的总损失函数包括:Preferably, the total loss function of the generator includes:
LG=λ1L1+λpLperc+λsLstyle+λgLg L G =λ 1 L 1 +λ p L perc +λ s L style +λ g L g
其中,LG为生成器的总损失函数,L1为L1损失函数,λ1为L1损失函数的权重参数,Lperc为感知损失函数,λp为感知损失函数的权重参数,Lstyle为风格损失函数,λs为风格损失函数的权重参数,Lg为生成器对抗损失函数,λg为生成器对抗损失函数的权重参数;where L G is the total loss function of the generator, L 1 is the L 1 loss function, λ 1 is the weight parameter of the L 1 loss function, L perc is the perceptual loss function, λ p is the weight parameter of the perceptual loss function, L style is the style loss function, λ s is the weight parameter of the style loss function, L g is the generator confrontation loss function, and λ g is the weight parameter of the generator confrontation loss function;
所述生成器对抗损失函数Lg如下:The generator adversarial loss function L g is as follows:
其中,其中G表示生成器,D表示判别器,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,E[*]表示分布函数的期望值,G(Iin)表示生成器输出的填补事件图像序列,logDs(G(Iin))表示事件序列判别器将填补事件图像判别为未损失事件图像的概率,logDf(G(Iin))表示事件帧判别器将填补事件图像判别为未损失事件图像的概率;Among them, G represents the generator, D represents the discriminator, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, E[*] represents the expected value of the distribution function, G(I in ) represents the sequence of padded event images output by the generator, logD s (G(I in )) represents the probability that the event sequence discriminator will discriminate the padded event images as unloss event images, and logD f (G(I in )) represents the event frame discrimination probability that the padded event image will be discriminated as a non-lost event image;
所述L1损失函数L1如下: The L1 loss function L1 is as follows:
其中,Igt表示未损失事件图像序列,Ipred表示生成器输出的填补事件图像序列;where I gt represents the sequence of unlossed event images, and I pred represents the sequence of padded event images output by the generator;
所述感知损失函数Lperc如下:The perceptual loss function L perc is as follows:
其中,其中φj是预训练VGG-19网络的第j层的激活图,φj(Igt)表示未损失事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列,φj(Ipred)表示填补事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列;Nj表示VGG-19网络中第j层网络的特征通道数;where φ j is the activation map of the jth layer of the pre-trained VGG-19 network, φ j (I gt ) represents the corresponding activation map sequence obtained after inputting the unloss event image sequence into the jth layer of the VGG-19 network, φ j j (I pred ) represents the corresponding activation map sequence obtained after the padded event image sequence is input to the jth layer of the VGG-19 network; N j represents the number of feature channels of the jth layer network in the VGG-19 network;
所述风格损失函数Lstyle如下:The style loss function L style is as follows:
其中,是根据激活图φj构造的Cj×CjGram矩阵,表示根据未损失事件图像序列对应的激活图序列构造的多个Gram矩阵,表示根据填补事件图像序列对应的激活图序列构造的多个Gram矩阵。in, is the C j ×C j Gram matrix constructed from the activation map φ j , represents multiple Gram matrices constructed from the activation map sequence corresponding to the unloss event image sequence, Represents multiple Gram matrices constructed from activation map sequences corresponding to padded event image sequences.
本申请还提供一种基于轻量生成对抗网络的快速事件图像填补系统,所述基于轻量生成对抗网络的快速事件图像填补系统,包括:The present application also provides a light-weight generative adversarial network-based fast event image filling system, and the light-weight generative adversarial network-based fast event image filling system includes:
第一模块,用于构建轻量生成对抗网络;The first module is used to build a lightweight generative adversarial network;
第二模块,用于获取训练数据,所述训练数据包括多对相匹配的损失事件图像和未损失事件图像;a second module, configured to acquire training data, the training data including multiple pairs of matching loss event images and non-loss event images;
第三模块,用于利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数;The third module is used to optimize the lightweight generative adversarial network by using the training data to obtain optimal network parameters;
第四模块,用于获取待填补的损失事件图像,输入至基于最优网络参数的轻量生成对抗网络中,得到轻量生成对抗网络输出的填补事件图像;The fourth module is used to obtain the loss event image to be filled, and input it into the light-weight generative adversarial network based on the optimal network parameters to obtain the filled-in event image output by the light-weight generative adversarial network;
其中,所述轻量生成对抗网络包括生成器和判别器,所述生成器包括编码器、解码器、以及连接在编码器和解码器之间的两个残差块,所述编码器包括三个3D卷积,所述编码器对图像进行两次下采样,所述解码器包括三个3D转置卷积,所述解码器对图像进行两次上采样;所述判别器包括事件帧判别器和事件序列判别器,所述事件帧判别器为PatchGAN结构,且事件帧判别器中的卷积为2D卷积,所述事件序列判别器为PatchGAN结构,且事件序列判别器中的卷积为3D卷积。Wherein, the lightweight generative adversarial network includes a generator and a discriminator, the generator includes an encoder, a decoder, and two residual blocks connected between the encoder and the decoder, the encoder includes three 3D convolutions, the encoder downsamples the image twice, the decoder includes three 3D transposed convolutions, the decoder upsamples the image twice; the discriminator includes event frame discrimination device and event sequence discriminator, the event frame discriminator is a PatchGAN structure, and the convolution in the event frame discriminator is a 2D convolution, the event sequence discriminator is a PatchGAN structure, and the convolution in the event sequence discriminator is a 3D convolution.
作为优选,所述残差块中的卷积采用扩张因子为2的扩张卷积。Preferably, the convolution in the residual block adopts a dilated convolution with a dilation factor of 2.
作为优选,所述第三模块,利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数,执行如下操作:Preferably, the third module uses the training data to optimize the lightweight generative adversarial network to obtain optimal network parameters, and performs the following operations:
基于训练数据取P对相匹配的损失事件图像和未损失事件图像;Take P pairs of matching loss event images and non-loss event images based on the training data;
将P张损失事件图像作为一个损失事件图像序列输入所述生成器中,得到所述生成器输出的填补事件图像序列,所述填补事件图像序列中的每一填补事件图像与作为输入的损失事件图像序列中的每一损失事件图像相对应;Input the P loss event images as a loss event image sequence into the generator, and obtain the padded event image sequence output by the generator, each padded event image in the padded event image sequence and the input loss event image Each loss event image in the image sequence corresponds to;
将P张未损失事件图像作为一个未损失事件图像序列,根据所述未损失事件图像序列和填补事件图像序列,先基于判别器的总损失函数进行判别器的反向传播,再基于生成器的总损失函数进行生成器的反向传播;Taking P unlost event images as a sequence of unlossed event images, according to the unlossed event image sequence and the filled event image sequence, the back-propagation of the discriminator is first performed based on the total loss function of the discriminator, and then based on the generator's total loss function. The total loss function performs back-propagation of the generator;
重复训练直至得到所述轻量生成对抗网络最优的网络参数。The training is repeated until the optimal network parameters of the light-weight generative adversarial network are obtained.
作为优选,所述判别器的总损失函数包括:Preferably, the total loss function of the discriminator includes:
其中,LD为判别器的总损失函数,为事件序列判别器的损失函数,为事件帧判别器的损失函数,为事件序列判别器的权重参数,为事件帧判别器的权重参数;where LD is the total loss function of the discriminator, is the loss function of the event sequence discriminator, is the loss function of the event frame discriminator, is the weight parameter of the event sequence discriminator, is the weight parameter of the event frame discriminator;
所述事件序列判别器的损失函数和事件帧判别器的损失函数如下:The loss function of the event sequence discriminator and the loss function of the event frame discriminator as follows:
其中,Igt表示未损失事件图像序列,Pdata(Igt)表示未损失事件图像序列的分布,E[*]表示分布函数的期望值,logDs(Igt)表示事件序列判别器判别为未损失事件图像的概率,logDf(Igt)表示事件帧判别器判别为未损失事件图像的概率,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,log(1-Ds(G(Iin)))表示事件序列判别器判别为由生成器输出的填补事件图像的概率,log(1-Df(G(Iin)))表示事件帧判别器判别为由生成器输出的填补事件图像的概率。Among them, I gt represents the sequence of non-lost event images, P data (I gt ) represents the distribution of the sequence of non-lost event images, E[*] represents the expected value of the distribution function, and logD s (I gt ) represents the event sequence discriminator discriminated as not The probability of loss event image, logD f (I gt ) represents the probability that the event frame discriminator discriminates as a non-lost event image, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, log( 1-D s (G(I in ))) represents the probability that the event sequence discriminator discriminates as the padded event image output by the generator, log(1-D f (G(I in ))) represents the event frame discriminator discriminant is the probability of filling the event image output by the generator.
作为优选,所述生成器的总损失函数包括:Preferably, the total loss function of the generator includes:
LG=λ1L1+λpLperc+λsLstyle+λgLg L G =λ 1 L 1 +λ p L perc +λ s L style +λ g L g
其中,LG为生成器的总损失函数,L1为L1损失函数,λ1为L1损失函数的权重参数,Lperc为感知损失函数,λp为感知损失函数的权重参数,Lstyle为风格损失函数,λs为风格损失函数的权重参数,Lg为生成器对抗损失函数,λg为生成器对抗损失函数的权重参数;where L G is the total loss function of the generator, L 1 is the L 1 loss function, λ 1 is the weight parameter of the L 1 loss function, L perc is the perceptual loss function, λ p is the weight parameter of the perceptual loss function, L style is the style loss function, λ s is the weight parameter of the style loss function, L g is the generator confrontation loss function, and λ g is the weight parameter of the generator confrontation loss function;
所述生成器对抗损失函数Lg如下:The generator adversarial loss function L g is as follows:
其中,其中G表示生成器,D表示判别器,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,E[*]表示分布函数的期望值,G(Iin)表示生成器输出的填补事件图像序列,logDs(G(Iin))表示事件序列判别器将填补事件图像判别为未损失事件图像的概率,logDf(G(Iin))表示事件帧判别器将填补事件图像判别为未损失事件图像的概率;Among them, G represents the generator, D represents the discriminator, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, E[*] represents the expected value of the distribution function, G(I in ) represents the sequence of padded event images output by the generator, logD s (G(I in )) represents the probability that the event sequence discriminator will discriminate the padded event images as unloss event images, and logD f (G(I in )) represents the event frame discrimination probability that the padded event image will be discriminated as a non-lost event image;
所述L1损失函数L1如下: The L1 loss function L1 is as follows:
其中,Igt表示未损失事件图像序列,Ipred表示生成器输出的填补事件图像序列;where I gt represents the sequence of unlossed event images, and I pred represents the sequence of padded event images output by the generator;
所述感知损失函数Lperc如下:The perceptual loss function L perc is as follows:
其中,其中φj是预训练VGG-19网络的第j层的激活图,φj(Igt)表示未损失事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列,φj(Ipred)表示填补事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列;Nj表示VGG-19网络中第j层网络的特征通道数;where φ j is the activation map of the jth layer of the pre-trained VGG-19 network, φ j (I gt ) represents the corresponding activation map sequence obtained after inputting the unloss event image sequence into the jth layer of the VGG-19 network, φ j j (I pred ) represents the corresponding activation map sequence obtained after the padded event image sequence is input to the jth layer of the VGG-19 network; N j represents the number of feature channels of the jth layer network in the VGG-19 network;
所述风格损失函数Lstyle如下:The style loss function L style is as follows:
其中,是根据激活图φj构造的Cj×CjGram矩阵,表示根据未损失事件图像序列对应的激活图序列构造的多个Gram矩阵,表示根据填补事件图像序列对应的激活图序列构造的多个Gram矩阵。in, is the C j ×C j Gram matrix constructed from the activation map φ j , represents multiple Gram matrices constructed from the activation map sequence corresponding to the unloss event image sequence, Represents multiple Gram matrices constructed from activation map sequences corresponding to padded event image sequences.
为了克服传统图像修复模型庞大、参数量冗余和推断速度慢的缺点以及2D卷积造成结果的时间一致性下降的问题,本申请提供的基于轻量生成对抗网络的快速事件图像填补方法及系统,构建了一个浅层的3D生成器以充分利用事件图像的稀疏特性,同时,为了保证填补事件结果的真实性和结构的精细度,在原始对抗损失中加入L1损失、感知损失和风格损失。最后,提出事件序列判别器,提高结果的时间一致性。In order to overcome the shortcomings of traditional image inpainting models such as large size, redundant parameters, and slow inference speed, as well as the problem of 2D convolution resulting in decreased time consistency of results, the present application provides a method and system for fast event image filling based on lightweight generative adversarial networks. , a shallow 3D generator is constructed to take full advantage of the sparse nature of event images, and at the same time, in order to ensure the authenticity of the filled event results and the fineness of the structure, L1 loss , perceptual loss and style loss are added to the original adversarial loss . Finally, an event sequence discriminator is proposed to improve the temporal consistency of the results.
附图说明Description of drawings
图1为本申请的基于轻量生成对抗网络的快速事件图像填补方法的流程图;FIG. 1 is a flowchart of the fast event image filling method based on a lightweight generative adversarial network of the present application;
图2为本申请构建的轻量生成对抗网络的结构示意图;2 is a schematic structural diagram of a lightweight generative adversarial network constructed by the application;
图3为本申请损失事件图像的一种实施例示例图;FIG. 3 is an example diagram of an embodiment of a loss event image of the present application;
图4为本申请轻量生成对抗网络针对图3输出的填补后的事件图像。FIG. 4 is the padded event image output by the lightweight generative adversarial network of the application for FIG. 3 .
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein in the specification of the present application are for the purpose of describing specific embodiments only, and are not intended to limit the present application.
其中一个实施例中,提供一种基于轻量生成对抗网络的快速事件图像填补方法,用于图像处理领域,尤其是对空间分辨率受损的事件相机图像的填补。In one of the embodiments, a fast event image filling method based on a lightweight generative adversarial network is provided, which is used in the field of image processing, especially the filling of event camera images with damaged spatial resolution.
如图1所述,一种基于轻量生成对抗网络的快速事件图像填补方法,包括以下步骤:As shown in Figure 1, a fast event image filling method based on lightweight generative adversarial network includes the following steps:
步骤S1、构建轻量生成对抗网络。Step S1, constructing a lightweight generative adversarial network.
由于直接应用生成对抗网络来填补事件图像,将会埋没事件相机快速响应的特性,并且事件流的稀疏性也不能被充分利用,因此本实施例构建轻量生成对抗网络,以便于高动态响应场景的应用。Since the direct application of the generative adversarial network to fill in the event image will bury the fast response characteristics of the event camera, and the sparsity of the event stream cannot be fully utilized, this embodiment builds a lightweight generative adversarial network to facilitate high dynamic response scenarios. Applications.
如图2所示,本实施例构建的轻量生成对抗网络包括生成器和判别器。其中生成器包括编码器和解码器,以及连接在编码器和解码器之间的两个残差块。As shown in Figure 2, the lightweight generative adversarial network constructed in this embodiment includes a generator and a discriminator. The generator includes an encoder and a decoder, and two residual blocks connected between the encoder and the decoder.
传统图像修复的网络对于事件图像来说深度过深,这导致推断速度过慢。因此本发明的编码器包括三个3D卷积,只对图像进行两次下采样,每次下采样后特征通道扩大为前层的2倍,与之对应的解码器包括三个3D转置卷积,也只对图像进行2次上采样,每次上采样后特征通道缩小为前层的2倍。The network of traditional image inpainting is too deep for event images, which leads to slow inference speed. Therefore, the encoder of the present invention includes three 3D convolutions, and only performs two downsampling on the image. After each downsampling, the feature channel is expanded to be twice that of the previous layer, and the corresponding decoder includes three 3D transposed volumes. product, and only upsampling the image twice, after each upsampling, the feature channel is reduced to 2 times that of the previous layer.
同时由于事件图像的稀疏性,浅层的网络并不会使事件图像的生成质量过低,所以本实施例在编码器和解码器之间仅使用两个残差块。为了增加感受野,使用扩张因子为2的扩张卷积代替残差层中的规则卷积,同时为提高泛化能力,保留更多时空信息,用3D扩张卷积替代2D扩张卷积。本实施例在网络的所有层上使用实例标准化。At the same time, due to the sparseness of the event image, the shallow network will not cause the generation quality of the event image to be too low, so this embodiment only uses two residual blocks between the encoder and the decoder. In order to increase the receptive field, a dilated convolution with a dilation factor of 2 is used to replace the regular convolution in the residual layer. At the same time, in order to improve the generalization ability and retain more spatiotemporal information, a 3D dilated convolution is used to replace the 2D dilated convolution. This embodiment uses instance normalization across all layers of the network.
为了提升填补事件图像的时间一致性和质量,本实施例构建的判别器包括事件帧判别器和事件序列判别器。事件帧判别器为PatchGAN结构,且事件帧判别器中的卷积为2D卷积,事件序列判别器为PatchGAN结构,且事件序列判别器中的卷积为3D卷积。In order to improve the temporal consistency and quality of the filled event images, the discriminator constructed in this embodiment includes an event frame discriminator and an event sequence discriminator. The event frame discriminator is a PatchGAN structure, and the convolution in the event frame discriminator is a 2D convolution, the event sequence discriminator is a PatchGAN structure, and the convolution in the event sequence discriminator is a 3D convolution.
判别器使用70×70PatchGAN结构,该结构用于判别大小为70×70的重叠图像块是否真实。事件帧判别器使用2D卷积,其目的在于关注事件帧的空间特征一致性。虽然在生成器中使用3D卷积能够保留更多时空信息,但也使得图像边缘会出现模糊性,为此引入事件序列判别器,即使用3D卷积提升生成的图像质量,事件序列判别器重点关注像素变化的时间依赖性和相关性。最后,为了增强训练稳定性,将频谱归一化应用于判别器。The discriminator uses a 70×70 PatchGAN structure, which is used to discriminate whether overlapping image patches of size 70×70 are real. The event frame discriminator uses 2D convolutions, which aim to focus on the spatial feature consistency of event frames. Although the use of 3D convolution in the generator can retain more spatiotemporal information, it also makes the edge of the image appear blurry. For this reason, an event sequence discriminator is introduced, that is, using 3D convolution to improve the quality of the generated image, the event sequence discriminator focuses on Focus on the temporal dependencies and correlations of pixel changes. Finally, to enhance training stability, spectral normalization is applied to the discriminator.
本实施例使用常规卷积、扩张卷积和转置卷积。以5×5的输入特征图为例,采用3×3大小的卷积核,步长为1,常规卷积将输出3×3的特征图,再以此作为扩张卷积的输入,当卷积核点的间隔数为1时(卷积核点之间填充0,核大小变为5×5,参数量不变,感受野变大),同时将特征图边缘填充数(填0)设为2,步长为1,此时扩张卷积输出的特张图大小为3×3;对于转置卷积,只需将特征图边缘填充数设置为2,其他参数同常规卷积,就可将3×3的特征图恢复成5×5的大小。This embodiment uses regular convolution, dilated convolution and transposed convolution. Taking the 5×5 input feature map as an example, a 3×3 convolution kernel is used, and the stride is 1. The conventional convolution will output a 3×3 feature map, which is then used as the input of the dilated convolution. When the interval of the kernel points is 1 (0 is filled between the convolution kernel points, the kernel size becomes 5 × 5, the parameter amount remains unchanged, and the receptive field becomes larger), and the number of feature map edge fills (fill 0) is set to is 2, the stride is 1, and the size of the feature map output by the dilated convolution is 3 × 3; for the transposed convolution, only the number of edge padding of the feature map is set to 2, and other parameters are the same as those of the conventional convolution. The 3×3 feature map can be restored to 5×5 size.
步骤S2、获取训练数据,所述训练数据包括多对相匹配的损失事件图像和未损失事件图像。Step S2: Acquire training data, where the training data includes multiple pairs of matching loss event images and non-loss event images.
本实施例主要针对事件相机的图像进行填补,因此本实施例以事件相机为例进行说明,对于事件相机而言,其输出可以视为事件{ei}∈N的连续流。每个事件ei可以使用以下形式表示:This embodiment mainly fills in the image of the event camera, so this embodiment takes the event camera as an example for description. For the event camera, the output can be regarded as a continuous stream of events {e i }∈N. Each event e i can be represented using the following form:
ei=(xi,yi,ti,pi) (1)e i =(x i ,y i ,t i ,p i ) (1)
其中(xi,yi)表示生成事件的像素的空间位置,ti表示亮度变化的时间坐标,pi∈{-1,1}表示引起事件的像素处强度的正或负变化,i为事件索引序号。where (x i , y i ) denotes the spatial position of the pixel that generated the event, t i denotes the temporal coordinate of the luminance change, p i ∈ {-1, 1} denotes the positive or negative change in intensity at the pixel that caused the event, and i is Event index sequence number.
在曝光时间间隔Δt=t+τ内,通过将时间t和t+τ之间的所有事件以像素级别相加,获得事件帧Fτ(t),因此事件帧可以表示为:In the exposure time interval Δt = t + τ, the event frame F τ (t) is obtained by adding all events between times t and t + τ at the pixel level, so the event frame can be expressed as:
其中Et,τ={ei|ti∈[t,t+τ]}。以这种方式,事件帧可以表示为大小为1×w×h的灰度图像,该灰度图像将在特定时间间隔内发生的所有事件整合到单个通道中。基于事件帧积累的方式,本实施例在生成匹配的损失事件图像和未损失事件图像时,将M1个事件积累为一个事件帧作为损失事件图像,将M2个事件积累为一个事件帧作为未损失事件图像,其中M2至少为M1的80倍,以保证未损失事件图像的有效性。where E t,τ ={e i |t i ∈[t,t+τ]}. In this way, an event frame can be represented as a grayscale image of size 1 × w × h that integrates all events that occurred within a specific time interval into a single channel. Based on the method of event frame accumulation, when generating matching loss event images and non-loss event images, M 1 events are accumulated into one event frame as the loss event image, and M 2 events are accumulated into one event frame as the loss event image. Unlost event images, where M 2 is at least 80 times larger than M 1 , to guarantee the validity of un-loss event images.
例如本实施例中损失事件图像以100个事件为1帧累积所得,而对于的未损失事件图像则以7500个事件为1帧进行积累。如图3为以100个事件为1帧的积累的损失事件图像,由此可以获得帧率约为2000FPS的低分辨率事件图像。For example, in this embodiment, the loss event image is accumulated with 100 events as one frame, while the non-loss event image is accumulated with 7500 events as one frame. Figure 3 is an accumulated loss event image with 100 events as one frame, so a low-resolution event image with a frame rate of about 2000 FPS can be obtained.
步骤S3、利用训练数据优化所述轻量生成对抗网络得到最优的网络参数。Step S3, using the training data to optimize the lightweight generative adversarial network to obtain optimal network parameters.
在轻量生成对抗网络的训练中,为了避免显存不够或显存利用率过低,本实施例以多个事件帧为一个序列片段输入用于训练。In the training of the light-weight generative adversarial network, in order to avoid insufficient video memory or low utilization of video memory, in this embodiment, multiple event frames are used as a sequence segment input for training.
具体的,本实施例基于训练数据取P(P>1,例如取值为8)对相匹配的损失事件图像和未损失事件图像。Specifically, in this embodiment, based on the training data, P (P>1, for example, a value of 8) is used to pair matching loss event images and non-loss event images.
将P张损失事件图像作为一个损失事件图像序列输入所述生成器中,得到所述生成器输出的填补事件图像序列,所述填补事件图像序列中的每一填补事件图像与作为输入的损失事件图像序列中的每一损失事件图像相对应。Input the P loss event images as a loss event image sequence into the generator, and obtain the padded event image sequence output by the generator, each padded event image in the padded event image sequence and the input loss event image Each loss event image in the image sequence corresponds to.
将P张未损失事件图像作为一个未损失事件图像序列,根据所述未损失事件图像序列和填补事件图像序列,先基于判别器的总损失函数进行判别器的反向传播,再基于生成器的总损失函数进行生成器的反向传播。Taking P unlost event images as a sequence of unlossed event images, according to the unlossed event image sequence and the filled event image sequence, the back-propagation of the discriminator is first performed based on the total loss function of the discriminator, and then based on the generator's total loss function. The total loss function does the back-propagation of the generator.
重复训练直至得到所述轻量生成对抗网络最优的网络参数。The training is repeated until the optimal network parameters of the light-weight generative adversarial network are obtained.
本实施例判别器具有两种,为了加强两个判别器之间的相关性,本实施例联立两种判别器的训练,在更新判别器损失时,对两种判别器的损失先进行求和再反向传播,而不是各自更新损失。因此构建判别器的总损失函数包括:There are two types of discriminators in this embodiment. In order to strengthen the correlation between the two discriminators, this embodiment simultaneously trains the two discriminators. When updating the discriminator loss, the losses of the two discriminators are calculated first. and then backpropagate, instead of updating the losses individually. So the total loss function for building the discriminator consists of:
其中,LD为判别器的总损失函数,为事件序列判别器的损失函数,为事件帧判别器的损失函数,为事件序列判别器的权重参数,为事件帧判别器的权重参数。本实施例优选 where LD is the total loss function of the discriminator, is the loss function of the event sequence discriminator, is the loss function of the event frame discriminator, is the weight parameter of the event sequence discriminator, is the weight parameter of the event frame discriminator. This embodiment is preferred
所述事件序列判别器的损失函数和事件帧判别器的损失函数如下:The loss function of the event sequence discriminator and the loss function of the event frame discriminator as follows:
其中,Igt表示未损失事件图像序列,Pdata(Igt)表示未损失事件图像序列的分布,E[*]表示分布函数的期望值,logDs(Igt)表示事件序列判别器判别为未损失事件图像的概率,logDf(Igt)表示事件帧判别器判别为未损失事件图像的概率,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,log(1-Ds(G(Iin)))表示事件序列判别器判别为由生成器输出的填补事件图像的概率,log(1-Df(G(Iin)))表示事件帧判别器判别为由生成器输出的填补事件图像的概率。Among them, I gt represents the sequence of non-lost event images, P data (I gt ) represents the distribution of the sequence of non-lost event images, E[*] represents the expected value of the distribution function, and logD s (I gt ) represents the event sequence discriminator discriminated as not The probability of loss event image, logD f (I gt ) represents the probability that the event frame discriminator discriminates as a non-lost event image, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, log( 1-D s (G(I in ))) represents the probability that the event sequence discriminator discriminates as the padded event image output by the generator, log(1-D f (G(I in ))) represents the event frame discriminator discriminant is the probability of filling the event image output by the generator.
为了保证填补的事件图像序列的真实性和质量,本实施例综合考虑生成器的多种损失,采用的生成器的总损失函数包括:In order to ensure the authenticity and quality of the filled event image sequence, this embodiment comprehensively considers multiple losses of the generator, and the total loss function of the generator used includes:
LG=λ1L1+λpLperc+λsLstyle+λgLg (6)L G =λ 1 L 1 +λ p L perc +λ s L style +λ g L g (6)
其中,LD为判别器的总损失函数,L1为L1损失函数,λ1为L1损失函数的权重参数,Lperc为感知损失函数,λp为感知损失函数的权重参数,Lstyle为风格损失函数,λs为风格损失函数的权重参数,Lg为生成器对抗损失函数,λg为生成器对抗损失函数的权重参数。本实施例优选λ1=1,λg=λp=0.1,λs=250。where L D is the total loss function of the discriminator, L 1 is the L 1 loss function, λ 1 is the weight parameter of the L 1 loss function, L perc is the perceptual loss function, λ p is the weight parameter of the perceptual loss function, L style is the style loss function, λ s is the weight parameter of the style loss function, L g is the generator confrontation loss function, and λ g is the weight parameter of the generator confrontation loss function. In this embodiment, it is preferable that λ 1 =1, λ g =λ p =0.1, and λ s =250.
轻量生成对抗网络对输入的图像序列中的每一图像进行填补后输出,并使用BCELoss(二值交叉熵损失)使得填补事件图像序列的分布接近真实标签的分布,因此采用的生成器对抗损失函数Lg如下:The light-weight generative adversarial network fills each image in the input image sequence and outputs it, and uses BCELoss (binary cross entropy loss) to make the distribution of the filled event image sequence close to the distribution of the real label, so the generator adversarial loss is used. The function L g is as follows:
其中,其中G表示生成器,D表示判别器,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,E[*]表示分布函数的期望值,G(Iin)表示生成器输出的填补事件图像序列,logDs(G(Iin))表示事件序列判别器将填补事件图像判别为未损失事件图像的概率,logDf(G(Iin))表示事件帧判别器将填补事件图像判别为未损失事件图像的概率;Among them, G represents the generator, D represents the discriminator, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, E[*] represents the expected value of the distribution function, G(I in ) represents the sequence of padded event images output by the generator, logD s (G(I in )) represents the probability that the event sequence discriminator will discriminate the padded event images as unloss event images, and logD f (G(I in )) represents the event frame discrimination probability that the padded event image will be discriminated as a non-lost event image;
为了充分利用事件图像的稀疏性特征,在原有生成器损失函数中加入L1损失,L1损失聚焦于像素级特征,本实施例采用的L1损失函数L1如下:In order to make full use of the sparsity feature of the event image, L 1 loss is added to the original generator loss function. The L 1 loss focuses on pixel-level features. The L 1 loss function L 1 used in this embodiment is as follows:
其中,Igt表示未损失事件图像序列,Ipred表示生成器输出的填补事件图像序列。where I gt represents the sequence of unloss event images and I pred represents the sequence of padded event images output by the generator.
L1损失函数在保证生成器的像素级特征的同时,又会引起结果的模糊,所以本实施例引入感知损失和风格损失来保留图像内容。感知损失将生成的目标图像Ipred规范化为更接近VGG子空间中的真实标签Igt,所述感知损失函数Lperc如下:The L1 loss function will cause blurring of the result while ensuring the pixel-level features of the generator, so this embodiment introduces perceptual loss and style loss to preserve the image content. The perceptual loss normalizes the generated target image I pred to be closer to the true label I gt in the VGG subspace, and the perceptual loss function L perc is as follows:
其中,其中φj是预训练VGG-19网络的第j层的激活图,φj(Igt)表示未损失事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列,φj(Ipred)表示填补事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列;Nj表示VGG-19网络中第j层网络的特征通道数。where φ j is the activation map of the jth layer of the pre-trained VGG-19 network, φ j (I gt ) represents the corresponding activation map sequence obtained after inputting the unloss event image sequence into the jth layer of the VGG-19 network, φ j j (I pred ) represents the corresponding activation map sequence obtained after the padded event image sequence is input to the jth layer of the VGG-19 network; N j represents the number of feature channels of the jth layer network in the VGG-19 network.
与感知损失不同,为了更好地恢复细节纹理,风格损失会先将自相关(Gram矩阵)应用于特征上。风格损失可测量激活图协方差之间的差异,也使用VGG进行计算。给定大小为Cj×Hj×Wj的激活图,可以通过以下方式计算风格损失:Unlike the perceptual loss, the style loss first applies the autocorrelation (Gram matrix) to the features in order to recover the detailed texture better. Style loss, which measures the difference between activation map covariances, is also computed using VGG. Given an activation map of size C j × H j × W j , the style loss can be computed in the following way:
其中,是根据激活图φj构造的Cj×CjGram矩阵,表示根据未损失事件图像序列对应的激活图序列构造的多个Gram矩阵,表示根据填补事件图像序列对应的激活图序列构造的多个Gram矩阵。in, is the C j ×C j Gram matrix constructed from the activation map φ j , represents multiple Gram matrices constructed from the activation map sequence corresponding to the unloss event image sequence, Represents multiple Gram matrices constructed from activation map sequences corresponding to padded event image sequences.
本实施例主要实现对高时间分辨率、低空间分辨率的事件图像序列,或者时间分辨率正常但空间分辨率受损的事件图像序列的填补。首先在高时间分辨条件下对事件进行累积得到事件帧,将事件序列图像送入生成器,经过生成器后,输出填补的事件图像序列。随后,填补的事件图像序列和真实标签共同送入两个判别器,判别器判别真伪,并将结果反馈给生成器,从而保证了填补事件图像序列的时间一致性和图像质量。This embodiment mainly implements filling in event image sequences with high temporal resolution and low spatial resolution, or event image sequences with normal temporal resolution but impaired spatial resolution. First, the events are accumulated under the condition of high time resolution to obtain the event frame, and the event sequence image is sent to the generator. After the generator, the filled event image sequence is output. Afterwards, the padded event image sequence and the ground truth are sent to two discriminators together, and the discriminator discriminates the authenticity and feeds the result back to the generator, thus ensuring the temporal consistency and image quality of the padded event image sequence.
步骤S4、获取待填补的损失事件图像,输入至基于最优网络参数的轻量生成对抗网络中,得到轻量生成对抗网络输出的填补事件图像。Step S4: Obtain the loss event image to be filled, and input it into the light-weight generative adversarial network based on the optimal network parameters to obtain the filled-in event image output by the light-weight generative adversarial network.
为了最大程度的使用训练得到的轻量生成对抗网络,因此本实施例在进行损失事件图像填补时,同样将P张损失事件图像作为一个损失事件图像序列输入轻量生成对抗网络中进行填补,对应的输出的为填补事件图像序列。如图4所示,为本实施例轻量生成对抗网络针对图3所示的损失事件图像,填补后输出的事件图像,由图可以看出本申请轻量生成对抗网络输出图像填补结构的真实性和结构的精细度高,能够较大程度还原图像。In order to use the light-weight generative adversarial network obtained by training to the greatest extent, when the loss event image filling is performed in this embodiment, P loss event images are also input into the light-weight generative adversarial network as a loss event image sequence for filling, corresponding to The output is a sequence of padded event images. As shown in FIG. 4 , the light-weight generative adversarial network of the present embodiment outputs the event image after filling for the loss event image shown in FIG. 3 . From the figure, it can be seen that the light-weight generative adversarial network output image of the present application is actually filled in the structure. The fineness of nature and structure is high, and the image can be restored to a greater extent.
本申请采用轻量生成对抗网络进行事件图像的填补,比传统图像修复使用的对抗网路模型更小,且推断速度更快;使用3D卷积和事件序列判别器,可有效提升填补结果的时间一致性和质量。适用于捕捉快速运动的事物,充分保留了事件相机高动态响应的特点,例如可应用于超高速人体运动捕捉与高帧率场景。This application uses a lightweight generative adversarial network to fill in event images, which is smaller than the adversarial network model used in traditional image inpainting, and has faster inference speed; using 3D convolution and event sequence discriminator can effectively improve the time to fill in results Consistency and quality. It is suitable for capturing fast-moving things, and fully retains the high dynamic response characteristics of event cameras, such as ultra-high-speed human motion capture and high frame rate scenes.
本实施例使用一个浅层的3D生成器以充分利用事件图像的稀疏特性,同时,为了保证填补事件结果的真实性和结构的精细度,在原始对抗损失中加入L1损失、感知损失和风格损失。最后,使用事件帧判别器和事件序列判别器,提高结果的时间一致性。本发明模型小,推断速度可达500FPS,基本能够满足事件相机捕获高速运动物体的需求,亦能够用于高动态响应场景。This embodiment uses a shallow 3D generator to take full advantage of the sparseness of event images. At the same time, in order to ensure the authenticity of the filled event results and the fineness of the structure, L1 loss , perceptual loss and style are added to the original adversarial loss. loss. Finally, an event frame discriminator and an event sequence discriminator are used to improve the temporal consistency of the results. The model of the invention is small, and the inference speed can reach 500FPS, which can basically meet the requirements of the event camera for capturing high-speed moving objects, and can also be used in high dynamic response scenarios.
在另一实施例中,还提供一种基于轻量生成对抗网络的快速事件图像填补系统,所述基于轻量生成对抗网络的快速事件图像填补系统,包括:In another embodiment, a light-weight generative adversarial network-based fast event image filling system is also provided, and the light-weight generative adversarial network-based fast event image filling system includes:
第一模块,用于构建轻量生成对抗网络;The first module is used to build a lightweight generative adversarial network;
第二模块,用于获取训练数据,所述训练数据包括多对相匹配的损失事件图像和未损失事件图像;a second module, configured to acquire training data, the training data including multiple pairs of matching loss event images and non-loss event images;
第三模块,用于利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数;The third module is used to optimize the lightweight generative adversarial network by using the training data to obtain optimal network parameters;
第四模块,用于获取待填补的损失事件图像,输入至基于最优网络参数的轻量生成对抗网络中,得到轻量生成对抗网络输出的填补事件图像;The fourth module is used to obtain the loss event image to be filled, and input it into the light-weight generative adversarial network based on the optimal network parameters to obtain the filled-in event image output by the light-weight generative adversarial network;
其中,所述轻量生成对抗网络包括生成器和判别器,所述生成器包括编码器、解码器、以及连接在编码器和解码器之间的两个残差块,所述编码器包括三个3D卷积,所述编码器对图像进行两次下采样,所述解码器包括三个3D转置卷积,所述解码器对图像进行两次上采样;所述判别器包括事件帧判别器和事件序列判别器,所述事件帧判别器为PatchGAN结构,且事件帧判别器中的卷积为2D卷积,所述事件序列判别器为PatchGAN结构,且事件序列判别器中的卷积为3D卷积。Wherein, the lightweight generative adversarial network includes a generator and a discriminator, the generator includes an encoder, a decoder, and two residual blocks connected between the encoder and the decoder, the encoder includes three 3D convolutions, the encoder downsamples the image twice, the decoder includes three 3D transposed convolutions, the decoder upsamples the image twice; the discriminator includes event frame discrimination device and event sequence discriminator, the event frame discriminator is a PatchGAN structure, and the convolution in the event frame discriminator is a 2D convolution, the event sequence discriminator is a PatchGAN structure, and the convolution in the event sequence discriminator is a 3D convolution.
关于基于轻量生成对抗网络的快速事件图像填补系统的具体限定可以参见上文中对于基于轻量生成对抗网络的快速事件图像填补方法的限定,在此不再赘述。上述各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the light-weight generative adversarial network-based fast event image filling system, please refer to the definition of the light-weight generative adversarial network-based fast event image filling method above, which will not be repeated here. The above-mentioned modules can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
在另一个实施例中,所述残差块中的卷积采用扩张因子为2的扩张卷积。In another embodiment, the convolution in the residual block adopts a dilated convolution with a dilation factor of 2.
在另一个实施例中,所述第三模块,利用所述训练数据优化所述轻量生成对抗网络得到最优的网络参数,执行如下操作:In another embodiment, the third module, using the training data to optimize the lightweight generative adversarial network to obtain optimal network parameters, performs the following operations:
基于训练数据取P对相匹配的损失事件图像和未损失事件图像;Take P pairs of matching loss event images and non-loss event images based on the training data;
将P张损失事件图像作为一个损失事件图像序列输入所述生成器中,得到所述生成器输出的填补事件图像序列,所述填补事件图像序列中的每一填补事件图像与作为输入的损失事件图像序列中的每一损失事件图像相对应;Input the P loss event images as a loss event image sequence into the generator, and obtain the padded event image sequence output by the generator, each padded event image in the padded event image sequence and the input loss event image Each loss event image in the image sequence corresponds to;
将P张未损失事件图像作为一个未损失事件图像序列,根据所述未损失事件图像序列和填补事件图像序列,先基于判别器的总损失函数进行判别器的反向传播,再基于生成器的总损失函数进行生成器的反向传播;Taking P unlost event images as a sequence of unlossed event images, according to the unlossed event image sequence and the filled event image sequence, the back-propagation of the discriminator is first performed based on the total loss function of the discriminator, and then based on the generator's total loss function. The total loss function performs back-propagation of the generator;
重复训练直至得到所述轻量生成对抗网络最优的网络参数。The training is repeated until the optimal network parameters of the light-weight generative adversarial network are obtained.
在另一个实施例中,所述判别器的总损失函数包括:In another embodiment, the total loss function of the discriminator includes:
其中,LD为判别器的总损失函数,为事件序列判别器的损失函数,为事件帧判别器的损失函数,为事件序列判别器的权重参数,为事件帧判别器的权重参数;where LD is the total loss function of the discriminator, is the loss function of the event sequence discriminator, is the loss function of the event frame discriminator, is the weight parameter of the event sequence discriminator, is the weight parameter of the event frame discriminator;
所述事件序列判别器的损失函数和事件帧判别器的损失函数如下:The loss function of the event sequence discriminator and the loss function of the event frame discriminator as follows:
其中,Igt表示未损失事件图像序列,Pdata(Igt)表示未损失事件图像序列的分布,E[*]表示分布函数的期望值,logDs(Igt)表示事件序列判别器判别为未损失事件图像的概率,logDf(Igt)表示事件帧判别器判别为未损失事件图像的概率,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,log(1-Ds(G(Iin)))表示事件序列判别器判别为由生成器输出的填补事件图像的概率,log(1-Df(G(Iin)))表示事件帧判别器判别为由生成器输出的填补事件图像的概率。Among them, I gt represents the sequence of non-lost event images, P data (I gt ) represents the distribution of the sequence of non-lost event images, E[*] represents the expected value of the distribution function, and logD s (I gt ) represents the event sequence discriminator discriminated as not The probability of loss event image, logD f (I gt ) represents the probability that the event frame discriminator discriminates as a non-lost event image, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, log( 1-D s (G(I in ))) represents the probability that the event sequence discriminator discriminates as the padded event image output by the generator, log(1-D f (G(I in ))) represents the event frame discriminator discriminant is the probability of filling the event image output by the generator.
在另一个实施例中,所述生成器的总损失函数包括:In another embodiment, the overall loss function of the generator includes:
LG=λ1L1+λpLperc+λsLstyle+λgLg L G =λ 1 L 1 +λ p L perc +λ s L style +λ g L g
其中,LG为生成器的总损失函数,L1为L1损失函数,λ1为L1损失函数的权重参数,Lperc为感知损失函数,λp为感知损失函数的权重参数,Lstyle为风格损失函数,λs为风格损失函数的权重参数,Lg为生成器对抗损失函数,λg为生成器对抗损失函数的权重参数;where L G is the total loss function of the generator, L 1 is the L 1 loss function, λ 1 is the weight parameter of the L 1 loss function, L perc is the perceptual loss function, λ p is the weight parameter of the perceptual loss function, L style is the style loss function, λ s is the weight parameter of the style loss function, L g is the generator confrontation loss function, and λ g is the weight parameter of the generator confrontation loss function;
所述生成器对抗损失函数Lg如下:The generator adversarial loss function L g is as follows:
其中,其中G表示生成器,D表示判别器,Iin表示损失事件图像序列,Pdata(Iin)表示损失事件图像序列的分布,E[*]表示分布函数的期望值,G(Iin)表示生成器输出的填补事件图像序列,logDs(G(Iin))表示事件序列判别器将填补事件图像判别为未损失事件图像的概率,logDf(G(Iin))表示事件帧判别器将填补事件图像判别为未损失事件图像的概率;Among them, G represents the generator, D represents the discriminator, I in represents the loss event image sequence, P data (I in ) represents the distribution of the loss event image sequence, E[*] represents the expected value of the distribution function, G(I in ) represents the sequence of padded event images output by the generator, logD s (G(I in )) represents the probability that the event sequence discriminator will discriminate the padded event images as unloss event images, and logD f (G(I in )) represents the event frame discrimination probability that the padded event image will be discriminated as a non-lost event image;
所述L1损失函数L1如下: The L1 loss function L1 is as follows:
其中,Igt表示未损失事件图像序列,Ipred表示生成器输出的填补事件图像序列;where I gt represents the sequence of unlossed event images, and I pred represents the sequence of padded event images output by the generator;
所述感知损失函数Lperc如下:The perceptual loss function L perc is as follows:
其中,其中φj是预训练VGG-19网络的第j层的激活图,φj(Igt)表示未损失事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列,φj(Ipred)表示填补事件图像序列输入VGG-19网络第j层后得到的相应的激活图序列;Nj表示VGG-19网络中第j层网络的特征通道数;where φ j is the activation map of the jth layer of the pre-trained VGG-19 network, φ j (I gt ) represents the corresponding activation map sequence obtained after inputting the unloss event image sequence into the jth layer of the VGG-19 network, φ j j (I pred ) represents the corresponding activation map sequence obtained after the padded event image sequence is input to the jth layer of the VGG-19 network; N j represents the number of feature channels of the jth layer network in the VGG-19 network;
所述风格损失函数Lstyle如下:The style loss function L style is as follows:
其中,是根据激活图φj构造的Cj×CjGram矩阵,表示根据未损失事件图像序列对应的激活图序列构造的多个Gram矩阵,表示根据填补事件图像序列对应的激活图序列构造的多个Gram矩阵。in, is the C j ×C j Gram matrix constructed from the activation map φ j , represents multiple Gram matrices constructed from the activation map sequence corresponding to the unloss event image sequence, Represents multiple Gram matrices constructed from activation map sequences corresponding to padded event image sequences.
应该理解的是,虽然图1的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of FIG. 1 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIG. 1 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed and completed at the same time, but may be executed at different times. The execution of these sub-steps or stages The sequence is also not necessarily sequential, but may be performed alternately or alternately with other steps or sub-steps of other steps or at least a portion of a phase.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-described embodiments can be combined arbitrarily. For the sake of brevity, all possible combinations of the technical features in the above-described embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, All should be regarded as the scope described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be pointed out that for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011133015.7A CN112396674B (en) | 2020-10-21 | 2020-10-21 | Rapid event image filling method and system based on lightweight generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011133015.7A CN112396674B (en) | 2020-10-21 | 2020-10-21 | Rapid event image filling method and system based on lightweight generation countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112396674A true CN112396674A (en) | 2021-02-23 |
CN112396674B CN112396674B (en) | 2024-10-18 |
Family
ID=74596029
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011133015.7A Active CN112396674B (en) | 2020-10-21 | 2020-10-21 | Rapid event image filling method and system based on lightweight generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112396674B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114266786A (en) * | 2021-12-21 | 2022-04-01 | 北京工业大学 | Gastric lesion segmentation method and system based on generative adversarial network |
CN115860054A (en) * | 2022-07-21 | 2023-03-28 | 广州工商学院 | Sparse codebook multiple access coding and decoding system based on generation countermeasure network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559287A (en) * | 2018-11-20 | 2019-04-02 | 北京工业大学 | A kind of semantic image restorative procedure generating confrontation network based on DenseNet |
CN110930418A (en) * | 2019-11-27 | 2020-03-27 | 江西理工大学 | A Retinal Vessel Segmentation Method Fusion W-net and Conditional Generative Adversarial Networks |
CN111402179A (en) * | 2020-03-12 | 2020-07-10 | 南昌航空大学 | Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network |
CN111695435A (en) * | 2020-05-19 | 2020-09-22 | 东南大学 | Driver behavior identification method based on deep hybrid coding and decoding neural network |
-
2020
- 2020-10-21 CN CN202011133015.7A patent/CN112396674B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559287A (en) * | 2018-11-20 | 2019-04-02 | 北京工业大学 | A kind of semantic image restorative procedure generating confrontation network based on DenseNet |
CN110930418A (en) * | 2019-11-27 | 2020-03-27 | 江西理工大学 | A Retinal Vessel Segmentation Method Fusion W-net and Conditional Generative Adversarial Networks |
CN111402179A (en) * | 2020-03-12 | 2020-07-10 | 南昌航空大学 | Image synthesis method and system combining countermeasure autoencoder and generation countermeasure network |
CN111695435A (en) * | 2020-05-19 | 2020-09-22 | 东南大学 | Driver behavior identification method based on deep hybrid coding and decoding neural network |
Non-Patent Citations (5)
Title |
---|
ALEX ZIHAO ZHU等: ""EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras"", ARVIV PREPRINT ARXIV:1802.06898, 31 August 2018 (2018-08-31), pages 3 * |
CHE SUN等: "’Adversarial 3D Convolutional Auto-Encoder for Abnormal Event Detection in Videos"", 《IEEE TRANSACTIONS ON MULTIMEDIA》, vol. 23, 10 September 2020 (2020-09-10), pages 3292 - 3305 * |
DMITRY ULYANOV等: "\'Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis\'", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, 31 December 2017 (2017-12-31), pages 4106 - 4113 * |
杨东升等: ""基于双生成器生成对抗网络的电力系统暂态稳定评估方法"", 《电网技术》, vol. 45, no. 8, 14 October 2020 (2020-10-14), pages 2394 - 2945 * |
王万良等: ""生成式对抗网络研究进展"", 《通信学报》, vol. 39, no. 2, 28 February 2018 (2018-02-28), pages 2018032 - 1 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114266786A (en) * | 2021-12-21 | 2022-04-01 | 北京工业大学 | Gastric lesion segmentation method and system based on generative adversarial network |
CN114266786B (en) * | 2021-12-21 | 2024-09-13 | 北京工业大学 | Gastric lesion segmentation method and system based on generative adversarial network |
CN115860054A (en) * | 2022-07-21 | 2023-03-28 | 广州工商学院 | Sparse codebook multiple access coding and decoding system based on generation countermeasure network |
CN115860054B (en) * | 2022-07-21 | 2023-09-26 | 广州工商学院 | Sparse codebook multiple access coding and decoding system based on generation countermeasure network |
Also Published As
Publication number | Publication date |
---|---|
CN112396674B (en) | 2024-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lim et al. | DSLR: Deep stacked Laplacian restorer for low-light image enhancement | |
Pan et al. | DACNN: Blind image quality assessment via a distortion-aware convolutional neural network | |
WO2021093620A1 (en) | Method and system for high-resolution image inpainting | |
CN112164011B (en) | Moving Image Deblurring Method Based on Adaptive Residual and Recursive Cross Attention | |
CN111798400A (en) | Reference-free low-light image enhancement method and system based on generative adversarial network | |
CN111028150A (en) | A fast spatiotemporal residual attention video super-resolution reconstruction method | |
CN112750201B (en) | Three-dimensional reconstruction method, related device and equipment | |
CN112653899A (en) | Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene | |
Liu et al. | BE-CALF: Bit-depth enhancement by concatenating all level features of DNN | |
CN112465727A (en) | Low-illumination image enhancement method without normal illumination reference based on HSV color space and Retinex theory | |
CN110689599A (en) | 3D visual saliency prediction method for generating countermeasure network based on non-local enhancement | |
Zheng et al. | T-net: Deep stacked scale-iteration network for image dehazing | |
CN112862689A (en) | Image super-resolution reconstruction method and system | |
Zhang et al. | Multi-attention convolutional neural network for video deblurring | |
CN114463176B (en) | Image super-resolution reconstruction method based on improved ESRGAN | |
Han et al. | Hybrid high dynamic range imaging fusing neuromorphic and conventional images | |
US11915383B2 (en) | Methods and systems for high definition image manipulation with neural networks | |
CN117576402B (en) | Deep learning-based multi-scale aggregation transducer remote sensing image semantic segmentation method | |
CN114638836A (en) | An urban streetscape segmentation method based on highly effective driving and multi-level feature fusion | |
CN115115540A (en) | Method and device for unsupervised low-light image enhancement based on illumination information guidance | |
CN114463218A (en) | Event data driven video deblurring method | |
CN112396674A (en) | Rapid event image filling method and system based on lightweight generation countermeasure network | |
CN112435165B (en) | Two-stage video super-resolution reconstruction method based on generation countermeasure network | |
Ren et al. | A lightweight object detection network in low-light conditions based on depthwise separable pyramid network and attention mechanism on embedded platforms | |
CN114842400A (en) | Video frame generation method and system based on residual block and feature pyramid |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |