CN111275651B - Face bright removal method based on antagonistic neural network - Google Patents
Face bright removal method based on antagonistic neural network Download PDFInfo
- Publication number
- CN111275651B CN111275651B CN202010115474.6A CN202010115474A CN111275651B CN 111275651 B CN111275651 B CN 111275651B CN 202010115474 A CN202010115474 A CN 202010115474A CN 111275651 B CN111275651 B CN 111275651B
- Authority
- CN
- China
- Prior art keywords
- image
- discriminator
- generator
- face
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
Abstract
本发明公开了一种基于对抗神经网络的人脸有光去除方法,该方法包括:通过计算机图像学的方法合成人脸,建立虚拟人脸有光和人脸无光对应的虚拟人脸数据集,通过普通拍摄和利用偏振镜片拍摄获得真实人脸有光和人脸无光对应的真实人脸数据集;建立对抗神经网络,用数据集训练网络参数,获得有光去除的网络模型,将有光的人脸图像输入训练好的有光去除网络模型,得到该人脸的无光图像。本发明通过对抗神经网络能够实现更加自然、真实的人脸有光去除。
The invention discloses a method for removing light from a human face based on an adversarial neural network. The method includes: synthesizing a human face through a method of computer graphics, and establishing a virtual human face data set corresponding to a virtual human face with light and a human face with no light , obtain the real face data set corresponding to the real face with light and face without light through ordinary shooting and shooting with polarized lenses; build an adversarial neural network, use the data set to train network parameters, and obtain a network model with light removal, there will be The light face image is input to the trained light removal network model, and the light-free image of the face is obtained. The present invention can realize more natural and real human face light removal by fighting against the neural network.
Description
技术领域Technical Field
本发明涉及模式识别领域,具体涉及一种基于对抗神经网络的人脸有光去除方法。The invention relates to the field of pattern recognition, and in particular to a method for removing light from a human face based on an adversarial neural network.
背景技术Background Art
皮肤表面会由于环境光源的照射而反射光线,在图像上表现为皮肤有光。皮肤表面有光掩盖了该部位原本的颜色和纹理,而人脸上的局部有光对计算机视觉、计算机图形学和医学检查等方面都有影响,在计算机视觉方面,人面部有光会对人脸检测、人脸识别等带来困难,造成检测不到,识别错误等问题;在计算机图形学方面,真实人脸有光需要与环境相一致,在不会产生有光的光源下,皮肤不应有高亮;在医学检查方面,对于人脸面部色素性皮肤病的检测造成干扰,皮肤有光覆盖的部分难以分辨是否存在该类皮肤病,对于病情判断和治疗方案的设计有阻碍作用。所以。人脸有光去除有其必要性。The skin surface will reflect light due to the irradiation of the ambient light source, which is manifested as shiny skin in the image. The shiny skin surface covers up the original color and texture of the part, and the local shiny skin on the face has an impact on computer vision, computer graphics and medical examination. In computer vision, the shiny skin on the face will bring difficulties to face detection and face recognition, resulting in problems such as failure to detect and recognition errors; in computer graphics, the shiny skin on the real face needs to be consistent with the environment. Under the light source that does not produce shiny skin, the skin should not be highlighted; in medical examination, it will interfere with the detection of facial pigmented skin diseases. It is difficult to distinguish whether the skin covered by shiny skin has such skin diseases, which hinders the diagnosis of the disease and the design of treatment plans. Therefore. It is necessary to remove the shiny skin on the face.
由于皮肤的材质较为复杂性,所以一般的有光去除方法不适用,现有的去有光技术的研究还未能实现自然真实地消去人脸皮肤上的光,尤其是当脸上有雀斑之类的密集斑纹时,要在自然地去除有光的同时,尽可能的恢复有光部分的原本皮肤和皮肤上的斑、痣等特征。因此,人脸有光去除是一个复杂而有挑战性的问题。通过建立光照模型的光照去除方法,受限于物理光照模型,不能完全反映真实的光照环境,尤其是对于人脸、皮肤这种复杂材质,因此去除有光的效果不够真实。Since the skin material is relatively complex, general methods for removing light are not applicable. Existing research on light removal technology has not yet achieved the goal of naturally and realistically removing light from the face skin, especially when there are dense spots such as freckles on the face. While naturally removing the light, it is necessary to restore the original skin and the spots, moles and other features on the skin as much as possible. Therefore, removing light from the face is a complex and challenging problem. The light removal method that establishes a light model is limited by the physical light model and cannot fully reflect the real light environment, especially for complex materials such as face and skin. Therefore, the effect of removing light is not realistic enough.
随着人工智能技术的兴起,深度学习算法不断发展,和通过建立光照模型的去光照方法相比,用端到端的神经网络,直接根据有光的原图像预测出对应的无光图像,能够不受限于固定的光照模型,能够更好更真实的恢复人脸有光部分的皮肤。With the rise of artificial intelligence technology, deep learning algorithms are constantly developing. Compared with the de-illumination method by establishing a lighting model, the end-to-end neural network can directly predict the corresponding lightless image based on the original illuminated image. It is not restricted by a fixed lighting model and can better and more realistically restore the skin of the illuminated part of the face.
发明内容Summary of the invention
发明目的:本发明所要解决的技术问题是提出一种基于对抗神经网络的人脸有光去除方法,通过该方法能够利用端到端的神经网络实现对人脸有光区域的修复,达到自然真实地去除人脸有光区域,恢复对应部分的本来皮肤肤色和纹理,适用于各式各样的人脸有光。Purpose of the invention: The technical problem to be solved by the present invention is to propose a method for removing light from the face based on an adversarial neural network. This method can use an end-to-end neural network to repair the light areas on the face, so as to naturally and realistically remove the light areas on the face and restore the original skin color and texture of the corresponding part. The method is suitable for all kinds of light on the face.
技术方案:为实现本发明的目的,本发明所采用的技术方案是:本发明设计了一种基于对抗神经网络的人脸有光去除方法。包括以下几个步骤:一种基于对抗神经网络的人脸有光去除方法,该方法包括以下步骤:Technical solution: To achieve the purpose of the present invention, the technical solution adopted by the present invention is: the present invention designs a method for removing light from human faces based on an adversarial neural network. The method comprises the following steps: A method for removing light from human faces based on an adversarial neural network, the method comprising the following steps:
(1)通过计算机图像学的方法合成人脸,生成虚拟的人脸有光和人脸无光图像对,建立虚拟人脸数据集;(1) Synthesize human faces through computer graphics methods, generate virtual human face image pairs with and without light, and establish a virtual face dataset;
(2)通过普通拍摄和利用偏振镜片拍摄人脸,获得真实环境下人脸有光和人脸无光的图像对,建立真实人脸数据集;(2) By taking ordinary photos and using polarized lenses to take photos of human faces, we obtain image pairs of human faces with light and without light in real environments, and establish a real face dataset;
(3)搭建生成式对抗神经网络模型;(3) Build a generative adversarial neural network model;
(4)用虚拟人脸数据集预训练神经网络;(4) Pre-training the neural network using a virtual face dataset;
(5)用真实人脸数据集调整预训练的神经网络参数,得到最终的有光去除网络模型;(5) Use the real face dataset to adjust the pre-trained neural network parameters to obtain the final light removal network model;
(6)将待去有光的人脸图像输入有光去除网络模型中,得到对应的无光人脸图像。(6) The face image to be illuminated is input into the illuminated removal network model to obtain the corresponding dark face image.
进一步的,所述步骤(1)建立虚拟人脸数据集的方法如下:Furthermore, the method of establishing the virtual face dataset in step (1) is as follows:
(1.1)在网络上搜集人正脸无光图像,作为生成三维人脸的素材;(1.1) Collecting non-glossy images of human faces on the Internet as materials for generating three-dimensional faces;
(1.2)通过HeadShot生成图像对应的三维人脸,添加光源,给人脸材质设置不同的反射程度,并通过预设的人脸三维模型的表情得到该人脸不同表情下的不同有光图像,通过截取的方式,获得无光源环境下的人脸无光图像和有光源环境下的人脸有光图像,建立虚拟人脸数据集。(1.2) Generate the 3D face corresponding to the image through HeadShot, add light source, set different reflection levels for the face material, and obtain different illuminated images of the face under different expressions through the expressions of the preset 3D face model. By intercepting, obtain the face dark image in the dark environment and the face illuminated image in the light environment, and establish a virtual face dataset.
进一步的,步骤(2)中建立真实人脸数据集的方法如下:Furthermore, the method for establishing a real face dataset in step (2) is as follows:
(2.1)用相机采集数据,设置好相机位置、被拍摄者位置、偏振镜的位置、光源位置和角度,使拍摄时能够用偏振镜消除有光;(2.1) Collect data with a camera, set the camera position, the subject's position, the polarizing filter position, and the light source position and angle so that the polarizing filter can eliminate light during shooting;
(2.2)采集图像时,被拍摄者在固定的位置保持不动,在不同强度光照下拍摄出不同有光的真实人脸图像,再用偏振镜拍摄对应图像的无光版本,建立真实人脸数据集。(2.2) When collecting images, the subject remains motionless in a fixed position, and different illuminated real face images are captured under different light intensities. A polarizing filter is then used to capture the dark version of the corresponding image to establish a real face dataset.
进一步的,步骤(3)中,搭建生成式对抗神经网络模型,方法如下:Furthermore, in step (3), a generative adversarial neural network model is constructed as follows:
构建对抗神经网络,包括生成器和判别器,生成器采用U-net的结构,总层数为15,将第i层特征与第15-i层特征之间连接起来,其中i∈[1,7],对应层共享底层信息,从而优化输出结果;生成器输入图像的大小为256*256,经过卷积核大小为4*4的卷积、批量归一化和ReLu激活,实现下采样,经过8次上述的下采样,每次采样后得到的特征的维度依次为:128*128、64*64、32*32、16*16、8*8*、4*4、2*2和1*1;再进行激活、卷积核大小为4*4的反卷积、批量归一化,实现上采样,经过8次上述的上采样,每次采样后得到维度依次为:2*2、4*4、8*8、16*16、32*32、64*64、128*128的特征和大小为256*256的输出;Construct an adversarial neural network, including a generator and a discriminator. The generator adopts the U-net structure with a total of 15 layers. The features of the i-th layer are connected to the features of the 15-i layer, where i∈[1,7], and the corresponding layers share the underlying information, thereby optimizing the output results. The size of the generator input image is 256*256, and it is downsampled after convolution with a convolution kernel size of 4*4, batch normalization and ReLu activation. After 8 times of the above downsampling, the image obtained after each sampling is The dimensions of the features are: 128*128, 64*64, 32*32, 16*16, 8*8*, 4*4, 2*2 and 1*1; then activation, deconvolution with a convolution kernel size of 4*4, and batch normalization are performed to achieve upsampling. After 8 times of the above upsampling, the dimensions of the features obtained after each sampling are: 2*2, 4*4, 8*8, 16*16, 32*32, 64*64, 128*128 and the output size is 256*256;
判别器为马尔科夫判别器(PatchGAN),其工作原理是将输入图像分割出N*N的小块进行判断,以所有小块的判别结果的平均值作为输出结果,通过改变输入图像大小而不改变判别器的参数N,实现多尺度判别,判别器结果为2个不同尺度判别器结果的平均值,训练过程采用Adam优化器优化。The discriminator is a Markov discriminator (PatchGAN), which works by dividing the input image into N*N small blocks for judgment, and taking the average of the judgment results of all small blocks as the output result. By changing the size of the input image without changing the parameter N of the discriminator, multi-scale discrimination is achieved. The discriminator result is the average of the results of two discriminators of different scales. The training process is optimized using the Adam optimizer.
进一步,步骤(3)中用虚拟人脸数据集预训练神经网络,方法如下:对于步骤(3)构建好的神经网络,先用虚拟人脸数据集进行预训练,得到预训练模型的生成器和判别器的参数,具体对抗训练原理如下:将训练集中有光的图像a输入生成器G中,得到生成的图像G(a),将生成图G(a)与训练集中对应的有光图像a输入到尺度不同的判别器D1、D2中,得到判别器结果由于最开始生成器生成的结果不真实,判别器认为该图像为真实图像概率为0,即判别器的结果为0,而将训练集中有光的图像a与训练集中对应的无光图像a‘输入到判别器D1、D2中,得到判别器结果判别器认为输入图像为真实图像,故判别器的结果为1,更新生成器的参数,使生成图为真实图像的概率最大化,该最大值大于0.5,此时判别器认为生成器生成的图像为真实图像;再保持生成器参数不变,训练判别器,使生成图为真实图像的概率最小化,该最小值小于0.5,此时判别器认为生成器生成的图像不真实;如此交替训练生成器和判别器,直到判别器认为真实图像与生成图像一致,即生成器和判别器之间达到纳什均衡,则训练完成;Furthermore, in step (3), the virtual face dataset is used to pre-train the neural network. The method is as follows: for the neural network constructed in step (3), the virtual face dataset is first used for pre-training to obtain the parameters of the generator and discriminator of the pre-trained model. The specific adversarial training principle is as follows: the lighted image a in the training set is input into the generator G to obtain the generated image G(a), and the generated image G(a) and the corresponding lighted image a in the training set are input into the discriminators D1 and D2 of different scales to obtain the discriminator results Since the result generated by the generator at the beginning is not real, the discriminator believes that the probability of the image being a real image is 0, that is, the result of the discriminator is 0, and the light image a in the training set and the corresponding dark image a' in the training set are input into the discriminators D1 and D2 to obtain the discriminator results The discriminator believes that the input image is a real image, so the result of the discriminator is To 1, update the parameters of the generator to make the generated image a real image. Maximize, the maximum value is greater than 0.5, at this time the discriminator believes that the image generated by the generator is a real image; then keep the generator parameters unchanged and train the discriminator to make the probability of the generated image being a real image Minimize, the minimum value is less than 0.5, at which point the discriminator believes that the image generated by the generator is not real; the generator and the discriminator are trained alternately in this way until the discriminator believes that the real image is consistent with the generated image, that is, the Nash equilibrium is reached between the generator and the discriminator, and the training is completed;
为了保证生成图G(a)与训练集中无光图像a‘之间的相似性,加入了G(a)与a‘的L1距离作为损失函数的一部分,整体损失函数为其中生成器和判别器的损失函数训练判别器D时,目标是使LGAN最小,训练生成器G时,目标是使LGAN最大,L1范数损失函数L1=Ea,a‘[||a‘-G(a)||1],E为期望,λ为系数。In order to ensure the similarity between the generated image G(a) and the matte image a' in the training set, the L1 distance between G(a) and a' is added as part of the loss function. The overall loss function is: The loss functions of the generator and the discriminator are When training the discriminator D, the goal is to minimize L GAN , and when training the generator G, the goal is to maximize L GAN . The L1 norm loss function is L1 = E a,a' [||a'-G(a)|| 1 ], where E is the expectation and λ is the coefficient.
进一步的,步骤(5)中,用真实人脸数据集调整预训练的神经网络参数,得到最终的有光去除网络模型,方法如下:在步骤(4)基础上,用真实人脸数据集训练,调整生成器和判别器的参数,使最终的有光去除网络模型能够生成与真实人脸一样的无光人脸图像,具体对抗训练原理如下:将训练集中有光的图像a输入生成器G中,得到生成的图像G(a),将生成图G(a)与训练集中对应的有光图像a输入到尺度不同的判别器D1、D2中,得到判别器结果由于最开始生成器生成的结果不真实,判别器认为该图像为真实图像概率为0,即判别器的结果为0,而将训练集中有光的图像a与训练集中对应的无光图像a‘输入到判别器D1、D2中,得到判别器结果判别器认为输入图像为真实图像,故判别器的结果为1,更新生成器的参数,使生成图为真实图像的概率最大化,该最大值大于0.5,此时判别器认为生成器生成的图像为真实图像;再保持生成器参数不变,训练判别器,使生成图为真实图像的概率最小化,该最小值小于0.5,此时判别器认为生成器生成的图像不真实;如此交替训练生成器和判别器,直到判别器判别认为真实图像与生成图像一致,即生成器和判别器之间达到纳什均衡,则训练完成;Furthermore, in step (5), the parameters of the pre-trained neural network are adjusted using a real face dataset to obtain a final light removal network model. The method is as follows: Based on step (4), the parameters of the generator and the discriminator are adjusted using a real face dataset for training, so that the final light removal network model can generate a matte face image that is the same as a real face. The specific adversarial training principle is as follows: the light image a in the training set is input into the generator G to obtain a generated image G(a), and the generated image G(a) and the corresponding light image a in the training set are input into the discriminators D1 and D2 of different scales to obtain the discriminator result Since the result generated by the generator at the beginning is not real, the discriminator believes that the probability of the image being a real image is 0, that is, the result of the discriminator is 0, and the light image a in the training set and the corresponding dark image a' in the training set are input into the discriminators D1 and D2 to obtain the discriminator results The discriminator believes that the input image is a real image, so the result of the discriminator is To 1, update the parameters of the generator to make the generated image a real image. Maximize, the maximum value is greater than 0.5, at this time the discriminator believes that the image generated by the generator is a real image; then keep the generator parameters unchanged and train the discriminator to make the probability of the generated image being a real image Minimize, the minimum value is less than 0.5, at which point the discriminator believes that the image generated by the generator is not real; the generator and the discriminator are trained alternately in this way until the discriminator determines that the real image is consistent with the generated image, that is, the Nash equilibrium is reached between the generator and the discriminator, and the training is completed;
为了保证生成图G(a)与训练集中无光图像a‘之间的相似性,加入了G(a)与a‘的L1距离作为损失函数的一部分,整体损失函数为其中生成器和判别器的损失函数LGAN=Ea,a‘[logD(a,a‘)]+Ea[log(1-D(a,G(a)))],训练判别器D时,目标是使LgAN最小,训练生成器G时,目标是使LGAN最大,L1范数损失函数L1=Ea,a‘[||a‘-G(a)||1],E为期望,λ为系数。In order to ensure the similarity between the generated image G(a) and the matte image a' in the training set, the L1 distance between G(a) and a' is added as part of the loss function. The overall loss function is: The loss function of the generator and discriminator is L GAN = E a,a' [logD(a,a')]+E a [log(1-D(a,G(a)))], when training the discriminator D, the goal is to minimize L gAN , when training the generator G, the goal is to maximize L GAN , L1 norm loss function L1 = E a,a' [||a'-G(a)|| 1 ], E is the expectation, λ is the coefficient.
有益效果:与现有技术相比,本发明的技术方案具有以下有益技术效果:Beneficial effects: Compared with the prior art, the technical solution of the present invention has the following beneficial technical effects:
(1)本发明实现了利用端到端的神经网络达到去除人脸有光的效果,不受限于固定的物理光照模型,能够生成更加真实的有光去除效果;(1) The present invention achieves the effect of removing the light from the face by using an end-to-end neural network, is not limited to a fixed physical illumination model, and can generate a more realistic light removal effect;
(2)此外,相比与其他材质的有光去除,本发明对于皮肤这种复杂材质有良好的适应性;(2) In addition, compared with the light removal of other materials, the present invention has good adaptability to complex materials such as skin;
(3)此外,本发明不仅能够恢复有光区域本来的肤色,对于该区域的斑点、色块等也能有一定的复原能力;(3) In addition, the present invention can not only restore the original skin color of the light area, but also has a certain degree of restoration ability for spots, color blocks, etc. in the area;
(4)本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。(4) Other features and advantages of the present invention will be described in the following description and, in part, will become apparent from the description or will be understood through the implementation of the present invention. The purposes and other advantages of the present invention can be realized and obtained through the structures particularly pointed out in the description, claims and drawings.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本发明实施的基于对抗神经网络的人脸有光去除方法的流程图;FIG1 is a flow chart of a method for removing light from a human face based on an adversarial neural network implemented in the present invention;
图2是本发明训练对抗神经网络的流程图;FIG2 is a flow chart of training an adversarial neural network according to the present invention;
图3是本发明搭建的对抗神经网络生成器的结构示意图;FIG3 is a schematic diagram of the structure of the adversarial neural network generator constructed by the present invention;
图4是本发明的有光去除效果示意图。FIG. 4 is a schematic diagram of the light removal effect of the present invention.
具体实施方式DETAILED DESCRIPTION
以下将结合附图及实施例来详细说明本发明的实施方式,借此对本发明如何应用技术手段来解决技术问题,并达成技术效果的实现过程能充分理解并据以实施。需要说明的是,只要不构成冲突,本发明中的各个实施例以及各实施例中的各个特征可以相互结合,所形成的技术方案均在本发明的保护范围之内。The following will describe the implementation methods of the present invention in detail with reference to the accompanying drawings and embodiments, so that the implementation process of how the present invention applies technical means to solve technical problems and achieve technical effects can be fully understood and implemented accordingly. It should be noted that as long as there is no conflict, the various embodiments of the present invention and the various features in the embodiments can be combined with each other, and the technical solutions formed are all within the protection scope of the present invention.
另外,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。In addition, the steps shown in the flowcharts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although a logical order is shown in the flowcharts, in some cases, the steps shown or described can be performed in an order different from that shown here.
图1为本发明实施的基于对抗神经网络的人脸有光去除方法的流程图,下面参照图1,详细说明各个步骤。FIG1 is a flow chart of a method for removing light from a human face based on an adversarial neural network implemented in the present invention. Each step will be described in detail below with reference to FIG1 .
步骤S110,由于没有公开的人脸有光图像的数据集,而采集大量成对的真实人脸有光数据费时费力,难以实现,故采用计算机图形学的方法,对人脸图像加上有光,得到成对的虚拟人脸有光图和人脸无光图,建立虚拟人脸数据集。Step S110, since there is no public dataset of illuminated face images, and collecting a large number of paired real face illuminated data is time-consuming and labor-intensive and difficult to achieve, computer graphics methods are used to add light to the face images to obtain paired virtual face illuminated images and face dark images, and to establish a virtual face dataset.
人脸图像从网络上搜集,通过HeadShot生成图像对应的三维人脸,添加光源,给人脸材质设置不同的反射程度,并通过预设的人脸三维模型的表情得到该人脸不同表情下的不同有光图像。虚拟人脸数据集中包括黑种人、白种人、黄种人的男性女性,确保了数据集的全面性,为使后续神经网络训练的结果能够适用的各类人群提供基础。Face images are collected from the Internet, and HeadShot generates the corresponding 3D face. Light sources are added, different reflectivity levels are set for the face material, and different light images of the face with different expressions are obtained through the expressions of the preset 3D face model. The virtual face data set includes black, white, and yellow men and women, ensuring the comprehensiveness of the data set and providing a basis for the results of subsequent neural network training to be applicable to all kinds of people.
步骤S120,由于计算机图形学生成的人脸有一定的虚假感,为了弥补这一点,需要真实有光人脸和无光人脸的对应图像。为此,采集图像时,被拍摄者在固定的位置保持不动,在不同强度光照下(光源角度固定),拍摄出不同有光的真实人脸图像,再用安置在固定角度的偏振镜片去除光,得到相同人脸的无光版本。由此建立数量较小的真实人脸数据集。Step S120: Since the human face generated by computer graphics has a certain sense of falseness, in order to compensate for this, corresponding images of real illuminated human faces and dark human faces are required. To this end, when collecting images, the subject remains motionless at a fixed position, and different illuminated real human face images are taken under different light intensities (with a fixed light source angle), and then a polarized lens placed at a fixed angle is used to remove the light to obtain a dark version of the same face. In this way, a small number of real human face data sets are established.
步骤S130,构建对抗神经网络,包括生成器和判别器,通过之后的训练得到生成器和判别器的参数,使生成器能够生成自然的无光图像,使判别器认为生成的图像是真实自然的无光图像。Step S130, constructing an adversarial neural network, including a generator and a discriminator, and obtaining parameters of the generator and the discriminator through subsequent training, so that the generator can generate a natural matte image, and the discriminator believes that the generated image is a real and natural matte image.
步骤S140,对于上述构建好的神经网络,先用虚拟人脸数据集进行预训练。得到预训练模型的生成器和判别器的参数。Step S140: Pre-train the constructed neural network using a virtual face dataset to obtain the parameters of the generator and discriminator of the pre-trained model.
步骤S150,在预训练模型的基础上,用真实人脸数据集训练,调整生成器和判别器的参数,使最终的有光去除网络模型能够生成与真实人脸一样自然不虚假的无光人脸图像。由于真实人脸数据集数量较小,用图像翻转的方式进数据增强,增加数据量,使训练后的网络性能更好。保存最后训练得到的生成器和判别器参数。Step S150, based on the pre-trained model, the real face data set is used for training, and the parameters of the generator and the discriminator are adjusted so that the final light removal network model can generate a natural and non-false light-free face image like a real face. Since the number of real face data sets is small, data enhancement is performed by image flipping to increase the data volume and make the performance of the trained network better. The parameters of the generator and the discriminator obtained by the final training are saved.
步骤S160,将有光的人脸图像输入上一步得到的有光去除网络模型中,得到对应的无光人脸图像。Step S160, input the illuminated face image into the illuminated removal network model obtained in the previous step to obtain the corresponding dark face image.
图2为本发明训练对抗神经网络的流程图,分为预训练和参数调整两步。对抗神经网络分为两个部分,生成器和判别器,生成器采用U-net的结构,总层数为15,将第i层特征与第15-i层特征之间连接起来,对应层共享底层信息,从而优化输出结果。用常用的马尔科夫判别器(PatchGAN)作为判别器,鉴于单个判别器的感受范围有限,本发明在判别器上采用多尺度的方式,分别把握细节信息和全局信息。通过改变输入图像的尺寸,输入到相同结构的判别器中,实现提取图像不同尺度的细节信息。判别器损失为各个尺度判别的损失之和。Fig. 2 is a flowchart of the training of the adversarial neural network of the present invention, which is divided into two steps: pre-training and parameter adjustment. The adversarial neural network is divided into two parts, a generator and a discriminator. The generator adopts the structure of U-net, with a total number of layers of 15. The features of the i-th layer are connected to the features of the 15-i-th layer, and the corresponding layers share the underlying information, thereby optimizing the output results. The commonly used Markov discriminator (PatchGAN) is used as the discriminator. In view of the limited perception range of a single discriminator, the present invention adopts a multi-scale approach on the discriminator to grasp the detail information and global information respectively. By changing the size of the input image and inputting it into the discriminator of the same structure, it is possible to extract detail information of different scales of the image. The discriminator loss is the sum of the losses of each scale discrimination.
无论虚拟人脸数据集还是真是人脸数据集,都是成对的,为了将网络的注意力集中在有光区域的不同,将成对的图像联接起来作为输入,网络能够自动学习到图像的残差,从而有效地针对有光区域进行特征学习。整体损失函数为生成器和判别器的对抗损失加上生成器输出的生成图像与数据集中对应的无光图像之间的L1范数损失。训练过程采用Adam优化器优化。Both virtual face datasets and real face datasets are paired. In order to focus the network's attention on the differences in the lighted areas, the paired images are connected as input. The network can automatically learn the residual of the image, thereby effectively learning the features of the lighted areas. The overall loss function is the adversarial loss of the generator and the discriminator plus the L1 norm loss between the generated image output by the generator and the corresponding dark image in the dataset. The training process is optimized using the Adam optimizer.
具体训练的步骤为,先用虚拟人脸数据集进行预训练,得到大致的生成器和判别器的参数,后用真实人脸数据集训练,根据真实人脸图像来微调生成器和判别器的参数,使生成器的生成结果更加真实自然,判别器对于图像真实性、有光去除效果的判断更加准确。具体对抗训练原理如下:将训练集中有光的图像a输入生成器G中,得到生成的图像G(a),将生成图G(a)与训练集中对应的有光图像a输入到尺度不同的判别器D1、D2中,得到判别器结果由于最开始生成器生成的结果真实性很差,判别器认为该图像为真实图像概率为0,即判别器的结果为0,而将训练集中有光的图像a与训练集中对应的无光图像a‘输入到判别器D1、D2中,得到判别器结果判别器认为输入图像为真实图像,故判别器的结果为1,更新生成器的参数,使生成图为真实图像的概率最大化,该最大值大于0.5,此时判别器认为生成器生成的图像为真实图像;再保持生成器参数不变,训练判别器,使生成图为真实图像的概率最小化,该最小值小于0.5,此时判别器认为生成器生成的图像不真实;如此交替训练生成器和判别器,直到判别器无法判别生成器生成的图像是否真实(认为真实图像与生成图像无法区分),即生成器和判别器之间达到纳什均衡,则训练完成。为了保证生成图G(a)与训练集中无光图像a‘之间的相似性,加入了G(a)与a‘的L1距离作为损失函数的一部分。整体损失函数为 其中生成器和判别器的损失函数LGAN=Ea,a‘[logD(a,a‘)]+Ea[log(1-D(a,G(a)))],训练判别器D时,目标是使LGAN最小,训练生成器G时,目标是使LGAN最大,L1范数损失函数L1=Ea,a‘[||a‘-G(a)||1],E为期望,λ为系数,训练时设为100。The specific training steps are: first pre-train with a virtual face dataset to get the approximate parameters of the generator and discriminator, then train with a real face dataset, and fine-tune the parameters of the generator and discriminator based on real face images, so that the generator's generation results are more realistic and natural, and the discriminator's judgment on image authenticity and light removal effect is more accurate. The specific adversarial training principle is as follows: input the lighted image a in the training set into the generator G to obtain the generated image G(a), and input the generated image G(a) and the corresponding lighted image a in the training set into the discriminators D1 and D2 of different scales to obtain the discriminator results. Since the results generated by the generator at the beginning are not very realistic, the discriminator believes that the probability of the image being a real image is 0, that is, the result of the discriminator is 0, and the light image a in the training set and the corresponding dark image a' in the training set are input into the discriminators D1 and D2 to obtain the discriminator results The discriminator believes that the input image is a real image, so the result of the discriminator is To 1, update the parameters of the generator to make the generated image a real image. Maximize, the maximum value is greater than 0.5, at this time the discriminator believes that the image generated by the generator is a real image; then keep the generator parameters unchanged and train the discriminator to make the probability of the generated image being a real image Minimize, the minimum value is less than 0.5, at which point the discriminator believes that the image generated by the generator is not real; the generator and the discriminator are trained alternately until the discriminator cannot determine whether the image generated by the generator is real (it believes that the real image and the generated image are indistinguishable), that is, the generator and the discriminator reach a Nash equilibrium, and the training is completed. In order to ensure the similarity between the generated image G(a) and the light-free image a' in the training set, the L1 distance between G(a) and a' is added as part of the loss function. The overall loss function is The loss function of the generator and discriminator is L GAN = E a,a' [logD(a,a')]+E a [log(1-D(a,G(a)))], when training the discriminator D, the goal is to minimize L GAN , when training the generator G, the goal is to maximize L GAN , L1 norm loss function L1 = E a,a' [||a'-G(a)|| 1 ], E is the expectation, λ is the coefficient, which is set to 100 during training.
图3为本发明搭建的对抗神经网络生成器的结构示意图,是在简单的Encoder-Decoder结构的基础上添加了维度拼接。生成器采用U-net的结构,总层数为15,将第i层特征与第15-i层特征之间连接起来(其中i∈[1,7]),对应层共享底层信息,从而优化输出结果。生成器输入图像的大小为256*256,经过卷积(卷积核大小为4*4)、批量归一化和ReLu激活,实现下采样,经过8次上述的下采样,每次采样后得到的特征的维度依次为:128*128、64*64、32*32、16*16、8*8*、4*4、2*2和1*1;再进行激活、反卷积(卷积核大小为4*4)、批量归一化,实现上采样,经过8次上述的上采样,每次采样后得到维度依次为:2*2、4*4、8*8、16*16、32*32、64*64、128*128的特征和大小为256*256的输出。Figure 3 is a schematic diagram of the structure of the adversarial neural network generator built by the present invention, which adds dimension splicing on the basis of a simple Encoder-Decoder structure. The generator adopts the U-net structure with a total of 15 layers, connecting the features of the i-th layer with the features of the 15-i-th layer (where i∈[1,7]), and the corresponding layers share the underlying information, thereby optimizing the output results. The size of the generator input image is 256*256. After convolution (the convolution kernel size is 4*4), batch normalization and ReLu activation, downsampling is achieved. After 8 times of the above downsampling, the dimensions of the features obtained after each sampling are: 128*128, 64*64, 32*32, 16*16, 8*8*, 4*4, 2*2 and 1*1; then activation, deconvolution (the convolution kernel size is 4*4), batch normalization are performed to achieve upsampling. After 8 times of the above upsampling, the dimensions of the features obtained after each sampling are: 2*2, 4*4, 8*8, 16*16, 32*32, 64*64, 128*128 and an output size of 256*256.
图4是本发明的有光去除效果图。FIG. 4 is a diagram showing the light removal effect of the present invention.
虽然本发明所揭露的实施方式如上,但所述的内容只是为了便于理解本发明而采用的实施方式,并非用以限定本发明。任何本发明所属技术领域内的技术人员,在不脱离本发明所揭露的精神和范围的前提下,可以在实施的形式上及细节上作任何的修改与变化,但本发明的专利保护范围,仍须以所附的权利要求书所界定的范围为准。Although the embodiments disclosed in the present invention are as above, the contents described are only embodiments adopted for facilitating the understanding of the present invention and are not intended to limit the present invention. Any technician in the technical field to which the present invention belongs can make any modifications and changes in the form and details of the implementation without departing from the spirit and scope disclosed in the present invention, but the patent protection scope of the present invention shall still be subject to the scope defined in the attached claims.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010115474.6A CN111275651B (en) | 2020-02-25 | 2020-02-25 | Face bright removal method based on antagonistic neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010115474.6A CN111275651B (en) | 2020-02-25 | 2020-02-25 | Face bright removal method based on antagonistic neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111275651A CN111275651A (en) | 2020-06-12 |
CN111275651B true CN111275651B (en) | 2023-05-12 |
Family
ID=71002283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010115474.6A Active CN111275651B (en) | 2020-02-25 | 2020-02-25 | Face bright removal method based on antagonistic neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111275651B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111951373B (en) * | 2020-06-30 | 2024-02-13 | 重庆灵翎互娱科技有限公司 | Face image processing method and equipment |
CN112598598B (en) * | 2020-12-25 | 2023-11-28 | 南京信息工程大学滨江学院 | Image reflected light removing method based on two-stage reflected light eliminating network |
CN112669284A (en) * | 2020-12-29 | 2021-04-16 | 天津大学 | Method for realizing pulmonary nodule detection by generating confrontation network |
CN114565509B (en) * | 2022-02-08 | 2025-02-28 | 厦门美图之家科技有限公司 | A method, device and equipment for removing facial oiliness based on CNN |
CN115018996A (en) * | 2022-08-08 | 2022-09-06 | 海马云(天津)信息技术有限公司 | Method and device for generating 3D face model according to real person face photo |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108205659A (en) * | 2017-11-30 | 2018-06-26 | 深圳市深网视界科技有限公司 | Face occluder removes and its method, equipment and the medium of model construction |
CN108596062B (en) * | 2018-04-12 | 2021-04-06 | 清华大学 | Real-time highlight removal method and device for face images based on deep learning |
CN109308450A (en) * | 2018-08-08 | 2019-02-05 | 杰创智能科技股份有限公司 | A kind of face's variation prediction method based on generation confrontation network |
CN109711364A (en) * | 2018-12-29 | 2019-05-03 | 成都视观天下科技有限公司 | A kind of facial image super-resolution reconstruction method, device and computer equipment |
CN109815893B (en) * | 2019-01-23 | 2021-03-26 | 中山大学 | Color face image illumination domain normalization method based on cyclic generation countermeasure network |
CN110728628B (en) * | 2019-08-30 | 2022-06-17 | 南京航空航天大学 | A face de-occlusion method based on conditional generative adversarial network |
-
2020
- 2020-02-25 CN CN202010115474.6A patent/CN111275651B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111275651A (en) | 2020-06-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111275651B (en) | Face bright removal method based on antagonistic neural network | |
CN109255831B (en) | A method for single-view face 3D reconstruction and texture generation based on multi-task learning | |
CN109815893B (en) | Color face image illumination domain normalization method based on cyclic generation countermeasure network | |
CN112052831B (en) | Method, device and computer storage medium for face detection | |
CN108537191B (en) | A three-dimensional face recognition method based on structured light camera | |
Raghavendra et al. | Exploring the usefulness of light field cameras for biometrics: An empirical study on face and iris recognition | |
CN111369501B (en) | Deep learning method for identifying oral squamous cell carcinoma based on visual features | |
CN112700390A (en) | Cataract OCT image repairing method and system based on machine learning | |
CN112233017B (en) | A method for pathological face data enhancement based on generative adversarial networks | |
CN116596861B (en) | Dental lesion recognition method, system, equipment and storage medium | |
CN109063643B (en) | Facial expression pain degree identification method under condition of partial hiding of facial information | |
CN111612717A (en) | Water surface image reflection removing and repairing method and device based on countermeasure network | |
CN113792581B (en) | A method, device and electronic device for detecting living body | |
CN112435281A (en) | Multispectral fundus image analysis method and system based on counterstudy | |
Bogo et al. | Automated detection of new or evolving melanocytic lesions using a 3D body model | |
CN117422657A (en) | Monocular endoscope depth estimation method based on highlight removal and semantic information | |
Liu et al. | Sparse-to-dense coarse-to-fine depth estimation for colonoscopy | |
CN111369506B (en) | A method for grading lens opacity based on ocular B-ultrasound images | |
CN118279299B (en) | Method for taking retinal images using non-visible light flash | |
CN114627136A (en) | Tongue picture segmentation and alignment method based on feature pyramid network | |
Subhedar et al. | A review on recent work on oct image classification for disease detection | |
CN117133041A (en) | Three-dimensional reconstruction network face recognition method, system, equipment and medium based on deep learning | |
CN111489325B (en) | An instant identification device for hyperthyroidism and its control method | |
Sun et al. | DeepMetricEye: Metric depth estimation in periocular VR imagery | |
Farooq et al. | Generating thermal image data samples using 3D facial modelling techniques and deep learning methodologies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |