WO2023236594A1 - 人脸美丽预测方法和装置、电子设备、存储介质 - Google Patents

人脸美丽预测方法和装置、电子设备、存储介质 Download PDF

Info

Publication number
WO2023236594A1
WO2023236594A1 PCT/CN2023/078761 CN2023078761W WO2023236594A1 WO 2023236594 A1 WO2023236594 A1 WO 2023236594A1 CN 2023078761 W CN2023078761 W CN 2023078761W WO 2023236594 A1 WO2023236594 A1 WO 2023236594A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
probability
face
beauty prediction
task
Prior art date
Application number
PCT/CN2023/078761
Other languages
English (en)
French (fr)
Inventor
甘俊英
谢小山
何国辉
Original Assignee
五邑大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 五邑大学 filed Critical 五邑大学
Publication of WO2023236594A1 publication Critical patent/WO2023236594A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the invention relates to the field of neural network technology, in particular to a face beauty prediction method, system and storage medium based on a generative adversarial network.
  • Face beauty prediction is a cutting-edge topic in the field of machine learning and computer vision. It mainly studies how to make computers have the ability to judge the beauty of faces similar to humans. However, current research in this area is due to the lack of large-scale face databases for neural networks. Supervised training has the problem of insufficient supervision information and the model is prone to overfitting.
  • the main purpose of the embodiments of the present disclosure is to propose a face beauty prediction method and device, electronic equipment, and computer-readable storage media, which can effectively solve the problem of insufficient supervision information and easy overfitting of the model in face beauty prediction research.
  • a first aspect of the embodiments of the present disclosure proposes a face beauty prediction method, which method includes:
  • the pseudo-face image and the original image are judged to obtain a first probability and a second probability; wherein, the first probability represents the probability that the pseudo-face image is judged to be a real image, and the second probability Probability represents the probability that the original image is judged to be a real image;
  • a training set is generated through an optimized generative adversarial network; wherein the training set includes a plurality of training samples, and the training samples include labels that reflect the facial beauty level of the training samples;
  • the training set is input into the face beauty prediction task network and the face beauty prediction task network is trained to obtain a trained first task network.
  • the generative adversarial network includes a generation module and the decision module, and optimizing the generative adversarial network includes:
  • the generation module is based on the expression:
  • the decision module is based on the expression: Update, where D represents the decision module and G represents the generation module, represents the static gradient of the generated module, represents the static gradient of the decision module, x (i) represents the i-th sample in the original image, and z (i) represents the i-th sample in the face pseudo image.
  • inputting the training set into the face beauty prediction task network and training the face beauty task network includes:
  • each dimension of the multi-dimensional labels is used to supervise each corresponding first sub-task network, and the total dimension of the multi-dimensional labels is The number is equal to the total number of the first subtask network;
  • Supervise learning is performed on a plurality of first sub-task networks through the multi-dimensional labels to obtain a plurality of trained second sub-task networks.
  • the supervised learning of multiple first sub-task networks through the multi-dimensional labels includes:
  • the method further includes:
  • the first multi-dimensional vector corresponds to the second multi-dimensional vector, then the first multi-dimensional vector is correct
  • the first multi-dimensional vector is corrected according to a plurality of the first output results.
  • modifying the first multi-dimensional vector according to a plurality of the first output results includes:
  • the preset rule is: modify the first output result based on the criteria that only a minimum number of first output results need to be modified and the confidence level of the modified first output result is the lowest.
  • inputting the training set into the face beauty prediction task network and training the face beauty task network includes:
  • the parameters of the first subtask network are cyclically optimized using a backpropagation algorithm.
  • the second aspect of the embodiment of the present disclosure proposes a facial beauty prediction device, the device includes:
  • Generating module used to generate face pseudo-images based on Gaussian noise
  • the judgment module is used to judge the fake face image and the original image to obtain the first probability and the second probability;
  • a generative adversarial network optimization module configured to optimize the generative adversarial network when the difference between the first probability and the second probability is greater than a preset threshold
  • a training set generation module used to generate a training set through an optimized generative adversarial network
  • the training module is used to input the training set into the face beauty prediction task network and train the face beauty prediction task network to obtain the trained first task network;
  • a third aspect of the embodiment of the present disclosure proposes an electronic device, which includes a memory, a processor, a program stored on the memory and executable on the processor, and a program for implementing the processor. and the memory.
  • the program is run by the processor, the face beauty prediction method as described in any one of the embodiments of the first aspect of the present application is implemented.
  • a fourth aspect of the embodiment of the present disclosure provides a computer-readable storage medium for computer-readable storage, characterized in that the computer-readable storage medium stores one or more programs, and the one or more The program can be run by one or more processors to implement the face beauty prediction method as described in any one of the above embodiments of the first aspect.
  • the face beauty prediction method and device, electronic equipment, and computer-readable storage medium proposed by the embodiments of the present disclosure obtain the original image and Gaussian noise; generate a face pseudo-image based on the Gaussian noise; and perform the processing on the face pseudo-image and the original image.
  • the generative adversarial network can output face pseudo-images that are extremely similar to real face images, and construct the output face pseudo-images as a training set.
  • the face beauty The prediction task network is trained to solve the problem of insufficient supervision information and easy overfitting of the model in face beauty prediction research.
  • Figure 1 is a flow chart of a face beauty prediction method provided by an embodiment of the present disclosure
  • FIG. 2 is a flow chart of step S400 in Figure 1;
  • FIG. 3 is a flow chart of step S300 in Figure 1;
  • FIG. 4 is a flow chart of step S330 in Figure 1;
  • Figure 5 is a module structure block diagram of a face beauty prediction device provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the present disclosure.
  • Embodiments of the present disclosure may be used in a variety of general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics devices, network PCs, minicomputers, mainframe computers, including Distributed computing environment for any of the above systems or devices, etc.
  • the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • the present application may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located in both local and remote storage media including storage devices.
  • the face beauty prediction method according to the first aspect of the embodiment of the present disclosure includes but is not limited to step S100 to step S600.
  • Step S100 obtain the original image and Gaussian noise
  • the original image and Gaussian noise are obtained.
  • the Gaussian noise and original image here may be pre-stored inside the system, or may be input externally.
  • the original image refers to a real face image obtained through photography equipment or other methods.
  • Step S200 generate a pseudo face image based on Gaussian noise
  • a pseudo-face image is generated based on Gaussian noise.
  • the generator of the adversarial network After receiving the Gaussian noise, the generator of the adversarial network generates a pseudo-face image based on the Gaussian noise.
  • Step S300 Determine the pseudo face image and the original image to obtain the first probability and the second probability
  • step S300 of some embodiments the pseudo-face image and the original image are judged to obtain the first probability and the second probability; wherein, the first probability represents the probability that the pseudo-face image is judged to be a real image, and the The second probability represents the probability that the original image is judged to be a real image; the pseudo-face image generated by the generator of the generative adversarial network and the original image are fed to the judger.
  • the judger After receiving the image, the judger will perform the source of the image. Determine the probability of whether the image is a fake image generated by the generator or a real face image obtained by shooting or other methods.
  • the determiner will conclude that the probability that the image is a real face image is close to 0, or when the image generated by the generator is very realistic and different from the real face image.
  • the judge will not be able to distinguish its source and can only guess blindly.
  • the probability of the fake face image generated by the generator being judged as a real face image will be close to 50%.
  • Step S400 When the difference between the first probability and the second probability is greater than the preset threshold, optimize and generate an adversarial network
  • step S400 of some embodiments when the difference between the first probability and the second probability is greater than a preset threshold, an adversarial network is generated optimally; the preset threshold is a very small value (such as 0.1%), and when the decision is made When the difference between the first probability and the second probability output by the generator is greater than the preset threshold, that is, the determiner can well distinguish that the fake face image generated by the generator is not a real face image. This shows that The images generated by the generator are not realistic enough to fool the discriminator. Therefore, the generator should be optimized to improve the quality of the fake face images generated by the generator.
  • the determiner also needs to be optimized so that the determiner can better distinguish whether the image is a pseudo-face image generated by the generator or a real face image, until the determiner
  • the probability that the fake face image generated by the generator is judged to be a real face image is very close to, or even equal to, the probability that the original image is judged to be a real face image.
  • Whether the fake face image is a real face image that is, the fake face image generated by the generator is very realistic and can be confused with the real one.
  • a trained generator is obtained, through which a large number of realistic face images can be generated.
  • the face images are used as face data to form a database.
  • Step S500 generate a training set through the optimized generative adversarial network
  • a training set is generated through an optimized generative adversarial network; in the above steps, through the continuous game of the generator and the decider in the generative adversarial network, an optimized generative adversarial network has been obtained that can generate very A generator of images that are close to real faces.
  • a series of face images can be generated through the generative adversarial network, and these face images can be assembled into a set, that is, a training set.
  • the training set includes multiple Training samples, the training samples include labels reflecting the beauty level of the faces of the training samples;
  • Step S600 Input the training set to the face beauty prediction task network and train the face beauty prediction task network to obtain the trained first task network.
  • the training set is input to the face beauty prediction task network and
  • the face beauty prediction task network is trained to obtain the trained first task network.
  • the face beauty prediction task network can be a CNN neural network.
  • the training set generated in the above step S500 and including a large number of face images and face beauty level labels corresponding to the images is used as input to perform supervised training on the CNN neural network. , to obtain a well-trained neural network used to complete the face beauty prediction task.
  • the generative adversarial network includes a generation module and a decision module. As shown in Figure 2, step S400 includes but is not limited to step S210 to step S220.
  • Step S210 reduce the static gradient of the generation module to update the generation module
  • step S210 of some embodiments the static gradient of the generation module is reduced to update the generation module, specifically, according to the expression: Update the generation module, where D represents the decision module and G represents the generation module, represents the static gradient of the generation module, z (i) represents the i-th sample in the face pseudo-image.
  • Step S220 increase the static gradient of the decision module to update the decision module
  • step S220 of some embodiments the static gradient of the decision module is increased to update the decision module, specifically, according to the expression: Update the decision module, where D represents the decision module and G represents the generation module, represents the static gradient of the decision module, x (i) represents the i-th sample in the original image, and z (i) represents the i-th sample in the face pseudo image.
  • step S600 includes but is not limited to step S310 to step S330.
  • Step S310 decompose the face beauty prediction task into multiple binary classification subtasks, and generate multiple first subtask networks corresponding to each binary classification subtask;
  • the face beauty prediction task is decomposed into multiple binary classification subtasks, and multiple first subtask networks are generated corresponding to each binary classification subtask.
  • Single task data can be used for multi-task prediction. study.
  • Step S320 Generate multi-dimensional labels based on the facial beauty level labels of the training samples
  • a multi-dimensional label is generated according to the face beauty level label of the training sample, where each dimension of the multi-dimensional label corresponds to the first sub-task network one-to-one, and each dimension of the multi-dimensional label is used for Supervise each first subtask network, and the total dimension of the multi-dimensional label is equal to the total number of the first subtask network;
  • Step S330 Perform supervised learning on multiple first sub-task networks through multi-dimensional labels to obtain multiple trained second sub-task networks.
  • step S330 of some embodiments supervised learning is performed on multiple first sub-task networks through multi-dimensional labels to obtain multiple trained second sub-task networks, and each dimension of the multi-dimensional labels is used to perform supervised learning on each sub-task network.
  • Supervision is carried out. Specifically, by judging whether the output result of the first subtask network is equal to the corresponding one dimension in the multidimensional label, and using the back propagation algorithm to looply optimize the first subtask network. network parameters.
  • steps after step S330 include but are not limited to steps S410 to step S440.
  • Step S410 integrate the first output results of the multiple trained second sub-task networks into a first multi-dimensional vector
  • step S410 of some embodiments the first output results of the multiple trained second sub-task networks are integrated into a first multi-dimensional vector.
  • the face beauty prediction task is decomposed into multiple binary categories.
  • each subtask can output a result.
  • a multi-dimensional vector can be obtained. For example, if the number of subtask networks is 3, the output results are 1, 1, respectively. 0, you can get a multi-dimensional vector [1,1,0].
  • Step S420 Compare the first multi-dimensional vector with the second multi-dimensional vector to determine whether the first multi-dimensional vector is in error
  • step S420 of some embodiments the first multidimensional vector is compared with the second multidimensional vector to determine whether the first multidimensional vector is in error.
  • the output results of the subtask network are integrated to obtain the first multidimensional vector.
  • Step S430 if the first multi-dimensional vector corresponds to the second multi-dimensional vector, then the first multi-dimensional vector is correct;
  • Step S440 If the first multi-dimensional vector does not correspond to the second multi-dimensional vector, correct the first multi-dimensional vector according to the plurality of first output results.
  • step S440 of some embodiments if the first multidimensional vector does not correspond to the second multidimensional vector, the first multidimensional vector is corrected according to the plurality of first output results.
  • the first output result is modified according to the preset rules to correct the first multi-dimensional vector.
  • the preset rule is : Modify the first output result based on the criteria that only a minimum number of first output results need to be modified and the modified first output result has the lowest confidence.
  • the first output results are all Boolean elements, it can be corrected from 0 to 1 and 1 to 0.
  • the second multi-dimensional vector After comparing the first multi-dimensional vector [0,0,0] with the second multi-dimensional vector, we can get: To correct the first or second item in the first multi-dimensional vector, you only need to modify one item to conform to the second multi-dimensional vector. At this time, the output results of the sub-task networks corresponding to the first and second items should be compared. the confidence level, correct the output result with a lower confidence level.
  • the face beauty prediction method proposed in the embodiment of the present disclosure obtains the original image and Gaussian noise; generates a pseudo-face image based on the Gaussian noise; and judges the pseudo-face image and the original image to obtain the first probability and the second probability; where , the first probability represents the probability that the fake face image is judged to be a real image, and the second probability represents the probability that the original image is judged to be a real image; when the difference between the first probability and the second probability is greater than the preset threshold, Optimize the generative adversarial network; generate a training set through the optimized generative adversarial network; where, The training set includes multiple training samples, and the training samples include labels that reflect the face beauty level of the training samples; the training set is input to the face beauty prediction task network and the face beauty prediction task network is trained to obtain the trained first task network.
  • the generative adversarial network can generate highly realistic face images, and through the generative adversarial network generation training, the training set is transferred to the face beauty prediction task network to perform the face beauty prediction task network. training, thus solving the problem of lack of large-scale face database to supervise the training of neural networks in face beauty prediction research, resulting in insufficient supervision information and easy overfitting of the model.
  • Embodiments of the present disclosure also provide a face beauty prediction device, as shown in Figure 5, which can implement the above face beauty prediction method.
  • the face beauty prediction device includes: an acquisition module 510, used to acquire the original image and Gaussian noise;
  • the generation module 520 is used to generate a fake face image based on Gaussian noise;
  • the decision module 530 is used to judge the fake face image and the original image to obtain the first probability and the second probability;
  • the generation adversarial network optimization module 540 is used to When the difference between the first probability and the second probability is greater than the preset threshold, the generative adversarial network is optimized;
  • the training set generation module 550 is used to generate a training set through the optimized generative adversarial network;
  • the training module 560 is used to generate The training set is input to the face beauty prediction task network and the face beauty prediction task network is trained to obtain the trained first task network.
  • the face beauty prediction device in the embodiment of the present disclosure is used to execute the face beauty prediction method in the above embodiment. Its specific processing process is the same as the face beauty prediction method in the above embodiment, and will not be described again here.
  • An embodiment of the present disclosure also provides an electronic device 600, including:
  • At least one processor and,
  • a memory communicatively connected to at least one processor; wherein,
  • the memory stores instructions, and the instructions are executed by at least one processor, so that when the at least one processor executes the instructions, the method as in any one of the embodiments of the first aspect of the present application is implemented.
  • the computer device includes: a processor 610, a memory 620, an input/output interface 630, a communication interface 640 and a bus 650.
  • the processor 610 can be implemented by a general central processing unit (Central Processin Unit, CPU), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, for execution.
  • CPU Central Processin Unit
  • ASIC Application Specific Integrated Circuit
  • the memory 620 can be implemented in the form of read-only memory (Read Only Memory, ROM), static storage device, dynamic storage device, or random access memory (Random Access Memory, RAM).
  • the memory 620 can store operating systems and other application programs.
  • the relevant program codes are stored in the memory 620 and called by the processor 610 to execute the disclosed implementation.
  • Example of face beauty prediction method
  • Input/output interface 630 used to implement information input and output
  • Communication interface 640 is used to realize communication and interaction between this device and other devices, which can be done through wired means. (such as USB, network cable, etc.), or wirelessly (such as mobile network, WIFI, Bluetooth, etc.); and
  • Bus 650 which transmits information between various components of the device (such as processor 610, memory 620, input/output interface 630, and communication interface 640);
  • the processor 610, the memory 620, the input/output interface 630 and the communication interface 640 implement communication connections between each other within the device through the bus 650.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separate, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • At least one (item) refers to one or more, and “plurality” refers to two or more.
  • “And/or” is used to describe the relationship between associated objects, indicating that there can be three relationships. For example, “A and/or B” can mean: only A exists, only B exists, and A and B exist simultaneously. , where A and B can be singular or plural. The character “/” generally indicates that the related objects are in an "or” relationship. “At least one of the following” or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items).
  • At least one of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c” ”, where a, b, c can be single or multiple.
  • the disclosed devices and methods can Achieved through other means.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not run.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a computer-readable file.
  • the storage medium includes multiple instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to run all or part of the steps of the methods described in various embodiments of the present application.
  • the aforementioned computer-readable storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), magnetic disk or optical disk, etc.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk or optical disk etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Image Analysis (AREA)

Abstract

一种人脸美丽预测方法和装置、电子设备、存储介质,属于神经网络技术领域。包括:获取原图像和高斯噪声;根据高斯噪声生成人脸伪图像;对人脸伪图像和原图像进行判决,得到第一概率和第二概率;当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;通过已优化的生成对抗网络生成训练集;将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。通过优化生成对抗网络,使对抗生成网络可以生成逼真的人脸图像并构建训练集,以此训练神经网络,解决了人脸美丽预测研究中由于缺乏大规模的人脸美丽数据库对神经网络进行监督训练,导致监督信息不足,模型容易过拟合的问题。

Description

人脸美丽预测方法和装置、电子设备、存储介质 技术领域
本发明涉及神经网络技术领域,特别是基于生成对抗网络的人脸美丽预测方法、系统及存储介质。
背景技术
人脸美丽预测是机器学习和计算机视觉领域的前沿课题,其主要研究如何让计算机拥有和人类判断人脸美丽相似的能力,然而,目前该方面研究由于缺乏大规模的人脸数据库对神经网络进行监督训练,由此存在监督信息不足,模型容易过拟合的问题。
发明内容
本公开实施例的主要目的在于提出一种人脸美丽预测方法和装置、电子设备、计算机可读存储介质,能够有效解决人脸美丽预测研究中监督信息不足,模型容易过拟合的问题。
为实现上述目的,本公开实施例的第一方面提出一种人脸美丽预测方法,所述方法包括:
获取原图像和高斯噪声;
根据所述高斯噪声生成人脸伪图像;
对所述人脸伪图像和所述原图像进行判决,得到第一概率和第二概率;其中,所述第一概率表示所述人脸伪图像被判决为真实图像的概率,所述第二概率表示所述原图像被判决为真实图像的概率;
当所述第一概率和所述第二概率之间的差值大于预设阈值时,优化所述生成对抗网络;
通过已优化的生成对抗网络生成训练集;其中,所述训练集包括多个训练样本,所述训练样本包括反映所述训练样本人脸美丽等级的标签;
将所述训练集输入至人脸美丽预测任务网络并对所述人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。
在一些实施例,所述生成对抗网络包括生成模块和所述判决模块,所述优化所述生成对抗网络包括:
降低所述生成模块的静态梯度以更新所述生成模块;
提高所述判决模块的静态梯度以更新所述判决模块;
其中,所述生成模块根据表达式:进行更新,所述所述判决模块根据表达式: 进行更新,其中D表示所述判决模块,G表示所述生成模块,表示所述生成模块的静态梯度,表示所述判决模块的静态梯度,x(i)表示原图像中的第i个样本,z(i)表示人脸伪图像中的第i个样本。
在一些实施例,所述将所述训练集输入至人脸美丽预测任务网络中并对所述人脸美丽任务网络进行训练,包括:
将人脸美丽预测任务分解为多个二分类子任务,并生成多个第一子任务网络分别对应每一个二分类子任务;
根据所述训练样本的人脸美丽等级标签生成多维标签;其中,所述多维标签的每一维分别用于监督每一个与之对应的所述第一子任务网络,所述多维标签的总维数与所述第一子任务网络的总个数相等;
通过所述多维标签对多个所述第一子任务网络进行监督学习,得到已训练的多个第二子任务网络。
在一些实施例,所述通过所述多维标签对多个所述第一子任务网络进行监督学习,包括:
判断所述第一子任务网络的输出结果是否与多维标签中对应的一维相等。
在一些实施例,所述通过所述多维标签对多个所述第一子任务网络进行监督学习,得到已训练的多个第二子任务网络之后,还包括:
将所述已训练的多个第二子任务网络的第一输出结果整合为第一多维向量;
将所述第一多维向量与所述第二多维向量进行比较以判断所述第一多维向量是否出错;
若所述第一多维向量与所述第二多维向量对应,则所述第一多维向量无误;
若所述第一多维向量与所述第二多维向量不对应,则根据多个所述第一输出结果修正所述第一多维向量。
在一些实施例,所述根据多个所述第一输出结果修正所述第一多维向量,包括:
根据预设规则修改所述第一输出结果,以修正所述第一多维向量;
其中,所述预设规则为:以只需修改最少数量的第一输出结果以及被修改的第一输出结果的置信度最低为标准,对所述第一输出结果进行修改。
在一些实施例,所述将所述训练集输入至人脸美丽预测任务网络中并对所述人脸美丽任务网络进行训练,包括:
使用反向传播算法循环优化所述第一子任务网络的参数。
本公开实施例的第二方面提出了一种人脸美丽预测装置,所述装置包括:
获取模块,用于获取原图像和高斯噪声;
生成模块,用于根据高斯噪声生成人脸伪图像;
判决模块,用于对人脸伪图像和原图像进行判决,得到第一概率和第二概率;
生成对抗网络优化模块,用于当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;
训练集生成模块,用于通过已优化的生成对抗网络生成训练集;
训练模块,用于将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络;
本公开实施例的第三方面提出了一种电子设备,所述电子设备包括存储器、处理器、存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,所述程序被所述处理器运行时实现如本申请第一方面实施例中任一项所述的人脸美丽预测方法。
本公开实施例的第四方面提出了一种计算机可读存储介质,用于计算机可读存储,其特征在于,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器运行,以实现如上述第一方面实施例中任一项所述的人脸美丽预测方法。
本公开实施例所提出的人脸美丽预测方法和装置、电子设备、计算机可读存储介质,通过获取原图像和高斯噪声;根据高斯噪声生成人脸伪图像;对人脸伪图像和原图像进行判决,得到第一概率和第二概率;当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;通过已优化的生成对抗网络生成训练集;将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。通过不断优化生成对抗网络,从而使生成对抗网络可以输出与真实人脸图像相似度极高的人脸伪图像,并将所输出的人脸伪图像构建为训练集,通过训练集对人脸美丽预测任务网络进行训练,从而解决人脸美丽预测研究中存在的监督信息不足,模型容易过拟合的问题。
附图说明
图1是本公开实施例提供的人脸美丽预测方法的流程图;
图2是图1中的步骤S400的流程图;
图3是图1中的步骤S300的流程图;
图4是图1中的步骤S330的流程图;
图5是本公开实施例提供的人脸美丽预测装置的模块结构框图;
图6是本公开实施例提供的电子设备的硬件结构示意图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序运行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第 一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本公开的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本公开的各方面。
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序运行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际运行的顺序有可能根据实际情况改变。
本公开实施例可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程存储介质中。
参照图1,根据本公开实施例第一方面实施例的人脸美丽预测方法,包括但不限于步骤S100至步骤S600。
步骤S100,获取原图像和高斯噪声;
在一些实施例的步骤S100中,获取原图像和高斯噪声,这里的高斯噪声和原图像可以是预先存储在系统内部的,也可以是通过外部输入的。其中,原图像是指通过摄影设备或者其它方式获取的真实的人脸图像。
步骤S200,根据高斯噪声生成人脸伪图像;
在一些实施例的步骤S200中,根据高斯噪声生成人脸伪图像,生成对抗网络的生成器在接收到高斯噪声之后,会根据该高斯噪声生成人脸伪图像。
步骤S300,对人脸伪图像和原图像进行判决,得到第一概率和第二概率;
在一些实施例的步骤S300中,对人脸伪图像和原图像进行判决,得到第一概率和第二概率;其中,第一概率表示所述人脸伪图像被判决为真实图像的概率,第二概率表示原图像被判决为真实图像的概率;将生成对抗网络的生成器所生成的人脸伪图像和原图像一起穿输至判决器,判决器在接收到图像之后,会对图像来源进行判断,得出图像是生成器所生成的伪造图像还是通过拍摄或者其它方式所得的真实人脸图像的概率。比如,当生成器所生成的图像不具备基本的人脸特征时,判决器则会得出该图像是真实人脸图像的概率接近0,或者,当生成器所生成的图像非常逼真,与实拍照片无异时,此时判决器会无法分辨其来源,从而只能盲猜,此时生成器所生成的人脸伪图像被判决为真实人脸图像概率则会接近50%。
步骤S400,当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;
在一些实施例的步骤S400中,当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;预设阈值是一个很小的值(比如0.1%),当判决器所输出的第一概率和第二概率之间的差值大于预设阈值时,即此时判决器可以很好地分辨出生成器所生成的人脸伪图像不是真实人脸图像,这说明生成器所生成的图像不够逼真,无法欺骗判决器。由此,应对生成器进行优化,从而提高生成器生成人脸伪图像的质量。同时,由于生成器生成人脸伪图像的质量提高了,判决器也需要进行优化,使得判决器能更好地分辨图片是生成器生成的人脸伪图像还是真实的人脸图像,直至判决器将生成器所生成的人脸伪图像判决为真实人脸图像的概率和原图像被判决为真实人脸图像的概率非常接近,甚至相等,此时,则说明判决器已无法分别生成器所生成的人脸伪图像是不是真实人脸图像,即生成器所生成的人脸伪图像已经非常逼真,可以以假乱真了,此时便得到了一个训练好的生成器,可以通过该生成器生成大量逼真的人脸图像作为人脸数据组建数据库。
步骤S500,通过已优化的生成对抗网络生成训练集;
在一些实施例的步骤S500中,通过已优化的生成对抗网络生成训练集;在上述步骤中,通过生成对抗网络中生成器和判决器的不断博弈,已经得到了一个优化好的,可以生成非常接近真实人脸的图像的生成器,此时,即可通过该生成对抗网络生成一系列的人脸图像,并将这些人脸图像组建为一个集合,即训练集,其中,训练集包括多个训练样本,训练样本包括反映训练样本人脸美丽等级的标签;
步骤S600,将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。
在一些实施例的步骤S600中,将训练集输入至人脸美丽预测任务网络并对 人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。其中人脸美丽预测任务网络可以是CNN神经网络,将上述步骤S500中生成的包括大量人脸图像以及与图像一一对应的人脸美丽等级标签的训练集作为输入,对CNN神经网络进行监督训练,以得到一个训练好的用于完成人脸美丽预测任务的神经网络。
在一些实施例中,生成对抗网络包括生成模块和判决模块,如图2所示,步骤S400包括但不限于步骤S210至步骤S220。
步骤S210,降低生成模块的静态梯度以更新生成模块;
在一些实施例的步骤S210中,降低生成模块的静态梯度以更新生成模块,具体的,根据表达式:对生成模块进行更新,其中D表示判决模块,G表示生成模块,表示生成模块的静态梯度,z(i)表示人脸伪图像中的第i个样本。
步骤S220,提高判决模块的静态梯度以更新判决模块;
在一些实施例的步骤S220中,提高判决模块的静态梯度以更新判决模块,具体的,根据表达式:对判决模块进行更新,其中D表示判决模块,G表示生成模块,表示判决模块的静态梯度,x(i)表示原图像中的第i个样本,z(i)表示人脸伪图像中的第i个样本。
在一些实施例中,如图3所示,步骤S600包括但不限于步骤S310至步骤S330。
步骤S310,将人脸美丽预测任务分解为多个二分类子任务,并生成多个第一子任务网络分别对应每一个二分类子任务;
在一些实施例的步骤S310中,将人脸美丽预测任务分解多个二分类子任务,并生成多个第一子任务网络分别对应每一个二分类子任务,可以利用单任务数据进行多任务预测学习。
步骤S320,根据训练样本的人脸美丽等级标签生成多维标签;
在一些实施例的步骤S320中,根据训练样本的人脸美丽等级标签生成多维标签,其中,多维标签的每一维分别与第一子任务网络一一对应,多维标签的每一维分别用于监督每一个第一子任务网络,多维标签的总维数与第一子任务网络的总个数相等;
步骤S330,通过多维标签对多个第一子任务网络进行监督学习,得到已训练的多个第二子任务网络。
在一些实施例的步骤S330中,通过多维标签对多个第一子任务网络进行监督学习,得到已训练的多个第二子任务网络,利用多维标签的每一维分别对每一个子任务网络进行监督,具体的,通过判断第一子任务网络的输出结果与多维标签中对应的一维是否相等,并使用反向传播算法循环优化第一子任务网 络的参数。
在一些实施例中,如图4所示,步骤S330之后包括但不限于步骤S410至步骤S440。
步骤S410,将已训练的多个第二子任务网络的第一输出结果整合为第一多维向量;
在一些实施例的步骤S410中,将已训练的多个第二子任务网络的第一输出结果整合为第一多维向量,在上述步骤中,将人脸美丽预测任务分解为多个二分类子任务后,每个子任务都可以输出一个结果,再将多个子任务网络的输出结果整合之后,便可以得到一个多维向量,比如,子任务网络数量为3,其输出结果分别为1,1,0,则可以得到一个多维向量[1,1,0]。
步骤S420,将第一多维向量与第二多维向量进行比较以判断第一多维向量是否出错;
在一些实施例的步骤S420中,将第一多维向量与第二多维向量进行比较以判断第一多维向量是否出错,在上述步骤S410中,通过将子任务网络的输出结果整合得到第一多维向量,将第一多维向量与第二多维向量进行对比,其中,第二多维向量包括对应多个不同人脸美丽等级的情形,比如可以包括分别对应人脸美丽等级1、2和3的[0,1,0]、[1,0,0]、[1,1,0]。
步骤S430,若第一多维向量与第二多维向量对应,则第一多维向量无误;
步骤S440,若第一多维向量与第二多维向量不对应,则根据多个第一输出结果修正第一多维向量。
在一些实施例的步骤S440中,若第一多维向量与第二多维向量不对应,则根据多个第一输出结果修正第一多维向量。通过将第一多维向量与第二多维向量进行比较,若第一多维向量不属于第二多维向量中的任一个,比如第一多维向量为[0,0,0],与第二多维向量中的任一项均不符合,即说明第一多维向量有误,此时,根据预设规则修改第一输出结果,以修正第一多维向量其中,预设规则为:以只需修改最少数量的第一输出结果以及被修改的第一输出结果的置信度最低为标准,对第一输出结果进行修改。由于第一输出结果均为布尔型元素,由0校正为1,1校正为0即可,将第一多维向量[0,0,0]与第二多维向量对比后可得,此时校正第一多维向量中的第一项或第二项均只需修改一项即符合第二多维向量,此时则应比较第一项和第二项分别对应的子任务网络的输出结果的置信度,对置信度更低的一个输出结果进行校正。
本公开实施例提出的人脸美丽预测方法,通过获取原图像和高斯噪声;根据高斯噪声生成人脸伪图像;对人脸伪图像和原图像进行判决,得到第一概率和第二概率;其中,第一概率表示人脸伪图像被判决为真实图像的概率,第二概率表示原图像被判决为真实图像的概率;当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;通过已优化的生成对抗网络生成训练集;其中, 训练集包括多个训练样本,训练样本包括反映训练样本人脸美丽等级的标签;将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。通过对生成对抗网络进行优化,从而使生成对抗网络可以生成高度逼真的人脸图像,并通过生成对抗网络生成训练,将训练集传输至人脸美丽预测任务网络中对人脸美丽预测任务网络进行训练,从而解决人脸美丽预测研究中于缺乏大规模的人脸数据库对神经网络进行监督训练,导致监督信息不足,模型容易过拟合的问题。
本公开实施例还提供一种人脸美丽预测装置,如图5所示,可以实现上述人脸美丽预测方法,该人脸美丽预测装置包括:获取模块510,用于获取原图像和高斯噪声;生成模块520,用于根据高斯噪声生成人脸伪图像;判决模块530,用于对人脸伪图像和原图像进行判决,得到第一概率和第二概率;生成对抗网络优化模块540,用于当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;训练集生成模块550,用于通过已优化的生成对抗网络生成训练集;训练模块560,用于将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。
本公开实施例的人脸美丽预测装置用于执行上述实施例中的人脸美丽预测方法,其具体处理过程与上述实施例中的人脸美丽预测方法相同,此处不再一一赘述。
本公开实施例还提供了一种电子设备600,包括:
至少一个处理器,以及,
与至少一个处理器通信连接的存储器;其中,
存储器存储有指令,指令被至少一个处理器执行,以使至少一个处理器执行指令时实现如本申请第一方面实施例中任一项的方法。
下面结合图6对电子设备600的硬件结构进行详细说明。该计算机设备包括:处理器610、存储器620、输入/输出接口630、通信接口640和总线650。
处理器610,可以采用通用的中央处理器(Central Processin Unit,CPU)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本公开实施例所提供的技术方案;
存储器620,可以采用只读存储器(Read Only Memory,ROM)、静态存储设备、动态存储设备或者随机存取存储器(Random Access Memory,RAM)等形式实现。存储器620可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器620中,并由处理器610来调用执行本公开实施例的人脸美丽预测方法;
输入/输出接口630,用于实现信息输入及输出;
通信接口640,用于实现本设备与其他设备的通信交互,可以通过有线方式 (例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信;和
总线650,在设备的各个组件(例如处理器610、存储器620、输入/输出接口630和通信接口640)之间传输信息;
其中处理器610、存储器620、输入/输出接口630和通信接口640通过总线650实现彼此之间在设备内部的通信连接。
本公开实施例描述的实施例是为了更加清楚的说明本公开实施例的技术方案,并不构成对于本公开实施例提供的技术方案的限定,本领域技术人员可知,随着技术的演变和新应用场景的出现,本公开实施例提供的技术方案对于类似的技术问题,同样适用。
本领域技术人员可以理解的是,图1至图6中示出的技术方案并不构成对本公开实施例的限定,可以包括比图示更多或更少的步骤,或者组合某些步骤,或者不同的步骤。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
本申请的说明书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以 通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不运行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取计算机可读存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个计算机可读存储介质中,包括多指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)运行本申请各个实施例所述方法的全部或部分步骤。而前述的计算机可读存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序的介质。
以上参照附图说明了本公开实施例的优选实施例,并非因此局限本公开实施例的权利范围。本领域技术人员不脱离本公开实施例的范围和实质内所作的任何修改、等同替换和改进,均应在本公开实施例的权利范围之内。

Claims (10)

  1. 一种基于生成对抗网络的人脸美丽预测方法,其特征在于,所述方法包括:
    获取原图像和高斯噪声;
    根据所述高斯噪声生成人脸伪图像;
    对所述人脸伪图像和所述原图像进行判决,得到第一概率和第二概率;其中,所述第一概率表示所述人脸伪图像被判决为真实图像的概率,所述第二概率表示所述原图像被判决为真实图像的概率;
    当所述第一概率和所述第二概率之间的差值大于预设阈值时,优化所述生成对抗网络;
    通过已优化的所述生成对抗网络生成训练集;其中,所述训练集包括多个训练样本,所述训练样本包括反映所述训练样本人脸美丽等级的标签;
    将所述训练集输入至人脸美丽预测任务网络并对所述人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。
  2. 根据权利要求1所述的人脸美丽预测方法,其特征在于,所述生成对抗网络包括生成模块和判决模块,所述优化所述生成对抗网络包括:
    降低所述生成模块的静态梯度以更新所述生成模块;
    提高所述判决模块的静态梯度以更新所述判决模块;
    其中,所述生成模块根据表达式:进行更新,所述判决模块根据表达式:进行更新,其中D表示所述判决模块,G表示所述生成模块,表示所述生成模块的静态梯度,表示所述判决模块的静态梯度,x(i)表示所述原图像中的第i个样本,z(i)表示所述人脸伪图像中的第i个样本。
  3. 根据权利要求1所述的人脸美丽预测方法,其特征在于,所述将所述训练集输入至人脸美丽预测任务网络中并对所述人脸美丽任务网络进行训练,包括:
    将人脸美丽预测任务分解为多个二分类子任务,并生成多个第一子任务网络分别对应每一个所述二分类子任务;
    根据所述训练样本的人脸美丽等级标签生成多维标签;其中,所述多维标签的每一维分别用于监督每一个与之对应的所述第一子任务网络,所述多维标签的总维数与所述第一子任务网络的总个数相等;
    通过所述多维标签对多个所述第一子任务网络进行监督学习,得到已训练的多个第二子任务网络。
  4. 根据权利要求3所述的人脸美丽预测方法,其特征在于,所述通过所述多维标签对多个所述第一子任务网络进行监督学习,包括:
    判断所述第一子任务网络的输出结果与所述多维标签中对应的一维是否相等。
  5. 根据权利要求3所述的人脸美丽预测方法,其特征在于,所述通过所述多维标签对多个所述第一子任务网络进行监督学习,得到已训练的多个第二子任务网络之后,还包括:
    将多个已训练的所述第二子任务网络的第一输出结果整合为第一多维向量;
    将所述第一多维向量与第二多维向量进行比较以判断所述第一多维向量是否出错;
    若所述第一多维向量与所述第二多维向量对应,则所述第一多维向量无误;
    若所述第一多维向量与所述第二多维向量不对应,则根据多个所述第一输出结果修正所述第一多维向量。
  6. 根据权利要求5所述的人脸美丽预测方法,其特征在于,所述根据多个所述第一输出结果修正所述第一多维向量,包括:
    根据预设规则修改所述第一输出结果,以修正所述第一多维向量;
    其中,所述预设规则为:以只需修改最少数量的所述第一输出结果以及被修改的所述第一输出结果的置信度最低为标准,对所述第一输出结果进行修改。
  7. 根据权利要求3至权利要求6中任一项所述的人脸美丽预测方法,其特征在于,所述将所述训练集输入至人脸美丽预测任务网络中并对所述人脸美丽任务网络进行训练,包括:
    使用反向传播算法循环优化所述第一子任务网络的参数。
  8. 一种人脸美丽预测装置,其特征在于,所述装置包括:
    获取模块,用于获取原图像和高斯噪声;
    生成模块,用于根据高斯噪声生成人脸伪图像;
    判决模块,用于对人脸伪图像和原图像进行判决,得到第一概率和第二概率;
    生成对抗网络优化模块,用于当第一概率和第二概率之间的差值大于预设阈值时,优化生成对抗网络;
    训练集生成模块,用于通过已优化的生成对抗网络生成训练集;
    训练模块,用于将训练集输入至人脸美丽预测任务网络并对人脸美丽预测任务网络进行训练,得到已训练的第一任务网络。
  9. 一种电子设备,其特征在于,所述电子设备包括存储器、处理器、存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,所述程序被所述处理器运行时实现如权利要求1至7中任一项所述的人脸美丽预测方法。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器运行,以实现如权利要求1至7中任一项所述的人脸美丽预测方法。
PCT/CN2023/078761 2022-06-09 2023-02-28 人脸美丽预测方法和装置、电子设备、存储介质 WO2023236594A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210646405.7A CN114973377A (zh) 2022-06-09 2022-06-09 人脸美丽预测方法和装置、电子设备、存储介质
CN202210646405.7 2022-06-09

Publications (1)

Publication Number Publication Date
WO2023236594A1 true WO2023236594A1 (zh) 2023-12-14

Family

ID=82961597

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/078761 WO2023236594A1 (zh) 2022-06-09 2023-02-28 人脸美丽预测方法和装置、电子设备、存储介质

Country Status (2)

Country Link
CN (1) CN114973377A (zh)
WO (1) WO2023236594A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114973377A (zh) * 2022-06-09 2022-08-30 五邑大学 人脸美丽预测方法和装置、电子设备、存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695602A (zh) * 2020-05-18 2020-09-22 五邑大学 多维度任务人脸美丽预测方法、系统及存储介质
WO2021052159A1 (zh) * 2019-09-20 2021-03-25 五邑大学 基于对抗迁移学习的人脸美丽预测方法及装置
CN112613435A (zh) * 2020-12-28 2021-04-06 杭州魔点科技有限公司 人脸图像生成方法、装置、设备及介质
CN113705492A (zh) * 2021-08-31 2021-11-26 杭州艾芯智能科技有限公司 人脸训练样本图像的生成方法、系统、计算机设备及存储介质
CN114973377A (zh) * 2022-06-09 2022-08-30 五邑大学 人脸美丽预测方法和装置、电子设备、存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021052159A1 (zh) * 2019-09-20 2021-03-25 五邑大学 基于对抗迁移学习的人脸美丽预测方法及装置
CN111695602A (zh) * 2020-05-18 2020-09-22 五邑大学 多维度任务人脸美丽预测方法、系统及存储介质
CN112613435A (zh) * 2020-12-28 2021-04-06 杭州魔点科技有限公司 人脸图像生成方法、装置、设备及介质
CN113705492A (zh) * 2021-08-31 2021-11-26 杭州艾芯智能科技有限公司 人脸训练样本图像的生成方法、系统、计算机设备及存储介质
CN114973377A (zh) * 2022-06-09 2022-08-30 五邑大学 人脸美丽预测方法和装置、电子设备、存储介质

Also Published As

Publication number Publication date
CN114973377A (zh) 2022-08-30

Similar Documents

Publication Publication Date Title
CN108615073B (zh) 图像处理方法及装置、计算机可读存储介质、电子设备
US11055555B2 (en) Zero-shot object detection
CN111859960B (zh) 基于知识蒸馏的语义匹配方法、装置、计算机设备和介质
Sariyar et al. The RecordLinkage package: detecting errors in data.
US10482380B2 (en) Conditional parallel processing in fully-connected neural networks
WO2019052311A1 (zh) 风格语句的生成方法、模型训练方法、装置及计算机设备
US10552712B2 (en) Training device and training method for training image processing device
GB2618917A (en) Method for few-shot unsupervised image-to-image translation
US20210150412A1 (en) Systems and methods for automated machine learning
JP2020077343A (ja) ルール生成装置、ルール生成方法及びルール生成プログラム
WO2023236594A1 (zh) 人脸美丽预测方法和装置、电子设备、存储介质
Wu et al. Yunet: A tiny millisecond-level face detector
US11669687B1 (en) Systems and methods for natural language processing (NLP) model robustness determination
WO2018036547A1 (zh) 一种数据处理的方法以及装置
US20220230061A1 (en) Modality adaptive information retrieval
CN112418320A (zh) 一种企业关联关系识别方法、装置及存储介质
US20190050740A1 (en) Accelerated decision tree execution
CN113971733A (zh) 一种基于超图结构的模型训练方法、分类方法及装置
Zhang et al. The classification and detection of malware using soft relevance evaluation
WO2021012263A1 (en) Systems and methods for end-to-end deep reinforcement learning based coreference resolution
Singh et al. Distributed quadratic programming solver for kernel SVM using genetic algorithm
JP2020052935A (ja) 学習済みモデルを生成する方法、データを分類する方法、コンピュータおよびプログラム
CN112817560B (zh) 一种基于表函数的计算任务处理方法、系统及计算机可读存储介质
Villaverde et al. PREMER: a tool to infer biological networks
WO2021251959A1 (en) Class agnostic repetition counting in video(s) utilizing a temporal self-similarity matrix

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23818762

Country of ref document: EP

Kind code of ref document: A1