CN115936796A - Virtual makeup changing method, system, equipment and storage medium - Google Patents
Virtual makeup changing method, system, equipment and storage medium Download PDFInfo
- Publication number
- CN115936796A CN115936796A CN202111169098.XA CN202111169098A CN115936796A CN 115936796 A CN115936796 A CN 115936796A CN 202111169098 A CN202111169098 A CN 202111169098A CN 115936796 A CN115936796 A CN 115936796A
- Authority
- CN
- China
- Prior art keywords
- picture
- makeup
- information
- neural network
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 107
- 238000012549 training Methods 0.000 claims abstract description 65
- 238000013528 artificial neural network Methods 0.000 claims abstract description 57
- 230000001815 facial effect Effects 0.000 claims abstract description 33
- 230000008569 process Effects 0.000 claims description 33
- 238000003062 neural network model Methods 0.000 claims description 20
- 210000003128 head Anatomy 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 15
- 238000004891 communication Methods 0.000 claims description 14
- 238000013135 deep learning Methods 0.000 claims description 10
- 239000002537 cosmetic Substances 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 7
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 239000003086 colorant Substances 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 238000003786 synthesis reaction Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 24
- 239000000463 material Substances 0.000 abstract description 17
- 230000004927 fusion Effects 0.000 abstract description 6
- 230000006870 function Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000001514 detection method Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 6
- 230000008901 benefit Effects 0.000 description 4
- 230000002452 interceptive effect Effects 0.000 description 4
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 210000000887 face Anatomy 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241001237961 Amanita rubescens Species 0.000 description 1
- 206010003591 Ataxia Diseases 0.000 description 1
- 206010010947 Coordination abnormal Diseases 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 210000004209 hair Anatomy 0.000 description 1
- 208000016290 incoordination Diseases 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Landscapes
- Image Processing (AREA)
Abstract
The invention discloses a virtual makeup changing method, which comprises the following steps: 1) Uploading a picture to be made up by a user; 2) Selecting or uploading a target makeup picture by a user; 3) Disassembling and extracting the model characteristics of the picture to be made up and the target makeup picture; 4) The model features comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture; 5) Inputting the characteristics extracted from the processed makeup picture and the facial structure information of the user picture into a GAN neural network as conditions; 6) And the GAN neural network outputs the synthesized makeup changing picture according to the condition. The invention adopts the facial structure information and the makeup information as the input conditions of the GAN neural network respectively, and obtains more real makeup changing effect. Meanwhile, an improved neural network training method is adopted, and a single picture is used for disassembling two required conditions for training, so that the possibility that large-scale one-to-one matching data required by neural network training is not obtained in the prior art is made possible, and in comparison with a traditional material pasting method, the reality and the fusion degree during image generation are greatly improved.
Description
Technical Field
The invention belongs to the field of virtual makeup changing, and particularly relates to a virtual makeup changing method, a virtual makeup changing system, virtual makeup changing equipment and a storage medium by using a neural network, in particular to a method for generating a virtual makeup changing picture by using a GAN neural network model which is trained by obtaining model characteristics through single picture disassembly processing.
Background
With the development of internet technology, new modes such as online shopping, online live broadcast interaction, online interactive entertainment, online friend-making interaction and the like are increasingly popularized. Compared with the experience of shopping in a physical store or off-line shop, the online shopping has the advantages of large selectable space, various commodity types, time and labor saving and the like, and the online interactive entertainment or friend-making display has the advantages of convenience, rapidness, more audiences and the like. However, there are some problems that are not easy to solve when purchasing commodities on the internet, and most of all, the real effect of purchasing commodities cannot be visually checked. This problem is most pronounced in all commercial varieties, especially in cosmetic products such as eye shadows, eyeliners, blushers, lipsticks, foundations, and the like. Compared with the method that the effect of the commodity can be changed and checked in real time in shopping in a physical store, the online cosmetic shopping can not provide an effect picture aiming at a consumer, only can provide a picture for a model to use, and even does not provide a picture for use in the model to use, so that the consumer can not intuitively acquire the matching degree of the cosmetics, the face and the skin in real time, and great troubles are caused.
In addition, in some interactive entertainment occasions, users also have the requirement of experiencing different makeup and effect. Since ordinary users can see the makeup of the heart instrument on the internet, the makeup can be truly reproduced on their faces, so that the users can observe whether the makeup is suitable for themselves. Several cosmetic changing techniques have been developed in the prior art to address such needs. Virtual makeup replacement is typically the process of transferring the makeup of a target picture to the original picture. After giving the picture to be made up and the target makeup picture, the algorithm program needs to transfer the makeup on the target makeup picture to the picture to be made up. By the method, the user can see the effect of the user after makeup without spending time and money to actually purchase the cosmetics or actually make up, and the method can be widely applied to the Internet industry.
In the prior art, a robust reference picture-based makeup transfer method is disclosed, which comprises the following specific steps: the method comprises the following steps: define the makeup changing question as: wherein x is a picture to be made up, y is a reference picture, the mapping G takes x and y as input, the picture after output and the x are the same person, and the picture has the makeup of the reference picture y; step two: extracting a makeup matrix of the reference picture y, and extracting by using a makeup extraction network reference picture to obtain a first makeup matrix gamma and a second makeup matrix beta; step three: calculating the similarity between each pixel in the picture to be made up and each pixel in the reference picture; step four: performing deformation processing on the makeup matrix extracted in the step two by utilizing the similarity of pixels to obtain a self-adaptive makeup matrix; step five: the picture to be made up is made up by using the characteristic diagram of the visual characteristic by using the self-adaptive makeup matrix: step six: and performing up-sampling on the visual characteristic image to obtain a makeup changing picture. However, although the makeup changing mode can complete corresponding makeup changing operation by simple processing, the makeup changing mode is mainly completed by a mode of rendering material fitting after key point positioning, the main operation is effect fusion, the key point positioning is depended on, the material operation is depended on, the makeup changing can not be realized at will, the removal effect on the existing makeup on the picture is not realized, and the superposition effect can not meet the high requirement of a user under the condition that some users are thick makeup.
In the current virtual makeup changing field, the original makeup of a user is covered by using a target makeup basically occupies the mainstream due to the characteristics of small calculated amount, simple realization and the like. However, with the rapid development of the neural network model technology in recent years, a method for completing virtual makeup replacement by using the neural network model is gradually on the stage in the field of makeup replacement. Therefore, the quality of the final output image of the neural network model, that is, the goodness of fit between the original image and the target image becomes the most critical factor, and especially for the deep learning neural network, how to train the image and achieve the optimal training effect and training efficiency becomes a problem to be solved urgently.
Disclosure of Invention
In order to match with the development trend of the internet industry and meet the requirements that users can freely change makeup and see effects and the faces of thousands of people when browsing favorite makeup in the field of virtual makeup changing, the virtual makeup changing method which can achieve simple input, has the calculated amount not exceeding the bearing capacity of terminal equipment and has the effect close to the offline real makeup effect needs to be provided. In order to solve the problems, the invention provides a makeup changing method for overcoming the problems and also comprises a neural network training method.
The invention provides a virtual makeup changing method, which comprises the following steps: 1) Uploading a picture to be made up by a user; 2) Selecting or uploading a target makeup picture by a user; 3) Disassembling and extracting model characteristics of the picture to be made up and the target makeup picture; 4) The model characteristics comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture; 5) Inputting the processed makeup picture and the extracted features of the facial structure information of the user picture into a GAN neural network as conditions; 6) And the GAN neural network outputs the synthesized makeup changing picture according to the condition.
Further, in step 2), the user can select a target makeup picture provided by the system or upload a partial picture of the target makeup.
Further, in step 3), the terminal uploads the picture information to the server to complete the disassembly and extraction of the model features, or directly completes the disassembly and extraction of the model features at the terminal and then uploads the feature information to the server.
Further, the step of disassembling and extracting the model characteristics of the picture to be made up also comprises the steps of 1) processing according to the two-dimensional image of the user to obtain a human head outline image; 2) Inputting a two-dimensional human head contour image into a first neural network subjected to deep learning to perform regression of key points; 3) Obtaining key point information of a user head; 4) Obtaining semantic segmentation maps of all parts of the head; 5) Color disturbance is carried out on the picture, and face makeup information is removed in a lab, brightness, contrast and shadow disturbance mode; 6) And extracting the characteristics of the image after the disturbance processing through an encoder to obtain the characterizable face structure information of the user image.
Further, the step of disassembling and extracting the model features of the target makeup picture also comprises the steps of 1) converting the face three-dimensional picture into a planar picture; 2) Performing WARP stretching deformation processing on the picture according to a certain preset rule; 3) Fixedly unfolding the picture into a picture with a designated part at a fixed position according to preset key point information; 4) The picture from the WARP to the fixed position carries the face makeup information of the target makeup picture, and can be directly input into the GAN neural network.
Further, the method also comprises the steps of 1) presetting key points as 5 points, wherein the appointed parts comprise canthus, nose tip and mouth corner; 2) In addition to the preset key points, other points are disturbed in a mode of increasing colors, adding and subtracting RGB values and changing contrast, brightness and light shadow; 3) The information of the preserved makeup comprises 5 key points and lipstick, eye shadow and color number information within a certain range around the key points.
Further, before the model characteristics are input into the GAN neural network model as conditions, the method also comprises a process of training the GAN neural network, and the training process is finished by adopting a method of self-disassembling a single picture with makeup.
Further, the training process comprises: 1) Disturbance removal is carried out on the face makeup information of the picture, and face structure information is extracted from the picture; 2) Disturbance removal is carried out on the picture face structure information, and face makeup information is extracted from the picture; 3) Inputting the extracted facial structure information of the picture into an encoder to extract features; 4) Inputting the obtained features and the facial makeup information into the GAN neural network model as input conditions; 5) The training is performed using the picture itself as the true value of the model.
Furthermore, when the makeup information of the face of the original picture is removed, besides random disturbance, the following disturbance operation is also performed, the colors of the specified parts in the extracted other face pictures are covered on the specified parts of the original picture by a histogram matching method, so that the makeup information at the specified positions in the original picture is completely stripped.
In addition, the invention also provides a system for realizing the virtual makeup changing method, which comprises the following steps: 1) The picture acquisition module is used for acquiring a picture to be made up uploaded by a user, and the user selects or uploads a target makeup picture; 2) The disassembling and extracting module is used for acquiring model characteristics of the picture to be made up and the target makeup picture; the model features comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture; 3) And the picture synthesis output module comprises a GAN neural network and is used for inputting the processed makeup picture and the characteristics extracted from the facial structure information of the user picture into the GAN neural network as conditions and outputting the synthesized makeup changing picture.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method steps of any of the preceding claims.
An electronic device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; a memory for storing a computer program; a processor for implementing any of the above method steps when executing a program stored in the memory.
The invention has the beneficial effects that:
1. the degree of realism is high. The invention utilizes the GAN network to directly generate the synthetic picture, is different from the existing GAN network, and creatively provides two completely independent and noninterference model input conditions of the facial structure information and the target facial makeup information of the user. Compared with a covering type virtual makeup changing mode, although some steps are added in the processing process, fusion operation is avoided by only using a rendering material attaching mode after key points are positioned, and the defect that the superposition effect cannot meet the requirements of a user under the condition that an original picture is made up deeply is overcome. Meanwhile, the scheme has lower requirements on the user drawings and better makeup changing effect, can be used for optionally selecting the target makeup drawings, does not depend on prefabricated operation materials, and realizes the purpose of thousands of people and thousands of faces.
2. The training materials are easy to obtain, and the training effect is good. According to a general model design idea, a GAN for completing a virtual makeup task is trained, a user facial picture, a target makeup picture and a makeup picture of a user reference target makeup should be provided, and the makeup pictures are used as a model true value for training. In practice, however, it is difficult to collect complete sets of photo materials, resulting in fewer types of makeup that can be used for training, and it is difficult to meet the requirements of internet entertainment interaction. Therefore, the face structure information and the face makeup information in the picture are respectively disassembled and extracted creatively by using a specific information perturbation mode, the two pieces of information are used for replacing a user pixel picture and a target makeup picture as input conditions, and the picture after makeup is used as a model output true value for training. Under the condition, any one of the pictures after makeup collected by various channels can be used as a training material, so that the requirement on the training material is greatly reduced. On the premise of guaranteeing mass data training, the GAN neural network can obtain a synthetic picture without discomfort. The virtual makeup changing method provided by the invention adopts an improved neural network training method, and can finish relatively accurate training by only using a single picture, so that the parameter precision of the deep learning neural network is greatly improved, the convergence speed is obviously accelerated, and the consistency and the reduction degree of the generated synthetic image can be greatly improved.
3. Deep learning neural networks are used as an aid. The invention fully utilizes the advantages of the deep learning network and can restore the facial structure information and the makeup information with high precision in various complex scenes. Different neural networks are respectively used for different purposes, and the neural network models with different input conditions and training modes are utilized, so that accurate contour separation under a complex background is realized, semantic segmentation of each part of a human body is realized, key points and joint points are determined, the influence of the head posture, the hair style and the like is eliminated, and the real face of the head or the face is approached to the maximum extent. In the prior art, a neural network model is also used, but the functions and functions of the neural network model are greatly different due to different input conditions, input parameters and training modes.
4. The user operation is simple. The invention provides a method for generating the synthetic picture of the original face information and the target makeup information through the GAN neural network, which can quickly realize virtual makeup replacement by only two pictures, is very well suitable for the characteristics and the trend of the Internet era, and is simple and quick. The user does not need any preparation, and the uploading of one photo and the selection of one makeup are equivalent to the completion of all the work. If the invention is applied to scenes such as entertainment small programs or network shopping, the experience and the viscosity of the user can be greatly enhanced. The real shapes of the human body and the face do not need to be mapped, and a wide application scene is provided for various industries such as interactive entertainment, online shopping and the like.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.
FIG. 1 is a process flow diagram of an embodiment;
FIG. 2 is a process flow diagram of one embodiment;
FIG. 3 is a diagram illustrating the structural information of the neural network model condition A according to an embodiment;
FIG. 4 is a diagram illustrating example neural network model condition B cosmetic information;
FIG. 5 is a diagram illustrating a neural network model training method according to an embodiment;
FIG. 6 is a schematic diagram of the system of the present invention.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details.
It should be further noted that, for the convenience of description, only some but not all of the relevant portions of the present application are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but could have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
As shown in fig. 1-2, the present invention provides a virtual makeup changing method, comprising: 1) Uploading a picture to be made up by a user; 2) Selecting or uploading a target makeup picture by a user; 3) Disassembling and extracting model characteristics of the picture to be made up and the target makeup picture; 4) The model features comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture; 5) Inputting the processed makeup picture and the extracted features of the facial structure information of the user picture into a GAN neural network as conditions; 6) And the GAN neural network outputs the synthesized makeup changing picture according to the condition.
Here, after uploading the picture, the user cannot actually see the steps that occur later until the combined makeup-changed picture is transmitted back to the user's terminal. After the picture to be made up and the target makeup picture are obtained, the two pictures need to be processed by adopting a sub-neural network respectively, the user picture only keeps face structure information as an input condition A, the makeup picture only keeps makeup information as an input condition B, and the conditions A and B are input into a GAN network. Referring to fig. 3 and 4, the conditions a and B can be understood as a picture without makeup information (especially color information) and a picture without structure information, which is equivalent to that the outline of the face can be seen in the picture to be made up by the user, that is, the person can be basically determined, but the original color information cannot be shown; the makeup information of the face can be seen in the target makeup picture, but the facial contour characteristics of the user can not be completely distinguished, namely, the owner of the picture can not be distinguished. After the trained master GAN network obtains the two input condition information, the two independent conditions A (structure information) and B (makeup information) are automatically overlapped and fused to generate a makeup-changed composite picture. Because new makeup information is superposed on the facial structure chart completely without makeup information, the interference of the original makeup is completely avoided, the makeup changing effect is greatly improved, and the changed makeup is more natural and real.
Further, in step 2), the user can select a target makeup picture provided by the system or upload a partial picture of the target makeup. The selection process of the user is embodied, if the alternative makeup pictures provided by the system are satisfied, the pictures provided by the system can be directly selected as the target makeup pictures, the trouble of uploading the makeup pictures by the user is avoided, and the template pictures provided by the system are all the pictures subjected to post-finishing treatment, so that the system has better performance in terms of fusion and reality.
Further, in step 3), the terminal uploads the picture information to the server to complete the disassembly and extraction of the model features, or directly completes the disassembly and extraction of the model features at the terminal and then uploads the feature information to the server. In the situation that privacy protection is more and more emphasized by users and supervision layers, how to protect the collected customer information is a problem that must be properly solved. In addition to executing strict data compliance measures at the server end, partial model calculation and original data storage functions can be completed by utilizing the computing power and the storage space of the user terminal, and in the operation mode, the server end does not need to disassemble pictures uploaded by a user any more, and partial network resources and server bandwidth resources are saved because received image information is processed according to a specified format and size; meanwhile, the original data of the user is directly stored locally, and only the image information subjected to disturbance processing is uploaded, so that the risk of privacy disclosure is reduced.
Further, the step of disassembling and extracting the model features of the picture to be made up also comprises the steps of 1) processing according to a two-dimensional image of a user to obtain a human head outline image; 2) Inputting a two-dimensional human head contour image into a first neural network subjected to deep learning to perform regression of key points; 3) Obtaining key point information of a user head; 4) Obtaining semantic segmentation maps of all parts of the head; 5) Color disturbance is carried out on the picture, and face makeup information is removed in a lab, brightness, contrast and shadow disturbance mode; 6) And extracting features of the disturbed picture through an encoder to obtain the representable face structure information of the user picture.
In this part, the acquired original picture of the user is mainly processed to obtain the structural parameter information required for generating the makeup-changing picture, as shown in fig. 2. Previously, the selection of the key points is usually performed manually, but the method has low efficiency and is not suitable for the requirement of fast pace in the internet era, so that the selection of the key points by utilizing the deep-learning neural network instead of manually becomes a trend in the days when the neural network is on the way. However, how to efficiently utilize the neural network is a problem that needs further research.
The acquisition of the two-dimensional face contour image utilizes a target detection algorithm, which is a target area fast generation network based on a convolutional neural network. Before the two-dimensional face image is input into the first neural network model, the method further comprises a process of training the neural network, wherein the training sample comprises a standard image for marking the positions of original key points, and the positions of the original key points are marked on the two-dimensional image with high accuracy by manual work. Here, a target image is first acquired, and the target image is subjected to human structure information detection using a target detection algorithm. Human detection is not to detect a real human body by using a measuring instrument, but actually means that any given image is generally a two-dimensional picture containing enough information, such as a picture containing information of a human face, a head and the like. Then, a certain strategy is adopted to search the given image so as to determine whether the given image contains the face, and if the given image contains the face, structural parameters such as the organ position, the size and the like of the face are given. In this embodiment, before obtaining key points of a face structure in a target image, human face detection needs to be performed on the target image to obtain a frame labeled with a human face position in the target image, and since an image input by a user can be any image, there inevitably exist some backgrounds of non-human body images, such as a desk chair, a large tree automobile building, and the like, and these useless backgrounds are removed through some mature algorithms.
Meanwhile, the key point detection and the edge detection are carried out, a neural network is used for generating a key point diagram of the human face, and optionally, a target detection algorithm can be used for quickly generating a network for a target area based on a convolutional neural network. This first neural network needs to carry out a large amount of data training, carries out the key point mark to the photo that collects by the manual work, then inputs neural network and trains, and through the neural network of deep learning, just can obtain immediately after accomplishing basically after the input photo the key point that the rate of accuracy and effect is the same with manual mark key point, and efficiency is tens of times or even hundreds of times of manual mark simultaneously. In the invention, the key point positions of the face in the picture are obtained, only the first step is completed, 1D point information is obtained, and 2D face information is generated according to the 1D point information, for example, the positions of eyebrows, canthi, contour of eyes, contour of lips and the like are obtained through key points, and the work can be completed through a neural network model and a mature algorithm in the prior art.
Further, the step of disassembling and extracting the model features of the target makeup picture further comprises the steps of 1) converting the face three-dimensional picture into a planar picture; 2) Performing WARP stretching deformation processing on the picture according to a certain preset rule; 3) Fixedly unfolding the picture into a picture with a designated part at a fixed position according to preset key point information; 4) The picture from the WARP to the fixed position carries the face makeup information of the target makeup picture, and can be directly input into the GAN neural network.
This step mainly completes the work of removing the structural information, but requires the complete preservation of the makeup information. The method comprises the steps of stretching and deforming the original three-dimensional face by adopting a traditional WARP image affine transformation mode, and finally, referring to the attached figure 3, the original face shape cannot be seen. In the stretching deformation, according to the key point information obtained in the previous step, a plurality of key points are preset as fixed points, so that the positions of the same parts in different original pictures can be accurately matched, for example, different original pictures, the positions of the canthus after treatment can be fixed on a certain coordinate point, and the dressing information of the canthus can be accurately transferred to the canthus position of the target picture.
Further, the method also comprises the steps of 1) setting the preset key points as 5 points, wherein the designated parts comprise canthus, nose tip and mouth angle; 2) Except for presetting key points, disturbing other points in a mode of increasing colors, adding and subtracting RGB values and changing contrast, brightness and light shadow; 3) The reserved makeup information comprises 5 key points and lipstick, eye shadow and color number information within a certain range around the key points.
This step is directed to a special case that may be encountered in the GAN model output. Because the synthesis output task in the invention can be completed by adopting a common GAN generation confrontation network model, but after a certain number of samples are processed, the mutual interference of cosmetic information (color, brightness and the like) at different parts or in different areas can still occur, so that the problem of uneven color or incoordination after synthesis can occur. The preliminary judgment is that in the structure (ID) information disturbance process, subtle influence is generated on makeup information among different areas, so that the makeup information is in unnatural transition. Therefore, the makeup information is tried to be fixed in a plurality of specified key points and areas nearby the key points, an area or pixel point threshold value is set for the area, certain disturbance is further carried out on the makeup information of other non-key areas, the whole disturbance process and method are similar to those in the disassembly condition A, and of course, some modes can be added appropriately according to needs. Through the treatment, the color accuracy of the makeup information is higher in the migration process, and the transition is more natural.
Further, before the model features are input into the GAN neural network model as conditions, a process of training the GAN neural network is also included, and referring to fig. 5, the training process is completed by a self-disassembling method by adopting a single picture with makeup. The training process comprises: 1) Disturbance removal is carried out on the face makeup information of the picture, and face structure information is extracted from the picture; 2) Disturbance removal is carried out on the picture face structure information, and face makeup information is extracted from the picture; 3) Inputting the extracted facial structure information of the picture into an encoder to extract features; 4) Inputting the obtained features and the facial makeup information into the GAN neural network model as input conditions; 5) The picture itself is used as the true value of the model for training.
The core of the training method is that the whole training process can be completed only by adopting a single picture with makeup. As is well known, various algorithms of the neural network are various, various parameters are also very many, and various LOSS functions (LOSS) are various due to different project requirements, and only through continuous training and iteration, the most correct "answer" can be better approached to be produced. According to the most standard design idea, the training of completing the virtual makeup changing task GAN should provide the user plain picture, the target makeup picture and the makeup picture of the user referring to the target makeup, so that the training of the model by the real picture after the user makes up can obtain better effect, and the more the number of samples is, the more the model output is accurate. However, in practical tests, the training strategy is found to have high requirements on training materials, and searching for original colors, makeup looks and post-makeup photos is very difficult for general companies, so that the number of training materials is small, the number of interchangeable makeup looks is small, various model parameters and LOSS functions are not easy to adjust, the reality and fusion degree of the result are not high, and particularly, a plurality of complex transformations have obvious discordance.
In order to overcome the defects, the method includes that any collected picture after makeup is used as a training material, face structure information and face makeup information in the picture are respectively disassembled and extracted, the two pieces of information are used for replacing a user pixel picture and a target makeup picture to serve as input conditions, and the picture after makeup is used as a model output true value to train the model. Because we train the neural network with the most correct answer and the training materials are changed from difficult to obtain to easy to obtain, the dependence of the model on the training materials is greatly reduced. Under the mass data training, the accuracy of parameters and the rationality of LOSS can be greatly improved, the convergence speed of a GAN neural network model is ideal, and after the training is finished, the model is used for synthesizing pictures and directly outputting the pictures, so that the synthesized pictures with good reality degree and fusion can be obtained, the structural information of the face in the pictures is consistent with the input original pictures, and the superposition of the makeup information is more natural and real because of no interference of the original makeup. The invention adopts an improved self-disassembling method to train the neural network, can finish more accurate training by only using a single picture, greatly reduces the collection difficulty of a training set, ensures that a large amount of data can be obtained with lower cost, also improves the generalization capability of the network, obviously accelerates the convergence speed of the GAN neural network, and can greatly improve the accuracy and the authenticity when the synthetic image is generated.
The method adopted in the GAN model training is consistent with the actual calculation process of the GAN model, and compared with the method that two pictures are adopted as the acquisition sources of the conditions A and B in the actual makeup changing process, the training method only adopts an original picture as a training material to disassemble the conditions A and B for convenience and accuracy. The condition that the material picture actually contains the information of the condition A and the condition B is skillfully utilized, and the material picture is disassembled and extracted. Meanwhile, the model training method is consistent with the disassembly method adopted in the real model processing process, and the makeup information is removed when the condition A is disassembled and extracted; and removing the structural information when the condition B is disassembled and extracted. The specific processing means adopted is also consistent, various disturbance methods are included, the difficulty of the system for executing the operations is not increased, and the same set of operations can be used for training the model and finishing the output of the synthesized picture in the main process. The pertinence and the continuity of the training are well guaranteed, and the iteration of the model can be more and more close to the requirements of the user.
Further, when the makeup information of the face in the original picture is removed, besides random disturbance, the following disturbance operation is also carried out, and the extracted colors of the specified parts in other face pictures are covered on the specified parts in the original picture by a histogram matching method, so that the makeup information at the specified positions in the original picture is completely stripped. In the training of the model, various conditions have more or less influence on the final training result, and we find that in the dismantling process of the condition A, if only random disturbance of various parameters is carried out, the stripping effect of the makeup information is incomplete in many times, the makeup information cannot be completely removed, such as lip color and eye shadow color, and further, the makeup after the synthesis covering is unnatural. It is desirable to completely hide the color of lips, etc. from the original picture, especially when the makeup color is darker and the area is larger. Therefore, a simple method with obvious effect is designed, namely, the makeup of the designated part of the original picture is completely replaced by the makeup of the designated part of the picture which is prefabricated by people, because the makeup images copied from other pictures have large contrast with the makeup condition of the picture per se, the effect of completely removing the makeup images is easily formed in disturbance, the disturbance effect of the designated part or the designated range is greatly improved, and the finally formed picture is more natural and is less disturbed by the original picture.
The original makeup picture is re-input into the neural network as a true value result to train the result, so that the neural network can know the input condition and what the most accurate matching result of the output model is. There are many training methods for neural networks, but the basic flow and thought are almost the same, that is, let the neural network know what is wrong. One common deep learning neural network training method generally includes: (1) preprocessing data; (2) Inputting data into a neural network (each neuron inputs values in a weighted accumulation mode firstly and then inputs an activation function as an output value of the neuron) to carry out forward propagation to obtain a score or a result; (3) Inputting the 'score' or the 'result' into an error function (regularization punishment, over-fitting prevention), comparing the 'score' or the 'result' with an expected value to obtain an error, and judging the identification degree (the smaller the loss value is, the better the identification degree) by the error, wherein a plurality of the errors are sums; (4) Determining gradient vectors by back propagation (back derivation, error function and each activation function in the neural network requires, with the final goal of minimizing the error); (5) Finally, each weight is adjusted through a gradient vector, and the error tends to be 0 or the convergence trend is adjusted towards the score or the result; (6) Repeating the above process until the average value of the set times or the loss error does not drop (lowest point); and (7) finishing the training.
It can be seen that most of neural network training processes are based on a large amount of data originally, and when training is started, a neural network model has a poor effect on many scenes, so that a large amount of various badcases need to be labeled, and then the labeled badcases are added into a training set, so that the neural network knows what the actual values of the badcases should be, and the scene can be accurately predicted after the network learns that similar images are touched later. However, this training is not very efficient in practice, so the training process is actually an iterative process, and if it is possible to train the model with results that are almost consistent with the standard answers, the later the badcase appears less, and the convergence rate is increased very fast, and the better the model performance.
In addition, the resume screening method according to the embodiment of the invention described in conjunction with fig. 1 to 5 may be implemented by a corresponding electronic device. Fig. 6 is a diagram illustrating a hardware architecture 300 according to an embodiment of the invention.
The invention also discloses a system for realizing the virtual makeup changing method, which comprises the following steps: 1) The picture acquisition module is used for acquiring a picture to be made up uploaded by a user, and the user selects or uploads a target makeup picture; 2) The disassembling and extracting module is used for acquiring model characteristics of the picture to be made up and the target makeup picture; the model features comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture; 3) And the picture synthesis output module comprises a GAN neural network and is used for inputting the processed makeup picture and the characteristics extracted from the facial structure information of the user picture into the GAN neural network as conditions and outputting the synthesized makeup changing picture.
And, an apparatus, comprising: one or more processors; a memory for storing one or more programs; when executed by the one or more processors, cause the one or more processors to perform the resume screening method of any of the preceding claims.
And a computer-readable storage medium on which a computer program is stored, which program, when executed by a processor, performs the resume screening method as described in any one of the preceding claims.
The apparatus 300 for implementing the present invention in this embodiment includes: the device comprises a processor 301, a memory 302, a communication interface 303 and a bus 310, wherein the processor 301, the memory 302 and the communication interface 303 are connected through the bus 310 and complete mutual communication.
In particular, the processor 301 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more integrated circuits implementing an embodiment of the present invention.
That is, the device 300 may be implemented to include: a processor 301, a memory 302, a communication interface 303, and a bus 310. The processor 301, memory 302 and communication interface 303 are coupled by a bus 310 and communicate with each other. The memory 302 is used to store program code; the processor 301 executes a program corresponding to the executable program code by reading the executable program code stored in the memory 302 for executing the method in any embodiment of the present invention, thereby implementing the method and apparatus described with reference to the figures.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.
Claims (12)
1. A virtual makeup replacement method, the method comprising:
1) Uploading a picture to be made up by a user;
2) Selecting or uploading a target makeup picture by a user;
3) Disassembling and extracting model characteristics of the picture to be made up and the target makeup picture;
4) The model features comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture;
5) Inputting the processed makeup picture and the extracted features of the facial structure information of the user picture into a GAN neural network as conditions;
6) And the GAN neural network outputs the synthesized makeup changing picture according to the condition.
2. The method as set forth in claim 1, wherein in step 2), the user can select a target makeup picture provided by the system or upload a partial picture of the target makeup.
3. The method according to claim 1, wherein in step 3), the terminal uploads the picture information to the server to complete the parsing and extraction of the model features, or completes the parsing and extraction of the model features directly at the terminal and then uploads the feature information to the server.
4. The method according to claim 1, wherein the step of disassembling and extracting the model features of the picture to be made up further comprises, 1) obtaining a human head contour image according to user two-dimensional image processing; 2) Inputting a two-dimensional human head contour image into a first neural network subjected to deep learning to perform regression of key points; 3) Obtaining key point information of a user head; 4) Obtaining semantic segmentation maps of all parts of the head; 5) Color disturbance is carried out on the picture, and face makeup information is removed in a lab, brightness, contrast and shadow disturbance mode; 6) And extracting features of the disturbed picture through an encoder to obtain the representable face structure information of the user picture.
5. The method as claimed in claim 1, wherein the step of disassembling and extracting the model features of the target makeup picture further comprises, 1) converting the face stereo picture into a planar picture; 2) Performing WARP stretching deformation processing on the picture according to a certain preset rule; 3) Fixedly unfolding the picture into a picture with a designated part at a fixed position according to preset key point information; 4) The picture from the WARP to the fixed position carries the face makeup information of the target makeup picture, and can be directly input into the GAN neural network.
6. The method of claim 5, further comprising, 1) the preset key points are set to 5 points, and the designated parts include the corners of the eyes, the tip of the nose and the corners of the mouth; 2) Except for presetting key points, disturbing other points in a mode of increasing colors, adding and subtracting RGB values and changing contrast, brightness and light shadow; 3) The information of the preserved makeup comprises 5 key points and lipstick, eye shadow and color number information within a certain range around the key points.
7. The method as claimed in claim 1, further comprising a process of training the GAN neural network before model features are inputted into the GAN neural network model as conditions, wherein the training process is performed by a self-dismantling method using a single cosmetic picture.
8. The method of claim 7, wherein the training process comprises: 1) Disturbance removal is carried out on the face makeup information of the picture, and face structure information is extracted from the picture; 2) Disturbance removal is carried out on the picture face structure information, and face makeup information is extracted from the picture; 3) Inputting the extracted facial structure information of the picture into an encoder to extract features; 4) Inputting the obtained features and the face makeup information as input conditions into a GAN neural network model; 5) The picture itself is used as the true value of the model for training.
9. The method as claimed in claim 8, wherein when removing the makeup information of the face in the original picture, in addition to random disturbance, a disturbance operation is performed to overlay the extracted color of the specified part in the other face picture on the specified part in the original picture by means of histogram matching, so that the makeup information at the specified position in the original picture is completely stripped.
10. A system for implementation of a virtual makeup replacement method, comprising: 1) The picture acquisition module is used for acquiring a picture to be made up uploaded by a user, and the user selects or uploads a target makeup picture; 2) The disassembling and extracting module is used for acquiring model characteristics of the picture to be made up and the target makeup picture; the model features comprise facial structure information of the picture to be made up and facial makeup information of the target makeup picture; 3) And the picture synthesis output module comprises a GAN neural network and is used for inputting the processed makeup picture and the characteristics extracted from the facial structure information of the user picture into the GAN neural network as conditions and outputting the synthesized makeup changing picture.
11. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-9.
12. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of claims 1-9 when executing a program stored in the memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111169098.XA CN115936796A (en) | 2021-10-02 | 2021-10-02 | Virtual makeup changing method, system, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111169098.XA CN115936796A (en) | 2021-10-02 | 2021-10-02 | Virtual makeup changing method, system, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115936796A true CN115936796A (en) | 2023-04-07 |
Family
ID=86647823
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111169098.XA Pending CN115936796A (en) | 2021-10-02 | 2021-10-02 | Virtual makeup changing method, system, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115936796A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116486054A (en) * | 2023-06-25 | 2023-07-25 | 四川易景智能终端有限公司 | AR virtual cosmetic mirror and working method thereof |
-
2021
- 2021-10-02 CN CN202111169098.XA patent/CN115936796A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116486054A (en) * | 2023-06-25 | 2023-07-25 | 四川易景智能终端有限公司 | AR virtual cosmetic mirror and working method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109978930B (en) | Stylized human face three-dimensional model automatic generation method based on single image | |
JP7526412B2 (en) | Method for training a parameter estimation model, apparatus for training a parameter estimation model, device and storage medium | |
CN107924579A (en) | The method for generating personalization 3D head models or 3D body models | |
US20240037852A1 (en) | Method and device for reconstructing three-dimensional faces and storage medium | |
US11587288B2 (en) | Methods and systems for constructing facial position map | |
US11562536B2 (en) | Methods and systems for personalized 3D head model deformation | |
CN104123749A (en) | Picture processing method and system | |
JP7462120B2 (en) | Method, system and computer program for extracting color from two-dimensional (2D) facial images | |
CN110796593A (en) | Image processing method, device, medium and electronic equipment based on artificial intelligence | |
CN113570684A (en) | Image processing method, image processing device, computer equipment and storage medium | |
US11417053B1 (en) | Methods and systems for forming personalized 3D head and facial models | |
CN113870404B (en) | Skin rendering method of 3D model and display equipment | |
CN113362422B (en) | Shadow robust makeup transfer system and method based on decoupling representation | |
CN113808277B (en) | Image processing method and related device | |
CN114821675B (en) | Object processing method and system and processor | |
CN114693570A (en) | Human body model image fusion processing method, device and storage medium | |
CN115936796A (en) | Virtual makeup changing method, system, equipment and storage medium | |
CN116740281A (en) | Three-dimensional head model generation method, three-dimensional head model generation device, electronic equipment and storage medium | |
Beacco et al. | Automatic 3D avatar generation from a single RBG frontal image | |
CN113822986B (en) | Virtual clothes changing method and system based on improved GRNet network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |