CN111860167B - Face fusion model acquisition method, face fusion model acquisition device and storage medium - Google Patents

Face fusion model acquisition method, face fusion model acquisition device and storage medium Download PDF

Info

Publication number
CN111860167B
CN111860167B CN202010560733.6A CN202010560733A CN111860167B CN 111860167 B CN111860167 B CN 111860167B CN 202010560733 A CN202010560733 A CN 202010560733A CN 111860167 B CN111860167 B CN 111860167B
Authority
CN
China
Prior art keywords
face
fusion
attribute information
model
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010560733.6A
Other languages
Chinese (zh)
Other versions
CN111860167A (en
Inventor
余席宇
洪智滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010560733.6A priority Critical patent/CN111860167B/en
Publication of CN111860167A publication Critical patent/CN111860167A/en
Application granted granted Critical
Publication of CN111860167B publication Critical patent/CN111860167B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application discloses a method, a device and a storage medium for acquiring a face fusion model and fusing the face, and relates to the fields of artificial intelligence, deep learning and computer vision, wherein the method can comprise the following steps: obtaining a training sample, wherein the training sample comprises the following steps: a target face image and a template face image; training to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image by using a training sample, wherein the training model comprises the following steps: acquiring attribute information of a target face in a target face image; carrying out face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of a fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function. The scheme can be used for cloud service, can improve the face fusion effect, and has wide applicability and the like.

Description

Face fusion model acquisition method, face fusion model acquisition device and storage medium
Technical Field
The present application relates to computer application technologies, and in particular, to a method, an apparatus, and a storage medium for acquiring a face fusion model and fusing a face in the fields of artificial intelligence, deep learning, and computer vision.
Background
At present, the face fusion technology is widely applied to various scenes such as virtual visual images, long and short video interesting playing methods, album interesting playing methods and the like. Face fusion techniques typically require the retention of identity information of one face image while fusing attribute information of another face image.
In view of lower implementation cost, a face fusion method based on mixed deformation (blendhape) and the like are mostly adopted at present for face fusion, but the fusion effect of the method is poor, and the obtained fusion image is not real enough and the like.
Disclosure of Invention
The application provides a face fusion model acquisition method, a face fusion model acquisition device and a face fusion storage medium.
A face fusion model acquisition method comprises the following steps:
obtaining a training sample, wherein the training sample comprises the following steps: a target face image and a template face image;
training to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image by using the training sample, wherein the training model comprises the following steps:
acquiring attribute information of a target face in the target face image;
carrying out face fusion on the target face image and the template face image to obtain a fused face;
acquiring attribute information of the fused face;
and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
A face fusion method comprising:
acquiring a target face image and a template face image;
and inputting the target face image and the template face image into the face fusion model obtained according to the method to obtain an output face fusion result.
A face fusion model acquisition device, comprising: the sample acquisition module and the model training module;
the sample obtaining module is configured to obtain a training sample, where the training sample includes: a target face image and a template face image;
the model training module is configured to train to obtain a face fusion model for performing face fusion on an input target face image and a template face image by using the training sample, and includes: acquiring attribute information of a target face in the target face image; carrying out face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of the fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
A face fusion apparatus comprising: the image acquisition module and the human face fusion module;
the image acquisition module is used for acquiring a target face image and a template face image;
the face fusion module is used for inputting the target face image and the template face image into the face fusion model obtained according to the method to obtain an output face fusion result.
An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.
A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method as described above.
A computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
One embodiment of the above application has the following advantages or benefits: the method can train to obtain a face fusion model, wherein the attribute information of a target face and a fusion face can be respectively obtained, decoupling of identity and attribute is realized, an attribute consistency loss function can be constructed by using the obtained attribute information, training of the model is guided by using the attribute consistency loss function, and the like, so that the model training effect is improved, the fusion effect when the face fusion is carried out by using the model obtained by training is further improved, a more real fusion image can be obtained, and the corresponding face fusion result can be obtained by giving any target face image and template face image to the model obtained by training, so that the method has wide applicability, low realization cost and the like. It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
FIG. 1 is a flowchart of an embodiment of a method for acquiring a face fusion model according to the present application;
FIG. 2 is a schematic diagram of a face fusion model acquisition process described in the present application;
FIG. 3 is a flowchart of an embodiment of a face fusion method described herein;
fig. 4 is a schematic structural diagram of an embodiment 40 of a face fusion model obtaining device according to the present application;
fig. 5 is a schematic structural diagram of an embodiment 50 of a face fusion device according to the present application;
fig. 6 is a block diagram of an electronic device according to a method according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In addition, it should be understood that the term "and/or" herein is merely one association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Fig. 1 is a flowchart of an embodiment of a method for obtaining a face fusion model described in the present application. As shown in fig. 1, the following detailed implementation is included.
At 101, a training sample is obtained, the training sample comprising: a target face image and a template face image.
In 102, a training sample is utilized to train to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image, wherein the training process comprises contents shown in 103-106.
In 103, attribute information of a target face in the target face image is acquired.
In 104, the target face image and the template face image are subjected to face fusion to obtain a fused face.
In 105, attribute information of the fused face is acquired.
In 106, an attribute consistency loss function is constructed according to the attribute information of the target face and the attribute information of the fusion face, and self-supervision learning of the face fusion model is performed by using the attribute consistency loss function.
It can be seen that, in this embodiment, a face fusion model may be obtained through training, where attribute information of a target face and a fused face may be obtained respectively, decoupling of identity and attribute may be implemented, an attribute consistency loss function may be constructed by using the obtained attribute information, and training of the model may be guided by using the attribute consistency loss function, so as to improve a model training effect, further improve a fusion effect when face fusion is performed by using the model obtained through training, so that a more realistic fusion image may be obtained.
During model training, a plurality of training samples can be obtained according to actual needs, and each training sample can respectively comprise: the target face image and the template face image respectively correspond to different characters, that is, each training sample may respectively correspond to a group of characters, and in addition, characters corresponding to different training samples may be different.
Aiming at the target face image in the training sample, the attribute information of the target face in the target face image can be acquired by using a face three-dimensional reconstruction method. Preferably, the three-dimensional deformation model (3DMM,3D Morphable Model) parameters of the target face can be obtained by using a face three-dimensional reconstruction method, and then the attribute information of the target face can be extracted from the 3DMM parameters.
The face three-dimensional reconstruction method is specifically realized in the prior art. Different dimensions in the 3DMM parameters respectively correspond to information such as identity, expression, gesture, illumination and the like of the person. The attribute information mainly includes expression and posture information.
In addition, the target face image and the template face image can be subjected to face fusion to obtain a fused face. Preferably, a face fusion method based on a generated countermeasure network (GAN, generative Adversarial Networks) can be used to perform face fusion on the target face image and the template face image, thereby obtaining a fused face. In practical applications, any GAN-based face fusion method may be used, such as the existing face migration (face migration) method.
Face fusion techniques typically require the retention of identity information of one face image while fusing attribute information of another face image. In the face method, a self-encoder (auto encoder) attribute reconstruction network exists for a target face, and features of various scales of the attribute reconstruction network are fused into identity information of a template face.
The face fusion method based on the GAN can obtain a good fusion effect, so that subsequent processing is facilitated, and in addition, the face fusion method based on the GAN and the face three-dimensional reconstruction method are combined, and the advantages of the face fusion method based on the GAN and the face three-dimensional reconstruction method are utilized, so that model training effects and the like are further improved.
And then, the attribute information of the fused face can be obtained by using a face three-dimensional reconstruction method. Preferably, the 3DMM parameters of the fused face can be obtained by using a face three-dimensional reconstruction method, and then the attribute information of the fused face can be extracted from the 3DMM parameters.
Based on the three-dimensional reconstruction method of the human face, the required attribute information can be conveniently and accurately obtained, so that decoupling of the identity and the attribute is realized, and subsequent processing and the like are facilitated.
After the processing, the attribute information of the target face and the attribute information of the fusion face can be respectively obtained, and the consistency of the two attribute information is expected when the faces are fused, so that an attribute consistency loss function can be constructed according to the attribute information of the target face and the attribute information of the fusion face, and the self-supervision learning of the face fusion model can be performed by utilizing the attribute consistency loss function.
Preferably, the attribute information of the target face and the L2 norm (L2-norm) of the attribute information of the fused face may be calculated, and as the attribute consistency loss function, specific forms may be: and I A-B I2, wherein A and B respectively represent attribute information of the target face and attribute information of the fusion face.
In addition, the self-supervision learning of the face fusion model can be performed by combining the attribute consistency loss function and the identity consistency loss function in the face fusion method based on the GAN. For example, self-supervised learning of the face fusion model is performed in combination with the attribute consistency loss function and the identity consistency loss function in the accessifer method.
By the mode, the consistency of the attributes of the fusion face and the target face is ensured, and the identity consistency of the fusion face and the template face is ensured.
In summary, fig. 2 is a schematic diagram of a face fusion model obtaining process described in the present application. As shown in FIG. 2, the attribute information of the target face in the target face image can be obtained by using a face three-dimensional reconstruction method, for example, the 3DMM parameter of the target face can be obtained by using the face three-dimensional reconstruction method, the attribute information of the target face can be extracted from the 3DMM parameter, the target face image and the template face image can be subjected to face fusion to obtain a fused face, for example, the face image and the template face image can be subjected to face fusion by using a face fusion method based on GAN (gas) to obtain the fused face, then the attribute information of the fused face can be obtained by using the face three-dimensional reconstruction method, for example, the 3DMM parameter of the fused face can be obtained by using the face three-dimensional reconstruction method, the attribute information of the fused face can be extracted from the 3DMM parameter, further, an attribute consistency loss function can be constructed according to the attribute information of the target face and the attribute information of the fused face, for example, the attribute information of the target face and the L2-norm of the attribute information of the fused face can be calculated as an attribute consistency loss function, and then the attribute consistency loss function can be learned by combining with the attribute consistency loss function of the face in the face fusion method.
Fig. 3 is a flowchart of an embodiment of a face fusion method described in the present application. As shown in fig. 3, the following detailed implementation is included.
In 301, a target face image and a template face image are acquired.
In 302, a target face image and a template face image are input into a face fusion model obtained through training in advance, and an output face fusion result is obtained.
The face fusion model may be a face fusion model obtained according to the method shown in fig. 1.
After the face fusion model is obtained, the face fusion model can be used for carrying out actual face fusion. For any target face image and template face image to be fused, a face fusion model can be input, so that an output face fusion result is obtained.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may take other order or occur simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application. In addition, portions of one embodiment that are not described in detail may be referred to in the description of other embodiments.
The scheme can be used for cloud services. The foregoing is a description of embodiments of the method, and the following further describes embodiments of the device.
Fig. 4 is a schematic structural diagram of an embodiment 40 of a face fusion model acquisition device according to the present application. As shown in fig. 4, includes: the sample acquisition module 401 and the model training module 402.
The sample obtaining module 401 is configured to obtain a training sample, where the training sample includes: a target face image and a template face image.
The model training module 402 is configured to train to obtain a face fusion model for performing face fusion on an input target face image and a template face image by using a training sample, where the training module includes: acquiring attribute information of a target face in a target face image; carrying out face fusion on the target face image and the template face image to obtain a fused face; acquiring attribute information of a fused face; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
For the target face image in the training sample, the model training module 402 may acquire attribute information of the target face therein by using a face three-dimensional reconstruction method, for example, may acquire 3DMM parameters of the target face by using the face three-dimensional reconstruction method, and extract attribute information of the target face from the 3DMM parameters.
Different dimensions in the 3DMM parameters respectively correspond to information such as identity, expression, gesture, illumination and the like of the person. The attribute information mainly includes expression and posture information.
The model training module 402 may also perform face fusion on the target face image and the template face image using a GAN-based face fusion method. In practical application, any face fusion method based on GAN, such as the existing face fusion method, can be used.
The model training module 402 may further obtain attribute information of the fused face by using a face three-dimensional reconstruction method, for example, may obtain 3DMM parameters of the fused face by using a face three-dimensional reconstruction method, and extract the attribute information of the fused face from the 3DMM parameters.
After the attribute information of the target face and the attribute information of the fused face are respectively obtained, the model training module 402 may calculate the attribute information of the target face and the L2-norm of the attribute information of the fused face as attribute consistency loss functions, and the specific form may be: and I A-B I2, wherein A and B respectively represent attribute information of the target face and attribute information of the fusion face.
In addition, the model training module 402 may also perform self-supervised learning of the face fusion model in combination with the attribute consistency loss function and the identity consistency loss function in the GAN-based face fusion method. For example, self-supervised learning of the face fusion model is performed in combination with the attribute consistency loss function and the identity consistency loss function in the accessifer method.
Fig. 5 is a schematic structural diagram of an embodiment 50 of a face fusion device according to the present application. As shown in fig. 5, includes: the image acquisition module 501 and the face fusion module 502.
The image acquisition module 501 is configured to acquire a target face image and a template face image.
The face fusion module 502 is configured to input the target face image and the template face image into a face fusion model obtained by training in advance, and obtain an output face fusion result.
The face fusion model may be a face fusion model obtained according to the method shown in fig. 1, that is, a face fusion model obtained by the apparatus shown in fig. 4.
After the face fusion model is obtained, the face fusion model can be used for carrying out actual face fusion. For any target face image and template face image to be fused, a face fusion model can be input, so that an output face fusion result is obtained.
The specific workflow of the embodiment of the apparatus shown in fig. 4 and 5 is referred to the related description in the foregoing method embodiment, and will not be repeated.
In a word, by adopting the scheme of the embodiment of the application device, a human face fusion model can be obtained through training, wherein the decoupling of the identity and the attribute of a target human face and a fusion human face can be realized by utilizing a human face three-dimensional reconstruction method, an attribute consistency loss function can be constructed by utilizing the obtained attribute information, the training of the model is guided by utilizing the attribute consistency loss function, and the like, so that the model training effect is improved, the fusion effect when the human face fusion is carried out by utilizing the model obtained through training is further improved, a more real fusion image can be obtained, and the corresponding human face fusion result can be obtained by giving any target human face image and template human face image aiming at the model obtained through training, so that the model has wide applicability, lower realization cost and the like; in addition, a face fusion method based on GAN is adopted, so that a better fusion effect can be obtained, the subsequent processing is convenient, the face fusion method based on GAN and the face three-dimensional reconstruction method are combined, the advantages of the face fusion method based on GAN are utilized, and the model training effect and the like are further improved; in addition, based on the three-dimensional reconstruction method of the human face, the required attribute information can be conveniently and accurately obtained; furthermore, the self-supervision learning of the face fusion model can be performed by combining the attribute consistency loss function and the identity consistency loss function in the face fusion method based on the GAN, so that the consistency of the attributes of the fusion face and the target face is ensured, and the identity consistency of the fusion face and the template face is ensured.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 6, is a block diagram of an electronic device according to a method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 6, the electronic device includes: one or more processors Y01, memory Y02, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of a graphical user interface on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). In fig. 6, a processor Y01 is taken as an example.
The memory Y02 is a non-transitory computer readable storage medium provided in the present application. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.
The memory Y02 serves as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present application. The processor Y01 executes various functional applications of the server and data processing, i.e., implements the methods in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory Y02.
The memory Y02 may include a memory program area that may store an operating system, at least one application program required for functions, and a memory data area; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory Y02 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory Y02 may optionally include memory located remotely from processor Y01, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, blockchain networks, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device Y03 and an output device Y04. The processor Y01, memory Y02, input device Y03, and output device Y04 may be connected by a bus or otherwise, with bus connections being exemplified in fig. 6.
The input device Y03 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output means Y04 may include a display device, an auxiliary lighting means, a tactile feedback means (e.g., a vibration motor), and the like. The display device may include, but is not limited to, a liquid crystal display, a light emitting diode display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific integrated circuitry, computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. The terms "machine-readable medium" and "computer-readable medium" as used herein refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable logic devices) for providing machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a cathode ray tube or a liquid crystal display monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area networks, wide area networks, blockchain networks, and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (12)

1. The method for acquiring the face fusion model is characterized by comprising the following steps of:
obtaining a training sample, wherein the training sample comprises the following steps: a target face image and a template face image;
training to obtain a face fusion model for carrying out face fusion on an input target face image and a template face image by using the training sample, wherein the training model comprises the following steps:
the obtaining the attribute information of the target face in the target face image comprises the following steps: acquiring three-dimensional deformation model parameters of the target face by using a face three-dimensional reconstruction method, and extracting attribute information of the target face from the three-dimensional deformation model parameters;
carrying out face fusion on the target face image and the template face image to obtain a fused face;
the obtaining the attribute information of the fused face comprises the following steps: acquiring three-dimensional deformation model parameters of the fused face by using a face three-dimensional reconstruction method, and extracting attribute information of the fused face from the three-dimensional deformation model parameters;
and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
2. The method for acquiring a face fusion model according to claim 1, wherein,
the step of performing face fusion on the target face image and the template face image comprises the following steps:
and carrying out face fusion on the target face image and the template face image by using a face fusion method based on a generated countermeasure network.
3. The method for acquiring a face fusion model according to claim 1, wherein,
the constructing the attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face comprises the following steps:
and calculating the attribute information of the target face and the L2 norm of the attribute information of the fusion face as the attribute consistency loss function.
4. The method for acquiring a face fusion model according to claim 2, wherein,
the self-supervised learning of the face fusion model by using the attribute consistency loss function comprises the following steps:
and combining the attribute consistency loss function and the identity consistency loss function in the face fusion method based on the generated countermeasure network to perform self-supervision learning of the face fusion model.
5. A method of face fusion comprising:
acquiring a target face image and a template face image;
inputting the target face image and the template face image into a face fusion model obtained by the method according to any one of claims 1-4 to obtain an output face fusion result.
6. The device for acquiring the face fusion model is characterized by comprising the following components: the sample acquisition module and the model training module;
the sample obtaining module is configured to obtain a training sample, where the training sample includes: a target face image and a template face image;
the model training module is configured to train to obtain a face fusion model for performing face fusion on an input target face image and a template face image by using the training sample, and includes: the obtaining the attribute information of the target face in the target face image comprises the following steps: acquiring three-dimensional deformation model parameters of the target face by using a face three-dimensional reconstruction method, and extracting attribute information of the target face from the three-dimensional deformation model parameters; carrying out face fusion on the target face image and the template face image to obtain a fused face; the obtaining the attribute information of the fused face comprises the following steps: acquiring three-dimensional deformation model parameters of the fused face by using a face three-dimensional reconstruction method, and extracting attribute information of the fused face from the three-dimensional deformation model parameters; and constructing an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fusion face, and performing self-supervision learning of the face fusion model by using the attribute consistency loss function.
7. The face fusion model acquisition apparatus according to claim 6, wherein,
the model training module performs face fusion on the target face image and the template face image by using a face fusion method based on a generated countermeasure network.
8. The face fusion model acquisition apparatus according to claim 6, wherein,
and the model training module calculates the L2 norm of the attribute information of the target face and the attribute information of the fusion face as the attribute consistency loss function.
9. The face fusion model acquisition apparatus according to claim 7, wherein,
and the model training module combines the attribute consistency loss function and the identity consistency loss function in the face fusion method based on the generated countermeasure network to perform self-supervision learning of the face fusion model.
10. A face fusion apparatus, comprising: the image acquisition module and the human face fusion module;
the image acquisition module is used for acquiring a target face image and a template face image;
the face fusion module is configured to input the target face image and the template face image into a face fusion model obtained by the method according to any one of claims 1 to 4, and obtain an output face fusion result.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202010560733.6A 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium Active CN111860167B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010560733.6A CN111860167B (en) 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010560733.6A CN111860167B (en) 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium

Publications (2)

Publication Number Publication Date
CN111860167A CN111860167A (en) 2020-10-30
CN111860167B true CN111860167B (en) 2024-01-26

Family

ID=72986226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010560733.6A Active CN111860167B (en) 2020-06-18 2020-06-18 Face fusion model acquisition method, face fusion model acquisition device and storage medium

Country Status (1)

Country Link
CN (1) CN111860167B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113052025A (en) * 2021-03-12 2021-06-29 咪咕文化科技有限公司 Training method of image fusion model, image fusion method and electronic equipment
CN113705316A (en) * 2021-04-13 2021-11-26 腾讯科技(深圳)有限公司 Method, device and equipment for acquiring virtual image and storage medium
CN112950775A (en) * 2021-04-27 2021-06-11 南京大学 Three-dimensional face model reconstruction method and system based on self-supervision learning
CN113362263B (en) * 2021-05-27 2023-09-15 百度在线网络技术(北京)有限公司 Method, apparatus, medium and program product for transforming an image of a virtual idol
CN113642491A (en) * 2021-08-20 2021-11-12 北京百度网讯科技有限公司 Face fusion method, and training method and device of face fusion model
CN114973349A (en) * 2021-08-20 2022-08-30 腾讯科技(深圳)有限公司 Face image processing method and training method of face image processing model
CN113850168A (en) * 2021-09-16 2021-12-28 百果园技术(新加坡)有限公司 Fusion method, device and equipment of face pictures and storage medium
CN113902956B (en) * 2021-09-30 2023-04-07 北京百度网讯科技有限公司 Training method of fusion model, image fusion method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107578017A (en) * 2017-09-08 2018-01-12 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
CN110503703A (en) * 2019-08-27 2019-11-26 北京百度网讯科技有限公司 Method and apparatus for generating image

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107578017A (en) * 2017-09-08 2018-01-12 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
CN110503703A (en) * 2019-08-27 2019-11-26 北京百度网讯科技有限公司 Method and apparatus for generating image

Also Published As

Publication number Publication date
CN111860167A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN111860167B (en) Face fusion model acquisition method, face fusion model acquisition device and storage medium
EP3920147A1 (en) Method and apparatus for generating virtual avatar, device, storage medium and computer program product
US11587300B2 (en) Method and apparatus for generating three-dimensional virtual image, and storage medium
CN111832745B (en) Data augmentation method and device and electronic equipment
CN111968203B (en) Animation driving method, device, electronic equipment and storage medium
CN111861955A (en) Method and device for constructing image editing model
CN112001180A (en) Multi-mode pre-training model acquisition method and device, electronic equipment and storage medium
CN110806865B (en) Animation generation method, device, equipment and computer readable storage medium
US20210241498A1 (en) Method and device for processing image, related electronic device and storage medium
CN111967297B (en) Image semantic segmentation method and device, electronic equipment and medium
CN111739167B (en) 3D human head reconstruction method, device, equipment and medium
CN111709875B (en) Image processing method, device, electronic equipment and storage medium
CN111539897A (en) Method and apparatus for generating image conversion model
CN112241716B (en) Training sample generation method and device
CN111695516B (en) Thermodynamic diagram generation method, device and equipment
CN111539347B (en) Method and device for detecting target
JP2022185096A (en) Method and apparatus of generating virtual idol, and electronic device
JP7168837B2 (en) PRODUCT INFORMATION METHOD, APPARATUS, DEVICE, STORAGE MEDIUM, AND PROGRAM
CN112116548A (en) Method and device for synthesizing face image
CN116167426A (en) Training method of face key point positioning model and face key point positioning method
CN112508163B (en) Method and device for displaying subgraph in neural network model and storage medium
CN111898489B (en) Method and device for marking palm pose, electronic equipment and storage medium
CN112053280A (en) Panoramic map display method, device, equipment and storage medium
CN111385489B (en) Method, device and equipment for manufacturing short video cover and storage medium
US11670029B2 (en) Method and apparatus for processing character image data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant