WO2023036239A1

WO2023036239A1 - Human face fusion method and apparatus, device and storage medium

Info

Publication number: WO2023036239A1
Application number: PCT/CN2022/117804
Authority: WO
Inventors: 何茜
Original assignee: 北京字跳网络技术有限公司
Priority date: 2021-09-10
Filing date: 2022-09-08
Publication date: 2023-03-16
Also published as: CN115797481A

Abstract

The present disclosure relates to a human face fusion method and apparatus, a device, and a storage medium. The method comprises: acquiring a human face image to be fused and a face image of a material object to be fused, and inputting the human face image to be fused and the face image of the material object to be fused into a trained target network to obtain a target fused image, where the target network is obtained by training a sample group, the sample group comprises a human face image, a face image of the material object and a fused image obtained by fusing the human face image and the face image of the material object, the face image of the material object is obtained by rendering an initial face image of the material object, and the facial feature parameter of the face image matches that of the face image. The foregoing technical solution realizes special effect play methods fusing the human face and the material is achieved, improves the fun of a video, thereby enhancing the user experience and meeting diversified user requirements for special effect play methods.

Description

Face fusion method, device, equipment and storage medium

Cross References to Related Applications

This application claims the priority of the Chinese patent application with the application number 202111064260.1 and the title of the invention "face fusion method, device, equipment and storage medium" submitted on September 10, 2021, the entire content of which is incorporated by reference in this application.

technical field

The present disclosure relates to the technical field of image processing, and in particular to a face fusion method, device, equipment and storage medium.

Background technique

With the development of science and technology, more and more application software has entered the life of users, gradually enriching the leisure life of users, for example, short video application programs. Users can use video, photos, etc. to record their lives, and upload the videos and photos to the short video application.

There are many special effects of image algorithms and rendering technologies on short video applications. These special effects attract more and more users to use short video applications, which makes users more and more demanding for special effects on short video applications. Video special effects gameplay needs to be constantly updated to meet the diverse needs of users for special effects gameplay.

Contents of the invention

In order to solve the above-mentioned technical problems or at least partly solve the above-mentioned technical problems, the present disclosure provides a face fusion method, device, equipment and storage medium to train the target network based on the sample group and provide functional effects of face fusion to meet Users have diverse needs for special effects gameplay.

The present disclosure provides a face fusion method, the method comprising:

Obtain the face image to be fused and the facial image of the material object to be fused;

Input the face image to be fused and the facial image of the material object to be fused into the trained target network to obtain the target fused image, wherein the target network is trained based on the sample group, and the sample group includes the face image and the face of the material object The image and the fused image obtained by fusing the face image and the face image of the material object, the face image of the material object is obtained after rendering the initial face image of the material object, and the facial feature parameters of the face image and the face image match.

The present disclosure provides a human face fusion device, which includes:

An image acquisition module, configured to acquire a face image to be fused and a facial image of a material object to be fused;

The fusion module is used to input the face image to be fused and the facial image of the material object to be fused to the trained target network to obtain the target fused image, wherein the target network is trained based on a sample group, and the sample group includes a human face image , the face image of the material object and the fused image obtained by fusing the face image and the face image of the material object, the face image of the material object is obtained after rendering the initial face image of the material object, and the face image and the face image Facial features parameter matching.

The embodiment of the present invention also provides an electronic device, which includes:

one or more processors;

storage means for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the face fusion method provided in any embodiment of the present invention.

An embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the face fusion method provided by any embodiment of the present invention is implemented.

Compared with the prior art, the technical solutions provided by the embodiments of the present disclosure have the following advantages:

Embodiments of the present disclosure provide a face fusion method, device, device, and storage medium, capable of acquiring a face image to be fused and a facial image of a material object to be fused, and combining the face image to be fused with the material to be fused The face image of the subject is input to the trained target network to obtain the target fused image, because the target network is trained based on the sample group, and the sample group includes the face image, the face image of the material object, and the facial image composed of the face image and the material object In the fused image obtained by fusion, the facial image of the material object is obtained after rendering the initial facial image of the material object, and the face image matches the facial feature parameters of the facial image, so that the target network that has been trained can be used to convert the face image By merging with the facial image, a special effect gameplay that combines the face and the material is obtained, which improves the interest of the video, thereby improving the user experience, and meeting the diverse needs of the user for the special effect gameplay.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure.

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, for those of ordinary skill in the art, In other words, other drawings can also be obtained from these drawings without paying creative labor.

FIG. 1 is a schematic flowchart of a face fusion method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic flowchart of a target network training method provided by an embodiment of the present disclosure;

Fig. 3 is a schematic flow chart of obtaining a sample group provided by an embodiment of the present disclosure;

FIG. 4 is a schematic flowchart of another face fusion method provided by an embodiment of the present disclosure;

Fig. 5 is a schematic structural diagram of a face fusion device provided by an embodiment of the present disclosure;

Fig. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed ways

In order to more clearly understand the above objects, features and advantages of the present disclosure, the solutions of the present disclosure will be further described below. It should be noted that, in the case of no conflict, the embodiments of the present disclosure and the features in the embodiments can be combined with each other.

In the following description, many specific details are set forth in order to fully understand the present disclosure, but the present disclosure can also be implemented in other ways than described here; obviously, the embodiments in the description are only some of the embodiments of the present disclosure, and Not all examples.

There are many special effects of image algorithms and rendering technologies on short video applications. These special effects attract more and more users to use short video applications. In order to improve the fun of short video applications, users hope to use more and more The special effect gameplay satisfies the experience of using short video applications. As a result, users have more and more demands for special effects gameplay on short video applications, and the special effects gameplay of short videos needs to be constantly updated to meet users' diverse needs for special effects gameplay.

In order to solve the above problems, embodiments of the present disclosure provide a face fusion method, device, device, and storage medium, which can use the trained target network to fuse face images and facial images, thereby obtaining the fusion of face and facial images. The special effect gameplay of material fusion improves the interest of the video, which in turn can improve the user experience to meet the diverse needs of users for special effect gameplay.

The face fusion method provided by the embodiment of the present disclosure will first be described below with reference to FIG. 1 to FIG. 4 .

Fig. 1 shows a schematic flowchart of a face fusion method provided by an embodiment of the present disclosure.

In some embodiments of the present disclosure, the face fusion method shown in FIG. 1 may be executed by an electronic device. Among them, electronic devices may include devices with communication functions such as mobile phones, tablet computers, desktop computers, notebook computers, vehicle terminals, wearable devices, all-in-one computers, and smart home devices, as well as devices simulated by virtual machines or simulators.

As shown in FIG. 1, the face fusion method may include the following steps.

S110. Acquire a face image to be fused and a face image of a material object to be fused.

In the embodiment of the present disclosure, the face image to be fused may be the original face image that needs to be fused.

Wherein, the face image may be an image of a real person under any facial feature parameter. Optionally, the facial feature parameters may include shooting angle, expression, light intensity, etc., which are not limited here.

In the embodiment of the present disclosure, the facial image of the material object to be fused may be a material image fused with a human face image.

Wherein, the facial image may be an image of the material object under any facial feature parameter. Optionally, the material object may be a real person or a cartoon character.

Specifically, when the user uses the face fusion special effect on the short video application program of the electronic device, the electronic device obtains the facial image of the material object to be fused selected by the user, and collects the face image, and uses the collected face image as the subject to be fused. The fused face image is used to further perform image fusion based on the face image to be fused and the facial image of the material object to be fused.

S120. Input the face image to be fused and the facial image of the material object to be fused into the trained target network to obtain the target fused image, wherein the target network is trained based on a sample group, and the sample group includes a face image and a material object The facial image of and the fused image obtained by fusing the face image and the facial image of the material object.

In the embodiment of the present disclosure, the target network may be a Generative Adversarial Network (GAN), and the confrontation network is a network model including a generator and a discriminator. Based on the target network trained by the sample group, the face image to be fused and the face image of the material object to be fused can be fused to obtain a target fused image.

In the embodiment of the present disclosure, the facial image of the material object is obtained after rendering the initial facial image of the material object, and the face image matches the facial feature parameters of the facial image.

Wherein, the method for rendering the initial facial image of the material object may include: adjusting the shooting angle, expression, and light intensity of the initial facial image of the material object to obtain the facial image of the material object, so that the facial features of the human face image and the facial image The parameters match.

Specifically, when the electronic device is training the target network, the face image to be fused, the face image of the material object to be fused, and the fused image obtained by fusing the face image and the face image of the material object can be input to the preset network, Based on the preset network, the face image and the facial image are fused to obtain a predicted fusion image, and the network parameters of the preset network are adjusted based on the predicted fusion image and the fusion image obtained by fusing the face image and the facial image of the material object until the output The predicted fusion image is a fusion image, so that the network parameters of the preset network are stabilized, and the trained target network is obtained.

Further, after obtaining the trained target network, input the face image to be fused and the facial image of the material object to be fused into the trained target network, and use the target network to combine the face image to be fused and the material to be fused The facial images of the subject are fused to obtain the fused image of the target.

In the embodiment of the present disclosure, the face image to be fused and the facial image of the material object to be fused can be acquired, and the face image to be fused and the facial image of the material object to be fused are input to the target network that has been trained to obtain The target fused image, because the target network is trained based on the sample group, and the sample group includes the face image, the facial image of the material object, and the fused image obtained by fusing the face image and the facial image of the material object, the facial image of the material object is It is obtained after rendering the initial facial image of the material object, and the face image matches the facial feature parameters of the facial image, so that the trained target network is used to fuse the face image and the facial image, thus obtaining the combination of the human face and the face image. The special effect gameplay of material fusion improves the interest of the video, which in turn can improve the user experience to meet the diverse needs of users for special effect gameplay.

In another implementation manner of the present disclosure, in order to ensure that the target network can be used to fuse the face image and the facial image, the electronic device may also perform a model training step on the target network before performing S110.

Fig. 2 shows a schematic flowchart of a target network training method provided by an embodiment of the present disclosure.

As shown in FIG. 2 , before acquiring the face image to be fused and the face image of the material object to be fused, the target network training method may further include the following steps.

S210. Obtain a sample group.

In the embodiment of the present disclosure, the sample group includes a human face image, a facial image of a material object, and a fused image obtained by fusing the human face image and the facial image of the material object, and the facial image of the material object is the initial facial image of the material object. After rendering, the face image matches the facial feature parameters of the face image.

In some embodiments of the present disclosure, FIG. 3 shows a schematic flow chart of acquiring a sample group.

As shown in Figure 3, S210 may include:

S2101. Acquire facial images of multiple material objects and multiple human face images, each facial image is an image of a material object under any facial feature parameter, and each human face image is an image of a real person under any facial feature parameter.

S2102. Match the face image with the facial image based on the facial feature parameters, and select a target facial image matching the facial feature parameters of the human face image from the multiple facial images.

S2103. Fusion the human face image and the target facial image to generate a fusion image, and obtain a sample group composed of the human face image, the target facial image, and the fusion image corresponding to the human face image.

Wherein, the facial image for each material object may be the original facial images corresponding to multiple materials. Optionally, the multiple material objects may include multiple different real characters and multiple different virtual characters.

Wherein, the multiple face images may be images of the same real person under any facial feature parameter.

It should be noted that facial feature parameters of multiple material object facial images and multiple human face images may be the same or different.

In some embodiments, S2102 may perform at least one of the following operations:

S11. Match the face image with the face image according to the face angle of the material object in the face feature parameter.

S12. Match the face image with the facial image according to the expression data of the material object in the facial feature parameter.

S13. Match the face image with the facial image according to the facial contour of the material object in the facial feature parameter.

S14. Match the face image with the facial image according to the facial features of the material object in the facial feature parameters.

S15. Match the human face image with the facial image according to the facial decoration of the material object in the facial feature parameter.

Wherein, the face angle may be a shooting angle of the material object. Specifically, according to the facial angle of the material object in the facial feature parameters, the electronic device screens out the facial image of the material object from the facial image that is consistent with the shooting angle of the real person in the face image as the facial image to be matched, and compares the human face The image is matched with the face image to be matched.

Wherein, the expression data may be an expression displayed by the material object. Specifically, according to the facial expression data of the material object in the facial feature parameters, the electronic device screens out the facial image of the material object from the facial image that is consistent with the facial expression data of the real person in the face image, and uses it as the facial image to be matched. The image is matched with the face image to be matched.

Wherein, the facial contour may be determined according to the key points of the facial contour of the material object. Specifically, the electronic device determines the facial contour of the material object according to the key points of the facial contour of the material object, and selects the material object from the facial image that is consistent with the facial contour of a real person in the face image according to the facial contour of the material object. The face image is used as the face image to be matched, and the face image is matched with the face image to be matched.

Wherein, the features of facial features may be determined according to the key points of facial features in the facial image of the material object. Specifically, the electronic device determines the facial features of the material object according to the key points of the facial features in the facial image of the material object, and according to the facial features of the material object, selects from the facial image the material consistent with the facial features of the real person in the face image The face image of the subject is used as the face image to be matched, and the face image and the face image to be matched are matched.

Wherein, the facial decoration may be determined according to the non-face key points of the material object except the key points of the facial features and the key points of the facial contour. Specifically, the electronic device determines the facial decoration of the material object according to the non-facial key points of the material object, and selects the face of the material object from the facial image that is consistent with the facial decoration of the real person in the face image according to the facial decoration of the material object The image is used as the face image to be matched, and the face image is matched with the face image to be matched.

Furthermore, in order to improve the matching accuracy between the face image and the face image, the electronic device can also acquire non-facial feature parameters.

Optionally, the non-facial feature parameters may include at least one of the following: lighting conditions, hairstyle data, body shape data, clothing data, and the like.

Correspondingly, matching the face image and the facial image based on the facial feature parameters may also include performing at least one of the following operations:

S16. Match the face image with the face image based on the light state of the face image in the non-facial feature parameters.

S17. Based on the hairstyle data of the material object in the non-facial feature parameter, match the face image with the face image.

S18. Based on the body shape data of the material object in the non-facial feature parameter, match the face image with the face image.

S19. Based on the clothing data of the material object in the non-facial feature parameter, match the face image with the face image.

Wherein, the light state may be the light intensity of the facial image. Specifically, after the electronic device matches the face image with the face image based on the above at least one facial feature parameter, it can also filter out from the face image the light state of the real person in the face image based on the light state of the material object. The face image of the consistent material object is used as the face image to be matched, and the face image is matched with the face image to be matched.

Wherein, the hairstyle data may be the hairstyle displayed by the material object. Specifically, after the electronic device matches the face image with the face image based on the above at least one facial feature parameter, it can also filter out the hairstyle data of the real person in the face image from the face image based on the hairstyle data of the material object. The face image of the consistent material object is used as the face image to be matched, and the face image is matched with the face image to be matched.

Wherein, the figure data may be determined based on the key points of the figure outline of the material object. Specifically, after the electronic device matches the face image with the facial image based on the above at least one facial feature parameter, it can also determine the body shape data based on the key points of the body shape outline of the material object, based on the body shape data, from the face image Screen out the face image of the material object that is consistent with the body shape data of the real person in the face image as the face image to be matched, and match the face image with the face image to be matched.

Wherein, the clothing data may be determined based on the texture information of the clothing worn by the material object. Specifically, after the electronic device matches the face image with the facial image based on the above at least one facial feature parameter, it can also determine the clothing data based on the texture information of the clothing worn by the material object, and filter the facial image based on the clothing data. The face image of the material object consistent with the clothing data of the real person in the face image is obtained as the face image to be matched, and the face image and the face image to be matched are matched.

In other embodiments, S2102 may perform at least one of the following operations:

S21. Match the face image with the facial image based on the partial facial feature parameters, and select a partial facial image matching the partial facial feature parameters of the human face image from the multiple facial images.

S22. Match the face image with the facial image based on other facial feature parameters, and select a target facial image matching the other facial feature parameters of the human face image from part of the facial images.

Wherein, the partial facial feature parameters may be used to initially match the facial feature parameters between the face image and the facial image, so as to preliminarily screen out the partial facial images matching the partial facial feature parameters of the human face image from multiple facial images.

Wherein, the other facial feature parameters may be facial feature parameters used for secondary matching between the face image and the facial image, so as to accurately screen out the target facial image matching the other facial feature parameters of the human face image from partial facial images.

Optionally, some of the facial feature parameters and other facial feature parameters can include at least one of facial angles, expression data, facial contours, facial features, and facial decorations, and the facial features included in the partial facial feature parameters and other facial feature parameters The characteristic parameters are different.

Correspondingly, after screening out the target facial images that match other facial feature parameters of the human face images from the partial facial images, at least one of the following operations is performed:

S23. Based on the light state of the facial image in the non-facial feature parameters, match the human face image with the target facial image.

S24. Based on the hairstyle data of the material object in the non-facial feature parameter, match the face image with the facial image.

S25. Based on the body shape data of the material object in the non-facial feature parameter, match the face image with the face image.

S26. Based on the clothing data of the material object in the non-facial feature parameter, match the face image with the face image.

Wherein, S23-S26 are similar to S16-S19, and will not be repeated here.

Therefore, in the embodiment of the present disclosure, the face image can be matched with the face image through multiple matching methods, and the target face image matching the facial feature parameters of the face image can be selected from the multiple face images, adapting to A variety of matching scenarios, and by matching images based on multiple facial feature parameters, a variety of different styles of matching results can be obtained, and the target network can be trained based on the matching results of multiple styles, so that when using the target network for face fusion, you can get A variety of different styles of face fusion results improve the fun of the video.

In some embodiments, S2103 may perform at least one of the following operations:

S31. Block the face image and the target face image respectively.

S32. Calculate the first sparse matrix of each sub-block of the face image and the second sparse matrix of each sub-block of the target facial image.

S33. Fuse the first sparse matrix and the second sparse matrix for each block, and determine a fused image according to the fused sparse matrix.

Specifically, first, the electronic device can divide the face image and the target facial image into multiple image blocks based on the adaptive image fusion algorithm and according to the preset block size, and each image block can correspond to a pixel matrix, and then , performing wavelet transformation on the pixel matrix of each block of the face image and the pixel matrix of each block of the target facial image to obtain the first wavelet coefficient matrix of each block of the face image and the first wavelet coefficient matrix of each block of the target facial image The second wavelet coefficient matrix, then, perform sparse processing on the first wavelet coefficient matrix and the second wavelet coefficient matrix respectively, to obtain the first sparse matrix and the second sparse matrix, and again, according to the principle of taking the largest absolute value, the first and second sparse matrices of each block are The first sparse matrix is fused with the second sparse matrix to obtain a fused sparse matrix, and finally, the fused sparse matrix is subjected to wavelet inverse transform to obtain a fused image.

Therefore, in the embodiment of the present disclosure, by dividing the image into blocks and calculating the sparse matrix of each block, the sparse matrix can be used to accurately express the image information of each block, and then the obtained sparse matrix of each block is calculated. Fusion and inverse transformation of the fused sparse matrix are performed to obtain a fusion image, which improves the fusion effect of the human face image and the target facial image.

S220. Train the preset network based on the sample group, so as to obtain a target network for facial fusion of human faces and material objects.

Wherein, for the training method of the target network in S220, reference may be made to S120, which will not be repeated here.

In yet another implementation manner of the present disclosure, the training process of the target network and the face fusion process can also be described as a whole.

Fig. 4 shows a schematic flowchart of another face fusion method provided by an embodiment of the present disclosure.

As shown in FIG. 4, the face fusion method may include the following steps.

S410. Obtain a sample group.

S420. Train the preset network based on the sample group, so as to obtain a target network for performing facial fusion on human faces and material objects.

S430. Acquire a face image to be fused and a face image of a material object to be fused.

S440. Input the face image to be fused and the face image of the material object to be fused to the trained target network to obtain a target fused image.

Wherein, S410-S420 are similar to S210-S220, and S430-S440 are similar to S110-S120, which will not be repeated here.

Fig. 5 shows a schematic structural diagram of a human face fusion device provided by an embodiment of the present disclosure.

In some embodiments of the present disclosure, the face fusion apparatus shown in FIG. 5 may be applied to electronic equipment. Among them, electronic devices may include devices with communication functions such as mobile phones, tablet computers, desktop computers, notebook computers, vehicle terminals, wearable devices, all-in-one computers, and smart home devices, as well as devices simulated by virtual machines or simulators.

As shown in FIG. 5 , the face fusion device 500 may include: an image acquisition module 510 and a fusion module 520 .

Wherein, the image acquisition module 510 is used to acquire the face image to be fused and the facial image of the material object to be fused;

The fusion module 520 is used to input the face image to be fused and the facial image of the material object to be fused to the trained target network to obtain the target fused image, wherein the target network is trained based on a sample group, and the sample group includes a human face image, the face image of the material object, and the fused image obtained by fusing the face image and the face image of the material object. The face image of the material object is obtained after rendering the initial face image of the material object, and the face image and the face image The facial feature parameters match.

In the embodiment of the present disclosure, the face image to be fused and the facial image of the material object to be fused can be acquired, and the face image to be fused and the facial image of the material object to be fused are input to the target network that has been trained to obtain The target fused image, because the target network is trained based on the sample group, and the sample group includes the face image, the facial image of the material object, and the fused image obtained by fusing the face image and the facial image of the material object, the facial image of the material object is It is obtained after rendering the initial facial image of the material object, and the face image matches the facial feature parameters of the facial image, so that the face image and the facial image can be fused using the trained target network, and the face and material can be obtained. The integrated special effects gameplay can improve the interest of the video, which in turn can improve the user experience to meet the diverse needs of users for special effects gameplay.

Optionally, the device may also include a target network training module. Wherein, the target network training module includes a sample group acquisition unit and a target network training unit.

Wherein, the sample group obtaining unit is used to obtain the sample group.

The target network training unit is used to train the preset network based on the sample group to obtain the target network for facial fusion of human faces and material objects.

Optionally, the sample group acquisition unit may include: an image acquisition subunit, an image matching subunit, and an image fusion subunit.

Wherein, the image acquisition subunit is used to acquire facial images and multiple human face images of multiple material objects, each facial image is an image of a material object under any facial feature parameter, and each human face image is a real person in any facial feature Image under parameters.

The image matching subunit is used to match the face image and the facial image based on the facial feature parameters, and screens out the target facial image matching the facial feature parameters of the human face image from a plurality of facial images;

The image fusion subunit is used to fuse the human face image and the target facial image to generate a fusion image, and obtain a sample group composed of the human face image, the target facial image and the fusion image corresponding to the human face image.

Optionally, the image matching subunit can also be used to perform at least one of the following operations:

Matching the human face image and the facial image according to the facial angle of the material object in the facial feature parameter;

Match the face image with the face image according to the expression data of the material object in the facial feature parameter;

Matching the face image and the facial image according to the facial contour of the material object in the facial feature parameter;

According to the facial features of the material object in the facial feature parameters, the face image and the facial image are matched;

Match the human face image with the face image according to the facial decoration of the material object in the facial feature parameter.

Optionally, the image matching subunit can also be used to acquire non-facial feature parameters.

Correspondingly, the image matching subunit can also be used to perform at least one of the following operations:

Matching the face image and the face image based on the light state of the face image in the non-facial feature parameters;

Matching the face image and the face image based on the hairstyle data of the material object in the non-facial feature parameter;

Based on the body shape data of the material object in the non-facial feature parameter, the face image and the facial image are matched;

Based on the clothing data of the material object in the non-facial feature parameter, the face image and the face image are matched.

Optionally, the image matching subunit can also be used to match the face image and the facial image based on some facial feature parameters, and select a partial facial image that matches the partial facial feature parameters of the human face image from multiple facial images;

The face image is matched with the facial image based on other facial feature parameters, and a target facial image matching the other facial feature parameters of the human face image is screened out from part of the facial images.

Optionally, the image fusion subunit can also be used to block the face image and the target face image respectively;

Calculate the first sparse matrix of each sub-block of the face image and the second sparse matrix of each sub-block of the target facial image;

The first sparse matrix and the second sparse matrix of each block are fused, and a fused image is determined according to the fused sparse matrix.

It should be noted that the face fusion device 500 shown in FIG. 5 can execute each step in the method embodiment shown in FIG. 1 and FIG. 5 , and realize each process in the method embodiment shown in FIG. 1 and FIG. 5 and effects, which will not be described here.

Fig. 6 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.

As shown in FIG. 6 , the electronic device 600 may include a controller 601 and a memory 602 storing computer program instructions.

Specifically, the above-mentioned controller 601 may include a central processing unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement one or more integrated circuits of the embodiments of the present application.

Memory 602 may include mass storage for information or instructions. By way of example and not limitation, the memory 602 may include a hard disk drive (Hard Disk Drive, HDD), a floppy disk drive, a flash memory, an optical disk, a magneto-optical disk, a magnetic tape, or a Universal Serial Bus (Universal Serial Bus, USB) drive or two or more thereof. A combination of the above. Storage 602 may include removable or non-removable (or fixed) media, where appropriate. Memory 602 may be internal or external to the integrated gateway device, where appropriate. In a particular embodiment, memory 602 is a non-volatile solid-state memory. In a particular embodiment, the memory 602 includes a read-only memory (Read-Only Memory, ROM). Where appropriate, the ROM can be a mask programmed ROM, a programmable ROM (Programmable ROM, PROM), an erasable PROM (Electrical Programmable ROM, EPROM), an electrically erasable PROM (Electrically Erasable Programmable ROM, EEPROM) ), electrically rewritable ROM (Electrically Alterable ROM, EAROM) or flash memory, or a combination of two or more of these.

The controller 601 executes the steps of the face fusion method provided by the embodiments of the present disclosure by reading and executing the computer program instructions stored in the memory 602 .

In an example, the electronic device 600 may further include a transceiver 603 and a bus 604 . Wherein, as shown in FIG. 6 , the controller 601 , the memory 602 and the transceiver 603 are connected through a bus 604 and complete mutual communication.

Bus 604 includes hardware, software, or both. By way of example and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Super Transmission (Hyper Transport, HT) interconnection, Industrial Standard Architecture (Industrial Standard Architecture, ISA) bus, Infinity Bandwidth interconnection, Low Pin Count (Low Pin Count, LPC) bus, memory bus, Micro Channel Architecture (Micro Channel Architecture) , MCA) bus, Peripheral Component Interconnect (PCI) bus, PCI-Express (PCI-X) bus, Serial Advanced Technology Attachment (Serial Advanced Technology Attachment, SATA) bus, Video Electronics Standards Association local (Video Electronics Standards Association Local Bus, VLB) bus or other suitable bus or a combination of two or more of these. Bus 604 may comprise one or more buses, where appropriate. Although the embodiments of this application describe and illustrate a particular bus, this application contemplates any suitable bus or interconnect.

This embodiment provides a storage medium containing computer-executable instructions, and the computer-executable instructions are used to perform a face fusion method when executed by a computer processor, the method comprising:

Of course, a storage medium containing computer-executable instructions provided by an embodiment of the present invention, the computer-executable instructions are not limited to the above-mentioned method operations, and can also perform the relevant steps in the face fusion method provided by any embodiment of the present invention. operate.

Through the above description about the implementation mode, those skilled in the art can clearly understand that the present invention can be realized by means of software and necessary general-purpose hardware, and of course it can also be realized by hardware, but in many cases the former is a better implementation mode . Based on such an understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk of a computer. , read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), flash memory (FLASH), hard disk or optical disc, etc., including several instructions to make a computer device (which can be a personal computer) , server, or network equipment, etc.) execute the face fusion method provided by various embodiments of the present invention.

Note that the above are only preferred embodiments of the present invention and applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and that various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention, and the present invention The scope is determined by the scope of the appended claims.

Claims

A face fusion method, characterized in that, comprising:

Obtain the face image to be fused and the facial image of the material object to be fused;

Input the face image to be fused and the facial image of the material object to be fused into the trained target network to obtain a target fused image, wherein the target network is trained based on a sample group, and the sample group includes a human face an image, a facial image of a material object, and a fused image obtained by fusing the human face image and the facial image of the material object, where the facial image of the material object is obtained after rendering the initial facial image of the material object, and The facial image is matched with facial feature parameters of the facial image.
method according to claim 1, is characterized in that, the training method of target network comprises the steps:

get sample group;

The preset network is trained based on the sample group to obtain a target network for facial fusion of human faces and material objects.
The method according to claim 2, wherein said acquiring a sample group comprises:

Obtain multiple facial images of the material object and multiple facial images, each of the facial images is an image of the material object under any facial feature parameter, and each of the facial images is a real person on any face Image under feature parameters;

Matching the face image and the facial image based on the facial feature parameters, and selecting a target facial image matching the facial feature parameters of the human face image from a plurality of the facial images;

Fusing the face image and the target face image to generate the fused image, and obtaining the sample group consisting of the face image, the target face image, and the fused image corresponding to the face image .
The method according to claim 3, wherein said matching said facial image with said facial image based on said facial feature parameters comprises performing at least one of the following operations:

Matching the human face image and the facial image according to the facial angle of the material object in the facial feature parameter;

Matching the human face image and the facial image according to the expression data of the material object in the facial feature parameter;

Matching the human face image and the facial image according to the facial contour of the material object in the facial feature parameter;

According to the facial features of the material object in the facial feature parameter, the face image and the facial image are matched;

Matching the human face image and the facial image according to the facial decoration of the material object in the facial feature parameters.
The method according to claim 4, further comprising: acquiring non-facial feature parameters;

Correspondingly, the matching of the face image and the face image based on the facial feature parameters also includes performing at least one of the following operations:

Matching the face image with the face image based on the light state of the face image in the non-facial feature parameters;

Matching the human face image with the facial image based on the hairstyle data of the material object in the non-facial feature parameters;

Matching the human face image and the facial image based on the body shape data of the material object in the non-facial feature parameters;

Based on the clothing data of the material object in the non-facial feature parameters, the face image is matched with the facial image.
The method according to claim 3, wherein said matching said facial image with said facial image based on said facial feature parameters comprises:

Matching the face image with the facial image based on part of the facial feature parameters, and selecting a partial facial image that matches the partial facial feature parameters of the human face image from a plurality of the facial images;

The face image is matched with the facial image based on other facial feature parameters, and target facial images matching other facial feature parameters of the human face image are selected from the partial facial images.
The method according to claim 3, wherein said fusion of said human face image and said target facial image to generate said fusion image comprises:

Blocking the face image and the target face image respectively;

Calculating the first sparse matrix of each block of the human face image and the second sparse matrix of each block of the target facial image;

The first sparse matrix and the second sparse matrix of each block are fused, and a fused image is determined according to the fused sparse matrix.
A human face fusion device, characterized in that it comprises:

An image acquisition module, configured to acquire a face image to be fused and a facial image of a material object to be fused;

A fusion module, configured to input the face image to be fused and the facial image of the material object to be fused to the trained target network to obtain a target fused image, wherein the target network is obtained based on sample group training, and the The sample group includes a human face image, a facial image of a material object, and a fused image obtained by fusing the human face image and the facial image of the material object, and the facial image of the material object is rendered from the initial facial image of the material object obtained later, and the face image is matched with the facial feature parameters of the face image.
An electronic device, characterized in that the device comprises:

one or more processors;

storage means for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the face fusion method according to any one of claims 1-7.
A computer-readable storage medium, on which a computer program is stored, wherein the computer program implements the face fusion method according to any one of claims 1-7 when executed by a processor.