CN113486785A - Video face changing method, device, equipment and storage medium based on deep learning - Google Patents

Video face changing method, device, equipment and storage medium based on deep learning Download PDF

Info

Publication number
CN113486785A
CN113486785A CN202110754216.7A CN202110754216A CN113486785A CN 113486785 A CN113486785 A CN 113486785A CN 202110754216 A CN202110754216 A CN 202110754216A CN 113486785 A CN113486785 A CN 113486785A
Authority
CN
China
Prior art keywords
face
changed
video
image
video frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110754216.7A
Other languages
Chinese (zh)
Inventor
张攀
刘求索
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Inveno Technology Co ltd
Original Assignee
Shenzhen Inveno Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Inveno Technology Co ltd filed Critical Shenzhen Inveno Technology Co ltd
Priority to CN202110754216.7A priority Critical patent/CN113486785A/en
Publication of CN113486785A publication Critical patent/CN113486785A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of face changing, and discloses a video face changing method, device, equipment and storage medium based on deep learning. The method comprises the following steps: when a face changing instruction of a video to be face changed is received, a pre-stored face changing model is obtained according to the face changing instruction; extracting a video frame to be face-changed in a video to be face-changed; when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed by a pre-stored face changing model to obtain the face changing image; and updating the video to be face-changed according to the face-changed image to obtain the target video. Through the mode, the pre-stored face changing model can be obtained when a face changing instruction is received, then the video frame to be changed, needing face changing, of the face is extracted, when face information exists in the video frame to be changed, the face changing image is determined through the pre-stored face changing model, finally, the video to be changed is updated according to the face changing image, the target video is obtained, the video face changing training times are few, the time consumption is short, and the resource consumption is small.

Description

Video face changing method, device, equipment and storage medium based on deep learning
Technical Field
The invention relates to the technical field of face changing, in particular to a video face changing method, device, equipment and storage medium based on deep learning.
Background
Face-changing technology has emerged in recent years, and many scholars are studying both in the industry and academia. The face changing technology can automatically replace the face in the video or the image with the face of another person, and the face changing technology is widely applied to various fields. However, the existing face changing technology needs to be trained for a long time and for a plurality of times when the video is replaced each time, and the resource consumption is overlarge.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a video face changing method, a video face changing device, video face changing equipment and a storage medium based on deep learning, and aims to solve the technical problems of long training time, more training times and large resource consumption of replacing videos in the prior art.
In order to achieve the above object, the present invention provides a video face changing method based on deep learning, which comprises the following steps:
when a face changing instruction of a video to be face changed is received, acquiring a pre-stored face changing model according to the face changing instruction;
extracting a video frame to be face-changed in the video to be face-changed;
when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed through the pre-stored face changing model to obtain a face changing image;
and updating the video to be face-changed according to the face-changed image to obtain a target video.
Optionally, when a face change instruction of a video to be face changed is received, before a pre-stored face change model is obtained according to the face change instruction, the method further includes:
acquiring a historical face-changing image, a historical replacement image and a historical target video frame;
training an original model according to the historical face-changing image, the historical replacement image and the historical target video frame to obtain a trained model;
and adjusting the trained model through a preset format to obtain a pre-stored face changing model.
Optionally, before the training an original model according to the historical face-changed image, the historical replacement image, and the historical target video frame to obtain a trained model, the method further includes:
obtaining a plurality of preset loss functions;
obtaining a preset total loss function according to the preset loss function;
updating the original model according to the preset total loss function to obtain a target model;
the training of the original model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model comprises the following steps:
and training the target model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model.
Optionally, when the face information exists in the video frame to be changed, determining a face change image according to the face information in the video frame to be changed through the pre-stored face change model, to obtain a face change image, including:
when the face information exists in the video frame to be changed, obtaining an aligned face image according to the video frame to be changed;
and determining the face changing image of the aligned face image through the pre-stored face changing model to obtain the face changing image.
Optionally, when a human face is identified in the video frame to be changed, obtaining an aligned human face image according to the video frame to be changed includes:
when a face is identified in the video frame to be face-changed, detecting the video frame to be face-changed to obtain a plurality of face key points;
cutting a human face area image with a preset size in the video frame to be changed according to the human face key point;
and carrying out alignment operation on the face according to the face region image to obtain an aligned face image.
Optionally, after extracting a frame of the video to be face-changed in the video to be face-changed, the method further includes:
when no human face is identified in the video frame to be changed, taking the video frame to be changed as a first target video frame;
inquiring video frame position information of the first target video frame in the video to be changed;
and writing the first target video frame back to the video to be changed according to the video frame position information.
Optionally, the updating the video to be face-changed according to the face-changed image to obtain a target video, including:
inquiring the corresponding position of the face changing image in the video frame to be changed;
determining a replacement image according to the corresponding position;
replacing the face changing image in the video frame to be changed with the replacing image to obtain a second target video frame;
and updating the video to be face-changed according to the second target video frame to obtain a target video.
In addition, in order to achieve the above object, the present invention further provides a video face changing device based on deep learning, including:
the calling module is used for acquiring a pre-stored face changing model according to a face changing instruction when the face changing instruction of a video to be face changed is received;
the extraction module is used for extracting a video frame to be changed in the video to be changed;
the determining module is used for determining face changing images of the face information in the video frame to be changed through the pre-stored face changing model when the face information exists in the video frame to be changed, so as to obtain face changing images;
and the updating module is used for updating the video to be face-changed according to the face-changed image to obtain a target video.
In addition, in order to achieve the above object, the present invention further provides a video face changing device based on deep learning, including: a memory, a processor, and a deep learning based video facer stored on the memory and executable on the processor, the deep learning based video facer configured to implement a deep learning based video facer method as described above.
In addition, to achieve the above object, the present invention further provides a storage medium, on which a deep learning based video face changing program is stored, which when executed by a processor implements the deep learning based video face changing method as described above.
When a face changing instruction of a video to be face changed is received, a pre-stored face changing model is obtained according to the face changing instruction; extracting a video frame to be face-changed in the video to be face-changed; when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed through the pre-stored face changing model to obtain a face changing image; and updating the video to be face-changed according to the face-changed image to obtain a target video. Through the mode, the pre-stored face changing model can be obtained when a face changing instruction is received, then the video frame to be changed, needing face changing, of the face is extracted, when face information exists in the video frame to be changed, the face changing image is determined through the pre-stored face changing model, finally, the video frame to be changed is updated according to the face changing image, the target video is obtained, only one pre-stored face changing model is called to process the video frame, and finally the face changing image is obtained, so that the video face changing training times are few, the time consumption of steps is short, and the resource consumption is small.
Drawings
Fig. 1 is a schematic structural diagram of a deep learning-based video face changing device in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flowchart of a video face-changing method based on deep learning according to a first embodiment of the present invention;
FIG. 3 is a flowchart illustrating a video face-changing method based on deep learning according to a second embodiment of the present invention;
FIG. 4 is a schematic view of key points of a human face in an embodiment of a deep learning-based video face changing method of the present invention;
fig. 5 is a block diagram illustrating a video face changing device based on deep learning according to a first embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a video face changing device based on deep learning in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the video face changing device based on deep learning may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the architecture shown in fig. 1 does not constitute a limitation of a deep learning based video facer apparatus, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a deep learning-based video facelining program.
In the deep learning based video facelining apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the video face changing device based on deep learning of the present invention may be disposed in the video face changing device based on deep learning, and the video face changing device based on deep learning calls the video face changing program based on deep learning stored in the memory 1005 through the processor 1001, and executes the video face changing method based on deep learning provided by the embodiment of the present invention.
An embodiment of the present invention provides a video face changing method based on deep learning, and referring to fig. 2, fig. 2 is a schematic flow diagram of a first embodiment of a video face changing method based on deep learning according to the present invention.
In this embodiment, the video face changing method based on deep learning includes the following steps:
step S10: and when a face changing instruction of a video to be face changed is received, acquiring a pre-stored face changing model according to the face changing instruction.
It should be noted that the execution subject of the present embodiment is a controller, the controller is mainly used for controlling the video face changing method based on deep learning, and may be any device capable of implementing the function, which is not limited in the present embodiment. The present embodiment and the following embodiments will be specifically described by taking a controller of a video face changing method based on deep learning as an example.
It should be understood that the face change instruction refers to an instruction for instructing to start executing face change of a video to be face changed, and the face change instruction may be an instruction in any form, which is not limited in this embodiment.
In specific implementation, the pre-stored face changing model refers to a pre-processed and stored model for face changing, and the pre-stored face changing model in this embodiment is a model obtained by modifying a part of a structure and adjusting parameters based on a faceshift model.
It should be noted that the video to be changed is a video that needs to perform the step of the face changing method, and may be a video with any length and definition, which is not limited in this embodiment.
It should be understood that, when a face change instruction of a video to be face changed is received, acquiring a pre-stored face change model according to the face change instruction means that the pre-stored face change model is called when a face change instruction that a video to be face changed needs to be face changed is received.
Further, before the step S10, in order to obtain the pre-stored face-changing model after training in advance, the method further includes:
acquiring a historical face-changing image, a historical replacement image and a historical target video frame;
training an original model according to the historical face-changing image, the historical replacement image and the historical target video frame to obtain a trained model;
and adjusting the trained model through a preset format to obtain a pre-stored face changing model.
The history face change image is a face change image stored in a step of a successful face change method performed before the face change of this time. The face-changing image refers to an image area which is determined by a pre-stored face-changing model and needs to be replaced in a video frame to be face-changed.
It should be understood that the historical replacement images refer to replacement images stored in a successful face change method step performed prior to the present face change. The replacing image refers to an image which needs to replace the face changing image on the video frame to be face changed.
In a specific implementation, the historical target video frame refers to a video frame which is stored in a step of a successful face changing method before the face changing is performed, and the face is completely replaced.
It should be noted that the historical face-changing image, the historical replacement image, and the historical target video frame are stored in the form of a source data set CelebA-HQ or FFHQ.
It should be understood that the original model is trained according to the historical face-changed image, the historical replacement image and the historical target video frame, and the obtaining of the trained model refers to extracting the historical face-changed image, the historical replacement image and the historical target video frame from a training data set, and then training the original model according to the historical face-changed image, the historical replacement image and the historical target video frame to obtain the trained model.
In a specific implementation, the original model refers to a faceshift model that is invoked prior to training.
It should be noted that the preset format refers to a PB format, and the preset format is used to adjust the trained model to obtain the pre-stored face-changed model refers to a PB format of the trained model, so as to obtain the pre-stored face-changed model.
By the method, the original model can be trained accurately and quickly, and the pre-stored face-changing model is obtained.
Further, in order to perform structure adjustment on the original model, the step of training the original model according to the historical face-changed image, the historical replacement image, and the historical target video frame further includes, before obtaining the trained model:
obtaining a plurality of preset loss functions;
obtaining a preset total loss function according to the preset loss function;
updating the original model according to the preset total loss function to obtain a target model;
the training of the original model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model comprises the following steps:
and training the target model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model.
It should be understood that the preset loss function refers to a loss function that is stored in advance to replace the function in the original model.
In a specific implementation, the preset loss function may include the following four loss functions, which are, in order, a GAN loss function, an id loss function, a multi-layer attribute loss function, and a pixel-level reconstruction loss function:
Figure BDA0003143880320000071
Figure BDA0003143880320000072
Figure BDA0003143880320000073
Figure BDA0003143880320000074
wherein D is the discriminator, G is the generator, E is the mathematical expectation, ZidBeing a characteristic of the face, XsFor source faces, i.e. face-changed images, X, which need to be changedtFor the target face, i.e., the replacement image replacing the face-changed image,
Figure BDA0003143880320000081
refers to the generated face, i.e. the face after replacement; zk attIs the embedding of a certain property in the k-th layer.
In a specific implementation, obtaining the preset total loss function according to the preset loss function means obtaining the preset total loss function according to a GAN loss function, an id loss function, a multi-layer attribute loss function, and a pixel-level reconstruction loss function.
It should be noted that the calculation method of the preset total loss function is as follows:
Figure BDA0003143880320000082
wherein λ isatt、λidAnd λrecIs a preset coefficient.
It should be understood that, updating the original model according to the preset total loss function to obtain the target model means that, after the preset total loss function is obtained, the preset total loss function is used to replace the loss function in the original model, and the obtained model is the target model.
By the method, the accuracy of the original model can be improved by replacing the loss function in the original model, and the accuracy and the final effect of video face changing are improved.
Step S20: and extracting a video frame to be changed in the video to be changed.
In a specific implementation, the video frame to be changed is a designated video frame that needs to be changed in the video to be changed, and the video frame to be changed may be any frame of a picture in the video to be changed, which is not limited in this embodiment.
It should be noted that, extracting a video frame to be changed in the video to be changed means extracting a video frame to be changed from the video to be changed as the video frame to be changed.
Step S30: and when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed through the pre-stored face changing model to obtain the face changing image.
It should be understood that, the image recognition means may be used, and other methods may also be used to determine whether the face information exists in the video to be changed, which is not limited in this embodiment.
In a specific implementation, when the face information exists in the video frame to be changed, the face information in the video frame to be changed is determined by the pre-stored face changing model, and the obtained face changing image means that after the face information exists in the video frame to be changed, the video frame to be changed is input into the pre-stored face changing model, and finally, the image output by the pre-stored face changing model is the determined face changing image.
Step S40: and updating the video to be face-changed according to the face-changed image to obtain a target video.
It should be noted that the target video refers to a face-changed video that needs to be finally obtained after the method is performed.
It should be understood that updating the video to be face-changed according to the face-changed image to obtain the target video means that, after the face-changed image is determined, replacing and covering a face-changed image in a video frame to be face-changed by a prepared replacement image, then obtaining a second target video frame, and finally updating the video to be face-changed according to the second target video frame to obtain the target video.
In the embodiment, when a face changing instruction of a video to be face changed is received, a pre-stored face changing model is obtained according to the face changing instruction; extracting a video frame to be face-changed in the video to be face-changed; when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed through the pre-stored face changing model to obtain a face changing image; and updating the video to be face-changed according to the face-changed image to obtain a target video. Through the mode, the pre-stored face changing model can be obtained when a face changing instruction is received, then the video frame to be changed, needing face changing, of the face is extracted, when face information exists in the video frame to be changed, the face changing image is determined through the pre-stored face changing model, finally, the video frame to be changed is updated according to the face changing image, the target video is obtained, only one pre-stored face changing model is called to process the video frame, and finally the face changing image is obtained, so that the video face changing training times are few, the time consumption of steps is short, and the resource consumption is small.
Referring to fig. 3, fig. 3 is a flowchart illustrating a video face changing method based on deep learning according to a second embodiment of the present invention.
Based on the first embodiment, the video face changing method based on deep learning of the present embodiment includes, in the step S30:
step S301: and when the face information exists in the video frame to be changed, obtaining an aligned face image according to the video frame to be changed.
It should be noted that aligning the face image refers to processing the video frame to be changed to obtain the face image when the face information exists in the video frame to be changed.
It should be understood that, when the face information exists in the video frame to be changed, obtaining an aligned face image according to the video frame to be changed means that, when the face information exists in the video frame to be changed, the video frame to be changed is processed, the image position of the face is determined first, and then the face image is aligned to obtain an aligned face image.
Further, in order to accurately obtain the aligned face image, step S101 includes:
when a face is identified in the video frame to be face-changed, detecting the video frame to be face-changed to obtain a plurality of face key points;
cutting a human face area image with a preset size in the video frame to be changed according to the human face key point;
and carrying out alignment operation on the face according to the face region image to obtain an aligned face image.
It should be noted that the face key points refer to key image points on the face, which are automatically selected after the video frame to be changed is automatically detected, and are used for identifying image points of facial features, and the number of the face key points may be any number, which is not limited in this embodiment.
In a specific implementation, as shown in fig. 4, a schematic diagram of selecting face key points is shown, the face key points are selected based on feature parts such as facial organs of an identified face, and the number of the face key points is not specifically limited in this embodiment, and may be any number. Fig. 4 is only an illustration, and does not limit the description of the present embodiment.
It should be understood that when a human face is identified in the video frame to be changed, detecting the video frame to be changed to obtain a plurality of human face key points means that when human face information exists in the video frame to be changed, image detection is automatically performed on the video frame to be changed, and then the plurality of human face key points are determined.
In a specific implementation, the preset size is a size that can be set by an administrator or a user, and is used for cutting the face image, and the size of the preset size is not limited in this embodiment.
The face region image refers to an image of a preset size cut out from a video frame to be changed according to the position of a face key point.
It should be understood that, the cutting of the face region image with the preset size in the video frame to be changed according to the face key points means that, after a plurality of face key points are determined, an image region with the preset size is cut out in the video frame to be changed according to the positions of the face key points to serve as the face region image.
In a specific implementation, performing an alignment operation on a face according to the face region image to obtain an aligned face image means that after the face region image is obtained, performing an alignment operation on the face image through affine transformation, that is, adjusting the position and the angle of the face to the front, and obtaining the final image, that is, the aligned face image.
By the method, the face image can be accurately and quickly changed into the aligned face image, so that the subsequent face changing step is more convenient, and the final face changing effect is improved.
Step S302: and determining the face changing image of the aligned face image through the pre-stored face changing model to obtain the face changing image.
It should be noted that, determining the face-changing image of the aligned face image through the pre-stored face-changing model to obtain the face-changing image means that the aligned face image is input into the pre-stored face-changing model obtained after training is completed, then an image output by the pre-stored face-changing model is obtained, and the image output by the pre-stored face-changing model is used as the face-changing image.
Further, in order to put back the video frame to be face-changed to the video frame to be face-changed when the video frame to be face-changed does not recognize a human face, after the step of extracting the video frame to be face-changed in the video frame to be face-changed, the method further includes:
when no human face is identified in the video frame to be changed, taking the video frame to be changed as a first target video frame;
inquiring video frame position information of the first target video frame in the video to be changed;
and writing the first target video frame back to the video to be changed according to the video frame position information.
It should be understood that when no human face is identified in the video frame to be changed, taking the video frame to be changed as the first target video frame means that when no human face is identified after image recognition is performed on the video frame to be changed, taking the video frame to be changed without the human face identified as the first target video frame.
In a specific implementation, the video frame position information refers to a position of a frame number of the first target video frame in the video to be face-changed, that is, a frame number of the first target video frame in the video to be face-changed.
It should be noted that, the querying of the video frame position information of the first target video frame in the video to be face-changed refers to querying of the video frame position information of the first target video frame in the video to be face-changed after the first target video frame is determined.
It should be understood that, the step of placing the first target video frame back to the video to be face-changed according to the video frame position information refers to placing the first target video frame back to the video to be face-changed after determining the video frame position information of the first target video frame.
By the method, the video frame to be changed can be quickly placed back to the video to be changed when the face information is not identified in the video frame to be changed, and the video face changing efficiency is improved.
Further, in order to change the face of the video to be changed after the face change image is obtained, the step of updating the video to be changed according to the face change image to obtain a target video includes:
inquiring the corresponding position of the face changing image in the video frame to be changed;
determining a replacement image according to the corresponding position;
replacing the face changing image in the video frame to be changed with the replacing image to obtain a second target video frame;
and updating the video to be face-changed according to the second target video frame to obtain a target video.
It should be noted that, the querying of the corresponding position of the face-changing image in the video frame to be face-changed refers to querying the position of the face-changing image in the video frame to be face-changed, that is, the corresponding position.
It should be understood that the replacement image refers to a face image which is stored in advance and is used for replacing the face image, namely the face image in the video frame to be changed after the replacement is completed.
In a specific implementation, the determining of the replacement image according to the corresponding position refers to determining a replacement image to be replaced according to the corresponding position after determining the corresponding position of the face-changed image.
It should be noted that, replacing the face-changed image in the video frame to be face-changed with the replacement image to obtain the second target video frame means that the face-changed image in the video frame to be face-changed is replaced with the replacement image, and the finally obtained video frame is the second target video frame. That is, the second target video frame is an image in which the face-changed image in the video frame to be face-changed is changed to the replacement image.
It should be understood that, updating the video to be face-changed according to the second target video frame to obtain the target video means that, after the second target video frame is obtained, the position of the frame number of the second target video frame in the video to be face-changed is inquired, that is, the frame number of the second target video frame in the video to be face-changed is the number of the frame number, and then the second target video frame is placed in the video to be face-changed to replace the video frame to be face-changed, so as to obtain the target video.
In the embodiment, when the face information exists in the video frame to be changed, an aligned face image is obtained according to the video frame to be changed; and determining the face changing image of the aligned face image through the pre-stored face changing model to obtain the face changing image. Through the mode, the video frame of the face to be changed can be preprocessed before the face changing model is prestored, the aligned face image is obtained, then the aligned face image is input into the prestored face changing model, the image input with high accuracy of the prestored face changing model is used, and the face changing effect is improved.
In addition, an embodiment of the present invention further provides a storage medium, where the storage medium stores a deep learning-based video face changing program, and the deep learning-based video face changing program, when executed by a processor, implements the steps of the deep learning-based video face changing method as described above.
Since the storage medium adopts all technical solutions of all the embodiments described above, at least all the beneficial effects brought by the technical solutions of the embodiments described above are achieved, and are not described in detail herein.
Referring to fig. 5, fig. 5 is a block diagram illustrating a first embodiment of a video face changing apparatus based on deep learning according to the present invention.
As shown in fig. 5, the video face changing apparatus based on deep learning according to the embodiment of the present invention includes:
the calling module 10 is configured to, when a face changing instruction of a video to be face changed is received, obtain a pre-stored face changing model according to the face changing instruction.
And the extracting module 20 is configured to extract a video frame to be changed in the video to be changed.
And the processing module 30 is configured to determine a face change image according to the pre-stored face change model when the face information exists in the video frame to be changed, so as to obtain the face change image.
And the updating module 40 is configured to update the video to be face-changed according to the face-changed image to obtain a target video.
It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in a specific application, a person skilled in the art may set the technical solution as needed, and the present invention is not limited thereto.
In the embodiment, when a face changing instruction of a video to be face changed is received, a pre-stored face changing model is obtained according to the face changing instruction; extracting a video frame to be face-changed in the video to be face-changed; when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed through the pre-stored face changing model to obtain a face changing image; and updating the video to be face-changed according to the face-changed image to obtain a target video. Through the mode, the pre-stored face changing model can be obtained when a face changing instruction is received, then the video frame to be changed, needing face changing, of the face is extracted, when face information exists in the video frame to be changed, the face changing image is determined through the pre-stored face changing model, finally the video line to be changed is updated according to the face changing image, the target video is obtained, the video frame is processed by only calling the pre-stored face changing model, and finally the face changing image is obtained, so that the video face changing training times are few, the time consumption of steps is short, and the resource consumption is small.
In this embodiment, the calling module 10 is further configured to obtain a historical face-changing image, a historical replacement image, and a historical target video frame; training an original model according to the historical face-changing image, the historical replacement image and the historical target video frame to obtain a trained model; and adjusting the trained model through a preset format to obtain a pre-stored face changing model.
In this embodiment, the calling module 10 is further configured to obtain a plurality of preset loss functions; obtaining a preset total loss function according to the preset loss function; updating the original model according to the preset total loss function to obtain a target model; the training of the original model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model comprises the following steps: and training the target model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model.
In this embodiment, the processing module 30 is further configured to, when face information exists in the video frame to be face-changed, obtain an aligned face image according to the video frame to be face-changed; and determining the face changing image of the aligned face image through the pre-stored face changing model to obtain the face changing image.
In this embodiment, the processing module 30 is further configured to, when a human face is identified in the video frame to be face-changed, detect the video frame to be face-changed to obtain a plurality of human face key points; cutting a human face area image with a preset size in the video frame to be changed according to the human face key point; and carrying out alignment operation on the face according to the face region image to obtain an aligned face image.
In this embodiment, the extracting module 20 is further configured to, when a human face is not identified in the video frame to be changed, use the video frame to be changed as a first target video frame; inquiring video frame position information of the first target video frame in the video to be changed; and putting the first target video frame back to the video to be changed with the face according to the video frame position information.
In this embodiment, the updating module 40 is further configured to query a corresponding position of the face-changed image in the video frame to be face-changed; determining a replacement image according to the corresponding position; replacing the face changing image in the video frame to be changed with the replacing image to obtain a second target video frame; and updating the video to be face-changed according to the second target video frame to obtain a target video.
Since the present apparatus employs all technical solutions of all the above embodiments, at least all the beneficial effects brought by the technical solutions of the above embodiments are achieved, and are not described in detail herein.
It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may be referred to a video face changing method based on deep learning provided in any embodiment of the present invention, and are not described herein again.
Further, it is to be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (e.g. a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A video face changing method based on deep learning is characterized in that the video face changing method based on deep learning comprises the following steps:
when a face changing instruction of a video to be face changed is received, acquiring a pre-stored face changing model according to the face changing instruction;
extracting a video frame to be face-changed in the video to be face-changed;
when the face information exists in the video frame to be changed, determining a face changing image according to the face information in the video frame to be changed through the pre-stored face changing model to obtain a face changing image;
and updating the video to be face-changed according to the face-changed image to obtain a target video.
2. The method of claim 1, wherein before the obtaining of the pre-stored face changing model according to the face changing instruction when the face changing instruction of the video to be face changed is received, the method further comprises:
acquiring a historical face-changing image, a historical replacement image and a historical target video frame;
training an original model according to the historical face-changing image, the historical replacement image and the historical target video frame to obtain a trained model;
and adjusting the trained model through a preset format to obtain a pre-stored face changing model.
3. The method of claim 2, wherein before the training an original model according to the historical re-face image, the historical replacement image and the historical target video frame to obtain a trained model, further comprising:
obtaining a plurality of preset loss functions;
obtaining a preset total loss function according to the preset loss function;
updating the original model according to the preset total loss function to obtain a target model;
the training of the original model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model comprises the following steps:
and training the target model according to the historical face-changing image, the historical image to be replaced and the historical target video frame to obtain a trained model.
4. The method of claim 1, wherein when the face information exists in the video frame to be changed, determining a face change image according to the pre-stored face change model by using the face information in the video frame to be changed to obtain the face change image, comprising:
when the face information exists in the video frame to be changed, obtaining an aligned face image according to the video frame to be changed;
and determining the face changing image of the aligned face image through the pre-stored face changing model to obtain the face changing image.
5. The method of claim 4, wherein when a human face is recognized in the video frame to be face-changed, obtaining an aligned human face image according to the video frame to be face-changed comprises:
when a face is identified in the video frame to be face-changed, detecting the video frame to be face-changed to obtain a plurality of face key points;
cutting a human face area image with a preset size in the video frame to be changed according to the human face key point;
and carrying out alignment operation on the face according to the face region image to obtain an aligned face image.
6. The method according to any one of claims 1 to 5, wherein after extracting the frame of the video to be face-changed from the video to be face-changed, the method further comprises:
when no human face is identified in the video frame to be changed, taking the video frame to be changed as a first target video frame;
inquiring video frame position information of the first target video frame in the video to be changed;
and writing the first target video frame back to the video to be changed according to the video frame position information.
7. The method according to any one of claims 1 to 5, wherein the updating the video to be face-changed according to the face-changed image to obtain a target video comprises:
inquiring the corresponding position of the face changing image in the video frame to be changed;
determining a replacement image according to the corresponding position;
replacing the face changing image in the video frame to be changed with the replacing image to obtain a second target video frame;
and updating the video to be face-changed according to the second target video frame to obtain a target video.
8. A video face changing device based on deep learning, comprising:
the calling module is used for acquiring a pre-stored face changing model according to a face changing instruction when the face changing instruction of a video to be face changed is received;
the extraction module is used for extracting a video frame to be changed in the video to be changed;
the processing module is used for determining face changing images of the face information in the video frame to be changed through the pre-stored face changing model when the face information exists in the video frame to be changed, so as to obtain face changing images;
and the updating module is used for updating the video to be face-changed according to the face-changed image to obtain a target video.
9. A video faceting apparatus based on deep learning, the apparatus comprising: a memory, a processor, and a deep learning based video facer stored on the memory and executable on the processor, the deep learning based video facer configured to implement a deep learning based video facer method as claimed in any one of claims 1 to 7.
10. A storage medium having stored thereon a deep learning based video faceting program which, when executed by a processor, implements a deep learning based video faceting method as recited in any one of claims 1 to 7.
CN202110754216.7A 2021-07-01 2021-07-01 Video face changing method, device, equipment and storage medium based on deep learning Pending CN113486785A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110754216.7A CN113486785A (en) 2021-07-01 2021-07-01 Video face changing method, device, equipment and storage medium based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110754216.7A CN113486785A (en) 2021-07-01 2021-07-01 Video face changing method, device, equipment and storage medium based on deep learning

Publications (1)

Publication Number Publication Date
CN113486785A true CN113486785A (en) 2021-10-08

Family

ID=77940555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110754216.7A Pending CN113486785A (en) 2021-07-01 2021-07-01 Video face changing method, device, equipment and storage medium based on deep learning

Country Status (1)

Country Link
CN (1) CN113486785A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114222077A (en) * 2021-12-14 2022-03-22 惠州视维新技术有限公司 Video processing method and device, storage medium and electronic equipment
CN115187446A (en) * 2022-05-26 2022-10-14 北京健康之家科技有限公司 Face changing video generation method and device, computer equipment and readable storage medium
CN115358916A (en) * 2022-07-06 2022-11-18 北京健康之家科技有限公司 Face-changed image generation method and device, computer equipment and readable storage medium
CN117196937A (en) * 2023-09-08 2023-12-08 天翼爱音乐文化科技有限公司 Video face changing method, device and storage medium based on face recognition model

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190259384A1 (en) * 2018-02-19 2019-08-22 Invii.Ai Systems and methods for universal always-on multimodal identification of people and things
CN110222607A (en) * 2019-05-24 2019-09-10 北京航空航天大学 The method, apparatus and system of face critical point detection
CN111243626A (en) * 2019-12-30 2020-06-05 清华大学 Speaking video generation method and system
CN111476710A (en) * 2020-04-13 2020-07-31 上海艾麒信息科技有限公司 Video face changing method and system based on mobile platform
US20200302184A1 (en) * 2019-03-21 2020-09-24 Samsung Electronics Co., Ltd. Electronic device and controlling method thereof
CN111783603A (en) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 Training method for generating confrontation network, image face changing method and video face changing method and device
CN111783608A (en) * 2020-06-24 2020-10-16 南京烽火星空通信发展有限公司 Face changing video detection method
CN111914812A (en) * 2020-08-20 2020-11-10 腾讯科技(深圳)有限公司 Image processing model training method, device, equipment and storage medium
CN111950497A (en) * 2020-08-20 2020-11-17 重庆邮电大学 AI face-changing video detection method based on multitask learning model
CN112102157A (en) * 2020-09-09 2020-12-18 咪咕文化科技有限公司 Video face changing method, electronic device and computer readable storage medium
CN112163511A (en) * 2020-09-25 2021-01-01 天津大学 Method for identifying authenticity of image
CN112446364A (en) * 2021-01-29 2021-03-05 中国科学院自动化研究所 High-definition face replacement video generation method and system
CN112734631A (en) * 2020-12-31 2021-04-30 北京深尚科技有限公司 Video image face changing method, device, equipment and medium based on fine adjustment model
WO2021083069A1 (en) * 2019-10-30 2021-05-06 上海掌门科技有限公司 Method and device for training face swapping model
US20210152751A1 (en) * 2019-11-19 2021-05-20 Tencent Technology (Shenzhen) Company Limited Model training method, media information synthesis method, and related apparatuses

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190259384A1 (en) * 2018-02-19 2019-08-22 Invii.Ai Systems and methods for universal always-on multimodal identification of people and things
US20200302184A1 (en) * 2019-03-21 2020-09-24 Samsung Electronics Co., Ltd. Electronic device and controlling method thereof
CN110222607A (en) * 2019-05-24 2019-09-10 北京航空航天大学 The method, apparatus and system of face critical point detection
WO2021083069A1 (en) * 2019-10-30 2021-05-06 上海掌门科技有限公司 Method and device for training face swapping model
US20210152751A1 (en) * 2019-11-19 2021-05-20 Tencent Technology (Shenzhen) Company Limited Model training method, media information synthesis method, and related apparatuses
CN111243626A (en) * 2019-12-30 2020-06-05 清华大学 Speaking video generation method and system
CN111476710A (en) * 2020-04-13 2020-07-31 上海艾麒信息科技有限公司 Video face changing method and system based on mobile platform
CN111783608A (en) * 2020-06-24 2020-10-16 南京烽火星空通信发展有限公司 Face changing video detection method
CN111783603A (en) * 2020-06-24 2020-10-16 有半岛(北京)信息科技有限公司 Training method for generating confrontation network, image face changing method and video face changing method and device
CN111950497A (en) * 2020-08-20 2020-11-17 重庆邮电大学 AI face-changing video detection method based on multitask learning model
CN111914812A (en) * 2020-08-20 2020-11-10 腾讯科技(深圳)有限公司 Image processing model training method, device, equipment and storage medium
CN112102157A (en) * 2020-09-09 2020-12-18 咪咕文化科技有限公司 Video face changing method, electronic device and computer readable storage medium
CN112163511A (en) * 2020-09-25 2021-01-01 天津大学 Method for identifying authenticity of image
CN112734631A (en) * 2020-12-31 2021-04-30 北京深尚科技有限公司 Video image face changing method, device, equipment and medium based on fine adjustment model
CN112446364A (en) * 2021-01-29 2021-03-05 中国科学院自动化研究所 High-definition face replacement video generation method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LINGZHI LI 等: ""FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping"", 《COMPUTER VISION AND PATTERN RECOGNITION》, 15 September 2020 (2020-09-15), pages 1 - 11 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114222077A (en) * 2021-12-14 2022-03-22 惠州视维新技术有限公司 Video processing method and device, storage medium and electronic equipment
CN115187446A (en) * 2022-05-26 2022-10-14 北京健康之家科技有限公司 Face changing video generation method and device, computer equipment and readable storage medium
CN115358916A (en) * 2022-07-06 2022-11-18 北京健康之家科技有限公司 Face-changed image generation method and device, computer equipment and readable storage medium
CN117196937A (en) * 2023-09-08 2023-12-08 天翼爱音乐文化科技有限公司 Video face changing method, device and storage medium based on face recognition model
CN117196937B (en) * 2023-09-08 2024-05-14 天翼爱音乐文化科技有限公司 Video face changing method, device and storage medium based on face recognition model

Similar Documents

Publication Publication Date Title
CN113486785A (en) Video face changing method, device, equipment and storage medium based on deep learning
CN109961009B (en) Pedestrian detection method, system, device and storage medium based on deep learning
CN108197618B (en) Method and device for generating human face detection model
CN111814905A (en) Target detection method, target detection device, computer equipment and storage medium
WO2021147221A1 (en) Text recognition method and apparatus, and electronic device and storage medium
CN112633313B (en) Bad information identification method of network terminal and local area network terminal equipment
DE102014117895A1 (en) Note-based spot-healing techniques
CN107292817B (en) Image processing method, device, storage medium and terminal
CN109963072B (en) Focusing method, focusing device, storage medium and electronic equipment
US11232561B2 (en) Capture and storage of magnified images
CN113420769A (en) Image mask recognition, matting and model training method and device and electronic equipment
CN113079273A (en) Watermark processing method, device, electronic equipment and medium
CN112381092A (en) Tracking method, device and computer readable storage medium
CN109871205B (en) Interface code adjustment method, device, computer device and storage medium
CN112532884B (en) Identification method and device and electronic equipment
US11442982B2 (en) Method and system for acquiring data files of blocks of land and of building plans and for making matches thereof
CN111539390A (en) Small target image identification method, equipment and system based on Yolov3
CN111127458A (en) Target detection method and device based on image pyramid and storage medium
CN115860026A (en) Bar code detection method and device, bar code detection equipment and readable storage medium
CN110660000A (en) Data prediction method, device, equipment and computer readable storage medium
CN112749769B (en) Graphic code detection method, graphic code detection device, computer equipment and storage medium
CN112286430B (en) Image processing method, apparatus, device and medium
CN112950167A (en) Design service matching method, device, equipment and storage medium
CN113127058A (en) Data annotation method, related device and computer program product
CN112819885A (en) Animal identification method, device and equipment based on deep learning and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination