CN111986301A - Method and device for processing data in live broadcast, electronic equipment and storage medium - Google Patents

Method and device for processing data in live broadcast, electronic equipment and storage medium Download PDF

Info

Publication number
CN111986301A
CN111986301A CN202010924402.6A CN202010924402A CN111986301A CN 111986301 A CN111986301 A CN 111986301A CN 202010924402 A CN202010924402 A CN 202010924402A CN 111986301 A CN111986301 A CN 111986301A
Authority
CN
China
Prior art keywords
face
interchange
data
user
anchor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010924402.6A
Other languages
Chinese (zh)
Inventor
巢娅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Hangzhou Network Co Ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Priority to CN202010924402.6A priority Critical patent/CN111986301A/en
Publication of CN111986301A publication Critical patent/CN111986301A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications

Abstract

The embodiment of the invention provides a method and a device for processing data in live broadcast, electronic equipment and a storage medium, wherein the method comprises the following steps: determining first attribute information aiming at a first user and second attribute information aiming at a second user in a live broadcast connection process of the first user and the second user; determining a target face interchange pattern according to the first attribute information and the second attribute information; and according to the target face interchange mode, carrying out face interchange processing on the first user and the second user in a video picture of the live broadcast connection line. The embodiment of the invention realizes the face switching processing applied to any live broadcast scene, can achieve the purpose of live broadcast real-time face changing by determining the face interchange mode aiming at the user information, has low delay and good effect, gives consideration to the system performance, and increases the interest and entertainment of live broadcast.

Description

Method and device for processing data in live broadcast, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image technologies, and in particular, to a method and an apparatus for processing data in live broadcast, an electronic device, and a storage medium.
Background
At present, a face changing method of direct mapping is generally adopted for presenting the face interchange effect, but the method has poor fusion effect and influences the face changing effect. The related art provides a face changing method for extracting face characteristic points and fitting the face characteristic points to other face three-dimensional models, but the method needs to construct a face changing three-dimensional model in real time and carry out mapping, so that the algorithm has large computation amount and high cost on system performance. In another related technology, a method for training a model by acquiring face pictures of people needing face changing in advance and then changing faces according to the trained model is provided, but the face changing method needs to train the model in advance and is not suitable for real-time face changing in a live broadcast process.
The two face changing methods cannot simultaneously consider the effectiveness of the face changing algorithm and the energy consumption of the system, are not designed by combining the attribute characteristics of the anchor in the live broadcast scene, and are not suitable for real-time face changing of the anchor with different attributes in the live broadcast scene. Therefore, when the face interchange between the hosts is performed in the live broadcast scene, how to extend the face interchange method to the live broadcast service scene to achieve the effect of real-time face interchange is a problem which is urgently needed to be solved at present.
Disclosure of Invention
In view of the above, it is proposed to provide a method and apparatus, an electronic device, and a storage medium for live data processing that overcome or at least partially solve the above problems, including:
a method of data processing in live broadcasting, the method comprising:
determining first attribute information aiming at a first user and second attribute information aiming at a second user in a live broadcast connection process of the first user and the second user;
determining a target face interchange pattern according to the first attribute information and the second attribute information;
and according to the target face interchange mode, carrying out face interchange processing on the first user and the second user in a video picture of the live broadcast connection line.
Optionally, the determining a target face exchange pattern according to the first attribute information and the second attribute information includes:
acquiring a preset attribute information set, and judging whether the first attribute information and the second attribute information are both matched with the attribute information set;
if so, determining that the preset first face interchange mode is the target face interchange mode;
otherwise, determining the preset second face interchange mode as the target face interchange mode.
Optionally, before performing the face interchange processing on the first user and the second user in the video frame of the live connection according to the target face interchange mode, the method further includes:
acquiring first original face data for the first user and second original face data for the second user from a video picture of the live broadcast connection;
the performing, according to the target face interchange mode, face interchange processing on the first user and the second user in a video picture of the live broadcast connection includes:
acquiring a target face interchange model corresponding to the target face interchange mode;
generating first exchanged face data corresponding to the first original face data and second exchanged face data corresponding to the second original face data by adopting the target face exchange model;
performing replacement processing on the first original face data according to the first interchanged face data, and performing replacement processing on the second original face data according to the second interchanged face data.
Optionally, the obtaining a target face interchange model corresponding to the target face interchange pattern includes:
when the target face interchange mode is the first face interchange mode, acquiring a preset first face interchange model as a target face interchange model; wherein the first face interchange model comprises a first decoding model for the first user and a second decoding model for the second user;
generating first interchanged face data corresponding to the first original face data and second interchanged face data corresponding to the second original face data by adopting the target face interchange model, wherein the generating comprises:
generating first interchange face data corresponding to the first original face data by adopting the second decoding model;
and generating second interchange face data corresponding to the second original face data by adopting the first decoding model.
Optionally, the obtaining a target face interchange model corresponding to the target face interchange pattern includes:
when the target face interchange pattern is the second face interchange pattern, adopting the first original face data to construct a second face interchange model for the first user;
constructing a third face interchange model for the second user by adopting the second original face data;
generating first interchanged face data corresponding to the first original face data and second interchanged face data corresponding to the second original face data by adopting the target face interchange model, wherein the generating comprises:
generating first face interchange data corresponding to the first original face data by adopting the second face interchange model;
and generating second face interchange data corresponding to the second original face data by adopting the third face interchange model.
Optionally, the replacing the first original face data according to the first interchanged face data includes:
generating first face mask data corresponding to the first original face data, and obtaining first target face data according to the first face mask data and the first interchange face data;
replacing the first original face data with the first target face data;
the replacing the second original face data according to the second interchanged face data includes:
generating second face mask data corresponding to the second original face data, and obtaining second target face data according to the second face mask data and the second interchange face data;
replacing the second original face data with the second target face data.
Optionally, the obtaining first target face data according to the first face mask data and the first face exchange data includes:
obtaining first target face data according to the first face mask data, the first interchange face data and the skin color data of the first user;
said deriving second target face data from said second face mask data and said second interchange face data, comprising:
and obtaining second target face data according to the second face mask data, the second interchange face data and the second user skin color data.
Optionally, in the live connection process, a video picture of the live connection is presented at a viewer client, where the video picture includes a video picture of the first user and a video picture of the second user at the same time.
Optionally, the first face exchange mode is a self-encoder mode, and the second face exchange mode is a three-dimensional reconstruction mode.
An apparatus for data processing in a live broadcast, the apparatus comprising:
the attribute information determining module is used for determining first attribute information aiming at a first user and second attribute information aiming at a second user in a live broadcast connection process of the first user and the second user;
a target face interchange pattern determination module for determining a target face interchange pattern according to the first attribute information and the second attribute information;
and the face interchange processing module is used for performing face interchange processing on the first user and the second user in a video picture of the live broadcast connection line according to the target face interchange mode.
An electronic device comprising a processor, a memory and a computer program stored on the memory and being executable on the processor, the computer program, when executed by the processor, implementing a method of data processing in live broadcast as described above.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method of data processing in live broadcast as described above.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, the first attribute information aiming at the first user and the second attribute information aiming at the second user are determined in the process of carrying out live broadcast connection between the first user and the second user, then the target face interchange mode is determined according to the first attribute information and the second attribute information, and then the face interchange processing is carried out on the first user and the second user in a video picture of the live broadcast connection according to the target face interchange mode, so that the face switching processing applied to any live broadcast scene is realized, the face interchange mode is determined according to the attribute information of the users, the face interchange algorithm can be flexibly adjusted for processing, the purpose of live broadcast real-time face interchange can be achieved, the delay is low, the effect is good, the system performance is considered, and the interest and the entertainment of live broadcast are increased.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the description of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart illustrating steps of a method for processing data in live broadcast according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating steps of another method for processing data in a live broadcast according to an embodiment of the present invention;
FIG. 3a is a diagram illustrating an example of facial feature points according to an embodiment of the present invention;
FIG. 3b is a diagram illustrating an example of an original facial picture according to an embodiment of the present invention;
fig. 3c is a schematic diagram of an example of processing before face changing in live broadcast according to an embodiment of the present invention;
fig. 3d is a schematic diagram of an example of a post-processing after face changing in live broadcast according to an embodiment of the present invention;
fig. 3e is a schematic diagram of an example of an overall process of changing faces in live broadcast according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating steps of another method for processing data in a live broadcast according to an embodiment of the present invention;
FIG. 5a is a diagram illustrating an example of model training provided by an embodiment of the present invention;
FIG. 5b is a diagram illustrating an exemplary application of a model provided by an embodiment of the invention;
fig. 6 is a flowchart illustrating steps of another method for processing data in a live broadcast according to an embodiment of the present invention;
FIG. 7 is a diagram of another example of a model application provided by an embodiment of the invention;
fig. 8 is a schematic structural diagram of an apparatus for processing data in live broadcast according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In order to better understand the method for processing data in live broadcast of the present invention, a live broadcast architecture used in the embodiment of the present invention is described first. The live network architecture can include a server and a plurality of terminals. The server can be called a background server, a component server, and the like, and is used for providing a background service of the live webcast. The server can comprise a server, a server cluster or a cloud platform. The terminal may be an intelligent terminal with a live webcast function, for example, the intelligent terminal may be a computer, a smart phone, a tablet computer, a PDA (Personal digital assistant), a multimedia player, a wearable device, and the like.
Referring to fig. 1, a flow diagram of steps of a method of data processing in live broadcast provided by an embodiment of the invention is shown, the steps shown in the flow diagram of the figure may be performed in a computer system such as a set of computer executable instructions, and although a logical order is shown in the flow diagram, in some cases the steps shown or described may be performed in an order different than here. The method specifically comprises the following steps:
step 101, in a process of performing live connection between a first user and a second user, determining first attribute information for the first user and second attribute information for the second user;
as an example, the first user and the second user may be anchor users in a live connection, the first attribute information may be identity information for the first user, and the second attribute information may be identity information for the second user, for example, an anchor ID of anchor a and an anchor ID of anchor B may be obtained for anchor a and anchor B in the live connection.
In the process of live broadcast connection, first attribute information for a first user and second attribute information for a second user can be acquired for the first user and the second user of the current live broadcast connection, so that the user identity can be further determined through the attribute information.
For example, an anchor ID of a two-bit anchor may be obtained for anchor a and anchor B of a current live link, and then identity information for the two-bit anchor may be obtained according to the anchor ID, where the identity information may indicate that an anchor user is a known anchor or a non-known anchor.
The execution subject for determining the first attribute information and the second attribute information may be a server or a client.
In an example, the first user and the second user may be anchor users of the same live room.
In yet another example, the first user and the second user may be anchor users of different live rooms, in which case the live connection may be a live-on-wheat mode and the anchor users of the different live rooms may perform PKs.
For example, a scene of PK is performed for two-bit anchor wheat in live broadcasting, in the scene, a virtual gift can be sent by a viewer, a face exchange function for the two-bit anchor of PK in current live broadcasting is triggered, the face of the anchor a can be replaced by the face of the anchor B, the face posture of the anchor a is kept, meanwhile, the face of the anchor B can be replaced by the face of the anchor a, the face posture of the anchor B is kept, and further, the entertainment and the viewer interactivity of live broadcasting can be increased by the face exchange effect for the two-bit anchor in live broadcasting.
Step 102, determining a target face interchange mode according to the first attribute information and the second attribute information;
the target face exchange mode may be a mode for performing face exchange processing for the first user and the second user, and the target face exchange mode may be in a corresponding relationship with the first attribute information and the second attribute information, for example, a face exchange mode corresponding to identity information of two anchor in a live broadcast line may be preset for identity information of the two anchor.
After the first attribute information and the second attribute information are determined, the face interchange pattern corresponding to the first attribute information and the second attribute information may be used as a target face interchange pattern to perform face interchange processing for the first user and the second user.
In practical application, the user ID of the first user and the user ID of the second user may be obtained, and then whether the two anchor broadcasters are known anchor broadcasters or not may be determined respectively by the user IDs, so that the face interchange processing may be performed in a corresponding face interchange mode for the anchor identities of the known anchor broadcasters or the non-known anchor broadcasters.
In an example, a face interchange mode a may be preset for a case where both two anchor broadcasters of a live link are known anchor broadcasters; the face interchange mode B can be preset aiming at the situation that two anchor broadcasters of a live broadcast line are both unknown anchor broadcasters, and the face interchange mode B can also be preset aiming at the situation that one anchor is an unknown anchor and the other anchor is an unknown anchor.
In an embodiment of the present invention, step 102 may include the following sub-steps:
substep 11, acquiring a preset attribute information set, and judging whether the first attribute information and the second attribute information are both matched with the attribute information set;
as an example, the set of attribute information may be a set of attribute information for a given user, e.g., a whitelist for a known anchor may be preset, which may include an anchor ID of the known anchor, and whether or not it is present in the whitelist may be determined by identifying the anchor ID.
After the first attribute information and the second attribute information are determined, a preset attribute information set may be obtained, where the attribute information set may be a set for attribute information of a specified user, and it may be determined whether both of the first attribute information and the second attribute information exist in the attribute information set, and it may be further determined whether both of the first attribute information and the second attribute information match the attribute information set.
Substep 12, if yes, determining the preset first face interchange mode as the target face interchange mode;
as an example, the first face exchange mode may be an auto-encoder mode, for example, for a case where both two-bit anchor of a live connection are known anchors, the corresponding face exchange mode may be preset to be an auto-encoder mode.
In a specific implementation, when both the first attribute information and the second attribute information exist in the attribute information set, it may be determined that the first attribute information and the second attribute information match the attribute information set, and the preset first face exchange pattern may be taken as the target face exchange pattern.
For example, when it is recognized that the anchor ID of anchor a and the anchor ID of anchor B both exist in the white list for the well-known anchor in the live line, it may be determined that the anchor a and the anchor B of the live line are subjected to face exchange processing in the self-encoder mode, that is, the anchor ID in the white list may trigger the self-encoder mode to perform face exchange processing.
In an example, the self-encoder mode may perform face interchange processing by using a self-encoder face-changing algorithm, the self-encoder face-changing algorithm may collect face pictures of a known anchor in advance for model training, and a method of training a self-encoder model by collecting data in advance may provide a face-changing algorithm with a better effect for a high-quality known anchor.
And substep 13, otherwise, determining the preset second face interchange pattern as the target face interchange pattern.
As an example, the second face interchange mode is a three-dimensional reconstruction mode, for example, for two-bit anchor of live broadcast connection, both of them are unknown anchor, and the corresponding face interchange mode is preset as the three-dimensional reconstruction mode; or, for the case that one bit is known anchor and the other bit is not known anchor, the corresponding face interchange mode is preset to be the three-dimensional reconstruction mode.
In a specific implementation, when the first attribute information and/or the second attribute information does not exist in the attribute information set, it may be determined that the first attribute information and the second attribute information do not match the attribute information set, and the preset second face interchange pattern may be taken as the target face interchange pattern.
For example, when it is recognized that both the anchor ID of the anchor a and the anchor ID of the anchor B do not exist in a white list for a known anchor in a live broadcast connection, it may be determined that the anchor a and the anchor B of the live broadcast connection are subjected to face exchange processing in a three-dimensional reconstruction mode; or when the anchor ID of the anchor a or the anchor ID of the anchor B does not exist in the white list, determining to perform face interchange processing on the anchor a and the anchor B of the live broadcast connection by using a three-dimensional reconstruction mode, that is, the anchor ID in the non-white list may trigger the three-dimensional reconstruction mode to perform face interchange processing.
In an example, the three-dimensional reconstruction mode may employ a three-dimensional face reconstruction interchange algorithm to perform face interchange processing, the three-dimensional face reconstruction interchange algorithm may reconstruct a three-dimensional model for a anchor face by obtaining facial feature points of the anchor, and may render a picture texture of a target face to be interchanged onto the three-dimensional model to generate an interchanged target face, and by employing the three-dimensional face interchange algorithm, an algorithm with a suboptimal effect may be provided for a general anchor.
Step 103, according to the target face interchange mode, performing face interchange processing on the first user and the second user in a video picture of the live broadcast connection line.
As an example, during a live connection, a video frame of the live connection may be presented at the viewer client, where the video frame may include both a video frame of the first user and a video frame of the second user.
After the target face exchange mode is determined, face exchange processing can be performed on a first user and a second user in a video picture of a live connection line by adopting a face exchange algorithm corresponding to the target face exchange mode according to the target face exchange mode.
For example, in the live broadcasting process, the face exchange processing can be performed on two anchor broadcasters in a video picture of live broadcasting connection, and the face exchange result after face exchange can be issued to audiences, so that the face exchange effect of the two anchor broadcasters PK in the live broadcasting can be displayed to the audiences, and the interest and the entertainment of the live broadcasting can be increased aiming at a live broadcasting scene.
In an example, the triggering manner for the first user and the second user to perform the face exchange processing may be to give a virtual gift to the anchor (i.e., the first user and the second user) for the audience during the live broadcast, and further may trigger the face exchange between the anchors.
In another example, because the face interchange requirement in the live broadcast scene has high requirements on the real-time performance of the face-changing algorithm and the reality of the fusion effect, and the anchor user of the live broadcast connection is not fixed every time, the arbitrariness of the live broadcast connection PK scene also has high requirements on the expandability of the face-changing algorithm, the face-changing algorithm switching under different anchor user connection conditions can synthesize the self-encoder face-changing algorithm with good effect and the three-dimensional face reconstruction interchange algorithm with good expandability, and fully exert the advantages of the two algorithms to complement each other, so that the face real-time interchange processing can be applied to any live broadcast scene, the performance of the system is considered, and the adaptability to various service scenes is improved.
In the embodiment of the invention, the first attribute information aiming at the first user and the second attribute information aiming at the second user are determined in the process of carrying out live broadcast connection between the first user and the second user, then the target face interchange mode is determined according to the first attribute information and the second attribute information, and then the face interchange processing is carried out on the first user and the second user in a video picture of the live broadcast connection according to the target face interchange mode, so that the face switching processing applied to any live broadcast scene is realized, the face interchange mode is determined according to the attribute information of the users, the face interchange algorithm can be flexibly adjusted for processing, the purpose of live broadcast real-time face interchange can be achieved, the delay is low, the effect is good, the system performance is considered, and the interest and the entertainment of live broadcast are increased.
Referring to fig. 2, a flowchart illustrating steps of another method for processing data in live broadcast according to an embodiment of the present invention is shown, and specifically includes the following steps:
step 201, in a process of performing live connection between a first user and a second user, determining first attribute information for the first user and second attribute information for the second user;
in the process of live broadcast connection, first attribute information for a first user and second attribute information for a second user can be acquired for the first user and the second user of the current live broadcast connection, so that the user identity can be further determined through the attribute information.
In an example, for a live broadcast online scene of a anchor a and an anchor B, a live broadcast online video stream of the anchor a and the anchor B connected to a microphone PK may be decoded through a server transcoding service function, and the live broadcast online video stream may be in a left-right split-screen form, for example, a left side may be a live broadcast picture of the anchor a, and a right side may be a live broadcast picture of the anchor B.
After the live broadcast picture is obtained, the live broadcast picture of the anchor a and the live broadcast picture of the anchor B may be simultaneously input to the pre-facechange processing interface, so as to further obtain the original facial picture and facial feature point information for the anchor a and the anchor B through the pre-facechange processing.
For example, the pre-face change processing may include face detection processing and face segmentation processing, taking anchor a as an example, a face detection module may be used to obtain the face position of anchor a in a live broadcast picture, 130 feature points of the face of anchor a, and the like, and then an original face picture for anchor a may be cut according to the obtained face position of anchor a.
Step 202, determining a target face interchange mode according to the first attribute information and the second attribute information;
after the first attribute information and the second attribute information are determined, the face interchange pattern corresponding to the first attribute information and the second attribute information may be used as a target face interchange pattern to perform face interchange processing for the first user and the second user.
Step 203, acquiring first original face data for the first user and second original face data for the second user from a video picture of the live broadcast connection;
as an example, the first original face data may be an original face picture and face feature point information for a first user, and the second original face data may be an original face picture and face feature point information for a second user, for example, a live picture for anchor a and a live picture for anchor B may be respectively acquired from a video picture of a live link, and then the original face picture for anchor a and the face feature point information for anchor a, and the original face picture for anchor B and the face feature point information for anchor B may be obtained.
In the face interchange process, first original face data for a first user and second original face data for a second user may be respectively acquired from a live-connected video picture, so that face interchange processing may be performed for the first user and the second user using the first original face data and the second original face data.
For example, a main broadcast a live video frame obtained from a live video frame may be input to a face detection module for face detection, and the face detection module may detect 130 feature points of a face for main broadcast a, 68 feature points of the face (e.g., multiple annotation points of the face in fig. 3 a), and a width and a height of the face by using a face detection method based on deep learning.
In an example, the 130 advanced face feature points may include 68 basic face feature points, the 130 face feature points may represent more face information, such as face information including details of the face, such as eyeballs, etc., the 130 face feature points may be used for three-dimensional face reconstruction, the 68 face feature points may be used for segmenting a face picture, such as an anchor a original face picture with a length and a width of 128mm (as shown in fig. 3 b) cut from an anchor a live picture, and the 68 face feature points may also be used for fusing face change results during face change post-processing.
Specifically, for obtaining the anchor a original facial picture, point cloud matching can be performed on 68 feature points of the face of the anchor a according to a preset rule, then a 2 × 3 face transformation matrix can be obtained, affine transformation such as translation, rotation and cutting can be performed on a live broadcast picture of the anchor a by using the face transformation matrix, and then the anchor a original facial picture of 128mm × 128mm can be obtained.
In yet another example, taking anchor a as an example, 130 feature points of the obtained human face, 68 feature points of the human face, the human face transformation matrix and the anchor a original face picture can all be stored in the structural data, so that the stored data can be used for the subsequent face interchange processing.
Step 204, obtaining a target face interchange model corresponding to the target face interchange mode;
as an example, the target face exchange model may be a model for a face exchange process, e.g., a self-encoder model, a three-dimensional reconstructed face model.
After the target face exchange pattern is determined, a face exchange model corresponding to a preset target face exchange pattern may be obtained as a target face exchange model for further face exchange processing.
For example, when the self-encoder mode is adopted, the anchor a and the anchor B may be subjected to face interchange processing by the self-encoder model; when the three-dimensional reconstruction mode is adopted, the face interchange processing can be carried out on the anchor A and the anchor B through the three-dimensional reconstruction face model.
Step 205, generating first exchanged face data corresponding to the first original face data and second exchanged face data corresponding to the second original face data by using the target face exchange model;
as an example, the first interchanged face data may be face interchange processed face data for a face of the first user, and the second interchanged face data may be face interchange processed face data for a face of the second user.
After determining the target face interchange model, an auto-encoder model or a three-dimensional reconstructed face model may be employed to generate first interchanged face data corresponding to first original face data for a first user and second interchanged face data corresponding to second original face data for a second user.
For example, whether the personal account IDs of the anchor a and the anchor B exist in a white list or not is judged, the white list can be set in a server in advance, if the anchor IDs of the anchor a and the anchor B both exist in the white list, a self-encoder face-changing algorithm can be adopted, the pictures after face exchange for the anchor a and the anchor B are obtained through a self-encoder model respectively, otherwise, a three-dimensional reconstruction face-changing algorithm can be adopted, and the pictures after face exchange for the anchor a and the anchor B are obtained through a three-dimensional reconstruction face model respectively.
In an example, for the self-encoder face-changing algorithm, a 128mm original face picture of anchor a and a 128mm original face picture of anchor B may be used as input data of a self-encoder model, and then the generated anchor a face-changing picture and anchor B face-changing picture may be output through a self-encoder model, that is, face-changing data after face-changing processing for anchor a and anchor B is obtained.
In yet another example, for the three-dimensional reconstruction face changing algorithm, a 128mm original face picture and 130 feature points of a human face of anchor a, and a 128mm original face picture and 130 feature points of a human face of anchor B may be used as input data of a three-dimensional reconstruction face model, and then the generated anchor a face changing picture and anchor B face changing picture may be output through the three-dimensional reconstruction face model, that is, exchanged face data after face exchange processing for anchor a and anchor B is obtained.
Step 206, performing a replacement process on the first original face data according to the first interchanged face data, and performing a replacement process on the second original face data according to the second interchanged face data.
After the first and second interchanged face data are generated, the first original face data may be subjected to replacement processing using the first interchanged face data, and the second original face data may be subjected to replacement processing using the second interchanged face data, so as to obtain a face interchange effect for the first and second users.
For example, the face exchange result may be pasted back to the live broadcast pictures of anchor a and anchor B, the face exchange post-processing may be performed by using 130 feature points of the face of anchor a and 130 feature points of the face of anchor B, and the anchor a exchange face picture and the anchor B exchange face picture after face exchange are pasted back to the live broadcast pictures before face exchange according to the feature point positions, and then the anchor a live broadcast picture that has been replaced by the face of anchor B and the anchor B live broadcast picture that has been replaced by the face of anchor a may be obtained separately, and further the anchor a live broadcast picture and the anchor B live broadcast picture after face exchange may be made into left and right split-screen live broadcast pictures by a server coding function, and then sent to viewers to show the face exchange effect for two anchor microphones PK in the anchor.
In an embodiment of the present invention, step 206 may include the following sub-steps:
generating first face mask data corresponding to the first original face data, and obtaining first target face data according to the first face mask data and the first interchange face data; replacing the first original face data with the first target face data;
as an example, the first facial mask data may be facial data for a first user for post-faceting.
After the first face interchange effect is obtained, first face mask data corresponding to the first original face data may be generated through face interchange post-processing, then the first face mask data and the first face interchange data may be adopted to obtain first target face data, and then the first original face data may be replaced by the first target face data, so as to obtain a face interchange effect for the first user.
For example, taking anchor a as an example, 68 feature points of anchor a face and a face transformation matrix obtained by pre-processing of face changing may be used to generate a face mask (i.e., first face mask data) for anchor a, that is, the face mask may be obtained by performing mapping operation on 68 feature points of the face in combination with the face transformation matrix.
In an example, to make edge blending natural in the face-change post-processing that blends the face-change results, edge erosion/blurring and feathering processing may be performed on the generated face mask.
In an embodiment of the present invention, the obtaining of first target face data according to the first face mask data and the first face exchange data includes:
and obtaining first target face data according to the first face mask data, the first interchange face data and the skin color data of the first user.
After the first face mask data is generated, the first face mask data, the first face exchange data, and the skin color data of the first user may be used, so that the first target face data may be obtained.
For example, for the post-face exchange processing of anchor a, a face-exchanged picture (i.e., first exchanged face data) obtained by the face exchange processing of anchor a may be subjected to skin color migration, and by using a Reinhard skin color migration algorithm in combination with a generated anchor a face mask (i.e., first face mask data), an anchor a skin color (i.e., skin color data of a first user) may be migrated into an anchor a face-exchanged picture that has been replaced with an anchor B face, a picture after skin color migration (i.e., first target face data) may be obtained, and then an anchor a original face picture (i.e., first original face data) may be replaced with the picture after skin color migration, so that a fused face-exchanged result may be more natural.
In another example, the anchor a face-changed picture after the skin color migration may be fused to the face of the anchor a live broadcast picture, that is, the anchor a face-changed picture after the skin color migration, the anchor a original face picture (i.e., the first original face data) and the anchor a face mask may be used to fuse face pixels, and then the fused anchor a face may be affine to the anchor a live broadcast original picture for replacement by using a face transformation matrix obtained by the pre-processing of face change.
Step 206 may also include the following sub-steps:
generating second face mask data corresponding to the second original face data, and obtaining second target face data according to the second face mask data and the second interchange face data; replacing the second original face data with the second target face data.
As an example, the second facial mask data may be facial data for a second user for post-facechange processing.
After the second face interchange effect is obtained, second face mask data corresponding to the second original face data may be generated through face interchange post-processing, then, the second face mask data and the second face interchange face data may be adopted to obtain second target face data, and then, the first target face data may be adopted to replace the second original face data, so that a face interchange effect for the second user is obtained.
In an embodiment of the present invention, the deriving second target face data from the second face mask data and the second interchange face data includes:
and obtaining second target face data according to the second face mask data, the second interchange face data and the second user skin color data.
After the second face mask data is generated, the second face mask data, the second interchange face data, and the second user skin color data may be employed, and then second target face data may be obtained.
For example, for the post-face exchange processing of anchor B, a face-changed picture (i.e., second exchanged face data) obtained by the face exchange processing of anchor B may be subjected to skin color migration, and by using a Reinhard skin color migration algorithm in combination with a generated anchor B face mask (i.e., second face mask data), an anchor B skin color (i.e., skin color data of a second user) may be migrated into the anchor B face-changed picture that has been replaced with the anchor a face, and a picture (i.e., second target face data) after skin color migration may be obtained, and then an anchor B original face picture (i.e., second original face data) may be replaced with the picture after skin color migration, so that the fused face-changed result may be more natural.
In one example, a method of only operating on a face region is adopted for skin color migration, face mask generation, face change result fusion and the like through a pre-face change processing method and a post-face change processing method, so that the image processing speed is increased.
In another example, the pre-face-changing processing method and the post-face-changing processing method may be general methods in the face exchange processing process, and for accessing a new face-changing algorithm, the pre-face-changing processing and the post-face-changing processing do not need to be changed, and the corresponding face-changing algorithm switching according to a service scene is realized.
The face exchange processing device has the advantages that the face exchange processing device is deployed at the server side, the whole process from video decoding, face exchange processing and video coding can be completed, face exchange aiming at two-bit anchor broadcasting in the live broadcasting process is realized, face exchange results are issued to audiences, and the face exchange device can run at the live broadcasting server side in real time, so that the face exchange device can adapt to any live broadcasting scene, and the live broadcasting interest and entertainment are increased.
In order to enable those skilled in the art to better understand the above steps, the following description is provided for the embodiment of the present invention with reference to fig. 3c to 3e, but it should be understood that the embodiment of the present invention is not limited thereto.
For the pre-facechanging process (taking anchor a as an example), the process may be performed by the steps shown in fig. 3 c:
s301: the live broadcast picture of the anchor A can be obtained;
s302: the method adopts a deep learning face detection module for detection, and can obtain 130 feature points of the face, 68 feature points of the face and the width and height of the face aiming at the anchor A;
s303: point cloud matching can be carried out on 68 characteristic points of the human face of the anchor A to obtain a 2 x 3 human face transformation matrix;
s304: affine transformation can be carried out on the live broadcast picture of the anchor A by adopting the face transformation matrix to obtain an anchor A face picture (original face picture) of 128 mm;
s305: the preprocessing information obtained by the face-changing preprocessing can be stored in the structure.
For the post-face-changing processing procedure (taking anchor a as an example), the processing may be performed by adopting the steps shown in fig. 3 d:
s306: 68 feature points of an anchor A face and an anchor A face transformation matrix obtained by face changing pretreatment can be adopted to generate a face mask of the anchor A;
s307: anchor a skin color can be migrated to an anchor a face-changed picture (generated live B face) that has been replaced with an anchor B face;
s308: the anchor A face-changed picture after skin color migration can be fused to the face of the anchor A live broadcast picture (face-changed result fusion);
s309: the merged facechange results may be posted back to the original direct broadcast picture of anchor a.
For the whole process of face changing (taking the case of face changing by the anchor a and the anchor B), the processing may be performed by adopting the steps shown in fig. 3 e:
s310: in the process of carrying out live broadcast connection between the anchor A and the anchor B, the connecting wheat video streams of the anchor A and the anchor B can be decoded;
s311: the method comprises the steps that independent anchor A live broadcast pictures and anchor B live broadcast pictures can be obtained by decoding video streams;
s312: a face picture (an original face picture) and key points (human face characteristic points) can be obtained by adopting a face-changing pretreatment interface;
s313: respectively obtaining a face picture and key points of the anchor A and a face picture and key points of the anchor B;
s314: whether the anchor ID of anchor A and the anchor ID of anchor B are both in a white list for a known anchor can be judged;
s315: when all the face information is in the white list, face exchange processing can be carried out by adopting a face exchange algorithm of a self-encoder; when the data are not in the white list, face interchange processing can be carried out by adopting a three-dimensional reconstruction face interchange algorithm;
s316: the face changing results of the anchor A and the anchor B can be fused through face changing post-processing;
s317: the main broadcast A live broadcast picture and the main broadcast B live broadcast picture after face changing can be obtained by adopting face changing post-processing;
s318: and coding the anchor A live broadcast picture and the anchor B live broadcast picture after face changing to obtain the face-exchanged connecting video stream.
Referring to fig. 4, a flowchart illustrating steps of another method for processing data in live broadcast according to an embodiment of the present invention is shown, and specifically includes the following steps:
step 401, in a process of performing a live connection between a first user and a second user, determining first attribute information for the first user and second attribute information for the second user;
in the process of live broadcast connection, first attribute information for a first user and second attribute information for a second user can be acquired for the first user and the second user of the current live broadcast connection, so that the user identity can be further determined through the attribute information.
Step 402, determining a target face interchange mode according to the first attribute information and the second attribute information;
after the first attribute information and the second attribute information are determined, the face interchange pattern corresponding to the first attribute information and the second attribute information may be used as a target face interchange pattern to perform face interchange processing for the first user and the second user.
Step 403, when the target face interchange pattern is the first face interchange pattern, acquiring a preset first face interchange model as a target face interchange model; wherein the first face interchange model comprises a first decoding model for the first user and a second decoding model for the second user;
as an example, the first face exchange model may include an encoding model, a first decoding model of a first user, and a second decoding model of a second user, e.g., the self-Encoder model may have an Encoder module (i.e., an encoding model) and two Decoder modules (i.e., a first decoding model and a second decoding model).
When face interchange is performed in the first face interchange mode, a preset first face interchange model may be obtained, and the first face interchange model may include a first decoding model for a first user and a second decoding model for a second user, so as to further perform face interchange processing for the first user and the second user.
For example, the anchor a original face picture and the anchor B original face picture obtained after the face-changing pre-processing may be input into a common Encoder module in a trained auto-Encoder model.
In an example, the self-encoder model may collect face data of a famous anchor in advance, train the model, and perform model training before deployment of face exchange processing, so that the model may be directly deployed to a server after training is completed to generate a face changing result.
In yet another example, the self-Encoder algorithm may include an Encoder module and two Decoder modules, the self-Encoder algorithm may share the Encoder module, may train the Decoder _ a module with the face data set of anchor a, and may train the Decoder _ B module with the face data set of anchor B.
Step 404, generating first interchange face data corresponding to the first original face data by using the second decoding model;
in practical applications, the data output from the common Encoder module in the self-Encoder model may be subjected to a second decoding model for a second user to generate first exchanged face data corresponding to the first original face data.
For example, the result of the anchor a after calculation by the Encoder model may be input into the Decoder _ B module, so as to achieve the purpose of generating the face of the anchor B, so as to generate the face result with the same facial expression as the anchor a and the same facial expression as the anchor B in the five sense organs.
Step 405, generating second interchange face data corresponding to the second original face data by using the first decoding model;
in practical applications, the data output from the common Encoder model in the self-Encoder model may be used as the first decoding model for the first user to generate second interchanged face data corresponding to the second original face data.
For example, the result of the calculation of the anchor B by the Encoder module may be input to the Decoder _ a module, so as to achieve the purpose of generating the face of the anchor a, so as to generate the face result with the same posture expression as that of the anchor B and the same facial expression as that of the anchor a in the five sense organs.
In one example, the face change algorithm of the self-encoder may generate a face change result after calculation through different Decoder modules, and the kernel of face change processing generates the face change result for the Decoder module of the input counterpart.
In another example, the structure of the self-encoder model can be modified to reduce the size of the model, so that the running speed of the face generation algorithm is increased, the operation speed of the face change algorithm of the self-encoder is optimized, when the self-encoder model is deployed at a server, the face change result can be generated at a higher speed, and the requirement of a live broadcast scene on real-time result generation is met.
Step 406, performing a replacement process on the first original face data according to the first interchanged face data, and performing a replacement process on the second original face data according to the second interchanged face data.
After the first and second interchanged face data are generated, the first original face data may be subjected to replacement processing using the first interchanged face data, and the second original face data may be subjected to replacement processing using the second interchanged face data, so as to obtain a face interchange effect for the first and second users.
In order to enable those skilled in the art to better understand the above steps, the following description is provided for the exemplary embodiment of the present invention with reference to fig. 5a-5b, but it should be understood that the embodiment of the present invention is not limited thereto.
For the process of autoencoder model training (taking anchor a and anchor B as an example), the process can be performed by using the steps shown in fig. 5 a:
the face data set aiming at the anchor A and the face data set aiming at the anchor B are collected in advance, so that training can be carried out on the self-Encoder model, and the trained self-Encoder model can comprise a common Encoder module, a Decoder _ A module aiming at the anchor A and a Decoder _ B module aiming at the anchor B.
For the autoencoder model application process (anchor a and anchor B for example), the process can be performed with the steps as shown in fig. 5B:
s501: the anchor A original face picture and the anchor B original face picture obtained by face changing pretreatment can be obtained;
s502: the anchor A original face picture and the anchor B original face picture can be input into a common Encoder module for calculation processing;
s503: data output by the Encoder module and aiming at the anchor A can be input into a Decoder _ B module to continue calculation processing; data output by the Encoder module and aiming at the anchor B can be input into the Decoder _ A module to continue calculation processing;
s504: the Decoder _ B module may output a face change result for anchor a (generating a face to replace with anchor B); the Decoder _ a module may output a face change result for anchor B (generating a face to replace with anchor a).
Referring to fig. 6, a flowchart illustrating steps of another method for processing data in live broadcast according to an embodiment of the present invention is shown, and specifically includes the following steps:
step 601, in the process of carrying out live broadcast connection between a first user and a second user, determining first attribute information for the first user and second attribute information for the second user;
in the process of live broadcast connection, first attribute information for a first user and second attribute information for a second user can be acquired for the first user and the second user of the current live broadcast connection, so that the user identity can be further determined through the attribute information.
Step 602, determining a target face interchange pattern according to the first attribute information and the second attribute information;
after the first attribute information and the second attribute information are determined, the face interchange pattern corresponding to the first attribute information and the second attribute information may be used as a target face interchange pattern to perform face interchange processing for the first user and the second user.
Step 603, when the target face interchange pattern is the second face interchange pattern, constructing a second face interchange model for the first user by using the first original face data;
as an example, the second face-interchange model may be a three-dimensional reconstructed face model for the first user, e.g., a three-dimensional reconstructed face model for anchor a may be constructed.
When face interchange is performed by adopting the second face interchange mode, the first original face data can be adopted to construct a second face interchange model aiming at the first user so as to further perform face interchange processing aiming at the first user.
For example, 130 feature points of a human face of the anchor a obtained by the face changing pretreatment can be obtained, and then a three-dimensional reconstruction human face model for the anchor a can be constructed in real time by adopting the 130 feature points of the human face, and the three-dimensional reconstruction human face model can simulate information such as the face shape, the position of five sense organs, the size of five sense organs, the face posture and the like of the anchor a.
Step 604, constructing a third face interchange model for the second user by using the second original face data;
as an example, the third face exchange model may be a three-dimensional reconstructed face model for the second user, e.g., a three-dimensional reconstructed face model for anchor B may be constructed.
When the second face exchange mode is adopted for face exchange, the second original face data can be adopted to construct a third face exchange model for the second user, so as to further perform face exchange processing for the second user.
For example, 130 feature points of a human face of the anchor B obtained by the pre-processing of face changing can be obtained, and then a three-dimensional reconstruction human face model for the anchor B can be constructed in real time by using the 130 feature points of the human face, and the three-dimensional reconstruction human face model can simulate information such as the face shape, the position of five sense organs, the size of five sense organs, the facial posture and the like of the anchor B.
Step 605, generating first face interchange data corresponding to the first original face data by using the second face interchange model;
in practical applications, the second face interchange model for the first user may be adopted to generate first interchange face data corresponding to the first original face data, for example, an original face picture of anchor B may be rendered onto a three-dimensional reconstructed face model for anchor a, so as to generate a face change result for anchor a.
In an example, the three-dimensional reconstruction face-changing algorithm may render an original facial picture of the anchor B onto the three-dimensional reconstruction face model for the anchor a, and render the original facial picture of the anchor a onto the three-dimensional reconstruction face model for the anchor B, so as to obtain face-changing results for the anchor a and the anchor B, respectively, where a core of face-changing processing is to render an opposite facial picture onto its own three-dimensional reconstruction face model to generate a face-changing result.
Step 606, generating second face interchange data corresponding to the second original face data by adopting the third face interchange model;
in practical applications, a third face interchange model for a second user may be adopted to generate second interchange face data corresponding to the second original face data, for example, an original face picture of anchor a may be rendered onto a three-dimensional reconstructed face model for anchor B, so as to generate a face change result for anchor B.
In an example, the three-dimensional reconstruction face-changing algorithm can generate a fine result only for a face-changing area and generate a non-fine result for the rest parts, so that the generation speed of the algorithm is increased, the operation speed of the three-dimensional reconstruction face-changing algorithm is optimized, the purpose of face changing can be achieved by using the three-dimensional reconstruction face model and the face picture rendering method without acquiring face pictures in advance for an unknown anchor, and the method can be applied to any live broadcast scene.
Step 607, performing a replacement process on the first original face data according to the first interchanged face data, and performing a replacement process on the second original face data according to the second interchanged face data.
After the first and second interchanged face data are generated, the first original face data may be subjected to replacement processing using the first interchanged face data, and the second original face data may be subjected to replacement processing using the second interchanged face data, so as to obtain a face interchange effect for the first and second users.
In order to enable those skilled in the art to better understand the above steps, the following description is provided for the embodiment of the present invention with reference to fig. 7, but it should be understood that the embodiment of the present invention is not limited thereto.
For the process of three-dimensional reconstruction of the face model application (taking anchor a and anchor B as an example), the steps as shown in fig. 7 can be adopted for processing:
s701: 130 characteristic points of a human face of a anchor A and 130 characteristic points of a human face of an anchor B obtained by face changing pretreatment can be obtained, and a three-dimensional reconstruction human face model for the anchor A and a three-dimensional reconstruction human face model for the anchor B are respectively constructed in real time by adopting the characteristic points;
s702: rendering an original face picture of the anchor B to a three-dimensional reconstructed face of the anchor A to obtain a face changing result aiming at the anchor A (generating a face to be replaced by the anchor B); the anchor a original face picture can be rendered to the three-dimensional reconstructed face of the anchor B, and a face changing result for the anchor B is obtained (a face to be replaced by the anchor a is generated).
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 8, a schematic structural diagram of a device for processing data in live broadcast according to an embodiment of the present invention is shown, and the device may specifically include the following modules:
an attribute information determining module 801, configured to determine, during a live connection between a first user and a second user, first attribute information for the first user and second attribute information for the second user;
a target face exchange pattern determining module 802, configured to determine a target face exchange pattern according to the first attribute information and the second attribute information;
a face interchange processing module 803, configured to perform face interchange processing on the first user and the second user in the video frame of the live broadcast connection according to the target face interchange mode.
In an embodiment of the present invention, the target face interchange pattern determining module 802 includes:
the attribute information set acquisition submodule is used for acquiring a preset attribute information set and judging whether the first attribute information and the second attribute information are both matched with the attribute information set;
the first judgment submodule is used for determining that the preset first face interchange mode is the target face interchange mode if the preset first face interchange mode is the target face interchange mode;
and the second judgment sub-module is used for determining that the preset second face interchange mode is the target face interchange mode if the preset second face interchange mode is not the target face interchange mode.
In an embodiment of the present invention, the method further includes:
an original face data acquisition module, configured to acquire, from a video frame of the live connection, first original face data for the first user and second original face data for the second user;
the face exchange processing module 803 includes:
the target face interchange model obtaining sub-module is used for obtaining a target face interchange model corresponding to the target face interchange model;
an interchange face data generation sub-module, configured to generate, using the target face interchange model, first interchange face data corresponding to the first original face data and second interchange face data corresponding to the second original face data;
and the replacement processing sub-module is used for performing replacement processing on the first original face data according to the first interchanged face data and performing replacement processing on the second original face data according to the second interchanged face data.
In an embodiment of the present invention, the target face interchange model obtaining sub-module includes:
a first face interchange model acquisition unit configured to acquire a preset first face interchange model as a target face interchange model when the target face interchange mode is the first face interchange mode; wherein the first face interchange model comprises a first decoding model for the first user and a second decoding model for the second user;
the interchange face data generation submodule includes:
a first face interchange model generation unit, configured to generate first interchange face data corresponding to the first original face data by using the second decoding model;
and the first face interchange model generation unit is used for generating second interchange face data corresponding to the second original face data by adopting the first decoding model.
In an embodiment of the present invention, the target face interchange model obtaining sub-module includes:
a second face interchange model construction unit, configured to construct a second face interchange model for the first user using the first original face data when the target face interchange pattern is the second face interchange pattern;
a third face exchange model construction unit, configured to construct a third face exchange model for the second user using the second original face data;
the interchange face data generation submodule includes:
the second face interchange model generation unit is used for generating first interchange face data corresponding to the first original face data by adopting the second face interchange model;
and generating a second face interchange data unit by using the third face interchange model, wherein the second face interchange data unit is used for generating second face interchange data corresponding to the second original face data by using the third face interchange model.
In an embodiment of the present invention, the replacement processing sub-module includes:
a first target face data obtaining unit, configured to generate first face mask data corresponding to the first original face data, and obtain first target face data according to the first face mask data and the first interchange face data;
a first original face data replacing unit that replaces the first original face data with the first target face data;
the replacement processing sub-module further includes:
a second target face data obtaining unit, configured to generate second face mask data corresponding to the second original face data, and obtain second target face data according to the second face mask data and the second interchanged face data;
a second original face data replacing unit that replaces the second original face data with the second target face data.
In an embodiment of the present invention, the first target face data obtaining unit includes:
a first target face data obtaining subunit, configured to obtain first target face data according to the first face mask data, the first interchange face data, and the skin color data of the first user;
the second target face data obtaining unit includes:
a second target face data obtaining subunit, configured to obtain second target face data according to the second face mask data, the second interchange face data, and the second user skin color data.
In an embodiment of the present invention, in the live connection process, a video frame of the live connection is presented at a viewer client, and the video frame includes a video frame of the first user and a video frame of the second user at the same time.
In an embodiment of the invention, the first face interchange mode is a self-encoder mode, and the second face interchange mode is a three-dimensional reconstruction mode.
In the embodiment of the invention, the first attribute information aiming at the first user and the second attribute information aiming at the second user are determined in the process of carrying out live broadcast connection between the first user and the second user, then the target face interchange mode is determined according to the first attribute information and the second attribute information, and then the face interchange processing is carried out on the first user and the second user in a video picture of the live broadcast connection according to the target face interchange mode, so that the face switching processing applied to any live broadcast scene is realized, the face interchange mode is determined according to the attribute information of the users, the face interchange algorithm can be flexibly adjusted for processing, the purpose of live broadcast real-time face interchange can be achieved, the delay is low, the effect is good, the system performance is considered, and the interest and the entertainment of live broadcast are increased.
An embodiment of the present invention further provides an electronic device, which may include a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements the method for processing data in live broadcasting.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for processing data in live broadcasting.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method and apparatus for processing data in live broadcast, the electronic device, and the storage medium provided above are introduced in detail, and a specific example is applied in this document to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (12)

1. A method for processing data in live broadcast, the method comprising:
determining first attribute information aiming at a first user and second attribute information aiming at a second user in a live broadcast connection process of the first user and the second user;
determining a target face interchange pattern according to the first attribute information and the second attribute information;
and according to the target face interchange mode, carrying out face interchange processing on the first user and the second user in a video picture of the live broadcast connection line.
2. The method of claim 1, wherein determining a target face exchange pattern based on the first attribute information and the second attribute information comprises:
acquiring a preset attribute information set, and judging whether the first attribute information and the second attribute information are both matched with the attribute information set;
if so, determining that the preset first face interchange mode is the target face interchange mode;
otherwise, determining the preset second face interchange mode as the target face interchange mode.
3. The method according to claim 1 or 2, wherein before the performing the face interchange processing on the first user and the second user in the video picture of the live connection according to the target face interchange mode, further comprises:
acquiring first original face data for the first user and second original face data for the second user from a video picture of the live broadcast connection;
the performing, according to the target face interchange mode, face interchange processing on the first user and the second user in a video picture of the live broadcast connection includes:
acquiring a target face interchange model corresponding to the target face interchange mode;
generating first exchanged face data corresponding to the first original face data and second exchanged face data corresponding to the second original face data by adopting the target face exchange model;
performing replacement processing on the first original face data according to the first interchanged face data, and performing replacement processing on the second original face data according to the second interchanged face data.
4. The method of claim 3, wherein the obtaining a target face interchange model corresponding to the target face interchange pattern comprises:
when the target face interchange mode is the first face interchange mode, acquiring a preset first face interchange model as a target face interchange model; wherein the first face interchange model comprises a first decoding model for the first user and a second decoding model for the second user;
generating first interchanged face data corresponding to the first original face data and second interchanged face data corresponding to the second original face data by adopting the target face interchange model, wherein the generating comprises:
generating first interchange face data corresponding to the first original face data by adopting the second decoding model;
and generating second interchange face data corresponding to the second original face data by adopting the first decoding model.
5. The method of claim 3, wherein the obtaining a target face interchange model corresponding to the target face interchange pattern comprises:
when the target face interchange pattern is the second face interchange pattern, adopting the first original face data to construct a second face interchange model for the first user;
constructing a third face interchange model for the second user by adopting the second original face data;
generating first interchanged face data corresponding to the first original face data and second interchanged face data corresponding to the second original face data by adopting the target face interchange model, wherein the generating comprises:
generating first face interchange data corresponding to the first original face data by adopting the second face interchange model;
and generating second face interchange data corresponding to the second original face data by adopting the third face interchange model.
6. The method according to claim 4 or 5, wherein the replacement processing of the first original face data based on the first interchanged face data comprises:
generating first face mask data corresponding to the first original face data, and obtaining first target face data according to the first face mask data and the first interchange face data;
replacing the first original face data with the first target face data;
the replacing the second original face data according to the second interchanged face data includes:
generating second face mask data corresponding to the second original face data, and obtaining second target face data according to the second face mask data and the second interchange face data;
replacing the second original face data with the second target face data.
7. The method of claim 6, wherein said deriving first target face data from said first face mask data and said first exchanged face data comprises:
obtaining first target face data according to the first face mask data, the first interchange face data and the skin color data of the first user;
said deriving second target face data from said second face mask data and said second interchange face data, comprising:
and obtaining second target face data according to the second face mask data, the second interchange face data and the second user skin color data.
8. The method of claim 1, wherein during the live connection, a video frame of the live connection is presented at a viewer client, and the video frame includes a video frame of the first user and a video frame of the second user.
9. The method of claim 2, wherein the first face exchange mode is a self-encoder mode and the second face exchange mode is a three-dimensional reconstruction mode.
10. An apparatus for data processing in a live broadcast, the apparatus comprising:
the attribute information determining module is used for determining first attribute information aiming at a first user and second attribute information aiming at a second user in a live broadcast connection process of the first user and the second user;
a target face interchange pattern determination module for determining a target face interchange pattern according to the first attribute information and the second attribute information;
and the face interchange processing module is used for performing face interchange processing on the first user and the second user in a video picture of the live broadcast connection line according to the target face interchange mode.
11. An electronic device comprising a processor, a memory, and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing a method of live data processing as claimed in any one of claims 1 to 9.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of in-live data processing according to any one of claims 1 to 9.
CN202010924402.6A 2020-09-04 2020-09-04 Method and device for processing data in live broadcast, electronic equipment and storage medium Pending CN111986301A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010924402.6A CN111986301A (en) 2020-09-04 2020-09-04 Method and device for processing data in live broadcast, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010924402.6A CN111986301A (en) 2020-09-04 2020-09-04 Method and device for processing data in live broadcast, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111986301A true CN111986301A (en) 2020-11-24

Family

ID=73448340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010924402.6A Pending CN111986301A (en) 2020-09-04 2020-09-04 Method and device for processing data in live broadcast, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111986301A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447604A (en) * 2016-09-30 2017-02-22 北京奇虎科技有限公司 Method and device for transforming facial frames in videos
CN106534757A (en) * 2016-11-22 2017-03-22 北京金山安全软件有限公司 Face exchange method and device, anchor terminal and audience terminal
WO2018010682A1 (en) * 2016-07-15 2018-01-18 腾讯科技(深圳)有限公司 Live broadcast method, live broadcast data stream display method and terminal
CN108229239A (en) * 2016-12-09 2018-06-29 武汉斗鱼网络科技有限公司 A kind of method and device of image procossing
CN110533585A (en) * 2019-09-04 2019-12-03 广州华多网络科技有限公司 A kind of method, apparatus that image is changed face, system, equipment and storage medium
US10552977B1 (en) * 2017-04-18 2020-02-04 Twitter, Inc. Fast face-morphing using neural networks
JP2020071851A (en) * 2018-10-31 2020-05-07 バイドゥ オンライン ネットワーク テクノロジー (ベイジン) カンパニー リミテッド Method and apparatus for live broadcasting with avatar
CN111314719A (en) * 2020-01-22 2020-06-19 北京达佳互联信息技术有限公司 Live broadcast auxiliary method and device, electronic equipment and storage medium
CN111586424A (en) * 2020-04-28 2020-08-25 永康精信软件开发有限公司 Video live broadcast method and device for realizing multi-dimensional dynamic display of cosmetics

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018010682A1 (en) * 2016-07-15 2018-01-18 腾讯科技(深圳)有限公司 Live broadcast method, live broadcast data stream display method and terminal
CN106447604A (en) * 2016-09-30 2017-02-22 北京奇虎科技有限公司 Method and device for transforming facial frames in videos
CN106534757A (en) * 2016-11-22 2017-03-22 北京金山安全软件有限公司 Face exchange method and device, anchor terminal and audience terminal
CN108229239A (en) * 2016-12-09 2018-06-29 武汉斗鱼网络科技有限公司 A kind of method and device of image procossing
US10552977B1 (en) * 2017-04-18 2020-02-04 Twitter, Inc. Fast face-morphing using neural networks
JP2020071851A (en) * 2018-10-31 2020-05-07 バイドゥ オンライン ネットワーク テクノロジー (ベイジン) カンパニー リミテッド Method and apparatus for live broadcasting with avatar
CN110533585A (en) * 2019-09-04 2019-12-03 广州华多网络科技有限公司 A kind of method, apparatus that image is changed face, system, equipment and storage medium
CN111314719A (en) * 2020-01-22 2020-06-19 北京达佳互联信息技术有限公司 Live broadcast auxiliary method and device, electronic equipment and storage medium
CN111586424A (en) * 2020-04-28 2020-08-25 永康精信软件开发有限公司 Video live broadcast method and device for realizing multi-dimensional dynamic display of cosmetics

Similar Documents

Publication Publication Date Title
Niklaus et al. 3d ken burns effect from a single image
Serrano et al. Motion parallax for 360 RGBD video
CN111402399B (en) Face driving and live broadcasting method and device, electronic equipment and storage medium
CN107707931B (en) Method and device for generating interpretation data according to video data, method and device for synthesizing data and electronic equipment
US9747495B2 (en) Systems and methods for creating and distributing modifiable animated video messages
CN111988658B (en) Video generation method and device
CN111464834B (en) Video frame processing method and device, computing equipment and storage medium
TWI752502B (en) Method for realizing lens splitting effect, electronic equipment and computer readable storage medium thereof
CN112822542A (en) Video synthesis method and device, computer equipment and storage medium
CN108605119B (en) 2D to 3D video frame conversion
CN112272327B (en) Data processing method, device, storage medium and equipment
CN110969572A (en) Face changing model training method, face exchanging device and electronic equipment
TW201911240A (en) Image processing device and method, file generating device and method, and program
Langlotz et al. AR record&replay: situated compositing of video content in mobile augmented reality
CN113395569B (en) Video generation method and device
Mukhopadhyay et al. Diff2lip: Audio conditioned diffusion models for lip-synchronization
Li et al. SPGAN: face forgery using spoofing generative adversarial networks
CN111510769A (en) Video image processing method and device and electronic equipment
CN111986301A (en) Method and device for processing data in live broadcast, electronic equipment and storage medium
US20230138434A1 (en) Extraction of user representation from video stream to a virtual environment
CN115035219A (en) Expression generation method and device and expression generation model training method and device
Calagari et al. Data driven 2-D-to-3-D video conversion for soccer
CN112449249A (en) Video stream processing method and device, electronic equipment and storage medium
CN116596752B (en) Face image replacement method, device, equipment and storage medium
CN116957951A (en) Image generation method, device, computer equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination