WO2023273500A1

WO2023273500A1 - Data display method, apparatus, electronic device, computer program, and computer-readable storage medium

Info

Publication number: WO2023273500A1
Application number: PCT/CN2022/085941
Authority: WO
Inventors: 邱丰; 王佳梨; 王权
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-06-29
Filing date: 2022-04-08
Publication date: 2023-01-05
Also published as: CN113453034B; CN113453034A

Abstract

Provided in embodiments of the present disclosure are a data display method, an apparatus, an electronic device, a computer program, and a computer-readable storage medium. The method comprises: obtaining a plurality of video image frames of a real broadcaster during a live broadcast; detecting the head orientation of the real broadcaster in each video image frame; and when it is determined that the length of time that the head of the real broadcaster has been in a specified orientation satisfies a special effect trigger requirement according to the head orientations corresponding to the plurality of video image frames, displaying a target special effect animation in a video live broadcast visual, wherein a virtual broadcaster model driven by the real broadcaster is displayed in the video live broadcast visual.

Description

Data display method, device, electronic device, computer program, and computer-readable storage medium

Cross References to Related Applications

This disclosure is based on the Chinese patent application with the application number 202110728854.1, the filing date is June 29, 2021, and the application name is "data display method, device, electronic equipment, and computer-readable storage medium", and requires the Chinese patent application Priority, the entire content of this Chinese patent application is hereby incorporated into this application by reference.

technical field

The present disclosure relates to the technical field of image processing, and in particular, to a data presentation method, device, electronic equipment, computer program, and computer-readable storage medium.

Background technique

With the development of network technology, real-time video communication such as webcasting has become an increasingly popular form of entertainment. During the live broadcast, generally, the host is required to face the display screen of the host terminal, so as to enhance the interactive effect between the host and the audience. In some special cases, when the anchor's face disappears from the display screen, it not only affects the display effect of the animation special effects added for the anchor, but also reduces the viewing experience of the audience watching the live video. At the same time, as the audience leaves the live broadcast room, it will also indirectly affect the live broadcast experience of the host and the popularity of the live broadcast.

Contents of the invention

Embodiments of the present disclosure at least provide a data display method, device, electronic device, computer program, and computer-readable storage medium.

In the first aspect, the embodiment of the present disclosure provides a data display method, including: acquiring multiple frames of video images of the real anchor during the live broadcast; detecting the head posture of the real anchor in each frame of the video image; The head posture corresponding to the multi-frame video image, when it is determined that the time length of the head of the real anchor in the specified posture meets the special effect triggering requirements, the target special effect animation is displayed in the live video screen; the video The live broadcast screen displays a virtual anchor model driven by the real anchor.

For the field of virtual live broadcast, when the head of the real anchor is detected to be in a specified posture for a long time, it may cause the head of the virtual anchor model displayed in the live video screen to shake, thereby affecting the live broadcast experience and viewing experience of the anchor. In the disclosed technical solution, by displaying the virtual anchor model in the live video screen, the interest and interactivity of the live broadcast can be enhanced. Under certain circumstances, by displaying the target special effect animation corresponding to the driving virtual anchor model in the live video screen, it can ensure that the head of the virtual anchor model is in a stable playback state, and at the same time, it can also enrich the display content of the video anchor screen, so that the live video The picture is no longer too monotonous, and thus solves the problem of abnormal display of the virtual anchor model caused by the fact that the facial image of the real anchor cannot be matched in the traditional live broadcast scene.

In an optional implementation manner, the detection of the head posture of the real anchor in each frame of the video image includes: when it is determined that the face of the real anchor is facing the video acquisition device, determining the current The first facial orientation of the real anchor at any time; determine the change information of the head posture of the real anchor according to the first facial orientation; the change information is used to characterize the change information of the first facial orientation ; Determine the head pose of the real anchor in each frame of the video image based on the change information.

In the above-mentioned embodiment, by determining the change information of the head pose of the real anchor according to the first face orientation of the real anchor at the current moment, and then determining the head pose of the real anchor according to the change information, it is possible to use the timing information in the video sequence (that is, adjacent video images) to analyze the change information of the head posture of the real anchor. Compared with the method of determining the head posture based on a single frame video image, the method provided by the technical solution of the present disclosure can improve the head posture. accuracy, so as to obtain more accurate attitude results.

In an optional implementation manner, the determining the head posture of the real anchor in each frame of the video image based on the change information includes: determining the first face according to the change information When the head orientation increases to exceed the first threshold, it is determined that the head posture of the real anchor changes from an unspecified posture to the specified posture.

In an optional implementation manner, determining the head posture of the real anchor in each frame of the video image based on the change information includes: determining the first facial orientation according to the change information In the case of exceeding the first threshold and decreasing to less than the second threshold, it is determined that the head posture of the real anchor changes from the designated posture to a non-designated posture, wherein the second threshold is smaller than the first threshold.

In the above embodiment, by comparing the change information of the target angle with the first threshold and the second threshold, it is possible to determine the head posture of the real anchor through multi-threshold comparison, thereby improving the head posture of the real anchor. The accuracy rate, thereby preventing the frequent changes of the real anchor's head posture brought about by the single-threshold technical solution.

In an optional implementation manner, the detecting the head posture of the real anchor in each frame of the video image includes: when it is determined that the face of the real anchor is not facing the video capture device, by The deep learning model processes the live video images to obtain the head pose of the real anchor, and determines whether the head of the real anchor is in the specified pose according to the head pose.

In the above embodiments, when the face of the real host faces the video capture device, the complete facial feature points cannot be displayed in the live video screen. Since the incomplete facial feature points will affect the determination result of the head pose, based on this, the pose estimation of the live video screen is carried out through the deep learning model to obtain the head pose of the real anchor, which can improve the estimation accuracy of the head pose of the real anchor.

In an optional implementation manner, the processing of the live video image by using a deep learning model to obtain the head pose of the real anchor includes: acquiring the target reference image frame; wherein, the target reference The image frame includes at least one of the following image frames: N image frames before the video live image in the video sequence to which the live video image belongs, the first M image frames in the video sequence to which the live video image belongs, N and M is a positive integer greater than zero; the live video picture and the target reference image frame are processed by a deep learning model to obtain the head pose of the real anchor.

In the above embodiment, by combining the timing information in the video sequence to predict the head posture of the real anchor in the live video screen at the current moment, the real anchor's head determined according to N image frames (or M image frames) can be Head posture is used as the guidance information of the live video screen to be processed at the current moment, so as to guide the deep learning model to predict the head posture of the real anchor in the live video screen at the current moment, so as to obtain more accurate head posture detection results.

In an optional implementation manner, the detecting the head posture of the real anchor in each frame of the video image includes: performing feature point detection on the face of the real anchor in the video image to obtain feature points The detection result, wherein the feature point detection result is used to characterize the feature information of the facial feature points of the real anchor; determine the second facial orientation of the real anchor according to the feature point detection result, wherein the second face The orientation is used to characterize the orientation information of the face of the real anchor relative to the video capture device; the head posture of the real anchor is determined according to the second facial orientation.

In the above embodiment, by determining the second facial orientation of the real anchor according to the feature point detection result of the feature point detection on the face of the real anchor in the video image, the orientation information of the real anchor relative to the video acquisition device can be determined, for example, the real anchor The front faces the video capture device, or the real anchor faces the video capture device. Since the real host cannot collect a complete facial image when facing the video capture device, in this case, it will affect the accuracy of the real host's head posture. By determining the head posture of the real anchor according to the two situations of frontal orientation and non-frontal orientation, the accuracy of the real anchor's head posture can be improved.

In an optional implementation manner, the displaying the target special effect animation in the live video screen includes: determining the posture type of the head posture; determining the special effect animation matching the posture type, adding the The matching special effect animation is used to drive the target special effect animation displayed by the virtual anchor model, and the target special effect animation is displayed in the live video screen.

In the above embodiments, different types of special effect animations are triggered according to different head postures and gesture types, which can enrich the display content of special effect animations, thereby increasing the live broadcast fun during the live broadcast process and providing users with a better live broadcast experience.

In an optional implementation manner, the displaying target special effect animation in the live video screen includes: determining the type information of each viewer who watches the live broadcast process of the virtual anchor model driven by the real anchor; The special effect animation matching the above type information, using the matching special effect animation as the target special effect animation displayed by the virtual anchor model, and sending the target special effect animation to the audience terminal, so that the target animation can be seen in the audience The party terminal displays the target special effect animation.

In the above embodiment, by determining the matching target special effect animation according to the type information of each viewer, and displaying the target special effect animation on the audience terminal, the probability of the audience continuing to watch the live broadcast can be increased, thereby reducing the loss of viewers , while ensuring the popularity of the live broadcast of the real anchor, it also increases the corresponding interactive fun.

In a second aspect, an embodiment of the present disclosure provides a data display device, including: an acquisition part configured to acquire multiple frames of video images of a real host during a live broadcast; a detection part configured to detect each frame of the video image The head posture of the real anchor in the above; the special effect adding part is configured to determine the time length of the head of the real anchor in a specified posture according to the head posture corresponding to the multi-frame video image to meet the special effect In the case of a trigger requirement, the target special effect animation is displayed in the live video screen; the live video screen shows a virtual anchor model driven by the real anchor.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.

In a fourth aspect, embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any of the first aspects of the first aspect, may be executed. Steps in one possible implementation.

An embodiment of the present disclosure provides a computer program, including computer readable codes. When the computer readable codes run in an electronic device, a processor in the electronic device implements the above method when executed.

In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these drawings other related drawings are obtained.

Fig. 1 shows a flow chart 1 of a data presentation method provided by an embodiment of the present disclosure;

Fig. 2 shows a schematic diagram of the effect of a live video screen of a real anchor provided by an embodiment of the present disclosure;

FIG. 3 shows the second flow chart of a data presentation method provided by an embodiment of the present disclosure;

Fig. 4 shows a schematic diagram showing the orientation information between the first real anchor and the video capture device provided by the embodiment of the present disclosure;

Fig. 5 shows a schematic diagram showing the orientation information between the second real anchor and the video acquisition device provided by the embodiment of the present disclosure;

FIG. 6 shows a flowchart three of a data presentation method provided by an embodiment of the present disclosure;

FIG. 7 shows a flowchart 4 of a data presentation method provided by an embodiment of the present disclosure;

FIG. 8 shows a flowchart five of a data presentation method provided by an embodiment of the present disclosure;

Fig. 9 shows a schematic diagram showing the orientation information between the third real anchor and the video capture device provided by the embodiment of the present disclosure;

FIG. 10 shows a flowchart six of a data presentation method provided by an embodiment of the present disclosure;

FIG. 11 shows the seventh flowchart of a data presentation method provided by an embodiment of the present disclosure;

FIG. 12 shows a flowchart eighth of a data presentation method provided by an embodiment of the present disclosure;

Fig. 13 shows a schematic diagram of a data display device provided by an embodiment of the present disclosure;

Fig. 14 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.

detailed description

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only It is a part of the embodiments of the present disclosure, but not all of them. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but merely represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.

It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The term "and/or" in this article only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B can mean: there is A alone, A and B exist at the same time, and B exists alone. situation. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.

After research, it is found that in the process of live broadcasting, the anchor is generally required to face the display screen of the anchor's terminal, so as to enhance the interactive effect between the anchor and the audience. In some special cases, when the anchor's face disappears from the display screen, it not only affects the display effect of the animation special effects added for the anchor, but also reduces the viewing experience of the audience watching the live video. At the same time, as the audience leaves the live broadcast room, it will also indirectly affect the live broadcast experience of the host and the popularity of the live broadcast.

Based on the above research, the present disclosure provides a data presentation method. The technical solution provided by the present disclosure can be applied in a virtual live broadcast scenario. The virtual live broadcast scene can be understood as the use of pre-set virtual anchor models, such as red pandas, little rabbits, cartoon characters, etc. to replace the actual image of the real anchor for live broadcast. At this time, the above-mentioned virtual anchor is shown in the live video screen Model. At the same time, the virtual anchor model can also be used to interact with the real anchor and the audience.

For example, the camera device of the live broadcast device can collect a video image containing the real anchor, and then capture the head of the real anchor included in the video image through electronic equipment, so as to obtain the head posture of the real anchor. After determining the head posture, the electronic device can generate a corresponding driving signal, which is used to drive the virtual anchor model in the live video screen to perform corresponding actions corresponding to the real anchor, and display the virtual anchor model through the live video screen. A picture of the virtual anchor model performing the action.

In an optional implementation, the real anchor can preset one or more virtual anchor models through the electronic device, for example, "YYY role model in XXX game" can be preset as the virtual anchor model. In this way, when starting the virtual live broadcast at the current moment, one can be selected from one or more preset virtual anchor models as the virtual anchor model at the current moment. Wherein, the virtual anchor model may be a 2D model or a 3D model.

In another optional implementation, in addition to determining the virtual anchor model for the real anchor in the manner described above, the electronic device can also reshape the virtual anchor for the real anchor in the video image after acquiring multiple frames of video images Model.

For example, the electronic device can recognize the real anchor included in the video image, so as to reshape the virtual anchor model for the real anchor according to the recognition result. The recognition result may include at least one of the following: the gender of the real anchor, the appearance characteristics of the real anchor, the clothing characteristics of the real anchor, and the like.

Based on the recognition result, the electronic device may search for a model matching the recognition result from the virtual anchor model database as the virtual anchor model of the real anchor. For example, according to the recognition result, it is determined that the real anchor wears a peaked cap and clothes in hip-hop style during the live broadcast. The electronic device may search for a virtual anchor model matching the "cap" or "hip-hop style" from the virtual anchor model database as the virtual anchor model of the real anchor.

In addition to searching for a model matching the recognition result in the virtual anchor model library, the electronic device can also construct a corresponding virtual anchor model for the real anchor in real time through the model building module based on the recognition result.

Here, when constructing the virtual anchor model in real time, the electronic device can also use the virtual anchor model used by the virtual live broadcast initiated by the real anchor in the past as a reference to construct the virtual anchor model driven by the real anchor at the current moment.

Through the method of determining the virtual anchor model described above, it is possible to customize the corresponding virtual anchor model for the real anchor, thereby avoiding stereotyped virtual anchor models. At the same time, by customizing the virtual anchor model, it can also leave a deeper impression on the audience.

For the viewer side, the animation displayed on the live viewing interface of the viewer side is the animation when the virtual anchor model performs corresponding actions. For the live broadcast end, the virtual anchor model can be displayed on the live video screen of the live broadcast end, and the video image containing the real anchor can also be displayed. For example, as shown in Figure 2, the virtual anchor model can be displayed on the left side of the live video screen. The anchor model can also display the video image of the real anchor at the position 21 in the lower right corner of the live video screen.

In the embodiment of the present disclosure, the target special effect animation includes multiple animation frames. When the electronic device drives the virtual anchor model to perform specified actions, it can generate multiple animation frames, and obtain the target special effect animation by combining the multiple animation frames.

In the embodiment of the present disclosure, by displaying the target special effect animation corresponding to the driving virtual anchor model in the live video screen, it can ensure that the head of the virtual anchor model is in a stable playback state, and at the same time, it can also enrich the display content of the video anchor screen. As a result, the video live broadcast screen is no longer too monotonous, and then solves the problem of abnormal display of the virtual anchor model caused by the fact that the facial image of the real anchor cannot be matched in the traditional live broadcast scene.

In order to facilitate the understanding of this embodiment, a data display method disclosed in the embodiment of the present disclosure is first introduced in detail. The execution subject of the data display method provided in the embodiment of the present disclosure is generally an electronic device with a certain computing capability. The devices include, for example: terminal devices or servers or other live broadcast devices capable of supporting virtual live broadcast. In some possible implementation manners, the data presentation method may be implemented by a processor invoking computer-readable instructions stored in a memory.

In the embodiment of the present disclosure, the data presentation method can be applied to any virtual live broadcast scene such as a chat live broadcast scene, a game live broadcast scene, etc., which is not specifically limited in the present disclosure.

Referring to FIG. 1 , it is a flow chart of a data presentation method provided by an embodiment of the present disclosure, the method includes steps S101 to S105, wherein:

S101. Acquire multiple frames of video images of a real anchor during a live broadcast.

S103. Detect the head posture of the real anchor in each frame of video image.

Here, the head posture can be used to characterize the angle between the plane corresponding to the face of the real anchor and the horizontal plane, and/or, the angle between the plane corresponding to the face of the real anchor and the plane where the lens of the video capture device is located. The included angle, and/or, the included angle between the plane corresponding to the real anchor's face and the plane where the real anchor's terminal is located.

In the embodiment of the present disclosure, the posture of the head of the real anchor relative to the video acquisition device of the real anchor terminal can be determined according to the head posture: for example, postures such as head-up posture, head-down posture, and head-up posture, among which, The head-up posture can be understood as a state where the face of the real anchor is relatively parallel to the horizontal plane.

In the embodiment of the present disclosure, when the video image contains multiple real anchors, the head posture of each real anchor can be detected, and the head posture of a designated real anchor among the multiple real anchors can also be detected. This is not specifically limited.

S105. When it is determined according to the head postures corresponding to the multi-frame video images that the time length of the real anchor's head in the specified posture meets the special effect triggering requirements, display the target special effect animation in the live video screen, and the live video screen display has A virtual anchor model driven by real anchors.

Here, the specified posture can be understood as the head posture of the real anchor when the face of the real anchor in the video image is in an invalid display state. For example, it can be the head posture of the real anchor when the face of the real anchor is fixed for a long time, it can also be the head posture of the real anchor when the face of the real anchor disappears from the live video screen, and it can also be The head posture of the real anchor when only part of the face of the real anchor is displayed, or the head posture of the real anchor when the real anchor is not facing the video capture device for a long time.

For example, the specified posture includes the following postures: bowing the head posture, raising the head posture, bowing the head to the lower left, lowering the head to the lower right, raising the head to the upper left, and raising the head to the upper right. No longer list them one by one.

Here, the target special effect animation can be understood as a special effect animation matching a specified posture. Wherein, the special effect animations matching the specified gestures may be the same or different. For example, for a head-down posture or a head-up posture, one or more matching special effect animations may be preset, and each special effect animation corresponds to a different special effect triggering requirement.

In the embodiment of the present disclosure, the target special effect animation may include model animation, and besides, may also include material special effects. Wherein, the model animation may be the animation when the specified limbs of the virtual anchor model are driven to perform corresponding actions, for example, actions such as heart-to-heart gestures, greeting gestures, and goodbye gestures. The material special effect can be a preset dynamic or static sticker special effect. Here, the material special effect may be a special effect matching the model animation, and may also be a special effect matching the specified posture of the real anchor.

When the special effect of the material is a special effect that matches the model animation, while displaying the model animation in the live video screen, the special effect of the material can also be displayed at the specified display position in the live video screen; when switching to the next During model animation, you can switch to display the material effects corresponding to the next model action on the live video screen.

If the special effect of the material is a special effect that matches the specified posture of the real anchor, the special effect of the material can be continuously displayed in the live video screen when it is detected that the length of time that the real anchor is in the specified posture meets the requirements for triggering the special effect. Until it is detected that the head of the real anchor is no longer in the specified posture.

For example, in the virtual game live broadcast scene, the specified posture can be that the real anchor keeps his head down for a long time, and the target special effect animation can include: model animation and material special effects. Here, the model animation can include the animation of the virtual anchor model "Bi Xin" and the animation of the virtual anchor model "Greeting". The material effect can be a sticker effect that matches the model animation. For example, the sticker effect can be "Hello ", and love stickers.

In this way, when the real anchor is in the posture of bowing his head for a long time, the animation of "greeting" and "compassion" can be displayed in sequence in the live video screen until it is detected that the head of the real anchor is no longer in the specified position. attitude.

In the case of displaying the "greeting" animation on the live video screen, the sticker special effect of "Hello" can be displayed on the live video screen at the same time. When the animation of "Bi Xin" is displayed on the live video screen, the special effect of love stickers can be displayed on the live video screen at the same time.

By setting the target special effect animation to include model animation and material special effects, the content displayed in the live video screen can be enriched, thereby improving the user's live broadcast experience.

In an optional implementation manner, displaying the target special effect animation in the live video screen includes the following steps:

When it is detected that the head of the real anchor is in the specified posture for a period of time meeting the requirement for triggering the special effect, the target special effect animation may be requested to the server. Then, display the target special effect animation on the live video screen of the live broadcast device where the real anchor is located, and push the video stream corresponding to the target special effect animation to the device where the audience terminal is located, so as to view the live broadcast on the device where the audience terminal is located. Play the target special effect animation on .

In the embodiment of the present disclosure, the number of target special effect animations may be one, or multiple. For example, multiple target special effect animations can be set to play in a loop until it is detected that the head of the real anchor is no longer in the specified posture. Also for example, a target special effect animation can be set to play in a loop until it is detected that the head of the real anchor is no longer in the specified posture.

For example, for the live game scene, the virtual anchor model and the real-time game screen can be displayed on the live video screen at the same time. For example, the real-time game screen can be displayed on the left side of the live video screen, and then the virtual anchor model can be displayed on the right side of the live video screen. In the case that the time when the real anchor's head is in the head-down posture meets the special effect triggering requirement, the target special effect animation can be determined. For example, the target special effect animation can be a special effect animation for the virtual anchor model to dance, and it can also be a special effect animation for the virtual anchor model to remind the audience "please wait for a while, and the excitement will continue later".

In the embodiment of the present disclosure, a database containing a mapping relationship can be created in advance, and various special effect animations are stored in the database. The mapping relationship of , and/or, is used to characterize the mapping relationship between the special effect trigger requirement and the special effect animation corresponding to each specified gesture.

Before displaying the target special effect animation in the live video screen, the special effect animation that has a mapping relationship with the specified posture and special effect triggering requirements can be searched in the database according to the mapping relationship, and the target special effect animation can be determined based on the found special effect animation.

With regard to the above step S101, after detecting the start instruction of the live broadcast of the real anchor, start to collect the live video of the real anchor during the live broadcast, wherein the live video contains multiple frames of video images.

After collecting multiple frames of video images, step S103 is performed to detect the head posture of the real anchor in each frame of the video image, as shown in Figure 3, including the following steps:

S1031. Perform feature point detection on the face of the real anchor in the video image to obtain a feature point detection result, wherein the feature point detection result is used to characterize the feature information of the facial feature points of the real anchor;

S1032. Determine a second facial orientation of the real anchor according to the feature point detection result, wherein the second facial orientation is used to represent the orientation information of the real anchor's face relative to the video capture device;

S1033. Determine the head posture of the real anchor according to the second facial orientation.

For each frame of video image, feature point detection can be performed on the face of the real anchor in the video image through the face detection network model, so as to obtain the feature information of the facial feature points of the real anchor.

Here, the feature points can be understood as the feature points of the facial features of the real anchor. The number of feature points can be set according to actual needs. Generally, the number of feature points can be selected as 84 facial feature points. The feature information of a feature point can be understood as the number of feature points, the label of the feature point, the classification information of each feature point (for example, the eye feature point, the mouth feature point, or the nose feature point), and each feature point The eigenvalues corresponding to the points.

It should be noted that since the number of feature points can affect the accuracy of the determined head pose of the real anchor, for example, the larger the number of feature points, the higher the accuracy of the calculated head pose, and vice versa. Low. At this time, the number of feature points can be dynamically adjusted according to the remaining amount of device memory of the real host terminal. For example, when the remaining memory of the real host terminal is greater than the preset threshold, a feature point detection result of a larger number of feature points may be selected to determine the face orientation of the real host according to the feature point detection result.

By dynamically setting the number of feature points, a more accurate face orientation can be obtained when the memory of the real host terminal meets the calculation requirements, thereby improving the accuracy of the head posture.

After the feature point detection is performed on the face of the real anchor and the feature point detection result is obtained, the face orientation of the real anchor (that is, the above-mentioned second face orientation) can be determined according to the feature point detection result.

An optional implementation manner is that the feature point detection result can be input into the neural network model, so as to process the feature point detection result through the neural network model to obtain the face orientation of the real anchor (that is, the above-mentioned second face facing).

Another optional implementation manner is to judge the classification information of the feature points contained in the feature point detection result. If it is determined according to the classification information that the feature point does not include all the facial feature points, at this time, it can be determined that the real anchor is facing the video capture device. If it is determined according to the classification information that the feature point contains all facial features, at this time, it can be determined that the real anchor is facing the video capture device.

Here, the second facial orientation is used to characterize the orientation information of the face of the real anchor relative to the video capture device; the orientation information can be understood as the orientation of the face of the real anchor relative to the video capture device of the real host terminal to which the real anchor belongs. angle and distance.

As shown in Figure 4 and Figure 5 is the angle between the face of the real anchor relative to the video capture device.

As shown in Figure 4, the video capture device is installed on the terminal of the real anchor, and the real anchor is determined when the angle between the horizontal plane of the face of the real anchor and the X-axis of the coordinate system where the video capture device is located is less than or equal to the specified threshold face facing the video capture device.

As shown in Figure 5, the video capture device is installed on the terminal of the real anchor, and when the angle between the horizontal plane of the face of the real anchor and the X-axis of the coordinate system where the video capture device is located is greater than a specified threshold, determine the face of the real anchor side to the video capture device.

In the embodiment of the present disclosure, the specified threshold may be set to any value between 0 and 30, which is not specifically limited here.

After the face orientation is determined, the face orientation can be used to determine whether the face of the real anchor is facing the video capture device.

In the case of determining the frontal orientation of the face of the real anchor, the head pose of the real anchor is determined by way of threshold value comparison. Here, the method of threshold value comparison can be understood as determining whether the head pose of the real anchor is a specified pose by comparing the change information of the head pose of the real anchor with a preset threshold. When it is determined that the face of the real anchor faces the video capture device, the head posture of the real anchor is determined through the neural network model.

In the above embodiment, by determining the second facial orientation of the real anchor according to the feature point detection result of the feature point detection on the face of the real anchor in the video image, the orientation information of the real anchor relative to the video acquisition device can be determined, for example, the real anchor The front faces the video capture device, or the real anchor faces the video capture device. Since the real host cannot collect a complete facial image when facing the video capture device, in this case, it will affect the accuracy of the real host's head posture. By determining the head posture of the real anchor according to two situations of frontal orientation and non-frontal orientation (for example, facing sideways), the accuracy rate of the head posture of the real anchor can be improved.

The following will introduce the two cases of positive pair and side pair in detail.

Situation 1: The face of the real anchor faces the video capture device.

In this case, as shown in Figure 6, step S103, detecting the head posture of the real anchor in each frame of the video image, the process includes the following steps:

S11. When it is determined that the face of the real anchor is facing the video capture device, determine the first face orientation of the real anchor at the current moment;

S12. Determine the change information of the real anchor's head posture according to the first facial orientation; the change information is used to represent the change information of the first facial orientation;

S13. Determine the head pose of the real anchor in each frame of the video image based on the change information.

In the embodiment of the present disclosure, if it is determined that the face of the real anchor is facing the video capture device, the historical facial orientation can be obtained, wherein the historical facial orientation is based on the video collected at multiple historical moments before the current moment. The facial orientation of the real anchor determined by the image, and the historical facial orientation can be used to represent the historical angle between the plane where the real anchor's face is located and the horizontal plane at each historical moment.

After the historical facial orientation is obtained, the historical facial orientation and the first facial orientation determined at the current moment can be combined to determine the change information of the head posture of the real anchor, that is, according to the historical angle and the plane where the face is located at the current moment The angle with the horizontal plane determines the change information of the first facial orientation.

Here, the first face orientation is used to represent the degree of inclination of the real host's face relative to the imaging plane corresponding to the video capture device. For example, the first face orientation may be the angle between the face of the real anchor and the horizontal plane; the first face orientation may also be the angle between the face of the real anchor and the imaging plane corresponding to the video capture device. In addition, other included angles that can characterize the degree of inclination may also be used.

Here, the change information can be understood as the trend information such as the gradual increase of the first facial orientation and the increase range of the first facial orientation, or the gradual decrease of the first facial orientation and the decrease range of the first facial orientation. .

It should be noted that the historical face orientation is the facial orientation determined according to video images corresponding to a plurality of consecutive historical moments. For example, if the current moment is moment k, then the historical moment can be from moment k-n to moment k-1, and the historical facial orientations are the facial orientations of the real anchor determined based on the video images collected from moment k-n to moment k-1.

In the embodiment of the present disclosure, when determining the head posture of the real anchor in each frame of video image according to the change information, the change information can be compared with a threshold transition interval, wherein the threshold transition interval is determined based on multiple thresholds multiple transition intervals. The change process of the head pose of the real anchor can be determined through the threshold transition interval, and then the head pose of the real anchor at the current moment can be determined through the change process.

In the above embodiment, by determining the change information of the head posture of the real anchor according to the first face orientation of the real anchor at the current moment and the historical face orientation at historical moments, and then determining the head posture of the real anchor according to the change information, it can be realized Using the timing information in the video sequence (that is, adjacent video images) to analyze the change information of the head pose of the real anchor, compared with the method of determining the head pose based on a single frame of video image, the technical solution provided by the present disclosure The method can improve the accuracy of the head pose, so as to obtain more accurate pose results.

In an optional implementation, as shown in FIG. 7, the above step S13, determining the head pose of the real anchor in each frame of the video image based on the change information, can be performed by executing S13-1 or The steps of S13-2 are implemented as follows:

S13-1. When it is determined according to the change information that the first facial orientation has increased to exceed a first threshold, determine that the head posture of the real anchor has changed from an unspecified posture to the specified posture.

In some embodiments, for S13-1, the first threshold may be set according to the angle range of the first face orientation defined for the specified gesture in the actual live broadcast scene. When it is determined according to the change information that the change information of the first facial orientation is that the target angle gradually increases, and the first facial orientation increases from less than the first threshold to exceeding the first threshold, determine the head of the real anchor Change the internal posture to the specified posture.

Exemplarily, the first threshold may be set to any value in [27-33], for example, the first threshold may be set to 30. For example, when it is determined according to the change information that the change information of the first facial orientation is that the first facial orientation increases to more than 30 degrees, it is determined that the head posture of the real anchor has changed to a specified posture. The embodiment of the present disclosure does not limit the specific numerical value of the setting of the first threshold.

Here, after it is determined that the first facial orientation has increased to exceed the first threshold, the detection of the head posture may also continue to be performed on the collected video images. When it is detected that the first facial orientation continues to increase to exceed the threshold A1 after it increases to exceed the first threshold, it is determined that the specified gesture (for example, head-down posture or head-up posture) of the real anchor is too serious, at this time , a posture adjustment prompt message can be sent to the real anchor to prompt the real anchor to adjust the head posture at the current moment.

Here, the threshold A1 may be multiple thresholds greater than the first threshold, for example, the threshold A1 may be selected as 50 degrees, and may also be selected as 60 degrees, 70 degrees and so on. It can be understood that the threshold A1 can be selected as multiple arbitrary values between [30-90], which is not specifically limited in the present disclosure.

S13-2. When it is determined according to the change information that the first facial orientation decreases from exceeding the first threshold to being less than the second threshold, determine that the head posture of the real anchor has changed from the specified posture to An unspecified gesture, wherein the second threshold is smaller than the first threshold.

In some embodiments, for S13-2, the first threshold can be set according to the angle range of the first facial orientation defined for the specified gesture in the actual live broadcast scene; according to the non-specified gesture in the actual live broadcast scene The defined angle range of the first facial orientation is used to set the second threshold. Exemplarily, the first threshold can be set to any value in [27-33], for example, the first threshold can be set to 30; the second threshold can be set to [17-23] Any numerical value, for example, the second threshold may be set to 20. When the change information decreases from exceeding the first threshold to being less than the second threshold, it is determined that the head posture of the real anchor has changed to a specified posture.

The following is an example of the above S13-1 and S13-2 in combination with actual scenarios. The process is described as follows:

The real anchor M broadcasts live on the live broadcast platform through the real anchor terminal. After the real anchor M opens the live broadcast room, start to collect video images, and determine the head posture of the real anchor in the manner described above.

Assume that the target angle between the real host's face and the imaging plane of the video capture device (ie, the first face orientation) is alpha. If the change information of alpha is gradually increasing, when the alpha increases from 0 to more than 20 degrees but not 50, it is determined that the real anchor is not bowing or raising his head; when the alpha increases to more than 30 degrees, it is determined that the real The anchor is looking down or looking up. Conversely, when the alpha gradually decreases from an angle greater than 30 degrees to the interval between 20-30 degrees, it is determined that the real anchor is still bowing or raising his head, until the alpha continues to decrease to less than 20 degrees, it is determined The real anchor is not bowing or looking up.

In an optional head-down detection technical solution, a threshold may be preset, and the angle between the real anchor's face orientation and the horizontal plane may be compared with the threshold to determine whether the real anchor is in a specified posture. However, when a real anchor performs a nodding action, it may frequently occur that the target angle is greater than the threshold, or the target angle is smaller than the threshold. Since the nodding action is not a specified gesture, the single-threshold detection technology may cause errors in the recognition of the specified gesture of the real anchor, thereby triggering the corresponding special effect animation by mistake, bringing a bad live broadcast experience to the real anchor and the audience .

In the technical solution of the present disclosure, by comparing the change information of the target angle with the first threshold and the second threshold, it is possible to determine the head posture of the real anchor through multi-threshold comparison, thereby improving the head posture of the real anchor. The accuracy of the head posture can be improved, so as to prevent the frequent changes of the head posture of the real anchor brought by the single threshold technical solution.

Situation 2: The face of the real host is not facing the video capture device frontally (for example, facing sideways).

In this case, as shown in FIG. 8, step S103, detecting the head posture of the real anchor in each frame of the video image, can be realized by executing S21-S22, as follows:

S21. When it is determined that the face of the real anchor is not facing the video acquisition device, process the live video image through a deep learning model to obtain the head posture of the real anchor;

S22. Determine whether the head of the real anchor is in the specified posture according to the head posture.

In the embodiment of the present disclosure, when it is detected that the face of the real anchor is not facing the video capture device, the live video picture can be input into the deep learning model, so that the live video picture can be processed by the deep learning model, Get the head pose of the real anchor.

Before inputting the live video picture into the deep learning model, the deep learning model needs to be trained. Specifically, it is possible to collect images of multiple real anchors at various angles relative to the video capture screen, and then input the images into the deep learning model for training, and then analyze the live video screen through the trained deep learning model processing to obtain the head posture of the real anchor.

In an optional implementation, the output data of the deep learning model can be a vector, which is used to indicate at least one of the following information: whether it is in a specified posture, the posture type of the specified posture (for example, bowing the head posture or looking up head pose), the estimated angle between the real anchor’s face orientation and the horizontal plane, and the orientation information of the real anchor’s face relative to the video capture device.

In the embodiment of the present disclosure, when it is determined according to the output data of the deep learning model that the head of the real anchor is in a specified posture, and the specified posture meets the requirements for triggering special effects, the live broadcast screen displays Target special effect animation.

When it is determined according to the output data of the deep learning model that the head of the real anchor is in an unspecified posture, and the face of the real anchor is facing the video capture device, prompt information can be generated to the real anchor, and the prompt information is used to prompt the real The host moves the video capture device so that the face of the real host can face the video capture device.

For example, as shown in FIG. 9 , the video capture device and the real anchor terminal are set separately, and the video capture device is placed on the left side of the real anchor terminal. When the real anchor faces the display screen of the real anchor's terminal, the live video image collected by the video collection device includes the left side of the real anchor's face. When it is detected that the real host is facing the display screen of the real host’s terminal, and the side is facing the video capture device, it is determined that the special effect trigger condition is not met, and a prompt message needs to be generated for the real host to prompt the real host to adjust the video capture device orientation.

In an optional implementation manner, the above S21 can be realized by executing S21-1 to S21-2 as shown in FIG. 10, as follows:

S21-1. When it is determined that the face of the real anchor is not facing the video acquisition device, acquire a target reference image frame; wherein, the target reference image frame includes at least one of the following image frames: the live video image The N image frames before the live video picture in the video sequence to which it belongs, and the first M image frames in the video sequence to which the live video picture belongs, where N and M are positive integers greater than zero;

S21-2. Process the live video image and the target reference image frame by using a deep learning model to obtain the head pose of the real anchor.

In the embodiment of the present disclosure, in order to further improve the accuracy of the head pose of the real anchor, the electronic device can determine the head pose of the real anchor at the current moment by combining the timing information of the video sequence during the live broadcast of the real anchor through a deep learning model .

In an optional implementation manner, in the video sequence, N image frames before the live video picture corresponding to the current moment may be determined. Then, input the obtained N image frames, the output data corresponding to each image frame, and the live video image collected at the current moment into the deep learning model for processing, so as to obtain the head pose of the real anchor.

Here, since the movement of the head of the real anchor does not change very frequently during the live broadcast, the head postures of the real anchor corresponding to adjacent live video images in the video sequence may be the same posture. In this case, the head pose of the real anchor in the live video screen at the current moment can be predicted by combining the timing information in the video sequence, and the head pose of the real anchor determined based on N image frames can be used as the current moment The guidance information of the live video screen to be processed can guide the deep learning model to predict the head pose of the real anchor in the live video screen at the current moment, so as to obtain more accurate detection results of the head pose.

In another optional implementation manner, the first M image frames in the video sequence may also be determined. Then, the acquired M image frames, the output data corresponding to each image frame, and the live video images collected at the current moment are input into the deep learning model for processing, so as to obtain the head pose of the real anchor.

Here, when the real anchor starts the live broadcast, the face of the real anchor will face the video capture device in order to debug the device of the real anchor. Therefore, when predicting the live video picture to be processed at the current moment, M image frames, the output data corresponding to each image frame, and the live video picture collected at the current moment can be input into the deep learning model for processing , so as to obtain the head pose of the real anchor.

Since the M image frames can be understood as the image frames collected when the face of the real anchor faces the video capture device, the M image frames may contain the complete face of the real anchor. In this way, the deep learning model can compare the picture about the real anchor in the live video picture to be processed at the current moment with the picture about the real anchor in the M image frames, so as to guide the deep learning model to predict the live video picture at the current moment In order to get more accurate head pose detection results.

In yet another optional implementation manner, in the video sequence, N image frames before the live video image corresponding to the current moment may be determined, and the first M image frames in the video sequence may be determined. Then, input the obtained N image frames and M image frames, the output data corresponding to each image frame, and the live video image collected at the current moment into the deep learning model for processing, so as to obtain the real host's head pose.

In the embodiment of the present disclosure, when the head pose of the real anchor in the video image is detected in the manner described above, the head pose of the real anchor can be determined according to the head poses corresponding to multiple frames of video images. When the head is in the specified posture for a period of time that meets the triggering requirements of the special effect, the target special effect animation is displayed on the live video screen.

In an optional implementation manner, the target special effect animation can also be displayed on the live video screen when the specified gesture meets at least one of the following special effect triggering requirements, including:

The number of times the head of the real anchor is in the specified posture meets the requirements for triggering special effects;

The type of state where the head of the real anchor is in the specified posture meets the requirements for triggering special effects;

When the real anchor's head is in a specified posture, the position of the head in the video image meets the requirements for triggering special effects.

In the above embodiments, by setting various special effect triggering requirements, the display mode of special effect animation can be enriched, and a richer interactive experience can be provided for real hosts and audiences.

In an optional implementation manner, based on FIG. 1, as shown in FIG. 11, the above step S105, displaying target special effect animation in the live video screen includes the following steps:

S1051. Determine the posture type of the head posture;

S1052. Determine the special effect animation that matches the gesture type, use the matched special effect animation as the target special effect animation that drives the virtual anchor model to display, and display the target special effect animation on the live video screen .

In the embodiments of the present disclosure, different special effect animations are set for head postures of different posture types. After determining the posture type of the head posture, the model animation and/or material special effects matching the posture type can be searched in the data table, and the found model animation and/or material special effects can be used as driving the virtual anchor The target special effect animation displayed by the model, and display the target special effect animation on the live video screen.

It can be understood that the target special effect animation can be one special effect animation, and can also be multiple special effect animations. When there is one target animation with special effects, the animation with special effects can be cyclically played in the video sequence corresponding to the live video screen. When there are multiple target special effect animations, each target special effect animation may be played sequentially in the video sequence corresponding to the live video screen.

When the special effect of the material is a special effect that matches the model animation, the special effect of the material can be played sequentially in the live video screen following the corresponding model animation. When the special effect of the material is a special effect that matches the specified pose, the special effect of the material can be played in a loop on the live video screen without following the model animation.

In the above embodiment, according to different types of head postures, different types of special effects animations are triggered, which can enrich the display content in the video live broadcast screen, thereby increasing the live broadcast fun during the virtual live broadcast process, and providing users with more Live experience.

In an optional implementation manner, based on FIG. 1 or FIG. 11, the display of target special effect animation in the live video screen in the above step S105 or S1052 may also include the following steps as shown in FIG. 12:

S31. Determine the type information of each viewer who watches the live broadcast process of the virtual anchor model driven by the real anchor;

S33. Determine the special effect animation matching the type information, use the matching special effect animation as the target special effect animation displayed by the virtual anchor model, and send the target special effect animation to the audience terminal, The target special effect animation is displayed on the audience terminal.

In the embodiment of the present disclosure, different types of special effect animations may be triggered to be displayed for different viewers. First, type information of each viewer may be determined, and the type information may include at least one of the following: gender, age, region, occupation, hobby, and rating.

After the above type information is obtained, the special effect animation matching the type information can be searched in the database according to the type information as the target special effect animation. Then, the target special effect animation is sent to the audience terminal, so as to play the target special effect animation on the live video screen displayed by the audience terminal.

For example, the real anchor may keep his head down for a long time during the live broadcast. When the real anchor is in the bowed state, the facial expressions of the real anchor cannot be captured, which will cause the virtual anchor model to be unable to be displayed normally in the live video screen. . When the audience enters the live broadcast room and sees the virtual anchor model that cannot be displayed normally, it will affect the viewing experience of the audience and cause the audience to leave the live broadcast room. In the above-mentioned situation where the virtual anchor model cannot be displayed normally due to the head posture of the real anchor, the data display method of the embodiment of the present application can display the corresponding special effect animation for the audience, for example: the real anchor is performing a connection operation , please don't leave. Thereby, the probability of viewers continuing to watch the live broadcast is increased, the loss of viewers is reduced, and while ensuring the popularity of the live broadcast of the real host, the corresponding interactive fun is also increased.

Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

Based on the same inventive concept, the embodiment of the present disclosure also provides a data display device corresponding to the data display method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned data display method in the embodiment of the present disclosure, the implementation of the device Reference can be made to the implementation of the method, and repeated descriptions will not be repeated.

Referring to FIG. 13 , it is a schematic diagram of a data display device provided by an embodiment of the present disclosure. The device includes: an acquisition part 51, a detection part 52, and a special effect addition part 53; wherein,

The acquisition part 51 is configured to acquire the multi-frame video images of the real anchor during the live broadcast;

The detection part 52 is configured to detect the head posture of the real anchor in each frame of the video image;

The special effect adding part 53 is configured to, when it is determined according to the head postures corresponding to the multi-frame video images that the length of time that the head of the real anchor is in a specified posture meets the requirements for triggering special effects, The target special effect animation is displayed in the live video screen; the live video screen shows a virtual anchor model driven by the real anchor.

In the disclosed technical solution, by displaying the virtual anchor model in the live video screen, the interest and interactivity of the live broadcast can be enhanced. Under certain circumstances, by displaying the target special effect animation corresponding to the driving virtual anchor model in the live video screen, it can ensure that the head of the virtual anchor model is in a stable playback state, and at the same time, it can also enrich the display content of the video anchor screen, so that the live video The picture is no longer too monotonous, and thus solves the problem of abnormal display of the virtual anchor model caused by the fact that the facial image of the real anchor cannot be matched in the traditional live broadcast scene.

In a possible implementation manner, the detection part 52 is further configured to: determine the first face orientation of the real anchor at the current moment when it is determined that the face of the real anchor faces the video capture device; The first facial orientation determines the change information of the head posture of the real anchor; the change information is used to characterize the change information of the first facial orientation; the video image of each frame is determined based on the change information The head pose of the real anchor in .

In a possible implementation manner, the detection part 52 is further configured to: determine that the real anchor's head The internal posture is changed from the unspecified posture to the specified posture.

In a possible implementation manner, the detection part 52 is further configured to: determine the The head pose of the real anchor changes from the designated pose to a non-designated pose, wherein the second threshold is smaller than the first threshold.

In a possible implementation manner, the detection part 52 is further configured to: when it is determined that the face of the real anchor is not facing the video collection device, process the live video screen through a deep learning model to obtain The head pose of the real anchor, and determine whether the head of the real anchor is in the specified pose according to the head pose.

In a possible implementation manner, the detection part 52 is further configured to: acquire the target reference image frame; wherein, the target reference image frame includes at least one of the following image frames: the video sequence to which the live video picture belongs In the N image frames before the live video picture, the first M image frames in the video sequence to which the live video picture belongs, N and M are positive integers greater than zero; The target reference image frame is processed to obtain the head pose of the real anchor.

In a possible implementation manner, the detection part 52 is further configured to: perform feature point detection on the face of the real anchor in the video image to obtain a feature point detection result, wherein the feature point detection result is used for Characterize the feature information of the facial feature points of the real anchor; determine the second facial orientation of the real anchor according to the feature point detection result, wherein the second facial orientation is used to characterize the face of the real anchor relative to the video Collecting the orientation information of the device; determining the head posture of the real anchor according to the second facial orientation.

In a possible implementation manner, the special effect adding part 53 is further configured to: determine the posture type of the head posture; determine the special effect animation matching the posture type, and use the matching special effect animation as the driving The target special effect animation displayed by the virtual anchor model, and the target special effect animation is displayed in the live video screen.

In a possible implementation manner, the special effect adding part 53 is further configured to: determine the type information of each viewer watching the live broadcast process of the virtual anchor model driven by the real anchor; determine the special effect matching the type information Animation, using the matching special effect animation as the target special effect animation displayed by the virtual anchor model, and sending the target special effect animation to the audience terminal to display the target special effect on the audience terminal animation.

For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.

Corresponding to the data presentation method in FIG. 1, the embodiment of the present disclosure also provides an electronic device 600, as shown in FIG. 14, which is a schematic structural diagram of the electronic device 600 provided in the embodiment of the present disclosure, including:

Processor 61, memory 62, and bus 63; memory 62 is configured to store execution instructions, including memory 621 and external memory 622; memory 621 here is also called internal memory, and is configured to temporarily store computing data in the processor 61, And the data exchanged with the external memory 622 such as hard disk, the processor 61 exchanges data with the external memory 622 through the memory 621, when the electronic device 600 is running, the processor 61 communicates with the memory 62 through the bus 63 , so that the processor 61 executes the following instructions:

Obtain multi-frame video images of the real anchor during the live broadcast;

Detecting the head posture of the real anchor in each frame of the video image;

According to the head posture corresponding to the multi-frame video image, it is determined that the time length of the head of the real anchor in the specified posture meets the special effect triggering requirement, and display the target special effect animation in the live video screen ; The live video screen shows a virtual anchor model driven by the real anchor.

Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the data presentation method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

Embodiments of the present disclosure also provide a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the data display method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.

Wherein, the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.

Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the system and device described above can refer to the corresponding process in the foregoing method embodiment, and details are not repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the parts is only a logical function division. In actual implementation, there may be other division methods. For example, multiple parts or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or parts may be in electrical, mechanical or other forms.

The parts described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional part in each embodiment of the present disclosure may be integrated into one processing unit, each part may exist separately physically, or two or more parts may be integrated into one part.

If the functions are implemented in the form of software function parts and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.

Industrial Applicability

In the embodiment of the present disclosure, by displaying the virtual anchor model in the video live broadcast screen, the interest and interactivity of the live broadcast can be enhanced. Next, by displaying the target special effect animation corresponding to the driving virtual anchor model in the live video screen, it can ensure that the head of the virtual anchor model is in a stable playback state, and at the same time, it can also enrich the display content of the video anchor screen, so that the live video screen It is no longer too monotonous, and thus solves the problem of abnormal display of the virtual anchor model caused by the fact that the facial image of the real anchor cannot be matched in the traditional live broadcast scene. And, by determining the change information of the head pose of the real anchor according to the first face orientation of the real anchor at the current moment, and then determining the head pose of the real anchor according to the change information, it can be realized to use the timing information in the video sequence (that is, Adjacent video images) to analyze the change information of the head posture of the real anchor, compared with the method of determining the head posture based on a single frame video image, the method provided by the disclosed technical solution can improve the accuracy of the head posture , so as to obtain more accurate pose results. In addition, by comparing the change information of the target angle with the first threshold and the second threshold, it is possible to determine the head pose of the real anchor through multi-threshold comparison, thereby improving the accuracy of the head pose of the real anchor , so as to prevent the frequent changes of the head posture of the real anchor brought by the single threshold technical scheme. And, by determining the second facial orientation of the real anchor according to the feature point detection result of feature point detection on the face of the real anchor in the video image, the orientation information of the real anchor relative to the video capture device can be determined, for example, the real anchor is facing the video Acquisition device, or, real anchor side-to-side video acquisition device. Since the real host cannot collect a complete facial image when facing the video capture device, in this case, it will affect the accuracy of the real host's head posture. By determining the head posture of the real anchor according to the two situations of frontal orientation and non-frontal orientation, the accuracy of the real anchor's head posture can be improved.

Claims

A method for displaying data, comprising:

Obtain multi-frame video images of the real anchor during the live broadcast;

Detecting the head posture of the real anchor in each frame of the video image;

According to the head posture corresponding to the multi-frame video image, it is determined that the time length of the head of the real anchor in the specified posture meets the special effect triggering requirement, and the target special effect animation is displayed in the live video screen; The above-mentioned live video screen displays a virtual anchor model driven by the real anchor.
The method according to claim 1, wherein said detecting the head posture of said real anchor in each frame of said video image comprises:

When it is determined that the face of the real anchor is facing the video capture device, determine the first face orientation of the real anchor at the current moment;

Determine the change information of the real anchor's head posture according to the first facial orientation; the change information is used to characterize the change information of the first facial orientation;

Determining the head pose of the real anchor in each frame of the video image based on the change information.
The method according to claim 2, wherein said determining the head pose of the real anchor in each frame of the video image based on the change information comprises:

If it is determined according to the change information that the first facial orientation has increased to exceed a first threshold, it is determined that the head posture of the real anchor has changed from an unspecified posture to the specified posture.
The method according to claim 2 or 3, wherein the determining the head pose of the real anchor in each frame of the video image based on the change information comprises:

When it is determined according to the change information that the first facial orientation decreases from exceeding the first threshold to being less than the second threshold, it is determined that the head posture of the real anchor has changed from the designated posture to a non-designated posture , wherein the second threshold is smaller than the first threshold.
The method according to any one of claims 1 to 4, wherein said detecting the head posture of said real anchor in each frame of said video image comprises:

When it is determined that the face of the real anchor is not facing the video capture device, the live video screen is processed through a deep learning model to obtain the head posture of the real anchor, and determine the head posture according to the head posture Whether the head of the real anchor is in the specified posture.
The method according to claim 5, wherein said processing the live video picture through a deep learning model to obtain the head posture of the real anchor includes:

Obtain a target reference image frame; wherein, the target reference image frame includes at least one of the following image frames: the N image frames before the live video picture in the video sequence to which the live video picture belongs, the live video picture to which the live video picture belongs The first M image frames in the video sequence, N and M are positive integers greater than zero;

The live video image and the target reference image frame are processed by a deep learning model to obtain the head pose of the real anchor.
The method according to any one of claims 1 to 6, wherein said detecting the head posture of said real anchor in each frame of said video image comprises:

Perform feature point detection on the face of the real anchor in the video image to obtain a feature point detection result, wherein the feature point detection result is used to characterize the feature information of the real anchor facial feature points;

Determining a second facial orientation of the real anchor according to the feature point detection result, wherein the second facial orientation is used to characterize the orientation information of the real anchor's face relative to the video capture device;

Determining the head pose of the real anchor according to the second facial orientation.
The method according to any one of claims 1 to 7, wherein said displaying target special effect animation in the live video picture includes:

determining a gesture type of the head gesture;

Determining a special effect animation that matches the gesture type, using the matched special effect animation as a target special effect animation that drives the virtual anchor model to display, and displaying the target special effect animation on the live video screen.
The method according to any one of claims 1 to 8, wherein said displaying target special effect animation in the live video screen includes:

Determine the type information of each viewer who watches the live broadcast process of the virtual anchor model driven by the real anchor;

Determine the special effect animation that matches the type information, use the matched special effect animation as the target special effect animation displayed by the virtual anchor model, and send the target special effect animation to the audience terminal, so as to The audience terminal displays the target special effect animation.
A data display device, comprising:

The acquisition part is configured to acquire multiple frames of video images of the real anchor during the live broadcast;

The detection part is configured to detect the head posture of the real anchor in each frame of the video image;

The special effect adding part is configured to, when it is determined according to the head postures corresponding to the multi-frame video images that the time length of the head of the real anchor in the specified posture meets the special effect triggering requirements, the live broadcast screen The target special effect animation is displayed in the live video screen; the virtual anchor model driven by the real anchor is displayed on the live video screen.
An electronic device, comprising: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor communicates with the memory through the bus , when the machine-readable instructions are executed by the processor, the steps of the data presentation method according to any one of claims 1 to 9 are executed.
A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the data presentation method according to any one of claims 1 to 9 are executed.
A computer program comprising computer readable code, when the computer readable code is run in an electronic device, the processor in the electronic device implements the data according to any one of claims 1 to 9 when executed Show the steps of the method.