WO2023279713A1 - Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product - Google Patents

Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product Download PDF

Info

Publication number
WO2023279713A1
WO2023279713A1 PCT/CN2022/075015 CN2022075015W WO2023279713A1 WO 2023279713 A1 WO2023279713 A1 WO 2023279713A1 CN 2022075015 W CN2022075015 W CN 2022075015W WO 2023279713 A1 WO2023279713 A1 WO 2023279713A1
Authority
WO
WIPO (PCT)
Prior art keywords
animation
anchor
detection result
real
gesture
Prior art date
Application number
PCT/CN2022/075015
Other languages
French (fr)
Chinese (zh)
Inventor
邱丰
刘昕
王佳梨
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023279713A1 publication Critical patent/WO2023279713A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program

Definitions

  • the present disclosure relates to the technical field of computers, and in particular to a special effect display method, device, computer equipment, storage medium, computer program and computer program product.
  • the anchor can trigger the display of special effect animations by performing trigger operations on the special effect trigger buttons on the live broadcast device.
  • the anchor can manually manipulate the mouse or keyboard to trigger the display of special effects animation; or, the anchor can also click or press the shortcut key edited and preset in the live broadcast software to trigger and play the special effect animation.
  • the related virtual live broadcast solution requires the anchor to manually trigger the display of special effects animation, which takes up the hand performance of the anchor during the live broadcast and reduces the interaction efficiency between the anchor user and the viewer through hand movements, resulting in the anchor user's experience of using the live broadcast software. good.
  • Embodiments of the present disclosure at least provide a special effect display method, device, computer equipment, storage medium, computer program, and computer program product.
  • an embodiment of the present disclosure provides a method for displaying special effects, including: acquiring a first video image of a real anchor during a live broadcast; and performing posture detection on a designated body part of the real anchor in the first video image , to obtain the posture detection result; in the case of detecting that the real anchor is in a preset posture according to the posture detection result, determine the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result; The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.
  • the embodiments of the present disclosure are applicable to the field of virtual live broadcast, and can display a virtual anchor model driven by a real anchor in the live video interface, and can display special animation effects of the virtual anchor model in the live video interface. That is to say, by recognizing the posture of the real anchor, the target animation special effect of the virtual anchor model driven by the real anchor can be determined based on the posture detection result, and the target animation special effect can be displayed in the live video interface. In this way, the target animation effects corresponding to the virtual anchor model can be triggered and displayed on the video live broadcast interface through the posture detection results of the real anchor, without relying on external control devices to trigger the display of animation special effects, and at the same time, the live broadcast experience of virtual live broadcast users is improved.
  • the embodiment of the present disclosure also provides a special effect display device, including: an acquisition unit, configured to acquire the first video image of the real host during the live broadcast; The specified body parts of the real anchor perform posture detection to obtain a posture detection result; a determining unit is configured to determine according to the posture detection result when it is detected that the real anchor is in a preset posture according to the posture detection result Target animation special effects of the virtual anchor model corresponding to the real anchor; a display unit configured to display the target animation special effects of the virtual anchor model in the live video interface corresponding to the real anchor.
  • an embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.
  • a computer device including: a processor, a memory, and a bus
  • the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing
  • the processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.
  • embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any of the first aspects of the first aspect, may be executed. Steps in one possible implementation.
  • an embodiment of the present disclosure further provides a computer program, including computer readable code, when the computer readable code is run in an electronic device, the processor in the computer device implements the above first program when executed. aspect, or a step in any possible implementation of the first aspect.
  • the present disclosure provides a computer program product, including computer program instructions.
  • the computer program instructions When the computer program instructions are executed by a computer, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are implemented.
  • FIG. 1 shows a flow chart of a method for displaying special effects provided by an embodiment of the present disclosure
  • Fig. 2 shows a schematic diagram of a posture detection result provided by an embodiment of the present disclosure
  • Fig. 3 shows a schematic diagram of a special effect display device provided by an embodiment of the present disclosure
  • Fig. 4 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
  • this live broadcast solution requires the anchor to manually trigger the display of special effects animation, this live broadcast solution will occupy the hand performance of the anchor during the live broadcast, thereby reducing the interaction efficiency between the anchor user and the viewer through hand movements. Thereby reducing the live broadcast user's experience of using the live broadcast software.
  • the present disclosure provides a special effect display method, device, computer equipment, storage medium, computer program and computer program product.
  • the technical solution provided by the present disclosure can be applied in a virtual live broadcast scenario.
  • the virtual live broadcast scene can be understood as the use of pre-set virtual anchor models, such as red pandas, little rabbits, cartoon characters, etc. to replace the actual image of the real anchor for live broadcast.
  • the above-mentioned virtual anchor is shown in the live video screen Model.
  • the interaction between the real anchor and the audience can also be carried out according to the virtual anchor model.
  • the camera device of the live broadcast device can collect a video image containing a real anchor, and then capture the body of the real anchor contained in the video image, so as to obtain posture information of the real anchor. After the posture information is determined, a corresponding driving signal can be generated, and the driving signal is used to drive the live broadcast device to display the animation special effect corresponding to the virtual anchor model in the live video screen.
  • the real anchor may preset a corresponding virtual anchor model, for example, the preset virtual anchor model may be "YYY role model in XXX game".
  • a real anchor can preset one or more virtual anchor models. When starting the virtual live broadcast at the current moment, one can be selected from one or more preset virtual anchor models as the virtual anchor model at the current moment.
  • the virtual anchor model may be a 2D model or a 3D model.
  • the real anchor determines the virtual anchor model, after the first video image is acquired, the real anchor in the first video image can be reshaped Virtual anchor model.
  • the live broadcast device can identify the real anchor included in the video image, and reshape the virtual anchor model for the real anchor according to the recognition result.
  • the recognition result may include at least one of the following: the gender of the real anchor, the appearance characteristics of the real anchor, the wearing characteristics of the real anchor, and the like.
  • the live broadcast device may search for a model matching the recognition result from the virtual anchor model database as the virtual anchor model of the real anchor.
  • the live broadcast device determines according to the recognition result that the peaked cap and clothes worn by the real anchor during the live broadcast are hip-hop-style clothes, it can search from the virtual anchor model library, and compare the searched ones with the The virtual anchor model that matches the "peaked cap” or "hip-hop style” is used as the virtual anchor model of the real anchor.
  • the live broadcast device in addition to searching for a model that matches the recognition result in the virtual anchor model library, can also build a corresponding virtual host in real time based on the recognition result through the model construction module. anchor model.
  • the virtual anchor model used in the virtual live broadcast initiated by the real anchor in the past can also be used as a reference to construct the virtual anchor model driven by the real anchor at the current moment.
  • the execution subject of the special effect display method provided in the embodiment of the present disclosure is generally a computer device with a certain computing power, for example :
  • the special effect display method can be executed by a terminal device, or a server or other processing equipment, wherein the terminal device can be user equipment, mobile device, user terminal, terminal, cellular phone, personal digital processing, handheld device, computing device, vehicle-mounted device, wearable devices etc.
  • the computer device can be any live broadcast device that supports the installation of virtual live broadcast software.
  • the method for displaying special effects may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
  • FIG. 1 it is a flow chart of a method for displaying special effects provided by an embodiment of the present disclosure.
  • the method includes steps S101 to S107, wherein:
  • S101 Obtain a first video image of a real anchor during a live broadcast.
  • the live broadcast device may collect a video stream of a real host during a live broadcast through a camera device pre-installed on the live broadcast device, and the first video image is a video frame in the video stream.
  • the video images of the collected video stream may include the real host's face and upper body parts.
  • the video image may also include part or all of the hand images.
  • the video image In the actual live broadcast scene, when the real anchor leaves the shooting range of the camera device, or the live broadcast scene of the real anchor is relatively complicated, the video image often contains incomplete faces and/or incomplete upper body parts.
  • the specified body parts may be at least part of the specified body parts of the real anchor.
  • the specified body part includes: a head part and an upper body part (two arm parts, a hand part and an upper body trunk part).
  • the above posture detection results can be used to characterize at least one of the following: the relative positional relationship between each specified body part, the position of the gesture contained in the first video image Gesture classification results.
  • the live broadcast device may detect whether the real host is in a preset posture according to the posture detection result. If the real anchor is in the preset posture, it is determined that the real anchor in the first video image satisfies the triggering condition of the animation special effect. At this point, the live broadcast device can determine the target animation special effect that matches the first video image, and display the target animation special effect.
  • the embodiments of the present disclosure can be applied to the field of virtual live broadcast.
  • the live broadcast device can display the virtual anchor model driven by the real anchor in the live video interface, and can display the animation special effects of the virtual anchor model in the live video interface.
  • the live broadcast device can determine the target animation effect of the virtual anchor model driven by the real anchor based on the posture detection result by recognizing the posture of the real anchor, and display the target animation special effect in the live video interface.
  • the gesture detection results of the real anchor can be used to trigger the display of the target animation effects corresponding to the virtual anchor model on the live video interface. In this way, there is no need to rely on external control devices to trigger the display of animation effects, and it also improves the live broadcast of virtual live broadcast users. experience.
  • the realization of obtaining the attitude detection result may include the following process:
  • Step S1031 perform limb detection on the specified limb parts of the real anchor in the first video image, and obtain a limb detection result.
  • Step S1032 if the body detection result includes a hand detection frame, perform gesture detection on the image within the hand detection frame in the first video image to obtain a gesture classification result.
  • Step S1033 determining the posture detection result according to the body detection result and the gesture classification result.
  • the live broadcast device may use the body detection model to perform body detection on the designated body parts of the real anchor in the first video image, and obtain body detection results.
  • the limb detection result includes at least one of the following: limb key points, size of the face frame, position information of the face frame, size of the hand detection frame, and position information of the hand detection frame.
  • the live broadcast device can determine the body key points of the specified body parts of the real anchor in the first video image through the body detection model. If it is recognized that the first video image contains a clear facial image, at least one of the size of the human face frame and the position information of the human face frame is obtained. Then, if it is recognized that the first video image contains a clear hand image, at least one of the size of the hand detection frame and the position information of the hand detection frame can be obtained.
  • the live broadcast device when it is detected that the body detection result contains the hand detection frame, can also perform gesture detection on the image located in the hand detection frame through the gesture recognition model, and obtain the gesture classification result, where , the gesture classification result is a feature vector, which is used to characterize the probability that the hand gesture of the real anchor in the first video image belongs to each preset gesture.
  • the live broadcast device may determine the body detection result and gesture classification result as the gesture detection result. Then, judge whether the real anchor in the first video image satisfies the triggering condition of the animation special effect according to the body detection result and the gesture classification result. If the triggering condition of the animation special effect is satisfied, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result.
  • the live broadcast device may discard the first video image, and use the next video frame in the video stream as the first video image, through the above The described steps process the first video image, and the processing process will not be described in detail here.
  • the body detection result includes body key points, for example, body key points of each specified body part.
  • the specified limb part includes a head part and an upper body part
  • the body keys include keys for the head part, as well as keys for both arms in the upper body part, keys for the hands, and keys for the upper torso.
  • gesture detection for accurately representing the action semantic information of the real anchor in the first video image can be obtained result.
  • the target animation special effect is determined based on the posture detection result, the accuracy of the target animation special effect that is triggered to be displayed can be improved.
  • the implementation of detecting that the real anchor is in a preset posture according to the posture detection result may include the following steps:
  • Step S11 according to the body detection result in the posture detection result, judge whether the real anchor in the first video image satisfies the gesture recognition condition, and obtain the judgment result;
  • Step S12 when the judgment result indicates that the real host satisfies the gesture recognition condition, detect whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture;
  • Step S13 if it is detected that the gesture indicated by the gesture classification result is the preset gesture, determine that the real anchor is in the preset gesture.
  • the live broadcast device determines the body detection result described above, it may judge whether the real host in the first video image meets the gesture recognition condition according to the body detection result.
  • the live broadcast device may determine the relative positional relationship between each designated body part according to the body detection result; and determine whether the real host in the first video image meets the gesture recognition condition according to the relative positional relationship.
  • the above relative positional relationship includes at least one of the following: a relative distance between each designated body part, and an angular relationship between associated body parts in each designated body part.
  • the associated body parts may be adjacent specified body parts, or specified body parts of the same type.
  • the fact that the real anchor in the first video image satisfies the gesture recognition condition may be understood as: the body movement of the real anchor in the first video image is a preset body movement.
  • the live broadcast device may detect whether the gesture made by the real anchor in the first video image is a preset gesture.
  • the live broadcast device when the live broadcast device detects that the gesture made by the real anchor in the first video image is a preset gesture, it can determine that the real anchor is in the preset gesture.
  • a real anchor in a video image satisfies the triggering condition of the animation special effect, and then executes the step of determining the target animation special effect of the virtual live broadcast model corresponding to the real anchor according to the posture detection result.
  • the live broadcast device can determine the target animation special effect of the virtual anchor model through the combination of the body detection result and the gesture classification result.
  • the gesture of the real anchor is required to be a preset gesture, but also the body movement of the real anchor when making the preset gesture is required to be a preset body movement.
  • the method of first judging whether the real object in the first video image meets the gesture recognition condition based on the body detection result can improve the efficiency of posture comparison and shorten the time of posture comparison, so that this technical solution can be applied to real-time requirements High live scene.
  • step S11 determining whether the real anchor in the first video image satisfies the gesture recognition condition according to the limb detection result in the posture detection result, may include the following process:
  • the preset orientation information is used to represent The real anchor is in a preset posture, and the relative orientation relationship between each designated body part.
  • the live broadcast device may first determine the relative orientation information between the specified body parts of the real anchor according to the body detection result (that is, The relative positional relationship described above), here, the relative orientation information between each specified body parts may include the following relative orientation information: relative orientation information between hands and faces, relative orientation information between arms, Relative orientation information between upper body torso and arms, relative orientation information between arms and upper body torso.
  • the relative orientation information may include at least one of the following: relative distance, relative angle, and relative direction.
  • the relative distance may include: the relative distance between each designated body part.
  • the relative distance between the center point of the hand detection frame and the center point of the face detection frame is M1 pixels
  • the relative distance between the elbow of the left arm of the real anchor and the elbow of the right arm is M2 Pixels
  • the relative distance between the center point of the upper body torso and the elbow of each arm is M3 pixels and M4 pixels.
  • the relative angle may include: an included angle between various specified body parts.
  • the relative direction may include: direction information between various specified body parts.
  • the hand detection frame is at the left position (or right position, lower position, upper position, etc.) of the face detection frame.
  • the left arm of the real anchor is on the left side of the right arm; the upper body torso is on the right side of the left arm, and on the left side of the right arm, etc.
  • the live broadcast device can compare the relative orientation information with the preset orientation information to obtain the comparison result, and then judge whether the real anchor in the first video image is based on the comparison result.
  • the gesture recognition condition is met.
  • the relative orientation information includes a plurality of first sub-information
  • the preset orientation information includes a plurality of second sub-information.
  • the realization of comparing the relative orientation information with the preset orientation information to obtain the comparison result may include:
  • the second sub-information is the same information used in the preset orientation information Information representing the relative distance between the center point of the hand detection frame and the center point of the face detection frame.
  • the live broadcast device may judge whether the real host in the first video image meets the gesture recognition condition according to the multiple difference values.
  • the live broadcast device can obtain the judgment result that the real host in the first video image satisfies the gesture recognition condition when each difference value is smaller than the preset difference threshold; When at least one difference value among the values is greater than or greater than the preset difference threshold, a judgment result is obtained that the real anchor in the first video image does not satisfy the gesture recognition condition.
  • the live broadcast device can obtain that the real host in the first video image satisfies gesture recognition when it is determined that the number of difference values smaller than the preset difference threshold is greater than or equal to the preset number threshold As for the judgment result of the condition, if it is determined that the number of the preset difference threshold is less than the preset number threshold among the plurality of difference values, the judgment result that the real anchor in the first video image does not meet the gesture recognition condition is obtained.
  • the live broadcast device may determine the weight value corresponding to each difference value, and then perform a weighted summation according to each difference value and the corresponding weight value to obtain a weighted summation calculation result.
  • the sum calculation result is less than or equal to the preset weighted threshold, it is determined that the real anchor in the first video image meets the gesture recognition condition; when the weighted sum calculation result is greater than the preset weighted threshold, it is determined that the first video image The real anchors in do not meet the gesture recognition requirements.
  • the preset difference threshold, the preset quantity threshold, and the preset weighting threshold can be set according to actual needs, which is not limited in this embodiment of the present disclosure.
  • determining the realization of the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result may include the following steps:
  • Step S21 based on the posture detection result, determine the first driving information for the animation special effect; wherein, the first driving information is used to indicate the animation jump information of the animation special effect displayed in the live video interface;
  • Step S22 according to the first driving information, determine the animation sequence matching the first driving information among the plurality of animation sequences corresponding to the posture detection result, and determine the matching animation sequence as The target animation effect.
  • the first driving information may be a 1 ⁇ (P+Q) matrix, where P may be the number of preset gestures, and Q may be the number of multiple animation sequences corresponding to the preset gestures.
  • the preset pose that matches the pose detection result of the real anchor in the first video image can be determined among multiple preset poses based on the first driving information.
  • the element corresponding to the matched preset posture can be set to “1”, and the elements corresponding to other preset postures can be set to “0”, wherein “1” indicates that the preset posture is a posture that matches the posture detection result of the real anchor, and "0" represents that the preset posture is not a posture that matches the posture detection result of the real anchor.
  • the quantity of Q is associated with the “matched preset pose”, that is, Q can be understood as a plurality of animation sequences corresponding to the “matched preset pose”.
  • the element corresponding to the animation sequence matching the first video image can be set as “1”, and the other elements can be set as “0”.
  • the live broadcast device after the live broadcast device determines the first driving information, it can determine a preset posture that matches the posture detection result among multiple preset postures according to the first driving information, and In the plurality of animation sequences corresponding to the posture, determine the target animation special effect of the virtual anchor model corresponding to the real anchor.
  • the live broadcast device can pre-set multiple stages for the preset posture, for example: an action entry stage, an action hold stage, and an action exit stage.
  • the preset posture is the posture of the real anchor showing "OK" through hand movements as shown in Figure 2, and the arm of the real anchor is raised to the position of the head and showing the "OK” action can enter the stage for the above action , the real anchor keeping the "OK” action can be the above-mentioned action holding stage, and the real anchor's arm is lowered from the head position is the above-mentioned action exit stage.
  • the live broadcast device may preset a corresponding animation sequence for each stage, and identify which stage the preset gesture is in according to the preset corresponding animation sequence.
  • the live broadcast device can determine the target animation special effect according to the first driving information, thereby simplifying the data format, saving the device memory of the live broadcast device, and ensuring the smoothness of the live broadcast process.
  • step S21 the implementation of determining the first driving information for animation special effects based on the posture detection result may include the following process:
  • the live broadcast device may detect the stability of the frame rate of the live broadcast device, and determine at least one video image in the video stream before the first video image according to the frame rate stability.
  • a target time window may be determined.
  • the acquisition time corresponding to the first video image is T seconds
  • the target time window may be [T-1, T]. That is to say, the live broadcast device may determine the video image within the target time window in the video stream as at least one video image.
  • N frames of video images preceding the first video image in the video stream may be acquired as at least one video image.
  • the live broadcast device After obtaining at least one video image, the live broadcast device can obtain the second driving information, and the second driving information is determined according to each of the video images, so that the live broadcast device can obtain at least one second driving information, and according to the posture detection As a result, estimated driving information for the animated special effect is determined. Afterwards, an animation sequence driven by each of the second driving information and the estimated driving information may be determined, so as to obtain at least one animation sequence.
  • the requirement for the preset number of times may be the highest number of occurrences, that is, the live broadcast device may use the animation sequence with the highest number of occurrences in at least one animation sequence as the animation sequence that meets the requirement for the preset number of times.
  • the live broadcast device can obtain the second driving information for animation special effects determined according to each video image, and then, according to The second driving information for animation special effects determined for each frame of video images in the 30 frames of video images, to obtain 30 second driving information; determine the animation sequence driven by each second driving information in the 30 second driving information, 30 animation sequences are obtained; the second driving information corresponding to the animation sequence with the highest occurrence frequency among the 30 animation sequences is determined as the first driving information of each frame of video images in the 30 frames of video images.
  • the animation effects of the virtual live broadcast model may be different.
  • the animation special effects displayed in the live video interface may have the problem of shaking.
  • the disclosed technical solution proposes a decision stabilization algorithm based on time series. The algorithm first obtains at least one video image in the video stream before the first video image, and then determines the second driving information for animation special effects according to each video image. , and then determine the first driving information according to the second driving information. Through this processing method, the signal jitter can be reduced on the basis of ensuring a low decision-making response delay, thereby improving the accuracy of the animation sequence that triggers the corresponding action.
  • step S22 according to the first driving information, the implementation of determining the animation sequence matching the first driving information among the multiple animation sequences corresponding to the gesture detection result , which can include the following processes:
  • the animation state machines are used to represent the jump relationship between multiple animation states, and each of the animation states corresponds to one or more animation sequences;
  • the live broadcast device may pre-set corresponding animation state machines for multiple animation sequences corresponding to preset poses.
  • the next animation state to be jumped to by the animation state machine can be determined according to the content contained in the first driving information, for example, jumping from the current animation state A to the animation state B .
  • the animation sequence corresponding to the next animation state to be jumped can be determined, and then the animation sequence is determined to be an animation sequence matching the first video image.
  • jumping from A to B may be transitioned through animation transition frames.
  • the live broadcast device can realize the playback of the Avatar image skeletal animation or special effect animation through the animation mixing algorithm based on the mixing parameters of the continuous animation clips.
  • the animation blending algorithm may include at least one of animation algorithms such as skeletal skin animation algorithm, 2D freeform cartesian algorithm (2D freeform cartesian) and 2D simple directional (2D simple directional) of Mecannim animation, for Therefore, this embodiment of the present application does not make a limitation.
  • mixing parameters may include parameters such as speed and direction.
  • the animation state machine contains transition conditions between animation states. Therefore, the animation state machine is controlled through the driving information to realize the transition of the animation state in the live video interface, so that the live broadcast user can Use more complex body movements to achieve smooth transitions between various action states to improve the user's live broadcast experience.
  • the target animation special effects include at least one of body movement special effects and rendering material special effects used to characterize the body movements of the virtual anchor model
  • the realization of the target animation special effects of the virtual anchor model displayed in the live video interface may include at least one of the following:
  • the target animation special effects include the virtual anchor's body movement special effects, for example, the movement special effects of the virtual live broadcast's limbs performing the "OK" action as shown in FIG. 2 .
  • the target animation special effects also include rendering material special effects, for example, you can add "rabbit ears” special effects to the virtual anchor.
  • the target position may be associated with body motion special effects, for example, for the same rendering material, in the case of different body motion special effects, the display positions (ie, target positions) in the live video interface are also different.
  • the content displayed in the live video interface can be enriched, thereby increasing the fun of the live broadcast and improving the user's live broadcast experience.
  • the method further includes: acquiring a virtual live broadcast scene corresponding to the real anchor.
  • the virtual live broadcast scene may be a game commentary scene, a talent show scene, an emotional expression scene, etc., which is not limited in this embodiment of the present disclosure.
  • step S105 according to the posture detection result, determine the realization of the target animation special effect of the virtual anchor model corresponding to the real anchor, and may also include the following steps: obtaining the initial animation special effect matching the posture detection result; An animation special effect matching the virtual live broadcast scene is determined as the target animation special effect in the initial animation special effect.
  • the live broadcast device can set different types of animation special effects.
  • the displayed animation special effects may also be different, for example, in the game commentary scene, the "OK" gesture corresponds to The animation special effects are more in line with the action habits of the game characters.
  • the live broadcast device may determine the animation special effect matching the virtual live broadcast scene as the target animation special effect in the initial animation special effect.
  • the default posture is the "OK" posture as shown in Figure 2
  • multiple initial animation special effects can be set in advance for the "OK” posture, for example, initial animation special effects M1, initial animation special effects M2, initial animation special effects M3 .
  • initial animation special effects M1 initial animation special effects M1
  • initial animation special effects M2 initial animation special effects M3
  • initial animation special effects M3 initial animation special effects
  • a corresponding scene label is set to indicate the virtual live broadcast scene to which the initial animation special effect applies.
  • the virtual live broadcast scene After obtaining the virtual live broadcast scene corresponding to the real anchor, the virtual live broadcast scene can be matched with the scene label, so as to determine the animation special effect matching the virtual live broadcast scene as the target animation special effect.
  • the real anchor is recorded as anchor A
  • the virtual anchor model driven by the real anchor is "Princess Rabbit”.
  • the live broadcast device selected by anchor A is a smart phone, and a camera is installed on the smart phone.
  • the preset gesture is the gesture shown in FIG. 2
  • the preset gesture is the OK gesture as shown in FIG. 2 .
  • the live broadcast device collects the first video image of the real anchor during the live broadcast through the camera; then, performs body detection on the designated body parts of the anchor A in the first video image, and obtains the body detection result.
  • the body detection result includes at least one of the following: body key points, size of the face frame, position information of the face frame, size of the hand detection frame, and position information of the hand detection frame.
  • the relative orientation information between anchor A's hand detection frame and face frame is determined according to the limb detection result, it is determined that anchor A meets the gesture recognition condition. For example, as shown in FIG. 2 , according to the body detection result, it can be determined that anchor A's hand is on one side of the face and adjacent to the face. At this time, gesture detection may be performed on the image within the hand detection frame in the first video image to obtain a gesture classification result. If it is recognized that the gesture made by anchor A is "OK gesture", and it is determined that anchor A is in the preset posture as shown in FIG. 2 .
  • the live broadcast device can determine the target animation special effect of the virtual anchor model "Princess Rabbit” corresponding to anchor A according to the posture detection result, and display the target animation special effect of "Princess Rabbit” on the live video interface corresponding to anchor A.
  • the "OK gesture” can include 3 animation stages: the action of anchor A raising his arm to the head position and showing the "OK” action enters the stage, and the action of anchor A maintaining the “OK” action remains stage, and the action of Anchor A's arm lowering from the head position exits the stage.
  • a corresponding animation sequence can be determined for it.
  • the live broadcast device After the live broadcast device obtains the gesture detection result, it can also determine at least one video image in the video stream before the first video image, and obtain the second driving information for animation special effects determined according to each video image, the first The second driving information is used to indicate the animation jump information of the animation special effect displayed in the live video interface, and at the same time, the estimated driving information for the animation special effect can be determined according to the posture detection result. Afterwards, according to the second driving information and the estimated driving information, the animation sequence driven by each driving information may be determined; and then, the animation sequence with the largest number of occurrences may be determined among the determined animation sequences. For example, the animation sequence corresponding to the "motion hold phase" has the highest number of occurrences. Afterwards, the animation sequence corresponding to the "action holding stage" can be played on the live video screen.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides a special effect display device corresponding to the special effect display method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned special effect display method of the embodiment of the present disclosure, the implementation of the device Reference can be made to the implementation of the method, and repeated descriptions will not be repeated.
  • FIG. 3 it is a schematic diagram of a special effect display device provided by an embodiment of the present disclosure.
  • the device includes: an acquisition part 41, a posture detection part 42, a determination part 43, and a display part 44; wherein,
  • the acquisition part 41 is configured to acquire the first video image of the real anchor during the live broadcast
  • the posture detection part 42 is configured to perform posture detection on the designated limb parts of the real anchor in the first video image, and obtain a posture detection result;
  • the determining part 43 is configured to determine the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result when it is detected that the real anchor is in a preset posture according to the posture detection result;
  • the display part 44 is configured to display the target animation special effects of the virtual anchor model in the live video interface corresponding to the real anchor.
  • the posture detection part 42 is further configured to: when the posture detection result includes at least one of the body detection result and the gesture classification result, specifying the real anchor in the first video image Performing limb detection on the limb parts to obtain the limb detection result; in the case that the limb detection result includes a hand detection frame, performing gesture detection on the image in the first video image located in the hand detection frame to obtain A gesture classification result: determining the posture detection result according to the limb detection result and the gesture classification result.
  • the determining part 43 is further configured to: judge whether the real anchor in the first video image satisfies the gesture recognition condition according to the body detection result in the posture detection result, and obtain the judgment result; When the judgment result indicates that the real anchor satisfies the gesture recognition condition, detect whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture; when the gesture indicated by the gesture classification result is detected If the gesture is the preset gesture, it is determined that the real anchor is in the preset gesture.
  • the determination part 43 is further configured to: determine the relative orientation information between the designated limb parts of the real anchor according to the body detection results; information to determine whether the real anchor in the first video image satisfies the gesture recognition condition; the preset orientation information is used to characterize the relative orientation between the designated body parts of the real anchor in a preset posture relation.
  • the determining part 43 is further configured to: determine first driving information for animation special effects based on the posture detection result; wherein the first driving information is used to indicate that the live video interface The animation jump information of the animation special effects of the virtual live broadcast model shown in ; according to the first driving information, determine the animation that matches the first driving information in a plurality of animation sequences corresponding to the posture detection results sequence, and determine the matching animation sequence as the target animation special effect.
  • the determination part 43 is further configured to: determine at least one video image in the video stream before the first video image; The second driving information of the special effect, and determine the estimated driving information for the animation special effect according to the posture detection result; determine the animation sequence driven by each driving information in the second driving information and the estimated driving information, and obtain at least An animation sequence; determining the first driving information as the driving information corresponding to the animation sequence whose occurrence number meets the preset number requirement in the at least one animation sequence.
  • the determining part 43 is further configured to: acquire animation state machines of the multiple animation sequences; the animation state machine is used to characterize the jump relationship between multiple animation states, each The animation state corresponds to one or more animation sequences; according to the first driving information, determine the next animation state to be jumped of the animation state machine; according to the animation state corresponding to the next animation state to be jumped An animation sequence, determining an animation sequence matching the first driving information.
  • the acquiring part 41 is also configured to acquire the virtual live broadcast scene corresponding to the real anchor; An initial animation special effect; in the initial animation special effect, an animation special effect matching the virtual live broadcast scene is determined as the target animation special effect.
  • the determining part 43 is further set to at least one of the following : Displaying the special effect of the body movement of the virtual anchor model in the live video interface; displaying the special effect of the rendering material at a target position associated with the body movement of the virtual anchor model in the live video interface.
  • a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a unit, a module or a non-modular one.
  • the embodiment of the present disclosure also provides a computer device 500, as shown in FIG. 4, which is a schematic structural diagram of the computer device 500 provided by the embodiment of the present disclosure, including:
  • processor 51 memory 52, and bus 53; memory 52 is used for storing and executing instruction, comprises memory 521 and external memory 522; memory 521 here is also called internal memory, is used for temporarily storing computing data in processor 51, and The data exchanged by the external memory 522 such as hard disk, the processor 51 exchanges data with the external memory 522 through the memory 521, and when the computer device 500 is running, the processor 51 communicates with the memory 52 through the bus 53, so that The processor 51 executes the following instructions:
  • the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result;
  • the target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored. When the computer program is run by a processor, the steps of the method for displaying special effects described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • the embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the special effect display method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.
  • the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the computer software product is stored in a storage medium, including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • the target animation special effect of the virtual anchor model driven by the real anchor is determined based on the posture detection result, and the target animation special effect is displayed in the live video interface.
  • the target animation effects corresponding to the virtual anchor model can be triggered and displayed on the live video interface through the gesture detection results of the real anchor, which increases the interaction efficiency between the anchor user and the viewer through body movements, and also improves the live broadcast experience of the live broadcast user .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Processing Or Creating Images (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present disclosure provides a special effect display method and apparatus, a computer device, and a storage medium. The method comprises: acquiring a first video image of a real anchor in a live broadcast process; performing posture detection on a specified limb part of the real anchor in the first video image to obtain a posture detection result; upon detecting, according to the posture detection result, that the real anchor is in a preset posture, determining, according to the posture detection result, a target animation special effect of a virtual anchor model corresponding to the real anchor; and displaying the target animation special effect of the virtual anchor model in a video live broadcast interface corresponding to the real anchor.

Description

特效展示方法、装置、计算机设备、存储介质、计算机程序和计算机程序产品Special effect display method, device, computer equipment, storage medium, computer program and computer program product
相关申请的交叉引用Cross References to Related Applications
本公开基于申请号为202110768288.7、申请日为2021年7月7日、申请名称为“一种特效展示方法、装置、计算机设备以及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。This disclosure is based on the Chinese patent application with the application number 202110768288.7, the application date is July 7, 2021, and the application name is "A special effect display method, device, computer equipment and storage medium", and requires the priority of the Chinese patent application Right, the entire content of this Chinese patent application is hereby incorporated into this disclosure as a reference.
技术领域technical field
本公开涉及计算机的技术领域,特别是涉及一种特效展示方法、装置、计算机设备、存储介质、计算机程序和计算机程序产品。The present disclosure relates to the technical field of computers, and in particular to a special effect display method, device, computer equipment, storage medium, computer program and computer program product.
背景技术Background technique
在目前的虚拟直播过程中,主播可以通过对直播设备上特效触发按键执行触发操作的方式,触发展示特效动画。例如,主播可以手动操控鼠标或者键盘来触发展示特效动画;或者,主播还可以在直播软件中点选或按下提前编辑预设好的快捷键,进行特效动画的触发和播放。由于相关的虚拟直播方案需要主播手动触发展示特效动画,占用了主播直播时的手部表现,减少了主播用户通过手部动作与观看者的交互效率,导致主播用户对该直播软件的使用体验不佳。In the current virtual live broadcast process, the anchor can trigger the display of special effect animations by performing trigger operations on the special effect trigger buttons on the live broadcast device. For example, the anchor can manually manipulate the mouse or keyboard to trigger the display of special effects animation; or, the anchor can also click or press the shortcut key edited and preset in the live broadcast software to trigger and play the special effect animation. Because the related virtual live broadcast solution requires the anchor to manually trigger the display of special effects animation, which takes up the hand performance of the anchor during the live broadcast and reduces the interaction efficiency between the anchor user and the viewer through hand movements, resulting in the anchor user's experience of using the live broadcast software. good.
发明内容Contents of the invention
本公开实施例至少提供一种特效展示方法、装置、计算机设备、存储介质、计算机程序和计算机程序产品。Embodiments of the present disclosure at least provide a special effect display method, device, computer equipment, storage medium, computer program, and computer program product.
第一方面,本公开实施例提供了一种特效展示方法,包括:获取真实主播在直播过程中的第一视频图像;对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果;在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效;在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。In the first aspect, an embodiment of the present disclosure provides a method for displaying special effects, including: acquiring a first video image of a real anchor during a live broadcast; and performing posture detection on a designated body part of the real anchor in the first video image , to obtain the posture detection result; in the case of detecting that the real anchor is in a preset posture according to the posture detection result, determine the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result; The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.
本公开实施例适用于虚拟直播领域,可以在视频直播界面中展示真实主播所驱动的虚拟主播模型,并可以在视频直播界面中展示该虚拟主播模型的动画特效。也就是说,可以通过识别真实主播的姿态,基于姿态检测结果来确定真实主播驱动的虚拟主播模型的目标动画特效,并在视频直播界面中展示该目标动画特效。由此可以实现通过真实主播的姿态检测结果在视频直播界面上触发展示虚拟主播模型对应的目标动画特效,无需依赖外部控制设备来触发展示动画特效,同时还提高了虚拟直播用户的直播体验。The embodiments of the present disclosure are applicable to the field of virtual live broadcast, and can display a virtual anchor model driven by a real anchor in the live video interface, and can display special animation effects of the virtual anchor model in the live video interface. That is to say, by recognizing the posture of the real anchor, the target animation special effect of the virtual anchor model driven by the real anchor can be determined based on the posture detection result, and the target animation special effect can be displayed in the live video interface. In this way, the target animation effects corresponding to the virtual anchor model can be triggered and displayed on the video live broadcast interface through the posture detection results of the real anchor, without relying on external control devices to trigger the display of animation special effects, and at the same time, the live broadcast experience of virtual live broadcast users is improved.
第二方面,本公开实施例还提供一种特效展示装置,包括:获取单元,用于获取真实主播在直播过程中的第一视频图像;姿态检测单元,用于对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果;确定单元,用于在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效;展示单元,用于在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。In the second aspect, the embodiment of the present disclosure also provides a special effect display device, including: an acquisition unit, configured to acquire the first video image of the real host during the live broadcast; The specified body parts of the real anchor perform posture detection to obtain a posture detection result; a determining unit is configured to determine according to the posture detection result when it is detected that the real anchor is in a preset posture according to the posture detection result Target animation special effects of the virtual anchor model corresponding to the real anchor; a display unit configured to display the target animation special effects of the virtual anchor model in the live video interface corresponding to the real anchor.
第三方面,本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第一方面,或第一方面中任一种可能的实施方式中的步骤。In a third aspect, an embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.
第四方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第一方面,或第一方面中任一种可能的实施方式中的步骤。In a fourth aspect, embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any of the first aspects of the first aspect, may be executed. Steps in one possible implementation.
第五方面,本公开实施例还提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述计算机设备中的处理器执行时实现上述第一方面,或第一方面中任一种可能的实施方式中的步骤。In the fifth aspect, an embodiment of the present disclosure further provides a computer program, including computer readable code, when the computer readable code is run in an electronic device, the processor in the computer device implements the above first program when executed. aspect, or a step in any possible implementation of the first aspect.
第六方面,本公开提供了一种计算机程序产品,包括计算机程序指令,所述计算机程序指令被计算机执行时实现上述第一方面,或第一方面中任一种可能的实施方式中的步骤。In a sixth aspect, the present disclosure provides a computer program product, including computer program instructions. When the computer program instructions are executed by a computer, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are implemented.
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍。In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the following will briefly introduce the drawings required in the embodiments.
图1示出了本公开实施例所提供的一种特效展示方法的流程图;FIG. 1 shows a flow chart of a method for displaying special effects provided by an embodiment of the present disclosure;
图2示出了本公开实施例所提供的一种姿态检测结果的示意图;Fig. 2 shows a schematic diagram of a posture detection result provided by an embodiment of the present disclosure;
图3示出了本公开实施例所提供的一种特效展示装置的示意图;Fig. 3 shows a schematic diagram of a special effect display device provided by an embodiment of the present disclosure;
图4示出了本公开实施例所提供的一种计算机设备的示意图。Fig. 4 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
具体实施方式detailed description
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这 三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B can mean: there is A alone, A and B exist at the same time, and B exists alone. situation. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
经研究发现,由于相关的虚拟直播方案需要主播手动触发展示特效动画,因此,该直播方案会占用主播直播时的手部表现,从而减少主播用户通过手部动作与观看者之间的交互效率,进而降低了直播用户对该直播软件的使用体验。After research, it was found that because the related virtual live broadcast solution requires the anchor to manually trigger the display of special effects animation, this live broadcast solution will occupy the hand performance of the anchor during the live broadcast, thereby reducing the interaction efficiency between the anchor user and the viewer through hand movements. Thereby reducing the live broadcast user's experience of using the live broadcast software.
基于上述研究,本公开提供了一种特效展示方法、装置、计算机设备、存储介质、计算机程序和计算机程序产品。本公开所提供的技术方案,可以应用于虚拟直播场景下。虚拟直播场景可以理解为使用预先设定的虚拟主播模型,如小熊猫、小兔子、卡通人物等代替真实主播的实际形象进行直播,此时,在视频直播画面中所展示出的为上述虚拟主播模型。同时,还可以根据该虚拟主播模型进行真实主播与观众的互动。Based on the above research, the present disclosure provides a special effect display method, device, computer equipment, storage medium, computer program and computer program product. The technical solution provided by the present disclosure can be applied in a virtual live broadcast scenario. The virtual live broadcast scene can be understood as the use of pre-set virtual anchor models, such as red pandas, little rabbits, cartoon characters, etc. to replace the actual image of the real anchor for live broadcast. At this time, the above-mentioned virtual anchor is shown in the live video screen Model. At the same time, the interaction between the real anchor and the audience can also be carried out according to the virtual anchor model.
举例来说,直播设备的摄像装置可以采集包含真实主播的视频图像,然后,对视频图像中所包含的真实主播的肢体进行捕捉,从而得到真实主播的姿态信息。在确定出该姿态信息之后,就可以生成对应的驱动信号,该驱动信号用于驱动直播设备在视频直播画面中展示虚拟主播模型对应的动画特效。For example, the camera device of the live broadcast device can collect a video image containing a real anchor, and then capture the body of the real anchor contained in the video image, so as to obtain posture information of the real anchor. After the posture information is determined, a corresponding driving signal can be generated, and the driving signal is used to drive the live broadcast device to display the animation special effect corresponding to the virtual anchor model in the live video screen.
在一个可选的实施方式中,真实主播可以预先设定相应的虚拟主播模型,例如,可以预先设定的虚拟主播模型为“XXX游戏中的YYY角色模型”。真实主播可以预先设定一个或多个虚拟主播模型。在开启当前时刻的虚拟直播时,可以从预先设定的一个或多个虚拟主播模型中选择一个作为当前时刻的虚拟主播模型。其中,虚拟主播模型可以为2D模型,还可以为3D模型。In an optional embodiment, the real anchor may preset a corresponding virtual anchor model, for example, the preset virtual anchor model may be "YYY role model in XXX game". A real anchor can preset one or more virtual anchor models. When starting the virtual live broadcast at the current moment, one can be selected from one or more preset virtual anchor models as the virtual anchor model at the current moment. Wherein, the virtual anchor model may be a 2D model or a 3D model.
在另一个可选的实施方式中,除了上述所描述的方式,即真实主播确定虚拟主播模型之外,还可以在获取到第一视频图像之后,为该第一视频图像中的真实主播重塑虚拟主播模型。In another optional implementation, in addition to the method described above, that is, the real anchor determines the virtual anchor model, after the first video image is acquired, the real anchor in the first video image can be reshaped Virtual anchor model.
在本公开的一些实施例中,直播设备可以对视频图像中所包含的真实主播进行识别,根据识别结果为真实主播重塑虚拟主播模型。这里,识别结果可以包含以下至少之一:真实主播的性别、真实主播的外貌特征、真实主播的穿戴特征等。此时,直播设备可以从虚拟主播模型库中搜索与该识别结果相匹配的模型作为该真实主播的虚拟主播模型。In some embodiments of the present disclosure, the live broadcast device can identify the real anchor included in the video image, and reshape the virtual anchor model for the real anchor according to the recognition result. Here, the recognition result may include at least one of the following: the gender of the real anchor, the appearance characteristics of the real anchor, the wearing characteristics of the real anchor, and the like. At this time, the live broadcast device may search for a model matching the recognition result from the virtual anchor model database as the virtual anchor model of the real anchor.
示例性的,直播设备在根据识别结果确定出,真实主播在直播过程中所戴鸭舌帽和所穿衣服为嘻哈风格的衣服的情况下,可以从虚拟主播模型库中搜索,将搜索到的与该“鸭舌帽”或者“嘻哈风”相匹配的虚拟主播模型作为该真实主播的虚拟主播模型。Exemplarily, when the live broadcast device determines according to the recognition result that the peaked cap and clothes worn by the real anchor during the live broadcast are hip-hop-style clothes, it can search from the virtual anchor model library, and compare the searched ones with the The virtual anchor model that matches the "peaked cap" or "hip-hop style" is used as the virtual anchor model of the real anchor.
在本公开的一些实施例中,直播设备除了在虚拟主播模型库中搜索与识别结果相匹配的模型之外,还可以基于该识别结果,通过模型构建模块,为真实主播实时构建出相应的虚拟主播模型。In some embodiments of the present disclosure, in addition to searching for a model that matches the recognition result in the virtual anchor model library, the live broadcast device can also build a corresponding virtual host in real time based on the recognition result through the model construction module. anchor model.
这里,在实时构建该虚拟主播模型时,还可以将该真实主播在过去时刻所发起的虚拟直播中所使用的虚拟主播模型作为参考,构建当前时刻该真实 主播所驱动的虚拟主播模型。Here, when constructing the virtual anchor model in real time, the virtual anchor model used in the virtual live broadcast initiated by the real anchor in the past can also be used as a reference to construct the virtual anchor model driven by the real anchor at the current moment.
通过上述所描述的确定虚拟主播模型的方式,可以实现为真实主播个性化定制相应的虚拟主播模型,从而增加虚拟主播模型的多样性。同时,通过个性化定制虚拟主播模型,还可以为观众留下更深刻的印象。Through the method of determining the virtual anchor model described above, it is possible to customize corresponding virtual anchor models for real anchors, thereby increasing the diversity of virtual anchor models. At the same time, by customizing the virtual anchor model, it can also leave a deeper impression on the audience.
为便于对本实施例进行理解,这里,对本公开实施例所公开的一种特效展示方法进行详细介绍,本公开实施例所提供的特效展示方法的执行主体一般为具有一定计算能力的计算机设备,例如:特效展示方法可以由终端设备、或服务器或其他处理设备执行,其中,终端设备可以为用户设备、移动设备、用户终端、终端、蜂窝电话、个人数字处理、手持设备、计算设备、车载设备、可穿戴设备等。该计算机设备可以为支持安装虚拟直播软件的任意一个直播设备。在一些可能的实现方式中,该特效展示方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of this embodiment, here, a special effect display method disclosed in the embodiment of the present disclosure is introduced in detail. The execution subject of the special effect display method provided in the embodiment of the present disclosure is generally a computer device with a certain computing power, for example : The special effect display method can be executed by a terminal device, or a server or other processing equipment, wherein the terminal device can be user equipment, mobile device, user terminal, terminal, cellular phone, personal digital processing, handheld device, computing device, vehicle-mounted device, wearable devices etc. The computer device can be any live broadcast device that supports the installation of virtual live broadcast software. In some possible implementation manners, the method for displaying special effects may be implemented in a manner in which a processor invokes computer-readable instructions stored in a memory.
参见图1所示,为本公开实施例提供的一种特效展示方法的流程图,所述方法包括步骤S101~S107,其中:Referring to FIG. 1 , it is a flow chart of a method for displaying special effects provided by an embodiment of the present disclosure. The method includes steps S101 to S107, wherein:
S101:获取真实主播在直播过程中的第一视频图像。S101: Obtain a first video image of a real anchor during a live broadcast.
S103:对第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果。S103: Perform posture detection on the designated body parts of the real anchor in the first video image, and obtain a posture detection result.
在本公开实施例中,直播设备可以通过直播设备上预先安装的摄像装置采集真实主播在直播过程中的视频流,第一视频图像为该视频流中的视频帧。In an embodiment of the present disclosure, the live broadcast device may collect a video stream of a real host during a live broadcast through a camera device pre-installed on the live broadcast device, and the first video image is a video frame in the video stream.
这里,采集到的视频流的视频图像中可以包含真实主播的脸部和上半身肢体部位。在一些实施例中,视频图像中还可以包含部分或者全部手部画面。在实际直播场景下,在真实主播离开摄像装置的拍摄范围,或者,真实主播的直播场景较为复杂的情况下,往往会导致视频图像中包含不完整的脸部和/或不完整上半身肢体部位。Here, the video images of the collected video stream may include the real host's face and upper body parts. In some embodiments, the video image may also include part or all of the hand images. In the actual live broadcast scene, when the real anchor leaves the shooting range of the camera device, or the live broadcast scene of the real anchor is relatively complicated, the video image often contains incomplete faces and/or incomplete upper body parts.
在本公开实施例中,指定肢体部位可以为真实主播的至少部分指定肢体部位。这里,指定肢体部位包含:头部部位和上半身肢体部位(两个手臂部位、手部部位和上半身躯干部位)。In the embodiment of the present disclosure, the specified body parts may be at least part of the specified body parts of the real anchor. Here, the specified body part includes: a head part and an upper body part (two arm parts, a hand part and an upper body trunk part).
可以理解的是,在指定肢体部位为多个的情况下,上述姿态检测结果可以用于表征以下至少一种:各个指定肢体部位之间的相对位置关系,该第一视频图像中所包含手势的手势分类结果。It can be understood that, in the case where there are multiple specified body parts, the above posture detection results can be used to characterize at least one of the following: the relative positional relationship between each specified body part, the position of the gesture contained in the first video image Gesture classification results.
S105:在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型目标动画特效;S105: When it is detected according to the posture detection result that the real anchor is in a preset posture, determine a virtual anchor model target animation special effect corresponding to the real anchor according to the posture detection result;
S107:在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。S107: Display the target animation special effect of the virtual anchor model in the live video interface corresponding to the real anchor.
在本公开实施例中,直播设备在确定出姿态检测结果之后,可以根据该姿态检测结果检测真实主播是否处于预设姿态。如果真实主播处于预设姿态,则确定出该第一视频图像中的真实主播满足动画特效的触发条件。此时,直播设备可以确定与第一视频图像相匹配的目标动画特效,并展示该目标动画 特效。In the embodiment of the present disclosure, after determining the posture detection result, the live broadcast device may detect whether the real host is in a preset posture according to the posture detection result. If the real anchor is in the preset posture, it is determined that the real anchor in the first video image satisfies the triggering condition of the animation special effect. At this point, the live broadcast device can determine the target animation special effect that matches the first video image, and display the target animation special effect.
本公开实施例可以应用于虚拟直播领域,直播设备可以在视频直播界面中展示真实主播所驱动的虚拟主播模型,并可以在视频直播界面中展示该虚拟主播模型的动画特效。这里,直播设备可以通过识别真实主播的姿态,基于姿态检测结果来确定真实主播驱动的虚拟主播模型的目标动画特效,并在视频直播界面中展示该目标动画特效。由此可以实现通过真实主播的姿态检测结果,触发在视频直播界面上展示虚拟主播模型对应的目标动画特效,如此,无需依赖外部控制设备来触发展示动画特效,同时还提高了虚拟直播用户的直播体验。The embodiments of the present disclosure can be applied to the field of virtual live broadcast. The live broadcast device can display the virtual anchor model driven by the real anchor in the live video interface, and can display the animation special effects of the virtual anchor model in the live video interface. Here, the live broadcast device can determine the target animation effect of the virtual anchor model driven by the real anchor based on the posture detection result by recognizing the posture of the real anchor, and display the target animation special effect in the live video interface. In this way, the gesture detection results of the real anchor can be used to trigger the display of the target animation effects corresponding to the virtual anchor model on the live video interface. In this way, there is no need to rely on external control devices to trigger the display of animation effects, and it also improves the live broadcast of virtual live broadcast users. experience.
在本公开的一些实施例中,针对上述步骤S103,在姿态检测结果包括:肢体检测结果和手势分类结果中的至少一个的情况下,对第一视频图像中真实主播的指定肢体部位进行姿态检测,得到姿态检测结果的实现,可以包括如下过程:In some embodiments of the present disclosure, for the above step S103, if the posture detection result includes at least one of the body detection result and the gesture classification result, perform posture detection on the specified body parts of the real anchor in the first video image , the realization of obtaining the attitude detection result may include the following process:
步骤S1031,对所述第一视频图像中真实主播的指定肢体部位进行肢体检测,得到肢体检测结果。Step S1031, perform limb detection on the specified limb parts of the real anchor in the first video image, and obtain a limb detection result.
步骤S1032,在所述肢体检测结果中包含手部检测框的情况下,对所述第一视频图像中位于所述手部检测框内的图像进行手势检测,得到手势分类结果。Step S1032 , if the body detection result includes a hand detection frame, perform gesture detection on the image within the hand detection frame in the first video image to obtain a gesture classification result.
步骤S1033,根据所述肢体检测结果和所述手势分类结果确定所述姿态检测结果。Step S1033, determining the posture detection result according to the body detection result and the gesture classification result.
在本公开实施例中,直播设备可以通过肢体检测模型,对第一视频图像中真实主播的指定肢体部位进行肢体检测,得到肢体检测结果。这里,所述肢体检测结果包含以下至少之一:肢体关键点、人脸框的尺寸、人脸框的位置信息、手部检测框的尺寸和手部检测框的位置信息。In the embodiment of the present disclosure, the live broadcast device may use the body detection model to perform body detection on the designated body parts of the real anchor in the first video image, and obtain body detection results. Here, the limb detection result includes at least one of the following: limb key points, size of the face frame, position information of the face frame, size of the hand detection frame, and position information of the hand detection frame.
如图2所示,直播设备可以通过肢体检测模型确定第一视频图像中真实主播的指定肢体部位的肢体关键点。在识别出第一视频图像中包含清晰面部图像的情况下,得到人脸框的尺寸和人脸框的位置信息中的至少一个。然后,在识别出第一视频图像中包含清晰手部图像的情况下,可以得到手部检测框的尺寸和手部检测框的位置信息中的至少一个。As shown in FIG. 2 , the live broadcast device can determine the body key points of the specified body parts of the real anchor in the first video image through the body detection model. If it is recognized that the first video image contains a clear facial image, at least one of the size of the human face frame and the position information of the human face frame is obtained. Then, if it is recognized that the first video image contains a clear hand image, at least one of the size of the hand detection frame and the position information of the hand detection frame can be obtained.
在本公开实施例中,在检测出肢体检测结果中包含手部检测框的情况下,直播设备还可以通过手势识别模型对位于手部检测框内的图像进行手势检测,得到手势分类结果,其中,该手势分类结果为一个特征向量,该特征向量用于表征第一视频图像中真实主播的手部姿势属于每个预设手势的概率。In the embodiment of the present disclosure, when it is detected that the body detection result contains the hand detection frame, the live broadcast device can also perform gesture detection on the image located in the hand detection frame through the gesture recognition model, and obtain the gesture classification result, where , the gesture classification result is a feature vector, which is used to characterize the probability that the hand gesture of the real anchor in the first video image belongs to each preset gesture.
在本公开实施例中,在检测得到上述肢体检测结果和手势分类结果之后,直播设备就可以将该肢体检测结果和手势分类结果确定为姿态检测结果。然后,根据肢体检测结果和手势分类结果判断第一视频图像中的真实主播是否满足动画特效的触发条件。如果满足动画特效的触发条件,则根据姿态检测结果确定与真实主播对应的虚拟主播模型的目标动画特效。In the embodiment of the present disclosure, after the above body detection result and gesture classification result are obtained, the live broadcast device may determine the body detection result and gesture classification result as the gesture detection result. Then, judge whether the real anchor in the first video image satisfies the triggering condition of the animation special effect according to the body detection result and the gesture classification result. If the triggering condition of the animation special effect is satisfied, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result.
在本公开实施例中,在肢体检测结果中不包含手部检测框的情况下,直 播设备可以舍弃该第一视频图像,并将视频流中的下一个视频帧作为第一视频图像,通过上述所描述的步骤对该第一视频图像进行处理,处理过程此处不再详细描述。In the embodiment of the present disclosure, if the body detection result does not contain the hand detection frame, the live broadcast device may discard the first video image, and use the next video frame in the video stream as the first video image, through the above The described steps process the first video image, and the processing process will not be described in detail here.
这里,肢体检测结果包含肢体关键点,例如,各个指定肢体部位的肢体关键点。若指定肢体部位包含头部部位和上半身肢体部位,则肢体关键点包含头部部位的关键点,以及上半身肢体部位中两个手臂的关键点,手部关键点,以及上身躯干的关键点。Here, the body detection result includes body key points, for example, body key points of each specified body part. If the specified limb part includes a head part and an upper body part, the body keys include keys for the head part, as well as keys for both arms in the upper body part, keys for the hands, and keys for the upper torso.
上述实施方式中,通过对第一视频图像进行肢体检测和手势检测,并对肢体检测结果和手势分类结果进行整合,可以得到用于准确表示第一视频图像中真实主播的动作语义信息的姿态检测结果。在通过该姿态检测结果确定目标动画特效时,可以提高所触发展示的目标动画特效的准确性。In the above embodiment, by performing limb detection and gesture detection on the first video image, and integrating the limb detection results and gesture classification results, gesture detection for accurately representing the action semantic information of the real anchor in the first video image can be obtained result. When the target animation special effect is determined based on the posture detection result, the accuracy of the target animation special effect that is triggered to be displayed can be improved.
在一个可选的实施方式中,上述步骤S105中,根据所述姿态检测结果检测所述真实主播处于预设姿态的实现,可以包括如下步骤:In an optional implementation manner, in the above step S105, the implementation of detecting that the real anchor is in a preset posture according to the posture detection result may include the following steps:
步骤S11,根据所述姿态检测结果中的肢体检测结果,判断所述第一视频图像中的真实主播是否满足手势识别条件,得到判断结果;Step S11, according to the body detection result in the posture detection result, judge whether the real anchor in the first video image satisfies the gesture recognition condition, and obtain the judgment result;
步骤S12,在所述判断结果表征所述真实主播满足所述手势识别条件的情况下,检测所述姿态检测结果中手势分类结果所指示的手势是否为预设手势;Step S12, when the judgment result indicates that the real host satisfies the gesture recognition condition, detect whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture;
步骤S13,在检测出所述手势分类结果所指示的手势是所述预设手势的情况下,确定所述真实主播处于所述预设姿态。Step S13, if it is detected that the gesture indicated by the gesture classification result is the preset gesture, determine that the real anchor is in the preset gesture.
在本公开实施例中,直播设备在确定出上述所描述的肢体检测结果之后,可以根据肢体检测结果判断第一视频图像中的真实主播是否满足手势识别条件。In the embodiment of the present disclosure, after the live broadcast device determines the body detection result described above, it may judge whether the real host in the first video image meets the gesture recognition condition according to the body detection result.
在本公开实施例中,直播设备可以根据肢体检测结果确定各个指定肢体部位之间的相对位置关系;根据该相对位置关系,确定第一视频图像中的真实主播是否满足手势识别条件。In an embodiment of the present disclosure, the live broadcast device may determine the relative positional relationship between each designated body part according to the body detection result; and determine whether the real host in the first video image meets the gesture recognition condition according to the relative positional relationship.
可以理解的是,上述相对位置关系包含以下至少之一:各个指定肢体部位之间的相对距离、各个指定肢体部位中相关联肢体部位之间的角度关系。其中,相关联肢体部位可以为相邻的指定肢体部位,或者,类型相同的指定肢体部位。It can be understood that the above relative positional relationship includes at least one of the following: a relative distance between each designated body part, and an angular relationship between associated body parts in each designated body part. Wherein, the associated body parts may be adjacent specified body parts, or specified body parts of the same type.
在本公开实施例中,第一视频图像中的真实主播满足手势识别条件可以被理解为:第一视频图像中真实主播的肢体动作为预先设定的肢体动作。In the embodiment of the present disclosure, the fact that the real anchor in the first video image satisfies the gesture recognition condition may be understood as: the body movement of the real anchor in the first video image is a preset body movement.
因此,在检测到第一视频图像中真实主播的肢体动作为预先设定的肢体动作的情况下,直播设备可以检测第一视频图像中真实主播所做出的手势是否为预设手势。Therefore, when detecting that the body movement of the real anchor in the first video image is a preset body movement, the live broadcast device may detect whether the gesture made by the real anchor in the first video image is a preset gesture.
在本公开实施例中,直播设备在检测第一视频图像中真实主播所做出的的手势是预设手势的情况下,可以确定出真实主播处于预设姿态,此时,就可以确定出第一视频图像中的真实主播满足动画特效的触发条件,进而执行根据姿态检测结果确定与真实主播对应的虚拟直播模型的目标动画特效的步骤。In the embodiment of the present disclosure, when the live broadcast device detects that the gesture made by the real anchor in the first video image is a preset gesture, it can determine that the real anchor is in the preset gesture. A real anchor in a video image satisfies the triggering condition of the animation special effect, and then executes the step of determining the target animation special effect of the virtual live broadcast model corresponding to the real anchor according to the posture detection result.
在本公开实施例中,直播设备可以通过肢体检测结果和手势分类结果的结合,来确定该虚拟主播模型的目标动画特效。在此情况下,不仅仅要求真实主播的手势为预设手势,还要求真实主播在做出预设手势时的肢体动作为预先设定的肢体动作。如此,先基于肢体检测结果判断第一视频图像中的真实对象是否满足手势识别条件的方式,可以提高姿态比对的效率,缩短姿态比对的时间,从而使得本技术方案能够适用于实时性要求较高的直播场景。In the embodiment of the present disclosure, the live broadcast device can determine the target animation special effect of the virtual anchor model through the combination of the body detection result and the gesture classification result. In this case, not only the gesture of the real anchor is required to be a preset gesture, but also the body movement of the real anchor when making the preset gesture is required to be a preset body movement. In this way, the method of first judging whether the real object in the first video image meets the gesture recognition condition based on the body detection result can improve the efficiency of posture comparison and shorten the time of posture comparison, so that this technical solution can be applied to real-time requirements High live scene.
在一个可选的实施方式中,上述步骤S11,根据所述姿态检测结果中的肢体检测结果确定所述第一视频图像中的真实主播是否满足手势识别条件的实现,可以包括如下过程:In an optional implementation manner, the implementation of the above step S11, determining whether the real anchor in the first video image satisfies the gesture recognition condition according to the limb detection result in the posture detection result, may include the following process:
(1)、根据所述肢体检测结果确定所述真实主播的各个指定肢体部位之间的相对方位信息;(1), according to the body detection result, determine the relative orientation information between each designated body part of the real anchor;
(2)、根据所述相对方位信息和预设方位信息,判断所述第一视频图像中的真实主播是否满足所述手势识别条件,得到所述判断结果;所述预设方位信息用于表征所述真实主播在处于预设姿态下,各个指定肢体部位之间的相对方位关系。(2) According to the relative orientation information and the preset orientation information, it is judged whether the real anchor in the first video image satisfies the gesture recognition condition, and the judgment result is obtained; the preset orientation information is used to represent The real anchor is in a preset posture, and the relative orientation relationship between each designated body part.
在本公开实施例中,在姿态检测结果中包含肢体检测结果和手势分类结果的情况下,直播设备可以首先根据肢体检测结果确定真实主播的各个指定肢体部位之间的相对方位信息(也即,上述所描述的相对位置关系),这里,各个指定肢体部位之间的相对方位信息可以包含以下相对方位信息:手部和脸部之间的相对方位信息、手臂和手臂之间的相对方位信息、上半身躯干和手臂之间的相对方位信息、手臂和上半身躯干之间的相对方位信息。In the embodiment of the present disclosure, if the posture detection result includes the body detection result and the gesture classification result, the live broadcast device may first determine the relative orientation information between the specified body parts of the real anchor according to the body detection result (that is, The relative positional relationship described above), here, the relative orientation information between each specified body parts may include the following relative orientation information: relative orientation information between hands and faces, relative orientation information between arms, Relative orientation information between upper body torso and arms, relative orientation information between arms and upper body torso.
这里,相对方位信息可以包含以下至少之一:相对距离、相对角度、相对方向。Here, the relative orientation information may include at least one of the following: relative distance, relative angle, and relative direction.
在本公开实施例中,相对距离可以包含:各个指定肢体部位之间的相对距离。例如,手部检测框的中心点和人脸检测框的中心点之间的相对距离为M1个像素点;真实主播的左手臂的手肘和右手臂的手肘之间的相对距离为M2个像素点;上半身躯干的中心点和每个手臂的手肘之间的相对距离为M3个像素点和M4个像素点。In the embodiment of the present disclosure, the relative distance may include: the relative distance between each designated body part. For example, the relative distance between the center point of the hand detection frame and the center point of the face detection frame is M1 pixels; the relative distance between the elbow of the left arm of the real anchor and the elbow of the right arm is M2 Pixels; the relative distance between the center point of the upper body torso and the elbow of each arm is M3 pixels and M4 pixels.
在本公开实施例中,相对角度可以包含:各个指定肢体部位之间的夹角。例如,手部检测框的中心点和人脸检测框的中心点之间的连线与水平线之间的夹角N1;真实主播的左手臂和右手臂之间的夹角为N2;上半身躯干和每个手臂之间的夹角为N3和N4。In the embodiment of the present disclosure, the relative angle may include: an included angle between various specified body parts. For example, the angle N1 between the center point of the hand detection frame and the center point of the face detection frame and the horizontal line; the angle between the left arm and the right arm of the real anchor is N2; the upper body torso and The angle between each arm is N3 and N4.
在本公开实施例中,相对方向可以包含:各个指定肢体部位之间的方向信息。例如,手部检测框在脸部检测框的左侧位置(或者,右侧位置、下方位置、上方位置等)。真实主播的左手臂在右手臂的左侧位置;上半身躯干在左手臂的右侧位置,并在右手臂的左侧位置等。In the embodiment of the present disclosure, the relative direction may include: direction information between various specified body parts. For example, the hand detection frame is at the left position (or right position, lower position, upper position, etc.) of the face detection frame. The left arm of the real anchor is on the left side of the right arm; the upper body torso is on the right side of the left arm, and on the left side of the right arm, etc.
在本公开实施例中,直播设备在确定出相对方位信息之后,就可以将相对方位信息和预设方位信息进行比较,得到比较结果,进而根据该比较结果判断第一视频图像中的真实主播是否满足所述手势识别条件。In the embodiment of the present disclosure, after determining the relative orientation information, the live broadcast device can compare the relative orientation information with the preset orientation information to obtain the comparison result, and then judge whether the real anchor in the first video image is based on the comparison result. The gesture recognition condition is met.
在一个可选的实施方式中,相对方位信息包含多个第一子信息,预设方位信息中包含多个第二子信息。上述将相对方位信息和预设方位信息进行比较,得到比较结果的实现,可以包括:In an optional implementation manner, the relative orientation information includes a plurality of first sub-information, and the preset orientation information includes a plurality of second sub-information. The realization of comparing the relative orientation information with the preset orientation information to obtain the comparison result may include:
(a)、将多个第一子信息和多个第二子信息中相同类型的子信息进行配对,得到多个待比较信息对。(a) Pairing the sub-information of the same type among the multiple first sub-information and the multiple second sub-information to obtain multiple pairs of information to be compared.
其中,相同类型可以理解为对应相同类型的指定肢体部位,且所表征的物理含义相同。Wherein, the same type can be understood as corresponding to the same type of specified body parts, and the represented physical meanings are the same.
若第一子信息为相对方位信息中用于表征手部检测框的中心点和人脸检测框的中心点之间的相对距离的信息,则第二子信息为预设方位信息中同样用于表征手部检测框的中心点和人脸检测框的中心点之间的相对距离的信息。If the first sub-information is the information used to characterize the relative distance between the center point of the hand detection frame and the center point of the face detection frame in the relative orientation information, then the second sub-information is the same information used in the preset orientation information Information representing the relative distance between the center point of the hand detection frame and the center point of the face detection frame.
(b)、确定每个待比较信息对中第一子信息和第二子信息之间的差异,从而得到多个差异值,比较结果包括多个差异值。(b) Determine the difference between the first sub-information and the second sub-information in each information pair to be compared, so as to obtain a plurality of difference values, and the comparison result includes a plurality of difference values.
在本公开实施例中,直播设备在得到多个差异值后,可以根据多个差异值判断第一视频图像中的真实主播是否满足手势识别条件。In the embodiment of the present disclosure, after obtaining multiple difference values, the live broadcast device may judge whether the real host in the first video image meets the gesture recognition condition according to the multiple difference values.
在本公开的一些实施例中,直播设备可以在每个差异值均小于预设差异阈值的情况下,得到第一视频图像中的真实主播满足手势识别条件的判断结果;在确定出每个差异值中至少一个差异值大于或者大于预设差异阈值的情况下,得到第一视频图像中的真实主播不满足手势识别条件的判断结果。In some embodiments of the present disclosure, the live broadcast device can obtain the judgment result that the real host in the first video image satisfies the gesture recognition condition when each difference value is smaller than the preset difference threshold; When at least one difference value among the values is greater than or greater than the preset difference threshold, a judgment result is obtained that the real anchor in the first video image does not satisfy the gesture recognition condition.
在本公开的一些实施例中,直播设备可以在多个差异值中确定出小于预设差异阈值的数量大于或者等于预设数量阈值的情况下,得到第一视频图像中的真实主播满足手势识别条件的判断结果,在多个差异值中确定出于晓预设差异阈值的数量小于预设数量阈值的情况下,得到第一视频图像中的真实主播不满足手势识别条件的判断结果。In some embodiments of the present disclosure, the live broadcast device can obtain that the real host in the first video image satisfies gesture recognition when it is determined that the number of difference values smaller than the preset difference threshold is greater than or equal to the preset number threshold As for the judgment result of the condition, if it is determined that the number of the preset difference threshold is less than the preset number threshold among the plurality of difference values, the judgment result that the real anchor in the first video image does not meet the gesture recognition condition is obtained.
在本公开的一些实施例中,直播设备可以确定每个差异值所对应的权重值,然后,根据每个差异值和对应的权重值进行加权求和,得到加权求和计算结果,在该加权求和计算结果小于或者等于预设加权阈值的情况下,确定第一视频图像中的真实主播满足手势识别条件;在加权求和计算计算结果大于预设加权阈值的情况下,确定第一视频图像中的真实主播不满足手势识别条件。In some embodiments of the present disclosure, the live broadcast device may determine the weight value corresponding to each difference value, and then perform a weighted summation according to each difference value and the corresponding weight value to obtain a weighted summation calculation result. When the sum calculation result is less than or equal to the preset weighted threshold, it is determined that the real anchor in the first video image meets the gesture recognition condition; when the weighted sum calculation result is greater than the preset weighted threshold, it is determined that the first video image The real anchors in do not meet the gesture recognition requirements.
需要说明的是,预设差异阈值、预设数量阈值和预设加权阈值可以根据实际需要设置,对此,本公开实施例不作限制。It should be noted that the preset difference threshold, the preset quantity threshold, and the preset weighting threshold can be set according to actual needs, which is not limited in this embodiment of the present disclosure.
上述实施方式中,通过将相对方位信息和预设方位信息进行比对,来确定第一视频图像中真实主播的肢体动作是否为预先设定的肢体动作的方式,可以得到更加准确的肢体比对结果,从而能够更加准确的确定第一视频图像是否满足手势识别条件。In the above embodiment, by comparing the relative orientation information with the preset orientation information, it is determined whether the body movement of the real anchor in the first video image is the preset body movement, so that more accurate body comparison can be obtained As a result, it can be more accurately determined whether the first video image satisfies the gesture recognition condition.
在一个可选的实施方式中,上述步骤S105中:根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效的实现,可以包括如下步骤:In an optional embodiment, in the above step S105: determining the realization of the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result may include the following steps:
步骤S21,基于所述姿态检测结果,确定针对动画特效的第一驱动信息; 其中,所述第一驱动信息用于指示所述视频直播界面中所展示动画特效的动画跳转信息;Step S21, based on the posture detection result, determine the first driving information for the animation special effect; wherein, the first driving information is used to indicate the animation jump information of the animation special effect displayed in the live video interface;
步骤S22,根据所述第一驱动信息,在所述姿态检测结果所对应的多个动画序列中确定与所述第一驱动信息相匹配的动画序列,并将所述相匹配的动画序列确定为所述目标动画特效。Step S22, according to the first driving information, determine the animation sequence matching the first driving information among the plurality of animation sequences corresponding to the posture detection result, and determine the matching animation sequence as The target animation effect.
在本公开实施例中,第一驱动信息可以为一个1×(P+Q)的矩阵,其中,P可以为预设姿态的数量,Q可以为预设姿态所对应多个动画序列的数量。In the embodiment of the present disclosure, the first driving information may be a 1×(P+Q) matrix, where P may be the number of preset gestures, and Q may be the number of multiple animation sequences corresponding to the preset gestures.
这里,通过该第一驱动信息可以在多个预设姿态中确定第一视频图像中真实主播的姿态检测结果相匹配的预设姿态。在上述1×(P+Q)的矩阵中,该相匹配的预设姿态所对应的元素可以设置为“1”,其他预设姿态所对应元素可以设置为“0”,其中,“1”表示该预设姿态为与真实主播的姿态检测结果相匹配的姿态,“0”表示该预设姿态不是与真实主播的姿态检测结果相匹配的姿态。Here, the preset pose that matches the pose detection result of the real anchor in the first video image can be determined among multiple preset poses based on the first driving information. In the above 1×(P+Q) matrix, the element corresponding to the matched preset posture can be set to “1”, and the elements corresponding to other preset postures can be set to “0”, wherein “1” indicates that the preset posture is a posture that matches the posture detection result of the real anchor, and "0" represents that the preset posture is not a posture that matches the posture detection result of the real anchor.
在上述1×(P+Q)的矩阵中,Q的数量与“相匹配的预设姿态”相关联,即Q可以理解为该“相匹配的预设姿态”所对应的多个动画序列。在上述1×(P+Q)的矩阵中,与第一视频图像相匹配的动画序列所对应的元素可以设置为“1”,其余元素可以设置为“0”。In the above-mentioned 1×(P+Q) matrix, the quantity of Q is associated with the “matched preset pose”, that is, Q can be understood as a plurality of animation sequences corresponding to the “matched preset pose”. In the above 1×(P+Q) matrix, the element corresponding to the animation sequence matching the first video image can be set as “1”, and the other elements can be set as “0”.
在本公开实施例中,直播设备在确定出第一驱动信息之后,就可以根据该第一驱动信息,在多个预设姿态中确定与姿态检测结果相匹配的预设姿态,并在该预设姿态所对应的多个动画序列中,确定与真实主播对应的虚拟主播模型的目标动画特效。In the embodiment of the present disclosure, after the live broadcast device determines the first driving information, it can determine a preset posture that matches the posture detection result among multiple preset postures according to the first driving information, and In the plurality of animation sequences corresponding to the posture, determine the target animation special effect of the virtual anchor model corresponding to the real anchor.
针对任意一个预设姿态,直播设备可以为该预设姿态预先设定多个阶段,例如:动作进入阶段、动作保持阶段、动作退出阶段。示例性的,预设姿态为如图2所示的真实主播通过手部动作展示“OK”的姿势,真实主播的手臂抬起至头部位置并展示出“OK”动作可以为上述动作进入阶段,真实主播保持该“OK”动作可以为上述动作保持阶段,真实主播的手臂从头部位置放下为上述动作退出阶段。需要说明的是,直播设备可以为每个阶段预先设定相应的动画序列,根据预先设定的相应的动画序列识别预设姿态处于哪个阶段。For any preset posture, the live broadcast device can pre-set multiple stages for the preset posture, for example: an action entry stage, an action hold stage, and an action exit stage. Exemplarily, the preset posture is the posture of the real anchor showing "OK" through hand movements as shown in Figure 2, and the arm of the real anchor is raised to the position of the head and showing the "OK" action can enter the stage for the above action , the real anchor keeping the "OK" action can be the above-mentioned action holding stage, and the real anchor's arm is lowered from the head position is the above-mentioned action exit stage. It should be noted that the live broadcast device may preset a corresponding animation sequence for each stage, and identify which stage the preset gesture is in according to the preset corresponding animation sequence.
上述实施方式中,通过确定第一驱动信息,直播设备可以根据第一驱动信息确定目标动画特效,从而可以简化数据格式,节省直播设备的设备内存,从而保证直播过程的流畅性。In the above embodiment, by determining the first driving information, the live broadcast device can determine the target animation special effect according to the first driving information, thereby simplifying the data format, saving the device memory of the live broadcast device, and ensuring the smoothness of the live broadcast process.
在一个可选的实施方式中,步骤S21,所述基于所述姿态检测结果,确定针对动画特效的第一驱动信息的实现,可以包括如下过程:In an optional implementation manner, step S21, the implementation of determining the first driving information for animation special effects based on the posture detection result may include the following process:
(1)、确定所述视频流中位于所述第一视频图像之前的至少一个视频图像;(1), determining at least one video image located before the first video image in the video stream;
(2)、获取根据每个所述视频图像确定出的针对动画特效的第二驱动信息,并根据所述姿态检测结果确定针对动画特效的估计驱动信息;(2) Obtaining the second driving information for animation special effects determined according to each of the video images, and determining the estimated driving information for animation special effects according to the gesture detection result;
(3)、确定所述第二驱动信息和所述估计驱动信息中每个驱动信息所驱动展示的动画序列,得到至少一个动画序列;(3) Determine the animation sequence driven by each of the driving information in the second driving information and the estimated driving information to obtain at least one animation sequence;
(4)、将所述至少一个动画序列中出现次数满足预设次数要求的动画序列所对应的驱动信息确定为所述第一驱动信息。(4) Determine, as the first driving information, the driving information corresponding to the animation sequence whose occurrence number meets the preset number requirement in the at least one animation sequence.
在本公开实施例中,直播设备可以检测直播设备的帧率的稳定性,根据帧率的稳定性确定所述视频流中位于所述第一视频图像之前的至少一个视频图像。In the embodiment of the present disclosure, the live broadcast device may detect the stability of the frame rate of the live broadcast device, and determine at least one video image in the video stream before the first video image according to the frame rate stability.
例如,针对帧率相对稳定的直播设备,可以确定一个目标时间窗口。示例性的,第一视频图像所对应的采集时刻为T秒,那么该目标时间窗口可以为[T-1,T]。也就是说,直播设备可以将视频流中位于该目标时间窗口内的视频图像确定为至少一个视频图像。For example, for a live broadcast device with a relatively stable frame rate, a target time window may be determined. Exemplarily, the acquisition time corresponding to the first video image is T seconds, then the target time window may be [T-1, T]. That is to say, the live broadcast device may determine the video image within the target time window in the video stream as at least one video image.
又例如,针对帧率不稳定的直播设备,可以获取视频流中位于第一视频图像之前的N帧视频图像作为至少一个视频图像。For another example, for a live broadcast device with an unstable frame rate, N frames of video images preceding the first video image in the video stream may be acquired as at least one video image.
在得到至少一个视频图像之后,直播设备可以获取第二驱动信息,第二驱动信息是根据每个所述视频图像确定出的,如此,直播设备可以得到至少一个第二驱动信息,并根据姿态检测结果确定针对动画特效的估计驱动信息。之后,就可以确定第二驱动信息和所述估计驱动信息中每个驱动信息所驱动展示的动画序列,进而得到至少一个动画序列。After obtaining at least one video image, the live broadcast device can obtain the second driving information, and the second driving information is determined according to each of the video images, so that the live broadcast device can obtain at least one second driving information, and according to the posture detection As a result, estimated driving information for the animated special effect is determined. Afterwards, an animation sequence driven by each of the second driving information and the estimated driving information may be determined, so as to obtain at least one animation sequence.
在得到至少一个动画序列之后,就可以从至少一个动画序列中确定出现次数满足预设次数要求的动画序列,然后,将该满足预设次数要求的动画序列所对应的驱动信息确定为第一驱动信息。示例性的,预设次数要求可以为出现次数最多,也就是说,直播设备可以将至少一个动画序列中出现次数最多的动画序列作为满足预设次数要求的动画序列。After obtaining at least one animation sequence, it is possible to determine from at least one animation sequence that the number of occurrences of the animation sequence meets the requirement of the preset number of times, and then determine the driving information corresponding to the animation sequence that meets the requirement of the preset number of times as the first driving information. Exemplarily, the requirement for the preset number of times may be the highest number of occurrences, that is, the live broadcast device may use the animation sequence with the highest number of occurrences in at least one animation sequence as the animation sequence that meets the requirement for the preset number of times.
举例来说,从T0时刻开始采集视频流,直到T1时刻,可以得到30帧视频图像,此时,直播设备可以获取根据每个视频图像确定出的针对动画特效的第二驱动信息,然后,根据30帧视频图像中每帧视频图像确定出的针对动画特效的第二驱动信息,得到30个第二驱动信息;确定30个第二驱动信息中每个第二驱动信息所驱动展示的动画序列,得到30个动画序列;将30个动画序列中出现次数最高的动画序列所对应的第二驱动信息确定为这30帧视频图像中每帧视频图像的第一驱动信息。For example, starting to collect video streams from time T0 until time T1, 30 frames of video images can be obtained. At this time, the live broadcast device can obtain the second driving information for animation special effects determined according to each video image, and then, according to The second driving information for animation special effects determined for each frame of video images in the 30 frames of video images, to obtain 30 second driving information; determine the animation sequence driven by each second driving information in the 30 second driving information, 30 animation sequences are obtained; the second driving information corresponding to the animation sequence with the highest occurrence frequency among the 30 animation sequences is determined as the first driving information of each frame of video images in the 30 frames of video images.
由于虚拟直播模型是根据多个视频图像中每个视频图像确定出的,虚拟直播模型的动画特效可能不同,此时,视频直播界面中所展示的动画特效可能存在抖动的问题,基于此,本公开技术方案提出了一种基于时间序列的决策稳定算法,该算法首先获取视频流中位于第一视频图像之前的至少一个视频图像,然后,根据每个视频图像确定针对动画特效的第二驱动信息,再根据第二驱动信息确定第一驱动信息,通过该处理方式可以在保证较低决策响应延迟的基础上减少信号抖动的情况,进而提高触发对应动作的动画序列的准确性。Since the virtual live broadcast model is determined based on each video image in multiple video images, the animation effects of the virtual live broadcast model may be different. At this time, the animation special effects displayed in the live video interface may have the problem of shaking. The disclosed technical solution proposes a decision stabilization algorithm based on time series. The algorithm first obtains at least one video image in the video stream before the first video image, and then determines the second driving information for animation special effects according to each video image. , and then determine the first driving information according to the second driving information. Through this processing method, the signal jitter can be reduced on the basis of ensuring a low decision-making response delay, thereby improving the accuracy of the animation sequence that triggers the corresponding action.
在一个可选的实施方式中,上述步骤S22:根据所述第一驱动信息,在所述姿态检测结果所对应的多个动画序列中确定与所述第一驱动信息相匹配的动画序列的实现,可以包括如下过程:In an optional embodiment, the above step S22: according to the first driving information, the implementation of determining the animation sequence matching the first driving information among the multiple animation sequences corresponding to the gesture detection result , which can include the following processes:
(1)、获取所述多个动画序列的动画状态机;所述动画状态机用于表征多个动画状态之间的跳转关系,每个所述动画状态对应一个或多个动画序列;(1), obtaining the animation state machines of the multiple animation sequences; the animation state machines are used to represent the jump relationship between multiple animation states, and each of the animation states corresponds to one or more animation sequences;
(2)、根据所述第一驱动信息,确定所述动画状态机的下一个待跳转的动画状态;(2), according to the first driving information, determine the next animation state to be jumped of the animation state machine;
(3)、根据所述下一个待跳转的动画状态所对应的动画序列,确定与所述第一驱动信息相匹配的动画序列。(3) According to the animation sequence corresponding to the next animation state to be jumped to, determine the animation sequence matching the first driving information.
在本公开实施例中,直播设备可以预先为预设姿态所对应的多个动画序列设置相应的动画状态机。如此,在得到第一驱动信息之后,就可以根据第一驱动信息中所包含的内容,确定动画状态机的下一个待跳转的动画状态,例如,由当前动画状态A跳转至动画状态B。在确定出下一个待跳转的动画状态之后,就可以确定该下一个待跳转的动画状态所对应的动画序列,进而确定该动画序列为与第一视频图像相匹配的动画序列。In the embodiment of the present disclosure, the live broadcast device may pre-set corresponding animation state machines for multiple animation sequences corresponding to preset poses. In this way, after obtaining the first driving information, the next animation state to be jumped to by the animation state machine can be determined according to the content contained in the first driving information, for example, jumping from the current animation state A to the animation state B . After the next animation state to be jumped is determined, the animation sequence corresponding to the next animation state to be jumped can be determined, and then the animation sequence is determined to be an animation sequence matching the first video image.
在一些实施例中,从A跳转到B,可以通过动画过渡帧进行过渡。这里,直播设备可以基于连续动画片段的混合参数,通过动画混合算法,实现Avatar形象骨骼动画或特效动画的播放。In some embodiments, jumping from A to B may be transitioned through animation transition frames. Here, the live broadcast device can realize the playback of the Avatar image skeletal animation or special effect animation through the animation mixing algorithm based on the mixing parameters of the continuous animation clips.
在一些实施例中,动画混合算法可以包括骨骼蒙皮动画算法、Mecannim动画的2维自由笛卡尔算法(2D freeform cartesian)和2维简单定向(2D simple directional)等动画算法中的至少一个,对此,本申请实施例不作限制。In some embodiments, the animation blending algorithm may include at least one of animation algorithms such as skeletal skin animation algorithm, 2D freeform cartesian algorithm (2D freeform cartesian) and 2D simple directional (2D simple directional) of Mecannim animation, for Therefore, this embodiment of the present application does not make a limitation.
在一些实施例中,混合参数可以包括速度、方向等参数。In some embodiments, mixing parameters may include parameters such as speed and direction.
可以理解的是,动画状态机中包含各个动画状态之间的转移条件,因此,通过驱动信息对动画状态机进行控制,以实现视频直播界面中动画状态的跳转,使直播用户可以在直播过程中使用更复杂的肢体动作,实现各种动作状态之间的流畅转移,以提高用户直播体验。It can be understood that the animation state machine contains transition conditions between animation states. Therefore, the animation state machine is controlled through the driving information to realize the transition of the animation state in the live video interface, so that the live broadcast user can Use more complex body movements to achieve smooth transitions between various action states to improve the user's live broadcast experience.
在一个可选的实施方式中,在目标动画特效包括用于表征虚拟主播模型的肢体动作的肢体动作特效和渲染素材特效种的至少一个的情况下,上述步骤S105:在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效的实现,可以包括以下至少之一:In an optional implementation manner, in the case where the target animation special effects include at least one of body movement special effects and rendering material special effects used to characterize the body movements of the virtual anchor model, the above step S105: in the real anchor corresponding The realization of the target animation special effects of the virtual anchor model displayed in the live video interface may include at least one of the following:
在所述视频直播界面中展示所述虚拟主播模型的肢体动作的肢体动作特效;在所述视频直播界面中与所述虚拟主播模型的肢体动作关联的目标位置处展示所述渲染素材特效。Displaying the special effects of body movements of the virtual anchor model's body movements in the live video interface; displaying the rendering material special effects at a target position associated with the body movements of the virtual anchor model in the live video interface.
在本公开实施例中,目标动画特效包含虚拟主播的肢体动作特效,例如,虚拟直播的肢体执行如图2所示的“OK”动作的动作特效。除此之外,目标动画特效还包含渲染素材特效,例如,可以为虚拟主播添加“兔耳朵”的特效。In the embodiment of the present disclosure, the target animation special effects include the virtual anchor's body movement special effects, for example, the movement special effects of the virtual live broadcast's limbs performing the "OK" action as shown in FIG. 2 . In addition, the target animation special effects also include rendering material special effects, for example, you can add "rabbit ears" special effects to the virtual anchor.
这里,目标位置可以与肢体动作特效相关联,例如,针对相同的渲染素材,在不同肢体动作特效的情况下,在视频直播界面中的展示位置(即目标位置)也不相同。Here, the target position may be associated with body motion special effects, for example, for the same rendering material, in the case of different body motion special effects, the display positions (ie, target positions) in the live video interface are also different.
上述实施方式中,通过在视频直播界面中展示肢体动作特效和渲染素材特效的方式,可以丰富视频直播界面中所展示的内容,从而增加了直播的趣 味性,提高了用户的直播体验。In the above embodiments, by displaying body movement special effects and rendering material special effects in the live video interface, the content displayed in the live video interface can be enriched, thereby increasing the fun of the live broadcast and improving the user's live broadcast experience.
在一个可选的实施方式中,该方法还包括:获取所述真实主播所对应的虚拟直播场景。In an optional implementation manner, the method further includes: acquiring a virtual live broadcast scene corresponding to the real anchor.
这里,虚拟直播场景可以为游戏解说场景、才艺展示场景和情感表达场景等,对此,本公开实施例不做限制。Here, the virtual live broadcast scene may be a game commentary scene, a talent show scene, an emotional expression scene, etc., which is not limited in this embodiment of the present disclosure.
上述步骤S105:根据所述姿态检测结果,确定与所述真实主播对应的虚拟主播模型的目标动画特效的实现,还可以包括如下步骤:获取与所述姿态检测结果相匹配的初始动画特效;在所述初始动画特效中确定与所述虚拟直播场景相匹配的动画特效作为所述目标动画特效。The above step S105: according to the posture detection result, determine the realization of the target animation special effect of the virtual anchor model corresponding to the real anchor, and may also include the following steps: obtaining the initial animation special effect matching the posture detection result; An animation special effect matching the virtual live broadcast scene is determined as the target animation special effect in the initial animation special effect.
在本公开实施例中,针对不同的虚拟直播场景,直播设备可以设置不同类型的动画特效。示例性的,针对如图2所示的“OK”姿态,在不同的虚拟直播场景下,所展示的动画特效也可以是不相同的,例如,在游戏解说场景下,“OK”姿态所对应的动画特效更加符合游戏人物的动作习惯。In the embodiments of the present disclosure, for different virtual live broadcast scenes, the live broadcast device can set different types of animation special effects. Exemplarily, for the "OK" gesture shown in Figure 2, in different virtual live broadcast scenarios, the displayed animation special effects may also be different, for example, in the game commentary scene, the "OK" gesture corresponds to The animation special effects are more in line with the action habits of the game characters.
在本公开实施例中,直播设备可以在获取到与姿态检测结果相匹配的初始动画特效之后,在初始动画特效中确定与虚拟直播场景相匹配的动画特效作为目标动画特效。In the embodiment of the present disclosure, after acquiring the initial animation special effect matching the posture detection result, the live broadcast device may determine the animation special effect matching the virtual live broadcast scene as the target animation special effect in the initial animation special effect.
举例来说,预设姿态为如图2所示的“OK”姿态,可以预先为该“OK”姿态设置多个初始动画特效,例如,初始动画特效M1,初始动画特效M2,初始动画特效M3。针对每个初始动画特效,均对应设置了场景标签,用于指示该初始动画特效所适用的虚拟直播场景。For example, the default posture is the "OK" posture as shown in Figure 2, and multiple initial animation special effects can be set in advance for the "OK" posture, for example, initial animation special effects M1, initial animation special effects M2, initial animation special effects M3 . For each initial animation special effect, a corresponding scene label is set to indicate the virtual live broadcast scene to which the initial animation special effect applies.
在获取到真实主播所对应的虚拟直播场景之后,可以将该虚拟直播场景与场景标签进行匹配,从而确定出与虚拟直播场景相匹配的动画特效作为目标动画特效。After obtaining the virtual live broadcast scene corresponding to the real anchor, the virtual live broadcast scene can be matched with the scene label, so as to determine the animation special effect matching the virtual live broadcast scene as the target animation special effect.
上述实施方式中,通过虚拟直播场景在初始动画特效中筛选得到目标动画特效的方式,可以实现为直播用户定制个性化的动画特效,进而使得确定出的目标动画特效能够更加满足用户的直播需求,从而提高了用户的直播体验。In the above-mentioned embodiment, by screening the target animation special effects in the initial animation special effects in the virtual live broadcast scene, it is possible to customize personalized animation special effects for live broadcast users, so that the determined target animation special effects can better meet the user's live broadcast needs. Thereby improving the user's live broadcast experience.
下面将结合具体实施方式对上述所描述的特效展示方法进行介绍。The method for displaying special effects described above will be introduced below in conjunction with specific implementation methods.
示例性的,真实主播记为主播A,真实主播所驱动的虚拟主播模型为“兔子公主”。主播A所选用的直播设备为智能手机,在该智能手机上设置有相机。预设姿态为如图2所示的姿态,预设手势为如图2所示的:OK手势。Exemplarily, the real anchor is recorded as anchor A, and the virtual anchor model driven by the real anchor is "Princess Rabbit". The live broadcast device selected by anchor A is a smart phone, and a camera is installed on the smart phone. The preset gesture is the gesture shown in FIG. 2 , and the preset gesture is the OK gesture as shown in FIG. 2 .
在本公开实施例中,首先,直播设备通过相机采集真实主播在直播过程中的第一视频图像;然后,对第一视频图像中主播A的指定肢体部位进行肢体检测,得到肢体检测结果。肢体检测结果包含以下至少之一:肢体关键点、人脸框的尺寸、人脸框的位置信息、手部检测框的尺寸、手部检测框的位置信息。In the embodiment of the present disclosure, first, the live broadcast device collects the first video image of the real anchor during the live broadcast through the camera; then, performs body detection on the designated body parts of the anchor A in the first video image, and obtains the body detection result. The body detection result includes at least one of the following: body key points, size of the face frame, position information of the face frame, size of the hand detection frame, and position information of the hand detection frame.
在得到肢体检测结果之后,如果根据肢体检测结果确定出主播A的手部检测框和人脸框之间的相对方位信息,确定主播A满足手势识别条件。例如,如图2所示,根据该肢体检测结果可以确定出主播A的手部在人脸的一侧, 并与人脸相邻。此时,可以对第一视频图像中位于该手部检测框内的图像进行手势检测,得到手势分类结果。如果识别出主播A所做的手势为“OK手势”,且确定主播A处于如图2所示的预设姿态。After the limb detection result is obtained, if the relative orientation information between anchor A's hand detection frame and face frame is determined according to the limb detection result, it is determined that anchor A meets the gesture recognition condition. For example, as shown in FIG. 2 , according to the body detection result, it can be determined that anchor A's hand is on one side of the face and adjacent to the face. At this time, gesture detection may be performed on the image within the hand detection frame in the first video image to obtain a gesture classification result. If it is recognized that the gesture made by anchor A is "OK gesture", and it is determined that anchor A is in the preset posture as shown in FIG. 2 .
之后,直播设备可以根据该姿态检测结果确定与主播A对应的虚拟主播模型“兔子公主”的目标动画特效,并在主播A对应的视频直播界面中展示“兔子公主”的目标动画特效。Afterwards, the live broadcast device can determine the target animation special effect of the virtual anchor model "Princess Rabbit" corresponding to anchor A according to the posture detection result, and display the target animation special effect of "Princess Rabbit" on the live video interface corresponding to anchor A.
示例性的,针对“OK手势”,可以包含3个动画阶段:主播A的手臂抬起至头部位置并展示出“OK”动作的动作进入阶段,主播A保持该“OK”动作的动作保持阶段,以及主播A的手臂从头部位置放下的动作退出阶段。针对每个阶段,可以为其确定相应的动画序列。Exemplarily, for the "OK gesture", it can include 3 animation stages: the action of anchor A raising his arm to the head position and showing the "OK" action enters the stage, and the action of anchor A maintaining the "OK" action remains stage, and the action of Anchor A's arm lowering from the head position exits the stage. For each phase, a corresponding animation sequence can be determined for it.
直播设备在得到该姿态检测结果之后,还可以确定视频流中位于该第一视频图像之前的至少一个视频图像,并获取根据每个视频图像确定出的针对动画特效的第二驱动信息,该第二驱动信息用于指示视频直播界面中所展示动画特效的动画跳转信息,同时还可以根据姿态检测结果,确定针对动画特效的估计驱动信息。之后,就可以根据第二驱动信息和估计驱动信息,确定每个驱动信息所驱动展示的动画序列;然后,在确定出的动画序列中确定出现次数最多的动画序列。例如,针对“动作保持阶段”所对应的动画序列的出现次数最多。之后,可以在视频直播画面中播放该“动作保持阶段”所对应的动画序列。After the live broadcast device obtains the gesture detection result, it can also determine at least one video image in the video stream before the first video image, and obtain the second driving information for animation special effects determined according to each video image, the first The second driving information is used to indicate the animation jump information of the animation special effect displayed in the live video interface, and at the same time, the estimated driving information for the animation special effect can be determined according to the posture detection result. Afterwards, according to the second driving information and the estimated driving information, the animation sequence driven by each driving information may be determined; and then, the animation sequence with the largest number of occurrences may be determined among the determined animation sequences. For example, the animation sequence corresponding to the "motion hold phase" has the highest number of occurrences. Afterwards, the animation sequence corresponding to the "action holding stage" can be played on the live video screen.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.
基于同一发明构思,本公开实施例中还提供了与特效展示方法对应的特效展示装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述特效展示方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。Based on the same inventive concept, the embodiment of the present disclosure also provides a special effect display device corresponding to the special effect display method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned special effect display method of the embodiment of the present disclosure, the implementation of the device Reference can be made to the implementation of the method, and repeated descriptions will not be repeated.
参照图3所示,为本公开实施例提供的一种特效展示装置的示意图,所述装置包括:获取部分41、姿态检测部分42、确定部分43、展示部分44;其中,Referring to FIG. 3 , it is a schematic diagram of a special effect display device provided by an embodiment of the present disclosure. The device includes: an acquisition part 41, a posture detection part 42, a determination part 43, and a display part 44; wherein,
获取部分41,被设置为获取真实主播在直播过程中的第一视频图像;The acquisition part 41 is configured to acquire the first video image of the real anchor during the live broadcast;
姿态检测部分42,被设置为对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果;The posture detection part 42 is configured to perform posture detection on the designated limb parts of the real anchor in the first video image, and obtain a posture detection result;
确定部分43,被设置为在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效;The determining part 43 is configured to determine the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result when it is detected that the real anchor is in a preset posture according to the posture detection result;
展示部分44,被设置为在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。The display part 44 is configured to display the target animation special effects of the virtual anchor model in the live video interface corresponding to the real anchor.
一种可能的实施方式中,姿态检测部分42,还被设置为:在姿态检测结果包括肢体检测结果和手势分类结果中的至少一个的情况下,对所述第一视 频图像中真实主播的指定肢体部位进行肢体检测,得到肢体检测结果;在所述肢体检测结果中包含手部检测框的情况下,对所述第一视频图像中位于所述手部检测框内的图像进行手势检测,得到手势分类结果;根据所述肢体检测结果和所述手势分类结果确定所述姿态检测结果。In a possible implementation manner, the posture detection part 42 is further configured to: when the posture detection result includes at least one of the body detection result and the gesture classification result, specifying the real anchor in the first video image Performing limb detection on the limb parts to obtain the limb detection result; in the case that the limb detection result includes a hand detection frame, performing gesture detection on the image in the first video image located in the hand detection frame to obtain A gesture classification result: determining the posture detection result according to the limb detection result and the gesture classification result.
一种可能的实施方式中,确定部分43,还被设置为:根据所述姿态检测结果中的肢体检测结果判断所述第一视频图像中的真实主播是否满足手势识别条件,得到判断结果;在所述判断结果表征所述真实主播满足所述手势识别条件的情况下,检测所述姿态检测结果中手势分类结果所指示的手势是否为预设手势;在检测出所述手势分类结果所指示的手势是所述预设手势的情况下,确定所述真实主播处于所述预设姿态。In a possible implementation manner, the determining part 43 is further configured to: judge whether the real anchor in the first video image satisfies the gesture recognition condition according to the body detection result in the posture detection result, and obtain the judgment result; When the judgment result indicates that the real anchor satisfies the gesture recognition condition, detect whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture; when the gesture indicated by the gesture classification result is detected If the gesture is the preset gesture, it is determined that the real anchor is in the preset gesture.
一种可能的实施方式中,确定部分43,还被设置为:根据所述肢体检测结果确定所述真实主播的各个指定肢体部位之间的相对方位信息;根据所述相对方位信息和预设方位信息,确定所述第一视频图像中的真实主播是否满足所述手势识别条件;所述预设方位信息用于表征所述真实主播在处于预设姿态下,各个指定肢体部位之间的相对方位关系。In a possible implementation manner, the determination part 43 is further configured to: determine the relative orientation information between the designated limb parts of the real anchor according to the body detection results; information to determine whether the real anchor in the first video image satisfies the gesture recognition condition; the preset orientation information is used to characterize the relative orientation between the designated body parts of the real anchor in a preset posture relation.
一种可能的实施方式中,确定部分43,还被设置为:基于所述姿态检测结果,确定针对动画特效的第一驱动信息;其中,所述第一驱动信息用于指示所述视频直播界面中所展示的虚拟直播模型的动画特效的动画跳转信息;根据所述第一驱动信息,在所述姿态检测结果所对应的多个动画序列中确定与所述第一驱动信息相匹配的动画序列,并将所述相匹配的动画序列确定为所述目标动画特效。In a possible implementation manner, the determining part 43 is further configured to: determine first driving information for animation special effects based on the posture detection result; wherein the first driving information is used to indicate that the live video interface The animation jump information of the animation special effects of the virtual live broadcast model shown in ; according to the first driving information, determine the animation that matches the first driving information in a plurality of animation sequences corresponding to the posture detection results sequence, and determine the matching animation sequence as the target animation special effect.
一种可能的实施方式中,确定部分43,还被设置为:确定所述视频流中位于所述第一视频图像之前的至少一个视频图像;获取根据每个所述视频图像确定出的针对动画特效的第二驱动信息,并根据所述姿态检测结果确定针对动画特效的估计驱动信息;确定所述第二驱动信息和所述估计驱动信息中每个驱动信息所驱动展示的动画序列,得到至少一个动画序列;将所述至少一个动画序列中出现次数满足预设次数要求的动画序列所对应的驱动信息为确定所述第一驱动信息。In a possible implementation manner, the determination part 43 is further configured to: determine at least one video image in the video stream before the first video image; The second driving information of the special effect, and determine the estimated driving information for the animation special effect according to the posture detection result; determine the animation sequence driven by each driving information in the second driving information and the estimated driving information, and obtain at least An animation sequence; determining the first driving information as the driving information corresponding to the animation sequence whose occurrence number meets the preset number requirement in the at least one animation sequence.
一种可能的实施方式中,确定部分43,还被设置为:获取所述多个动画序列的动画状态机;所述动画状态机用于表征多个动画状态之间的跳转关系,每个所述动画状态对应一个或多个动画序列;根据所述第一驱动信息,确定所述动画状态机的下一个待跳转的动画状态;根据所述下一个待跳转的动画状态所对应的动画序列,确定与所述第一驱动信息相匹配的动画序列。In a possible implementation manner, the determining part 43 is further configured to: acquire animation state machines of the multiple animation sequences; the animation state machine is used to characterize the jump relationship between multiple animation states, each The animation state corresponds to one or more animation sequences; according to the first driving information, determine the next animation state to be jumped of the animation state machine; according to the animation state corresponding to the next animation state to be jumped An animation sequence, determining an animation sequence matching the first driving information.
一种可能的实施方式中,所述获取部分41,还被配置为获取所述真实主播所对应的虚拟直播场景;所述确定部分43,还被设置为:获取与所述姿态检测结果相匹配的初始动画特效;在所述初始动画特效中确定与所述虚拟直播场景相匹配的动画特效作为所述目标动画特效。In a possible implementation manner, the acquiring part 41 is also configured to acquire the virtual live broadcast scene corresponding to the real anchor; An initial animation special effect; in the initial animation special effect, an animation special effect matching the virtual live broadcast scene is determined as the target animation special effect.
一种可能的实施方式中,在目标动画特效包括用于表征虚拟主播模型的肢体动作的肢体动作特效和渲染素材特效中的至少一个的情况下,确定部分 43,还被设置为以下至少之一:在所述视频直播界面中展示所述虚拟主播模型的肢体动作的肢体动作特效;在所述视频直播界面中与所述虚拟主播模型的肢体动作关联的目标位置处展示所述渲染素材特效。In a possible implementation manner, when the target animation special effects include at least one of physical movement special effects and rendering material special effects for representing the physical movement of the virtual anchor model, the determining part 43 is further set to at least one of the following : Displaying the special effect of the body movement of the virtual anchor model in the live video interface; displaying the special effect of the rendering material at a target position associated with the body movement of the virtual anchor model in the live video interface.
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.
在本公开实施例以及其他的实施例中,“部分”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是单元,还可以是模块也可以是非模块化的。In the embodiments of the present disclosure and other embodiments, a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a unit, a module or a non-modular one.
对应于图1中的特效展示方法,本公开实施例还提供了一种计算机设备500,如图4所示,为本公开实施例提供的计算机设备500结构示意图,包括:Corresponding to the special effect display method in FIG. 1, the embodiment of the present disclosure also provides a computer device 500, as shown in FIG. 4, which is a schematic structural diagram of the computer device 500 provided by the embodiment of the present disclosure, including:
处理器51、存储器52、和总线53;存储器52用于存储执行指令,包括内存521和外部存储器522;这里的内存521也称内存储器,用于暂时存放处理器51中的运算数据,以及与硬盘等外部存储器522交换的数据,处理器51通过内存521与外部存储器522进行数据交换,当所述计算机设备500运行时,所述处理器51与所述存储器52之间通过总线53通信,使得所述处理器51执行以下指令: Processor 51, memory 52, and bus 53; memory 52 is used for storing and executing instruction, comprises memory 521 and external memory 522; memory 521 here is also called internal memory, is used for temporarily storing computing data in processor 51, and The data exchanged by the external memory 522 such as hard disk, the processor 51 exchanges data with the external memory 522 through the memory 521, and when the computer device 500 is running, the processor 51 communicates with the memory 52 through the bus 53, so that The processor 51 executes the following instructions:
获取真实主播在直播过程中的第一视频图像;Obtain the first video image of the real anchor during the live broadcast;
对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果;Performing posture detection on the specified body parts of the real anchor in the first video image to obtain a posture detection result;
在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效;When it is detected that the real anchor is in a preset posture according to the posture detection result, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result;
在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的特效展示方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored. When the computer program is run by a processor, the steps of the method for displaying special effects described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的特效展示方法的步骤,具体可参见上述方法实施例,在此不再赘述。The embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the special effect display method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。Wherein, the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意 性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the essence of the technical solution of the present disclosure or the part that contributes to the related technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.
工业实用性Industrial Applicability
本公开实施例中,通过识别真实主播的姿态,基于姿态检测结果来确定真实主播驱动的虚拟主播模型的目标动画特效,并在视频直播界面中展示该目标动画特效。由此可以实现通过真实主播的姿态检测结果在视频直播界面上触发展示虚拟主播模型对应的目标动画特效,增加了主播用户通过肢体动作与观看者的交互效率,同时还提高了直播用户的直播体验。In the embodiment of the present disclosure, by identifying the posture of the real anchor, the target animation special effect of the virtual anchor model driven by the real anchor is determined based on the posture detection result, and the target animation special effect is displayed in the live video interface. In this way, the target animation effects corresponding to the virtual anchor model can be triggered and displayed on the live video interface through the gesture detection results of the real anchor, which increases the interaction efficiency between the anchor user and the viewer through body movements, and also improves the live broadcast experience of the live broadcast user .

Claims (14)

  1. 一种特效展示方法,包括:A method for displaying special effects, comprising:
    获取真实主播在直播过程中的第一视频图像;Obtain the first video image of the real anchor during the live broadcast;
    对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果;Performing posture detection on the specified body parts of the real anchor in the first video image to obtain a posture detection result;
    在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效;When it is detected that the real anchor is in a preset posture according to the posture detection result, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result;
    在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.
  2. 根据权利要求1所述的方法,其中,所述姿态检测结果包括肢体检测结果和势分类结果中的至少一个;The method according to claim 1, wherein the posture detection results include at least one of body detection results and gesture classification results;
    所述对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果,包括:The gesture detection is performed on the specified limb parts of the real anchor in the first video image, and the gesture detection result is obtained, including:
    对所述第一视频图像中真实主播的指定肢体部位进行肢体检测,得到肢体检测结果;Carry out limb detection to the designated limb parts of the real anchor in the first video image, and obtain the limb detection result;
    在所述肢体检测结果中包含手部检测框的情况下,对所述第一视频图像中位于所述手部检测框内的图像进行手势检测,得到手势分类结果。If the limb detection result includes a hand detection frame, gesture detection is performed on an image within the hand detection frame in the first video image to obtain a gesture classification result.
  3. 根据权利要求1或2所述的方法,其中,所述根据所述姿态检测结果检测所述真实主播处于预设姿态,包括:The method according to claim 1 or 2, wherein the detecting that the real anchor is in a preset posture according to the posture detection result comprises:
    根据所述姿态检测结果中的肢体检测结果,判断所述第一视频图像中的真实主播是否满足手势识别条件,得到判断结果;According to the limb detection result in the posture detection result, judge whether the real anchor in the first video image satisfies the gesture recognition condition, and obtain the judgment result;
    在所述判断结果表征所述真实主播满足所述手势识别条件的情况下,检测所述姿态检测结果中手势分类结果所指示的手势是否为预设手势;When the judgment result indicates that the real anchor satisfies the gesture recognition condition, detecting whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture;
    在检测出所述手势分类结果所指示的手势是所述预设手势的情况下,确定所述真实主播处于所述预设姿态。If it is detected that the gesture indicated by the gesture classification result is the preset gesture, it is determined that the real anchor is in the preset gesture.
  4. 根据权利要求3所述的方法,其中,所述根据所述姿态检测结果中的肢体检测结果,判断所述第一视频图像中的真实主播是否满足手势识别条件,得到判断结果,包括:The method according to claim 3, wherein, according to the limb detection result in the gesture detection result, judging whether the real anchor in the first video image satisfies the gesture recognition condition, and obtaining the judgment result includes:
    根据所述肢体检测结果确定所述真实主播的各个指定肢体部位之间的相对方位信息;determining the relative orientation information between the designated body parts of the real anchor according to the body detection results;
    根据所述相对方位信息和预设方位信息,判断所述第一视频图像中的真实主播是否满足所述手势识别条件,得到所述判断结果;所述预设方位信息用于表征所述真实主播在处于预设姿态下,各个指定肢体部位之间的相对方位关系。According to the relative orientation information and preset orientation information, it is judged whether the real anchor in the first video image satisfies the gesture recognition condition, and the judgment result is obtained; the preset orientation information is used to characterize the real anchor In the preset posture, the relative orientation relationship between the specified body parts.
  5. 根据权利要求1至4中任一项所述的方法,其中,所述根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效,包括:The method according to any one of claims 1 to 4, wherein said determining the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result comprises:
    基于所述姿态检测结果,确定针对动画特效的第一驱动信息;其中,所述第一驱动信息用于指示所述视频直播界面中所展示的虚拟直播模型的动画特效的动画跳转信息;Based on the posture detection result, determine the first driving information for the animation special effect; wherein the first driving information is used to indicate the animation jump information of the animation special effect of the virtual live model displayed in the live video interface;
    根据所述第一驱动信息,在所述姿态检测结果所对应的多个动画序列中确定与所述第一驱动信息相匹配的动画序列,并将所述相匹配的动画序列确 定为所述目标动画特效。According to the first driving information, determining an animation sequence matching the first driving information among the plurality of animation sequences corresponding to the posture detection result, and determining the matching animation sequence as the target Animation effects.
  6. 根据权利要求5所述的方法,其中,所述基于所述姿态检测结果,确定针对动画特效的第一驱动信息,包括:The method according to claim 5, wherein said determining the first driving information for animation special effects based on the posture detection result comprises:
    确定所述视频流中位于所述第一视频图像之前的至少一个视频图像;determining at least one video image preceding the first video image in the video stream;
    获取根据每个所述视频图像确定出的针对动画特效的第二驱动信息,并根据所述姿态检测结果确定针对动画特效的估计驱动信息;Acquiring second driving information for animation special effects determined according to each of the video images, and determining estimated driving information for animation special effects according to the gesture detection result;
    确定所述第二驱动信息和所述估计驱动信息中每个驱动信息所驱动展示的动画序列,得到至少一个动画序列;determining an animation sequence driven by each of the second driving information and the estimated driving information to obtain at least one animation sequence;
    将所述至少一个动画序列中出现次数满足预设次数要求的动画序列所对应的驱动信息为确定所述第一驱动信息。The first driving information is determined as the driving information corresponding to the animation sequence whose occurrence number meets the preset number requirement in the at least one animation sequence.
  7. 根据权利要求5所述的方法,其中,所述根据所述第一驱动信息,在所述姿态检测结果所对应的多个动画序列中确定与所述第一驱动信息相匹配的动画序列,包括:The method according to claim 5, wherein, according to the first driving information, determining an animation sequence matching the first driving information among a plurality of animation sequences corresponding to the posture detection result, comprises :
    获取所述多个动画序列的动画状态机;所述动画状态机用于表征多个动画状态之间的跳转关系,每个所述动画状态对应一个或多个动画序列;Acquiring the animation state machines of the plurality of animation sequences; the animation state machine is used to represent the jump relationship between the plurality of animation states, each of the animation states corresponds to one or more animation sequences;
    根据所述第一驱动信息,确定所述动画状态机的下一个待跳转的动画状态;According to the first driving information, determine the next animation state to be jumped to by the animation state machine;
    根据所述下一个待跳转的动画状态所对应的动画序列,确定与所述第一驱动信息相匹配的动画序列。An animation sequence matching the first driving information is determined according to the animation sequence corresponding to the next animation state to be jumped.
  8. 根据权利要求1至7任一所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 7, wherein the method further comprises:
    获取所述真实主播所对应的虚拟直播场景;Obtaining the virtual live broadcast scene corresponding to the real anchor;
    所述根据所述姿态检测结果,确定与所述真实主播对应的虚拟主播模型的目标动画特效,包括:According to the posture detection result, determining the target animation special effect of the virtual anchor model corresponding to the real anchor includes:
    获取与所述姿态检测结果相匹配的初始动画特效;Obtain an initial animation special effect matching the posture detection result;
    在所述初始动画特效中确定与所述虚拟直播场景相匹配的动画特效作为所述目标动画特效。An animation special effect matching the virtual live broadcast scene is determined in the initial animation special effect as the target animation special effect.
  9. 根据权利要求1至8任一所述的方法,其中,所述目标动画特效包括用于表征虚拟主播模型的肢体动作的肢体动作特效和渲染素材特效中的至少一个;The method according to any one of claims 1 to 8, wherein the target animation special effects include at least one of physical movement special effects and rendering material special effects for characterizing the physical movement of the virtual anchor model;
    所述在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效,包括以下至少之一:The displaying the target animation effects of the virtual anchor model in the live video interface corresponding to the real anchor includes at least one of the following:
    在所述视频直播界面中展示所述虚拟主播模型的肢体动作的肢体动作特效;Displaying the body movement special effects of the body movement of the virtual anchor model in the live video interface;
    在所述视频直播界面中与所述虚拟主播模型的肢体动作关联的目标位置处展示所述渲染素材特效。The special effect of the rendering material is displayed at a target position associated with the body movements of the virtual anchor model in the live video interface.
  10. 一种特效展示装置,包括:A special effect display device, comprising:
    获取部分,被配置为获取真实主播在直播过程中的第一视频图像;The obtaining part is configured to obtain the first video image of the real anchor during the live broadcast;
    姿态检测部分,被配置为对所述第一视频图像中所述真实主播的指定肢体部位进行姿态检测,得到姿态检测结果;The posture detection part is configured to perform posture detection on the specified limb parts of the real anchor in the first video image, and obtain a posture detection result;
    确定部分,被配置为在根据所述姿态检测结果检测出所述真实主播处于预设姿态的情况下,根据所述姿态检测结果确定与所述真实主播对应的虚拟主播模型的目标动画特效;The determining part is configured to determine the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result when it is detected that the real anchor is in a preset posture according to the posture detection result;
    展示部分,被配置为在所述真实主播对应的视频直播界面中展示所述虚拟主播模型的目标动画特效。The display part is configured to display the target animation special effects of the virtual anchor model in the live video interface corresponding to the real anchor.
  11. 一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至9任一所述的特效展示方法的步骤。A computer device, comprising: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processor communicates with the memory through the bus When the machine-readable instructions are executed by the processor, the steps of the method for displaying special effects according to any one of claims 1 to 9 are executed.
  12. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至9任一所述的特效展示方法的步骤。A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the method for displaying special effects according to any one of claims 1 to 9 are executed.
  13. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行时实现权利要求1至9任一所述的特效展示方法的步骤。A computer program, comprising computer-readable codes, when the computer-readable codes run in an electronic device, the processor in the electronic device implements the method for displaying special effects according to any one of claims 1 to 9 when executed step.
  14. 一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行时实现1至9任一所述的特效展示方法的步骤。A computer program product, including computer program instructions, which enable a computer to implement the steps of any one of 1 to 9 special effect display methods when executed.
PCT/CN2022/075015 2021-07-07 2022-01-29 Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product WO2023279713A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110768288.7 2021-07-07
CN202110768288.7A CN113487709A (en) 2021-07-07 2021-07-07 Special effect display method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023279713A1 true WO2023279713A1 (en) 2023-01-12

Family

ID=77941870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075015 WO2023279713A1 (en) 2021-07-07 2022-01-29 Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product

Country Status (3)

Country Link
CN (1) CN113487709A (en)
TW (1) TW202303526A (en)
WO (1) WO2023279713A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487709A (en) * 2021-07-07 2021-10-08 上海商汤智能科技有限公司 Special effect display method and device, computer equipment and storage medium
CN114302153B (en) * 2021-11-25 2023-12-08 阿里巴巴达摩院(杭州)科技有限公司 Video playing method and device
CN114092678A (en) * 2021-11-29 2022-02-25 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN116630488A (en) * 2022-02-10 2023-08-22 北京字跳网络技术有限公司 Video image processing method, device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106804007A (en) * 2017-03-20 2017-06-06 合网络技术(北京)有限公司 The method of Auto-matching special efficacy, system and equipment in a kind of network direct broadcasting
CN109922354A (en) * 2019-03-29 2019-06-21 广州虎牙信息科技有限公司 Living broadcast interactive method, apparatus, live broadcast system and electronic equipment
CN109936774A (en) * 2019-03-29 2019-06-25 广州虎牙信息科技有限公司 Virtual image control method, device and electronic equipment
US20200125031A1 (en) * 2018-10-19 2020-04-23 Infinite Kingdoms Llc System for providing an immersive experience using multi-platform smart technology, content streaming, and special effects systems
CN113038229A (en) * 2021-02-26 2021-06-25 广州方硅信息技术有限公司 Virtual gift broadcasting control method, virtual gift broadcasting control device, virtual gift broadcasting control equipment and virtual gift broadcasting control medium
CN113487709A (en) * 2021-07-07 2021-10-08 上海商汤智能科技有限公司 Special effect display method and device, computer equipment and storage medium
CN114155605A (en) * 2021-12-03 2022-03-08 北京字跳网络技术有限公司 Control method, control device and computer storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106470343B (en) * 2016-09-29 2019-09-17 广州华多网络科技有限公司 Live video stream long-range control method and device
CN107277599A (en) * 2017-05-31 2017-10-20 珠海金山网络游戏科技有限公司 A kind of live broadcasting method of virtual reality, device and system
CN109803165A (en) * 2019-02-01 2019-05-24 北京达佳互联信息技术有限公司 Method, apparatus, terminal and the storage medium of video processing
CN110139115B (en) * 2019-04-30 2020-06-09 广州虎牙信息科技有限公司 Method and device for controlling virtual image posture based on key points and electronic equipment
CN110475150B (en) * 2019-09-11 2021-10-08 广州方硅信息技术有限公司 Rendering method and device for special effect of virtual gift and live broadcast system
CN111667589A (en) * 2020-06-12 2020-09-15 上海商汤智能科技有限公司 Animation effect triggering display method and device, electronic equipment and storage medium
CN112135160A (en) * 2020-09-24 2020-12-25 广州博冠信息科技有限公司 Virtual object control method and device in live broadcast, storage medium and electronic equipment
CN112752149B (en) * 2020-12-29 2023-06-06 广州繁星互娱信息科技有限公司 Live broadcast method, live broadcast device, terminal and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106804007A (en) * 2017-03-20 2017-06-06 合网络技术(北京)有限公司 The method of Auto-matching special efficacy, system and equipment in a kind of network direct broadcasting
US20200125031A1 (en) * 2018-10-19 2020-04-23 Infinite Kingdoms Llc System for providing an immersive experience using multi-platform smart technology, content streaming, and special effects systems
CN109922354A (en) * 2019-03-29 2019-06-21 广州虎牙信息科技有限公司 Living broadcast interactive method, apparatus, live broadcast system and electronic equipment
CN109936774A (en) * 2019-03-29 2019-06-25 广州虎牙信息科技有限公司 Virtual image control method, device and electronic equipment
CN111641844A (en) * 2019-03-29 2020-09-08 广州虎牙信息科技有限公司 Live broadcast interaction method and device, live broadcast system and electronic equipment
CN113038229A (en) * 2021-02-26 2021-06-25 广州方硅信息技术有限公司 Virtual gift broadcasting control method, virtual gift broadcasting control device, virtual gift broadcasting control equipment and virtual gift broadcasting control medium
CN113487709A (en) * 2021-07-07 2021-10-08 上海商汤智能科技有限公司 Special effect display method and device, computer equipment and storage medium
CN114155605A (en) * 2021-12-03 2022-03-08 北京字跳网络技术有限公司 Control method, control device and computer storage medium

Also Published As

Publication number Publication date
CN113487709A (en) 2021-10-08
TW202303526A (en) 2023-01-16

Similar Documents

Publication Publication Date Title
WO2023279713A1 (en) Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product
KR102292537B1 (en) Image processing method and apparatus, and storage medium
CN111726536B (en) Video generation method, device, storage medium and computer equipment
US10198845B1 (en) Methods and systems for animating facial expressions
Betancourt et al. The evolution of first person vision methods: A survey
CN109729426B (en) Method and device for generating video cover image
US20180121733A1 (en) Reducing computational overhead via predictions of subjective quality of automated image sequence processing
WO2023273500A1 (en) Data display method, apparatus, electronic device, computer program, and computer-readable storage medium
CN108525305B (en) Image processing method, image processing device, storage medium and electronic equipment
GB2590208A (en) Face-based special effect generation method and apparatus, and electronic device
US20150248167A1 (en) Controlling a computing-based device using gestures
JP6750046B2 (en) Information processing apparatus and information processing method
CN106201173B (en) A kind of interaction control method and system of user's interactive icons based on projection
CN109064387A (en) Image special effect generation method, device and electronic equipment
CN113422977B (en) Live broadcast method and device, computer equipment and storage medium
WO2023024442A1 (en) Detection method and apparatus, training method and apparatus, device, storage medium and program product
TW202304212A (en) Live broadcast method, system, computer equipment and computer readable storage medium
CN109034063A (en) Plurality of human faces tracking, device and the electronic equipment of face special efficacy
KR20130032620A (en) Method and apparatus for providing moving picture using 3d user avatar
CN113766168A (en) Interactive processing method, device, terminal and medium
CN111638784A (en) Facial expression interaction method, interaction device and computer storage medium
CN111797850A (en) Video classification method and device, storage medium and electronic equipment
JP6730461B2 (en) Information processing system and information processing apparatus
CN114513694A (en) Scoring determination method and device, electronic equipment and storage medium
JP5776471B2 (en) Image display system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836473

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE