TW202303526A

TW202303526A - Special effect display method, computer equipment and computer-readable storage medium

Info

Publication number: TW202303526A
Application number: TW111117706A
Authority: TW
Inventors: 邱豐; 劉昕; 王佳梨; 錢晨
Original assignee: 大陸商上海商湯智能科技有限公司
Priority date: 2021-07-07
Filing date: 2022-05-11
Publication date: 2023-01-16
Also published as: WO2023279713A1; CN113487709A

Abstract

The present disclosure provides a special effect display method, a computer equipment, and a computer-readable storage medium, wherein the method includes: acquiring a first video image of a real anchor during a live broadcast; performing posture detection on the designated limb parts of the real anchor in the first video image to obtain a posture detection result; when it is detected that the real anchor is in a preset posture according to the posture detection result, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result; the target animation special effect of the virtual anchor model is displayed in the live broadcast video interface corresponding to the real anchor.

Description

Special effect display method, computer equipment, and computer-readable storage medium

本發明關於電腦的技術領域，特別是關於一種特效展示方法、電腦設備及電腦可讀儲存媒體。The present invention relates to the technical field of computers, in particular to a method for displaying special effects, computer equipment and computer-readable storage media.

在目前的虛擬直播過程中，主播可以通過對直播設備上特效觸發按鍵執行觸發操作的方式，觸發展示特效動畫。例如，主播可以手動操控滑鼠或者鍵盤來觸發展示特效動畫；或者，主播還可以在直播軟體中點選或按下提前編輯預設好的快速鍵，進行特效動畫的觸發和播放。由於相關的虛擬直播方案需要主播手動觸發展示特效動畫，佔用了主播直播時的手部表現，減少了主播使用者通過手部動作與觀看者的交互效率，導致主播使用者對該直播軟體的使用體驗不佳。In the current virtual live broadcast process, the anchor can trigger the display of special effect animations by performing trigger operations on the special effect trigger buttons on the live broadcast device. For example, the anchor can manually manipulate the mouse or keyboard to trigger the display of special effects animation; or, the anchor can also click or press the pre-edited shortcut keys in the live broadcast software to trigger and play the special effect animation. Because the related virtual live broadcast solution requires the host to manually trigger the display of special effects animation, which takes up the hand performance of the host during the live broadcast, reduces the interaction efficiency between the host user and the viewer through hand movements, and leads to the use of the live broadcast software by the host user. Bad experience.

本發明實施例至少提供一種特效展示方法、電腦設備及電腦可讀儲存媒體。Embodiments of the present invention at least provide a method for displaying special effects, computer equipment, and a computer-readable storage medium.

第一方面，本發明實施例提供了一種特效展示方法，包括：獲取真實主播在直播過程中的第一視頻圖像；對所述第一視頻圖像中所述真實主播的指定肢體部位進行姿態檢測，得到姿態檢測結果；在根據所述姿態檢測結果檢測出所述真實主播處於預設姿態的情況下，根據所述姿態檢測結果確定與所述真實主播對應的虛擬主播模型的目標動畫特效；在所述真實主播對應的視頻直播介面中展示所述虛擬主播模型的目標動畫特效。In the first aspect, an embodiment of the present invention provides a method for displaying special effects, including: acquiring a first video image of a real anchor during a live broadcast; Detecting to obtain a posture detection result; in the case of detecting that the real anchor is in a preset posture according to the posture detection result, determining the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result; The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.

本發明實施例適用於虛擬直播領域，可以在視頻直播介面中展示真實主播所驅動的虛擬主播模型，並可以在視頻直播介面中展示該虛擬主播模型的動畫特效。也就是說，可以通過識別真實主播的姿態，基於姿態檢測結果來確定真實主播驅動的虛擬主播模型的目標動畫特效，並在視頻直播介面中展示該目標動畫特效。由此可以實現通過真實主播的姿態檢測結果在視頻直播介面上觸發展示虛擬主播模型對應的目標動畫特效，無需依賴外部控制設備來觸發展示動畫特效，同時還提高了虛擬直播使用者的直播體驗。The embodiment of the present invention is applicable to the field of virtual live broadcast, and can display the virtual anchor model driven by the real anchor in the live video interface, and can display the animation special effects of the virtual anchor model in the live video interface. That is to say, the target animation effect of the virtual anchor model driven by the real anchor can be determined based on the posture detection result by recognizing the pose of the real anchor, and the target animation effect can be displayed in the live video interface. In this way, the target animation effects corresponding to the virtual anchor model can be triggered and displayed on the live video interface through the posture detection results of the real anchor, without relying on external control devices to trigger the display of animation effects, and at the same time, the live broadcast experience of virtual live broadcast users is improved.

第二方面，本發明實施例還提供一種特效展示裝置，包括：獲取單元，用於獲取真實主播在直播過程中的第一視頻圖像；姿態檢測單元，用於對所述第一視頻圖像中所述真實主播的指定肢體部位進行姿態檢測，得到姿態檢測結果；確定單元，用於在根據所述姿態檢測結果檢測出所述真實主播處於預設姿態的情況下，根據所述姿態檢測結果確定與所述真實主播對應的虛擬主播模型的目標動畫特效；展示單元，用於在所述真實主播對應的視頻直播介面中展示所述虛擬主播模型的目標動畫特效。In the second aspect, the embodiment of the present invention also provides a special effect display device, including: an acquisition unit, used to acquire the first video image of the real anchor during the live broadcast; a posture detection unit, used to perform the first video image The specified body parts of the real anchor are subjected to posture detection to obtain a posture detection result; the determining unit is configured to, when it is detected that the real anchor is in a preset posture according to the posture detection result, according to the posture detection result Determining the target animation special effect of the virtual anchor model corresponding to the real anchor; a display unit configured to display the target animation special effect of the virtual anchor model in the live video interface corresponding to the real anchor.

第三方面，本發明實施例還提供一種電腦設備，包括：處理器、記憶體和匯流排，所述記憶體儲存有所述處理器可執行的機器可讀指令，當電腦設備運行時，所述處理器與所述記憶體之間通過匯流排通信，所述機器可讀指令被所述處理器執行時執行上述第一方面，或第一方面中任一種可能的實施方式中的步驟。In the third aspect, the embodiment of the present invention also provides a computer device, including: a processor, a memory, and a bus bar. The memory stores machine-readable instructions executable by the processor. When the computer device is running, the The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.

第四方面，本發明實施例還提供一種電腦可讀儲存媒體，該電腦可讀儲存媒體上儲存有電腦程式，該電腦程式被處理器運行時執行上述第一方面，或第一方面中任一種可能的實施方式中的步驟。In the fourth aspect, the embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program executes the above-mentioned first aspect or any one of the first aspects when the computer program is run by a processor. steps in a possible implementation.

第五方面，本發明實施例還提供了一種電腦程式，包括電腦可讀代碼，當所述電腦可讀代碼在電子設備中運行時，所述電腦設備中的處理器執行時實現上述第一方面，或第一方面中任一種可能的實施方式中的步驟。In the fifth aspect, the embodiment of the present invention also provides a computer program, including computer readable code, when the computer readable code is run in the electronic device, the processor in the computer device executes to realize the above first aspect , or a step in any possible implementation manner in the first aspect.

第六方面，本發明提供了一種電腦程式產品，包括電腦程式指令，所述電腦程式指令被電腦執行時實現上述第一方面，或第一方面中任一種可能的實施方式中的步驟。In a sixth aspect, the present invention provides a computer program product, including computer program instructions. When the computer program instructions are executed by a computer, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are realized.

為使本發明的上述目的、特徵和優點能更明顯易懂，下文特舉較佳實施例，並配合所附附圖，作詳細說明如下。In order to make the above-mentioned objects, features and advantages of the present invention more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

為使本發明實施例的目的、技術方案和優點更加清楚，下面將結合本發明實施例中附圖，對本發明實施例中的技術方案進行清楚、完整地描述。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention.

應注意到：相似的標號和字母在下面的附圖中表示類似項，因此，一旦某一項在一個附圖中被定義，則在隨後的附圖中不需要對其進行進一步定義和解釋。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

本文中術語“和/或”，僅僅是描述一種關聯關係，表示可以存在三種關係，例如，A和/或B，可以表示：單獨存在A，同時存在A和B，單獨存在B這三種情況。另外，本文中術語“至少一種”表示多種中的任意一種或多種中的至少兩種的任意組合，例如，包括A、B、C中的至少一種，可以表示包括從A、B和C構成的集合中選擇的任意一個或多個元素。The term "and/or" in this article only describes an association relationship, which means that there may be three relationships, for example, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one or any combination of at least two of the plurality, for example, including at least one of A, B, and C, may mean including the composition consisting of A, B, and C Any one or more elements selected in the collection.

經研究發現，由於相關的虛擬直播方案需要主播手動觸發展示特效動畫，因此，該直播方案會佔用主播直播時的手部表現，從而減少主播使用者通過手部動作與觀看者之間的交互效率，進而降低了直播使用者對該直播軟體的使用體驗。After research, it was found that because the related virtual live broadcast solution requires the anchor to manually trigger the display of special effects animation, this live broadcast solution will occupy the hand performance of the anchor during the live broadcast, thereby reducing the interaction efficiency between the anchor user and the viewer through hand movements , thereby reducing the user experience of the live broadcast software.

基於上述研究，本發明提供了一種特效展示方法、裝置、電腦設備、儲存媒體、電腦程式和電腦程式產品。本發明所提供的技術方案，可以應用於虛擬直播場景下。虛擬直播場景可以理解為使用預先設定的虛擬主播模型，如小熊貓、小兔子、卡通人物等代替真實主播的實際形象進行直播，此時，在視頻直播畫面中所展示出的為上述虛擬主播模型。同時，還可以根據該虛擬主播模型進行真實主播與觀眾的互動。Based on the above research, the present invention provides a special effect display method, device, computer equipment, storage medium, computer program and computer program product. The technical solution provided by the present invention can be applied in a virtual live broadcast scene. The virtual live broadcast scene can be understood as the use of pre-set virtual anchor models, such as red pandas, little rabbits, cartoon characters, etc. to replace the actual image of the real anchor for live broadcast. At this time, the above-mentioned virtual anchor models are shown in the live video screen . At the same time, the interaction between the real anchor and the audience can also be carried out according to the virtual anchor model.

舉例來說，直播設備的攝影裝置可以採集包含真實主播的視頻圖像，然後，對視頻圖像中所包含的真實主播的肢體進行捕捉，從而得到真實主播的姿態資訊。在確定出該姿態資訊之後，就可以生成對應的驅動信號，該驅動信號用於驅動直播設備在視頻直播畫面中展示虛擬主播模型對應的動畫特效。For example, the camera device of the live broadcast device can collect video images containing the real anchor, and then capture the body of the real anchor contained in the video image, so as to obtain the posture information of the real anchor. After the posture information is determined, a corresponding driving signal can be generated, and the driving signal is used to drive the live broadcast device to display the animation special effect corresponding to the virtual anchor model in the live video screen.

在一個可選的實施方式中，真實主播可以預先設定相應的虛擬主播模型，例如，可以預先設定的虛擬主播模型為“XXX遊戲中的YYY角色模型”。真實主播可以預先設定一個或多個虛擬主播模型。在開啟當前時刻的虛擬直播時，可以從預先設定的一個或多個虛擬主播模型中選擇一個作為當前時刻的虛擬主播模型。其中，虛擬主播模型可以為2D模型，還可以為3D模型。In an optional embodiment, the real anchor may preset a corresponding virtual anchor model, for example, the preset virtual anchor model may be "YYY role model in XXX game". A real anchor can preset one or more virtual anchor models. When starting the virtual live broadcast at the current moment, one can be selected from one or more preset virtual anchor models as the virtual anchor model at the current moment. Wherein, the virtual anchor model may be a 2D model or a 3D model.

在另一個可選的實施方式中，除了上述所描述的方式，即真實主播確定虛擬主播模型之外，還可以在獲取到第一視頻圖像之後，為該第一視頻圖像中的真實主播重塑虛擬主播模型。In another optional implementation, in addition to the method described above, that is, the real anchor determines the virtual anchor model, after the first video image is acquired, the real anchor model in the first video image can be Reshape the virtual anchor model.

在本發明的一些實施例中，直播設備可以對視頻圖像中所包含的真實主播進行識別，根據識別結果為真實主播重塑虛擬主播模型。這裡，識別結果可以包含以下至少之一：真實主播的性別、真實主播的外貌特徵、真實主播的穿戴特徵等。此時，直播設備可以從虛擬主播模型庫中搜索與該識別結果相匹配的模型作為該真實主播的虛擬主播模型。In some embodiments of the present invention, the live broadcast device can identify the real anchor contained in the video image, and reshape the virtual anchor model for the real anchor according to the recognition result. Here, the recognition result may include at least one of the following: the gender of the real anchor, the appearance characteristics of the real anchor, the wearing characteristics of the real anchor, and the like. At this time, the live broadcast device may search for a model matching the recognition result from the virtual anchor model database as the virtual anchor model of the real anchor.

示例性的，直播設備在根據識別結果確定出，真實主播在直播過程中所戴鴨舌帽和所穿衣服為嘻哈風格的衣服的情況下，可以從虛擬主播模型庫中搜索，將搜索到的與該“鴨舌帽”或者“嘻哈風”相匹配的虛擬主播模型作為該真實主播的虛擬主播模型。Exemplarily, when the live broadcast device determines according to the recognition result that the peaked cap and clothes worn by the real anchor during the live broadcast are hip-hop-style clothes, it can search from the virtual anchor model library, and compare the searched ones with the The virtual anchor model that matches the "peaked cap" or "hip-hop style" is used as the virtual anchor model of the real anchor.

在本發明的一些實施例中，直播設備除了在虛擬主播模型庫中搜索與識別結果相匹配的模型之外，還可以基於該識別結果，通過模型構建模組，為真實主播即時構建出相應的虛擬主播模型。In some embodiments of the present invention, in addition to searching for a model that matches the recognition result in the virtual anchor model library, the live broadcast device can also build a corresponding model for the real anchor in real time based on the recognition result. Virtual anchor model.

這裡，在即時構建該虛擬主播模型時，還可以將該真實主播在過去時刻所發起的虛擬直播中所使用的虛擬主播模型作為參考，構建當前時刻該真實主播所驅動的虛擬主播模型。Here, when constructing the virtual anchor model in real time, the virtual anchor model used in the virtual live broadcast initiated by the real anchor in the past can also be used as a reference to construct the virtual anchor model driven by the real anchor at the current moment.

通過上述所描述的確定虛擬主播模型的方式，可以實現為真實主播個性化定制相應的虛擬主播模型，從而增加虛擬主播模型的多樣性。同時，通過個性化定制虛擬主播模型，還可以為觀眾留下更深刻的印象。Through the method of determining the virtual anchor model described above, it is possible to customize corresponding virtual anchor models for real anchors, thereby increasing the diversity of virtual anchor models. At the same time, by customizing the virtual anchor model, it can also leave a deeper impression on the audience.

為便於對本實施例進行理解，這裡，對本發明實施例所公開的一種特效展示方法進行詳細介紹，本發明實施例所提供的特效展示方法的執行主體一般為具有一定計算能力的電腦設備，例如：特效展示方法可以由終端設備、或伺服器或其他處理設備執行，其中，終端設備可以為使用者設備、移動設備、使用者終端、終端、蜂窩電話、個人數位助理、手持設備、計算設備、車載設備、可穿戴設備等。該電腦設備可以為支援安裝虛擬直播軟體的任意一個直播設備。在一些可能的實現方式中，該特效展示方法可以通過處理器調用記憶體中儲存的電腦可讀指令的方式來實現。In order to facilitate the understanding of this embodiment, here, a special effect display method disclosed in the embodiment of the present invention is introduced in detail. The execution subject of the special effect display method provided in the embodiment of the present invention is generally a computer device with a certain computing power, for example: The method for displaying special effects may be executed by a terminal device, or a server or other processing device, wherein the terminal device may be a user device, a mobile device, a user terminal, a terminal, a cellular phone, a personal digital assistant, a handheld device, a computing device, a vehicle-mounted devices, wearables, etc. The computer device can be any live broadcast device that supports the installation of virtual live broadcast software. In some possible implementation manners, the method for displaying special effects may be realized by calling a computer-readable instruction stored in a memory by a processor.

參見圖1所示，為本發明實施例提供的一種特效展示方法的流程圖，所述方法包括步驟S101~S107如下。Referring to FIG. 1 , it is a flow chart of a method for displaying special effects provided by an embodiment of the present invention. The method includes steps S101 to S107 as follows.

S101：獲取真實主播在直播過程中的第一視頻圖像。S101: Obtain a first video image of a real anchor during a live broadcast.

S103：對所述第一視頻圖像中所述真實主播的指定肢體部位進行姿態檢測，得到姿態檢測結果。S103: Perform posture detection on the designated body parts of the real anchor in the first video image, and obtain a posture detection result.

在本發明實施例中，直播設備可以通過直播設備上預先安裝的攝影裝置採集真實主播在直播過程中的視頻流，第一視頻圖像為該視頻流中的視頻幀。In the embodiment of the present invention, the live broadcast device can capture the video stream of the real host during the live broadcast through the camera device pre-installed on the live broadcast device, and the first video image is a video frame in the video stream.

這裡，採集到的視頻流的視頻圖像中可以包含真實主播的臉部和上半身肢體部位。在一些實施例中，視頻圖像中還可以包含部分或者全部手部畫面。在實際直播場景下，在真實主播離開攝影裝置的拍攝範圍，或者，真實主播的直播場景較為複雜的情況下，往往會導致視頻圖像中包含不完整的臉部和/或不完整上半身肢體部位。Here, the video images of the collected video stream may include the real host's face and upper body parts. In some embodiments, the video image may also include part or all of the hand images. In the actual live broadcast scene, when the real anchor leaves the shooting range of the camera device, or the live broadcast scene of the real anchor is more complicated, the video image often contains incomplete faces and/or incomplete upper body parts .

在本發明實施例中，指定肢體部位可以為真實主播的至少部分指定肢體部位。這裡，指定肢體部位包含：頭部部位和上半身肢體部位（兩個手臂部位、手部部位和上半身軀幹部位）。In the embodiment of the present invention, the designated body parts may be at least part of the designated body parts of the real anchor. Here, the specified body part includes: a head part and an upper body part (two arm parts, a hand part, and an upper body torso part).

可以理解的是，在指定肢體部位為多個的情況下，上述姿態檢測結果可以用於表徵以下至少一種：各個指定肢體部位之間的相對位置關係，該第一視頻圖像中所包含手勢的手勢分類結果。It can be understood that, in the case where there are multiple specified body parts, the above posture detection results can be used to characterize at least one of the following: the relative positional relationship between each specified body part, the position of the gesture contained in the first video image Gesture classification results.

S105：在根據所述姿態檢測結果檢測出所述真實主播處於預設姿態的情況下，根據所述姿態檢測結果確定與所述真實主播對應的虛擬主播模型的目標動畫特效。S105: If it is detected according to the posture detection result that the real anchor is in a preset posture, determine a target animation special effect of a virtual anchor model corresponding to the real anchor according to the posture detection result.

S107：在所述真實主播對應的視頻直播介面中展示所述虛擬主播模型的目標動畫特效。S107: Display the target animation special effect of the virtual anchor model in the live video interface corresponding to the real anchor.

在本發明實施例中，直播設備在確定出姿態檢測結果之後，可以根據該姿態檢測結果檢測真實主播是否處於預設姿態。如果真實主播處於預設姿態，則確定出該第一視頻圖像中的真實主播滿足動畫特效的觸發條件。此時，直播設備可以確定與第一視頻圖像相匹配的目標動畫特效，並展示該目標動畫特效。In the embodiment of the present invention, after determining the posture detection result, the live broadcast device may detect whether the real host is in a preset posture according to the posture detection result. If the real anchor is in the preset posture, it is determined that the real anchor in the first video image satisfies the triggering condition of the animation special effect. At this time, the live broadcast device may determine the target animation special effect matching the first video image, and display the target animation special effect.

本發明實施例可以應用於虛擬直播領域，直播設備可以在視頻直播介面中展示真實主播所驅動的虛擬主播模型，並可以在視頻直播介面中展示該虛擬主播模型的動畫特效。這裡，直播設備可以通過識別真實主播的姿態，基於姿態檢測結果來確定真實主播驅動的虛擬主播模型的目標動畫特效，並在視頻直播介面中展示該目標動畫特效。由此可以實現通過真實主播的姿態檢測結果，觸發在視頻直播介面上展示虛擬主播模型對應的目標動畫特效，如此，無需依賴外部控制設備來觸發展示動畫特效，同時還提高了虛擬直播使用者的直播體驗。The embodiment of the present invention can be applied to the field of virtual live broadcast. The live broadcast device can display the virtual anchor model driven by the real anchor in the live video interface, and can display the animation special effects of the virtual anchor model in the live video interface. Here, the live broadcast device can determine the target animation effect of the virtual anchor model driven by the real anchor based on the posture detection result by recognizing the pose of the real anchor, and display the target animation special effect in the live video interface. In this way, the gesture detection results of the real anchor can be used to trigger the display of the target animation effects corresponding to the virtual anchor model on the live video interface. In this way, there is no need to rely on external control devices to trigger the display of animation effects, and it also improves the virtual live broadcast user experience. Live experience.

在本發明的一些實施例中，針對上述步驟S103，在姿態檢測結果包括：肢體檢測結果和手勢分類結果中的至少一個的情況下，對第一視頻圖像中真實主播的指定肢體部位進行姿態檢測，得到姿態檢測結果的實現，可以包括如下過程：步驟S1031，對所述第一視頻圖像中真實主播的指定肢體部位進行肢體檢測，得到肢體檢測結果。步驟S1032，在所述肢體檢測結果中包含手部檢測框的情況下，對所述第一視頻圖像中位於所述手部檢測框內的圖像進行手勢檢測，得到手勢分類結果。步驟S1033，根據所述肢體檢測結果和所述手勢分類結果確定所述姿態檢測結果。 In some embodiments of the present invention, for the above step S103, if the posture detection result includes at least one of the body detection result and the gesture classification result, perform a gesture on the specified body part of the real anchor in the first video image Detection, the realization of obtaining the attitude detection result, may include the following process: Step S1031, perform limb detection on the specified limb parts of the real anchor in the first video image, and obtain a limb detection result. Step S1032 , if the body detection result includes a hand detection frame, perform gesture detection on the image within the hand detection frame in the first video image to obtain a gesture classification result. Step S1033, determining the posture detection result according to the body detection result and the gesture classification result.

在本發明實施例中，直播設備可以通過肢體檢測模型，對第一視頻圖像中真實主播的指定肢體部位進行肢體檢測，得到肢體檢測結果。這裡，所述肢體檢測結果包含以下至少之一：肢體關鍵點、人臉框的尺寸、人臉框的位置資訊、手部檢測框的尺寸和手部檢測框的位置資訊。In the embodiment of the present invention, the live broadcast device may use the body detection model to perform body detection on the specified body parts of the real anchor in the first video image, and obtain body detection results. Here, the body detection result includes at least one of the following: body key points, size of the face frame, position information of the face frame, size of the hand detection frame, and position information of the hand detection frame.

如圖2所示，直播設備可以通過肢體檢測模型確定第一視頻圖像中真實主播的指定肢體部位的肢體關鍵點。在識別出第一視頻圖像中包含清晰面部圖像的情況下，得到人臉框的尺寸和人臉框的位置資訊中的至少一個。然後，在識別出第一視頻圖像中包含清晰手部圖像的情況下，可以得到手部檢測框的尺寸和手部檢測框的位置資訊中的至少一個。As shown in FIG. 2 , the live broadcast device can determine the body key points of the specified body parts of the real anchor in the first video image through the body detection model. If it is recognized that the first video image contains a clear facial image, at least one of the size of the face frame and the location information of the face frame is obtained. Then, if it is recognized that the first video image contains a clear hand image, at least one of the size of the hand detection frame and the position information of the hand detection frame can be obtained.

在本發明實施例中，在檢測出肢體檢測結果中包含手部檢測框的情況下，直播設備還可以通過手勢識別模型對位於手部檢測框內的圖像進行手勢檢測，得到手勢分類結果，其中，該手勢分類結果為一個特徵向量，該特徵向量用於表徵第一視頻圖像中真實主播的手部姿勢屬於每個預設手勢的概率。In the embodiment of the present invention, when the body detection result includes the hand detection frame, the live broadcast device can also use the gesture recognition model to perform gesture detection on the image located in the hand detection frame to obtain the gesture classification result, Wherein, the gesture classification result is a feature vector, and the feature vector is used to represent the probability that the hand gesture of the real anchor in the first video image belongs to each preset gesture.

在本發明實施例中，在檢測得到上述肢體檢測結果和手勢分類結果之後，直播設備就可以將該肢體檢測結果和手勢分類結果確定為姿態檢測結果。然後，根據肢體檢測結果和手勢分類結果判斷第一視頻圖像中的真實主播是否滿足動畫特效的觸發條件。如果滿足動畫特效的觸發條件，則根據姿態檢測結果確定與真實主播對應的虛擬主播模型的目標動畫特效。In the embodiment of the present invention, after the above body detection result and gesture classification result are obtained, the live broadcast device may determine the body detection result and gesture classification result as the gesture detection result. Then, judge whether the real anchor in the first video image satisfies the triggering condition of the animation special effect according to the body detection result and the gesture classification result. If the triggering condition of the animation special effect is satisfied, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result.

在本發明實施例中，在肢體檢測結果中不包含手部檢測框的情況下，直播設備可以捨棄該第一視頻圖像，並將視頻流中的下一個視頻幀作為第一視頻圖像，通過上述所描述的步驟對該第一視頻圖像進行處理，處理過程此處不再詳細描述。In the embodiment of the present invention, if the body detection result does not contain the hand detection frame, the live broadcast device may discard the first video image, and use the next video frame in the video stream as the first video image, The first video image is processed through the steps described above, and the processing process will not be described in detail here.

這裡，肢體檢測結果包含肢體關鍵點，例如，各個指定肢體部位的肢體關鍵點。若指定肢體部位包含頭部部位和上半身肢體部位，則肢體關鍵點包含頭部部位的關鍵點，以及上半身肢體部位中兩個手臂的關鍵點，手部關鍵點，以及上身軀幹的關鍵點。Here, the body detection result includes body key points, for example, body key points of each specified body part. If the specified limb part includes a head part and an upper body part, the body keys include keys for the head part, as well as keys for both arms in the upper body part, keys for the hands, and keys for the upper torso.

上述實施方式中，通過對第一視頻圖像進行肢體檢測和手勢檢測，並對肢體檢測結果和手勢分類結果進行整合，可以得到用於準確表示第一視頻圖像中真實主播的動作語義資訊的姿態檢測結果。在通過該姿態檢測結果確定目標動畫特效時，可以提高所觸發展示的目標動畫特效的準確性。In the above embodiment, by performing limb detection and gesture detection on the first video image, and integrating the limb detection result and gesture classification result, the semantic information used to accurately represent the action semantic information of the real anchor in the first video image can be obtained. Posture detection results. When the target animation special effect is determined based on the posture detection result, the accuracy of the target animation special effect that is triggered to be displayed can be improved.

在一個可選的實施方式中，上述步驟S105中，根據所述姿態檢測結果檢測所述真實主播處於預設姿態的實現，可以包括如下步驟：步驟S11，根據所述姿態檢測結果中的肢體檢測結果，判斷所述第一視頻圖像中的真實主播是否滿足手勢識別條件，得到判斷結果；步驟S12，在所述判斷結果表徵所述真實主播滿足所述手勢識別條件的情況下，檢測所述姿態檢測結果中手勢分類結果所指示的手勢是否為預設手勢；步驟S13，在檢測出所述手勢分類結果所指示的手勢是所述預設手勢的情況下，確定所述真實主播處於所述預設姿態。 In an optional implementation manner, in the above step S105, the implementation of detecting that the real anchor is in a preset posture according to the posture detection result may include the following steps: Step S11, according to the body detection result in the posture detection result, judge whether the real anchor in the first video image satisfies the gesture recognition condition, and obtain the judgment result; Step S12, when the judgment result indicates that the real host satisfies the gesture recognition condition, detect whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture; Step S13, if it is detected that the gesture indicated by the gesture classification result is the preset gesture, determine that the real anchor is in the preset gesture.

在本發明實施例中，直播設備在確定出上述所描述的肢體檢測結果之後，可以根據肢體檢測結果判斷第一視頻圖像中的真實主播是否滿足手勢識別條件。In the embodiment of the present invention, after the live broadcast device determines the body detection result described above, it can judge whether the real host in the first video image meets the gesture recognition condition according to the body detection result.

在本發明實施例中，直播設備可以根據肢體檢測結果確定各個指定肢體部位之間的相對位置關係；根據該相對位置關係，確定第一視頻圖像中的真實主播是否滿足手勢識別條件。In the embodiment of the present invention, the live broadcast device can determine the relative positional relationship between the specified limb parts according to the body detection result; according to the relative positional relationship, determine whether the real anchor in the first video image meets the gesture recognition condition.

可以理解的是，上述相對位置關係包含以下至少之一：各個指定肢體部位之間的相對距離、各個指定肢體部位中相關聯肢體部位之間的角度關係。其中，相關聯肢體部位可以為相鄰的指定肢體部位，或者，類型相同的指定肢體部位。It can be understood that the above relative positional relationship includes at least one of the following: a relative distance between each designated body part, and an angular relationship between associated body parts in each designated body part. Wherein, the associated body parts may be adjacent specified body parts, or specified body parts of the same type.

在本發明實施例中，第一視頻圖像中的真實主播滿足手勢識別條件可以被理解為：第一視頻圖像中真實主播的肢體動作為預先設定的肢體動作。In the embodiment of the present invention, the fact that the real anchor in the first video image satisfies the gesture recognition condition may be understood as: the body movements of the real anchor in the first video image are preset body movements.

因此，在檢測到第一視頻圖像中真實主播的肢體動作為預先設定的肢體動作的情況下，直播設備可以檢測第一視頻圖像中真實主播所做出的手勢是否為預設手勢。Therefore, when detecting that the body movement of the real anchor in the first video image is a preset body movement, the live broadcast device may detect whether the gesture made by the real anchor in the first video image is a preset gesture.

在本發明實施例中，直播設備在檢測第一視頻圖像中真實主播所做出的的手勢是預設手勢的情況下，可以確定出真實主播處於預設姿態，此時，就可以確定出第一視頻圖像中的真實主播滿足動畫特效的觸發條件，進而執行根據姿態檢測結果確定與真實主播對應的虛擬直播模型的目標動畫特效的步驟。In the embodiment of the present invention, when the live broadcast device detects that the gesture made by the real anchor in the first video image is a preset gesture, it can determine that the real anchor is in the preset gesture, and at this time, it can be determined that The real anchor in the first video image satisfies the triggering condition of the animation special effect, and then executes the step of determining the target animation special effect of the virtual live broadcast model corresponding to the real anchor according to the posture detection result.

在本發明實施例中，直播設備可以通過肢體檢測結果和手勢分類結果的結合，來確定該虛擬主播模型的目標動畫特效。在此情況下，不僅僅要求真實主播的手勢為預設手勢，還要求真實主播在做出預設手勢時的肢體動作為預先設定的肢體動作。如此，先基於肢體檢測結果判斷第一視頻圖像中的真實物件是否滿足手勢識別條件的方式，可以提高姿態比對的效率，縮短姿態比對的時間，從而使得本技術方案能夠適用於即時性要求較高的直播場景。In the embodiment of the present invention, the live broadcast device can determine the target animation special effect of the virtual anchor model through the combination of the body detection result and the gesture classification result. In this case, not only the gesture of the real anchor is required to be a preset gesture, but also the body movement of the real anchor when making the preset gesture is required to be a preset body movement. In this way, the method of first judging whether the real object in the first video image meets the gesture recognition condition based on the body detection result can improve the efficiency of posture comparison and shorten the time of posture comparison, so that this technical solution can be applied to real-time Highly demanding live broadcast scenarios.

在一個可選的實施方式中，上述步驟S11，根據所述姿態檢測結果中的肢體檢測結果確定所述第一視頻圖像中的真實主播是否滿足手勢識別條件的實現，可以包括如下過程：（1）、根據所述肢體檢測結果確定所述真實主播的各個指定肢體部位之間的相對方位資訊；（2）、根據所述相對方位資訊和預設方位資訊，判斷所述第一視頻圖像中的真實主播是否滿足所述手勢識別條件，得到所述判斷結果；所述預設方位資訊用於表徵所述真實主播在處於預設姿態下，各個指定肢體部位之間的相對方位關係。 In an optional implementation manner, the implementation of the above step S11, determining whether the real anchor in the first video image satisfies the gesture recognition condition according to the limb detection result in the posture detection result, may include the following process: (1) Determine the relative orientation information between the designated body parts of the real anchor according to the body detection results; (2) According to the relative orientation information and preset orientation information, it is judged whether the real anchor in the first video image satisfies the gesture recognition condition, and the judgment result is obtained; the preset orientation information is used for To characterize the relative orientation relationship between the specified body parts of the real anchor in a preset posture.

在本發明實施例中，在姿態檢測結果中包含肢體檢測結果和手勢分類結果的情況下，直播設備可以首先根據肢體檢測結果確定真實主播的各個指定肢體部位之間的相對方位資訊（也即，上述所描述的相對位置關係），這裡，各個指定肢體部位之間的相對方位資訊可以包含以下相對方位資訊：手部和臉部之間的相對方位資訊、手臂和手臂之間的相對方位資訊、上半身軀幹和手臂之間的相對方位資訊、手臂和上半身軀幹之間的相對方位資訊。In the embodiment of the present invention, when the posture detection result includes the body detection result and the gesture classification result, the live broadcast device can first determine the relative orientation information between the specified body parts of the real anchor according to the body detection result (that is, The relative positional relationship described above), here, the relative orientation information between the specified body parts may include the following relative orientation information: the relative orientation information between the hand and the face, the relative orientation information between the arm and the arm, Relative orientation information between upper body torso and arms, relative orientation information between arms and upper body torso.

這裡，相對方位資訊可以包含以下至少之一：相對距離、相對角度、相對方向。Here, the relative orientation information may include at least one of the following: relative distance, relative angle, and relative direction.

在本發明實施例中，相對距離可以包含：各個指定肢體部位之間的相對距離。例如，手部檢測框的中心點和人臉檢測框的中心點之間的相對距離為M1個圖元點；真實主播的左手臂的手肘和右手臂的手肘之間的相對距離為M2個圖元點；上半身軀幹的中心點和每個手臂的手肘之間的相對距離為M3個圖元點和M4個圖元點。In the embodiment of the present invention, the relative distance may include: the relative distance between each specified body part. For example, the relative distance between the center point of the hand detection frame and the center point of the face detection frame is M1 primitive points; the relative distance between the elbow of the left arm of the real anchor and the elbow of the right arm is M2 primitive points; the relative distance between the center point of the upper body torso and the elbow of each arm is M3 primitive points and M4 primitive points.

在本發明實施例中，相對角度可以包含：各個指定肢體部位之間的夾角。例如，手部檢測框的中心點和人臉檢測框的中心點之間的連線與水平線之間的夾角N1；真實主播的左手臂和右手臂之間的夾角為N2；上半身軀幹和每個手臂之間的夾角為N3和N4。In the embodiment of the present invention, the relative angle may include: an included angle between various specified body parts. For example, the angle N1 between the center point of the hand detection frame and the center point of the face detection frame and the horizontal line; the angle between the left arm and the right arm of the real anchor is N2; the upper body torso and each The angles between the arms are N3 and N4.

在本發明實施例中，相對方向可以包含：各個指定肢體部位之間的方向資訊。例如，手部檢測框在臉部檢測框的左側位置（或者，右側位置、下方位置、上方位置等）。真實主播的左手臂在右手臂的左側位置；上半身軀幹在左手臂的右側位置，並在右手臂的左側位置等。In the embodiment of the present invention, the relative direction may include: direction information between designated body parts. For example, the hand detection frame is at the left position (or, right position, lower position, upper position, etc.) of the face detection frame. The left arm of the real anchor is on the left side of the right arm; the upper body torso is on the right side of the left arm, and on the left side of the right arm, etc.

在本發明實施例中，直播設備在確定出相對方位資訊之後，就可以將相對方位資訊和預設方位資訊進行比較，得到比較結果，進而根據該比較結果判斷第一視頻圖像中的真實主播是否滿足所述手勢識別條件。In the embodiment of the present invention, after the live broadcast device determines the relative orientation information, it can compare the relative orientation information with the preset orientation information to obtain the comparison result, and then judge the real anchor in the first video image according to the comparison result. Whether the gesture recognition condition is met.

在一個可選的實施方式中，相對方位資訊包含多個第一子資訊，預設方位資訊中包含多個第二子資訊。上述將相對方位資訊和預設方位資訊進行比較，得到比較結果的實現，可以包括：（a）、將多個第一子資訊和多個第二子資訊中相同類型的子資訊進行配對，得到多個待比較資訊對。其中，相同類型可以理解為對應相同類型的指定肢體部位，且所表徵的物理含義相同。若第一子資訊為相對方位資訊中用於表徵手部檢測框的中心點和人臉檢測框的中心點之間的相對距離的資訊，則第二子資訊為預設方位資訊中同樣用於表徵手部檢測框的中心點和人臉檢測框的中心點之間的相對距離的資訊。（b）、確定每個待比較資訊對中第一子資訊和第二子資訊之間的差異，從而得到多個差異值，比較結果包括多個差異值。 In an optional implementation manner, the relative orientation information includes a plurality of first sub-information, and the default orientation information includes a plurality of second sub-information. The realization of comparing the relative orientation information with the default orientation information to obtain the comparison result may include: (a) Pairing the sub-information of the same type among the plurality of first sub-information and the plurality of second sub-information to obtain a plurality of pairs of information to be compared. Wherein, the same type can be understood as corresponding to the same type of specified body parts, and the represented physical meanings are the same. If the first sub-information is the information used to characterize the relative distance between the center point of the hand detection frame and the center point of the face detection frame in the relative orientation information, the second sub-information is also used in the default orientation information Information representing the relative distance between the center point of the hand detection frame and the center point of the face detection frame. (b) Determine the difference between the first sub-information and the second sub-information in each information pair to be compared, so as to obtain a plurality of difference values, and the comparison result includes a plurality of difference values.

在本發明實施例中，直播設備在得到多個差異值後，可以根據多個差異值判斷第一視頻圖像中的真實主播是否滿足手勢識別條件。In the embodiment of the present invention, after obtaining multiple difference values, the live broadcast device may judge whether the real host in the first video image meets the gesture recognition condition according to the multiple difference values.

在本發明的一些實施例中，直播設備可以在每個差異值均小於預設差異閾值的情況下，得到第一視頻圖像中的真實主播滿足手勢識別條件的判斷結果；在確定出每個差異值中至少一個差異值大於或者大於預設差異閾值的情況下，得到第一視頻圖像中的真實主播不滿足手勢識別條件的判斷結果。In some embodiments of the present invention, the live broadcast device can obtain the judgment result that the real anchor in the first video image satisfies the gesture recognition condition when each difference value is smaller than the preset difference threshold; When at least one of the difference values is greater than or greater than the preset difference threshold, a judgment result is obtained that the real anchor in the first video image does not meet the gesture recognition condition.

在本發明的一些實施例中，直播設備可以在多個差異值中確定出小於預設差異閾值的數量大於或者等於預設數量閾值的情況下，得到第一視頻圖像中的真實主播滿足手勢識別條件的判斷結果，在多個差異值中確定出於曉預設差異閾值的數量小於預設數量閾值的情況下，得到第一視頻圖像中的真實主播不滿足手勢識別條件的判斷結果。In some embodiments of the present invention, the live broadcast device can obtain the satisfaction gesture of the real anchor in the first video image when it is determined that the number of difference values smaller than the preset difference threshold is greater than or equal to the preset number threshold As for the judgment result of the recognition condition, if it is determined that the number of the preset difference threshold is less than the preset number threshold among the plurality of difference values, the judgment result that the real anchor in the first video image does not meet the gesture recognition condition is obtained.

在本發明的一些實施例中，直播設備可以確定每個差異值所對應的權重值，然後，根據每個差異值和對應的權重值進行加權求和，得到加權求和計算結果，在該加權求和計算結果小於或者等於預設加權閾值的情況下，確定第一視頻圖像中的真實主播滿足手勢識別條件；在加權求和計算計算結果大於預設加權閾值的情況下，確定第一視頻圖像中的真實主播不滿足手勢識別條件。In some embodiments of the present invention, the live broadcast device can determine the weight value corresponding to each difference value, and then perform a weighted summation according to each difference value and the corresponding weight value to obtain a weighted summation calculation result. When the sum calculation result is less than or equal to the preset weighted threshold, it is determined that the real anchor in the first video image meets the gesture recognition condition; when the weighted sum calculation result is greater than the preset weighted threshold, it is determined that the first video image The real anchor in the image does not meet the gesture recognition requirements.

需要說明的是，預設差異閾值、預設數量閾值和預設加權閾值可以根據實際需要設置，對此，本發明實施例不作限制。It should be noted that the preset difference threshold, the preset quantity threshold, and the preset weight threshold can be set according to actual needs, which is not limited in this embodiment of the present invention.

上述實施方式中，通過將相對方位資訊和預設方位資訊進行比對，來確定第一視頻圖像中真實主播的肢體動作是否為預先設定的肢體動作的方式，可以得到更加準確的肢體比對結果，從而能夠更加準確的確定第一視頻圖像是否滿足手勢識別條件。In the above embodiment, by comparing the relative orientation information with the preset orientation information, it is determined whether the body movement of the real anchor in the first video image is the preset body movement, so that more accurate body comparison can be obtained As a result, it can be more accurately determined whether the first video image satisfies the gesture recognition condition.

在一個可選的實施方式中，上述步驟S105中：根據所述姿態檢測結果確定與所述真實主播對應的虛擬主播模型的目標動畫特效的實現，可以包括如下步驟：步驟S21，基於所述姿態檢測結果，確定針對動畫特效的第一驅動資訊；其中，所述第一驅動資訊用於指示所述視頻直播介面中所展示動畫特效的動畫跳轉資訊；步驟S22，根據所述第一驅動資訊，在所述姿態檢測結果所對應的多個動畫序列中確定與所述第一驅動資訊相匹配的動畫序列，並將所述相匹配的動畫序列確定為所述目標動畫特效。 In an optional embodiment, in the above step S105: determining the realization of the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result may include the following steps: Step S21, based on the posture detection result, determine the first driving information for the animation special effect; wherein, the first driving information is used to indicate the animation jump information of the animation special effect displayed in the live video interface; Step S22, according to the first driving information, determine an animation sequence matching the first driving information among multiple animation sequences corresponding to the posture detection result, and determine the matching animation sequence as The target animation effect.

在本發明實施例中，第一驅動資訊可以為一個1×（P+Q）的矩陣，其中，P可以為預設姿態的數量，Q可以為預設姿態所對應多個動畫序列的數量。In the embodiment of the present invention, the first driving information may be a 1×(P+Q) matrix, where P may be the number of preset poses, and Q may be the number of multiple animation sequences corresponding to the preset poses.

這裡，通過該第一驅動資訊可以在多個預設姿態中確定第一視頻圖像中真實主播的姿態檢測結果相匹配的預設姿態。在上述1×（P+Q）的矩陣中，該相匹配的預設姿態所對應的元素可以設置為“1”，其他預設姿態所對應元素可以設置為“0”，其中，“1”表示該預設姿態為與真實主播的姿態檢測結果相匹配的姿態，“0”表示該預設姿態不是與真實主播的姿態檢測結果相匹配的姿態。Here, the preset posture matching the posture detection result of the real anchor in the first video image can be determined among the plurality of preset postures through the first driving information. In the above 1×(P+Q) matrix, the element corresponding to the matching preset posture can be set to “1”, and the elements corresponding to other preset postures can be set to “0”, where “1” indicates that the preset posture is a posture that matches the posture detection result of the real anchor, and "0" represents that the preset posture is not a posture that matches the posture detection result of the real anchor.

在上述1×（P+Q）的矩陣中，Q的數量與“相匹配的預設姿態”相關聯，即Q可以理解為該“相匹配的預設姿態”所對應的多個動畫序列。在上述1×（P+Q）的矩陣中，與第一視頻圖像相匹配的動畫序列所對應的元素可以設置為“1”，其餘元素可以設置為“0”。In the above 1×(P+Q) matrix, the quantity of Q is associated with the “matched preset pose”, that is, Q can be understood as multiple animation sequences corresponding to the “matched preset pose”. In the above 1×(P+Q) matrix, elements corresponding to animation sequences matching the first video image may be set to “1”, and other elements may be set to “0”.

在本發明實施例中，直播設備在確定出第一驅動資訊之後，就可以根據該第一驅動資訊，在多個預設姿態中確定與姿態檢測結果相匹配的預設姿態，並在該預設姿態所對應的多個動畫序列中，確定與真實主播對應的虛擬主播模型的目標動畫特效。In the embodiment of the present invention, after the live broadcast device determines the first driving information, it can determine a preset posture that matches the posture detection result among multiple preset postures according to the first driving information, and In the plurality of animation sequences corresponding to the posture, determine the target animation special effect of the virtual anchor model corresponding to the real anchor.

針對任意一個預設姿態，直播設備可以為該預設姿態預先設定多個階段，例如：動作進入階段、動作保持階段、動作退出階段。示例性的，預設姿態為如圖2所示的真實主播通過手部動作展示“OK”的姿勢，真實主播的手臂抬起至頭部位置並展示出“OK”動作可以為上述動作進入階段，真實主播保持該“OK”動作可以為上述動作保持階段，真實主播的手臂從頭部位置放下為上述動作退出階段。需要說明的是，直播設備可以為每個階段預先設定相應的動畫序列，根據預先設定的相應的動畫序列識別預設姿態處於哪個階段。For any preset posture, the live broadcast device can pre-set multiple stages for the preset posture, for example: an action entry stage, an action hold stage, and an action exit stage. Exemplarily, the preset posture is the posture of the real anchor showing "OK" through hand movements as shown in Figure 2, and the arm of the real anchor is raised to the position of the head and showing the "OK" action can enter the stage for the above action , the real anchor keeping the "OK" action can be the above-mentioned action holding stage, and the real anchor's arm is lowered from the head position is the above-mentioned action exit stage. It should be noted that the live broadcast device may preset a corresponding animation sequence for each stage, and identify which stage the preset gesture is in according to the preset corresponding animation sequence.

上述實施方式中，通過確定第一驅動資訊，直播設備可以根據第一驅動資訊確定目標動畫特效，從而可以簡化資料格式，節省直播設備的設備內部記憶體，從而保證直播過程的流暢性。In the above embodiments, by determining the first driving information, the live broadcast device can determine the target animation special effect according to the first driving information, thereby simplifying the data format, saving the internal memory of the live broadcast device, and ensuring the smoothness of the live broadcast process.

在一個可選的實施方式中，步驟S21，所述基於所述姿態檢測結果，確定針對動畫特效的第一驅動資訊的實現，可以包括如下過程：（1）、確定所述視頻流中位於所述第一視頻圖像之前的至少一個視頻圖像；（2）、獲取根據每個所述視頻圖像確定出的針對動畫特效的第二驅動資訊，並根據所述姿態檢測結果確定針對動畫特效的估計驅動資訊；（3）、確定所述第二驅動資訊和所述估計驅動資訊中每個驅動資訊所驅動展示的動畫序列，得到至少一個動畫序列；（4）、將所述至少一個動畫序列中出現次數滿足預設次數要求的動畫序列所對應的驅動資訊確定為所述第一驅動資訊。 In an optional implementation manner, step S21, the implementation of determining the first driving information for animation special effects based on the posture detection result may include the following process: (1) Determine at least one video image preceding the first video image in the video stream; (2) Acquiring second driving information for animation special effects determined according to each of the video images, and determining estimated driving information for animation special effects according to the posture detection result; (3) Determine the animation sequence driven by each of the second driving information and the estimated driving information to obtain at least one animation sequence; (4) Determine the driving information corresponding to the animation sequence whose occurrence number meets the preset number requirement in the at least one animation sequence as the first driving information.

在本發明實施例中，直播設備可以檢測直播設備的幀率的穩定性，根據幀率的穩定性確定所述視頻流中位於所述第一視頻圖像之前的至少一個視頻圖像。In the embodiment of the present invention, the live broadcast device may detect the stability of the frame rate of the live broadcast device, and determine at least one video image in the video stream before the first video image according to the stability of the frame rate.

例如，針對幀率相對穩定的直播設備，可以確定一個目標時間視窗。示例性的，第一視頻圖像所對應的採集時刻為T秒，那麼該目標時間視窗可以為[T-1，T]。也就是說，直播設備可以將視頻流中位於該目標時間視窗內的視頻圖像確定為至少一個視頻圖像。For example, for a live broadcast device with a relatively stable frame rate, a target time window can be determined. Exemplarily, if the acquisition time corresponding to the first video image is T seconds, then the target time window may be [T-1, T]. That is to say, the live broadcast device may determine the video image within the target time window in the video stream as at least one video image.

又例如，針對幀率不穩定的直播設備，可以獲取視頻流中位於第一視頻圖像之前的N幀視頻圖像作為至少一個視頻圖像。For another example, for a live broadcast device with an unstable frame rate, N frames of video images preceding the first video image in the video stream may be acquired as at least one video image.

在得到至少一個視頻圖像之後，直播設備可以獲取第二驅動資訊，第二驅動資訊是根據每個所述視頻圖像確定出的，如此，直播設備可以得到至少一個第二驅動資訊，並根據姿態檢測結果確定針對動畫特效的估計驅動資訊。之後，就可以確定第二驅動資訊和所述估計驅動資訊中每個驅動資訊所驅動展示的動畫序列，進而得到至少一個動畫序列。After obtaining at least one video image, the live broadcast device can obtain second driving information, and the second driving information is determined according to each video image, so that the live broadcast device can obtain at least one second driving information, and according to The pose detection results determine estimated driving information for the animation effect. Afterwards, an animation sequence driven by each of the second driving information and the estimated driving information may be determined, so as to obtain at least one animation sequence.

在得到至少一個動畫序列之後，就可以從至少一個動畫序列中確定出現次數滿足預設次數要求的動畫序列，然後，將該滿足預設次數要求的動畫序列所對應的驅動資訊確定為第一驅動資訊。示例性的，預設次數要求可以為出現次數最多，也就是說，直播設備可以將至少一個動畫序列中出現次數最多的動畫序列作為滿足預設次數要求的動畫序列。After obtaining at least one animation sequence, it is possible to determine from at least one animation sequence that the number of occurrences of the animation sequence meets the requirement of the preset number of times, and then determine the driving information corresponding to the animation sequence that meets the requirement of the preset number of times as the first driving Information. Exemplarily, the requirement for the preset number of times may be the highest number of occurrences, that is, the live broadcast device may use the animation sequence with the highest number of occurrences in at least one animation sequence as the animation sequence that meets the requirement for the preset number of times.

舉例來說，從T0時刻開始採集視頻流，直到T1時刻，可以得到30幀視頻圖像，此時，直播設備可以獲取根據每個視頻圖像確定出的針對動畫特效的第二驅動資訊，然後，根據30幀視頻圖像中每幀視頻圖像確定出的針對動畫特效的第二驅動資訊，得到30個第二驅動資訊；確定30個第二驅動資訊中每個第二驅動資訊所驅動展示的動畫序列，得到30個動畫序列；將30個動畫序列中出現次數最高的動畫序列所對應的第二驅動資訊確定為這30幀視頻圖像中每幀視頻圖像的第一驅動資訊。For example, starting to collect video streams from time T0 until time T1, 30 frames of video images can be obtained. At this time, the live broadcast device can obtain the second driving information for animation special effects determined according to each video image, and then , according to the second driving information for animation special effects determined by each frame of video images in the 30 frames of video images, 30 second driving information are obtained; each of the 30 second driving information is determined to drive the display 30 animation sequences are obtained; the second driving information corresponding to the animation sequence with the highest number of occurrences among the 30 animation sequences is determined as the first driving information of each frame of video images in the 30 frames of video images.

由於虛擬直播模型是根據多個視頻圖像中每個視頻圖像確定出的，虛擬直播模型的動畫特效可能不同，此時，視頻直播介面中所展示的動畫特效可能存在抖動的問題，基於此，本發明技術方案提出了一種基於時間序列的決策穩定演算法，該演算法首先獲取視頻流中位於第一視頻圖像之前的至少一個視頻圖像，然後，根據每個視頻圖像確定針對動畫特效的第二驅動資訊，再根據第二驅動資訊確定第一驅動資訊，通過該處理方式可以在保證較低決策回應延遲的基礎上減少信號抖動的情況，進而提高觸發對應動作的動畫序列的準確性。Since the virtual live broadcast model is determined based on each video image in multiple video images, the animation special effects of the virtual live broadcast model may be different. At this time, the animation special effects displayed in the live video interface may have the problem of shaking. , the technical solution of the present invention proposes a decision stabilization algorithm based on time series, the algorithm first obtains at least one video image in the video stream before the first video image, and then determines the animation for each video image The second driving information of the special effect, and then determine the first driving information according to the second driving information. Through this processing method, the signal jitter can be reduced on the basis of ensuring a low decision-making response delay, thereby improving the accuracy of the animation sequence that triggers the corresponding action. sex.

在一個可選的實施方式中，上述步驟S22：根據所述第一驅動資訊，在所述姿態檢測結果所對應的多個動畫序列中確定與所述第一驅動資訊相匹配的動畫序列的實現，可以包括如下過程：（1）、獲取所述多個動畫序列的動畫狀態機；所述動畫狀態機用於表徵多個動畫狀態之間的跳轉關係，每個所述動畫狀態對應一個或多個動畫序列；（2）、根據所述第一驅動資訊，確定所述動畫狀態機的下一個待跳轉的動畫狀態；（3）、根據所述下一個待跳轉的動畫狀態所對應的動畫序列，確定與所述第一驅動資訊相匹配的動畫序列。 In an optional embodiment, the above step S22: according to the first driving information, the implementation of determining the animation sequence matching the first driving information among the multiple animation sequences corresponding to the gesture detection result , which can include the following processes: (1) Acquiring the animation state machines of the plurality of animation sequences; the animation state machine is used to represent the jump relationship between the plurality of animation states, and each animation state corresponds to one or more animation sequences; (2) According to the first driving information, determine the next animation state to be jumped to by the animation state machine; (3) According to the animation sequence corresponding to the next animation state to be jumped, determine the animation sequence matching the first driving information.

在本發明實施例中，直播設備可以預先為預設姿態所對應的多個動畫序列設置相應的動畫狀態機。如此，在得到第一驅動資訊之後，就可以根據第一驅動資訊中所包含的內容，確定動畫狀態機的下一個待跳轉的動畫狀態，例如，由當前動畫狀態A跳轉至動畫狀態B。在確定出下一個待跳轉的動畫狀態之後，就可以確定該下一個待跳轉的動畫狀態所對應的動畫序列，進而確定該動畫序列為與第一視頻圖像相匹配的動畫序列。In the embodiment of the present invention, the live broadcast device may pre-set corresponding animation state machines for multiple animation sequences corresponding to preset poses. In this way, after obtaining the first driving information, the next animation state to be jumped to for the animation state machine can be determined according to the content contained in the first driving information, for example, jumping from the current animation state A to the animation state B. After the next animation state to be jumped is determined, the animation sequence corresponding to the next animation state to be jumped can be determined, and then the animation sequence is determined to be an animation sequence matching the first video image.

在一些實施例中，從A跳轉到B，可以通過動畫過渡幀進行過渡。這裡，直播設備可以基於連續動畫片段的混合參數，通過動畫混合演算法，實現Avatar形象骨骼動畫或特效動畫的播放。In some embodiments, jumping from A to B may be transitioned through animation transition frames. Here, the live broadcast device can realize the playback of the Avatar image skeletal animation or special effect animation through the animation mixing algorithm based on the mixing parameters of the continuous animation clips.

在一些實施例中，動畫混合演算法可以包括骨骼蒙皮動畫演算法、Mecannim動畫的2維自由笛卡爾演算法（2D freeform cartesian）和2維簡單定向（2D simple directional）等動畫演算法中的至少一個，對此，本申請實施例不作限制。In some embodiments, the animation blending algorithm may include animation algorithms such as skeletal skinning animation algorithm, 2D freeform cartesian algorithm (2D freeform cartesian) and 2D simple directional (2D simple directional) of Mecannim animation. At least one, this embodiment of the present application does not limit it.

在一些實施例中，混合參數可以包括速度、方向等參數。In some embodiments, mixing parameters may include parameters such as speed and direction.

可以理解的是，動畫狀態機中包含各個動畫狀態之間的轉移條件，因此，通過驅動資訊對動畫狀態機進行控制，以實現視頻直播介面中動畫狀態的跳轉，使直播使用者可以在直播過程中使用更複雜的肢體動作，實現各種動作狀態之間的流暢轉移，以提高使用者直播體驗。It is understandable that the animation state machine contains the transition conditions between animation states. Therefore, the animation state machine is controlled through the driving information to realize the transition of the animation state in the live video interface, so that live broadcast users can More complex body movements are used to achieve smooth transfer between various action states to improve the user's live broadcast experience.

在一個可選的實施方式中，在目標動畫特效包括用於表徵虛擬主播模型的肢體動作的肢體動作特效和渲染素材特效種的至少一個的情況下，上述步驟S105：在所述真實主播對應的視頻直播介面中展示所述虛擬主播模型的目標動畫特效的實現，可以包括以下至少之一：在所述視頻直播介面中展示所述虛擬主播模型的肢體動作的肢體動作特效；在所述視頻直播介面中與所述虛擬主播模型的肢體動作關聯的目標位置處展示所述渲染素材特效。 In an optional implementation manner, in the case where the target animation special effects include at least one of body movement special effects and rendering material special effects used to characterize the body movements of the virtual anchor model, the above step S105: in the real anchor corresponding The realization of the target animation effects of the virtual anchor model displayed in the live video interface may include at least one of the following: displaying the physical effect of the physical movement of the virtual anchor model in the live video interface; displaying the rendering material special effect at a target position associated with the physical movement of the virtual anchor model in the live video interface.

在本發明實施例中，目標動畫特效包含虛擬主播的肢體動作特效，例如，虛擬直播的肢體執行如圖2所示的“OK”動作的動作特效。除此之外，目標動畫特效還包含渲染素材特效，例如，可以為虛擬主播添加“兔耳朵”的特效。In the embodiment of the present invention, the target animation special effect includes the virtual anchor's body movement special effect, for example, the movement special effect of the virtual live broadcast's body performing the "OK" action as shown in FIG. 2 . In addition, the target animation special effects also include rendering material special effects, for example, you can add "rabbit ears" special effects to the virtual anchor.

這裡，目標位置可以與肢體動作特效相關聯，例如，針對相同的渲染素材，在不同肢體動作特效的情況下，在視頻直播介面中的展示位置（即目標位置）也不相同。Here, the target position may be associated with the body motion special effects. For example, for the same rendered material, the display positions (that is, the target positions) in the live video interface are different in the case of different body motion special effects.

上述實施方式中，通過在視頻直播介面中展示肢體動作特效和渲染素材特效的方式，可以豐富視頻直播介面中所展示的內容，從而增加了直播的趣味性，提高了使用者的直播體驗。In the above embodiments, by displaying body movement special effects and rendering material special effects in the live video interface, the content displayed in the live video interface can be enriched, thereby increasing the fun of the live broadcast and improving the user's live broadcast experience.

在一個可選的實施方式中，該方法還包括：獲取所述真實主播所對應的虛擬直播場景。In an optional implementation manner, the method further includes: acquiring a virtual live broadcast scene corresponding to the real anchor.

這裡，虛擬直播場景可以為遊戲解說場景、才藝展示場景和情感表達場景等，對此，本發明實施例不做限制。Here, the virtual live broadcast scene may be a game commentary scene, a talent show scene, an emotional expression scene, etc., which is not limited in this embodiment of the present invention.

上述步驟S105：根據所述姿態檢測結果，確定與所述真實主播對應的虛擬主播模型的目標動畫特效的實現，還可以包括如下步驟：獲取與所述姿態檢測結果相匹配的初始動畫特效；在所述初始動畫特效中確定與所述虛擬直播場景相匹配的動畫特效作為所述目標動畫特效。The above step S105: according to the posture detection result, determine the realization of the target animation special effect of the virtual anchor model corresponding to the real anchor, and may also include the following steps: obtaining the initial animation special effect matching the posture detection result; An animation special effect matching the virtual live broadcast scene is determined as the target animation special effect in the initial animation special effect.

在本發明實施例中，針對不同的虛擬直播場景，直播設備可以設置不同類型的動畫特效。示例性的，針對如圖2所示的“OK”姿態，在不同的虛擬直播場景下，所展示的動畫特效也可以是不相同的，例如，在遊戲解說場景下，“OK”姿態所對應的動畫特效更加符合遊戲人物的動作習慣。In the embodiment of the present invention, for different virtual live broadcast scenes, the live broadcast device can set different types of animation special effects. Exemplarily, for the "OK" gesture shown in Figure 2, in different virtual live broadcast scenarios, the displayed animation special effects may also be different, for example, in the game commentary scene, the "OK" gesture corresponds to The animation special effects are more in line with the action habits of the game characters.

在本發明實施例中，直播設備可以在獲取到與姿態檢測結果相匹配的初始動畫特效之後，在初始動畫特效中確定與虛擬直播場景相匹配的動畫特效作為目標動畫特效。In the embodiment of the present invention, after obtaining the initial animation special effect matching the posture detection result, the live broadcast device may determine the animation special effect matching the virtual live broadcast scene as the target animation special effect in the initial animation special effect.

舉例來說，預設姿態為如圖2所示的“OK”姿態，可以預先為該“OK”姿態設置多個初始動畫特效，例如，初始動畫特效M1，初始動畫特效M2，初始動畫特效M3。針對每個初始動畫特效，均對應設置了場景標籤，用於指示該初始動畫特效所適用的虛擬直播場景。For example, the default posture is the "OK" posture as shown in Figure 2, and multiple initial animation special effects can be set in advance for the "OK" posture, for example, initial animation special effects M1, initial animation special effects M2, initial animation special effects M3 . For each initial animation special effect, a corresponding scene label is set to indicate the virtual live broadcast scene to which the initial animation special effect applies.

在獲取到真實主播所對應的虛擬直播場景之後，可以將該虛擬直播場景與場景標籤進行匹配，從而確定出與虛擬直播場景相匹配的動畫特效作為目標動畫特效。After obtaining the virtual live broadcast scene corresponding to the real anchor, the virtual live broadcast scene can be matched with the scene label, so as to determine the animation special effect matching the virtual live broadcast scene as the target animation special effect.

上述實施方式中，通過虛擬直播場景在初始動畫特效中篩選得到目標動畫特效的方式，可以實現為直播使用者定制個性化的動畫特效，進而使得確定出的目標動畫特效能夠更加滿足使用者的直播需求，從而提高了使用者的直播體驗。In the above-mentioned embodiment, by screening the target animation special effects from the initial animation special effects in the virtual live broadcast scene, it is possible to customize personalized animation special effects for the live broadcast users, so that the determined target animation special effects can better satisfy the user's live broadcast. demand, thereby improving the user's live broadcast experience.

下面將結合具體實施方式對上述所描述的特效展示方法進行介紹。The method for displaying special effects described above will be introduced below in conjunction with specific implementation methods.

示例性的，真實主播記為主播A，真實主播所驅動的虛擬主播模型為“兔子公主”。主播A所選用的直播設備為智慧手機，在該智慧手機上設置有相機。預設姿態為如圖2所示的姿態，預設手勢為如圖2所示的：OK手勢。Exemplarily, the real anchor is recorded as anchor A, and the virtual anchor model driven by the real anchor is "Princess Rabbit". The live broadcast device selected by anchor A is a smart phone, and a camera is installed on the smart phone. The preset gesture is the gesture shown in FIG. 2 , and the preset gesture is the OK gesture as shown in FIG. 2 .

在本發明實施例中，首先，直播設備通過相機採集真實主播在直播過程中的第一視頻圖像；然後，對第一視頻圖像中主播A的指定肢體部位進行肢體檢測，得到肢體檢測結果。肢體檢測結果包含以下至少之一：肢體關鍵點、人臉框的尺寸、人臉框的位置資訊、手部檢測框的尺寸、手部檢測框的位置資訊。In the embodiment of the present invention, first, the live broadcast device collects the first video image of the real anchor during the live broadcast through the camera; then, performs body detection on the designated body parts of the anchor A in the first video image, and obtains the body detection result . The body detection result includes at least one of the following: body key points, size of the face frame, position information of the face frame, size of the hand detection frame, and position information of the hand detection frame.

在得到肢體檢測結果之後，如果根據肢體檢測結果確定出主播A的手部檢測框和人臉框之間的相對方位資訊，確定主播A滿足手勢識別條件。例如，如圖2所示，根據該肢體檢測結果可以確定出主播A的手部在人臉的一側，並與人臉相鄰。此時，可以對第一視頻圖像中位於該手部檢測框內的圖像進行手勢檢測，得到手勢分類結果。如果識別出主播A所做的手勢為“OK手勢”，且確定主播A處於如圖2所示的預設姿態。After the body detection result is obtained, if the relative orientation information between anchor A's hand detection frame and face frame is determined according to the body detection result, it is determined that anchor A meets the gesture recognition condition. For example, as shown in FIG. 2 , according to the body detection result, it can be determined that anchor A's hand is on one side of the face and adjacent to the face. At this time, gesture detection may be performed on the image within the hand detection frame in the first video image to obtain a gesture classification result. If it is recognized that the gesture made by anchor A is "OK gesture", and it is determined that anchor A is in the preset posture as shown in FIG. 2 .

之後，直播設備可以根據該姿態檢測結果確定與主播A對應的虛擬主播模型“兔子公主”的目標動畫特效，並在主播A對應的視頻直播介面中展示“兔子公主”的目標動畫特效。Afterwards, the live broadcast device can determine the target animation special effect of the virtual anchor model "Princess Rabbit" corresponding to anchor A according to the posture detection result, and display the target animation special effect of "Princess Rabbit" in the live video interface corresponding to anchor A.

示例性的，針對“OK手勢”，可以包含3個動畫階段：主播A的手臂抬起至頭部位置並展示出“OK”動作的動作進入階段，主播A保持該“OK”動作的動作保持階段，以及主播A的手臂從頭部位置放下的動作退出階段。針對每個階段，可以為其確定相應的動畫序列。Exemplarily, for the "OK gesture", it can include 3 animation stages: the action of anchor A raising his arm to the head position and showing the "OK" action enters the stage, and the action of anchor A maintaining the "OK" action remains stage, and the action of Anchor A's arm lowering from the head position exits the stage. For each phase, a corresponding animation sequence can be determined for it.

直播設備在得到該姿態檢測結果之後，還可以確定視頻流中位於該第一視頻圖像之前的至少一個視頻圖像，並獲取根據每個視頻圖像確定出的針對動畫特效的第二驅動資訊，該第二驅動資訊用於指示視頻直播介面中所展示動畫特效的動畫跳轉資訊，同時還可以根據姿態檢測結果，確定針對動畫特效的估計驅動資訊。之後，就可以根據第二驅動資訊和估計驅動資訊，確定每個驅動資訊所驅動展示的動畫序列；然後，在確定出的動畫序列中確定出現次數最多的動畫序列。例如，針對“動作保持階段”所對應的動畫序列的出現次數最多。之後，可以在視頻直播畫面中播放該“動作保持階段”所對應的動畫序列。After the live broadcast device obtains the pose detection result, it can also determine at least one video image in the video stream before the first video image, and obtain the second driving information for animation special effects determined according to each video image , the second driving information is used to indicate the animation jump information of the animation special effect displayed in the live video interface, and at the same time, the estimated driving information for the animation special effect can be determined according to the posture detection result. Then, according to the second driving information and the estimated driving information, the animation sequence driven by each driving information can be determined; and then, the animation sequence with the largest number of occurrences can be determined among the determined animation sequences. For example, the animation sequence corresponding to the "motion hold phase" has the highest number of occurrences. Afterwards, the animation sequence corresponding to the "action holding stage" can be played on the live video screen.

本領域技術人員可以理解，在具體實施方式的上述方法中，各步驟的撰寫順序並不意味著嚴格的執行順序而對實施過程構成任何限定，各步驟的具體執行順序應當以其功能和可能的內在邏輯確定。Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

基於同一發明構思，本發明實施例中還提供了與特效展示方法對應的特效展示裝置，由於本發明實施例中的裝置解決問題的原理與本發明實施例上述特效展示方法相似，因此裝置的實施可以參見方法的實施，重複之處不再贅述。Based on the same inventive concept, the embodiment of the present invention also provides a special effect display device corresponding to the special effect display method. Since the problem-solving principle of the device in the embodiment of the present invention is similar to the above-mentioned special effect display method in the embodiment of the present invention, the implementation of the device Reference can be made to the implementation of the method, and repeated descriptions will not be repeated.

參照圖3所示，為本發明實施例提供的一種特效展示裝置的示意圖，所述裝置包括：獲取部分41、姿態檢測部分42、確定部分43、展示部分44；其中，獲取部分41，被設置為獲取真實主播在直播過程中的第一視頻圖像；姿態檢測部分42，被設置為對所述第一視頻圖像中所述真實主播的指定肢體部位進行姿態檢測，得到姿態檢測結果；確定部分43，被設置為在根據所述姿態檢測結果檢測出所述真實主播處於預設姿態的情況下，根據所述姿態檢測結果確定與所述真實主播對應的虛擬主播模型的目標動畫特效；展示部分44，被設置為在所述真實主播對應的視頻直播介面中展示所述虛擬主播模型的目標動畫特效。 Referring to Figure 3, it is a schematic diagram of a special effect display device provided by an embodiment of the present invention, the device includes: an acquisition part 41, a gesture detection part 42, a determination part 43, and a display part 44; wherein, The acquisition part 41 is configured to acquire the first video image of the real anchor during the live broadcast; The posture detection part 42 is configured to perform posture detection on the designated limb parts of the real anchor in the first video image, and obtain a posture detection result; The determining part 43 is configured to determine the target animation special effect of the virtual anchor model corresponding to the real anchor according to the posture detection result when it is detected that the real anchor is in a preset posture according to the posture detection result; The display part 44 is configured to display the target animation special effects of the virtual anchor model in the live video interface corresponding to the real anchor.

一種可能的實施方式中，姿態檢測部分42，還被設置為：在姿態檢測結果包括肢體檢測結果和手勢分類結果中的至少一個的情況下，對所述第一視頻圖像中真實主播的指定肢體部位進行肢體檢測，得到肢體檢測結果；在所述肢體檢測結果中包含手部檢測框的情況下，對所述第一視頻圖像中位於所述手部檢測框內的圖像進行手勢檢測，得到手勢分類結果；根據所述肢體檢測結果和所述手勢分類結果確定所述姿態檢測結果。In a possible implementation manner, the posture detection part 42 is further configured to: when the posture detection result includes at least one of the body detection result and the gesture classification result, specifying the real anchor in the first video image Performing limb detection on limb parts to obtain limb detection results; in the case that the limb detection results include a hand detection frame, performing gesture detection on an image in the first video image that is located in the hand detection frame , to obtain a gesture classification result; determine the posture detection result according to the limb detection result and the gesture classification result.

一種可能的實施方式中，確定部分43，還被設置為：根據所述姿態檢測結果中的肢體檢測結果判斷所述第一視頻圖像中的真實主播是否滿足手勢識別條件，得到判斷結果；在所述判斷結果表徵所述真實主播滿足所述手勢識別條件的情況下，檢測所述姿態檢測結果中手勢分類結果所指示的手勢是否為預設手勢；在檢測出所述手勢分類結果所指示的手勢是所述預設手勢的情況下，確定所述真實主播處於所述預設姿態。In a possible implementation manner, the determining part 43 is further configured to: judge whether the real anchor in the first video image satisfies the gesture recognition condition according to the body detection result in the posture detection result, and obtain the judgment result; When the judgment result indicates that the real anchor satisfies the gesture recognition condition, detect whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture; when the gesture indicated by the gesture classification result is detected If the gesture is the preset gesture, it is determined that the real anchor is in the preset gesture.

一種可能的實施方式中，確定部分43，還被設置為：根據所述肢體檢測結果確定所述真實主播的各個指定肢體部位之間的相對方位資訊；根據所述相對方位資訊和預設方位資訊，確定所述第一視頻圖像中的真實主播是否滿足所述手勢識別條件；所述預設方位資訊用於表徵所述真實主播在處於預設姿態下，各個指定肢體部位之間的相對方位關係。In a possible implementation manner, the determination part 43 is further configured to: determine the relative orientation information between the designated limb parts of the real anchor according to the body detection result; , determine whether the real anchor in the first video image satisfies the gesture recognition condition; the preset orientation information is used to characterize the relative orientation between the designated body parts of the real anchor in a preset posture relation.

一種可能的實施方式中，確定部分43，還被設置為：基於所述姿態檢測結果，確定針對動畫特效的第一驅動資訊；其中，所述第一驅動資訊用於指示所述視頻直播介面中所展示的虛擬直播模型的動畫特效的動畫跳轉資訊；根據所述第一驅動資訊，在所述姿態檢測結果所對應的多個動畫序列中確定與所述第一驅動資訊相匹配的動畫序列，並將所述相匹配的動畫序列確定為所述目標動畫特效。In a possible implementation manner, the determining part 43 is further configured to: determine first driving information for animation special effects based on the posture detection result; wherein, the first driving information is used to indicate that in the live video interface Animation jump information of the animation special effects of the displayed virtual live broadcast model; according to the first driving information, determine the animation sequence matching the first driving information among the plurality of animation sequences corresponding to the posture detection results, And determine the matching animation sequence as the target animation special effect.

一種可能的實施方式中，確定部分43，還被設置為：確定所述視頻流中位於所述第一視頻圖像之前的至少一個視頻圖像；獲取根據每個所述視頻圖像確定出的針對動畫特效的第二驅動資訊，並根據所述姿態檢測結果確定針對動畫特效的估計驅動資訊；確定所述第二驅動資訊和所述估計驅動資訊中每個驅動資訊所驅動展示的動畫序列，得到至少一個動畫序列；將所述至少一個動畫序列中出現次數滿足預設次數要求的動畫序列所對應的驅動資訊為確定所述第一驅動資訊。In a possible implementation manner, the determining part 43 is further configured to: determine at least one video image located before the first video image in the video stream; For the second driving information of the animation special effect, and determine the estimated driving information for the animation special effect according to the posture detection result; determine the animation sequence driven by each driving information in the second driving information and the estimated driving information, At least one animation sequence is obtained; and the driving information corresponding to the animation sequence whose occurrence number meets the preset number requirement in the at least one animation sequence is determined as the first driving information.

一種可能的實施方式中，確定部分43，還被設置為：獲取所述多個動畫序列的動畫狀態機；所述動畫狀態機用於表徵多個動畫狀態之間的跳轉關係，每個所述動畫狀態對應一個或多個動畫序列；根據所述第一驅動資訊，確定所述動畫狀態機的下一個待跳轉的動畫狀態；根據所述下一個待跳轉的動畫狀態所對應的動畫序列，確定與所述第一驅動資訊相匹配的動畫序列。In a possible implementation manner, the determining part 43 is further configured to: acquire animation state machines of the multiple animation sequences; the animation state machine is used to characterize the jump relationship between multiple animation states, each of the animation states The animation state corresponds to one or more animation sequences; according to the first driving information, determine the next animation state to be jumped of the animation state machine; according to the animation sequence corresponding to the next animation state to be jumped, determine An animation sequence matching the first driving information.

一種可能的實施方式中，所述獲取部分41，還被配置為獲取所述真實主播所對應的虛擬直播場景；所述確定部分43，還被設置為：獲取與所述姿態檢測結果相匹配的初始動畫特效；在所述初始動畫特效中確定與所述虛擬直播場景相匹配的動畫特效作為所述目標動畫特效。In a possible implementation manner, the acquiring part 41 is also configured to acquire the virtual live broadcast scene corresponding to the real anchor; the determining part 43 is also configured to: acquire the pose detection result matching An initial animation special effect; in the initial animation special effect, an animation special effect matching the virtual live broadcast scene is determined as the target animation special effect.

一種可能的實施方式中，在目標動畫特效包括用於表徵虛擬主播模型的肢體動作的肢體動作特效和渲染素材特效中的至少一個的情況下，確定部分43，還被設置為以下至少之一：在所述視頻直播介面中展示所述虛擬主播模型的肢體動作的肢體動作特效；在所述視頻直播介面中與所述虛擬主播模型的肢體動作關聯的目標位置處展示所述渲染素材特效。In a possible implementation manner, when the target animation special effects include at least one of physical movement special effects and rendering material special effects for representing the physical movement of the virtual anchor model, the determining part 43 is also set to at least one of the following: displaying the physical effect of the physical movement of the virtual anchor model in the live video interface; displaying the rendering material special effect at a target position associated with the physical movement of the virtual anchor model in the live video interface.

關於裝置中的各模組的處理流程、以及各模組之間的交互流程的描述可以參照上述方法實施例中的相關說明，這裡不再詳述。For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.

在本發明實施例以及其他的實施例中，“部分”可以是部分電路、部分處理器、部分程式或軟體等等，當然也可以是單元，還可以是模組也可以是非模組化的。In the embodiments of the present invention and other embodiments, a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a unit, and it may also be a module or non-modular.

對應於圖1中的特效展示方法，本發明實施例還提供了一種電腦設備500，如圖4所示，為本發明實施例提供的電腦設備500結構示意圖，包括：處理器51、記憶體52、和匯流排53；記憶體52用於儲存執行指令，包括內部記憶體521和外部記憶體522；內部記憶體521，用於暫時存放處理器51中的運算資料，以及與硬碟等外部記憶體522交換的資料，處理器51通過內部記憶體521與外部記憶體522進行資料交換，當所述電腦設備500運行時，所述處理器51與所述記憶體52之間通過匯流排53通信，使得所述處理器51執行以下指令：獲取真實主播在直播過程中的第一視頻圖像；對所述第一視頻圖像中所述真實主播的指定肢體部位進行姿態檢測，得到姿態檢測結果；在根據所述姿態檢測結果檢測出所述真實主播處於預設姿態的情況下，根據所述姿態檢測結果確定與所述真實主播對應的虛擬主播模型的目標動畫特效；在所述真實主播對應的視頻直播介面中展示所述虛擬主播模型的目標動畫特效。 Corresponding to the special effect display method in FIG. 1, the embodiment of the present invention also provides a computer device 500, as shown in FIG. 4, which is a schematic structural diagram of the computer device 500 provided by the embodiment of the present invention, including: Processor 51, memory 52, and bus bar 53; Memory 52 is used for storing execution instructions, including internal memory 521 and external memory 522; Internal memory 521 is used for temporarily storing the calculation data in the processor 51, And the data exchanged with external memory 522 such as hard disk, processor 51 carries out data exchange with external memory 522 through internal memory 521, when described computer equipment 500 runs, described processor 51 and described memory 52 Communicate through the bus 53, so that the processor 51 executes the following instructions: Obtain the first video image of the real anchor during the live broadcast; Performing posture detection on the specified body parts of the real anchor in the first video image to obtain a posture detection result; When it is detected that the real anchor is in a preset posture according to the posture detection result, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result; The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.

本發明實施例還提供一種電腦可讀儲存媒體，該電腦可讀儲存媒體上儲存有電腦程式，該電腦程式被處理器運行時執行上述方法實施例中所述的特效展示方法的步驟。其中，該儲存媒體可以是易失性或非易失的電腦可讀取儲存媒體。An embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored. When the computer program is run by a processor, the steps of the method for displaying special effects described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

本發明實施例還提供一種電腦程式產品，該電腦程式產品承載有程式碼，所述程式碼包括的指令可用於執行上述方法實施例中所述的特效展示方法的步驟，具體可參見上述方法實施例，在此不再贅述。The embodiment of the present invention also provides a computer program product, the computer program product carries a program code, and the instructions contained in the program code can be used to execute the steps of the special effect display method described in the above method embodiment, for details, please refer to the above method implementation example, which will not be repeated here.

其中，上述電腦程式產品可以具體通過硬體、軟體或其結合的方式實現。在一個可選實施例中，所述電腦程式產品具體體現為電腦儲存媒體，在另一個可選實施例中，電腦程式產品具體體現為軟體產品，例如軟體發展包（Software Development Kit，SDK）等等。Wherein, the above-mentioned computer program product can be realized by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium. In another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. wait.

所屬領域的技術人員可以清楚地瞭解到，為描述的方便和簡潔，上述描述的系統和裝置的具體工作過程，可以參考前述方法實施例中的對應過程，在此不再贅述。在本發明所提供的幾個實施例中，應該理解到，所揭露的裝置和方法，可以通過其它的方式實現。以上所描述的裝置實施例僅僅是示意性的，例如，所述單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式，又例如，多個單元或元件可以結合或者可以集成到另一個系統，或一些特徵可以忽略，或不執行。另一點，所顯示或討論的相互之間的耦合或直接耦合或通信連接可以是通過一些通信介面，裝置或單元的間接耦合或通信連接，可以是電性，機械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present invention, it should be understood that the disclosed devices and methods can be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or elements can be combined or can be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作為分離部件說明的單元可以是或者也可以不是物理上分開的，作為單元顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may also be distributed to multiple network units . Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本發明各個實施例中的各功能單元可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

所述功能如果以軟體功能單元的形式實現並作為獨立的產品銷售或使用時，可以儲存在一個處理器可執行的非易失的電腦可讀取儲存媒體中。基於這樣的理解，本發明的技術方案本質上或者說對相關技術做出貢獻的部分或者該技術方案的部分可以以軟體產品的形式體現出來，該電腦軟體產品儲存在一個儲存媒體中，包括若干指令用以使得一台電腦設備（可以是個人電腦，伺服器，或者網路設備等）執行本發明各個實施例所述方法的全部或部分步驟。而前述的儲存媒體包括：U盤、移動硬碟、唯讀記憶體（Read-Only Memory，ROM）、隨機存取記憶體（Random Access Memory，RAM）、磁碟或者光碟等各種可以儲存程式碼的媒體。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the related technology or the part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium, including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc., which can store program codes. media.

最後應說明的是：以上所述實施例，僅為本發明的具體實施方式，用以說明本發明的技術方案，而非對其限制，本發明的保護範圍並不局限於此，儘管參照前述實施例對本發明進行了詳細的說明，本領域的普通技術人員應當理解：任何熟悉本技術領域的技術人員在本發明揭露的技術範圍內，其依然可以對前述實施例所記載的技術方案進行修改或可輕易想到變化，或者對其中部分技術特徵進行等同替換；而這些修改、變化或者替換，並不使相應技術方案的本質脫離本發明實施例技術方案的精神和範圍，都應涵蓋在本發明的保護範圍之內。因此，本發明的保護範圍應所述以申請專利範圍的保護範圍為準。Finally, it should be noted that: the above-described embodiments are only specific implementations of the present invention, used to illustrate the technical solutions of the present invention, rather than limiting them, and the scope of protection of the present invention is not limited thereto, although referring to the foregoing The embodiment has described the present invention in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present invention Changes can be easily thought of, or equivalent replacements are made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in the scope of the present invention within the scope of protection. Therefore, the protection scope of the present invention should be based on the protection scope of the patent application.

工業實用性本發明實施例中，通過識別真實主播的姿態，基於姿態檢測結果來確定真實主播驅動的虛擬主播模型的目標動畫特效，並在視頻直播介面中展示該目標動畫特效。由此可以實現通過真實主播的姿態檢測結果在視頻直播介面上觸發展示虛擬主播模型對應的目標動畫特效，增加了主播使用者通過肢體動作與觀看者的交互效率，同時還提高了直播使用者的直播體驗。 Industrial Applicability In the embodiment of the present invention, by identifying the posture of the real anchor, the target animation special effect of the virtual anchor model driven by the real anchor is determined based on the posture detection result, and the target animation special effect is displayed in the live video interface. In this way, the target animation effects corresponding to the virtual anchor model can be triggered and displayed on the live video interface through the posture detection results of the real anchor, which increases the interaction efficiency between the anchor user and the viewer through body movements, and also improves the live broadcast user’s experience. Live experience.

41:獲取部分 42:姿態檢測部分 43:確定部分 44:展示部分 500:電腦設備 51:處理器 52:記憶體 521:內部記憶體 522:外部記憶體 53:匯流排 S101~S107:步驟 41: Get part 42: Attitude detection part 43: Determining part 44: Display part 500: computer equipment 51: Processor 52: memory 521: internal memory 522: external memory 53: busbar S101~S107: steps

為了更清楚地說明本發明實施例的技術方案，下面將對實施例中所需要使用的附圖作簡單地介紹。圖1示出了本發明實施例所提供的一種特效展示方法的流程圖；圖2示出了本發明實施例所提供的一種姿態檢測結果的示意圖；圖3示出了本發明實施例所提供的一種特效展示裝置的示意圖；圖4示出了本發明實施例所提供的一種電腦設備的示意圖。 In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following will briefly introduce the drawings used in the embodiments. FIG. 1 shows a flow chart of a method for displaying special effects provided by an embodiment of the present invention; Fig. 2 shows a schematic diagram of a posture detection result provided by an embodiment of the present invention; Fig. 3 shows a schematic diagram of a special effect display device provided by an embodiment of the present invention; Fig. 4 shows a schematic diagram of a computer device provided by an embodiment of the present invention.

S101~S107:步驟 S101~S107: steps

Claims

A method for displaying special effects, comprising: Obtain the first video image of the real anchor during the live broadcast; Performing posture detection on the specified body parts of the real anchor in the first video image to obtain a posture detection result; When it is detected that the real anchor is in a preset posture according to the posture detection result, the target animation special effect of the virtual anchor model corresponding to the real anchor is determined according to the posture detection result; The target animation special effect of the virtual anchor model is displayed in the live video interface corresponding to the real anchor.

The method according to claim 1, wherein the posture detection results include at least one of body detection results and posture classification results; The gesture detection is performed on the specified limb parts of the real anchor in the first video image, and the gesture detection result is obtained, including: Carry out limb detection to the designated limb parts of the real anchor in the first video image, and obtain the limb detection result; If the limb detection result includes a hand detection frame, gesture detection is performed on an image within the hand detection frame in the first video image to obtain a gesture classification result.

The method according to claim 1 or 2, wherein the detecting that the real anchor is in a preset posture according to the posture detection result includes: According to the limb detection result in the posture detection result, judge whether the real anchor in the first video image satisfies the gesture recognition condition, and obtain the judgment result; When the judgment result indicates that the real anchor satisfies the gesture recognition condition, detecting whether the gesture indicated by the gesture classification result in the gesture detection result is a preset gesture; If it is detected that the gesture indicated by the gesture classification result is the preset gesture, it is determined that the real anchor is in the preset gesture.

According to the method described in claim 3, wherein, according to the limb detection result in the posture detection result, it is judged whether the real anchor in the first video image meets the gesture recognition condition, and the judgment result is obtained, including: determining the relative orientation information between the designated body parts of the real anchor according to the body detection results; According to the relative orientation information and the preset orientation information, it is judged whether the real anchor in the first video image satisfies the gesture recognition condition, and the judgment result is obtained; the preset orientation information is used to represent the real The anchor is in the preset posture, and the relative orientation relationship between the designated body parts.

The method according to claim 1 or 2, wherein said determining the target animation effect of the virtual anchor model corresponding to the real anchor according to the posture detection result includes: Based on the posture detection result, determine the first driving information for the animation special effect; wherein, the first driving information is used to indicate the animation jump information of the animation special effect of the virtual live model displayed in the live video interface; According to the first driving information, an animation sequence matching the first driving information is determined among a plurality of animation sequences corresponding to the posture detection result, and the matching animation sequence is determined as the target Animation effects.

According to the method described in claim 5, wherein the determining the first driving information for animation special effects based on the posture detection result includes: determining at least one video image preceding the first video image in the video stream; Acquiring second driving information for animation special effects determined according to each of the video images, and determining estimated driving information for animation special effects according to the gesture detection result; determining an animation sequence driven by each of the second driving information and the estimated driving information to obtain at least one animation sequence; The first driving information is determined as the driving information corresponding to the animation sequence whose occurrence number meets the preset requirement in the at least one animation sequence.

The method according to claim 5, wherein, according to the first driving information, determining an animation sequence matching the first driving information among the plurality of animation sequences corresponding to the gesture detection results includes : Acquiring the animation state machines of the plurality of animation sequences; the animation state machine is used to represent the jump relationship between the plurality of animation states, each of the animation states corresponds to one or more animation sequences; According to the first driving information, determine the next animation state to be jumped to by the animation state machine; According to the animation sequence corresponding to the next animation state to be jumped, an animation sequence matching the first driving information is determined.

The method according to claim 1 or 2, wherein the method further comprises: Obtaining the virtual live broadcast scene corresponding to the real anchor; According to the posture detection result, determining the target animation special effect of the virtual anchor model corresponding to the real anchor includes: Obtain an initial animation special effect matching the posture detection result; An animation special effect matching the virtual live broadcast scene is determined in the initial animation special effect as the target animation special effect.

The method according to claim 1 or 2, wherein the target animation special effects include at least one of physical movement special effects and rendering material special effects for representing the physical movement of the virtual anchor model; The target animation effects of the virtual anchor model displayed in the live video interface corresponding to the real anchor include at least one of the following: Displaying the body movement special effects of the body movement of the virtual anchor model in the live video interface; The special effect of the rendering material is displayed at a target position associated with the body movements of the virtual anchor model in the live video interface.

A computer device, comprising: a processor, a memory, and a bus bar, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the connection between the processor and the memory Through bus communication, when the machine-readable instructions are executed by the processor, the steps of the method for displaying special effects as described in any one of claims 1 to 9 are executed.

A computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is run by a processor, the steps of the method for displaying special effects as described in any one of claims 1 to 9 are executed.