TW202304212A

TW202304212A - Live broadcast method, system, computer equipment and computer readable storage medium

Info

Publication number: TW202304212A
Application number: TW111117705A
Authority: TW
Inventors: 王佳梨; 王權; 蘇麗偉
Original assignee: 大陸商上海商湯智能科技有限公司
Priority date: 2021-07-07
Filing date: 2022-05-11
Publication date: 2023-01-16
Also published as: CN113507621A; WO2023279705A1

Abstract

The present disclosure provides a live broadcast method, system, computer equipment and computer-readable storage medium, wherein the method includes: determining a virtual live broadcast mode; the virtual live broadcast mode is used to indicate at least one target capture area for limb capture of a real anchor; collecting the video image of the real anchor during the live broadcast; based on the virtual live broadcast mode, performing motion recognition on the target capture area in the video image to obtain a target recognition result; based on the target recognition result, determining the video stream data corresponding to the virtual anchor model driven by the real anchor; the video stream data is used to present a process in which the virtual anchor model performs the action indicated by the identification result.

Description

Live broadcast method, system, computer equipment and computer-readable storage medium

本發明關於電腦的技術領域，具體而言，關於一種直播方法、系統、電腦設備及電腦可讀儲存媒體。The present invention relates to the technical field of computers, and specifically relates to a live broadcast method, system, computer equipment and computer-readable storage media.

在相關技術中的虛擬直播過程中，需要主播穿戴動作捕捉設備，進而通過動作捕捉設備捕捉主播端的動作資料。由於動作捕捉設備的局限性，相關技術中的動作捕捉設備僅能捕捉主播的部分肢體動作，從而導致相關技術中的虛擬直播軟體無法滿足主播的動作捕捉需求。同時，動作捕捉設備限制了主播的直播場景，以及限制了直播所能做出的動作，並增加了虛擬直播的複雜程度。In the virtual live broadcast process in the related art, the anchor needs to wear a motion capture device, and then the motion data of the anchor terminal is captured by the motion capture device. Due to the limitation of the motion capture device, the motion capture device in the related art can only capture part of the body movements of the anchor, so that the virtual live broadcast software in the related art cannot meet the motion capture requirements of the anchor. At the same time, the motion capture device limits the live broadcast scene of the anchor, and limits the actions that the live broadcast can make, and increases the complexity of the virtual live broadcast.

本發明實施例至少提供一種直播方法、系統、電腦設備及電腦可讀儲存媒體。Embodiments of the present invention at least provide a live broadcast method, system, computer equipment, and computer-readable storage medium.

第一方面，本發明實施例提供了一種直播方法，應用於主播端設備，包括：確定虛擬直播模式；所述虛擬直播模式用於指示對真實主播進行肢體捕捉的至少一個目標捕捉部位；採集所述真實主播在直播過程中的視頻圖像；基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果；基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料；所述視頻流資料用於呈現所述虛擬主播模型執行所述目標識別結果所指示動作的過程。In the first aspect, the embodiment of the present invention provides a live broadcast method, which is applied to the anchor terminal device, including: determining a virtual live broadcast mode; the virtual live broadcast mode is used to indicate at least one target capture part for body capture of the real anchor; collecting the Describe the video image of the real anchor in the live broadcast process; based on the virtual live broadcast mode, perform action recognition on the target capture part in the video image to obtain the target recognition result; based on the target recognition result, determine the real The video stream data corresponding to the virtual anchor model driven by the anchor; the video stream data is used to present the process of the virtual anchor model executing the action indicated by the target recognition result.

在本發明實施例中，在確定虛擬直播模式之後，通過該虛擬直播模式確定目標捕捉部位的方式，無需手動設置目標捕捉部位，可以自動過濾掉不需要進行動作捕捉的肢體部位，從而實現自動的捕捉真實主播的待捕捉的肢體部位，進而縮短了動作捕捉的時間，降低虛擬直播的複雜程度，同時，還可以使得該直播方法適應即時性較高的直播場景。In the embodiment of the present invention, after the virtual live broadcast mode is determined, the virtual live broadcast mode determines the target capture position, without manually setting the target capture position, and can automatically filter out the body parts that do not need motion capture, thereby realizing automatic The body parts to be captured of the real anchor are captured, thereby shortening the time for motion capture and reducing the complexity of the virtual live broadcast. At the same time, the live broadcast method can also be adapted to live broadcast scenes with high immediacy.

一種可選的實施方式中，所述確定虛擬直播模式，包括：確定所述主播端設備的運行場景；基於預先為所述主播端設備設定的多個運行版本，確定與所述運行場景相匹配的目標運行版本；基於確定出的所述目標運行版本確定所述虛擬直播模式。In an optional implementation manner, the determining the virtual live broadcast mode includes: determining the operating scenario of the anchor device; based on multiple operating versions pre-set for the anchor end device, determining the The target running version; determine the virtual live broadcast mode based on the determined target running version.

在本發明實施例中，通過確定上述運行場景，可以確定主播端設備所對應的安裝設備的計算能力，進而為在該安裝設備上運行符合其計算能力的運行版本，從而保證主播端設備的正常運行，減少由於計算需求量大，且該安裝設備的計算能力不足導致的主播端設備運行異常的問題。In the embodiment of the present invention, by determining the above operation scenario, the computing capability of the installation device corresponding to the host device can be determined, and then the running version that meets its computing capability can be run on the installation device, thereby ensuring the normal operation of the host device. Running, reducing the problem of abnormal operation of the host device due to the large computing demand and insufficient computing power of the installed device.

一種可選的實施方式中，所述確定所述主播端設備的運行場景，包括：在檢測到針對直播應用程式的首次安裝操作的情況下，獲取所述主播端設備的設備標識資訊，並根據所述設備標識資訊確定所述運行場景；和/或，在檢測到針對所述直播應用程式的打開操作的情況下，獲取當前時刻所述主播端設備的剩餘設備計算資源，並根據所述剩餘設備計算資源確定所述運行場景。In an optional implementation manner, the determining the running scenario of the host device includes: obtaining the device identification information of the host device when the first installation operation for the live broadcast application is detected, and according to The device identification information determines the running scenario; and/or, when an opening operation for the live broadcast application is detected, obtain the remaining device computing resources of the anchor device at the current moment, and use the remaining The computing resource of the device determines the running scenario.

在本發明實施例中，通過檢測主播端設備的首次安裝操作或者打開操作，可以實現在開啟該主播端設備之前，確定出當前時刻該主播端設備的計算能力，從而在該主播端設備開啟之前，準確的根據該計算能力確定出主播端設備的運行場景，進而保證主播端設備的正常運行。In the embodiment of the present invention, by detecting the first installation operation or opening operation of the anchor device, the computing capability of the anchor device at the current moment can be determined before the anchor device is turned on, so that before the anchor device is turned on, , accurately determine the running scenario of the host device according to the computing capability, and then ensure the normal operation of the host device.

一種可選的實施方式中，所述確定虛擬直播模式，包括：回應於所述真實主播的直播模式選擇指令，在多個預設直播模式中確定所述虛擬直播模式。In an optional implementation manner, the determining the virtual live broadcast mode includes: responding to a live broadcast mode selection instruction of the real host, and determining the virtual live broadcast mode in a plurality of preset live broadcast modes.

在本發明實施例中，通過設置使用者在直播模式中選擇真實主播的虛擬直播模式的方式，可以為使用者提供更加豐富的虛擬直播場景，從而滿足使用者的需求，提高使用者的使用體驗。In the embodiment of the present invention, by setting the way for the user to select the virtual live broadcast mode of the real anchor in the live broadcast mode, more abundant virtual live broadcast scenes can be provided for the user, thereby meeting the needs of the user and improving the user experience .

一種可選的實施方式中，所述方法還包括：在採集所述真實主播在直播過程中的視頻圖像之前，採集包含所述真實主播的預覽圖像；確定所述預覽圖像中包含的目標捕捉部位是否滿足動作識別條件；在所述預覽圖像中包含的目標捕捉部位不滿足所述動作識別條件的情況下，生成目標調整資訊，直至確定出所述目標捕捉部位滿足所述動作識別條件，其中，所述目標調整資訊用於提醒所述真實主播調整所述目標捕捉部位在所述預覽圖像中的展示狀態。 In an optional implementation manner, the method further includes: before collecting the video image of the real anchor during the live broadcast, collecting a preview image containing the real anchor; determining the Whether the target capture part meets the action recognition conditions; If the target capture part included in the preview image does not satisfy the motion recognition condition, generate target adjustment information until it is determined that the target capture part satisfies the motion recognition condition, wherein the target adjustment information It is used to remind the real anchor to adjust the display state of the target capturing part in the preview image.

在本發明實施例中，在基於預覽圖像確定出目標捕捉部位不滿足動作識別條件的情況下，生成目標調整資訊，可以實現在使用者直播之前，對攝影裝置和/或目標捕捉部位的展示狀態進行調整，從而提高直播過程中動作識別精度，進而提高直播效果。In the embodiment of the present invention, when it is determined based on the preview image that the target capture part does not meet the action recognition conditions, target adjustment information is generated, which can realize the display of the camera and/or the target capture part before the live broadcast by the user The state is adjusted to improve the accuracy of action recognition during the live broadcast, thereby improving the live broadcast effect.

一種可選的實施方式中，所述基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果，包括：在檢測到所述目標捕捉部位包含手部部位的情況下，獲取所述虛擬直播模式所對應的模式標籤，其中，所述模式標籤包括是否對手部部位進行手部識別的目標模式標籤；在確定出所述模式標籤為目標模式標籤的情況下，對所述視頻圖像中的目標捕捉部位進行動作識別，以及對所述目標捕捉部位中的手部部位進行手部檢測，得到包含手部檢測結果的所述目標識別結果。In an optional implementation manner, the performing action recognition on the target capture part in the video image based on the virtual live broadcast mode to obtain the target recognition result includes: detecting that the target capture part contains a hand In the case of a part, obtain the mode tag corresponding to the virtual live broadcast mode, wherein the mode tag includes whether to perform hand recognition on the hand part; if the mode tag is determined to be the target mode tag Next, perform motion recognition on the target capturing part in the video image, and perform hand detection on the hand part in the target capturing part, to obtain the target recognition result including the hand detection result.

一種可選的實施方式中，在確定所述真實主播驅動的虛擬主播模型對應的視頻流資料之後，所述方法還包括：根據所述手部檢測結果檢測到所述真實主播的手部姿勢為預設手勢，獲取與所述預設手勢相對應的渲染素材特效；在所述視頻流資料中的指定視頻幀中渲染所述渲染素材特效。In an optional implementation manner, after determining the video stream data corresponding to the virtual anchor model driven by the real anchor, the method further includes: detecting the hand gesture of the real anchor according to the hand detection result as The preset gesture is used to obtain the rendering material special effect corresponding to the preset gesture; and render the rendering material special effect in a specified video frame in the video stream data.

在本發明實施例中，通過上述處理方式，可以實現對目標捕捉部位進行精細化處理，從而滿足使用者的不同直播需求。In the embodiment of the present invention, through the above-mentioned processing method, it is possible to perform refined processing on the target capture part, so as to meet different live broadcast requirements of users.

一種可選的實施方式中，所述基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料，包括：獲取至少一個觀眾端設備發送的直播觀看請求；基於所述直播觀看請求確定每個觀眾端設備對應的直播觀看介面的介面背景圖像，其中，所述介面背景圖像包含：靜態背景圖像或者動態背景圖像；基於所述目標識別結果確定用於表徵所述虛擬主播模型執行所述目標識別結果所指示動作的多個視頻圖像；將所述多個視頻圖像中每個視頻圖像的背景圖像替換為所述介面背景圖像，並基於修改之後的所述多個視頻圖像確定所述視頻流資料。In an optional implementation manner, the determining the video stream data corresponding to the virtual anchor model driven by the real anchor based on the target recognition result includes: obtaining a live viewing request sent by at least one audience device; The live viewing request determines the interface background image of the live viewing interface corresponding to each audience device, wherein the interface background image includes: a static background image or a dynamic background image; it is determined based on the target recognition result for representing The virtual anchor model performs a plurality of video images of the actions indicated by the target recognition result; the background image of each video image in the plurality of video images is replaced by the interface background image, and based on The modified video images determine the video stream material.

一種可選的實施方式中，所述方法還包括：在確定所述真實主播驅動的虛擬主播模型對應的視頻流資料之後，向對應相同介面背景圖像的觀眾端設備推送包含該介面背景圖像的視頻流資料。In an optional implementation manner, the method further includes: after determining the video stream data corresponding to the virtual anchor model driven by the real anchor, pushing an image containing the interface background image to the audience device corresponding to the same interface background image video stream data.

在本發明實施例中，為請求相同的介面展示背景的觀眾端設備生成對應的視頻流資料，並向請求相同介面背景圖像的觀眾端設備推送包含該介面背景圖像的視頻流資料的方式，可以節省視頻流資料的確定時間，以提高虛擬直播的直播品質。In the embodiment of the present invention, the corresponding video stream data is generated for the viewer-end devices requesting the same interface background image, and the video stream data including the interface background image is pushed to the viewer-end device requesting the same interface background image , which can save the time for determining the video stream data, so as to improve the live broadcast quality of the virtual live broadcast.

一種可選的實施方式中，所述方法還包括：在確定虛擬直播模式之後，在所述主播端設備的展示介面中展示第一指示資訊和/或第二指示資訊；其中，所述第一指示資訊用於指示處於有效捕捉狀態的目標捕捉部位，所述第二指示資訊用於指示處於無效捕捉狀態的目標捕捉部位。In an optional implementation manner, the method further includes: after determining the virtual live broadcast mode, displaying the first indication information and/or the second indication information in the display interface of the anchor device; wherein, the first The indication information is used to indicate the target capture part in the valid capture state, and the second indication information is used to indicate the target capture part in the invalid capture state.

在本發明實施例中，通過在主播端設備的展示介面上展示第一指示資訊和/或第二指示資訊的方式，可以指導真實主播做出有效的捕捉動作，從而提高動作捕捉的準確性，以保證主播端設備的穩定運行。In the embodiment of the present invention, by displaying the first instruction information and/or the second instruction information on the display interface of the anchor device, the real anchor can be instructed to make effective capture actions, thereby improving the accuracy of motion capture. To ensure the stable operation of the host device.

第二方面，本發明實施例提供了一種直播系統，包括：主播端設備和觀眾端設備；所述主播端設備，被配置為按照上述第一方面中任一項所述的直播方法確定所述真實主播驅動的虛擬主播模型對應的視頻流資料，並向所述觀眾端設備推送所述視頻流資料；所述觀眾端設備，被配置為獲取所述視頻流資料，並在直播觀看介面播放所述視頻流資料。In a second aspect, an embodiment of the present invention provides a live broadcast system, including: an anchor device and an audience device; the anchor device is configured to determine the The video stream data corresponding to the virtual anchor model driven by the real anchor, and push the video stream data to the viewer device; Describe the video stream data.

一種可選的實施方式中，所述觀眾端設備包含：移動終端設備和PC設備；在所述觀眾端設備為所述移動終端設備的情況下，所述主播端設備通過內容（Content Delivery Network，簡稱CDN）分發網路向所述觀眾端設備傳輸所述視頻流資料；在所述觀眾端設備為所述PC設備的情況下，所述主播端設備通過CDN分發網路和轉推流伺服器向所述觀眾端設備傳輸所述視頻流資料。In an optional implementation manner, the viewer end device includes: a mobile terminal device and a PC device; in the case that the viewer end device is the mobile terminal device, the anchor end device transmits content (Content Delivery Network, CDN for short) distribution network to transmit the video stream data to the viewer-end device; if the viewer-end device is the PC device, the host device transmits the video stream data to the viewer-end device through the CDN distribution network and the forwarding stream server The viewer device transmits the video stream data.

第三方面，本發明實施例提供了一種直播裝置，設置於主播端設備，包括：第一確定單元，配置為確定虛擬直播模式；所述虛擬直播模式用於指示對真實主播進行肢體捕捉的至少一個目標捕捉部位；採集單元，配置為採集所述真實主播在直播過程中的視頻圖像；動作識別單元，配置為基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果；第二確定單元，配置為基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料；所述視頻流資料用於呈現所述虛擬主播模型執行所述識別結果所指示動作的過程。In a third aspect, an embodiment of the present invention provides a live broadcast device, which is set on the host device, including: a first determination unit configured to determine a virtual live broadcast mode; the virtual live broadcast mode is used to indicate at least A target capture part; the collection unit is configured to collect the video image of the real anchor during the live broadcast; the action recognition unit is configured to perform actions on the target capture part in the video image based on the virtual live broadcast mode Recognition to obtain a target recognition result; a second determination unit configured to determine the video stream data corresponding to the virtual anchor model driven by the real anchor based on the target recognition result; the video stream data is used to present the virtual anchor model The process of executing the action indicated by the recognition result.

第四方面，本發明實施例還提供一種電子設備，包括：處理器、記憶體和匯流排，所述記憶體儲存有所述處理器可執行的機器可讀指令，當電子設備運行時，所述處理器與所述記憶體之間通過匯流排通信，所述機器可讀指令被所述處理器執行時執行上述第一方面，或第一方面中任一種可能的實施方式中的步驟。In the fourth aspect, the embodiment of the present invention also provides an electronic device, including: a processor, a memory, and a bus bar. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.

第五方面，本發明實施例還提供一種電腦可讀儲存媒體，該電腦可讀儲存媒體上儲存有電腦程式，該電腦程式被處理器運行時執行上述第一方面，或第一方面中任一種可能的實施方式中的步驟。In the fifth aspect, the embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any one of the first aspects steps in a possible implementation.

本發明實施例還提供一種電腦程式，所述電腦程式包括電腦可讀代碼，在所述電腦可讀代碼在電子設備中運行的情況下，所述電子設備的處理器執行上述任一實施例所述的直播方法。An embodiment of the present invention also provides a computer program, the computer program includes computer-readable codes, and when the computer-readable codes run in an electronic device, the processor of the electronic device executes the program described in any one of the above-mentioned embodiments. The live broadcast method described above.

為使本發明的上述目的、特徵和優點能更明顯易懂，下文特舉較佳實施例，並配合所附附圖，作詳細說明如下。In order to make the above-mentioned objects, features and advantages of the present invention more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

為使本發明實施例的目的、技術方案和優點更加清楚，下面將結合本發明實施例中附圖，對本發明實施例中的技術方案進行清楚、完整地描述，顯然，所描述的實施例僅僅是本發明一部分實施例，而不是全部的實施例。通常在此處附圖中描述和示出的本發明實施例的元件可以以各種不同的配置來佈置和設計。因此，以下對在附圖中提供的本發明的實施例的詳細描述並非旨在限制要求保護的本發明的範圍，而是僅僅表示本發明的選定實施例。基於本發明的實施例，本領域技術人員在沒有做出創造性勞動的前提下所獲得的所有其他實施例，都屬於本發明保護的範圍。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is a part of embodiments of the present invention, but not all embodiments. The elements of the embodiments of the invention generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the present invention.

應注意到：相似的標號和字母在下面的附圖中表示類似項，因此，一旦某一項在一個附圖中被定義，則在隨後的附圖中不需要對其進行進一步定義和解釋。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

本文中術語“和/或”，僅僅是描述一種關聯關係，表示可以存在三種關係，例如，A和/或B，可以表示：單獨存在A，同時存在A和B，單獨存在B這三種情況。另外，本文中術語“至少一種”表示多種中的任意一種或多種中的至少兩種的任意組合，例如，包括A、B、C中的至少一種，可以表示包括從A、B和C構成的集合中選擇的任意一個或多個元素。The term "and/or" in this article only describes an association relationship, which means that there may be three relationships, for example, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one or any combination of at least two of the plurality, for example, including at least one of A, B, and C, may mean including the composition consisting of A, B, and C Any one or more elements selected in the collection.

經研究發現，在相關技術中的虛擬直播過程中，需要主播穿戴動作捕捉設備，進而通過動作捕捉設備捕捉主播端的動作資料。由於動作捕捉設備的局限性，相關技術中的動作捕捉設備僅能捕捉主播的部分肢體動作，從而導致相關技術中的虛擬直播軟體無法滿足主播的動作捕捉需求。同時，動作捕捉設備限制了主播的直播場景，以及限制了直播所能做出的動作，並增加了虛擬直播的複雜程度。After research, it is found that in the process of virtual live broadcast in related technologies, the anchor needs to wear a motion capture device, and then capture the motion data of the anchor end through the motion capture device. Due to the limitation of the motion capture device, the motion capture device in the related art can only capture part of the body movements of the anchor, so that the virtual live broadcast software in the related art cannot meet the motion capture requirements of the anchor. At the same time, the motion capture device limits the live broadcast scene of the anchor, and limits the actions that the live broadcast can make, and increases the complexity of the virtual live broadcast.

基於上述研究，本發明實施例提供了一種直播方法。本發明實施例所提供的直播方法，可以應用於虛擬直播場景下。虛擬直播場景可以理解為使用預先設定的虛擬主播模型，如小熊貓、小兔子、卡通人物等代替真實主播的實際形象進行直播，此時，在視頻直播畫面中所展示出的為上述虛擬主播模型。同時，還可以根據該虛擬主播模型進行真實主播與觀眾的互動。Based on the above research, an embodiment of the present invention provides a live broadcast method. The live broadcast method provided by the embodiment of the present invention can be applied in a virtual live broadcast scenario. The virtual live broadcast scene can be understood as the use of pre-set virtual anchor models, such as red pandas, little rabbits, cartoon characters, etc. to replace the actual image of the real anchor for live broadcast. At this time, the above-mentioned virtual anchor models are shown in the live video screen . At the same time, the interaction between the real anchor and the audience can also be carried out according to the virtual anchor model.

舉例來說，主播端設備的攝影裝置可以採集包含真實主播的視頻圖像，對視頻圖像中所包含的真實主播的肢體進行捕捉，從而得到真實主播的姿態資訊。在確定出該姿態資訊之後，就可以生成對應的驅動信號，該驅動信號用於驅動在視頻直播畫面中展示虛擬主播模型對應的動畫特效。For example, the camera device of the anchor device can collect video images containing the real anchor, and capture the body of the real anchor contained in the video image, so as to obtain the posture information of the real anchor. After the posture information is determined, a corresponding driving signal can be generated, and the driving signal is used to drive the animation special effect corresponding to the virtual anchor model displayed in the live video screen.

在一個可選的實施方式中，真實主播可以預先設定相應的虛擬主播模型，例如，可以預先設定的虛擬主播模型為“XXX遊戲中的YYY角色模型”。真實主播可以預先設定一個或多個虛擬主播模型。在開啟當前時刻的虛擬直播時，可以從預先設定的一個或多個虛擬主播模型中選擇一個作為當前時刻的虛擬主播模型。其中，虛擬主播模型可以為二維（two Dimensional，簡稱2D）模型，還可以為三維（three Dimensional，簡稱3D）模型。In an optional embodiment, the real anchor may preset a corresponding virtual anchor model, for example, the preset virtual anchor model may be "YYY role model in XXX game". A real anchor can preset one or more virtual anchor models. When starting the virtual live broadcast at the current moment, one can be selected from one or more preset virtual anchor models as the virtual anchor model at the current moment. Wherein, the virtual anchor model may be a two-dimensional (two Dimensional, 2D for short) model, and may also be a three-dimensional (three Dimensional, 3D for short) model.

在另一個可選的實施方式中，除了上述所描述方式為真實主播確定虛擬主播模型之外，還可以在獲取到視頻圖像之後，為該視頻圖像中的真實主播重塑虛擬主播模型。In another optional implementation, in addition to determining the virtual anchor model for the real anchor in the manner described above, after the video image is acquired, the virtual anchor model can be reshaped for the real anchor in the video image.

舉例來說，可以對視頻圖像中所包含的真實主播進行識別，從而根據識別結果為真實主播重塑虛擬主播模型。該識別結果可以包含以下至少之一：真實主播的性別、真實主播的外貌特徵、真實主播的穿戴特徵等。For example, the real anchor included in the video image can be identified, so as to reshape the virtual anchor model for the real anchor according to the recognition result. The recognition result may include at least one of the following: the gender of the real anchor, the appearance characteristics of the real anchor, the clothing characteristics of the real anchor, and the like.

此時，可以從虛擬主播模型庫中搜索與該識別結果相匹配的模型作為該真實主播的虛擬主播模型。例如，根據識別結果確定出真實主播在直播過程中戴鴨舌帽、所穿衣服為嘻哈風格的衣服。此時，可以從虛擬主播模型庫中搜索與該“鴨舌帽”或者“嘻哈風”相匹配的虛擬主播模型作為該真實主播的虛擬主播模型。At this point, a model that matches the recognition result can be searched from the virtual anchor model database as the virtual anchor model of the real anchor. For example, according to the recognition result, it is determined that the real anchor wears a peaked cap and clothes in hip-hop style during the live broadcast. At this time, a virtual anchor model that matches the "cap" or "hip-hop style" can be searched from the virtual anchor model database as the virtual anchor model of the real anchor.

除了在虛擬主播模型庫中搜索與識別結果相匹配的模型之外，還可以基於該識別結果，通過模型構建模組，為真實主播即時構建出相應的虛擬主播模型。In addition to searching for a model that matches the recognition result in the virtual anchor model library, it is also possible to construct a module through the model based on the recognition result to instantly build a corresponding virtual anchor model for the real anchor.

這裡，在即時構建該虛擬主播模型時，還可以將該真實主播在過去時刻所發起的虛擬直播所使用的虛擬主播模型作為參考，構建當前時刻該真實主播所驅動的虛擬主播模型。Here, when constructing the virtual anchor model in real time, the virtual anchor model used by the virtual live broadcast initiated by the real anchor in the past can also be used as a reference to construct the virtual anchor model driven by the real anchor at the current moment.

通過上述所描述的確定虛擬主播模型的方式，可以實現為真實主播個性化定制相應的虛擬主播模型，從而使得虛擬主播模型豐富多樣。同時，通過個性化定制虛擬主播模型，還可以為觀眾留下更深刻的印象。Through the method of determining the virtual anchor model described above, it is possible to customize corresponding virtual anchor models for real anchors, thereby making the virtual anchor models rich and varied. At the same time, by customizing the virtual anchor model, it can also leave a deeper impression on the audience.

為便於對本實施例進行理解，首先對本發明實施例所公開的一種直播方法進行詳細介紹，本發明實施例所提供的直播方法的執行主體一般為具有一定計算能力的電腦設備，該電腦設備可以為支援安裝上述主播端設備的設備。在一些可能的實現方式中，該直播方法可以通過處理器調用記憶體中儲存的電腦可讀指令的方式來實現。In order to facilitate the understanding of this embodiment, a live broadcast method disclosed in the embodiment of the present invention is first introduced in detail. The execution subject of the live broadcast method provided by the embodiment of the present invention is generally a computer device with a certain computing capability, and the computer device can be Devices that support the installation of the above-mentioned host devices. In some possible implementation manners, the live broadcast method may be implemented by a processor calling computer-readable instructions stored in a memory.

參見圖1A所示，為本發明實施例提供的一種直播方法的流程圖，所述方法包括步驟S101至S107如下。Referring to FIG. 1A , which is a flowchart of a live broadcast method provided by an embodiment of the present invention, the method includes steps S101 to S107 as follows.

S101：確定虛擬直播模式；所述虛擬直播模式用於指示對真實主播進行肢體捕捉的目標捕捉部位。S101: Determine a virtual live broadcast mode; the virtual live broadcast mode is used to indicate a target capture position for body capture of a real anchor.

在本發明實施例中，虛擬直播模式可以包括但不限於：頭肩直播模式、腰身直播模式、全身直播模式。In the embodiment of the present invention, the virtual live broadcast mode may include but not limited to: head and shoulders live broadcast mode, waist and waist live broadcast mode, and whole body live broadcast mode.

這裡，頭肩直播模式可以理解為對真實主播肩部以上的部位進行動作識別。腰身直播模式理解為對真實主播腰部以上的部位進行動作識別。全身直播模式可以理解為對真實主播的全身肢體部位進行動作識別。Here, the head-and-shoulders live broadcast mode can be understood as the action recognition of the parts above the shoulders of the real anchor. The waist live broadcast mode is understood as the action recognition of the parts above the waist of the real anchor. The whole body live broadcast mode can be understood as the action recognition of the whole body limbs of the real anchor.

除了上述所描述的虛擬直播模式之外，還可以包含其他的直播模式，此處不再舉例說明。In addition to the virtual live broadcast mode described above, other live broadcast modes may also be included, which will not be described here as examples.

S103：採集所述真實主播在直播過程中的視頻圖像。S103: Collect video images of the real anchor during the live broadcast.

這裡，可以通過主播端設備所在主播端設備上安裝的攝影機採集該視頻圖像。Here, the video image may be captured by a camera installed on the host device where the host device is located.

S105：基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果。S105: Based on the virtual live broadcast mode, perform action recognition on the target capture part in the video image to obtain a target recognition result.

這裡，目標捕捉部位為真實主播的目標捕捉部位；目標識別結果包含以下至少之一：各個目標捕捉部位的肢體關鍵點的位置資訊、人臉檢測框的尺寸資訊、人臉檢測框的位置資訊。Here, the target capture part is the target capture part of the real anchor; the target recognition result includes at least one of the following: the position information of the body key points of each target capture part, the size information of the face detection frame, and the position information of the face detection frame.

如果目標捕捉部位包含手部，該目標識別結果中還包含以下至少之一：手部檢測框的尺寸資訊、手部檢測框的位置資訊、手部檢測框所框選手部的手勢分類資訊，以及該手勢分類資訊所指示手勢的有效性。If the target capture part includes a hand, the target recognition result also includes at least one of the following: size information of the hand detection frame, position information of the hand detection frame, gesture classification information of the player framed by the hand detection frame, and The validity of the gesture indicated by the gesture classification information.

這裡，有效性用於表徵該手勢分類資訊所指示的手勢是否滿足特效觸發條件。如果滿足，則為有效手勢；否則為無效手勢。Here, the validity is used to represent whether the gesture indicated by the gesture classification information satisfies the special effect triggering condition. If so, it is a valid gesture; otherwise, it is an invalid gesture.

S107：基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料；所述視頻流資料用於呈現所述虛擬主播模型執行所述識別結果所指示動作的過程。S107: Based on the target recognition result, determine the video stream material corresponding to the virtual anchor model driven by the real anchor; the video stream material is used to present a process in which the virtual anchor model performs an action indicated by the recognition result.

圖1B示出可以應用本發明實施例的直播方法的一種系統架構示意圖；如圖1B所示，該系統架構中包括：視頻圖像獲取終端201、網路202和控制終端203。為實現支撐一個示例性應用，視頻圖像獲取終端201和控制終端203通過網路202建立通信連接，視頻圖像獲取終端201通過網路202向控制終端203上報採集到的真實主播在直播過程中的視頻圖像，控制終端203基於虛擬直播模式，對視頻圖像中的目標捕捉部位進行動作識別；並基於目標識別結果，確定真實主播驅動的虛擬主播模型對應的視頻流資料；最後，控制終端203在視頻直播介面中展示視頻流資料，並通過網路202發送給視頻圖像獲取終端201。FIG. 1B shows a schematic diagram of a system architecture to which the live broadcasting method according to an embodiment of the present invention can be applied; In order to support an exemplary application, the video image acquisition terminal 201 and the control terminal 203 establish a communication connection through the network 202, and the video image acquisition terminal 201 reports to the control terminal 203 through the network 202 the collected real host during the live broadcast process. In the video image, the control terminal 203, based on the virtual live broadcast mode, performs action recognition on the target capture part in the video image; and based on the target recognition result, determines the video stream data corresponding to the virtual anchor model driven by the real anchor; finally, the control The terminal 203 displays the video stream data in the live video interface, and sends it to the video image acquisition terminal 201 through the network 202 .

作為示例，視頻圖像獲取終端201可以包括圖像採集設備，控制終端203可以包括具有視覺資訊處理能力的視覺處理設備或遠端伺服器。網路202可以採用有線或無線連接方式。其中，當控制終端203為視覺處理設備時，視頻圖像獲取終端201可以通過有線連接的方式與視覺處理設備通信連接，例如通過匯流排進行資料通信；當控制終端203為遠端伺服器時，視頻圖像獲取終端201可以通過無線網路與遠端伺服器進行資料交互。As an example, the video image acquisition terminal 201 may include an image acquisition device, and the control terminal 203 may include a visual processing device with visual information processing capability or a remote server. The network 202 can be connected in a wired or wireless manner. Wherein, when the control terminal 203 is a visual processing device, the video image acquisition terminal 201 can communicate with the visual processing device through a wired connection, such as performing data communication through a bus; when the control terminal 203 is a remote server, The video image acquisition terminal 201 can exchange data with a remote server through a wireless network.

或者，在一些場景中，視頻圖像獲取終端201可以是帶有視頻採集模組的視覺處理設備，可以是帶有攝影頭的主機。這時，本發明實施例的直播方法可以由視頻圖像獲取終端201執行，上述系統架構可以不包含網路202和控制終端203。Or, in some scenarios, the video image acquisition terminal 201 may be a vision processing device with a video acquisition module, or a host computer with a camera. At this time, the live broadcasting method of the embodiment of the present invention may be executed by the video image acquisition terminal 201 , and the above-mentioned system architecture may not include the network 202 and the control terminal 203 .

在一個可選的實施方式中，如圖2所示，針對上述步驟S101，確定虛擬直播模式，包括如下過程：步驟S11，確定所述主播端設備的運行場景；其中，該運行場景用於表徵主播端設備對應的安裝設備的硬體計算資源；步驟S12，基於預先為所述主播端設備設定的多個運行版本，確定與所述運行場景相匹配的目標運行版本；步驟S13，基於確定出的所述目標運行版本確定所述虛擬直播模式。 In an optional implementation manner, as shown in FIG. 2, for the above step S101, determining the virtual live broadcast mode includes the following process: Step S11, determining the operation scenario of the anchor device; wherein, the operation scenario is used to represent the hardware computing resources of the installation device corresponding to the anchor device; Step S12: Determine a target running version that matches the running scenario based on multiple running versions pre-set for the anchor device; Step S13, determining the virtual live broadcast mode based on the determined target running version.

在本發明實施例中，首先，可以確定主播端設備的運行場景，該運行場景可以理解為該主播端設備的安裝場景。例如，該主播端設備所對應的安裝設備（即，上述電腦設備）的安裝場景。在確定出運行場景之後，就可以確定與該運行場景相匹配的虛擬直播模式。In the embodiment of the present invention, firstly, the running scenario of the host device can be determined, and the running scenario can be understood as the installation scenario of the host device. For example, the installation scene of the installation device corresponding to the host device (that is, the above-mentioned computer device). After the operating scenario is determined, a virtual live broadcast mode matching the operating scenario can be determined.

在本發明實施例中，預先為主播端設備設定了多個運行版本，其中，每個運行版本對應相應的運行標籤，該運行標籤用於指示該運行版本所對應的運行場景。針對不同的運行版本，預先設定了不同的虛擬直播模式。In the embodiment of the present invention, multiple running versions are pre-set for the host device, wherein each running version corresponds to a corresponding running tag, and the running tag is used to indicate the running scenario corresponding to the running version. For different operating versions, different virtual live broadcast modes are preset.

這裡，運行版本可以為直播軟體的運行版本，不同運行版本對應不同的虛擬直播模式。Here, the running version may be the running version of the live broadcast software, and different running versions correspond to different virtual live broadcast modes.

在一種可選的實施方式中，該運行版本可以理解為該直播軟體的軟體開發套件（Software Development Kit，簡稱SDK）的版本。In an optional implementation manner, the running version may be understood as a software development kit (Software Development Kit, SDK for short) version of the live broadcast software.

舉例來說，在主播端設備上首次安裝該直播軟體時，可以選擇安裝與當前主播端設備的運行場景相匹配的軟體開發套件SDK。For example, when installing the live broadcast software on the host device for the first time, you can choose to install the software development kit SDK that matches the running scene of the current host device.

在另一種可選的實施方式中，該運行版本還可以理解為該直播軟體的軟體開發套件SDK中的多個軟體運行模式。In another optional implementation manner, the running version can also be understood as multiple software running modes in the software development kit SDK of the live broadcast software.

舉例來說，在主播端設備上首次安裝該直播軟體時，可以安裝該直播軟體的軟體開發套件SDK。在主播端設備上運行該直播軟體時，可以根據該主播端設備的運行場景（例如，剩餘計算資源）自動運行該直播軟體中的對應運行模式。For example, when the live broadcast software is installed on the host device for the first time, the software development kit SDK of the live broadcast software can be installed. When running the live broadcast software on the anchor device, the corresponding operation mode in the live software can be automatically run according to the operation scenario (for example, remaining computing resources) of the anchor device.

在確定出運行場景之後，就可以基於預先為該主播端設備設定的多個運行版本所對應的版本標籤，確定與該運行場景相匹配的目標運行版本。進而，基於確定出的目標運行版本確定真實主播的虛擬直播模式。After the running scenario is determined, the target running version matching the running scenario can be determined based on the version tags corresponding to the multiple running versions preset for the anchor device. Furthermore, the virtual live broadcast mode of the real anchor is determined based on the determined target running version.

在本發明實施例中，假設，針對上述所描述的多個運行版本包含運行版本1和運行版本2，其中，運行版本1能夠支援驅動以下部位：頭部表情驅動，軀幹驅動，手臂驅動，手指驅動；運行版本2可以支援驅動以下部位：表情驅動和軀幹驅動。In the embodiment of the present invention, it is assumed that the multiple operating versions described above include operating version 1 and operating version 2, wherein operating version 1 can support driving the following parts: head expression driving, torso driving, arm driving, fingers Drive; running version 2 can support drive the following parts: expression drive and trunk drive.

在本發明實施例中，通過確定上述運行場景，可以確定主播端設備所對應的安裝設備的計算能力，進而在該安裝設備上運行符合其計算能力的運行版本，從而保證主播端設備的正常運行，減少由於計算需求量大，且該安裝設備的計算能力不足導致的主播端設備運行異常的問題。In the embodiment of the present invention, by determining the above operation scenario, the computing capability of the installation device corresponding to the host device can be determined, and then the running version that meets its computing capability can be run on the installation device, so as to ensure the normal operation of the host device , to reduce the problem of abnormal operation of the host device due to the large computing demand and the insufficient computing power of the installed device.

在一個可選的實施方式中，上述步驟S11，確定所述主播端設備的運行場景，包括如下幾種方式。In an optional implementation manner, the above-mentioned step S11, determining the running scenario of the host device, includes the following methods.

方式一：在檢測到針對直播應用程式的首次安裝操作的情況下，獲取所述主播端設備的設備標識資訊，並根據所述設備標識資訊確定所述運行場景。在本發明實施例中，在檢測到使用者首次安裝直播應用程式的情況下，可以獲取主播端設備的設備標識資訊，之後，根據該主播端設備的設備標識資訊，確定該主播端設備的計算能力。之後，就可以根據確定出的計算能力，確定運行場景。例如，該運行場景為計算能力較強的主播端設備，或者，該運行場景為計算能力較弱的主播端設備。 method one: When the first installation operation for the live broadcast application is detected, the device identification information of the host device is obtained, and the running scenario is determined according to the device identification information. In the embodiment of the present invention, when it is detected that the user installs the live application program for the first time, the device identification information of the anchor device can be obtained, and then, according to the device identification information of the anchor device, the calculation of the anchor device can be determined. ability. Afterwards, the running scenario can be determined based on the determined computing capability. For example, the running scenario is an anchor device with strong computing capability, or the running scenario is an anchor device with weak computing capability.

方式二：在檢測到針對所述直播應用程式的打開操作的情況下，獲取當前時刻所述主播端設備的剩餘設備計算資源，並根據所述剩餘設備計算資源確定所述運行場景。在本發明實施例中，在檢測到使用者打開該直播應用程式的情況下，可以獲取主播端設備的剩餘設備計算資源，之後，根據該主播端設備的剩餘設備計算資源，確定該主播端設備的計算能力。之後，就可以根據確定出的計算能力，確定運行場景。例如，該運行場景為計算能力較強的主播端設備，或者，該運行場景為計算能力較弱的主播端設備。 Method 2: When an opening operation for the live broadcast application is detected, the remaining device computing resources of the host device at the current moment are acquired, and the running scenario is determined according to the remaining device computing resources. In the embodiment of the present invention, when it is detected that the user opens the live broadcast application, the remaining device computing resources of the host device can be obtained, and then the host device can be determined according to the remaining device computing resources of the host device. computing power. Afterwards, the running scenario can be determined based on the determined computing capability. For example, the running scenario is an anchor device with strong computing capability, or the running scenario is an anchor device with weak computing capability.

方式三：在本發明實施例中，在檢測到使用者首次安裝該直播應用程式，且打開該直播應用程式的情況下，可以獲取主播端設備的設備標識資訊和主播端設備的剩餘設備計算資源，之後，根據該主播端設備的設備標識資訊和剩餘設備計算資源，確定該主播端設備的計算能力。之後，就可以根據確定出的計算能力，確定主播端設備的運行場景。例如，該主播端設備的運行場景為計算能力較強的主播端設備，或者，該主播端設備的運行場景為計算能力較弱的主播端設備。 Method 3: In the embodiment of the present invention, when it is detected that the user installs the live broadcast application program for the first time and opens the live broadcast application program, the device identification information of the host device and the remaining device computing resources of the host device can be obtained, and then, According to the device identification information of the host device and the remaining computing resources of the device, the computing capability of the host device is determined. Afterwards, the operating scenario of the host device can be determined according to the determined computing capability. For example, the operating scenario of the anchor device is an anchor device with strong computing capability, or the operating scenario of the anchor device is an anchor device with weak computing capability.

上述實施方式中，通過檢測首次安裝操作或者打開操作，可以實現在開啟該直播應用程式之前，確定出當前時刻該主播端設備的計算能力，從而在該主播端設備開啟之前，準確的根據該計算能力確定出主播端設備的運行場景，進而保證主播端設備的正常運行。In the above embodiments, by detecting the first installation operation or opening operation, it is possible to determine the computing capability of the anchor device at the current moment before starting the live broadcast application, so that before the anchor device is turned on, it can be accurately calculated according to the calculation capability. Ability to determine the operating scenario of the host device, thereby ensuring the normal operation of the host device.

在一個可選的實施方式中，針對上述步驟S101，確定虛擬直播模式，包括如下步驟。In an optional implementation manner, for the above step S101, determining the virtual live broadcast mode includes the following steps.

步驟S21，回應於所述真實主播的直播模式的選擇指令，在多個預設直播模式中確定所述虛擬直播模式。Step S21, in response to the selection instruction of the live broadcast mode of the real anchor, determine the virtual live broadcast mode among a plurality of preset live broadcast modes.

在本發明實施例中，可以預先設定多個預設直播模式，例如，如圖3所示的“頭肩場景”和“腰身場景”。In the embodiment of the present invention, a plurality of preset live broadcast modes may be preset, for example, "head shoulder scene" and "waist scene" as shown in FIG. 3 .

在使用者打開該主播端設備之後，可以如圖3所示，在主播端設備的展示介面上展示該多個預設直播模式。使用者可以在如圖3所示的展示介面中選擇相對應的直播模式作為虛擬直播模式，例如，使用者可以選擇進入“頭肩場景”，或者，選擇進入“腰身場景”。After the user turns on the host device, as shown in FIG. 3 , the plurality of preset live broadcast modes can be displayed on the display interface of the host device. The user can select the corresponding live broadcast mode as the virtual live broadcast mode in the display interface shown in Figure 3, for example, the user can choose to enter the "head and shoulders scene", or choose to enter the "waist scene".

當使用者選擇進入“頭肩場景”之後，可以在展示介面上展示出如圖4所示的資訊；當使用者選擇進入“腰身場景”之後，可以在展示介面上展示出如圖5所示的資訊。When the user chooses to enter the "head and shoulders scene", the information shown in Figure 4 can be displayed on the display interface; when the user chooses to enter the "waist scene", the information shown in Figure 5 can be displayed on the display interface information.

上述實施方式中，通過設置使用者在多個預設直播模式中選擇虛擬直播模式的方式，可以為使用者提供更加豐富的虛擬直播場景，從而滿足使用者的需求，提高使用者的使用體驗。In the above embodiments, by setting the way for the user to select a virtual live broadcast mode from multiple preset live broadcast modes, more abundant virtual live broadcast scenes can be provided for the user, thereby meeting the needs of the user and improving the user experience.

在本發明實施例中，該方法還包括如下步驟：在確定真實主播的虛擬直播模式之後，在所述主播端設備的展示介面中展示第一指示資訊和/或第二指示資訊；其中，所述第一指示資訊用於指示處於有效捕捉狀態的目標捕捉部位，所述第二指示資訊用於指示處於無效捕捉狀態的目標捕捉部位。 In an embodiment of the present invention, the method further includes the following steps: After determining the virtual live broadcast mode of the real anchor, displaying the first indication information and/or the second indication information in the display interface of the anchor end device; Wherein, the first indication information is used to indicate the target capture part in the effective capture state, and the second indication information is used to indicate the target capture part in the invalid capture state.

在本發明實施例中，在使用者選擇進入“頭肩場景”之後，可以在展示介面上展示出如圖4所示的資訊。在使用者選擇進入“腰身場景”之後，可以在展示介面上展示出如圖5所示的資訊。此時，在展示介面上可以展示出第一隻指示資訊和/或第二指示資訊。In the embodiment of the present invention, after the user chooses to enter the "head and shoulders scene", the information shown in FIG. 4 can be displayed on the display interface. After the user chooses to enter the "waist scene", the information shown in Figure 5 can be displayed on the display interface. At this time, the first indication information and/or the second indication information may be displayed on the display interface.

這裡，第一指示資訊用於指示處於有效捕捉狀態的目標捕捉部位。例如，在該“頭肩場景”模式下，真實主播的頭部完整的出現在視頻圖像中，或者真實主播的肩部完整的出現在視頻圖像中。例如，在該“腰身場景”模式下，真實主播的完整上肢部位出現在視頻圖像中。Here, the first indication information is used to indicate the target capture part in an effective capture state. For example, in the "head and shoulders scene" mode, the head of the real anchor completely appears in the video image, or the shoulders of the real anchor completely appear in the video image. For example, in this "waist scene" mode, the complete upper body parts of the real anchor appear in the video image.

這裡，第二指示資訊的數量可以為多個，還可以為一個。不同的第二指示資訊用於指示處於不同無效捕捉狀態的目標捕捉部位。例如，如圖4所示，展示出兩個第二指示資訊，該兩個第二指示資訊分別指示主播處於不同無效捕捉狀態下的目標捕捉部位。Here, the quantity of the second indication information may be multiple, or one. The different second indication information is used to indicate target capture parts in different invalid capture states. For example, as shown in FIG. 4 , two second indication information are displayed, and the two second indication information respectively indicate the target capture positions of the anchor in different invalid capture states.

上述實施方式中，通過在主播端設備的展示介面上展示第一指示資訊和/或第二指示資訊的方式，可以指導真實主播做出有效的捕捉動作，從而提高動作捕捉的準確性，以保證主播端設備的穩定運行。In the above embodiments, by displaying the first instruction information and/or the second instruction information on the display interface of the anchor device, the real anchor can be instructed to make effective capture actions, thereby improving the accuracy of motion capture and ensuring The stable operation of the host device.

在本發明實施例中，在確定虛擬直播模式之後，且在採集所述真實主播在直播過程中的視頻圖像之前，該方法還包括如下步驟：（1）、採集包含所述真實主播的預覽圖像；（2）、確定所述預覽圖像中包含的目標捕捉部位是否滿足動作識別條件；（3）、在所述預覽圖像中包含的目標捕捉部位不滿足所述動作識別條件的情況下，生成目標調整資訊，直至確定出所述目標捕捉部位滿足所述動作識別條件，其中，所述目標調整資訊用於提醒所述真實主播調整所述目標捕捉部位在所述預覽圖像中的展示狀態。 In the embodiment of the present invention, after determining the virtual live broadcast mode, and before collecting the video images of the real anchor during the live broadcast, the method further includes the following steps: (1) Collect a preview image containing the real host; (2) Determine whether the target capture part included in the preview image satisfies the motion recognition condition; (3) When the target capture part included in the preview image does not satisfy the motion recognition condition, generate target adjustment information until it is determined that the target capture part satisfies the motion recognition condition, wherein the The target adjustment information is used to remind the real anchor to adjust the display state of the target capturing part in the preview image.

在本發明實施例中，使用者在選擇出虛擬直播模式之後，就可以開始採集包含真實主播的預覽圖像。此時，可以通過預覽圖像查看當前時刻目標捕捉部位是否滿足動作識別條件。在一些實施例中，動作識別條件為目標捕捉部位的預設的動作，如果該目標捕捉部位的動作符合該預設的動作，說明目標捕捉部位滿足動作識別條件；如果該目標捕捉部位的動作符合該預設的動作，說明目標捕捉部位不滿足動作識別條件。In the embodiment of the present invention, after the user selects the virtual live broadcast mode, the user can start to collect preview images containing real anchors. At this time, you can check whether the target capture part at the current moment meets the action recognition condition through the preview image. In some embodiments, the action recognition condition is a preset action of the target capture part. If the action of the target capture part meets the preset action, it means that the target capture part satisfies the action recognition condition; The preset action indicates that the target capture part does not satisfy the action recognition condition.

在確定出滿足該條件的情況下，可以提示使用者進入直播模式，此時，使用者可以通過點擊預覽介面中的“進入”按鈕，進入到對應虛擬直播模式下的直播介面。When it is determined that the condition is satisfied, the user can be prompted to enter the live broadcast mode. At this time, the user can click the "Enter" button in the preview interface to enter the live broadcast interface corresponding to the virtual live broadcast mode.

在確定出不滿足該條件的情況下，可以確定出預覽圖像中目標捕捉部位處於無效捕捉狀態，例如，如圖4或者圖5中第二指示資訊所指示的動作。此時，可以在預覽圖像的展示介面中展示出“目標捕捉部位處於無效狀態”的提示資訊，同時，還可以生成並展示目標調整資訊，該目標調整資訊用於提醒真實主播調整目標捕捉部位的裝置。例如，該目標調整資訊可以為以下調整資訊：請向上移動身體，或者，向下移動攝影裝置，保證您的上半身處於預覽圖像內；請您調整坐姿，或者，請您調整攝影裝置的朝向，從而保證您的面部處於預覽圖像內。If it is determined that the condition is not satisfied, it may be determined that the target capture part in the preview image is in an invalid capture state, for example, as indicated by the second indication information in FIG. 4 or FIG. 5 . At this time, the prompt information "the target capture part is invalid" can be displayed in the display interface of the preview image, and at the same time, target adjustment information can also be generated and displayed, which is used to remind the real anchor to adjust the target capture position installation. For example, the target adjustment information may be the following adjustment information: Please move your body up, or move the camera device down to ensure that your upper body is in the preview image; please adjust your sitting posture, or, please adjust the orientation of the camera device, This ensures that your face is within the preview image.

上述實施方式中，在基於預覽圖像確定出目標捕捉部位不滿足動作識別條件的情況下，生成目標調整資訊，可以實現在使用者直播之前，對攝影裝置和/或目標捕捉部位的展示狀態進行調整，從而提高直播過程中動作識別的精度，進而提高直播效果。In the above-mentioned embodiment, when it is determined based on the preview image that the target capture part does not meet the motion recognition conditions, the target adjustment information is generated, which can realize the display status of the photography device and/or the target capture part before the live broadcast by the user. Adjust, so as to improve the accuracy of action recognition during the live broadcast, and then improve the live broadcast effect.

在本發明實施例中，針對上述步驟S105，基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果，包括如下步驟：步驟S1051，在檢測到所述目標捕捉部位包含手部部位的情況下，獲取所述虛擬直播模式所對應的模式標籤，其中，所述模式標籤包含是否對手部部位進行手部識別的目標模式標籤；步驟S1052，在確定出所述模式標籤為目標模式標籤的情況下，對所述視頻圖像中的目標捕捉部位進行動作識別，以及對所述目標捕捉部位中的手部部位進行手部檢測，得到包含手部檢測結果的所述目標識別結果。 In the embodiment of the present invention, for the above step S105, based on the virtual live broadcast mode, the action recognition is performed on the target capture part in the video image to obtain the target recognition result, including the following steps: Step S1051, when it is detected that the target capture part includes a hand part, obtain the mode tag corresponding to the virtual live broadcast mode, wherein the mode tag includes a target mode tag whether to perform hand recognition on the hand part ; Step S1052, when it is determined that the pattern label is the target pattern label, perform motion recognition on the target capture part in the video image, and perform hand detection on the hand part in the target capture part, The target recognition result including the hand detection result is obtained.

在本發明實施例中，假設，虛擬直播模式所指示的目標捕捉部位包含手部部位，那麼在真實主播進入到虛擬直播模式之後，可以在主播端的虛擬直播介面展示出手指驅動資訊，例如，該手指驅動資訊可以為如圖6所示的資訊。如圖6所示，可以在主播端的虛擬直播介面中展示出“手指驅動”和“手勢識別”的開啟按鈕。當真實主播選擇開啟“手指驅動”和/或“手勢識別”時，則可以對應生成用於對手部部位進行識別的模式標籤。In the embodiment of the present invention, assuming that the target capture part indicated by the virtual live broadcast mode includes hand parts, then after the real anchor enters the virtual live broadcast mode, finger actuation information can be displayed on the virtual live broadcast interface of the anchor end, for example, the The finger actuation information may be information as shown in FIG. 6 . As shown in Figure 6, the start buttons of "finger-driven" and "gesture recognition" can be displayed in the virtual live broadcast interface of the anchor. When the real host chooses to enable "finger drive" and/or "gesture recognition", a pattern label for recognizing hand parts can be correspondingly generated.

因此，在本發明實施例中，在根據虛擬直播模式確定出目標捕捉部位包含手部部位的情況下，還可以獲取虛擬直播模式所對應的模式標籤。以根據該模式標籤確定真實主播是否開啟了“手指驅動”和“手勢識別”。在確定出模式標籤為目標模式標籤的情況下，則確定開啟了“手指驅動”和“手勢識別”中的至少一個，此時，還可以基於該目標模式標籤確定真實主播開啟了“手指驅動”，還是開啟了“手勢識別”。Therefore, in the embodiment of the present invention, when it is determined according to the virtual live broadcast mode that the target capture part includes hand parts, the mode tag corresponding to the virtual live broadcast mode may also be acquired. According to the mode label, it can be determined whether the real anchor has turned on "finger drive" and "gesture recognition". When it is determined that the mode label is the target mode label, it is determined that at least one of "finger-driven" and "gesture recognition" is turned on. At this time, it can also be determined based on the target mode label that the real anchor has turned on "finger-driven" , or "Gesture Recognition" is turned on.

例如，若目標模式標籤為“01”，則確定開啟了“手指驅動”；若目標模式標籤為“10”，則確定開啟了“手勢識別”；若目標模式標籤為“11”，則確定開啟了“手指驅動”和“手勢識別”。For example, if the target mode label is "01", it is determined that "Finger Drive" is turned on; if the target mode label is "10", it is determined that "Gesture Recognition" is turned on; if the target mode label is "11", it is determined that it is turned on "finger drive" and "gesture recognition".

在本發明實施例中，在開啟了“手指驅動”的情況下，該手部檢測結果中包含手部的各個關鍵點的位置資訊。在開啟了“手勢識別”的情況下，該手部檢測結果中包含手勢的識別結果，例如，該手勢的識別結果可以為手部處於OK的姿勢，手部處於比心的姿勢等。在開啟了“手指驅動”和“手勢識別”的情況下，該手部檢測結果中包含手勢的識別結果和手部的各個關鍵點的位置資訊。In the embodiment of the present invention, when the "finger driving" is turned on, the hand detection result includes position information of each key point of the hand. When “Gesture Recognition” is turned on, the hand detection result includes the gesture recognition result. For example, the gesture recognition result can be that the hand is in an OK posture, the hand is in a heart-to-heart posture, and so on. When "Finger Drive" and "Gesture Recognition" are turned on, the hand detection result includes the gesture recognition result and the position information of each key point of the hand.

如圖7所示，在主播端的展示介面中還可以展示出該主播端設備所能夠識別的手勢，例如，“OK”，“666”和“hello”。As shown in FIG. 7 , gestures that can be recognized by the host device, such as "OK", "666" and "hello", can also be displayed in the presentation interface of the host terminal.

上述實施方式中，通過上述處理方式，可以實現對目標捕捉部位進行精細化處理，從而滿足使用者的不同直播需求。In the above-mentioned embodiment, through the above-mentioned processing method, it is possible to perform refined processing on the target capture part, so as to meet different live broadcast requirements of users.

在本發明實施例中，在確定所述真實主播驅動的虛擬主播模型對應的視頻流資料之後，該方法還包括如下步驟：（1）、根據所述手部檢測結果檢測到所述真實主播的手部姿勢為預設手勢，獲取與所述預設手勢相對應的渲染素材特效；（2）、在所述視頻流資料中的指定視頻幀中渲染所述渲染素材特效。 In the embodiment of the present invention, after determining the video stream data corresponding to the virtual anchor model driven by the real anchor, the method further includes the following steps: (1) According to the hand detection result, it is detected that the hand gesture of the real anchor is a preset gesture, and the rendering material special effect corresponding to the preset gesture is acquired; (2) Render the special effect of the rendering material in a specified video frame in the video stream data.

在本發明實施例中，在真實主播開啟了“手勢識別”的情況下，手部檢測結果中包含手勢的識別結果，此時，可以根據該手勢的識別結果確定真實主播的手部姿勢是否為預設手勢。如果是，則確定與預設手勢相對應的渲染素材特效，並在視頻流資料中的指定視頻幀中渲染渲染素材特效。In the embodiment of the present invention, when the real anchor turns on "gesture recognition", the hand detection result includes the gesture recognition result. At this time, it can be determined according to the gesture recognition result whether the real anchor's hand gesture is Default gestures. If so, determine the rendering material special effect corresponding to the preset gesture, and render the rendering material special effect in the specified video frame in the video stream data.

在一個可選的實施方式中，在根據手部檢測結果確定出所述真實主播的手部姿勢為預設手勢之前，還可以執行以下步驟：首先，對視頻圖像中的真實主播進行姿態檢測，得到姿態檢測結果，然後，根據姿態檢測結果中的肢體檢測結果確定視頻圖像是否滿足手勢識別條件。 In an optional implementation manner, before determining that the hand gesture of the real anchor is a preset gesture according to the hand detection result, the following steps may also be performed: Firstly, pose detection is performed on the real anchor in the video image to obtain the pose detection result, and then, according to the body detection result in the pose detection result, it is determined whether the video image meets the gesture recognition condition.

在本發明實施例中，在確定出上述所描述的肢體檢測結果之後，根據肢體檢測結果確定各個目標捕捉部位之間的相對位置關係；根據該相對位置關係，確定視頻圖像是否滿足手勢識別條件。In the embodiment of the present invention, after the limb detection result described above is determined, the relative positional relationship between each target capture part is determined according to the limb detection result; according to the relative positional relationship, it is determined whether the video image satisfies the gesture recognition condition .

可以理解的是，上述相對位置關係包含以下至少之一：各個目標捕捉部位之間的相對距離、各個目標捕捉部位中相關聯肢體部位之間的角度關係。其中，相關聯肢體部位可以理解為相鄰的目標捕捉部位，或者，類型相同的目標捕捉部位。It can be understood that the above relative positional relationship includes at least one of the following: a relative distance between each target capturing part, and an angular relationship between associated limb parts in each target capturing part. Wherein, the associated body parts may be understood as adjacent target capture parts, or target capture parts of the same type.

在本發明實施例中，視頻圖像滿足手勢識別條件可以理解為：視頻圖像中真實主播的肢體動作為預先設定的肢體動作。In the embodiment of the present invention, the video image satisfying the gesture recognition condition may be understood as: the body movement of the real host in the video image is a preset body movement.

其次，在確定出滿足所述手勢識別條件的情況下，根據手部檢測結果確定真實主播的手部姿勢是否為預設手勢；並在檢測出所述預設手勢的情況下，在所述視頻流資料中的指定視頻幀中添加所述渲染素材特效。Secondly, when it is determined that the gesture recognition condition is satisfied, determine whether the hand gesture of the real anchor is a preset gesture according to the hand detection result; and when the preset gesture is detected, in the video Add the rendering material special effect to the specified video frame in the streaming data.

在本發明實施例中，是通過肢體檢測結果和手部檢測結果的結合，來確定視頻圖像所對應的渲染素材特效。在此情況下，在根據姿態檢測結果確定出真實主播滿足手勢識別條件的情況下，再對真實主播的手勢進行識別的方式可以提高姿態比對的效率，縮短姿態比對的時間，從而使得本技術方案能夠適用於即時性要求較高的直播場景。In the embodiment of the present invention, the rendering material special effect corresponding to the video image is determined through the combination of the limb detection result and the hand detection result. In this case, when it is determined that the real anchor meets the gesture recognition conditions according to the gesture detection results, the way of recognizing the gesture of the real anchor can improve the efficiency of gesture comparison and shorten the time of gesture comparison, so that this The technical solution can be applied to live broadcast scenarios with high immediacy requirements.

在本發明實施例中，上述步驟S107，基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料，具體包括如下步驟：步驟S1071，獲取至少一個觀眾端設備發送的直播觀看請求；步驟S1072，基於所述直播觀看請求確定每個觀眾端設備對應的直播觀看介面的介面背景圖像，其中，所述介面背景圖像包含：靜態背景圖像或者動態背景圖像；步驟S1073，確定用於表徵所述虛擬主播模型執行所述目標識別結果所指示動作的多個視頻圖像；步驟S1074，將所述多個視頻圖像中每個視頻圖像的背景圖像替換為所述介面背景圖像，並基於修改之後的所述多個視頻圖像確定所述視頻流資料。 In the embodiment of the present invention, the above step S107, based on the target recognition result, determines the video stream data corresponding to the virtual anchor model driven by the real anchor, which specifically includes the following steps: Step S1071, obtaining a live viewing request sent by at least one viewer device; Step S1072: Determine the interface background image of the live viewing interface corresponding to each viewer device based on the live viewing request, wherein the interface background image includes: a static background image or a dynamic background image; Step S1073, determining a plurality of video images used to represent that the virtual anchor model performs the action indicated by the target recognition result; Step S1074, replacing the background image of each video image in the plurality of video images with the interface background image, and determining the video stream data based on the modified video images.

在本發明實施例中，在確定所述真實主播驅動的虛擬主播模型對應的視頻流資料之前，還可以獲取至少一個觀眾端設備發送的直播觀看請求。該直播觀看請求中可以攜帶每個觀眾端設備所請求展示的介面背景圖像。如果觀眾端設備未請求介面背景圖像，那麼該介面背景圖像可以設置為透明背景圖像。In the embodiment of the present invention, before determining the video stream material corresponding to the virtual anchor model driven by the real anchor, a live viewing request sent by at least one viewer terminal device may also be acquired. The live viewing request may carry the interface background image requested by each viewer device. If the viewer device does not request the interface background image, then the interface background image can be set as a transparent background image.

此時，可以確定用於表徵所述虛擬主播模型執行所述目標識別結果所指示動作的多個視頻圖像，並將多個視頻圖像中每個視頻圖像的背景圖像替換為所述介面背景圖像，並基於修改之後的所述多個視頻圖像確定所述視頻流資料。At this point, it is possible to determine a plurality of video images used to characterize the virtual anchor model performing the action indicated by the target recognition result, and replace the background image of each video image in the plurality of video images with the interface background image, and determine the video stream data based on the modified video images.

在本發明實施例中，可以對多個視頻圖像中的每個視頻圖像進行圖像分割，分割為前景圖像和背景圖像，然後，將分割得到的前景圖像和介面背景圖像進行融合，得到修改之後的多個視頻圖像。針對每個視頻圖像，均包含對應的時間標籤（例如，時間戳記），此時，可以基於該時間標籤將修改之後的多個視頻圖像進行處理，生成該視頻資料流程。In the embodiment of the present invention, each video image in a plurality of video images may be segmented into a foreground image and a background image, and then the foreground image and the interface background image obtained by the segmentation are Fusion is performed to obtain multiple modified video images. For each video image, a corresponding time tag (for example, a time stamp) is included. At this time, multiple modified video images may be processed based on the time tag to generate the video data flow.

在確定出待驅動的虛擬主播模型所對應的視頻流資料之後，還可以向對應相同介面背景圖像的觀眾端設備推送包含該介面背景圖像的視頻流資料。After the video stream data corresponding to the virtual anchor model to be driven is determined, the video stream data including the interface background image can also be pushed to the audience device corresponding to the same interface background image.

上述實施方式中，為請求相同的介面展示背景的觀眾端設備生成對應的視頻流資料，並向請求相同介面背景圖像的觀眾端設備推送包含該介面背景圖像的視頻流資料的方式，可以節省視頻流資料的確定時間，以提高虛擬直播的直播品質。In the above embodiment, the method of generating corresponding video stream data for the viewer-end devices requesting the same interface display background, and pushing the video stream data containing the interface background image to the viewer-end devices requesting the same interface background image can be Save time for determining video stream data to improve live broadcast quality of virtual live broadcast.

在本發明實施例中，上述步驟：對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果，還包括如下過程：在檢測到視頻圖像中包含的目標捕捉部位不完整的情況下，還可以對視頻圖像幀進行擴展處理，得到目標圖像；目標圖像中包含用於對該缺少的目標捕捉部位進行姿態檢測的區域；通過預訓練的姿態檢測模型對所述目標圖像進行姿態檢測，得到真實主播的姿態檢測結果。 In the embodiment of the present invention, the above step: performing action recognition on the target capture part in the video image to obtain the target recognition result, also includes the following process: When it is detected that the target capture part contained in the video image is incomplete, the video image frame can also be expanded to obtain the target image; The detected area; the pose detection is performed on the target image through the pre-trained pose detection model to obtain the pose detection result of the real anchor.

採用上述處理方式，可以實現在視頻圖像幀中不包含完整的肢體部位的情況下，依然可以對真實主播進行姿態檢測，從而保證主播端設備能夠正常穩定運行。Using the above processing method, it is possible to detect the posture of the real anchor even when the video image frame does not contain complete body parts, so as to ensure the normal and stable operation of the anchor device.

在本發明實施例中，基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料，還包含如下步驟：根據目標識別結果確定目標捕捉部位的位置資訊，根據虛擬直播圖像中虛擬主播模型相對應部位的驅動信號，以通過該驅動信號，確定該相對應部位的關鍵點的位置資訊。針對每個視頻圖像均可以對應確定出相應的虛擬直播圖像，進而得到視頻流資料。在本發明實施例中，虛擬主播模型可以為三維阿凡達（3D avatar）模型。 In the embodiment of the present invention, based on the target recognition result, determining the video stream data corresponding to the virtual anchor model driven by the real anchor further includes the following steps: Determine the position information of the target capture part according to the target recognition result, and determine the position information of the key points of the corresponding part according to the driving signal of the corresponding part of the virtual anchor model in the virtual live image through the driving signal. For each video image, a corresponding virtual live image can be correspondingly determined, and then video stream data can be obtained. In this embodiment of the present invention, the virtual anchor model may be a three-dimensional avatar (3D avatar) model.

參見圖8所示，為本發明實施例提供的一種直播系統的示意圖，所述直播系統包括：主播端設備71和觀眾端設備72。Referring to FIG. 8 , which is a schematic diagram of a live broadcast system provided by an embodiment of the present invention, the live broadcast system includes: a host device 71 and a viewer device 72 .

所述主播端設備，被配置按照上述任一項所述的直播方法確定所述真實主播驅動的虛擬主播模型對應的視頻流資料，並向所述觀眾端設備推送所述視頻流資料。The anchor device is configured to determine the video stream data corresponding to the virtual anchor model driven by the real anchor according to the live broadcast method described in any one of the above, and push the video stream data to the viewer end device.

所述觀眾端設備，被配置成獲取所述視頻流資料，並在直播觀看介面播放所述視頻流資料。The viewer device is configured to obtain the video stream data, and play the video stream data on a live viewing interface.

這裡，觀眾端設備包含：移動終端設備和PC設備。Here, the viewer device includes: a mobile terminal device and a PC device.

在所述觀眾端設備為所述移動終端設備的情況下，所述主播端設備通過CDN分發網路向所述觀眾端設備傳輸所述視頻流資料。In the case that the viewer device is the mobile terminal device, the host device transmits the video stream data to the viewer device through a CDN distribution network.

在所述觀眾端設備為所述PC設備的情況下，所述主播端設備通過CDN分發網路和轉推流伺服器向所述觀眾端設備傳輸所述視頻流資料。In the case that the viewer-end device is the PC device, the host-end device transmits the video stream data to the viewer-end device through a CDN distribution network and a forwarding stream server.

從圖8中可以看出，主播端設備安裝在主播端設備71中。首先，可以確定真實主播的虛擬直播模式；在確定出虛擬直播模式之後，就可以根據該虛擬直播模式確定待驅動部位（或者目標捕捉部位），例如，表情、手臂和軀幹等待驅動部位。之後，就可以通過主播端設備71的攝影裝置採集真實主播在直播過程中的視頻圖像，並對基於虛擬直播模式，對視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果；從而通過虛擬攝影頭基於該目標識別結果進行圖像渲染，得到虛擬直播圖像，進而根據多個虛擬直播圖像確定出視頻流資料。It can be seen from FIG. 8 that the host device is installed in the host device 71 . First, the virtual live broadcast mode of the real anchor can be determined; after the virtual live broadcast mode is determined, the parts to be driven (or target capture parts) can be determined according to the virtual live broadcast mode, such as expressions, arms and torso waiting to be driven. Afterwards, the video image of the real anchor during the live broadcast can be collected by the camera device 71 of the anchor end device, and based on the virtual live broadcast mode, the action recognition is performed on the target capture part in the video image to obtain the target recognition result; thus The virtual camera performs image rendering based on the target recognition result to obtain a virtual live image, and then determines the video stream data according to the multiple virtual live images.

之後，可以通過CDN分發網路向移動終端設備傳輸該視頻流資料，以及通過CDN分發網路和轉推流伺服器向PC設備傳輸視頻流資料，從而在觀眾端設備中播放該視頻流資料。Afterwards, the video stream data can be transmitted to the mobile terminal device through the CDN distribution network, and the video stream data can be transmitted to the PC device through the CDN distribution network and the forwarding stream server, so as to play the video stream data in the viewer device.

在本發明實施例中，通過虛擬直播模式確定目標捕捉部位，並對目標捕捉部位進行動作識別的方式，可以實現更加靈活的確定真實主播的待捕捉的肢體部位，通過該處理方式，可以過濾掉不需要進行動作捕捉的肢體部位，從而縮短了動作捕捉的時間，降低了虛擬直播的複雜程度，同時，還可以使得該直播方法要求即時性較高的直播場景。In the embodiment of the present invention, the method of determining the target capture position through the virtual live broadcast mode and performing action recognition on the target capture position can achieve a more flexible determination of the body parts to be captured of the real anchor. Through this processing method, it is possible to filter out There is no need for body parts for motion capture, thereby shortening the time for motion capture and reducing the complexity of the virtual live broadcast. At the same time, it can also make the live broadcast method require live broadcast scenes with high immediacy.

本領域技術人員可以理解，在具體實施方式的上述方法中，各步驟的撰寫順序並不意味著嚴格的執行順序而對實施過程構成任何限定，各步驟的具體執行順序應當以其功能和可能的內在邏輯確定。Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

基於同一發明構思，本發明實施例中還提供了與直播方法對應的直播裝置，由於本發明實施例中的裝置解決問題的原理與本發明實施例上述直播方法相似，因此裝置的實施可以參見方法的實施，重複之處不再贅述。Based on the same inventive concept, the embodiment of the present invention also provides a live broadcast device corresponding to the live broadcast method. Since the problem-solving principle of the device in the embodiment of the present invention is similar to the above-mentioned live broadcast method in the embodiment of the present invention, the implementation of the device can refer to the method The implementation of this method will not be repeated here.

參照圖9所示，為本發明實施例提供的一種直播裝置的示意圖，所述裝置包括：第一確定單元81、採集單元82、動作識別單元83、第二確定單元84；其中，第一確定單元81，配置為確定虛擬直播模式；所述虛擬直播模式用於指示對真實主播進行肢體捕捉的目標捕捉部位；採集單元82，配置為採集所述真實主播在直播過程中的視頻圖像；動作識別單元83，配置為基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果；第二確定單元84，配置為基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料；所述視頻流資料用於呈現所述虛擬主播模型執行所述識別結果所指示動作的過程。 Referring to FIG. 9 , it is a schematic diagram of a live broadcast device provided by an embodiment of the present invention, and the device includes: a first determination unit 81, an acquisition unit 82, an action recognition unit 83, and a second determination unit 84; wherein, The first determining unit 81 is configured to determine a virtual live broadcast mode; the virtual live broadcast mode is used to indicate the target capture position for body capture of the real anchor; The collection unit 82 is configured to collect the video images of the real anchor during the live broadcast; The motion recognition unit 83 is configured to perform motion recognition on the target capture part in the video image based on the virtual live broadcast mode, and obtain a target recognition result; The second determining unit 84 is configured to determine the video stream data corresponding to the virtual anchor model driven by the real anchor based on the target recognition result; the video stream data is used to present the result of the virtual anchor model executing the recognition result Indicates the course of action.

一種可能的實施方式中，第一確定單元81，還配置為：確定所述主播端設備的運行場景；基於預先為所述主播端設備設定的多個運行版本，確定與所述運行場景相匹配的目標運行版本；基於確定出的所述目標運行版本確定所述虛擬直播模式。In a possible implementation manner, the first determination unit 81 is further configured to: determine the running scenario of the anchor device; The target running version; determine the virtual live broadcast mode based on the determined target running version.

一種可能的實施方式中，第一確定單元81，還配置為：在檢測到針對直播應用程式的首次安裝操作的情況下，獲取所述主播端設備的設備標識資訊，並根據所述設備標識資訊確定所述運行場景；和/或，在檢測到針對所述直播應用程式的打開操作的情況下，獲取當前時刻所述主播端設備的剩餘設備計算資源，並根據所述剩餘設備計算資源確定所述運行場景。In a possible implementation manner, the first determination unit 81 is further configured to: acquire the device identification information of the host device in the case of detecting the first installation operation for the live application program, and according to the device identification information Determining the running scenario; and/or, in the case of detecting the opening operation for the live broadcast application, obtaining the remaining device computing resources of the anchor device at the current moment, and determining the remaining device computing resources according to the remaining device computing resources Describe the running scenario.

一種可能的實施方式中，第一確定單元81，還配置為：回應於所述真實主播的直播模式的選擇指令，在所述多個預設直播模式中確定所述虛擬直播模式。In a possible implementation manner, the first determining unit 81 is further configured to: respond to a selection instruction of the live broadcast mode of the real anchor, and determine the virtual live broadcast mode among the plurality of preset live broadcast modes.

一種可能的實施方式中，該裝置，還配置為：在採集所述真實主播在直播過程中的視頻圖像之前，採集包含所述真實主播的預覽圖像；確定所述預覽圖像中包含的目標捕捉部位是否滿足動作識別條件；在所述預覽圖像中包含的目標捕捉部位不滿足所述動作識別條件的情況下，生成目標調整資訊，直至確定出所述目標捕捉部位滿足所述動作識別條件，其中，所述目標調整資訊用於提醒所述真實主播調整所述目標捕捉部位在所述預覽圖像中的展示狀態。In a possible implementation manner, the device is further configured to: collect a preview image containing the real anchor before collecting the video image of the real anchor during the live broadcast; determine the Whether the target capture part satisfies the motion recognition condition; if the target capture part contained in the preview image does not meet the motion recognition condition, generate target adjustment information until it is determined that the target capture part meets the motion recognition Conditions, wherein the target adjustment information is used to remind the real anchor to adjust the display state of the target capture part in the preview image.

一種可能的實施方式中，動作識別單元83，還配置為：在檢測到所述目標捕捉部位包含手部部位的情況下，獲取所述虛擬直播模式所對應的模式標籤，其中，所述模式標籤包含是否對手部部位進行手勢識別的目標模式標籤；在確定出所述模式標籤為目標模式標籤的情況下，對所述視頻圖像中的目標捕捉部位進行動作識別，以及對所述目標捕捉部位中的手部部位進行手勢檢測，得到包含手勢識別結果的所述目標識別結果。In a possible implementation manner, the action recognition unit 83 is further configured to: acquire the mode tag corresponding to the virtual live broadcast mode when it is detected that the target capture part includes a hand part, wherein the mode tag A target pattern tag including whether gesture recognition is performed on the hand part; if the pattern tag is determined to be the target pattern tag, perform motion recognition on the target capture part in the video image, and perform action recognition on the target capture part Gesture detection is performed on the hand parts in the object, and the target recognition result including the gesture recognition result is obtained.

一種可能的實施方式中，動作識別單元83，還配置為：根據所述手部檢測結果檢測到所述真實主播的手部姿勢為預設手勢，獲取與所述預設手勢相對應的渲染素材特效；在所述視頻流資料中的指定視頻幀中渲染所述渲染素材特效。In a possible implementation manner, the action recognition unit 83 is further configured to: detect that the hand gesture of the real anchor is a preset gesture according to the hand detection result, and acquire the rendering material corresponding to the preset gesture special effects; rendering the rendering material special effects in the specified video frame in the video stream data.

一種可能的實施方式中，第二確定單元84，還配置為：獲取至少一個觀眾端設備發送的直播觀看請求；基於所述直播觀看請求確定每個觀眾端設備對應的直播觀看介面的介面背景圖像，其中，所述介面背景圖像包含：靜態背景圖像或者動態背景圖像；確定用於表徵所述虛擬主播模型執行所述目標識別結果所指示動作的多個視頻圖像；將所述多個視頻圖像中每個視頻圖像的背景圖像替換為所述介面背景圖像，並基於修改之後的所述多個視頻圖像確定所述視頻流資料。In a possible implementation manner, the second determination unit 84 is further configured to: acquire a live viewing request sent by at least one viewer device; determine the interface background image of the live viewing interface corresponding to each viewer device based on the live viewing request image, wherein the interface background image includes: a static background image or a dynamic background image; a plurality of video images determined to characterize the virtual anchor model performing the action indicated by the target recognition result; The background image of each video image in the plurality of video images is replaced with the interface background image, and the video stream data is determined based on the modified video images.

一種可能的實施方式中，該裝置，還配置為：在確定所述真實主播驅動的虛擬主播模型對應的視頻流資料之後，向對應相同介面背景圖像的觀眾端設備推送包含該介面背景圖像的視頻流資料。In a possible implementation manner, the device is further configured to: after determining the video stream data corresponding to the virtual anchor model driven by the real anchor, push a video containing the interface background image to the audience device corresponding to the same interface background image. video stream data.

一種可能的實施方式中，該裝置，還配置為：在確定虛擬直播模式之後，在所述主播端設備的展示介面中展示第一指示資訊和/或第二指示資訊；其中，所述第一指示資訊用於指示處於有效捕捉狀態的目標捕捉部位，所述第二指示資訊用於指示處於無效捕捉狀態的目標捕捉部位。In a possible implementation manner, the device is further configured to: display the first indication information and/or the second indication information in the display interface of the anchor device after determining the virtual live broadcast mode; wherein, the first indication information The indication information is used to indicate the target capture part in the valid capture state, and the second indication information is used to indicate the target capture part in the invalid capture state.

關於裝置中的各模組的處理流程、以及各模組之間的交互流程的描述可以參照上述方法實施例中的相關說明，這裡不再詳述。For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.

對應於圖1A中的直播方法，本發明實施例還提供了一種電腦設備900，如圖10所示，為本發明實施例提供的電子設備900結構示意圖，包括：處理器91、記憶體92、和匯流排93；記憶體92用於儲存執行指令，包括內部記憶體921和外部記憶體922；內部記憶體921用於暫時存放處理器91中的運算資料，以及與硬碟等外部記憶體922交換的資料，處理器91通過內部記憶體921與外部記憶體922進行資料交換，當所述電子設備900運行時，所述處理器91與所述記憶體92之間通過匯流排93通信，使得所述處理器91執行以下指令：確定虛擬直播模式；所述虛擬直播模式用於指示對所述真實主播進行肢體捕捉的目標捕捉部位；採集所述真實主播在直播過程中的視頻圖像；基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果；基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料；所述視頻流資料用於呈現所述虛擬主播模型執行所述識別結果所指示動作的過程。 Corresponding to the live broadcast method in FIG. 1A, the embodiment of the present invention also provides a computer device 900, as shown in FIG. 10, which is a schematic structural diagram of the electronic device 900 provided by the embodiment of the present invention, including: Processor 91, memory 92, and bus 93; memory 92 is used to store execution instructions, including internal memory 921 and external memory 922; internal memory 921 is used to temporarily store computing data in the processor 91, and For the data exchanged with the external memory 922 such as hard disk, the processor 91 exchanges data with the external memory 922 through the internal memory 921. When the electronic device 900 is running, the connection between the processor 91 and the memory 92 Inter-communication through the bus 93, so that the processor 91 executes the following instructions: Determine the virtual live broadcast mode; the virtual live broadcast mode is used to indicate the target capture site for body capture of the real anchor; Collecting the video images of the real anchor during the live broadcast; Based on the virtual live broadcast mode, perform action recognition on the target capture part in the video image to obtain a target recognition result; Based on the target recognition result, determine the video stream material corresponding to the virtual anchor model driven by the real anchor; the video stream material is used to present the process of the virtual anchor model executing the action indicated by the recognition result.

本發明實施例還提供一種電腦可讀儲存媒體，該電腦可讀儲存媒體上儲存有電腦程式，該電腦程式被處理器運行時執行上述方法實施例中所述的直播方法的步驟。其中，該儲存媒體可以是易失性或非易失的電腦可讀取儲存媒體。An embodiment of the present invention also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the live broadcasting method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

本發明實施例還提供一種電腦程式產品，該電腦程式產品承載有程式碼，所述程式碼包括的指令可用於執行上述方法實施例中所述的直播方法的步驟，可參見上述方法實施例，在此不再贅述。The embodiment of the present invention also provides a computer program product, the computer program product carries a program code, and the instructions contained in the program code can be used to execute the steps of the live broadcast method described in the above method embodiment, please refer to the above method embodiment, I won't repeat them here.

其中，上述電腦程式產品可以具體通過硬體、軟體或其結合的方式實現。在一個可選實施例中，所述電腦程式產品具體體現為電腦儲存媒體，在另一個可選實施例中，電腦程式產品具體體現為軟體產品，例如軟體發展包等等。Wherein, the above-mentioned computer program product can be realized by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium. In another optional embodiment, the computer program product is embodied as a software product, such as a software development kit and the like.

本發明實施例中涉及的設備可以是系統、方法和電腦程式產品中的至少之一。電腦程式產品可以包括電腦可讀儲存媒體，其上載有配置為使處理器實現本發明的各個方面的電腦可讀程式指令。The device involved in the embodiments of the present invention may be at least one of a system, a method, and a computer program product. The computer program product may include a computer-readable storage medium loaded with computer-readable program instructions configured to cause a processor to implement various aspects of the invention.

電腦可讀儲存媒體可以是可以保持和儲存由指令執行設備使用的指令的有形設備。電腦可讀儲存媒體例如可以是但不限於電存放裝置、磁存放裝置、光存放裝置、電磁存放裝置、半導體存放裝置或者上述的任意合適的組合。電腦可讀儲存媒體的例子（非窮舉的列表）包括：可擕式電腦盤、硬碟、隨機存取記憶體（Random Access Memory，RAM）、唯讀記憶體（Read-Only Memory，ROM）、可擦除可程式設計唯讀記憶體（Electrical Programmable Read Only Memory，EPROM）或快閃記憶體、靜態隨機存取記憶體（Static Random-Access Memory，SRAM）、可擕式壓縮磁碟唯讀記憶體（Compact Disc Read-Only Memory，CD-ROM）、數位多功能盤（Digital Video Disc，DVD）、記憶棒、軟碟、機械編碼設備、例如其上儲存有指令的打孔卡或凹槽內凸起結構、以及上述的任意合適的組合。這裡所使用的電腦可讀儲存媒體不被解釋為暫態信號本身，諸如無線電波或者其他自由傳播的電磁波、通過波導或其他傳輸媒介傳播的電磁波（例如，通過光纖電纜的光脈衝）、或者通過電線傳輸的電信號。A computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer-readable storage medium may be, for example but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above. Examples (non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, Random Access Memory (RAM), Read-Only Memory (ROM) , Erasable Programmable Read Only Memory (Electrical Programmable Read Only Memory, EPROM) or flash memory, Static Random-Access Memory (Static Random-Access Memory, SRAM), portable compressed disk read-only Compact Disc Read-Only Memory (CD-ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices such as punched cards or slots on which instructions are stored Inner protrusion structure, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other Electrical signals transmitted by wires.

這裡所描述的電腦可讀程式指令可以從電腦可讀儲存媒體下載到各個計算/處理設備，或者通過網路、例如網際網路、局域網、廣域網路和無線網中的至少之一下載到外部電腦或外部存放裝置。網路可以包括銅傳輸電纜、光纖傳輸、無線傳輸、路由器、防火牆、交換機、閘道電腦和邊緣伺服器中的至少之一。每個計算/處理設備中的網路介面卡或者網路介面從網路接收電腦可讀程式指令，並轉發該電腦可讀程式指令，以供儲存在各個計算/處理設備中的電腦可讀儲存媒體中。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to each computing/processing device, or to an external computer over at least one of a network, such as the Internet, a local area network, a wide area network, and a wireless network or external storage device. The network may include at least one of copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and edge servers. The network interface card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in computer-readable storage in each computing/processing device in the media.

配置為執行本發明操作的電腦程式指令可以是彙編指令、指令集架構（Industry Standard Architecture，ISA）指令、機器指令、機器相關指令、微代碼、固件指令、狀態設置資料、或者以一種或多種程式設計語言的任意組合編寫的原始程式碼或目標代碼，所述程式設計語言包括物件導向的程式設計語言—諸如Smalltalk、C++等，以及常規的過程式程式設計語言，諸如“C”語言或類似的程式設計語言。電腦可讀程式指令可以完全地在使用者電腦上執行、部分地在使用者電腦上執行、作為一個獨立的套裝軟體執行、部分在使用者電腦上部分在遠端電腦上執行、或者完全在遠端電腦或伺服器上執行。在涉及遠端電腦的情形中，遠端電腦可以通過任意種類的網路，包括局域網（Local Area Network，LAN）或廣域網路（Wide Area Network，WAN）連接到使用者電腦，或者，可以連接到外部電腦（例如利用網際網路服務提供者來通過網際網路連接）。在一些實施例中，通過利用電腦可讀程式指令的狀態資訊來個性化定制電子電路，例如可程式設計邏輯電路、FPGA或可程式設計邏輯陣列（Programmable Logic Arrays，PLA），該電子電路可以執行電腦可讀程式指令，從而實現本發明實施例的各個方面。Computer program instructions configured to perform the operations of the present invention may be assembly instructions, Instruction Set Architecture (Industry Standard Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source code or object code written in any combination of design languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages, such as the "C" language or similar programming language. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone package, partly on the user's computer and partly on a remote computer, or entirely on a remote computer. Execute on the terminal computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it may be connected to An external computer (e.g. using an Internet service provider to connect via the Internet). In some embodiments, electronic circuits, such as programmable logic circuits, FPGAs, or programmable logic arrays (Programmable Logic Arrays, PLAs), can be customized by utilizing state information of computer-readable program instructions, which can execute Computer-readable program instructions to realize various aspects of the embodiments of the present invention.

所屬領域的技術人員可以清楚地瞭解到，為描述的方便和簡潔，上述描述的系統和裝置的具體工作過程，可以參考前述方法實施例中的對應過程，在此不再贅述。在本發明所提供的幾個實施例中，應該理解到，所揭露的裝置和方法，可以通過其它的方式實現。以上所描述的裝置實施例僅僅是示意性的，例如，所述單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式，又例如，多個單元或元件可以結合或者可以集成到另一個系統，或一些特徵可以忽略，或不執行。另一點，所顯示或討論的相互之間的耦合或直接耦合或通信連接可以是通過一些通信介面，裝置或單元的間接耦合或通信連接，可以是電性，機械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present invention, it should be understood that the disclosed devices and methods can be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or elements can be combined or can be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作為分離部件說明的單元可以是或者也可以不是物理上分開的，作為單元顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may also be distributed to multiple network units . Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本發明各個實施例中的各功能單元可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

所述功能如果以軟體功能單元的形式實現並作為獨立的產品銷售或使用時，可以儲存在一個處理器可執行的非易失的電腦可讀取儲存媒體中。基於這樣的理解，本發明的技術方案本質上或者說對現有技術做出貢獻的部分或者該技術方案的部分可以以軟體產品的形式體現出來，該電腦軟體產品儲存在一個儲存媒體中，包括若干指令用以使得一台電腦設備（可以是個人電腦，伺服器，或者網路設備等）執行本發明各個實施例所述方法的全部或部分步驟。而前述的儲存媒體包括：U盤、移動硬碟、唯讀記憶體（Read-Only Memory，ROM）、隨機存取記憶體（Random Access Memory，RAM）、磁碟或者光碟等各種可以儲存程式碼的媒體。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of software products, which are stored in a storage medium, including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc., which can store program codes. media.

最後應說明的是：以上所述實施例，僅為本發明的具體實施方式，用以說明本發明的技術方案，而非對其限制，本發明的保護範圍並不局限於此，儘管參照前述實施例對本發明進行了詳細的說明，本領域的普通技術人員應當理解：任何熟悉本技術領域的技術人員在本發明揭露的技術範圍內，其依然可以對前述實施例所記載的技術方案進行修改或可輕易想到變化，或者對其中部分技術特徵進行等同替換；而這些修改、變化或者替換，並不使相應技術方案的本質脫離本發明實施例技術方案的精神和範圍，都應涵蓋在本發明的保護範圍之內。因此，本發明的保護範圍應所述以申請專利範圍的保護範圍為準。Finally, it should be noted that: the above-described embodiments are only specific implementations of the present invention, used to illustrate the technical solutions of the present invention, rather than limiting them, and the scope of protection of the present invention is not limited thereto, although referring to the foregoing The embodiment has described the present invention in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present invention Changes can be easily thought of, or equivalent replacements are made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in the scope of the present invention within the scope of protection. Therefore, the protection scope of the present invention should be based on the protection scope of the patent application.

工業實用性本發明實施例提供了一種直播方法、系統、電腦設備及電腦可讀儲存媒體，其中，該方法包括：確定虛擬直播模式；所述虛擬直播模式用於指示對真實主播進行肢體捕捉的至少一個目標捕捉部位；採集所述真實主播在直播過程中的視頻圖像；基於所述虛擬直播模式，對所述視頻圖像中的目標捕捉部位進行動作識別，得到目標識別結果；基於所述目標識別結果，確定所述真實主播驅動的虛擬主播模型對應的視頻流資料；所述視頻流資料用於呈現所述虛擬主播模型執行所述識別結果所指示動作的過程。 Industrial Applicability An embodiment of the present invention provides a live broadcast method, system, computer equipment, and computer-readable storage medium, wherein the method includes: determining a virtual live broadcast mode; the virtual live broadcast mode is used to indicate at least one target for body capture of a real anchor Capture the location; collect the video image of the real anchor during the live broadcast; based on the virtual live broadcast mode, perform action recognition on the target capture location in the video image to obtain a target recognition result; based on the target recognition result , determining the video stream material corresponding to the virtual anchor model driven by the real anchor; the video stream material is used to present a process in which the virtual anchor model performs the action indicated by the recognition result.

201:視頻圖像獲取終端 202:網路 203:控制終端 71:主播端設備 72:觀眾端設備 81:第一確定單元 82:採集單元 83:動作識別單元 84:第二確定單元 900:電腦設備 91:處理器 92:記憶體 921:內部記憶體 922:外部記憶體 93:匯流排 S101~S107,S11~S13:步驟 201: video image acquisition terminal 202: Network 203: Control terminal 71:Anchor terminal equipment 72: Audience device 81: The first determination unit 82: Acquisition unit 83: Action recognition unit 84: The second determination unit 900: computer equipment 91: Processor 92: Memory 921: internal memory 922: external memory 93: busbar S101~S107, S11~S13: steps

為了更清楚地說明本發明實施例的技術方案，下面將對實施例中所需要使用的附圖作簡單地介紹，此處的附圖被併入說明書中並構成本說明書中的一部分，這些附圖示出了符合本發明的實施例，並與說明書一起用於說明本發明的技術方案。應當理解，以下附圖僅示出了本發明的某些實施例，因此不應被看作是對範圍的限定，對於本領域普通技術人員來講，在不付出創造性勞動的前提下，還可以根據這些附圖獲得其他相關的附圖。圖1A示出了本發明實施例所提供的一種直播方法的流程圖；圖1B示出可以應用本發明實施例的直播方法的一種系統架構示意圖；圖2示出了本發明實施例所提供的另一種直播方法的流程圖；圖3示出了本發明實施例所提供的一種虛擬直播模式的選擇介面的介面示意圖；圖4示出了本發明實施例所提供的另一種虛擬直播模式的選擇介面的介面示意圖；圖5示出了本發明實施例所提供的第三種虛擬直播模式的選擇介面的介面示意圖；圖6示出了本發明實施例所提供的一種模式標籤的設置介面的介面示意圖；圖7示出了本發明實施例所提供的一種包含預設手勢的手勢資訊的虛擬直播介面的介面示意圖；圖8示出了本發明實施例所提供的一種直播系統的示意圖；圖9示出了本發明實施例所提供的一種直播裝置的示意圖；圖10示出了本發明實施例所提供的一種電腦設備的示意圖。 In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the accompanying drawings used in the embodiments will be briefly introduced below, and the drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present invention, and are used together with the description to illustrate the technical solution of the present invention. It should be understood that the following drawings only show some embodiments of the present invention, and therefore should not be regarded as limiting the scope. For those of ordinary skill in the art, they can also make From these figures are obtained other related figures. FIG. 1A shows a flow chart of a live broadcast method provided by an embodiment of the present invention; FIG. 1B shows a schematic diagram of a system architecture to which the live broadcast method according to the embodiment of the present invention can be applied; FIG. 2 shows a flow chart of another live broadcast method provided by an embodiment of the present invention; FIG. 3 shows a schematic interface diagram of a selection interface of a virtual live broadcast mode provided by an embodiment of the present invention; FIG. 4 shows a schematic interface diagram of another virtual live mode selection interface provided by an embodiment of the present invention; FIG. 5 shows a schematic interface diagram of a selection interface of a third virtual live broadcast mode provided by an embodiment of the present invention; FIG. 6 shows a schematic interface diagram of a mode label setting interface provided by an embodiment of the present invention; FIG. 7 shows a schematic interface diagram of a virtual live broadcast interface including gesture information of preset gestures provided by an embodiment of the present invention; Fig. 8 shows a schematic diagram of a live broadcast system provided by an embodiment of the present invention; Fig. 9 shows a schematic diagram of a live broadcast device provided by an embodiment of the present invention; Fig. 10 shows a schematic diagram of a computer device provided by an embodiment of the present invention.

S101~S107:步驟 S101~S107: steps

Claims

A live broadcast method, applied to anchor equipment, including: Determine the virtual live broadcast mode; the virtual live broadcast mode is used to indicate at least one target capture site for body capture of the real anchor; Collecting the video images of the real anchor during the live broadcast; Based on the virtual live broadcast mode, perform action recognition on the target capture part in the video image to obtain a target recognition result; Based on the target recognition result, determine the video stream material corresponding to the virtual anchor model driven by the real anchor; the video stream material is used to present a process in which the virtual anchor model executes the action indicated by the target recognition result.

According to the method described in claim 1, wherein said determining the virtual live broadcast mode includes: Determine the operating scenario of the anchor device; Determining a target running version matching the running scenario based on multiple running versions pre-set for the anchor device; The virtual live broadcast mode is determined based on the determined target running version.

According to the method described in claim 2, wherein the determining the operation scenario of the anchor device includes: In the case of detecting the first installation operation for the live broadcast application, obtaining the device identification information of the anchor device, and determining the running scenario according to the device identification information; and/or When an opening operation for the live broadcast application is detected, the remaining device computing resources of the host device at the current moment are acquired, and the running scenario is determined according to the remaining device computing resources.

The method according to any one of claims 1 to 3, wherein said determining the virtual live broadcast mode includes: In response to the live broadcast mode selection instruction of the real host, the virtual live broadcast mode is determined among a plurality of preset live broadcast modes.

The method according to any one of claims 1 to 3, wherein the method further comprises: Before collecting the video images of the real anchor during the live broadcast, collecting a preview image that includes the real anchor; determining whether the target capture part contained in the preview image satisfies the motion recognition condition; If the target capture part included in the preview image does not satisfy the motion recognition condition, generate target adjustment information until it is determined that the target capture part satisfies the motion recognition condition, wherein the target adjustment information It is used to remind the real anchor to adjust the display state of the target capturing part in the preview image.

According to the method described in claim 1, wherein, based on the virtual live broadcast mode, the action recognition is performed on the target capture part in the video image, and the target recognition result is obtained, including: When it is detected that the target capture part includes a hand part, the mode tag corresponding to the virtual live broadcast mode is obtained, wherein the mode tag includes a target mode tag whether to perform hand recognition on the hand part; When it is determined that the pattern tag is the target pattern tag, perform action recognition on the target capture part in the video image, and perform hand detection on the hand parts in the target capture part, and obtain The target recognition result of the partial detection result.

The method according to claim 6, wherein, after determining the video stream data corresponding to the virtual anchor model driven by the real anchor, the method further includes: According to the hand detection result, it is detected that the hand gesture of the real anchor is a preset gesture, and the rendering material special effect corresponding to the preset gesture is obtained; Rendering the rendering material special effect in a specified video frame in the video stream data.

The method according to any one of claims 1 to 3, wherein, based on the target recognition result, determining the video stream data corresponding to the virtual anchor model driven by the real anchor includes: Obtain a live viewing request sent by at least one viewer device; Determine the interface background image of the live viewing interface corresponding to each audience device based on the live viewing request, wherein the interface background image includes: a static background image or a dynamic background image; Determining a plurality of video images used to characterize the virtual anchor model performing the action indicated by the target recognition result; The background image of each video image in the plurality of video images is replaced with the interface background image, and the video stream data is determined based on the modified video images.

The method according to claim 8, wherein the method further comprises: After the video stream data corresponding to the virtual anchor model driven by the real anchor is determined, the video stream data including the interface background image is pushed to the audience device corresponding to the same interface background image.

The method according to claim 1, wherein the method further comprises: After the virtual live broadcast mode is determined, display the first indication information and/or the second indication information in the display interface of the anchor device; wherein, the first indication information is used to indicate the target capture position in an effective capture state, The second indication information is used to indicate target capture parts in an invalid capture state.

A live broadcast system, comprising: a host device and a viewer device; The anchor device is configured to determine the video stream data corresponding to the virtual anchor model driven by the real anchor according to the live broadcast method described in any one of the above request items 1 to 10, and push the video stream data to the viewer end device. the video streaming data; The viewer device is configured to obtain the video stream data, and play the video stream data on a live viewing interface.

The system according to claim 11, wherein the viewer device includes: a mobile terminal device and a PC device; When the viewer device is the mobile terminal device, the anchor device transmits the video stream data to the viewer device through a CDN distribution network; In the case that the viewer-end device is the PC device, the host-end device transmits the video stream data to the viewer-end device through a CDN distribution network and a forwarding stream server.

A computer device, comprising: a processor, a memory, and a bus bar, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the connection between the processor and the memory Through bus communication, when the machine-readable instructions are executed by the processor, the steps of the live broadcasting method as described in any one of claim items 1 to 10 are performed.

A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the live broadcast method as described in any one of claims 1 to 10 are executed.