WO2021047419A1 - 直播交互方法、直播系统、电子设备及存储介质 - Google Patents

直播交互方法、直播系统、电子设备及存储介质 Download PDF

Info

Publication number
WO2021047419A1
WO2021047419A1 PCT/CN2020/112793 CN2020112793W WO2021047419A1 WO 2021047419 A1 WO2021047419 A1 WO 2021047419A1 CN 2020112793 W CN2020112793 W CN 2020112793W WO 2021047419 A1 WO2021047419 A1 WO 2021047419A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
live
information
live broadcast
server
Prior art date
Application number
PCT/CN2020/112793
Other languages
English (en)
French (fr)
Inventor
曾衍
Original Assignee
广州华多网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州华多网络科技有限公司 filed Critical 广州华多网络科技有限公司
Publication of WO2021047419A1 publication Critical patent/WO2021047419A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4882Data services, e.g. news ticker for displaying messages, e.g. warnings, reminders

Definitions

  • This application relates to the field of live broadcast technology, and in particular to a live broadcast interactive method, live broadcast system, electronic equipment and storage medium.
  • network platforms include one-to-one chat and dating platforms, anchor chat room platforms, forums and friends platforms, etc., among which, one-to-one chat and make friends
  • the platform and the anchor chat room platform are more popular with users because they can communicate with real-time video.
  • each live video platform In order to increase the stickiness of platform users, each live video platform often provides a wealth of gifts that can be given, thereby increasing the interaction between users during the live video.
  • the existing gifts are displayed on the public screen and then the gifts are presented. The disappearance has nothing to do with the screen of the live video, resulting in a single presentation effect and a short presentation time of the gift presented in the live video process.
  • the present application provides a live broadcast interaction method, a live broadcast system, electronic equipment, and a storage medium, so as to solve the problem of a single live broadcast interaction method in the prior art.
  • a technical solution adopted in this application is to provide a live broadcast interaction method, the live broadcast interaction method is applied to a live broadcast system, and the live broadcast system includes a host end, an audience end, and a server;
  • the live broadcast interaction method includes:
  • the host terminal collects contour information and live video, encodes the contour information into the network extraction layer of the video stream, encodes the live video into the video coding layer of the video stream, and encodes the encoded Upload the video code stream to the server;
  • the anchor end and/or the audience end further obtain the trigger instruction generated by the server, and obtain corresponding special effect information based on the trigger instruction;
  • the host terminal and/or the viewer terminal decodes the contour information and the live video from the encoded video code stream, and renders the special effect information to the live video based on the contour information.
  • another technical solution adopted in this application is to provide a live broadcast system, the live broadcast system at least includes an anchor end, an audience end, and a server;
  • the anchor end is used to collect contour information and live video, encode the contour information into the network extraction layer of the video stream, encode the live video into the video coding layer of the video stream, and encode the Upload the video code stream to the server;
  • the server is configured to send the encoded video code stream to the viewer
  • the anchor end and/or the audience end are configured to further obtain the trigger instruction generated by the server, and obtain corresponding special effect information based on the trigger instruction;
  • the anchor end and/or the audience end are further configured to decode the outline information and the live video from the encoded video stream, and render the special effect information to all the information based on the outline information. Describe the live video.
  • the live broadcast interaction method is applied to an electronic device, and the live broadcast interaction method includes:
  • the contour information and the live video are decoded from the encoded video code stream, and the special effect information is rendered to the live video based on the contour information.
  • an electronic device including a memory and a processor coupled to the memory;
  • the memory is used to store program data
  • the processor is used to execute the program data to implement the above-mentioned live broadcast interaction method.
  • Another technical solution adopted in this application is to provide a computer storage medium in which a computer program is stored, and the computer program is executed to implement the steps of the above live interaction method.
  • the beneficial effects of this application are: the anchor terminal collects contour information and live video, encodes the contour information into the network extraction layer of the video stream, encodes the live video into the video coding layer of the video stream, and combines The encoded video stream is uploaded to the server; the server sends the encoded video stream to the viewer; the host and/or the viewer further obtain the trigger command generated by the server, and obtain the corresponding special effect information based on the trigger command; the host And/or the viewer side decodes the outline information and the live video from the encoded video stream, and renders the special effect information to the live video based on the outline information.
  • the characters and special effects can be rendered together during the live broadcast process, which can effectively enhance the interest of the mic-link interaction, enrich the live broadcast content, and improve the interactivity of the web live broadcast.
  • FIG. 1 is a schematic flowchart of a first embodiment of a live interaction method provided by the present application
  • FIG. 2 is a schematic flow diagram of the upstream logic of the anchor end provided by the present application.
  • FIG. 3 is a schematic diagram of the AI special effect animation provided by this application.
  • FIG. 4 is a schematic flowchart of a second embodiment of a live interaction method provided by the present application.
  • FIG. 5 is a schematic flowchart of a third embodiment of a live interaction method provided by the present application.
  • FIG. 6 is a schematic flowchart of a fourth embodiment of a live interaction method provided by the present application.
  • FIG. 7 is a schematic flow chart of the downstream logic of the anchor end provided by the present application.
  • FIG. 8 is a schematic flow chart of the processing logic of mixed picture transcoding provided by the present application.
  • Fig. 9 is a schematic flow chart of the audience-side downlink logic provided in this application.
  • FIG. 10 is a schematic structural diagram of an embodiment of a live broadcast system provided by the present application.
  • FIG. 11 is a schematic flowchart of a fifth embodiment of a live interaction method provided by the present application.
  • FIG. 12 is a schematic structural diagram of an embodiment of an electronic device provided by the present application.
  • FIG. 13 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.
  • the live broadcast system applied in this embodiment at least includes an anchor end, an audience end, and a server.
  • the host and the viewer respectively establish a communication connection with the server, so that the host can interact with the live broadcast through the server, and the viewer can watch the live content of the host through the server.
  • the electronic devices corresponding to the anchor end can be electronic devices such as smart phones, tablets, laptops, desktop computers, or wearable devices, and the electronic devices corresponding to the audience end can also be, for example, smart phones, tablets, laptops, etc. Electronic devices such as desktop computers or wearable devices.
  • the device types corresponding to multiple viewers may be the same or different from the device types corresponding to the host.
  • Both the host and the audience can establish a wireless connection such as WIFI, Bluetooth, or ZigBee with the server.
  • a wireless connection such as WIFI, Bluetooth, or ZigBee with the server.
  • FIG. 1 is a schematic flowchart of the first embodiment of the live interaction method provided by the present application.
  • the live broadcast interaction method of this embodiment can be applied to the above live broadcast system, and the specific structure of the live broadcast system will not be repeated here.
  • the live interaction method of this embodiment specifically includes the following steps:
  • the host terminal collects contour information and live video, encodes the contour information to the network extraction layer of the video stream, encodes the live video to the video coding layer of the video stream, and uploads the encoded video stream to the server.
  • the host uploads the AI data, that is, the profile information, and the live video to the server through the video code stream.
  • AI data that is, the profile information
  • the live video to the server through the video code stream.
  • the contour information collected by the host can be the human body contour information of the host, or other preset target contour information.
  • the preset target contour may be the contour of an object that often appears in a live video.
  • the present application uses human body contour information as an example for description.
  • the host terminal performs video collection on the live video recorded by the camera to obtain color data of the video, that is, YUV data.
  • YUV is a color coding method, which is often used in various video processing components. When YUV encodes photos or videos, it takes into account human perception and allows the bandwidth of chroma to be reduced.
  • YUV is a type of compiling true-color color space (colorspace), where "Y" represents brightness (Luminance, Luma), "U” represents chrominance (Chrominance), and "V” represents density (Chroma).
  • the anchor terminal After the anchor terminal obtains the color data of the video, it performs AI processing to obtain the human body contour information in the live video, where the human body contour includes at least the facial contour and the limb contour.
  • the host uses video compression standards such as H.264/H.265 to encode human contour information into the network extraction layer of the video stream. Specifically, the host compresses and encodes the human contour information into the video stream network extraction layer.
  • SEI is Supplemental Enhancement Information (Supplemental Enhancement Information), which belongs to the category of code stream. SEI provides a method for adding additional information to the video code stream.
  • the basic features of SEI include: 1. It is not a necessary option for the decoding process; 2. It may be helpful to the decoding process (error tolerance, error correction); 3. Integration in the video code stream.
  • the anchor end encodes the body contour information into the SEI, so that the body contour information can be transmitted to the server together with the live video through the video code stream, that is, the anchor network in FIG. 2.
  • the host end when the host end has not updated the application version in time or the device performance does not meet the requirements for displaying AI special effects, the host end will inform the server and the corresponding audience end in time. For example, when the host starts broadcasting, it is tested whether the device performance can support the display of AI special effects. If so, it will actively report to the server when collecting human contour information. The host can currently support AI special effects gifts; if the server does not receive the AI special effects from the host According to the reporting agreement, it is considered that the anchor does not support AI special effects.
  • the audience If there is an abnormal situation during the live broadcast, for example, the audience gives an AI special effect gift, but the application version of the host does not support or the performance of the terminal device on the host does not support, the corresponding prompt is sent to the audience: the default one can be played at this time Special effect animation, but this kind of special effect animation does not combine the anchor’s face or body outline.
  • S102 The server sends the encoded video stream to the viewer.
  • the server sends the encoded video stream to the viewer, where the SEI information of the encoded video stream carries body contour information of the anchor.
  • the anchor end and/or the audience end further obtain the trigger instruction generated by the server, and obtain corresponding special effect information based on the trigger instruction.
  • the server generates a corresponding trigger instruction by giving a gift trigger or recognizing a human action trigger to instruct the host and the audience to download the corresponding special effect information based on the trigger instruction.
  • the server obtains the gift information sent by the audience, it judges whether the type of the gift information is ordinary gift information or AI special effect gift information.
  • the server generates a trigger instruction based on the AI special effect gift information.
  • the server presets a variety of action instructions.
  • the server recognizes the actions of the host in the live video, such as gestures.
  • the anchor performs an action preset by the server in the live video
  • the server generates a corresponding trigger instruction based on the action. For example, when the server recognizes that the host makes a gesture of comparison, it triggers an angel to fly three times around the host's profile picture, and then kiss the host's face.
  • the corresponding special effect information can be cached locally at the host and/or viewer when the first download is made for the next time the same AI special effect gift is triggered. use. Therefore, when the anchor end and/or the viewer end receive the trigger instruction, they first search the local cache for whether there is special effect information corresponding to the trigger instruction. If it exists, the host and/or viewer directly extract the special effect information in the buffer area; if it does not exist, the host and/or viewer send the request information to the server based on the trigger instruction, so that the server sends the special effect information corresponding to the request information.
  • the host and/or the viewer when the host and/or the viewer receives multiple trigger instructions for AI special effect gifts in a relatively short period of time, they put the trigger instructions for multiple AI special effect gifts in the queue according to the receiving time sequence, and then play them in chronological order.
  • the corresponding AI special effects gift when the host and/or the viewer receives multiple trigger instructions for AI special effect gifts in a relatively short period of time, they put the trigger instructions for multiple AI special effect gifts in the queue according to the receiving time sequence, and then play them in chronological order. The corresponding AI special effects gift.
  • S104 The host and/or the audience decode the human body contour information and the live video from the encoded video code stream, and render the special effect information to the live video based on the human contour information to display the corresponding live interface.
  • the host and/or the viewer when the host and/or the viewer receives the trigger instruction from the server, the host and/or the viewer will decode the SEI information from the network extraction layer of the encoded video stream to obtain the human body in the SEI information Profile information.
  • the host and/or the audience input the decoded human contour information into the animation renderer for rendering.
  • the animation renderer obtains the animation playback resource of the corresponding gift type according to the corresponding gift type, that is, after the special effect information in S103, the animation will be displayed.
  • the playback resources are rendered and drawn based on the human body contour information.
  • the renderer combines the body contour information to display the human body contour to draw around the three circles, and draws its wings drop in the live video area. External screen.
  • the host and/or the audience can render the special effect information to the live video based on the human contour information, and display the corresponding live interface.
  • FIG. 3 is a schematic diagram of the AI special effect animation provided by this application.
  • the live broadcast interface includes the human body contour 11 and special effect animation 12 of the anchor.
  • the special effect animation 12 is displayed around the human body contour 11, and the special effect animation 12 can produce a blocking effect, or a partial transparency effect of the special effect animation 12 on the human body.
  • the special effect of an airplane flies around the human body and disappears when it flies behind the human body; or it starts from the special effect in the live video area and flies to a certain part of the human body in the video area.
  • the anchor terminal collects contour information and live video, encodes the contour information into the network extraction layer of the video stream, encodes the live video into the video coding layer of the video stream, and uploads the encoded video stream To the server; the server sends the encoded video stream to the viewer; the host and/or the viewer further obtain the trigger command generated by the server, and obtain the corresponding special effect information based on the trigger command; the host and/or the viewer from the encoding
  • the resulting video stream decodes the outline information and the live video, and renders the special effect information to the live video based on the outline information.
  • FIG. 4 is a schematic flowchart of a second embodiment of the live interaction method provided by this application.
  • the live interaction method of this embodiment specifically includes the following steps:
  • the viewer terminal obtains the video resolution of the anchor terminal based on the profile information.
  • the audience side obtains its own video resolution on the one hand, and on the other hand obtains the video resolution of the host side according to the decoded body contour information or the live video.
  • the viewer end when the video resolution of the viewer end is the same as the video resolution of the host end, the viewer end does not need to convert the body contour information.
  • the video resolution of the viewer end is different from the video resolution of the host end, the viewer end needs to perform a proportional conversion on the coordinate information of the human body contour.
  • the anchor end is to start broadcasting on a terminal device with a video resolution of 1920*1680
  • the coordinate system of the human contour information collected by the anchor end is at this resolution
  • the audience end is at the video resolution of 1080*720
  • the audience side needs to perform the coordinate system conversion of the human body contour information according to the ratio of the video resolution of the audience side and the host side, so that the human body contour information and special effects information can be rendered by the animation renderer
  • the live video of can be displayed normally on the audience.
  • the viewer end in view of the situation that the video resolution of the anchor end and the video resolution of the viewer end are different, the viewer end can perform a proportional conversion of the coordinate system of the human body contour information according to the video resolution relationship of the two clients, so that the present application
  • the live broadcast interactive method can be adapted to different terminal devices.
  • FIG. 5 is a schematic flowchart of the third embodiment of the live interaction method provided by this application.
  • the live interaction method of this embodiment specifically includes the following steps:
  • the anchor terminal determines the number of contour information collection points based on the service requirements and the transmission bandwidth requirements, and collects the contour information based on the number of collection points.
  • the anchor terminal collects the anchor's body contour information in real time during the start of the broadcast, and the number of collection points for collecting the body contour information depends on the corresponding service and transmission bandwidth requirements.
  • a relatively large number of collection points can be used to represent the collected human contour information, for example, 256 collection points are used to represent the contour of the entire human body.
  • relatively few collection points can be used to represent the contour information of the human face, for example, 68 points are used to represent the contour information of the human face.
  • S302 The host judges whether the required bandwidth of the encoded video stream is greater than or equal to a preset bandwidth.
  • the anchor terminal collects the human body contour information, it compresses and encodes the human body contour information into the video code stream. As shown in Figure 2, the host needs to detect whether the transmitted content meets the requirements before transmitting the encoded video stream.
  • the detection content can include at least the following two aspects:
  • the host can judge whether the required bandwidth of the encoded video stream is greater than or equal to the uplink bandwidth; if so, in order to ensure the flow of live broadcast, the host needs to discard the body contour information when the uplink bandwidth is insufficient.
  • the host can also determine whether the size of the body contour information is greater than the preset byte; if so, in order to ensure the flow of the live broadcast, the host needs to discard the body contour information when the uplink bandwidth is insufficient. For example, when the body contour information is greater than 400 Bytes, the host needs to discard the body contour information and then transmit the video stream.
  • the host when the host discards all or part of the human contour information, the host can adaptively reduce the collection points required to collect the human contour information based on the size of the discarded human contour information when collecting the human contour information in the next time sequence, thereby reducing subsequent follow-ups.
  • the size of the transmitted body contour information when the host discards all or part of the human contour information.
  • the live interaction method can be applied to a single anchor end, that is, a single player special effects gameplay.
  • the live interaction method of the present application can also be applied to the situation of multiple anchors, that is, the multiplayer special effects gameplay.
  • FIG. 6 is a schematic flowchart of a fourth embodiment of a live interaction method provided by the present application.
  • the anchor end in the foregoing embodiment may include a first anchor end and a second anchor end.
  • the live interaction method of this embodiment specifically includes the following steps:
  • the first anchor terminal collects the first contour information and the first live video, encodes the first contour information into the network extraction layer of the first video code stream, and encodes the first live video into the video coding layer of the first video code stream , And upload the encoded first video stream to the server.
  • the second host terminal collects the second contour information and the second live video, encodes the second contour information into the network extraction layer of the second video stream, and encodes the second live video into the video encoding layer of the second video stream. , And upload the encoded second video stream to the server.
  • the first anchor end and the second anchor end respectively perform body contour information collection and encoding.
  • the specific process is the same as S101 in the foregoing embodiment, and will not be repeated here.
  • S403 The server sends the coded first video code stream and the coded second video code stream to the viewer end, sends the coded first video code stream to the second host end, and sends the coded second video code stream to the second host end.
  • the stream is sent to the first host.
  • the first anchor end, the second anchor end, and/or the audience end further obtain the trigger instruction generated by the server, and obtain corresponding special effect information based on the trigger instruction.
  • S405 The first anchor end decodes the second contour information and the second live video from the encoded second video code stream, and the second anchor end decodes the first contour information and the first live broadcast from the encoded first video code stream For video, the viewer terminal decodes the first contour information, the second contour information, the first live video, and the second live video from the encoded first video code stream and the encoded second video code stream.
  • FIG. 7 is a schematic flow diagram of the downstream logic of the anchor end provided in this application.
  • the host network that is, the server transmits the encoded first video stream to the second host end.
  • the second host side strips the SEI information in the encoded first video code stream, thereby decoding the first human body contour information.
  • S406 The first host, the second host, and the audience mix the first live video and the second live video to obtain an interactive video, and render the special effect information to the interactive based on the first contour information and the second contour information. video.
  • the host network After obtaining the first live video and the second live video, the host network performs video mixing of the two live videos to obtain an interactive video.
  • the interactive video includes first human body contour information, second human body contour information, and a mixed picture layout of the first live video and the second live video.
  • the host network can also transcode the interactive video and transmit it to the CDN network (Content Delivery Network) to adapt to different network bandwidths, different terminal processing capabilities, and different user needs.
  • the transcoded interactive video includes transcoding parameters.
  • the CDN network sends the transcoded interactive video to the audience side, and the audience side strips the SEI information in the transcoded interactive video, thereby decoding the first human body contour Information, second body contour information, mixed drawing layout, and transcoding parameters.
  • FIG. 10 is a schematic structural diagram of an embodiment of the live broadcast system provided by the present application.
  • the live broadcast system 200 of this embodiment at least includes an anchor terminal 21, an audience terminal 22, and a server 23.
  • the host terminal 21 and the audience terminal 22 respectively implement a communication connection with the server 23.
  • the host 21 is used to collect contour information, encode the contour information into the network extraction layer of the video stream, encode the live video into the video encoding layer of the video stream, and upload the encoded video stream to Server 23.
  • the server 23 is configured to send the encoded video stream to the viewer 22.
  • the host 21 and/or the audience 22 are used to further obtain the trigger instruction generated by the server 23, and obtain corresponding special effect information based on the trigger instruction.
  • the host terminal 21 and/or the audience terminal 22 are also used to decode the outline information and the live video from the encoded video stream, and render the special effect information to the live video based on the outline information.
  • FIG. 11 is a schematic flowchart of the fifth embodiment of the live interaction method provided by this application.
  • the live broadcast interaction method of this embodiment is applied to an electronic device, which may specifically be the anchor terminal 21 in the live broadcast system 200 described above, which will not be repeated here.
  • the live interaction method of this embodiment specifically includes the following steps:
  • S501 Collect contour information and live video, encode the contour information to the network extraction layer of the video stream, encode the live video to the video coding layer of the video stream, and upload the encoded video stream to the server to enable the server Send the encoded video stream to the audience.
  • S502 Further obtain a trigger instruction, and obtain corresponding special effect information based on the trigger instruction.
  • S503 Decode the contour information and the live video from the encoded video stream, and render the special effect information to the live video based on the contour information.
  • FIG. 12 is a schematic structural diagram of an embodiment of the electronic device provided in this application.
  • the electronic device 300 of this embodiment includes a memory 31 and a processor 32, where the memory 31 is coupled to the processor 32.
  • the memory 31 is used to store program data
  • the processor 32 is used to execute the program data to implement the live interaction method of the foregoing embodiment.
  • the processor 32 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 32 may be an integrated circuit chip with signal processing capabilities.
  • the processor 32 may also be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component .
  • the general-purpose processor may be a microprocessor or the processor 32 may also be any conventional processor or the like.
  • FIG. 13 is a schematic structural diagram of an embodiment of the computer storage medium provided by the present application.
  • the computer storage medium 400 stores program data 41, and the program data 41 is stored in the computer storage medium 400. When executed by the processor, it is used to implement the live interaction method of the foregoing embodiment.
  • the embodiments of the present application When the embodiments of the present application are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .

Abstract

一种直播交互方法、直播系统、电子设备及存储介质,该直播交互方法应用于直播系统,直播系统包括主播端、观众端以及服务器;该直播交互方法包括:主播端采集轮廓信息和直播视频,将轮廓信息编码到视频码流的网络提取层,将直播视频编码到视频码流的视频编码层,并将编码后的视频码流上传到服务器;服务器将编码后的视频码流发送给观众端;主播端和/或观众端进一步获取服务器生成的触发指令,并基于触发指令获取对应的特效信息;主播端和/或观众端从编码后的视频码流解码出人体轮廓信息和直播视频,并基于轮廓信息将特效信息渲染到直播视频。本申请的直播交互方法能够增强连麦互动的趣味性,使得直播内容更加丰富,从而提高互动性。

Description

直播交互方法、直播系统、电子设备及存储介质
本申请要求于2019年09月12日提交中国专利局、申请号为201910865638.4、发明名称为“直播交互方法、直播系统、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及直播技术领域,特别是涉及一种直播交互方法、直播系统、电子设备及存储介质。
背景技术
随着网络通讯的发展,越来越多的用户选择通过网络平台来交友和娱乐,这些网络平台包括一对一聊天交友平台、主播聊天室平台、论坛交友平台等,其中,一对一聊天交友平台和主播聊天室平台由于可实时视频交流,更受用户青睐。
为了增加平台用户的使用粘性,各直播视频平台往往会提供丰富的可以赠送的礼物,从而在直播视频时增加用户之间互动,但是,现有的赠送礼物均是在公屏上显示赠送礼物然后消失,与直播视频的画面无关,导致直播视频过程中赠送礼物的呈现效果单一、呈现时间短。
发明内容
本申请提供一种直播交互方法、直播系统、电子设备及存储介质,以解决现有技术中直播交互互动方式单一的问题。
为解决上述技术问题,本申请采用的一个技术方案是提供一种直播交互方法,所述直播交互方法应用于直播系统,所述直播系统包括主播端、观众端以及服务器;
所述直播交互方法包括:
所述主播端采集轮廓信息和直播视频,将所述轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的所述视频码流上传到所述服务器;
所述服务器将编码后的所述视频码流发送给所述观众端;
所述主播端和/或所述观众端进一步获取所述服务器生成的触发指令,并基于所述触发指令获取对应的特效信息;
所述主播端和/或所述观众端从编码后的所述视频码流解码出所述轮廓信息和所述直播视频,并基于所述轮廓信息将所述特效信息渲染到所述直播视频。
为解决上述技术问题,本申请采用的另一个技术方案是提供一种直播系统,所述直播系统至少包括主播端、观众端以及服务器;
所述主播端,用于采集轮廓信息和直播视频,将所述轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的所述视频码流上传到所述服务器;
所述服务器,用于将编码后的所述视频码流发送给所述观众端;
所述主播端和/或所述观众端,用于进一步获取所述服务器生成的触发指令,并基于所述触发指令获取对应的特效信息;
所述主播端和/或所述观众端,还用于从编码后的所述视频码流解码出所述轮廓信息和所述直播视频,并基于所述轮廓信息将所述特效信息渲染到所述直播视频。
为解决上述技术问题,本申请采用的另一个技术方案是提供另一种直播交互方法,所述直播交互方法应用于一种电子设备,所述直播交互方法包括:
采集轮廓信息和直播视频,将所述轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的所述视频码流上传到所述服务器,以使所述服务器将编码后的所述视频码流发送给所述观众端;
进一步获取触发指令,并基于所述触发指令获取对应的特效信息;
从编码后的所述视频码流解码出所述轮廓信息和所述直播视频,并基于所述轮廓信息将所述特效信息渲染到所述直播视频。
为解决上述技术问题,本申请采用的另一个技术方案是提供一种电子设备,所述电子设备包括存储器以及与所述存储器耦接的处理器;
其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现如上述的直播交互方法。
为解决上述技术问题,本申请采用的另一个技术方案是提供一种计算机存储介质,其中存储有计算机程序,计算机程序被执行时实现上述直播交互方法的步骤。
区别于现有技术,本申请的有益效果是:主播端采集轮廓信息和直播视频,将轮廓信息编码到视频码流的网络提取层,将直播视频编码到视频码流的视频编码层,并将编码后的视频码流上传到服务器;服务器将编码后的视频码流发送给观众端;主播端和/或观众端进一步获取服务器生成的触发指令,并基于触发指令获取对应的特效信息;主播端和/或观众端从编码后的视频码流解码出轮廓信息和直播视频,并基于轮廓信息将特效信息渲染到直播视频。通过本申请的直播交互方法,可以在直播过程中,将人物和特效渲染在一起展示,能够有效增强连麦互动的趣味性,使得直播内容更加丰富,从而提高网络直播的互动性。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是本申请提供的直播交互方法第一实施例的流程示意图;
图2是本申请提供的主播端上行逻辑的流程示意图;
图3是本申请提供的AI特效动画的示意图;
图4是本申请提供的直播交互方法第二实施例的流程示意图;
图5是本申请提供的直播交互方法第三实施例的流程示意图;
图6是本申请提供的直播交互方法第四实施例的流程示意图;
图7是本申请提供的主播端下行逻辑的流程示意图;
图8是本申请提供的混画转码处理逻辑的流程示意图;
图9是本申请提供的观众端下行逻辑的流程示意图;
图10是本申请提供的直播系统一实施例的结构示意图;
图11是本申请提供的直播交互方法第五实施例的流程示意图;
图12是本申请提供的电子设备一实施例的结构示意图;
图13是本申请提供的计算机存储介质一实施例的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
本申请首先提出了一种直播交互方法,可以应用于直播系统。其中,本实施例应用的直播系统至少包括主播端、观众端以及服务器。
在直播交互过程中,主播端和观众端分别与服务器建立通信连接,以使主播端可以通过服务器进行直播交互,观众端可以通过服务器观看主播端的直播内容。
与主播端对应的电子设备均可以为例如智能手机、平板电脑、笔记本电脑、台式电脑或者可穿戴设备等电子设备,与观众端对应的电子设备也可以是例如智能手机、平板电脑、笔记本电脑、台式电脑或者可穿戴设备等电子设备。
多个观众端对应的设备类型与主播端对应的设备类型可以相同或不同。
主播端以及观众端均可以与服务器建立例如WIFI、蓝牙或者ZigBee等无线连接。
请继续参阅图1,图1是本申请提供的直播交互方法第一实施例的流程示意图。本实施例的直播交互方法可以应用于上述的直播系统,直播系统的具体结构在此不再赘述。
具体地,本实施例的直播交互方法具体包括以下步骤:
S101:主播端采集轮廓信息和直播视频,将轮廓信息编码到视频码流 的网络提取层,将直播视频编码到视频码流的视频编码层,并将编码后的视频码流上传到服务器。
其中,主播端将AI数据,即轮廓信息与直播视频通过视频码流一同上传到服务器。具体流程结合图1和图2进行描述,其中,图2是本申请提供的主播端上行逻辑的流程示意图。
具体地,主播端采集的轮廓信息可以为主播的人体轮廓信息,也可以是其它预先设置的标的轮廓信息,例如,预设的标的轮廓可以是直播视频中常出现的物体轮廓。在下述实施例的描述中,本申请以人体轮廓信息为例进行描述。
具体地,主播端对摄像头录制的直播视频进行视频采集,以获取视频的颜色数据,即YUV数据。其中YUV是一种颜色编码方法,常常被用在各个视频处理组件中。YUV在对照片或视频编码时,会考虑到人类的感知能力,允许降低色度的带宽。YUV是编译true-color颜色空间(colorspace)的种类,其中,“Y”表示明亮度(Luminance、Luma),“U”表示色度(Chrominance),“V”表示浓度(Chroma)。
主播端获取视频的颜色数据后,进行AI处理,以得到直播视频中的人体轮廓信息,其中,人体轮廓至少包括脸部轮廓以及肢体轮廓。主播端采用H.264/H.265等视频压缩标准将人体轮廓信息编码到视频码流的网络提取层中,具体地,主播端将人体轮廓信息压缩并编码到视频码流网络提取层中的SEI中,SEI即补充增强信息(Supplemental Enhancement Information),属于码流范畴,SEI提供了向视频码流中加入额外信息的方法。SEI的基本特征包括:1.并未解码过程的必须选项;2.可能对解码过程(容错、纠错)有帮助;3.集成在视频码流中。
在本实施例中,主播端将人体轮廓信息编码到SEI,使得人体轮廓信息能够与直播视频通过视频码流一起传输到服务器,即图2中的主播网络。
进一步地,主播端由于应用程序版本没有及时更新或者设备性能不满足展示AI特效的要求时,主播端将及时告知服务器以及对应的观众端。例如,主播端开播时检测设备性能能否支持展示AI特效,若能,则在进行人体轮廓信息采集时主动上报服务器该主播端目前能够支持AI特效礼物;若 服务器没有接收到主播端关于AI特效的上报协议,则认为该主播端不支持AI特效。
此处的上报协议的意义包括:
(1)当主播端在不支持AI特效礼物的老版本应用程序开播时,观众端在支持AI特效礼物的新版本应用程序送礼时会有对应的提示,用以提示观众即系给主播赠送AI特效礼物,主播端也无法展示该AI特效礼物。
(2)当主播端的终端设备性能较差,不支持实时采集人体轮廓信息的功能时,则当观众端赠送对应的AI特效礼物时,也会有对应的反馈提示。
若在直播过程中出现异常情况,例如观众赠送了一种AI特效礼物,但是主播端的应用程序版本不支持或者主播端的终端设备性能不支持时,向观众发送对应的提示:这时可以播放默认的特效动画,但是这种特效动画不结合主播的人脸或人体轮廓。
S102:服务器将编码后的视频码流发送给观众端。
其中,服务器将编码后的视频码流发送给观众端,其中,编码后的视频码流的SEI信息中携带有主播的人体轮廓信息。
S103:主播端和/或观众端进一步获取服务器生成的触发指令,并基于触发指令获取对应的特效信息。
其中,在直播交互的过程中,服务器会通过赠送礼物触发或者识别人体动作触发的方式生成对应的触发指令,用以指示主播端和观众端基于该触发指令下载对应的特效信息。
生成触发指令的方式主要有以下两种:
(1)服务器获取观众端发送的礼物信息时,判断该礼物信息的种类是普通礼物信息还是AI特效礼物信息。当观众端发送的是AI特效礼物信息时,服务器基于该AI特效礼物信息生出触发指令。
(2)服务器预设有多种动作指令,当接收到主播端编码后的视频码流时,服务器识别直播视频中主播的动作,例如手势等。当在直播视频中主播做出服务器预设的动作时,服务器基于该动作生成对应触发指令。例如,服务器识别到主播做出手势比行的动作时,触发天使围绕该主播头像飞三圈,然后亲吻主播的脸蛋的触发指令。
进一步地,由于在直播过程中,很多AI特效礼物都是重复展示,对应的特效信息主播端和/或观众端在第一次下载时可以缓存在本地,以供下一次触发相同AI特效礼物时使用。因此,主播端和/或观众端接收到触发指令时,先在本地缓存区搜索是否存在该触发指令对应的特效信息。若存在,主播端和/或观众端直接提取缓存区的特效信息;若不存在,主播端和/或观众端基于该触发指令向服务器发送请求信息,以使服务器发送请求信息对应的特效信息。
进一步地,主播端和/或观众端在较短时间内接收到多条AI特效礼物的触发指令时,按照接收时间顺序将多条AI特效礼物的触发指令放到队列中,从而按照时间顺序播放对应的AI特效礼物。
S104:主播端和/或观众端从编码后的视频码流解码出人体轮廓信息和直播视频,并基于人体轮廓信息将特效信息渲染到直播视频,以显示对应的直播界面。
其中,当主播端和/或观众端接收到服务器的触发指令时,主播端和/或观众端会从编码后的视频码流的网络提取层中解码出SEI信息,从而获取SEI信息中的人体轮廓信息。主播端和/或观众端将解码的人体轮廓信息输入动画渲染器进行渲染,动画渲染器根据对应的礼物类型,获取到对应礼物类型的动画播放资源,即S103中的特效信息后,会将动画播放资源根据人体轮廓信息进行渲染绘制。
例如,动画播放资源为围绕人体飞三圈,然后翅膀跌落到视频外部时,渲染器则结合人体轮廓信息展示在人体轮廓做围绕三圈的绘制,并将其翅膀掉落绘制在直播视频区域的外部的画面。
通过动画渲染器的渲染,主播端和/或观众端能够基于人体轮廓信息将特效信息渲染到直播视频,并显示对应的直播界面。直播界面的具体示意图请参阅图3,图3是本申请提供的AI特效动画的示意图。直播界面上包括主播的人体轮廓11以及特效动画12。其中,特效动画12围绕人体轮廓11展示,可以产生特效动画12挡人效果,或者特效动画12在人体上的部分透明效果。例如,飞机特效围绕人体飞一周,飞到人体背后时消失;或者从直播视频区域产生特效效果开始,飞到视频区域人体的某个部分。
在本实施例中,主播端采集轮廓信息和直播视频,将轮廓信息编码到视频码流的网络提取层,将直播视频编码到视频码流的视频编码层,并将编码后的视频码流上传到服务器;服务器将编码后的视频码流发送给观众端;主播端和/或观众端进一步获取服务器生成的触发指令,并基于触发指令获取对应的特效信息;主播端和/或观众端从编码后的视频码流解码出轮廓信息和直播视频,并基于轮廓信息将特效信息渲染到直播视频。通过本申请的直播交互方法,可以在直播过程中,将人物和特效渲染在一起展示,能够有效增强连麦互动的趣味性,使得直播内容更加丰富,从而提高网络直播的互动性。
在上述S104中,由于人体轮廓信息是来自于主播端的直播视频,当主播端从编码后的视频码流的SEI信息中获取人体轮廓信息后,主播端可以直接使用动画渲染器对人体轮廓信息和特效信息渲染到直播视频。但在其它实施例中,当观众端从编码后的直播视频的SEI信息中获取人体轮廓信息后,若观众端的视频分辨率与主播端的视频分辨率不同时,观众端可能无法直接基于人体轮廓信息对特效信息进行动画渲染。因此,本申请提出了另一种直播交互方法,具体请参阅图4,图4是本申请提供的直播交互方法第二实施例的流程示意图。
如图4所示,本实施例的直播交互方法具体包括以下步骤:
S201:观众端基于轮廓信息获取主播端的视频分辨率。
其中,观众端一方面获取自身的视频分辨率,一方面根据解码出的人体轮廓信息或者直播视频获取主播端的视频分辨率。
S202:在观众端的视频分辨率与主播端的视频分辨率不同的情况下,观众端基于主播端的视频分辨率对轮廓信息进行坐标的等比转换。
其中,当观众端的视频分辨率与主播端的视频分辨率相同时,观众端不需求对人体轮廓信息进行转换。当观众端的视频分辨率与主播端的视频分辨率不同时,观众端需要对人体轮廓的坐标信息进行等比转换。
例如,主播端是在视频分辨率为1920*1680的终端设备上进行开播,主播端采集到的人体轮廓信息的坐标系是在该分辨率下的,观众端是在视频分辨率为1080*720的终端设备上进行观看;此时,观众端需要按照观众端 与主播端的视频分辨率的比例对人体轮廓信息进行坐标系的等比转换,以使经过动画渲染器对人体轮廓信息和特效信息渲染的直播视频能够在观众端上正常显示。
在本实施例中,针对主播端的视频分辨率和观众端的视频分辨率不同的情况,观众端可以根据两个客户端的视频分辨率关系对人体轮廓信息进行坐标系的等比转换,以使本申请的直播交互方法能够适应于不同的终端设备。
对于上述实施例的S101,本申请提出了另一种具体的直播交互方法,具体请参阅图5,图5是本申请提供的直播交互方法第三实施例的流程示意图。
如图5所示,本实施例的直播交互方法具体包括以下步骤:
S301:主播端基于业务需求以及传输的带宽要求确定轮廓信息的采集点数量,并基于采集点数量采集轮廓信息。
其中,主播端在开播过程中实时采集主播的人体轮廓信息,采集人体轮廓信息的采集点数量取决于对应业务以及传输的带宽要求。
例如,如果需要实现全身人体的特效效果,此时可以通过相对较多的采集点用来表示采集的人体轮廓信息,如通过256个采集点表示整个人体的轮廓。如果需要实现人脸的特效效果,此时可以通过相对较少的采集点用来表示人脸的轮廓信息,如通过68个点表示人脸的轮廓信息。
S302:主播端判断编码后的视频码流的所需要的带宽是否大于或等于预设带宽。
其中,主播端采集人体轮廓信息后,将人体轮廓信息压缩并编码到视频码流。如图2所示,主播端在传输编码后的视频码流之前,需要检测传输的内容是否符合要求。
S303:主播端丢弃人体轮廓信息。
检测的内容至少可以包括以下两个方面:
(1)主播端可以判断编码后的视频码流的所需要的带宽是否大于或等于上行带宽;若是,为了保证直播的流程性,主播端需要在上行带宽不足的情况下丢弃人体轮廓信息。
(2)主播端也可以判断人体轮廓信息的大小是否大于预设字节;若是,为了保证直播的流程性,主播端需要在上行带宽不足的情况下丢弃人体轮廓信息。例如,当人体轮廓信息大于400Bytes时,主播端需要将人体轮廓信息丢弃,再传输视频码流。
进一步地,主播端丢弃全部或部分人体轮廓信息的情况下,主播端在下个时序采集人体轮廓信息时可以基于丢弃的人体轮廓信息大小适应性减少采集人体轮廓信息所需要的采集点,进而减少后续传输的人体轮廓信息的大小。
在上述实施例中,直播交互方法可以应用于单一主播端,即单人特效玩法。在其它实施例中,本申请的直播交互方法也可以应用于多主播的情况,即多人特效玩法。
具体请参阅图6,图6是本申请提供的直播交互方法第四实施例的流程示意图。其中,上述实施例中的主播端可以包括第一主播端和第二主播端。
如图6所示,本实施例的直播交互方法具体包括以下步骤:
S401:第一主播端采集第一轮廓信息和第一直播视频,将第一轮廓信息编码到第一视频码流的网络提取层,将第一直播视频编码到第一视频码流的视频编码层,并将编码后的第一视频码流上传到服务器。
S402:第二主播端采集第二轮廓信息和第二直播视频,将第二轮廓信息编码到第二视频码流的网络提取层,将第二直播视频编码到第二视频码流的视频编码层,并将编码后的第二视频码流上传到服务器。
其中,在S401和S402中,第一主播端和第二主播端分别进行人体轮廓信息采集以及编码,具体过程与上述实施例中的S101相同,在此不再赘述。
S403:服务器将编码后的第一视频码流和编码后的第二视频码流发送给观众端,将编码后的第一视频码流发送给第二主播端,将编码后的第二视频码流发送给第一主播端。
S404:第一主播端、第二主播端和/或观众端进一步获取服务器生成的触发指令,并基于触发指令获取对应的特效信息。
S405:第一主播端从编码后的第二视频码流解码出第二轮廓信息和第二直播视频,第二主播端从编码后的第一视频码流解码出第一轮廓信息和 第一直播视频,观众端从编码后的第一视频码流和编码后的第二视频码流解码出第一轮廓信息、第二轮廓信息、第一直播视频以及第二直播视频。
其中,请结合图7,图7是本申请提供的主播端下行逻辑的流程示意图。以第二主播端从编码后的第一视频码流解码出第一人体轮廓信息为例进行描述,具体地,主播网络,即服务器将编码后的第一视频码流传输给第二主播端。第二主播端将编码后的第一视频码流中的SEI信息剥除,从而解码出第一人体轮廓信息。
S406:第一主播端、第二主播端和观众端将第一直播视频和第二直播视频进行视频混画,获得交互视频,并基于第一轮廓信息、第二轮廓信息将特效信息渲染到交互视频。
其中,请结合图8和图9对本步骤进行解读。主播网络在获取第一直播视频和第二直播视频后,将两个直播视频进行视频混画,从而得到交互视频。其中,交互视频中包括第一人体轮廓信息、第二人体轮廓信息以及第一直播视频和第二直播视频的混画布局。
进一步地,主播网络还可以对交互视频进行视频转码,并传输到CDN网络(Content Delivery Network,内容分发网络),以适应不同的网络带宽、不同的终端处理能力和不同的用户需求,其中,转码后的交互视频包括转码参数。
请参阅图9的观众端下行逻辑的流程示意图,CDN网络将转码后的交互视频发送给观众端,观众端将转码后的交互视频中的SEI信息剥除,从而解码出第一人体轮廓信息、第二人体轮廓信息、混画布局以及转码参数。
为了实现上述实施例的直播交互方法,本申请提出了一种直播系统,具体请参阅图10,图10是本申请提供的直播系统一实施例的结构示意图。
本实施例的直播系统200至少包括主播端21、观众端22以及服务器23。其中,主播端21、观众端22分别与服务器23实现通信连接。
主播端21,用于采集轮廓信息,将轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的视频码流上传到服务器23。
服务器23,用于将编码后的视频码流发送给观众端22。
主播端21和/或观众端22,用于进一步获取服务器23生成的触发指令,并基于触发指令获取对应的特效信息。
主播端21和/或观众端22,还用于从编码后的视频码流解码出轮廓信息和直播视频,并基于轮廓信息将特效信息渲染到直播视频。
为了解决上述技术问题,本申请还提出了另一种直播交互方法,具体请参阅图11,图11是本申请提供的直播交互方法第五实施例的流程示意图。本实施例的直播交互方法应用于一种电子设备,具体可以为上述直播系统200中的主播端21,在此不再赘述。
如图11所示,本实施例的直播交互方法具体包括以下步骤:
S501:采集轮廓信息和直播视频,将轮廓信息编码到视频码流的网络提取层,将直播视频编码到视频码流的视频编码层,并将编码后的视频码流上传到服务器,以使服务器将编码后的视频码流发送给观众端。
S502:进一步获取触发指令,并基于触发指令获取对应的特效信息。
S503:从编码后的视频码流解码出轮廓信息和直播视频,并基于轮廓信息将特效信息渲染到直播视频。
为了实现上述实施例的直播交互方法,本申请提出了一种电子设备,具体请参阅图12,图12是本申请提供的电子设备一实施例的结构示意图。
本实施例的电子设备300包括存储器31和处理器32,其中,存储器31与处理器32耦接。
其中,存储器31用于存储程序数据,处理器32用于执行程序数据以实现上述实施例的直播交互方法。
在本实施例中,处理器32还可以称为CPU(Central Processing Unit,中央处理单元)。处理器32可能是一种集成电路芯片,具有信号的处理能力。处理器32还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器32也可以是任何常规的处理器等。
本申请还提供一种计算机存储介质,请继续参阅图13,图13是本申请提供的计算机存储介质一实施例的结构示意图,该计算机存储介质400中存 储有程序数据41,该程序数据41在被处理器执行时,用以实现上述实施例的直播交互方法。
本申请的实施例以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其它相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (11)

  1. 一种直播交互方法,其特征在于,所述直播交互方法应用于直播系统,所述直播系统包括主播端、观众端以及服务器;
    所述直播交互方法包括:
    所述主播端采集轮廓信息和直播视频,将所述轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的所述视频码流上传到所述服务器;
    所述服务器将编码后的所述视频码流发送给所述观众端;
    所述主播端和/或所述观众端进一步获取所述服务器生成的触发指令,并基于所述触发指令获取对应的特效信息;
    所述主播端和/或所述观众端从编码后的所述视频码流解码出所述轮廓信息和所述直播视频,并基于所述轮廓信息将所述特效信息渲染到所述直播视频。
  2. 根据权利要求1所述的直播交互方法,其特征在于,
    所述主播端和/或所述观众端从编码后的所述视频码流解码出所述轮廓信息的步骤之后,包括:
    所述观众端基于所述轮廓信息获取所述主播端的视频分辨率;
    在所述观众端的视频分辨率与所述主播端的视频分辨率不同的情况下,所述观众端基于所述主播端的视频分辨率对所述轮廓信息进行坐标的等比转换。
  3. 根据权利要求1所述的直播交互方法,其特征在于,
    所述主播端采集轮廓信息的步骤,包括:
    所述主播端基于业务需求以及传输的带宽要求确定所述轮廓信息的采集点数量,并基于所述采集点数量采集所述轮廓信息。
  4. 根据权利要求3所述的直播交互方法,其特征在于,
    所述将编码后的所述视频码流上传到所述服务器的步骤之前,包括:
    所述主播端判断编码后的所述视频码流的所需要的带宽是否大于或等于预设带宽;
    若是,所述主播端丢弃所述轮廓信息;
    或者,所述主播端判断所述轮廓信息的大小是否大于预设字节;
    若是,所述主播端丢弃所述轮廓信息。
  5. 根据权利要求1所述的直播交互方法,其特征在于,
    所述主播端和/或所述观众端进一步获取所述服务器生成的触发指令的步骤,包括:
    所述服务器获取所述观众端发送的特效礼物信息或者识别出所述直播视频中的预设动作时,产生所述触发指令,并将所述触发指令发送给所述主播端和所述观众端。
  6. 根据权利要求1所述的直播交互方法,其特征在于,
    所述基于所述触发指令获取对应的特效信息的步骤,包括:
    若所述主播端和/或所述观众端下载过所述特效信息,所述主播端和/或所述观众端从本地直接缓存所述特效信息;
    若所述主播端和/或所述观众端未下载所述特效信息,所述主播端和/或所述观众端基于所述触发指令向所述服务器发送请求信息,以使所述服务器发送所述请求信息对应的特效信息。
  7. 根据权利要求1所述的直播交互方法,其特征在于,
    所述主播端包括第一主播端和第二主播端;
    所述直播交互方法包括:
    所述第一主播端采集第一轮廓信息和第一直播视频,将所述第一轮廓信息编码到第一视频码流的网络提取层,将所述第一直播视频编码到所述第一视频码流的视频编码层,并将编码后的所述第一视频码流上传到所述服务器;
    所述第二主播端采集第二轮廓信息和第二直播视频,将所述第二轮廓信息编码到第二视频码流的网络提取层,将所述第二直播视频编码到所述第二视频码流的视频编码层,并将编码后的所述第二视频码流上传到所述服务器;
    所述服务器将编码后的所述第一视频码流和编码后的所述第二视频码流发送给所述观众端,将编码后的所述第一视频码流发送给所述第二主播端,将编码后的所述第二视频码流发送给所述第一主播端;
    所述第一主播端、所述第二主播端和/或所述观众端进一步获取所述服务器生成的触发指令,并基于所述触发指令获取对应的特效信息;
    所述第一主播端从编码后的所述第二视频码流解码出所述第二轮廓信息和所述第二直播视频,所述第二主播端从编码后的所述第一视频码流解码出所述第一轮廓信息和所述第一直播视频,所述观众端从编码后的所述第一视频码流和编码后的所述第二视频码流解码出所述第一轮廓信息、所述第二轮廓信息、所述第一直播视频以及所述第二直播视频;
    所述第一主播端、所述第二主播端和所述观众端将所述第一直播视频和所述第二直播视频进行视频混画,获得交互视频,并基于所述第一轮廓信息、所述第二轮廓信息将所述特效信息渲染到所述交互视频。
  8. 一种直播系统,其特征在于,所述直播系统至少包括主播端、观众端以及服务器;
    所述主播端,用于采集轮廓信息和直播视频,将所述轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的所述视频码流上传到所述服务器;
    所述服务器,用于将编码后的所述视频码流发送给所述观众端;
    所述主播端和/或所述观众端,用于进一步获取所述服务器生成的触发指令,并基于所述触发指令获取对应的特效信息;
    所述主播端和/或所述观众端,还用于从编码后的所述视频码流解码出所述轮廓信息和所述直播视频,并基于所述轮廓信息将所述特效信息渲染到所述直播视频。
  9. 一种直播交互方法,其特征在于,所述直播交互方法应用于一种电子设备,所述直播交互方法包括:
    采集轮廓信息和直播视频,将所述轮廓信息编码到视频码流的网络提取层,将所述直播视频编码到所述视频码流的视频编码层,并将编码后的所述视频码流上传到所述服务器,以使所述服务器将编码后的所述视频码流发送给所述观众端;
    进一步获取触发指令,并基于所述触发指令获取对应的特效信息;
    从编码后的所述视频码流解码出所述轮廓信息和所述直播视频,并基于所述轮廓信息将所述特效信息渲染到所述直播视频。
  10. 一种电子设备,其特征在于,所述电子设备包括存储器以及与所述存储器耦接的处理器;
    其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现如权利要求9中所述的直播交互方法。
  11. 一种计算机存储介质,其特征在于,所述计算机存储介质用于存储程序数据,所述程序数据在被处理器执行时,用以实现如权利要求1~7以及权利要求9中任一项所述的直播交互方法。
PCT/CN2020/112793 2019-09-12 2020-09-01 直播交互方法、直播系统、电子设备及存储介质 WO2021047419A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910865638.4A CN110557649B (zh) 2019-09-12 2019-09-12 直播交互方法、直播系统、电子设备及存储介质
CN201910865638.4 2019-09-12

Publications (1)

Publication Number Publication Date
WO2021047419A1 true WO2021047419A1 (zh) 2021-03-18

Family

ID=68740284

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/112793 WO2021047419A1 (zh) 2019-09-12 2020-09-01 直播交互方法、直播系统、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN110557649B (zh)
WO (1) WO2021047419A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113395533A (zh) * 2021-05-24 2021-09-14 广州博冠信息科技有限公司 虚拟礼物特效显示方法、装置、计算机设备及存储介质
CN113473168A (zh) * 2021-07-02 2021-10-01 北京达佳互联信息技术有限公司 直播方法及系统、便携设备执行的直播方法及便携设备
CN113840177A (zh) * 2021-09-22 2021-12-24 广州博冠信息科技有限公司 直播互动方法、装置、存储介质与电子设备
CN113923530A (zh) * 2021-10-18 2022-01-11 北京字节跳动网络技术有限公司 一种互动信息展示方法、装置、电子设备及存储介质
CN113949900A (zh) * 2021-10-08 2022-01-18 上海哔哩哔哩科技有限公司 直播贴图处理方法及系统
CN114125501A (zh) * 2021-10-30 2022-03-01 杭州当虹科技股份有限公司 互动视频生成方法及其播放方法和装置

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110536151B (zh) * 2019-09-11 2021-11-19 广州方硅信息技术有限公司 虚拟礼物特效的合成方法和装置、直播系统
CN110557649B (zh) * 2019-09-12 2021-12-28 广州方硅信息技术有限公司 直播交互方法、直播系统、电子设备及存储介质
CN111464828A (zh) * 2020-05-14 2020-07-28 广州酷狗计算机科技有限公司 虚拟特效显示方法、装置、终端及存储介质
CN112000252B (zh) * 2020-08-14 2022-07-22 广州市百果园信息技术有限公司 虚拟物品的发送及显示方法、装置、设备及存储介质
CN112261428A (zh) * 2020-10-20 2021-01-22 北京字节跳动网络技术有限公司 画面展示方法、装置、电子设备及计算机可读介质
CN112929680B (zh) * 2021-01-19 2023-09-05 广州虎牙科技有限公司 直播间图像渲染方法、装置、计算机设备及存储介质
CN113382275B (zh) * 2021-06-07 2023-03-07 广州博冠信息科技有限公司 直播数据的生成方法、装置、存储介质及电子设备
CN114025219A (zh) * 2021-11-01 2022-02-08 广州博冠信息科技有限公司 增强现实特效的渲染方法、装置、介质及设备
CN115174954A (zh) * 2022-08-03 2022-10-11 抖音视界有限公司 视频直播方法、装置、电子设备以及存储介质
CN116896649B (zh) * 2023-09-11 2024-01-19 北京达佳互联信息技术有限公司 直播互动方法、装置、电子设备及存储介质

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106131591A (zh) * 2016-06-30 2016-11-16 广州华多网络科技有限公司 直播方法、装置及终端
CN106231434A (zh) * 2016-07-25 2016-12-14 武汉斗鱼网络科技有限公司 一种基于人脸检测的直播互动特效实现方法及系统
CN107343220A (zh) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 数据处理方法、装置和终端设备
CN109151489A (zh) * 2018-08-14 2019-01-04 广州虎牙信息科技有限公司 直播视频图像处理方法、装置、存储介质和计算机设备
US20190190970A1 (en) * 2017-12-18 2019-06-20 Facebook, Inc. Systems and methods for providing device-based feedback
CN110475150A (zh) * 2019-09-11 2019-11-19 广州华多网络科技有限公司 虚拟礼物特效的渲染方法和装置、直播系统
CN110493630A (zh) * 2019-09-11 2019-11-22 广州华多网络科技有限公司 虚拟礼物特效的处理方法和装置、直播系统
CN110536151A (zh) * 2019-09-11 2019-12-03 广州华多网络科技有限公司 虚拟礼物特效的合成方法和装置、直播系统
CN110557649A (zh) * 2019-09-12 2019-12-10 广州华多网络科技有限公司 直播交互方法、直播系统、电子设备及存储介质
CN110784730A (zh) * 2019-10-31 2020-02-11 广州华多网络科技有限公司 直播视频数据的传输方法、装置、设备和存储介质

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141608B (zh) * 2007-09-28 2011-05-11 腾讯科技(深圳)有限公司 一种视频即时通讯系统及方法
WO2013181756A1 (en) * 2012-06-08 2013-12-12 Jugnoo Inc. System and method for generating and disseminating digital video
CN103729610B (zh) * 2013-12-24 2017-01-11 北京握奇智能科技有限公司 一种二维码聚焦显示方法及系统
CN104780339A (zh) * 2015-04-16 2015-07-15 美国掌赢信息科技有限公司 一种即时视频中的表情特效动画加载方法和电子设备
CN106331735B (zh) * 2016-08-18 2020-04-21 北京奇虎科技有限公司 一种特效处理方法、电子设备及服务器
US20180234708A1 (en) * 2017-02-10 2018-08-16 Seerslab, Inc. Live streaming image generating method and apparatus, live streaming service providing method and apparatus, and live streaming system
CN106804007A (zh) * 2017-03-20 2017-06-06 合网络技术(北京)有限公司 一种网络直播中自动匹配特效的方法、系统及设备
CN107682729A (zh) * 2017-09-08 2018-02-09 广州华多网络科技有限公司 一种基于直播的互动方法及直播系统、电子设备
CN107995155A (zh) * 2017-10-11 2018-05-04 上海聚力传媒技术有限公司 视频数据编码、解码、展示方法、视频系统及存储介质
CN107888965B (zh) * 2017-11-29 2020-02-14 广州酷狗计算机科技有限公司 图像礼物展示方法及装置、终端、系统、存储介质

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106131591A (zh) * 2016-06-30 2016-11-16 广州华多网络科技有限公司 直播方法、装置及终端
CN106231434A (zh) * 2016-07-25 2016-12-14 武汉斗鱼网络科技有限公司 一种基于人脸检测的直播互动特效实现方法及系统
CN107343220A (zh) * 2016-08-19 2017-11-10 北京市商汤科技开发有限公司 数据处理方法、装置和终端设备
US20190190970A1 (en) * 2017-12-18 2019-06-20 Facebook, Inc. Systems and methods for providing device-based feedback
CN109151489A (zh) * 2018-08-14 2019-01-04 广州虎牙信息科技有限公司 直播视频图像处理方法、装置、存储介质和计算机设备
CN110475150A (zh) * 2019-09-11 2019-11-19 广州华多网络科技有限公司 虚拟礼物特效的渲染方法和装置、直播系统
CN110493630A (zh) * 2019-09-11 2019-11-22 广州华多网络科技有限公司 虚拟礼物特效的处理方法和装置、直播系统
CN110536151A (zh) * 2019-09-11 2019-12-03 广州华多网络科技有限公司 虚拟礼物特效的合成方法和装置、直播系统
CN110557649A (zh) * 2019-09-12 2019-12-10 广州华多网络科技有限公司 直播交互方法、直播系统、电子设备及存储介质
CN110784730A (zh) * 2019-10-31 2020-02-11 广州华多网络科技有限公司 直播视频数据的传输方法、装置、设备和存储介质

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113395533A (zh) * 2021-05-24 2021-09-14 广州博冠信息科技有限公司 虚拟礼物特效显示方法、装置、计算机设备及存储介质
CN113473168A (zh) * 2021-07-02 2021-10-01 北京达佳互联信息技术有限公司 直播方法及系统、便携设备执行的直播方法及便携设备
CN113473168B (zh) * 2021-07-02 2023-08-08 北京达佳互联信息技术有限公司 直播方法及系统、便携设备执行的直播方法及便携设备
CN113840177A (zh) * 2021-09-22 2021-12-24 广州博冠信息科技有限公司 直播互动方法、装置、存储介质与电子设备
CN113840177B (zh) * 2021-09-22 2024-04-30 广州博冠信息科技有限公司 直播互动方法、装置、存储介质与电子设备
CN113949900A (zh) * 2021-10-08 2022-01-18 上海哔哩哔哩科技有限公司 直播贴图处理方法及系统
CN113949900B (zh) * 2021-10-08 2023-11-24 上海哔哩哔哩科技有限公司 直播贴图处理方法、系统、设备及存储介质
CN113923530A (zh) * 2021-10-18 2022-01-11 北京字节跳动网络技术有限公司 一种互动信息展示方法、装置、电子设备及存储介质
CN113923530B (zh) * 2021-10-18 2023-12-22 北京字节跳动网络技术有限公司 一种互动信息展示方法、装置、电子设备及存储介质
CN114125501A (zh) * 2021-10-30 2022-03-01 杭州当虹科技股份有限公司 互动视频生成方法及其播放方法和装置

Also Published As

Publication number Publication date
CN110557649A (zh) 2019-12-10
CN110557649B (zh) 2021-12-28

Similar Documents

Publication Publication Date Title
WO2021047419A1 (zh) 直播交互方法、直播系统、电子设备及存储介质
CN110798698B (zh) 一种直播应用程序的多服务器推流方法、设备和存储介质
WO2018121014A1 (zh) 视频播放控制方法、装置及终端设备
KR100889367B1 (ko) 네트워크를 통한 가상 스튜디오 구현 시스템 및 그 방법
US11882188B2 (en) Methods and systems for maintaining smooth frame rate during transmission of streaming video content
JP6337114B2 (ja) ワイヤレスディスプレイのためのソースデバイスにおけるリソース利用のための方法および装置
TW201119405A (en) System and method for multi-stream video compression using multiple encoding formats
US20160029079A1 (en) Method and Device for Playing and Processing a Video Based on a Virtual Desktop
US20220193540A1 (en) Method and system for a cloud native 3d scene game
CN107241654A (zh) 一种云端加速渲染集群全景游戏直播系统及方法
WO2023131057A1 (zh) 视频直播方法、系统及计算机存储介质
CN104837043B (zh) 多媒体信息处理方法及电子设备
JP2016508679A (ja) 複数の視覚コンポーネントを有する画面を共有するためのシステム、装置、および方法
US11120615B2 (en) Dynamic rendering of low frequency objects in a virtual reality system
WO2023040825A1 (zh) 媒体信息的传输方法、计算设备及存储介质
CN113301359A (zh) 音视频处理方法、装置及电子设备
US9838463B2 (en) System and method for encoding control commands
Wang et al. A study of live video streaming system for mobile devices
WO2022206016A1 (zh) 一种数据分层传输方法、装置及系统
KR20160015123A (ko) 클라우드 스트리밍 서비스 시스템, 스틸 이미지 기반 클라우드 스트리밍 서비스 방법 및 이를 위한 장치
CN112954394B (zh) 一种高清视频的编码及解码播放方法、装置、设备和介质
CN115243074A (zh) 视频流的处理方法及装置、存储介质、电子设备
CN114554277B (zh) 多媒体的处理方法、装置、服务器及计算机可读存储介质
WO2016107174A1 (zh) 多媒体文件数据的处理方法及系统、播放器和客户端
CN113747181A (zh) 基于远程桌面的网络直播方法、直播系统及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20863766

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20863766

Country of ref document: EP

Kind code of ref document: A1