WO2022062896A1 - 直播互动方法及装置 - Google Patents

直播互动方法及装置 Download PDF

Info

Publication number
WO2022062896A1
WO2022062896A1 PCT/CN2021/117040 CN2021117040W WO2022062896A1 WO 2022062896 A1 WO2022062896 A1 WO 2022062896A1 CN 2021117040 W CN2021117040 W CN 2021117040W WO 2022062896 A1 WO2022062896 A1 WO 2022062896A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
display image
live broadcast
live
behavior
Prior art date
Application number
PCT/CN2021/117040
Other languages
English (en)
French (fr)
Inventor
张水发
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022062896A1 publication Critical patent/WO2022062896A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration

Definitions

  • the present disclosure relates to the field of Internet technologies, and in particular, to a live interactive method, device, electronic device, and storage medium.
  • Interactive live broadcast is an enhanced application of live video, adding interactive functions to live video.
  • the interactive function in the interactive live broadcast includes adding voice and video interaction in the live video broadcast.
  • the present disclosure provides a live interactive method, device, electronic device and storage medium.
  • the technical solutions of the present disclosure are as follows:
  • a live interactive method including:
  • a second display avatar is rendered in the live scene.
  • a live interactive device including:
  • the display module is configured to display the live broadcast scene in the interface of the live broadcast room;
  • a collection module configured to collect behavior data of the first target object
  • a display image generation module configured to generate a first display image corresponding to the first target object according to the behavior data of the first target object
  • a first rendering module configured to render the first display image in the live broadcast scene
  • an acquisition module configured to acquire a second display image of the second target object, and the second display image is generated according to the behavior data of the second target object;
  • the second rendering module is further configured to render the second display image in the live broadcast scene.
  • an electronic device comprising:
  • a processor for executing instructions stored in a memory;
  • the processor is configured to execute the instructions to implement the live interaction method described in any one of the embodiments of the first aspect.
  • a storage medium when an instruction in the storage medium is executed by a processor of an electronic device, the electronic device can execute any one of the embodiments of the first aspect.
  • the described live interactive method when an instruction in the storage medium is executed by a processor of an electronic device, the electronic device can execute any one of the embodiments of the first aspect.
  • a computer program product comprising a computer program, the computer program being stored in a readable storage medium, and at least one processor of a device from the readable storage medium The computer program is read and executed, so that the device executes the live interaction method described in any one of the embodiments of the first aspect.
  • a live broadcast scene is pre-established, and the same live broadcast scene is displayed on the host and the audience; the host and the audience simultaneously collect the behavior data of their respective users to generate a display image, and at the same time, the two-way transmission of the display image is carried out, so that the host and the audience end And the audience can interact with real-world behaviors in the same virtual scene, so that the interactive way of live broadcast is more comprehensive.
  • FIG. 1 is an application environment diagram of a live broadcast interaction method according to an exemplary embodiment.
  • Fig. 2 is a flow chart of a live interactive method according to an exemplary embodiment.
  • Fig. 3 is a flowchart showing a step of collecting behavior data according to an exemplary embodiment.
  • Fig. 4 is a flow chart of a live interactive method according to an exemplary embodiment.
  • Fig. 5 is a schematic diagram of a live broadcast scene according to an exemplary embodiment.
  • Fig. 6 is a flow chart of a live interactive method according to another exemplary embodiment.
  • Fig. 7 is a block diagram of a live interactive device according to an exemplary embodiment.
  • Fig. 8 is an internal structure diagram of an electronic device according to an exemplary embodiment.
  • the related method, apparatus, device and storage medium can obtain relevant information of the user.
  • the present disclosure provides a live interactive method.
  • the live interactive method provided by the present disclosure can be applied to the application environment shown in FIG. 1 .
  • the host terminal 110 and the server 120 communicate through the network, and at least one viewer terminal 130 and the server 120 communicate through the network.
  • the viewer terminals 130 at least include viewer terminals participating in the live broadcast interaction (hereinafter referred to as interactive viewer terminals).
  • An application program that can be used for live broadcasting is installed in the host terminal 110 .
  • An application program that can be used to watch the live broadcast is installed in the viewer terminal 130 .
  • the application installed in the host terminal 110 for live broadcast and the application installed in the viewer terminal 130 for watching the live broadcast may be the same application.
  • the host 110 creates a live broadcast room, it can obtain the live broadcast scene material selected by the host to establish the live broadcast room.
  • the host 110 collects the host's behavior data, generates the host's display image corresponding to the host according to the host's behavior data, and renders the host's display image in the live broadcast scene.
  • the viewer terminal 130 enters the live broadcast room, and displays a live broadcast scene including the display image of the host on the screen of the viewer terminal 130. Some or all of the viewers (interactive viewers) in the viewers 130 may request the host 110 to perform live interaction.
  • the interactive audience terminal collects the behavior data of the interactive audience, generates the interactive audience display image corresponding to the interactive audience according to the behavior data of the interactive audience, and renders the interactive audience display image in the live broadcast scene.
  • the interactive viewer terminal sends the interactive viewer display image to the server 120, so that the server 120 sends the interactive viewer display image to the host terminal 110 and other viewer terminals not participating in the interaction, so that the anchor terminal 110 and other viewer terminals not participating in the interaction can broadcast the live broadcast.
  • the scene renders the interactive audience display image.
  • the host 110 can be, but is not limited to, various personal computers, laptops, smart phones, and tablet computers
  • the server 120 can be implemented by an independent server or a server cluster composed of multiple servers
  • the viewer 130 can be, but not limited to It is a variety of personal computers, laptops, smartphones, and tablets.
  • FIG. 2 is a flow chart of a method for live broadcast interaction according to an exemplary embodiment. As shown in FIG. 2 , the live broadcast interaction method is used for the interactive viewer terminal in the host terminal 110 or the viewer terminal 130 in FIG. 1 , including the following step.
  • step S210 the live broadcast scene is displayed on the interface of the live broadcast room.
  • the live broadcast scene refers to a virtual scene set for the live broadcast room.
  • the material of the live broadcast scene can be pre-configured, for example, it can be a game scene, a virtual image background, etc., or it can be obtained by the user in the album of the terminal device;
  • the host can trigger the creation request of the live room through the host terminal.
  • the host terminal obtains the material of the live broadcast scene in response to the creation request of the live broadcast room; and creates the live broadcast scene according to the obtained material of the live broadcast scene.
  • the host side displays the created live broadcast scene.
  • the audience can enter the live room through search, hotspot recommendation, etc., and the screen of the audience will display the same live scene as the host.
  • step S220 the behavior data of the first target object is collected, a first display image corresponding to the first target object is generated according to the behavior data of the first target object, and the first display image is rendered in the live broadcast scene.
  • the first target object may be a host or an interactive audience participating in the live broadcast interaction. Interactive viewers can be all or part of the viewers who are watching the live broadcast.
  • the behavior data of the first target object is collected in real time through an image acquisition device. Corresponding processing is performed on the behavior data of the first target object, a first display image corresponding to the first target object is generated, and the first display image is rendered in the live broadcast scene of the first client.
  • the behavior data of the first target object is not limited to video data, voice data or text comment data of the first target object.
  • the first displayed image corresponding to the first target object may be obtained based on the deep learning theory.
  • the first displayed image may be the first target object image obtained by semantically segmenting the behavior image, or may be a A three-dimensional model driven by the human body pose estimation result of the first target object; if the behavior data of the first target object is the voice data obtained by the voice collection of the first target object, the first display image may be the voice recognition of the voice data.
  • the related text content obtained is a behavior image obtained by photographing the first target object.
  • step S230 a second display image of the second target object is obtained, and the second display image is generated according to the behavior data of the second target object.
  • step S240 the second display image is rendered in the live broadcast scene.
  • the second target object may be a host or an interactive audience participating in the live broadcast interaction.
  • the second target object may be an interactive viewer; when the first target object is an interactive viewer, the second target object may be a host and/or other interactive viewers.
  • a second display image corresponding to the second target object is generated according to the behavior data of the second target object, and displayed on the second client
  • the second display image is rendered in the live scene.
  • the second client sends the acquired second display image to the server, and the server sends the second display image corresponding to the second target object to the first client.
  • the first client receives the second display image of the second target object sent by the server, and renders the second display image in the displayed live broadcast scene.
  • the first display image corresponding to the first target object can be received from the server, and the first display image can be rendered in the live broadcast scene displayed by the second client, so that the The second client and the first client present the same live broadcast scene.
  • the first display image corresponding to the first target object and the second display image corresponding to the second target object can be obtained from the server, and the live broadcast scene displayed on the viewer terminal.
  • the first display image and the second display image are rendered in the server, so that the audience terminal, the first client terminal and the second client terminal not participating in the live broadcast interaction present the same live broadcast scene.
  • a live broadcast scene is pre-established, and the same live broadcast scene is displayed on the anchor end and the audience end; the anchor end and the audience end collect the behavior data of their respective users at the same time to generate a display image, and at the same time carry out two-way transmission of the display image, so that the anchor end and the audience end. And the audience can interact with real-world behaviors in the same virtual scene, so that the interactive way of live broadcast can be more comprehensive.
  • step S220 the behavior data of the first target object is collected, a first display image corresponding to the first target object is generated according to the behavior data of the first target object, and the first display image is rendered in the live broadcast scene. , which includes: collecting multiple frames of behavioral images of the first target object, performing semantic segmentation processing on each frame of behavioral images, obtaining the first display image of each frame, and rendering the first display image of each frame in the live broadcast scene.
  • the behavior data of the first target object may be continuous multiple frames of behavior images of the first target object collected in real time by an image acquisition device.
  • the pre-configured trained semantic segmentation model is invoked.
  • a first target object image is obtained by performing speech segmentation processing on each frame of behavioral images through the trained semantic segmentation model, and the obtained first target object image is used as a first display image.
  • the first client renders the acquired first display image of each frame in the live broadcast scene.
  • the semantic segmentation model is not limited to the use of DeepLab (a semantic segmentation network), FCN (Fully Convolution Networks, fully convolutional network), SegNet (Sementic Segmentation, semantic segmentation network), BiSeNet (Bilateral Segmentation Network for Real-time Semantic Segmentation) , a dual-channel real-time semantic segmentation model), etc.
  • DeepLab a semantic segmentation network
  • FCN Full Convolution Networks, fully convolutional network
  • SegNet ementic Segmentation, semantic segmentation network
  • BiSeNet Bilateral Segmentation Network for Real-time Semantic Segmentation
  • the first target object image may be a corresponding portrait of a real host or a portrait of a real interactive audience.
  • the second display image of the second target object acquired by the first client the second display image obtained by performing semantic segmentation processing on each frame of behavioral images of the second target object in the same manner as described above.
  • the server sends the second display image obtained by the semantic segmentation process to the first client, so that the first client can render the second display image in the live broadcast scene.
  • the behavioral images of the anchors and/or interactive viewers participating in the live broadcast interaction are collected, and the obtained behavioral images are subjected to semantic segmentation to obtain a real portrait, and the obtained real portrait is rendered in the live broadcast scene, so that the virtual The live broadcast scene is closer to the real world scene, which can improve the authenticity of the live broadcast interaction, help users to stay in the live broadcast room and improve the user retention rate of the live broadcast application.
  • performing semantic segmentation processing on each frame of behavioral images includes: sending multiple frames of behavioral images to a server; .
  • performing semantic segmentation processing on the multi-frame behavior images collected by the first client and/or the second client may also be performed by the server.
  • the first client and/or the second client acquires the multi-frame behavior images collected by the respective image acquisition devices
  • the first client and/or the second client sends the acquired multi-frame behavior images to the server in real time .
  • the server invokes a pre-deployed semantic segmentation model.
  • the speech segmentation process is performed on each frame of behavioral image through the semantic segmentation model to obtain the first target object image and the second target object image, the obtained first target object image is used as the first display image, and the obtained second target object image image as a second display image.
  • the server can send the first display image and the second display image to the associated client in the live broadcast room (which may refer to the clients corresponding to all accounts that have entered the live broadcast room), so that the associated client can be in the currently displayed live broadcast scene.
  • the first display avatar and the second display avatar are rendered.
  • the operating pressure of the terminal device can be reduced and the performance of the terminal device can be improved. responding speed.
  • rendering the first display image in the live broadcast scene includes: performing tracking processing on multiple frames of behavioral images to obtain motion track information of the first target object; The motion trajectory of the first display image is rendered in the scene.
  • a trained target tracking algorithm is deployed on the first client in advance.
  • a target tracking algorithm is used to track and process multiple frames of behavior images collected by the first client to obtain motion track information of the first target object. Further, according to the motion trajectory information of the first target object, the motion trajectory of the first display image is rendered in the live broadcast scene.
  • the target tracking algorithm can use the tracking algorithm based on the correlation filter, such as KCF Tracker (Kernel Correlation Filter, kernel correlation filter tracking algorithm), MOSSE Tracker (Minimum Output Sum of Squared Error Tracker, error least square sum filter tracking algorithm) Wait.
  • rendering the second display image in the live broadcast scene includes: acquiring motion trajectory information of the second target object, and rendering each frame of the second display image in the live broadcast scene according to the motion trajectory information of the second target object. movement trajectory.
  • the first client may also receive the motion trajectory information of the second target object sent by the server.
  • the motion track information of the second target object the motion track carrying the second display image is rendered in the currently displayed live broadcast scene.
  • the motion trajectory information of the second target object may be obtained by tracking the multi-frame behavior images of the second target object through a target tracking algorithm preconfigured on the second client.
  • first client and the second client can also send the first display image, the motion track information of the first target object, and the second display image and the motion track information of the second target object to the live broadcast room through the server.
  • Other associated clients so that other associated clients render the first display image and the motion trajectory of the first display image, and render the second display image and the motion trajectory of the second display image in the currently displayed live broadcast scene.
  • the target tracking algorithm is pre-deployed, the motion trajectory information of the target object in the real world is obtained through the target tracking algorithm, and the motion trajectory of the displayed image is rendered in the live broadcast scene according to the motion trajectory information of the target object in the real world, so that The images displayed in the live broadcast scene can interact according to the behaviors of real-world characters, which can make the live broadcast interaction method more comprehensive, improve the authenticity of the live broadcast interaction, and help increase the user's stay time.
  • performing tracking processing on multiple frames of behavior images to obtain motion trajectory information of the first target object includes: sending the multiple frames of behavior images to a server; The obtained motion track information of the first target object.
  • the tracking processing of the multi-frame behavior images collected by the first client and/or the second client may also be performed by the server.
  • the first client and/or the second client respectively acquire the multi-frame behavior images collected by the image acquisition device
  • the first client and/or the second client send the acquired multi-frame behavior images to the server in real time.
  • the server invokes a pre-deployed target tracking algorithm.
  • the multi-frame behavior images are tracked by the target tracking algorithm, and the motion track information corresponding to the first target object and the second target object is obtained.
  • the server may send the motion trajectory information corresponding to the first target object and the second target object to the associated client in the live broadcast room, so that the associated client can render the first display image and the second display image in the currently displayed live broadcast scene their corresponding motion trajectories.
  • the operating pressure of the terminal device can be reduced, and the terminal device can be improved.
  • the responsiveness of the device can be reduced.
  • the method before performing semantic segmentation processing on each frame of behavioral images, the method further includes: acquiring scene display parameters of the live broadcast scene and device parameters of the image acquisition device; image to adjust.
  • the scene display parameters of the live broadcast scene are not limited to including information such as brightness and contrast of the live broadcast scene.
  • the scene display parameters of the live broadcast scene can be manually configured by the host when creating the live broadcast room, or pre-configured default parameters can be used.
  • Device parameters refer to the parameters of the image acquisition device used to acquire behavioral images. Device parameters are not limited to including factors such as illumination, contrast, camera resolution, and lens distortion. The device parameters of the image capturing devices corresponding to the first client and the second client may be different.
  • the first client obtains scene display parameters of the live broadcast scene.
  • the first client obtains the device parameters of the image collecting device.
  • the first client adjusts each frame of the acquired behavior image according to the scene display parameters of the live broadcast scene.
  • the acquired scene display parameters of the live broadcast scene and the device parameters of the image capture device both include brightness, and the brightness of the scene display parameters is less than the brightness in the device parameters, then the brightness of the scene display parameters can be reduced.
  • the brightness of the behavioral image of a target object is if the acquired scene display parameters of the live broadcast scene and the device parameters of the image capture device both include brightness, and the brightness of the scene display parameters is less than the brightness in the device parameters, then the brightness of the scene display parameters can be reduced. The brightness of the behavioral image of a target object.
  • the second client when collecting the behavior image of the second target object, the second client obtains the scene display parameters of the live broadcast scene and the device parameters of the image acquisition device. The second client adjusts the acquired behavioral images for each frame according to the scene display parameters of the live broadcast scene.
  • performing semantic segmentation processing on each frame of behavior image specifically includes: performing semantic segmentation processing on each frame of adjusted behavior image.
  • the first client invokes a pre-deployed semantic segmentation model to perform semantic segmentation processing on each frame of behavioral images of the first target object, to obtain the first The target object image, and the obtained first target object image is used as the first display image.
  • the behavior images collected by different clients can be rendered more consistent in the live broadcast scene.
  • rendering the first display image in the live broadcast scene includes performing behavior analysis on the first display image to obtain a behavior category of the first display image, and rendering in the live broadcast scene according to a rendering method corresponding to the behavior category.
  • the first display image includes performing behavior analysis on the first display image to obtain a behavior category of the first display image, and rendering in the live broadcast scene according to a rendering method corresponding to the behavior category.
  • the behavior categories are not limited to dancing, duet, jumping, high-five, motivation, etc.
  • the rendering method corresponding to the behavior category may refer to the relevant special effects rendering method corresponding to the behavior category.
  • the rendering method corresponding to the behavior category of dancing can be lighting effects
  • the rendering method corresponding to the behavior category of high-five can be the same as At least one display image of the high-five is approached, and corresponding special effects are added to the high-five part.
  • the behavioral analysis of the first displayed avatar may be performed based on deep learning theory.
  • a deep learning model can be used to perform action recognition on the first display image to obtain the behavior category of the first display image;
  • the first display image is related text content obtained by performing speech recognition on the voice data, and keyword recognition may be performed on the related text content to obtain the behavior category of the first displayed image.
  • the corresponding relationship between the behavior category and the rendering mode may be pre-configured on the first client.
  • the first client After the first client obtains the behavior category of the first display image, it can search for the rendering mode corresponding to the behavior category from the corresponding relationship between the behavior category and the rendering mode, and render the first rendering mode in the live broadcast scene according to the rendering mode corresponding to the behavior category.
  • a display image After the first client obtains the behavior category of the first display image, it can search for the rendering mode corresponding to the behavior category from the corresponding relationship between the behavior category and the rendering mode, and render the first rendering mode in the live broadcast scene according to the rendering mode corresponding to the behavior category.
  • rendering the second display image in the live broadcast scene includes: acquiring a behavior category of the second display image, and rendering the second display image in the live broadcast scene according to a rendering method corresponding to the behavior category of the second display image .
  • the second client after acquiring the second displayed image, the second client can perform behavior analysis on the second displayed image based on the deep learning theory to obtain the behavior category of the second displayed image.
  • the second client may send the behavior category of the second display avatar to the server.
  • the server sends the behavior category of the second display image to the first client, so that the first client can render the second display in the live scene according to the rendering method corresponding to the behavior category of the second display image in the live scene image.
  • the behavior category of the displayed image is obtained by analyzing the behavior of the displayed image in the live broadcast scene, and the displayed image is rendered in the live broadcast scene according to the rendering method corresponding to the behavior category, which further enriches the live broadcast interaction method, and can
  • the display image in the live broadcast scene is more vivid in visual effect, which helps to increase the number of viewers in the live broadcast room and prolong the stay time of the audience in the live broadcast room.
  • the first target object is the host, and the second target object is the audience; obtaining the second display image of the second target object includes: in response to an interaction request of the second target object, obtaining the first target object according to the interaction request. A second display image of the second target object.
  • the second target object is an interactive audience participating in the live broadcast interaction.
  • the second target object may trigger the interaction request through the second client.
  • the second client collects behavior data of the second target object, and generates a second display image corresponding to the second target object according to the behavior data of the second target object.
  • the second client can send the second display image to the first client corresponding to the host through the server, so that the first client obtains the second display image and renders the acquired second display image in the currently displayed live broadcast scene .
  • the anchor terminal and the audience terminal simultaneously collect the behavior data of their respective users to generate a display image, and at the same time carry out two-way transmission of the display image, so that the anchor terminal and the audience terminal can be in the same Real-world behaviors are used for live-streaming interaction in virtual scenes, which can make live-streaming interaction more comprehensive.
  • the first target object is a viewer
  • the second target object is a host or a viewer
  • collecting behavior data of the first target object includes: in response to an interaction request of the first target object, receiving a confirmation of the interaction request message, and the behavior data of the first target object is collected according to the confirmation message of the interaction request.
  • the second target object may be other interactive viewers or hosts participating in the live broadcast interaction.
  • the first target object may trigger an interaction request through the first client.
  • the first client can send the interaction request to the second client corresponding to the host through the server.
  • the host can trigger the permission instruction through the second client.
  • the server sends a confirmation message of the interaction request to the first client, so that the first client can start collecting behavior data of the first target object according to the confirmation message of the interaction request.
  • the viewer can collect behavior data of the viewer only after receiving the confirmation message from the host, so that the host can manage the interactive viewers in a unified manner.
  • a confirmation message of the interaction request is received, and the behavior data of the first target object is collected according to the confirmation message of the interaction request, including:
  • step S310 in response to the interaction request of the first target object, the number of displayed avatars in the live broadcast scene is acquired.
  • step S320 when the number of displayed images does not reach the number threshold, upload the interaction request
  • step S330 a confirmation message of the interaction request is received, and behavior data of the first target object is collected according to the confirmation message.
  • the number of displayed images in the live broadcast scene may refer to the number of displayed images corresponding to the interactive viewers in the current live broadcast scene.
  • the number threshold refers to the maximum number of interactive viewers allowed to participate in live interaction.
  • the quantity threshold can be manually configured by the host when creating the live room, or it can be a pre-configured default threshold.
  • the first target object if the first target object is a viewer, the second target object may be other interactive viewers or a host.
  • the first target object may trigger an interaction request through the first client.
  • the first client in response to the interaction request, acquires the number of displayed images in the current live broadcast scene. Compare the number of displayed avatars in the current live scene with a pre-acquired number threshold.
  • the server sends the interactive request of the first client to the second client of the host through the server.
  • the host can trigger the permission instruction through the second client.
  • the server sends a confirmation message of the interaction request to the first client, so that the first client can collect behavior data of the first target object according to the confirmation message of the interaction request.
  • the display effect of the displayed image in the live broadcast scene can be improved.
  • step S220 the behavior data of the first target object is collected, and the first display image corresponding to the first target object is generated according to the behavior data of the first target object, including: collecting the behavior data of the first target object.
  • Behavior data when the whole body image of the first target object is identified according to the behavior data of the first target object, a first display image corresponding to the first target object is generated according to the behavior data of the first target object.
  • the second target object may be other interactive viewers or a host.
  • the first target object may trigger an interaction request through the first client.
  • the first client may send the interaction request of the first client to the second client of the host through the server.
  • the host can trigger the permission instruction through the second client.
  • the server sends a confirmation message of the interaction request to the first client, so that the first client can collect behavior data of the first target object according to the confirmation message of the interaction request.
  • the behavior data of the first target object includes a behavior image of the first target object.
  • the first client can identify the behavior image of the first target object, and determine whether the behavior image contains the full-body image of the first target object. If the whole body image of the first target object is included, the first display image is acquired, and the first display image is rendered into the live broadcast scene.
  • the compliance of live interaction can be improved by allowing the interactive client to continue to collect behavior data of the interactive audience after judging that the client of the interactive audience can collect the full-body image of the interactive audience.
  • FIG. 4 is a flow chart of a live broadcast interaction method according to an exemplary embodiment. As shown in FIG. 4 , the live broadcast interaction method used in the host terminal includes the following steps.
  • step S401 the host creates a live room, and configures a live broadcast scene in the live broadcast room and a threshold for the number of displayed images in the live broadcast scene.
  • step S402 the host terminal displays the live broadcast scene in the interface of the live broadcast room.
  • step S403 the behavior data of the anchor is collected, and the behavior data of the anchor may be continuous multiple frames of behavior images of the anchor collected by the camera.
  • step S404 semantic segmentation, tracking, and behavior analysis are performed on each frame of the anchor's behavior image to obtain the anchor's display image, the anchor's motion track information, and the behavior category of the anchor's display image.
  • semantic segmentation is performed on each frame of the anchor's behavior image by using a semantic segmentation model to obtain a segmentation result of the anchor's portrait in each frame, which is used as the anchor's display image in each frame.
  • the anchor's display image is identified through the action recognition model, and the behavior category of the anchor's display image is obtained.
  • the motion trajectory information is obtained by tracking the multi-frame anchor behavior images through the target tracking algorithm.
  • the behavior category of the anchor's displayed image can also be obtained by performing behavior detection on multiple frames of anchor behavior images by a target tracking algorithm, which is not specifically limited here.
  • step S405 the anchor display image and the motion trajectory of the anchor display image are rendered in the live broadcast scene of the anchor end, and the anchor display image in the live broadcast scene is rendered according to the rendering method corresponding to the behavior category of the anchor display image.
  • step S406 the host's display image, the anchor's motion track information, and the behavior category of the anchor's displayed image are sent to the server, so that the server sends the anchor's displayed image, the anchor's motion track information, and the behavior category of the anchor's displayed image to all viewers end.
  • the viewer renders the anchor display image and the motion trajectory of the anchor display image in the live broadcast scene, and renders the anchor display image in the live broadcast scene according to the rendering method corresponding to the behavior category of the anchor display image.
  • step S407 in response to the interactive request from the interactive viewer, a permission instruction and initial location information allocated to the display image for the viewer of the interactive viewer are obtained.
  • step S408 a confirmation message of the interaction request is sent to the interactive viewer.
  • step S409 an audience display image of the interactive audience is acquired, and the audience display image is rendered to a corresponding initial position according to the initial position information.
  • the audience display image of the interactive audience is when the interactive audience end or the host end detects that the number of displayed images in the live broadcast scene does not exceed the number threshold, and the interactive audience end determines that the camera can capture the whole body image of the audience, according to the collected audience behavior image owned.
  • step S410 continue to acquire the displayed image of the audience, the movement track information of the interactive audience, and the behavior category of the displayed image of the audience.
  • the displayed image of the audience, the motion track information of the interactive audience, and the behavior category of the displayed image of the audience can be obtained by referring to step S404, and will not be described in detail here.
  • step S411 the host renders the audience display image and the motion trajectory of the audience display image in the live broadcast scene, and renders the audience display image in the live broadcast scene according to the rendering method corresponding to the behavior category of the audience display image.
  • FIG. 5 exemplarily shows a live broadcast scene displayed by the host terminal in one embodiment.
  • the live broadcast scene is a pre-selected virtual scene
  • the displayed image of the anchor and the displayed image of the audience are the real anchor portrait and real audience portrait obtained through the voice segmentation model.
  • FIG. 6 is a flow chart of a live broadcast interaction method according to an exemplary embodiment. As shown in FIG. 5 , the live broadcast interaction method used in an interactive audience terminal includes the following steps.
  • step S601 the interactive viewer terminal displays the live broadcast scene in the live broadcast room interface.
  • step S602 the anchor's display image, the anchor's motion track information, and the behavior category of the anchor's display image are acquired.
  • step S603 the anchor display image and the motion trajectory of the anchor display image are rendered in the live broadcast scene, and the anchor display image in the live broadcast scene is rendered according to the rendering method corresponding to the behavior category of the anchor display image.
  • step S604 in response to the interaction request triggered by the interactive audience, the number of displayed characters in the live broadcast scene is obtained, and when the number of displayed characters does not reach the number threshold, an interaction request is sent to the host.
  • step S605 a confirmation message of the interaction request sent by the host is received, the confirmation message carries the initial location information, and behavior data of the interactive viewer is collected according to the confirmation message.
  • the behavior data of the interactive audience may be an audience behavior image of the interactive audience collected by a camera.
  • step S606 when the whole body image of the interactive audience can be identified according to the audience behavior image, semantic segmentation processing is performed on the behavior image of the interactive audience to obtain the audience display image, and the audience display image is rendered to the corresponding initial position according to the initial position information. Location.
  • step S607 the display image of the viewer is sent to the server, so that the server sends the display image of the viewer to the host and all other viewers.
  • step S608 continuous acquisition of multiple frames of audience behavior images of the audience is continued.
  • step S609 semantic segmentation, tracking, and behavior analysis are performed on each frame of the audience behavior image to obtain the audience display image, the motion track information of the interactive audience, and the behavior category of the audience display image.
  • a semantic segmentation process is performed on each frame of the audience behavior image through a semantic segmentation model to obtain a segmentation result of the audience portrait in each frame, which is used as the audience display image in each frame.
  • the action recognition model Through the action recognition model, the displayed image of the audience is identified, and the behavior category of the displayed image of the audience is obtained.
  • the target tracking algorithm is used to track and process multiple frames of audience behavior images to obtain the motion track information of the audience. Further, the behavior category of the audience image can also be obtained by performing behavior detection on multiple frames of audience behavior images through the target tracking algorithm, which is not specifically limited here.
  • step S610 the audience display image and the motion trajectory of the audience display image are rendered in the live broadcast scene, and the audience display image in the live broadcast scene is rendered according to the rendering method corresponding to the behavior category of the audience display image.
  • the interactive viewer terminal and the host terminal present the same live broadcast scene, for details, please refer to the schematic diagram of the live broadcast scene in FIG. 5 .
  • step S611 the displayed image of the audience, the motion trajectory information of the interactive audience, and the behavior category of the displayed image of the audience are sent to the server, so that the server sends the displayed image of the audience, the information of the motion trajectory of the audience, and the behavior category of the displayed image of the audience to the host. and all other viewers.
  • the audience display image and the motion trajectory of the audience display image are rendered in the live broadcast scene through the anchor terminal and all other viewer terminals, and the audience display image in the live broadcast scene is rendered according to the special effect rendering method corresponding to the behavior category of the audience display image.
  • steps in the above flow charts are displayed in sequence according to the arrows, these steps are not necessarily executed in the sequence indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in the above flow chart may include multiple steps or multiple stages, these steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence of these steps or stages is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages within the other steps.
  • FIG. 7 is a block diagram of a live interactive device 700 according to an exemplary embodiment.
  • the apparatus 700 includes a display module 701 , a collection module 702 , a display image generation module 703 , a first rendering module 704 , an obtaining module 705 and a second rendering module 706 .
  • the display module 701 is configured to display the live scene in the live room interface; the collection module 702 is configured to collect the behavior data of the first target object; the display image generation module 703 is configured to generate according to the behavior data of the first target object The first display image corresponding to the first target object; the first rendering module 704 is configured to render the first display image in the live broadcast scene; the obtaining module 705 is configured to obtain the second display image of the second target object, the second display image The avatar is generated according to the behavior data of the second target object; the second rendering module 706 is further configured to render the second display avatar in the live broadcast scene.
  • the acquisition module 702 is configured to acquire multiple frames of behavioral images of the first target object; the apparatus 700 further includes: an image segmentation module, configured to perform semantic segmentation processing on each frame of behavioral images, The first display image of each frame is obtained; the first rendering module 704 is further configured to render the first display image of each frame in the live broadcast scene.
  • the image segmentation module includes: a sending unit, configured to send multiple frames of behavioral images to a server; a receiving unit, configured to receive a data obtained by performing voice segmentation processing on each frame of behavioral images sent by the server. The first display image of each frame.
  • the first rendering module 704 includes: a tracking unit configured to perform tracking processing on multiple frames of behavioral images to obtain motion trajectory information of the first target object; the first rendering unit configured to The motion trajectory information of the first target object, rendering the motion trajectory of the first display image in the live broadcast scene; the second rendering module 706 includes: a trajectory information acquisition unit configured to acquire the motion trajectory information of the second target object; The rendering unit is configured to render the motion trajectory of the second display image in each frame in the live broadcast scene according to the motion trajectory information of the second target object.
  • the tracking unit is configured to send the multi-frame behavior images to the server; and receive the motion trajectory information of the first target object sent by the server and obtained by tracking the multi-frame behavior images.
  • the acquiring module 705 is further configured to acquire the scene display parameters of the live broadcast scene and the device parameters of the image acquisition device; the apparatus 700 further includes: an image adjustment module, configured to display the parameters and The device parameters are used to adjust each frame of behavioral images; the image segmentation module is configured to perform semantic segmentation processing on the adjusted behavioral images of each frame.
  • the first rendering module 704 includes: a behavior analysis unit configured to perform behavior analysis on the first display image to obtain a behavior category of the first displayed image; and a third rendering unit configured to The rendering mode corresponding to the behavior category renders the first display image in the live broadcast scene;
  • the second rendering module 706 includes: a behavior category acquisition unit, configured to acquire the behavior category of the second display image; a fourth rendering unit, configured as The second display image is rendered in the live broadcast scene according to the rendering mode corresponding to the behavior category of the second display image.
  • the first target object is a broadcaster, and the second target object is a viewer; the obtaining module 705 is configured to, in response to an interaction request of the second target object, obtain the second target object's second target object according to the interaction request. Display image.
  • the first target object is a viewer
  • the second target object is a broadcaster or a viewer
  • the collection module 702 is configured to, in response to the interaction request of the first target object, receive a confirmation message of the interaction request, and according to the interaction
  • the requested confirmation message collects behavior data of the first target object.
  • the collection module 702 includes: a quantity acquisition unit, configured to acquire, in response to an interaction request of the first target object, the number of displayed characters in the live broadcast scene; When the quantity threshold is not reached, the interaction request is uploaded; the collection unit is configured to receive a confirmation message of the interaction request, and collect the behavior data of the first target object according to the confirmation message.
  • the collection module 702 is configured to collect behavior data of the first target object; when the whole body image of the first target object is identified according to the behavior data of the first target object, according to the behavior data of the first target object The behavior data generates a first display image corresponding to the first target object.
  • FIG. 8 is a block diagram of a device 800 for live interaction according to an exemplary embodiment.
  • device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, or the like.
  • device 800 may include one or more of the following components: processing component 802, memory 804, power supply component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and Communication component 816.
  • the processing component 802 generally controls the overall operation of the device 800, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 802 can include one or more processors 820 to execute instructions to perform all or some of the steps of the methods described above.
  • processing component 802 may include one or more modules that facilitate interaction between processing component 802 and other components.
  • processing component 802 may include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operation at device 800 . Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and the like. Memory 804 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable programmable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable programmable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • Power supply assembly 806 provides power to various components of device 800 .
  • Power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 800 .
  • Multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • multimedia component 808 includes a front-facing camera and/or a rear-facing camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data.
  • Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
  • Audio component 810 is configured to output and/or input audio signals.
  • audio component 810 includes a microphone (MIC) that is configured to receive external audio signals when device 800 is in operating modes, such as call mode, recording mode, and voice recognition mode.
  • the received audio signal may be further stored in memory 804 or transmitted via communication component 816 .
  • audio component 810 also includes a speaker for outputting audio signals.
  • the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
  • Sensor assembly 814 includes one or more sensors for providing status assessments of various aspects of device 800 .
  • the sensor component 814 can detect the open/closed state of the device 800, the relative positioning of components, such as the display and keypad of the device 800, and the sensor component 814 can also detect a change in the position of the device 800 or a component of the device 800 , the presence or absence of user contact with the device 800 , the orientation or acceleration/deceleration of the device 800 and the temperature change of the device 800 .
  • Sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communications between device 800 and other devices.
  • Device 800 may access wireless networks based on communication standards, such as WiFi, carrier networks (eg, 2G, 3G, 4G, or 5G), or a combination thereof.
  • the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
  • non-transitory computer-readable storage medium including instructions, such as memory 804 including instructions, executable by processor 820 of device 800 to perform the above method.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

Abstract

一种直播互动方法、装置、电子设备及存储介质。所述方法包括:在直播间界面中显示直播场景;采集第一目标对象的行为数据,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象,在直播场景渲染第一显示形象;获取第二目标对象的第二显示形象,第二显示形象是根据第二目标对象的行为数据生成的;在直播场景中渲染第二显示形象。

Description

直播互动方法及装置
相关申请的交叉引用
本公开基于申请日为2020年9月22日、申请号为202011001739.6号的中国专利申请,并要求该中国专利申请的优先权,在此全文引用上述中国专利申请公开的内容以作为本公开的一部分。
技术领域
本公开涉及互联网技术领域,尤其涉及一种直播互动方法、装置、电子设备及存储介质。
背景技术
互动直播是视频直播的增强应用,是在视频直播中增加互动功能。
相关技术中,互动直播中的互动功能包括在视频直播中增加语音、视频的互动。
发明内容
本公开提供一种直播互动方法、装置、电子设备及存储介质。本公开的技术方案如下:
根据本公开实施例的第一方面,提供一种直播互动方法,包括:
在直播间界面中显示直播场景;
采集第一目标对象的行为数据,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象,在直播场景渲染第一显示形象;
获取第二目标对象的第二显示形象,第二显示形象是根据第二目标对象的行为数据生成的;
在直播场景中渲染第二显示形象。
根据本公开实施例的第二方面,提供一种直播互动装置,包括:
显示模块,被配置为在直播间界面中显示直播场景;
采集模块,被配置为采集第一目标对象的行为数据;
显示形象生成模块,被配置为根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象;
第一渲染模块,被配置为在直播场景渲染第一显示形象;
获取模块,被配置为获取第二目标对象的第二显示形象,第二显示形象是根据第二目标对象的行为数据生成的;
第二渲染模块,还被配置为在直播场景中渲染第二显示形象。
根据本公开实施例的第三方面,提供一种电子设备,包括:
处理器;用于存储所述处理器可执行指令的存储器;
其中,所述处理器被配置为执行所述指令,以实现第一方面的任一项实施例中所述的直播互动方法。
根据本公开实施例的第四方面,提供一种存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行第一方面的任一项实施例中所述的直播互动方法。
根据本公开实施例的第五方面,提供一种计算机程序产品,所述程序产品包括计算机程序,所述计算机程序存储在可读存储介质中,设备的至少一个处理器从所述可读存储介质读取并执行所述计算机程序,使得设备执行第一方面的任一项实施例中所述的直播互动方法。
根据本公开的方案,预先建立直播场景,在主播端和观众端显示同一个直播场景;主播端和观众端同时采集各自用户的行为数据生成显示形象,并同时进行双向传播显示形象,使主播端和观众端能够在同一虚拟场景中以真实世界行为进行直播互动,从而使直播互动方式更加全面。
附图说明
图1是根据一示例性实施例示出的一种直播互动方法的应用环境图。
图2是根据一示例性实施例示出的一种直播互动方法的流程图。
图3是根据一示例性实施例示出的一种采集行为数据步骤的流程图。
图4是根据一示例性实施例示出的一种直播互动方法的流程图。
图5是根据一示例性实施例示出的一种直播场景的示意图。
图6是根据另一示例性实施例示出的一种直播互动方法的流程图。
图7是根据一示例性实施例示出的一种直播互动装置的框图。
图8是根据一示例性实施例示出的一种电子设备的内部结构图。
具体实施方式
为了使本领域普通人员更好地理解本公开的技术方案,下面将结合附图,对本公开实施例中的技术方案进行清楚、完整地描述。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
需要说明的是,本公开实施例中所描述的获取用户信息以及用户账户的相关信息,包括社交关系身份信息之类的,均已获得用户许可,在取得用户充分许可授权的前提下,本 公开所涉及的方法,装置,设备,存储介质可以获取用户的相关信息。
相关技术中,在互动过程中通常只能对主播端影像进行处理,并由主播端向观众端进行单向展示主播端影像,存在互动方式单一的问题。
据此,本公开提供一种直播互动方法。
本公开所提供的直播互动方法,可以应用于如图1所示的应用环境中。其中,主播端110和服务器120通过网络进行通信,至少一个观众端130和服务器120通过网络进行通信。观众端130中至少包含参与直播互动的观众端(以下称为互动观众端)。主播端110中安装有能够用于进行直播的应用程序。观众端130中安装有能够用于观看直播的应用程序。主播端110中安装的用于进行直播的应用程序和观众端130中安装的用于观看直播的应用程序可以是相同的应用程序。主播端110创建直播间时,可以获取主播选择的直播场景素材,建立直播间。主播端110进行直播的过程中,主播端110采集主播的行为数据,根据主播的行为数据生成主播对应的主播显示形象,在直播场景渲染主播显示形象。在主播端110进行直播的过程中,观众端130进入该直播间,并在观众端130的屏幕上显示包含主播显示形象的直播场景。观众端130中的部分或者全部观众端(互动观众端)可以向主播端110请求进行直播互动。互动观众端采集互动观众的行为数据,根据互动观众的行为数据生成互动观众对应的互动观众显示形象,在直播场景渲染互动观众显示形象。互动观众端将互动观众显示形象发送至服务器120,以使服务器120将互动观众显示形象发送至主播端110以及未参与互动的其他观众端,使主播端110以及未参与互动的其他观众端在直播场景中渲染互动观众显示形象。其中,主播端110可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑,服务器120可以用独立的服务器或者是多个服务器组成的服务器集群来实现,观众端130可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑。
图2是根据一示例性实施例示出的一种直播互动方法的流程图,如图2所示,直播互动方法用于图1中的主播端110或者观众端130中的互动观众端,包括以下步骤。
在步骤S210中,在直播间界面中显示直播场景。
其中,直播场景是指为直播间设置的虚拟场景。直播场景的素材可以预先配置,例如,可以是游戏场景、虚拟图像背景等,或者可以是用户在终端设备的相册中选择得到;或者通过图像采集设备实时拍摄图像得到,在此不做限定。在一些实施例中,主播可以通过主播端触发直播间的创建请求。主播端响应于直播间的创建请求,获取直播场景的素材;根据所获取的直播场景的素材创建直播场景。主播端显示已创建的直播场景。观众端可以通过搜索、热点推荐等方式进入该直播间,在观众端的屏幕上显示与主播端相同的直播场景。
在步骤S220中,采集第一目标对象的行为数据,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象,在直播场景渲染第一显示形象。
其中,第一目标对象可以是主播或者参与直播互动的互动观众。互动观众可以是正在观看直播的全部或者部分观众。在一些实施例中,对于第一目标对象对应的第一客户端, 实时通过图像采集设备采集第一目标对象的行为数据。对第一目标对象的行为数据进行相应的处理,生成第一目标对象对应的第一显示形象,并在第一客户端的直播场景渲染第一显示形象。第一目标对象的行为数据不限于是第一目标对象的视频数据、语音数据或者文字评论数据等。第一目标对象对应的第一显示形象可以基于深度学习理论得到。
示例性地,若第一目标对象的行为数据是对第一目标对象进行拍摄得到的行为图像,第一显示形象则可以是对行为图像进行语义分割处理得到的第一目标对象图像,也可以是通过第一目标对象的人体姿态估计结果驱动的三维模型;若第一目标对象的行为数据是对第一目标对象进行语音采集得到的语音数据,第一显示形象则可以是对语音数据进行语音识别得到的相关文字内容。
在步骤S230中,获取第二目标对象的第二显示形象,第二显示形象是根据第二目标对象的行为数据生成的。
在步骤S240中,在直播场景中渲染第二显示形象。
其中,第二目标对象可以是主播或者参与直播互动的互动观众。当第一目标对象是主播时,第二目标对象可以是互动观众;当第一目标对象是互动观众时,第二目标对象可以是主播和/或其他互动观众。在一些实施例中,对于第二目标对象对应的第二客户端,可以参照步骤S220,根据第二目标对象的行为数据生成第二目标对象对应的第二显示形象,并在第二客户端显示的直播场景中渲染第二显示形象。第二客户端将已获取的第二显示形象发送至服务器,通过服务器将第二目标对象对应的第二显示形象发送第一客户端。第一客户端接收服务器发送的第二目标对象的第二显示形象,在显示的直播场景中渲染第二显示形象。
同理,对于第二目标对象对应的第二客户端,可以从服务器接收第一目标对象对应的第一显示形象,并在第二客户端显示的直播场景中渲染第一显示形象,从而使第二客户端与第一客户端呈现相同的直播场景。
进一步地,对于未参与直播互动的账户对应的观众端,可以从服务器获取第一目标对象对应的第一显示形象,以及第二目标对象对应的第二显示形象,并在观众端显示的直播场景中渲染第一显示形象和第二显示形象,从而使未参与直播互动的观众端、第一客户端和第二客户端呈现相同的直播场景。
上述直播互动方法中,预先建立直播场景,在主播端和观众端显示同一个直播场景;主播端和观众端同时采集各自用户的行为数据生成显示形象,并同时进行双向传播显示形象,使主播端和观众端能够在同一虚拟场景中以真实世界行为进行直播互动,从而可以使直播互动方式更加全面。
在一示例性实施例中,在步骤S220中,采集第一目标对象的行为数据,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象,在直播场景渲染第一显示形象,包括:采集第一目标对象的多帧行为图像,对每帧行为图像进行语义分割处理,得到每帧第一显示形象,在直播场景中渲染每帧第一显示形象。
在一些实施例中,第一目标对象的行为数据可以是通过图像采集设备实时采集到的第一目标对象的连续多帧行为图像。第一客户端每获取一帧行为图像,调用预先配置的已训练的语义分割模型。通过已训练的语义分割模型对每帧行为图像进行语音分割处理,得到第一目标对象图像,将所得到的第一目标对象图像作为第一显示形象。第一客户端在直播场景中渲染所获取的每帧第一显示形象。其中,语义分割模型不限于采用DeepLab(一种语义分割网络)、FCN(Fully Convolution Networks,全卷积网络)、SegNet(Sementic Segmentation,语义分割网络)、BiSeNet(Bilateral Segmentation Network for Real-time Semantic Segmentation,一种双通道实时语义分割模型)等。在一些实施例中,当第一目标对象是主播或者互动观众时,第一目标对象图像则可以是对应的真实主播人像或者真实互动观众人像。
进一步地,对于第一客户端所获取的第二目标对象的第二显示形象,同样可以上述方式对第二目标对象的每帧行为图像进行语义分割处理得到的第二显示形象。服务器将语义分割处理得到的第二显示形象发送至第一客户端,以使第一客户端能够在直播场景中渲染第二显示形象。
一些实施例中,通过采集参与直播互动的主播和/或互动观众的行为图像,对所得到的行为图像进行语义分割处理,得到真实人像,并在直播场景中渲染所得到的真实人像,使得虚拟直播场景更接近真实世界场景,从而可以提升直播互动的真实性,有助于提升用户在直播间的停留时间、提高直播应用的用户留存率。
在一示例性实施例中,对每帧行为图像进行语义分割处理,包括:将多帧行为图像发送至服务器;接收服务器发送的对每帧行为图像进行语音分割处理得到的每帧第一显示形象。
在一些实施例中,对第一客户端和/或第二客户端采集到的多帧行为图像进行语义分割处理,还可以通过服务器端执行。在第一客户端和/或第二客户端获取各自的图像采集设备采集到的多帧行为图像后,第一客户端和/或第二客户端实时将所获取的多帧行为图像发送至服务器。服务器调用预先部署的语义分割模型。通过语义分割模型对每帧行为图像进行语音分割处理,得到第一目标对象图像和第二目标对象图像,将所得到的第一目标对象图像作为第一显示形象,将所得到的第二目标对象图像作为第二显示形象。服务器可以将第一显示形象和第二显示形象发送至直播间的关联客户端(可以是指已进入直播间的所有账户对应的客户端),以使关联客户端能够在当前显示的直播场景中渲染第一显示形象和第二显示形象。
一些实施例中,通过在服务器中预先部署语义分割模型,用于对第一客户端和第二客户端获取的多帧行为图像进行语义分割处理,可以减轻终端设备的运行压力,提升终端设备的响应速度。
在一示例性实施例中,在直播场景渲染第一显示形象,包括:对多帧行为图像进行跟踪处理,得到第一目标对象的运动轨迹信息;根据第一目标对象的运动轨迹信息,在直播 场景中渲染第一显示形象的运动轨迹。
在一些实施例中,为了使直播场景中渲染的第一显示形象能够更加接近真实世界的人物行为,一些实施例中,预先在第一客户端部署已训练的目标跟踪算法。通过目标跟踪算法对第一客户端采集多帧行为图像进行跟踪处理,得到第一目标对象的运动轨迹信息。进而根据第一目标对象的运动轨迹信息,在直播场景中渲染第一显示形象的运动轨迹。其中,目标跟踪算法可以采用基于相关滤波器的跟踪算法,例如KCF Tracker(Kernel Correlation Filter,核相关滤波跟踪算法)、MOSSE Tracker(Minimum Output Sum of Squared Error Tracker,误差最小平方和滤波器跟踪算法)等。
在一些实施例中,在直播场景中渲染第二显示形象,包括:获取第二目标对象的运动轨迹信息,根据第二目标对象的运动轨迹信息,在直播场景中渲染每帧第二显示形象的运动轨迹。
在一些实施例中,第一客户端还可以接收服务器发送的第二目标对象的运动轨迹信息。根据第二目标对象的运动轨迹信息,在当前显示的直播场景中渲染携第二显示形象的运动轨迹。第二目标对象的运动轨迹信息可以通过预先配置在第二客户端的目标跟踪算法,对第二目标对象的多帧行为图像进行跟踪处理得到。
进一步地,第一客户端和第二客户端还可以通过服务器将第一显示形象、第一目标对象的运动轨迹信息,以及第二显示形象、第二目标对象的运动轨迹信息发送至直播间的其他关联客户端,以使其他关联客户端在当前显示的直播场景中渲染第一显示形象、第一显示形象的运动轨迹,以及渲染第二显示形象、第二显示形象的运动轨迹。
一些实施例中,通过预先部署目标跟踪算法,通过目标跟踪算法得到真实世界的目标对象的运动轨迹信息,并根据真实世界的目标对象的运动轨迹信息在直播场景中渲染显示形象的运动轨迹,使得直播场景中显示形象可按照真实世界人物的行为进行互动,可以使直播互动方式更加全面,且可以提高直播互动的真实性,有助于提高用户的停留时间。
在一示例性实施例中,对多帧行为图像进行跟踪处理,得到第一目标对象的运动轨迹信息,包括:将多帧行为图像发送至服务器;接收服务器发送的对多帧行为图像进行跟踪处理得到的第一目标对象的运动轨迹信息。
在一些实施例中,对第一客户端和/或第二客户端采集到的多帧行为图像进行跟踪处理,还可以通过服务器端执行。在第一客户端和/或第二客户端各自获取图像采集设备采集到的多帧行为图像后,第一客户端和/或第二客户端实时将所获取的多帧行为图像发送至服务器。服务器调用预先部署的目标跟踪算法。通过目标跟踪算法对多帧行为图像进行跟踪处理,得到第一目标对象和第二目标对象各自对应的运动轨迹信息。服务器可以将第一目标对象和第二目标对象各自对应的运动轨迹信息发送至直播间的关联客户端,以使关联客户端能够在当前显示的直播场景中渲染第一显示形象和第二显示形象各自对应的运动轨迹。
一些实施例中,通过在服务器中预先部署已训练的目标跟踪算法,用于对第一客户端 和第二客户端获取的多帧行为图像进行跟踪处理,可以减轻终端设备的运行压力,提升终端设备的响应速度。
在一示例性实施例中,在对每帧行为图像进行语义分割处理之前,还包括:获取直播场景的场景显示参数以及图像采集设备的设备参数;根据场景显示参数和设备参数,对每帧行为图像进行调整。
其中,直播场景的场景显示参数不限于包括直播场景的亮度、对比度等信息。直播场景的场景显示参数可以是主播在创建直播间时手动配置的,或者采用预先配置的默认参数。设备参数是指用于采集行为图像的图像采集设备的参数。设备参数不限于包括光照、对比度、摄像头分辨率、镜头畸变等系数。第一客户端和第二客户端各自对应的图像采集设备的设备参数可能不同。
在一些实施例中,第一客户端获取直播场景的场景显示参数。在采集第一目标对象的行为图像时,第一客户端获取图像采集设备的设备参数。第一客户端根据直播场景的场景显示参数,对所获取的每帧行为图像进行调整。示例性地,若已获取的直播场景的场景显示参数和图像采集设备的设备参数都包含亮度,且场景显示参数的亮度小于设备参数中的亮度,则可以根据场景显示参数的亮度,减小第一目标对象的行为图像的亮度。
同样地,对于第二客户端,在采集第二目标对象的行为图像时,第二客户端获取直播场景的场景显示参数以及图像采集设备的设备参数。第二客户端根据直播场景的场景显示参数,对所获取的每帧行为图像进行调整。
在一些实施例中,对每帧行为图像进行语义分割处理,具体包括:对调整后的每帧行为图像进行语义分割处理。在一些实施例中,在对第一目标对象的每帧行为图像进行调整后,第一客户端调用预先部署的语义分割模型对第一目标对象的每帧行为图像进行语义分割处理,得到第一目标对象图像,并将所得到的第一目标对象图像作为第一显示形象。
一些实施例中,通过根据直播场景的场景显示参数以及图像采集设备的设备参数对所获取的行为图像进行调整,可以使不同的客户端采集到的行为图像在直播场景中呈现的效果更加一致。
在一示例性实施例中,在直播场景渲染第一显示形象,包括,对第一显示形象进行行为分析,得到第一显示形象的行为类别,按照与行为类别对应的渲染方式在直播场景中渲染第一显示形象。
其中,行为类别不限于跳舞、对唱、跳跃、击掌、激励等。与行为类别对应的渲染方式可以是指与行为类别对应的相关特效渲染方式,例如,与行为类别为跳舞对应的渲染方式可以是灯光特效,与行为类别为击掌对应的渲染方式可以是将同为击掌的至少一个显示形象靠近,并在击掌部位增加相应的特效。
在一些实施例中,对第一显示形象进行行为分析可以基于深度学习理论执行。示例性地,若第一显示形象是对行为图像进行语义分割处理得到的第一目标对象图像,则可以采用深度学习模型对第一显示形象进行动作识别,得到第一显示形象的行为类别;若第一显 示形象是对语音数据进行语音识别得到的相关文字内容,则可以对相关文字内容进行关键字识别,得到第一显示形象的行为类别。行为类别与渲染方式的对应关系可以预先配置在第一客户端。在第一客户端获取第一显示形象的行为类别后,可以从行为类别与渲染方式的对应关系中查找与行为类别对应的渲染方式,并按照与行为类别对应的渲染方式在直播场景中渲染第一显示形象。
在一些实施例中,在直播场景中渲染第二显示形象,包括:获取第二显示形象的行为类别,按照与第二显示形象的行为类别对应的渲染方式,在直播场景中渲染第二显示形象。
同样地,对于第二客户端,在获取第二显示形象后,第二客户端可以基于深度学习理论对第二显示形象进行行为分析,得到第二显示形象的行为类别。第二客户端可以将第二显示形象的行为类别发送至服务器。通过服务器将第二显示形象的行为类别发送至第一客户端,以使第一客户端能够在直播场景中按照与第二显示形象的行为类别对应的渲染方式,在直播场景中渲染第二显示形象。
一些实施例中,通过对直播场景中的显示形象进行行为分析,得到显示形象的行为类别,并按照与行为类别对应的渲染方式在直播场景中渲染显示形象,进一步丰富了直播互动方式,且可以使直播场景中的显示形象在视觉效果上更加生动形象,有助于增加直播间的观众数量,提高直播间观众的停留时长。
在一示例性实施例中,第一目标对象为主播,第二目标对象为观众;获取第二目标对象的第二显示形象,包括:响应于第二目标对象的互动请求,根据互动请求获取第二目标对象的第二显示形象。
其中,若第一目标对象为主播,则第二目标对象为参与直播互动的互动观众。在一些实施例中,第二目标对象可以通过第二客户端触发互动请求。第二客户端响应于互动请求,采集第二目标对象的行为数据,并根据第二目标对象的行为数据生成第二目标对象对应的第二显示形象。第二客户端可以通过服务器将第二显示形象发送至主播对应的第一客户端,以使第一客户端获取第二显示形象,并在当前显示的直播场景中渲染所获取的第二显示形象。
一些实施例中,通过使观看直播间的观众能够参与直播互动;主播端和观众端同时采集各自用户的行为数据生成显示形象,并同时进行双向传播显示形象,使主播端和观众端能够在同一虚拟场景中以真实世界行为进行直播互动,从而可以使直播互动方式更加全面。
在一示例性实施例中,第一目标对象为观众,第二目标对象为主播或观众;采集第一目标对象的行为数据,包括:响应于第一目标对象的互动请求,接收互动请求的确认消息,根据互动请求的确认消息采集第一目标对象的行为数据。
其中,若第一目标对象为互动观众,则第二目标对象可以为参与直播互动的其他互动观众或者主播。在一些实施例中,第一目标对象可以通过第一客户端触发互动请求。第一客户端可以通过服务器将互动请求发送至主播对应的第二客户端。主播可以通过第二客户 端触发许可指令。服务器响应于该许可指令,向第一客户端发送互动请求的确认消息,以使第一客户端能够根据互动请求的确认消息开始采集第一目标对象的行为数据。一些实施例中,通过使观众端在接收到主播端的确认消息后,才能够采集观众的行为数据,便于主播对互动观众进行统一管理。
在一示例性实施例中,如图3所示,响应于第一目标对象的互动请求,接收互动请求的确认消息,根据互动请求的确认消息采集第一目标对象的行为数据,包括:
在步骤S310中,响应于第一目标对象的互动请求,获取直播场景中的显示形象数量。
在步骤S320中,当显示形象数量未达到数量阈值时,上传互动请求;
在步骤S330中,接收互动请求的确认消息,根据确认消息采集第一目标对象的行为数据。
其中,直播场景中的显示形象数量可以是指当前直播场景中互动观众对应的显示形象数量。数量阈值是指允许参与直播互动的最大互动观众数量。数量阈值可以是主播在创建直播间时手动配置的,也可以是预先配置的默认阈值。在一些实施例中,若第一目标对象为观众,则第二目标对象可以其他互动观众或者主播。第一目标对象可以通过第一客户端触发互动请求。第一客户端响应于互动请求,获取当前直播场景中的显示形象数量。将当前直播场景中的显示形象数量与预先获取的数量阈值进行比较。若显示形象数量未达到数量阈值,则通过服务器将第一客户端的互动请求发送至主播的第二客户端。主播可以通过第二客户端触发许可指令。服务器响应于该许可指令,向第一客户端发送互动请求的确认消息,以使第一客户端能够根据互动请求的确认消息采集第一目标对象的行为数据。一些实施例中,通过为直播场景配置相应的数量阈值,控制参与直播互动的观众人数,可以改善直播场景中显示形象的展示效果。
在一示例性实施例中,在步骤S220中,采集第一目标对象的行为数据,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象,包括:采集第一目标对象的行为数据;当根据第一目标对象的行为数据识别出第一目标对象的全身形象时,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象。
在一些实施例中,若第一目标对象为观众,则第二目标对象可以为其他互动观众或者主播。第一目标对象可以通过第一客户端触发互动请求。第一客户端可以通过服务器将第一客户端的互动请求发送至主播的第二客户端。主播可以通过第二客户端触发许可指令。服务器响应于该许可指令,向第一客户端发送互动请求的确认消息,以使第一客户端能够根据互动请求的确认消息采集第一目标对象的行为数据。第一目标对象的行为数据中包含第一目标对象的行为图像。第一客户端可以对第一目标对象的行为图像进行识别,判断行为图像中是否包含第一目标对象的全身形象。若包含第一目标对象的全身形象,则获取第一显示形象,并将第一显示形象渲染至直播场景中。
一些实施例中,通过在判断互动观众的客户端能够采集互动观众的全身形象后,再允许互动客户端继续采集互动观众的行为数据,可以提升直播互动的合规性。
图4是根据一示例性实施例示出的一种直播互动方法的流程图,如图4所示,直播互动方法用于主播端中,包括以下步骤。
在步骤S401中,主播端创建直播间,配置直播间的直播场景以及直播场景中的显示形象的数量阈值。
在步骤S402中,主播端在直播间界面中显示直播场景。
在步骤S403中,采集主播的行为数据,主播的行为数据可以是通过摄像头采集的连续多帧主播行为图像。
在步骤S404中,对每帧主播行为图像进行语义分割处理、跟踪处理以及行为分析,得到主播显示形象、主播的运动轨迹信息以及主播显示形象的行为类别。
在一些实施例中,通过语义分割模型对每帧主播行为图像进行语义分割处理,得到每帧主播人像的分割结果,作为每帧主播显示形象。通过动作识别模型对主播显示形象进行识别,得到主播显示形象的行为类别。通过目标跟踪算法对多帧主播行为图像进行跟踪处理,得到的运动轨迹信息。进一步地,主播显示形象的行为类别也可以通过目标跟踪算法对多帧主播行为图像进行行为检测得到,在此不做具体限定。
在步骤S405中,在主播端的直播场景中渲染主播显示形象以及主播显示形象的运动轨迹,并根据与主播显示形象的行为类别对应的渲染方式,对直播场景中的主播显示形象进行渲染。
在步骤S406中,将主播显示形象、主播的运动轨迹信息以及主播显示形象的行为类别发送至服务器,以使服务器将主播显示形象、主播的运动轨迹信息以及主播显示形象的行为类别发送至所有观众端。通过观众端在直播场景中渲染主播显示形象以及主播显示形象的运动轨迹,并根据与主播显示形象的行为类别对应的渲染方式,对直播场景中的主播显示形象进行渲染。
在步骤S407中,响应于互动观众端的互动请求,获取许可指令以及为互动观众端的观众显示形象分配的初始位置信息。
在步骤S408中,向互动观众端发送互动请求的确认消息。
在步骤S409中,获取互动观众的观众显示形象,根据初始位置信息将观众显示形象渲染至对应的初始位置。其中,互动观众的观众显示形象是互动观众端或者主播端在检测直播场景中的显示形象数量未超过数量阈值、且互动观众端确定摄像头可以采集到观众的全身图像时,根据采集的观众行为图像得到的。
在步骤S410中,继续获取观众显示形象、互动观众的运动轨迹信息以及观众显示形象的行为类别。其中,观众显示形象、互动观众的运动轨迹信息以及观众显示形象的行为类别可以参照步骤S404得到,在此不做具体阐述。
在步骤S411中,主播端在直播场景中渲染观众显示形象、观众显示形象的运动轨迹,并按照与观众显示形象的行为类别对应的渲染方式,对直播场景中的观众显示形象进行渲染。图5示例性示出了一个实施例中主播端显示的直播场景。其中,直播场景为预先选择 的虚拟场景,主播显示形象和观众显示形象为通过语音分割模型得到的真实主播人像和真实观众人像。
图6是根据一示例性实施例示出的一种直播互动方法的流程图,如图5所示,直播互动方法用于互动观众端中,包括以下步骤。
在步骤S601中,互动观众端在直播间界面中显示直播场景。
在步骤S602中,获取主播显示形象、主播的运动轨迹信息以及主播显示形象的行为类别。
在步骤S603中,在直播场景中渲染主播显示形象以及主播显示形象的运动轨迹,并根据与主播显示形象的行为类别对应的渲染方式,对直播场景中的主播显示形象进行渲染。
在步骤S604中,响应于互动观众触发的互动请求,获取直播场景中的显示形象数量,并在显示形象数量未达到数量阈值时,向主播端发送互动请求。
在步骤S605中,接收主播端发送的互动请求的确认消息,确认消息携带初始位置信息,根据确认消息采集互动观众的行为数据。互动观众的行为数据可以是通过摄像头采集的互动观众的观众行为图像。
在步骤S606中,当根据观众行为图像能够识别出互动观众的全身形象时,对互动观众的行为图像进行语义分割处理,得到观众显示形象,并根据初始位置信息将观众显示形象渲染至对应的初始位置。
在步骤S607中,将观众显示形象发送至服务器,以使服务器将该观众显示形象发送至主播端和其他所有观众端。
在步骤S608中,继续采集观众的连续多帧观众行为图像。
在步骤S609中,对每帧观众行为图像进行语义分割处理、跟踪处理以及行为分析,得到观众显示形象、互动观众的运动轨迹信息以及观众显示形象的行为类别。
在一些实施例中,通过语义分割模型对每帧观众行为图像进行语义分割处理,得到每帧观众人像分割结果,作为每帧观众显示形象。通过动作识别模型对观众显示形象进行识别,得到观众显示形象的行为类别。通过目标跟踪算法对多帧观众行为图像进行跟踪处理,得到观众的运动轨迹信息。进一步地,观众形象的行为类别也可以通过目标跟踪算法对多帧观众行为图像进行行为检测得到,在此不做具体限定。
在步骤S610中,在直播场景中渲染观众显示形象以及观众显示形象的运动轨迹,并根据与观众显示形象的行为类别对应的渲染方式,对直播场景中的观众显示形象进行渲染。其中,互动观众端与主播端呈现相同的直播场景,具体可以参照图5的直播场景示意图。
在步骤S611中,将观众显示形象、互动观众的运动轨迹信息以及观众显示形象的行为类别发送至服务器,以使服务器将观众显示形象、观众的运动轨迹信息以及观众显示形象的行为类别发送至主播端和其他所有观众端。通过主播端和其他所有观众端在直播场景 中渲染观众显示形象以及观众显示形象的运动轨迹,并根据与观众显示形象的行为类别对应的特效渲染方式,对直播场景中的观众显示形象进行渲染。
应该理解的是,虽然上述流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,上述流程图的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
图7是根据一示例性实施例示出的一种直播互动装置700框图。参照图7,该装置700包括显示模块701、采集模块702、显示形象生成模块703、第一渲染模块704、获取模块705和第二渲染模块706。
显示模块701,被配置为在直播间界面中显示直播场景;采集模块702,被配置为采集第一目标对象的行为数据;显示形象生成模块703,被配置为根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象;第一渲染模块704,被配置为在直播场景渲染第一显示形象;获取模块705,被配置为获取第二目标对象的第二显示形象,第二显示形象是根据第二目标对象的行为数据生成的;第二渲染模块706,还被配置为在直播场景中渲染第二显示形象。
在一示例性实施例中,采集模块702,被配置为采集第一目标对象的多帧行为图像;所述装置700还包括:图像分割模块,被配置为对每帧行为图像进行语义分割处理,得到每帧第一显示形象;第一渲染模块704,还被配置为在直播场景中渲染每帧第一显示形象。
在一示例性实施例中,图像分割模块,包括:发送单元,被配置为将多帧行为图像发送至服务器;接收单元,被配置为接收服务器发送的对每帧行为图像进行语音分割处理得到的每帧第一显示形象。
在一示例性实施例中,第一渲染模块704,包括:跟踪单元,被配置为对多帧行为图像进行跟踪处理,得到第一目标对象的运动轨迹信息;第一渲染单元,被配置为根据第一目标对象的运动轨迹信息,在直播场景中渲染第一显示形象的运动轨迹;第二渲染模块706,包括:轨迹信息获取单元,被配置为获取第二目标对象的运动轨迹信息;第二渲染单元,被配置为根据第二目标对象的运动轨迹信息,在直播场景中渲染每帧第二显示形象的运动轨迹。
在一示例性实施例中,跟踪单元,被配置为将多帧行为图像发送至服务器;接收服务器发送的对多帧行为图像进行跟踪处理得到的第一目标对象的运动轨迹信息。
在一示例性实施例中,获取模块705,还被配置为获取直播场景的场景显示参数以及图像采集设备的设备参数;所述装置700还包括:图像调整模块,被配置为根据场景显示参数和设备参数,对每帧行为图像进行调整;图像分割模块,被配置为对调整后的每帧行为图像进行语义分割处理。
在一示例性实施例中,第一渲染模块704,包括:行为分析单元,被配置为对第一显示形象进行行为分析,得到第一显示形象的行为类别;第三渲染单元,被配置为按照与行为类别对应的渲染方式在直播场景中渲染第一显示形象;第二渲染模块706,包括:行为类别获取单元,被配置为获取第二显示形象的行为类别;第四渲染单元,被配置为按照与第二显示形象的行为类别对应的渲染方式,在直播场景中渲染第二显示形象。
在一示例性实施例中,第一目标对象为主播,第二目标对象为观众;获取模块705,被配置为响应于第二目标对象的互动请求,根据互动请求获取第二目标对象的第二显示形象。
在一示例性实施例中,第一目标对象为观众,第二目标对象为主播或观众;采集模块702,被配置为响应于第一目标对象的互动请求,接收互动请求的确认消息,根据互动请求的确认消息采集第一目标对象的行为数据。
在一示例性实施例中,采集模块702,包括:数量获取单元,被配置为响应于第一目标对象的互动请求,获取直播场景中的显示形象数量;上传单元,被配置为当显示形象数量未达到数量阈值时,上传互动请求;采集单元,被配置为接收互动请求的确认消息,根据确认消息采集第一目标对象的行为数据。
在一示例性实施例中,采集模块702,被配置为采集第一目标对象的行为数据;当根据第一目标对象的行为数据识别出第一目标对象的全身形象时,根据第一目标对象的行为数据生成第一目标对象对应的第一显示形象。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
图8是根据一示例性实施例示出的一种用于直播互动的设备800的框图。例如,设备800可以是移动电话、计算机、数字广播终端、消息收发设备、游戏控制台、平板设备、医疗设备、健身设备、个人数字助理等。
参照图8,设备800可以包括以下一个或多个组件:处理组件802、存储器804、电源组件806、多媒体组件808、音频组件810、输入/输出(I/O)的接口812、传感器组件814以及通信组件816。
处理组件802通常控制设备800的整体操作,诸如与显示、电话呼叫、数据通信、相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在设备800的操作。这些数据的示例包括用于在设备800上操作的任何应用程序或方法的指令、联系人数据、电话簿数据、消息、图片、视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM)、电可擦除可编程只读存储器(EEPROM)、 可擦除可编程只读存储器(EPROM)、可编程只读存储器(PROM)、只读存储器(ROM)、磁存储器、快闪存储器、磁盘或光盘。
电源组件806为设备800的各种组件提供电力。电源组件806可以包括电源管理系统,一个或多个电源,及其他与为设备800生成、管理和分配电力相关联的组件。
多媒体组件808包括在所述设备800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当设备800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当设备800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。
I/O接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为设备800提供各个方面的状态评估。例如,传感器组件814可以检测到设备800的打开/关闭状态,组件的相对定位,例如所述组件为设备800的显示器和小键盘,传感器组件814还可以检测设备800或设备800一个组件的位置改变,用户与设备800接触的存在或不存在,设备800方位或加速/减速和设备800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器、陀螺仪传感器、磁传感器、压力传感器或温度传感器。
通信组件816被配置为便于设备800和其他设备之间有线或无线方式的通信。设备800可以接入基于通信标准的无线网络,如WiFi,运营商网络(如2G、3G、4G或5G),或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,设备800可以被一个或多个应用专用集成电路(ASIC)、数字 信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由设备800的处理器820执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本公开要求的保护范围。

Claims (25)

  1. 一种直播互动方法,其中,所述方法应用于直播端或者观众端,包括:
    在直播间界面中显示直播场景;
    采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;
    获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;
    在所述直播场景中渲染所述第二显示形象。
  2. 根据权利要求1所述的直播互动方法,其中,所述采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象,包括:
    采集所述第一目标对象的多帧行为图像,对每帧行为图像进行语义分割处理,得到所述每帧第一显示形象,在所述直播场景中渲染所述每帧第一显示形象。
  3. 根据权利要求2所述的直播互动方法,其中,所述对每帧行为图像进行语义分割处理,包括:
    将所述多帧行为图像发送至服务器;
    接收所述服务器发送的对所述每帧行为图像进行语音分割处理得到的所述每帧第一显示形象。
  4. 根据权利要求2所述的直播互动方法,其中,所述在所述直播场景渲染所述第一显示形象,包括:对所述多帧行为图像进行跟踪处理,得到所述第一目标对象的运动轨迹信息;根据所述第一目标对象的运动轨迹信息,在所述直播场景中渲染所述第一显示形象的运动轨迹;
    所述在所述直播场景中渲染所述第二显示形象,包括:获取所述第二目标对象的运动轨迹信息,根据所述第二目标对象的运动轨迹信息,在所述直播场景中渲染每帧第二显示形象的运动轨迹。
  5. 根据权利要求4所述的直播互动方法,其中,所述对所述多帧行为图像进行跟踪处理,得到所述第一目标对象的运动轨迹信息,包括:
    将所述多帧行为图像发送至服务器;
    接收所述服务器发送的对所述多帧行为图像进行跟踪处理得到的所述第一目标对象的运动轨迹信息。
  6. 根据权利要求2所述的直播互动方法,其中,所述方法还包括:
    获取所述直播场景的场景显示参数以及所述直播端或观众端的图像采集设备的设备参数;
    根据所述场景显示参数和所述设备参数,对所述每帧行为图像进行调整;
    所述对每帧行为图像进行语义分割处理,包括:对调整后的所述每帧行为图像进行语 义分割处理。
  7. 根据权利要求1所述的直播互动方法,其中,所述在所述直播场景渲染所述第一显示形象,包括,对所述第一显示形象进行行为分析,得到所述第一显示形象的行为类别,按照与所述行为类别对应的渲染方式在所述直播场景中渲染所述第一显示形象;
    所述在所述直播场景中渲染所述第二显示形象,包括:
    获取所述第二显示形象的行为类别,按照与所述第二显示形象的行为类别对应的渲染方式,在所述直播场景中渲染所述第二显示形象。
  8. 根据权利要求1所述的直播互动方法,其中,所述第一目标对象为主播,所述第二目标对象为观众;所述获取第二目标对象的第二显示形象,包括:
    响应于第二目标对象的互动请求,根据所述互动请求获取第二目标对象的第二显示形象。
  9. 根据权利要求1所述的直播互动方法,其中,所述第一目标对象为观众,所述第二目标对象为主播或观众;所述采集第一目标对象的行为数据,包括:
    响应于第一目标对象的互动请求,接收所述互动请求的确认消息,根据所述互动请求的确认消息采集所述第一目标对象的行为数据。
  10. 根据权利要求9所述的直播互动方法,其中,所述响应于第一目标对象的互动请求,接收所述互动请求的确认消息,根据所述互动请求的确认消息采集所述第一目标对象的行为数据,包括:
    响应于所述第一目标对象的互动请求,获取所述直播场景中的显示形象数量;
    响应于所述显示形象数量未达到数量阈值,上传所述互动请求;
    接收所述互动请求的确认消息,根据所述确认消息采集所述第一目标对象的行为数据。
  11. 根据权利要求9所述的直播互动方法,其中,所述采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,包括:
    采集所述第一目标对象的行为数据;
    响应于根据所述第一目标对象的行为数据识别出所述第一目标对象的全身形象,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象。
  12. 一种直播互动装置,其中,所述装置包括:
    显示模块,被配置为在直播间界面中显示直播场景;
    采集模块,被配置为采集第一目标对象的行为数据;
    显示形象生成模块,被配置为根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象;
    第一渲染模块,被配置为在所述直播场景渲染所述第一显示形象;
    获取模块,被配置为获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;
    第二渲染模块,还被配置为在所述直播场景中渲染所述第二显示形象。
  13. 根据权利要求12所述的直播互动装置,其中,所述采集模块,被配置为采集所述第一目标对象的多帧行为图像;
    所述装置还包括:图像分割模块,被配置为对每帧行为图像进行语义分割处理,得到所述每帧第一显示形象;
    所述第一渲染模块,还被配置为在所述直播场景中渲染所述每帧第一显示形象。
  14. 根据权利要求13所述的直播互动装置,其中,所述图像分割模块,包括:
    发送单元,被配置为将所述多帧行为图像发送至服务器;
    接收单元,被配置为接收所述服务器发送的对所述每帧行为图像进行语音分割处理得到的所述每帧第一显示形象。
  15. 根据权利要求13所述的直播互动装置,其中,所述第一渲染模块,包括:
    跟踪单元,被配置为对所述多帧行为图像进行跟踪处理,得到所述第一目标对象的运动轨迹信息;
    第一渲染单元,被配置为根据所述第一目标对象的运动轨迹信息,在所述直播场景中渲染所述第一显示形象的运动轨迹;
    所述第二渲染模块,包括:
    轨迹信息获取单元,被配置为获取所述第二目标对象的运动轨迹信息;
    第二渲染单元,被配置为根据所述第二目标对象的运动轨迹信息,在所述直播场景中渲染每帧第二显示形象的运动轨迹。
  16. 根据权利要求15所述的直播互动装置,其中,所述跟踪单元,被配置为将所述多帧行为图像发送至服务器;接收所述服务器发送的对所述多帧行为图像进行跟踪处理得到的所述第一目标对象的运动轨迹信息。
  17. 根据权利要求13所述的直播互动装置,其中,所述获取模块,还被配置为获取所述直播场景的场景显示参数以及图像采集设备的设备参数;
    所述装置还包括:图像调整模块,被配置为根据所述场景显示参数和所述设备参数,对所述每帧行为图像进行调整;
    所述图像分割模块,被配置为对调整后的所述每帧行为图像进行语义分割处理。
  18. 根据权利要求12所述的直播互动装置,其中,所述第一渲染模块,包括:行为分析单元,被配置为对所述第一显示形象进行行为分析,得到所述第一显示形象的行为类别;
    第三渲染单元,被配置为按照与所述行为类别对应的渲染方式在所述直播场景中渲染所述第一显示形象;
    所述第二渲染模块,包括:
    行为类别获取单元,被配置为获取所述第二显示形象的行为类别;
    第四渲染单元,被配置为按照与所述第二显示形象的行为类别对应的渲染方式,在所述直播场景中渲染所述第二显示形象。
  19. 根据权利要求12所述的直播互动装置,其中,所述第一目标对象为主播,所述第二目标对象为观众;所述获取模块,被配置为响应于第二目标对象的互动请求,根据所述互动请求获取第二目标对象的第二显示形象。
  20. 根据权利要求12所述的直播互动装置,其中,所述第一目标对象为观众,所述第二目标对象为主播或观众;所述采集模块,被配置为响应于第一目标对象的互动请求,接收所述互动请求的确认消息,根据所述互动请求的确认消息采集所述第一目标对象的行为数据。
  21. 根据权利要求209所述的直播互动装置,其中,所述采集模块,包括:
    数量获取单元,被配置为响应于所述第一目标对象的互动请求,获取所述直播场景中的显示形象数量;
    上传单元,被配置为响应于所述显示形象数量未达到数量阈值,上传所述互动请求;
    采集单元,被配置为接收所述互动请求的确认消息,根据所述确认消息采集所述第一目标对象的行为数据。
  22. 根据权利要求20所述的直播互动装置,其中,所述采集模块,被配置为采集所述第一目标对象的行为数据;响应于根据所述第一目标对象的行为数据识别出所述第一目标对象的全身形象,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象。
  23. 一种电子设备,其中,所述电子设备包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现一种直播互动方法,所述方法包括:
    在直播间界面中显示直播场景;
    采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;
    获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;
    在所述直播场景中渲染所述第二显示形象。
  24. 一种非易失性存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行一种直播互动方法,所述方法包括:
    在直播间界面中显示直播场景;
    采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;
    获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;
    在所述直播场景中渲染所述第二显示形象。
  25. 一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现一种直播互动方法,所述方法包括:
    在直播间界面中显示直播场景;
    采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;
    获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;
    在所述直播场景中渲染所述第二显示形象。
PCT/CN2021/117040 2020-09-22 2021-09-07 直播互动方法及装置 WO2022062896A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011001739.6 2020-09-22
CN202011001739.6A CN112153400B (zh) 2020-09-22 2020-09-22 直播互动方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022062896A1 true WO2022062896A1 (zh) 2022-03-31

Family

ID=73893673

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/117040 WO2022062896A1 (zh) 2020-09-22 2021-09-07 直播互动方法及装置

Country Status (2)

Country Link
CN (1) CN112153400B (zh)
WO (1) WO2022062896A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114979682A (zh) * 2022-04-19 2022-08-30 阿里巴巴(中国)有限公司 多主播虚拟直播方法以及装置
CN115086693A (zh) * 2022-05-07 2022-09-20 北京达佳互联信息技术有限公司 虚拟对象交互方法、装置、电子设备和存储介质
CN115190347A (zh) * 2022-05-31 2022-10-14 北京达佳互联信息技术有限公司 消息处理方法、消息处理装置、电子设备和存储介质
CN115426509A (zh) * 2022-08-15 2022-12-02 北京奇虎科技有限公司 直播信息同步方法、装置、设备及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153400B (zh) * 2020-09-22 2022-12-06 北京达佳互联信息技术有限公司 直播互动方法、装置、电子设备及存储介质
CN113660503B (zh) * 2021-08-17 2024-04-26 广州博冠信息科技有限公司 同屏互动控制方法及装置、电子设备、存储介质
CN113900522A (zh) * 2021-09-30 2022-01-07 温州大学大数据与信息技术研究院 一种虚拟形象的互动方法、装置
CN113965812B (zh) * 2021-12-21 2022-03-25 广州虎牙信息科技有限公司 直播方法、系统及直播设备
CN115396688B (zh) * 2022-10-31 2022-12-27 北京玩播互娱科技有限公司 一种基于虚拟场景的多人互动网络直播方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789991A (zh) * 2016-12-09 2017-05-31 福建星网视易信息系统有限公司 一种基于虚拟场景的多人互动方法及系统
CN107613310A (zh) * 2017-09-08 2018-01-19 广州华多网络科技有限公司 一种直播方法、装置及电子设备
CN110519611A (zh) * 2019-08-23 2019-11-29 腾讯科技(深圳)有限公司 直播互动方法、装置、电子设备及存储介质
US20200099960A1 (en) * 2016-12-19 2020-03-26 Guangzhou Huya Information Technology Co., Ltd. Video Stream Based Live Stream Interaction Method And Corresponding Device
CN112153400A (zh) * 2020-09-22 2020-12-29 北京达佳互联信息技术有限公司 直播互动方法、装置、电子设备及存储介质

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105812813B (zh) * 2016-03-21 2019-07-12 深圳宸睿科技有限公司 一种授课视频压缩、播放方法及压缩、播放装置
WO2018033156A1 (zh) * 2016-08-19 2018-02-22 北京市商汤科技开发有限公司 视频图像的处理方法、装置和电子设备
CN106804007A (zh) * 2017-03-20 2017-06-06 合网络技术(北京)有限公司 一种网络直播中自动匹配特效的方法、系统及设备
CN107750014B (zh) * 2017-09-25 2020-10-16 迈吉客科技(北京)有限公司 一种连麦直播方法和系统
CN109874021B (zh) * 2017-12-04 2021-05-11 腾讯科技(深圳)有限公司 直播互动方法、装置及系统
CN108154086B (zh) * 2017-12-06 2022-06-03 北京奇艺世纪科技有限公司 一种图像提取方法、装置及电子设备
CN109963163A (zh) * 2017-12-26 2019-07-02 阿里巴巴集团控股有限公司 网络视频直播方法、装置及电子设备
CN110163861A (zh) * 2018-07-11 2019-08-23 腾讯科技(深圳)有限公司 图像处理方法、装置、存储介质和计算机设备
CN109271553A (zh) * 2018-08-31 2019-01-25 乐蜜有限公司 一种虚拟形象视频播放方法、装置、电子设备及存储介质
CN109766473B (zh) * 2018-11-30 2019-12-24 北京达佳互联信息技术有限公司 信息交互方法、装置、电子设备及存储介质
CN110691279A (zh) * 2019-08-13 2020-01-14 北京达佳互联信息技术有限公司 一种虚拟直播的方法、装置、电子设备及存储介质
CN111641843A (zh) * 2020-05-29 2020-09-08 广州华多网络科技有限公司 直播间中虚拟蹦迪活动展示方法、装置、介质及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789991A (zh) * 2016-12-09 2017-05-31 福建星网视易信息系统有限公司 一种基于虚拟场景的多人互动方法及系统
US20200099960A1 (en) * 2016-12-19 2020-03-26 Guangzhou Huya Information Technology Co., Ltd. Video Stream Based Live Stream Interaction Method And Corresponding Device
CN107613310A (zh) * 2017-09-08 2018-01-19 广州华多网络科技有限公司 一种直播方法、装置及电子设备
CN110519611A (zh) * 2019-08-23 2019-11-29 腾讯科技(深圳)有限公司 直播互动方法、装置、电子设备及存储介质
CN112153400A (zh) * 2020-09-22 2020-12-29 北京达佳互联信息技术有限公司 直播互动方法、装置、电子设备及存储介质

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114979682A (zh) * 2022-04-19 2022-08-30 阿里巴巴(中国)有限公司 多主播虚拟直播方法以及装置
CN114979682B (zh) * 2022-04-19 2023-10-13 阿里巴巴(中国)有限公司 多主播虚拟直播方法以及装置
CN115086693A (zh) * 2022-05-07 2022-09-20 北京达佳互联信息技术有限公司 虚拟对象交互方法、装置、电子设备和存储介质
CN115190347A (zh) * 2022-05-31 2022-10-14 北京达佳互联信息技术有限公司 消息处理方法、消息处理装置、电子设备和存储介质
CN115190347B (zh) * 2022-05-31 2024-01-02 北京达佳互联信息技术有限公司 消息处理方法、消息处理装置、电子设备和存储介质
CN115426509A (zh) * 2022-08-15 2022-12-02 北京奇虎科技有限公司 直播信息同步方法、装置、设备及存储介质
CN115426509B (zh) * 2022-08-15 2024-04-16 北京奇虎科技有限公司 直播信息同步方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN112153400B (zh) 2022-12-06
CN112153400A (zh) 2020-12-29

Similar Documents

Publication Publication Date Title
WO2022062896A1 (zh) 直播互动方法及装置
CN106791893B (zh) 视频直播方法及装置
CN108495032B (zh) 图像处理方法、装置、存储介质及电子设备
CN111314617B (zh) 视频数据处理方法、装置、电子设备及存储介质
US20220150594A1 (en) Method for message interaction, terminal and storage medium
US20220150598A1 (en) Method for message interaction, and electronic device
CN112905074B (zh) 交互界面展示方法、交互界面生成方法、装置及电子设备
CN110677734B (zh) 视频合成方法、装置、电子设备及存储介质
WO2022077970A1 (zh) 特效添加方法及装置
CN114025105B (zh) 视频处理方法、装置、电子设备、存储介质
CN114009003A (zh) 图像采集方法、装置、设备及存储介质
CN112312190A (zh) 视频画面的展示方法、装置、电子设备和存储介质
CN109218709B (zh) 全息内容的调整方法及装置和计算机可读存储介质
CN107105311B (zh) 直播方法及装置
CN116939275A (zh) 直播虚拟资源展示方法、装置、电子设备、服务器及介质
CN108986803B (zh) 场景控制方法及装置、电子设备、可读存储介质
CN111586296B (zh) 图像拍摄方法、图像拍摄装置及存储介质
CN115914721A (zh) 直播画面处理方法、装置、电子设备及存储介质
CN113315903B (zh) 图像获取方法和装置、电子设备、存储介质
CN108769513B (zh) 相机拍照方法及装置
CN110312117B (zh) 数据刷新方法及装置
KR20210157289A (ko) 촬영 프리뷰 이미지를 표시하는 방법, 장치 및 매체
CN113747113A (zh) 图像显示方法及装置、电子设备、计算机可读存储介质
CN111356001A (zh) 视频显示区域获取方法以及视频画面的显示方法、装置
CN111385400A (zh) 背光亮度调节方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21871268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21871268

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20-09-2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21871268

Country of ref document: EP

Kind code of ref document: A1