WO2022062896A1 - 直播互动方法及装置 - Google Patents
直播互动方法及装置 Download PDFInfo
- Publication number
- WO2022062896A1 WO2022062896A1 PCT/CN2021/117040 CN2021117040W WO2022062896A1 WO 2022062896 A1 WO2022062896 A1 WO 2022062896A1 CN 2021117040 W CN2021117040 W CN 2021117040W WO 2022062896 A1 WO2022062896 A1 WO 2022062896A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target object
- display image
- live broadcast
- live
- behavior
- Prior art date
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000009877 rendering Methods 0.000 claims abstract description 76
- 230000006399 behavior Effects 0.000 claims description 213
- 230000002452 interceptive effect Effects 0.000 claims description 95
- 230000033001 locomotion Effects 0.000 claims description 63
- 230000011218 segmentation Effects 0.000 claims description 48
- 230000003542 behavioural effect Effects 0.000 claims description 36
- 238000012790 confirmation Methods 0.000 claims description 32
- 230000004044 response Effects 0.000 claims description 24
- 238000004458 analytical method Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 7
- 238000003709 image segmentation Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000009471 action Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 239000000463 material Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 241000195940 Bryophyta Species 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000004043 responsiveness Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
- H04N21/44012—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/485—End-user interface for client configuration
Definitions
- the present disclosure relates to the field of Internet technologies, and in particular, to a live interactive method, device, electronic device, and storage medium.
- Interactive live broadcast is an enhanced application of live video, adding interactive functions to live video.
- the interactive function in the interactive live broadcast includes adding voice and video interaction in the live video broadcast.
- the present disclosure provides a live interactive method, device, electronic device and storage medium.
- the technical solutions of the present disclosure are as follows:
- a live interactive method including:
- a second display avatar is rendered in the live scene.
- a live interactive device including:
- the display module is configured to display the live broadcast scene in the interface of the live broadcast room;
- a collection module configured to collect behavior data of the first target object
- a display image generation module configured to generate a first display image corresponding to the first target object according to the behavior data of the first target object
- a first rendering module configured to render the first display image in the live broadcast scene
- an acquisition module configured to acquire a second display image of the second target object, and the second display image is generated according to the behavior data of the second target object;
- the second rendering module is further configured to render the second display image in the live broadcast scene.
- an electronic device comprising:
- a processor for executing instructions stored in a memory;
- the processor is configured to execute the instructions to implement the live interaction method described in any one of the embodiments of the first aspect.
- a storage medium when an instruction in the storage medium is executed by a processor of an electronic device, the electronic device can execute any one of the embodiments of the first aspect.
- the described live interactive method when an instruction in the storage medium is executed by a processor of an electronic device, the electronic device can execute any one of the embodiments of the first aspect.
- a computer program product comprising a computer program, the computer program being stored in a readable storage medium, and at least one processor of a device from the readable storage medium The computer program is read and executed, so that the device executes the live interaction method described in any one of the embodiments of the first aspect.
- a live broadcast scene is pre-established, and the same live broadcast scene is displayed on the host and the audience; the host and the audience simultaneously collect the behavior data of their respective users to generate a display image, and at the same time, the two-way transmission of the display image is carried out, so that the host and the audience end And the audience can interact with real-world behaviors in the same virtual scene, so that the interactive way of live broadcast is more comprehensive.
- FIG. 1 is an application environment diagram of a live broadcast interaction method according to an exemplary embodiment.
- Fig. 2 is a flow chart of a live interactive method according to an exemplary embodiment.
- Fig. 3 is a flowchart showing a step of collecting behavior data according to an exemplary embodiment.
- Fig. 4 is a flow chart of a live interactive method according to an exemplary embodiment.
- Fig. 5 is a schematic diagram of a live broadcast scene according to an exemplary embodiment.
- Fig. 6 is a flow chart of a live interactive method according to another exemplary embodiment.
- Fig. 7 is a block diagram of a live interactive device according to an exemplary embodiment.
- Fig. 8 is an internal structure diagram of an electronic device according to an exemplary embodiment.
- the related method, apparatus, device and storage medium can obtain relevant information of the user.
- the present disclosure provides a live interactive method.
- the live interactive method provided by the present disclosure can be applied to the application environment shown in FIG. 1 .
- the host terminal 110 and the server 120 communicate through the network, and at least one viewer terminal 130 and the server 120 communicate through the network.
- the viewer terminals 130 at least include viewer terminals participating in the live broadcast interaction (hereinafter referred to as interactive viewer terminals).
- An application program that can be used for live broadcasting is installed in the host terminal 110 .
- An application program that can be used to watch the live broadcast is installed in the viewer terminal 130 .
- the application installed in the host terminal 110 for live broadcast and the application installed in the viewer terminal 130 for watching the live broadcast may be the same application.
- the host 110 creates a live broadcast room, it can obtain the live broadcast scene material selected by the host to establish the live broadcast room.
- the host 110 collects the host's behavior data, generates the host's display image corresponding to the host according to the host's behavior data, and renders the host's display image in the live broadcast scene.
- the viewer terminal 130 enters the live broadcast room, and displays a live broadcast scene including the display image of the host on the screen of the viewer terminal 130. Some or all of the viewers (interactive viewers) in the viewers 130 may request the host 110 to perform live interaction.
- the interactive audience terminal collects the behavior data of the interactive audience, generates the interactive audience display image corresponding to the interactive audience according to the behavior data of the interactive audience, and renders the interactive audience display image in the live broadcast scene.
- the interactive viewer terminal sends the interactive viewer display image to the server 120, so that the server 120 sends the interactive viewer display image to the host terminal 110 and other viewer terminals not participating in the interaction, so that the anchor terminal 110 and other viewer terminals not participating in the interaction can broadcast the live broadcast.
- the scene renders the interactive audience display image.
- the host 110 can be, but is not limited to, various personal computers, laptops, smart phones, and tablet computers
- the server 120 can be implemented by an independent server or a server cluster composed of multiple servers
- the viewer 130 can be, but not limited to It is a variety of personal computers, laptops, smartphones, and tablets.
- FIG. 2 is a flow chart of a method for live broadcast interaction according to an exemplary embodiment. As shown in FIG. 2 , the live broadcast interaction method is used for the interactive viewer terminal in the host terminal 110 or the viewer terminal 130 in FIG. 1 , including the following step.
- step S210 the live broadcast scene is displayed on the interface of the live broadcast room.
- the live broadcast scene refers to a virtual scene set for the live broadcast room.
- the material of the live broadcast scene can be pre-configured, for example, it can be a game scene, a virtual image background, etc., or it can be obtained by the user in the album of the terminal device;
- the host can trigger the creation request of the live room through the host terminal.
- the host terminal obtains the material of the live broadcast scene in response to the creation request of the live broadcast room; and creates the live broadcast scene according to the obtained material of the live broadcast scene.
- the host side displays the created live broadcast scene.
- the audience can enter the live room through search, hotspot recommendation, etc., and the screen of the audience will display the same live scene as the host.
- step S220 the behavior data of the first target object is collected, a first display image corresponding to the first target object is generated according to the behavior data of the first target object, and the first display image is rendered in the live broadcast scene.
- the first target object may be a host or an interactive audience participating in the live broadcast interaction. Interactive viewers can be all or part of the viewers who are watching the live broadcast.
- the behavior data of the first target object is collected in real time through an image acquisition device. Corresponding processing is performed on the behavior data of the first target object, a first display image corresponding to the first target object is generated, and the first display image is rendered in the live broadcast scene of the first client.
- the behavior data of the first target object is not limited to video data, voice data or text comment data of the first target object.
- the first displayed image corresponding to the first target object may be obtained based on the deep learning theory.
- the first displayed image may be the first target object image obtained by semantically segmenting the behavior image, or may be a A three-dimensional model driven by the human body pose estimation result of the first target object; if the behavior data of the first target object is the voice data obtained by the voice collection of the first target object, the first display image may be the voice recognition of the voice data.
- the related text content obtained is a behavior image obtained by photographing the first target object.
- step S230 a second display image of the second target object is obtained, and the second display image is generated according to the behavior data of the second target object.
- step S240 the second display image is rendered in the live broadcast scene.
- the second target object may be a host or an interactive audience participating in the live broadcast interaction.
- the second target object may be an interactive viewer; when the first target object is an interactive viewer, the second target object may be a host and/or other interactive viewers.
- a second display image corresponding to the second target object is generated according to the behavior data of the second target object, and displayed on the second client
- the second display image is rendered in the live scene.
- the second client sends the acquired second display image to the server, and the server sends the second display image corresponding to the second target object to the first client.
- the first client receives the second display image of the second target object sent by the server, and renders the second display image in the displayed live broadcast scene.
- the first display image corresponding to the first target object can be received from the server, and the first display image can be rendered in the live broadcast scene displayed by the second client, so that the The second client and the first client present the same live broadcast scene.
- the first display image corresponding to the first target object and the second display image corresponding to the second target object can be obtained from the server, and the live broadcast scene displayed on the viewer terminal.
- the first display image and the second display image are rendered in the server, so that the audience terminal, the first client terminal and the second client terminal not participating in the live broadcast interaction present the same live broadcast scene.
- a live broadcast scene is pre-established, and the same live broadcast scene is displayed on the anchor end and the audience end; the anchor end and the audience end collect the behavior data of their respective users at the same time to generate a display image, and at the same time carry out two-way transmission of the display image, so that the anchor end and the audience end. And the audience can interact with real-world behaviors in the same virtual scene, so that the interactive way of live broadcast can be more comprehensive.
- step S220 the behavior data of the first target object is collected, a first display image corresponding to the first target object is generated according to the behavior data of the first target object, and the first display image is rendered in the live broadcast scene. , which includes: collecting multiple frames of behavioral images of the first target object, performing semantic segmentation processing on each frame of behavioral images, obtaining the first display image of each frame, and rendering the first display image of each frame in the live broadcast scene.
- the behavior data of the first target object may be continuous multiple frames of behavior images of the first target object collected in real time by an image acquisition device.
- the pre-configured trained semantic segmentation model is invoked.
- a first target object image is obtained by performing speech segmentation processing on each frame of behavioral images through the trained semantic segmentation model, and the obtained first target object image is used as a first display image.
- the first client renders the acquired first display image of each frame in the live broadcast scene.
- the semantic segmentation model is not limited to the use of DeepLab (a semantic segmentation network), FCN (Fully Convolution Networks, fully convolutional network), SegNet (Sementic Segmentation, semantic segmentation network), BiSeNet (Bilateral Segmentation Network for Real-time Semantic Segmentation) , a dual-channel real-time semantic segmentation model), etc.
- DeepLab a semantic segmentation network
- FCN Full Convolution Networks, fully convolutional network
- SegNet ementic Segmentation, semantic segmentation network
- BiSeNet Bilateral Segmentation Network for Real-time Semantic Segmentation
- the first target object image may be a corresponding portrait of a real host or a portrait of a real interactive audience.
- the second display image of the second target object acquired by the first client the second display image obtained by performing semantic segmentation processing on each frame of behavioral images of the second target object in the same manner as described above.
- the server sends the second display image obtained by the semantic segmentation process to the first client, so that the first client can render the second display image in the live broadcast scene.
- the behavioral images of the anchors and/or interactive viewers participating in the live broadcast interaction are collected, and the obtained behavioral images are subjected to semantic segmentation to obtain a real portrait, and the obtained real portrait is rendered in the live broadcast scene, so that the virtual The live broadcast scene is closer to the real world scene, which can improve the authenticity of the live broadcast interaction, help users to stay in the live broadcast room and improve the user retention rate of the live broadcast application.
- performing semantic segmentation processing on each frame of behavioral images includes: sending multiple frames of behavioral images to a server; .
- performing semantic segmentation processing on the multi-frame behavior images collected by the first client and/or the second client may also be performed by the server.
- the first client and/or the second client acquires the multi-frame behavior images collected by the respective image acquisition devices
- the first client and/or the second client sends the acquired multi-frame behavior images to the server in real time .
- the server invokes a pre-deployed semantic segmentation model.
- the speech segmentation process is performed on each frame of behavioral image through the semantic segmentation model to obtain the first target object image and the second target object image, the obtained first target object image is used as the first display image, and the obtained second target object image image as a second display image.
- the server can send the first display image and the second display image to the associated client in the live broadcast room (which may refer to the clients corresponding to all accounts that have entered the live broadcast room), so that the associated client can be in the currently displayed live broadcast scene.
- the first display avatar and the second display avatar are rendered.
- the operating pressure of the terminal device can be reduced and the performance of the terminal device can be improved. responding speed.
- rendering the first display image in the live broadcast scene includes: performing tracking processing on multiple frames of behavioral images to obtain motion track information of the first target object; The motion trajectory of the first display image is rendered in the scene.
- a trained target tracking algorithm is deployed on the first client in advance.
- a target tracking algorithm is used to track and process multiple frames of behavior images collected by the first client to obtain motion track information of the first target object. Further, according to the motion trajectory information of the first target object, the motion trajectory of the first display image is rendered in the live broadcast scene.
- the target tracking algorithm can use the tracking algorithm based on the correlation filter, such as KCF Tracker (Kernel Correlation Filter, kernel correlation filter tracking algorithm), MOSSE Tracker (Minimum Output Sum of Squared Error Tracker, error least square sum filter tracking algorithm) Wait.
- rendering the second display image in the live broadcast scene includes: acquiring motion trajectory information of the second target object, and rendering each frame of the second display image in the live broadcast scene according to the motion trajectory information of the second target object. movement trajectory.
- the first client may also receive the motion trajectory information of the second target object sent by the server.
- the motion track information of the second target object the motion track carrying the second display image is rendered in the currently displayed live broadcast scene.
- the motion trajectory information of the second target object may be obtained by tracking the multi-frame behavior images of the second target object through a target tracking algorithm preconfigured on the second client.
- first client and the second client can also send the first display image, the motion track information of the first target object, and the second display image and the motion track information of the second target object to the live broadcast room through the server.
- Other associated clients so that other associated clients render the first display image and the motion trajectory of the first display image, and render the second display image and the motion trajectory of the second display image in the currently displayed live broadcast scene.
- the target tracking algorithm is pre-deployed, the motion trajectory information of the target object in the real world is obtained through the target tracking algorithm, and the motion trajectory of the displayed image is rendered in the live broadcast scene according to the motion trajectory information of the target object in the real world, so that The images displayed in the live broadcast scene can interact according to the behaviors of real-world characters, which can make the live broadcast interaction method more comprehensive, improve the authenticity of the live broadcast interaction, and help increase the user's stay time.
- performing tracking processing on multiple frames of behavior images to obtain motion trajectory information of the first target object includes: sending the multiple frames of behavior images to a server; The obtained motion track information of the first target object.
- the tracking processing of the multi-frame behavior images collected by the first client and/or the second client may also be performed by the server.
- the first client and/or the second client respectively acquire the multi-frame behavior images collected by the image acquisition device
- the first client and/or the second client send the acquired multi-frame behavior images to the server in real time.
- the server invokes a pre-deployed target tracking algorithm.
- the multi-frame behavior images are tracked by the target tracking algorithm, and the motion track information corresponding to the first target object and the second target object is obtained.
- the server may send the motion trajectory information corresponding to the first target object and the second target object to the associated client in the live broadcast room, so that the associated client can render the first display image and the second display image in the currently displayed live broadcast scene their corresponding motion trajectories.
- the operating pressure of the terminal device can be reduced, and the terminal device can be improved.
- the responsiveness of the device can be reduced.
- the method before performing semantic segmentation processing on each frame of behavioral images, the method further includes: acquiring scene display parameters of the live broadcast scene and device parameters of the image acquisition device; image to adjust.
- the scene display parameters of the live broadcast scene are not limited to including information such as brightness and contrast of the live broadcast scene.
- the scene display parameters of the live broadcast scene can be manually configured by the host when creating the live broadcast room, or pre-configured default parameters can be used.
- Device parameters refer to the parameters of the image acquisition device used to acquire behavioral images. Device parameters are not limited to including factors such as illumination, contrast, camera resolution, and lens distortion. The device parameters of the image capturing devices corresponding to the first client and the second client may be different.
- the first client obtains scene display parameters of the live broadcast scene.
- the first client obtains the device parameters of the image collecting device.
- the first client adjusts each frame of the acquired behavior image according to the scene display parameters of the live broadcast scene.
- the acquired scene display parameters of the live broadcast scene and the device parameters of the image capture device both include brightness, and the brightness of the scene display parameters is less than the brightness in the device parameters, then the brightness of the scene display parameters can be reduced.
- the brightness of the behavioral image of a target object is if the acquired scene display parameters of the live broadcast scene and the device parameters of the image capture device both include brightness, and the brightness of the scene display parameters is less than the brightness in the device parameters, then the brightness of the scene display parameters can be reduced. The brightness of the behavioral image of a target object.
- the second client when collecting the behavior image of the second target object, the second client obtains the scene display parameters of the live broadcast scene and the device parameters of the image acquisition device. The second client adjusts the acquired behavioral images for each frame according to the scene display parameters of the live broadcast scene.
- performing semantic segmentation processing on each frame of behavior image specifically includes: performing semantic segmentation processing on each frame of adjusted behavior image.
- the first client invokes a pre-deployed semantic segmentation model to perform semantic segmentation processing on each frame of behavioral images of the first target object, to obtain the first The target object image, and the obtained first target object image is used as the first display image.
- the behavior images collected by different clients can be rendered more consistent in the live broadcast scene.
- rendering the first display image in the live broadcast scene includes performing behavior analysis on the first display image to obtain a behavior category of the first display image, and rendering in the live broadcast scene according to a rendering method corresponding to the behavior category.
- the first display image includes performing behavior analysis on the first display image to obtain a behavior category of the first display image, and rendering in the live broadcast scene according to a rendering method corresponding to the behavior category.
- the behavior categories are not limited to dancing, duet, jumping, high-five, motivation, etc.
- the rendering method corresponding to the behavior category may refer to the relevant special effects rendering method corresponding to the behavior category.
- the rendering method corresponding to the behavior category of dancing can be lighting effects
- the rendering method corresponding to the behavior category of high-five can be the same as At least one display image of the high-five is approached, and corresponding special effects are added to the high-five part.
- the behavioral analysis of the first displayed avatar may be performed based on deep learning theory.
- a deep learning model can be used to perform action recognition on the first display image to obtain the behavior category of the first display image;
- the first display image is related text content obtained by performing speech recognition on the voice data, and keyword recognition may be performed on the related text content to obtain the behavior category of the first displayed image.
- the corresponding relationship between the behavior category and the rendering mode may be pre-configured on the first client.
- the first client After the first client obtains the behavior category of the first display image, it can search for the rendering mode corresponding to the behavior category from the corresponding relationship between the behavior category and the rendering mode, and render the first rendering mode in the live broadcast scene according to the rendering mode corresponding to the behavior category.
- a display image After the first client obtains the behavior category of the first display image, it can search for the rendering mode corresponding to the behavior category from the corresponding relationship between the behavior category and the rendering mode, and render the first rendering mode in the live broadcast scene according to the rendering mode corresponding to the behavior category.
- rendering the second display image in the live broadcast scene includes: acquiring a behavior category of the second display image, and rendering the second display image in the live broadcast scene according to a rendering method corresponding to the behavior category of the second display image .
- the second client after acquiring the second displayed image, the second client can perform behavior analysis on the second displayed image based on the deep learning theory to obtain the behavior category of the second displayed image.
- the second client may send the behavior category of the second display avatar to the server.
- the server sends the behavior category of the second display image to the first client, so that the first client can render the second display in the live scene according to the rendering method corresponding to the behavior category of the second display image in the live scene image.
- the behavior category of the displayed image is obtained by analyzing the behavior of the displayed image in the live broadcast scene, and the displayed image is rendered in the live broadcast scene according to the rendering method corresponding to the behavior category, which further enriches the live broadcast interaction method, and can
- the display image in the live broadcast scene is more vivid in visual effect, which helps to increase the number of viewers in the live broadcast room and prolong the stay time of the audience in the live broadcast room.
- the first target object is the host, and the second target object is the audience; obtaining the second display image of the second target object includes: in response to an interaction request of the second target object, obtaining the first target object according to the interaction request. A second display image of the second target object.
- the second target object is an interactive audience participating in the live broadcast interaction.
- the second target object may trigger the interaction request through the second client.
- the second client collects behavior data of the second target object, and generates a second display image corresponding to the second target object according to the behavior data of the second target object.
- the second client can send the second display image to the first client corresponding to the host through the server, so that the first client obtains the second display image and renders the acquired second display image in the currently displayed live broadcast scene .
- the anchor terminal and the audience terminal simultaneously collect the behavior data of their respective users to generate a display image, and at the same time carry out two-way transmission of the display image, so that the anchor terminal and the audience terminal can be in the same Real-world behaviors are used for live-streaming interaction in virtual scenes, which can make live-streaming interaction more comprehensive.
- the first target object is a viewer
- the second target object is a host or a viewer
- collecting behavior data of the first target object includes: in response to an interaction request of the first target object, receiving a confirmation of the interaction request message, and the behavior data of the first target object is collected according to the confirmation message of the interaction request.
- the second target object may be other interactive viewers or hosts participating in the live broadcast interaction.
- the first target object may trigger an interaction request through the first client.
- the first client can send the interaction request to the second client corresponding to the host through the server.
- the host can trigger the permission instruction through the second client.
- the server sends a confirmation message of the interaction request to the first client, so that the first client can start collecting behavior data of the first target object according to the confirmation message of the interaction request.
- the viewer can collect behavior data of the viewer only after receiving the confirmation message from the host, so that the host can manage the interactive viewers in a unified manner.
- a confirmation message of the interaction request is received, and the behavior data of the first target object is collected according to the confirmation message of the interaction request, including:
- step S310 in response to the interaction request of the first target object, the number of displayed avatars in the live broadcast scene is acquired.
- step S320 when the number of displayed images does not reach the number threshold, upload the interaction request
- step S330 a confirmation message of the interaction request is received, and behavior data of the first target object is collected according to the confirmation message.
- the number of displayed images in the live broadcast scene may refer to the number of displayed images corresponding to the interactive viewers in the current live broadcast scene.
- the number threshold refers to the maximum number of interactive viewers allowed to participate in live interaction.
- the quantity threshold can be manually configured by the host when creating the live room, or it can be a pre-configured default threshold.
- the first target object if the first target object is a viewer, the second target object may be other interactive viewers or a host.
- the first target object may trigger an interaction request through the first client.
- the first client in response to the interaction request, acquires the number of displayed images in the current live broadcast scene. Compare the number of displayed avatars in the current live scene with a pre-acquired number threshold.
- the server sends the interactive request of the first client to the second client of the host through the server.
- the host can trigger the permission instruction through the second client.
- the server sends a confirmation message of the interaction request to the first client, so that the first client can collect behavior data of the first target object according to the confirmation message of the interaction request.
- the display effect of the displayed image in the live broadcast scene can be improved.
- step S220 the behavior data of the first target object is collected, and the first display image corresponding to the first target object is generated according to the behavior data of the first target object, including: collecting the behavior data of the first target object.
- Behavior data when the whole body image of the first target object is identified according to the behavior data of the first target object, a first display image corresponding to the first target object is generated according to the behavior data of the first target object.
- the second target object may be other interactive viewers or a host.
- the first target object may trigger an interaction request through the first client.
- the first client may send the interaction request of the first client to the second client of the host through the server.
- the host can trigger the permission instruction through the second client.
- the server sends a confirmation message of the interaction request to the first client, so that the first client can collect behavior data of the first target object according to the confirmation message of the interaction request.
- the behavior data of the first target object includes a behavior image of the first target object.
- the first client can identify the behavior image of the first target object, and determine whether the behavior image contains the full-body image of the first target object. If the whole body image of the first target object is included, the first display image is acquired, and the first display image is rendered into the live broadcast scene.
- the compliance of live interaction can be improved by allowing the interactive client to continue to collect behavior data of the interactive audience after judging that the client of the interactive audience can collect the full-body image of the interactive audience.
- FIG. 4 is a flow chart of a live broadcast interaction method according to an exemplary embodiment. As shown in FIG. 4 , the live broadcast interaction method used in the host terminal includes the following steps.
- step S401 the host creates a live room, and configures a live broadcast scene in the live broadcast room and a threshold for the number of displayed images in the live broadcast scene.
- step S402 the host terminal displays the live broadcast scene in the interface of the live broadcast room.
- step S403 the behavior data of the anchor is collected, and the behavior data of the anchor may be continuous multiple frames of behavior images of the anchor collected by the camera.
- step S404 semantic segmentation, tracking, and behavior analysis are performed on each frame of the anchor's behavior image to obtain the anchor's display image, the anchor's motion track information, and the behavior category of the anchor's display image.
- semantic segmentation is performed on each frame of the anchor's behavior image by using a semantic segmentation model to obtain a segmentation result of the anchor's portrait in each frame, which is used as the anchor's display image in each frame.
- the anchor's display image is identified through the action recognition model, and the behavior category of the anchor's display image is obtained.
- the motion trajectory information is obtained by tracking the multi-frame anchor behavior images through the target tracking algorithm.
- the behavior category of the anchor's displayed image can also be obtained by performing behavior detection on multiple frames of anchor behavior images by a target tracking algorithm, which is not specifically limited here.
- step S405 the anchor display image and the motion trajectory of the anchor display image are rendered in the live broadcast scene of the anchor end, and the anchor display image in the live broadcast scene is rendered according to the rendering method corresponding to the behavior category of the anchor display image.
- step S406 the host's display image, the anchor's motion track information, and the behavior category of the anchor's displayed image are sent to the server, so that the server sends the anchor's displayed image, the anchor's motion track information, and the behavior category of the anchor's displayed image to all viewers end.
- the viewer renders the anchor display image and the motion trajectory of the anchor display image in the live broadcast scene, and renders the anchor display image in the live broadcast scene according to the rendering method corresponding to the behavior category of the anchor display image.
- step S407 in response to the interactive request from the interactive viewer, a permission instruction and initial location information allocated to the display image for the viewer of the interactive viewer are obtained.
- step S408 a confirmation message of the interaction request is sent to the interactive viewer.
- step S409 an audience display image of the interactive audience is acquired, and the audience display image is rendered to a corresponding initial position according to the initial position information.
- the audience display image of the interactive audience is when the interactive audience end or the host end detects that the number of displayed images in the live broadcast scene does not exceed the number threshold, and the interactive audience end determines that the camera can capture the whole body image of the audience, according to the collected audience behavior image owned.
- step S410 continue to acquire the displayed image of the audience, the movement track information of the interactive audience, and the behavior category of the displayed image of the audience.
- the displayed image of the audience, the motion track information of the interactive audience, and the behavior category of the displayed image of the audience can be obtained by referring to step S404, and will not be described in detail here.
- step S411 the host renders the audience display image and the motion trajectory of the audience display image in the live broadcast scene, and renders the audience display image in the live broadcast scene according to the rendering method corresponding to the behavior category of the audience display image.
- FIG. 5 exemplarily shows a live broadcast scene displayed by the host terminal in one embodiment.
- the live broadcast scene is a pre-selected virtual scene
- the displayed image of the anchor and the displayed image of the audience are the real anchor portrait and real audience portrait obtained through the voice segmentation model.
- FIG. 6 is a flow chart of a live broadcast interaction method according to an exemplary embodiment. As shown in FIG. 5 , the live broadcast interaction method used in an interactive audience terminal includes the following steps.
- step S601 the interactive viewer terminal displays the live broadcast scene in the live broadcast room interface.
- step S602 the anchor's display image, the anchor's motion track information, and the behavior category of the anchor's display image are acquired.
- step S603 the anchor display image and the motion trajectory of the anchor display image are rendered in the live broadcast scene, and the anchor display image in the live broadcast scene is rendered according to the rendering method corresponding to the behavior category of the anchor display image.
- step S604 in response to the interaction request triggered by the interactive audience, the number of displayed characters in the live broadcast scene is obtained, and when the number of displayed characters does not reach the number threshold, an interaction request is sent to the host.
- step S605 a confirmation message of the interaction request sent by the host is received, the confirmation message carries the initial location information, and behavior data of the interactive viewer is collected according to the confirmation message.
- the behavior data of the interactive audience may be an audience behavior image of the interactive audience collected by a camera.
- step S606 when the whole body image of the interactive audience can be identified according to the audience behavior image, semantic segmentation processing is performed on the behavior image of the interactive audience to obtain the audience display image, and the audience display image is rendered to the corresponding initial position according to the initial position information. Location.
- step S607 the display image of the viewer is sent to the server, so that the server sends the display image of the viewer to the host and all other viewers.
- step S608 continuous acquisition of multiple frames of audience behavior images of the audience is continued.
- step S609 semantic segmentation, tracking, and behavior analysis are performed on each frame of the audience behavior image to obtain the audience display image, the motion track information of the interactive audience, and the behavior category of the audience display image.
- a semantic segmentation process is performed on each frame of the audience behavior image through a semantic segmentation model to obtain a segmentation result of the audience portrait in each frame, which is used as the audience display image in each frame.
- the action recognition model Through the action recognition model, the displayed image of the audience is identified, and the behavior category of the displayed image of the audience is obtained.
- the target tracking algorithm is used to track and process multiple frames of audience behavior images to obtain the motion track information of the audience. Further, the behavior category of the audience image can also be obtained by performing behavior detection on multiple frames of audience behavior images through the target tracking algorithm, which is not specifically limited here.
- step S610 the audience display image and the motion trajectory of the audience display image are rendered in the live broadcast scene, and the audience display image in the live broadcast scene is rendered according to the rendering method corresponding to the behavior category of the audience display image.
- the interactive viewer terminal and the host terminal present the same live broadcast scene, for details, please refer to the schematic diagram of the live broadcast scene in FIG. 5 .
- step S611 the displayed image of the audience, the motion trajectory information of the interactive audience, and the behavior category of the displayed image of the audience are sent to the server, so that the server sends the displayed image of the audience, the information of the motion trajectory of the audience, and the behavior category of the displayed image of the audience to the host. and all other viewers.
- the audience display image and the motion trajectory of the audience display image are rendered in the live broadcast scene through the anchor terminal and all other viewer terminals, and the audience display image in the live broadcast scene is rendered according to the special effect rendering method corresponding to the behavior category of the audience display image.
- steps in the above flow charts are displayed in sequence according to the arrows, these steps are not necessarily executed in the sequence indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in the above flow chart may include multiple steps or multiple stages, these steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence of these steps or stages is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages within the other steps.
- FIG. 7 is a block diagram of a live interactive device 700 according to an exemplary embodiment.
- the apparatus 700 includes a display module 701 , a collection module 702 , a display image generation module 703 , a first rendering module 704 , an obtaining module 705 and a second rendering module 706 .
- the display module 701 is configured to display the live scene in the live room interface; the collection module 702 is configured to collect the behavior data of the first target object; the display image generation module 703 is configured to generate according to the behavior data of the first target object The first display image corresponding to the first target object; the first rendering module 704 is configured to render the first display image in the live broadcast scene; the obtaining module 705 is configured to obtain the second display image of the second target object, the second display image The avatar is generated according to the behavior data of the second target object; the second rendering module 706 is further configured to render the second display avatar in the live broadcast scene.
- the acquisition module 702 is configured to acquire multiple frames of behavioral images of the first target object; the apparatus 700 further includes: an image segmentation module, configured to perform semantic segmentation processing on each frame of behavioral images, The first display image of each frame is obtained; the first rendering module 704 is further configured to render the first display image of each frame in the live broadcast scene.
- the image segmentation module includes: a sending unit, configured to send multiple frames of behavioral images to a server; a receiving unit, configured to receive a data obtained by performing voice segmentation processing on each frame of behavioral images sent by the server. The first display image of each frame.
- the first rendering module 704 includes: a tracking unit configured to perform tracking processing on multiple frames of behavioral images to obtain motion trajectory information of the first target object; the first rendering unit configured to The motion trajectory information of the first target object, rendering the motion trajectory of the first display image in the live broadcast scene; the second rendering module 706 includes: a trajectory information acquisition unit configured to acquire the motion trajectory information of the second target object; The rendering unit is configured to render the motion trajectory of the second display image in each frame in the live broadcast scene according to the motion trajectory information of the second target object.
- the tracking unit is configured to send the multi-frame behavior images to the server; and receive the motion trajectory information of the first target object sent by the server and obtained by tracking the multi-frame behavior images.
- the acquiring module 705 is further configured to acquire the scene display parameters of the live broadcast scene and the device parameters of the image acquisition device; the apparatus 700 further includes: an image adjustment module, configured to display the parameters and The device parameters are used to adjust each frame of behavioral images; the image segmentation module is configured to perform semantic segmentation processing on the adjusted behavioral images of each frame.
- the first rendering module 704 includes: a behavior analysis unit configured to perform behavior analysis on the first display image to obtain a behavior category of the first displayed image; and a third rendering unit configured to The rendering mode corresponding to the behavior category renders the first display image in the live broadcast scene;
- the second rendering module 706 includes: a behavior category acquisition unit, configured to acquire the behavior category of the second display image; a fourth rendering unit, configured as The second display image is rendered in the live broadcast scene according to the rendering mode corresponding to the behavior category of the second display image.
- the first target object is a broadcaster, and the second target object is a viewer; the obtaining module 705 is configured to, in response to an interaction request of the second target object, obtain the second target object's second target object according to the interaction request. Display image.
- the first target object is a viewer
- the second target object is a broadcaster or a viewer
- the collection module 702 is configured to, in response to the interaction request of the first target object, receive a confirmation message of the interaction request, and according to the interaction
- the requested confirmation message collects behavior data of the first target object.
- the collection module 702 includes: a quantity acquisition unit, configured to acquire, in response to an interaction request of the first target object, the number of displayed characters in the live broadcast scene; When the quantity threshold is not reached, the interaction request is uploaded; the collection unit is configured to receive a confirmation message of the interaction request, and collect the behavior data of the first target object according to the confirmation message.
- the collection module 702 is configured to collect behavior data of the first target object; when the whole body image of the first target object is identified according to the behavior data of the first target object, according to the behavior data of the first target object The behavior data generates a first display image corresponding to the first target object.
- FIG. 8 is a block diagram of a device 800 for live interaction according to an exemplary embodiment.
- device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, or the like.
- device 800 may include one or more of the following components: processing component 802, memory 804, power supply component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and Communication component 816.
- the processing component 802 generally controls the overall operation of the device 800, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
- the processing component 802 can include one or more processors 820 to execute instructions to perform all or some of the steps of the methods described above.
- processing component 802 may include one or more modules that facilitate interaction between processing component 802 and other components.
- processing component 802 may include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
- Memory 804 is configured to store various types of data to support operation at device 800 . Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and the like. Memory 804 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable programmable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read only memory
- EPROM erasable programmable Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Magnetic or Optical Disk Magnetic Disk
- Power supply assembly 806 provides power to various components of device 800 .
- Power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 800 .
- Multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
- the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
- the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
- multimedia component 808 includes a front-facing camera and/or a rear-facing camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data.
- Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
- Audio component 810 is configured to output and/or input audio signals.
- audio component 810 includes a microphone (MIC) that is configured to receive external audio signals when device 800 is in operating modes, such as call mode, recording mode, and voice recognition mode.
- the received audio signal may be further stored in memory 804 or transmitted via communication component 816 .
- audio component 810 also includes a speaker for outputting audio signals.
- the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
- Sensor assembly 814 includes one or more sensors for providing status assessments of various aspects of device 800 .
- the sensor component 814 can detect the open/closed state of the device 800, the relative positioning of components, such as the display and keypad of the device 800, and the sensor component 814 can also detect a change in the position of the device 800 or a component of the device 800 , the presence or absence of user contact with the device 800 , the orientation or acceleration/deceleration of the device 800 and the temperature change of the device 800 .
- Sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
- Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- Communication component 816 is configured to facilitate wired or wireless communications between device 800 and other devices.
- Device 800 may access wireless networks based on communication standards, such as WiFi, carrier networks (eg, 2G, 3G, 4G, or 5G), or a combination thereof.
- the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
- the communication component 816 also includes a near field communication (NFC) module to facilitate short-range communication.
- the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra-wideband
- Bluetooth Bluetooth
- device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGA field programmable A gate array
- controller microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
- non-transitory computer-readable storage medium including instructions, such as memory 804 including instructions, executable by processor 820 of device 800 to perform the above method.
- the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
Abstract
Description
Claims (25)
- 一种直播互动方法,其中,所述方法应用于直播端或者观众端,包括:在直播间界面中显示直播场景;采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;在所述直播场景中渲染所述第二显示形象。
- 根据权利要求1所述的直播互动方法,其中,所述采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象,包括:采集所述第一目标对象的多帧行为图像,对每帧行为图像进行语义分割处理,得到所述每帧第一显示形象,在所述直播场景中渲染所述每帧第一显示形象。
- 根据权利要求2所述的直播互动方法,其中,所述对每帧行为图像进行语义分割处理,包括:将所述多帧行为图像发送至服务器;接收所述服务器发送的对所述每帧行为图像进行语音分割处理得到的所述每帧第一显示形象。
- 根据权利要求2所述的直播互动方法,其中,所述在所述直播场景渲染所述第一显示形象,包括:对所述多帧行为图像进行跟踪处理,得到所述第一目标对象的运动轨迹信息;根据所述第一目标对象的运动轨迹信息,在所述直播场景中渲染所述第一显示形象的运动轨迹;所述在所述直播场景中渲染所述第二显示形象,包括:获取所述第二目标对象的运动轨迹信息,根据所述第二目标对象的运动轨迹信息,在所述直播场景中渲染每帧第二显示形象的运动轨迹。
- 根据权利要求4所述的直播互动方法,其中,所述对所述多帧行为图像进行跟踪处理,得到所述第一目标对象的运动轨迹信息,包括:将所述多帧行为图像发送至服务器;接收所述服务器发送的对所述多帧行为图像进行跟踪处理得到的所述第一目标对象的运动轨迹信息。
- 根据权利要求2所述的直播互动方法,其中,所述方法还包括:获取所述直播场景的场景显示参数以及所述直播端或观众端的图像采集设备的设备参数;根据所述场景显示参数和所述设备参数,对所述每帧行为图像进行调整;所述对每帧行为图像进行语义分割处理,包括:对调整后的所述每帧行为图像进行语 义分割处理。
- 根据权利要求1所述的直播互动方法,其中,所述在所述直播场景渲染所述第一显示形象,包括,对所述第一显示形象进行行为分析,得到所述第一显示形象的行为类别,按照与所述行为类别对应的渲染方式在所述直播场景中渲染所述第一显示形象;所述在所述直播场景中渲染所述第二显示形象,包括:获取所述第二显示形象的行为类别,按照与所述第二显示形象的行为类别对应的渲染方式,在所述直播场景中渲染所述第二显示形象。
- 根据权利要求1所述的直播互动方法,其中,所述第一目标对象为主播,所述第二目标对象为观众;所述获取第二目标对象的第二显示形象,包括:响应于第二目标对象的互动请求,根据所述互动请求获取第二目标对象的第二显示形象。
- 根据权利要求1所述的直播互动方法,其中,所述第一目标对象为观众,所述第二目标对象为主播或观众;所述采集第一目标对象的行为数据,包括:响应于第一目标对象的互动请求,接收所述互动请求的确认消息,根据所述互动请求的确认消息采集所述第一目标对象的行为数据。
- 根据权利要求9所述的直播互动方法,其中,所述响应于第一目标对象的互动请求,接收所述互动请求的确认消息,根据所述互动请求的确认消息采集所述第一目标对象的行为数据,包括:响应于所述第一目标对象的互动请求,获取所述直播场景中的显示形象数量;响应于所述显示形象数量未达到数量阈值,上传所述互动请求;接收所述互动请求的确认消息,根据所述确认消息采集所述第一目标对象的行为数据。
- 根据权利要求9所述的直播互动方法,其中,所述采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,包括:采集所述第一目标对象的行为数据;响应于根据所述第一目标对象的行为数据识别出所述第一目标对象的全身形象,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象。
- 一种直播互动装置,其中,所述装置包括:显示模块,被配置为在直播间界面中显示直播场景;采集模块,被配置为采集第一目标对象的行为数据;显示形象生成模块,被配置为根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象;第一渲染模块,被配置为在所述直播场景渲染所述第一显示形象;获取模块,被配置为获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;第二渲染模块,还被配置为在所述直播场景中渲染所述第二显示形象。
- 根据权利要求12所述的直播互动装置,其中,所述采集模块,被配置为采集所述第一目标对象的多帧行为图像;所述装置还包括:图像分割模块,被配置为对每帧行为图像进行语义分割处理,得到所述每帧第一显示形象;所述第一渲染模块,还被配置为在所述直播场景中渲染所述每帧第一显示形象。
- 根据权利要求13所述的直播互动装置,其中,所述图像分割模块,包括:发送单元,被配置为将所述多帧行为图像发送至服务器;接收单元,被配置为接收所述服务器发送的对所述每帧行为图像进行语音分割处理得到的所述每帧第一显示形象。
- 根据权利要求13所述的直播互动装置,其中,所述第一渲染模块,包括:跟踪单元,被配置为对所述多帧行为图像进行跟踪处理,得到所述第一目标对象的运动轨迹信息;第一渲染单元,被配置为根据所述第一目标对象的运动轨迹信息,在所述直播场景中渲染所述第一显示形象的运动轨迹;所述第二渲染模块,包括:轨迹信息获取单元,被配置为获取所述第二目标对象的运动轨迹信息;第二渲染单元,被配置为根据所述第二目标对象的运动轨迹信息,在所述直播场景中渲染每帧第二显示形象的运动轨迹。
- 根据权利要求15所述的直播互动装置,其中,所述跟踪单元,被配置为将所述多帧行为图像发送至服务器;接收所述服务器发送的对所述多帧行为图像进行跟踪处理得到的所述第一目标对象的运动轨迹信息。
- 根据权利要求13所述的直播互动装置,其中,所述获取模块,还被配置为获取所述直播场景的场景显示参数以及图像采集设备的设备参数;所述装置还包括:图像调整模块,被配置为根据所述场景显示参数和所述设备参数,对所述每帧行为图像进行调整;所述图像分割模块,被配置为对调整后的所述每帧行为图像进行语义分割处理。
- 根据权利要求12所述的直播互动装置,其中,所述第一渲染模块,包括:行为分析单元,被配置为对所述第一显示形象进行行为分析,得到所述第一显示形象的行为类别;第三渲染单元,被配置为按照与所述行为类别对应的渲染方式在所述直播场景中渲染所述第一显示形象;所述第二渲染模块,包括:行为类别获取单元,被配置为获取所述第二显示形象的行为类别;第四渲染单元,被配置为按照与所述第二显示形象的行为类别对应的渲染方式,在所述直播场景中渲染所述第二显示形象。
- 根据权利要求12所述的直播互动装置,其中,所述第一目标对象为主播,所述第二目标对象为观众;所述获取模块,被配置为响应于第二目标对象的互动请求,根据所述互动请求获取第二目标对象的第二显示形象。
- 根据权利要求12所述的直播互动装置,其中,所述第一目标对象为观众,所述第二目标对象为主播或观众;所述采集模块,被配置为响应于第一目标对象的互动请求,接收所述互动请求的确认消息,根据所述互动请求的确认消息采集所述第一目标对象的行为数据。
- 根据权利要求209所述的直播互动装置,其中,所述采集模块,包括:数量获取单元,被配置为响应于所述第一目标对象的互动请求,获取所述直播场景中的显示形象数量;上传单元,被配置为响应于所述显示形象数量未达到数量阈值,上传所述互动请求;采集单元,被配置为接收所述互动请求的确认消息,根据所述确认消息采集所述第一目标对象的行为数据。
- 根据权利要求20所述的直播互动装置,其中,所述采集模块,被配置为采集所述第一目标对象的行为数据;响应于根据所述第一目标对象的行为数据识别出所述第一目标对象的全身形象,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象。
- 一种电子设备,其中,所述电子设备包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现一种直播互动方法,所述方法包括:在直播间界面中显示直播场景;采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;在所述直播场景中渲染所述第二显示形象。
- 一种非易失性存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行一种直播互动方法,所述方法包括:在直播间界面中显示直播场景;采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;在所述直播场景中渲染所述第二显示形象。
- 一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现一种直播互动方法,所述方法包括:在直播间界面中显示直播场景;采集第一目标对象的行为数据,根据所述第一目标对象的行为数据生成所述第一目标对象对应的第一显示形象,在所述直播场景渲染所述第一显示形象;获取第二目标对象的第二显示形象,所述第二显示形象是根据所述第二目标对象的行为数据生成的;在所述直播场景中渲染所述第二显示形象。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011001739.6 | 2020-09-22 | ||
CN202011001739.6A CN112153400B (zh) | 2020-09-22 | 2020-09-22 | 直播互动方法、装置、电子设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022062896A1 true WO2022062896A1 (zh) | 2022-03-31 |
Family
ID=73893673
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/117040 WO2022062896A1 (zh) | 2020-09-22 | 2021-09-07 | 直播互动方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112153400B (zh) |
WO (1) | WO2022062896A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114979682A (zh) * | 2022-04-19 | 2022-08-30 | 阿里巴巴(中国)有限公司 | 多主播虚拟直播方法以及装置 |
CN115086693A (zh) * | 2022-05-07 | 2022-09-20 | 北京达佳互联信息技术有限公司 | 虚拟对象交互方法、装置、电子设备和存储介质 |
CN115190347A (zh) * | 2022-05-31 | 2022-10-14 | 北京达佳互联信息技术有限公司 | 消息处理方法、消息处理装置、电子设备和存储介质 |
CN115426509A (zh) * | 2022-08-15 | 2022-12-02 | 北京奇虎科技有限公司 | 直播信息同步方法、装置、设备及存储介质 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112153400B (zh) * | 2020-09-22 | 2022-12-06 | 北京达佳互联信息技术有限公司 | 直播互动方法、装置、电子设备及存储介质 |
CN113660503B (zh) * | 2021-08-17 | 2024-04-26 | 广州博冠信息科技有限公司 | 同屏互动控制方法及装置、电子设备、存储介质 |
CN113900522A (zh) * | 2021-09-30 | 2022-01-07 | 温州大学大数据与信息技术研究院 | 一种虚拟形象的互动方法、装置 |
CN113965812B (zh) * | 2021-12-21 | 2022-03-25 | 广州虎牙信息科技有限公司 | 直播方法、系统及直播设备 |
CN115396688B (zh) * | 2022-10-31 | 2022-12-27 | 北京玩播互娱科技有限公司 | 一种基于虚拟场景的多人互动网络直播方法及系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106789991A (zh) * | 2016-12-09 | 2017-05-31 | 福建星网视易信息系统有限公司 | 一种基于虚拟场景的多人互动方法及系统 |
CN107613310A (zh) * | 2017-09-08 | 2018-01-19 | 广州华多网络科技有限公司 | 一种直播方法、装置及电子设备 |
CN110519611A (zh) * | 2019-08-23 | 2019-11-29 | 腾讯科技(深圳)有限公司 | 直播互动方法、装置、电子设备及存储介质 |
US20200099960A1 (en) * | 2016-12-19 | 2020-03-26 | Guangzhou Huya Information Technology Co., Ltd. | Video Stream Based Live Stream Interaction Method And Corresponding Device |
CN112153400A (zh) * | 2020-09-22 | 2020-12-29 | 北京达佳互联信息技术有限公司 | 直播互动方法、装置、电子设备及存储介质 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105812813B (zh) * | 2016-03-21 | 2019-07-12 | 深圳宸睿科技有限公司 | 一种授课视频压缩、播放方法及压缩、播放装置 |
WO2018033156A1 (zh) * | 2016-08-19 | 2018-02-22 | 北京市商汤科技开发有限公司 | 视频图像的处理方法、装置和电子设备 |
CN106804007A (zh) * | 2017-03-20 | 2017-06-06 | 合网络技术(北京)有限公司 | 一种网络直播中自动匹配特效的方法、系统及设备 |
CN107750014B (zh) * | 2017-09-25 | 2020-10-16 | 迈吉客科技(北京)有限公司 | 一种连麦直播方法和系统 |
CN109874021B (zh) * | 2017-12-04 | 2021-05-11 | 腾讯科技(深圳)有限公司 | 直播互动方法、装置及系统 |
CN108154086B (zh) * | 2017-12-06 | 2022-06-03 | 北京奇艺世纪科技有限公司 | 一种图像提取方法、装置及电子设备 |
CN109963163A (zh) * | 2017-12-26 | 2019-07-02 | 阿里巴巴集团控股有限公司 | 网络视频直播方法、装置及电子设备 |
CN110163861A (zh) * | 2018-07-11 | 2019-08-23 | 腾讯科技(深圳)有限公司 | 图像处理方法、装置、存储介质和计算机设备 |
CN109271553A (zh) * | 2018-08-31 | 2019-01-25 | 乐蜜有限公司 | 一种虚拟形象视频播放方法、装置、电子设备及存储介质 |
CN109766473B (zh) * | 2018-11-30 | 2019-12-24 | 北京达佳互联信息技术有限公司 | 信息交互方法、装置、电子设备及存储介质 |
CN110691279A (zh) * | 2019-08-13 | 2020-01-14 | 北京达佳互联信息技术有限公司 | 一种虚拟直播的方法、装置、电子设备及存储介质 |
CN111641843A (zh) * | 2020-05-29 | 2020-09-08 | 广州华多网络科技有限公司 | 直播间中虚拟蹦迪活动展示方法、装置、介质及电子设备 |
-
2020
- 2020-09-22 CN CN202011001739.6A patent/CN112153400B/zh active Active
-
2021
- 2021-09-07 WO PCT/CN2021/117040 patent/WO2022062896A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106789991A (zh) * | 2016-12-09 | 2017-05-31 | 福建星网视易信息系统有限公司 | 一种基于虚拟场景的多人互动方法及系统 |
US20200099960A1 (en) * | 2016-12-19 | 2020-03-26 | Guangzhou Huya Information Technology Co., Ltd. | Video Stream Based Live Stream Interaction Method And Corresponding Device |
CN107613310A (zh) * | 2017-09-08 | 2018-01-19 | 广州华多网络科技有限公司 | 一种直播方法、装置及电子设备 |
CN110519611A (zh) * | 2019-08-23 | 2019-11-29 | 腾讯科技(深圳)有限公司 | 直播互动方法、装置、电子设备及存储介质 |
CN112153400A (zh) * | 2020-09-22 | 2020-12-29 | 北京达佳互联信息技术有限公司 | 直播互动方法、装置、电子设备及存储介质 |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114979682A (zh) * | 2022-04-19 | 2022-08-30 | 阿里巴巴(中国)有限公司 | 多主播虚拟直播方法以及装置 |
CN114979682B (zh) * | 2022-04-19 | 2023-10-13 | 阿里巴巴(中国)有限公司 | 多主播虚拟直播方法以及装置 |
CN115086693A (zh) * | 2022-05-07 | 2022-09-20 | 北京达佳互联信息技术有限公司 | 虚拟对象交互方法、装置、电子设备和存储介质 |
CN115190347A (zh) * | 2022-05-31 | 2022-10-14 | 北京达佳互联信息技术有限公司 | 消息处理方法、消息处理装置、电子设备和存储介质 |
CN115190347B (zh) * | 2022-05-31 | 2024-01-02 | 北京达佳互联信息技术有限公司 | 消息处理方法、消息处理装置、电子设备和存储介质 |
CN115426509A (zh) * | 2022-08-15 | 2022-12-02 | 北京奇虎科技有限公司 | 直播信息同步方法、装置、设备及存储介质 |
CN115426509B (zh) * | 2022-08-15 | 2024-04-16 | 北京奇虎科技有限公司 | 直播信息同步方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112153400B (zh) | 2022-12-06 |
CN112153400A (zh) | 2020-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022062896A1 (zh) | 直播互动方法及装置 | |
CN106791893B (zh) | 视频直播方法及装置 | |
CN108495032B (zh) | 图像处理方法、装置、存储介质及电子设备 | |
CN111314617B (zh) | 视频数据处理方法、装置、电子设备及存储介质 | |
US20220150594A1 (en) | Method for message interaction, terminal and storage medium | |
US20220150598A1 (en) | Method for message interaction, and electronic device | |
CN112905074B (zh) | 交互界面展示方法、交互界面生成方法、装置及电子设备 | |
CN110677734B (zh) | 视频合成方法、装置、电子设备及存储介质 | |
WO2022077970A1 (zh) | 特效添加方法及装置 | |
CN114025105B (zh) | 视频处理方法、装置、电子设备、存储介质 | |
CN114009003A (zh) | 图像采集方法、装置、设备及存储介质 | |
CN112312190A (zh) | 视频画面的展示方法、装置、电子设备和存储介质 | |
CN109218709B (zh) | 全息内容的调整方法及装置和计算机可读存储介质 | |
CN107105311B (zh) | 直播方法及装置 | |
CN116939275A (zh) | 直播虚拟资源展示方法、装置、电子设备、服务器及介质 | |
CN108986803B (zh) | 场景控制方法及装置、电子设备、可读存储介质 | |
CN111586296B (zh) | 图像拍摄方法、图像拍摄装置及存储介质 | |
CN115914721A (zh) | 直播画面处理方法、装置、电子设备及存储介质 | |
CN113315903B (zh) | 图像获取方法和装置、电子设备、存储介质 | |
CN108769513B (zh) | 相机拍照方法及装置 | |
CN110312117B (zh) | 数据刷新方法及装置 | |
KR20210157289A (ko) | 촬영 프리뷰 이미지를 표시하는 방법, 장치 및 매체 | |
CN113747113A (zh) | 图像显示方法及装置、电子设备、计算机可读存储介质 | |
CN111356001A (zh) | 视频显示区域获取方法以及视频画面的显示方法、装置 | |
CN111385400A (zh) | 背光亮度调节方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21871268 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21871268 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20-09-2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21871268 Country of ref document: EP Kind code of ref document: A1 |