CN116095353A - Live broadcast method and device based on volume video, electronic equipment and storage medium - Google Patents

Live broadcast method and device based on volume video, electronic equipment and storage medium Download PDF

Info

Publication number
CN116095353A
CN116095353A CN202310080256.7A CN202310080256A CN116095353A CN 116095353 A CN116095353 A CN 116095353A CN 202310080256 A CN202310080256 A CN 202310080256A CN 116095353 A CN116095353 A CN 116095353A
Authority
CN
China
Prior art keywords
live video
video data
fusion
information
virtual image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310080256.7A
Other languages
Chinese (zh)
Inventor
孙伟
罗栋藩
邵志兢
张煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Prometheus Vision Technology Co ltd
Original Assignee
Zhuhai Prometheus Vision Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Prometheus Vision Technology Co ltd filed Critical Zhuhai Prometheus Vision Technology Co ltd
Priority to CN202310080256.7A priority Critical patent/CN116095353A/en
Publication of CN116095353A publication Critical patent/CN116095353A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the application discloses a live broadcast method, a live broadcast device, electronic equipment and a storage medium based on volume video; in the embodiment of the application, live video data of the three-dimensional character model and shooting video of the fusion object can be obtained; performing image recognition processing on the shot video of the fusion object to obtain the virtual image of the fusion object; analyzing the live video data of the three-dimensional character model to obtain background information of the live video data; carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data; acquiring viewing angle information, and determining the display content of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information; based on the display content, the fused live video data is displayed, and interaction between the watched object and the three-dimensional character model in the live scene can be supported, so that interactivity and interestingness of the volume video are improved.

Description

Live broadcast method and device based on volume video, electronic equipment and storage medium
Technical Field
The application relates to the technical field of image processing, in particular to a live broadcast method and device based on volume video, electronic equipment and a storage medium.
Background
Volumetric Video (also known as Volumetric Video, spatial Video, volumetric three-dimensional Video, or 6-degree-of-freedom Video, etc.) is a technique that generates a sequence of three-dimensional models by capturing information (e.g., depth information, color information, etc.) in three-dimensional space. Compared with the traditional video, the volumetric video adds the concept of space into the video, and the three-dimensional model is used for better restoring the real three-dimensional world, instead of using a two-dimensional plane video plus a mirror to simulate the sense of space of the real three-dimensional world. Because the volume video is a three-dimensional model sequence, a user can adjust to any visual angle to watch according to own preference, and compared with a two-dimensional plane video, the volume video has higher reduction degree and immersion sense. The volumetric video may be applied in a number of different scenes, for example, the volumetric video may be applied in a live scene. However, when volumetric video is applied to a live scene, interaction between the viewing object and the three-dimensional character model in the live scene is not supported.
Disclosure of Invention
The embodiment of the application provides a live broadcast method, device, equipment, storage medium and program product based on a volume video, which can support interaction between a watching object and a three-dimensional character model in a live broadcast scene, thereby improving interactivity and interestingness of the volume video.
The embodiment of the application provides a live broadcast method based on volume video, which comprises the following steps:
acquiring live video data of a three-dimensional character model and shooting video of a fusion object;
performing image recognition processing on the shot video of the fusion object to obtain an virtual image of the fusion object;
analyzing the live video data of the three-dimensional character model to obtain background information of the live video data;
carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data;
obtaining viewing angle information, and determining display contents of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information;
and displaying the fused live video data based on the display content.
Accordingly, an embodiment of the present application provides a live broadcast device based on a volumetric video, including:
The acquisition unit is used for acquiring live video data of the three-dimensional character model and shooting video of the fusion object;
the image recognition unit is used for performing image recognition processing on the shot video of the fusion object to obtain the virtual image of the fusion object;
the analyzing unit is used for analyzing the live video data of the three-dimensional character model to obtain background information of the live video data;
the fusion unit is used for carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data;
a determining unit, configured to obtain viewing angle information, and determine display contents of a three-dimensional character model and an avatar of the fusion object in the fused live video data based on the viewing angle information;
and the display unit is used for displaying the fused live video data based on the display content.
In addition, the embodiment of the application also provides electronic equipment, which comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for running the computer program in the memory to realize the live broadcast method based on the volume video.
In addition, the embodiment of the application further provides a computer readable storage medium, and the computer readable storage medium stores a computer program, and the computer program is suitable for being loaded by a processor to execute any live broadcast method based on the volume video.
In addition, the embodiment of the application further provides a computer program product, which comprises a computer program, and the computer program realizes any of the live broadcast method based on the volume video provided by the embodiment of the application when being executed by a processor.
In the embodiment of the application, live video data of the three-dimensional character model and shooting video of the fusion object can be obtained; performing image recognition processing on the shot video of the fusion object to obtain the virtual image of the fusion object; analyzing the live video data of the three-dimensional character model to obtain background information of the live video data; carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data; acquiring viewing angle information, and determining the display content of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information; and displaying the fused live video data based on the display content, and supporting interaction between the watched object and the three-dimensional character model in the live scene, so that interactivity and interestingness of the volume video are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a live broadcast method based on volume video according to an embodiment of the present application;
FIG. 2 is a schematic diagram of different viewing angles provided by embodiments of the present application;
fig. 3 is a schematic structural diagram of a live video-based device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The embodiment of the application provides a live broadcast method and device based on a volume video, electronic equipment and a storage medium. The live broadcast device based on the volume video can be integrated in electronic equipment, and the electronic equipment can be a server, a terminal and other equipment.
The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, network acceleration services (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligent platform.
The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.
In addition, "plurality" in the embodiments of the present application means two or more. "first" and "second" and the like in the embodiments of the present application are used for distinguishing descriptions and are not to be construed as implying relative importance.
The following will describe in detail. The following description of the embodiments is not intended to limit the preferred embodiments.
In this embodiment, description will be made from the perspective of a live video-based apparatus, and for convenience, the live video-based method of the present application will be described in detail below with the live video-based apparatus integrated in a terminal, that is, with the terminal as an execution subject.
Referring to fig. 1, fig. 1 is a flow chart of a live broadcast method based on a volumetric video according to an embodiment of the present application. The live video-based method may include:
101. and acquiring live video data of the three-dimensional character model and shooting video of the fusion object.
The live video of the embodiment of the application may be a video generated based on a volume video. For example, the live video may be a live concert, or the like.
Wherein the three-dimensional character model may be a character constructed based on volumetric video. For example, the three-dimensional character model may be a model constructed from singers.
Among them, a volume Video (also called a Volumetric Video, a spatial Video, a Volumetric three-dimensional Video, or a 6-degree-of-freedom Video, etc.) is a technique for generating a three-dimensional model sequence by capturing information (such as depth information and color information, etc.) in a three-dimensional space. Compared with the traditional video, the volumetric video adds the concept of space into the video, and the three-dimensional model is used for better restoring the real three-dimensional world, instead of using a two-dimensional plane video plus a mirror to simulate the sense of space of the real three-dimensional world. Because the volume video is a three-dimensional model sequence, a user can adjust to any visual angle to watch according to own preference, and compared with a two-dimensional plane video, the volume video has higher reduction degree and immersion sense.
In one embodiment, in the present application, the three-dimensional model used to construct the volumetric video may be reconstructed as follows:
firstly, color images and depth images of different visual angles of a shooting object and camera parameters corresponding to the color images are acquired; and training a neural network model implicitly expressing a three-dimensional model of the shooting object according to the acquired color image and the corresponding depth image and camera parameters, and extracting an isosurface based on the trained neural network model to realize three-dimensional reconstruction of the shooting object so as to obtain the three-dimensional model of the shooting object.
It should be noted that, in the embodiments of the present application, the neural network model of which architecture is adopted is not particularly limited, and may be selected by those skilled in the art according to actual needs. For example, a multi-layer perceptron (Multilayer Perceptron, MLP) without a normalization layer may be selected as a base model for model training.
The three-dimensional model reconstruction method provided in the present application will be described in detail below.
Firstly, a plurality of color cameras and depth cameras can be synchronously adopted to shoot a target object (the target object is a shooting object) which needs to be subjected to three-dimensional reconstruction at multiple visual angles, so as to obtain color images and corresponding depth images of the target object at multiple different visual angles, namely, at the same shooting moment (the difference value of actual shooting moments is smaller than or equal to a time threshold, namely, the shooting moments are considered to be the same), the color cameras at all visual angles shoot to obtain color images of the target object at the corresponding visual angles, and correspondingly, the depth cameras at all visual angles shoot to obtain depth images of the target object at the corresponding visual angles. The target object may be any object, including but not limited to living objects such as a person, an animal, and a plant, or inanimate objects such as a machine, furniture, and a doll.
Therefore, the color images of the target object at different visual angles are provided with the corresponding depth images, namely, when shooting, the color cameras and the depth cameras can adopt the configuration of a camera set, and the color cameras at the same visual angle are matched with the depth cameras to synchronously shoot the same target object. For example, a studio may be built, in which a central area is a photographing area, around which a plurality of sets of color cameras and depth cameras are paired at a certain angle interval in a horizontal direction and a vertical direction. When the target object is in the shooting area surrounded by the color cameras and the depth cameras, the color images and the corresponding depth images of the target object at different visual angles can be obtained through shooting by the color cameras and the depth cameras.
In addition, camera parameters of the color camera corresponding to each color image are further acquired. The camera parameters include internal parameters and external parameters of the color camera, which can be determined through calibration, wherein the internal parameters of the color camera are parameters related to the characteristics of the color camera, including but not limited to data such as focal length and pixels of the color camera, and the external parameters of the color camera are parameters of the color camera in a world coordinate system, including but not limited to data such as position (coordinates) of the color camera and rotation direction of the camera.
As described above, after obtaining the color images of the target object at different viewing angles and the corresponding depth images thereof at the same shooting time, the three-dimensional reconstruction of the target object can be performed according to the color images and the corresponding depth images thereof. Different from the mode of converting depth information into point cloud to perform three-dimensional reconstruction in the related technology, the method and the device train a neural network model to achieve implicit expression of the three-dimensional model of the target object, so that three-dimensional reconstruction of the target object is achieved based on the neural network model.
Optionally, the application selects an MLP that does not include a normalization layer as a base model, and trains the method as follows:
converting pixel points in each color image into rays based on corresponding camera parameters;
sampling a plurality of sampling points on the ray, and determining first coordinate information of each sampling point and a directional distance (Signed Distance Field, SDF) value of each sampling point from a pixel point;
inputting the first coordinate information of the sampling points into a basic model to obtain a predicted SDF value and a predicted RGB color value of each sampling point output by the basic model;
based on a first difference between the predicted SDF value and the SDF value and a second difference between the predicted RGB color value and the RGB color value of the pixel point, adjusting parameters of the basic model until a preset stop condition is met;
And taking the basic model meeting the preset stopping condition as a neural network model of the three-dimensional model of the implicitly expressed target object.
Firstly, converting a pixel point in a color image into a ray based on camera parameters corresponding to the color image, wherein the ray can be a ray passing through the pixel point and perpendicular to a color image surface; then, sampling a plurality of sampling points on the ray, wherein the sampling process of the sampling points can be executed in two steps, partial sampling points can be uniformly sampled firstly, and then the plurality of sampling points are further sampled at a key position based on the depth value of the pixel point, so that the condition that the sampling points near the surface of the model can be sampled as many as possible is ensured; then, calculating first coordinate information of each sampling point in a world coordinate system and an SDF value of each sampling point according to camera parameters and depth values of the pixel points, wherein the SDF value can be a difference value between the depth value of the pixel point and the distance between the sampling point and an imaging surface of the camera, the difference value is a signed value, when the difference value is a positive value, the sampling point is shown to be outside the three-dimensional model, when the difference value is a negative value, the sampling point is shown to be inside the three-dimensional model, and when the difference value is zero, the sampling point is shown to be on the surface of the three-dimensional model; then, after sampling of the sampling points is completed and the SDF value corresponding to each sampling point is obtained through calculation, first coordinate information of the sampling points in a world coordinate system is further input into a basic model (the basic model is configured to map the input coordinate information into the SDF value and the RGB color value and then output), the SDF value output by the basic model is recorded as a predicted SDF value, and the RGB color value output by the basic model is recorded as a predicted RGB color value; then, parameters of the basic model are adjusted based on a first difference between the predicted SDF value and the SDF value corresponding to the sampling point and a second difference between the predicted RGB color value and the RGB color value of the pixel point corresponding to the sampling point.
In addition, for other pixel points in the color image, sampling is performed in the above manner, and then coordinate information of the sampling point in the world coordinate system is input to the basic model to obtain a corresponding predicted SDF value and a predicted RGB color value, which are used for adjusting parameters of the basic model until a preset stopping condition is met, for example, the preset stopping condition may be configured to reach a preset number of iterations of the basic model, or the preset stopping condition may be configured to converge the basic model. When the iteration of the basic model meets the preset stopping condition, the neural network model which can accurately and implicitly express the three-dimensional model of the shooting object is obtained. Finally, an isosurface extraction algorithm can be adopted to extract the three-dimensional model surface of the neural network model, so that a three-dimensional model of the shooting object is obtained.
Optionally, in some embodiments, determining an imaging plane of the color image based on camera parameters; and determining that the rays passing through the pixel points in the color image and perpendicular to the imaging surface are rays corresponding to the pixel points.
The coordinate information of the color image in the world coordinate system, namely the imaging surface, can be determined according to the camera parameters of the color camera corresponding to the color image. Then, it can be determined that the ray passing through the pixel point in the color image and perpendicular to the imaging plane is the ray corresponding to the pixel point.
Optionally, in some embodiments, determining second coordinate information and rotation angle of the color camera in the world coordinate system according to the camera parameters; and determining an imaging surface of the color image according to the second coordinate information and the rotation angle.
Optionally, in some embodiments, the first number of first sampling points are equally spaced on the ray; determining a plurality of key sampling points according to the depth values of the pixel points, and sampling a second number of second sampling points according to the key sampling points; the first number of first sampling points and the second number of second sampling points are determined as a plurality of sampling points obtained by sampling on the rays.
Firstly uniformly sampling n (i.e. a first number) first sampling points on rays, wherein n is a positive integer greater than 2; then, according to the depth value of the pixel point, determining a preset number of key sampling points closest to the pixel point from n first sampling points, or determining key sampling points smaller than a distance threshold from the pixel point from n first sampling points; then, resampling m second sampling points according to the determined key sampling points, wherein m is a positive integer greater than 1; and finally, determining the n+m sampling points obtained by sampling as a plurality of sampling points obtained by sampling on the rays. The m sampling points are sampled again at the key sampling points, so that the training effect of the model is more accurate at the surface of the three-dimensional model, and the reconstruction accuracy of the three-dimensional model is improved.
Optionally, in some embodiments, determining a depth value corresponding to the pixel point according to a depth image corresponding to the color image; calculating an SDF value of each sampling point from the pixel point based on the depth value; and calculating coordinate information of each sampling point according to the camera parameters and the depth values.
After a plurality of sampling points are sampled on the rays corresponding to each pixel point, for each sampling point, determining the distance between the shooting position of the color camera and the corresponding point on the target object according to the camera parameters and the depth value of the pixel point, and then calculating the SDF value of each sampling point one by one and the coordinate information of each sampling point based on the distance.
After the training of the basic model is completed, for the coordinate information of any given point, the corresponding SDF value of the basic model after the training is completed can be predicted by the basic model after the training is completed, and the predicted SDF value indicates the position relationship (internal, external or surface) between the point and the three-dimensional model of the target object, so as to realize the implicit expression of the three-dimensional model of the target object and obtain the neural network model for implicitly expressing the three-dimensional model of the target object.
Finally, performing iso-surface extraction on the neural network model, for example, drawing the surface of the three-dimensional model by using an iso-surface extraction algorithm (MC), so as to obtain the surface of the three-dimensional model, and further obtaining the three-dimensional model of the target object according to the surface of the three-dimensional model.
According to the three-dimensional reconstruction scheme, the three-dimensional model of the target object is implicitly modeled through the neural network, and depth information is added to improve the training speed and accuracy of the model. By adopting the three-dimensional reconstruction scheme provided by the application, the three-dimensional reconstruction is continuously carried out on the shooting object in time sequence, so that three-dimensional models of the shooting object at different moments can be obtained, and a three-dimensional model sequence formed by the three-dimensional models at different moments according to the time sequence is the volume video shot by the shooting object. Therefore, the volume video shooting can be carried out on any shooting object, and the volume video with specific content presentation can be obtained. For example, the dance shooting object can be shot with a volume video to obtain a volume video of dance of the shooting object at any angle, the teaching shooting object can be shot with a volume video to obtain a teaching volume video of the shooting object at any angle, and the like.
It should be noted that, the volume video according to the following embodiments of the present application may be obtained by shooting using the above volume video shooting method.
Wherein the fusion object may comprise a user viewing live video. In an embodiment, the method and the device can support fusion of a user watching live video and the live video, so that interactivity of the live video is improved. For example, a user may capture his or her own video, and then the video of the viewing user may be fused with the live video.
In an embodiment, the user may request that the video of the viewing user be blended with the live video before or while the live video is being played.
For example, before playing the live video, a fusion request control may be included on the video playing interface, and the user may fuse the live video with the process of watching the live video by triggering the fusion request control. After the user triggers the fusion request control, the terminal equipment of the user can acquire the shot video data of the user, and then the terminal equipment can transmit the shot video data of the user to the live broadcast terminal. And then, when the live video is played, the live terminal can acquire live video data of the three-dimensional character model and shooting video of the fusion object.
In one embodiment, the number of users that can be fused in the context of live video is limited because there may be many live video to watch. For example, the number of users that can be fused in the context of live video is 100 bits. Therefore, when the preset fusion positions in the live video are all reserved, the fusion request control on the video playing interface is forbidden, so that the situation that the quantity of fusion objects is excessive and the live quality is affected is avoided.
102. And performing image recognition processing on the shot video of the fusion object to obtain the virtual image of the fusion object.
In an embodiment, the user may also choose to merge his own shot video directly with the live video or in the form of an avatar with the live video.
For example, after the user triggers the merge request control, an avatar generation control and a merge selection control may be displayed on the video playback interface.
When the user selects the fusion selection control, the live broadcast device can directly fuse the shooting video of the user with the live broadcast video. When the user selects the avatar generation control, the avatar of the fusion object may be generated from the photographed video of the fusion object.
Among them, there are various ways in which the avatar of the fusion object can be generated.
For example, the live device may previously generate a plurality of avatar templates, and the user may select his/her desired avatar template, thereby generating his/her avatar.
For another example, the live broadcast device may intelligently generate an avatar of the fusion object from the user's captured video. For example, a contour image of a fusion object may be identified from a captured video, and then the contour object of the fusion object may be style-converted to obtain an avatar of the fusion object. Specifically, the step of performing image recognition processing on the photographed video of the fusion object to obtain an avatar of the fusion object may include:
Carrying out frame division processing on the shot video to obtain at least one video frame of the shot video;
carrying out contour recognition on the video frame to obtain a contour image of the fusion object;
and performing style conversion on the contour image of the fusion object to obtain the virtual image of the fusion object.
In an embodiment, the captured video may be subjected to a framing process to obtain at least one video frame of the captured video. For example, the captured video may be framed using an open source computer vision library (Open Source Computer Vision Library, openCV) to obtain at least one video frame of the captured video. The openCV is a cross-platform computer vision and machine learning software library, can run on a plurality of operating systems, provides interfaces of a plurality of programming languages, and realizes a plurality of general algorithms of image processing and computer vision directions.
In an embodiment, after obtaining at least one video frame of the captured video, contour recognition may be performed on the video frame to obtain a contour image of the fusion object. There are various methods for performing contour recognition on a video frame to obtain a contour image of a fusion object.
For example, an artificial intelligence algorithm may be utilized to perform contour recognition on the video frames to obtain a contour image of the fused object. For example, the contour image of the fusion object may be obtained by performing contour recognition on the video frame using an artificial intelligence algorithm such as a convolutional neural network (Convolutional Neural Networks, CNN), a deconvolution neural network (De-Convolutional Networks, DN), or a deep neural network (Deep Neural Networks, DNN).
For another example, contour recognition can be performed on the video frame through the pixel points, so as to obtain a contour image of the fusion object. For example, pixel point information of a video frame may be extracted. Then, contour detection can be performed on the fusion object in the video frame according to the pixel point information, so as to obtain the position information of the contour pixel point. Then, the contour image of the fusion object can be cut out according to the position information of the contour pixel points.
In an embodiment, style conversion may be performed on the contour image of the fusion object to obtain an avatar of the fusion object. Wherein, style conversion of the contour image of the fusion object may refer to converting the avatar of the fusion object in the contour image into an avatar. For example, the image of the fused object in the contour image may be converted into a cartoon wind. For another example, the image of the fused object in the contour image may be converted into a canvas, and so on. In one embodiment, an artificial intelligence algorithm may be utilized to perform style conversion on the contour image of the fusion object to obtain an avatar of the fusion object. For example, the contour image of the fusion object can be subjected to style conversion by using an artificial intelligence algorithm such as CNN or DNN, so as to obtain the virtual image of the fusion object.
103. And analyzing the live video data of the three-dimensional character model to obtain the background information of the live video data.
Wherein the background information of the live video data may include information of a considerable three-dimensional character model in the live video data. For example, in live video data, the three-dimensional character model may belong to foreground information, while information other than the three-dimensional character model may be background information.
In an embodiment, the live video data may be parsed to obtain background information of the live video data. For example, foreground separation may be performed on video frames of live video to obtain background information of live video data. The foreground separation of the video frames of the live video may refer to separating the three-dimensional character model from the background information in the live video. For example, artificial intelligence techniques may be utilized to separate three-dimensional character models from background information in live video.
104. And carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain the fused live video data.
In an embodiment, in order to improve interactivity and interestingness of the volume video, the virtual image of the fusion object and background information of the live video data may be fused to obtain fused live video data.
In an embodiment, in order to ensure the quality of the live video, there are limited objects that can be fused in the background of the live video, and thus, the background of the live video may include at least one preset fusion position. And then, carrying out fusion processing on the virtual image of the fusion object and a preset fusion position in the background video to obtain fused live video data.
The method comprises the steps of obtaining the live video data after fusion, wherein various modes can be used for carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data, and obtaining the live video data after fusion.
For example, the virtual image of the fusion object and the background information of the live video data may be randomly fused to obtain fused live video data. For example, a fusion position may be randomly allocated to the avatar of the fusion object, and then the avatar information of the fusion object may be fused into the background information according to the fusion position.
In an embodiment, the fusion object and the background information may be fused according to the level information of the fusion object. Specifically, the step of performing fusion processing on the virtual image of the fusion object and background information of the live video data to obtain fused live video data may include:
Acquiring grade information of a fusion object;
determining a fusion position of the virtual image of the fusion object in the background information of the live video data based on the grade information of the fusion object;
and carrying out fusion processing on the background information of the virtual image and the live video data according to the fusion position of the virtual image, so as to obtain the fused live video data.
The rating information of the fusion object may include an account rating of the fusion object, an amount given by a predetermined fusion location, and the like. For example, the higher the amount given when the fusion object is to be fused to a predetermined fusion position, the higher its rank. As another example, the higher the account number rating of the fusion object, the higher its rating, and so on.
Then, a fusion position of the avatar of the fusion object in the background information of the live video data may be determined according to the rating information of the fusion object. Specifically, the step of determining the fusion position of the avatar of the fusion object in the background information of the live video data based on the level information of the fusion object may include:
matching the grade information of at least one preset fusion position in the grade information and the background information of the fusion object to obtain a matching result;
And determining the fusion position of the avatar in the background information of the live video data in at least one preset fusion position based on the matching result.
For example, the preset fusion positions in the background information all have corresponding grade information. For example, the closer to the three-dimensional character model, the higher the level of the preset fusion position. Conversely, the farther from the three-dimensional character model the lower the level of the preset fusion position.
In an embodiment, the level information of at least one preset fusion position in the level information and the background information of the fusion object may be matched to obtain a matching result. Then, a fusion position of the avatar in the background information of the live video data may be determined in at least one preset fusion position based on the matching result. For example, the rank information of the fusion object may be compared with rank information of a preset fusion position. If the grade information of the fusion object is the same as the grade information of the preset fusion position, the fusion object is matched with the preset fusion position, and then the preset fusion position can be determined as the fusion position of the fusion object.
And then, the virtual image and the background information of the live video data can be fused according to the fusion position of the virtual image, so as to obtain the fused live video data.
105. And acquiring viewing angle information, and determining the display content of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information.
In an embodiment, after generating the fused live video data, the live video may be presented 360 degrees as a result of the three-dimensional character model based live video. Thus, viewing can change the display content of live video by adjusting its viewing angle. For example, as shown in fig. 2, 001 may be live content viewed by a viewer through a first viewing angle, and 002 may be live content viewed by a viewer through a second viewing angle. Wherein the first viewing angle and the second viewing angle are different viewing angles, so that the content of the live video display is also different. Since the user can adjust the viewing angle of his own viewing, viewing angle information of the user can be acquired, and the display contents of the three-dimensional character model and the avatar of the fusion object in the fused live video data can be determined based on the viewing angle information.
The viewing angle information may refer to a change in the original viewing angle and the current viewing angle of the user, that is, an angle change of the first viewing angle and the second viewing angle. For example, as shown in fig. 2, the first viewing angle and the second viewing angle in the image are changed by 180 degrees.
In one embodiment, the step of determining the display contents of the three-dimensional character model and the avatar of the fusion object in the fused live video data based on the viewing angle information may include:
performing angle calculation processing on the viewing angle information to obtain a viewing angle;
mapping the viewing angle to the fused live video data to obtain a display range corresponding to the live video data;
and determining the display content of the three-dimensional character model and the virtual image of the fusion object in the fused live video data according to the display range.
In an embodiment, the viewing angle information may be subjected to an angle calculation process to obtain a viewing angle. For example, viewing angle information may be used to account for angle variations of the first and second viewing angles. Then, the angle information corresponding to the first viewing angle and the angle change may be arithmetically operated to obtain the viewing angle. For example, the angle information corresponding to the first viewing angle and the viewing angle change may be added or subtracted to obtain the viewing angle.
In an embodiment, the viewing angle may be mapped to the fused live video data, so as to obtain a display range corresponding to the live video data. For example, the live video has a preset three-dimensional coordinate. The viewing angle may be mapped into the three-dimensional coordinates, and then a display range corresponding to the index video data may be determined by the three-dimensional coordinates.
Then, the display contents of the three-dimensional character model and the avatar of the fusion object in the fused live video data may be determined according to the display range.
106. And displaying the fused live video data based on the display content.
In one embodiment, after determining the display content of the three-dimensional character model and the avatar of the fusion object in the fused live video data, the fused live video data may be displayed based on the display content.
In an embodiment, in order to further increase the interest of the live video, the avatar in the fused live video data may also be changed along with the motion of the three-dimensional character model. For example, when the hands of the three-dimensional character model are lifted, the avatar in the fused live video data may be lifted following the hands of the three-dimensional character model. For another example, as the three-dimensional character model turns around, the avatar in the fused live video data may rotate following the three-dimensional character model.
Specifically, the embodiment of the application may further include:
detecting the action of the three-dimensional character model;
when detecting that the action of the three-dimensional character model is matched with the preset triggering action, extracting virtual image control logic corresponding to the preset triggering action;
And adjusting the position information of the live video data after the virtual image is fused according to the virtual image control logic.
In one embodiment, the motion of the three-dimensional character model in the fused live video data may be detected. By detecting the motion of the three-dimensional character model, it is possible to know what motion the three-dimensional character model has performed.
In one embodiment, it may be preset what actions of the three-dimensional character model will cause the position of the avatar to change. Thus, the actions of the three-dimensional character model and the preset trigger actions can be matched.
When the action of the three-dimensional character model is detected to be matched with the preset trigger action, the virtual image control logic corresponding to the preset trigger action can be extracted. Wherein the avatar control logic may be used to illustrate how the position of the avatar changes. For example, the avatar control logic may account for the position of the avatar to follow the ascent of the three-dimensional character model as the ascent of the three-dimensional character model. As another example, the avatar control logic may state that when the hand of the three-dimensional character model is lowered, the position of the avatar follows the lowering of the hand of the three-dimensional character model.
And then, adjusting the position information of the live video data after the virtual image is fused according to the virtual image control logic. Specifically, the step of adjusting the position information of the live video data after the fusion of the avatar according to the avatar control logic may include:
calculating dynamic change acceleration according to the action of the three-dimensional character model;
converting the dynamically changing acceleration into noise information acting on the avatar according to the avatar control logic;
and adding noise information to the virtual image to realize adjustment processing on the position information of the live video data after the virtual image is fused.
In one embodiment, the speed of change in motion of the three-dimensional character model may also be detected when motion of the three-dimensional character model is detected. The dynamically changing acceleration may then be calculated from the motion change speed of the three-dimensional character model. By dynamically varying the acceleration, the position of the avatar can be changed.
In one embodiment, the dynamically changing acceleration may be translated into noise information that acts on the avatar in accordance with avatar control logic. Wherein the noise information may cause the position of the avatar to be changed. For example, the noise information may be berlin noise (perl i n nonise). Wherein the dynamically changing acceleration may be converted into noise information acting on the avatar according to the avatar control logic. For example, when the avatar control logic is a three-dimensional character model ascending, the position of the avatar follows the ascending three-dimensional character model, and thus noise information in an upward direction can be generated. By adding noise information to the avatar, the adjustment processing of the position information of the live video data of the avatar after fusion can be realized.
In an embodiment, in order to further improve the interactivity between the watching of the live video and the live video, the embodiment of the application can also support the user watching the live video to operate the virtual image to freely move in the scene based on the cloud rendering technology, so that the thousand people and thousands of people watch the concert.
For example, it may be determined whether a user of a live video qualifies to control his avatar to freely move in a scene according to the level of the user viewing the live video. For example, the level of a user viewing live video may include levels 1 through 5. Wherein, the 5 th level user can control the own avatar to move in the scene. And the users of the 1 st to 4 th levels cannot control the own avatar to move in the scene.
For another example, it may be determined whether the user of the live video can control his/her avatar to freely move in the scene according to the position of the avatar of the user watching the live video in the live video. For example, avatars that may define certain locations in the live video background may move, and users at those locations may operate their own avatars to move.
From the above, in the embodiment of the present application, live video data of a three-dimensional character model and a shooting video of a fusion object may be obtained; performing image recognition processing on the shot video of the fusion object to obtain the virtual image of the fusion object; analyzing the live video data of the three-dimensional character model to obtain background information of the live video data; carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data; acquiring viewing angle information, and determining the display content of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information; and displaying the fused live video data based on the display content, and supporting interaction between the watched object and the three-dimensional character model in the live scene, so that interactivity and interestingness of the volume video are improved.
In order to facilitate better implementation of the live broadcast method based on the volume video provided by the embodiment of the application, the embodiment of the application also provides a device based on the live broadcast method based on the volume video. Where the meaning of the terms is the same as in the live video-based method described above, specific implementation details may be referred to in the description of the method embodiments.
For example, as shown in fig. 3, the volume video-based live device may include:
an acquiring unit 301, configured to acquire live video data of a three-dimensional character model and a captured video of a fusion object;
the image recognition unit 302 is configured to perform image recognition processing on the captured video of the fusion object, so as to obtain an avatar of the fusion object;
the parsing unit 303 is configured to parse live video data of the three-dimensional character model to obtain background information of the live video data;
a fusion unit 304, configured to perform fusion processing on the virtual image of the fusion object and background information of the live video data, so as to obtain fused live video data;
a determining unit 305, configured to obtain viewing angle information, and determine display contents of the three-dimensional character model and the avatar of the fusion object in the fused live video data based on the viewing angle information;
and a display unit 306, configured to display the fused live video data based on the display content.
In an embodiment, the image recognition unit 302 may include:
the frame-dividing processing subunit is used for carrying out frame-dividing processing on the shot video to obtain at least one video frame of the shot video;
The contour recognition subunit is used for carrying out contour recognition on the video frame to obtain a contour image of the fusion object;
and the style conversion subunit is used for performing style conversion on the contour image of the fusion object to obtain the virtual image of the fusion object.
In an embodiment, the fusing unit 304 may include:
an information acquisition subunit, configured to acquire level information of the fusion object;
a position determining subunit, configured to determine, based on the level information of the fusion object, a fusion position of an avatar of the fusion object in background information of the live video data;
and the fusion subunit is used for carrying out fusion processing on the virtual image and the background information of the live video data according to the fusion position of the virtual image to obtain fused live video data.
In an embodiment, the location determining subunit may include:
the matching module is used for matching the grade information of the fusion object with the grade information of at least one preset fusion position in the background information to obtain a matching result;
and the position determining module is used for determining the fusion position of the avatar in the background information of the live video data in the at least one preset fusion position based on the matching result.
In an embodiment, the determining unit 305 may include:
the angle calculating subunit is used for carrying out angle calculation processing on the viewing angle information to obtain a viewing angle;
the mapping subunit is used for mapping the viewing angle to the fused live video data to obtain a display range corresponding to the live video data;
and the content determining subunit is used for determining the display content of the three-dimensional character model and the virtual image of the fusion object in the fused live video data according to the display range.
In an embodiment, the live broadcast device may further include:
a detection unit for detecting the motion of the three-dimensional character model;
the extraction unit is used for extracting virtual image control logic corresponding to a preset trigger action when detecting that the action of the three-dimensional character model is matched with the preset trigger action;
and the adjusting unit is used for adjusting the position information of the live video data after the fusion of the virtual image according to the virtual image control logic.
In an embodiment, the adjusting unit may include:
a calculating subunit, configured to calculate a dynamically changing acceleration according to the motion of the three-dimensional character model;
A conversion subunit for converting the dynamically changing acceleration into noise information acting on the avatar according to the avatar control logic;
and the adding subunit is used for adding the noise information to the virtual image so as to realize adjustment processing on the position information of the live video data of the virtual image after fusion.
In the specific implementation, each module may be implemented as an independent entity, or may be combined arbitrarily, and implemented as the same entity or a plurality of entities, and the specific implementation and the corresponding beneficial effects of each module may be referred to the foregoing method embodiments, which are not described herein again.
The embodiment of the application also provides an electronic device, which may be a server or a terminal, as shown in fig. 4, and shows a schematic structural diagram of the electronic device according to the embodiment of the application, specifically:
the electronic device may include one or more processing cores 'processors 601, one or more computer-readable storage media's memory 602, power supply 603, and input unit 604, among other components. Those skilled in the art will appreciate that the electronic device structure shown in fig. 4 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or may be arranged in different components. Wherein:
The processor 601 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing computer programs and/or modules stored in the memory 602, and invoking data stored in the memory 602. Optionally, the processor 601 may include one or more processing cores; preferably, the processor 601 may integrate an application processor and a modem processor, wherein the application processor primarily handles operating systems, user interfaces, applications, etc., and the modem processor primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.
The memory 602 may be used to store computer programs and modules, and the processor 601 may execute various functional applications and data processing by executing the computer programs and modules stored in the memory 602. The memory 602 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, a computer program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device, etc. In addition, the memory 602 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 602 may also include a memory controller to provide access to the memory 602 by the processor 601.
The electronic device further comprises a power supply 603 for supplying power to the various components, preferably the power supply 603 may be logically connected to the processor 601 by a power management system, so that functions of managing charging, discharging, power consumption management and the like are achieved by the power management system. The power supply 603 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The electronic device may further comprise an input unit 604, which input unit 604 may be used for receiving input digital or character information and for generating keyboard, mouse, joystick, optical or trackball signal inputs in connection with user settings and function control.
Although not shown, the electronic device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 601 in the electronic device loads executable files corresponding to the processes of one or more computer programs into the memory 602 according to the following instructions, and the processor 601 executes the computer programs stored in the memory 602, so as to implement various functions, such as:
Acquiring live video data of a three-dimensional character model and shooting video of a fusion object;
performing image recognition processing on the shot video of the fusion object to obtain an virtual image of the fusion object;
analyzing the live video data of the three-dimensional character model to obtain background information of the live video data;
carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data;
obtaining viewing angle information, and determining display contents of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information;
and displaying the fused live video data based on the display content.
The specific embodiments and the corresponding beneficial effects of the above operations can be seen from the above detailed description of the live broadcast method based on the volume video, and will not be described herein.
It will be appreciated by those of ordinary skill in the art that all or part of the steps of the various methods of the above embodiments may be performed by a computer program, or by computer program control related hardware, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer readable storage medium having stored therein a computer program that is capable of being loaded by a processor to perform steps in any of the volumetric video based live methods provided by embodiments of the present application. For example, the computer program may perform the steps of:
acquiring live video data of a three-dimensional character model and shooting video of a fusion object;
performing image recognition processing on the shot video of the fusion object to obtain an virtual image of the fusion object;
analyzing the live video data of the three-dimensional character model to obtain background information of the live video data;
carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data;
obtaining viewing angle information, and determining display contents of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information;
and displaying the fused live video data based on the display content.
The specific embodiments and the corresponding beneficial effects of each of the above operations can be found in the foregoing embodiments, and are not described herein again.
Wherein the computer-readable storage medium may comprise: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Because the computer program stored in the computer readable storage medium can execute the steps in any of the live broadcast methods based on the volume video provided in the embodiments of the present application, the beneficial effects that any of the live broadcast methods based on the volume video provided in the embodiments of the present application can be achieved, which are detailed in the previous embodiments and are not repeated herein.
Among other things, according to one aspect of the present application, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the volumetric video-based live broadcast method described above.
The foregoing describes in detail a live video-based method, apparatus, electronic device and storage medium provided in the embodiments of the present application, and specific examples are applied to illustrate principles and implementations of the present application, where the foregoing description of the embodiments is only for helping to understand the method and core ideas of the present application; meanwhile, those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, and the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. A live video-based method, comprising:
acquiring live video data of a three-dimensional character model and shooting video of a fusion object;
performing image recognition processing on the shot video of the fusion object to obtain an virtual image of the fusion object;
analyzing the live video data of the three-dimensional character model to obtain background information of the live video data;
carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data;
obtaining viewing angle information, and determining display contents of the three-dimensional character model and the virtual image of the fusion object in the fused live video data based on the viewing angle information;
and displaying the fused live video data based on the display content.
2. The method according to claim 1, wherein performing the avatar recognition process on the captured video of the fusion object to obtain the avatar of the fusion object includes:
framing the shot video to obtain at least one video frame of the shot video;
performing contour recognition on the video frame to obtain a contour image of the fusion object;
And performing style conversion on the contour image of the fusion object to obtain the virtual image of the fusion object.
3. The method according to claim 1, wherein the fusing the avatar of the fused object and the background information of the live video data to obtain fused live video data includes:
acquiring the grade information of the fusion object;
determining a fusion position of the virtual image of the fusion object in the background information of the live video data based on the grade information of the fusion object;
and carrying out fusion processing on the virtual image and the background information of the live video data according to the fusion position of the virtual image to obtain the fused live video data.
4. The method of claim 3, wherein the determining a fusion position of the avatar of the fusion object in the background information of the live video data based on the rating information of the fusion object comprises:
matching the grade information of the fusion object with the grade information of at least one preset fusion position in the background information to obtain a matching result;
and determining the fusion position of the virtual image in the background information of the live video data in the at least one preset fusion position based on the matching result.
5. The method of claim 1, wherein the determining, based on the viewing perspective information, display content of the three-dimensional character model and the avatar of the fusion object in the fused live video data comprises:
performing angle calculation processing on the viewing angle information to obtain a viewing angle;
mapping the viewing angle to the fused live video data to obtain a display range corresponding to the live video data;
and determining the display contents of the three-dimensional character model and the virtual image of the fusion object in the fused live video data according to the display range.
6. The method according to claim 1 or 5, characterized in that the method further comprises:
detecting the action of the three-dimensional character model;
when detecting that the action of the three-dimensional character model is matched with a preset trigger action, extracting virtual image control logic corresponding to the preset trigger action;
and adjusting the position information of the live video data after the fusion of the virtual image according to the virtual image control logic.
7. The method of claim 6, wherein the adjusting the position information of the live video data after the fusing according to the avatar control logic comprises:
Calculating dynamic change acceleration according to the action of the three-dimensional character model;
converting the dynamically changing acceleration into noise information acting on the avatar according to the avatar control logic;
and adding the noise information to the virtual image to realize adjustment processing on the position information of the live video data of the virtual image after fusion.
8. A live video-based device, comprising:
the acquisition unit is used for acquiring live video data of the three-dimensional character model and shooting video of the fusion object;
the image recognition unit is used for performing image recognition processing on the shot video of the fusion object to obtain the virtual image of the fusion object;
the analyzing unit is used for analyzing the live video data of the three-dimensional character model to obtain background information of the live video data;
the fusion unit is used for carrying out fusion processing on the virtual image of the fusion object and the background information of the live video data to obtain fused live video data;
a determining unit, configured to obtain viewing angle information, and determine display contents of a three-dimensional character model and an avatar of the fusion object in the fused live video data based on the viewing angle information;
And the display unit is used for displaying the fused live video data based on the display content.
9. An electronic device comprising a processor and a memory, the memory storing a computer program, the processor being configured to execute the computer program in the memory to perform the live video-based method of any of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program adapted to be loaded by a processor for performing the live video-based method of any of claims 1 to 7.
CN202310080256.7A 2023-02-02 2023-02-02 Live broadcast method and device based on volume video, electronic equipment and storage medium Pending CN116095353A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310080256.7A CN116095353A (en) 2023-02-02 2023-02-02 Live broadcast method and device based on volume video, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310080256.7A CN116095353A (en) 2023-02-02 2023-02-02 Live broadcast method and device based on volume video, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116095353A true CN116095353A (en) 2023-05-09

Family

ID=86204218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310080256.7A Pending CN116095353A (en) 2023-02-02 2023-02-02 Live broadcast method and device based on volume video, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116095353A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116975654A (en) * 2023-08-22 2023-10-31 腾讯科技(深圳)有限公司 Object interaction method, device, electronic equipment, storage medium and program product

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116975654A (en) * 2023-08-22 2023-10-31 腾讯科技(深圳)有限公司 Object interaction method, device, electronic equipment, storage medium and program product
CN116975654B (en) * 2023-08-22 2024-01-05 腾讯科技(深圳)有限公司 Object interaction method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
TWI752502B (en) Method for realizing lens splitting effect, electronic equipment and computer readable storage medium thereof
EP3274986A2 (en) Virtual 3d methods, systems and software
CN105931283B (en) A kind of 3-dimensional digital content intelligence production cloud platform based on motion capture big data
US20240046557A1 (en) Method, device, and non-transitory computer-readable storage medium for reconstructing a three-dimensional model
CN113709543B (en) Video processing method and device based on virtual reality, electronic equipment and medium
CN109997175A (en) Determine the size of virtual objects
CN112598780B (en) Instance object model construction method and device, readable medium and electronic equipment
US20240212252A1 (en) Method and apparatus for training video generation model, storage medium, and computer device
CN116095353A (en) Live broadcast method and device based on volume video, electronic equipment and storage medium
CN115442658B (en) Live broadcast method, live broadcast device, storage medium, electronic equipment and product
CN116109974A (en) Volumetric video display method and related equipment
CN115442519B (en) Video processing method, apparatus and computer readable storage medium
US20230106330A1 (en) Method for creating a variable model of a face of a person
CN115546408A (en) Model simplifying method and device, storage medium, electronic equipment and product
CN116245989A (en) Method and device for processing volume video, storage medium and computer equipment
US20240048780A1 (en) Live broadcast method, device, storage medium, electronic equipment and product
WO2024124664A1 (en) Video processing method and apparatus, computer device, and computer-readable storage medium
CN115497029A (en) Video processing method, device and computer readable storage medium
CN116129002A (en) Video processing method, apparatus, device, storage medium, and program product
CN116233395A (en) Video synchronization method, device and computer readable storage medium for volume video
CN116310245A (en) Method, device, medium, equipment and product for mounting prop on volume video
CN116188636A (en) Video processing method, apparatus, device, storage medium, and program product
CN116170652A (en) Method and device for processing volume video, computer equipment and storage medium
CN116389704A (en) Video processing method, apparatus, computer device, storage medium and product
CN116132653A (en) Processing method and device of three-dimensional model, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination