CN117793481A

CN117793481A - Video stream generation method, device, equipment and readable storage medium

Info

Publication number: CN117793481A
Application number: CN202311835499.3A
Authority: CN
Inventors: 胡冬晨
Original assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2023-12-28
Filing date: 2023-12-28
Publication date: 2024-03-29

Abstract

The application discloses a video stream generation method, a device, equipment and a readable storage medium, which relate to the technical field of video processing and are used for improving the flexibility of a video stream generation process. The method comprises the following steps: generating a target virtual scene according to an original virtual scene, wherein attribute information of the target virtual scene is the same as that of the original virtual scene; acquiring 3D elements set in the target virtual scene; generating a second video stream based on the 3D element; and sending the second video stream to a server, wherein the second video stream is used for obtaining a target video stream by the server based on the first video stream and the second video stream of the original virtual scene. The embodiment of the application can improve the flexibility of the generation process of the video stream.

Description

Video stream generation method, device, equipment and readable storage medium

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for generating a video stream.

Background

In the prior art, elements of a virtual scene are constructed or prefabricated in advance whether they are offline or real-time rendered, and thus this results in poor flexibility in the process of generating a video stream based on the virtual scene.

Disclosure of Invention

The embodiment of the application provides a video stream generation method, a device, equipment and a readable storage medium, so as to improve the flexibility of a video stream generation process.

In a first aspect, an embodiment of the present application provides a method for generating a video stream, including:

generating a target virtual scene according to an original virtual scene, wherein attribute information of the target virtual scene is the same as that of the original virtual scene;

acquiring a three-dimensional (3D) element set in the target virtual scene;

generating a second video stream based on the 3D element;

and sending the second video stream to a server, wherein the second video stream is used for obtaining a target video stream by the server based on the first video stream and the second video stream of the original virtual scene.

In a second aspect, an embodiment of the present application further provides a video stream generating method, including:

receiving a second video stream sent by a client, wherein the second video stream is generated by the client based on 3D elements set in a target virtual scene, and the target virtual scene is generated based on an original virtual scene in the first video stream;

and obtaining a target video stream according to the first video stream and the second video stream.

In a third aspect, an embodiment of the present application further provides a video stream generating apparatus, including:

the first generation module is used for generating a target virtual scene according to an original virtual scene, wherein the attribute information of the target virtual scene is the same as the attribute information of the original virtual scene;

the first acquisition module is used for acquiring three-dimensional 3D elements arranged in the target virtual scene;

a second generation module for generating a second video stream based on the 3D element;

and the first sending module is used for sending the second video stream to a server and obtaining a target video stream by the server based on the first video stream and the second video stream of the original virtual scene.

In a fourth aspect, the present application further provides a video stream generating apparatus, including:

the first receiving module is used for receiving a second video stream sent by a client, wherein the second video stream is generated by the client based on 3D elements arranged in a target virtual scene, and the target virtual scene is generated based on an original virtual scene in the first video stream;

the first generation module is used for obtaining a target video stream according to the first video stream and the second video stream.

In a fifth aspect, embodiments of the present application further provide a communication device, including: the video streaming system comprises a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor realizes the steps in the video streaming method as described above when executing the program.

In a sixth aspect, embodiments of the present application further provide a readable storage medium having a program stored thereon, which when executed by a processor, implements the steps in the video stream generating method as described above.

In the embodiment of the application, in the process of generating the target video stream, the target virtual scene can be generated based on the original virtual scene, and the 3D element can be set in the target virtual scene, so that the second video stream is generated according to the set 3D element, and the target video stream is obtained according to the first video stream and the second video stream of the original virtual scene. The 3D element can be set in the process of generating the target video stream, so that the problem caused by presetting the 3D element in the prior art can be avoided, and the flexibility of the generation process of the video stream can be improved.

Drawings

Fig. 1 is one of flowcharts of a video stream generating method provided in an embodiment of the present application;

FIG. 2 is a second flowchart of a video stream generating method according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of a video stream generating method according to an embodiment of the present application;

fig. 4 is one of the block diagrams of the video stream generating apparatus provided in the embodiment of the present application;

fig. 5 is a second block diagram of the video stream generating apparatus according to the embodiment of the present application.

Detailed Description

In the embodiment of the application, the term "and/or" describes the association relationship of the association objects, which means that three relationships may exist, for example, a and/or B may be represented: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

The term "plurality" in the embodiments of the present application means two or more, and other adjectives are similar thereto.

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

Referring to fig. 1, fig. 1 is a flowchart of a video stream generating method according to an embodiment of the present application, as shown in fig. 1, including the following steps:

and 101, generating a target virtual scene according to an original virtual scene, wherein attribute information of the target virtual scene is the same as that of the original virtual scene.

Wherein the original virtual scene includes, but is not limited to, a virtual game scene, a virtual conference scene, a virtual live scene, a virtual shopping scene, and the like.

In practical application, the original virtual scene seen by the user is a video stream pushed to the user by the server, and interactions and the like made by the user on the video are realized by using a real-time communication technology. Based on this, in this step, the client may establish communication with the server side through WebRTC (Web Real-Time Communications, web Real-time communication), so that the first video stream of the original virtual scene transmitted by the server side may be acquired.

In this step, a canvas is generated based on the first video stream, and attribute information of the original virtual scene is acquired, and then, the target virtual scene is generated on the canvas through a three-dimensional detection technology according to the attribute information of the original virtual scene. Wherein the attribute information includes, but is not limited to, information of size, shape, center point coordinate position, translation in space, rotation data, line of sight range, and the like. From this information, contours, wireframe structures, etc., of the target virtual scene may be constructed in the canvas.

In particular, a layer of canvas, which can be understood as an interactable 3D scene implemented using WebGL (Web Graphics Library ) technology, is superimposed on top of the first video stream. Canvas and video differ in that video may include sound and pictures, while canvas has only pictures. The original virtual scene (3D virtual scene) shown in the video picture of the first video stream is re-carved in the empty canvas, and the 3D space coordinates can be calculated based on the RGB values from the 2D projection of the original virtual scene based on the AI (Artificial Intelligence ) 3D detection technology of the image, and the 3D point cloud is formed by utilizing the 3D space coordinates, and then the 3D point cloud is fused to form the 3D model. In the embodiment of the application, the information such as the size, the position, the viewing distance range and the like of the environment space of the original virtual scene is detected through the method. Then, a multi-etching virtual space with the same size, position and operation experience can be constructed through the information. The virtual space may not include any objects therein.

Step 102, obtaining 3D elements set in the target virtual scene.

Wherein the 3D element may be any 3D element that the user wants to add. For example, the user may be provided with alternative 3D elements, from which the user may select the 3D element to be set, such as a chair, etc., by means of selection or dragging, etc.

Step 103, generating a second video stream based on the 3D element.

After the 3D element is set by the user, the user may operate it, for example, set a size, adjust a position and an angle, and the like. Here, an operation performed by a user on the 3D element in the canvas is acquired, and based on the operation, content in the canvas is captured as the video stream. For example, content in a Canvas (Canvas) may be captured as the video stream based on the operation by a captureStream ()) method of a Canvas element application program interface (Application Programming Interface, API).

After the user has added the 3D element, rendering of the original virtual scene may be paused while the 3D element is placed into the target virtual scene. After placement of the 3D element, an auxiliary wireframe of the model may be displayed on the operator interface for adjusting the size, position, angle, etc. at which the 3D element is presented in the target virtual scene. After the user confirms the adjustment to the 3D element, the rendering of the original virtual scene may continue. At the same time, the target virtual scene also begins rendering.

Since the target virtual scene is drawn in the Canvas, the original virtual scene may be a video stream, and thus, in obtaining the second video stream, canvas content may be captured in real-time as a video stream (excluding audio) using captureStream () method of Canvas element API in HTML (HyperText Markup Language ). In this way, the quality of the second video stream obtained is also better due to the capability of the canvas to capture video streams being natively supported, good in performance.

Meanwhile, in the above process, a first input of the user to the original virtual scene may also be received, and the first input may be synchronized into the target virtual scene. The first input is used for adjusting attribute information of the original virtual scene. The first input may include operations of rotation, translation, scaling, etc. on the original virtual space, and the operations may be simulated through three-dimensional matrix transformation, so that the operations may be synchronously reproduced in the target virtual scene, thereby achieving the purpose of synchronizing effects in the original virtual scene and the target virtual scene.

Step 104, the second video stream is sent to a server, and the target video stream is obtained by the server based on the first video stream and the second video stream of the original virtual scene.

In the embodiment of the application, in the process of generating the target video stream, the target virtual scene can be generated based on the original virtual scene, and the 3D element can be set in the target virtual scene, so that the second video stream is generated according to the set 3D element, and the target video stream is obtained according to the first video stream and the second video stream of the original virtual scene. The 3D element can be set in the process of generating the target video stream, so that the problem caused by presetting the 3D element in the prior art can be avoided, and the flexibility of the generation process of the video stream can be improved. Meanwhile, as the user can set 3D elements in the process of generating the video stream, the interactivity can be improved.

Referring to fig. 2, fig. 2 is a flowchart of a video stream generating method according to an embodiment of the present application, as shown in fig. 2, including the following steps:

step 201, receiving a second video stream sent by a client, wherein the second video stream is generated by the client based on 3D elements set in a target virtual scene, and the target virtual scene is generated based on an original virtual scene in the first video stream.

Step 202, obtaining a target video stream according to the first video stream and the second video stream.

Specifically, the server may combine the first video stream and the second video stream to obtain the target video stream.

In this way, the server may have two video streams, one is a first video stream including the original virtual scene, and the other is a second video stream. The server can combine the two video streams and push the video streams to other users, or push the video streams separately. If the virtual scene is merged and pushed, other users can view a new picture of the target virtual scene; if the video streaming frames are pushed separately, the user side can selectively watch one video streaming frame or two video streaming frames, so that the flexibility is higher.

Optionally, the server may further obtain feedback information of the user, which indicates the viewing preference of the user, for example, whether the user wants to view the first video stream, the second video stream, or the target video stream. Then, the server can push corresponding video streams to the user according to the feedback information of the user so as to meet the requirements of different users.

Referring to fig. 3, fig. 3 is a schematic diagram of a process in an embodiment of the present application. The client obtains a first video stream of an original virtual scene from a server. After the target virtual scene is generated, a second video stream is generated based on the operation of the 3D element added in the target virtual scene and pushed to the server. The server generates a target video stream from the first video stream and the second video stream.

In the above process, since the original virtual scene and the target virtual scene have the same attribute, the user operation and the interaction effect can be synchronized between the two, and meanwhile, the 3D element is supported to be set by the user, so that the interactivity is stronger.

Referring to fig. 4, fig. 4 is a block diagram of a video stream generating apparatus according to an embodiment of the present application, which is applied to a client. As shown in fig. 4, the video stream generating apparatus includes:

a first generation module 401, configured to generate a target virtual scene according to an original virtual scene, where attribute information of the target virtual scene is the same as attribute information of the original virtual scene; a first obtaining module 402, configured to obtain a three-dimensional 3D element set in the target virtual scene; a second generating module 403, configured to generate a second video stream based on the 3D element; a first sending module 404, configured to send the second video stream to a server, where the second video stream is used to obtain, by the server, a target video stream based on the first video stream and the second video stream of the original virtual scene.

Optionally, the first generating module 401 may include:

a first generation sub-module for generating a canvas based on the first video stream;

the first acquisition sub-module is used for acquiring attribute information of the original virtual scene;

and the second generation sub-module is used for generating the target virtual scene on the canvas through a three-dimensional detection technology according to the attribute information of the original virtual scene.

Optionally, the second generating module 403 includes:

the first acquisition submodule is used for acquiring the operation of a user on the 3D element in the canvas;

and the first generation sub-module is used for capturing the content in the canvas as the video stream based on the operation.

Optionally, the first generating sub-module is further configured to capture, based on the operation, the content in the Canvas as the video stream through a stream capturing method of a Canvas element API.

Optionally, the apparatus may further include:

the first receiving module is used for receiving a first input of a user to the original virtual scene, wherein the first input is used for adjusting attribute information of the original virtual scene;

and the first processing module is used for synchronizing the first input into the target virtual scene.

The device provided in the embodiment of the present application may execute the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.

Referring to fig. 5, fig. 5 is a block diagram of a video stream generating apparatus according to an embodiment of the present application, which is applied to a server. As shown in fig. 5, the video stream generating apparatus includes:

a first receiving module 501, configured to receive a second video stream sent by a client, where the second video stream is generated by the client based on a 3D element set in a target virtual scene, and the target virtual scene is generated based on an original virtual scene in the first video stream;

a first generating module 502, configured to obtain a target video stream according to the first video stream and the second video stream.

Optionally, the first generating module is further configured to combine the first video stream and the second video stream to obtain the target video stream.

It should be noted that, in the embodiment of the present application, the division of the units is schematic, which is merely a logic function division, and other division manners may be implemented in actual practice. In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution, in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The embodiment of the application provides a communication device, which comprises: a memory, a processor, and a program stored on the memory and executable on the processor; the processor is configured to read the program in the memory to implement the steps in the video stream generating method as described above.

The embodiment of the present application further provides a readable storage medium, where a program is stored, where the program, when executed by a processor, implements each process of the embodiment of the video stream generating method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here. The readable storage medium may be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic memories (e.g., floppy disks, hard disks, magnetic tapes, magneto-optical disks (MO), etc.), optical memories (e.g., CD, DVD, BD, HVD, etc.), semiconductor memories (e.g., ROM, EPROM, EEPROM, nonvolatile memories (NAND FLASH), solid State Disks (SSD)), etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. In light of such understanding, the technical solutions of the present application may be embodied essentially or in part in the form of a software product stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and including instructions for causing a terminal (which may be a cell phone, computer, server, air conditioner, or network device, etc.) to perform the methods described in the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims

1. A video stream generating method, comprising:

acquiring a three-dimensional 3D element set in the target virtual scene;

generating a second video stream based on the 3D element;

2. The method of claim 1, wherein the generating a target virtual scene from the original virtual scene comprises:

generating a canvas based on the first video stream;

acquiring attribute information of the original virtual scene;

and generating the target virtual scene on the canvas through a three-dimensional detection technology according to the attribute information of the original virtual scene.

3. The method of claim 2, wherein the generating a second video stream based on the 3D element comprises:

acquiring operations of a user on the 3D element in the canvas;

based on the operation, content in the canvas is captured as the video stream.

4. The method of claim 3, wherein the capturing content in the canvas as the video stream based on the operation comprises:

and capturing the content in the Canvas as the video stream based on the operation by a stream capturing method of a Canvas element application program interface API.

5. The method according to claim 1, wherein the method further comprises:

receiving a first input of a user to the original virtual scene, wherein the first input is used for adjusting attribute information of the original virtual scene;

synchronizing the first input into the target virtual scene.

6. A video stream generating method, comprising:

7. A video stream generating apparatus, comprising:

8. A video stream generating apparatus, comprising:

9. An electronic device, comprising: a memory, a processor, and a program stored on the memory and executable on the processor; -characterized in that the processor is arranged to read a program in a memory for implementing the steps in the video stream generating method according to any one of claims 1 to 6.

10. A readable storage medium storing a program, wherein the program when executed by a processor implements the steps in the video stream generating method according to any one of claims 1 to 6.