CN116939233A

CN116939233A - Live video processing method, apparatus, device, storage medium and computer program

Info

Publication number: CN116939233A
Application number: CN202210370093.1A
Authority: CN
Inventors: 刘平
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-04-08
Filing date: 2022-04-08
Publication date: 2023-10-24
Also published as: WO2023193524A1

Abstract

The embodiment of the application provides a live video processing method, a device, equipment, a storage medium and a computer program, which are at least applied to the technical fields of image processing and games, wherein the method comprises the following steps: displaying an operation mode button on a setting interface of the live broadcast application; responding to the selection operation of the operation mode button and the live broadcast starting operation of a target live broadcast object, and acquiring the live broadcast video of the target live broadcast object based on the original video of the target live broadcast object and the operation mode corresponding to the selection operation; wherein, in the operation mode, the live video has a specific picture display effect; and displaying the live video on a live interface of the live application. According to the method and the device for generating the live video, the live video with the same picture quality parameters as the original video of the target live object can be generated and output, and live watching experience of a live watching user is improved.

Description

Live video processing method, apparatus, device, storage medium and computer program

Technical Field

The embodiment of the application relates to the technical field of Internet, in particular to a live video processing method, a live video processing device, live video processing equipment, a storage medium and a computer program.

Background

Currently, in electronic contest broadcast, for High Dynamic Range (HDR) games, no software directly broadcast to a common host is generally required to adopt a compatible mode. Taking the current live video open source software (for example, obs-studio) as an example, a method of adding a Look-Up Table (LUT) is generally adopted, an HDR game picture is collected through the live video software, and then the HDR color is mapped to a Standard Dynamic Range (SDR) color similar to the HDR game picture through LUT processing, so that a live video picture with a display effect in a certain color Range is obtained, and no particularly serious color distortion occurs.

However, in the related art, the host mainly uses the obs-studio or the software based on the secondary development of the obs-studio, because the host does not support the HDR live broadcast and cannot process the video through the LUT, the live broadcast picture will be severely distorted when the host opens the game HDR, so that the picture quality of the host game end is greatly different from that of the live broadcast picture, and the live broadcast watching user cannot experience the effect of the HDR live broadcast, thereby reducing the live broadcast watching experience of the live broadcast watching user.

Disclosure of Invention

The embodiment of the application provides a live video processing method, a device, equipment, a storage medium and a computer program, which are at least applied to the technical fields of image processing and games, and can generate and output live video with the same picture quality parameters as the original video of a target live object, thereby improving the live watching experience of a live watching user.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a live video processing method, which comprises the following steps:

displaying an operation mode button on a setting interface of the live broadcast application;

responding to the selection operation of the operation mode button and the live broadcast starting operation of a target live broadcast object, and acquiring the live broadcast video of the target live broadcast object based on the original video of the target live broadcast object and the operation mode corresponding to the selection operation; the live video has a corresponding picture display effect in the running mode;

and displaying the live video on a live interface of the live application.

In some embodiments, the method further comprises: initializing a texture region at a preset storage position of the live broadcast application; wherein the texture region enables cross-process sharing; when an original video of the target live broadcast object is generated, a hook function is called to carry out hook processing on the original video; copying the original video processed by the hook to the shared texture.

In some embodiments, when generating the original video of the target live object, invoking a hook function to hook the original video includes: when an original video of the target live object is generated, a hook function is called to carry out hook processing on a designated message for generating the original video; modifying the specified message processed by the hook to obtain a modified specified message; and acquiring the original video of the target live broadcast object based on the modified specified message to obtain the original video after the hook processing.

In some embodiments, the method further comprises: when the original video of the target live object is generated, an open shared resource function is called to open a shared handle of the graphic infrastructure; acquiring a texture region capable of realizing cross-process sharing and a region identifier of the texture region through the sharing handle; and determining the video picture type of the generated original video based on the format of the region identifier.

In some embodiments, the obtaining the live video of the target live object based on the original video of the target live object and the operation mode corresponding to the selection operation includes: when the video picture type of the original video is matched with the operation mode corresponding to the selection operation, rendering the original video onto a target canvas of the live broadcast application by adopting a preset color format and a preset color space to obtain a rendered video; and encoding the rendered video to obtain the live video of the target live object.

In some embodiments, the encoding the rendered video to obtain a live video of the target live object includes: performing format conversion processing on the coded video to obtain video data after format conversion; and performing software coding processing on the video after format conversion, or performing hardware coding processing on the video after format conversion to obtain the live video of the target live object.

In some embodiments, the encoded video is RGB format data; the performing format conversion processing on the encoded video to obtain video data after format conversion, including: performing bit operation on the RGB format data to obtain RGB component data of each pixel point; determining RGB component data of each preset number of pixel points as a data group; performing matrix conversion on RGB component data in each data set to obtain YUV data corresponding to each pixel point; and determining the video data after format conversion based on the YUV data corresponding to each pixel point.

In some embodiments, the encoded video is RGB format data; the performing format conversion processing on the encoded video to obtain video data after format conversion, including: acquiring format textures of the RGB format data; performing linear conversion on the format texture to obtain RGB data after format conversion; sequentially performing color matrix conversion and reordering processing on the RGB data subjected to format conversion to obtain YUV data with preset bits; and determining the video data after format conversion based on the YUV data with the preset bit.

In some embodiments, before the format-converted video is subjected to a hardware encoding process, the method further comprises: creating a switching chain when the original video is rendered, and acquiring preset example metadata; traversing video data corresponding to the original video by taking the exchange chain as an initial detection point, and determining the data content which is the same as the example metadata in the video data; determining an offset address based on the same data content; determining the offset address as HDR metadata of the original video; correspondingly, the hardware encoding processing is performed on the video after format conversion to obtain the live video of the target live object, which comprises the following steps: and carrying out hardware coding processing on the video after format conversion based on the HDR metadata of the original video to obtain the live video of the target live object.

In some embodiments, the method further comprises: acquiring HDR metadata of the original video from the video data of the original video; determining a key frame in the live video after the hardware encoding processing; the HDR metadata is added as supplemental enhancement information to frame data of the key frames.

In some embodiments, the video picture types of the original video and the live video are both HDR types.

The embodiment of the application provides a live video processing device, which comprises:

the first display module is used for displaying an operation mode button on a setting interface of the live broadcast application;

the acquisition module is used for responding to the selection operation of the operation mode button and the live broadcast starting operation of the target live broadcast object, and acquiring the live broadcast video of the target live broadcast object based on the original video of the target live broadcast object and the operation mode corresponding to the selection operation; the live video has a corresponding picture display effect in the running mode;

and the second display module is used for displaying the live video on a live interface of the live application.

In some embodiments, the apparatus further comprises: the initialization module is used for initializing a texture area at a preset storage position of the live broadcast application; wherein the texture region enables cross-process sharing; the first function calling module is used for calling a hook function to hook the original video of the target live object when the original video is generated; and the copying module is used for copying the original video processed by the hook to the shared texture.

In some embodiments, the first function call module is further configured to: when an original video of the target live object is generated, a hook function is called to carry out hook processing on a designated message for generating the original video; modifying the specified message processed by the hook to obtain a modified specified message; and acquiring the original video of the target live broadcast object based on the modified specified message to obtain the original video after the hook processing.

In some embodiments, the apparatus further comprises: the second function calling module is used for calling an open shared resource function to open a shared handle of the graphic infrastructure when the original video of the target live object is generated; the information acquisition module is used for acquiring a texture region capable of realizing cross-process sharing and a region identifier of the texture region through the sharing handle; and the video picture type determining module is used for determining the video picture type of the generated original video based on the format of the region identification.

In some embodiments, the acquisition module is further configured to: when the video picture type of the original video is matched with the operation mode corresponding to the selection operation, rendering the original video onto a target canvas of the live broadcast application by adopting a preset color format and a preset color space to obtain a rendered video; and encoding the rendered video to obtain the live video of the target live object.

In some embodiments, the acquisition module is further configured to: performing format conversion processing on the coded video to obtain video data after format conversion; and performing software coding processing on the video after format conversion, or performing hardware coding processing on the video after format conversion to obtain the live video of the target live object.

In some embodiments, the encoded video is RGB format data; the acquisition module is further configured to: performing bit operation on the RGB format data to obtain RGB component data of each pixel point; determining RGB component data of each preset number of pixel points as a data group; performing matrix conversion on RGB component data in each data set to obtain YUV data corresponding to each pixel point; and determining the video data after format conversion based on the YUV data corresponding to each pixel point.

In some embodiments, the encoded video is RGB format data; the acquisition module is further configured to: acquiring format textures of the RGB format data; performing linear conversion on the format texture to obtain RGB data after format conversion; sequentially performing color matrix conversion and reordering processing on the RGB data subjected to format conversion to obtain YUV data with preset bits; and determining the video data after format conversion based on the YUV data with the preset bit.

In some embodiments, the apparatus further comprises: the exchange chain creation module is used for creating an exchange chain when the original video is rendered before the video subjected to the format conversion is subjected to hardware coding processing, and acquiring preset example metadata; the traversing module is used for traversing the video data corresponding to the original video by taking the exchange chain as an initial detection point and determining the data content which is the same as the example metadata in the video data; the offset address determining module is used for determining an offset address based on the same data content; a metadata determination module for determining the offset address as HDR metadata of the original video; correspondingly, the acquisition module is further configured to: and carrying out hardware coding processing on the video after format conversion based on the HDR metadata of the original video to obtain the live video of the target live object.

In some embodiments, the apparatus further comprises: a metadata acquisition module, configured to acquire HDR metadata of the original video from video data of the original video; the key frame determining module is used for determining key frames in the live video after the hardware encoding processing; and the information adding module is used for adding the HDR metadata serving as supplementary enhancement information to the frame data of the key frame.

The embodiment of the application provides live video processing equipment, which comprises the following components:

a memory for storing executable instructions; and the processor is used for realizing the live video processing method when executing the executable instructions stored in the memory.

Embodiments of the present application provide a computer program product or computer program comprising executable instructions stored in a computer readable storage medium; the processor of the live video processing device reads the executable instructions from the computer readable storage medium and executes the executable instructions to realize the live video processing method.

The embodiment of the application provides a computer readable storage medium, which stores executable instructions for realizing the live video processing method when a processor executes the executable instructions.

The embodiment of the application has the following beneficial effects: in the live broadcasting process of the target live broadcasting object through the live broadcasting application, the live broadcasting video of the target live broadcasting object is obtained and displayed on a live broadcasting interface of the live broadcasting application based on the original video of the target live broadcasting object and the running mode corresponding to the selection operation by responding to the selection operation of the running mode button and the live broadcasting starting operation of the target live broadcasting object, wherein the live broadcasting video has a corresponding picture display effect under the running mode, and therefore the live broadcasting video with the picture display effect matched with the running mode is generated, the live broadcasting picture distortion is avoided, and the live broadcasting watching experience of a live broadcasting watching user is improved.

Drawings

FIG. 1 is a diagram of a comparison of an original HDR game frame and an OBS live frame in the related art;

FIG. 2 is a flow diagram of a related art implementation of HDR game live based on LUT processing;

FIG. 3 is a schematic diagram of an alternative architecture of a live video processing system provided by an embodiment of the present application;

fig. 4 is a schematic structural diagram of a live video processing device according to an embodiment of the present application;

fig. 5 is a schematic flow chart of an alternative method for processing live video according to an embodiment of the present application;

fig. 6 is a schematic flow chart of another alternative method for processing live video according to an embodiment of the present application;

fig. 7 is a schematic flow chart of still another alternative method for processing live video according to an embodiment of the present application;

fig. 8 is an interface diagram of a setting page of a live application provided by an embodiment of the present application;

FIG. 9 is a diagram of a selection interface for opening a game HDR provided by an embodiment of the present application;

FIG. 10 is an interface diagram for selecting a game to be played according to an embodiment of the present application;

FIG. 11 is an interface diagram for starting live provided by an embodiment of the present application;

FIG. 12 is a diagram of a HDR game live versus OBS live HDR game live comparison of an embodiment of the present application;

FIG. 13 is a flow chart of a live game of an embodiment of the present application;

FIG. 14 is a schematic diagram of an implementation of hooking a message by a hooking mechanism provided by an embodiment of the present application;

FIG. 15 is a schematic diagram of a comparison of two color formats;

FIG. 16 is a schematic diagram showing a comparison of a PQ curve and a conventional gamma curve;

FIG. 17 is a schematic diagram of a comparison of BT.2020 and BT.709 gamuts;

FIG. 18 is a schematic diagram of a YUV format;

FIG. 19 is a schematic flow chart of a CPU conversion scheme provided by an embodiment of the present application;

FIG. 20 is a flowchart of a GPU conversion scheme according to an embodiment of the present application;

FIG. 21 is a graph of performance versus CPU conversion scheme and GPU conversion scheme;

FIG. 22 is a flow chart of a method for acquiring game HDR metadata provided by an embodiment of the present application;

FIG. 23 is an interface diagram of a game HDR metadata acquisition method provided by an embodiment of the present application;

fig. 24 is a flowchart of a method for supporting HDR data by HEVC hard coding according to an embodiment of the present application.

Detailed Description

The present application will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which embodiments of this application belong. The terminology used in the embodiments of the application is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.

Before explaining the scheme of the embodiment of the present application, the nouns related to the embodiment of the present application are explained first:

(1) Live broadcast: the method is a technology for acquiring data of a play party through a certain device, compressing the data into a video stream which can be watched and transmitted through a series of processes such as video coding, and outputting the video stream to a watching user side. Typically, production and viewing are synchronized.

(2) And (5) on-demand sowing: the method is characterized in that a plurality of prerecorded videos are placed on a website and played according to different hobbies of net friends. Compared with on-demand broadcasting, the on-demand broadcasting has higher requirements on software and hardware.

(3) High Dynamic Range (HDR): compared with a common image, the HDR can provide more color ranges and image details, improves the contrast of the image brightness, and can greatly restore the real environment and present extremely high image quality. A larger brightness range, a wider color gamut and a larger bit depth than a traditional standard dynamic range image (SDR) are adopted; unlike gamma corrected transfer functions and new encoding schemes, HDR has applications in fields such as photography, video, gaming, etc.

(4) Standard Dynamic Range (SDR): sometimes also called LDR, SDR images do not have more comprehensive details and do not have a wider color range than HDR. Overexposure of the image will result in loss of information in the brighter portions of the image; also, when the image is underexposed, information on darker portions of the image may be lost.

(5) Tone mapping: the method is to compress the brightness of the HDR picture to the range of SDR, and maintain the details, colors and the like of the original HDR image as much as possible. Tone mapping mainly comprises two aspects, luminance mapping and gamut mapping. And conversely, inverse tone mapping is also carried out to map the SDR picture into an HDR picture.

(6) HDR game live broadcast: the HDR game is instructed to live, so that the audience can experience the HDR effect, and the audience experience can be greatly improved.

(7) obs-studio: the OBS is video live broadcast open source software, and provides functions of game picture grabbing, coding, pushing and the like for users.

(8) DirectX: is a multimedia programming interface and is widely used for Windows electronic game development.

(9) Hook function (Hook): before the system calls the function, the hook program captures the message, the hook function gets control first, and the hook function can process (change) the execution behavior of the function.

(10) Live stream data: the video and audio collected by the anchor user are encoded to form a code stream suitable for being transmitted in the network, so that the code stream can be decoded and played by a receiving end in real time without waiting for receiving all data.

(11) The anchor: or as a hosting user, refers to a user performing and sharing the performance in a live service.

(12) Live broadcast room: and the live broadcast platform is used for the host user to issue applications of different live broadcast services.

(13) Live audience: audience for performances of anchor users in live services.

Before explaining the live video processing method according to the embodiment of the present application, a method in the related art will be described first.

At present, most games commonly support HDR, and because HDR has a higher picture display effect, game players can have higher visual experience, and game experience is greatly improved, wherein HDR game pictures of the games supporting HDR can provide a larger color range and image details, have higher image brightness contrast, and present extremely high image quality. The non-HDR game picture of the game which does not support HDR has smaller color range, less image detail, low image brightness contrast and poor image quality compared with the HDR game picture. In the video-on-demand field, HDR offline video-on-demand starts to gradually enter the field of view, and each large video platform also supports HDR playing, but for offline HDR video, each large video platform only supports special software playing or mobile equipment, and does not support HDR playing of a browser. For offline HDR video playback, the mobile end uses h.265 encoding format, which incurs high costs.

Currently, the main broadcasting uses the OBS-studio or the software based on the secondary development of the OBS-studio, the main broadcasting starts the game HDR, the live broadcasting is performed in an OBS mode, and the live broadcasting picture of the OBS is seriously distorted. Fig. 1 is a comparison diagram of an original HDR game screen and an OBS live screen in the related art, wherein the left diagram is the original HDR game screen, the right diagram is the OBS live screen, and as shown in fig. 1, the OBS live screen has a poorer image quality than the original HDR game screen.

In the related art, when a game is live, for an HDR game, since there is no software that directly live to a general host, a compatibility mode is generally required to achieve optimization of a live effect. Taking obs-studio as an example, in implementation, a method of adding LUT processing is generally adopted, as shown in fig. 2, which is a flow chart for implementing live broadcast of an HDR game based on LUT processing in the related art, firstly, an HDR game picture 201 is collected 202 by live broadcast software, the collected HDR game picture 201 is added into a live broadcast software canvas 203, and then, an HDR color is mapped to an SDR color similar to the HDR game picture by LUT processing 204, so as to obtain a live broadcast picture 205, thereby obtaining a live broadcast picture with a display effect within a certain color range without particularly serious color distortion.

However, in the method in the related art, because the HDR is not supported in all directions, or the SDR picture is live, a live audience of a device with the HDR cannot experience the HDR effect consistent with the host; moreover, the pictures processed by the LUT may generate larger distortion with the original pictures, and adding the LUT increases the algorithm complexity of the whole live video processing process.

Based on the above-mentioned problems in the related art, the embodiments of the present application provide a live video processing method, which omnidirectionally supports HDR from rendering, preprocessing and encoding, instead of compromises using LUT processing, so that the host and the live audience can actually experience benefits brought by HDR live broadcast, such as excellent pictures, a larger color range, more image details, higher image bright-dark contrast, and extremely high image quality. The embodiment of the application provides a solution capable of realizing HDR game live broadcast, which at least comprises the following contents: the collection of HDR game picture content, the rendering of HDR game pictures, and the synthesis of pictures and other SDR content, the encoding and promotion of HDR game pictures, each of which will be described in detail below.

In the live video processing method provided by the embodiment of the application, firstly, an operation mode button is displayed on a setting interface of a live application; then, responding to a selection operation for an operation mode button and a live broadcast starting operation for a target live broadcast object, and acquiring a live broadcast video of the target live broadcast object based on an original video of the target live broadcast object and an operation mode corresponding to the selection operation; finally, displaying the live video on a live interface of the live application; the live video has a corresponding picture display effect in the running mode. Therefore, the live video with the picture display effect matched with the running mode is generated and output, live picture distortion is avoided, and live watching experience of a live watching user is improved.

An exemplary application of the live video processing device according to the embodiment of the present application is described below, where the live video processing device provided by the embodiment of the present application may be implemented as a terminal or as a server. In one implementation manner, the live video processing device provided by the embodiment of the application may be implemented as any terminal with an image processing function, such as a notebook computer, a tablet computer, a desktop computer, a mobile device (for example, a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, a portable game device), an intelligent robot, an intelligent home appliance, and an intelligent vehicle-mounted device; in another implementation manner, the live video processing device provided by the embodiment of the present application may be implemented as a server, where the server may be an independent physical server, or may be a server cluster or a distributed system formed by multiple physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content distribution networks (CDN, content Delivery Network), and basic cloud computing services such as big data and artificial intelligence platforms. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present application. In the following, an exemplary application when the live video processing device is implemented as a terminal will be described.

Referring to fig. 3, fig. 3 is an optional architecture schematic diagram of a live video processing system provided by an embodiment of the present application, in order to support any live application, and live a target live object by using a specific operation model through the live application to present live video, at least the live application and an operation application for operating the target live object are installed on a terminal. The live video processing system 10 at least comprises a terminal 100, a network 200 and a server 300, wherein the server 300 is a server of a live application, and the terminal 100 can form a live video processing device in an embodiment of the present application. The terminal 100 is connected to the server 300 through the network 200, and the network 200 may be a wide area network or a local area network, or a combination of both. In the running process of the live broadcast application, a client of the live broadcast application can acquire an original video which is operated aiming at a target live broadcast object in the operation application. In some embodiments, the terminals may be plural, with one terminal constituting the anchor terminal and the other terminals constituting the audience terminals of the live audience (i.e., terminals 100-1 and 100-2 in FIG. 3).

During live broadcast, the terminal 100 (anchor terminal) displays an operation mode button on a setting interface of a live broadcast application; and receives a selection operation of a user (which may be a main cast) for an operation mode button and a live broadcast start operation for a target live broadcast object, and then, the terminal 100 acquires a live broadcast video of the target live broadcast object from the server 300 based on an original video of the target live broadcast object and an operation mode corresponding to the selection operation in response to the selection operation of the operation mode button and the live broadcast start operation for the target live broadcast object; and displaying the live video on a live interface of the live application of all terminals.

In some embodiments, the server 300 determines a live video of the target live object based on the original video of the target live object and an operation mode corresponding to the selection operation in response to the selection operation for the operation mode button and the live start operation for the target live object, and distributes the live video to the anchor terminal and all audience terminals.

The live video processing method provided by the embodiment of the application can also be realized based on a cloud platform and by a cloud technology, for example, the server 300 can be a cloud server, and the live video of the target live object is determined by the cloud server based on the original video of the target live object and an operation mode corresponding to the selection operation.

In some embodiments, a cloud memory may be further provided, and an original video of the target live object, a live video of the target live object, and the like may be stored in the cloud memory. In this way, the live video based on the target live object can be on demand later, i.e. other audiences can watch the live video again after the live is finished.

Here, cloud technology (Cloud technology) refers to a hosting technology that unifies serial resources such as hardware, software, and networks in a wide area network or a local area network to implement calculation, storage, processing, and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

Fig. 4 is a schematic structural diagram of a live video processing device according to an embodiment of the present application, where the live video processing device shown in fig. 4 includes: at least one processor 310, a memory 350, at least one network interface 320, and a user interface 330. The various components in the live video processing device are coupled together by a bus system 340. It is understood that the bus system 340 is used to enable connected communications between these components. The bus system 340 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 340 in fig. 4.

The processor 310 may be an integrated circuit chip with signal processing capabilities such as a general purpose processor, which may be a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

The user interface 330 includes one or more output devices 331 that enable presentation of media content, and one or more input devices 332.

Memory 350 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 350 optionally includes one or more storage devices physically located remote from processor 310. Memory 350 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a random access Memory (RAM, random Access Memory). The memory 350 described in embodiments of the present application is intended to comprise any suitable type of memory. In some embodiments, memory 350 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

The operating system 351 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

network communication module 352 for reaching other computing devices via one or more (wired or wireless) network interfaces 320, exemplary network interfaces 320 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Serial Bus), etc.;

An input processing module 353 for detecting one or more user inputs or interactions from one of the one or more input devices 332 and translating the detected inputs or interactions.

In some embodiments, the apparatus provided by the embodiments of the present application may be implemented in a software manner, fig. 4 shows a live video processing apparatus 354 stored in a memory 350, where the live video processing apparatus 354 may be a live video processing apparatus in a live video processing device, and may be software in the form of a program and a plug-in, and includes the following software modules: first display module 3541, acquisition module 3542, and second display module 3543, which are logical, and thus can be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be described hereinafter.

In other embodiments, the apparatus provided by the embodiments of the present application may be implemented in hardware, and by way of example, the apparatus provided by the embodiments of the present application may be a processor in the form of a hardware decoding processor that is programmed to perform the live video processing method provided by the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may employ one or more application specific integrated circuits (ASIC, application Specific Integrated Circuit), DSP, programmable logic device (PLD, programmable Logi c Device), complex programmable logic device (CPLD, complex Programmable Logic Devi ce), field programmable gate array (FPGA, field-Programmable Gate Array), or other electronic component.

The live video processing method provided by the embodiments of the present application may be executed by a live video processing device, where the live video processing device may be any terminal having an image processing function, a video display function, and a game function, or may also be a server, that is, the live video processing method of the embodiments of the present application may be executed by a terminal, or may also be executed by a server, or may also be executed by interaction between the terminal and the server.

Referring to fig. 5, fig. 5 is a schematic flowchart of an alternative method for processing live video according to an embodiment of the present application, and the steps shown in fig. 5 will be described below, where the method for processing live video in fig. 5 is described by taking a terminal as an execution body as an example.

Step S501, an operation mode button is displayed on a setting interface of the live application.

In the embodiment of the application, a live broadcast application is run on a terminal by a host, wherein the live broadcast application provides a setting interface for setting live broadcast parameters of the current live broadcast process. The device interface of the live broadcast application can be provided with an operation mode button, and the button is used for selecting whether to operate the live broadcast application according to a specific operation mode or not, so that live broadcast video can be output by adopting a video output effect corresponding to the specific operation mode in the live broadcast process.

Referring to fig. 8, a specific operation mode is taken as an example of an HDR operation mode for a setting interface of a live application, where the HDR operation mode is displayed on the setting interface, and the HDR operation mode is used to represent live video output in a current live process as an HDR video. When the host selects an HDR switch corresponding to an HDR operation mode in a setting interface, the live video output by the current live broadcast process is an HDR video, and correspondingly, when the live video of the current live broadcast process is generated, the live video processing method provided by the embodiment of the application can be adopted to process the video, so that the HDR video is obtained.

In step S502, in response to the selection operation for the operation mode button and the live broadcast start operation for the target live broadcast object, the live broadcast video of the target live broadcast object is acquired based on the original video of the target live broadcast object and the operation mode corresponding to the selection operation.

Here, the live broadcast starting operation refers to clicking a start button in a live broadcast application to start live broadcast, referring to fig. 11, when a user sets a live broadcast parameter and clicks the start live broadcast button, a live broadcast video of a target live broadcast object may be obtained and video push is performed to realize live broadcast.

In the embodiment of the application, the target live object can be any live type, for example, any type of live type such as games, shopping, performances, lectures and the like, or any type of live object, for example, any type of live object such as game roles, sold goods, performance roles, figures, content lectures and the like.

In some embodiments, an operation application may also be run on the terminal, where the operation application is used to generate the original video of the target live object, that is, the user performs a series of operations on the operation application to generate the original video. For example, the operating application may be a game application, and the host plays at a client of the game application to effect running of the game application, thereby producing a game running picture that constitutes an original video of the target live object.

In other embodiments, the live application may have a video recording function, that is, the live application may call a video acquisition module on the terminal to perform video acquisition, and generate an original video. For example, a host may play a program during the live application running and capture the host's performance by a camera of the host terminal to generate the original video. Or when the target live object is a game, the host plays the game application by operating at the client of the game application, and generates a game running picture in the running process of the game application, and the live application can obtain a game video by recording the game running picture, wherein the game video forms an original video of the target live object.

When a user selects an operation mode button on a setting interface to select a specific operation mode, a target live object is selected, and clicks a start live button, acquiring live video of the target live object, wherein the live video is generated based on original video of the target live object and the operation mode corresponding to the selection operation. In the embodiment of the application, the generation of the live video of the target live object based on the original video of the target live object and the operation mode corresponding to the selection operation can be realized by a server of the live application, namely, a background server of the live application responds to the selection operation and the live start operation, generates the live video of the target live object based on the original video of the target live object and the operation mode corresponding to the selection operation, and sends the live video to the terminal.

In the embodiment of the application, in the operation mode corresponding to the selection operation, the live video has a corresponding picture display effect, wherein the picture display effect corresponds to a picture quality parameter, and the picture quality parameter is greater than a preset picture quality parameter threshold, or the picture quality parameter is greater than or equal to the picture quality parameter of the original video.

When the live video is generated, the original video has the preset picture quality parameter when being generated in the operation application, so that when the live video is generated, the live video can be generated based on the picture quality parameter, at least the picture quality parameter of the live video is ensured to be the same as that of the original video, and the effect of the video acquired during live video is the same as or better than that of the video finally presented can be realized.

Step S503, displaying the live video on a live interface of the live application.

In the embodiment of the application, after the live video is generated, the live video can be distributed to each audience terminal watching live broadcast, and the live video is displayed on the current interface of each audience terminal, so that the current live broadcast process of the anchor is promoted.

In some embodiments, the picture quality parameters of the live video may be the same as the picture quality parameters of the original video. Wherein the picture quality parameters include, but are not limited to, at least one of: color range, image shading contrast, color space, color format, color gamut, brightness range, etc.

According to the live video processing method provided by the embodiment of the application, in the live broadcast process of a target live broadcast object through a live broadcast application, the live video of the target live broadcast object is obtained and displayed on a live broadcast interface of the live broadcast application based on the original video of the target live broadcast object and the operation mode corresponding to the selection operation by responding to the selection operation of the operation mode button and the live broadcast start operation of the target live broadcast object, wherein the live video has a corresponding picture display effect in the operation mode, so that the live video with the picture display effect matched with the operation mode is generated, the live broadcast picture distortion is avoided, and the live broadcast watching experience of a live broadcast watching user is improved.

In some embodiments, a live video processing system is shown in fig. 3, and at least includes a terminal and a server, where the terminal includes a hosting terminal and a plurality of audience terminals, and fig. 6 is a schematic flowchart of another alternative live video processing method provided in an embodiment of the present application, where fig. 6 illustrates an audience terminal as an example. Live broadcast applications are run on both the anchor terminal and the audience terminal, or live broadcast applications are run on the anchor terminal, and the audience terminal enters a live broadcast room to watch live broadcast through an applet or other third party programs. And the anchor terminal also operates an operation application corresponding to the target live object, and the original video of the target live object is generated by operating the operation application and performing corresponding operation on a client of the operation application. For example, the operation application may be a game application by which a host may operate on a game character (i.e., a target live object) to generate a game video (i.e., an original video). As shown in fig. 6, the live video processing method according to the embodiment of the application includes the following steps:

step S601, an operation mode button is displayed on a setting interface of a live application operated on the anchor terminal.

In step S602, the anchor terminal receives a selection operation of the anchor for the operation mode button.

Step S603, the anchor terminal receives the live broadcast start operation of the anchor for the target live broadcast object.

In step S604, the anchor terminal receives a series of operation instructions input by the anchor at the client of the operation application, and generates an original video of the target live object based on the operation instructions.

In some embodiments, since the live video shown by the live application is generated in real time based on the acquired original video, the live application needs to have the right to acquire the original video generated by the operation application, that is, the right acquisition reminding information may be sent to the host in advance when the live application and the operation application are started, so as to remind the host that the live application needs to have the right to acquire the original video generated by the operation application. When the anchor selection allows acquisition, the live application can have the right to acquire the original video generated by the operation application, and acquire the original video when the operation application generates the original video.

In the embodiment of the present application, when the live broadcast application generates the original video, the acquisition of the original video may be achieved through the following steps S11 to S13 (not shown in the figure):

Step S11, initializing a texture region at a preset storage position of a live broadcast application; wherein, texture regions enable cross-process sharing.

Here, the texture region is a region for copying an original video picture, and the texture region can be shared across processes, that is, for video pictures acquired by different processes, all can be copied to the same texture region.

And step S12, when the original video of the target live broadcast object is generated, calling a hook function to hook the original video.

In some embodiments, step S12 may be implemented by the following steps S121 to S123 (not shown in the figures):

step S121, when generating the original video of the target live object, a hook function is called to perform a hook process on a specified message for generating the original video.

Here, the designation message may be a generation instruction for generating the original video in the operation application, based on which the original video of the target live object can be generated.

Step S122, modifying the specified message after the hooking process to obtain the modified specified message.

In the embodiment of the application, after the specified message after the hook processing is obtained, the control right of the specified message is obtained, that is, the live broadcast application can control the specified message to generate the original video, so that the specified message after the hook processing can be modified based on the original video generation requirement to add the current required functions of the user into the specified message.

Step S123, obtaining the original video of the target live object based on the modified specified message, and obtaining the original video after hooking.

And S13, copying the original video processed by the hook to the shared texture.

In step S605, the anchor terminal transmits the original video of the target live object and the operation mode corresponding to the selection operation to the server.

In step S606, the server generates the live video of the target live object based on the original video of the target live object and the operation mode corresponding to the selection operation.

Here, after generating the live video, the server transcodes and distributes the live video. Transcoding refers to generating live video with more code rate, resolution and dynamic range for different users to watch.

In the embodiment of the application, when the operation mode corresponding to the selection operation is matched with the video type of the original video of the target live object, and if the video type of the original video is a high-quality video type (i.e. the picture quality parameter of the original video is greater than or equal to the preset picture quality parameter threshold), the live video with the same video type as the original video can be generated.

In some embodiments, if the video type of the original video is not a high quality video type (i.e., the picture quality parameter of the original video is smaller than the preset picture quality parameter threshold), the video type of the original video may be adjusted, i.e., the picture of the original video is repaired, so as to improve the picture quality parameter of the original video, and obtain a live video with a high quality video type.

In step S607, the server transmits the live video to all the viewer terminals.

Step S608, the audience terminal displays live video on a live interface of a live application; the picture quality parameters of the live video are the same as those of the original video or the live video has a video with a high-quality video type.

According to the live video processing method provided by the embodiment of the application, the live video of the target live object is generated based on the original video of the target live object and the operation mode corresponding to the selection operation, and the picture quality parameter of the live video is the same as that of the original video or the live video has a video with a high-quality video type, that is, the picture quality parameter of the generated index video is not smaller than that of the original video, so that the picture effect of the live video is improved.

Fig. 7 is a schematic flowchart of still another alternative method for processing live video according to an embodiment of the present application, as shown in fig. 7, where the method includes the following steps:

step S701, displaying an operation mode button on a setting interface of a live application operated on the anchor terminal.

In step S702, the anchor terminal receives a selection operation of the anchor for the operation mode button.

In step S703, the anchor terminal receives the live broadcast start operation of the anchor for the target live broadcast object.

In step S704, the anchor terminal receives a series of operation instructions input by the anchor at the client of the operation application, and generates an original video of the target live object based on the operation instructions.

Step S705, when generating the original video of the target live object, the anchor terminal calls an open shared resource function to open the shared handle of the graphic infrastructure.

Here, a graphics hook (e.g., graphics-hoo k.dll) in the form of a dynamic link library may be pre-built and injected into the operation application, and upon receipt of a notification of successful injection, the open shared resource function (i.e., the open shared resource interface) may obtain a texture region formed by initializing at a preset storage location of the live application by calling the open shared resource function to operate the shared handle of the application process on the graphics infrastructure.

In step S706, the anchor terminal obtains, through the shared handle, a texture region and a region identifier of the texture region that can be shared across processes.

In the embodiment of the application, when each texture region is in an initialized form, a region identifier is correspondingly generated, and the region identifier is used for distinguishing different texture regions.

In step S707, the anchor terminal determines the video picture type of the generated original video based on the format of the region identification.

Here, the format field of the region identifier may be acquired, and it is determined whether the format of the region identifier is a preset format, and when the format of the region identifier is the preset format, it may be determined that the video picture type of the generated original video is a specific type. FOR example, if the acquired area identifier is in dxgi_for mat_r10g10b10a2_unorm format, it may be determined that the acquired original video is HDR video.

In step S708, the anchor terminal transmits the original video of the target live object, the video frame type of the original video, and the operation mode corresponding to the selection operation to the server.

In step S709, when the video frame type of the original video matches with the operation mode corresponding to the selection operation, the server adopts the preset color format and the preset color space to render the original video onto the target canvas of the live broadcast application, so as to obtain the rendered video.

In the embodiment of the present application, the video picture type of the original video may be an HDR type, i.e., the original video may be an HDR type video.

Here, the preset color format is a color format matching the HDR type, and the preset color space is a color space matching the HDR type. FOR example, the predetermined color format may be dxgi_for mat_r10g10b10a2_unorm color format, in which RGBA is 10 bits each and a occupies 2 bits; the preset COLOR SPACE may be dxgi_color_space_rgb_full_g2084_none_p2020 COLOR SPACE, under which the HDR wide COLOR gamut of bt.2020 is employed and the PQ curve of HDR is followed.

In step S710, the server performs encoding processing on the rendered video to obtain a live video of the target live object.

In some embodiments, the encoding process of the rendered video may include a software encoding process and a hardware encoding process, that is, step S710 may be implemented by any one of the following two ways:

mode one: performing format conversion processing on the coded video to obtain video data after format conversion; and performing software coding processing on the video after format conversion to obtain the live video of the target live object.

In some embodiments, the encoded video may be RGB format data; in a possible implementation manner, the format conversion process is performed on the encoded video, so as to obtain video data after format conversion, which may be implemented through the following steps S21 to S24:

step S21, performing bit operation on the RGB format data to obtain RGB component data of each pixel point.

In some embodiments, the RGB format data is stored in a graphics processor (GPU, graphics Proc essing Unit) and the format conversion of the encoded video may be accomplished by a central processing unit (CPU, central Processing Unit), so that the RGB format data may be copied from the GPU to the CPU prior to bit manipulation of the RGB format data. Then, the CPU performs bit operation on the RGB format data to obtain RGB component data of each pixel point, and the subsequent steps S22 to S24 are completed, and the CPU completes format conversion processing of the coded video.

Step S22, determining RGB component data for each preset number of pixels as one data set.

In step S23, matrix conversion is performed on the RGB component data in each data set to obtain YUV data corresponding to each pixel.

Step S24, determining the video data after format conversion based on the YUV data corresponding to each pixel point.

In another possible implementation manner, the format conversion process is performed on the encoded video, so as to obtain video data after format conversion, which may be implemented through the following steps S25 to S27:

step S25, format textures of the RGB format data are obtained, and linear conversion is carried out on the format textures to obtain the RGB data after format conversion.

Step S26, sequentially performing color matrix conversion and reordering on the RGB data after format conversion to obtain YUV data with preset bit.

Step S27, determining the video data after format conversion based on the YUV data having the preset bit.

In the embodiment of the application, the format conversion processing of the coded video can be completed through the GPU, that is, the format texture is linearly converted through the GPU to obtain the RGB data after format conversion, and the RGB data after format conversion is sequentially subjected to color matrix conversion and reordering processing.

Mode two: performing format conversion processing on the coded video to obtain video data after format conversion; and carrying out hardware coding processing on the video after format conversion to obtain the live video of the target live object.

In the embodiment of the application, the video picture type of the live video can be an HDR type, namely, the live video can be an HDR type video.

In some embodiments, because the hardware encoding process lacks HDR metadata, or HD R metadata is only native display data, and not HDR key metadata actually used by the game, this will lead to inaccurate effects seen by the spectators, and therefore, the embodiments of the present application further provide two ways of obtaining HDR metadata:

mode one of acquiring HDR metadata: before carrying out hardware coding processing on the video after format conversion, creating a switching chain (swapchain) when the original video is rendered, and acquiring preset example metadata; traversing video data corresponding to the original video by taking a switching chain as an initial detection point, and determining the data content which is the same as the example metadata in the video data; determining an offset address based on the same data content; the offset address is determined as HDR metadata of the original video.

In an embodiment of the application, the switch chain is a series of virtual frame buffers used by the graphics card and graphics application programming interface (API, application Programming Interface) to stabilize frame rate and other functions. The switch chain is typically present in graphics memory, but may also be present in system memory. The lack of a switch chain may result in rendering chunking, the existence and use of a switch chain being necessary for many graphics APIs, a switch chain having two buffers being double-buffered.

After the HDR metadata is obtained, performing hardware encoding processing on the video after format conversion, which may be based on the HDR metadata of the original video, and performing hardware encoding processing on the video after format conversion to obtain a live video of the target live object.

The second way to obtain HDR metadata is: acquiring HD R metadata of an original video from video data of the original video; determining a key frame in the live video after the hardware encoding processing; HDR metadata is added as supplemental enhancement information to the frame data of the key frames.

In step S711, the server transmits the live video to all the viewer terminals.

Step S712, the audience terminal displays live video on a live interface of a live application; the picture quality parameters of the live video are the same as those of the original video.

According to the live video processing method provided by the embodiment of the application, different format conversion modes are provided for carrying out format conversion processing on the coded video, so that more optional operation modes are provided for a user, and the user can select an appropriate format conversion mode to carry out format conversion processing on the coded video based on the performances of the GPU and the CPU which are currently used; in addition, the method can respectively adopt a software coding processing mode or a hardware coding processing mode to code the rendered video, can provide more optional operation modes for users, improves the processing effect of video coding, and enables the live video processing equipment to efficiently generate the live video of the target live object, thereby improving the live effect.

In the following, an exemplary application of the embodiment of the present application in a practical application scenario will be described. The embodiment of the application provides a live video processing method which can be applied to HDR game live broadcast.

In the live video processing method of the embodiment of the present application, an HDR switch option is added to a setting page of a live application, and a user is provided with an ability to start HDR live, as shown in fig. 8, which is an interface diagram of the setting page of the live application provided in the embodiment of the present application, on the setting page 80, an HDR switch option 801 is provided, and when a host is playing a game live, if the game is an HDR game, the HDR switch option 801 may be checked, so as to implement the live game of HDR. Fig. 9 is a diagram of a selection interface for starting up the game HDR provided in the embodiment of the present application, and as shown in fig. 9, after the host clicks the HDR switch option 801 on the setting page, a function option 901 for selecting the starting up game HDR may also be set. Then, as shown in fig. 10, the click game process 101 acquires the selectable game process 102, and then selects the game to be live 103. Then, as shown in fig. 11, clicking the start live button 111 starts the HDR game live.

Fig. 12 is a diagram of a contrast between an HDR game live broadcast picture and an OBS live broadcast HDR game picture according to an embodiment of the present application, where, as shown in fig. 12, the upper path is a distortion effect of the OBS live broadcast HDR game, and the lower path is a path processed by the solution according to the embodiment of the present application, where, the HDR game live broadcast is normal, and a viewer obtains an experience consistent with the host.

In some embodiments, the game live generally includes the steps shown in FIG. 13: s131, collecting game pictures; s132, rendering a game picture; s133, game picture video preprocessing; s134, video encoding of game pictures, and generating live video; s135, live video plug flow. The embodiment of the application supports HDR game live broadcast in all directions.

In the embodiment of the present application, in step S131, the HDR game screen acquisition is divided into two parts, which are respectively required to be operated in the game process and the live software process.

In a game session, game session behavior includes: a hooking function (for example, may be a graphic s-hook.dll function) is written first, the hooking function can hook a current function (present function) of a processing (i.e. hook-off) system, the current function is a necessary process for rendering a 3D accelerator card (D3D, direct 3D) on a screen, and a game process can call the current function at certain time intervals. Fig. 14 is a schematic diagram of different implementation manners of hooking a message by using a hooking mechanism provided in an embodiment of the present application, where the hooking mechanism (hook) allows a live program to intercept and process a window (Windows) message or a specified event, and when the specified message is sent out, the hooking program 141 captures the message before the message reaches a target window, so as to obtain control rights for the message, and further process or modify the message, and add a required function.

In the embodiment of the application, a block of texture (i.e., texture region) which can be SHARED across processes can be initialized first, the actual operation on a window is to create the texture (e.g., implemented by CreateTexture 2D), add the identification of RESOURCE miscellaneous sharing (e.g., d3d11_request_misc_shared) (region identification corresponding to the texture region), and open the get SHARED handle function (e.g., getSharedHandle) of the graphics infrastructure (DXGI, microsoft DirectX Gr aphics Infrastructure), to obtain the SHARED handle of the texture.

Here, the hook processes the current function of the system, and the game screen can be copied onto the upper texture before being displayed on the screen.

In the live software process, a hook function may be injected into the game process, and when an injection success notification is received, the shared handle is opened through an open shared resource interface (for example, openshared resource) of the ID3D11Device, where the open shared resource interface can obtain a Texture object of the ID3D11Texture 2D. The Texture description (d3d11_textur2d_desc) can be acquired by the Texture description acquisition module (GetDesc) of the ID3D11Texture2D, and then a field of a Format (Format) of the Texture description can be read, and if the Format is dxgi_format_r10g10b10a2_unorm Format, the acquired game picture is recognized as an HDR game picture.

In the process of rendering the game screen in step S132, after the HDR game screen is acquired, the game screen needs to be rendered onto the canvas of the live software. Unlike traditional SDR live broadcasting, instead of dxgi_format_r8g8b8a8_unorm color FORMAT for SDR live broadcasting, dxgi_format_r10g10a2_unorm color FORMAT (i.e., preset color FORMAT) may be used herein. FIG. 15 is a schematic diagram showing a comparison of two color formats, as shown in FIG. 15, RGBA in R8G8B8A8 occupies 8 bits, and R10G10B10A2 is RGB 10 bits, A2 bits.

Then, a COLOR SPACE is set, and a dxgi_color_space_rgb_full_l_g22_none_p 709 COLOR SPACE used by SDR live broadcasting has 709 COLOR gamut, gamma curve (Gamma curve) under the COLOR SPACE. In the method of the embodiment of the present application, the dxgi_color_space_rgb_full_g2084_none_p2020 COLOR SPACE (i.e., the preset COLOR SPACE) is used, and the HDR wide COLOR gamut of bt.2020 follows the PQ curve of HDR.

Fig. 16 is a comparative schematic diagram of a PQ curve and a conventional gamma curve, and as shown in fig. 16, the PQ curve can express a wider range of brightness. Fig. 17 is a schematic diagram of a bt.2020 color gamut versus a bt.709 color gamut, as shown in fig. 17, where bt.2020 color gamut is capable of expressing more colors.

In the game picture video preprocessing in step S133, it may be HDR encoding preprocessing. Since the HDR game picture can be rendered on the game HDR canvas by step S132, the HDR encoding requires YUV420P10LE format or YUV P010 format, as shown in fig. 18, YUV (i.e., YUV420P10LE format or YUV P010) is 10bit (X in the figure represents 0, which is the occupied data), and the format of R10G10B10A2 used by the canvas needs to be subjected to format conversion for this purpose. Two schemes are proposed here to achieve a 10bit RGB to 10bit YUV conversion.

Scheme one: CPU conversion scheme.

FIG. 19 is a flowchart of a CPU conversion scheme according to an embodiment of the present application, as shown in FIG. 19, firstly, in step S191, 10bit RGB data (R10G 10B10A 2) is copied from the GPU to the CPU; then in step S192, YUV is RGB-based sampling data, and the 10bit RGB data can obtain r, g, b component data of each point by the following bit operation:

int r＝(ar30)&0x3FF；

int g＝(ar30>>10)&0x3FF；

int b＝(ar30>>20)&0x3FF；

int a＝(ar30>>30)*0x55。

then, 4Y, 1U and one V can be obtained through the following matrix conversion process by 4 points:

const float BT2020_10bit_limited_rgb_to_yuv[]＝{

0.224951f,0.580575f,0.050779f,0.000000f,0.062561f,

-0.122296f,-0.315632f,0.437928f,0.000000f,0.500489f,

0.437928f,-0.402706f,-0.035222f,0.000000f,0.500489f,

0.000000f,0.000000f,0.000000f,1.000000f,0.000000f,

}；

in the embodiment of the application, when the U, V component is calculated, the average value of four points can be taken. Finally, 10bit YUV data is obtained, and in step S193, the 10bit YUV data is fed into the encoder.

Scheme II: GPU conversion scheme.

Because the CPU conversion performance is lower, a GPU conversion scheme is also provided. FIG. 20 is a flowchart of a GPU conversion scheme according to an embodiment of the present application, as shown in FIG. 20, in step S201, an rgb10 format texture (i.e. R10G10B10A 2) is linearly converted to an rgb16 format texture (i.e. RGBA 16) by the GPU; then, in step S202, 10 bits of YUV data (high 10 bits valid) are obtained by color matrix conversion; in step S203, reordering is performed and division is performed by 64 to obtain 10bit YUV data (i.e. YUV420P10 LE); in step S204, 10bit YUV data is copied from the GPU to the CPU; in step S205, 10bit YUV data is fed into an encoder.

FIG. 21 is a graph showing the comparison of the performance of the CPU conversion scheme and the GPU conversion scheme, wherein the algorithm I is the CPU conversion scheme, the algorithm II is the GPU conversion scheme, the CPU occupation of the algorithm I is 10%, and the CPU occupation of the algorithm II is 0, as shown in FIG. 21; the frame processing time of algorithm one is 22 milliseconds and the frame processing time of algorithm two is 1.2 milliseconds.

In the step S134 game picture video encoding process, two HDR encoding modes are supported. One of which is CPU software code and the other of which is GPU hardware code.

In the process of encoding CPU software, a libx264 open source encoder can be directly utilized, and compared with traditional live broadcast, the formats used in the embodiment of the application are AVCOL_SPC_BT2020_NCL, AVCOL_PRI_BT2020, AVCOL_TRC_SMPTE2084 and 10bit YUVAV_PIX_FMT_YUV420P10LE. It should be noted that this coding format is not a standard format for HDR.

In the GPU hardware encoding process, an nvenc_hevc hardware encoder may be utilized, and compared with conventional live broadcast, the formats used in the embodiments of the present application are avcol_spc_bt2020_ncl, avcol_pri_bt2020, avcol_trc_smpte2084, and 10bit av_pix_fmt_p010LE.

Because existing schemes lack HDR key metadata (i.e., HDR metadata), or the HDR key metadata is simply native display data, rather than the HDR key metadata actually used by the game, this can lead to inaccurate results seen by the spectators. Therefore, the embodiment of the application provides the following data scheme, which is divided into game HDR metadata acquisition and HEVC hard coding supporting HDR data.

Fig. 22 is a flowchart of a game HDR metadata acquisition method provided by an embodiment of the present application, as shown in fig. 22, in the game HDR metadata acquisition process, including the following steps:

in step S221, an HDR metadata exchange command (swap 4 x hdramedata) is input.

Here, a separate process is created, DEVICE (DEVICE) and switch chain (SwapChain) data required for D3D rendering is created, and an HDR example metadata is set, which corresponds to a swap4 address.

In step S222, i=0 is defined.

Step S223, determining whether the traversed video data is identical to the HDR example metadata.

Here, starting with the exchange chain, the subsequent data of the video data is traversed one by one, and the content equal to the HDR exemplary metadata is judged to acquire one offset address offset. This offset address is different for different versions of the operating system.

If yes, go to step S224; if the determination is negative, step S225 is performed.

In step S224, metadata offset=i is output.

In step S225, the value of i is added to 1.

In the embodiment of the application, the hook obtains the exchange chain of the game process, reads the offset address content, namely the HDR metadata, and then returns to the assistant process through the pipeline.

Fig. 23 is an interface diagram of a game HDR metadata acquisition method provided in an embodiment of the present application, in which an HDR metadata address 231 is obtained through traversal, under the HDR metadata address 231, there is HDR metadata content 232, and an exchange address (i.e. swap4 address) 233 is also provided, and under the exchange address 233, the same content 234 as the HDR metadata, i.e. the content in the HDR example metadata is provided. Note that HDR example metadata is stored at a swap4 address, and HDR metadata is stored at an HDR metadata address.

Fig. 24 is a flowchart of a method for supporting HDR data by hard-coding HEVC, which is provided in an embodiment of the present application, as shown in fig. 24, in the process of supporting HDR data by hard-coding HEVC, including the following steps:

in step S241, it is determined whether the current frame of the game video (i.e., the original video of the target live object) is a key frame.

If yes, go to step S222; if the determination result is negative, step S243 is performed.

Step S242, the HDR metadata of the game video is acquired through an AV bit stream filter (AVBitStreamFilter).

Step S243, an Rtmp data stream (RtmpStream) is acquired.

In the embodiment of the application, since 264 can use AVBitStreamFilter of ffmpeg to add supplemental enhancement information (SEI, supplemental Enhancement Information), but 265AVBitStreamF ilter does not define HDR metadata, ffmpeg source codes are modified, and 265AVBitStream Filter support of HDR metadata is realized. A step is added to the original encoding flow, and the HDR metadata information acquired by the game is added to the key frame output by the encoder by using AVBit StreamFilter.

In step S135, after encoding is completed, the encoded stream may be output to an offline MP4 file, or may be used for live webcasting.

The live video processing method provided by the embodiment of the application at least comprises the following key points: acquisition of HDR game picture content; HDR game picture rendering and picture and other SDR content synthesis; encoding of HDR game pictures and live broadcast promotion; recording of HDR game frames.

It will be appreciated that in the embodiment of the present application, the content of the user information, such as the selection operation of the operation mode button by the anchor and the live start operation for the target live object, the original video of the target live object, the live video of the target live object, etc., if the data related to the user information or the enterprise information is involved, when the embodiment of the present application is applied to a specific product or technology, the user permission or consent needs to be obtained, and the collection, use and processing of the related data need to comply with the relevant laws and regulations and standards of the relevant country and region.

Continuing with the description below, the live video processing device 354 provided in the embodiments of the present application is implemented as an exemplary structure of a software module, and in some embodiments, as shown in fig. 4, the live video processing device 354 includes:

a first display module 3541 for displaying an operation mode button on a setting interface of a live application;

An obtaining module 3542, configured to obtain, in response to a selection operation for the operation mode button and a live broadcast start operation for a target live broadcast object, a live broadcast video of the target live broadcast object based on an original video of the target live broadcast object and an operation mode corresponding to the selection operation; the live video has a corresponding picture display effect in the running mode;

a second display module 3542, configured to display the live video on a live interface of the live application.

It should be noted that, the description of the apparatus according to the embodiment of the present application is similar to the description of the embodiment of the method described above, and has similar beneficial effects as the embodiment of the method, so that a detailed description is omitted. For technical details not disclosed in the present apparatus embodiment, please refer to the description of the method embodiment of the present application for understanding.

Embodiments of the present application provide a computer program product or computer program comprising executable instructions, the executable instructions being a computer instruction; the executable instructions are stored in a computer readable storage medium. The processor of the live video processing device, when reading the executable instructions from the computer readable storage medium, causes the live video processing device to perform the method according to the embodiment of the present application when the processor executes the executable instructions.

Embodiments of the present application provide a storage medium having stored therein executable instructions which, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, as shown in fig. 5.

In some embodiments, the storage medium may be a computer readable storage medium, such as a ferroelectric Memory (FRAM, ferromagnetic Random Access Memory), read Only Memory (ROM), programmable Read Only Memory (PROM, programmable Read Only Memory), erasable programmable Read Only Memory (EPROM, erasable Programmable Read Only Memory), electrically erasable programmable Read Only Memory (EEPROM, electrically Erasable Programmable Read Only Memory), flash Memory, magnetic surface Memory, optical Disk, or Compact Disk-Read Only Memory (CD-ROM), or the like; but may be a variety of devices including one or any combination of the above memories.

In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.

As an example, the executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code). As an example, the executable instructions may be deployed to execute on one computing device (which may be a job run-length determining device) or on multiple computing devices located at one site, or on multiple computing devices distributed across multiple sites and interconnected by a communication network.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. A live video processing method, the method comprising:

and displaying the live video on a live interface of the live application.

2. The method according to claim 1, wherein the method further comprises:

initializing a texture region at a preset storage position of the live broadcast application; wherein the texture region enables cross-process sharing;

when an original video of the target live broadcast object is generated, a hook function is called to carry out hook processing on the original video;

copying the original video processed by the hook to the shared texture.

3. The method according to claim 2, wherein when generating the original video of the target live object, invoking a hook function to hook the original video comprises:

When an original video of the target live object is generated, a hook function is called to carry out hook processing on a designated message for generating the original video;

modifying the specified message processed by the hook to obtain a modified specified message;

and acquiring the original video of the target live broadcast object based on the modified specified message to obtain the original video after the hook processing.

4. The method according to claim 1, wherein the method further comprises:

when the original video of the target live object is generated, an open shared resource function is called to open a shared handle of the graphic infrastructure;

acquiring a texture region capable of realizing cross-process sharing and a region identifier of the texture region through the sharing handle;

and determining the video picture type of the generated original video based on the format of the region identifier.

5. The method according to claim 1, wherein the obtaining the live video of the target live object based on the original video of the target live object and the operation mode corresponding to the selection operation includes:

when the video picture type of the original video matches the operation mode corresponding to the selection operation,

Rendering the original video to a target canvas of the live broadcast application by adopting a preset color format and a preset color space to obtain a rendered video;

and encoding the rendered video to obtain the live video of the target live object.

6. The method of claim 5, wherein the encoding the rendered video to obtain the live video of the target live object comprises:

performing format conversion processing on the coded video to obtain video data after format conversion;

and performing software coding processing on the video after format conversion, or performing hardware coding processing on the video after format conversion to obtain the live video of the target live object.

7. The method of claim 6, wherein the encoded video is RGB format data; the performing format conversion processing on the encoded video to obtain video data after format conversion, including:

performing bit operation on the RGB format data to obtain RGB component data of each pixel point;

determining RGB component data of each preset number of pixel points as a data group;

performing matrix conversion on RGB component data in each data set to obtain YUV data corresponding to each pixel point;

And determining the video data after format conversion based on the YUV data corresponding to each pixel point.

8. The method of claim 6, wherein the encoded video is RGB format data; the performing format conversion processing on the encoded video to obtain video data after format conversion, including:

acquiring format textures of the RGB format data;

performing linear conversion on the format texture to obtain RGB data after format conversion;

sequentially performing color matrix conversion and reordering processing on the RGB data subjected to format conversion to obtain YUV data with preset bits;

and determining the video data after format conversion based on the YUV data with the preset bit.

9. The method of claim 6, wherein prior to performing the hardware encoding process on the format-converted video, the method further comprises:

creating a switching chain when the original video is rendered, and acquiring preset example metadata;

traversing video data corresponding to the original video by taking the exchange chain as an initial detection point, and determining the data content which is the same as the example metadata in the video data;

Determining an offset address based on the same data content;

determining the offset address as HDR metadata of the original video;

correspondingly, the hardware encoding processing is performed on the video after format conversion to obtain the live video of the target live object, which comprises the following steps:

and carrying out hardware coding processing on the video after format conversion based on the HDR metadata of the original video to obtain the live video of the target live object.

10. The method of claim 6, wherein the method further comprises:

acquiring HDR metadata of the original video from the video data of the original video;

determining a key frame in the live video after the hardware encoding processing;

the HDR metadata is added as supplemental enhancement information to frame data of the key frames.

11. The method according to any of claims 1 to 10, wherein the video picture types of the original video and the live video are both HDR types.

12. A live video processing apparatus, the apparatus comprising:

13. A live video processing device, comprising:

a memory for storing executable instructions; a processor for implementing the live video processing method of any one of claims 1 to 11 when executing executable instructions stored in the memory.

14. A computer readable storage medium, characterized in that executable instructions are stored for causing a processor to execute the executable instructions for implementing the live video processing method of any one of claims 1 to 11.

15. A computer program product or computer program comprising executable instructions stored in a computer readable storage medium;

The live video processing method of any of claims 1 to 11 is implemented when a processor of a live video processing device reads the executable instructions from the computer readable storage medium and executes the executable instructions.