CN116567244A

CN116567244A - Determination method, device, equipment and storage medium of decoding configuration parameters

Info

Publication number: CN116567244A
Application number: CN202210103032.9A
Authority: CN
Inventors: 杨小祥; 曹洪彬; 陈思佳; 曹健; 黄永铖; 张佳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-01-27
Filing date: 2022-01-27
Publication date: 2023-08-08

Abstract

The application provides a method, a device, equipment and a storage medium for determining decoding configuration parameters, which can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, driving assistance, video and the like, and the method comprises the following steps: determining initial decoding rendering configuration information of the encoding device, wherein the initial decoding rendering configuration information comprises initial decoding parameters and initial rendering parameters of the encoding device; decoding the test code stream under the initial decoding configuration information to obtain an initial decoding output frame rate and initial single frame decoding delay; and if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, adjusting at least one of the initial decoding parameters and the initial rendering parameters to obtain target decoding rendering configuration information of the encoding equipment. That is, according to the method and the device, the influence of the decoding parameters and the rendering parameters is considered when the target decoding rendering configuration is determined, and the target decoding rendering configuration information meeting the low-delay and high-resolution scenes can be determined.

Description

Determination method, device, equipment and storage medium of decoding configuration parameters

Technical Field

The embodiment of the application relates to the technical field of image processing, in particular to a method, a device, equipment and a storage medium for determining decoding configuration parameters.

Background

The video application scenes are more, such as a video call scene, a video player scene, a screen sharing scene and the like. The requirements of different scenes on video are not identical, for example, low delay is required in video call scenes, high resolution is required in video playing scene requirements, and high resolution and low delay are required in screen sharing scene values.

In order to meet the requirements of different scenes on videos, the requirements of low delay can be achieved by configuring decoding parameters of decoding equipment, for example, by comparing decoding delays of different decoding chips, and selecting the decoding chip with the smallest decoding delay as a target decoding chip for decoding. Or the highest resolution supported by different decoding chips is compared, and the decoding chip with the highest resolution is selected as a target decoding chip for decoding, so that the requirement of high resolution is realized.

That is, it is more important to consider the influence of the type of the decoding chip on the decoding configuration at present, but for a low-latency and high-frame rate scene, by changing the type of the decoding chip, the decoding configuration parameters satisfying the requirements cannot be obtained.

Disclosure of Invention

The application provides a method, a device, equipment and a storage medium for determining decoding configuration parameters, so as to obtain target decoding rendering configuration information meeting the requirements of low-delay and high-frame-rate scenes and improve decoding effect.

In a first aspect, the present application provides a determination of decoding configuration parameters, applied to an encoding device, including:

determining initial decoding rendering configuration information of an encoding device, wherein the initial decoding rendering configuration information comprises initial decoding parameters and initial rendering parameters of the encoding device;

decoding the test code stream under the initial decoding configuration information to obtain an initial decoding output frame rate and initial single frame decoding delay corresponding to the initial decoding rendering configuration information;

and if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, adjusting at least one of the initial decoding parameter and the initial rendering parameter to obtain target decoding rendering configuration information of the encoding equipment, wherein the decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold value.

In a second aspect, the present application provides a determining apparatus for decoding configuration parameters, applied to an encoding device, including:

A determining unit configured to determine initial decoding rendering configuration information of an encoding device, where the initial decoding rendering configuration information includes initial decoding parameters and initial rendering parameters of the encoding device;

the detection unit is used for decoding the test code stream under the initial decoding configuration information to obtain an initial decoding output frame rate and initial single frame decoding delay corresponding to the initial decoding rendering configuration information;

and the adjusting unit is used for adjusting at least one of the initial decoding parameters and the initial rendering parameters to obtain target decoding rendering configuration information of the encoding equipment if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, wherein the decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold value.

In a third aspect, an electronic device is provided, comprising: a processor and a memory for storing a computer program, the processor being for invoking and running the computer program stored in the memory to perform the method of the first aspect.

In a fourth aspect, a computer-readable storage medium is provided for storing a computer program that causes a computer to perform the method of the first aspect.

In a fifth aspect, a chip is provided for implementing the method in the first aspect or each implementation manner thereof. Specifically, the chip includes: a processor for calling and running a computer program from a memory, causing a device on which the chip is mounted to perform the method as in the first aspect or implementations thereof described above.

In a sixth aspect, a computer program product is provided, comprising computer program instructions for causing a computer to perform the method of the first aspect or implementations thereof.

In a seventh aspect, there is provided a computer program which, when run on a computer, causes the computer to perform the method of the first aspect or implementations thereof described above.

In summary, in the present application, an encoding apparatus determines initial decoding rendering configuration information of the encoding apparatus, the initial decoding rendering configuration information including initial decoding parameters and initial rendering parameters of the encoding apparatus; decoding the test code stream under the initial decoding configuration information to obtain an initial decoding output frame rate and initial single frame decoding delay corresponding to the initial decoding rendering configuration information; and if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, adjusting at least one of the initial decoding parameter and the initial rendering parameter to obtain target decoding rendering configuration information of the encoding equipment, wherein the decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold value. In other words, in the embodiment of the application, when the target decoding rendering configuration is determined, the decoding parameters and the rendering parameters of the encoding device are fully considered, the target decoding rendering configuration information meeting the low-delay and high-resolution scenes is determined by adjusting the decoding parameters and the rendering parameters, and when the target decoding rendering configuration information is used for encoding and decoding, the quality of encoding and decoding can be improved. In addition, in some embodiments, the initial decoding rendering configuration information is optimal decoding rendering configuration information of a plurality of devices, and the target decoding rendering configuration information of the encoding device can be found out from a complex configuration combination by performing limited detection on the basis of the initial decoding rendering configuration information, so that efficiency is high.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of an application scenario according to an embodiment of the present application;

FIG. 2 is a schematic block diagram of a video encoder provided by an embodiment of the present application;

FIG. 3 is a schematic block diagram of a video decoder provided by an embodiment of the present application;

FIG. 4 is a flowchart of a method for determining decoding configuration parameters according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating detection of decoding parameters and rendering parameters according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for determining decoding configuration parameters according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a decoding configuration parameter determining apparatus according to an embodiment of the present application;

fig. 8 is a schematic block diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

First, related concepts related to the embodiments of the present application will be described:

decoding the bin: the method refers to the phenomenon that a hardware decoder starts to output a decoded image after a certain number of video frames are input.

Decoding single frame delay: refers to the time difference between the hardware decoder passing in a frame of video and the hardware decoder outputting the frame of video. For the decoding chip without frame, the single frame delay is decoded, namely the real decoding delay. For the decoding chip of the hoarding frame, the delay of decoding a single frame can contain the hoarding frame time of the previous frames.

Decoding the input frame rate: the frequency at which the decoder feeds the video stream is pointed.

Decoding output frame rate: refers to the frequency at which the decoder outputs video image frames.

Video coding single frame reference: refers to video frames that, when encoded, only refer to the image content of the current and previous frames.

Video coding multi-frame reference: refers to video frames that, when encoded, refer to the image content of the current frame and the previous frames. For some bin types, the video coding multi-frame references may result in an increased number of bin numbers on the chip.

Fig. 1 is a schematic view of an application scenario according to an embodiment of the present application, including a cloud server 101 and a terminal device 102. The cloud server 101 may be understood as an encoding device, and the terminal device 102 may be understood as a decoding device.

The cloud server 101 is configured to encode (may be understood as compressing) video data to generate a code stream, and transmit the code stream to the terminal device 102.

The cloud server 101 of the present embodiment may be understood as a device with a video encoding function, and the terminal device 102 may be understood as a device with a video decoding function, that is, the embodiments of the present application include a wider device for the cloud server 101 and the terminal device 102, such as a smart phone, a desktop computer, a mobile computing device, a notebook (e.g., laptop) computer, a tablet computer, a set-top box, a television, a camera, a display device, a digital media player, a video game console, a vehicle-mounted computer, and the like.

In some embodiments, cloud server 101 may transmit encoded video data (e.g., a code stream) to terminal device 102 via channel 103. Channel 103 may comprise one or more media and/or devices capable of transmitting encoded video data from cloud server 101 to terminal device 102.

In one example, channel 103 includes one or more communication media that enable cloud server 101 to transmit encoded video data directly to terminal device 102 in real-time. In this example, cloud server 101 may modulate the encoded video data according to a communication standard and transmit the modulated video data to terminal device 102. Where the communication medium comprises a wireless communication medium, such as a radio frequency spectrum, the communication medium may optionally also comprise a wired communication medium, such as one or more physical transmission lines.

In another example, channel 103 comprises a storage medium that may store video data encoded by cloud server 101. Storage media include a variety of locally accessed data storage media such as compact discs, DVDs, flash memory, and the like. In this example, the terminal device 102 may obtain encoded video data from the storage medium.

In another example, channel 103 may comprise a storage server that may store video data encoded by cloud server 101. In this example, the terminal device 102 may download stored encoded video data from the storage server. Alternatively, the storage server may store the encoded video data and may transmit the encoded video data to the terminal device 102, such as a web server (e.g., for a website), a File Transfer Protocol (FTP) server, or the like.

In some embodiments, cloud server 101 includes a video encoder and an output interface. The output interface may comprise, among other things, a modulator/demodulator (modem) and/or a transmitter.

In some embodiments, cloud server 101 may include a video source in addition to a video encoder and an input interface.

The video source may include at least one of a video capture device (e.g., a video camera), a video archive, a video input interface for receiving video data from a video content provider, a computer graphics system for generating video data.

A video encoder encodes video data from a video source to produce a bitstream. The video data may include one or more pictures (pictures) or sequences of pictures (sequence of pictures). The code stream contains encoded information of the image or image sequence in the form of a bit stream. The encoded information may include encoded image data and associated data. The associated data may include a sequence parameter set (sequence parameter set, SPS for short), a picture parameter set (picture parameter set, PPS for short), and other syntax structures. An SPS may contain parameters that apply to one or more sequences. PPS may contain parameters that apply to one or more pictures. A syntax structure refers to a set of zero or more syntax elements arranged in a specified order in a bitstream.

The video encoder transmits the encoded video data directly to the terminal device 102 via an output interface. The encoded video data may also be stored on a storage medium or a storage server for subsequent reading by the terminal device 102.

In some embodiments, the terminal device 102 includes an input interface and a video decoder.

In some embodiments, the terminal device 102 may include a display in addition to an input interface and a video decoder.

Wherein the input interface comprises a receiver and/or a modem. The input interface may receive encoded video data over a channel.

The video decoder is used for decoding the encoded video data to obtain decoded video data, and transmitting the decoded video data to the display device.

The display device displays the decoded video data. The display means may be integrated with the terminal device or external to the terminal device. The display device may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.

Alternatively, the cloud server 101 may be one or more. When the cloud server 101 is plural, there are at least two servers for providing different services, and/or there are at least two servers for providing the same service, for example, providing the same service in a load balancing manner, which is not limited in the embodiment of the present application.

Alternatively, the cloud server 101 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content distribution networks), and basic cloud computing services such as big data and artificial intelligence platforms. Cloud server 101 may also become a node of the blockchain.

In some embodiments, the cloud server 101 is a cloud server with powerful computing resources, and is characterized by high virtualization and high distribution.

In some embodiments, the present application may be applied to the field of image encoding and decoding, the field of video encoding and decoding, the field of hardware video encoding and decoding, the field of dedicated circuit video encoding and decoding, the field of real-time video encoding and decoding, and the like. For example, the schemes of the present application may be incorporated into audio video coding standards (audio video coding standard, AVS for short), such as the h.264/audio video coding (audio video coding, AVC for short) standard, the h.265/high efficiency video coding (high efficiency video coding, HEVC for short) standard, and the h.266/multi-function video coding (versatile video coding, VVC for short) standard. Alternatively, the schemes of the present application may operate in conjunction with other proprietary or industry standards including ITU-T H.261, ISO/IECMPEG-1Visual, ITU-T H.262 or ISO/IECMPEG-2Visual, ITU-T H.263, ISO/IECMPEG-4Visual, ITU-T H.264 (also known as ISO/IECMPEG-4 AVC), including Scalable Video Codec (SVC) and Multiview Video Codec (MVC) extensions. It should be understood that the techniques of this application are not limited to any particular codec standard or technique.

The following describes a video coding framework according to an embodiment of the present application.

Fig. 2 is a schematic block diagram of a video encoder provided by an embodiment of the present application. It should be appreciated that the video encoder 200 may be used for lossy compression of images (lossy compression) and may also be used for lossless compression of images (lossless compression). The lossless compression may be visual lossless compression (visually lossless compression) or mathematical lossless compression (mathematically lossless compression).

The video encoder 200 may be applied to image data in luminance and chrominance (YCbCr, YUV) format.

For example, the video encoder 200 reads video data, divides a frame of image into a number of Coding Tree Units (CTUs) for each frame of image in the video data, and in some examples, CTBs may be referred to as "tree blocks", "maximum coding units" (Largest Coding unit, LCUs) or "coding tree blocks" (coding tree block, CTBs). Each CTU may be associated with a block of pixels of equal size within the image. Each pixel may correspond to one luminance (or luma) sample and two chrominance (or chroma) samples. Thus, each CTU may be associated with one block of luma samples and two blocks of chroma samples. One CTU size is, for example, 128×128, 64×64, 32×32, etc. One CTU may be further divided into several Coding Units (CUs), where a CU may be a rectangular block or a square block. The CU may be further divided into a Prediction Unit (PU) and a Transform Unit (TU), so that the encoding, the prediction, and the transform are separated, and the processing is more flexible. In one example, CTUs are divided into CUs in a quadtree manner, and CUs are divided into TUs, PUs in a quadtree manner.

Video encoders and video decoders may support various PU sizes. Assuming that the size of a particular CU is 2nx2n, video encoders and video decoders may support 2 nx2n or nxn PU sizes for intra prediction and support 2 nx2n, 2 nx N, N x 2N, N x N or similar sized symmetric PUs for inter prediction. Video encoders and video decoders may also support asymmetric PUs of 2nxnu, 2nxnd, nL x 2N, and nR x 2N for inter prediction.

In some embodiments, as shown in fig. 2, the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, a loop filtering unit 260, a decoded image buffer 270, and an entropy encoding unit 280. It should be noted that video encoder 200 may include more, fewer, or different functional components.

Alternatively, in this application, a current block (current block) may be referred to as a current Coding Unit (CU) or a current Prediction Unit (PU), or the like. The prediction block may also be referred to as a prediction image block or an image prediction block, and the reconstructed image block may also be referred to as a reconstructed block or an image reconstructed image block.

In some embodiments, prediction unit 210 includes an inter prediction unit 211 and an intra estimation unit 212. Because of the strong correlation between adjacent pixels in a frame of video, intra-prediction methods are used in video coding techniques to eliminate spatial redundancy between adjacent pixels. Because of the strong similarity between adjacent frames in video, the inter-frame prediction method is used in the video coding and decoding technology to eliminate the time redundancy between adjacent frames, thereby improving the coding efficiency.

The inter prediction unit 211 may be used for inter prediction, which may refer to image information of different frames, using motion information to find a reference block from the reference frame, generating a prediction block from the reference block, for eliminating temporal redundancy; the frames used for inter-prediction may be P frames, which refer to forward predicted frames, and/or B frames, which refer to bi-directional predicted frames. The motion information includes a reference frame list in which the reference frame is located, a reference frame index, and a motion vector. The motion vector may be integer or sub-pixel, and if the motion vector is sub-pixel, then interpolation filtering is required to make the required sub-pixel block in the re-reference frame, where the integer or sub-pixel block in the reference frame found from the motion vector is referred to as the reference block. Some techniques may use the reference block directly as a prediction block, and some techniques may reprocess the reference block to generate a prediction block. Reprocessing a prediction block on the basis of a reference block is also understood to mean that the reference block is taken as a prediction block and then a new prediction block is processed on the basis of the prediction block.

The most commonly used inter prediction methods at present include: geometric partitioning modes (geometric partitioning mode, GPM) in VVC video codec standard, and angle weighted prediction (angular weighted prediction, AWP) in AVS3 video codec standard. These two intra prediction modes are in principle common.

The intra estimation unit 212 predicts pixel information within the current code image block for eliminating spatial redundancy by referring to only information of the same frame image. The frame used for intra prediction may be an I-frame.

The intra prediction modes used by HEVC are Planar mode (Planar), DC, and 33 angular modes, for a total of 35 prediction modes. The intra modes used by VVC are Planar, DC and 65 angular modes, for a total of 67 prediction modes. The intra modes used by AVS3 are DC, plane, bilinear and 63 angular modes, for a total of 66 prediction modes.

In some embodiments, intra-estimation unit 212 may be implemented using intra-block copy techniques and intra-string copy techniques.

Residual unit 220 may generate a residual block of the CU based on the pixel block of the CU and the prediction block of the PU of the CU. For example, residual unit 220 may generate a residual block of the CU such that each sample in the residual block has a value equal to the difference between: samples in pixel blocks of a CU, and corresponding samples in prediction blocks of PUs of the CU.

The transform/quantization unit 230 may quantize the transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with TUs of a CU based on Quantization Parameter (QP) values associated with the CU. The video encoder 200 may adjust the degree of quantization applied to the transform coefficients associated with the CU by adjusting the QP value associated with the CU.

The inverse transform/quantization unit 240 may apply inverse quantization and inverse transform, respectively, to the quantized transform coefficients to reconstruct a residual block from the quantized transform coefficients.

The reconstruction unit 250 may add samples of the reconstructed residual block to corresponding samples of one or more prediction blocks generated by the prediction unit 210 to generate a reconstructed image block associated with the TU. In this way, reconstructing sample blocks for each TU of the CU, video encoder 200 may reconstruct pixel blocks of the CU.

Loop filtering unit 260 may perform a deblocking filtering operation to reduce blocking artifacts of pixel blocks associated with the CU.

In some embodiments, the loop filtering unit 260 includes a deblocking filtering unit for deblocking artifacts and a sample adaptive compensation/adaptive loop filtering (SAO/ALF) unit for removing ringing effects.

The decoded image buffer 270 may store reconstructed pixel blocks. Inter prediction unit 211 may use the reference image containing the reconstructed pixel block to perform inter prediction on PUs of other images. In addition, intra estimation unit 212 may use the reconstructed pixel blocks in decoded image buffer 270 to perform intra prediction on other PUs in the same image as the CU.

The entropy encoding unit 280 may receive the quantized transform coefficients from the transform/quantization unit 230. Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.

Fig. 3 is a schematic block diagram of a video decoder provided by an embodiment of the present application.

As shown in fig. 3, the video decoder 300 includes: an entropy decoding unit 310, a prediction unit 320, an inverse quantization/transformation unit 330, a reconstruction unit 340, a loop filtering unit 350, and a decoded image buffer 360. It should be noted that the video decoder 300 may include more, fewer, or different functional components.

The video decoder 300 may receive the bitstream. The entropy decoding unit 310 may parse the bitstream to extract syntax elements from the bitstream. As part of parsing the bitstream, the entropy decoding unit 310 may parse entropy-encoded syntax elements in the bitstream. The prediction unit 320, the inverse quantization/transformation unit 330, the reconstruction unit 340, and the loop filtering unit 350 may decode video data according to syntax elements extracted from a bitstream, i.e., generate decoded video data.

In some embodiments, prediction unit 320 includes an intra prediction unit 322 and an inter prediction unit 321.

Intra prediction unit 322 may perform intra prediction to generate a prediction block for the PU. Intra-prediction unit 322 may use an intra-prediction mode to generate a prediction block for the PU based on pixel blocks of spatially-neighboring PUs. Intra-prediction unit 322 may also determine an intra-prediction mode for the PU based on one or more syntax elements parsed from the bitstream.

The inter prediction unit 321 may construct a first reference picture list (list 0) and a second reference picture list (list 1) according to syntax elements parsed from the bitstream. Furthermore, if the PU uses inter prediction encoding, entropy decoding unit 310 may parse the motion information of the PU. Inter prediction unit 321 may determine one or more reference blocks of the PU from the motion information of the PU. Inter prediction unit 321 may generate a prediction block of a PU from one or more reference blocks of the PU.

The inverse quantization/transform unit 330 may inverse quantize (i.e., dequantize) transform coefficients associated with the TUs. Inverse quantization/transform unit 330 may determine the degree of quantization using QP values associated with the CUs of the TUs.

After inverse quantizing the transform coefficients, inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients in order to generate a residual block associated with the TU.

Reconstruction unit 340 uses the residual blocks associated with the TUs of the CU and the prediction blocks of the PUs of the CU to reconstruct the pixel blocks of the CU. For example, the reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct a pixel block of the CU, resulting in a reconstructed image block.

Loop filtering unit 350 may perform a deblocking filtering operation to reduce blocking artifacts of pixel blocks associated with the CU.

The video decoder 300 may store the reconstructed image of the CU in a decoded image buffer 360. The video decoder 300 may use the reconstructed image in the decoded image buffer 360 as a reference image for subsequent prediction or may transmit the reconstructed image to a display device for presentation.

The basic flow of video encoding and decoding is as follows: at the encoding end, one frame of image is divided into blocks, and for a current block, the prediction unit 210 generates a prediction block of the current block using intra prediction or inter prediction. The residual unit 220 may calculate a residual block, which may also be referred to as residual information, based on the difference between the prediction block and the original block of the current block, i.e., the prediction block and the original block of the current block. The residual block is transformed and quantized by the transforming/quantizing unit 230, and the like, so that information insensitive to human eyes can be removed to eliminate visual redundancy. Alternatively, the residual block before being transformed and quantized by the transforming/quantizing unit 230 may be referred to as a time domain residual block, and the time domain residual block after being transformed and quantized by the transforming/quantizing unit 230 may be referred to as a frequency residual block or a frequency domain residual block. The entropy encoding unit 280 receives the quantized change coefficient output from the change quantization unit 230, and may entropy encode the quantized change coefficient to output a bitstream. For example, the entropy encoding unit 280 may eliminate character redundancy according to the target context model and probability information of the binary code stream.

At the decoding end, the entropy decoding unit 310 may parse the code stream to obtain prediction information of the current block, a quantization coefficient matrix, etc., and the prediction unit 320 generates a prediction block of the current block using intra prediction or inter prediction on the current block based on the prediction information. The inverse quantization/transformation unit 330 performs inverse quantization and inverse transformation on the quantized coefficient matrix using the quantized coefficient matrix obtained from the code stream to obtain a residual block. The reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block. The reconstructed blocks constitute a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the blocks, resulting in a decoded image. The encoding side also needs to obtain a decoded image in a similar operation to the decoding side. The decoded image may also be referred to as a reconstructed image, which may be a subsequent frame as a reference frame for inter prediction.

The block division information determined by the encoding end, and mode information or parameter information such as prediction, transformation, quantization, entropy coding, loop filtering, etc. are carried in the code stream when necessary. The decoding end analyzes the code stream and analyzes and determines the same block division information as the encoding end according to the existing information, and predicts, transforms, quantizes, entropy codes, loop filters and other mode information or parameter information, so that the decoded image obtained by the encoding end is ensured to be the same as the decoded image obtained by the decoding end.

The foregoing is a basic flow of a video codec under a block-based hybrid coding framework, and as technology advances, some modules or steps of the framework or flow may be optimized.

In some embodiments, embodiments of the present invention may be applied to a variety of scenarios in which it is desirable to determine decoding configuration parameters, including, but not limited to, cloud technology (e.g., cloud gaming), artificial intelligence, intelligent transportation, assisted driving, and the like.

In some embodiments, the methods of embodiments of the present application may apply Yu Duan cloud co-coding. The end cloud collaborative coding refers to a scheme of compressing video by collaborative coding of a cloud end and a terminal. Because the computing power of the video content producer (cloud) and the computing power of the video content consumer (terminal) are different, a relatively complex video compression task can be completed cooperatively by two ends, so that the cloud resources and the powerful computing power (such as encoding power) can be utilized, the data volume of network transmission is reduced, and the computing power (such as decoding power) of the terminal is also effectively utilized. The method can be used for scenes such as cloud games.

In some embodiments, video coding is coordinated, and optimal coding configuration and coding strategy are selected according to the coding and decoding capability of the intelligent terminal and in combination with the game type and the user network type.

The cloud end cooperative protocol refers to a unified protocol for data interaction between a cloud server and an intelligent terminal.

The intelligent terminal cooperative interface is an intelligent terminal software and hardware module interface, and can effectively interact with the intelligent terminal through the interface, configure video coding and rendering parameters and acquire real-time operation performance of hardware.

Decoding performance refers to the highest decoding frame rate and single frame decoding delay supported for a given video size under a particular decoding protocol. The video size is defined as follows: 360p,576p,720p,1080p,2k,4k. The video frame rate is defined as follows: 30fps,40fps,50fps,60fps,90fps,120fps.

The definition of the video resolution and video frame rate of the terminal device is shown in tables 1 and 2.

Table 1 definition of video resolution of terminal device

Video resolution	Enumeration definition
		360p	0x1
576p	0x2
		720p	0x4
1080p	0x8
		2k	0x10
4k	0x20

Table 2 definition of video resolution and video frame rate for terminal equipment

Video frame rate	Enumeration definition
		30fps	0x1
40fps	0x2
		50fps	0x4
60fps	0x8
		90fps	0x10
120fps	0x20

Optionally, the decoding performance supported by the terminal device is given in the form of triples, the first element is an enumeration definition of video resolution, the second element is an enumeration definition of video frame rate, the third element is a single frame decoding delay at video resolution and video frame rate, for example, H264 decoding of device a, and the single frame decoding delay at 720p@60fps is 10ms, which is denoted as (4, 8, 10).

The video coding collaborative optimization scheme is that a cloud server determines a coding function set to be started according to game types and network conditions, and then determines the optimal coding configuration of the current equipment through equipment types and coding capacities reported by an intelligent terminal.

In some embodiments, the terminal device decoding capability data structure requirements are shown in table 3.

Table 3 decoding capability data structure requirements for terminal devices

/>

And the cloud server determines the optimal encoding and decoding configuration such as the decoding protocol, the decoding resolution, the video frame rate and the like of the current equipment, and encoding and decoding strategies such as the number of video encoding reference frames, SVC starting and the like according to the decoding capability of the intelligent terminal and by combining the game type and the user network condition.

Current video applications fall into three general categories: video call class, video player class, screen sharing class.

The video telephony class requires low latency, no special requirements for high resolution and high frame rate, video resolution is typically 480p-720p, and video frame rate is 30fps. Video telephony applications are more concerned with single frame decoding delays, which may be compared to single frame decoding delays of different decoding chips (H264/H265/…), seeking configurations with minimal delays. The partial decoding chip has the condition of decoding the bin frame, and the optimal configuration is selected by taking the factor into consideration.

The video player needs high resolution, the video frame rate only reaches 30fps, the single frame decoding delay has no hard requirement, and the decoding of the stored frame of the chip is not concerned, so long as the stability of the decoding output frame rate is ensured. Such applications are more concerned with the highest resolution supported by the decoding chip (H264/H265/…).

Screen sharing requires high resolution and low latency, and the decoded frame rate is typically required to be 15-30fps. Such applications consider the frame hoarding of the decoding chip in addition to the highest resolution supported by the decoding chip (H264/H265/…).

In summary, the existing technical solutions consider more the influence of the decoding chip type (H264/H265/…) on the decoding configuration. However, for the low-delay and high-frame rate scenarios, by changing the type of decoding chip, the decoding configuration parameters meeting the requirements cannot be obtained.

In order to solve the technical problems, in the embodiment of the present application, when determining the decoding configuration parameters, the decoding parameters and the rendering parameters of the decoding device are fully considered, and by adjusting the decoding parameters and the rendering parameters, the target decoding rendering configuration information satisfying the low-delay and high-resolution scenes is determined, and when the target decoding rendering configuration information is used for encoding and decoding, the quality of encoding and decoding can be improved.

The following describes the technical solutions of the embodiments of the present application in detail through some embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

Fig. 4 is a flowchart of a method for determining decoding configuration parameters according to an embodiment of the present application, where the method may be applied to the terminal device 102 shown in fig. 1 or to the decoder shown in fig. 3, that is, the embodiment of the present application is applied to the decoding side.

As shown in fig. 4, the embodiment of the present application includes the following steps:

s401, determining initial decoding rendering configuration information of the terminal equipment.

The initial decoding rendering configuration information comprises initial decoding parameters and initial rendering parameters of the terminal equipment.

In order to meet the requirements of high resolution, high frame rate and low delay, the embodiment of the application selects the optimal decoding rendering configuration by detecting the decoding rendering performance of different decoding configurations on the terminal equipment so as to exert the limit capability of the current terminal equipment, thereby meeting the requirements of specific service scenes.

In some embodiments, the decoding parameters of the terminal device include at least any one of: the decoding chip type, video resolution and decoding input frame rate.

In some embodiments, the rendering parameters of the terminal device include at least any one of the following: rendering control type, rendering frame loss strategy and coding code stream parameters.

Wherein, decoding chip type: including H264, H265, VP9, AVS, etc.

Video resolution: including 720p, 1080p, 2K, 4K resolutions.

Decoding the input frame rate: including 50fps, 60fps, etc.

Rendering control type: the different systems are different, taking a Windows system as an example, the rendering control types include Direct3D (Direct 3D is abbreviated as D3D, which is a set of 3D drawing programming interfaces developed by Microsoft corporation on Microsoft Windows operating system) and OpenGL (open graphics library). In some embodiments, the rendering control type is also referred to as a rendering window type.

Rendering a frame loss strategy: including rendering lost frames and rendering non-lost frames. The display is refreshed according to a vertical synchronization signal, which is once every 16ms for a 60Hz display device. The frame loss rendering strategy refers to continuously sending a plurality of video images in two vertical synchronous signals, and only displaying the video image sent last time. The frame loss rendering is started, so that the smoothness of the picture is reduced. At present, the rendering frequency of most displays is 60Hz, and under the condition of no frame loss in rendering, the decoding frame rate can only support 60fps at maximum, but if decoding jitter or network jitter exists, the decoding rendering delay gradually increases due to delay accumulation. At this point, the rendering frame loss needs to be turned on to reduce the overall low latency. For a decoded frame rate of 50fps, however, there is no need to turn on the rendering frame loss policy since the upper frequency of rendering by the device is not reached.

Encoding code stream parameters: including video-encoded single-frame references and video-encoded multi-frame references. The specific coding semantics can affect the single frame decoding delay, and for part of the chips, the number of coding reference frames can affect the number of stored frames decoded by the chips, thereby affecting the decoding delay. Video coding single frame references are required for such chips.

As shown in fig. 5, the decoding rendering capability of the terminal device in the embodiment of the present application may be obtained by a detection manner, where the detection manner may include static capability detection and dynamic capability detection. The static capability detection mainly performs decoding capability detection and rendering capability detection, and the decoding capability detection main body comprises detection of decoding chip type and decoding chip capability. Rendering capability detection mainly includes detection of rendering control type and frame dropping strategy.

The static capability detection refers to detailed hardware parameter information that can be directly obtained from a hardware interface, wherein decoding parameters that can be obtained through the static capability detection include decoding chip types supported by a terminal device (e.g., H264/H265/VP9/AVS, etc.), and configurations of highest resolution supported by each decoding chip, decoding capability of each decoding chip, etc., wherein the decoding capability of the decoding chip includes Size, profile, level of the decoding chip, wherein Profile is a description of video compression characteristics (e.g., CABAC, number of color samples, etc.) of the decoding chip. Level is a description of the characteristics of the decoding chip itself (e.g., code rate, resolution, frame rate, etc.). In brief, the higher the Profile, the more advanced compression characteristics are employed. The higher the Level, the higher the rate, resolution, frame rate of the video. The rendering parameters that can be obtained through static capability detection include rendering control types supported by the terminal device (Windows platform Direct3D, openGL, etc.), and rendering frame loss policies supported by each rendering control.

According to the embodiment of the application, the decoding parameters and the rendering parameters of the terminal equipment can be obtained in a static capacity detection mode, and after the decoding parameters and the rendering parameters of the terminal equipment are obtained, various decoding rendering configuration combinations of the terminal equipment can be obtained.

The dynamic capability detection refers to real-time incoming video code stream, and the terminal equipment performs decoding rendering under the current configuration to obtain data such as decoding output frame rate, single frame decoding delay and the like of the terminal equipment. The single frame decoding delay of most terminal devices varies with the decoding input frame rate, and in general, the higher the decoding input frame rate, the faster the decoding speed, and the lower the single frame decoding delay. This is because an increase in the decoding input frame rate will increase the device operating frequency, resulting in faster decoding. Therefore, different decoding input frame rates have different single frame decoding delays, and the decoding input frame rate needs to be detected as a configuration in the dynamic capability detection stage of the terminal equipment.

For example, on a certain Windows platform device, it is necessary to find an optimal decoding rendering configuration that satisfies high resolution, high frame rate, and low latency. The static capability of the device is as follows, the decoding chip types comprise H264 and H265, the decoding resolution supports 720p and 1080p, the rendering control types comprise Direct3D and OpenGL, the frame loss strategy supports frame loss and frame non-loss, and 16 combinations are provided. In the dynamic detection phase, the performance of the detection device at two decoded input frame rates (50 fps and 60 fps) is required, and the influence of a single frame reference and a multi-frame reference of video coding is also required to be detected, and 64 combinations are formed as a whole. In order to mine the limit decoding rendering capability of the chip, all decoding rendering configurations described above need to be tried.

In some embodiments, the initial decoding rendering configuration information in the embodiments of the present application is any combination of decoding parameters and rendering parameters of the terminal device, for example, any one of the above 64 combinations. In other words, in the embodiment of the present application, any one of the combinations in 64 is used as initial decoding rendering configuration information, whether the decoding output frame rate and the single frame decoding delay corresponding to the initial decoding rendering configuration information meet the corresponding threshold value is determined, if not, the initial decoding rendering configuration information is adjusted, whether the decoding output frame rate and the single frame decoding delay corresponding to the adjusted decoding rendering configuration information meet the corresponding threshold value is determined, and the steps are repeatedly executed until decoding rendering configuration information that both the decoding output frame rate and the single frame decoding delay meet the corresponding threshold value is obtained, the decoding rendering configuration information is stored as target decoding rendering configuration information of the terminal device, and when the next device is started, the decoding rendering configuration information is directly loaded for decoding operation.

In some embodiments, in order to increase the determination speed of the decoding rendering configuration information, the initial decoding rendering configuration information in the embodiments of the present application is target decoding rendering configuration information with highest occurrence probability among preset target decoding rendering configuration information of N terminal devices, where N is a positive integer greater than 1.

That is, the initial decoding rendering configuration information in the embodiment of the present application is obtained by performing a complete test on a certain number of terminal devices in advance, obtaining target decoding rendering configuration information (i.e., optimal decoding rendering configuration information) of the devices, and selecting, as the initial decoding rendering configuration information, the target decoding rendering configuration information with the highest occurrence frequency. For example, to select the initial decoding rendering configuration information of the Windows platform, it is necessary to perform an integrity test on a certain range of Windows platforms in advance, and select the optimal decoding rendering configuration information with the highest occurrence frequency as the initial decoding rendering configuration information of the Windows platform.

Illustratively, the initial decoding rendering configuration information available to the Windows platform is: the method comprises the steps of H264+1080p +60fps + D3D9+ no frame loss + video coding multi-frame reference, namely, the decoding chip type is H264, the video resolution is 1080p, the decoding frame rate is 60fps, the rendering control type is D3D9, the frame loss strategy is no frame loss, and the coding rate parameter is video coding multi-frame reference.

According to the method, the initial decoding rendering configuration information of the terminal device can be determined, decoding parameters included in the initial decoding rendering configuration information are recorded as initial decoding parameters, and rendering parameters included in the initial decoding rendering configuration information are recorded as initial rendering parameters for convenience of description. Assume that the initial decoding rendering configuration information is: h264+1080p +60fps + d3d9+ no frame loss + video coding multi-frame reference, the initial decoding parameters are: the initial rendering parameters are D3D9+no frame loss+video coding multi-frame reference.

S402, decoding the test code stream under the initial decoding configuration information to obtain an initial decoding output frame rate and initial single frame decoding delay corresponding to the initial decoding rendering configuration information.

After the initial decoding rendering configuration information of the terminal equipment is determined, judging whether the initial decoding rendering configuration information meets the requirement. Specifically, a section of test code stream is input to the terminal equipment, and the terminal equipment decodes the test code stream under the initial decoding configuration information to obtain a decoding output frame rate and a single frame decoding delay corresponding to the initial decoding rendering configuration information.

In the embodiment of the present application, for convenience of description, the decoding output frame rate corresponding to the initial decoding rendering configuration information is denoted as the initial decoding output frame rate, and the single-frame decoding delay corresponding to the initial decoding rendering configuration information is denoted as the initial single-frame decoding delay.

S403, if at least one of the initial decoding output frame rate and the single frame decoding delay does not meet the corresponding threshold, adjusting at least one of the initial decoding parameters and the initial rendering parameters to obtain target decoding rendering configuration information of the terminal equipment.

The decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold values.

According to the steps, after the initial decoding output frame rate and the initial single frame decoding delay corresponding to the initial decoding rendering configuration information are determined, comparing the initial decoding output frame rate with the corresponding threshold, judging whether the initial decoding output frame rate meets the corresponding threshold, comparing the initial single frame decoding delay with the corresponding threshold, and judging whether the initial single frame decoding delay meets the corresponding threshold.

In some embodiments, if the initial decoding rendering configuration information is target decoding rendering configuration information with highest occurrence probability in the preset target decoding rendering configuration information of N terminal devices. The initial decoding rendering configuration information is obtained according to a certain number of equipment summary, has certain universality, and is optimal configuration for most of equipment. For the rest equipment, the optimal configuration can be found by only performing partial configuration adjustment on the initial decoding rendering configuration information. And judging whether configuration adjustment is needed according to the decoding output frame rate and single frame decoding delay of each dynamic detection.

And if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, adjusting at least one of the initial decoding parameters and the initial rendering parameters to obtain target decoding rendering configuration information meeting the threshold value.

Example 1, if the initial decoding output frame rate does not meet the corresponding threshold, adjusting the initial decoding parameter, decoding the test code stream using the adjusted decoding parameter, determining the decoding output frame rate corresponding to the adjusted decoding parameter, determining whether the decoding output frame rate meets the corresponding threshold, and if so, fixing the adjusted decoding parameter. If the decoding output frame rate corresponding to the adjusted decoding parameter does not meet the corresponding threshold value, the decoding parameter is adjusted again, the test code stream is decoded by using the adjusted decoding parameter, the decoding output frame rate corresponding to the decoding parameter after the adjustment is determined, whether the decoding output frame rate corresponding to the decoding parameter after the adjustment meets the corresponding threshold value is judged, if yes, the decoding parameter after the adjustment is fixed, otherwise, the decoding parameter is adjusted again, and the steps are repeated until the decoding parameter meeting the threshold value requirement is obtained.

Example 2, if the initial single frame decoding delay does not meet the corresponding threshold, adjusting the initial rendering parameter, decoding the test code stream by using the adjusted rendering parameter, determining the single frame decoding delay corresponding to the adjusted rendering parameter, determining whether the single frame decoding delay meets the corresponding threshold, and if so, fixing the adjusted rendering parameter. If the single frame decoding delay corresponding to the adjusted rendering parameter does not meet the corresponding threshold value, the rendering parameter is adjusted again, the test code stream is decoded by using the adjusted rendering parameter, the single frame decoding delay corresponding to the rendering parameter after the adjustment is determined, whether the single frame decoding delay corresponding to the rendering parameter after the adjustment meets the corresponding threshold value is judged, if yes, the rendering parameter after the adjustment is fixed, otherwise, the rendering parameter is adjusted again, and the steps are repeated until the rendering parameter meeting the threshold value requirement is obtained.

Example 3, if the initial decoding output frame rate and the initial single frame decoding delay do not meet the corresponding thresholds, adjusting the initial decoding parameters and the initial rendering parameters, decoding the test code stream by using the adjusted decoding parameters and the rendering parameters, determining the decoding output frame rate and the single frame decoding delay corresponding to the adjusted decoding parameters and the rendering parameters, judging whether the decoding output frame rate and the single frame decoding delay meet the corresponding thresholds, and if so, fixing the adjusted decoding parameters and the rendering parameters. If the adjusted decoding parameters and the decoding output frame rate and single frame decoding delay corresponding to the rendering parameters do not meet the corresponding threshold values, the decoding parameters and the rendering parameters are adjusted, the test code stream is decoded again by using the adjusted decoding parameters and the rendering parameters, the decoding output frame rate and single frame decoding delay corresponding to the adjusted decoding parameters and the rendering parameters are determined, whether the decoding output frame rate and single frame decoding delay corresponding to the adjusted decoding parameters and the rendering parameters meet the corresponding threshold values is judged, if yes, the adjusted decoding parameters and the rendering parameters are fixed, otherwise, the decoding parameters and the rendering parameters are adjusted again, and the steps are repeated until the decoding parameters and the rendering parameters meeting the threshold values are obtained.

In some embodiments, the decoding parameters in the embodiments of the present application relate to the decoding output frame rate, the decoding output frame rate may be adjusted by adjusting the decoding parameters, the rendering parameters relate to the Shan Zhen decoding delay, and the single frame decoding delay may be adjusted by adjusting the rendering parameters.

For convenience of description, the threshold corresponding to the decoding output frame rate is denoted as a decoding output frame rate threshold, and the threshold corresponding to the single-frame decoding delay is denoted as a single-frame decoding delay threshold.

In some embodiments, the above S403 includes the following cases:

and 1, if the initial decoding output frame rate does not meet the decoding output frame rate threshold, and the initial single-frame decoding delay meets the single-frame decoding delay threshold, adjusting the initial decoding parameters to obtain target decoding parameters meeting the decoding output frame rate threshold, and determining the rendering parameters as target rendering parameters.

In the case 1, if the initial single frame decoding delay meets the single frame decoding delay threshold, it is indicated that the rendering parameters in the initial decoding rendering configuration information meet the requirements, the rendering parameters are not adjusted, and the initial rendering parameters are directly determined as target rendering parameters.

The initial decoding output frame rate does not meet the decoding output frame rate threshold, which indicates that the initial decoding parameters in the initial decoding rendering configuration information do not meet the requirements, and the initial decoding parameters are adjusted to obtain target decoding parameters meeting the decoding output frame rate threshold. The process of adjusting the initial decoding parameters to obtain the decoding parameters satisfying the decoding output frame rate threshold may refer to the description of example 1 above, and will not be described herein.

The specific value of the decoding output frame rate threshold is not limited, and is, for example, a preset value.

In some embodiments, the above-mentioned decoding output frame rate threshold is a decoding input frame rate, and determining whether the initial decoding output frame rate satisfies the decoding output frame rate threshold includes the following steps:

step 11, if the initial decoding output frame rate is smaller than the decoding input frame rate, determining that the initial decoding output frame rate does not meet the decoding output frame rate threshold;

and step 12, if the initial decoding output frame rate is equal to the decoding input frame rate, determining that the initial decoding output frame rate meets the decoding output frame rate threshold.

It should be noted that, in the above step 12, the initial decoded output frame rate is equal to the decoded input frame rate, which may be understood as that the initial decoded output frame rate is approximately equal to the decoded input frame rate.

In this embodiment of the present application, the initial decoding parameter includes at least one of a decoding chip type, a decoding input frame rate and a video resolution of the terminal device, and in this case, in case 1, adjusting the initial decoding parameter to obtain the target decoding parameter that meets the decoding output frame rate threshold includes the following steps:

and step A, adjusting at least one of the decoding chip type, the decoding input frame rate and the video resolution to obtain target decoding parameters meeting the decoding output frame rate threshold.

In some embodiments, any one or more parameters of the decoding chip type, the decoding input frame rate and the video resolution included in the initial decoding parameters may be adjusted simultaneously to obtain a target decoding parameter satisfying the decoding output frame rate threshold. For example, the decoding input frame rate and/or the video resolution are reduced, whether the decoding output frame rate corresponding to the adjusted parameters meets the decoding output frame rate threshold is judged, and if the decoding output frame rate does not meet the decoding output frame rate threshold, the chip type is adjusted. For another example, the decoding chip type and/or decoding input frame rate are adjusted, and the video resolution is adjusted. That is, in this embodiment, the adjustment order of the decoding chip type, the decoding input frame rate, and the video resolution, and the adjustment composition are not limited, and are specifically determined according to actual needs.

In some embodiments, the step a may be adjusted according to the adjustment sequence shown in the following step A1:

and A1, adjusting at least one of the decoding chip type, the decoding input frame rate and the video resolution according to an adjusting mode of adjusting the decoding chip type, and then adjusting the decoding input frame rate and the video resolution to obtain target decoding parameters.

Because the influence of different chip types on the decoding output frame rate is obvious, in order to quickly determine the target decoding parameters, when the initial decoding output frame rate does not meet the decoding output frame rate threshold, the type (such as H264/H265) of the decoding chip is replaced first, and whether the decoding output frame rate corresponding to the adjusted decoding chip meets the decoding output frame rate threshold is judged. In a normal case, the decoding output frame rate threshold can be satisfied by replacing the decoding chip type. If the decoding output frame rate corresponding to the adjusted decoding chip does not meet the decoding output frame rate threshold, selecting the decoding chip with the highest decoding output frame rate as the target decoding chip type of the subsequent detection, attempting to reduce the frame rate and the resolution on the basis, and finding out the decoding configuration reaching the standard.

In an adjustment scheme of decoding input frame rate and video resolution, firstly, the decoding input frame rate is reduced, whether the decoding output frame rate corresponding to the adjusted decoding input frame rate meets a decoding output frame rate threshold value is judged, and if not, the video resolution is reduced.

In some embodiments, in the embodiments of the present application, the decoded input frame rate and the video resolution may be adjusted multiple times, specifically, if the decoded output frame rate corresponding to the decoded input frame rate and the video resolution after being adjusted down still does not meet the decoded output frame rate threshold, the decoded input frame rate and the video resolution are adjusted down again until the decoded input frame rate and the video resolution meeting the decoded output frame rate threshold are obtained.

For example, the decoded input frame rate is first reduced, and it is determined whether the decoded output frame rate corresponding to the reduced decoded input frame rate satisfies the decoded output frame rate threshold. If not, the video resolution is reduced, whether the decoded output frame rate corresponding to the reduced video resolution meets the decoded output frame rate threshold is judged, if not, the decoded input frame rate and the video resolution are reduced continuously according to the adjustment sequence of the video resolution after the decoded input frame rate is firstly reduced, and the steps are repeated until the decoded input frame rate and the video resolution meeting the decoded output frame rate threshold are obtained.

In some embodiments, the initial decoding rendering configuration information may define only a portion of parameters, and the remaining parameters are determined by dynamic probing. For example, there are some models, the decoding capability of H265 is stronger than that of H264, but the decoding output frame rate and single frame decoding delay of H264/H265 under the default configuration can reach the standard. If the initial decode rendering configuration information includes a decode chip type, only the default configured chip type will be selected in the end. For this case, the decoding chip type may be set to an unknown type, where the optimal decoding chip type needs to be detected on the device first to determine the default configuration, and then the subsequent detection operation is performed.

That is, if the initial decoding rendering configuration information does not include the decoding chip type, the embodiment of the present application may determine the optimal decoding chip type first, and add the determined decoding chip type to the initial decoding rendering configuration information to obtain new initial decoding rendering configuration information. The new initial decoding rendering configuration information includes a decoding chip type with optimal decoding performance, and then, the method of the embodiment of the present application is executed using the new determined initial decoding rendering configuration information.

And 2, if the initial decoding output frame rate meets the decoding output frame rate threshold and the initial single-frame decoding delay does not meet the single-frame decoding delay threshold, adjusting the initial rendering parameters to obtain target rendering parameters meeting the single-frame decoding delay threshold, and determining the initial decoding parameters as target decoding parameters.

In case 2, if the initial decoding output frame rate satisfies the decoding output frame rate threshold, it is indicated that the decoding parameters in the initial decoding rendering configuration information satisfy the requirements, and the decoding parameters are not adjusted, and the initial decoding parameters are directly determined as target decoding parameters.

The initial single-frame decoding delay does not meet the single-frame decoding delay threshold, and the initial rendering parameters in the initial decoding rendering configuration information are not in accordance with the requirements, and the initial rendering parameters are adjusted to obtain target rendering parameters meeting the single-frame decoding delay threshold. The process of adjusting the initial rendering parameters to obtain the rendering parameters satisfying the single frame decoding delay threshold may refer to the description of example 2 above, and will not be described herein.

The specific value of the single frame decoding delay threshold is not limited, for example, a preset value.

In some embodiments, the single frame decoding delay threshold in the embodiments of the present application is the inverse of the result of multiplying the sum of the number of hoarded frames and 1 by the decoded input frame rate. On the premise that the decoding output frame rate meets the requirement, for equipment which does not store frames for decoding, the single-frame decoding delay should be less than 1/decoding input frame rate. For the apparatus for decoding the bin, the decoding delay needs to consider the number of bins. For example, the apparatus for decoding the bin 3 frames only inputs the video stream of the n+4th frame, the video image of the n frame is outputted from the decoder, and the single frame decoding delay additionally calculates the bin time of the bin 3 frames, i.e. 1/decoding input frame rate 3. In summary, the single frame decoding delay threshold in the embodiment of the present application is the inverse of the result of multiplying the sum of the number of hoarding frames and 1 by the decoded input frame rate, that is, the single frame decoding delay standard reaching decision criterion is less than 1/decoded input frame rate (1+number of hoarding frames).

At this time, determining whether the initial single frame decoding delay meets the single frame decoding delay threshold includes the steps of:

step 21, if the initial single-frame decoding delay is less than or equal to the reciprocal, determining that the initial single-frame decoding delay meets a single-frame decoding delay threshold;

and step 22, if the initial single-frame decoding delay is greater than the reciprocal, determining that the initial single-frame decoding delay does not meet the single-frame decoding delay threshold.

In some embodiments, if the initial rendering parameter includes at least one of a rendering control type and a rendering frame loss policy, in the case 2, adjusting the initial rendering parameter to obtain the target rendering parameter that meets the single frame decoding delay threshold includes the following step B:

and B, adjusting at least one of the type of the rendering control and the rendering frame loss strategy to obtain target rendering parameters meeting the single-frame decoding delay threshold.

In the embodiment of the present application, if the decoding output frame rate meets the standard (i.e., the decoding output frame rate is approximately equal to the decoding output frame rate threshold), but the single-frame decoding delay does not meet the standard (i.e., the single-frame decoding delay is greater than the Shan Zhen decoding delay threshold), the reasons may be that: firstly, the rendering control has insufficient performance under the current rendering frame rate, and the rendering control needs to be switched at the moment; second, the decoded output frame rate has reached the upper device rendering frame rate limit, at which point the rendering frame loss policy needs to be enabled.

In some embodiments, the order and manner in which at least one of the rendering control type and the rendering drop policy is adjusted is not limited, for example, the rendering control type is adjusted first, the rendering drop policy is adjusted first, or the rendering drop policy is adjusted first, the rendering control type is adjusted second, or the rendering drop policy and the rendering control type are adjusted simultaneously.

In some embodiments, the step B includes the following step B1:

and B1, adjusting at least one of the type of the rendering control and the rendering frame loss strategy according to an adjustment mode of adjusting the type of the rendering control and then adjusting the rendering frame loss strategy, so as to obtain target rendering parameters meeting a single frame decoding delay threshold.

In the step B1, if the initial single frame decoding delay is greater than the Shan Zhen decoding delay threshold, it is indicated that the initial rendering parameter does not meet the requirement, the rendering control type is adjusted first, the single frame decoding delay corresponding to the adjusted rendering control type is determined, and if the single frame decoding delay is less than or equal to the single frame decoding delay threshold, the rendering control type and the rendering frame loss strategy at the moment are determined as the target rendering parameter. If the single frame decoding delay is greater than the Shan Zhen decoding delay threshold, the rendering frame loss strategy is adjusted, for example, non-frame loss is modified to frame loss, the single frame decoding delay corresponding to the adjusted rendering frame loss strategy is determined, and if the single frame decoding delay is less than or equal to the single frame decoding delay threshold, the rendering control type and the rendering frame loss strategy at the moment are determined to be target rendering parameters. If the single frame decoding delay is greater than the Shan Zhen decoding delay threshold, repeating the steps until the target rendering parameters meeting the requirements are obtained.

In some embodiments, if the target rendering parameters satisfying the single frame decoding delay threshold are not obtained after the rendering frame loss policy and the rendering control type are adjusted according to the method, the method of the embodiment of the present application further includes:

and C, adjusting the video coding multi-frame reference in the initial decoding rendering configuration information into a video coding single-frame reference.

As can be seen from the above description, the encoding code rate parameter in the initial decoding rendering configuration information is a video encoding multi-frame reference, and the multi-frame reference needs to store more reference frames, so as to increase single-frame decoding delay, and in order to reduce single-frame decoding delay, the video encoding multi-frame reference in the initial decoding rendering configuration information can be adjusted to be a video encoding single-frame reference.

And 3, if the initial decoding output frame rate does not meet the decoding output frame rate threshold and the initial single-frame decoding delay does not meet the single-frame decoding delay threshold, adjusting the initial decoding parameters to obtain target decoding parameters meeting the decoding output frame rate threshold, and adjusting the initial rendering parameters on the basis of the target decoding parameters to obtain target rendering parameters meeting the single-frame decoding delay threshold.

In this case 3, the initial decoding output frame rate does not satisfy the decoding output frame rate threshold, and the initial single frame decoding delay does not satisfy the single frame decoding delay threshold, which indicates that both the initial decoding parameter and the initial rendering parameter in the initial decoding rendering configuration information do not conform to the requirements, and adjustment of both the initial decoding parameter and the initial rendering parameter is required. In this embodiment of the present application, initial decoding parameters are adjusted first to obtain target decoding parameters that meet the threshold of the decoding output frame rate, and the description of the foregoing case 1 is specifically referred to. Based on the target decoding parameters, the initial rendering parameters are adjusted to obtain target rendering parameters meeting the single-frame decoding delay threshold, and the related description of the condition 2 is specifically referred to.

And according to the 3 conditions, obtaining target decoding parameters and target rendering parameters, wherein the target decoding parameters and the target rendering parameters form target decoding rendering configuration information of the terminal equipment.

According to the method, the terminal equipment obtains the target decoding rendering configuration information, the target decoding rendering configuration information can be understood as the optimal decoding rendering configuration parameters of the terminal equipment, and the terminal equipment is stored in the terminal equipment

In the embodiment of the present application, when the terminal device is started for the first time, the optimal decoding rendering configuration is detected based on the initial decoding rendering configuration information, so as to obtain the target decoding rendering configuration information of the terminal device, and the target decoding rendering configuration information conforming to the terminal device is stored. And directly decoding by using the locally stored target decoding rendering configuration information when the system is started in a follow-up formal mode.

The target decoding rendering configuration information obtained by the embodiment of the application meets the requirements of high resolution, high frame rate and low time delay, and can be understood as the optimal configuration of the terminal equipment.

It should be noted that, in the embodiment of the present application, during one parameter adjustment process, other parameters are kept unchanged except for the adjusted parameters.

In some embodiments, after determining the target decoding rendering configuration information according to the above method, the terminal device sends the target decoding rendering configuration information to the cloud server, so that the cloud server encodes according to the target decoding rendering configuration information to obtain a code stream according with the decoding capability of the terminal device.

According to the method for determining the decoding configuration parameters, the terminal equipment determines initial decoding rendering configuration information of the terminal equipment, wherein the initial decoding rendering configuration information comprises initial decoding parameters and initial rendering parameters of the terminal equipment; decoding the test code stream under the initial decoding configuration information to obtain an initial decoding output frame rate and initial single frame decoding delay corresponding to the initial decoding rendering configuration information; and if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, adjusting at least one of the initial decoding parameter and the initial rendering parameter to obtain target decoding rendering configuration information of the terminal equipment, wherein the decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold value. In other words, in the embodiment of the present application, when determining the target decoding rendering configuration, the decoding parameters and rendering parameters of the terminal device are fully considered, and the target decoding rendering configuration information satisfying the low-latency and high-resolution scenes is determined by adjusting the decoding parameters and rendering parameters, so that the quality of encoding and decoding can be improved when encoding and decoding are performed using the target decoding rendering configuration information. In addition, in some embodiments, the initial decoding rendering configuration information is optimal decoding rendering configuration information of a plurality of devices, and the target decoding rendering configuration information of the terminal device can be found out from a complex configuration combination by performing limited detection on the basis of the initial decoding rendering configuration information, so that efficiency is high.

Fig. 6 is a flowchart of a method for determining decoding configuration parameters according to an embodiment of the present application, and fig. 6 is a flowchart of a specific embodiment of the present application, including:

s601, determining initial decoding rendering configuration information of the terminal equipment.

Optionally, the initial decoding rendering configuration information is target decoding rendering configuration information with highest occurrence probability in target decoding rendering configuration information of preset N terminal devices, where N is a positive integer greater than 1.

S602, determining an initial decoding output frame rate and an initial single frame decoding delay corresponding to the initial decoding rendering configuration information.

With specific reference to the description of S402, a detailed description is omitted here.

S603, judging whether the initial decoding output frame rate meets a corresponding threshold value.

If the initial decoding output frame rate satisfies the corresponding threshold, S611 is performed.

If the initial decoding output frame rate does not satisfy the corresponding threshold, the following S604 is performed.

S604, replacing the decoding chip type, and re-determining the decoding output frame rate.

S605, selecting the optimal decoding chip according to the decoding output frame rate.

In the embodiment of the present application, if the terminal device has a plurality of decoding chips, determining a decoding output frame rate corresponding to each decoding chip, and selecting one decoding chip with the optimal decoding output frame rate as the optimal decoding chip. Where the decoded output frame rate optimum is understood to be the smallest difference from the decoded output frame rate threshold.

S606, judging whether the decoding output frame rate corresponding to the optimal decoding chip meets a corresponding threshold value.

If the execution of S611 is satisfied, S607 is not satisfied.

S607, the decoding input frame rate is reduced, and the decoding output frame rate is redetermined.

S608, judging whether the decoding output frame rate meets a corresponding threshold value.

If the execution of S611 is satisfied, S609 is not satisfied.

S609, the decoding resolution is lowered, and the decoding output frame rate is redetermined.

S610, judging whether the decoding output frame rate meets a corresponding threshold value.

If the execution is satisfied as S611, the return execution is not satisfied as S607.

S611, judging whether the initial single frame decoding delay meets a corresponding threshold value.

If execution S616 is satisfied, if execution S612 is not satisfied.

S612, the decoding parameters are kept unchanged, the rendering control type is adjusted, and the single-frame decoding delay is redetermined.

S613, judging whether the single frame decoding delay meets a corresponding threshold value.

If execution S616 is satisfied, if execution S614 is not satisfied.

S614, adjusting the rendering frame loss strategy and re-determining the single frame decoding delay.

S615, judging whether the single frame decoding delay meets a corresponding threshold value.

If execution S616 is satisfied, if execution S617 is not satisfied.

S616, target decoding rendering configuration information of the terminal equipment is obtained.

S617, adjusting the video coding multi-frame reference in the initial decoding rendering configuration information into a video coding single-frame reference.

In some embodiments, if the above step S617 is also performed for the bin type, the optimal coding parameters are obtained.

In the embodiment of the application, the optimal decoding rendering configuration detection flow is as shown in fig. 6: firstly, detecting initial decoding rendering configuration information, and obtaining initial decoding output frame rate and initial single frame decoding delay. If both reach the standard, the initial decoding rendering configuration information is the optimal configuration, and the optimal coding parameters are found out on the decoding configuration. If the initial decoding output frame rate reaches the standard and the single frame decoding delay does not reach the standard, the rendering parameters need to be modified. If the initial decoding output frame rate does not reach the standard, the decoding chip type is modified, the decoding input frame rate is reduced, and the decoding resolution is reduced until the decoding output frame rate meeting the condition is found.

Taking the above Windows platform as an example, the configuration to be detected is as follows: the decoding chip has two types of H264/H265, the decoding resolution has two types of 1080p/720p, the decoding frame rate has two types of 60fps/50fps, the rendering control has two types of Direct3D and OpenGL, the rendering frame loss strategy has two types of frame loss/frame non-frame loss, the coding parameters have two types of single frame/multi-frame reference, and the combination is totally carried out in 64 times.

In order to reduce the detection times, the initial decoding rendering configuration information is target decoding rendering configuration information with highest occurrence probability in target decoding rendering configuration information of the preset N terminal devices. After the first detection is finished, the results of most devices are that the initial decoding output frame rate reaches the standard, the initial single frame decoding delay reaches the standard, the decoding has no frame hoarding phenomenon, the initial decoding rendering configuration information is the optimal configuration, and only one detection is needed at the moment. For the bin type, the detection of the single-frame reference code stream needs to be added once, namely, the video coding multi-frame reference in the initial decoding rendering configuration information is adjusted to be the video coding single-frame reference, and the detection needs to be carried out twice in total. And when the initial decoding output frame rate reaches the standard and the initial single frame decoding delay does not reach the standard (namely, the decoding capability is enough and the rendering performance is insufficient), only 3 more detection times are needed to select the optimal rendering configuration. Even in the worst case, a maximum of 10 probes are required to find the optimal configuration, well below 64 probes for complete probing.

It should be understood that fig. 4-6 are only examples of the present application and should not be construed as limiting the present application.

The preferred embodiments of the present application have been described in detail above with reference to the accompanying drawings, but the present application is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solutions of the present application within the scope of the technical concept of the present application, and all the simple modifications belong to the protection scope of the present application. For example, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described in detail. As another example, any combination of the various embodiments of the present application may be made without departing from the spirit of the present application, which should also be considered as disclosed herein.

Method embodiments of the present application are described in detail above in connection with fig. 4-6, and apparatus embodiments of the present application are described in detail below.

Fig. 7 is a schematic structural diagram of a decoding configuration parameter determining apparatus according to an embodiment of the present application, where the apparatus is applied to a terminal device, and the apparatus 10 includes:

a determining unit 11, configured to determine initial decoding rendering configuration information of a terminal device, where the initial decoding rendering configuration information includes initial decoding parameters and initial rendering parameters of the terminal device;

the detecting unit 12 is configured to decode the test code stream under the initial decoding configuration information, so as to obtain an initial decoding output frame rate and an initial single frame decoding delay corresponding to the initial decoding rendering configuration information;

and the adjusting unit 13 is configured to adjust at least one of the initial decoding parameter and the initial rendering parameter to obtain target decoding rendering configuration information of the terminal device if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold, where the decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold.

In some embodiments, the initial decoding rendering configuration information is target decoding rendering configuration information with highest occurrence probability in target decoding rendering configuration information of preset N terminal devices, where N is a positive integer greater than 1.

In some embodiments, the threshold includes a decoding output frame rate threshold and a single frame decoding delay threshold, and the adjusting unit 13 is specifically configured to adjust the initial decoding parameter if the initial decoding output frame rate does not meet the decoding output frame rate threshold and the initial single frame decoding delay meets the single frame decoding delay threshold, to obtain a target decoding parameter that meets the decoding output frame rate threshold, and to determine the rendering parameter as a target rendering parameter;

if the initial decoding output frame rate meets the decoding output frame rate threshold and the initial single-frame decoding delay does not meet the single-frame decoding delay threshold, adjusting the initial rendering parameters to obtain target rendering parameters meeting the single-frame decoding delay threshold, and determining the initial decoding parameters as target decoding parameters;

if the initial decoding output frame rate does not meet the decoding output frame rate threshold and the initial single-frame decoding delay does not meet the single-frame decoding delay threshold, adjusting the initial decoding parameters to obtain target decoding parameters meeting the decoding output frame rate threshold, and adjusting the initial rendering parameters on the basis of the target decoding parameters to obtain target rendering parameters meeting the single-frame decoding delay threshold;

Wherein the target decoding parameters and the target rendering parameters constitute the target decoding rendering configuration information.

In some embodiments, the decoded output frame rate threshold is a decoded input frame rate, and the adjusting unit 13 is further configured to determine that the initial decoded output frame rate does not meet the decoded output frame rate threshold if the initial decoded output frame rate is less than the decoded input frame rate; and if the initial decoding output frame rate is equal to the decoding input frame rate, determining that the initial decoding output frame rate meets the decoding output frame rate threshold.

In some embodiments, the initial decoding parameter includes at least one of a decoding chip type, a decoding input frame rate, and a video resolution of the terminal device, and the adjusting unit 13 is specifically configured to adjust at least one of the decoding chip type, the decoding input frame rate, and the video resolution to obtain a target decoding parameter that meets the decoding output frame rate threshold.

In some embodiments, the adjusting unit 13 is specifically configured to adjust at least one of the decoding chip type, the decoding input frame rate, and the video resolution according to an adjustment manner of adjusting the decoding chip type first and then adjusting the decoding input frame rate and the video resolution, so as to obtain the target decoding parameter.

In some embodiments, the adjusting unit 13 is further configured to, if the decoded input frame rate and the video resolution are reduced, reduce the decoded input frame rate and the video resolution again until the decoded input frame rate and the video resolution that meet the decoded output frame rate threshold are obtained when the decoded output frame rate threshold is not met.

In some embodiments, if the initial decoding rendering configuration information does not include a decoding chip type, the adjusting unit 13 is further configured to obtain M decoding chips of the terminal device, where M is a positive integer greater than 1; determining a decoding output frame rate corresponding to each decoding chip in the M decoding chips under the initial decoding rendering configuration information; and adding the type corresponding to the decoding chip with the largest decoding output frame rate in the M decoding chips to the initial decoding rendering configuration information to obtain new initial decoding rendering configuration information.

In some embodiments, the single frame decoding delay threshold is the inverse of the result of multiplying the sum of the number of stored frames and 1 by the decoded input frame rate, and the adjusting unit 13 is further configured to determine that the initial single frame decoding delay meets the single frame decoding delay threshold if the initial single frame decoding delay is less than or equal to the inverse; and if the initial single-frame decoding delay is larger than the reciprocal, determining that the initial single-frame decoding delay does not meet the single-frame decoding delay threshold.

In some embodiments, the initial rendering parameters include at least one of a rendering control type and a rendering frame loss policy, and the adjusting unit 13 is specifically configured to adjust at least one of the rendering control type and the rendering frame loss policy to obtain a target rendering parameter that meets the single frame decoding delay threshold.

In some embodiments, the adjusting unit 13 is specifically configured to adjust at least one of the rendering control type and the rendering frame loss policy according to an adjustment manner of adjusting the rendering control type first and then adjusting the rendering frame loss policy, so as to obtain a target rendering parameter that meets the single frame decoding delay threshold.

In some embodiments, if the rendering parameter is not adjusted to meet the target rendering parameter of the single frame decoding delay threshold, the adjusting unit 13 is further configured to adjust the video encoding multi-frame reference in the initial decoding rendering configuration information to a video encoding single frame reference.

In some embodiments, the adjusting unit 13 is further configured to send second indication information to a cloud server, where the second indication information is used to indicate target decoding rendering configuration information of the terminal device, so that the cloud server encodes according to the target decoding rendering configuration information.

It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the apparatus 10 shown in fig. 7 may perform the above-described method embodiment, and the foregoing and other operations and/or functions of each module in the apparatus 7 are respectively for implementing the above-described method embodiment shown in fig. 4, and are not further described herein for brevity.

The apparatus of the embodiments of the present application are described above in terms of functional modules in conjunction with the accompanying drawings. It should be understood that the functional module may be implemented in hardware, or may be implemented by instructions in software, or may be implemented by a combination of hardware and software modules. Specifically, each step of the method embodiments in the embodiments of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in software form, and the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software modules in the decoding processor. Alternatively, the software modules may be located in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the above method embodiments.

Fig. 8 is a schematic block diagram of an electronic device provided in an embodiment of the present application, which may be the encoder and/or decoder described above.

As shown in fig. 8, the electronic device 40 may include:

a memory 41 and a memory 42, the memory 41 being adapted to store a computer program and to transfer the program code to the memory 42. In other words, the memory 42 may call and run a computer program from the memory 41 to implement the methods in the embodiments of the present application.

For example, the memory 42 may be used to perform the method embodiments described above in accordance with instructions in the computer program.

In some embodiments of the present application, the memory 42 may include, but is not limited to:

a general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.

In some embodiments of the present application, the memory 41 includes, but is not limited to:

volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DR RAM).

In some embodiments of the present application, the computer program may be partitioned into one or more modules that are stored in the memory 41 and executed by the memory 42 to perform the methods provided herein. The one or more modules may be a series of computer program instruction segments capable of performing particular functions for describing the execution of the computer program in the video production device.

As shown in fig. 8, the electronic device 40 may further include:

a transceiver 40, the transceiver 43 may be connected to the memory 42 or the memory 41.

The memory 42 may control the transceiver 43 to communicate with other devices, and in particular, may transmit information or data to other devices or receive information or data transmitted by other devices. The transceiver 43 may include a transmitter and a receiver. The transceiver 43 may further include antennas, the number of which may be one or more.

It will be appreciated that the various components in the video production device are connected by a bus system that includes, in addition to a data bus, a power bus, a control bus and a status signal bus.

The present application also provides a computer storage medium having stored thereon a computer program which, when executed by a computer, enables the computer to perform the method of the above-described method embodiments. Alternatively, embodiments of the present application also provide a computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method of the method embodiments described above.

When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces, in whole or in part, a flow or function consistent with embodiments of the present application. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for determining decoding configuration parameters, which is applied to a terminal device, comprising:

determining initial decoding rendering configuration information of terminal equipment, wherein the initial decoding rendering configuration information comprises initial decoding parameters and initial rendering parameters of the terminal equipment;

and if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not meet the corresponding threshold value, adjusting at least one of the initial decoding parameter and the initial rendering parameter to obtain target decoding rendering configuration information of the terminal equipment, wherein the decoding output frame rate and the single frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold value.

2. The method of claim 1, wherein the initial decoding rendering configuration information is target decoding rendering configuration information with highest occurrence probability among target decoding rendering configuration information of preset N terminal devices, and N is a positive integer greater than 1.

3. The method of claim 1, wherein the threshold includes a decoding output frame rate threshold and a single frame decoding delay threshold, and wherein adjusting at least one of the initial decoding parameter and the initial rendering parameter to obtain the target decoding rendering configuration information of the terminal device if at least one of the initial decoding output frame rate and the initial single frame decoding delay does not satisfy the corresponding threshold includes:

If the initial decoding output frame rate does not meet the decoding output frame rate threshold, and the initial single-frame decoding delay meets the single-frame decoding delay threshold, adjusting the initial decoding parameters to obtain target decoding parameters meeting the decoding output frame rate threshold, and determining the rendering parameters as target rendering parameters;

4. The method of claim 3, wherein the decoded output frame rate threshold is a decoded input frame rate, the method further comprising:

if the initial decoding output frame rate is smaller than the decoding input frame rate, determining that the initial decoding output frame rate does not meet the decoding output frame rate threshold;

and if the initial decoding output frame rate is equal to the decoding input frame rate, determining that the initial decoding output frame rate meets the decoding output frame rate threshold.

5. The method of claim 4, wherein the initial decoding parameters include at least one of a decoding chip type, a decoding input frame rate, and a video resolution of the terminal device, and wherein the adjusting the initial decoding parameters to obtain target decoding parameters that satisfy the decoding output frame rate threshold comprises:

and adjusting at least one of the decoding chip type, the decoding input frame rate and the video resolution to obtain target decoding parameters meeting the decoding output frame rate threshold.

6. The method of claim 5, wherein adjusting at least one of the decoding chip type, decoding input frame rate, and video resolution to obtain target decoding parameters that satisfy the decoding output frame rate threshold comprises:

And adjusting at least one of the decoding chip type, the decoding input frame rate and the video resolution according to an adjusting mode of firstly adjusting the decoding chip type and then adjusting the decoding input frame rate and the video resolution to obtain the target decoding parameter.

7. The method of claim 6, wherein the method further comprises:

and if the decoding input frame rate and the video resolution are reduced, and the corresponding decoding output frame rate does not meet the decoding output frame rate threshold, reducing the decoding input frame rate and the video resolution again until the decoding input frame rate and the video resolution meeting the decoding output frame rate threshold are obtained.

8. The method of any of claims 1-7, wherein if the initial decoding rendering configuration information does not include a decoding chip type, the method further comprises:

obtaining M decoding chips of the terminal equipment, wherein M is a positive integer greater than 1;

determining a decoding output frame rate corresponding to each decoding chip in the M decoding chips under the initial decoding rendering configuration information;

and adding the type corresponding to the decoding chip with the largest decoding output frame rate in the M decoding chips to the initial decoding rendering configuration information to obtain new initial decoding rendering configuration information.

9. The method of any of claims 3-7, wherein the single frame decoding delay threshold is the inverse of the result of multiplying the sum of the number of bin frames and 1 by the decoded input frame rate, the method further comprising:

if the initial single-frame decoding delay is smaller than or equal to the reciprocal, determining that the initial single-frame decoding delay meets the single-frame decoding delay threshold;

and if the initial single-frame decoding delay is larger than the reciprocal, determining that the initial single-frame decoding delay does not meet the single-frame decoding delay threshold.

10. The method of claim 9, wherein the initial rendering parameters include at least one of a rendering control type and a rendering frame loss policy, and wherein the adjusting the initial rendering parameters to obtain target rendering parameters that satisfy the single frame decoding delay threshold comprises:

and adjusting at least one of the rendering control type and the rendering frame loss strategy to obtain a target rendering parameter meeting the single-frame decoding delay threshold.

11. The method of claim 10, wherein adjusting at least one of the rendering control type and the rendering frame loss policy to obtain the target rendering parameter that meets the single frame decoding delay threshold comprises:

And adjusting at least one of the rendering control type and the rendering frame loss strategy according to an adjusting mode of firstly adjusting the rendering control type and then adjusting the rendering frame loss strategy to obtain target rendering parameters meeting the single-frame decoding delay threshold.

12. The method of claim 10, wherein if the target rendering parameter that meets the single frame decoding delay threshold is not obtained by adjusting the rendering parameter, the method further comprises:

and adjusting the video coding multi-frame reference in the initial decoding rendering configuration information into a video coding single-frame reference.

13. The method according to any one of claims 1-7, further comprising:

and sending second indicating information to a cloud server, wherein the second indicating information is used for indicating target decoding rendering configuration information of the terminal equipment so that the cloud server encodes according to the target decoding rendering configuration information.

14. A device for determining decoding configuration parameters, which is applied to a terminal device, and comprises:

a determining unit, configured to determine initial decoding rendering configuration information of a terminal device, where the initial decoding rendering configuration information includes an initial decoding parameter and an initial rendering parameter of the terminal device;

and the adjusting unit is used for adjusting at least one of the initial decoding parameters and the initial rendering parameters to obtain target decoding rendering configuration information of the terminal equipment if at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold value, wherein the decoding output frame rate and the single-frame decoding delay corresponding to the target decoding rendering configuration meet the corresponding threshold value.

15. An electronic device, comprising:

a processor and a memory for storing a computer program, the processor being for invoking and running the computer program stored in the memory to perform the method of any of claims 1 to 13.

16. A computer storage medium comprising computer program instructions that cause a computer to perform the method of any one of claims 1 to 13.