WO2021068341A1

WO2021068341A1 - Code rate configuration method, system and device for video frame

Info

Publication number: WO2021068341A1
Application number: PCT/CN2019/118056
Authority: WO
Inventors: 林鹏程; 黄学辉
Original assignee: 网宿科技股份有限公司
Priority date: 2019-10-10
Filing date: 2019-11-13
Publication date: 2021-04-15
Also published as: CN110769254B; CN110769254A

Abstract

Disclosed are a code rate configuration method, system and device for a video frame. The method comprises: establishing a mapping relationship between an encoded data volume of a macro block and configuration parameters of the macro block; determining actual encoded data volumes of macro blocks according to the mapping relationship and initial configuration parameters of the macro blocks in a video frame, wherein the actual encoded data volumes of the macro blocks constitute an actual encoded data volume of the video frame; and recognizing a target region in the video frame, and when the actual encoded data volume of the video frame is kept unchanged, redetermining, according to a picture quality improving coefficient of the target region, encoded data volumes and configuration parameters of the macro blocks in the video frame, so as to encode the video frame by means of the redetermined configuration parameters of the macro blocks. The technical solution provided in the present application can achieve a relatively good picture quality improving effect.

Description

Code rate configuration method, system and equipment of video frame

Technical field

The present invention relates to the technical field of image processing, in particular to a method, system and equipment for configuring the code rate of video frames.

Background technique

With the improvement of video quality, the amount of video coded data is also increasing. However, the limited network bandwidth makes it impossible to improve the video quality indefinitely. Therefore, under the premise of limited bandwidth or limited amount of encoded data, how to improve the image quality of the video as much as possible has become an urgent problem to be solved.

With the development of image processing technology, many different coding standards have emerged. For example, these coding standards may include JPEG, JVT, MPEG, H.26X, and so on. During video encoding, certain encoding standards can allocate different amounts of encoded data to different video frames based on the complexity of each video frame and the reference weight of each video frame. Although this encoding method can improve the image quality of some of the video frames, it sacrifices the image quality of other video frames. When the user is watching the video, the image quality will fluctuate. Therefore, such an encoding method cannot achieve a better image quality improvement effect.

Summary of the invention

The purpose of this application is to provide a method, system and device for configuring the bit rate of a video frame, which can have a better image quality improvement effect.

In order to achieve the above objective, one aspect of the present application provides a method for configuring the code rate of a video frame, the video frame is divided into a plurality of macro blocks; the method includes: establishing the encoding data amount of the macro block and the configuration parameters of the macro block The mapping relationship between each of the macroblocks; according to the mapping relationship and the initial configuration parameters of each macroblock in the video frame, determine the actual coded data volume of each macroblock; wherein the actual coded data of each macroblock The amount constitutes the actual amount of encoded data of the video frame; identifying the target area in the video frame, and keeping the actual amount of encoded data of the video frame unchanged, according to the image quality improvement coefficient of the target area And re-determine the coded data amount and configuration parameters of each macro block in the video frame, so as to encode the video frame through the re-determined configuration parameters of each macro block.

In order to achieve the above-mentioned objective, another aspect of the present application also provides a system for configuring the code rate of a video frame, the video frame is divided into a plurality of macro blocks; the system includes: a mapping relationship establishment unit for establishing a macro block The mapping relationship between the amount of encoded data and the configuration parameters of the macroblock; the encoded data amount determining unit is configured to determine each macroblock according to the mapping relationship and the initial configuration parameter of each macroblock in the video frame The actual amount of encoded data; wherein the actual amount of encoded data of each macroblock constitutes the actual amount of encoded data of the video frame; the encoding unit is used to identify the target area in the video frame and keep the video When the actual coded data volume of the frame remains unchanged, the coded data volume and configuration parameters of each macroblock in the video frame are re-determined according to the image quality improvement coefficient of the target area to pass each macroblock The re-determined configuration parameters encode the video frame.

In order to achieve the above objective, another aspect of the present application also provides a video frame rate configuration device, the device includes a processor and a memory, the memory is used to store a computer program, the computer program is executed by the processor When the above-mentioned video frame rate configuration method is realized.

It can be seen from the above that the technical solutions provided by one or more embodiments of the present application can pre-establish the mapping relationship between the amount of encoded data of the macro block and the configuration parameters. In practical applications, the amount of coded data is usually fixed within a limited range, so it is possible to establish an accurate mapping relationship with the configuration parameters for the amount of coded data within this range. When encoding a video frame, the video frame can be divided into multiple different macroblocks. For each macro block, the initial configuration parameters can be substituted into the above mapping relationship, so as to determine the actual coded data amount of each macro block. When the image quality is improved, the target area that the human eye pays attention to can be identified in the video frame according to the principle of human vision. The image quality can be improved for the content in the target area, and the image quality can be appropriately reduced for the content outside the target area.

Specifically, according to the image quality improvement coefficient required for image quality improvement, the encoded data amount and configuration parameters of each macroblock in the video frame can be re-determined while keeping the actual encoded data amount of the video frame unchanged. In this way, according to the re-determined configuration parameters, each macro block in the video frame can be coded, thereby completing the process of image quality improvement.

It can be seen that through the above solution, the image quality can be improved for the target area that the human eye pays attention to, and the image quality can be appropriately reduced for the area that the human eye does not pay attention to. On the one hand, it can ensure that the actual coded data volume of the video frame will not change significantly, on the other hand, it can ensure that the user will always present a higher picture quality to the area of his concern when watching, thus having a higher picture quality improvement effect.

Description of the drawings

In order to more clearly explain the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative work.

FIG. 1 is a step diagram of a method for configuring a bit rate of a video frame in an embodiment of the present invention;

Fig. 2 is a schematic diagram of a mapping relationship model in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a code rate configuration system of a video frame in an embodiment of the present invention;

Fig. 4 is a schematic structural diagram of a video frame rate configuration device in an embodiment of the present invention.

Detailed ways

In order to make the purpose, technical solutions, and advantages of the present application clearer, the technical solutions of the present application will be described clearly and completely in conjunction with the specific embodiments of the present application and the corresponding drawings. Obviously, the described implementations are only a part of the implementations of the present application, rather than all of the implementations. Based on the implementation manners in this application, all other implementation manners obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The code rate configuration method of the video frame provided by the present application can adjust the amount of encoded data of the macroblock (MacroBlock) in the video frame, so as to achieve the effect that the macroblocks in different regions have different amounts of encoded data. Specifically, when encoding a video frame, the video frame may be divided into multiple macroblocks. In practical applications, the divided macroblocks usually have the same size. Of course, the divided macroblocks may also have different sizes according to different regions, which is not limited in this application.

In this embodiment, each macro block in the video frame can be mathematically transformed based on the residual error with the reference macro block, so as to obtain the residual transform (SATD, Sum of Absolute) of each macro block. Transformed Difference) value. Specifically, the respective pixel matrices of the current macroblock and the reference macroblock can be extracted, and the two pixel matrices can be subtracted to obtain the residuals of the two macroblocks. Subsequently, the residual can be subjected to hadamad transformation, and the obtained value can be used as the SATD value of the current macroblock. When encoding a video frame, in addition to calculating the SATD value for each macroblock, it is also necessary to set a quantization offset parameter for each macroblock. The quantization offset parameter can be used to save the number of bytes after encoding. In practical applications, the quantization offset parameter of each macroblock can be calculated by means of the existing coding standard. For example, the complexity of each macro block can be analyzed by the MB-Tree algorithm that comes with x264, so as to calculate the quantization offset parameter of each macro block. The above-mentioned residual transform value and quantization offset parameter can be used as the configuration parameter of the macro block.

In this embodiment, the SATD value and quantization offset parameter of each macroblock determined in the above manner can be used as the initial SATD value and initial quantization offset parameter of each macroblock. Subsequently, the process of configuring the bit rate of the video frame can be implemented through the steps shown in FIG. 1. Specifically, referring to FIG. 1, in one embodiment, a method for configuring a bit rate of a video frame may include the following multiple steps.

S1: Establish a mapping relationship between the amount of coded data of the macro block and the configuration parameters of the macro block.

In practical applications, the configuration parameters of the macro block may include the above-mentioned residual transformation value and quantization offset parameter. According to the SATD value and quantization offset parameter of the macro block, the actual coded data amount of the macro block can be calculated. In the current coding standard, there is already a mapping relationship between the amount of coded data, SATD value, and quantization offset parameter, but this mapping relationship is not accurate enough in a scenario with a low code rate. In view of this, in this embodiment, the mapping relationship between the amount of coded data, the SATD value, and the quantization bias parameter can be refitted in advance for a scene with a low bit rate through a machine learning method.

Specifically, when refitting the mapping relationship, firstly, the video frame samples within the specified bit rate range can be obtained. The specified code rate range may be a range with a lower code rate. For example, the specified code rate range may be a range of 100kbps to 300kpbs. The video frame samples within the specified bit rate range can all be video frames that have been encoded. By analyzing the relationship between the amount of encoded data of the macroblock in these video frames and the SATD value of the macroblock and the quantization offset parameter, Then, accurate mapping relationships can be obtained in low bit rate scenarios.

In this embodiment, for each video frame sample, the coded data amount, SATD value, and quantization offset parameter of each macroblock in the video frame sample can be identified, and subsequently, it can be based on the coded data amount, SATD value and quantization obtained by the recognition. The bias parameter training obtains the prediction model. In the training process, different methods can be used to obtain the final predictive model. For example, algorithms such as Support Vector Machine (SVM) and logistic regression can be used to use the recognized SATD value and quantization bias parameter as input data, and compare the corresponding amount of encoded data with the output result to determine The error between the actual amount of encoded data and the output result. Through this error, the prediction model is continuously corrected, so that the error between the output result and the actual amount of encoded data can be limited within a certain allowable range. In this way, by training a large number of video frame samples, a more accurate prediction model can be obtained. The input of the prediction model may be the SATD value and the quantization offset parameter of the macro block, and the output may be the amount of coded data of the macro block obtained by prediction.

In this embodiment, the mathematical relationship between the input data and the output result of the prediction model can be expressed by a certain formula, and the mathematical relationship can be used as the mapping relationship established in step S1.

Specifically, the following describes the training process of the prediction model from the perspective of a mathematical formula. First, the mathematical expression of the predictive model can be as follows:

B _i ＝w·φ(S _i ,Q _i )+b

Wherein, B _i represents the sample video frame coded data amount of the i-th macroblock, S _i represents the i-th macroblock SATD value, Q _i denotes the quantization parameter bias the i-th macroblock, φ (S _i ,Q _i ) represents the kernel function with SATD value and quantization bias parameter as independent variables, w represents the linear coefficient of the kernel function, and b represents the undetermined constant.

It can be seen that in the mapping relationship between the amount of encoded data and the SATD value and the quantization offset parameter, the residual transformation value of the macroblock and the quantization offset parameter can be used as the independent variables of the kernel function, and the kernel function can also have linear coefficients, and The undetermined constant is also included in the mapping relationship. In order to determine the mathematical expression, you need to determine the linear coefficients and undetermined constants. In addition, it is also necessary to determine the assignment of other variables in the kernel function except the SATD value and the quantization bias parameter.

Specifically, the above-mentioned expression representing the mapping relationship can be converted into a minimum value problem with constraint conditions, and the expression of the minimum value problem can be as follows:

Among them, ξ _i and ξ _i ^* can be used as slack variables of the i-th macroblock, C can be used as a penalty factor, and l represents the number of macroblocks contained in the current video frame sample.

It can be seen that both the expression characterizing the minimum value problem and the expression characterizing the constraint condition may include slack variables, and the expression characterizing the minimum value problem may also include a penalty factor.

In order to solve the above-mentioned minimum problem, Lagrangian multipliers can be introduced into the above-mentioned minimum problem, thereby converting the above-mentioned minimum problem into a dual problem, and by solving the dual problem, the above linear coefficients can be obtained And the assignment of undetermined constants. Specifically, after introducing the Lagrangian multiplier a _i , the above-mentioned minimum problem can be transformed into the following dual form:

Where a _i and

It can be an introduced Lagrangian multiplier, ε is the real part of the slack variable, and y _i is the actual amount of coded data of the i-th macroblock.

When solving the above dual problem, the kernel function can use the radial basis kernel function, and the expression of the radial basis kernel function can be as follows:

Among them, σ represents the fitting parameter.

In this embodiment, the specific assignment of the Lagrange multiplier is obtained by solving the above-mentioned dual problem. Through the transformation relationship between the Lagrangian multiplier and the linear coefficient and the undetermined constant in the process of converting the minimum problem to the dual problem, the specific assignment of the Lagrangian multiplier can be used to deduct the linear coefficient and Assignment of undetermined constants. Finally, the mathematical expression of the predictive model can be obtained. The data expression can be used as the established mapping relationship. The mapping relationship can be used as the relationship between the amount of encoded data, the SATD value and the quantization bias parameter at low bit rates. Mapping relations.

S3: Determine the actual coded data volume of each macroblock according to the mapping relationship and the initial configuration parameters of each macroblock in the video frame; wherein the actual coded data volume of each macroblock constitutes the The actual amount of encoded data of the video frame.

In this embodiment, the initial configuration parameters of the macroblock may refer to the initial SATD value and the initial quantization offset parameter. In this way, the initial SATD value and the initial quantization offset parameter of each macroblock in the video frame to be encoded can be substituted into the above In the mapping relationship, the actual coded data amount of each macro block is determined. Since the above mapping relationship is obtained based on training in a low bit rate scenario, the actual coded data amount determined according to the above mapping relationship is more accurate. After the actual coded data volume of each macroblock is obtained, the sum of the actual coded data volume of these macroblocks can be used as the actual coded data volume of the video frame.

S5: Identify the target area in the video frame, and re-determine each target area in the video frame according to the image quality improvement coefficient of the target area while keeping the actual amount of encoded data of the video frame unchanged. The coded data amount and configuration parameters of the macroblocks are used to encode the video frame with the configuration parameters newly determined by each macroblock.

In this embodiment, it is considered that when a user is watching a video, he usually pays attention to a part of the main content in the screen, but does not pay particular attention to other content. In view of this, the image quality of the area that the user pays attention to in the video frame can be enhanced, and for the area that the user does not pay attention to, the image quality can be appropriately reduced, so as to keep the overall amount of encoded data of the video frame unchanged, and improve the user’s The viewing experience.

Specifically, the target area of the user's attention in the video frame can be identified. In practical applications, there can be many ways to identify the target area. For example, a fixed area in the video frame may be used as the target area, and the fixed area may be, for example, the central area of the video frame. In addition, the target area in the video frame can also be determined according to the focus information of the video frame. Generally speaking, the content mainly displayed in the video frame tends to have high definition, and the definition can be expressed by different focus information. Therefore, the main display content and other parts of the content can have different focus information. By identifying the focus information in the video frame, the target area of the user's attention in the video frame can be determined. Furthermore, the target area in the video frame can also be determined by the target tracking algorithm. Specifically, feature extraction can be performed first, so as to extract one or more of the gray feature, color feature, texture feature, Haar-like rectangle feature, interest point feature, and super pixel feature in the video frame. Then, the motion state of the target object in multiple adjacent video frames is analyzed through the motion model, so as to determine the candidate area where the target object is located. Among them, the motion model may be one of a mean shift (Mean shift), a sliding window (Slide window), a Kalman filter (Kalman Filtering), a particle filter (Particle Filtering) and other models. Finally, for the determined candidate area, the Appearance Model can be used to determine whether there is a tracked target object in the candidate area of the current video frame. If it exists, the candidate area can be used as the target area of the video frame. Of course, there are more ways to determine the target area in the video frame in practical applications. The above examples in this embodiment are only possible implementations, and it does not mean that the solution of this application can only be applied to the above implementations. .

In this embodiment, when encoding a video frame, a larger amount of encoded data can be allocated to the target area, thereby improving the image quality of the target area. Specifically, when the image quality of the target area is improved, an image quality improvement coefficient may be determined in advance. The image quality improvement coefficient may be determined based on the current coded data amount of the macroblock in the target area and the improved coded data amount. For example, if the current coded data volume is 100kb, and the improved coded data volume needs to reach 200kb, then the image quality improvement factor can be 2. After the image quality improvement coefficient is obtained, the amount of encoded data and the residual transformation value of each of the macroblocks in the video frame may be re-determined while keeping the actual amount of encoded data of the video frame unchanged.

Specifically, the target macroblock located in the target area can be determined in each macroblock of the video frame, and the image quality can be enhanced for this part of the target macroblock. Therefore, the product of the actual coded data amount of the target macroblock and the image quality improvement coefficient may be used as the coded data amount after the target macroblock is re-determined. In this way, the amount of encoded data after the target area is re-determined

It can be determined by the following formula:

Wherein, r is the image quality improvement coefficient, and ROI (Region Of Interest, region of interest) is the number of macroblocks included in the aforementioned target region.

In this embodiment, it is possible to calculate the sum of the amount of encoded data after each of the target macroblocks is re-determined, and calculate the difference between the actual amount of encoded data of the video frame and the sum of the amount of encoded data.

It can be expressed by the following formula:

Among them, CS _Total represents the actual amount of encoded data of the video frame, the difference

It can be expressed as the sum of the amount of coded data in areas other than the target area after the image quality is adjusted. In this way, the number of other macroblocks located outside the target area in the video frame can be determined, and the difference

The ratio between the determined quantity and the determined quantity is used as the coded data quantity after the other macroblocks are re-determined.

In this embodiment, after the re-determined coded data amount of each macro block is calculated, the new SATD value of each macro block can be further determined. Specifically, the product of the initial residual transformation value of the target macroblock and the image quality improvement coefficient may be used as the residual transformation value after the target macroblock is re-determined. The SATD value after the target macroblock is re-determined

It can be as follows:

Where, S _i represents the i-th macroblock (macroblock if the target macroblock is a) an initial SATD value.

Then, the calculated difference can be

_{The ratio of the CS NROI to} the actual coded data volume of the other macroblocks is used as the image quality adjustment coefficient of the other macroblocks, and the initial residual transformation value of the other macroblocks is compared with the image quality adjustment coefficient The product of is used as the residual transform value after the other macroblocks are re-determined. Specifically, the image quality adjustment coefficients of other macroblocks can be as follows:

Then the residual transformation values of other macroblocks after re-determination can be as follows:

Where, S _i represents the i-th macroblock (if the macroblock is another macroblock) initial SATD value.

In this embodiment, through the above-mentioned method, it is possible to re-determine the sum of the coded data volume of each macroblock in the target area and other areas through the image quality improvement coefficient while keeping the actual coded data volume of the video frame unchanged. SATD value. In this way, according to the mapping relationship determined in step S1, the re-determined coded data amount and SATD value can be substituted into the mapping relationship, so as to calculate the re-determined quantization offset parameter of the macroblock. In this way, by re-determining the SATD value and the quantization offset parameter, the macroblocks in the video frame can be coded, so as to obtain the coded video frame that meets the expected bit rate.

It can be seen from the above that the mapping relationship established in this application can be represented by the model shown in FIG. 2. In Figure 2, the SATD value and the quantization offset parameter can be used as input data to predict the amount of coded data of the macroblock. In addition, if the expected coded data volume of the macroblock is limited, and the SATD value of the macroblock is set, then the actual quantization offset parameter of the macroblock can also be derived from the model. In this way, by performing model training on video frame samples in low bit rate scenarios, the quantization relationship between the amount of encoded data and the improvement of image quality can be established, and the bit rate allocation of the ROI in the video can be dynamically realized.

Referring to FIG. 3, the present application also provides a code rate configuration system for a video frame, where the video frame is divided into multiple macroblocks; the system includes:

A mapping relationship establishing unit for establishing a mapping relationship between the amount of coded data of the macro block and the configuration parameters of the macro block;

The coded data amount determining unit is configured to determine the actual coded data amount of each macroblock according to the mapping relationship and the initial configuration parameters of each macroblock in the video frame; wherein, the actual amount of each macroblock is The amount of encoded data constitutes the actual amount of encoded data of the video frame;

The encoding unit is used to identify the target area in the video frame, and to re-determine the video frame according to the image quality improvement coefficient of the target area while keeping the actual amount of encoded data of the video frame unchanged The coded data amount and configuration parameters of each macroblock in, encode the video frame with the configuration parameters newly determined by each macroblock.

In one embodiment, the configuration parameters include residual transform values and quantization offset parameters; the coding unit includes:

A parameter re-determination module, configured to re-determine the amount of encoded data and the residual transformation value of each macroblock in the video frame;

The quantization offset parameter calculation module is used to calculate the adjusted quantization offset parameter of each macro block according to the mapping relationship and the re-determined coded data amount and residual transformation value of each macro block to pass The re-determined residual transform value of each macroblock and the calculated quantization offset parameter encode the video frame.

In an embodiment, the mapping relationship establishment unit includes:

The sample processing module is used to obtain video frame samples within a specified code rate range, and divide the video frame samples into multiple macro blocks;

The training module is used to identify the amount of encoded data, the residual transformation value and the quantization offset parameter of each macroblock in the video frame sample, and train based on the encoded data amount, the residual transformation value and the quantization offset parameter obtained by the recognition Predictive model

The relationship establishment module is used to use the mathematical relationship between the input data and the output result of the prediction model as a mapping between the coded data amount of the macroblock and the residual transformation value and quantization offset parameter of the macroblock. relationship.

In one embodiment, the parameter re-determination unit includes:

The image quality improvement module is used to determine the target macroblock located in the target area in each macroblock of the video frame, and to multiply the actual coded data amount of the target macroblock and the image quality improvement coefficient , As the amount of coded data after the target macroblock is re-determined;

A difference calculation module, which is used to calculate the sum of the amount of encoded data after each of the target macroblocks is re-determined, and calculate the difference between the actual amount of encoded data of the video frame and the sum of the amount of encoded data;

The image quality adjustment module is used to determine the number of other macroblocks located outside the target area in the video frame, and re-determine the ratio between the difference and the determined number as the other macroblocks The amount of encoded data afterwards.

In an embodiment, the parameter re-determination unit further includes:

A residual transformation value improvement module, configured to multiply the initial residual transformation value of the target macroblock and the image quality improvement coefficient as the residual transformation value after the target macroblock is re-determined;

The residual transform value adjustment module is configured to use the calculated ratio of the difference value and the sum of the actual coded data amount of the other macroblocks as the image quality adjustment coefficient of the other macroblocks, and adjust the other The product of the initial residual transformation value of the macroblock and the image quality adjustment coefficient is used as the residual transformation value after the other macroblocks are re-determined.

4, this application also provides a video frame rate configuration device, the device includes a processor and a memory, the memory is used to store a computer program, when the computer program is executed by the processor, The above-mentioned method for configuring the code rate of the video frame is realized.

In this embodiment, the memory may include a physical device for storing information, which is usually digitized and then stored in a medium using electrical, magnetic, or optical methods. The memory described in this embodiment may also include: a device that uses electrical energy to store information, such as RAM or ROM, etc.; a device that uses magnetic energy to store information, such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, or U disk ; A device that uses optical means to store information, such as CD or DVD. Of course, there are other types of memory, such as quantum memory or graphene memory.

In this embodiment, the processor can be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or a processor, and a computer-readable medium storing computer-readable program codes (for example, software or firmware) executable by the (micro)processor, logic gates, switches, dedicated integrated circuits, etc. Circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc.

The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the implementation of the system and the device, both can be explained with reference to the introduction of the implementation of the foregoing method.

Those skilled in the art should understand that the embodiments of the present invention can be provided as a method, a system, or a computer program product. Therefore, the present invention may adopt a form of a complete hardware implementation, a complete software implementation, or a combination of software and hardware implementations. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.

The present invention is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to the embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

In a typical configuration, the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.

Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or equipment including a series of elements not only includes those elements, but also includes Other elements that are not explicitly listed, or they also include elements inherent to such processes, methods, commodities, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, commodity, or equipment that includes the element.

The above are only the implementation manners of this application, and are not intended to limit this application. For those skilled in the art, this application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the scope of the claims of this application.

Claims

A method for configuring the bit rate of a video frame, wherein the video frame is divided into a plurality of macro blocks; the method includes:

Establish a mapping relationship between the amount of coded data of the macro block and the configuration parameters of the macro block;

According to the mapping relationship and the initial configuration parameters of each macroblock in the video frame, determine the actual coded data volume of each macroblock; wherein the actual coded data volume of each macroblock constitutes the video frame The actual amount of encoded data;

Identify the target area in the video frame, and re-determine each macro in the video frame according to the image quality improvement coefficient of the target area while keeping the actual amount of encoded data of the video frame unchanged. The amount of coded data and configuration parameters of the block are used to encode the video frame with the configuration parameters newly determined by each of the macroblocks.
The method according to claim 1, wherein the configuration parameters include a residual transform value and a quantization offset parameter; and re-determining the amount of coded data and configuration parameters of each of the macroblocks in the video frame comprises:

Re-determine the amount of encoded data and the residual transform value of each macroblock in the video frame;

According to the mapping relationship, and the re-determined coded data amount and residual transformation value of each macro block, the adjusted quantization offset parameter of each macro block is calculated to pass the re-determined residual value of each macro block. The difference transform value and the calculated quantization offset parameter encode the video frame.
The method according to claim 2, wherein the mapping relationship is established in the following manner:

Obtaining video frame samples within a specified bit rate range, and dividing the video frame samples into multiple macro blocks;

Identifying the amount of encoded data, residual transformation value, and quantization offset parameter of each macroblock in the video frame sample, and training to obtain a prediction model based on the encoded data amount, residual transformation value, and quantization offset parameter obtained by the identification;

The mathematical relationship between the input data and the output result of the prediction model is used as the established mapping relationship between the amount of encoded data of the macro block and the residual transform value of the macro block and the quantization offset parameter.
The method according to claim 2, wherein in the mapping relationship, the residual transform value of the macroblock and the quantization bias parameter are used as the independent variables of the kernel function, and the kernel function has linear coefficients, and The mapping relationship also includes undetermined constants.
The method according to claim 4, wherein when establishing the mapping relationship, the method further comprises:

The expression characterizing the mapping relationship is converted into a minimum value problem with constraint conditions, wherein the expression characterizing the minimum value problem and the expression characterizing the constraint conditions both contain slack variables, and characterize the The expression of the minimum problem also includes a penalty factor;

A Lagrangian multiplier is introduced into the minimum problem to convert the minimum problem into a dual problem, and by solving the dual problem, the assignment of the linear coefficient and the undetermined constant is obtained.
The method according to claim 2, wherein the amount of coded data of each macroblock in the video frame is re-determined in the following manner:

Determine a target macroblock located in the target area in each macroblock of the video frame, and use the product of the actual coded data amount of the target macroblock and the image quality improvement coefficient as the target macroblock The amount of encoded data after re-determination;

Calculating the sum of the coded data amount after each of the target macroblocks is re-determined, and calculating the difference between the actual coded data amount of the video frame and the sum of the coded data amount;

The number of other macroblocks located outside the target area in the video frame is determined, and the ratio between the difference and the determined number is used as the re-determined coded data amount of the other macroblocks.
The method according to claim 6, wherein the residual transform value of each macroblock in the video frame is re-determined in the following manner:

Taking the product of the initial residual transformation value of the target macroblock and the image quality improvement coefficient as the residual transformation value after the target macroblock is re-determined;

The ratio of the calculated difference value to the sum of the actual coded data amount of the other macroblocks is used as the image quality adjustment coefficient of the other macroblocks, and the initial residual transformation value of the other macroblocks is compared with The product of the image quality adjustment coefficient is used as the residual transformation value after the other macroblocks are re-determined.
A code rate configuration system for a video frame, wherein the video frame is divided into a plurality of macro blocks; the system includes:

A mapping relationship establishing unit for establishing a mapping relationship between the amount of coded data of the macro block and the configuration parameters of the macro block;

The coded data amount determining unit is configured to determine the actual coded data amount of each macroblock according to the mapping relationship and the initial configuration parameters of each macroblock in the video frame; wherein, the actual amount of each macroblock is The amount of encoded data constitutes the actual amount of encoded data of the video frame;

The encoding unit is used to identify the target area in the video frame, and to re-determine the video frame according to the image quality improvement coefficient of the target area while keeping the actual amount of encoded data of the video frame unchanged The coded data amount and configuration parameters of each macroblock in, encode the video frame with the configuration parameters newly determined by each macroblock.
The system according to claim 8, wherein the configuration parameter includes a residual transformation value and a quantization offset parameter; and the coding unit includes:

A parameter re-determination module, configured to re-determine the amount of encoded data and the residual transformation value of each macroblock in the video frame;

The quantization offset parameter calculation module is used to calculate the adjusted quantization offset parameter of each macro block according to the mapping relationship and the re-determined coded data amount and residual transformation value of each macro block to pass The re-determined residual transform value of each macroblock and the calculated quantization offset parameter encode the video frame.
The system according to claim 9, wherein the mapping relationship establishment unit comprises:

The sample processing module is used to obtain video frame samples within a specified code rate range, and divide the video frame samples into multiple macroblocks;

The training module is used to identify the amount of encoded data, the residual transformation value and the quantization offset parameter of each macroblock in the video frame sample, and train based on the encoded data amount, the residual transformation value and the quantization offset parameter obtained by the recognition Predictive model

The relationship establishment module is used to use the mathematical relationship between the input data and the output result of the prediction model as a mapping between the coded data amount of the macroblock and the residual transformation value and quantization offset parameter of the macroblock. relationship.
The system according to claim 9, wherein the parameter re-determination unit comprises:

The image quality improvement module is used to determine the target macroblock located in the target area in each macroblock of the video frame, and to multiply the actual coded data amount of the target macroblock and the image quality improvement coefficient , As the amount of coded data after the target macroblock is re-determined;

A difference calculation module, which is used to calculate the sum of the amount of encoded data after each of the target macroblocks is re-determined, and calculate the difference between the actual amount of encoded data of the video frame and the sum of the amount of encoded data;

The image quality adjustment module is used to determine the number of other macroblocks located outside the target area in the video frame, and re-determine the ratio between the difference and the determined number as the other macroblocks The amount of encoded data afterwards.
The system according to claim 11, wherein the parameter re-determining unit further comprises:

A residual transformation value improvement module, configured to multiply the initial residual transformation value of the target macroblock and the image quality improvement coefficient as the residual transformation value after the target macroblock is re-determined;

The residual transform value adjustment module is configured to use the calculated ratio of the difference value and the sum of the actual coded data amount of the other macroblocks as the image quality adjustment coefficient of the other macroblocks, and adjust the other The product of the initial residual transformation value of the macroblock and the image quality adjustment coefficient is used as the residual transformation value after the other macroblocks are re-determined.
A video frame rate configuration device, characterized in that the device includes a processor and a memory, the memory is used to store a computer program, and when the computer program is executed by the processor, it implements 7. The method described in any one of 7.