WO2021093060A1

WO2021093060A1 - Video encoding method, system, and apparatus

Info

Publication number: WO2021093060A1
Application number: PCT/CN2019/123710
Authority: WO
Inventors: 黄学辉
Original assignee: 网宿科技股份有限公司
Priority date: 2019-11-15
Filing date: 2019-12-06
Publication date: 2021-05-20
Also published as: CN110996099B; CN110996099A

Abstract

Disclosed are a video encoding method, a system, and an apparatus. The method comprises: acquiring, according to a current frame and an encoding order of the video, a current image set from a video to be encoded; sequentially performing, in reverse order to the encoding order, reverse updating on a region of interest in each video frame in the current image set; and calculating quantization parameters for coding units in the updated region of interest and a non-region of interest in the current frame, respectively, and encoding the coding units in the current frame according to the calculated quantization parameters. The technical solution provided in the present application enhances the accuracy of video encoding.

Description

Video coding method, system and equipment

Technical field

The present invention relates to the technical field of image processing, in particular to a video coding method, system and equipment.

Background technique

With the continuous development of high-definition video, the bandwidth required to transmit high-definition video is also increasing. Taking into account the cost of bandwidth, currently when encoding high-definition video, different areas in the video frame are usually assigned different bit rates, so that the key areas can be as clear as possible while ensuring that the total bit rate remains unchanged.度 Increase.

In order to achieve the above solution, when encoding the video, it is necessary to identify the area of interest to the human eye in each video frame, and then more bitrates can be allocated to the area of interest to the human eye, while for other parts of the area, Can allocate less bit rate. In this coding method, the same quantization parameter is usually used for coding the region of interest to the human eye. But in fact, not every position in the area of interest to the human eye is concerned by the human eye, so the current encoding method is still not accurate enough.

Summary of the invention

The purpose of this application is to provide a video encoding method, system, and equipment, which can improve the accuracy of video encoding.

In order to achieve the above-mentioned objective, one aspect of the present application provides a video coding method. The video frame of the video to be coded includes a region of interest and a region of non-interest; the method includes: according to the current frame and the current frame in the video to be coded. The encoding sequence of the video to be encoded, obtaining a current image set in the video to be encoded, the current image set including the current frame and a specified number of video frames located after the current frame in the encoding sequence; The reverse order of the coding sequence is reversed to sequentially update the region of interest in each video frame in the current image set; the coding basis for the updated region of interest and non-interest region in the current frame is basically The units respectively calculate quantization parameters to perform coding for each basic coding unit in the current frame according to the calculated quantization parameters.

In order to achieve the above-mentioned object, another aspect of the present application also provides a video encoding system. The video frame of the video to be encoded includes a region of interest and a region of non-interest; the system includes: a current image set acquisition unit for The current frame in the video to be encoded and the encoding sequence of the video to be encoded, the current image set is acquired in the video to be encoded, the current image set includes the current frame and is located in the current frame in the encoding order A specified number of video frames afterwards; a reverse update unit, which is used to reversely update the region of interest in each video frame in the current image set in the reverse order opposite to the encoding order; a quantization coding unit, It is used for calculating quantization parameters for the updated coding basic units in the region of interest and non-interesting regions in the current frame respectively, so as to perform coding for the coding basic units in the current frame according to the calculated quantization parameters.

In order to achieve the foregoing objective, another aspect of the present application also provides a video encoding device. The device includes a processor and a memory. The memory is used to store a computer program. When the computer program is executed by the processor, the foregoing Video encoding method.

It can be seen from the above that the technical solutions provided by one or more implementation manners of the present application can be combined and analyzed together with multiple video frames located after the current frame in the encoding sequence when the current frame is encoded. Specifically, the region of interest in each video frame may be updated in the reverse order from the last video frame in the order opposite to the encoding order. In this way, for temporally adjacent video frames, if there is a sudden change in the region of interest at the same position, the reverse update can be used to slow down or eliminate this sudden change in the region of interest, and finally make the adjacent frames interesting After coding, the region will not appear too much abruptly. After the region of interest is updated in the reverse direction, the quantization parameter can be calculated for each basic coding unit in the current frame from the perspective of the basic coding unit. When the current frame is subsequently encoded, different code rates can be assigned to different coding basic units, so that the code rate is allocated in a video frame more refined, so that the accuracy of video encoding is higher.

Description of the drawings

In order to more clearly explain the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative work.

Figure 1 is a step diagram of a video encoding method in an embodiment of the present invention;

Fig. 2 is a schematic diagram of functional modules of a video encoding system in an embodiment of the present invention;

Fig. 3 is a schematic structural diagram of a video encoding device in an embodiment of the present invention.

Detailed ways

In order to make the purpose, technical solutions, and advantages of the present application clearer, the technical solutions of the present application will be described clearly and completely in conjunction with the specific embodiments of the present application and the corresponding drawings. Obviously, the described implementations are only a part of the implementations of the present application, rather than all of the implementations. Based on the implementation manners in this application, all other implementation manners obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

An embodiment of the present application provides a video encoding method, which can encode each video frame in the video to be encoded. In practical applications, each video frame in the video to be encoded may have its own saliency image. The saliency image may be an image whose width (W) and height (Height, H) are consistent with the video frame, the saliency image may be a grayscale image, and in the saliency image, each pixel The pixel value of can be located in the interval of [0,1], and the pixel value can be used as the saliency value of the corresponding pixel in the video frame. The larger the pixel value, the greater the significance of the corresponding pixel in the video frame. Specifically, each video frame in the to-be-coded video can be processed according to a visual saliency detection algorithm, so as to obtain a saliency image of each video frame. In this embodiment, the saliency value of each pixel in the saliency image can be represented by S(i,j). It can be seen from the above that S(i,j) can satisfy the following conditions:

0≤S(i,j)≤1,0≤i＜W,0≤j＜H

Among them, W represents the width of the saliency image, and H represents the height of the saliency image.

In addition, each video frame in the video to be encoded can be divided into a region of interest and a region of non-interest, where the region of interest can be the area that the user pays attention to when watching the video, while the non-interest area is the area that the user does not pay much attention to. area. The region of interest can be identified through a variety of algorithms, and this application does not limit the process of identifying the region of interest.

In an embodiment, the above-mentioned video encoding method may include multiple steps as shown in FIG. 1.

S1: Acquire a current image set in the to-be-encoded video according to the current frame in the to-be-encoded video and the encoding order of the to-be-encoded video. The current image set includes the current frame and is located in the encoding order. The specified number of video frames after the current frame.

In this embodiment, each video frame in the video to be encoded and the saliency image corresponding to each video frame can be processed as input data. Specifically, the video frame currently being processed can be used as the current frame. In order to accurately encode the current frame, multiple video frames located after the current frame in the encoding order can be combined for analysis. For example, the current frame and the N video frames located after the current frame in the encoding order may be used as the current image set obtained from the video to be encoded according to a predetermined number N configured in advance. Each video frame in the current image set can be numbered according to the coding order, specifically, it can be denoted as:

P ₀ ,P ₁ ...P _N

Among them, P ₀ may be the current frame, and P _N may be the last frame in the encoding sequence of the current image.

In this embodiment, the saliency image of the t-th video frame in the current image set can be denoted as _St (i, j), so that _St (i, j) and P _t can have a corresponding relationship.

S3: Reversely update the regions of interest in each video frame in the current image set in a reverse order that is opposite to the encoding order.

In this embodiment, it is considered that there may be a sudden change in the saliency value in the pictures of adjacent video frames at the same position in the encoding time, and this situation may cause a sudden change in the region of interest. After encoding, regions of interest with mutations may cause visual conflicts for users. In order to eliminate the influence of this situation, the method of updating the region of interest in the reverse direction can be used to make the saliency values of the pictures at the same position between adjacent video frames more consistent.

Specifically, each video frame in the current frame can be divided according to basic coding units, so that each video frame is divided into several basic coding units. In practical applications, the basic coding unit may be a rectangle with the same size, and the size of the basic coding unit may also be different according to different coding methods. For example, for the H.264 encoder, the coding basic unit may be a 16*16 macro block. For an HEVC encoder, the coding basic unit may be the smallest HEVC coding unit allowed by the encoder, and the typical size of the smallest HEVC coding unit may be 8*8. Among them, the width and height of the coding basic unit can be denoted as W _bu and H _{bu respectively} .

In this embodiment, after each video frame is divided according to basic coding units, each basic coding unit may be further divided into multiple blocks during the coding process. Specifically, according to the coding sequence, the coding mode can be selected for each video frame in the current image set, with the coding basic unit as a unit. Wherein, the coding mode may include an inter prediction mode and an intra prediction mode. The intra-frame prediction mode may include information such as the division of intra-frame blocks and the direction of intra-frame prediction, and the inter-frame prediction mode may include the division of inter-frame blocks, the selection of the reference frame of the block, and the motion vector of the block.

Of course, in practical applications, in order to reduce the amount of coding calculations, the selection process of the coding mode can be appropriately simplified. For example, when searching for inter-frame motion, you can use only integer pixel accuracy. For another example, when performing intra-frame prediction, only horizontal, vertical, and 45-degree directions may be used, without entropy coding or quantization.

In this embodiment, the optimal prediction mode of each basic coding unit in the video frame can be recorded. Whether the optimal prediction mode is inter-frame prediction or intra-frame prediction can be determined according to the difference between the original value and the predicted value in different prediction modes, and the amount of additional information that needs to be recorded. If the original value of the block in the coding basic unit is closer to the predicted value in a certain prediction mode, and the amount of information that needs to be recorded is less, then the prediction mode can be used as the optimal prediction mode. Finally, by separately detecting the inter-frame prediction mode and the intra-frame prediction mode, the optimal prediction mode of each coding basic unit in the video frame can be determined.

In this embodiment, after the optimal prediction mode of the coding basic unit is determined, the complexity of each block in the coding basic unit can also be calculated. This complexity can play a role in the subsequent reverse update of the region of interest. Specifically, the complexity of the block can be the sum of the absolute transform error of the block (Sum of Absolute Transformed Difference, SATD). When calculating the SATD value, the original pixel value of the block can be obtained first and predicted by the optimal prediction mode. The obtained predicted pixel value. It should be noted that both the original pixel value and the predicted pixel value can be a pixel matrix, and are not limited to a specific pixel value. Then, the original pixel value and the predicted pixel value can be subtracted, and the subtracted result can be subjected to hadamad transformation to obtain a transformed pixel matrix. Finally, by summing the absolute values of the elements in the transformed pixel matrix, the SATD value of the block can be obtained, and the SATD value can be used as the complexity of the block.

In this embodiment, after the complexity of each block is calculated, the process of inversely updating the region of interest can be performed. Specifically, firstly, the sequence of the reverse update of the region of interest can be determined, and the sequence can be the reverse sequence opposite to the encoding sequence. For example, if the coding sequence is from P ₀ to P _N , then the sequence of the reverse update of the region of interest can be from P _N to P ₀ . The following takes the first video frame P _N that undergoes reverse update as an example to illustrate the specific process of reverse update.

First, the coding basic unit in the region of interest in the video frame can be identified, and the same operation is performed on each block in the coding basic unit in the region of interest. Traverse each coding basic unit in the region of interest of the video frame, and for any block in the current coding basic unit, the optimal prediction mode adopted by the block can be identified first. The optimal prediction mode adopted by the block may be consistent with the optimal prediction mode of the coding basic unit to which the block belongs. If the optimal prediction mode is an intra-frame prediction mode, the sub-block may not be processed and directly jump to the next sub-block. If the optimal prediction mode is the inter prediction mode, then the reference block corresponding to the sub-block can be found in the reference frame of the video frame according to various information recorded in the inter prediction mode. The reference frame may be a video frame located before the video frame _{P N in the current image set.} If the reference block does not belong to the region of interest in the reference frame, the reference block may be set as the region of interest. In this way, after the same processing is performed on each block in the region of interest, the process of updating the region of interest in the reverse direction to the reference frame can be completed.

In this embodiment, after the region of interest is updated in the reverse direction, it is also necessary to reset the corresponding saliency value for each pixel in the updated region of interest. Specifically, the saliency value can be re-set for the updated region of interest in the reference frame by passing the saliency value backward. When the significance value, to determine the transfer coefficient based on the complexity of the video frame P _N equatorial blocks reverse transmission, a product of significant value and pass coefficient block pixel points as the significance values to be transmitted. The transfer coefficient can be obtained by substituting the complexity of the block into a preset transfer function. Considering that when the region of interest is updated in the reverse direction, the saliency value passed to the reference block should not exceed the saliency value of the block itself, so the transfer function can be a function with a calculation result in the interval [0, 1]. In addition, considering that if the complexity of the block is relatively large, it means that the degree of similarity between the block and the reference block is relatively low. At this time, in order not to excessively affect the authenticity of the reference block, the saliency value transferred to the reference block is also It should be relatively small. In view of this, the transfer function should be a decreasing function of complexity. In summary, the preset transfer function φ(x) should meet the following conditions:

(1) 0≤φ(x)≤1

(2) For any 0≤a＜b, satisfy φ(a)≥φ(b)

In this embodiment, by substituting the complexity of the block into the above transfer function, the transfer coefficient can be obtained. After the transfer coefficient is multiplied by the saliency value of the pixel in the block, the saliency to be transferred can be obtained. value. Then, the saliency value to be transferred can be used as the saliency value of the corresponding pixel in the reference block. According to this method of updating the region of interest in the reverse direction and resetting the saliency value for the updated region of interest in the reference frame, the sudden change of the picture between the block and the reference block can be reduced.

It can be seen from the above that when the saliency value is reset, for the target block participating in the reverse update of the region of interest in the target video frame, the transfer coefficient can be determined according to the complexity of the target block, and the transfer coefficient The product of the saliency value of the pixel in the target block is used as the saliency value to be transferred. Then, the reference block of the target block may be determined in the reference frame, and the saliency value of the corresponding pixel in the reference block of the target block may be replaced with the saliency value to be transferred. The above is just taking the video frame _PN as an example to illustrate the process of reversely updating the region of interest. Those skilled in the art should know that for any target video frame in the current image set, the above-mentioned method can be used to reversely update the region of interest. , And finally the saliency image of the current frame can be updated.

It should be noted that if _{the inter-frame prediction mode adopted by the video frame PN} uses two or more reference frames, the above-mentioned processing can be performed for each reference frame, which will not be repeated here.

S5: Calculate quantization parameters for the updated coding basic units in the region of interest and non-interest region in the current frame respectively, so as to perform coding for each basic coding unit in the current frame according to the calculated quantization parameters.

In this embodiment, after the inverse update of the region of interest is performed, the region of interest in the video frame in the current image set may all be updated in units of blocks. Since the coding basic unit is used as the unit in the coding process, the region of interest in the current video frame can be adjusted according to the basic coding unit, and for the convenience of subsequent data processing, it can be the adjusted region of interest. Each coding basic unit is set to represent the significance value.

Specifically, when the region of interest is adjusted according to the coding basic unit, it can be flexibly set according to actual application scenarios. For example, for any basic coding unit in the current frame, if the basic coding unit contains blocks belonging to the region of interest, the basic coding unit may be set as the region of interest. For another example, for any basic coding unit in the current frame, if each block in the basic coding unit belongs to the region of interest, then the basic coding unit is set as the region of interest. This way of adjustment The latter region of interest will be smaller than the region of interest obtained by adjustment in the previous method. Specifically, an appropriate adjustment method of the region of interest can be flexibly selected according to the processing accuracy or the computing capability of the device.

After adjusting the region of interest in the current frame, the representative saliency value of each basic coding unit in the adjusted region of interest can be determined. According to different application scenarios and accuracy, the above-mentioned representative significance value can be flexibly determined. For example, in one embodiment, for the current basic coding unit, the largest saliency value in the current basic coding unit may be used as the representative saliency value. For another example, in another embodiment, for the current basic coding unit, the saliency value of each pixel in the current basic coding unit can be read, and the saliency value of each pixel can be averagely filtered Or median filter processing, and use the processed result as the representative significance value. In practical applications, the saliency value of each pixel can be average filtered according to the following formula:

Among them, S'(x, y) can represent the saliency value after mean filtering, and W _bu and H _bu can respectively represent the width and height of the coding basic unit.

Of course, there are more ways to determine the representative saliency value of the basic coding unit in practical applications, and I will not list them one by one here.

In this embodiment, after determining the representative saliency value of each coding basic unit in the region of interest after the current frame is updated, the representative saliency value can be used to calculate each coding basic unit in the updated region of interest. The bit rate adjustment factor. The code rate adjustment coefficient may indicate the degree of increase in the code rate of the basic coding unit during the encoding process. The larger the code rate adjustment coefficient, the more code rate allocated to the corresponding basic coding unit. In practical applications, the value of the code rate adjustment coefficient may be greater than or equal to 1, and the representative saliency value of the basic coding unit may be used as an independent variable. The larger the representative saliency value, the larger the corresponding code rate adjustment coefficient. Specifically, in one embodiment, the code rate adjustment coefficient may be determined according to the following formula:

g(s)=k*s ⁿ +1

Among them, g(s) represents the code rate adjustment coefficient, s represents the representative significance value of a certain coding basic unit, and k and n are self-defined constants.

In addition, in another embodiment, the code rate adjustment coefficient can also be determined according to the following formula:

g(s)=k ^s

Of course, as long as the calculation method of the code rate adjustment coefficient conforms to the above settings, there can be more ways to calculate the code rate adjustment coefficient in practical applications. I will not list them one by one here. For the i-th one in the current frame The code rate adjustment coefficient of the basic coding unit can be represented by g _i .

In this embodiment, it is also possible to calculate the scaling complexity of each coding basic unit in the region of interest after the current frame is updated. The scaling complexity can be determined according to the original complexity of each basic coding unit. Specifically, if the original encoder uses the same quantization parameter for each basic coding unit in the current frame, the scaling complexity of each basic coding unit may be equal to the original complexity. If the original encoder sets different quantization parameters for each basic coding unit in the current frame, then the average quantization parameter of the current frame may be calculated first according to the original quantization parameter of each basic coding unit in the current frame. Then, for any basic coding unit in the current frame, it can be calculated according to the average quantization parameter of the current frame, the complexity of each block in the basic coding unit, and the original quantization parameter of the basic coding unit The scaling complexity of the coding basic unit. In a specific application scenario, the calculation formula of the average quantization parameter can be as follows:

Among them, QP _avg represents the average quantization parameter of the current frame, QP _i represents the original quantization parameter of the i-th coding basic unit in the current frame, and NBU represents the total number of coding basic units in the current frame.

In the specific implementation process, for coding standards (such as H.264 and HEVC) where the quantization parameter and the quantization step size are based on an exponential relationship of 2, the following formula can be used to calculate the scaling complexity:

Among them, FS _i represents the scaling complexity of the i-th coding basic unit, F _i (j) represents the complexity of the j-th block in the i-th coding basic unit _{, and NPU i represents the block in the i-th coding basic unit} The total number of.

For coding standards where the quantization parameter is equal to the quantization step size, the following formula can be used to calculate the scaling complexity:

In one embodiment, in order to prevent the code rate of the non-interest region from being too low and causing the coded picture to appear blurry, blurring, etc., in the actual implementation process, it is necessary to set a non-interest region code rate protection coefficient B, which The protection coefficient can represent the degree to which the code rate of a video frame can be transferred from the non-interest area to the interest area. The protection coefficient B of the non-interest area can be a number greater than 0 and less than 1. The larger the value of the protection coefficient, the lower the code rate for transferring from the non-interest area to the interest area, that is, the greater the protection coefficient. Larger, the less the code rate of the non-interest area is reduced, and correspondingly, the less the additional code rate available for the area of interest.

In the specific implementation process, due to the limitation of the protection coefficient B of the non-interest area, the code rate of transferring from the non-interest area to the interest area may be restricted under the premise of the same output bit rate. The restriction can be set by It is represented by a value, which can be referred to as the completion degree of bit rate adjustment. Specifically, based on the protection coefficient, the scaling complexity of each coding basic unit in the current frame, and the code rate adjustment coefficient of each coding basic unit in the updated region of interest, the code of the current frame can be calculated Rate adjustment completion degree P.

In one embodiment, when calculating the rate adjustment completion degree of the current frame, the sum of the scaling complexity of each coding basic unit in the updated non-interest region in the current frame may be calculated, and the updated The first product of the sum of the scaling complexity of each coding basic unit in the non-interest area and the protection coefficient. Then, the second product of the scaling complexity of each coding basic unit in the updated region of interest and the respective code rate adjustment coefficient can be calculated, and the second product of each coding basic unit in the updated region of interest can be calculated. The sum of two products. Finally, the difference between the sum of the scaling complexity of each coding basic unit of the current frame and the first product can be calculated, and the ratio of the difference to the sum of the second product can be calculated, and the ratio can be used as the total value. Describes the completion degree of bit rate adjustment of the current frame.

The above process can be expressed by the following formula:

Among them, NROI represents the region of non-interest in the current frame, and ROI represents the region of interest in the current frame.

If the completion degree of bit rate adjustment calculated by the above formula is less than 1, it means that the bit rate for transferring from the non-interest area to the interest area in the current frame may be restricted. For example, P=0.8 means that due to the limitation of the code rate protection coefficient of the non-interest area, the code rate of the current frame from the non-interest area to the area of interest is only 80% of the expected rate. If the code rate adjustment completion degree is greater than or equal to 1, it means that according to the code rate adjustment coefficient of each coding basic unit, the code rate for transferring from the non-interest area to the interest area is not limited, which conforms to the expected situation. According to the above method, in the specific implementation process, the completion degree of the rate adjustment of the region of interest of different video frames may be different, so that the rate allocation of each video frame can be accurately performed.

In this embodiment, according to the completion degree of the code rate adjustment, the code rate adjustment coefficient of each coding basic unit in the updated region of interest, and the original quantization of each coding basic unit in the updated region of interest Parameter, the quantization parameter of each coding basic unit in the updated region of interest in the current frame can be calculated. For different coding standards, the process of calculating the quantization parameters of each coding basic unit in the region of interest may also be different.

Specifically, for coding standards (such as H.264 and HEVC coding standards) where the quantization parameter and the quantization step size are based on an exponential relationship of 2, the quantization parameter of each coding basic unit can be recalculated according to the following formula:

Among them, QP' _i represents the quantization parameter of the i-th coding basic unit after recalculation.

For coding standards where the quantization parameter is equal to the quantization step size, the quantization parameter can be recalculated using the following formula:

It can be seen from the above that when recalculating the quantization parameters of each coding basic unit in the region of interest, for any target coding basic unit in the updated region of interest, if the rate adjustment completion degree of the current frame is less than the specified threshold , The quantization adjustment value of the target coding basic unit may be determined according to the product of the code rate adjustment completion degree and the code rate adjustment coefficient of the target coding basic unit, and the original quantization parameter of the target coding basic unit and the The difference or ratio of the quantization adjustment value is used as the quantization parameter of the target coding basic unit. If the rate adjustment completion degree of the current frame is greater than or equal to the specified threshold, the quantization adjustment value of the target encoding basic unit may be determined according to the rate adjustment coefficient of the target encoding basic unit, and the target encoding The difference or ratio between the original quantization parameter of the basic unit and the quantization adjustment value is used as the quantization parameter of the target coding basic unit. The above specified threshold can be 1 in the formula. Of course, according to different application scenarios, the value of the specified threshold and code rate adjustment completion degree can also be changed accordingly.

In this embodiment, after calculating the quantization parameters of each coding basic unit in the region of interest, the quantization parameters of each coding basic unit in the non-interest region can be continued to be calculated. Specifically, the code rate adjustment completion degree of the current frame may be adjusted first, the scaling complexity of each basic coding unit in the current frame, and the code rate adjustment coefficient of each basic coding unit in the updated region of interest, Calculate the code rate reduction coefficient of the updated non-interest region. Then, according to the code rate reduction coefficient of the non-interest area and the original quantization parameter of each basic coding unit in the non-interest area, the quantization parameter of each basic coding unit in the non-interest area is calculated.

Among them, the code rate reduction coefficient D of the non-interest area can be calculated according to the following formula:

It can be seen that if the rate adjustment completion degree of the current frame is less than a specified threshold, the protection coefficient may be used as the updated code rate reduction coefficient of the non-interest area. If the rate adjustment completion degree of the current frame is greater than or equal to the specified threshold, the sum of the scaling complexity of each coding basic unit in the non-interest region after the update in the current frame can be calculated, and the update is calculated After the product of the scaling complexity of each coding basic unit in the region of interest and the respective code rate adjustment coefficient, and calculating the sum of each of the products. Then, it is also possible to calculate the difference between the sum of the complexity of each coding basic unit and the sum of the products in the current frame, and use the ratio of the difference to the sum of the scaling complexity as the updated non The code rate reduction factor of the region of interest. The above specified threshold can be 1 in the formula. Of course, according to different application scenarios, the value of the specified threshold and code rate adjustment completion degree can also be changed accordingly.

In this embodiment, after the code rate reduction coefficient of the non-interest area is calculated, for any target coding basic unit in the updated non-interest area, the code rate of the non-interest area can be reduced The coefficient determines the quantization adjustment value of the target coding basic unit, and uses the difference or ratio between the original quantization parameter of the target coding basic unit and the quantization adjustment value as the quantization parameter of the target coding basic unit. Specifically, the above process can be expressed according to the following formula:

For coding standards (such as H.264 and HEVC coding standards) where the quantization parameter and the quantization step are based on an exponential relationship of 2, the quantization parameter of each coding basic unit in the non-interest area can be calculated according to the following formula:

QP' _i ＝QP _i -6log ₂ D

For coding standards where the quantization parameter is equal to the quantization step size, the following formula can be used to calculate the quantization parameter of each basic coding unit in the non-interest area:

In this embodiment, after recalculating the quantization parameters of each coding basic unit in the region of interest and non-interest region in the current frame in the above-mentioned manner, the quantization parameters of each coding basic unit can be performed according to the recalculated quantization parameters. coding. As the quantization step size of the coding basic unit in the region of interest becomes smaller after adjustment in the above manner, the quantization step size of the coding basic unit in the non-region of interest becomes larger, so that the region of interest can be encoded more finely. , Get more bit rate allocation, thus more clear.

Please refer to FIG. 2. The present application also provides a video encoding system. The video frame of the video to be encoded includes a region of interest and a region of non-interest; the system includes:

A current image set acquiring unit, configured to acquire a current image set in the video to be encoded according to the current frame in the video to be encoded and the encoding sequence of the video to be encoded, and the current image set includes the current frame And a specified number of video frames located after the current frame in the coding order;

A reverse update unit, configured to reversely update the region of interest in each video frame in the current image set in a reverse order opposite to the encoding order;

The quantization coding unit is used to calculate quantization parameters for the coding basic units in the updated region of interest and non-interest region in the current frame, respectively, to use the calculated quantization parameters as the coding basic unit in the current frame Encode.

In an embodiment, the quantization coding unit includes:

The saliency value reset module is used to reset the saliency value for the updated region of interest;

The quantization parameter calculation module is used to determine the scaling complexity of each basic coding unit in the current frame, and based on the scaling complexity and the re-set saliency value in the current frame, it is the updated value in the current frame The coding basic unit in the region of interest and non-interest region respectively calculates the quantization parameter.

In an embodiment, the reverse update unit includes:

An encoding basic unit configuration module, configured to divide each video frame in the current image set according to an encoding basic unit, and determine an optimal prediction mode for each encoding basic unit obtained by the division;

The region of interest resetting module is used to traverse each basic coding unit in the region of interest of the target video frame for any target video frame in the current image set, and for any one of the current coding basic units Perform block analysis. If the optimal prediction mode adopted by the block is the inter prediction mode, search for the reference block corresponding to the block in the reference frame of the target video frame, and if the reference block is in the The reference frame does not belong to the region of interest, and the reference block is set as the region of interest.

In one embodiment, the system further includes:

The complexity calculation unit is used to calculate the complexity of each block in the coding basic unit, wherein for any block in the coding basic unit, the prediction of the block is determined according to the optimal prediction mode And calculate the sum of absolute transformation errors according to the original pixel value of the block and the predicted pixel value, and use the sum of absolute transformation errors as the complexity of the block.

In an embodiment, the quantization coding unit includes:

The code rate adjustment coefficient calculation module is used to calculate the code rate of each coding basic unit in the updated region of interest according to the representative saliency value of each coding basic unit in the updated region of interest in the current frame Adjustment coefficient;

The code rate adjustment completion degree calculation module is used to determine the protection coefficient of the non-interest area, and is based on the protection coefficient, the scaling complexity of each coding basic unit in the current frame, and each of the updated areas of interest Encoding the code rate adjustment coefficient of the basic unit of encoding, and calculating the completion degree of the code rate adjustment of the current frame; wherein the protection coefficient is used to characterize the degree to which the code rate is transferred from the non-interest area to the interest area;

The region of interest calculation module is configured to adjust the code rate according to the degree of completion of the code rate adjustment, the code rate adjustment coefficient of each basic unit of coding in the updated region of interest, and the value of each basic unit of coding in the updated region of interest. The original quantization parameter calculates the quantization parameter of each coding basic unit in the updated region of interest in the current frame.

In an embodiment, the code rate adjustment completion degree calculation module includes:

The first product calculation module is used to calculate the sum of the scaling complexity of each coding basic unit in the updated non-interest area in the current frame, and calculate the sum of the coding basic units in the updated non-interest area The first product of the sum of the scaling complexity and the protection coefficient;

The second product sum calculation module is used to calculate the second product of the scaling complexity of each coding basic unit in the updated region of interest and the respective code rate adjustment coefficient, and calculate the updated region of interest The sum of the second products of each basic coding unit in;

The ratio calculation module is used to calculate the difference between the sum of the scaling complexity of each coding basic unit in the current frame and the first product, and calculate the ratio of the difference to the sum of the second product, and The ratio is used as the rate adjustment completion degree of the current frame.

3, an embodiment of the present application also provides a video encoding device, the device includes a processor and a memory, the memory is used to store a computer program, when the computer program is executed by the processor, can realize The above-mentioned video coding method.

In this embodiment, the memory may include a physical device for storing information, which is usually digitized and then stored in a medium using electrical, magnetic, or optical methods. The memory described in this embodiment may also include: a device that uses electrical energy to store information, such as RAM or ROM, etc.; a device that uses magnetic energy to store information, such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, or U disk ; A device that uses optical means to store information, such as CD or DVD. Of course, there are other types of memory, such as quantum memory or graphene memory.

In this embodiment, the processor can be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or a processor, and a computer-readable medium storing computer-readable program codes (for example, software or firmware) executable by the (micro)processor, logic gates, switches, dedicated integrated circuits, etc. Circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc.

It can be seen from the above that the technical solutions provided by one or more implementation manners of the present application can be combined and analyzed together with multiple video frames located after the current frame in the encoding sequence when the current frame is encoded. Specifically, starting from the last video frame, the regions of interest in each video frame can be updated in the reverse order from the encoding sequence, and the saliency value can be reset for the updated region of interest. In this way, for temporally adjacent video frames, if there is a sudden change in the region of interest at the same position, the reverse update can be used to slow down or eliminate this sudden change in the region of interest, and finally make the adjacent frames interesting After coding, the region will not appear too much abruptly. After the region of interest is updated in the reverse direction, the quantization parameter can be calculated for each basic coding unit in the current frame from the perspective of the basic coding unit. Among them, since the scaling complexity and saliency value of each basic coding unit are different, and the calculation methods of the quantization parameters of the region of interest and the non-interest region are also different, the quantization parameters of each basic coding unit in the current frame may also be different. the same. When the current frame is subsequently encoded, different code rates can be assigned to different coding basic units, so that the code rate is allocated in a video frame more refined, so that the accuracy of video encoding is higher.

The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the implementation of the system and the device, both can be explained with reference to the introduction of the implementation of the foregoing method.

Those skilled in the art should understand that the embodiments of the present invention can be provided as a method, a system, or a computer program product. Therefore, the present invention may adopt a form of a complete hardware implementation, a complete software implementation, or a combination of software and hardware implementations. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.

The present invention is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to the embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

In a typical configuration, the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include non-permanent memory in a computer readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.

Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or equipment including a series of elements includes not only those elements, but also Other elements that are not explicitly listed, or they also include elements inherent to such processes, methods, commodities, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, commodity, or equipment that includes the element.

The above are only the implementation manners of this application, and are not intended to limit this application. For those skilled in the art, this application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the scope of the claims of this application.

Claims

A video encoding method, characterized in that the video frame of the video to be encoded includes a region of interest and a region of non-interest; the method includes:

According to the current frame in the video to be encoded and the encoding order of the video to be encoded, a current image set is acquired in the video to be encoded, and the current image set includes the current frame and is located in the encoding order. The specified number of video frames after the current frame;

In a reverse order that is opposite to the encoding order, sequentially update the regions of interest in each video frame in the current image set in reverse order;

Calculating quantization parameters for the updated coding basic units in the region of interest and non-interesting regions in the current frame respectively, so as to perform coding for each coding basic unit in the current frame according to the calculated quantization parameters.
The method according to claim 1, wherein calculating quantization parameters for the updated coding basic units in the region of interest and the non-interest region in the current frame respectively comprises:

Reset the significance value for the updated region of interest;

Determine the scaling complexity of each coding basic unit in the current frame, and based on the scaling complexity and the saliency value reset in the current frame, determine the updated region of interest and non-sensitivity in the current frame The coding basic units in the region of interest calculate the quantization parameters respectively.
The method according to claim 2, wherein, sequentially updating the region of interest in each video frame in the current image set comprises:

Dividing each video frame in the current image set according to basic coding units, and determining an optimal prediction mode for each basic coding unit obtained by the division;

For any target video frame in the current image set, traverse each coding basic unit in the region of interest of the target video frame, and analyze any block in the current coding basic unit. The optimal prediction mode adopted by the block is the inter-frame prediction mode. The reference block corresponding to the block is searched in the reference frame of the target video frame, and if the reference block does not belong to the region of interest in the reference frame, Set the reference block as the region of interest.
The method according to claim 3, wherein when determining the optimal prediction mode for each of the divided coding basic units, the method further comprises:

The complexity of each block in the basic coding unit is calculated separately, wherein for any block in the basic coding unit, the predicted pixel value of the block is determined according to the optimal prediction mode, and the predicted pixel value of the block is determined according to the The original pixel value of the block and the predicted pixel value calculate an absolute transformation error sum, and use the absolute transformation error sum as the complexity of the block.
The method according to claim 4, wherein resetting the saliency value for the updated region of interest comprises:

For the target block participating in the reverse update of the region of interest in the target video frame, the transfer coefficient is determined according to the complexity of the target block, and the transfer coefficient is compared with the saliency of the pixel in the target block The product of the values is used as the significance value to be passed;

The reference block of the target block is determined in the reference frame, and the saliency value of the corresponding pixel in the reference block of the target block is replaced with the saliency value to be transferred.
The method according to claim 2, wherein after resetting the saliency value for the updated region of interest, the method further comprises:

Adjusting the region of interest in the current frame in the current frame with a coding basic unit as a unit, and setting a representative saliency value for each coding basic unit in the adjusted region of interest;

Wherein, for any basic coding unit in the current frame, if the basic coding unit contains a block belonging to a region of interest, the basic coding unit is set as the region of interest;

or

For any basic coding unit in the current frame, if each block in the basic coding unit belongs to the region of interest, the basic coding unit is set as the region of interest.
The method according to claim 2 or 6, wherein calculating a quantization parameter for the coding basic unit in the updated region of interest in the current frame comprises:

Respectively calculating the code rate adjustment coefficient of each coding basic unit in the updated region of interest according to the representative saliency value of each coding basic unit in the updated region of interest in the current frame;

Determine the protection coefficient of the non-interest area, and calculate based on the protection coefficient, the scaling complexity of each basic coding unit in the current frame, and the code rate adjustment coefficient of each basic coding unit in the updated area of interest The completion degree of bit rate adjustment of the current frame; wherein, the protection coefficient is used to characterize the degree to which the bit rate is transferred from the non-interest area to the interest area;

According to the completion degree of code rate adjustment, the code rate adjustment coefficient of each basic coding unit in the updated region of interest, and the original quantization parameter of each basic coding unit in the updated region of interest, the current The quantization parameter of each basic coding unit in the updated region of interest in the frame.
The method according to claim 7, wherein calculating the rate adjustment completion degree of the current frame comprises:

Calculate the sum of the scaling complexity of each coding basic unit in the updated non-interest area in the current frame, and calculate the sum of the scaling complexity of each coding basic unit in the updated non-interest area and the The first product of the protection coefficient;

Calculate the second product of the scaling complexity of each coding basic unit in the updated region of interest and the respective code rate adjustment coefficient, and calculate the second product of each coding basic unit in the updated region of interest with;

Calculate the difference between the sum of the scaling complexity of each coding basic unit in the current frame and the first product, and calculate the ratio of the difference to the sum of the second product, and use the ratio as the Describes the completion degree of bit rate adjustment of the current frame.
The method according to claim 7, wherein calculating the quantization parameter of each coding basic unit in the updated region of interest in the current frame comprises:

For any target coding basic unit in the updated region of interest, if the rate adjustment completion degree of the current frame is less than a specified threshold, the rate adjustment completion degree and the code of the target coding basic unit The product of the rate adjustment coefficient determines the quantization adjustment value of the target coding basic unit, and the difference or ratio between the original quantization parameter of the target coding basic unit and the quantization adjustment value is used as the quantization of the target coding basic unit parameter;

If the rate adjustment completion degree of the current frame is greater than or equal to the specified threshold, the quantization adjustment value of the target coding basic unit is determined according to the rate adjustment coefficient of the target coding basic unit, and the target coding basic unit The difference or ratio between the original quantization parameter of the unit and the quantization adjustment value is used as the quantization parameter of the target coding basic unit.
The method according to claim 7, wherein calculating quantization parameters for the updated coding basic units in the non-interest region in the current frame respectively comprises:

According to the code rate adjustment completion degree of the current frame, the scaling complexity of each coding basic unit in the current frame, and the code rate adjustment coefficient of each coding basic unit in the updated region of interest, calculate the updated The code rate reduction coefficient of the non-interest area;

Calculate the quantization parameter of each coding basic unit in the non-interest area according to the code rate reduction coefficient of the non-interest area and the original quantization parameter of each coding basic unit in the non-interest area.
The method according to claim 10, wherein calculating the code rate reduction coefficient of the updated non-interest region comprises:

If the code rate adjustment completion degree of the current frame is less than a specified threshold, use the protection coefficient as the code rate reduction coefficient of the updated non-interest region;

If the rate adjustment completion degree of the current frame is greater than or equal to the specified threshold, calculate the sum of the scaling complexity of each coding basic unit in the updated non-interest area in the current frame, and calculate the updated The product of the scaling complexity of each coding basic unit in the region of interest and the respective code rate adjustment coefficient, and calculating the sum of each of the products; calculating the sum of the scaling complexity of each coding basic unit in the current frame and the total The difference value of the sum of the products, and the ratio of the difference value and the sum of the scaling complexity as the code rate reduction coefficient of the updated non-interest area.
The method according to claim 10, wherein calculating the quantization parameter of each coding basic unit in the non-interest area comprises:

For any target coding basic unit in the updated non-interest region, the quantization adjustment value of the target coding basic unit is determined according to the code rate reduction coefficient of the non-interest region, and the target coding basic unit The difference or ratio between the original quantization parameter of the unit and the quantization adjustment value is used as the quantization parameter of the target coding basic unit.
The method according to claim 2, wherein determining the scaling complexity of each coding basic unit in the current frame comprises:

Calculating the average quantization parameter of the current frame according to the original quantization parameter of each basic coding unit in the current frame;

For any basic coding unit in the current frame, the coding is calculated according to the average quantization parameter of the current frame, the complexity of each block in the coding basic unit, and the original quantization parameter of the coding basic unit The scaling complexity of the basic unit.
A video encoding system, characterized in that the video frame of the video to be encoded includes a region of interest and a region of non-interest; the system includes:

A current image set acquiring unit, configured to acquire a current image set in the video to be encoded according to the current frame in the video to be encoded and the encoding sequence of the video to be encoded, and the current image set includes the current frame And a specified number of video frames located after the current frame in the coding order;

A reverse update unit, configured to reversely update the region of interest in each video frame in the current image set in a reverse order opposite to the encoding order;

The quantization coding unit is used to calculate quantization parameters for the coding basic units in the updated region of interest and non-interest region in the current frame, respectively, to use the calculated quantization parameters as the coding basic unit in the current frame Encode.
The system according to claim 14, wherein the quantization coding unit comprises:

The saliency value reset module is used to reset the saliency value for the updated region of interest;

The quantization parameter calculation module is used to determine the scaling complexity of each basic coding unit in the current frame, and based on the scaling complexity and the re-set saliency value in the current frame, it is the updated value in the current frame The coding basic unit in the region of interest and non-interest region respectively calculates the quantization parameter.
The system according to claim 15, wherein the reverse update unit comprises:

An encoding basic unit configuration module, configured to divide each video frame in the current image set according to an encoding basic unit, and determine an optimal prediction mode for each encoding basic unit obtained by the division;

The region of interest resetting module is used to traverse each basic coding unit in the region of interest of the target video frame for any target video frame in the current image set, and for any one of the current coding basic units Perform block analysis. If the optimal prediction mode adopted by the block is the inter prediction mode, search for the reference block corresponding to the block in the reference frame of the target video frame, and if the reference block is in the The reference frame does not belong to the region of interest, and the reference block is set as the region of interest.
The system according to claim 16, wherein the system further comprises:

The complexity calculation unit is used to calculate the complexity of each block in the coding basic unit, wherein for any block in the coding basic unit, the prediction of the block is determined according to the optimal prediction mode And calculate the sum of absolute transformation errors according to the original pixel value of the block and the predicted pixel value, and use the sum of absolute transformation errors as the complexity of the block.
The system according to claim 15, wherein the quantization coding unit comprises:

The code rate adjustment coefficient calculation module is used to calculate the code rate of each coding basic unit in the updated region of interest according to the representative saliency value of each coding basic unit in the updated region of interest in the current frame Adjustment coefficient;

The code rate adjustment completion degree calculation module is used to determine the protection coefficient of the non-interest area, and is based on the protection coefficient, the scaling complexity of each coding basic unit in the current frame, and each of the updated areas of interest Encoding the code rate adjustment coefficient of the basic unit of encoding, and calculating the completion degree of the code rate adjustment of the current frame; wherein the protection coefficient is used to characterize the degree to which the code rate is transferred from the non-interest area to the interest area;

The region of interest calculation module is configured to adjust the code rate according to the degree of completion of the code rate adjustment, the code rate adjustment coefficient of each basic unit of coding in the updated region of interest, and the value of each basic unit of coding in the updated region of interest. The original quantization parameter calculates the quantization parameter of each coding basic unit in the updated region of interest in the current frame.
The system according to claim 18, wherein the code rate adjustment completion degree calculation module comprises:

The first product calculation module is used to calculate the sum of the scaling complexity of each coding basic unit in the updated non-interest area in the current frame, and calculate the sum of the coding basic units in the updated non-interest area The first product of the sum of the scaling complexity and the protection coefficient;

The second product sum calculation module is used to calculate the second product of the scaling complexity of each coding basic unit in the updated region of interest and the respective code rate adjustment coefficient, and calculate the updated region of interest The sum of the second products of each basic coding unit in;

The ratio calculation module is used to calculate the difference between the sum of the scaling complexity of each coding basic unit in the current frame and the first product, and calculate the ratio of the difference to the sum of the second product, and The ratio is used as the rate adjustment completion degree of the current frame.
A video encoding device, characterized in that the device includes a memory and a processor, the memory is used to store a computer program, and when the computer program is executed by the processor, it can implement any one of claims 1 to 13 The method described.