CN116437092A

CN116437092A - Code rate control method and device for avoiding generation of oversized I frame

Info

Publication number: CN116437092A
Application number: CN202310216093.0A
Authority: CN
Inventors: 刘鹏飞
Original assignee: ASR Microelectronics Co Ltd
Current assignee: ASR Microelectronics Co Ltd
Priority date: 2023-03-07
Filing date: 2023-03-07
Publication date: 2023-07-14

Abstract

The invention discloses a code rate control method for avoiding the generation of an oversized I frame. Step S1: and carrying out intra-frame coding mode prediction on each image block to be coded to obtain all intra-frame coding mode candidates of each image block to be coded and corresponding prediction cost. Step S2: and obtaining the intra-frame coding complexity of the image block to be coded and the video frame. Step S3: and calculating the upper limit value of the coding bit number of the I frame which does not cause abnormal code rate control. Step S4: and calculating the lower limit value of the Lagrangian multiplier of the I frame to be encoded. Step S5: and calculating Lagrange multipliers with the I frames to be encoded having a respiratory effect inhibiting effect. Step S6: and selecting the lower limit value of the Lagrange multiplier of the I frame to be encoded and the larger value of the Lagrange multiplier of the I frame to be encoded with the respiratory effect inhibiting effect as the Lagrange multiplier of the I frame to be encoded. The method and the device can inhibit respiratory effect and avoid the generation of oversized I frames in the encoding process.

Description

Code rate control method and device for avoiding generation of oversized I frame

Technical Field

The invention relates to a video coding technology, in particular to a code rate control method.

Background

Video coding techniques are used to characterize video information by compressing redundant components in video images and using as little data as possible. Common video coding standards are AVC (Advanced Video Coding, also known as h.264), HEVC (High Efficiency Video Coding, also known as h.265), and the like. Video coding techniques use image blocks as the most basic coding units. For example, in HEVC, the basic Unit of Coding is a CU (Coding Unit). A CU may be an image block of 64 pixels by 64 pixels, or 32 pixels by 32 pixels, or 16 pixels by 16 pixels, or 8 pixels by 8 pixels in size. An image block of 64 pixels by 64 pixels in size is also called LCU (Largest Coding Unit, maximum coding unit).

In order to remove redundancy of information in the spatial domain and the temporal domain, an intra coding (intra coding) technique and an inter coding (inter coding) technique are used for video coding of an input video frame. The encoded video frames can generally be divided into I frames using intra-frame only coding techniques, and P and B frames using a mixture of intra-frame and inter-frame coding techniques. The I frame is encoded by using only the intra-frame encoding technique when the video encoding is performed, and only the information of the video frame is used for encoding, and the information of other encoded video frames is not referred to. All image blocks within an I-frame are encoded as intra-coded blocks. When the P frame and the B frame are coded, an intra-frame coding technology and an inter-frame coding technology are adopted in a mixed mode, so that not only can the information of the video frame be used for coding, but also the information of one or more other coded video frames can be referred to. The image blocks inside the P-frame and the B-frame may be encoded as intra-frame encoded blocks or inter-frame encoded blocks. For each picture block inside the P-frame and the B-frame, if the rate distortion cost of using intra-frame coding techniques is smaller, this picture block will be coded as an intra-frame coded block; if the rate distortion penalty is smaller with inter-frame coding techniques, this image block will be encoded as an inter-frame coded block. For each incoming video frame, the type of encoded frame is determined by the group of pictures (GOP, group Of Pictures, a set of time-consecutive video frames) parameters set by the user. In video coding, since the coding efficiency of P frames and B frames is high, and I frames have the effect of blocking error propagation, a manner of periodically coding I frames and coding the rest of video frames as P frames or B frames is generally adopted. Fig. 1 illustrates a common group of pictures structure, which contains I frames, P frames, or B frames.

In order to increase the video compression rate as much as possible, common video coding algorithms are often lossy compression algorithms, such as AVC and HEVC belong to the lossy compression algorithms. This causes the encoded reconstructed video to differ from the original video, i.e. the encoded reconstructed video will be distorted. For lossy compression algorithms, the coding performance needs to be commonly measured by the coding bit Rate (Rate) and the coding-induced Distortion (Distortion). In the video coding process, the coding bit rate and the coding distortion are mutually restricted and contradictory, for example, decreasing the coding bit rate tends to increase the coding distortion, and decreasing the coding distortion tends to increase the coding bit rate. Rate distortion optimization (Rate Distortion Optimization, RDO) is to reduce the coding distortion as much as possible at a certain coding bit rate or to reduce the coding bit rate as much as possible at a certain coding distortion. At present, a Lagrange multiplier lambda-based rate distortion optimization method is adopted in a common video coding algorithm. Let D be the coding distortion, R be the coding bit rate, J be the coding rate distortion Cost (Rate Distortion Cost, RD Cost), j=d+λ×r. The rate distortion optimization method based on the lagrangian multiplier λ can be expressed as min (J), where min () represents taking the minimum value. In video coding, each coding mode has its corresponding coding distortion D and coding bit rate R, so that the coding rate distortion cost J corresponding to each coding mode can be calculated. And selecting the coding mode with the minimum rate distortion cost J to perform video coding, so that the optimal coding performance can be obtained. The process of selecting an optimal coding Mode for each image block to be coded by the encoder through a rate distortion optimization method is called Mode selection (Mode Decision) of the image block to be coded.

In real life, the channel bandwidth (channel capacity) used to transmit compressed video is limited. If the encoding bit rate of the compressed video is too high, the capacity of the channel bandwidth is exceeded, which can cause video transmission congestion and even packet loss. If the encoding bit rate of the compressed video is too low, the channel bandwidth is not fully utilized, and higher video quality cannot be obtained. Therefore, it is necessary to control the output code rate of the video encoder to match the channel bandwidth using a code rate control technique.

The Rate Control (Rate Control) technique is to adjust the coding parameters of the video encoder to make the output Rate of the video encoder equal to the preset target Rate, and simultaneously reduce coding distortion as much as possible to improve the video coding quality. In a common rate control algorithm, a rate control task is generally realized through two links of target bit allocation and target bit control.

In a common rate control algorithm, the target bit allocation is performed at three levels, namely GOP level, video frame level, and image block level. After the target bit allocation at the GOP level and the video frame level, the target number of coded bits for the current video frame to be coded is determined. And then in a target bit control link, calculating the corresponding Lagrangian multiplier lambda according to the target coding bit number of each video frame to be coded, and using the Lagrangian multiplier lambda in the coding process of the video frame.

Because the operation amount of the video coding algorithm is large, in order to improve the video coding speed and realize real-time coding and transmission, it is a common practice in the industry to use an Application Specific Integrated Circuit (ASIC) to accelerate the video coding process in hardware. Such application specific integrated circuits that hardware accelerate the video encoding process are commonly referred to as hardware video encoders.

When video surveillance and video conferencing-type scenes are encoded, periodic visual flicker is often observed in the encoded video stream. Especially in stationary areas of the scene, visual flicker is more pronounced. This periodic visual flicker phenomenon is known as the "breathing effect". The generation of "breathing effects" is related to periodically encoded I-frames in the video stream. Because the I frame only adopts the intra-frame coding technology, the information of the video frame coded before the I frame cannot be referred to, the distortion size and the distortion mode of the I frame have larger difference with those of the P frame or the B frame coded before the I frame, and thus, when the decoded video stream is watched, the distortion difference can cause the human eyes to observe the visual flicker phenomenon. During video encoding, I frames are typically encoded periodically, and visual flicker observed by the human eye occurs periodically, i.e., producing a "breathing effect". The "breathing effect" greatly affects the subjective visual experience of the viewer, and should therefore be suppressed when encoded.

In order to suppress respiratory effects during video coding, the most common approach is to set the QP (Quantization Parameter ) of an I frame near the QP of the previous coded P or B frame, i.e. there are: QP for I frame = QP + offset delta for the previous encoded P or B frame. Wherein the offset delta is made an integer with a smaller absolute value, and is generally between [ -2,2 ]. Thus, the distortion size of the I frame is close to that of the encoded P frame or B frame before the I frame, and the respiratory effect is effectively restrained. In this way, after calculating the QP of the I frame to be encoded according to the QP and the offset δ of the previous encoded P frame or B frame, the lagrangian multiplier λ of the I frame to be encoded can be calculated according to the relational formula of the QP and the lagrangian multiplier λ, and used in the actual encoding process of the I frame to be encoded.

The method is used for encoding the I frames, and respiratory effect can be effectively restrained, but for certain video scenes, particularly video scenes with static time domain content and complex space texture, oversized I frames are easy to encode, namely the encoding bit number of the I frames far exceeds the encoding bit number of P frames or B frames. This not only causes severe instantaneous code rate fluctuations, but also causes the average code rate of the actual code to far exceed the preset target code rate, resulting in abnormal code rate control. For such video scenes, the QP for P and B frames encoded using inter-frame coding techniques is small because the content is still in the time domain. When encoding an I-frame, the QP of the I-frame will be set near the QP of the previously encoded P-frame or B-frame in order to suppress the respiratory effect, so that the QP of the I-frame will also be a smaller value. Meanwhile, because of the complex spatial texture of the video scene, when the I frame is encoded by adopting the intra-frame encoding technology, the ultra-large I frame with the size far exceeding that of the P frame or the B frame can be encoded by adopting a smaller QP, so that severe instantaneous code rate fluctuation is caused, and even the average code rate of actual encoding far exceeds the preset target code rate when severe, the code rate control is disabled. In other words, in this case, the QP value of the I frame to be encoded is set near the QP value of the P frame or B frame previously encoded, and although the respiratory effect can be suppressed, an excessively large I frame is generated, resulting in abnormal rate control.

The industry expects that there is one such technique: in the code rate control process of video coding, when an oversized I frame is not coded, setting the QP of the I frame near the QP of the previous coded P frame or B frame so as to inhibit respiratory effect and improve subjective visual quality; the size of the I frame is suppressed only when an oversized I frame is encoded, so that abnormal code rate control is avoided.

Currently, some solutions have been used to avoid the generation of oversized I frames during video coding.

One solution is that the QP for the I frame is still calculated from the QP for the previous encoded P or B frame, but the offset δ is chosen to be a larger positive integer. Thus, since the QP of the I frame is far larger than that of the P frame or the B frame, the size of the encoded I frame is small, and the situation that the encoding bit number of the I frame is far larger than that of the P frame or the B frame can not occur. By adopting the scheme for coding, the occurrence of oversized I frames can be effectively avoided, the fluctuation of the instantaneous code rate is small, and the control of the average code rate is accurate. However, when the scheme is adopted, for a video coding scene which does not code an oversized I frame, the coding quality of the I frame is poor because the QP of the I frame is far greater than that of the P frame or the B frame without distinction, and the coding quality of the P frame and the B frame which directly or indirectly refer to the I frame can be reduced, so that the coding efficiency of the whole video stream is greatly reduced.

Alternatively, the QP for an I frame is independent of the QP for the previously encoded P or B frame, and is calculated entirely from the target number of encoded bits for the I frame. By adopting the scheme for coding, the occurrence of oversized I frames can be effectively avoided, but the condition that the QP of the I frames has larger difference with the QP of the previous coded P frames or B frames is very easy to occur, so that the subjective visual quality of the I frames has larger difference with the subjective visual quality of the P frames or B frames, serious visual flicker is caused, and the subjective visual quality of the whole video stream is greatly reduced.

Disclosure of Invention

The technical problem to be solved by the invention is to design a code rate control method, which avoids generating oversized I frames in the encoding process on the basis of keeping the subjective visual quality of the I frames and the previously encoded P frames and B frames as consistent as possible and inhibiting respiratory effect.

In order to solve the technical problems, the invention provides a code rate control method for avoiding generating an oversized I frame, which comprises the following steps. Step S1: and carrying out intra-frame coding mode prediction on each image block to be coded to obtain all intra-frame coding mode candidates of each image block to be coded and corresponding prediction cost. Step S2: screening prediction cost corresponding to each intra-frame coding mode candidate of the image block to be coded, and selecting the minimum prediction cost to represent the intra-frame coding complexity of the image block to be coded; the sum of the intra-coding complexity of all image blocks within a video frame is then taken as the intra-coding complexity of that video frame. Step S3: and calculating the upper limit value of the coding bit number of the I frame which does not cause abnormal code rate control according to the coding state of the current video sequence. Step S4: according to the actual coding bit number, lagrange multiplier and intra-frame coding complexity of the previous coded I frame, combining the coding bit number upper limit value and the intra-frame coding complexity of the I frame to be coded, and calculating the Lagrange multiplier lower limit value of the I frame to be coded; wherein the intra-coding complexity of the I-frame to be encoded multiplexes the intra-coding complexity of the video frame preceding the I-frame to be encoded. Step S5: and calculating Lagrangian multipliers with the respiratory effect inhibiting effect of the I frame to be encoded according to quantization parameters of the previous encoded P frame or B frame. The steps S1 to S4 form a group; this group is either performed simultaneously with step S5 or any before. Step S6: and selecting the lower limit value of the Lagrange multiplier of the I frame to be encoded and the larger value of the Lagrange multiplier of the I frame to be encoded with the respiratory effect inhibiting effect as the Lagrange multiplier of the I frame to be encoded.

Further, in the step S1, a prediction cost corresponding to each intra-frame coding mode is calculated for each image block to be coded, and one or more intra-frame coding modes with the lowest prediction cost are used as intra-frame coding mode candidates of the image block to be coded; the prediction cost refers to a coding rate distortion cost.

Preferably, the step S2 calculates at least the intra-coding complexity of each I-frame and one video frame preceding each I-frame.

Further, in the step S3, the upper limit value of the number of encoded bits of the I frame that does not cause the abnormal rate control is also calculated by using the state information of the sliding window used to smooth the frame-level bit allocation in the rate control process.

Further, in the step S3, the target number of encoding bits R of the sliding window is calculated _sw ，R _sw ＝R _PicAvg ×(N _coded +N _SW )-R _coded The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is _PicAvg Is the average target coding bit number, N, of each video frame calculated from the average code rate of the video sequence _coded Is the number of frames, N, of currently encoded video frames in a video sequence _SW Is the size of the sliding window, R _coded Is the sum of the actual number of coded bits of all video frames currently coded in the video sequence; then according to R _sw Calculating the upper limit value R of the coding bit number of the I frame which does not cause abnormal code rate control _Imax ；

Wherein N is _I Is the number of I frames in the sliding window; eta is a constant and the value range is (0, 1)]。

Further, the step S4 is represented by the formula five:

obtaining; wherein lambda is _new Is the lower limit of the lagrangian multiplier for the I frame to be encoded; f (f) ^-1 Is an inverse of the first function f; i (n) _new An upper limit value of the coding bit number which does not cause abnormal code rate control and represents the I frame to be coded; i (m) _old Representing the actual number of encoding bits of the previous encoded I frame; g is a second function; omega _n-1 Representing the intra-coding complexity of the video frame immediately preceding the I-frame to be encoded, for replacing the intra-coding complexity omega of the I-frame to be encoded _n ；ω _m Representing intra-coding complexity of a previous coded I-frame; lambda (lambda) _old Representing the previous encodedLagrangian multipliers of I frames.

Further, the first function f is represented by formula one:

acquiring; equation one represents: i-frame encoding is performed on the same video frame by using different Lagrangian multipliers, and the ratio of the actual encoded bit numbers obtained after encoding is related to the ratio of the Lagrangian multipliers, and the relation is expressed by a first function f. The second function g is represented by equation two: />

Acquiring; equation two represents: i-frame encoding is performed on different video frames by using the same Lagrangian multiplier, and the ratio of the actual encoding bit number obtained after encoding is related to the ratio of the intra-frame encoding complexity of the video frames, and the relation is represented by a second function g.

Further, deriving based on the first formula and the second formula to obtain a third formula:

further deriving equation four: />

Wherein lambda is _new Is the lagrangian multiplier of the I frame to be encoded; i (n) _new Replacing by the target coding bit number of the I frame to be coded; omega _n Intra-coding complexity omega from a video frame encoded immediately preceding an I-frame to be encoded _n-1 Replacing; obtaining a formula five; order I (n) _new For the upper limit value of the coding bit number of the I frame to be coded, which does not cause abnormal code rate control, lambda is calculated _new I.e. the lower limit of the lagrangian multiplier of the I-frame to be encoded.

Further, in the step S5, the amount of the previous encoded P frame or B frame of the I frame to be encoded is determinedQuantization parameter QP _PB And calculating a quantization parameter QP with respiratory effect inhibition effect on the I frame to be encoded by the offset delta _I The calculation formula is as follows: QP (QP) _I ＝QP _PB +δ; wherein δ is an integer offset; then according to quantization parameter QP with inhibition effect on respiratory effect of I frame to be encoded _I Calculating Lagrange multiplier lambda with respiratory effect inhibition effect of I frame to be encoded _I The calculation formula is as follows:

wherein Exp () is an exponential function based on the natural logarithm e.

The application also provides a code rate control device for avoiding generating the oversized I frame, which comprises an intra-frame coding mode prediction module, an intra-frame coding complexity calculation module, an I frame coding bit number upper limit value calculation module, an I frame Lagrange multiplier lower limit value calculation module, an I frame Lagrange multiplier calculation module and an I frame Lagrange multiplier selection module. The intra-frame coding mode prediction module is used for carrying out intra-frame coding mode prediction on each image block to be coded, and outputting all intra-frame coding mode candidates of each image block to be coded and corresponding prediction cost. The intra-frame coding complexity calculation module is used for screening the prediction cost corresponding to each intra-frame coding mode candidate of the image block to be coded, and selecting the minimum prediction cost to represent the intra-frame coding complexity of the image block to be coded; the sum of the intra-coding complexity of all image blocks within a video frame is then taken as the intra-coding complexity of that video frame. The I frame coding bit number upper limit value calculation module is used for calculating the coding bit number upper limit value of the I frame which does not cause abnormal code rate control according to the coding state of the current video sequence. The I frame Lagrange multiplier lower limit value calculation module is used for calculating the lower limit value of the Lagrange multiplier of the I frame to be encoded according to the actual encoding bit number, the Lagrange multiplier and the intra-frame encoding complexity of the previous encoded I frame and combining the encoding bit number upper limit value and the intra-frame encoding complexity of the I frame to be encoded; wherein the intra-coding complexity of the I-frame to be encoded multiplexes the intra-coding complexity of the video frame preceding the I-frame to be encoded. The I frame Lagrange multiplier calculation module is used for calculating Lagrange multipliers with the respiratory effect inhibiting effect of the I frame to be encoded according to quantization parameters of the previous encoded P frame or B frame. The I frame Lagrange multiplier selection module is used for selecting a lower limit value of a Lagrange multiplier of an I frame to be encoded and a larger value in the Lagrange multipliers of the I frame to be encoded, which have a respiratory effect inhibiting effect, as the Lagrange multipliers of the I frame to be encoded.

The invention has the technical effects that: the code rate control method of the I frame is provided, and can inhibit respiratory effect and avoid the generation of oversized I frame in the encoding process.

Drawings

Fig. 1 is a schematic diagram of a common image group structure.

Fig. 2 is a flow chart of a code rate control method for avoiding generating an oversized I frame.

Fig. 3 is a schematic structural diagram of a rate control device for avoiding generating an oversized I frame according to the present application.

The reference numerals in the drawings illustrate: the method comprises the steps of 1, 2, 3, 4, 5 and 6, wherein the 1 is an intra-frame coding mode prediction module, the 2 is an intra-frame coding complexity calculation module, the 3 is an I-frame coding bit number upper limit value calculation module, the 4 is an I-frame Lagrange multiplier lower limit value calculation module, the 5 is an I-frame Lagrange multiplier calculation module and the 6 is an I-frame Lagrange multiplier selection module.

Detailed Description

Referring to fig. 2, the code rate control method for avoiding generating an oversized I frame provided in the present application includes the following steps.

Step S1: and carrying out intra-frame coding mode prediction on each image block to be coded, namely calculating the prediction cost corresponding to each intra-frame coding mode for each image block to be coded, taking one or more intra-frame coding modes with the lowest prediction cost as intra-frame coding mode candidates of the image block to be coded, and outputting all the intra-frame coding mode candidates of each image block to be coded and the prediction cost corresponding to the intra-frame coding mode candidates. The prediction cost refers to a low-calculation-amount, low-precision coding rate distortion cost J, j=d+λ×r. Where D is the coding distortion, typically using SAD (Sum of Absolute Difference, sum of absolute error) or SATD (Sum of Absolute Transformed Difference, sum of absolute transform error) algorithms; r is the coding bit rate, typically using a simplified bit rate estimation algorithm, such as an Exponential-Golomb coding (Exponential-Golomb coding) algorithm. And transformation and quantization operations are not typically introduced into the computation process. This step is an inherent step in the existing video coding process, and intra coding mode prediction is performed for all image blocks of I frames, P frames and B frames.

The coding mode prediction and the coding mode rate distortion optimization adopt a rate distortion optimization method to screen the coding mode of the image block. The difference between the two is mainly that the calculation methods of the distortion cost of the coding rate are different in precision. In the rate distortion optimization of the coding mode, a coding rate distortion cost with larger calculation amount and higher precision is generally adopted. Wherein, the calculating coding distortion generally uses SSD algorithm (Sum of Squared Differences, sum of square errors), the calculating coding bit rate generally uses CABAC algorithm (Context Adaptive Binary Arithmatic Coding, context adaptive binary arithmetic coding) or CAVLC algorithm (Context Adaptive Variable Length Coding ), and transformation and quantization operations are generally introduced in the calculating process.

Step S2: screening prediction cost corresponding to each intra-frame coding mode candidate of the image block to be coded, and selecting the minimum prediction cost to represent the intra-frame coding complexity of the image block to be coded; the sum of the intra-coding complexity of all image blocks within a video frame is then taken as the intra-coding complexity of that video frame. Preferably, this step calculates at least the intra-coding complexity of each I-frame and the video frame preceding each I-frame, thereby serving the following I-frames. The step multiplexes the calculation result of intra coding mode prediction (step S1), does not need to introduce an extra image preprocessing stage, and has low hardware implementation cost.

Step S3: and calculating the upper limit value of the coding bit number of the I frame which does not cause abnormal code rate control according to the coding state of the current video sequence. For example, the upper limit value of the number of encoded bits of an I frame that does not cause abnormal rate control is calculated using state information of a Sliding Window (SW) used to Smooth frame-level bit allocation in the rate control process.

In the video encoding process, when encoding of a video frame is completed, there is often an error between the actual number of encoding bits and the target number of encoding bits calculated before encoding, and this error may be smoothed in a sliding window composed of a plurality of subsequent continuous video frames to reduce instantaneous code rate fluctuations. In the sliding window, the number of frames of successive video frames contained therein is referred to as the size of the sliding window, and the sum of the target number of encoding bits of all video frames therein is referred to as the target number of encoding bits of the sliding window. The target number of coded bits for the sliding window can be calculated by the following formula: r is R _sw ＝R _PicAvg ×(N _coded +N _SW )-R _coded . Wherein R is _sw Is the target number of coded bits of the sliding window, R _PicAvg Is the average target coding bit number, N, of each video frame calculated from the average code rate of the video sequence _coded Is the number of frames, N, of currently encoded video frames in a video sequence _SW Is the size of the sliding window, R _coded Is the sum of the actual number of coded bits of all video frames currently coded in the video sequence. R is R _PicAvg 、N _coded 、N _SW 、R _coded The encoding state of the current video sequence is characterized. It can be seen that as the encoding process proceeds, R _sw The value of (2) will vary from one coding state of the video sequence to another. This step calculates R based on the current video sequence coding state before each I frame is coded _sw And according to the value of R _sw The upper limit value of the number of encoded bits of the I frame which does not cause abnormal rate control is calculated. For example, the following calculation formula is adopted:

wherein R is _Imax The upper limit value of the coding bit number of the I frame which does not cause abnormal code rate control; r is R _sw Is the target number of coded bits for the sliding window; n (N) _I Is the number of I frames in the sliding window; eta is (0, 1)]Specific numerical values and video thereofThe coding parameters of the sequence are related and may be set to 0.67 in general.

Step S4: and according to the actual coding bit number, the Lagrange multiplier and the intra-frame coding complexity of the previous coded I frame, combining the upper limit value of the coding bit number and the intra-frame coding complexity of the I frame to be coded, and calculating the lower limit value of the Lagrange multiplier of the I frame to be coded. Wherein the intra-coding complexity of the I-frame to be encoded multiplexes the intra-coding complexity of the video frame preceding the I-frame to be encoded.

The lower limit of the lagrangian multiplier for the I frame to be encoded is represented by equation five:

solving, this is one example. Wherein lambda is _new Is the lagrangian multiplier of the I frame to be encoded. f (f) ^-1 Is an inverse of the first function f. I (n) _new Representing the use of lambda by the I-frame to be encoded _new The actual number of encoded bits after encoding is performed, at this time, the I frame to be encoded is not yet encoded, and therefore, the actual number of encoded bits cannot be obtained, and here, the target number of encoded bits of the I frame to be encoded is approximately replaced. I (m) _old Representing the actual number of encoded bits of the previous encoded I frame. g is a second function. Omega _n-1 Representing the intra-coding complexity of the video frame immediately preceding the I-frame to be encoded, for replacing the intra-coding complexity omega of the I-frame to be encoded _n 。ω _m Representing the intra coding complexity of the previous coded I frame. Lambda (lambda) _old Representing the lagrangian multiplier for the previous encoded I frame. In the fifth formula, let I (n) _new For the upper limit value of the coding bit number of the I frame to be coded, which does not cause abnormal code rate control, lambda is calculated _new I.e. the lower limit of the lagrangian multiplier of the I-frame to be encoded.

Referring to fig. 1, the mth frame and the nth frame are two I frames adjacent to each other, and zero or more P frames or B frames may exist between them. Lambda (lambda) _old And lambda (lambda) _new Representing the values of two different lagrangian multipliers. I (m) _old Representing that the mth frame adopts lambda _old The actual number of coded bits after coding. I (m) _new Representing that the mth frame adopts lambda _new Proceeding withThe actual number of coded bits after coding. I (n) _old Indicating that the nth frame adopts lambda _old The actual number of coded bits after coding. I (n) _new Indicating that the nth frame adopts lambda _new The actual number of coded bits after coding. Omega _m Representing the intra coding complexity of the mth frame. Omega _n Representing the intra-coding complexity of the nth frame.

The first function f is represented by equation one:

and (5) obtaining. The physical meaning of equation one is: i-frame encoding is performed on the same video frame by using different Lagrangian multipliers, and the ratio of the actual encoded bit numbers obtained after encoding is related to the ratio of the Lagrangian multipliers, and the relation is expressed by a first function f. The specific expression form of the first function f can be obtained by performing function fitting on two sides of the equal sign of the formula one, and will not be described here again.

The second function g is represented by equation two:

and (5) obtaining. The physical meaning of equation two is: i-frame encoding is performed on different video frames by using the same Lagrangian multiplier, and the ratio of the actual encoding bit number obtained after encoding is related to the ratio of the intra-frame encoding complexity of the video frames, and the relation is represented by a second function g. The specific expression form of the second function g can be obtained by performing function fitting on two sides of the equal sign of the formula two, and is not described herein.

In the specific implementation, a plurality of experiments are performed in advance, the expression of the first function f and the expression of the second function g are obtained in a function fitting mode, and the expressions are stored in a video encoder and belong to known function relations.

Deriving based on the first and second formulas to obtain a third formula:

further derivation can yield equation four: />

In equation four, the mth frame is an encoded I frame, and its encoding process is completed, and its corresponding actual number of encoded bits I (m) _old Lagrangian multiplier lambda used in encoding _old And intra coding complexity omega _m Are known amounts. The nth frame is the I frame to be encoded, I (n) _new The original represents the actual number of encoding bits of the I frame to be encoded, and the n frame cannot obtain the actual number of encoding bits because the n frame has not yet started encoding. In order to derive the lagrangian multiplier required by the encoding process of the I frame to be encoded before the I frame to be encoded is encoded, the present application approximately uses the target encoding bit number of the I frame to be encoded instead of the actual encoding bit number, and because the "target encoding bit number" of the I frame to be encoded is the "actual encoding bit number" of the expected I frame to be encoded after the actual encoding process, the error introduced by such approximate substitution is small. I (n) _new This is replaced by the target number of coded bits of the I-frame to be coded, which is a known quantity. Omega _n The intra-coding complexity, representing the nth frame, is an unknown quantity since the nth frame has not yet been coded. Lambda (lambda) _new Representing the number of actual coded bits when the nth frame is the desired target number of coded bits I (n) _new In this case, the lagrange multiplier to be used is an unknown quantity and a quantity to be obtained. To calculate Lagrangian multiplier lambda of I frame to be encoded using equation four _new The present invention exploits the correlation in the time domain of successive video frames (successive video frames are similar in content), using the intra-coding complexity ω of the encoded n-1 th frame adjacent to the n-th frame _n-1 Instead of the intra-coding complexity omega of the nth frame to be coded _n . Because the sampling time of the n-1 frame and the n frame is very close, the contents of the two video frames are very similar, the error of the replacement operation of the intra-frame coding complexity is very small, and the frame-level Lagrange multiplier lambda of the I frame to be coded is avoided _new Is a great influence on the calculation of (2). Thus due toCoding of the n-1 th frame has been completed with intra-coding complexity omega _n-1 Is a known quantity. In equation four, let ω _n-1 Instead of omega _n Equation five can be obtained.

Step S5: and calculating Lagrangian multipliers with the respiratory effect inhibiting effect of the I frame to be encoded according to quantization parameters of the previous encoded P frame or B frame.

In this step, the quantization parameter QP of the P frame or B frame coded before the I frame to be coded is firstly used _PB And calculating a quantization parameter QP with respiratory effect inhibition effect on the I frame to be encoded by the offset delta _I The calculation formula is as follows: QP (QP) _I ＝QP _PB +δ. Where δ is an integer offset of smaller absolute value, e.g. in [ -2,2]Between them. Thus, the distortion size of the I frame is close to that of the P frame or the B frame coded before the I frame, and the respiratory effect is effectively restrained. Then according to quantization parameter QP with inhibition effect on respiratory effect of I frame to be encoded _I Calculating Lagrange multiplier lambda with respiratory effect inhibition effect of I frame to be encoded _I . The calculation formula is as follows:

wherein Exp () is an exponential function based on the natural logarithm e.

The steps S1, S2, S3, S4 form a group. The order of this group and step S5 is not critical, either simultaneously or any preceding.

Step S6: and selecting the lower limit value of the Lagrange multiplier of the I frame to be encoded and the larger value of the Lagrange multiplier of the I frame to be encoded with the respiratory effect inhibiting effect as the Lagrange multiplier of the I frame to be encoded, and using the Lagrange multiplier of the I frame to be encoded in the actual encoding process of the I frame to be encoded.

Lagrangian multiplier λ with respiratory effect suppression effect for I frame to be encoded calculated according to step S5 _I Although respiratory effects can be effectively suppressed, for some coding scenarios, it is possible that oversized I frames may result in rate control anomalies. Thus, the Lagrangian multiplier λ having an inhibitory effect on respiratory effects when an I frame is to be encoded _I The lower limit lambda of Lagrangian multiplier which is not abnormal in code rate control _new When selecting lambda _I The breathing effect can be effectively suppressed for the actual encoding process of the I-frame to be encoded. When lambda is _I ＜λ _new When selecting lambda _new The method is used for the actual encoding process of the I frame to be encoded, and can effectively avoid the generation of the oversized I frame. In this way, the objects that the invention is intended to achieve are achieved, namely: in the code rate control process of video coding, when an oversized I frame is not coded, setting the QP of the I frame near the QP of the previous coded P frame or B frame so as to inhibit respiratory effect and improve subjective visual quality; the size of the I frame is suppressed only when an oversized I frame is encoded, so that abnormal code rate control is avoided.

Referring to fig. 3, the rate control device for avoiding generating an oversized I frame provided in the present application includes an intra-frame coding mode prediction module 1, an intra-frame coding complexity calculation module 2, an I frame coding bit number upper limit calculation module 3, an I frame lagrangian multiplier lower limit calculation module 4, an I frame lagrangian multiplier calculation module 5, and an I frame lagrangian multiplier selection module 6. The apparatus shown in fig. 3 corresponds to the method shown in fig. 2.

The intra-frame coding mode prediction module 1 is configured to perform intra-frame coding mode prediction on each image block to be coded, and output all intra-frame coding mode candidates of each image block to be coded and prediction costs corresponding to the candidates.

The intra-frame coding complexity calculation module 2 is configured to screen prediction costs corresponding to each intra-frame coding mode candidate of an image block to be coded, and select a minimum prediction cost to represent the intra-frame coding complexity of the image block to be coded; the sum of the intra-coding complexity of all image blocks within a video frame is then taken as the intra-coding complexity of that video frame.

The I frame coding bit number upper limit value calculating module 3 is configured to calculate, according to the coding state of the current video sequence, a coding bit number upper limit value of an I frame that does not cause an abnormal rate control.

The I frame lagrangian multiplier lower limit value calculating module 4 is configured to calculate a lower limit value of a lagrangian multiplier of an I frame to be encoded according to an actual encoding bit number, a lagrangian multiplier and an intra-frame encoding complexity of a previous encoded I frame, and by combining an encoding bit number upper limit value and an intra-frame encoding complexity of the I frame to be encoded. Wherein the intra-coding complexity of the I-frame to be encoded multiplexes the intra-coding complexity of the video frame preceding the I-frame to be encoded.

The I-frame lagrangian multiplier calculation module 5 is configured to calculate, according to quantization parameters of a P-frame or a B-frame encoded previously, a lagrangian multiplier with a respiratory effect suppressing effect on an I-frame to be encoded.

The I-frame lagrangian multiplier selection module 6 is configured to select a lower limit value of a lagrangian multiplier of an I-frame to be encoded and a larger value of lagrangian multipliers of the I-frame to be encoded, where the lagrangian multiplier has a respiratory effect suppressing effect, as the lagrangian multiplier of the I-frame to be encoded.

The code rate control method provided by the application utilizes the prediction cost of the intra-frame coding mode generated in the video frame coding process to represent the intra-frame coding complexity of the video frame, does not need to increase an image preprocessing stage, and has low hardware cost and low realization cost; the method and the device also utilize the information of the encoded video frames and the encoding state of the video sequence to calculate the lower limit value of the Lagrange multiplier of the I frame to be encoded, which does not cause abnormal code rate control, and restrict the Lagrange multiplier of the I frame to be encoded, which has an inhibition effect on the respiratory effect, so that the generation of the oversized I frame is avoided on the basis of effectively inhibiting the respiratory effect, the subjective visual quality of the video is improved, and the normal code rate control process is ensured.

The Chinese patent application with application publication number of CN115550656A and application publication date of 2022, 12 and 30 discloses a method and a device for controlling the bit rate of an I frame video frame, which are suitable for hardware implementation. Compared with CN115550656a, the technical innovation of the present application mainly has the following points.

First, the goal of CN115550656a is to achieve accurate control of the bit rate of the I-frame, even if the actual number of encoded bits of the I-frame to be encoded is as close as possible to the target number of encoded bits. The objective of the present application is to avoid the generation of oversized I frames during encoding, while keeping as consistent as possible the subjective visual quality of the I frames and the P and B frames encoded previously, to suppress respiratory effects.

Second, formulas one to five appear in CN115550656a and this application, and the writing form is the same. The meaning of the variable values in the formulas of the two documents is quite different. In CN115550656a, equation five is used to calculate the value of the lagrangian multiplier, I (n), required to be used given the target number of encoding bits of the I frame to be encoded _new Lambda represents the target number of coded bits of the I-frame to be coded _new Representing that the number of coding bits after the actual coding of the I frame to be coded is equal to the target number of coding bits I (n) _new The lagrangian multiplier used as required. In the present application, equation five is used to calculate the lower limit value of the Lagrangian multiplier, I (n), of the I frame to be encoded that does not cause abnormal rate control _new Target coding bit number upper limit value lambda representing I frame to be coded and not causing abnormal code rate control _new Representing the lower limit of the lagrangian multiplier for I frames to be encoded that do not cause rate control anomalies.

Thirdly, in the present application, when calculating the lower limit value of the lagrangian multiplier of the I frame to be encoded, which does not cause the abnormal rate control, the upper limit value of the number of encoding bits of the I frame, which does not cause the abnormal rate control, is calculated according to the encoding state of the current video sequence (i.e. step S3). This is not found in CN115550656 a.

Fourth, in the present application, the Lagrange multiplier λ with respiratory effect suppression effect of the I frame to be encoded is calculated _I Is also absent in CN115550656a (i.e., step S5).

Fifth, in CN115550656a, the value λ of the lagrangian multiplier for the I frame to be encoded is calculated using equation five _new After that, lambda can be used _new To encode the I frame to be encoded. In the application, a formula five is utilized to calculate the lower limit lambda of Lagrange multiplier of the I frame to be encoded which does not cause abnormal code rate control _new Thereafter, only lambda is set _new Used as a lower threshold, lagrangian multiplier with respiratory effects suppression only when the I frame to be encodedλ _I Ratio lambda _new For hours, lambda is used _new To encode I-frames to be encoded, otherwise λ is used _I The I frame to be encoded is encoded (i.e. step S6).

The above are only preferred embodiments of the present invention, and are not intended to limit the present invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A code rate control method for avoiding generating oversized I frames is characterized by comprising the following steps of;

step S1: carrying out intra-frame coding mode prediction on each image block to be coded to obtain all intra-frame coding mode candidates of each image block to be coded and corresponding prediction cost;

step S2: screening prediction cost corresponding to each intra-frame coding mode candidate of the image block to be coded, and selecting the minimum prediction cost to represent the intra-frame coding complexity of the image block to be coded; then taking the sum of the intra-frame coding complexity of all image blocks in the video frame as the intra-frame coding complexity of the video frame;

Step S3: according to the coding state of the current video sequence, calculating the upper limit value of the coding bit number of the I frame which does not cause abnormal code rate control;

step S4: according to the actual coding bit number, lagrange multiplier and intra-frame coding complexity of the previous coded I frame, combining the coding bit number upper limit value and the intra-frame coding complexity of the I frame to be coded, and calculating the Lagrange multiplier lower limit value of the I frame to be coded; wherein the intra-coding complexity of the I-frame to be encoded multiplexes the intra-coding complexity of the video frame preceding the I-frame to be encoded;

step S5: according to the quantization parameter of the previous encoded P frame or B frame, calculating Lagrangian multiplier with the respiratory effect inhibition effect of the I frame to be encoded;

the steps S1 to S4 form a group; this group is either performed simultaneously with step S5 or any before;

step S6: and selecting the lower limit value of the Lagrange multiplier of the I frame to be encoded and the larger value of the Lagrange multiplier of the I frame to be encoded with the respiratory effect inhibiting effect as the Lagrange multiplier of the I frame to be encoded.

2. The method according to claim 1, wherein in the step S1, a prediction cost corresponding to each intra-frame coding mode is calculated for each image block to be coded, and one or more intra-frame coding modes with the lowest prediction cost are used as intra-frame coding mode candidates for the image block to be coded; the prediction cost refers to a coding rate distortion cost.

3. The method according to claim 1, wherein said step S2 calculates at least the intra-coding complexity of each I-frame and one video frame preceding each I-frame.

4. The method according to claim 1, wherein in step S3, the upper limit value of the number of encoded bits of the I frame that does not cause abnormal rate control is further calculated by using state information of a sliding window used for smoothing frame-level bit allocation in the rate control process.

5. The method for avoiding oversized I frame rate control of claim 4 wherein in step S3, the target number of bits R for sliding window is calculated _sw ，R _sw ＝R _PicAvg ×(N _coded +N _SW )-R _coded The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is _PicAvg Is the average target coding bit number, N, of each video frame calculated from the average code rate of the video sequence _coded Is the number of frames, N, of currently encoded video frames in a video sequence _SW Is the size of the sliding window, R _coded Is the sum of the actual number of coded bits of all video frames currently coded in the video sequence; then according to R _sw Calculating the upper limit value R of the coding bit number of the I frame which does not cause abnormal code rate control _Imax ；

Wherein N is _I Is the number of I frames in the sliding window; eta is a constant and the value range is (0, 1) ]。

6. The method for rate control to avoid oversized I-frames according to claim 4, wherein said step S4 is defined by the formula five:

obtaining; wherein lambda is _new Is the lower limit of the lagrangian multiplier for the I frame to be encoded; f (f) ^-1 Is an inverse of the first function f; i (n) _new An upper limit value of the coding bit number which does not cause abnormal code rate control and represents the I frame to be coded; i (m) _old Representing the actual number of encoding bits of the previous encoded I frame; g is a second function; omega _n-1 Representing the intra-coding complexity of the video frame immediately preceding the I-frame to be encoded, for replacing the intra-coding complexity omega of the I-frame to be encoded _n ；ω _m Representing intra-coding complexity of a previous coded I-frame; lambda (lambda) _old Representing the lagrangian multiplier for the previous encoded I frame.

7. The method for rate control to avoid oversized I-frames according to claim 6, wherein the first function f is represented by equation one:

acquiring; equation one represents: i-frame encoding the same video frame by using different Lagrangian multipliers, wherein the ratio of the actual encoding bit number obtained after encoding is related to the ratio of the Lagrangian multipliers, and the relation is expressed by a first function f;

the second function g is represented by equation two:

8. The method for avoiding oversized I-frame rate control of claim 7 wherein the deriving is based on equation one and equation two to obtain equation three:

further deriving equation four: />

9. The method according to claim 1, wherein in step S5, the quantization parameter QP of the P frame or the B frame encoded before the I frame to be encoded is used for the encoding _PB And calculating a quantization parameter QP with respiratory effect inhibition effect on the I frame to be encoded by the offset delta _I The calculation formula is as follows: QP (QP) _I ＝QP _PB +δ; wherein δ is an integer offset; then according to quantization parameter QP with inhibition effect on respiratory effect of I frame to be encoded _I Calculating Lagrange multiplier lambda with respiratory effect inhibition effect of I frame to be encoded _I The calculation formula is as follows:

wherein Exp () is an exponential function based on the natural logarithm e.

10. The code rate control device for avoiding the generation of the oversized I frame is characterized by comprising an intra-frame coding mode prediction module, an intra-frame coding complexity calculation module, an I frame coding bit number upper limit value calculation module, an I frame Lagrange multiplier lower limit value calculation module, an I frame Lagrange multiplier calculation module and an I frame Lagrange multiplier selection module;

the intra-frame coding mode prediction module is used for carrying out intra-frame coding mode prediction on each image block to be coded and outputting all intra-frame coding mode candidates of each image block to be coded and corresponding prediction cost;

the intra-frame coding complexity calculation module is used for screening the prediction cost corresponding to each intra-frame coding mode candidate of the image block to be coded, and selecting the minimum prediction cost to represent the intra-frame coding complexity of the image block to be coded; then taking the sum of the intra-frame coding complexity of all image blocks in the video frame as the intra-frame coding complexity of the video frame;

The I frame coding bit number upper limit value calculation module is used for calculating the coding bit number upper limit value of the I frame which does not cause abnormal code rate control according to the coding state of the current video sequence;

the I frame Lagrange multiplier lower limit value calculation module is used for calculating the lower limit value of the Lagrange multiplier of the I frame to be encoded according to the actual encoding bit number, the Lagrange multiplier and the intra-frame encoding complexity of the previous encoded I frame and combining the encoding bit number upper limit value and the intra-frame encoding complexity of the I frame to be encoded; wherein the intra-coding complexity of the I-frame to be encoded multiplexes the intra-coding complexity of the video frame preceding the I-frame to be encoded;

the I frame Lagrange multiplier calculation module is used for calculating Lagrange multipliers with the respiratory effect inhibiting effect of the I frame to be coded according to quantization parameters of the previous coded P frame or B frame;

the I frame Lagrange multiplier selection module is used for selecting a lower limit value of a Lagrange multiplier of an I frame to be encoded and a larger value in the Lagrange multipliers of the I frame to be encoded, which have a respiratory effect inhibiting effect, as the Lagrange multipliers of the I frame to be encoded.