CN112291564B

CN112291564B - HEVC intra-frame code rate control method for optimizing and monitoring video perception quality

Info

Publication number: CN112291564B
Application number: CN202011305568.6A
Authority: CN
Inventors: 公衍超; 王玲; 刘颖; 杨楷芳; 王富平; 林庆帆; 高梓铭; 韦同胜
Original assignee: Xian University of Posts and Telecommunications
Current assignee: Xian University of Posts and Telecommunications
Priority date: 2020-11-20
Filing date: 2020-11-20
Publication date: 2021-09-14
Anticipated expiration: 2040-11-20
Also published as: CN112291564A

Abstract

The HEVC intra-frame rate control method for optimizing perceptual quality of a surveillance video comprises the steps of coding a 1 st frame of the surveillance video, determining an average frame difference of an LCU, determining a motion perception sensitive factor of the LCU, determining an average gradient of the LCU, determining a texture perception sensitive factor of the LCU, determining a target bit of the LCU and coding a subsequent I frame of the surveillance video, distributing more rate codes for a visual sensitive area with medium texture complexity and low motion speed, using relatively less rate codes in an area with flat texture, excessive complexity and high motion speed, solving the technical problems that the perceptual characteristic of a human visual system and the complexity of parameter calculation are not fully considered in the existing rate control technology, and particularly improving the perceptual quality of a reconstructed video under medium and low rate. The method has the advantages of simple parameter calculation, high bit control precision, consistency of the reconstructed video quality and the human visual system perception characteristic and the like, and can be used in the technical field of monitoring video code rate control.

Description

HEVC intra-frame code rate control method for optimizing and monitoring video perception quality

Technical Field

The invention belongs to the technical field of code rate control in video coding, and particularly relates to an HEVC intra-frame code rate control method for optimizing and monitoring video perceptual quality.

Background

With the rapid development of the internet and multimedia technologies, multimedia applications have already penetrated various aspects of our lives. Video, an important component of multimedia, has become an important source of our information. Video monitoring is one of important video applications, and plays an extremely important role in the fields of public security, intelligent home and the like. Video coding is a key technology for ensuring the application of surveillance video in a practical environment. The international telecommunication union telecommunication standardization sector and the international organization for standardization/international electrotechnical commission jointly provide a new generation of video coding standard-HEVC in 2013, and compared with H.264/AVC, the video coding standard saves the coding rate by nearly 50% on the premise of ensuring the unchanged coding quality.

Although the compression efficiency of HEVC is greatly improved, in the existing environment of real-time video transmission, the channel bandwidth is limited. In order to effectively transmit video data and ensure the playing quality of video services under the condition of meeting the limitations of channel bandwidth and transmission delay, code rate control needs to be performed on the video coding process. The code rate control method based on lambda field includes the first target bit for coding unitAllocating, namely allocating proper target bits for the coding units according to the video content, the channel bandwidth and the buffer status; the Quantization Parameter (QP) is then independently determined for the coding unit using an R- λ model to achieve the pre-allocated target bits. In a target bit allocation process considering only intra rate control, the target bit allocation is divided into a picture level and a maximum coding unit (LCU) level according to a hierarchical structure of an encoding process. The picture level is a coding unit of a full picture level in a video sequence, and the LCU level is a coding unit of a size of 64 × 64 pixels, which is further divided into coding units of the picture level. The target bits at the picture level are allocated according to the number of remaining uncoded pictures, coding complexity and the condition of the remaining target bits. The target bits at the LCU level are allocated according to the number of remaining unencoded LCUs in the current coded picture, the coding complexity and the condition of the remaining target bits. When LCU-level bit allocation is carried out on an I frame by a code rate control method based on a lambda field in the current HEVC, a target bit B of each LCU is determined by adopting the following formula_m,n：

Target bit B 'of LCU as coding advances'_m,nThe update is performed as follows:

relevant research on human visual characteristics has shown that human eyes are selective in watching video, and the visual focus is attracted by certain specific information. Objects with moderate texture complexity and moderate and slow motion speed often catch human eyes, such as pedestrians in video, vehicles driving on urban roads. In contrast, for flat textured areas such as sky and road surfaces and almost stationary objects, the information content of the flat textured areas is small and is often easily ignored by human eyes. For the objects with complicated textures such as grasslands, trees and the like and high movement speed, due to the influence of texture covering and movement covering, detailed information of the objects is difficult to distinguish by naked eyes, so that distortion is difficult to perceive. Aiming at the perception characteristic of human eyes, limited bit resources can be redistributed, more code rates are distributed to the regions concerned by the vision focus, and relatively less code rates are distributed to other unimportant regions, so that the perception quality of the video can be still ensured on the premise of relatively insufficient code rates.

In the current HEVC code rate control method based on the lambda domain, in the process of code rate allocation, the code rate allocation is mainly performed by considering objective factors such as a coding structure, video sequence content characteristics and the like. Since the final receiving end of the video stream is the human visual system, a reasonable code rate allocation strategy should allocate more code rates to the regions concerned by human eyes and allocate less code rates to the regions not concerned by human eyes. Therefore, the performance of HEVC currently adopting a λ -domain based rate control method without considering the perceptual characteristics of the human visual system can be further improved.

For the HEVC standard, some more effective rate control methods considering the perceptual characteristics of the human visual system are proposed. The general idea of the methods is to firstly divide the regions in the video into different levels according to the importance or the significance, and then allocate more code rates and use smaller QP for coding the more important or significant regions in modules such as code rate control bit allocation or QP determination and the like, so as to preferentially ensure the quality of the regions and improve the coding perception quality of the whole video image. However, the existing methods have one or more of the following problems: first, there are some methods that aim at improving the perceptual quality of video images, but use peak signal-to-noise ratio (PSNR), Mean Square Error (MSE), and other measures that do not consider the human visual perception characteristics as quality measures in model optimization or performance testing. A great deal of literature research shows that the quality measure of the PSNR, MSE and the like without considering the human visual perception characteristics has poor consistency with the actual quality of the human visual system perception characteristics. The perception quality improved by the code rate control method is limited; second, there are some methods that use complex operations in the calculation and updating of model parameters, such as precoding, motion estimation, various mathematical optimization model iterations, etc. Video monitoring is a video application with high real-time requirement, so a method with complex model parameter calculation and updating is not suitable for the video monitoring application due to high coding delay; third, some methods are mainly proposed for inter-frame images (P-frames or B-frames) in HEVC, and these methods are not suitable for intra-frame images because there is a significant difference between intra-frame images and inter-frame images in terms of the adopted coding technique and the coding rate distortion performance; a fourth, some code rate control methods based on neural network are also proposed, but these methods have very high requirements on system hardware, and are not suitable for being used at a monitoring camera in a video monitoring system. In addition, the coding time delay of the method is very large, and the method is not suitable for being applied to a video monitoring system.

Disclosure of Invention

The technical problem to be solved by the present invention is to overcome the shortcomings of the prior art, and provide an HEVC intra-frame rate control method for optimizing and monitoring video perceptual quality, which is suitable for a video monitoring system, considers human visual system perceptual characteristics, and has low coding delay.

The technical scheme adopted for solving the technical problems comprises the following steps:

(1) coding surveillance video frame 1

And (4) giving a target code rate, and coding the 1 st frame of the monitoring video according to the HEVC code rate control method based on the lambda field.

(2) Determining average frame difference for LCU

Average frame difference D of nth LCU in mth I frame of video_m,nDetermined according to equation (1):

wherein m is more than 1, m and n are finite positive integers, w and h are the width and height of the nth LCU in the mth I frame of the video respectively, and x_m,nAnd (I, j) is the brightness component value of the nth LCU in the mth I frame of the video at the pixel point (I, j).

(3) Determining motion perception sensitivity factors for LCUs

Mth videoMotion perception sensitivity factor d of nth LCU in I frame_m,nDetermined according to equation (2):

wherein p is₁、q₁、p₂、q₂Is a model parameter, p₁∈(0,1]，q₁∈[0.5,1]，p₂∈[-0.02,0)，q₂∈[2,5]。

(4) Determining average gradient of LCU

Average gradient T of nth LCU in mth I frame of video_m,nDetermined according to equation (3):

(5) determining texture perception sensitivity factors for LCUs

Texture perception sensitivity factor t of nth LCU in mth I frame of video_m,nDetermined according to equation (4):

wherein a is₁、a₂、a₃、a₄、a₅Is a model parameter, a₁∈[-2×10^-5,-1×10^-5]，a₂∈[0.001,0.002]，a₃∈[-0.1,-0.05]，a₄∈[0.5,2]，a₅∈[-0.5,0)。

(6) Determining perceptual sensitivity factors for LCUs

Perceptual sensitivity factor theta of nth LCU in mth I frame of video_m,nDetermined according to equation (5):

θ_m,n＝μ×d_m,n+(1-μ)×t_m,n (5)

μ＝b₁×D_m,n+b₂ (6)

wherein μ, b₁、b₂Is a model parameter, b₁∈[-0.001,0)，b₂∈[0.3,1]。

(7) Determining target bits for LCUs

Where round () is a rounding function, B_mIs the target bit number of the mth I frame of the video, LN is the number of LCUs in the mth I frame of the video, omega_m,nThe weight of the nth LCU in the mth I frame of the video is determined according to the code rate control method based on the lambda field of HEVC. Target bit B 'of LCU as coding advances'_m,nUpdating is performed according to equation (8):

wherein R is_res,tarIs the target number of bits remaining in the mth I frame of the video, A_m,kAnd sw represents the size of a sliding window for the number of coding bits actually output by the kth LCU in the mth I frame of the video.

(8) Encoding subsequent I-frames of surveillance video

From obtained B'_m,nAnd performing code rate control on the nth LCU in the mth I frame of the monitoring video to complete the coding process of the nth LCU in the mth I frame.

In the step (3) of determining motion perception sensitivity factor of LCU of the present invention, said p₁The optimum value is 0.2984, q₁Most preferably 0.6182, p₂The value is preferably-0.0187, q₂The optimum value is 4.7163.

In the step (5) of determining the texture perception sensitivity factor of the LCU, a is₁The optimum value is-1.4749 × 10^-5，a₂The optimum value is 0.0017, a₃The optimal value is-0.0702, a₄The optimum value is 0.9954, a₅The most preferred value is-0.2183.

Determining perceptual sensitivity factors of LCU in the present inventionIn step (6), b is₁Most preferably-0.0004, b₂And most preferably 0.85.

Because the invention adopts the motion perception sensitive factor for determining the LCU and the texture perception sensitive factor for determining the LCU, considers the content characteristic of the monitoring video and the perception characteristic of the human visual system, adds the perception sensitive factors into the bit distribution weight of the LCU level, distributes more code rates for the visual sensitive areas with medium texture complexity, proper motion speed and low motion speed, uses relatively less code rates in areas with flat texture, excessive complexity and high motion speed, well represents the relation between the visual perception quality and the coding code rate, solves the technical problems that the perception characteristic of the human visual system and the parameter calculation are not fully considered in the prior code rate control technology, and particularly can improve the perception quality of the reconstructed video under medium and low code rates. The method has the advantages of simple parameter calculation, high bit control precision, consistency of the reconstructed video quality and the human visual system perception characteristic and the like, and can be used in the technical field of monitoring video code rate control.

Drawings

FIG. 1 is a flowchart of example 1 of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, but the present invention is not limited to the examples.

Example 1

In fig. 1, the HEVC intra rate control method for optimizing perceptual quality of a surveillance video in this embodiment includes the following steps:

(1) coding surveillance video frame 1

(2) Determining average frame difference for LCU

(3) Determining motion perception sensitivity factors for LCUs

Motion perception sensitivity factor d of nth LCU in mth I frame of video_m,nDetermined according to equation (2):

wherein p is₁、q₁、p₂、q₂Is a model parameter, p₁∈(0,1]P of the present embodiment₁Value 0.2984, q₁∈[0.5,1]Q of the present embodiment₁Value 0.6182, p₂E [ -0.02, 0)), p of the present embodiment₂The value is-0.0187, q₂∈[2,5]Q of the present embodiment₂The value is 4.7163.

The step considers the content characteristics of the monitoring video and the time domain covering effect, and solves the technical problems that the perception characteristics of the human visual system are not fully considered and the parameter calculation is complex in the existing code rate control technology.

(4) Determining average gradient of LCU

(5) determining texture perception sensitivity factors for LCUs

wherein a is₁、a₂、a₃、a₄、a₅Is a model parameter, a₁∈[-2×10^-5,-1×10^-5]A of the present embodiment₁The value is-1.4749 x 10^-5，a₂∈[0.001,0.002]A of the present embodiment₂A value of 0.0017, a₃∈[-0.1,-0.05]A of the present embodiment₃The value is-0.0702, a₄∈[0.5,2]A of the present embodiment₄Value 0.9954, a₅E [ -0.5,0), a of the present embodiment₅The value is-0.2183.

The steps adopt the characteristics of monitoring video content and the spatial masking effect, and solve the technical problems that the existing code rate control technology does not fully consider the perception characteristics of a human visual system and the parameter calculation is complex.

(6) Determining perceptual sensitivity factors for LCUs

θ_m,n＝μ×d_m,n+(1-μ)×t_m,n (5)

μ＝b₁×D_m,n+b₂ (6)

wherein μ, b₁、b₂Is a model parameter, b₁E [ -0.001, 0)), b of the present embodiment₁The value is-0.0004, b₂∈[0.3,1]B of the present embodiment₂The value is 0.85.

(7) Determining target bits for LCUs

Where round () is a rounding function, B_mIs the target bit number of the mth I frame of the video, LN is the number of LCUs in the mth I frame of the video, omega_m,nThe weight of the nth LCU in the mth I frame of the video is determined according to the code rate control method based on the lambda field of HEVC. As the encoding progresses, the target bits of the LCUB′_m,nUpdating is performed according to equation (8):

The perception sensitivity factors are added into LCU-level bit distribution weights, more code rates are distributed for the visual sensitive areas with medium texture complexity, proper movement speed and low movement speed, relatively few code rates are used in areas with flat texture, excessive complex texture and high movement speed, the relation between the visual perception quality and the coding code rate is represented, the technical problems that the perception characteristics of a human visual system and the calculation of parameters are not fully considered in the existing code rate control technology are solved, and especially the perception quality of a reconstructed video can be improved under medium and low code rates.

(8) Encoding subsequent I-frames of surveillance video

And completing the HEVC intra-frame code rate control method for optimizing the perceived quality of the monitoring video.

Example 2

The HEVC intra rate control method for optimizing the perceptual quality of a surveillance video of the present embodiment comprises the following steps:

(1) coding surveillance video frame 1

This procedure is the same as in example 1.

(2) Determining average frame difference for LCU

This procedure is the same as in example 1.

(3) Determining motion perception sensitivity factors for LCUs

wherein p is₁、q₁、p₂、q₂Is a model parameter, p₁∈(0,1]P of the present embodiment₁Value of 0.1, q₁∈[0.5,1]Q of the present embodiment₁A value of 0.5, p₂E [ -0.02, 0)), p of the present embodiment₂The value is-0.02, q₂∈[2,5]Q of the present embodiment₂The value is 2.

(4) Determining average gradient of LCU

This procedure is the same as in example 1.

(5) Determining texture perception sensitivity factors for LCUs

wherein a is₁、a₂、a₃、a₄、a₅Is a model parameter, a₁∈[-2×10^-5,-1×10^-5]A of the present embodiment₁The value is-2 x 10^-5，a₂∈[0.001,0.002]A of the present embodiment₂A value of 0.001, a₃∈[-0.1,-0.05]A of the present embodiment₃The value is-0.1, a₄∈[0.5,2]A of the present embodiment₄A value of 0.5, a₅E [ -0.5,0), a of the present embodiment₅The value is-0.5.

(6) Determining perceptual sensitivity factors for LCUs

θ_m,n＝μ×d_m,n+(1-μ)×t_m,n (5)

μ＝b₁×D_m,n+b₂ (6)

wherein μ, b₁、b₂Is a model parameter, b₁E [ -0.001, 0)), b of the present embodiment₁The value is-0.001, b₂∈[0.3,1]B of the present embodiment₂The value is 0.3.

The other steps are the same as in example 1.

Example 3

(1) coding surveillance video frame 1

This procedure is the same as in example 1.

(2) Determining average frame difference for LCU

This procedure is the same as in example 1.

(3) Determining motion perception sensitivity factors for LCUs

wherein p is₁、q₁、p₂、q₂Is a model parameter, p₁∈(0,1]P of the present embodiment₁Value of 1, q₁∈[0.5,1]Q of the present embodiment₁Value of 1, p₂E [ -0.02, 0)), p of the present embodiment₂The value is-0.001, q₂∈[2,5]Q of the present embodiment₂The value is 5.

(4) Determining average gradient of LCU

This procedure is the same as in example 1.

(5) Determining texture perception sensitivity factors for LCUs

wherein a is₁、a₂、a₃、a₄、a₅Is a model parameter, a₁∈[-2×10^-5,-1×10^-5]A of the present embodiment₁The value is-1 x 10^-5，a₂∈[0.001,0.002]A of the present embodiment₂The value is 0.002, a₃∈[-0.1,-0.05]A of the present embodiment₃The value is-0.05, a₄∈[0.5,2]A of the present embodiment₄A value of 2, a₅E [ -0.5,0), a of the present embodiment₅The value is-0.1.

(6) Determining perceptual sensitivity factors for LCUs

θ_m,n＝μ×d_m,n+(1-μ)×t_m,n (5)

μ＝b₁×D_m,n+b₂ (6)

wherein μ, b₁、b₂Is a model parameter, b₁E [ -0.001, 0)), b of the present embodiment₁The value is-0.0001, b₂∈[0.3,1]B of the present embodiment₂The value is 1.

The other steps are the same as in example 1.

To verify the beneficial effects of the present invention, the inventor performed experiments on test videos by using the method of embodiment 1 of the present invention, the experiments are as follows:

selecting 4 videos in the PKU-SVD-A data set as test sequences, wherein the names, the resolutions and the frame rates of the 4 videos are respectively as follows: campus, 720x576, 30 frames/sec; classlover, 720x576, 30 frames/sec; mainload, 1600x1200, 30 frames/sec; interaction, 1600x1200, 30 frames/sec. A comparison test was performed by using the HEVC intra-frame rate control method (hereinafter referred to as the present invention) for optimizing and monitoring video perceptual quality in embodiment 1 of the present invention and the rate control method in the encoder HM16.0 (hereinafter referred to as the comparison method 1) for which the high definition video coding standard is formulated and organized to recommend. Bit error BE is used for measuring bit control precision corresponding to the two methods, and the bit error BE is calculated according to the following formula:

wherein B is_actIs the actual bitrate of the coded video, B_tarIs the target bitrate for coding the video. The smaller the BE, the more accurate the bit control is represented. The perceptual quality of the reconstructed video is measured by the average subjective score MOS. The larger the MOS, the higher the perceptual quality of the reconstructed video.

Using a test model HM16.0 recommended by the international standard organization, adopting a full intra-frame coding structure, namely encoder _ intra _ main.cfg, in the configuration file, under a given target code rate, starting the code rate control coding of each test sequence according to the method of the invention and the comparison method 1, and obtaining a coding result, as shown in table 1.

Table 1 inventive and comparative method 1 coding results

As can BE seen from Table 1, for all the tested sequences, the average BE of comparative method 1 was 0.0074%, the average MOS was 2.61, the average BE of the inventive method was 0.0039%, and the average MOS was 2.93. Compared with the comparison method 1, the method has higher bit control precision, provides the perception quality more in line with the vision of human eyes and has higher coding performance.

Claims

1. An HEVC intra-frame rate control method for optimizing and monitoring video perception quality is characterized by comprising the following steps of:

(1) coding surveillance video frame 1

A target code rate is given, and the 1 st frame of the monitoring video is coded according to an HEVC code rate control method based on a lambda field;

(2) determining average frame difference for LCU

wherein m is more than 1, m and n are finite positive integers, w and h are the width and height of the nth LCU in the mth I frame of the video respectively, and x_m,n(I, j) is the brightness component value of the nth LCU in the mth I frame of the video at the pixel point (I, j);

(3) determining motion perception sensitivity factors for LCUs

wherein p is₁、q₁、p₂、q₂Is a model parameter, p₁∈(0,1]，q₁∈[0.5,1]，p₂∈[-0.02,0)，q₂∈[2,5]；

(4) Determining average gradient of LCU

(5) determining texture perception sensitivity factors for LCUs

wherein a is₁、a₂、a₃、a₄、a₅Is a model parameter, a₁∈[-2×10^-5,-1×10^-5]，a₂∈[0.001,0.002]，a₃∈[-0.1,-0.05]，a₄∈[0.5,2]，a₅∈[-0.5,0)；

(6) Determining perceptual sensitivity factors for LCUs

θ_m,n＝μ×d_m,n+(1-μ)×t_m,n (5)

μ＝b₁×D_m,n+b₂ (6)

wherein μ, b₁、b₂Is a model parameter, b₁∈[-0.001,0)，b₂∈[0.3,1]；

(7) Determining target bits for LCUs

Target bit B of nth LCU in mth I frame of video_m,nDetermined according to equation (7):

where round () is a rounding function, B_mIs the target bit number of the mth I frame of the video, LN is the number of LCUs in the mth I frame of the video, omega_m,nThe weight of the nth LCU in the mth I frame of the video is determined according to the code rate control method of HEVC based on the lambda field; target bit B 'of LCU as coding advances'_m,nUpdating is performed according to equation (8):

wherein R is_res,tarFor the target bit remained in the mth I frame of the videoNumber, A_m,kThe actual output coding bit number of a kth LCU in an mth I frame of the video is shown as sw, and the sw represents the size of a sliding window;

(8) encoding subsequent I-frames of surveillance video

2. The HEVC intra-frame rate control method for optimizing and monitoring video perceptual quality as recited in claim 1, wherein: in the step (3) of determining the motion perception sensitivity factor of the LCU, p is₁Value 0.2984, q₁Value 0.6182, p₂The value is-0.0187, q₂The value is 4.7163.

3. The HEVC intra-frame rate control method for optimizing and monitoring video perceptual quality as recited in claim 1, wherein: in the step (5) of determining the texture perception sensitivity factor of the LCU, a is₁The value is-1.4749 x 10^-5，a₂A value of 0.0017, a₃The value is-0.0702, a₄Value 0.9954, a₅The value is-0.2183.

4. The HEVC intra-frame rate control method for optimizing and monitoring video perceptual quality as recited in claim 1, wherein: in the step (6) of determining the perception sensitivity factor of the LCU, b is₁The value is-0.0004, b₂The value is 0.85.