CN113225556A

CN113225556A - Video coding method

Info

Publication number: CN113225556A
Application number: CN202110598168.7A
Authority: CN
Inventors: 廖义; 李日; 谢亚光; 孙彦龙
Original assignee: Hangzhou Arcvideo Technology Co ltd
Current assignee: Hangzhou Arcvideo Technology Co ltd
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2021-08-06

Abstract

The invention discloses a video coding method, which comprises the following steps: step 1, starting CU division judgment of a certain CU depth; step 2, solving the brightness variance value var1 of the CU, if var1 is larger than a first threshold TH1, executing step 3, otherwise executing step 4; step 3, performing DCT and quantization on the coding residual error of the current CU, counting the number of quantization coefficients in the current CU, which are more than 0, and marking the number as N1, if N1 is more than a second threshold TH2, judging that the current CU should be divided, otherwise, executing step 4; step 4, judging whether the current CU is divided according to a CU size selection method of Lagrange rate distortion, and if so, judging that the current CU is divided; and if not, judging that the current CU is not divided. The method provided by the embodiment of the invention ensures that the CU size of the local flat block is more reasonable, can reduce the noise of the local flat block and improve the subjective quality of the video.

Description

Video coding method

Technical Field

The invention belongs to the technical field of video coding, and particularly relates to a video coding method.

Background

Video technology has been widely applied to the fields of mobile terminals, live webcasting, home theater, remote monitoring and the like, Video resolution has also gradually changed from Standard Definition (SD) to High Definition (HD), Ultra High-Definition (UHD), and currently, international and commonly used Video encoding and decoding standards include h.264, h.265/HEVC (High Efficiency Video Coding), and domestic AVS (Audio Video Coding Standard), AVS +, AVS2 and the like.

The HEVC encoder divides each frame of image into several CTUs (Coding Tree units) of the same size, each CTU is further divided into CUs (Coding units) of different sizes, such as 64x64, 32x32, 16x16 and 8x8, according to information, such as texture and motion of each region, and the CU depths corresponding to these CUs of different sizes are 0, 1, 2 and 3, respectively. A larger CU size typically saves more code rate but has more coding distortion, while a smaller CU size typically consumes more code rate but has less coding distortion.

In order to achieve both bitrate and Distortion, an HEVC encoder recursively processes a CU in a quadtree form, and as shown in fig. 1, determines a CU size by comparing RDCost (Rate Distortion Cost) of CUs of each size, and selects a CU size with a minimum RDCost as an optimal CU size, where the RDCost is calculated as:

RDcost＝λ·R+SSD

the method is called a CU size selection method based on lagrangian rate distortion, and can select a CU size with less code rate consumption and less coding distortion, for example, as shown in fig. 2, a CU size division result of the method is adopted for a certain video, wherein a black flat area usually selects a large-size CU, and a texture complex area usually selects a smaller-size CU.

HEVC (high efficiency video coding) adopts high-efficiency predictive coding and transform coding technology, and the predictive coding is to predict the pixels of the current CU by using the pixels of the CU which are correlated in time and space domains, so that data information required to be carried by the current CU is reduced; transform coding subtracts CU prediction pixels and CU original pixels to form a coded residual, and performs DCT (Discrete Cosine Transform) and quantization on the coded residual to further compress residual information.

DCT concentrates most of the energy of the encoded residual information in a small range of the frequency domain, so that only a few bits are needed to describe the insignificant components, and in addition, the frequency domain decomposition maps the processing of the human visual system and allows the subsequent quantization process to meet its sensitivity requirement, the DCT transform formula is:

wherein X is a coding residual coefficient matrix, Y represents a DCT coefficient matrix, C is a transformation matrix, and E is a correction matrix.

Although the conventional CU size selection method based on Lagrange rate distortion selects the CU size with a smaller code rate and smaller objective coding distortion, the CU size which best meets the subjective feeling of human eyes cannot be selected. For a CU containing both flat regions and texture regions, the distribution of high-frequency information is not concentrated enough in the DCT process of the HEVC encoder, which makes it difficult for the quantization process to eliminate the high-frequency information.

Disclosure of Invention

In view of the above problems, the present invention provides a video encoding method.

In order to solve the technical problems, the invention adopts the following technical scheme:

a video encoding method, comprising:

step 1, starting CU division judgment of a certain CU depth;

step 2, solving the brightness variance value var1 of the CU, if var1 is larger than a first threshold TH1, executing step 3, otherwise executing step 4;

step 3, performing DCT and quantization on the coding residual error of the current CU, counting the number of quantization coefficients in the current CU, which are more than 0, and marking the number as N1, if N1 is more than a second threshold TH2, judging that the current CU should be divided, otherwise, executing step 4;

step 4, judging whether the current CU is divided according to a CU size selection method of Lagrange rate distortion, and if so, judging that the current CU is divided; and if not, judging that the current CU is not divided.

Preferably, the first threshold TH1 is in the range of [1, 1000 ].

Preferably, the first threshold TH1 has a value of 600.

Preferably, the second threshold TH2 is in the range of [1, 20 ].

Preferably, the second threshold TH2 has a value of 8.

Preferably, the luminance variance value var1 is specifically:

where N denotes the number of pixels in the current CU, y_tDenotes the luminance value of the t-th pixel in the current CU, and μ denotes the average value of the luminance values of all pixels in the current CU.

Preferably, in step 3, the quantization coefficients are:

where Y (i, j) represents a DCT coefficient having a position (i, j) in the DCT coefficient matrix Y, L (i, j) is a quantization coefficient having a position (i, j), and Q_stepRepresenting the quantization step size, floor () is a rounding down function and f is the rounding offset.

The invention has the following beneficial effects: utilizing the brightness variance value of the local flat block, judging that the brightness variance value is easy to generate noise if the brightness variance value is larger, and enabling the CU to tend to select a smaller size; and judging that the coding distortion is larger if the number is larger according to the number of the quantized non-zero coefficients, and enabling the CU to select a smaller CU size at the moment. The method provided by the embodiment of the invention ensures that the CU size of the local flat block is more reasonable, can reduce the noise of the local flat block and improve the subjective quality of the video.

Drawings

FIG. 1 is a diagram illustrating a quad-tree partitioning structure of a CTU in the prior art;

FIG. 2 is a schematic diagram of the prior art partitioning of CU sizes based on Lagrangian rate distortion;

FIG. 3 is a flowchart illustrating steps of a video encoding method according to an embodiment of the present invention;

FIG. 4 is a diagram of an experiment platform and a comparison platform using an open source x265 video encoder;

fig. 5 is a schematic diagram of a picture after video encoding and transcoding is performed by the method according to the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 3, a flow chart of the steps of a video encoding method according to an embodiment of the present invention is shown, including:

step 1, starting CU division judgment of a certain CU depth;

step 2, solving the brightness variance value var1 of the CU, if var1 is larger than a first threshold TH1, executing step 3, otherwise executing step 4; the threshold TH1 is in the range of [1, 1000], and is typically 600.

Step 3, performing DCT and quantization on the coding residual error of the current CU, counting the number of quantization coefficients in the current CU, which are more than 0, and marking the number as N1, if N1 is more than a second threshold TH2, judging that the current CU should be divided, otherwise, executing step 4; the threshold TH2 is in the range of [1, 20], and is typically 8.

In a specific application example, the luminance variance value var1 is specifically:

The quantization process is actually an optimization process of the DCT coefficients, which uses the property of the human eye that is insensitive to high frequencies to achieve a large simplification of the data, and is actually a simple division of each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. In a specific application example, in step 3, the quantization coefficients are:

An open source x265 video encoder is used as an experiment platform and a comparison platform, and the brightness variance value of a local flat block and the number of nonzero coefficients of a CU after quantization are utilized, so that the size of the CU in an area which is easy to generate noise is selected to be smaller, the noise of the local flat block is reduced, and the subjective quality is improved. As shown in fig. 4 and fig. 5, which are encoded output diagrams of the x265 method and the method of the present invention, respectively, it can be seen that noise is very significant in the boundary region between flat and texture in the x265 method, such as the black box portion in fig. 4, where there is much noise in the shape of stripes beside the light ray, while noise is very small in the boundary region between flat and texture in the method of the present invention, which indicates that the method of the present invention has a significant effect on improving local flat block noise, such as the black box portion in fig. 5. The method optimizes the subjective quality of the video by efficiently removing the local flat block noise, and can be applied to video compression standards such as H265/HEVC, AVS2 and the like.

It is to be understood that the exemplary embodiments described herein are illustrative and not restrictive. Although one or more embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Claims

1. A video encoding method, comprising:

step 1, starting CU division judgment of a certain CU depth;

2. The video encoding method of claim 1, wherein the first threshold THl is between [1, 1000 ].

3. The video encoding method of claim 1, wherein the first threshold THl is 600.

4. The video coding method of claim 1, wherein the second threshold TH2 ranges between [1, 20 ].

5. The video encoding method of claim 1, wherein the second threshold TH2 has a value of 8.

6. The video coding method of any of claims 1 to 5, wherein the luminance variance value var1 is specifically:

7. The video coding method of any of claims 1 to 5, wherein in step 3, the quantized coefficients are: