CN115941960A

CN115941960A - Method for skipping CU partition between VVC frames in advance based on lightweight neural network

Info

Publication number: CN115941960A
Application number: CN202211192262.3A
Authority: CN
Inventors: 李跃; 刘武; 万亚平; 刘杰; 田纹龙
Original assignee: University of South China
Current assignee: University of South China
Priority date: 2022-09-28
Filing date: 2022-09-28
Publication date: 2023-04-07

Abstract

The invention discloses a lightweight neural network-based VVC inter-frame CU partition skipping method in advance, which comprises the following steps of: extracting residual values (difference values of original brightness pixel values and predicted brightness pixel values), variance values, QP values, depth values and block ratios of the residual values of the current CU, and preprocessing the residual values to be used as input of a neural network; after passing through the neural network, outputting a probability value without skipping the current division; and finally, comparing the probability value with a preset threshold value, and judging whether the partition (including a horizontal binary tree partition BTH, a vertical binary tree partition BTV, a horizontal ternary tree partition TTH and a vertical ternary tree partition TTV) performed by the current CU is skipped in advance. The invention effectively reduces the complexity of the division of the multi-type tree of the coding unit through a simple neural network, and reduces the VVC interframe coding time under the condition of basically not influencing the coding quality.

Description

Method for skipping CU partition between VVC frames in advance based on lightweight neural network

[ technical field ] A

The invention relates to the technical field of video coding, in particular to a lightweight neural network-based VVC inter-frame CU partition early skipping method.

[ background ] A method for producing a semiconductor device

With the rapid development of video technology, information spread in a video form is more and more important in information exchange of people, and video information becomes a vital part in life. The dramatic increase in the number of videos, and the emergence of various new complex videos, has brought new challenges to video compression. In the early 2020, JFET released a new generation of video compression standard- -H.266/VVC. Compared with H.265/HEVC, the H.266/VVC improves a main coding module based on a hybrid coding framework, and simultaneously introduces a plurality of new coding technologies, and the VVC adopts coding unit division based on a quadtree + a binary tree + a ternary tree, so that a more flexible division shape is supported. The maximum coding unit size is increased from 64 × 64 to 128 × 128. In order to adapt to the video with high complexity, a new motion vector prediction model is added, and can be used for describing more complex motion for better video compression. Compared with HEVC, VVC coding performance is remarkably improved, and coding efficiency is improved by about 40% on the premise of ensuring that video quality is basically unchanged.

The test model (VTM) of H.266/VVC introduces a quad-Tree partition (QTMT) of Nested Multi-type trees, increases the partition modes to six types, specifically the QTMT partition of VVC, and firstly divides the current frame into a plurality of CTUs (the maximum size of the CTUs is 256 multiplied by 256) with the same size from the current frame. And then, the CTUs are divided, and each CTU is firstly subjected to quadtree division to obtain four leaf node CUs. Each CU then starts recursive partitioning including six partitions, non-partitioning (DT), quadtree partitioning (QT), horizontal binary tree partitioning (BTH), vertical binary tree partitioning (BTV), horizontal ternary tree partitioning (TTH), and vertical ternary tree partitioning (TTV), and fig. 1 shows six partitions for multi-type tree partitioning. And VVC sets some partition size restrictions here, such as the size of the minimum leaf node of the quadtree partition is 16 × 16, the size of the maximum leaf node of the binary tree and the ternary tree partition is 64 × 64, and the size of the minimum leaf node thereof is 4 × 4. And additionally stipulates that the quadtree division can not be carried out after the binary tree and the ternary tree division is carried out. Due to the fact that various dividing modes and dividing rules are added, dividing results become various. In order to obtain the optimal CU partition of the current frame, all possible partition conditions need to be traversed, RD-cost of each CU partition is obtained through calculation, and finally the partition mode with the minimum RD-cost is selected as the optimal CU partition mode.

The introduction of QTMT enables the H.266/VVC coding standard to be applied to more complex types of videos, but also greatly increases the time for dividing blocks. Therefore, in order to reduce the complexity of QT partition and perform fast CU partition, it is necessary to explore a simple and efficient method to reduce the time overhead in this section. On the premise of ensuring the coding quality, unnecessary CU partition modes are skipped in advance, and the coding complexity is effectively reduced.

[ summary of the invention ]

The invention discloses a method for skipping CU (variable valve carrier) partitions between frames in advance based on a lightweight neural network, which effectively reduces the complexity of coding calculation and saves the coding time under the condition of ensuring that the coding quality is basically unchanged, thereby solving the technical problems related to the background technology.

In order to realize the purpose, the technical scheme of the invention is as follows:

a method for skipping CU partitions in advance between VVC frames based on a lightweight neural network comprises the following steps:

step one, data collection: the method comprises the steps of encoding 32 frames of the same video by using an original VVC encoder under different quantization parameters QP, training four neural network models including a BTH model, a BTV model, a TTH model and a TTV model according to four division modes including horizontal binary division, vertical binary division, horizontal quarternary equal division and vertical quarternary equal division, and respectively collecting data corresponding to different neural network models to serve as training sets:

step two, data training: respectively training four neural network models, preprocessing a data set provided by a CU, using the preprocessed data set as input of a neural network, then obtaining a probability value corresponding to a label through two full-connection layers and an output layer, and obtaining four networks for the inter-frame CU division of the VVC encoder to terminate in advance after training;

step three, model deployment: and embedding the neural network model into the VVC coder, extracting 5 corresponding characteristic values according to the current partition mode and inputting the 5 corresponding characteristic values into the network trained in the second step in the actual coding process of the VVC coder to obtain the prediction of whether the current CU partition is skipped in advance, carrying out subsequent coding according to the prediction, and setting the operation of improving the prediction accuracy of the neural network model.

As a preferable improvement of the present invention, in the step one, the data includes a comparison value of the difference between the sub-block variance values, a maximum value of the sub-block variance values, a QP value, a block ratio, and a depth.

As a preferred improvement of the present invention, in the step one, data is respectively collected corresponding to different neural network models as a training set, which specifically includes:

BTH model: performing horizontal secondary division and vertical secondary division on the current CU, and respectively calculating to obtain a variance value of residual values of an upper sub-block and a lower sub-block and a variance value of residual values of a left sub-block and a right sub-block; then calculating the absolute value of the difference between the square difference values of the upper sub-block and the lower sub-block and the absolute value of the difference between the square difference values of the left sub-block and the right sub-block; extracting a comparison result of the difference value of the current CU, the maximum value of the variance values of the upper sub-block and the lower sub-block, the current QP value, the block-to-block ratio of the current CU and the current depth as input information; whether the cost value of the current division is the lowest cost value of all the divisions is used as a corresponding label;

BTV model: performing horizontal secondary division and vertical secondary division on the current CU, and respectively calculating to obtain a variance value of residual values of an upper sub-block and a lower sub-block and a variance value of residual values of a left sub-block and a right sub-block; then calculating the absolute value of the difference between the square difference values of the upper sub-block and the lower sub-block and the absolute value of the difference between the square difference values of the left sub-block and the right sub-block; extracting a comparison result of the difference value of the current CU, the maximum value of the variance values of the left sub-block and the right sub-block, the current QP value, the block ratio width of the current CU and the current depth as input information; whether the cost value of the current division is the lowest cost value of all the divisions is used as a corresponding label;

TTH model: performing horizontal four-equal division and vertical four-equal division on the current CU, and respectively calculating to obtain the variance value of the residual values of the upper and lower four sub-blocks and the variance value of the residual values of the left and right four sub-blocks; then calculating the absolute value of the difference between the square difference values of the upper and lower four sub-blocks and the absolute value of the difference between the square difference values of the left and right four sub-blocks; extracting a comparison result of the difference value of the current CU, the maximum value of the variance values of the upper sub-block and the lower sub-block, the current QP value, the block-to-block ratio of the current CU and the current depth as input information; whether the cost value of the current division is the lowest cost value of all the divisions is used as a corresponding label;

TTV model: performing horizontal four-equal division and vertical four-equal division on the current CU, and respectively calculating to obtain the variance value of the residual values of the upper and lower four sub-blocks and the variance value of the residual values of the left and right four sub-blocks; then calculating the absolute value of the difference between the square difference values of the upper and lower four sub-blocks and the absolute value of the difference between the square difference values of the left and right four sub-blocks; extracting a comparison result of the current CU difference value, the maximum value of the left and right square sub-block variance values, the current QP value, the block ratio width of the current CU and the current depth as input information; and whether the cost value of the current partition is the lowest cost value of all partitions is used as a corresponding label.

As a preferred improvement of the present invention, the cost values include six cost values of quadtree partitioning, horizontal binary tree partitioning, vertical binary tree partitioning, horizontal ternary tree partitioning, and vertical ternary tree partitioning.

As a preferred refinement of the present invention, in step one, the quantization parameters QP are 22, 27, 32, and 37.

In a preferred improvement of the present invention, in the second step, the number of neurons in each layer of the bilayer fully-connected layer is 20.

As a preferred improvement of the present invention, in step three, the operation of improving the prediction accuracy of the neural network model specifically includes setting a decision to reduce the coding quality loss: and if the current CU horizontal partition is already skipped through the neural network prediction, the current CU vertical partition does not perform the neural network prediction any more.

The invention has the following beneficial effects:

1. the network model is simple, and the proportion of the total time coding time for network prediction of the four divided corresponding models can be ignored;

2. the network structure is two full-connection layers, the VVC test software can be directly embedded in the neural network, and the network calling complexity is reduced;

3. the decision threshold is adjustable, the balance between saving of coding time and increasing of coding code rate can be realized by adjusting the decision threshold, if a higher coding speed is required, a larger decision threshold can be set, and if a higher coding quality is required, the decision threshold can be reduced, and skipping of division is reduced.

[ description of the drawings ]

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without inventive efforts, wherein:

FIG. 1 is a diagram of six ways of partitioning a VVC multi-type tree;

FIG. 2 is a model training diagram of the VVC interframe CU partition early skipping method based on the lightweight neural network;

fig. 3 is an encoding flowchart of the VVC inter-frame CU partition early-skipping method based on the lightweight neural network of the present invention.

[ detailed description ] embodiments

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of the technical solutions by those skilled in the art, and when the technical solutions are contradictory to each other or cannot be realized, such a combination of the technical solutions should not be considered to exist, and is not within the protection scope of the present invention.

The invention provides a lightweight neural network-based VVC inter-frame CU partition skipping method in advance, which comprises the following steps of:

specifically, the quantization parameters QP are 22, 27, 32, and 37, and the data includes a comparison value of the difference between the variance values of the sub-blocks, a maximum value of the variance values of the sub-blocks, a QP value, a block ratio, and a depth.

Respectively collecting data corresponding to different neural network models as training sets, and specifically comprising the following steps:

BTH model: and performing horizontal subdivision and vertical subdivision on the current CU, and respectively calculating to obtain the variance value of the residual values of the upper sub-block and the lower sub-block and the variance value of the residual values of the left sub-block and the right sub-block. Then, the absolute value of the difference between the square difference values of the upper and lower sub-blocks (SAD 1) and the absolute value of the difference between the square difference values of the left and right sub-blocks (SAD 2) are calculated. And extracting the comparison result of the difference value of the current CU (if SAD2 is larger than SAD1, assigning 0, and assigning 1 otherwise), the maximum value of the variance values of the upper sub-block and the lower sub-block/60, the current QP value/51, the block ratio of the current CU to the block ratio/(width + height), and the current depth/5 as input information. And whether the cost value of the current partition is the lowest cost value of all partitions is used as a corresponding label.

BTV model: and performing horizontal subdivision and vertical subdivision on the current CU, and respectively calculating to obtain the variance value of the residual values of the upper sub-block and the lower sub-block and the variance value of the residual values of the left sub-block and the right sub-block. Then, the absolute value of the difference between the square difference values of the upper and lower sub-blocks (SAD 1) and the absolute value of the difference between the square difference values of the left and right sub-blocks (SAD 2) are calculated. And extracting a comparison result of the difference value of the current CU (if SAD1 is larger than SAD2, 0 is assigned, otherwise 1 is assigned), the maximum value of the variance values of the left sub-block and the right sub-block/60, the current QP value/51, the block ratio width/(width + height) of the current CU, and the current depth/5 as input information. And whether the cost value of the current partition is the lowest cost value of all partitions is used as a corresponding label.

TTH model: and performing horizontal four-equal division and vertical four-equal division on the current CU, and respectively calculating to obtain the variance value of the residual values of the upper and lower four sub-blocks and the variance value of the residual values of the left and right four sub-blocks. Then, the absolute value of the difference between the square difference values of the upper and lower four sub-blocks (SAD 1) and the absolute value of the difference between the square difference values of the left and right four sub-blocks (SAD 2) are calculated. And extracting the comparison result of the difference value of the current CU (if SAD2 is larger than SAD1, 0 is assigned, otherwise 1 is assigned), the maximum value of the variance values of the upper sub-block and the lower sub-block/60, the current QP value/51, the block ratio of the current CU is high/(width + height), and the current depth/5 as input information. And whether the cost value of the current partition is the lowest cost value of all partitions is used as a corresponding label.

TTV model: and performing horizontal four-equal division and vertical four-equal division on the current CU, and respectively calculating to obtain the variance value of the residual values of the upper and lower four sub-blocks and the variance value of the residual values of the left and right four sub-blocks. Then, the absolute value of the difference between the square difference values of the upper and lower four sub-blocks (SAD 1) and the absolute value of the difference between the square difference values of the left and right four sub-blocks (SAD 2) are calculated. The comparison result of the current CU difference (if SAD1> SAD2, 0 is assigned, otherwise 1 is assigned), the maximum value of the left and right four-sub-block variance/60, the current QP value/51, the current CU block ratio width/(width + height), and the current depth/5 are extracted as input information. And whether the cost value of the current partition is the lowest cost value of all partitions is used as a corresponding label.

It should be further noted that the cost value includes six cost values of quadtree partitioning, horizontal binary tree partitioning, vertical binary tree partitioning, horizontal ternary tree partitioning, and vertical ternary tree partitioning.

Step two, data training: respectively training four neural network models, preprocessing a data set proposed from a CU, using the preprocessed data set as input of a neural network, then obtaining a probability value corresponding to a label through two full-connection layers and an output layer, and obtaining four networks for the partition of inter-frame CUs of a VVC encoder to terminate in advance after training;

specifically, the number of neurons in each layer of the double-layer fully-connected layer is 20.

Step three, model deployment: and embedding the neural network model into a VVC (variable valve control) coder, extracting corresponding 5 eigenvalues according to the current partition mode and inputting the eigenvalues into the network trained in the second step in the actual coding process of the VVC coder to obtain the prediction of whether the current CU partition should be skipped in advance, performing subsequent coding according to the prediction, and setting the operation of improving the prediction accuracy of the neural network model.

It should be noted that, the operation of setting and improving the prediction accuracy of the neural network model specifically includes setting a decision to reduce the coding quality loss: and if the current CU horizontal partition is already skipped through the neural network prediction, the current CU vertical partition does not perform the neural network prediction any more.

The following describes in detail a lightweight neural network-based VVC inter-frame CU partition early-skipping method provided by the present invention with specific embodiments.

Referring to fig. 2 and fig. 3, the present embodiment provides a lightweight neural network-based VVC inter-frame CU partition early skipping method, including the following steps:

the method comprises the following steps: collecting data, namely selecting a video sequence BlowingBubbels to run 32 frames under four quantization parameters QP by using the method to collect four data sets corresponding to four neural network models;

step two: respectively inputting the four data sets into a neural network for training, and obtaining four neural network models for skipping in advance by dividing CU between VVC frames after training;

step three: deployment of the model: in the actual VVC encoding process, for each CU, extracting 5 corresponding eigenvalues according to the current partition mode, inputting the eigenvalues into the trained neural network, obtaining a prediction of whether the current CU partition should be skipped in advance, and performing subsequent encoding according to the prediction, specifically:

extracting five characteristic values corresponding to the neural network model according to the dividing mode of the current CU; before calculating the current dividing cost value, inputting the five characteristic values into a corresponding neural network model to obtain a probability value, and comparing the probability value with a preset decision threshold value; if the probability value is larger than the threshold value, continuing encoding according to the VTM original flow, and if the probability value is smaller than the threshold value, skipping the calculation of the currently divided cost value; and if the horizontal division (including binary tree and ternary tree) of the current CU is skipped through network prediction, the vertical division is not subjected to the network prediction any more, and the coding is carried out according to the original VTM flow.

Since erroneous skipping of CU partitions that should not be skipped will cause a large loss in coding, in order to reduce the loss caused by prediction errors as much as possible, the present invention sets the training threshold and the decision threshold of the four neural network models to 0.05.

Simulation experiments are performed below to verify the encoding performance of the lightweight neural network-based VVC inter-frame CU partition early-skipping method provided by the invention.

To evaluate the feasibility and effectiveness of the proposed method, VTM14.2 was used as the test platform and executed independently on a PC of 11th Gen Inter (R) Core (TM) i7-11700F CPU, 169B RAM. The test sequence includes five resolutions, 416x240 (blowingbunbles, BQSquare, basetballpass), 832x480 (racehoresec, partyscreen, basetballkill), 1280x720 (Johnny, kristen andsara), 1920x1080 (Cactus, markeptplace, BQTerrace), and 3840x2160 (Tango 2), the coding Quantization Parameter (QP) is set to (22, 27, 32, 37), and the coding configuration is RA (randaccess) mode. Using the change of code rate (BD-Rate) And coding time savings (TS) To measure the performance of the algorithm. WhereinTSIs defined as:

represents the encoded time of the original test pattern, <' > is selected>

Representing the encoding time after the inventive method was applied to the original test model.

Watch (CN)

: comparison of the Performance of the present invention with that of VTM14.2 results (unit:%)>

Wherein, TS represents the saving of the encoding time relative to the original VTM14.2 method, and as can be seen from table 1, the invention saves the encoding time by 39.60% on average, and BD-Rate is increased by only 1.6%. In conclusion, the method provided by the invention can effectively realize the balance between the saving of the coding time and the increasing of the code rate under the condition that the human eyes can accept the reduction range of the coding quality.

The invention has the following beneficial effects:

1. the network model is simple, and the proportion of the total time for the network prediction of the four divided corresponding models can be ignored;

2. the network structure is two full connection layers, the VVC testing software can be directly embedded into the neural network, and the network calling complexity is reduced;

While embodiments of the invention have been disclosed above, it is not limited to the applications set forth in the specification and the embodiments, which are fully applicable to various fields of endeavor for which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.

Claims

1. A method for skipping CU partitions in advance between VVC frames based on a lightweight neural network is characterized by comprising the following steps:

2. The lightweight neural network-based VVC inter-frame CU partition skip-ahead method of claim 1, wherein in step one, the data includes a comparison value of difference between subblock variance values, a maximum value of subblock variance values, a QP value, a block ratio, and a depth.

3. The lightweight neural network-based VVC inter-frame CU partition early-skipping method of claim 2, wherein in step one, data is collected as training sets respectively corresponding to different neural network models, specifically comprising:

BTH model: performing horizontal subdivision and vertical subdivision on the current CU, and respectively calculating to obtain a variance value of residual values of an upper sub-block and a lower sub-block and a variance value of residual values of a left sub-block and a right sub-block; then calculating the absolute value of the difference between the variance values of the upper sub-block and the lower sub-block and the absolute value of the difference between the variance values of the left sub-block and the right sub-block; extracting a comparison result of the difference value of the current CU, the maximum value of the variance values of the upper sub-block and the lower sub-block, the current QP value, the block-to-block ratio of the current CU and the current depth as input information; whether the cost value of the current division is the lowest cost value of all the divisions is used as a corresponding label;

BTV model: performing horizontal secondary division and vertical secondary division on the current CU, and respectively calculating to obtain a variance value of residual values of an upper sub-block and a lower sub-block and a variance value of residual values of a left sub-block and a right sub-block; then calculating the absolute value of the difference between the variance values of the upper sub-block and the lower sub-block and the absolute value of the difference between the variance values of the left sub-block and the right sub-block; extracting a comparison result of the difference value of the current CU, the maximum value of the variance values of the left sub-block and the right sub-block, the current QP value, the block ratio width of the current CU and the current depth as input information; whether the cost value of the current division is the lowest cost value of all the divisions is used as a corresponding label;

4. The lightweight neural network based VVC inter-frame CU partition skip ahead method as claimed in claim 3, wherein said cost values comprise six cost values of quad-tree partition, horizontal binary tree partition, vertical binary tree partition, horizontal tri-tree partition, vertical tri-tree partition.

5. The lightweight neural network based VVC inter-frame CU partition early skipping method of claim 1, wherein in step one, the quantization parameters QP are 22, 27, 32 and 37.

6. The lightweight neural network based VVC inter-frame CU partition early skipping method of claim 1, wherein in step two, the number of neurons per layer of the bi-layer fully connected layer is 20.

7. The lightweight neural network-based VVC inter-frame CU partition skip-ahead method of claim 1, wherein in step three, setting up the operation of improving the neural network model prediction accuracy specifically includes setting up a decision to reduce the coding quality loss: and if the current CU horizontal partition is already skipped through the neural network prediction, the current CU vertical partition does not perform the neural network prediction any more.