CN116744020A

CN116744020A - Method for quick intra-frame prediction of universal video coding based on adjacent block depth

Info

Publication number: CN116744020A
Application number: CN202210206966.5A
Authority: CN
Inventors: 宋云; 文舟
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2022-03-04
Filing date: 2022-03-04
Publication date: 2023-09-12

Abstract

The invention relates to a general video coding intra-frame rapid prediction method based on adjacent block depth, which comprises the following steps: 1) Judging whether the dividing block in the current frame is a brightness block or not, if so, executing quick prediction; 2) Judging the size of a block divided during intra-frame prediction in each intra-picture frame (I-frame) image in the video, and carrying out the next processing on the block with the size smaller than or equal to 32; 3) And according to the acquired depth of the adjacent block, reaching a flag nei _flag. If the two left and upper neighboring blocks are equal in depth, nei _flag is 1. If the depths of two neighboring blocks are not equal and differ by 1, nei _flag is 0. Nei _flag is-1 when the adjacent block is other; 4) The flag isplite of whether early termination partition should be performed is obtained according to the depth of the current CU block and nei _flag. If nei _flag is 1 and the depth of the current CU is equal to or greater than the left neighboring block depth plus 1, isplite is 1, terminating the current CU from continuing partitioning in advance. If nei _flag is 0 and the depth of the current CU is equal to or greater than the greater depth plus 1 of the two neighboring blocks, isplite is 0, terminating the current CU in advance to continue partitioning. Otherwise, the isplite is-1, and the CU partition is consistent with the maintenance of the VVC default process to select the optimal partition result.

Description

Method for quick intra-frame prediction of universal video coding based on adjacent block depth

Technical Field

The invention relates to a method for quick intra-frame prediction of universal video coding based on adjacent block depth.

Background

After the digital age, digital video has been developed very rapidly, even following the surge of IT technology. The pursuit of higher definition is a pace that the digital video technology field has never stopped. Today, a wide variety of video applications have penetrated into various areas of human society. Video is a very large data volume carrier, for example, a video with a resolution of 480x240 at 24bits per pixel for 30 frames per second, and a bit rate of 82.944Mbps if we do not do any compression. A single one hour long video would require 278G of memory space at a resolution of 720p and 30 fps. We recognize that it is not feasible to compress video. Although network bandwidth and storage capacity have increased rapidly in recent years, the storage requirements for the transmission of video data featuring vast amounts of information have not been met far enough. Video coding technology has been a hot spot research area.

There are two main basic conditions for digital video to be able to be compressed: data redundancy and visual redundancy. The data redundancy of video, such as spatial redundancy and temporal redundancy, namely, there is a strong correlation among pixels of an image, and eliminating the redundancy does not cause information loss and belongs to lossless compression. Visual redundancy is due to the fact that some characteristics of the human eye, such as luminance discrimination threshold, visual threshold, and sensitivity to luminance and chrominance are different, so that moderate errors are not perceived during encoding. The visual characteristics of human eyes can be utilized to exchange a certain objective distortion for data compression. In order to achieve higher compression rates, video is typically compressed with a loss, i.e., a high compression ratio is achieved at the cost of some quality. The performance of the compression algorithm is measured according to two parameters: code rate and distortion. Lossy compression pursues that the highest compression ratio (lowest code rate) is obtained with a certain mass loss; or in the case of a certain code rate, the video quality is the best.

Video compression is typically achieved by a set of corresponding codecs. The encoder is used for reducing or eliminating spatial redundancy, temporal redundancy, structural redundancy, statistical redundancy and the like of the video image through a certain technology, so that the video is converted into a compressed form for transmission and storage. In the practical application process, the compressed video data is restored to the original video image through the corresponding decoder, and the most common video player is a group of decoders, which can support the playing (i.e. decoding) of videos with different encoding formats. Therefore, the encoding and decoding technology of digital video becomes the key for solving the problems of large storage space and high network transmission cost of massive video, and especially the encoding and decoding technology is very important under the great trend of rapid development and rapid popularization of the digital multimedia industry. The video coding can remove redundant components in video information to the maximum extent on the premise of ensuring that satisfactory viewing quality is provided for users, so as to achieve the aim of reducing the data volume.

In order to enable the inter-working and canonical decoding of encoded streams over a wide range, the international organization began to establish international standards for video encoding since the 80 s of the 20 th century. The international standard for video coding generally represents the contemporary most advanced video coding technology. The mainstream video coding standard is usually introduced by both the international telecommunication union telecommunication standardization sector ITU-T and the international organization for standardization/international electrotechnical commission ISO/IEC. The ITU-T organization has established the h.26x series video coding standard, which is widely used in broadband video communication. The MPEG video coding standard rule formulated by the ISO/IEC dynamic image expert group is mainly applied to the fields of broadcast television, video storage, network transmission application and the like. In 2003, the ITU-T and ISO/IEC organizations jointly promulgated the H.264/MPEG-4 advanced video coding (Advanced Video Coding, AVC) video compression standard. The standard is the most widely used video coding standard at present, has been successful in improving coding efficiency, video coding flexibility and the like, and the release of the standard enables digital video to be widely used in both network and engineering aspects. Both 2013 have cooperatively developed the previous generation video coding standard h.265/high efficiency video coding (High Efficiency Video Coding, HEVC). Compared with the H.264/AVC, the generation standard realizes the same image recovery quality by using a lower coding rate, and simultaneously adds more new technologies, thereby facing a wider application range. The development of video coding technology is greatly driven by the formulation of this standard.

With the improvement of the processing capability of a computer and the reduction of the storage cost, the challenge of diversification of network video services and the development of video coding technology, the digital video industry increasingly has urgent demands for video coding standards with high compression rate, strong stability and superior network adaptability, and on the other hand, the limitations of the most widely applied H.264 standard in compression, network and other aspects are increasingly highlighted at present when digital video is continuously developed to high frame rate, high compression rate and high resolution, so that the demands of a plurality of video applications cannot be met. Therefore, the VCEG working group of ITU-T and the MPEG working group of ISO/IEC are again in full cooperation, and as early as 2010, the joint video working group JCT-VC starts to seek an alternative to the existing codec standard, development of the new generation codec standard is developed, and in this context, the efficient video coding standard HEVC has emerged.

The advent of the HEVC standard has brought a new revolution to the high definition path of internet video. According to the early comparative study, HEVC breaks through various bottlenecks of H.264 in function and performance comprehensively, and shows great advantages for H.264. Therefore, under the requirement of continuously pursuing to increase the network speed and reduce the bandwidth cost, HEVC becomes a boosting agent for popularization and development of high-definition videos.

In 2016, the joint research group jfet was formally established by ITU-T and ISO, the next generation video coding standard h.266 standard was started to be studied, and the standardization work was formally started in month 4 of 2018, while confirming that the next generation video coding standard was called universal video coding. The goal of h.266/VVC is to provide superior coding performance over h.265/HEVC, while at the same time supporting 360 deg. panoramic video and HDR video is also needed. In the process of setting the coding standard, the performance of each coding algorithm needs to be examined, so a common reference software (VTM) is provided by jfet.

The VTM has a 18.03% bit rate drop in All Intra (AI) configuration compared to the latest reference software version HM16.0 of HEVC. For the test conditions of the Random Access (RA) configuration, VTM3.0 has improved performance by 23%.

VVC achieves excellent BD rate performance at the cost of time. Compared to HEVC, the encoding time increases by a factor of approximately 10, while the decoding time increases by a factor of approximately 2. Reducing complexity is a key step in promoting new standards, and CU partitioning early termination decision algorithms are one of the effective directions to minimize computation. Unlike HEVC, since VVC employs not only a quadtree, but also a binary and trigeminal tree structure, non-square blocks are allowed to be used in block partitioning. Therefore, CU partitioning is more complex due to additional searches in the mode decision loop, which greatly increases encoding time.

In recent years, a number of fast intra prediction algorithms have been proposed to reduce the computational complexity while saving more coding time and reducing coding distortion.

Disclosure of Invention

The embodiment of the invention provides a general video coding rapid intra-frame prediction method based on adjacent block depth, which can save a great amount of coding time under the condition of hardly reducing video coding quality.

The technical scheme for solving the problems is as follows: a method for universal video coding fast intra prediction based on neighboring block depths, characterized by: adjacent block depth is introduced into a test model of the universal video coding, the texture complexity of the current CU is judged by utilizing the depth characteristics of the coded CU on the left side and the upper side of the current CU, and whether the current CU can be divided or not is judged, so that the coding time is saved.

The h.266/VVC employs a block-based hybrid video coding structure, but employs a multi-type (QTMT) partition structure unlike the Quadtree (QT) partition in HEVC. QTMT mainly includes 3 kinds of partition structures, which are a quadtree partition structure (QT), a trigeminal tree partition structure (TT), and a binary tree partition structure (BT), respectively, wherein TT and BT are divided into two kinds of vertical partition and horizontal partition.

The optimal partitioning of the CU is determined in the VVC according to the values of the rate-distortion costs (RD costs) of the different partitioning schemes. The lowest RD cost among all possible partitions is the best partitioning scheme. VVC defines some coding parameters to limit the block partitioning under QTMT, maxCUWidth and MaxCUHeight are both 64, so 128×128 CTUs must be partitioned by QT first. MinQTSize is set to 8 to limit the minimum size of the QT node. According to MinQTSize and partitioning, an 8 x 8 CU can only be partitioned by MT. Both MaxBtSize and MaxTtSize are 32, so the maximum size of one MT node is 32×32. The improved coding tree structure can support coding blocks with larger sizes, and the division is more flexible. VVC removes the concept of Prediction Units (PUs) and Transform Units (TUs), only Coding Units (CUs). In the coding tree structure, a CU may be square or rectangular. BT partitions a CU into two rectangular sub-CUs of equal area, and TT partitions a CU into three rectangular sub-CUs with an area ratio of 1:2:1. Rectangular blocks with unequal length and width make the coding block more flexible.

Because the adjacent CUs of the same frame of image have stronger correlation in terms of color, smoothness, texture and the like, and the coding sequence in the CTU adopts a Z-shaped structure, when the current CU is coded, the units on the left side and the upper side are already coded. Thus, it may make a crucial contribution to reducing the encoding time if the texture complexity of the current CU is small by the depth determination of neighboring blocks without the need to continue partitioning. In order to solve the problems, the invention designs a CU partition early termination algorithm based on adjacent block depth so as to reduce the complexity of intra-frame coding in VVC.

The partitioning pattern of the Coding Tree Unit (CTU) is closely related to the texture characteristics of the block. Texture features can be used as an important basis for CTU partitioning. In areas of complex texture, CUs of small size are generally divided, and thus the depth of the CU is large. Whereas in the region of simple texture, CU blocks are typically larger in size and thus the depth of the CU is smaller. Depending on the spatial correlation of the CU, the current CU is highly similar to spatially neighboring CUs, which may not be exactly identical but are all relatively similar in coded depth, and therefore the spatial correlation of the CU is helpful for the determination of the coded depth.

In VTM, a 128×128 sized CTU block is split into 4 64×64 sub CU blocks by default, and thus 128×128 blocks need not be considered. Small-sized CUs occupy little encoding time, and early termination of partitioning a 64×64 CU tends to result in a relatively large quality loss. Therefore, the present invention rapidly predicts only CU blocks with length and width equal to 32 to achieve a balance between computational complexity and coding quality.

The present invention is a luminance component based method, so it is first determined whether the current block is in a luminance mode, and then the depth of the neighboring block is acquired. In one frame image, the remaining CUs have left and upper CUs adjacent thereto except for the leftmost and uppermost CUs. And then presuming the possible maximum depth of the current CU according to the depth of the adjacent CU, and terminating the division of the current CU in advance when the depth of the current CU is equal to or greater than the maximum depth. CU depth in VVC is different from HEVC. In HEVC, CUs are square, with dimensions 64×64, 32×32, 16×16, and 8×8, respectively, and corresponding CU depths are 0, 1, 2, and 3, respectively. In VVC, however, CU may be rectangular due to MT division, and CU depth is also divided into QT depth and MT depth. The depth of a CU is QT depth plus MT depth. In the VVC default setting, the CU cannot be QT divided after using the MT division, so the CU uses QT division first, and when the CU is 32×32, MT division can be started. For example, a CU of size 32 x 32 is partitioned into two 32 x 16 sub-CUs using a binary tree, where the 32 x 16 sub-CU has a QT depth of 2, an mt depth of 1, and a depth of 3.

The invention obtains the depth of the left and upper adjacent blocks. If the depths of two neighboring blocks are equal, the depth of the current CU is likely to be equal to or 1 greater than the depth of the neighboring blocks. If the depths of two neighboring blocks are not equal and differ by 1, the current CU is equal to the left CU depth or the upper CU depth. If the difference of the depth of two adjacent blocks is more than or equal to 2, the current CU coding depth has no obvious characteristic, and the optimal CU dividing result is selected by adopting full search and recursive rate distortion cost comparison.

The present invention obtains a flag nei _flag according to the depth of the neighboring CU. If the two neighboring blocks are equal in depth, nei _flag is 1. If the depths of two neighboring blocks are not equal and differ by 1, nei _flag is 0. And nei _flag is-1 when the adjacent block is other. The nei _flag is calculated as follows:

wherein, left _depth And Above _depth Representing the depth of the left side neighboring block and the depth of the upper side neighboring block, respectively.

The invention obtains a flag isSplit of whether early termination division should be performed according to the depth of the current CU block and nei _flag. If nei _flag is 1 and the depth of the current CU is equal to or greater than the left neighboring block depth plus 1, isplite is 1, terminating the current CU from continuing partitioning in advance. If nei _flag is 0 and the depth of the current CU is equal to or greater than the greater depth plus 1 of the two neighboring blocks, isplite is 0, terminating the current CU in advance to continue partitioning. Otherwise, the isplite is-1, and the CU partition is consistent with the maintenance of the VVC default process to select the optimal partition result. The calculation process of isplite is as follows:

wherein, CU _depth Is the depth of the current CU. max (Above) _depth ,Left _depth ) The depth of the block with the larger depth in the two adjacent blocks is taken.

The method judges whether the current CU carries out early termination of division or not through the depths of the left and upper adjacent CUs with great spatial correlation with the current CU, and can greatly reduce the calculation complexity of coding and reduce the coding time under the condition of less quality loss.

Drawings

FIG. 1 is a general flow chart of a method for fast intra prediction of video coding based on neighboring block depths according to the present invention;

Detailed Description

The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of explanation only and are not intended to limit the scope of the present invention.

FIG. 1 is a general flow chart of a method for fast intra prediction of video coding based on neighboring block depths according to the present invention; as shown in fig. 1, a method for video coding fast intra prediction based on neighboring block depth specifically comprises the following steps:

the user provides the video to be tested, and the test sequence is a YUV sequence.

Step one: a video sequence is first encoded in a full I-frame (ALL-INTRA) configuration. The test suggests quantization parameter values of 22, 27, 32, 37.

Step two: and carrying out preliminary judgment, if the coding block is a brightness block, carrying out quick prediction, and if the coding block is a chromaticity block, not carrying out any operation.

Step three: for the luminance coding blocks with the length and width smaller than or equal to 32 obtained in the intra coding process, if the current CU has left and upper adjacent blocks, the left adjacent block depth and the upper adjacent block depth of the current CU are obtained.

Step four: and according to the acquired depth of the adjacent block, reaching a flag nei _flag. If the two neighboring blocks are equal in depth, nei _flag is 1. If the depths of two neighboring blocks are not equal and differ by 1, nei _flag is 0. And nei _flag is-1 when the adjacent block is other.

Step five: the flag isplite of whether early termination partition should be performed is obtained according to the depth of the current CU block and nei _flag. If nei _flag is 1 and the depth of the current CU is equal to or greater than the left neighboring block depth plus 1, isplite is 1, terminating the current CU from continuing partitioning in advance. If nei _flag is 0 and the depth of the current CU is equal to or greater than the greater depth plus 1 of the two neighboring blocks, isplite is 0, terminating the current CU in advance to continue partitioning. Otherwise, the isplite is-1, and the CU partition is consistent with the maintenance of the VVC default process to select the optimal partition result.

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims

1. A method for fast intra-frame prediction for universal video coding based on neighboring block depths, characterized by: adjacent block depth is introduced into a test model of the universal video coding, the texture complexity of the current CU is judged by utilizing the depth characteristics of the coded CU on the left side and the upper side of the current CU, and whether the current CU can be divided or not is judged, so that the coding time is saved.

The h.266/VVC employs a block-based hybrid video coding structure, but employs a multi-type (QTMT) partition structure, unlike the Quadtree (QT) partition with only square blocks in HEVC. QTMT mainly includes 3 kinds of partition structures, which are a quadtree partition structure (QT), a trigeminal tree partition structure (TT), and a binary tree partition structure (BT), respectively, wherein TT and BT are divided into two kinds of vertical partition and horizontal partition.

In one frame image, the remaining CUs have left and upper CUs adjacent thereto except for the leftmost and uppermost CUs. And then presuming the possible maximum depth of the current CU according to the depth of the adjacent CU, and terminating the division of the current CU in advance when the depth of the current CU is equal to or greater than the maximum depth. CU depth in VVC is different from HEVC. In HEVC, CUs are square, with dimensions 64×64, 32×32, 16×16, and 8×8, respectively, and corresponding CU depths are 0, 1, 2, and 3, respectively. In VVC, however, CU may be rectangular due to MT division, and CU depth is also divided into QT depth and MT depth. The depth of a CU is QT depth plus MT depth. In the VVC default setting, the CU cannot be QT divided after using the MT division, so the CU uses QT division first, and when the CU is 32×32, MT division can be started. For example, a CU of size 32 x 32 is partitioned into two 32 x 16 sub-CUs using a binary tree, where the 32 x 16 sub-CU has a QT depth of 2, an mt depth of 1, and a depth of 3.

The present invention is a method based on a luminance component, so it is first determined whether a current block is a luminance mode, then the texture complexity of the current CU is determined based on the depth features of the coded CU on the left and upper sides of the current CU, and whether it can be divided by the current CU is determined, thereby saving encoding time.

2. The method of intra fast prediction for neighboring-block depth-based general video coding of claim 1, wherein the flag nei _flag is derived from the depth of the neighboring CU. If the two neighboring blocks are equal in depth, nei _flag is 1. If the depths of two neighboring blocks are not equal and differ by 1, nei _flag is 0. And nei _flag is-1 when the adjacent block is other. The nei _flag is calculated as follows: