CN111885383A

CN111885383A - CU subdivision based on CU texture complexity

Info

Publication number: CN111885383A
Application number: CN202010578401.0A
Authority: CN
Inventors: 张萌萌; 刘志; 岳�文
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2020-06-23
Filing date: 2020-06-23
Publication date: 2020-11-03

Abstract

There is provided a method of Coding Unit (CU) sub-division based on texture complexity of a CU, the method comprising: calculating a texture complexity value of the current CU before performing brightness intra-frame prediction on the current CU; if the texture complexity value is greater than a threshold value: if the size of the current CU is 4x8 (or 8x4), dividing the current CU into two sub-divisions of 4x 4; or if the size of the current CU is greater than 4x8 (or 8x4) and less than or equal to 64x64, dividing the current CU into 4 sub-partitions, wherein each sub-partition includes at least 16 pixels; or if the size of the current CU is larger than 64x64, not further sub-dividing the current CU; or if the texture complexity value is less than or equal to the threshold, not further sub-partitioning the current CU, wherein if it is determined that the current CU is further sub-partitioned, a CU sub-partition flag for the current CU is set. A corresponding decoding method is also provided.

Description

CU subdivision based on CU texture complexity

Technical Field

The present invention relates to the field of image and video processing, and more particularly, to a method, apparatus and computer program product for CU sub-partitioning based on CU texture complexity in multi-function video coding (VVC).

Background

In 4 months 2010, two international Video coding standards organizations VCEG and MPEG established Video compression joint group JCT-vc (joint Video coding), which together develop a high efficiency Video coding hevc (high efficiency Video coding) standard, also known as h.265. The first edition of the HEVC standard has been completed in january of 2013. And 3 versions released in succession at months 4 in 2013, 10 in 2014 and 4 in 2015, which can be easily obtained from the network, and the present application incorporates the three versions of the HEVC standard described above in the present specification as background for the present invention.

HEVC proposes a completely new syntax element: a Coding Unit (CU) is a basic unit that performs prediction, transform, quantization, and entropy coding, a Prediction Unit (PU) is a basic unit that performs intra inter prediction, and a Transform Unit (TU) is a basic unit that performs transform and quantization. In addition, each CU defines an area that shares the same prediction mode (intra or inter).

As shown in fig. 1, in HEVC, switching between intra-prediction mode and inter-prediction mode may be performed. In both intra prediction mode and inter prediction mode, HEVC adopts a coding structure of a Coding Tree Unit (CTU), which is a basic processing unit of HEVC coding and decoding. The CTU consists of 1 luma CTB, 2 chroma CTBs and corresponding syntax elements. Fig. 2 shows the CTU structure after one LCU (largest coding unit) coding. In HEVC, an LCU may contain only one Coding Unit (CU), or may be partitioned into CUs of different sizes using a CTU quadtree structure.

There are four sizes CU in HEVC, the sizes being: 64x64, 32x32, 16x16, and 8x 8. The smaller the CU block, the deeper it is located in the CTU tree. Referred to as 2Nx2N mode (indicating that partitioning into smaller CUs is possible) when the CUs are 64x64, 32x32, and 16x16, and referred to as NxN mode (indicating that no further partitioning is possible) when the CU is 8x 8. For intra prediction, a CU is split into two partmodes (2Nx2N and NxN) depending on whether it can be split into smaller CUs or not. CUs of sizes 64x64, 32x32, and 16x16 belong to 2N × 2N, and CUs of sizes 8 × 8 belong to N × N.

In HEVC, a PU is the basic unit of intra inter prediction, the partition of the PU is CU-based, with five regular sizes 64x64, 32x32, 16x16, 8x8, and 4x 4. More specifically, the PU size is based on PartMode: the PartMode PU size for 2nx2N is the same as the CU, and the PartMode CU for N × N can be divided into four 4 × 4 sub-PUs. For the CU pattern of 2N × 2N, the optional patterns of the intra-prediction PU include 2N × 2N and N × N, and the optional patterns of the inter-prediction PU include 8 kinds, including 4 kinds of symmetric patterns (2N × 2N, 2N × N, N ×) and 4 kinds of asymmetric patterns (2N × nU, 2N × nD, nL × 2N, nR × 2N), where 2N × nU and 2N × nD are divided by the upper and lower ratios of 1:3 and 3:1, respectively, and nL × 2N and nR × 2N are divided by the left and right ratios of 1:3 and 3:1, respectively.

In HEVC, mode selection still continues using lagrangian Rate Distortion Optimization (RDO) of h.264/AVC, whose RDO is computed for each intra mode:

J＝D+λR

where J is the lagrangian cost (i.e., RD-cost), D represents the distortion of the current intra mode, R represents the number of bits needed to encode all information in the current prediction mode, and λ is the lagrangian factor. Where D is typically implemented using the sum of absolute hadamard transform differences (SATD).

Processing a frame of video image requires first dividing it into multiple LCUs (64x64) and then encoding each LCU in turn. Each LCU is recursively divided in turn, which determines whether to continue the division by calculating the RD-cost for the current depth. An LCU may be divided into a minimum of 8x8 size units, as shown in fig. 2. The encoder judges whether to continue dividing or not by comparing RD-cost values of the depths, and if the sum of coding costs of 4 sub-CUs in the current depth is larger than that of the current CU, the dividing is not continued; otherwise, continuing the division until the division is finished.

Those skilled in the art will readily appreciate that since the CTU is a tree coding structure that CU partitions the LCU, the manner of CU partitioning in the CTU begins with the LCU, and thus these two terms are often used interchangeably in the art.

In intra prediction, a total of 35 prediction modes are used per PU. Using coarse mode decision (RMD), we can obtain three candidate modes for 64x64, 32x32, and 16x16 blocks, and eight candidate modes for 8x8 and 4x4 blocks. The best candidate list for each PU size is obtained by merging the Most Probable Modes (MPMs) from neighboring blocks. Then, the best intra prediction mode for the current PU is selected by RDO. When intra prediction of all PUs included in the current CU is completed, intra prediction of the current CU is completed. The sub-optimal CU inner prediction completion with smaller RD-cost is selected by a comparison between the RD-cost of the current CU and the total RD-cost of the current CU and the four sub-CUs of the 4 sub-CUs thereof. When all CU partitions are completed, the current CTU intra prediction is completed. For HEVC, when coding an LCU, intra prediction of 85 CUs (one 64 × 64CU, four 32 × 32 CUs, sixteen 16 × 16 CUs, and sixty-four 8 × 8 CUs) should be performed. When a CU is encoded, intra prediction of one PU or four sub-PUs should be performed. The large number of CUs and PUs results in high complexity of intra prediction.

A multifunctional Video Coding (Versatile Video Coding) VVC (h.266) proposed by jmet in san diego meeting, san diego conference, 10/4/2018, a new generation of Video Coding technology improved on the basis of h.265/HEVC, whose main objective is to improve the existing HEVC, provide higher compression performance, and at the same time optimize for emerging applications (360 ° panoramic Video and HDR).

Relevant documents and test platforms for VCC are available from https:// jvet.hhi.fraunhofer.de/, and proposals for H.266 are available from http:// phenix.it-supplaris.eu/jvet/.

VVC still continues the hybrid encoding framework adopted since h.264, and the general block diagram of its VTM8 encoder is shown in fig. 1. Inter and intra prediction coding: the correlation between the time domain and the spatial domain is eliminated. Transform coding: the residual is transform coded to remove spatial correlation. Entropy coding: eliminating statistical redundancy. The VVC will focus on researching new coding tools or techniques to improve the video compression efficiency in a hybrid coding framework.

Although both VVC and HEVC use a tree structure for CTU partitioning, a tree structure CTU partitioning method different from HEVC is used for VVC. As described above, in HEVC, the CTUs are partitioned into CUs (i.e., coding trees) using a quadtree structure. Decisions regarding intra-coding and inter-coding are made at leaf node CUs. Then, each leaf-CU may be further divided into 1, 2, or 4 prediction units PU according to the PU partition type. Within each PU, the same prediction process is used and the relevant information is sent to the decoder section on a PU basis. After the residual block is obtained by the PU-based prediction process, the leaf-CU may be divided into TUs according to another quadtree-like structure that is similar to the coding tree of the CU. In the VVC, a quadtree splitting structure with nested multi-type trees using binary trees and ternary trees is employed. That is, different forms of CU, PU, and TU are deleted in the VVC. A CTU is first partitioned by a quadtree and then further partitioned by a polytype tree. As shown in fig. 8, VVC specifies 4 multi-type tree partitioning patterns: horizontal binary tree partitioning, vertical binary tree partitioning, horizontal ternary tree partitioning, and vertical ternary tree partitioning. The leaf nodes of a multi-type tree are called Coding Units (CUs) and unless a CU is too large for the maximum transform length, the CU partition is used for prediction and transform processing without further partitioning. This means that in most cases, the CU, PU and TU have the same block size in the quadtree splitting structure with nested multi-type trees. The exception is that the maximum transform length supported is smaller than the width or height of the color components of the CU. Fig. 9 illustrates a particular embodiment of CTU-to-CU partitioning of a quad-tree partitioning structure with nested multi-type trees for VVC, where bold boxes represent quad-tree partitioning and the remaining edges represent multi-type tree partitioning.

Intra-frame prediction has been the main research content in video coding, which can remove spatial information redundancy by using spatial correlation of images to achieve compression of video data. Many new intra prediction techniques are proposed in VVC (h.266), including 67 intra prediction modes, inter-component linear model prediction, position-decision intra prediction combination, multi-reference row intra prediction, matrix-weighted intra prediction, intra sub-partitioning (ISP) coding modes.

The intra-frame prediction coding technology has a great significance in the video coding technology, and the performance of the whole video coding is greatly influenced by improving the performance of the intra-frame prediction technology. Therefore, in order to improve the encoding performance, it is necessary to deeply research the intra prediction encoding technology of the next generation encoding standard h.266/VVC to achieve high encoding performance.

In the intra prediction process, the current block needs to refer to the reconstructed pixels on the left and top of the current block to obtain a prediction signal. And calculating a prediction result to obtain a prediction residual error, transforming, quantizing and entropy coding the residual error, and then sending the residual error to a decoding end. The reference samples that can be used to create the intra-predicted signal are located only on the left and top of the block. Since the correlation between samples in a natural image generally decreases with increasing distance, the prediction quality of samples located near the lower right corner of an image block is generally worse than that of samples located near the upper left corner of the image block.

To solve this problem, VVC proposes an intra-sub-division coding mode that divides the luma intra-prediction block horizontally or vertically into 4 or 2 equally sized sub-blocks, which contain at least 16 samples. The minimum block size of the ISP that can be used is 4x8 (or 8x4) and the maximum block size is 64x 64. If the block size is equal to 4x8 (or 8x4), the corresponding block is divided into 2 sub-blocks; if the block size is greater than 4x8 (or 8x4), the corresponding block is divided into 4 sub-blocks. For example, a 32 × 16 block may be divided horizontally into 4 32 × 4 sub-blocks or vertically into 4 8 × 16 sub-blocks, and an 8 × 4 block may be divided vertically into only 2 4 × 4 sub-blocks, as shown in fig. 3.

The processing mode of each sub-block is similar to that of the intra-frame prediction block in the VVC, firstly, an intra-frame prediction signal and a residual signal are generated, and then the residual signal is transmitted to a decoding end after being transformed, quantized and entropy-coded. And the decoding end carries out entropy decoding on the received residual signal, carries out inverse quantization and inverse transformation and adds the intra-frame prediction signal to obtain a reconstructed sample. After each sub-block is processed, the reconstructed samples can be used to calculate the prediction signal of the next sub-block, and the same steps are repeated until all sub-blocks are encoded and all sub-blocks of an encoding block share the same intra prediction mode. As shown in fig. 4, a block is divided horizontally into four sub-blocks, the first sub-block uses neighboring CU samples as reference samples, and its reconstructed samples can be used to predict the second sub-block, and the process continues until the four sub-blocks are processed.

The ISP coding mode is developed from a Line-based Intra Prediction (LIP) coding mode. LIP was proposed at jfet meeting K. The main idea of LIP is to divide the luma intra prediction block into one-dimensional partitions or rows. It performs LIP partitioning on each block. With the AI configuration, the BD-rate can be reduced by an average of 2.34%, while the encoding run time becomes 293% of the original. The encoder run time is greatly increased, which is detrimental to hardware implementation. For example, a 4x4 block may be divided into 4(4 x 1) rows, which may create throughput problems, and all blocks are 1 x4 (or 4x 1) instead of 4x4 (the current case of VTM), which may result in a worse bit stream. If the number of lines generated is large (e.g., 64 lines), the encoder needs to perform a large number of operations and memory accesses while checking the necessary RD. Column sub-partitioning (1 × N) may be more difficult to implement because samples are allocated using raster scanning and memory access is expensive. To solve the above problem, at jfet meeting L, a partition is proposed with each block size set to a certain number of partitions (each partition has at least 16 samples), and the width or height of the final partition must be at least 4 samples. With the AI configuration, BD-rate can be reduced by 1% on average, while the coding run time becomes 156% of the original, successfully reducing the complexity of the tool and associated hardware. After this meeting, the LIP is formally renamed to ISP. Then, the ISP algorithm is optimized again for M conferences in order to get a better balance between coding gain and encoder run time. The experimental results show that under AI configuration, BD-rate can be reduced by 0.59% on average, and meanwhile, the encoding running time becomes 112% of the original encoding running time. By analyzing the experimental results of these several proposals, it is found that the ISP coding mode can achieve significant DB-rate improvement for video content containing rich textures, but not much improvement for some video content with simple textures. Therefore, there is still much space for optimization for the ISP coding mode.

From prior art efforts, there are many scholars that use CU texture characteristics to optimize video coding. However, the prior art focuses on solving CU size decision and mode selection problems with texture-based methods and achieves significant time savings with negligible coding performance degradation. But none have considered using a texture-based approach to optimize the CU sub-partition coding mode proposed in h.266.

Disclosure of Invention

The invention provides a method, a device, a codec and a processor-readable storage medium for VVC (H.266). More specifically, the present invention is used for CU sub-partitioning in VVC (h.266) based on CU texture complexity.

The ISP coding mode in VVC is not efficient for coding very flat or constant video content, and in order to further improve coding efficiency, a CU sub-partitioning scheme based on CU texture complexity is proposed herein.

In one aspect, whether a CU needs to be sub-divided or not may be determined in advance according to texture complexity of the CU, thereby enabling more efficient encoding. The CU sub-partitioning method is performed as follows: firstly, calculating the texture complexity of a CU before the CU is subjected to CU subdivision; and then judging whether the CU performs CU subdivision according to the texture complexity, so that more efficient intra-frame prediction is realized.

In another aspect, the present invention proposes setting a corresponding CU sub-division flag when determining to sub-divide a CU. In one embodiment, the CU sub-partition flag may indicate a horizontal sub-partition and a vertical sub-partition.

According to an aspect of the present invention, there is provided a method for decoding a video encoded bitstream, comprising:

reading a CU subdivision identification of the current CU from the bitstream;

when the CU subdivision identification is set:

if the size of the current CU is 4x8 (or 8x4), dividing the current CU into two sub-divisions of 4x 4; or

If the size of the current CU is greater than 4x8 (or 8x4) and less than or equal to 64x64, dividing the current CU into 4 sub-partitions, wherein each sub-partition includes at least 16 pixels; or

When the CU subdivision flag is not set, not further subdividing the current CU

According to another aspect of the present invention, the setting of the CU sub-partition identification is associated with a texture complexity of the current CU.

According to another aspect of the present invention, the setting of the CU sub-partition identification is associated with a comparison of the texture complexity of the current CU to a threshold.

According to an aspect of the present invention, there is provided a Coding Unit (CU) sub-division method based on texture complexity of the CU, the method comprising:

calculating a texture complexity value of the current CU before performing brightness intra-frame prediction on the current CU;

if the texture complexity value is greater than a threshold value:

If the size of the current CU is larger than 64x64, not further sub-dividing the current CU; or

If the texture complexity value is less than or equal to the threshold, not further sub-partitioning the current CU,

wherein if it is determined to further sub-partition the current CU, setting a CU sub-partition flag for the current CU.

According to another aspect of the invention, the threshold is 400.

According to another aspect of the present invention, the CU sub-partition flag indicates a horizontal sub-partition or a vertical sub-partition.

According to another aspect of the present invention, there is provided a video codec capable of CU sub-division based on texture complexity of a Coding Unit (CU), the video codec performing an encoding operation as described above with respect to an input original video stream or a decoding operation as described above with respect to an input encoded video stream.

According to another aspect of the present invention, a computing device capable of performing video coding is presented, comprising:

a processor; and

a non-volatile memory coupled to the processor, the non-volatile memory storing instructions or program code that, when executed by the processor, enable encoding operations as described above for an input raw video stream or decoding operations as described above for an input encoded video stream.

According to another aspect of the invention, the computing device may be a system on a chip (SOC)

According to another aspect of the invention, a computer program product for a method as described above is presented.

Drawings

Fig. 1 shows an embodiment of a general block diagram of a generic encoder for HEVC/VVC.

Fig. 2 shows a schematic diagram of a Coding Tree (CTU) in HEVC.

Fig. 3 shows a schematic diagram of CU sub-partitioning according to an embodiment of the present invention.

Fig. 4 shows a prediction diagram of CU sub-partitioning according to an embodiment of the present invention.

Fig. 5 shows a flow chart of an encoding method according to an embodiment of the invention.

Fig. 6 shows a flow chart of a decoding method according to an embodiment of the invention.

Fig. 7 shows a schematic diagram of a device for implementing the encoding method of an embodiment of the present invention.

Fig. 8 illustrates a multi-type tree splitting pattern for VVC.

Fig. 9 illustrates one particular embodiment of CTU-to-CU partitioning of a quad-tree partitioning structure of a VVC with nested multi-type trees.

Detailed Description

Various aspects are now described with reference to the drawings. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. It may be evident, however, that such aspect(s) may be practiced without these specific details.

As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, such as but not limited to hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets, e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal.

The ISP coding mode in VVC is not efficient for coding very flat or constant video content, and in order to improve the coding efficiency of intra prediction, a CU sub-partitioning scheme based on CU texture complexity is proposed herein.

The invention proposes a novel algorithm for VVC. However, those skilled in the art will readily appreciate that the present invention is equally applicable to encoding other types of video frames. In addition, it is readily understood by those skilled in the art that the present invention is directed to the luminance component, not the chrominance component.

It is easily understood that the CU sub-division method proposed in the present invention is not the same as the division manner of dividing multiple Prediction Units (PUs) in a single CU or the multi-type tree leaf node CU division manner in the VVC in HEVC. For example, in the present invention, the CU subdivision scheme is for intra prediction. For another example, in the present invention, all sub-partitions of one coding block (CU) share the same intra prediction mode. After each sub-partition is processed, its reconstructed samples can be used to calculate the prediction signal of the next sub-partition, and the prediction signal will repeat the same steps until all sub-partitions are encoded, as shown in fig. 4, a CU is divided horizontally into four sub-partitions, the first sub-partition uses neighboring CU samples as reference samples, its reconstructed samples can be used to predict the second sub-partition, and this process will continue until the four sub-partitions are processed. As another example, the CU subdivision of the present invention is performed after the end of the multi-type tree leaf node CU subdivision of the VVC, and is not the same as the multi-type tree division (binary tree horizontal/vertical, ternary tree horizontal/vertical) specified by the VVC.

Fig. 6 shows a flow diagram of an encoding method according to an embodiment of the invention. The encoding method performs Coding Unit (CU) sub-division based on texture complexity of a CU in multi-function video coding (VVC).

Fig. 6 begins at block 601. CU partitioning is performed for each frame to be encoded. In one embodiment, the current frame may be first stripe partitioned. In another embodiment, the current frame may not be band partitioned. The current frame is divided into a plurality of CTUs. As is well known in the art, HEVC and VVC coding are both CTU based. For example, the CTUs may be partitioned into CUs based on Rate Distortion Optimization (RDO). Embodiments regarding the partitioning of CTUs are shown in fig. 2 and 4, respectively. The present invention can perform CU partitioning in various ways. The inventive idea of the present invention is not to divide a CU, but to sub-divide a CU based on the texture complexity of the already divided CU. Therefore, CU partitioning is not discussed in detail here. Before block 601, the final partitioning of the CU has been completed. But needs to consider further the sub-division of the CU. As described above, in VVC, the final partition of a CU is not the same as the sub-partitions of the CU of the present invention.

In block 603, a texture complexity value of the current CU is calculated prior to luma intra prediction for the current CU. It is noted here that the present invention is only used for the luminance component, and not for the chrominance component. Because of the effect of RDO, which has been fully considered in the foregoing CU partitioning, the texture complexity value here is not calculated using RDO to avoid obtaining inefficient results if unnecessary calculations are made.

In one embodiment, the texture complexity of the current CU is calculated based on a combination of one or more of horizontal texture complexity, vertical texture complexity, 45 degrees diagonal complexity, and 135 degrees diagonal complexity. In a specific embodiment, taking all of the above into account, the texture complexity of the current CU is calculated as follows:

wherein W, H is the width and height of the CU, G, respectively_HOR、G_VER、G₄₅And G₁₃₅Which respectively represent horizontal texture complexity, vertical texture complexity, 45 degree diagonal complexity, and 135 degree diagonal complexity, which can be calculated using the following equations (2) - (3). In the following formulas (2) to (3), S_kFor texture operators, A is a pixel matrix of 3x3, and P (i, j) is the luminance pixel value at (i, j).

G_k＝S_k*A,(k＝HOR,VER,45,135) (2)

In one particular embodiment, the horizontal texture operator may be as follows:

in one particular embodiment, the vertical texture operator may be as follows:

in one particular embodiment, the 45 degree texture operator may be as follows:

in one particular embodiment, the 135 degree texture operator may be as follows:

in one particular embodiment, not all 4 texture complexities described above may be used, but only 1-3 of them. In this case, equation (1) may be modified to contain only the texture complexity term to be considered. For example, in many cases, the texture complexity may be calculated using only the horizontal texture complexity and the vertical texture complexity. In a particular embodiment, a first texture complexity may be calculated using horizontal texture complexity, vertical texture complexity, 45 degrees diagonal complexity, and 135 degrees diagonal complexity for the first N CUs of an I frame, while a second texture complexity is calculated using only horizontal texture complexity and vertical texture complexity, and a determination is made whether the difference in the first and second texture complexities is less than a threshold. If less than the threshold, the texture complexity is calculated using only the horizontal texture complexity and the vertical texture complexity for the remaining CUs. In one embodiment, this determination is maintained for a set of multiple frames (e.g., all frames within a GOP), since the texture complexity in a scene, which is typically made up of multiple frames, is approximated.

In an alternative embodiment, the texture complexity value may be calculated using the following equation (5):

where W, H are the width and height of the CU, respectively, and p is the pixel value (luminance value or chrominance value) located at (i, j) within the CU.

In a preferred embodiment, the texture complexity value may be calculated using the following equation (6):

in other embodiments, other ways to compute texture complexity may be used. For example, in one embodiment, the texture complexity may be calculated based on RDO. However, as noted above, it is preferable that the texture complexity value is not calculated using RDO to avoid obtaining inefficient results if unnecessary calculations are made.

In decision block 605, the calculated texture complexity is compared to a texture threshold.

In one embodiment, the texture threshold is determined by testing a plurality of video sequences. According to experiments, this particular texture threshold may be set to 400 in particular embodiments using all 4 texture complexities and using the texture operators described above.

When the texture complexity is less than or equal to the texture threshold (decision block 605: no), the current CU is not further sub-divided (block 615), but is intra-prediction encoded directly based on the current CU (block 619). The intra-frame coding may adopt any conventional intra-frame coding mode specified by HEVC or VVC.

When the texture complexity is greater than the texture threshold (decision block 605: Yes), then in decision block 607, it is determined whether the size of the current CU is 4x8 (or 8x 4).

If the size of the current CU is 4x8 (or 8x4) (decision block 607: YES), the current CU is divided into two sub-divisions of 4x4 based on the ISP. Fig. 3(b) shows two specific embodiments of this subdivision.

If the size of the current CU is not 4x8 (or 8x4) (decision block 607: No), then it is further determined in decision block 609 whether the size of the current CU is greater than 64x64, where the size of the current CU is determined to be greater than 64x64 as long as either of the width and the height of the current CU is greater than 64.

If the size of the current CU is equal to or less than 64x64 (decision block 609: NO), then the current CU is partitioned into 4 sub-partitions. The principle of this partitioning is that each sub-partition contains at least 16 pixels (or samples). Fig. 3(a) shows two specific embodiments of the subdivision, but these two specific embodiments are by no means limiting on the way the subdivision is. In one embodiment, the predetermined subdivision schemes for various CU sizes are predefined, such that no specific subdivision scheme needs to be specified in the coded bitstream. In another embodiment, the subdivision scheme is determined by a conventional rate-distortion scheme, and the best subdivision scheme is found based on the minimum rate-distortion cost for each possible subdivision scheme. In one particular embodiment, a sub-division flag is used to indicate a vertical sub-division or a horizontal sub-division, regardless of the manner in which the sub-division is determined.

If the size of the current CU is greater than 64x64 (decision block 609: Yes), the current CU is not further sub-partitioned based on the ISP (block 615), but is directly intra-coded based on the current CU (block 619).

In block 617, whenever a CU is sub-divided (in block 613 or 611), a sub-division flag for the current CU is set. In a particular embodiment, the subdivision flag indicates a vertical subdivision or a horizontal subdivision, which is based on the particular subdivision scheme.

In block 619, the current CU is intra-coded based on the sub-partition of the current CU. The intra-frame prediction mode of each sub-partition is similar to the processing mode of an intra-frame prediction block in HEVC or VVC, firstly, an intra-frame prediction signal and a residual signal are generated, and then the residual signal is transformed, quantized and entropy-coded and then sent to a decoding end. And the decoding end carries out entropy decoding on the received residual signal, carries out inverse quantization and inverse transformation and adds the intra-frame prediction signal to obtain a reconstructed sample. After each subdivision is processed, the reconstructed samples can be used to calculate a prediction signal for the next subdivision, and the same steps are repeated until all subdivisions are coded, and all subdivisions of a coding block share the same intra prediction mode. As shown in fig. 4, a block is divided horizontally into four sub-partitions, the first sub-partition uses neighboring CU samples as reference samples, and its reconstructed samples can be used to predict the second sub-partition, and this process continues until the four sub-partitions are processed. This process is similar to the ISP of VVC and therefore will not be described in detail.

Fig. 5 shows a flow chart of a decoding method according to an embodiment of the invention.

In block 501, the decoder reads the encoded video bitstream, parses the syntax elements, obtains the syntax elements associated with the CU, e.g., in a format specified in the HEVC or VVC protocol.

In block 503, the subdivision identification of the current CU is read.

In decision block 505, it is determined whether the sub-division flag is set. As described above for the encoding method of fig. 6, the sub-division flag for the current CU is set whenever the CU is sub-divided. In a particular embodiment, the subdivision flag indicates a vertical subdivision or a horizontal subdivision, which is based on a particular subdivision scheme.

When the CU sub-partition flag is not set (decision block 505: No), then the current CU is not further sub-partitioned (block 515), but is directly intra-decoded based on the current CU (block 617). The intra decoding may be based on any conventional intra coding approach specified by HEVC or VVC, and is typically specified in parsed syntax elements.

When the CU sub-partition flag is not set (decision block 505: YES), it is determined whether the size of the current CU is 4x8 (or 8x 4).

When the size of the current CU is 4x8 (or 8x4) (decision block 507: Yes), the current CU is divided into two sub-divisions of 4x 4.

When the size of the current CU is not 4x8 (or 8x4) (decision block 507: No), then the current CU is divided into 4 sub-partitions. As described above for the encoding method of fig. 6, the principle of the partitioning is that each sub-partition contains at least 16 pixels (or samples). In one embodiment, the manner of subdivision may be identified based on the subdivision. In another embodiment, the manner of the subdivision may be predefined.

In block 517, the current CU is intra-prediction decoded based on the sub-partitions of the current CU. As described above with respect to the encoding method of fig. 6, the intra prediction mode of each sub-partition is similar to the processing mode of the intra prediction block in HEVC or VVC, and an intra prediction signal and a residual signal are first generated, and then the residual signal is transformed, quantized and entropy encoded and then sent to the decoding end. And the decoding end carries out entropy decoding on the received residual signal, carries out inverse quantization and inverse transformation and adds the intra-frame prediction signal to obtain a reconstructed sample. After each subdivision is processed, the reconstructed samples can be used to calculate a prediction signal for the next subdivision, and the same steps are repeated until all subdivisions are coded, and all subdivisions of a coding block (CU) share the same intra prediction mode. As shown in fig. 4, a block is divided horizontally into four sub-partitions, the first sub-partition uses neighboring CU samples as reference samples, and its reconstructed samples can be used to predict the second sub-partition, and this process continues until the four sub-partitions are processed. This process is similar to the ISP of VVC and therefore will not be described in detail.

In a particular embodiment, the sub-partition identification is associated with the texture complexity of the current CU, as described above for the encoding method of fig. 6. More specifically, the CU sub-partition identification is associated with a comparison of the texture complexity of the current CU to a threshold.

An apparatus usable for video coding is shown in fig. 7, the apparatus comprising: a processor and memory including processor executable code for implementing the various methods of the present invention in the memory.

According to another aspect, the present disclosure may also relate to an encoder for implementing the above-described encoding method. The encoder may be dedicated hardware.

According to another aspect, the disclosure may also relate to a corresponding decoder for decoding an encoded video stream.

According to another aspect, the present disclosure may also relate to a video codec for the above-described encoding method or decoding method.

According to another aspect, the present disclosure may also relate to a computer program product for performing the methods described herein. According to a further aspect, the computer program product has a non-transitory storage medium having stored thereon computer code/instructions that, when executed by a processor, may implement the various operations described herein.

Although the above is mainly discussed with respect to VVC, it is easily understood by those skilled in the art that the present invention can be obviously applied to other video codec standards as long as the video codec standards contain the general elements (e.g. CU) and their equivalents referred to in the claims.

When implemented in hardware, the video encoder may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. Additionally, at least one processor may include one or more modules operable to perform one or more of the steps and/or operations described above.

When implemented in hardware, the video encoder or the device containing the video codec may be a System On Chip (SOC).

When the video encoder is implemented in hardware circuitry, such as an ASIC, FPGA, or the like, it may include various circuit blocks configured to perform various functions. Those skilled in the art can design and implement these circuits in various ways to achieve the various functions disclosed herein, depending on various constraints imposed on the overall system.

While the foregoing disclosure discusses illustrative aspects and/or embodiments, it should be noted that many changes and modifications could be made herein without departing from the scope of the described aspects and/or embodiments as defined by the appended claims. Furthermore, although elements of the described aspects and/or embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated. Additionally, all or a portion of any aspect and/or embodiment may be utilized with all or a portion of any other aspect and/or embodiment, unless stated to the contrary.

Claims

1. A method for decoding a video encoded bitstream, comprising:

reading a CU subdivision identification of the current CU from the bitstream;

when the CU subdivision identification is set:

When the CU subdivision identification is not set, not further subdividing the current CU.

2. The method of claim 1, wherein the setting of the CU sub-partition identification is associated with a texture complexity of the current CU.

3. The method of claim 2, wherein the setting of the CU sub-partition identification is associated with a comparison of texture complexity of the current CU to a threshold.

4. A method of Coding Unit (CU) sub-division based on texture complexity of a CU, the method comprising:

if the texture complexity value is greater than a threshold value:

If the texture complexity value is less than or equal to the threshold, not further sub-partitioning the current CU, wherein if it is determined that the current CU is further sub-partitioned, a CU sub-partition flag for the current CU is set.

5. The method of claim 4, wherein the CU sub-partition flag indicates a horizontal sub-partition or a vertical sub-partition.

6. The method of claim 4, wherein the threshold is 400.

7. A video codec capable of CU sub-division based on texture complexity of a Coding Unit (CU), the video codec performing the decoding operation of any one of claims 1-3 on an input video stream or performing the encoding operation of any one of claims 4-6 on an input encoded video stream.

8. A computing device capable of performing video coding, comprising:

a processor; and

a non-volatile memory coupled to the processor, the non-volatile memory storing instructions or program code which, when executed by the processor, is capable of carrying out a decoding operation according to any one of claims 1-3 or an encoding operation according to any one of claims 4-6 on an incoming encoded video stream.

9. The computing device of claim 8, wherein the computing device is a system on a chip (SOC).

10. A computer program product for performing the method of any one of claims 1-6.