CN111263156B

CN111263156B - Video decoding method, video encoding method and device

Info

Publication number: CN111263156B
Application number: CN202010105200.9A
Authority: CN
Inventors: 马宗全; 修晓宇; 陈漪纹; 王祥林
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-02-20
Filing date: 2020-02-20
Publication date: 2022-03-25
Anticipated expiration: 2040-02-20
Also published as: CN111263156A

Abstract

The disclosure relates to a video decoding method, a video encoding method and a video encoding device, belonging to the technical field of video encoding and compression, wherein the video decoding method comprises the following steps: the method comprises the steps of receiving block size information and target prediction mode information of a coding block, determining the subblock size of the coding block in each dividing direction according to the block size information and subblock dividing information of configured block size information, determining whether to disable an ISP mode for the coding block in the dividing direction according to the subblock size of the coding block in each dividing direction and a configured subblock size set of the disabled ISP mode, determining the reconstruction size of the coding block according to information whether to disable the ISP mode for the coding block in each dividing direction, reconstructing the coding block according to the reconstruction size and the target prediction mode information, wherein the size in the subblock size set is a newly appeared size, disabling the ISP mode for the coding block divided into the sizes in the corresponding dividing direction, and improving the reconstruction delay problem of the coding blocks by the ISP mode.

Description

Video decoding method, video encoding method and device

This application claims priority from U.S. patent office, U.S. patent application No. 62/808, 257, entitled "Methods and Devices for intra sub-partition coding mode", filed 20.02/2019, the entire contents of which are incorporated herein by reference.

Technical Field

The present disclosure relates to the field of video encoding and compression technologies, and in particular, to a video decoding method, a video encoding method, and an apparatus.

Background

In the field of Video Coding and compression technology, Video data may be compressed using a variety of Video Coding techniques, and Video Coding may be performed according to one or more Video Coding standards, including multifunctional Video Coding (VVC), Joint Exploration test Model (JEM), High-Efficiency Video Coding (HEVC) (h.265/HEVC), Advanced Video Coding (AVC) (h.264/AVC), Moving Picture Experts Group (MPEG) Coding, and the like. An important goal of video coding techniques is to minimize the impairment of video quality while reducing the video compression rate as much as possible.

The first version of the HEVC standard, which has been completed in 2013 in 10 months, saves about 50% of the bit rate compared to the previous generation video coding standard h.264/MPEG AVC, while providing equivalent perceptual quality (perceptual quality), but there is evidence that higher coding efficiency can be achieved with additional coding tools. For standardization of Video Coding, Video Coding Experts Group (VCEG) and MPEG both began the exploration of new Coding techniques. ITU-T VECG and ISO/IEC MPEG established a Joint Video Exploration Team (jfet) in 10 months 2015, and significant research into advanced technologies that can greatly improve coding efficiency began. Jfet maintains a reference software called Joint Exploration Model (JEM) by integrating a number of additional coding tools on top of HEVC test Model (HM).

The ITU-T and ISO/IEC published a joint Proposal (Call For pro-posal, CfP) on video compression efficiency exceeding HEVC in 2017, month 10. In 4 months in 2018, 23 cfps were evaluated in the tenth jfet conference, and the result shows that the compression efficiency is about 40% higher than that of HEVC, based on which the jfet starts a new project for a new generation of VVC video coding standards, and a reference software code library called a VVC Test Model (VTM) is established in the same month for demonstrating a reference implementation manner of the VVC standards.

VVC is built on top of the block-based hybrid video Coding framework as HEVC, and fig. 1 shows a block diagram of a generic video encoder that processes incoming video data on a block-by-block basis, also called Coding Units (CUs). In VTM-1.0, one CU can be up to 128 × 128 pixels, however, unlike HEVC which uses only quadtrees to partition blocks, to adapt to local features of an image, VVC uses quadtrees/binary trees/ternary trees to partition a Coding Tree Unit (Coding Tree Unit CTU) into CUs. In addition, the VVC eliminates the concept of partition Unit type in HEVC, i.e., there is no longer a separation of CUs, Prediction Units (PUs), and Transform Units (TUs) in the VVC, but each CU is always used as a basic Unit for Prediction and transformation, without further partitioning. In the multi-type tree structure, the CTUs are divided based on the quadtree structure, and then each quadtree leaf node is further divided based on the binary tree structure and the ternary tree structure, as shown in fig. 2, there are five partition types: the system comprises a quaternary partition, a horizontal binary partition, a vertical binary partition, a horizontal ternary partition and a vertical ternary partition, wherein W represents the width of the CUT, and H represents the height of the CUT. In fig. 1, spatial prediction and/or temporal prediction may be performed, wherein spatial prediction (i.e., intra prediction) is the prediction of a current video block using pixels from samples (referred to as reference samples) of encoded neighboring blocks in the same video frame/slice to reduce spatial redundancy inherent in the video frame; temporal prediction (i.e., inter prediction or motion compensated prediction) is the prediction of a current video block using reconstructed pixels from an encoded video frame to reduce temporal redundancy inherent in the video frame. In particular, the temporal prediction signal of a CU typically needs to send one or more Motion Vector (MV) signals, where the MV signals are used to indicate the Motion increment and direction between the current CU and its temporal reference block, and, if there are multiple reference pictures, additionally sends an index of a reference picture, which is used to identify which reference picture in the reference picture store the temporal prediction signal comes from. After spatial prediction and/or temporal prediction, a mode decision module in the encoder may select an optimal prediction mode according to a rate-distortion optimization method, then subtract a prediction block obtained using the optimal prediction mode from the current CU to obtain a prediction residual, remove correlation between the prediction residuals using a transform unit and a quantization unit, inverse quantize and inverse transform quantized residual coefficients to form a reconstructed residual, and then add the reconstructed residual back to the prediction block to form a reconstructed signal of the current CU. Before the reconstructed CU is placed In the reference picture memory, Loop filters such as a deblocking Filter, a Sample Adaptive Offset (SAO) Filter, and an Adaptive In-Loop Filter (ALF) may also be applied to the reconstructed CU. And finally, sending the coding mode (inter-frame or intra-frame), the prediction mode information, the motion information and the quantized residual coefficient to an entropy coding unit for further compression and packaging to obtain a final video bit stream.

Fig. 3 shows a block diagram of a generic video decoder. First, a video bitstream is entropy decoded in an entropy decoding unit, prediction information is transmitted to a spatial prediction unit to form a prediction block if the coding mode is intra-coding, prediction information is transmitted to a temporal prediction unit to form a prediction block if the coding mode is inter-coding, residual transform coefficients are transmitted to an inverse quantization unit and an inverse transformation unit to reconstruct a residual block, and the reconstructed block is obtained by adding the prediction block and the residual block. The reconstructed block may also be loop filtered before it is stored in the reference picture store. The reconstructed video in the reference picture store may be used to drive a display device or to predict video blocks.

In general, the basic Intra Prediction scheme applied in VVC remains the same as that of HEVC, except that multiple modules, such as Intra Sub-division (ISP) coding mode, extended Intra Prediction with wide-angle Intra direction, Position-adaptive Intra Prediction (PDPC), and 4-Tap Intra Interpolation (4-Tap Intra Interpolation), are further extended and/or improved.

In the related art, when the ISP mode is used to reconstruct the coding block, the current sub-block must be reconstructed until the reconstruction of the sub-block adjacent to the current sub-block is completed, so that the reconstruction delay problem is more prominent and the coding efficiency is lower.

Disclosure of Invention

The present disclosure provides a video decoding method, a video encoding method and an apparatus thereof, so as to at least solve the problems of a relatively outstanding reconstruction delay problem and a relatively low encoding efficiency when an ISP mode is adopted to reconstruct a coding block in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a video decoding method, including:

receiving block size information and target prediction mode information of a coding block;

determining the subblock size of the coding block in each dividing direction according to the subblock dividing information of the block size information and the configured subblock dividing information of the block size information, wherein the subblock dividing information comprises at least two dividing directions and subblock dividing quantity in each dividing direction;

determining whether to disable ISP modes for the coding blocks in the dividing directions according to the subblock sizes of the coding blocks in each dividing direction and a configured set of subblock sizes for disabling ISP modes, wherein the sizes in the set of subblock sizes are nonstandard sizes;

determining the reconstruction size of the coding block according to the information of whether the ISP mode is forbidden to the coding block in each dividing direction;

and reconstructing the coding block according to the reconstruction size and the target prediction mode information.

In one possible embodiment, the set of sub-block sizes is any combination of the following sub-block sizes:

1×N；N×1；2×N；N×2；

wherein when the sub-block size is 1xn or nx1, N is 16, 32 or 64; when the sub-block size is 2 × N or N × 2, N is 8, 16, 32, or 64.

In a possible implementation manner, determining a reconstruction size of the coding block according to information on whether to disable an ISP mode for the coding block in each dividing direction includes:

if the ISP mode is forbidden to the coding block in all the dividing directions, determining the reconstruction size as the block size of the coding block;

if the ISP mode is forbidden to the coding block in the partial dividing direction, determining the reconstruction size according to the received ISP identification of the coding block, wherein the ISP identification is used for indicating whether the coding block is reconstructed by adopting the ISP mode or not, and the ISP identification is sent by an encoder after the block size information of the coding block is sent.

In one possible implementation, determining the reconstruction size according to the received ISP identifier of the coding block includes:

if the ISP identification indicates that an ISP mode is not adopted to reconstruct the coding block, determining the reconstruction size as the block size of the coding block;

and if the ISP identification indicates that the coding block is reconstructed by adopting an ISP mode, determining the reconstruction size according to the number of the division directions of the coding block which does not forbid the ISP mode.

In a possible implementation, determining the reconstruction size according to the number of dividing directions in which the coding block does not disable the ISP mode includes:

if the coding block does not forbid an ISP mode in only one dividing direction, determining the reconstruction size as the size of the coding block in the dividing direction;

if the coding block does not forbid the ISP mode in at least two dividing directions, determining a target dividing direction of the coding block according to received dividing direction indication information of the coding block, and determining the reconstructed size as the size of a sub-block of the coding block in the target dividing direction, wherein the dividing direction indication information is sent by the encoder after the ISP identification of the coding block is sent.

According to a second aspect of the embodiments of the present disclosure, there is provided a video encoding method, including:

acquiring a video sequence;

determining the subblock size of a coding block in each dividing direction according to the block size information of the coding block in the video sequence and the configured subblock dividing information of the block size information, wherein the subblock dividing information comprises at least two dividing directions and subblock dividing quantity in each dividing direction;

determining target prediction mode information of the coding block according to the reconstruction size and each prediction mode corresponding to the configured reconstruction size;

and sending the block size information and the target prediction mode information of the coding block.

1×N；N×1；2×N；N×2；

and if the ISP mode is forbidden to the coding block in the partial dividing direction, determining the reconstruction size as the block size of the coding block and the subblock size of the coding block in the dividing direction in which the ISP mode is not forbidden.

In a possible implementation, if the ISP mode is disabled for the coding block in the partial partition direction, the method further includes:

determining whether to adopt an ISP mode to reconstruct the coding block according to the reconstruction size corresponding to the target prediction mode information;

and after the block size information of the coding block is sent, sending an ISP identification of the coding block, wherein the ISP identification is used for indicating whether an ISP mode is adopted to reconstruct the coding block.

In a possible implementation manner, if the ISP identifier indicates that the coding block is reconstructed in the ISP mode, the method further includes:

if the coding block does not forbid the ISP mode in at least two dividing directions, after the ISP identification of the coding block is sent, the dividing direction indication information of the coding block is sent, and the dividing direction indication information is used for indicating the target dividing direction corresponding to the target prediction mode information.

According to a third aspect of the embodiments of the present disclosure, there is provided a video decoding apparatus comprising:

a receiving module configured to perform receiving block size information and target prediction mode information of an encoded block;

a subblock size determining module configured to perform subblock division information according to the block size information and the configured block size information, the subblock division information including at least two division directions and subblock division amounts in each division direction, and determine a subblock size of a coding block in each division direction;

a judging module configured to execute determining whether to disable an ISP mode for the coding block in each dividing direction according to the subblock size of the coding block in each dividing direction and a configured set of subblock sizes of the disabled ISP mode, wherein the sizes in the set of subblock sizes are non-standard sizes;

a reconstruction size determination module configured to perform determination of a reconstruction size of the coding block according to information on whether to disable an ISP mode for the coding block in each division direction;

a reconstruction module configured to perform reconstructing the encoded block according to the reconstruction size and the target prediction mode information.

1×N；N×1；2×N；N×2；

In a possible implementation, the reconstruction size determination module is specifically configured to perform:

According to a fourth aspect of the embodiments of the present disclosure, there is provided a video encoding apparatus comprising:

an acquisition module configured to perform acquiring a video sequence;

a subblock size determining module configured to perform a subblock size determination of a coding block in each division direction according to block size information of the coding block in the video sequence and subblock division information of the configured block size information, the subblock division information including at least two division directions and a subblock division number in each division direction;

a reconstruction module configured to execute each prediction mode corresponding to the reconstruction size and the configured reconstruction size, and determine target prediction mode information of the coding block;

a transmitting module configured to perform transmitting block size information and target prediction mode information of the encoded block.

1×N；N×1；2×N；N×2；

In one possible implementation, the method further includes the ISP identification determination module:

the ISP identification determining module is configured to execute that if the ISP mode is forbidden to the coding block in the partial dividing direction, whether the coding block is reconstructed by adopting the ISP mode is determined according to the reconstruction size corresponding to the target prediction mode information;

the sending module is further configured to send, after sending the block size information of the coding block, an ISP identifier of the coding block, where the ISP identifier is used to indicate whether to reconstruct the coding block in an ISP mode.

In a possible implementation, if the ISP identifier indicates that the coding block is reconstructed in the ISP mode, the sending module is further configured to perform:

According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic apparatus including: at least one processor, and a memory communicatively coupled to the at least one processor, wherein:

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any of the video decoding or video encoding methods described above.

According to a sixth aspect of embodiments of the present disclosure, there is provided a storage medium, wherein when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is capable of executing any one of the video decoding or video encoding methods described above.

According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program product which, when invoked by a computer, may cause the computer to perform any of the video decoding or video encoding methods described above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

receiving information of block size and target prediction mode of coding block, determining the sub-block size of coding block in each dividing direction according to the information of sub-block division of block size and configured block size, determining whether ISP mode is forbidden for coding block in the dividing direction according to the sub-block size of coding block in each dividing direction and configured set of sub-block sizes forbidding ISP mode, determining the reconstructed size of coding block according to the information of whether ISP mode is forbidden for coding block in each dividing direction, reconstructing coding block according to reconstructed size and target prediction mode information, wherein the size in set of sub-block sizes is nonstandard size, that is, the sub-block sizes are new size, and the ISP mode is forbidden for coding block divided by sub-block sizes in corresponding dividing direction, so as to improve reconstruction delay problem of ISP mode for coding block, the coding efficiency of the ISP mode on the coding blocks is improved. In addition, a reconstruction circuit does not need to be additionally added to the subblocks with the subblock sizes, so that the modification of a decoder and an encoder is reduced, and the hardware cost is lower.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1 is a block diagram illustrating an encoder in accordance with an exemplary embodiment.

Fig. 2 is a schematic diagram illustrating five segmentation types according to an exemplary embodiment.

Fig. 3 is a block diagram illustrating a decoder according to an example embodiment.

Fig. 4 is a diagram illustrating an intra mode according to an example embodiment.

Fig. 5 is a schematic diagram illustrating a positional relationship between a coding block and a reference row according to an example embodiment.

Fig. 6a is a schematic diagram illustrating a positional relationship between a predicted sample and a reference sample according to an exemplary embodiment.

Fig. 6b is a schematic diagram illustrating a positional relationship between a further predicted sample and a reference sample according to an example embodiment.

Fig. 6c is a schematic diagram illustrating a positional relationship between a further prediction sample and a reference sample according to an example embodiment.

Fig. 7 is a schematic diagram illustrating a positional relationship between still another predicted sample and a reference sample according to an example embodiment.

Fig. 8a is a diagram illustrating a division of an encoded block in accordance with an example embodiment.

Fig. 8b is a diagram illustrating yet another partitioning of an encoded block in accordance with an example embodiment.

Fig. 8c is a diagram illustrating yet another division of an encoded block according to an example embodiment.

Fig. 9a is a schematic diagram illustrating a relationship between an effective intra direction of a conventional W × H encoded block and an orientation of a reference sample according to an example embodiment.

Fig. 9b is a schematic diagram illustrating the relationship between the effective intra direction of the sub-block of W × H and the orientation of the reference sample according to an exemplary embodiment.

Fig. 10 is a diagram illustrating a relationship between an effective intra direction of a sub-block and an orientation of a reference sample when a wide-angle intra prediction mode is enabled/disabled based on a size of the sub-block, according to an example embodiment.

FIG. 11 is a schematic diagram illustrating a positional relationship between one type of unavailable reference sample and an alternative reference sample in accordance with an exemplary embodiment.

FIG. 12 is a schematic diagram illustrating a positional relationship between still another unavailable reference sample and an alternate reference sample in accordance with an exemplary embodiment.

FIG. 13 is a schematic diagram illustrating a positional relationship between still another unavailable reference sample and an alternative reference sample in accordance with an exemplary embodiment.

FIG. 14 is a schematic diagram illustrating a positional relationship between yet another unavailable reference sample and an alternative reference sample in accordance with an exemplary embodiment.

Fig. 15 is a flow chart illustrating a method of video decoding according to an example embodiment.

Fig. 16 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 17 is a block diagram illustrating a video decoding apparatus according to an example embodiment.

Fig. 18 is a block diagram illustrating a video encoding apparatus according to an example embodiment.

Fig. 19 is a schematic structural diagram illustrating an electronic device for implementing a video decoding or video encoding method according to an exemplary embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The disclosed embodiments are primarily intended to improve existing ISP designs in VVC standards. In the following, a brief review of other coding tools in VVC, such as tools in intra prediction and transform coding, closely related to the techniques proposed in embodiments of the present disclosure, is first made.

Intra prediction mode with wide angle intra direction.

The VVC predicts the samples of the Current CU using a set of decoded samples adjacent to the Current CU (upper or left) as in HEVC, and, in order to capture the finer edge directions present in natural video (especially for high resolution video content, e.g. 4K), the number of Angular Intra modes (Angular Intra Mode) is extended from 33 in HEVC to 93 in VVC, the planar Mode (assuming that the inclination of the surface changes gradually from the boundary in both horizontal and vertical directions) and the Direct Current (DC) Mode (assuming that the surface is flat) of HEVC are also applicable to the VVC standard. Fig. 4 shows a schematic diagram of intra modes defined in the VVC standard, similar to intra prediction in HEVC, where all intra modes in the VVC (i.e. plane, DC and angular directions utilize a set of neighboring reconstructed samples above and to the left of the current coding block as Reference samples for intra prediction, fig. 5 shows a schematic diagram of the positional relationship between the current coding block and the Reference samples, where in HEVC only the closest sample row/column of the current coding block (i.e. row 0 in fig. 5) is taken as Reference sample, whereas in the VVC multiple Reference rows (MRL) are introduced, where the MRL uses two additional rows/columns (i.e. row 1 and row 3 in fig. 5) as Reference samples, and where the encoder may send the index of the selected Reference row/column to the decoder Column, fig. 6 a-6 c show schematic diagrams of the positional relationship between prediction samples and reference samples in a VVC, where, in addition to square coding blocks, rectangular coding blocks are present in the VVC due to the application of quadtree/binary tree/ternary tree partitioning of the current coding block, and where, due to the unequal width and height of the rectangular coding blocks, various sets of angular directions are selected for different block shapes, also referred to as wide-angle intra prediction. Specifically, for the square-shaped code blocks and the rectangular-shaped code blocks, 65 angular directions of 93 angular directions are supported by each shaped code block in addition to the planar mode and the DC mode, as shown in table 1. This design not only can effectively capture the direction structure (by adaptively selecting the angular direction based on the block shape) usually existing in the video, but also can ensure that each coding block can use 67 intra modes (i.e. plane, DC and 65 angular modes), and provides a consistent intra mode signaling design for coding blocks of different sizes, therefore, the efficiency of signaling the intra mode is high.

TABLE 1 Angle Direction of Intra prediction for different shape of code Block selection in VVC

Location adaptive intra joint prediction.

In practical applications, the intra prediction samples may be generated from an unfiltered neighboring reference sample set or from a filtered neighboring reference sample set, which may introduce discontinuity at the boundary of a current coding block with its domain. To address this problem, HEVC employs a boundary filter. In particular, the boundary filtering is implemented by combining the first row/column of predicted samples for DC, horizontal (i.e., mode 18) and vertical (i.e., mode 50) prediction modes with unfiltered reference samples using a 2-tap filter or a gradient-based smoothing filter, where the 2-tap filter is generally applicable for the DC mode and the gradient-based smoothing filter is generally applicable for the horizontal and vertical prediction modes.

The PDPC tool in VVC extends the above idea by employing a weighted combination of intra-predicted samples and unfiltered reference samples. In the current VVC working draft, PDPC is enabled for the following intra mode without signaling: planar, DC, horizontal (i.e., mode 18), vertical (i.e., mode 50), angular directions close to the lower left diagonal direction (i.e.,

intra mode

2, 3, 4, …, 10), and angular directions close to the upper right diagonal direction (i.e., intra

mode

58, 59, 60, …, 66).

Referring to fig. 7, assuming that the prediction sample at coordinates (x, y) is pred (x, y), its corresponding value after PDPC is performed is:

pred(x,y)＝(wL×R_-1,y+wT×R_x,-1–wTL×R_-1,-1+(64–wL–wT+wTL)×pred(x,y)+32)>>6； (1)

wherein R is_-1,yRepresenting the reference sample to the left of the current sample position (x, y), and wL represents R_-1,yWeight of (A), R_x,-1Representing the reference sample at the top of the current sample position (x, y), wT represents R_x,-1Weight of (A), R_-1,-1Representing the reference sample at the top left corner of the current coding block, wTL representing R_-1,-1The weight of (c).

In particular, the weights wL, wT, and wTL in equation (1) may be adaptively selected according to the prediction mode and the sampling position, as follows:

for the DC mode, the DC mode is,

wT＝32>>((y<<1)>>shift)；

wL＝32>>((x<<1)>>shift)；

wTL＝(wL>>4)+(wT>>4)； (2)

for the case of the planar mode,

wT＝32>>((y<<1)>>shift)；

wL＝32>>((x<<1)>>shift)；

wTL＝0； (3)

for the horizontal mode of the display device,

wT＝32>>((y<<1)>>shift)；

wL＝32>>((x<<1)>>shift)；

wTL＝wT； (4)

for the vertical mode of the operation of the machine,

wT＝32>>((y<<1)>>shift)；

wL＝32>>((x<<1)>>shift)；

wTL＝wL； (5)

for the diagonal direction from the bottom left,

wT＝16>>((y<<1)>>shift)；

wL＝16>>((x<<1)>>shift)；

wTL＝0； (6)

for the diagonal direction in the upper right,

wT＝16>>((y<<1)>>shift)；

wL＝16>>((x<<1)>>shift)；

wTL＝0； (7)

wherein shift ═ (log)₂(W)–2+log₂(H)–2+2)>>And 2, W is the width of the current coding block, and H is the height of the current coding block.

Multiple transform selection and shape adaptive transform selection.

To enable the Multiple Transform Selection (MTS) tool, VVC introduces DCT-VIII, DST-IV and DST-VII in addition to DCT-II in HEVC. In VVC, the transform type is adaptively selected at the coding block level by adding an MTS flag in the bitstream. Specifically, when the MTS flag of a block is equal to 0, a pair of fixed transforms (e.g., DCT-II) are applied in the horizontal direction and the vertical direction; when the MTS flag is equal to 1, two additional flags will be sent for the coding block to indicate the transform type for each direction, where the transform type for each direction is DCT-VIII or DST-VII.

On the other hand, since a block division structure based on a quadtree/binary tree/ternary tree is introduced in the VVC, the residual distribution of intra prediction is highly correlated with the block shape. When the MTS is disabled (i.e., the MTS flag of a coding block is equal to 0), all coding blocks use a shape adaptive transform selection method in which the DCT-II and DST-VII are implicitly enabled based on the width and height of the current coding block. Specifically, for each rectangular coding block, DST-VII is used in the direction related to the short side of the coding block, and DCT-II is used in the direction related to the long side of the coding block; for each square coding block, DST-VII is used in both horizontal and vertical directions.

Furthermore, to avoid introducing new transform types in coding blocks of different shapes, DST-VII is only used when the shorter edge of one coding block is less than or equal to 16; otherwise, DCT-II is always used. Table 2 shows horizontal transforms and vertical transforms of encoded blocks enabled based on the shape adaptive transform selection method in VVC.

Shape adaptive transform selection for coded blocks in table 2 VVC

Intra sub-division mode.

Conventional intra mode generates prediction samples in a coding block using only reference samples adjacent to one coding block, and based on this method, the spatial correlation between the prediction samples and the reference samples is approximately proportional to the distance between the prediction samples and the reference samples. Therefore, the prediction quality of samples inside a coding block (especially the samples located in the lower right corner of the block) is usually worse than the prediction quality of samples close to the block boundary. To further improve the quality of Intra Prediction, Short-Distance Intra Prediction (SDIP) has been proposed and studied intensively during the development of the HEVC standard. In SDIP, one coding block is divided horizontally or vertically into a plurality of sub-blocks. Typically, a square block is divided into four sub-blocks. For example, an 8 × 8 block may be divided into four 2 × 8 or four 8 × 2 sub-blocks. One extreme case of such sub-block based intra prediction is the so-called row/column based prediction. For example, for a W × H (width × height) coded block, the coded block may be partitioned into W × 1 sub-blocks of H, or into 1 × H sub-blocks, after which each sub-block (row/column) is coded in the same way as a conventional coded block (as shown in fig. 1), i.e. it is predicted by one of the available intra modes, and the prediction error is decorrelated based on the transform and quantization and then sent to the decoder. And the reconstructed samples in one sub-block (e.g., row/column) are used as reference samples for the next sub-block, and the above process is repeated until all sub-blocks within the coded block are predicted. Furthermore, to reduce signaling overhead, all sub-blocks within one coding block may share the same intra-mode.

With SDIP, different sub-blocks may provide different coding efficiencies. In general, row-based prediction may provide the best coding efficiency, since it provides the shortest prediction distance between different sub-blocks. But row-based prediction has the worst encoding/decoding throughput problems for codec hardware implementations. Considering an encoding block with 4 x 4 sub-blocks and the same encoding block with 4 x1 or 1x 4 sub-blocks, the same encoding block with 4 x1 or 1x 4 sub-blocks is only a quarter of the throughput of an encoding block with 4 x 4 sub-blocks.

Recently, VVC introduced video coding tools for Sub-block Prediction (ISP). Conceptually, ISPs are very similar to SDIPs. Specifically, according to the block size of the coding block, the ISP may divide the coding block into 2 or 4 sub-blocks in the horizontal direction, or divide the coding block into 2 or 4 sub-blocks in the vertical direction, and each sub-block contains at least 16 samples. Fig. 8 a-8 c show all possible sub-block division cases for coding blocks of different sizes.

In addition, the following main aspects are also included in the current ISP design to handle the interaction of ISP tools with other coding tools in the VVC:

1. interaction with the wide-angle intra direction: the ISP is combined with the wide-angle intra direction. In the current scheme, it is determined whether to use the normal intra direction or its corresponding wide-angle intra direction based on the block size (e.g., width/height ratio) of an original coding block (i.e., the coding block before sub-block division).

2. Interaction with multiple reference rows: ISPs cannot be enabled in conjunction with multiple reference rows. In current VVC signaling designs, the ISP enable/disable flag is signaled after the MRL index is signaled. When a coding block has a non-zero MRL index (i.e., the nearest neighbor sample is not used), the ISP enable/disable flag is not signaled but instead the ISP flag is inferred to be 0 directly, i.e., the automatically disabled coding block uses ISP mode, where an ISP flag of 0 indicates disabled ISP and an ISP flag of 1 indicates enabled ISP.

3. Interaction with the most probable mode: similar to the conventional intra Mode, the intra Mode of one ISP coded block is transmitted through a Most Probable Mode (MPM) mechanism. However, compared to the conventional intra mode, the following modifications are made to the MPM method of the ISP:

1) each ISP coding block enables only the intra-modes contained in the MPM list and disables all other intra-modes not in the MPM list;

2) the MPM list for each ISP coded block excludes the DC mode and preferably considers the horizontal intra mode for the horizontal direction and the vertical intra mode for the vertical direction.

4. Interaction with multiple transform selection: ISPs are used exclusively with MTS. When a coding block uses an ISP, the MTS flag of the coding block does not have to be signaled but always is inferred to be 0, i.e. MTS is disabled. However, in order to not always use such a transform as DCT-II, a fixed set of core transforms (including DST-VII and DCT-II) may be implicitly applied to ISP coded blocks according to their block sizes. In particular, assuming that W and H are the width and height of one sub-block, the intra mode, horizontal transform, and vertical transform of the sub-block may be selected according to table 3.

Table 3 method for sub-block selection of intra mode, horizontal transform and vertical transform

Although ISP tools in VVCs can improve intra prediction efficiency, their performance can be further improved, and in order to improve hardware implementation efficiency of codecs, it is also necessary to simplify some parts of existing ISPs. In particular, the following problems in existing ISP designs have been identified in the disclosed embodiments.

1. In order to minimize the cost of ISP implementation, it is more reasonable to multiplex existing intra prediction modules (e.g., reference sample access, intra sample prediction, etc.) of conventional coding blocks to the maximum extent for ISP coding blocks in a practical codec design. However, when using ISP in conjunction with wide-angle intra prediction, it is determined whether the original intra mode (i.e., the intra mode signaled at the coding block level) should be replaced with the wide-angle intra mode corresponding to each sub-block based on the block size/shape of the original coding block rather than the actual block (e.g., sub-block), which is inconsistent with conventional ISP coding blocks, because conventional ISP coding blocks select between the conventional intra direction and the wide-angle intra direction based on their block size. It is certain that such inconsistent design may lead to the following complexity problems of hardware implementation.

First, the range of valid intra directions supported by the same size block is different between ISP mode and conventional ISP mode.

Second, neighboring reference samples intra-predicted for the same size block are different between the ISP mode and the conventional ISP mode.

Furthermore, to support the effective intra direction defined by the current ISP, each sub-block may need to access more top or left side neighboring reference samples than a coded block of the same size but not coded by the ISP mode. To illustrate this problem, fig. 9 a-9 b compare schematic diagrams of the effective intra direction of a coding block and the range of reference samples used when the coding block is encoded in a conventional intra mode and when the coding block is encoded as a sub-block of the ISP mode (assuming that the coding block is divided into two sub-blocks from the vertical direction). As shown in fig. 9a, when using conventional intra mode, the intra angular directions in which the coding block is valid range from mode 2 to mode 66 (i.e. covering an angular range of 45 ° to-135 °), and in order to support these directions, when predicting the coding block, it is necessary to access 2W +1 reference samples on the upper side and 2H +1 reference samples on the left side of the coding block. However, when encoding the coding block using ISP mode, as shown in fig. 9b, since the parent coding block is flat rectangular in shape, wide-angle intra prediction needs to be applied to the coding block so that the intra-angular direction for which the coding block is effective ranges from mode 8 to mode 72 (i.e. covers an angular range of 63.4 ° to-116.6 °), and in order to support these directions, when predicting the coding block, 3W +1 reference samples on the upper side and 3H/2+1 reference samples on the left side of the coding block need to be accessed. That is, fig. 9b requires access to W more reference samples from among the adjacent samples accessing the coding block compared to fig. 9 a.

2. Because there is a strong correlation between the residuals of intra prediction, DCT-II and DST-VII are applied simultaneously to intra predicted coded blocks when MTS is disabled. However, as shown in tables 2 and 3, the conventional intra mode encoded coding block and the ISP mode encoded coding block utilize different methods to select the optimal horizontal/vertical transform between DCT-II and DST-VII. In practice, the choice of the best transform depends on the actual distribution of the prediction residual, which should be highly correlated to the block size of the coding block and the intra mode to which the coding block applies, rather than on whether the coding block applies the ISP mode. Furthermore, choosing a uniform design for the transformation of all the coding blocks is more beneficial to the hardware implementation of the ISP.

3. MRL cannot be used in conjunction with ISP mode. When the MRL index of a coding block is not zero, ISP mode is always disabled by inferring the value of the ISP flag to zero. However, the gains of MRL tools mainly come from two aspects:

1) since quantization/dequantization is applied in the transform domain, the reconstructed samples at different locations may have different reconstruction quality, i.e. the nearest neighbourhood may not always be the best reference sample for intra prediction;

2) there may be coding noise and occlusion in the nearest neighbourhood, which may lead to a degradation of the quality of the intra-predicted samples.

Based on the above analysis, it seems unreasonable to disable MRL for ISP mode, in other words, additional coding gain can be expected when a combination of ISP and MRL is enabled.

4. There may be implementation problems if the sub-block size of an ISP coded block belongs to 1 × N, N × 1, 2 × N or N × 2. In ISP mode, the reconstructed pixels of the neighboring sub-block must be ready for predicting the current sub-block, that is, in order to predict the pixels in the current sub-block, the reconstruction loop of the neighboring sub-block must be completed, and when the difference between the width and the height of the sub-block is large, the reconstruction delay of the sub-block is significant, and the reconstruction delay reduces the throughput of the VVC, thereby affecting the encoding performance of the ISP.

To address these issues, embodiments of the present disclosure propose methods to further improve ISP coding efficiency and simplify existing ISP designs to facilitate hardware implementation.

Wide-angle intra direction in ISP mode is enabled/disabled based on sub-block size.

In the current VVC design, wide-angle intra prediction is applied to an ISP coding block, and whether an original intra mode is applied to one sub block or a corresponding wide-angle intra mode is applied to one sub block is determined according to the entire coding block, so that not only is the intra direction range supported by the ISP coding block and a conventional coding block inconsistent, but also the number of reference samples in the upper neighborhood or the left neighborhood of the coding block may be increased. To this end, in one embodiment of the present disclosure, it is proposed to enable/disable wide-angle intra mode of an encoded block based on the sub-block size. Using the same example of fig. 9 a-9 b, fig. 10 shows a schematic diagram of the effective intra direction of a sub-block and the reference samples used by the sub-block when applying the method, and it can be seen from fig. 10 that after applying the method, the intra modes supported by each sub-block range from mode 2 to mode 66 (i.e. from 45 ° to-135 °), and the reference samples required for intra prediction include 2W +1 reference samples from the upper neighborhood and 2H +1 reference samples from the left neighborhood, which are the same as the statistics of a conventional coding block of the same size (i.e. W × H) in fig. 9a, so the method can provide a uniform design for wide-angle intra prediction of coding blocks regardless of whether they are coded using ISP mode or not.

In addition, in the current VTM-3.0, the width or height of the largest encoding block is 64 and the width or height of the smallest encoding block is 4. Accordingly, the aspect ratio of one coding block may be M: 1 or 1: m, wherein M can be 1, 2, 4, 8, 16, 32, and 64. However, after the ISP mode is enabled, the width or height of one sub-block may be reduced to one sample, and 1:64 or 64: an aspect ratio of 1 is also possible. Therefore, when enabling/disabling wide-angle intra prediction for ISPs based on sub-block size, a pair of aspect ratios should be introduced in defining the wide-angle intra direction for ISP support, namely 64: 1 and 1: 64. To achieve this, a new element needs to be introduced in angTable [ ] and invAngTable [ ], and the gray elements in table 4 are the introduced new elements, where angTable [ ] and invAngTable [ ] define tan values and arctan values for various intra angles.

TABLE 4 ANGTABLE [ ] and invAngTable [ ] modified for angular intra prediction

According to the second embodiment of the present disclosure, it is not necessary to add a new element in table 4 to process a data stream having 1:64 or 64: the intra prediction of the new aspect ratio subblock of 1, but always disabling the SIP mode, in other words, at the decoder side, when the resulting subblock has an aspect ratio of 64: 1 or 1:64, it is concluded that the SIP mode is 0 (i.e. SIP mode is disabled).

According to the third embodiment of the present disclosure, it is not necessary to add new elements in table 4 to handle new aspect ratios of 1:64 and/or 64: 1, will be used for intra prediction of sub-blocks with aspect ratios 1:16 and 16: 1 for intra prediction with aspect ratios of 1:64 and 64: 1.

Transform selection of the coded blocks is unified.

In current VVCs, when MTS is disabled, the best horizontal/vertical transform is chosen by the conventional ISP coding blocks (see table 2) and ISP coding blocks (see table 3) from DCT-II and DST-VII using different methods, which may not be reasonable because the statistical distribution of the intra prediction residual for each block/sub-block should be independent of whether the current coding block uses ISP mode or not. To achieve a uniform design, two methods are proposed below to coordinate the selection of the transformation of the ISP coding blocks and the selection of the transformation of the regular coding blocks.

In one possible implementation, the transform selection method for the conventional coding block (see table 2) is extended to the ISP coding block. Specifically, the transform selection method for the conventional coding blocks remains the same as the existing design in table 2, while the transform selection method for the ISP coding blocks is modified to: when an ISP coding block is divided into rectangular sub-blocks, DST-VII is adapted to the shorter dimension of each sub-block and DCT-II is adapted to the longer dimension of the sub-block; DST-VII is adapted to the horizontal and vertical directions when an ISP coding block is divided into square sub-blocks. In addition, to avoid introducing new transform sizes, DST-VII (same constraint design in table 2) is used only when the corresponding size of the subblock is equal to or smaller than 16.

In another possible implementation, the method of transform selection for ISP coding blocks (see table 3) is extended to conventional coding blocks. Specifically, the transform selection method of the sub-blocks remains the same as the existing design in table 3, and the transform selection of the conventional coding block is modified into table 3, i.e., the conventional coding block also selects the intra mode, horizontal transform and vertical transform according to the block size of the coding block according to table 3.

In current VVC designs, the ISP mode cannot be applied in conjunction with the MTS mode. Specifically, when one coding block is encoded by the ISP mode, the enable/disable flag of the MTS is not transmitted; instead, the flag for the always MTS is inferred to be 0, i.e., MTS is disabled. Conceptually, the gain of the ISP mode mainly comes from the following two reasons.

First, since the distance between the prediction sample and the reference sample is shortened, the average intra prediction correlation is improved when the ISP mode is enabled, and thus, the intra prediction efficiency can be improved.

Secondly, due to the presence of sub-blocks, the encoder/decoder has more freedom to apply transforms of various sizes that can better adapt to the specific characteristics of the local residual inside the coded block. Generally, after dividing an ISP coded block into a plurality of sub-blocks, each sub-block has more opportunities to achieve smaller prediction residuals. For example, all prediction residuals of one sub-block may go to zero, in which case only a single bit (i.e. CBF ═ 0) needs to be indicated in the bitstream to reconstruct the residuals at the decoder, saving significantly on signaling overhead. On the other hand, the gain of MTS comes from adaptive selection of multiple transforms to better compress the residual information of one coded block, so there should be no strong overlap between the gains of ISP and MTS modes, i.e., by enabling the combination of ISP and MTS modes, additional coding gain can be expected. Meanwhile, the design of always disabling the MTS mode for ISP coding blocks is not consistent with the design of conventional coding blocks with MTS mode enabled, and it is more desirable to provide a consistent MTS design for ISP coding blocks and conventional coding blocks from the viewpoint of facilitating the hardware implementation of the encoder/decoder. To improve intra coding performance while achieving a uniform design, the present disclosure proposes to enable the MTS mode for the ISP mode.

In one embodiment, a pair of MTS horizontal/vertical transforms may be signaled for an ISP coding block (e.g., MTS flag and MTS index signaled by a syntax element) and all sub-blocks within the coding block are signaled to share the MTS transform.

In another embodiment, each sub-block selects its own MTS transform. Specifically, when a coding block is enabled to ISP mode, an MTS flag is sent for each subblock, and if the MTS flag is zero, then transform selection may be implicitly used for that subblock (see table 3); if the MTS flag is 1, additional syntax elements (e.g., MTS indices) may be signaled to indicate the transformations of the sub-block in the horizontal and vertical directions (e.g., DCT-VIII, DST-IV, and DST-VII).

In the above method, each subblock selects its MTS transform, and multiple MTS enable/disable flags need to be sent in a bitstream, which results in a relatively large signaling overhead. To reduce signaling overhead, in another embodiment, MTS mode may be enabled/disabled simultaneously for all sub-blocks within an ISP coded block, but each sub-block is allowed to select its own transform when MTS mode is enabled. For example, an MTS flag is signaled at the coding block level, and when the MTS flag is zero, the transforms of all sub-blocks can be selected according to table 3; when the MTS flag is 1, the MTS index of each subblock may be signaled to indicate the transform selected for the subblock.

And coordinating MPM generation methods of the ISP coding blocks and the conventional coding blocks.

The ISP coding blocks utilize a different approach to form the MPM list than conventional coding blocks, which does not include DC modes, and prioritize either some neighboring horizontal intra-mode or some neighboring vertical intra-mode frames depending on the implemented partition direction, so that the corresponding coding benefit may be limited.

Under one possible implementation, the present disclosure proposes to generate an MPM list of ISP coding blocks using the same MPM list generation method as a conventional coding block.

In another possible implementation, the present disclosure provides a more efficient MPM list generation method for ISP coded blocks. On the premise of keeping different MPM list generation methods of ISP coding blocks and conventional coding blocks, the MPM list generation method of the ISP coding blocks in the VVC is improved so as to achieve better coding efficiency.

In particular, when generating an MPM list for an ISP coding block, if the current coding block is horizontally partitioned, intra prediction directions ranging from-14 to 18 (except for direction 0 of the plane and direction 1 of the DC) as shown in fig. 4 are excluded from selection, because these prediction directions are unlikely to provide a prediction benefit for horizontally partitioned sub-blocks in ISP mode. Likewise, when generating an MPM list for an ISP coding block, if the current coding block is vertically partitioned, intra prediction directions ranging from 50 to 80 as shown in fig. 4 are excluded from selection, because these prediction directions are unlikely to provide prediction benefits for vertically partitioned sub-blocks in ISP mode.

In addition, when some intra prediction directions are excluded from the MPM list of the ISP-coded block based on the above rule, some other intra prediction directions may be added instead if there are not enough 6 MPMs in the MPM list.

In one example, if there is a corresponding wide-angle intra prediction direction for the excluded intra prediction directions, the wide-angle intra prediction direction may be placed in the MPM list as a substitute. In another example, some intra prediction directions adjacent to the intra mode already in the MPM list may be put in the MPM list instead.

A combination of ISP mode and MRL mode.

In current VVCs, when the reference sample used is not from the nearest neighbor of a coding block (i.e., the MRL index is not zero), the ISP flag is forced to be zero (i.e., the ISP mode is disabled), i.e., the ISP mode and the MRL mode cannot be applied to one coding block. However, the ISP mode aims to improve the intra prediction efficiency by shortening the distance between the prediction sample and the reference sample, and the MRL mode aims to mitigate the negative impact of coding noise and occlusion present in the nearest neighbor samples on the overall intra coding performance, and there is little overlap between the coding advantages of these two tools, i.e., the combined benefits of the two are predictable. To further improve the efficiency of intra prediction, the present disclosure proposes to enable simultaneous ISP and MRL modes for one coding block.

In one embodiment, an ISP flag may be sent before or after the MRL index is sent, and the MRL index is shared by all sub-blocks in the same ISP coding block, i.e., all sub-blocks will use the ith row/column of their respective reconstructed samples (as indicated by the MRL index) as a reference to generate the intra-predicted samples.

In another embodiment, each sub-block in an ISP encoded block may reference a different row/column of reconstructed neighboring samples. In particular, the MRL index is signaled after the ISP flag is sent, if the ISP flag is equal to zero (i.e. the coding block is not partitioned), one MRL index is signaled, which is used to determine the reference sample for the entire coding block; if the ISP flag is equal to 1, a plurality of MRL indices (one for each sub-block) are transmitted according to the number of sub-blocks in the coding block to indicate the location of the corresponding reference sample of each sub-block, respectively.

The ISP's reference samples are extended.

In the current ISP design, for a sub-block which is not located at the head row and the head column in an ISP coding block, when a reference sample of the sub-block is not available, the closest available reference sample is used for replacement, see fig. 11, when a reference sample of a light gray area is not available (prediction is not completed), a reference sample of an arrow indicates a reference sample to replace all reference samples of the light gray area, so that the number of actually used reference samples is relatively small and the prediction effect is not good. To improve the predictive effect of the ISP, the present disclosure proposes to use different schemes to select alternative reference samples for the unavailable reference samples.

In one embodiment, the reference samples of the current encoded block are used as alternative reference samples to those that are not available. As shown in fig. 12 and 13, wherein an alternative reference sample for each unavailable reference sample is selected from the reference samples of the current coding block indicated by the angular intra prediction mode.

It should be noted that when generating alternative reference samples, interpolation filters or reference sample smoothing filters used by conventional intra prediction may also be applied here. Furthermore, when the prediction mode is DC mode or planar mode, no additional process is required to determine the alternative reference samples.

In another embodiment, a simple copy is deployed to generate an alternative reference sample, as shown in FIG. 14.

In particular, the ISP extended sample generation methods proposed by the present disclosure (shown in fig. 12-14) can be freely combined with other ISP improvement/simplification methods proposed by the present disclosure.

In one specific example, the ISP extended sample generation method is combined with the ISP broadframe direction, i.e. whether to enable/disable the broadframe direction is determined according to the sub-block size, and when such combination is enabled, the first sub-block may need more neighboring reference samples because the unavailable reference samples of the non-first sub-block (the sub-block of the coding block that is not first row and first column) are generated from the neighboring reference samples of the first sub-block (the sub-block of the coding block that is first row and first column), because the number of the alternative reference samples used by the non-first sub-block depends on the corresponding broadframe direction of the sub-block, which is determined according to the sub-block size instead of the coding block size. In other words, such a design may increase the number of reference samples that the first sub-block needs to access.

To avoid this, under one embodiment of the present disclosure, the values of the most recent reference samples are filled in the original reference sample region of the sub-block to generate those additional references. In another embodiment, the intra mode of one non-first sub-block is clipped to the closest intra mode, which does not require the use of other reference samples than the first sub-block.

Some sub-blocks in the ISP mode are removed.

In the current VVC, the subblock size may be 1 × N, N × 1, 2 × N, or N × 2. To predict the current subblock, reconstructed pixels in neighboring subblocks must be generated first, and this delay reduces the throughput of the VVC.

In a first embodiment, the coding block is disabled from using the ISP mode when the sub-block size after the coding block is divided is 1 × N and/or 2 × N.

In a second embodiment, when the subblock size after the coding block division is 1xN, 2xN, Nx1 and/or Nx2, the coding block is disabled from using the ISP mode.

In another embodiment, when the size of the subblock after the coding block is divided is 1 × N, the coding block is disabled from using the ISP mode.

In yet another embodiment, the coding block is disabled from using the ISP mode when the size of the sub-block after the coding block is divided is 1 × N and/or N × 1.

Table 5 shows the relationship between the block size of the coding block, the sub-block size, and the sub-block division direction of the coding block, on the basis of which the block size and the sub-block division direction of the disabled coding block can be determined according to the sub-block size of a given disabled ISP mode.

For example, to disable 1 × N and 2 × N sub-blocks, the ISP mode may be disabled in the vertical direction for coding blocks of 4 × 16, 4 × 32, 4 × 64, 4 × 8, 8 × 16, 8 × 32, and 8 × 64 sizes.

Table 5 relationship between block size of coding block, subblock size and subblock division direction of coding block

Fig. 15 is a flowchart illustrating a video decoding method according to an exemplary embodiment, the flowchart including the following steps.

S1501: block size information and target prediction mode information of an encoded block are received.

S1502: and determining the subblock size of the coding block in each dividing direction according to the subblock dividing information of the block size information and the configured subblock size information.

Wherein the subblock division information per block size information includes at least two division directions and a subblock division number per division direction.

Fig. 8a to 8c show subblock division information of coding blocks of all sizes, where the division directions are the horizontal direction and the vertical direction, the number of subblock divisions is 2 or 4, and the coding blocks can be equally divided in each division direction, that is, the subblock sizes after dividing the coding blocks in each division direction are the same.

S1503: and determining whether the ISP mode is disabled for the coding block in each dividing direction according to the sub-block size of the coding block in each dividing direction and a configured ISP-disabled sub-block size set, wherein the size in the sub-block size set is a non-standard size.

Generally, the width or height of the largest coding block is 64, and the width or height of the smallest coding block is 4, that is, the size of the coding block that can be divided by the coding end is already specified, and these sizes are the standard sizes of the coding blocks.

When the ISP mode is adopted, the coding blocks need to be divided again, sub-blocks with non-standard sizes may appear, if the sub-blocks are reconstructed, the reconstruction delay problem of the ISP mode may be more serious, and moreover, a reconstruction circuit for the sub-blocks needs to be added, so that the benefit of the ISP mode cannot be revealed.

For this reason, in the embodiment of the present disclosure, a set of subblock sizes with ISP disabled modes is configured, where sizes in the set of subblock sizes are non-standard sizes, that is, the subblock sizes are newly appeared sizes, and the ISP modes are disabled in corresponding dividing directions for coding blocks divided into the subblock sizes, so that the problem of reconstruction delay of the coding blocks by the ISP modes can be improved, and the coding efficiency of the coding blocks by the ISP modes can be improved. In addition, a reconstruction circuit does not need to be additionally added to the subblocks with the subblock sizes, the modification of a decoder and an encoder is small, and therefore the hardware cost is low.

1×N；N×1；2×N；N×2；

S1504: the reconstruction size of the coding block is determined based on information on whether the ISP mode is disabled for the coding block in each division direction.

In one possible implementation, ISP mode is disabled for coding blocks in all partitioning directions, at which point the block size of the coding block may be determined as the reconstructed size.

In this case, the decoding end may directly infer that the coding block is not reconstructed in the ISP mode, and therefore, the encoding end may not transmit an ISP identifier for indicating whether the coding block adopts the ISP mode, so as to save bits and improve the video compression rate.

In another possible implementation, the ISP mode is disabled for the coding blocks in the partial partition direction, and at this time, the reconstruction size may be determined according to the ISP identifier of the received coding block, where the ISP identifier is transmitted by the encoder after transmitting the block size information of the coding block.

In specific implementation, if the ISP identifier indicates that the ISP mode is not used for reconstructing the coding block, the block size of the coding block may be determined as the reconstruction size; if the ISP identification indicates that the coding block is reconstructed by adopting the ISP mode, the reconstruction size can be determined according to the number of the division directions of the coding block in which the ISP mode is not forbidden.

Specifically, if the coding block does not disable the ISP mode in only one division direction, the sub-block size of the coding block in that division direction may be determined as the reconstruction size. In this case, the decoding side can directly infer the division direction of the coding block, and the encoding side does not need to transmit division direction indication information indicating a target division direction of the coding block, so that bits can be saved and the video compression rate can be improved.

In addition, if the coding block does not disable the ISP mode in at least two partition directions, a target partition direction of the coding block may be determined according to the received partition direction indication information of the coding block, and a sub-block size of the coding block in the target partition direction may be determined as a reconstruction size, where the partition direction indication information is transmitted by the encoder after transmitting the ISP identifier of the coding block.

S1505: and reconstructing the coding block according to the reconstruction size and the target prediction mode information.

In specific implementation, when the reconstruction size is the block size of the coding block, reconstructing the coding block according to the reconstruction size and the target prediction mode information, namely reconstructing the coding block according to the existing non-ISP reconstruction mode; when the reconstruction size is the subblock size of the coding block, reconstructing the coding block according to the reconstruction size and the target prediction mode information, that is, reconstructing the coding block according to the existing ISP reconstruction method, which is not described herein again.

Fig. 16 is a flowchart illustrating a video decoding method according to an exemplary embodiment, the flowchart including the following steps.

S1601: a video sequence is acquired.

S1602: and determining the sub-block size of the coding block in each dividing direction according to the block size information of the coding block in the video sequence and the sub-block dividing information of the configured block size information.

S1603: and determining whether the ISP mode is disabled for the coding block in each dividing direction according to the sub-block size of the coding block in each dividing direction and a configured ISP-disabled sub-block size set, wherein the size in the sub-block size set is a non-standard size.

1×N；N×1；2×N；N×2；

S1604: the reconstruction size of the coding block is determined based on information on whether the ISP mode is disabled for the coding block in each division direction.

In specific implementation, if the ISP modes are forbidden to the coding blocks in all the dividing directions, determining the reconstruction size as the block size of the coding blocks; if the ISP mode is disabled for the coding block in the partial partition direction, the reconstruction size is determined as the block size of the coding block and the subblock size of the coding block in the partition direction in which the ISP mode is not disabled, that is, the reconstruction sizes at this time are multiple, and subsequently, the encoder may try various reconstruction sizes to determine a target prediction mode meeting the coding performance requirement.

S1605: and determining target prediction mode information of the coding block according to each prediction mode corresponding to the reconstruction size and the configured reconstruction size.

In specific implementation, for each reconstruction size, the coding block can be reconstructed according to each prediction mode corresponding to the reconstruction size, and the coding performance of the coding block by the prediction mode under the reconstruction size is determined according to the reconstruction result.

The coding performance of the prediction mode on the coding block is used for representing the coding effect of the prediction mode on the coding block, and the coding performance can be determined according to the prediction error and the bit number when the coding block is reconstructed by adopting the prediction mode.

In addition, if the reconstructed size is the size of a sub-block of the coding block, the coding performance of the prediction mode on the coding block may be the sum of the coding performance of the prediction mode on each sub-block of the coding block.

Further, the target prediction mode may be determined according to the coding performance of the coding block for each prediction mode at various reconstruction sizes.

For example, the target coding performance when the preset coding performance is reached, such as the best coding performance, is determined, and the prediction mode used when the target coding performance is reached is determined as the target prediction mode.

S1606: and sending the block size information and the target prediction mode information of the coding block.

In particular implementation, when the ISP mode is disabled for the coding blocks in all the partition directions, the decoder can also deduce the information, so that only the block size information and the target prediction mode information of the coding blocks can be transmitted without transmitting the ISP identifiers of the coding blocks, thereby saving bits and improving the video compression rate.

When the ISP mode is disabled for the coding block in the partial partition direction, the encoder end does not necessarily select the ISP mode to encode the coding block, and therefore the ISP identifier of the coding block needs to be transmitted.

Therefore, the decoder can determine whether to adopt the ISP mode to reconstruct the coding block according to the reconstruction size corresponding to the target prediction mode information. Specifically, if the reconstruction size corresponding to the target prediction mode information is the block size of the coding block, determining not to reconstruct the coding block by using the ISP mode; and if the reconstruction size corresponding to the target prediction mode information is the subblock size of the coding block, determining to reconstruct the coding block by adopting an ISP mode.

Then, after sending the block size information of the coding block, the ISP identification of the coding block is sent to inform a decoder whether to adopt the ISP mode to reconstruct the coding block.

In specific implementation, when the ISP identifier indicates that the ISP mode is used to reconstruct the coding block, if the coding block does not disable the ISP mode in only one partition direction, the decoder may also directly infer the partition direction, and at this time, the encoding end may not send partition direction indication information for indicating a target partition direction of the coding block, so as to save bits and improve encoding efficiency.

When the coding block does not disable the ISP mode in at least two dividing directions, the encoder may send the dividing direction indication information of the coding block after sending the ISP identifier of the coding block, so as to inform the decoder of a target dividing direction of the coding block, where the target dividing direction is a dividing direction corresponding to the target prediction mode information.

When the method provided in the embodiments of the present disclosure is implemented in software or hardware or a combination of software and hardware, a plurality of functional modules may be included in the electronic device, and each functional module may include software, hardware or a combination of software and hardware.

Fig. 17 is a block diagram illustrating a video decoding apparatus according to an exemplary embodiment, which includes a receiving module 1701, a sub-block size determining module 1702, a judging module 1703, a reconstruction size determining module 1704, and a reconstruction module 1705.

A receiving module 1701 configured to perform receiving block size information and target prediction mode information of an encoded block;

a subblock size determining module 1702 configured to perform subblock division information according to the block size information and the configured block size information, the subblock division information including at least two division directions and subblock division numbers in each division direction, and determining a subblock size of the coding block in each division direction;

a determining module 1703, configured to perform determining whether to disable an ISP mode for the coding block in each dividing direction according to a subblock size of the coding block in each dividing direction and a configured set of subblock sizes in which the ISP mode is disabled, where a size in the set of subblock sizes is a non-standard size;

a reconstruction size determination module 1704 configured to perform determining a reconstruction size of the coding block according to information on whether the ISP mode is disabled for the coding block in each of the dividing directions;

a reconstructing module 1705 configured to perform reconstructing the encoded block according to the reconstruction size and the target prediction mode information.

1×N；N×1；2×N；N×2；

Under one possible implementation, the reconstruction size determination module 1704 is specifically configured to perform:

Fig. 18 is a block diagram illustrating a video encoding apparatus according to an exemplary embodiment, which includes an obtaining module 1801, a sub-block size determining module 1802, a judging module 1803, a reconstruction size determining module 1804, a reconstructing module 1805, and a transmitting module 1806.

An acquisition module 1801 configured to perform acquiring a video sequence;

a sub-block size determining module 1802 configured to perform sub-block division information according to block size information of a coding block in the video sequence and the configured block size information, determining a sub-block size of the coding block in each division direction, the sub-block division information including at least two division directions and a sub-block division amount in each division direction;

a determining module 1803, configured to perform determining whether to disable ISP mode for the coding block in each dividing direction according to the subblock size of the coding block in each dividing direction and a configured set of subblock sizes for disabling ISP mode, where the size in the set of subblock sizes is a non-standard size;

a reconstruction size determining module 1804 configured to perform determining a reconstruction size of the coding block according to information on whether to disable an ISP mode for the coding block in each dividing direction;

a reconstructing module 1805 configured to execute each prediction mode corresponding to the reconstruction size and the configured reconstruction size, and determine target prediction mode information of the coding block;

a sending module 1806 configured to perform sending block size information and target prediction mode information of the coding block.

1×N；N×1；2×N；N×2；

Under one possible implementation, the reconstruction size determination module 1804 is specifically configured to perform:

In a possible implementation, the ISP identification determining module 1807 is further included:

the ISP identifier determining module 1807 is configured to execute, if the ISP mode is disabled for the coding block in the partial partition direction, determining whether to use the ISP mode to reconstruct the coding block according to the reconstruction size corresponding to the target prediction mode information;

the sending module 1806 is further configured to perform, after sending the block size information of the coding block, sending an ISP identifier of the coding block, where the ISP identifier is used to indicate whether to reconstruct the coding block in an ISP mode.

In a possible implementation manner, if the ISP identifier indicates that the coding block is reconstructed in the ISP mode, the sending module 1806 is further configured to perform:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

The division of the modules in the embodiments of the present disclosure is illustrative, and is only a logical function division, and there may be another division manner in actual implementation, and in addition, each functional module in each embodiment of the present disclosure may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The coupling of the various modules to each other may be through interfaces that are typically electrical communication interfaces, but mechanical or other forms of interfaces are not excluded. Thus, modules described as separate components may or may not be physically separate, may be located in one place, or may be distributed in different locations on the same or different devices. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

Fig. 19 is a schematic diagram illustrating a structure of an electronic device according to an exemplary embodiment, where the electronic device includes a transceiver 1901 and a processor 1902, and the processor 1902 may be a Central Processing Unit (CPU), a microprocessor, an application specific integrated circuit, a programmable logic circuit, a large scale integrated circuit, or a digital processing unit. The transceiver 1901 is used for data transmission and reception between the electronic device and another device.

The electronic device may further comprise a memory 1903 for storing software instructions executed by the processor 1902, but may also store some other data required by the electronic device, such as identification information of the electronic device, encryption information of the electronic device, user data, etc. The memory 1903 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1903 may also be a non-volatile memory (non-volatile memory) such as, but not limited to, a read-only memory (ROM), a flash memory (flash memory), a Hard Disk Drive (HDD) or a solid-state drive (SSD), or the memory 1903 may be any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 1903 may be a combination of the above memories.

The specific connection medium between the processor 1902, the memory 1903, and the transceiver 1901 is not limited in the embodiments of the present disclosure. In fig. 19, the embodiment of the present disclosure is described by taking only an example that the memory 1903, the processor 1902, and the transceiver 1901 are connected by the bus 1904, the bus is shown by a thick line in fig. 19, and the connection manner between other components is merely illustrative and not limiting. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 19, but it is not intended that there be only one bus or one type of bus.

The processor 1902 may be dedicated hardware or a processor running software, and when the processor 1902 may run software, the processor 1902 reads software instructions stored in the memory 1903 and, under the drive of the software instructions, executes the video decoding or video encoding method involved in the foregoing embodiments.

The disclosed embodiments also provide a storage medium, and when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is capable of executing the video decoding or video encoding method referred to in the foregoing embodiments.

In some possible embodiments, various aspects of the video decoding or video encoding method provided by the present disclosure may also be implemented in the form of a program product including program code for causing an electronic device to perform the video decoding or video encoding method referred to in the foregoing embodiments when the program product is run on the electronic device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product for video decoding or video encoding in embodiments of the present disclosure may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a computing device. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.

Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present disclosure have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the disclosure.

It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims

1. A video decoding method, comprising:

reconstructing the coding block according to the reconstruction size and the target prediction mode information;

determining a reconstruction size of the coding block according to information on whether to disable an ISP mode for the coding block in each division direction, including:

2. The method of claim 1, wherein the set of sub-block sizes is any combination of the following sub-block sizes:

1×N；N×1；2×N；N×2；

3. The method of claim 1, wherein determining the reconstruction size based on the received ISP identification of the coding block comprises:

4. The method of claim 3, wherein determining the reconstruction size according to the number of partitioning directions in which the coding block does not disable the ISP mode comprises:

5. A video encoding method, comprising:

acquiring a video sequence;

sending the block size information and the target prediction mode information of the coding block;

6. The method of claim 5, wherein the set of sub-block sizes is any combination of the following sub-block sizes:

1×N；N×1；2×N；N×2；

7. The method of claim 5, wherein if ISP mode is disabled for the coding block in the direction of fractional partitioning, further comprising:

8. The method of claim 7, wherein if the ISP identification indicates that the coding block is to be reconstructed in ISP mode, further comprising:

9. A video decoding apparatus, comprising:

a reconstruction module configured to perform reconstruction of the coding block according to the reconstruction size and the target prediction mode information;

the reconstruction size determination module is specifically configured to perform:

10. The apparatus of claim 9, wherein the set of sub-block sizes is any combination of the following sub-block sizes:

1×N；N×1；2×N；N×2；

11. The apparatus of claim 9, wherein the reconstruction size determination module is specifically configured to perform:

12. The apparatus of claim 11, wherein the reconstruction size determination module is specifically configured to perform:

13. A video encoding apparatus, comprising:

an acquisition module configured to perform acquiring a video sequence;

a transmitting module configured to perform transmitting block size information and target prediction mode information of the encoding block;

14. The apparatus of claim 13, wherein the set of sub-block sizes is any combination of the following sub-block sizes:

1×N；N×1；2×N；N×2；

15. The apparatus of claim 13, further comprising an ISP identification determination module:

16. The apparatus of claim 15, wherein if the ISP identification indicates that the coded block is to be reconstructed in ISP mode, the sending module is further configured to perform:

17. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein:

the memory stores instructions executable by the at least one processor, the at least one processor being capable of performing the method of any one of claims 1-4 or 5-8 when the instructions are executed by the at least one processor.

18. A storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any of claims 1-4 or 5-8.