CN113348671A - Video coding using intra sub-partition coding modes - Google Patents

Video coding using intra sub-partition coding modes Download PDF

Info

Publication number
CN113348671A
CN113348671A CN202080010728.6A CN202080010728A CN113348671A CN 113348671 A CN113348671 A CN 113348671A CN 202080010728 A CN202080010728 A CN 202080010728A CN 113348671 A CN113348671 A CN 113348671A
Authority
CN
China
Prior art keywords
sub
prediction
mode
partition
intra
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080010728.6A
Other languages
Chinese (zh)
Inventor
陈漪纹
修晓宇
王祥林
马宗全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111101969.4A priority Critical patent/CN113630607A/en
Publication of CN113348671A publication Critical patent/CN113348671A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video encoding method, comprising: a respective intra prediction is independently generated for each of a plurality of respective sub-partitions. A respective intra prediction for each sub-partition is generated using a plurality of reference samples from the current coding block. By way of illustration, respective intra predictions are not generated for any other sub-partitions of a plurality of respective sub-partitions using first sub-partition reconstruction samples from the plurality of respective sub-partitions, and a width of each sub-partition of the plurality of respective sub-partitions is less than or equal to 2.

Description

Video coding using intra sub-partition coding modes
Cross reference to related applications
This application claims us provisional patent application 62/801,214 filed on day 5, 2.2019. The entire contents of this application are incorporated by reference into this application.
Technical Field
Embodiments of the present application relate generally to video encoding and compression. More particularly, the present application relates to systems and methods for video encoding using intra sub-partition coding modes.
Background
This section provides background information related to the present application. The information contained in this section is not necessarily to be construed as prior art.
Video data may be compressed using any of a variety of video coding techniques. Video encoding may be performed according to one or more video encoding standards. Some illustrative video coding standards include general video coding (VVC), joint exploration test model (JEM) coding, high efficiency video coding (h.265/HEVC), advanced video coding (h.264/AVC), and Moving Picture Experts Group (MPEG) coding.
Video coding typically employs prediction methods (e.g., inter-prediction, intra-prediction, etc.) that exploit redundancy inherent in video images or sequences. One goal of video coding techniques is to compress video data into a form that uses a low bit rate while avoiding or reducing degradation of video quality.
The first version of the HEVC standard, finalized in 2013 in 10 months, provides a bit rate saving of about 50% or equivalent perceptual quality compared to the previous generation video coding standard (h.264/MPEG AVC). Although the HEVC standard provides significant coding improvements over previous standards, there is evidence that coding efficiencies over HEVC can be achieved by using additional coding tools. Based on this evidence, the Video Coding Experts Group (VCEG) and the Moving Picture Experts Group (MPEG) began exploratory work to develop new coding techniques for future video coding standardization. The joint video exploration team (jfet), which consisted of ITU-T VECG and ISO/IEC MPEG in 10 months of 2015, began to conduct significant research into advanced technologies that could significantly improve coding efficiency. Jfet maintains a reference software model called Joint Exploration Model (JEM) by integrating several additional coding tools on top of the HEVC test model (HM).
In 10.2017, ITU-T and ISO/IEC issued joint proposals (CfP) for video compression with capabilities beyond HEVC. In 4 months 2018, 23 CfP replies were received and evaluated at the 10 th jvt conference. These replies indicate that the compression efficiency gain is improved by about 40% over the HEVC standard. Based on the results of such evaluations, jfet initiated a new project to develop a next generation video coding standard known as universal video coding (VVC). Also during 2018, month 4, a reference software code base called VVC Test Model (VTM) was built to demonstrate a reference implementation of the VVC standard.
Disclosure of Invention
This section provides a general summary of the application and is not a comprehensive disclosure of its full scope or all of its features.
According to a first aspect of the present application, there is provided a video encoding method implemented in a computing device having one or more processors and a memory storing a plurality of programs for execution by the one or more processors. The method comprises the following steps: a respective intra prediction is independently generated for each of a plurality of respective sub-partitions, wherein the respective intra prediction for each sub-partition is generated using a plurality of reference samples from a current encoding block.
According to a second aspect of the present application, there is provided a video encoding method implemented in a computing device having one or more processors and a memory storing a plurality of programs for execution by the one or more processors. The method comprises the following steps: for a luma component of an intra sub-partition (ISP) encoded block, generating respective intra predictions for each of a plurality of corresponding sub-partitions using only N of M possible intra prediction modes, wherein M and N are positive integers and N is less than M.
According to a third aspect of the present application, there is provided a video encoding method implemented in a computing device having one or more processors and a memory storing a plurality of programs for execution by the one or more processors. The method comprises the following steps: for chroma components of an intra sub-partition (ISP) encoded block, intra prediction is generated using only N of M possible intra prediction modes, where M and N are positive integers and N is less than M.
According to a fourth aspect of the present application, there is provided a video encoding method implemented in a computing device having one or more processors and a memory storing a plurality of programs for execution by the one or more processors. The method comprises the following steps: for the luma component, respective luma intra predictions are generated for each of a plurality of corresponding sub-partitions of an entire intra sub-partition (ISP) encoded block, and for the chroma component, chroma intra predictions are generated for the entire intra sub-partition (ISP) encoded block.
According to a fifth aspect of the present application, there is provided a video encoding method implemented in a computing device having one or more processors and a memory storing a plurality of programs for execution by the one or more processors. The method comprises the following steps: generating a first prediction using an intra sub-partition mode; generating a second prediction using the inter prediction mode; and combining the first prediction and the second prediction to generate a final prediction by applying a weighted average to the first prediction and the second prediction.
Drawings
In the following, a set of illustrative, non-limiting embodiments of the present application will be described in connection with the attached drawings. Variations in structure, method, or function may be made by those of ordinary skill in the pertinent art based on the examples given herein and are included within the scope of the present application. The teachings of the different embodiments may be combined with each other but not necessarily with each other, in case of conflict.
Fig. 1 is a block diagram illustrating an exemplary encoder that may be used in connection with various video coding standards.
Fig. 2 is a block diagram illustrating an exemplary decoder that may be used in connection with various video coding standards.
FIG. 3 illustrates five exemplary block partitions for a multi-type tree structure.
Fig. 4 illustrates an exemplary set of intra modes for use with the VVC standard.
Fig. 5 shows a set of a plurality of reference lines for intra prediction.
Fig. 6A shows a first set of reference samples and angular directions for intra prediction of a first rectangular block.
Fig. 6B shows a second set of reference samples and angular directions for intra prediction of the second rectangular block.
Fig. 6C shows a third set of reference samples and angular directions for intra prediction of square blocks.
Fig. 7 shows an exemplary set of positions of neighboring reconstructed samples for position dependent intra prediction combining (PDPC) of one coding block.
Fig. 8A illustrates a set of exemplary short-distance intra prediction (SDIP) partitions for an 8 × 4 block.
Fig. 8B illustrates a set of exemplary short-distance intra prediction (SDIP) partitions for a 4 × 8 block.
Fig. 8C illustrates a set of exemplary short-distance intra prediction (SDIP) partitions for arbitrarily-sized blocks.
FIG. 9A is a view of chrominance values as a function of luminance values, where the view is used to derive a set of linear model parameters.
FIG. 9B shows the positions of samples used to derive the linear model parameters of FIG. 9A.
Fig. 10 illustrates the generation of reference samples for intra prediction for all sub-partitions using only reference samples outside the current coding block.
Fig. 11 illustrates a combination of inter-predicted samples and intra-predicted samples of the first sub-partition of fig. 10.
Fig. 12 illustrates a combination of inter-predicted samples and intra-predicted samples of the second sub-partition of fig. 10.
Detailed Description
The terminology used in the present application is intended to be illustrative of particular examples and is not intended to be limiting of the present application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be understood that the term "and/or" as used herein refers to any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first message may be termed a second message without departing from the scope of the present invention; similarly, the second information may also be referred to as the first information. As used herein, the term "if" can be understood to mean "when" or "based on" or "in response to" depending on the context.
Reference throughout this specification to "one embodiment," "an embodiment," "another embodiment," or the like, in the singular or plural, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment," "in another embodiment," and the like, in the singular or plural, in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Conceptually, many video coding standards are similar, including those mentioned previously in the background section. For example, almost all video coding standards use block-based processing and share similar video coding block diagrams to achieve video compression. As with HEVC, the VVC standard builds on top of the block-based hybrid video coding framework.
Fig. 1 shows a block diagram of an exemplary encoder 100 that may be used in connection with a variety of video coding standards. In encoder 100, a video frame is divided into a plurality of video blocks for processing. For each given video block, a prediction (prediction) is formed based on an inter prediction method (inter prediction approach) or an intra prediction method (intra prediction approach). In inter prediction, one or more predictors (predictors) are formed by motion estimation and motion compensation based on pixels from a previously reconstructed frame. In intra prediction, the predictor is formed based on reconstructed pixels in the current frame. Through the mode decision, the best predictor can be selected to predict the current block.
The prediction residual, which represents the difference between the current video block and its predictor, is sent to the transform circuit 102. The transform coefficients are then sent from transform circuitry 102 to quantization circuitry 104 for entropy reduction. The quantized coefficients are then fed to an entropy encoding circuit 106 to generate a compressed video bitstream. As shown in fig. 1, prediction related information 110, such as video block partition information, motion vectors, reference picture indices, and intra prediction modes, from inter prediction circuitry and/or intra prediction circuitry 112 is also fed through entropy coding circuitry 106 and saved into compressed video bitstream 114.
In the encoder 100, circuitry associated with the decoder is also required to reconstruct the pixels for prediction purposes. First, the prediction residual is reconstructed by the inverse quantization circuit 116 and the inverse transform circuit 118. This reconstructed prediction residual is combined with the block predictor 120 to generate an unfiltered reconstructed pixel for the current video block.
To improve coding efficiency and visual quality, an in-loop filter 115 is typically used. For example, deblocking filters (deblocking filters) are available in AVC, HEVC, and current versions of VVCs. In HEVC, an additional in-loop filter, called SAO (sample adaptive offset), is defined to further improve coding efficiency. In the latest version of the VVC standard, another in-loop filter called ALF (adaptive loop filter) which is highly likely to be included in the final standard is actively studied.
These in-loop filter operations are optional. Performing these operations helps to improve coding efficiency and visual quality. They may also be turned off as decisions are made by the encoder 100 to reduce computational complexity.
It should be noted that intra prediction is typically based on unfiltered reconstructed pixels, whereas inter prediction is based on filtered reconstructed pixels if these filter options are turned on by the encoder 100.
Fig. 2 is a block diagram illustrating an exemplary decoder 200 that may be used in connection with various video coding standards. The decoder 200 is similar to the reconstruction related part residing in the encoder 100 of fig. 1. In the decoder 200 (fig. 2), an input video bitstream 201 is first decoded by an entropy decoding circuit 202 to derive quantized coefficient levels and prediction related information. These quantized coefficient levels are processed by inverse quantization circuit 204 and inverse transform circuit 206 to obtain a reconstructed prediction residual. The block predictor mechanism implemented in the intra/inter mode selector 212 is configured to perform either the intra prediction process 208 or the motion compensation process 210 based on the decoded prediction information. The set of unfiltered reconstructed pixels is obtained by adding the reconstructed prediction residual from the inverse transform circuit 206 to the prediction output generated by the block predictor mechanism using an adder 214. The reconstructed block may also pass through an in-loop filter 209 before being stored in a picture buffer 213 that serves as a reference picture storage. The reconstructed video in the picture buffer 213 may then be sent out to drive a display device and used to predict future video blocks. With the in-loop filter 209 on, a filtering operation is performed on these reconstructed pixels to derive the final reconstructed video output 222.
Referring again to fig. 1, the video signal input to the encoder 100 is processed block by block. Each block is called a Coding Unit (CU). In VTM-1.0, a CU may reach 128 × 128 pixels. In High Efficiency Video Coding (HEVC), joint exploration test model (JEM), and general video coding (VVC), the basic unit for compression is called a Coding Tree Unit (CTU). However, in contrast to the HEVC standard, which partitions blocks based on quadtree only, in the VVC standard one Coding Tree Unit (CTU) is split into CUs to accommodate different local features (local characteristics) based on a quadtree/binary tree/ternary tree structure. In addition, the concept of multi-partition unit type in HEVC standard does not exist in the VVC standard, i.e. there is no separation of CU, Prediction Unit (PU) and Transform Unit (TU) in the VVC standard; instead, each CU is always used as a basic unit for prediction and transform without further partitioning. For the 4:2:0 chroma format, the maximum CTU size (size) of HEVC and JEM is defined as a maximum of 64 × 64 luma pixels and two 32 × 32 chroma pixel blocks. The maximum luminance block size allowed in the CTU is specified to be 128 × 128 (although the maximum size of the luminance transform block is 64 × 64).
FIG. 3 illustrates five exemplary block partitions for a multi-type tree structure. Exemplary block partitions of these five types include quad partition 301, horizontal binary partition 302, vertical binary partition 303, horizontal ternary partition 304, and vertical ternary partition 305. In the case of using a multi-type tree structure, one CTU is first partitioned using a quad tree structure. The leaf nodes of each quadtree may then be further partitioned by binary and ternary tree structures.
Using one or more of the exemplary block partitions 301, 302, 303, 304, and 305 of fig. 3, spatial prediction and/or temporal prediction may be performed using the configuration shown in fig. 1. Spatial prediction (or "intra prediction") uses pixels from samples of already-encoded neighboring blocks (referred to as reference samples) in the same video picture/slice to predict the current video block. Spatial prediction reduces the spatial redundancy inherent in video signals.
Temporal prediction (also referred to as "inter prediction" or "motion compensated prediction") uses reconstructed pixels from an encoded video picture to predict a current video block. Temporal prediction reduces temporal redundancy inherent in video signals. The temporal prediction signal for a given CU is typically sent (signaled) by one or more Motion Vectors (MVs) that indicate the amount and direction of motion between the current CU and its temporal reference. Furthermore, if multiple reference pictures are supported, one reference picture index is additionally sent for identifying from which reference picture in the reference picture store the temporal prediction signal comes.
After spatial and/or temporal prediction has been made, the intra/inter mode decision circuit 121 in the encoder 100 selects the best prediction mode, for example, based on a rate-distortion optimization method. Block predictor 120 is then subtracted from the current video block; and the resulting prediction residual is decorrelated using transform circuitry 102 and quantization circuitry 104. The resulting quantized residual coefficients are inverse quantized by inverse quantization circuit 116 and inverse transformed by inverse transform circuit 118 to form the reconstructed residual, which is then added back to the prediction block to form the reconstructed signal for the CU. Further, in-loop filtering 115, such as a deblocking filter, Sample Adaptive Offset (SAO), and/or adaptive in-loop filter (ALF), may be applied to the reconstructed CU before being placed into reference picture storage of picture buffer 117 and used to encode future video blocks. To form the output video bitstream 114, the (inter or intra) coding mode, prediction mode information, motion information, and quantized residual coefficients are all sent to the entropy coding unit 106 to be further compressed and packed to form the bitstream.
The basic intra prediction scheme applied in the VVC standard remains substantially the same as the basic intra prediction scheme of the HEVC standard, except that several modules, such as intra sub-partition (ISP) coding modes, extended intra prediction with wide-angle intra direction, position dependent intra prediction combining (PDPC), and 4-tap intra interpolation, are further extended and/or improved in the VVC standard. One broad aspect of the present application is directed to improvements in existing ISP designs in VVC standards. In addition, other coding tools (such as those in intra prediction and transform coding processes) included in the VVC standard and closely related to the techniques proposed in the present application will be discussed in detail below.
Intra prediction mode with wide-angle intra direction
In the case of the HEVC standard, the VVC standard predicts the samples of a current CU using a set of previously decoded samples that are adjacent to (i.e., above or to the left of) the CU. However, to capture the finer edge directions present in natural video (especially for video content with high resolution, e.g. 4K), the number of angular intra modes extends from 33 modes in the HEVC standard to 93 modes in the VVC standard. In addition to the angular direction, both the HEVC standard and the VVC standard provide a planar mode (assuming that the surface is a gradual surface with horizontal and vertical slopes derived from the boundary) and a DC mode (assuming that the surface is flat).
Fig. 4 illustrates an exemplary set of intra modes 400 for use with the VVC standard, and fig. 5 illustrates a set of multiple reference lines for intra prediction. Referring to fig. 4, the exemplary set of intra modes 400 includes modes 0, 1, -14, -12, -10, -8, -6, -4, -2, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, and 80. Mode 0 corresponds to the planar mode and mode 1 corresponds to the DC mode. Similar to the intra prediction process of the HEVC standard, all defined intra modes in the VVC standard (i.e., planar, DC, and angular direction modes) utilize the set of neighboring reconstructed samples above and to the left of the prediction block as a reference for intra prediction. However, unlike the HEVC standard, which uses only the nearest row/column of reconstructed samples ( row 0, 501 in fig. 5) as a reference, multiple reference rows (MRLs) are introduced in the VVC, with two additional rows/columns (i.e., row 1, 503 and row 3, 505 in fig. 5) being used for the intra prediction process. The index of the selected reference row/column is sent from encoder 100 (fig. 1) to (signaled) decoder 200 (fig. 2). When a non-nearest row/column is selected in FIG. 5, such as row 1 503 or row 3 505, the planes and DC modes of FIG. 4 are excluded from the set of intra modes that can be used to predict the current block.
Fig. 6A shows a first set of reference samples and angular directions 602, 604 for intra prediction of a rectangular block (width W divided by height H equal to 2). The first set of reference samples comprises a second sample 603 of the first sample 601. Fig. 6B shows a second set of reference samples and angular directions 606, 608 for intra prediction of a high rectangular block (W/H-1/2). The second set of reference samples comprises a third sample 605 and a fourth sample 607. Fig. 6C shows a third set of reference samples and angular directions 610, 612 for intra prediction of a square block (W ═ H). The third set of reference samples comprises a fifth sample 609 and a sixth sample 610. Assuming that the nearest neighbors are utilized, fig. 6C shows the location of a third reference sample that may be used in the VVC standard to derive the prediction samples for an intra block. As shown in fig. 6C, since the quad/binary/ternary tree partition structure is applied, in the context of the VVC standard, there are rectangular coding blocks for the intra prediction process in addition to the square coding blocks.
Since the width and height of a given block are not equal, different sets of angular directions (angular directions) are selected for different block shapes, which is also referred to as wide-angle intra prediction. Specifically, for square and rectangular coded blocks, in addition to the planar and DC modes, 65 of 93 angular directions are supported for each block shape, as shown in table 1. Such a design not only effectively captures the directional structures (by adaptively selecting the angular direction according to the block shape) that normally appear in video, but also ensures that a total of 67 intra modes (i.e., planar, DC mode, and 65 angular directions) are enabled for each coding block. This may enable good efficiency of signaling intra modes (signaling intra modes) while providing a consistent design across different block sizes.
Angle directions selected in table 1 VVC for intra prediction of different block shapes
Figure BDA0003177332060000081
Position dependent intra prediction combining
As described above, these intra-predicted samples are generated from a set of unfiltered or filtered neighboring reference samples, which may result in discontinuities along block boundaries between a current coding block and its neighboring blocks. To address these issues, boundary filtering is applied in the HEVC standard by combining the first row/column of prediction samples for DC, horizontal (i.e., mode 18 of fig. 4) and vertical (i.e., mode 50) prediction modes with unfiltered reference samples, using either a 2-tap filter (for DC modes) or a gradient-based smoothing filter (for horizontal and vertical prediction modes).
A position-dependent intra prediction combining (PDPC) tool in the VVC standard extends the foregoing concept by employing a weighted combination of intra-predicted samples and unfiltered reference samples. In the current VVC working draft, PDPC is enabled in the following signaling-free intra mode: planar, DC, horizontal (i.e., pattern 18), vertical (i.e., pattern 50), angular directions near the lower left corner diagonal (i.e., patterns 2, 3, 4.., 10), and angular directions near the upper right corner diagonal (i.e., patterns 58, 59, 60.., 66). Assuming that the prediction sample at coordinate (x, y) is pred (x, y), its corresponding value after PDPC is calculated as follows:
pred(x,y)=(wL×R-1,y+wT×Rx,-1–wTL×R-1,-1+(64–wL–wT+wTL)×pred(x,y)+32)>>6 (1);
wherein R isx,-1,R-1,yRespectively representing reference samples, R, located at the top and left of the current sample (x, y)-1,-1Representing the reference sample located in the upper left corner of the current block.
Fig. 7 shows an exemplary set of positions of neighboring reconstructed samples for position dependent intra prediction combining (PDPC) of one coding block. First reference sample 701 (R)x,-1) Representing the reference sample at the top of the current prediction sample (x, y). Second reference sample 703 (R)-1,y) Representing the reference sample to the left of the current prediction sample (x, y). Third reference sample 705 (R)-1,-1) Indicating the reference sample located in the upper left corner of the current prediction sample (x, y).
The reference samples including the first, second and third reference samples 701, 703 and 705 are combined with the current prediction sample (x, y) in the PDPC process. The weights wL, wT and wTL in equation (1) are adaptively selected according to the prediction mode and sample position, as follows, assuming that the size of the current coding block is W × H:
for the DC mode, the DC mode is selected,
wT=32>>((y<<1)>>shift),wL=32>>((x<<1)>>shift),wTL=(wL>>4)+(wT>>4) (2);
for the planar mode, the planar mode is,
wT=32>>((y<<1)>>shift),wL=32>>((x<<1)>>shift),wTL=0 (3);
for the horizontal mode:
wT=32>>((y<<1)>>shift),wL=32>>((x<<1)>>shift),wTL=wT (4);
for the vertical mode:
wT=32>>((y<<1)>>shift),wL=32>>((x<<1)>>shift),wTL=wL (5);
for the lower left corner diagonal direction:
wT=16>>((y<<1)>>shift),wL=16>>((x<<1)>>shift),wTL=0 (6);
diagonal direction for the upper right corner:
wT=16>>((y<<1)>>shift),wL=16>>((x<<1)>>shift),wTL=0 (7);
wherein shift ═ (log)2(W)–2+log2(H)–2+2)>>2。
Multiple transform selection and shape adaptive transform selection
In addition to the DCT-II transform used in the HEVC standard, a Multiple Transform Selection (MTS) tool is enabled in the VVC standard by introducing additional core transforms of DCT-VIII and DST-VII. In the VVC standard, adaptive selection of these transforms is enabled at the coding block level by signaling (signal) an MTS flag to the bitstream. Specifically, when the MTS flag of a block is equal to 0, a pair of fixed transforms (e.g., DCT-II) are applied in the horizontal and vertical directions. Otherwise (when the MTS flag is equal to 1), two additional flags for the block will be sent further (signal) to indicate the transform type (DCT-VIII or DST-VII) for each direction.
On the other hand, since a quad/binary/ternary tree-based block partition structure is introduced in the VVC standard, the residual distribution of intra prediction is highly correlated with block shape. Thus, when MTS is disabled (i.e., MTS flag is equal to 0 for one coded block), a shape adaptive transform selection method is applied to all intra-coded blocks in which DCT-II and DST-VII transforms are implicitly enabled depending on the width and height of the current block. More specifically, for each rectangular block, the method uses a DST-VII transform in the direction associated with the shorter side of a block and a DCT-II transform in the direction associated with the longer side of the block. For each square block, DST-VII is applied in both directions. Furthermore, to avoid introducing new transforms in different block sizes (block sizes), only DST-VII transforms are enabled when the shorter side of an intra-coded block is equal to or smaller than 16. Otherwise, the DCT-II transform is always applied.
Table 2 illustrates enabled horizontal and vertical transforms for intra-coded blocks based on a shape adaptive transform selection method in VVC.
Shape adaptive transform selection for intra blocks in table 2 VVC
Figure BDA0003177332060000101
Intra-frame sub-partition coding mode
Conventional intra mode only uses reconstructed samples adjacent to one encoded block to generate intra predicted samples for that block. Based on this approach, the spatial correlation between the prediction samples and the reference samples is approximately proportional to the distance between the prediction samples and the reference samples. Therefore, the inner samples (especially the samples located in the lower right corner of the block) typically have a lower prediction quality than the samples close to the block boundary. In order to further improve the intra prediction efficiency, Short Distance Intra Prediction (SDIP) has been proposed and well studied. The method divides an intra-coded block horizontally or vertically into a plurality of sub-blocks for prediction. A square block is typically divided into four sub-blocks. For example, an 8 × 8 block may be divided into four 2 × 8 or four 8 × 2 sub-blocks. One extreme case of such sub-block based intra prediction is referred to as line-based prediction, where a block is divided into one-dimensional rows/columns for prediction. For example, one W × H (width × height) block may be divided into H sub-blocks of size W × 1 or W sub-blocks of size 1 × H for intra prediction. Each resulting row/column is encoded (as shown in fig. 6A, 6B, 6C and 7) in the same way as a normal two-dimensional (2-D) block, i.e. it is predicted by one of the available intra modes, and the prediction error is de-correlated based on the transform and quantization and sent to the decoder 200 (fig. 2) for reconstruction. Thus, reconstructed samples in one sub-block (e.g., row/column) may be used as a reference to predict samples in the next sub-block. The above process is repeated until all sub-blocks within the current block are predicted and encoded. In addition, to reduce signaling overhead, all sub-blocks within one coded block share the same intra-mode.
With SDIP, different sub-block partitions may provide different coding efficiencies. In general, row-based prediction provides the best coding efficiency because it provides the "shortest prediction distance" between different partitions; on the other hand, row-based prediction also has the worst encoding/decoding throughput for codec hardware implementation. For example, consider a block with 4 x 4 sub-blocks and the same block with 4 x 1 or 1 x 4 sub-blocks, the latter case being only one-fourth of the throughput of the former case. In HEVC, the minimum intra prediction block size for luma is 4 × 4.
Fig. 8A illustrates a set of exemplary short-distance intra prediction (SDIP) partitions for 8 × 4 blocks 801, fig. 8B illustrates a set of exemplary short-distance intra prediction (SDIP) partitions for 4 × 8 blocks 803, and fig. 8C illustrates a set of exemplary short-distance intra prediction (SDIP) partitions for arbitrary-sized blocks 805. Recently, a video coding tool called sub-partition prediction (ISP) has been introduced into the VVC standard. Conceptually, ISPs are very similar to SDIPs. Specifically, depending on the block size, the ISP divides the current coding block into 2 or 4 sub-blocks in the horizontal or vertical direction, each sub-block containing at least 16 samples.
Taken together, fig. 8A, 8B and 8C show all possible partitioning scenarios for different coding block sizes. Furthermore, the current ISP design also includes the following main aspects to handle its interaction with other coding tools in the VVC standard:
interaction with the wide-angle intra direction: the ISP is combined with the wide-angle intra direction. In the present design, the block size (i.e., width/height ratio) used to determine whether to apply the normal intra direction or its corresponding wide-angle intra direction is one of the original encoded blocks, i.e., the block before the sub-block partition.
Interaction with multiple reference lines: the ISP cannot be turned on in conjunction with multiple reference lines. Specifically, in current VVC signaling designs, the ISP enable/disable flag is sent after the MRL index (signaled). When an intra block has a non-zero MRL index (i.e., refers to non-nearest neighbor samples), the ISP enable/disable flag is not sent but instead is inferred to be 0, i.e., in this case, the ISP is automatically disabled for that coding block.
Interaction with the most probable mode: similar to the normal intra mode, the intra mode for one ISP block is signaled by the Most Probable Mode (MPM) mechanism. However, compared to the normal intra mode, the MPM approach of the ISP is modified as follows: 1) each ISP block enables only the intra-modes contained in the MPM list and disables all other intra-modes not in the MPM list; 2) for each ISP block, its MPM list excludes the DC mode and prioritizes the horizontal intra mode of the ISP horizontal partition and the vertical mode of the ISP vertical partition, respectively.
At least one non-zero Coefficient Block Flag (CBF): in the current VVC, a CBF flag is sent for each Transform Unit (TU) to specify that the transform block contains one or more transform coefficient levels not equal to 0. Given a certain block using an ISP, the decoder assumes that at least one of the sub-partitions has a non-zero CBF. For this reason, if n is the number of sub-partitions and the first n-1 sub-partitions have generated zero CBF, the CBF of the nth sub-partition will be inferred to be 1. Therefore, the sub-partition does not have to be transmitted and decoded.
Interaction with multiple transform selection: the ISP is applied exclusively with the MTS, i.e. when a coding block uses the ISP, its MTS flag is not sent but always inferred to be 0, i.e. disabled. However, rather than always using the DCT-II transform, a fixed set of core transforms (including DST-VII and DCT-II) is implicitly applied to the ISP coded blocks based on block size. Specifically, assuming W and H are the width and height of an ISP subpartition, the horizontal and vertical transforms are selected according to the following rules, as shown in table 3.
Table 3 selected horizontal and vertical transforms for ISP blocks
Figure BDA0003177332060000121
Cross component linear model prediction
FIG. 9A is a view of chrominance values as a function of luminance values, where the view is used to derive a set of linear model parameters. More specifically, a straight-line relation 901 between chrominance values and luminance values is used to derive a set of linear model parameters α and β as follows. To reduce cross-component redundancy, a cross-component linear model (CCLM) prediction mode is employed in the VVC, and chroma samples for this prediction mode are predicted based on reconstructed luma samples of the same CU using the following linear model:
predC(i,j)=α·recL′(i,j)+β (8);
therein, predC(i, j) denotes the predicted chroma samples, rec, in the CUL' (i, j) denotes the downsampled reconstructed luma samples for the same CU. The linear model parameters α and β are derived from the straight-line relationship 901 between the luminance and chrominance values of two samples, the smallest luminance sample A (X) within a set of adjacent luminance samplesA,YA) And maximum luminance sample B (X)B,YB) As shown in fig. 9A. X in this caseA、YAIs the X-coordinate value (i.e., luminance value) and the y-coordinate value (i.e., chrominance value) of sample A, XB、YBThe x-coordinate value and the y-coordinate value of sample B. The linear model parameters α and β are obtained according to the following equations.
Figure BDA0003177332060000131
β=yA-αxA
This method is also known as the min-Max method. The division in the above equation can be avoided and can be replaced by multiplication and shifting.
FIG. 9B shows the positions of samples used to derive the linear model parameters of FIG. 9A. For a square coded block, the two equations for the linear model parameters α and β described above are applied directly. For non-square coded blocks, the neighboring samples of the longer boundary are first sub-sampled to have the same number of samples as the shorter boundary. Fig. 9B shows the positions of the left and upper samples involved in the CCLM mode and the samples of the current block, including the N × N set of chroma samples 903 and the 2N × 2N set of luma samples 905. In addition to computing linear model coefficients using the top and left templates together, these templates can be used alternately in other 2 LM modes, which are called LM _ A and LM _ L modes.
In the LM _ a mode, only the pixel samples in the upper template are used to calculate the linear model coefficients. To get more samples, the upper template is expanded to (W + W). In the LM _ L mode, only the pixel samples in the left template are used to compute the linear model coefficients. To get more samples, the left template is expanded to (H + H). It should be noted that when the upper reference line is located at the CTU boundary, only one luma line (the universal line buffer in intra prediction) is used to form the downsampled luma samples.
For chroma intra mode encoding, a total of 8 intra modes are allowed for chroma intra mode encoding. These modes include five conventional intra modes and three cross-component linear model modes (CCLM, LM _ a, and LM _ L). The chroma mode signaling and derivation process is shown in table 4. Chroma mode encoding is directly dependent on the intra prediction mode of the corresponding luma block. Since separate block partition structures for the luma and chroma components are enabled in the I slices, one chroma block may correspond to multiple luma blocks. Therefore, for the chroma DM mode, the intra prediction mode of the corresponding luma block covering the center position of the current chroma block is directly followed.
TABLE 4 derivation of chroma prediction mode from luma mode when CCLM is on
Figure BDA0003177332060000141
Although ISP tools in VVCs can improve intra prediction efficiency, there is still room to further improve the performance of VVCs. At the same time, some parts of existing ISPs will benefit from further simplification to provide more efficient codec hardware implementation, and/or to provide improved coding efficiency. In the present application, several approaches are proposed to further increase ISP coding efficiency, simplify existing ISP designs, and/or facilitate improved hardware implementations.
Independent (or parallel) sub-partition predictor generation for ISPs
Fig. 10 shows that only reference samples outside the current coding block 1000 are used to generate reference samples for intra prediction 1007 for all sub-partitions. The current encoded block 1000 includes a first sub-partition 11001, a second sub-partition 21002, a third sub-partition 31003, and a fourth sub-partition 41004. In the present application, it is proposed to generate intra prediction for each sub-partition 1001, 1002, 1003 and 1004 independently. In other words, all predictors for child partitions 1001, 1002, 1003, and 1004 can be generated in a parallel manner. In one embodiment, the predictors for all the sub-partitions are generated using the same method as used in the conventional non-sub-partitioned intra mode. Specifically, the intra prediction samples are generated for any other sub-partition in the same coding unit without using reconstructed samples of one sub-partition; all predictors for each sub-partition 1001, 1002, 1003, 1004 are generated using reference samples of the current coding block 1000, as shown in fig. 10.
For ISP mode in the VVC standard, the width of each sub-partition may be less than or equal to 2. A detailed example is as follows. According to the ISP mode in the VVC standard, 2 × N (width × height) subblock prediction is not allowed to depend on a reconstructed value of a previously decoded 2 × N subblock of the encoding block such that the minimum width of subblock prediction becomes four samples. For example, an 8 × 8 coding block encoded using vertically divided ISP is divided into 4 prediction areas of size 2 × 8, and the two 2 × 8 prediction areas on the left are merged into the first 4 × 8 prediction area for intra prediction. Transform circuitry 102 (fig. 1) is applied to each 2 x 8 partition.
According to this example, the two 2 × 8 prediction regions on the right side are merged into the second 4 × 8 prediction region for intra prediction. Transform circuitry 102 is applied to each 2 x 8 partition. It should be noted that the first 4 x 8 prediction region uses neighboring pixels of the current coding block to generate the intra predictor, while the second 4 x 8 region uses reconstructed pixels from the first 4 x 8 region (located on the left side of the second 4 x 8 region) or neighboring pixels from the current coding block (located on top of the second 4 x 8 region).
In another embodiment, only the Horizontal (HOR) prediction mode (as shown by mode 18 in fig. 4) and the prediction mode with a mode index less than 18 (as shown in fig. 4) may be used to form intra prediction for horizontal sub-partitions; and only the Vertical (VER) prediction mode, i.e., mode 50 in fig. 4, and prediction modes having a mode index greater than 50, as shown in fig. 4, may be used to form intra prediction for the vertical sub-partitions. Therefore, intra prediction for each horizontal sub-partition can be independently performed in parallel by the HOR prediction mode (such as mode 18 in fig. 4) and all angular prediction modes with modes less than 18. Also, intra prediction for each vertical sub-partition can be independently performed in parallel by VER prediction mode (mode 50 in fig. 4) and all angular prediction modes with modes larger than 50.
Intra-prediction mode coding for luma component of ISP
In the present application, it is proposed to allow only N of all possible intra prediction modes (N being a positive integer) to be used for the luma component of an ISP coded block. In one embodiment, only one mode is allowed for the luma component of the ISP encoded block. This single allowed mode may be a planar mode, for example. In another example, the single enable mode may be a DC mode. In yet another example, the single allowed mode may be one of an HOR prediction mode, a VER prediction mode, and a Diagonal (DIA) mode (mode 34 in fig. 4) intra prediction mode.
In another embodiment, only one mode is allowed for the luma component of the ISP coded block, and this mode may be different depending on the orientation of the sub-partition, i.e., whether it is a horizontal sub-partition or a vertical sub-partition. For example, for horizontal sub-partitions, only the HOR prediction mode is allowed; whereas for vertical sub-partitions only VER prediction mode is allowed. In yet another example, for horizontal sub-partitions only the VER prediction mode is allowed, and for vertical sub-partitions only the HOR prediction mode is allowed.
In yet another example, only two modes are allowed for the luma component of an ISP coded block. Each of the respective modes may be selected in response to the respective sub-partition direction, i.e. whether the sub-partition direction is horizontal or vertical. For example, for horizontal sub-partitions, only the PLANAR and HOR prediction modes are allowed, while for vertical sub-partitions, only the PLANAR and VER prediction modes are allowed.
In order to transmit N modes allowing for luma components for ISP coding blocks, the conventional Most Probable Mode (MPM) mechanism is not used. Instead, we propose to transmit (signal) the intra modes for the ISP encoded blocks using the binary codeword determined for each intra mode. The codeword may be generated using any of a number of different processes, including a Truncated Binary (TB) binarization process, a fixed length binarization process, a truncated rice (tr) binarization process, a k-order Exp-Golumb binarization process, a limited EGk binarization process, and so forth. These binary codeword generation processes are well defined in the HEVC specification. The truncated Rice with the Rice parameter equal to zero is also referred to as truncated unary binarization. A set of examples of codewords using different binarization methods is given in table 5.
TABLE 5 binary codewords generated using different binarization methods
Figure BDA0003177332060000161
Independent (or parallel) sub-partition predictor generation for ISPs
In yet another embodiment, the MPM derivation process for the conventional intra mode is directly reused for the ISP mode, and the signaling method of the MPM flag and MPM index remains the same as in the existing ISP design.
Intra-prediction mode coding for chroma components of ISP
In the present application, it is proposed to allow only N out of all possible chroma intra prediction modescA mode (N)cIs a positive integer) for the chroma components of the ISP encoded block. In one embodiment, only one mode is allowed for the chroma components of the ISP encoded block. For example, the single permission mode may be a Direct Mode (DM). In another example, the single enabled mode may be the LM mode. In yet another example, the single allowed mode may be one of a HOR prediction mode or a VER prediction mode. The Direct Mode (DM) is configured to apply the same intra prediction mode used by the corresponding luma block to the chroma blocks.
In another embodiment, only two modes are allowed for the chroma components of the ISP encoded block. In one example, only DM and LM are allowed for the chroma components of the ISP coded block.
In yet another embodiment, only four modes are allowed for the chroma components of the ISP encoded block. In one example, only DM, LM _ L, and LM _ a are allowed for the chroma components of the ISP coded block.
For transmitting N of chrominance components for ISP coding blockcMode, without using the traditional MPM mechanism. Instead, a fixed binary codeword is used to indicate the selected chroma mode in the bitstream. For example, the chroma intra prediction mode for the ISP encoded block may be transmitted using the determined binary codeword. The codeword may be generated using different processes, including a Truncated Binary (TB) binarization process, a fixed length binarization process, a truncated rice (tr) binarization process, a k-order Exp-Golumb binarization process, a limited EGk binarization process, and so on.
Coding block size for chroma components of ISP coding blocks
In the present application, it is proposed that sub-partition encoding is not allowed for the chroma components of ISP encoded blocks. Instead, normal intra prediction based on the entire block is used for the chroma components of the ISP coded block. In other words, for an ISP coded block, only its luma component is sub-partitioned, and its chroma components are not sub-partitioned.
Combination of ISP and inter-frame prediction
Fig. 11 illustrates a combination of inter-prediction samples and intra-prediction samples of the first sub-partition 1001 of fig. 10. Also, fig. 12 shows a combination of inter-predicted samples and intra-predicted samples of the second sub-partition 1002 of fig. 10. More specifically, to further improve coding efficiency, a new prediction mode is provided in which the prediction is generated as a weighted combination (e.g., weighted average) of the ISP mode and the inter prediction mode. The intra predictor generation for the ISP mode is the same as that described above in connection with fig. 10. The inter predictor may be generated through a process of combining modes or inter modes.
In the illustrative example of fig. 11, inter prediction samples 1101 for the current block (including all sub-partitions) are generated by motion compensation using the merge candidates indicated by the merge index. The intra-prediction samples 1103 are generated by intra-prediction using the transmitted (signaled) intra-mode. It should be noted that the process may use reconstructed samples of previous sub-partitions to generate intra predictor samples for non-first sub-partitions (i.e., second sub-partition 1002), as shown in fig. 12. After the inter and intra predictor samples are generated, they are weighted averaged to generate the final prediction samples for the sub-partition. The combined mode can be considered as an intra mode. Alternatively, the combined mode may be considered as an inter mode or a merge mode, rather than an intra mode.
CBF signaling for ISP coding blocks
To simplify the ISP design, it is proposed in this application to always send CBFs for the last subdivision.
In another embodiment of the present application, it is proposed not to send the CBF for the last sub-partition, but to deduce its value at the decoder side. For example, the value of CBF for the last sub-partition is always inferred to be 1. In another example, the value of CBF for the last sub-partition is always inferred to be zero.
According to an embodiment of the present application, there is provided a method of video encoding including independently generating a respective intra prediction for each of a plurality of corresponding sub-partitions, wherein each respective intra prediction is generated using a plurality of reference samples from a current encoding block.
In some examples, reconstructed samples from a first sub-partition of the plurality of corresponding sub-partitions are not used to generate respective intra predictions for any other sub-partition of the plurality of corresponding sub-partitions.
In some examples, a width of each of the plurality of respective sub-partitions is less than or equal to 2.
In some examples, the plurality of respective sub-partitions includes a plurality of vertical sub-partitions and a plurality of horizontal sub-partitions, and the method further includes generating the first set of intra-predictions for the plurality of horizontal sub-partitions using only the horizontal prediction mode and generating the second set of intra-predictions for the plurality of vertical sub-partitions using only the vertical prediction mode.
In some examples, the horizontal prediction mode is performed using a mode index less than 18.
In some examples, the vertical prediction mode is performed using a mode index greater than 50.
In some examples, the horizontal prediction mode is performed independently and in parallel for each of a plurality of horizontal sub-partitions.
In some examples, the vertical prediction modes are performed independently and in parallel for each of a plurality of vertical sub-partitions.
In some examples, the plurality of respective sub-partitions includes a last sub-partition, and the method further includes sending a (signal) Coefficient Block Flag (CBF) value for the last sub-partition.
In some examples, the plurality of respective sub-partitions includes a last sub-partition, and the method further includes inferring (affer) Coefficient Block Flag (CBF) values for the last sub-partition at the decoder.
In some examples, the Coefficient Block Flag (CBF) value is always inferred to be 1.
In some examples, the Coefficient Block Flag (CBF) value is always inferred to be zero.
According to another embodiment of the present application, there is provided a video encoding method including: for a luma component of an intra sub-partition (ISP) encoded block, generating respective intra predictions for each of a plurality of corresponding sub-partitions using only N of M possible intra prediction modes, wherein M and N are positive integers and N is less than M.
In some examples, a width of each of the plurality of respective sub-partitions is less than or equal to 2.
In some examples, N is equal to 1, such that only a single mode is allowed for the luminance component.
In some examples, the single mode is a planar mode.
In some examples, the single mode is a DC mode.
In some examples, the single mode is any one of a Horizontal (HOR) prediction mode, a Vertical (VER) prediction mode, and a Diagonal (DIA) prediction mode.
In some examples, the single mode is selected in response to a sub-partition direction, wherein the Horizontal (HOR) prediction mode is selected in response to the sub-partition direction being horizontal, and the Vertical (VER) prediction mode is selected in response to the sub-partition direction being vertical.
In some examples, the single mode is selected in response to a sub-partition direction, wherein the Horizontal (HOR) prediction mode is selected in response to the sub-partition direction being vertical, and the Vertical (VER) prediction mode is selected in response to the sub-partition direction being horizontal.
In some examples, N is equal to 2, such that two modes are allowed for the luma component.
In some examples, a first set of two modes is selected in response to a first sub-partition direction and a second set of two modes is selected in response to a second sub-partition direction.
In some examples, the first set of two modes comprises a planar mode and a Horizontal (HOR) prediction mode, the first sub-partition direction comprises a horizontal sub-partition, the second set of two modes comprises a planar mode and a Vertical (VER) prediction mode, the second sub-partition direction comprises a vertical sub-partition.
In some examples, each respective pattern of the N patterns is transmitted (signal) using a respective binary codeword from a predetermined set of binary codewords.
In some examples, the predetermined set of binary codewords is generated using at least one of a Truncated Binary (TB) binarization process, a fixed length binarization process, a truncated rice (tr) binarization process, a truncated unary binarization process, a k-order Exp-Golumb binarization process, and a limited EGk binarization process.
According to still another embodiment of the present application, there is provided a video encoding method including: for chroma components of an intra sub-partition (ISP) encoded block, intra prediction is generated using only N of M possible intra prediction modes, where M and N are positive integers and N is less than M.
In some examples, N is equal to 1, such that only a single mode is allowed for the chroma component.
In some examples, the single mode is a Direct Mode (DM), a Linear Model (LM) mode, a Horizontal (HOR) prediction mode, or a Vertical (VER) prediction mode.
In some examples, N is equal to 2, such that two modes are allowed for the chroma component.
In some examples, N is equal to 4, such that four modes are allowed for each chroma component.
In some examples, each respective pattern of the N patterns is transmitted (signal) using a respective binary codeword from a predetermined set of binary codewords.
In some examples, the predetermined set of binary codewords is generated using at least one of a Truncated Binary (TB) binarization process, a fixed length binarization process, a truncated rice (tr) binarization process, a truncated unary binarization process, a k-order Exp-Golumb binarization process, and a limited EGk binarization process.
According to still another embodiment of the present application, there is provided a video encoding method including: for the luma component, respective luma intra predictions are generated for each of a plurality of corresponding sub-partitions of an entire intra sub-partition (ISP) encoded block, and for the chroma component, chroma intra predictions are generated for the entire intra sub-partition (ISP) encoded block.
In some examples, a width of each of the plurality of respective sub-partitions is less than or equal to 2.
According to yet another embodiment of the present application, there is provided a video encoding method including generating a first prediction using an intra sub-partition mode; generating a second prediction using the inter prediction mode; and combining the first prediction and the second prediction to generate a final prediction by applying a weighted average to the first prediction and the second prediction.
In some examples, the second prediction is generated using at least one of a merge mode and an inter mode.
In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. The computer readable medium may comprise a computer readable storage medium corresponding to a tangible medium, such as a data storage medium, or a communication medium, including any medium that facilitates transfer of a computer program from one place to another, such as according to a communication protocol. In this manner, the computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the embodiments described herein. The computer program product may include a computer-readable medium.
Further, the above-described methods may be implemented using an apparatus comprising one or more circuits including an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components. The device may be used in conjunction with other hardware or software (DSPD), Programmable Logic Devices (PLD), Field Programmable Gate Arrays (FPGA), controllers, micro-controllers, microprocessors or other electronic components. The apparatus may use circuitry in combination with other hardware or software components to perform the above-described methods. Each module, sub-module, unit or sub-unit disclosed above may be implemented, at least in part, using one or more circuits.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles thereof, and including such departures from the present disclosure as come within known or customary practice in the art. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise examples described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. It is intended that the scope of the invention be limited only by the claims appended hereto.

Claims (36)

1. A video encoding method, comprising:
a respective intra prediction is independently generated for each of a plurality of respective sub-partitions, wherein the respective intra prediction for each sub-partition is generated using a plurality of reference samples from a current encoding block.
2. The method of claim 1, wherein no reconstructed samples from a first sub-partition of the plurality of respective sub-partitions are used to generate respective intra predictions for any other sub-partitions of the plurality of respective sub-partitions.
3. The method of claim 2, wherein a width of each of the plurality of respective sub-partitions is less than or equal to 2.
4. The method of claim 3, wherein the plurality of respective sub-partitions includes a plurality of vertical sub-partitions and a plurality of horizontal sub-partitions, the method further comprising generating a first set of intra predictions for the plurality of horizontal sub-partitions using only horizontal prediction modes and generating a second set of intra predictions for the plurality of vertical sub-partitions using only vertical prediction modes.
5. The method of claim 4, further comprising: the horizontal prediction mode is performed using a mode index smaller than 18.
6. The method of claim 4, further comprising: the vertical prediction mode is performed using a mode index greater than 50.
7. The method of claim 5, further comprising: the horizontal prediction mode is performed independently and in parallel for each horizontal sub-partition of the plurality of horizontal sub-partitions.
8. The method of claim 5, further comprising: the vertical prediction mode is performed independently and in parallel for each of the plurality of vertical sub-partitions.
9. The method of claim 3, wherein the plurality of respective sub-partitions comprises a last sub-partition, and further comprising sending a (signal) Coefficient Block Flag (CBF) value for the last sub-partition.
10. The method of claim 9, wherein the plurality of respective sub-partitions includes a last sub-partition, and the method further comprises inferring (affer) Coefficient Block Flag (CBF) values for the last sub-partition at a decoder.
11. The method of claim 10, wherein the Coefficient Block Flag (CBF) value is always inferred to be 1.
12. The method of claim 10, wherein the Coefficient Block Flag (CBF) value is always inferred to be zero.
13. A video encoding method, comprising:
for a luma component of an intra sub-partition (ISP) encoded block, generating respective intra predictions for each of a plurality of corresponding sub-partitions using only N of M possible intra prediction modes, wherein M and N are positive integers and N is less than M.
14. The method of claim 13, wherein a width of each of the plurality of respective sub-partitions is less than or equal to 2.
15. The method of claim 14, wherein N is equal to 1 such that only a single mode is allowed for the luma component.
16. The method of claim 15, wherein the single mode is a planar mode.
17. The method of claim 15, wherein the single mode is a DC mode.
18. The method of claim 15, wherein the single mode is any one of a Horizontal (HOR) prediction mode, a Vertical (VER) prediction mode, and a Diagonal (DIA) prediction mode.
19. The method of claim 15, further comprising selecting the single mode in response to a sub-partition direction (sub-partitioning), wherein the Horizontal (HOR) prediction mode is selected in response to the sub-partition direction being horizontal, and the Vertical (VER) prediction mode is selected in response to the sub-partition direction being vertical.
20. The method according to claim 15, further comprising selecting the single mode in response to a sub-partition direction, wherein the Horizontal (HOR) prediction mode is selected in response to the sub-partition direction being vertical, and the Vertical (VER) prediction mode is selected in response to the sub-partition direction being horizontal.
21. The method of claim 14, wherein N is equal to 2 such that two modes are allowed for the luma component.
22. The method of claim 21, wherein a first set of two modes is selected in response to a first sub-partition direction and a second set of two modes is selected in response to a second sub-partition direction.
23. The method according to claim 22, wherein the first set of two modes comprises a planar mode and a Horizontal (HOR) prediction mode, the first sub-partition direction comprises a horizontal sub-partition, the second set of two modes comprises a planar mode and a Vertical (VER) prediction mode, the second sub-partition direction comprises a vertical sub-partition.
24. The method of claim 15, further comprising: transmitting (signal) each respective pattern of the N patterns using a respective binary codeword from a predetermined set of binary codewords.
25. The method of claim 24, wherein the predetermined set of binary codewords is generated using at least one of a Truncated Binary (TB) binarization process, a fixed length binarization process, a truncated rice (tr) binarization process, a truncated unary binarization process, a k-th order Exp-Golumb binarization process, and a limited EGk binarization process.
26. A video encoding method, comprising:
for chroma components of an intra sub-partition (ISP) encoded block, intra prediction is generated using only N of M possible intra prediction modes, where M and N are positive integers and N is less than M.
27. The method of claim 26, wherein N is equal to 1 such that only a single mode is allowed for the chroma components.
28. The method of claim 27, wherein the single mode is a Direct Mode (DM), a Linear Model (LM) mode, a Horizontal (HOR) prediction mode, or a Vertical (VER) prediction mode.
29. The method of claim 26, wherein N is equal to 2 such that two modes are allowed for the chroma component.
30. The method of claim 26, wherein N is equal to 4 such that four modes are allowed for the chroma component.
31. The method of claim 26, further comprising: transmitting (signal) each respective pattern of the N patterns using a respective binary codeword from a predetermined set of binary codewords.
32. The method of claim 31, wherein the predetermined set of binary codewords is generated using at least one of a Truncated Binary (TB) binarization process, a fixed length binarization process, a truncated rice (tr) binarization process, a truncated unary binarization process, a k-th order Exp-Golumb binarization process, and a limited EGk binarization process.
33. A video encoding method, comprising:
for the luma component, respective luma intra predictions are generated for each of a plurality of corresponding sub-partitions of an entire intra sub-partition (ISP) encoded block, an
For the chroma component, chroma intra prediction is generated for the entire intra sub-partition (ISP) encoded block.
34. The method of claim 33, wherein a width of each of the plurality of respective sub-partitions is less than or equal to 2.
35. A video encoding method, comprising:
generating a first prediction using an intra sub-partition mode;
generating a second prediction using the inter prediction mode; and
merging the first prediction and the second prediction to generate a final prediction by applying a weighted average to the first prediction and the second prediction.
36. The method of claim 35, further comprising: generating the second prediction using at least one of a merge mode and an inter mode.
CN202080010728.6A 2019-02-05 2020-02-05 Video coding using intra sub-partition coding modes Pending CN113348671A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111101969.4A CN113630607A (en) 2019-02-05 2020-02-05 Video coding using intra sub-partition coding modes

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962801214P 2019-02-05 2019-02-05
US62/801,214 2019-02-05
PCT/US2020/016888 WO2020163535A1 (en) 2019-02-05 2020-02-05 Video coding using intra sub-partition coding mode

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202111101969.4A Division CN113630607A (en) 2019-02-05 2020-02-05 Video coding using intra sub-partition coding modes

Publications (1)

Publication Number Publication Date
CN113348671A true CN113348671A (en) 2021-09-03

Family

ID=71947318

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202111101969.4A Pending CN113630607A (en) 2019-02-05 2020-02-05 Video coding using intra sub-partition coding modes
CN202080010728.6A Pending CN113348671A (en) 2019-02-05 2020-02-05 Video coding using intra sub-partition coding modes

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202111101969.4A Pending CN113630607A (en) 2019-02-05 2020-02-05 Video coding using intra sub-partition coding modes

Country Status (7)

Country Link
US (2) US20210368205A1 (en)
EP (1) EP3922029A4 (en)
JP (2) JP2022518612A (en)
KR (3) KR20230049758A (en)
CN (2) CN113630607A (en)
MX (1) MX2021009355A (en)
WO (1) WO2020163535A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102657933B1 (en) * 2018-03-30 2024-04-22 광동 오포 모바일 텔레커뮤니케이션즈 코포레이션 리미티드 Image/video coding method based on intra prediction, and apparatus thereof
WO2020163535A1 (en) * 2019-02-05 2020-08-13 Beijing Dajia Internet Information Technology Co., Ltd. Video coding using intra sub-partition coding mode
US20220166968A1 (en) * 2019-03-22 2022-05-26 Lg Electronics Inc. Intra prediction method and apparatus based on multi-reference line in image coding system
KR20220019241A (en) * 2019-07-08 2022-02-16 엘지전자 주식회사 Video or image coding based on adaptive loop filter
WO2023132622A1 (en) * 2022-01-04 2023-07-13 엘지전자 주식회사 Dimd mode-based intra prediction method and device

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8976870B1 (en) * 2006-08-30 2015-03-10 Geo Semiconductor Inc. Block and mode reordering to facilitate parallel intra prediction and motion vector prediction
KR101432775B1 (en) * 2008-09-08 2014-08-22 에스케이텔레콤 주식회사 Video Encoding/Decoding Method and Apparatus Using Arbitrary Pixel in Subblock
US9787982B2 (en) * 2011-09-12 2017-10-10 Qualcomm Incorporated Non-square transform units and prediction units in video coding
KR101810916B1 (en) 2011-11-11 2017-12-20 지이 비디오 컴프레션, 엘엘씨 Effective prediction using partition coding
US9491457B2 (en) * 2012-09-28 2016-11-08 Qualcomm Incorporated Signaling of regions of interest and gradual decoding refresh in video coding
KR101835358B1 (en) 2012-10-01 2018-03-08 지이 비디오 컴프레션, 엘엘씨 Scalable video coding using inter-layer prediction contribution to enhancement layer prediction
WO2016072611A1 (en) 2014-11-04 2016-05-12 삼성전자 주식회사 Method and device for encoding/decoding video using intra prediction
EP3364658A4 (en) * 2015-10-15 2019-07-03 LG Electronics Inc. Method and apparatus for encoding and decoding video signal
WO2017142327A1 (en) 2016-02-16 2017-08-24 삼성전자 주식회사 Intra-prediction method for reducing intra-prediction errors and device for same
CN109155851A (en) * 2016-05-02 2019-01-04 汉阳大学校产学协力团 Utilize the image coding of intra-frame prediction, coding/decoding method and device
WO2017205703A1 (en) * 2016-05-25 2017-11-30 Arris Enterprises Llc Improved weighted angular prediction coding for intra coding
CN116744022A (en) * 2016-11-25 2023-09-12 株式会社Kt Method for encoding and decoding video
CN117041561A (en) 2016-12-07 2023-11-10 株式会社Kt Method for decoding or encoding video and apparatus for storing video data
CN111971959A (en) * 2018-02-09 2020-11-20 弗劳恩霍夫应用研究促进协会 Partition-based intra coding concept
WO2020012023A1 (en) * 2018-07-13 2020-01-16 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Partitioned intra coding concept
EP3840387B1 (en) * 2018-10-12 2024-03-06 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for encoding/decoding image signal and device for same
CN112514384A (en) * 2019-01-28 2021-03-16 株式会社 Xris Video signal encoding/decoding method and apparatus thereof
US11272198B2 (en) * 2019-01-30 2022-03-08 Tencent America LLC Method and apparatus for improved sub-block partitioning intra sub-partitions coding mode
WO2020163535A1 (en) * 2019-02-05 2020-08-13 Beijing Dajia Internet Information Technology Co., Ltd. Video coding using intra sub-partition coding mode
US20200252608A1 (en) * 2019-02-05 2020-08-06 Qualcomm Incorporated Sub-partition intra prediction
US11418811B2 (en) * 2019-03-12 2022-08-16 Apple Inc. Method for encoding/decoding image signal, and device therefor
WO2021006612A1 (en) * 2019-07-08 2021-01-14 현대자동차주식회사 Method and device for intra prediction coding of video data

Also Published As

Publication number Publication date
WO2020163535A1 (en) 2020-08-13
KR102517389B1 (en) 2023-04-03
MX2021009355A (en) 2021-09-14
US20210368193A1 (en) 2021-11-25
US20210368205A1 (en) 2021-11-25
EP3922029A4 (en) 2022-08-24
KR20220021036A (en) 2022-02-21
US11936890B2 (en) 2024-03-19
JP2023090929A (en) 2023-06-29
CN113630607A (en) 2021-11-09
KR20230049758A (en) 2023-04-13
EP3922029A1 (en) 2021-12-15
KR20210113259A (en) 2021-09-15
JP2022518612A (en) 2022-03-15

Similar Documents

Publication Publication Date Title
US11252405B2 (en) Image signal encoding/decoding method and apparatus therefor
US9451279B2 (en) Method for decoding a moving picture
WO2017190288A1 (en) Intra-picture prediction using non-adjacent reference lines of sample values
CN113348671A (en) Video coding using intra sub-partition coding modes
KR101809630B1 (en) Method and apparatus of image encoding/decoding using adaptive deblocking filtering
CN117834909A (en) Method for decoding video and method for encoding video
KR101966195B1 (en) Method and apparatus of image encoding/decoding using reference pixel composition in intra prediction
KR20200034646A (en) Method for encodign/decodign video signal and apparatus therefor
US20160191942A1 (en) Apparatus for decoding a moving picture
CN112956194A (en) Image signal encoding/decoding method and apparatus thereof
CN114143548B (en) Coding and decoding of transform coefficients in video coding and decoding
US20240187623A1 (en) Video Coding Using Intra Sub-Partition Coding Mode
KR101782155B1 (en) Image encoding/decoding method and image decoding apparatus using motion vector precision
CN112806006A (en) Method and apparatus for encoding and decoding using reference samples determined by predefined criteria
CN116389737B (en) Coding and decoding of transform coefficients in video coding and decoding
KR102103100B1 (en) Apparatus and method for decoding an image
CN113615202A (en) Method and apparatus for intra prediction for screen content encoding and decoding
WO2023158765A1 (en) Methods and devices for geometric partitioning mode split modes reordering with pre-defined modes order
WO2023154574A1 (en) Methods and devices for geometric partitioning mode with adaptive blending
KR20190023294A (en) Method and apparatus for encoding/decoding a video signal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination