WO2021058033A1 - Method and apparatus of combined inter and intra prediction with different chroma formats for video coding - Google Patents
Method and apparatus of combined inter and intra prediction with different chroma formats for video coding Download PDFInfo
- Publication number
- WO2021058033A1 WO2021058033A1 PCT/CN2020/118961 CN2020118961W WO2021058033A1 WO 2021058033 A1 WO2021058033 A1 WO 2021058033A1 CN 2020118961 W CN2020118961 W CN 2020118961W WO 2021058033 A1 WO2021058033 A1 WO 2021058033A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- current block
- prediction
- mode
- hypothesis
- coding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 241000023320 Luma <angiosperm> Species 0.000 claims abstract description 23
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 claims abstract description 23
- 238000005192 partition Methods 0.000 claims abstract description 14
- 238000000638 solvent extraction Methods 0.000 claims abstract description 12
- 230000033001 locomotion Effects 0.000 description 10
- VBRBNWWNRIMAII-WYMLVPIESA-N 3-[(e)-5-(4-ethylphenoxy)-3-methylpent-3-enyl]-2,2-dimethyloxirane Chemical compound C1=CC(CC)=CC=C1OC\C=C(/C)CCC1C(C)(C)O1 VBRBNWWNRIMAII-WYMLVPIESA-N 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000002156 mixing Methods 0.000 description 3
- 238000007906 compression Methods 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/1883—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit relating to sub-band structure, e.g. hierarchical level, directional tree, e.g. low-high [LH], high-low [HL], high-high [HH]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- Intra prediction and Inter prediction there are two kinds of prediction modes (i.e., Intra prediction and Inter prediction) for each PU.
- Intra prediction modes the spatial neighbouring reconstructed pixels can be used to generate the directional predictions.
- VVC Versatile Video Coding
- JVET Joint Video Exploration Team
- ⁇ First weighting factor group ⁇ 7/8, 6/8, 4/8, 2/8, 1/8 ⁇ and ⁇ 7/8, 4/8, 1/8 ⁇ is used for the luminance and the chrominance samples, respectively;
- One weighting factor group is selected based on the comparison of the motion vectors of two triangular prediction units.
- the second weighting factor group is used when the reference pictures of the two triangular prediction units are different from each other or their motion vector difference is larger than 16 pixels. Otherwise, the first weighting factor group is used.
- An example is shown in Fig. 2, where weightings 210 are shown for the luma block and weightings 220 are shown for the chroma block. A more detailed explanation of the algorithm can be found in JVET-L0124 and JVET-L0208.
- Geometric Merge mode also called geometric partitioning mode, GPM
- GPM geometric partitioning mode
- the 140 modes is defined as CE4-1.1 in P0068. To further reduced the complexity, in CE4-1.2 108 modes and 80 modes GEO are tested. In CE4-1.14, a TPM-like simplified motion storage is tested.
- the proposed GEO partitioning for Inter is allowed for uni-predicted blocks not smaller than 8 ⁇ 8 in order to have the same memory bandwidth usage as the bi-predicted blocks at the decoder side.
- Motion vector prediction for GEO partitioning is aligned with TPM. Also, the TPM blending between two predictions is applied on inner boundary.
- the split boundary of geometric Merge mode is descripted by angle and distance offset ⁇ i as shown in Fig. 4.
- Angle represents a quantized angle between 0 and 360 degrees and distance offset ⁇ i represents a quantized offset of the largest distance ⁇ max .
- the split directions overlapped with binary tree splits and TPM splits are excluded.
- Angles is quantized between 0 and 360 degrees with a fix step.
- the angle is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles as shown in Fig. 5A.
- Fig. 5B illustrates the reduced angles with 24 values.
- Distance ⁇ i is quantized from the largest possible distance ⁇ max with a fixed step.
- the value of ⁇ max can be geometrically derived by Eq. (1) for either w or h is equal to 8 and scaled with log2 scaled short edge length. For is equal to 0 degree case, ⁇ max is equal to w/2 and for is equal to 90 degree case, ⁇ max is equal to h/2 and. The shifted back “1.0” samples is to avoid that the split boundary is too close to the corner.
- the distance ⁇ i is quantized with 5 steps. Combining with 32 angles, there is a total of140 split modes excluding the binary tree and TPM splits. In CE4-1.2 -, the distance ⁇ i is quantized with 4 steps. Combining with 32 angles, there is a total of 108 split modes excluding the binary tree and TPM splits. In CE4-1.2, the distance ⁇ i is quantized with 4 steps. Combining with 24 angles, there is a total of 80 split modes excluding the binary tree and TPM splits.
- the GEO mode is signalled as an additional Merge mode together with TPM mode as shown in Table 1.
- merge_geo_flag [] [] is signalled with 4 CABAC context models, where the first three are derived depending on the mode of above and left neighbouring blocks, the fourth is derived depending on the aspect ratio of the current block.
- merge_geo_flag [] [] indicates whether the current block uses GEO mode or TPM mode, which is similar to a “most probable mode” flag.
- the geo_partition_idx [] [] is used as an index to the lookup table that stores the angle and distance ⁇ i pairs.
- the geo_partition_idx is coded using truncated binary and binarized using bypass.
- a method and apparatus for video coding are disclosed. According to this method, a current block is received at an encoder side or compressed data comprising the current block is received at a decoder side, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block.
- the single partition tree is a single tree for luma and chroma.
- a target coding mode is determined for the current block, .
- the current block is then encoded or decoded according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
- the additional hypothesis of prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold and the threshold is equal to 4.
- the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode. In another embodiment, the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) . In yet another embodiment, the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
- CIIP Combined Inter/Intra Prediction
- TPM Triangular Prediction mode
- GEO Geometric Merge mode
- the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
- the threshold is predefined implicitly in the standard or signalled at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
- TU Transform Unit
- CU Coding Unit
- CB Coding Block
- CTU Coding Tree Unit
- CTB Coding Tree Block
- CTB Coding Tree Unit
- CTU Coding Tree Unit
- CTU Coding Tree Block
- CTB Coding Tree Block
- SPS Sequence Parameter Set
- PPS Picture Parameter Set
- the image area corresponds to a Coding Tree Unit (CTU) .
- CTU Coding Tree Unit
- Fig. 1 illustrates an example of TPM (Triangular Prediction Mode) , where a CU is split into two triangular prediction units, in either diagonal or inverse diagonal direction. Each triangular prediction unit in the CU is Inter-predicted using its own uni-prediction motion vector and reference frame index to generate prediction from a uni-prediction candidate.
- TPM Triangular Prediction Mode
- Fig. 2 illustrates an example of adaptive weighting process, where weightings are shown for the luma block (left) and the chroma block (right) .
- Fig. 3A illustrates partition shapes for the triangular prediction mode (TPM) as disclosed in VTM-6.0
- Fig. 3B illustrates additional shapes being discussed for geometric Merge mode.
- Fig. 4 illustrates the split boundary of geometric Merge mode that is descripted by angle and distance offset ⁇ i .
- Fig. 5A illustrates an example where the angle is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles.
- Fig. 5B illustrates an example where the angle is quantized from between 0 and 360 degrees with step 11.25 degree and some near vertical direction angles are removed, which results in a total 24 of angles.
- Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
- Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
- a multiple hypothesis (MH) prediction mode is disclosed.
- an additional hypothesis of prediction is combined with the existing hypothesis of prediction by a weighted average process and the combined prediction is the final prediction of the current block.
- a simplification method of multiple hypothesis (MH) prediction mode is disclosed, where the MH prediction mode is not applied to chroma blocks under certain conditions according to this invention.
- the MH prediction mode is not applied to chroma blocks, it means that the additional hypothesis of prediction is not combined with the exiting hypothesis of prediction for the chroma block and the existing hypothesis of prediction is used as the final prediction of the current chroma block.
- the MH prediction mode When the MH prediction mode is applied to chroma blocks, it means that the additional hypothesis of prediction is combined with the exiting hypothesis of prediction and the combined prediction is used as the final prediction of the current chroma block.
- the proposed method When the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is then applied.
- MH prediction mode can be CIIP, TPM, or GEO.
- the proposed method can be applied even if the original flag for MH mode (e.g., CIIP, TPM, or GEO) at the CU level is true.
- MH mode is not applied to the chroma blocks even if the CU-level CIIP flag is true. It means that the final prediction for the luma block is the combined prediction, which is formed by the existing hypothesis of prediction and the additional hypothesis of prediction; for chroma blocks, the final prediction is the existing prediction.
- the block size may range from 128 to 4 for the luma component or from 64 to 2 for the chroma components.
- Intra blocks have more dependency than inter blocks. The most concern is about 2xN intra blocks. The smallest size for luma is already set as 4x4.2xN intra chroma is already removed in the dual tree cases. However, there are still some 2xN intra chroma blocks in single tree cases (for example, 2xN intra chroma blocks for CIIP. )
- “MH mode is not applied to the chroma blocks” means that additional hypothesis of prediction is not combined with the original (existing) hypothesis of prediction for chroma blocks.
- MH mode is not applied to the chroma blocks” means that for the chroma blocks, Intra prediction is not combined with Inter prediction so that Inter prediction is used directly.
- the proposed method is enabled for chroma format 4: 4: 4.
- the proposed method is enabled for chroma format 4: 2: 0.
- the proposed method is enabled for chroma format 4: 2: 2.
- the proposed method is enabled for chroma format 4: 2: 1.
- the proposed method is enabled for chroma format 4: 1: 1.
- the proposed method is enabled for chroma format 4: 0: 0 (i.e., mono chroma) .
- the pre-defined condition is in terms of block width, height, or area.
- block is this embodiment can be a luma block or a chroma block.
- the corresponding block width or height depends on the used chroma format. For example, if the used chroma format is 4: 2: 0, the corresponding block width is assigned with the half of the width for the collocated luma block.
- the pre-defined condition is that the block width is smaller than threshold-1 and/or the block height is smaller than threshold-2.
- the proposed method MH prediction mode is not applied to the chroma block
- the chroma block can be a chroma block for Cb component or Cr component.
- the pre-defined condition is that the block width is larger than threshold-1 and/or the block height is larger than threshold-2.
- the pre-defined condition is that the block area is smaller than threshold-3.
- the pre-defined condition is that the block area is larger than threshold-3.
- threshold-1 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
- threshold-1 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
- the variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
- threshold-2 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
- threshold-2 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
- the variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
- threshold-3 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
- threshold-3 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
- the variable can be 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
- threshold-1 and threshold-2 can be the same.
- threshold-1, threshold-2, and/or threshold-3 can be different for different chroma formats.
- the “block” in this invention can be CU, CB, TU or TB.
- the proposed method is enabled depending on an explicit flag at TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
- the proposed method can be used for the luma block, i.e., the multiple hypothesis (MH) prediction mode is not applied to the luma blocks under certain conditions.
- the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is applied.
- MH mode is not applied to chroma.
- chroma format 4: 4: 4 when chroma format 4: 4: 4 is used and when the chroma block width or height is smaller than 4, MH mode is not applied to chroma.
- chroma format 4: 2: 0 when chroma format 4: 2: 0 is used and the chroma block width (depending on the used chroma format) is smaller than 4, MH mode is not applied to chroma.
- other enabling conditions of MH mode are satisfied (e.g. assuming MH mode is CIIP, CIIP flag is enabled) and the chroma block width (depending on the used chroma format) is larger than or equal to 4, MH mode is applied to not only the luma block but also chroma blocks.
- any of the foregoing proposed methods can be implemented in encoders and/or decoders.
- any of the proposed methods can be implemented in an Intra/Inter coding module of an encoder, a motion compensation module, a merge candidate derivation module of a decoder.
- any of the proposed methods can be implemented as a circuit coupled to the Intra/Inter coding module of an encoder and/or motion compensation module, a Merge candidate derivation module of the decoder.
- Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) .
- the steps shown in the flowchart, as well as other following flowcharts in this disclosure, may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side and/or the decoder side.
- the steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart.
- a current block comprising one luma block and one or more chroma blocks is received in step 610, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block.
- the single partition tree is a single tree for luma and chroma.
- a target coding mode for the current block is determined in step 620.
- the current block is encoded according to the target coding mode in step 630, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
- Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) .
- compressed data comprising a current block are received in step 710, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block.
- the single partition tree is a single tree for luma and chroma.
- a target coding mode for the current block is determined in step 720.
- the current block is decoded according to the target coding mode in step 730, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
- Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both.
- an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein.
- An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
- DSP Digital Signal Processor
- the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
- These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
- the software code or firmware code may be developed in different programming languages and different formats or styles.
- the software code may also be compiled for different target platforms.
- different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
Abstract
A method and apparatus for video coding are disclosed. According to this method, a current block is received at an encoder side or compressed data comprising the current block is received at a decoder side, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block. A target coding mode is determined for the current block. The current block is then encoded or decoded according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
The present invention claims priority to U.S. Provisional Patent Application, Serial No. 62/907,699, filed on September 29, 2019. The U.S. Provisional Patent Application is hereby incorporated by reference in its entirety.
The present invention relates to prediction for video coding using CIIP (Combined Inter/Intra Prediction) . In particular, the present invention discloses techniques to improve processing throughput for small block sizes.
BACKGROUND AND RELATED ART
High-Efficiency Video Coding (HEVC) is a new international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) . HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed coding unit (CU) , is a 2Nx2N square block, and each CU can be recursively split into four smaller CUs until the predefined minimum size is reached. Each CU contains one or multiple prediction units (PUs) .
To achieve the best coding efficiency of hybrid coding architecture in HEVC, there are two kinds of prediction modes (i.e., Intra prediction and Inter prediction) for each PU. For Intra prediction modes, the spatial neighbouring reconstructed pixels can be used to generate the directional predictions.
After the development of HEVC standard, another merging video coding standard, named as Versatile Video Coding (VVC) , is being developed under Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11. Various new coding tools along with some existing coding tools have been evaluated for VVC.
In VTM (VVC Test Model) software, when a CU is coded in Merge mode, and if the CU contains at least 64 luma samples (i.e., CU width × CU height equal to or larger than 64) , an additional flag (CIIP flag) is signalled at CU level to indicate if the Combined Inter/Intra Prediction (CIIP) mode is applied to the current CU. In order to form the CIIP prediction, an Intra prediction mode is first derived from two additional syntax elements or implicitly assigned. For example, planar mode is implicitly assigned as the Intra prediction mode. For another example, up to four possible Intra prediction modes can be used: DC, planar, horizontal, or vertical. The Inter prediction (the existing hypothesis of prediction) and Intra prediction signals (the additional hypothesis of prediction) are then derived using regular Intra and Inter decoding processes. Finally, weighted averaging of the Inter and Intra prediction signals is performed to obtain the CIIP prediction. A more detailed explanation of the algorithm can be found in JVET-L0100 (M. -S. Chiang, et al., “CE10.1.1: Multi-hypothesis prediction for improving AMVP mode, skip or merge mode, and intra mode, ” ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, Oct. 2018, Document: JVET-L0100) .
Triangular prediction
For VTM, in JVET-L0124 (R. -L. Liao, et al., “CE10.3.1. b: Triangular prediction unit mode, ” ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, Oct. 2018, Document: JVET-L0124) and JVET-L0208 (T. Poirier, et al., “CE10 related: multiple prediction unit shapes, ” ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, Oct. 2018, Document: JVET-L0208) , the scenario of Triangular Prediction unit Mode (TPM) is proposed. The concept is to introduce a new triangular partition for motion compensated prediction. It splits a CU into two triangular prediction units, in either diagonal or inverse diagonal direction like Fig 1. Each triangular prediction unit in the CU is Inter-predicted using its own uni-prediction motion vector and reference frame. An adaptive weighting process is performed to the diagonal edge after predicting the triangular prediction units. Then, the transform and quantization process are applied to the whole CU. It is noted that this mode is only applied to skip and merge modes. An additional flag is signalled to indicate if TPM is applied.
Adaptive weighting process
After predicting each triangular prediction unit, an adaptive weighting process is applied to the diagonal edge between the two triangular prediction units to derive the final prediction for the whole CU. Two weighting factor groups are listed as follows:
● First weighting factor group: {7/8, 6/8, 4/8, 2/8, 1/8} and {7/8, 4/8, 1/8} is used for the luminance and the chrominance samples, respectively;
● Second weighting factor group: {7/8, 6/8, 5/8, 4/8, 3/8, 2/8, 1/8} and {6/8, 4/8, 2/8} are used for the luminance and the chrominance samples, respectively.
One weighting factor group is selected based on the comparison of the motion vectors of two triangular prediction units. The second weighting factor group is used when the reference pictures of the two triangular prediction units are different from each other or their motion vector difference is larger than 16 pixels. Otherwise, the first weighting factor group is used. An example is shown in Fig. 2, where weightings 210 are shown for the luma block and weightings 220 are shown for the chroma block. A more detailed explanation of the algorithm can be found in JVET-L0124 and JVET-L0208.
Geometric Merge mode (GEO)
Geometric Merge mode (also called geometric partitioning mode, GPM) is proposed in JVET-P0068 (H. Gao, et al., “CE4: CE4-1.1, CE4-1.2 and CE4-1.14: Geometric Merge Mode (GEO) ” , ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 16th Meeting: Geneva, CH, 1–11 October 2019, Document: P0068) , which uses the same predictors blending concept as TPM and extends the blending masks up to 140 different modes with 32 angles and 5 distance offsets.
The 140 modes is defined as CE4-1.1 in P0068. To further reduced the complexity, in CE4-1.2 108 modes and 80 modes GEO are tested. In CE4-1.14, a TPM-like simplified motion storage is tested.
Fig. 3A illustrates partition shapes (311-312) for TPM in VTM-6.0 and Fig. 3B illustrates additional shapes (313-319) being proposed for non-rectangular Inter blocks.
Similarly to TPM, the proposed GEO partitioning for Inter is allowed for uni-predicted blocks not smaller than 8×8 in order to have the same memory bandwidth usage as the bi-predicted blocks at the decoder side. Motion vector prediction for GEO partitioning is aligned with TPM. Also, the TPM blending between two predictions is applied on inner boundary.
The split boundary of geometric Merge mode is descripted by angle
and distance offset ρ
i as shown in Fig. 4. Angle
represents a quantized angle between 0 and 360 degrees and distance offset ρ
i represents a quantized offset of the largest distance ρ
max. In addition, the split directions overlapped with binary tree splits and TPM splits are excluded.
GEO angle and distance quantization.
Angles
is quantized between 0 and 360 degrees with a fix step. In CE4-1.1, CE4-1.2 with 108 modes and CE4-1.14, the angle
is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles as shown in Fig. 5A.
In CE4-1.2 with 80 modes, the angle
is still quantized with 11.25 degrees steps; however the near vertical direction angles (e.g., near horizontal split boundaries) are removed since in the nature values, objectives and motions are mostly horizontal. Fig. 5B illustrates the reduced angles with 24 values.
Distance ρ
i is quantized from the largest possible distance ρ
max with a fixed step. The value of ρ
max can be geometrically derived by Eq. (1) for either w or h is equal to 8 and scaled with log2 scaled short edge length. For
is equal to 0 degree case, ρ
max is equal to w/2 and for
is equal to 90 degree case, ρ
max is equal to h/2 and. The shifted back “1.0” samples is to avoid that the split boundary is too close to the corner.
In CE4-1.1 and CE4-1.14, the distance ρ
i is quantized with 5 steps. Combining with 32 angles, there is a total of140 split modes excluding the binary tree and TPM splits. In CE4-1.2 -, the distance ρ
i is quantized with 4 steps. Combining with 32 angles, there is a total of 108 split modes excluding the binary tree and TPM splits. In CE4-1.2, the distance ρ
i is quantized with 4 steps. Combining with 24 angles, there is a total of 80 split modes excluding the binary tree and TPM splits.
Mode signalling
According to the proposed method, the GEO mode is signalled as an additional Merge mode together with TPM mode as shown in Table 1.
Table 1 Syntax elements introduced by the proposal
The merge_geo_flag [] [] is signalled with 4 CABAC context models, where the first three are derived depending on the mode of above and left neighbouring blocks, the fourth is derived depending on the aspect ratio of the current block. merge_geo_flag [] [] indicates whether the current block uses GEO mode or TPM mode, which is similar to a “most probable mode” flag.
The geo_partition_idx [] [] is used as an index to the lookup table that stores the angle
and distance ρ
i pairs. The geo_partition_idx is coded using truncated binary and binarized using bypass.
BRIEF SUMMARY OF THE INVENTION
A method and apparatus for video coding are disclosed. According to this method, a current block is received at an encoder side or compressed data comprising the current block is received at a decoder side, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block. The single partition tree is a single tree for luma and chroma. A target coding mode is determined for the current block, . The current block is then encoded or decoded according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
In one embodiment, the additional hypothesis of prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold and the threshold is equal to 4.
In one embodiment, the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode. In another embodiment, the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) . In yet another embodiment, the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
In one embodiment, the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
In one embodiment, the threshold is predefined implicitly in the standard or signalled at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
In one embodiment, the image area corresponds to a Coding Tree Unit (CTU) .
Fig. 1 illustrates an example of TPM (Triangular Prediction Mode) , where a CU is split into two triangular prediction units, in either diagonal or inverse diagonal direction. Each triangular prediction unit in the CU is Inter-predicted using its own uni-prediction motion vector and reference frame index to generate prediction from a uni-prediction candidate.
Fig. 2 illustrates an example of adaptive weighting process, where weightings are shown for the luma block (left) and the chroma block (right) .
Fig. 3A illustrates partition shapes for the triangular prediction mode (TPM) as disclosed in VTM-6.0
Fig. 3B illustrates additional shapes being discussed for geometric Merge mode.
Fig. 4 illustrates the split boundary of geometric Merge mode that is descripted by angle
and distance offset ρ
i.
Fig. 5A illustrates an example where the angle
is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles.
Fig. 5B illustrates an example where the angle
is quantized from between 0 and 360 degrees with step 11.25 degree and some near vertical direction angles are removed, which results in a total 24 of angles.
Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
To improve the coding efficiency, a multiple hypothesis (MH) prediction mode is disclosed. When the current block is using an MH prediction mode, an additional hypothesis of prediction is combined with the existing hypothesis of prediction by a weighted average process and the combined prediction is the final prediction of the current block. In order to overcome processing efficiency issue associated with small blocks, a simplification method of multiple hypothesis (MH) prediction mode is disclosed, where the MH prediction mode is not applied to chroma blocks under certain conditions according to this invention. When the MH prediction mode is not applied to chroma blocks, it means that the additional hypothesis of prediction is not combined with the exiting hypothesis of prediction for the chroma block and the existing hypothesis of prediction is used as the final prediction of the current chroma block. When the MH prediction mode is applied to chroma blocks, it means that the additional hypothesis of prediction is combined with the exiting hypothesis of prediction and the combined prediction is used as the final prediction of the current chroma block. When the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is then applied.
In one embodiment, MH prediction mode can be CIIP, TPM, or GEO.
In another embodiment, the proposed method can be applied even if the original flag for MH mode (e.g., CIIP, TPM, or GEO) at the CU level is true. For example, MH mode is not applied to the chroma blocks even if the CU-level CIIP flag is true. It means that the final prediction for the luma block is the combined prediction, which is formed by the existing hypothesis of prediction and the additional hypothesis of prediction; for chroma blocks, the final prediction is the existing prediction.
Current VVC supports a flexible partitioning mechanism including QT, BT, and TT.In this split structure, the block size may range from 128 to 4 for the luma component or from 64 to 2 for the chroma components. The introduction of small block sizes, i.e., 2xN, leads to an inefficient hardware implementation. It causes pipeline delay and requires 2xN pixels process in the hardware architecture. In most hardware implementations, 4x1 pixel per 1 CPU (or GPU) clock is used for luma and chroma. However, it is asserted that an extra 2x2 pixel per 1 clock processing is needed for 2xN blocks. In addition, the memory access (reading and writing) is inefficient with 2xN, because in each access only 2x1 pixels are fetched. Intra blocks have more dependency than inter blocks. The most concern is about 2xN intra blocks. The smallest size for luma is already set as 4x4.2xN intra chroma is already removed in the dual tree cases. However, there are still some 2xN intra chroma blocks in single tree cases (for example, 2xN intra chroma blocks for CIIP. ) In order to solve such issue, in another embodiment, “MH mode is not applied to the chroma blocks” means that additional hypothesis of prediction is not combined with the original (existing) hypothesis of prediction for chroma blocks. In the case of CIIP, “MH mode is not applied to the chroma blocks” means that for the chroma blocks, Intra prediction is not combined with Inter prediction so that Inter prediction is used directly.
In another embodiment, the proposed method is enabled for chroma format 4: 4: 4.
In another embodiment, the proposed method is enabled for chroma format 4: 2: 0.
In another embodiment, the proposed method is enabled for chroma format 4: 2: 2.
In another embodiment, the proposed method is enabled for chroma format 4: 2: 1.
In another embodiment, the proposed method is enabled for chroma format 4: 1: 1.
In another embodiment, the proposed method is enabled for chroma format 4: 0: 0 (i.e., mono chroma) .
In another embodiment, the pre-defined condition is in terms of block width, height, or area.
In one sub-embodiment, “block” is this embodiment can be a luma block or a chroma block. When the block means a chroma block, the corresponding block width or height depends on the used chroma format. For example, if the used chroma format is 4: 2: 0, the corresponding block width is assigned with the half of the width for the collocated luma block.
In one sub-embodiment, the pre-defined condition is that the block width is smaller than threshold-1 and/or the block height is smaller than threshold-2. For example, when CIIP flag is enabled and the block width of the corresponding chroma block is smaller than 4, the proposed method (MH prediction mode is not applied to the chroma block) is used. The chroma block can be a chroma block for Cb component or Cr component.
In another sub-embodiment, the pre-defined condition is that the block width is larger than threshold-1 and/or the block height is larger than threshold-2.
In another sub-embodiment, the pre-defined condition is that the block area is smaller than threshold-3.
In another sub-embodiment, the pre-defined condition is that the block area is larger than threshold-3.
In another embodiment, threshold-1 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-1 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level. The variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-2 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-2 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level. The variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-3 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-3 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level. The variable can be 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another sub-embodiment, threshold-1 and threshold-2 can be the same.
In another sub-embodiment, threshold-1, threshold-2, and/or threshold-3 can be different for different chroma formats.
In another embodiment, the “block” in this invention can be CU, CB, TU or TB.
In another embodiment, the proposed method is enabled depending on an explicit flag at TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
In another embodiment, the proposed method can be used for the luma block, i.e., the multiple hypothesis (MH) prediction mode is not applied to the luma blocks under certain conditions. When the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is applied.
Any combination of the above methods can be applied. For example, when chroma format 4: 4: 4 is used and when the chroma block width or height is smaller than 4, MH mode is not applied to chroma. For another example, when chroma format 4: 2: 0 is used and the chroma block width (depending on the used chroma format) is smaller than 4, MH mode is not applied to chroma. In other words, when other enabling conditions of MH mode are satisfied (e.g. assuming MH mode is CIIP, CIIP flag is enabled) and the chroma block width (depending on the used chroma format) is larger than or equal to 4, MH mode is applied to not only the luma block but also chroma blocks.
Any of the foregoing proposed methods can be implemented in encoders and/or decoders. For example, any of the proposed methods can be implemented in an Intra/Inter coding module of an encoder, a motion compensation module, a merge candidate derivation module of a decoder. Alternatively, any of the proposed methods can be implemented as a circuit coupled to the Intra/Inter coding module of an encoder and/or motion compensation module, a Merge candidate derivation module of the decoder.
Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) . The steps shown in the flowchart, as well as other following flowcharts in this disclosure, may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side and/or the decoder side. The steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart. According to this method, a current block comprising one luma block and one or more chroma blocks is received in step 610, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block. The single partition tree is a single tree for luma and chroma. A target coding mode for the current block is determined in step 620. The current block is encoded according to the target coding mode in step 630, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) . According to this method, compressed data comprising a current block are received in step 710, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block. The single partition tree is a single tree for luma and chroma. A target coding mode for the current block is determined in step 720. The current block is decoded according to the target coding mode in step 730, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
The flowcharts shown are intended to illustrate an example of video coding according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
Claims (18)
- A method of video encoding, the method comprising:receiving a current block comprising one luma block and one or more chroma blocks, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;determining a target coding mode for the current block; andencoding the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
- The method of Claim 1, wherein the additional hypothesis of prediction is Intra prediction and the Intra prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold equal to 4.
- The method of Claim 1, wherein the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode.
- The method of Claim 1, wherein the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) .
- The method of Claim 1, wherein the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
- The method of Claim 1, wherein the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
- The method of Claim 1, wherein the threshold is signalled at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
- The method of Claim 1, wherein the image area corresponds to a Coding Tree Unit (CTU) .
- An apparatus of video encoding, the apparatus comprising one or more electronic circuits or processors arranged to:receive a current block comprising one luma block and one or more chroma blocks, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;determine a target coding mode for the current block; andencode the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if width, height or area of said one or more chroma blocks is smaller than a threshold and the target coding mode corresponds to the multi-hypothesis prediction mode.
- A method of video decoding, the method comprising:receiving compressed data comprising a current block, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;determining a target coding mode for the current block; anddecoding the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
- The method of Claim 10, wherein the additional hypothesis of prediction is Intra prediction and the Intra prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold equal to 4.
- The method of Claim 10, wherein the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode.
- The method of Claim 10, wherein the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) .
- The method of Claim 10, wherein the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
- The method of Claim 10, wherein the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
- The method of Claim 10, wherein the threshold is parsed at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
- The method of Claim 10, wherein the image area corresponds to a Coding Tree Unit (CTU) .
- An apparatus of video decoding, the apparatus comprising one or more electronic circuits or processors arranged to:receive compressed data comprising a current block, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;determine a target coding mode for the current block, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if a width, a height or an area of said one or more chroma blocks is smaller than a threshold; anddecode the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MX2022003827A MX2022003827A (en) | 2019-09-29 | 2020-09-29 | Method and apparatus of combined inter and intra prediction with different chroma formats for video coding. |
CN202080068079.5A CN114731427A (en) | 2019-09-29 | 2020-09-29 | Method and apparatus for video encoding and decoding incorporating intra-frame inter-prediction with different chroma formats |
KR1020227013214A KR20220061247A (en) | 2019-09-29 | 2020-09-29 | Method and apparatus of combined inter and intra prediction using different chroma formats for video coding |
EP20869647.6A EP4029265A4 (en) | 2019-09-29 | 2020-09-29 | Method and apparatus of combined inter and intra prediction with different chroma formats for video coding |
TW109133764A TWI774075B (en) | 2019-09-29 | 2020-09-29 | Method and apparatus of multi-hypothesis prediction mode with different chroma formats for video coding |
US17/764,385 US11831928B2 (en) | 2019-09-29 | 2020-09-29 | Method and apparatus of combined inter and intra prediction with different chroma formats for video coding |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201962907699P | 2019-09-29 | 2019-09-29 | |
US62/907,699 | 2019-09-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021058033A1 true WO2021058033A1 (en) | 2021-04-01 |
Family
ID=75166765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/118961 WO2021058033A1 (en) | 2019-09-29 | 2020-09-29 | Method and apparatus of combined inter and intra prediction with different chroma formats for video coding |
Country Status (7)
Country | Link |
---|---|
US (1) | US11831928B2 (en) |
EP (1) | EP4029265A4 (en) |
KR (1) | KR20220061247A (en) |
CN (1) | CN114731427A (en) |
MX (1) | MX2022003827A (en) |
TW (1) | TWI774075B (en) |
WO (1) | WO2021058033A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3857889A4 (en) * | 2018-11-16 | 2021-09-22 | Beijing Bytedance Network Technology Co. Ltd. | Weights in combined inter intra prediction mode |
US11277624B2 (en) | 2018-11-12 | 2022-03-15 | Beijing Bytedance Network Technology Co., Ltd. | Bandwidth control methods for inter prediction |
US11509923B1 (en) | 2019-03-06 | 2022-11-22 | Beijing Bytedance Network Technology Co., Ltd. | Usage of converted uni-prediction candidate |
WO2023040993A1 (en) * | 2021-09-16 | 2023-03-23 | Beijing Bytedance Network Technology Co., Ltd. | Method, device, and medium for video processing |
WO2023154359A1 (en) * | 2022-02-11 | 2023-08-17 | Beijing Dajia Internet Information Technology Co., Ltd. | Methods and devices for multi-hypothesis-based prediction |
US11838539B2 (en) | 2018-10-22 | 2023-12-05 | Beijing Bytedance Network Technology Co., Ltd | Utilization of refined motion vector |
US11956465B2 (en) | 2018-11-20 | 2024-04-09 | Beijing Bytedance Network Technology Co., Ltd | Difference calculation based on partial position |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024017188A1 (en) * | 2022-07-22 | 2024-01-25 | Mediatek Inc. | Method and apparatus for blending prediction in video coding system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013155028A1 (en) * | 2012-04-09 | 2013-10-17 | Vid Scale, Inc. | Weighted prediction parameter signaling for video coding |
US20140169475A1 (en) * | 2012-12-17 | 2014-06-19 | Qualcomm Incorporated | Motion vector prediction in video coding |
WO2019147628A1 (en) * | 2018-01-24 | 2019-08-01 | Vid Scale, Inc. | Generalized bi-prediction for video coding with reduced coding complexity |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114375582A (en) * | 2019-06-24 | 2022-04-19 | 阿里巴巴集团控股有限公司 | Method and system for processing luminance and chrominance signals |
US11206413B2 (en) * | 2019-08-13 | 2021-12-21 | Qualcomm Incorporated | Palette predictor updates for local dual trees |
US11463693B2 (en) * | 2019-08-30 | 2022-10-04 | Qualcomm Incorporated | Geometric partition mode with harmonized motion field storage and motion compensation |
US11509910B2 (en) * | 2019-09-16 | 2022-11-22 | Tencent America LLC | Video coding method and device for avoiding small chroma block intra prediction |
-
2020
- 2020-09-29 WO PCT/CN2020/118961 patent/WO2021058033A1/en unknown
- 2020-09-29 MX MX2022003827A patent/MX2022003827A/en unknown
- 2020-09-29 KR KR1020227013214A patent/KR20220061247A/en active Search and Examination
- 2020-09-29 EP EP20869647.6A patent/EP4029265A4/en active Pending
- 2020-09-29 CN CN202080068079.5A patent/CN114731427A/en active Pending
- 2020-09-29 US US17/764,385 patent/US11831928B2/en active Active
- 2020-09-29 TW TW109133764A patent/TWI774075B/en active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013155028A1 (en) * | 2012-04-09 | 2013-10-17 | Vid Scale, Inc. | Weighted prediction parameter signaling for video coding |
US20140169475A1 (en) * | 2012-12-17 | 2014-06-19 | Qualcomm Incorporated | Motion vector prediction in video coding |
WO2019147628A1 (en) * | 2018-01-24 | 2019-08-01 | Vid Scale, Inc. | Generalized bi-prediction for video coding with reduced coding complexity |
Non-Patent Citations (3)
Title |
---|
M.-S. CHIANG, C.-W. HSU, Y.-W. HUANG, S.-M. LEI (MEDIATEK): "CE10.1.1: Multi-hypothesis prediction for improving AMVP mode, skip or merge mode, and intra mode", 12. JVET MEETING; 20181003 - 20181012; MACAO; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-L0100, 24 September 2018 (2018-09-24), XP030193644 * |
M.-S. CHIANG, C.-W. HSU, Y.-W. HUANG, S.-M. LEI (MEDIATEK): "CE10.1.4: Simplification of combined inter and intra prediction", 13. JVET MEETING; 20190109 - 20190118; MARRAKECH; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-M0177, 2 January 2019 (2019-01-02), XP030200216 * |
See also references of EP4029265A4 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11889108B2 (en) | 2018-10-22 | 2024-01-30 | Beijing Bytedance Network Technology Co., Ltd | Gradient computation in bi-directional optical flow |
US11838539B2 (en) | 2018-10-22 | 2023-12-05 | Beijing Bytedance Network Technology Co., Ltd | Utilization of refined motion vector |
US11277624B2 (en) | 2018-11-12 | 2022-03-15 | Beijing Bytedance Network Technology Co., Ltd. | Bandwidth control methods for inter prediction |
US11284088B2 (en) | 2018-11-12 | 2022-03-22 | Beijing Bytedance Network Technology Co., Ltd. | Using combined inter intra prediction in video processing |
US11516480B2 (en) | 2018-11-12 | 2022-11-29 | Beijing Bytedance Network Technology Co., Ltd. | Simplification of combined inter-intra prediction |
US11956449B2 (en) | 2018-11-12 | 2024-04-09 | Beijing Bytedance Network Technology Co., Ltd. | Simplification of combined inter-intra prediction |
US11843725B2 (en) | 2018-11-12 | 2023-12-12 | Beijing Bytedance Network Technology Co., Ltd | Using combined inter intra prediction in video processing |
EP3857889A4 (en) * | 2018-11-16 | 2021-09-22 | Beijing Bytedance Network Technology Co. Ltd. | Weights in combined inter intra prediction mode |
US11956465B2 (en) | 2018-11-20 | 2024-04-09 | Beijing Bytedance Network Technology Co., Ltd | Difference calculation based on partial position |
US11509923B1 (en) | 2019-03-06 | 2022-11-22 | Beijing Bytedance Network Technology Co., Ltd. | Usage of converted uni-prediction candidate |
US11930165B2 (en) | 2019-03-06 | 2024-03-12 | Beijing Bytedance Network Technology Co., Ltd | Size dependent inter coding |
WO2023040993A1 (en) * | 2021-09-16 | 2023-03-23 | Beijing Bytedance Network Technology Co., Ltd. | Method, device, and medium for video processing |
WO2023154359A1 (en) * | 2022-02-11 | 2023-08-17 | Beijing Dajia Internet Information Technology Co., Ltd. | Methods and devices for multi-hypothesis-based prediction |
Also Published As
Publication number | Publication date |
---|---|
TW202121901A (en) | 2021-06-01 |
EP4029265A4 (en) | 2023-11-08 |
US20220360824A1 (en) | 2022-11-10 |
KR20220061247A (en) | 2022-05-12 |
MX2022003827A (en) | 2023-01-26 |
US11831928B2 (en) | 2023-11-28 |
TWI774075B (en) | 2022-08-11 |
CN114731427A (en) | 2022-07-08 |
EP4029265A1 (en) | 2022-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021058033A1 (en) | Method and apparatus of combined inter and intra prediction with different chroma formats for video coding | |
US11109052B2 (en) | Method of motion vector derivation for video coding | |
US11259025B2 (en) | Method and apparatus of adaptive multiple transforms for video coding | |
US11089323B2 (en) | Method and apparatus of current picture referencing for video coding | |
US10334281B2 (en) | Method of conditional binary tree block partitioning structure for video and image coding | |
US11956421B2 (en) | Method and apparatus of luma most probable mode list derivation for video coding | |
EP3130147B1 (en) | Methods of block vector prediction and decoding for intra block copy mode coding | |
US20190215521A1 (en) | Method and apparatus for video coding using decoder side intra prediction derivation | |
US20170310988A1 (en) | Method of Motion Vector Predictor or Merge Candidate Derivation in Video Coding | |
US11381838B2 (en) | Method and apparatus of improved merge with motion vector difference for video coding | |
WO2019210857A1 (en) | Method and apparatus of syntax interleaving for separate coding tree in video coding | |
US20220286714A1 (en) | Method and Apparatus of Partitioning Small Size Coding Units with Partition Constraints | |
US20220224890A1 (en) | Method and Apparatus of Partitioning Small Size Coding Units with Partition Constraints | |
EP4243416A2 (en) | Method and apparatus of chroma direct mode generation for video coding | |
WO2024088058A1 (en) | Method and apparatus of regression-based intra prediction in video coding system | |
WO2023207511A1 (en) | Method and apparatus of adaptive weighting for overlapped block motion compensation in video coding system | |
WO2024083251A1 (en) | Method and apparatus of region-based intra prediction using template-based or decoder side intra mode derivation in video coding system | |
WO2023020390A1 (en) | Method and apparatus for low-latency template matching in video coding system | |
WO2023207646A1 (en) | Method and apparatus for blending prediction in video coding system | |
WO2024022325A1 (en) | Method and apparatus of improving performance of convolutional cross-component model in video coding system | |
US20230119121A1 (en) | Method and Apparatus for Signaling Slice Partition Information in Image and Video Coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20869647 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20227013214 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2020869647 Country of ref document: EP Effective date: 20220414 |