WO2021058033A1 - Method and apparatus of combined inter and intra prediction with different chroma formats for video coding - Google Patents

Method and apparatus of combined inter and intra prediction with different chroma formats for video coding Download PDF

Info

Publication number
WO2021058033A1
WO2021058033A1 PCT/CN2020/118961 CN2020118961W WO2021058033A1 WO 2021058033 A1 WO2021058033 A1 WO 2021058033A1 CN 2020118961 W CN2020118961 W CN 2020118961W WO 2021058033 A1 WO2021058033 A1 WO 2021058033A1
Authority
WO
WIPO (PCT)
Prior art keywords
current block
prediction
mode
hypothesis
coding
Prior art date
Application number
PCT/CN2020/118961
Other languages
French (fr)
Inventor
Man-Shu CHIANG
Chih-Wei Hsu
Tzu-Der Chuang
Original Assignee
Mediatek Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediatek Inc. filed Critical Mediatek Inc.
Priority to MX2022003827A priority Critical patent/MX2022003827A/en
Priority to CN202080068079.5A priority patent/CN114731427A/en
Priority to KR1020227013214A priority patent/KR20220061247A/en
Priority to EP20869647.6A priority patent/EP4029265A4/en
Priority to TW109133764A priority patent/TWI774075B/en
Priority to US17/764,385 priority patent/US11831928B2/en
Publication of WO2021058033A1 publication Critical patent/WO2021058033A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/1883Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit relating to sub-band structure, e.g. hierarchical level, directional tree, e.g. low-high [LH], high-low [HL], high-high [HH]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • Intra prediction and Inter prediction there are two kinds of prediction modes (i.e., Intra prediction and Inter prediction) for each PU.
  • Intra prediction modes the spatial neighbouring reconstructed pixels can be used to generate the directional predictions.
  • VVC Versatile Video Coding
  • JVET Joint Video Exploration Team
  • ⁇ First weighting factor group ⁇ 7/8, 6/8, 4/8, 2/8, 1/8 ⁇ and ⁇ 7/8, 4/8, 1/8 ⁇ is used for the luminance and the chrominance samples, respectively;
  • One weighting factor group is selected based on the comparison of the motion vectors of two triangular prediction units.
  • the second weighting factor group is used when the reference pictures of the two triangular prediction units are different from each other or their motion vector difference is larger than 16 pixels. Otherwise, the first weighting factor group is used.
  • An example is shown in Fig. 2, where weightings 210 are shown for the luma block and weightings 220 are shown for the chroma block. A more detailed explanation of the algorithm can be found in JVET-L0124 and JVET-L0208.
  • Geometric Merge mode also called geometric partitioning mode, GPM
  • GPM geometric partitioning mode
  • the 140 modes is defined as CE4-1.1 in P0068. To further reduced the complexity, in CE4-1.2 108 modes and 80 modes GEO are tested. In CE4-1.14, a TPM-like simplified motion storage is tested.
  • the proposed GEO partitioning for Inter is allowed for uni-predicted blocks not smaller than 8 ⁇ 8 in order to have the same memory bandwidth usage as the bi-predicted blocks at the decoder side.
  • Motion vector prediction for GEO partitioning is aligned with TPM. Also, the TPM blending between two predictions is applied on inner boundary.
  • the split boundary of geometric Merge mode is descripted by angle and distance offset ⁇ i as shown in Fig. 4.
  • Angle represents a quantized angle between 0 and 360 degrees and distance offset ⁇ i represents a quantized offset of the largest distance ⁇ max .
  • the split directions overlapped with binary tree splits and TPM splits are excluded.
  • Angles is quantized between 0 and 360 degrees with a fix step.
  • the angle is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles as shown in Fig. 5A.
  • Fig. 5B illustrates the reduced angles with 24 values.
  • Distance ⁇ i is quantized from the largest possible distance ⁇ max with a fixed step.
  • the value of ⁇ max can be geometrically derived by Eq. (1) for either w or h is equal to 8 and scaled with log2 scaled short edge length. For is equal to 0 degree case, ⁇ max is equal to w/2 and for is equal to 90 degree case, ⁇ max is equal to h/2 and. The shifted back “1.0” samples is to avoid that the split boundary is too close to the corner.
  • the distance ⁇ i is quantized with 5 steps. Combining with 32 angles, there is a total of140 split modes excluding the binary tree and TPM splits. In CE4-1.2 -, the distance ⁇ i is quantized with 4 steps. Combining with 32 angles, there is a total of 108 split modes excluding the binary tree and TPM splits. In CE4-1.2, the distance ⁇ i is quantized with 4 steps. Combining with 24 angles, there is a total of 80 split modes excluding the binary tree and TPM splits.
  • the GEO mode is signalled as an additional Merge mode together with TPM mode as shown in Table 1.
  • merge_geo_flag [] [] is signalled with 4 CABAC context models, where the first three are derived depending on the mode of above and left neighbouring blocks, the fourth is derived depending on the aspect ratio of the current block.
  • merge_geo_flag [] [] indicates whether the current block uses GEO mode or TPM mode, which is similar to a “most probable mode” flag.
  • the geo_partition_idx [] [] is used as an index to the lookup table that stores the angle and distance ⁇ i pairs.
  • the geo_partition_idx is coded using truncated binary and binarized using bypass.
  • a method and apparatus for video coding are disclosed. According to this method, a current block is received at an encoder side or compressed data comprising the current block is received at a decoder side, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block.
  • the single partition tree is a single tree for luma and chroma.
  • a target coding mode is determined for the current block, .
  • the current block is then encoded or decoded according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
  • the additional hypothesis of prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold and the threshold is equal to 4.
  • the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode. In another embodiment, the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) . In yet another embodiment, the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
  • CIIP Combined Inter/Intra Prediction
  • TPM Triangular Prediction mode
  • GEO Geometric Merge mode
  • the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
  • the threshold is predefined implicitly in the standard or signalled at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
  • TU Transform Unit
  • CU Coding Unit
  • CB Coding Block
  • CTU Coding Tree Unit
  • CTB Coding Tree Block
  • CTB Coding Tree Unit
  • CTU Coding Tree Unit
  • CTU Coding Tree Block
  • CTB Coding Tree Block
  • SPS Sequence Parameter Set
  • PPS Picture Parameter Set
  • the image area corresponds to a Coding Tree Unit (CTU) .
  • CTU Coding Tree Unit
  • Fig. 1 illustrates an example of TPM (Triangular Prediction Mode) , where a CU is split into two triangular prediction units, in either diagonal or inverse diagonal direction. Each triangular prediction unit in the CU is Inter-predicted using its own uni-prediction motion vector and reference frame index to generate prediction from a uni-prediction candidate.
  • TPM Triangular Prediction Mode
  • Fig. 2 illustrates an example of adaptive weighting process, where weightings are shown for the luma block (left) and the chroma block (right) .
  • Fig. 3A illustrates partition shapes for the triangular prediction mode (TPM) as disclosed in VTM-6.0
  • Fig. 3B illustrates additional shapes being discussed for geometric Merge mode.
  • Fig. 4 illustrates the split boundary of geometric Merge mode that is descripted by angle and distance offset ⁇ i .
  • Fig. 5A illustrates an example where the angle is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles.
  • Fig. 5B illustrates an example where the angle is quantized from between 0 and 360 degrees with step 11.25 degree and some near vertical direction angles are removed, which results in a total 24 of angles.
  • Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
  • Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
  • a multiple hypothesis (MH) prediction mode is disclosed.
  • an additional hypothesis of prediction is combined with the existing hypothesis of prediction by a weighted average process and the combined prediction is the final prediction of the current block.
  • a simplification method of multiple hypothesis (MH) prediction mode is disclosed, where the MH prediction mode is not applied to chroma blocks under certain conditions according to this invention.
  • the MH prediction mode is not applied to chroma blocks, it means that the additional hypothesis of prediction is not combined with the exiting hypothesis of prediction for the chroma block and the existing hypothesis of prediction is used as the final prediction of the current chroma block.
  • the MH prediction mode When the MH prediction mode is applied to chroma blocks, it means that the additional hypothesis of prediction is combined with the exiting hypothesis of prediction and the combined prediction is used as the final prediction of the current chroma block.
  • the proposed method When the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is then applied.
  • MH prediction mode can be CIIP, TPM, or GEO.
  • the proposed method can be applied even if the original flag for MH mode (e.g., CIIP, TPM, or GEO) at the CU level is true.
  • MH mode is not applied to the chroma blocks even if the CU-level CIIP flag is true. It means that the final prediction for the luma block is the combined prediction, which is formed by the existing hypothesis of prediction and the additional hypothesis of prediction; for chroma blocks, the final prediction is the existing prediction.
  • the block size may range from 128 to 4 for the luma component or from 64 to 2 for the chroma components.
  • Intra blocks have more dependency than inter blocks. The most concern is about 2xN intra blocks. The smallest size for luma is already set as 4x4.2xN intra chroma is already removed in the dual tree cases. However, there are still some 2xN intra chroma blocks in single tree cases (for example, 2xN intra chroma blocks for CIIP. )
  • “MH mode is not applied to the chroma blocks” means that additional hypothesis of prediction is not combined with the original (existing) hypothesis of prediction for chroma blocks.
  • MH mode is not applied to the chroma blocks” means that for the chroma blocks, Intra prediction is not combined with Inter prediction so that Inter prediction is used directly.
  • the proposed method is enabled for chroma format 4: 4: 4.
  • the proposed method is enabled for chroma format 4: 2: 0.
  • the proposed method is enabled for chroma format 4: 2: 2.
  • the proposed method is enabled for chroma format 4: 2: 1.
  • the proposed method is enabled for chroma format 4: 1: 1.
  • the proposed method is enabled for chroma format 4: 0: 0 (i.e., mono chroma) .
  • the pre-defined condition is in terms of block width, height, or area.
  • block is this embodiment can be a luma block or a chroma block.
  • the corresponding block width or height depends on the used chroma format. For example, if the used chroma format is 4: 2: 0, the corresponding block width is assigned with the half of the width for the collocated luma block.
  • the pre-defined condition is that the block width is smaller than threshold-1 and/or the block height is smaller than threshold-2.
  • the proposed method MH prediction mode is not applied to the chroma block
  • the chroma block can be a chroma block for Cb component or Cr component.
  • the pre-defined condition is that the block width is larger than threshold-1 and/or the block height is larger than threshold-2.
  • the pre-defined condition is that the block area is smaller than threshold-3.
  • the pre-defined condition is that the block area is larger than threshold-3.
  • threshold-1 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
  • threshold-1 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
  • the variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
  • threshold-2 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
  • threshold-2 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
  • the variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
  • threshold-3 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
  • threshold-3 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
  • the variable can be 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
  • threshold-1 and threshold-2 can be the same.
  • threshold-1, threshold-2, and/or threshold-3 can be different for different chroma formats.
  • the “block” in this invention can be CU, CB, TU or TB.
  • the proposed method is enabled depending on an explicit flag at TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
  • the proposed method can be used for the luma block, i.e., the multiple hypothesis (MH) prediction mode is not applied to the luma blocks under certain conditions.
  • the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is applied.
  • MH mode is not applied to chroma.
  • chroma format 4: 4: 4 when chroma format 4: 4: 4 is used and when the chroma block width or height is smaller than 4, MH mode is not applied to chroma.
  • chroma format 4: 2: 0 when chroma format 4: 2: 0 is used and the chroma block width (depending on the used chroma format) is smaller than 4, MH mode is not applied to chroma.
  • other enabling conditions of MH mode are satisfied (e.g. assuming MH mode is CIIP, CIIP flag is enabled) and the chroma block width (depending on the used chroma format) is larger than or equal to 4, MH mode is applied to not only the luma block but also chroma blocks.
  • any of the foregoing proposed methods can be implemented in encoders and/or decoders.
  • any of the proposed methods can be implemented in an Intra/Inter coding module of an encoder, a motion compensation module, a merge candidate derivation module of a decoder.
  • any of the proposed methods can be implemented as a circuit coupled to the Intra/Inter coding module of an encoder and/or motion compensation module, a Merge candidate derivation module of the decoder.
  • Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) .
  • the steps shown in the flowchart, as well as other following flowcharts in this disclosure, may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side and/or the decoder side.
  • the steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart.
  • a current block comprising one luma block and one or more chroma blocks is received in step 610, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block.
  • the single partition tree is a single tree for luma and chroma.
  • a target coding mode for the current block is determined in step 620.
  • the current block is encoded according to the target coding mode in step 630, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
  • Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) .
  • compressed data comprising a current block are received in step 710, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block.
  • the single partition tree is a single tree for luma and chroma.
  • a target coding mode for the current block is determined in step 720.
  • the current block is decoded according to the target coding mode in step 730, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
  • Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both.
  • an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein.
  • An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
  • DSP Digital Signal Processor
  • the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
  • These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
  • the software code or firmware code may be developed in different programming languages and different formats or styles.
  • the software code may also be compiled for different target platforms.
  • different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

Abstract

A method and apparatus for video coding are disclosed. According to this method, a current block is received at an encoder side or compressed data comprising the current block is received at a decoder side, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block. A target coding mode is determined for the current block. The current block is then encoded or decoded according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.

Description

METHOD AND APPARATUS OF COMBINED INTER AND INTRA PREDICTION WITH DIFFERENT CHROMA FORMATS FOR VIDEO CODING
CROSS REFERENCE TO RELATED APPLICATIONS
The present invention claims priority to U.S. Provisional Patent Application, Serial No. 62/907,699, filed on September 29, 2019. The U.S. Provisional Patent Application is hereby incorporated by reference in its entirety.
FIELD OF THE INVENTION
The present invention relates to prediction for video coding using CIIP (Combined Inter/Intra Prediction) . In particular, the present invention discloses techniques to improve processing throughput for small block sizes.
BACKGROUND AND RELATED ART
High-Efficiency Video Coding (HEVC) is a new international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) . HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed coding unit (CU) , is a 2Nx2N square block, and each CU can be recursively split into four smaller CUs until the predefined minimum size is reached. Each CU contains one or multiple prediction units (PUs) .
To achieve the best coding efficiency of hybrid coding architecture in HEVC, there are two kinds of prediction modes (i.e., Intra prediction and Inter prediction) for each PU. For Intra prediction modes, the spatial neighbouring reconstructed pixels can be used to generate the directional predictions.
After the development of HEVC standard, another merging video coding standard, named as Versatile Video Coding (VVC) , is being developed under Joint Video Exploration  Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11. Various new coding tools along with some existing coding tools have been evaluated for VVC.
In VTM (VVC Test Model) software, when a CU is coded in Merge mode, and if the CU contains at least 64 luma samples (i.e., CU width × CU height equal to or larger than 64) , an additional flag (CIIP flag) is signalled at CU level to indicate if the Combined Inter/Intra Prediction (CIIP) mode is applied to the current CU. In order to form the CIIP prediction, an Intra prediction mode is first derived from two additional syntax elements or implicitly assigned. For example, planar mode is implicitly assigned as the Intra prediction mode. For another example, up to four possible Intra prediction modes can be used: DC, planar, horizontal, or vertical. The Inter prediction (the existing hypothesis of prediction) and Intra prediction signals (the additional hypothesis of prediction) are then derived using regular Intra and Inter decoding processes. Finally, weighted averaging of the Inter and Intra prediction signals is performed to obtain the CIIP prediction. A more detailed explanation of the algorithm can be found in JVET-L0100 (M. -S. Chiang, et al., “CE10.1.1: Multi-hypothesis prediction for improving AMVP mode, skip or merge mode, and intra mode, ” ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, Oct. 2018, Document: JVET-L0100) .
Triangular prediction
For VTM, in JVET-L0124 (R. -L. Liao, et al., “CE10.3.1. b: Triangular prediction unit mode, ” ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, Oct. 2018, Document: JVET-L0124) and JVET-L0208 (T. Poirier, et al., “CE10 related: multiple prediction unit shapes, ” ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, Oct. 2018, Document: JVET-L0208) , the scenario of Triangular Prediction unit Mode (TPM) is proposed. The concept is to introduce a new triangular partition for motion compensated prediction. It splits a CU into two triangular prediction units, in either diagonal or inverse diagonal direction like Fig 1. Each triangular prediction unit in the CU is Inter-predicted using its own uni-prediction motion vector and reference frame. An adaptive  weighting process is performed to the diagonal edge after predicting the triangular prediction units. Then, the transform and quantization process are applied to the whole CU. It is noted that this mode is only applied to skip and merge modes. An additional flag is signalled to indicate if TPM is applied.
Adaptive weighting process
After predicting each triangular prediction unit, an adaptive weighting process is applied to the diagonal edge between the two triangular prediction units to derive the final prediction for the whole CU. Two weighting factor groups are listed as follows:
● First weighting factor group: {7/8, 6/8, 4/8, 2/8, 1/8} and {7/8, 4/8, 1/8} is used for the luminance and the chrominance samples, respectively;
● Second weighting factor group: {7/8, 6/8, 5/8, 4/8, 3/8, 2/8, 1/8} and {6/8, 4/8, 2/8} are used for the luminance and the chrominance samples, respectively.
One weighting factor group is selected based on the comparison of the motion vectors of two triangular prediction units. The second weighting factor group is used when the reference pictures of the two triangular prediction units are different from each other or their motion vector difference is larger than 16 pixels. Otherwise, the first weighting factor group is used. An example is shown in Fig. 2, where weightings 210 are shown for the luma block and weightings 220 are shown for the chroma block. A more detailed explanation of the algorithm can be found in JVET-L0124 and JVET-L0208.
Geometric Merge mode (GEO)
Geometric Merge mode (also called geometric partitioning mode, GPM) is proposed in JVET-P0068 (H. Gao, et al., “CE4: CE4-1.1, CE4-1.2 and CE4-1.14: Geometric Merge Mode (GEO) ” , ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 16th Meeting: Geneva, CH, 1–11 October 2019, Document: P0068) , which uses the same predictors blending concept as TPM and extends the blending masks up to 140 different modes with 32 angles and  5 distance offsets.
The 140 modes is defined as CE4-1.1 in P0068. To further reduced the complexity, in CE4-1.2 108 modes and 80 modes GEO are tested. In CE4-1.14, a TPM-like simplified motion storage is tested.
Fig. 3A illustrates partition shapes (311-312) for TPM in VTM-6.0 and Fig. 3B illustrates additional shapes (313-319) being proposed for non-rectangular Inter blocks.
Similarly to TPM, the proposed GEO partitioning for Inter is allowed for uni-predicted blocks not smaller than 8×8 in order to have the same memory bandwidth usage as the bi-predicted blocks at the decoder side. Motion vector prediction for GEO partitioning is aligned with TPM. Also, the TPM blending between two predictions is applied on inner boundary.
The split boundary of geometric Merge mode is descripted by angle
Figure PCTCN2020118961-appb-000001
and distance offset ρ i as shown in Fig. 4. Angle
Figure PCTCN2020118961-appb-000002
represents a quantized angle between 0 and 360 degrees and distance offset ρ i represents a quantized offset of the largest distance ρ max. In addition, the split directions overlapped with binary tree splits and TPM splits are excluded.
GEO angle and distance quantization.
Angles
Figure PCTCN2020118961-appb-000003
is quantized between 0 and 360 degrees with a fix step. In CE4-1.1, CE4-1.2 with 108 modes and CE4-1.14, the angle
Figure PCTCN2020118961-appb-000004
is quantized from between 0 and 360 degrees with step 11.25 degree, which results in a total 32 of angles as shown in Fig. 5A.
In CE4-1.2 with 80 modes, the angle
Figure PCTCN2020118961-appb-000005
is still quantized with 11.25 degrees steps; however the near vertical direction angles (e.g., near horizontal split boundaries) are removed since in the nature values, objectives and motions are mostly horizontal. Fig. 5B illustrates the reduced angles with 24 values.
Distance ρ i is quantized from the largest possible distance ρ max with a fixed step.  The value of ρ max can be geometrically derived by Eq. (1) for either w or h is equal to 8 and scaled with log2 scaled short edge length. For
Figure PCTCN2020118961-appb-000006
is equal to 0 degree case, ρ max is equal to w/2 and for
Figure PCTCN2020118961-appb-000007
is equal to 90 degree case, ρ max is equal to h/2 and. The shifted back “1.0” samples is to avoid that the split boundary is too close to the corner.
Figure PCTCN2020118961-appb-000008
In CE4-1.1 and CE4-1.14, the distance ρ i is quantized with 5 steps. Combining with 32 angles, there is a total of140 split modes excluding the binary tree and TPM splits. In CE4-1.2 -, the distance ρ i is quantized with 4 steps. Combining with 32 angles, there is a total of 108 split modes excluding the binary tree and TPM splits. In CE4-1.2, the distance ρ i is quantized with 4 steps. Combining with 24 angles, there is a total of 80 split modes excluding the binary tree and TPM splits.
Mode signalling
According to the proposed method, the GEO mode is signalled as an additional Merge mode together with TPM mode as shown in Table 1.
Table 1 Syntax elements introduced by the proposal
Figure PCTCN2020118961-appb-000009
The merge_geo_flag [] [] is signalled with 4 CABAC context models, where the first three are derived depending on the mode of above and left neighbouring blocks, the fourth is  derived depending on the aspect ratio of the current block. merge_geo_flag [] [] indicates whether the current block uses GEO mode or TPM mode, which is similar to a “most probable mode” flag.
The geo_partition_idx [] [] is used as an index to the lookup table that stores the angle 
Figure PCTCN2020118961-appb-000010
and distance ρ i pairs. The geo_partition_idx is coded using truncated binary and binarized using bypass.
BRIEF SUMMARY OF THE INVENTION
A method and apparatus for video coding are disclosed. According to this method, a current block is received at an encoder side or compressed data comprising the current block is received at a decoder side, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block. The single partition tree is a single tree for luma and chroma. A target coding mode is determined for the current block, . The current block is then encoded or decoded according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
In one embodiment, the additional hypothesis of prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold and the threshold is equal to 4.
In one embodiment, the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode. In another embodiment, the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) . In yet another  embodiment, the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
In one embodiment, the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
In one embodiment, the threshold is predefined implicitly in the standard or signalled at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
In one embodiment, the image area corresponds to a Coding Tree Unit (CTU) .
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates an example of TPM (Triangular Prediction Mode) , where a CU is split into two triangular prediction units, in either diagonal or inverse diagonal direction. Each triangular prediction unit in the CU is Inter-predicted using its own uni-prediction motion vector and reference frame index to generate prediction from a uni-prediction candidate.
Fig. 2 illustrates an example of adaptive weighting process, where weightings are shown for the luma block (left) and the chroma block (right) .
Fig. 3A illustrates partition shapes for the triangular prediction mode (TPM) as disclosed in VTM-6.0
Fig. 3B illustrates additional shapes being discussed for geometric Merge mode.
Fig. 4 illustrates the split boundary of geometric Merge mode that is descripted by angle
Figure PCTCN2020118961-appb-000011
and distance offset ρ i.
Fig. 5A illustrates an example where the angle
Figure PCTCN2020118961-appb-000012
is quantized from between 0 and  360 degrees with step 11.25 degree, which results in a total 32 of angles.
Fig. 5B illustrates an example where the angle
Figure PCTCN2020118961-appb-000013
is quantized from between 0 and 360 degrees with step 11.25 degree and some near vertical direction angles are removed, which results in a total 24 of angles.
Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks.
DETAILED DESCRIPTION OF THE INVENTION
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
To improve the coding efficiency, a multiple hypothesis (MH) prediction mode is disclosed. When the current block is using an MH prediction mode, an additional hypothesis of prediction is combined with the existing hypothesis of prediction by a weighted average process and the combined prediction is the final prediction of the current block. In order to overcome processing efficiency issue associated with small blocks, a simplification method of multiple hypothesis (MH) prediction mode is disclosed, where the MH prediction mode is not applied to chroma blocks under certain conditions according to this invention. When the MH prediction mode is not applied to chroma blocks, it means that the additional hypothesis of prediction is not combined with the exiting hypothesis of prediction for the chroma block and the existing hypothesis of prediction is used as the final prediction of the current chroma block.  When the MH prediction mode is applied to chroma blocks, it means that the additional hypothesis of prediction is combined with the exiting hypothesis of prediction and the combined prediction is used as the final prediction of the current chroma block. When the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is then applied.
In one embodiment, MH prediction mode can be CIIP, TPM, or GEO.
In another embodiment, the proposed method can be applied even if the original flag for MH mode (e.g., CIIP, TPM, or GEO) at the CU level is true. For example, MH mode is not applied to the chroma blocks even if the CU-level CIIP flag is true. It means that the final prediction for the luma block is the combined prediction, which is formed by the existing hypothesis of prediction and the additional hypothesis of prediction; for chroma blocks, the final prediction is the existing prediction.
Current VVC supports a flexible partitioning mechanism including QT, BT, and TT.In this split structure, the block size may range from 128 to 4 for the luma component or from 64 to 2 for the chroma components. The introduction of small block sizes, i.e., 2xN, leads to an inefficient hardware implementation. It causes pipeline delay and requires 2xN pixels process in the hardware architecture. In most hardware implementations, 4x1 pixel per 1 CPU (or GPU) clock is used for luma and chroma. However, it is asserted that an extra 2x2 pixel per 1 clock processing is needed for 2xN blocks. In addition, the memory access (reading and writing) is inefficient with 2xN, because in each access only 2x1 pixels are fetched. Intra blocks have more dependency than inter blocks. The most concern is about 2xN intra blocks. The smallest size for luma is already set as 4x4.2xN intra chroma is already removed in the dual tree cases. However, there are still some 2xN intra chroma blocks in single tree cases (for example, 2xN intra chroma blocks for CIIP. ) In order to solve such issue, in another embodiment, “MH mode is not applied to the chroma blocks” means that additional hypothesis of prediction is not combined with the original (existing) hypothesis of prediction for chroma blocks. In the case of CIIP, “MH mode is not applied to the chroma blocks” means that for the  chroma blocks, Intra prediction is not combined with Inter prediction so that Inter prediction is used directly.
In another embodiment, the proposed method is enabled for chroma format 4: 4: 4.
In another embodiment, the proposed method is enabled for chroma format 4: 2: 0.
In another embodiment, the proposed method is enabled for chroma format 4: 2: 2.
In another embodiment, the proposed method is enabled for chroma format 4: 2: 1.
In another embodiment, the proposed method is enabled for chroma format 4: 1: 1.
In another embodiment, the proposed method is enabled for chroma format 4: 0: 0 (i.e., mono chroma) .
In another embodiment, the pre-defined condition is in terms of block width, height, or area.
In one sub-embodiment, “block” is this embodiment can be a luma block or a chroma block. When the block means a chroma block, the corresponding block width or height depends on the used chroma format. For example, if the used chroma format is 4: 2: 0, the corresponding block width is assigned with the half of the width for the collocated luma block.
In one sub-embodiment, the pre-defined condition is that the block width is smaller than threshold-1 and/or the block height is smaller than threshold-2. For example, when CIIP flag is enabled and the block width of the corresponding chroma block is smaller than 4, the proposed method (MH prediction mode is not applied to the chroma block) is used. The chroma block can be a chroma block for Cb component or Cr component.
In another sub-embodiment, the pre-defined condition is that the block width is larger than threshold-1 and/or the block height is larger than threshold-2.
In another sub-embodiment, the pre-defined condition is that the block area is  smaller than threshold-3.
In another sub-embodiment, the pre-defined condition is that the block area is larger than threshold-3.
In another embodiment, threshold-1 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-1 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level. The variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-2 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-2 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level. The variable is 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-3 can be a positive integer such as 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another embodiment, threshold-3 can be a variable defined in TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level. The variable can be 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, or 1024.
In another sub-embodiment, threshold-1 and threshold-2 can be the same.
In another sub-embodiment, threshold-1, threshold-2, and/or threshold-3 can be different for different chroma formats.
In another embodiment, the “block” in this invention can be CU, CB, TU or TB.
In another embodiment, the proposed method is enabled depending on an explicit  flag at TU (or TB) , CU (or CB) , CTU (or CTB) , slice, tile, tile group, SPS, PPS, or picture level.
In another embodiment, the proposed method can be used for the luma block, i.e., the multiple hypothesis (MH) prediction mode is not applied to the luma blocks under certain conditions. When the proposed method is enabled and the pre-defined condition is satisfied, the proposed method is applied.
Any combination of the above methods can be applied. For example, when chroma format 4: 4: 4 is used and when the chroma block width or height is smaller than 4, MH mode is not applied to chroma. For another example, when chroma format 4: 2: 0 is used and the chroma block width (depending on the used chroma format) is smaller than 4, MH mode is not applied to chroma. In other words, when other enabling conditions of MH mode are satisfied (e.g. assuming MH mode is CIIP, CIIP flag is enabled) and the chroma block width (depending on the used chroma format) is larger than or equal to 4, MH mode is applied to not only the luma block but also chroma blocks.
Any of the foregoing proposed methods can be implemented in encoders and/or decoders. For example, any of the proposed methods can be implemented in an Intra/Inter coding module of an encoder, a motion compensation module, a merge candidate derivation module of a decoder. Alternatively, any of the proposed methods can be implemented as a circuit coupled to the Intra/Inter coding module of an encoder and/or motion compensation module, a Merge candidate derivation module of the decoder.
Fig. 6 illustrates a flowchart of an exemplary prediction for video encoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) . The steps shown in the flowchart, as well as other following flowcharts in this disclosure, may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side and/or the decoder side.  The steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart. According to this method, a current block comprising one luma block and one or more chroma blocks is received in step 610, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block. The single partition tree is a single tree for luma and chroma. A target coding mode for the current block is determined in step 620. The current block is encoded according to the target coding mode in step 630, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
Fig. 7 illustrates a flowchart of an exemplary prediction for video decoding according to an embodiment of the present invention, where the additional hypothesis of prediction is disabled for small chroma blocks (the existing prediction is used as the final prediction for the small chroma blocks) . According to this method, compressed data comprising a current block are received in step 710, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block. The single partition tree is a single tree for luma and chroma. A target coding mode for the current block is determined in step 720. The current block is decoded according to the target coding mode in step 730, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
The flowcharts shown are intended to illustrate an example of video coding  according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different  programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims (18)

  1. A method of video encoding, the method comprising:
    receiving a current block comprising one luma block and one or more chroma blocks, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;
    determining a target coding mode for the current block; and
    encoding the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
  2. The method of Claim 1, wherein the additional hypothesis of prediction is Intra prediction and the Intra prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold equal to 4.
  3. The method of Claim 1, wherein the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode.
  4. The method of Claim 1, wherein the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) .
  5. The method of Claim 1, wherein the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
  6. The method of Claim 1, wherein the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
  7. The method of Claim 1, wherein the threshold is signalled at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
  8. The method of Claim 1, wherein the image area corresponds to a Coding Tree Unit (CTU) .
  9. An apparatus of video encoding, the apparatus comprising one or more electronic circuits or processors arranged to:
    receive a current block comprising one luma block and one or more chroma blocks, wherein the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;
    determine a target coding mode for the current block; and
    encode the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if width, height or area of said one or more chroma blocks is smaller than a threshold and the target coding mode corresponds to the multi-hypothesis prediction mode.
  10. A method of video decoding, the method comprising:
    receiving compressed data comprising a current block, wherein the current block comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;
    determining a target coding mode for the current block; and
    decoding the current block according to the target coding mode, wherein an  additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
  11. The method of Claim 10, wherein the additional hypothesis of prediction is Intra prediction and the Intra prediction for said one or more chroma blocks is disabled if the width of said one or more chroma blocks is smaller than the threshold equal to 4.
  12. The method of Claim 10, wherein the multi-hypothesis prediction mode corresponds to Combined Inter/Intra Prediction (CIIP) mode.
  13. The method of Claim 10, wherein the multi-hypothesis prediction mode corresponds to Triangular Prediction mode (TPM) .
  14. The method of Claim 10, wherein the multi-hypothesis prediction mode corresponds to Geometric Merge mode (GEO) .
  15. The method of Claim 10, wherein the current block is in chroma format 4: 4: 4, 4: 2: 2 or 4: 2: 0.
  16. The method of Claim 10, wherein the threshold is parsed at a Transform Unit (TU) or Transform Block (TB) , Coding Unit (CU) or Coding Block (CB) , Coding Tree Unit (CTU) or Coding Tree Block (CTB) , slice, tile, tile group, Sequence Parameter Set (SPS) , Picture Parameter Set (PPS) , or picture level of a video bitstream.
  17. The method of Claim 10, wherein the image area corresponds to a Coding Tree Unit (CTU) .
  18. An apparatus of video decoding, the apparatus comprising one or more electronic circuits or processors arranged to:
    receive compressed data comprising a current block, wherein the current block  comprises one luma block and one or more chroma blocks, the current block is generated by partitioning an image area using a single partition tree into one or more partitioned blocks comprising the current block, and one or more coding tools comprising a multi-hypothesis prediction mode is allowed for the current block;
    determine a target coding mode for the current block, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if a width, a height or an area of said one or more chroma blocks is smaller than a threshold; and
    decode the current block according to the target coding mode, wherein an additional hypothesis of prediction for said one or more chroma blocks is disabled if the target coding mode corresponds to the multi-hypothesis prediction mode and width, height or area of said one or more chroma blocks is smaller than a threshold.
PCT/CN2020/118961 2019-09-29 2020-09-29 Method and apparatus of combined inter and intra prediction with different chroma formats for video coding WO2021058033A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
MX2022003827A MX2022003827A (en) 2019-09-29 2020-09-29 Method and apparatus of combined inter and intra prediction with different chroma formats for video coding.
CN202080068079.5A CN114731427A (en) 2019-09-29 2020-09-29 Method and apparatus for video encoding and decoding incorporating intra-frame inter-prediction with different chroma formats
KR1020227013214A KR20220061247A (en) 2019-09-29 2020-09-29 Method and apparatus of combined inter and intra prediction using different chroma formats for video coding
EP20869647.6A EP4029265A4 (en) 2019-09-29 2020-09-29 Method and apparatus of combined inter and intra prediction with different chroma formats for video coding
TW109133764A TWI774075B (en) 2019-09-29 2020-09-29 Method and apparatus of multi-hypothesis prediction mode with different chroma formats for video coding
US17/764,385 US11831928B2 (en) 2019-09-29 2020-09-29 Method and apparatus of combined inter and intra prediction with different chroma formats for video coding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962907699P 2019-09-29 2019-09-29
US62/907,699 2019-09-29

Publications (1)

Publication Number Publication Date
WO2021058033A1 true WO2021058033A1 (en) 2021-04-01

Family

ID=75166765

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118961 WO2021058033A1 (en) 2019-09-29 2020-09-29 Method and apparatus of combined inter and intra prediction with different chroma formats for video coding

Country Status (7)

Country Link
US (1) US11831928B2 (en)
EP (1) EP4029265A4 (en)
KR (1) KR20220061247A (en)
CN (1) CN114731427A (en)
MX (1) MX2022003827A (en)
TW (1) TWI774075B (en)
WO (1) WO2021058033A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3857889A4 (en) * 2018-11-16 2021-09-22 Beijing Bytedance Network Technology Co. Ltd. Weights in combined inter intra prediction mode
US11277624B2 (en) 2018-11-12 2022-03-15 Beijing Bytedance Network Technology Co., Ltd. Bandwidth control methods for inter prediction
US11509923B1 (en) 2019-03-06 2022-11-22 Beijing Bytedance Network Technology Co., Ltd. Usage of converted uni-prediction candidate
WO2023040993A1 (en) * 2021-09-16 2023-03-23 Beijing Bytedance Network Technology Co., Ltd. Method, device, and medium for video processing
WO2023154359A1 (en) * 2022-02-11 2023-08-17 Beijing Dajia Internet Information Technology Co., Ltd. Methods and devices for multi-hypothesis-based prediction
US11838539B2 (en) 2018-10-22 2023-12-05 Beijing Bytedance Network Technology Co., Ltd Utilization of refined motion vector
US11956465B2 (en) 2018-11-20 2024-04-09 Beijing Bytedance Network Technology Co., Ltd Difference calculation based on partial position

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024017188A1 (en) * 2022-07-22 2024-01-25 Mediatek Inc. Method and apparatus for blending prediction in video coding system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013155028A1 (en) * 2012-04-09 2013-10-17 Vid Scale, Inc. Weighted prediction parameter signaling for video coding
US20140169475A1 (en) * 2012-12-17 2014-06-19 Qualcomm Incorporated Motion vector prediction in video coding
WO2019147628A1 (en) * 2018-01-24 2019-08-01 Vid Scale, Inc. Generalized bi-prediction for video coding with reduced coding complexity

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114375582A (en) * 2019-06-24 2022-04-19 阿里巴巴集团控股有限公司 Method and system for processing luminance and chrominance signals
US11206413B2 (en) * 2019-08-13 2021-12-21 Qualcomm Incorporated Palette predictor updates for local dual trees
US11463693B2 (en) * 2019-08-30 2022-10-04 Qualcomm Incorporated Geometric partition mode with harmonized motion field storage and motion compensation
US11509910B2 (en) * 2019-09-16 2022-11-22 Tencent America LLC Video coding method and device for avoiding small chroma block intra prediction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013155028A1 (en) * 2012-04-09 2013-10-17 Vid Scale, Inc. Weighted prediction parameter signaling for video coding
US20140169475A1 (en) * 2012-12-17 2014-06-19 Qualcomm Incorporated Motion vector prediction in video coding
WO2019147628A1 (en) * 2018-01-24 2019-08-01 Vid Scale, Inc. Generalized bi-prediction for video coding with reduced coding complexity

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
M.-S. CHIANG, C.-W. HSU, Y.-W. HUANG, S.-M. LEI (MEDIATEK): "CE10.1.1: Multi-hypothesis prediction for improving AMVP mode, skip or merge mode, and intra mode", 12. JVET MEETING; 20181003 - 20181012; MACAO; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-L0100, 24 September 2018 (2018-09-24), XP030193644 *
M.-S. CHIANG, C.-W. HSU, Y.-W. HUANG, S.-M. LEI (MEDIATEK): "CE10.1.4: Simplification of combined inter and intra prediction", 13. JVET MEETING; 20190109 - 20190118; MARRAKECH; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-M0177, 2 January 2019 (2019-01-02), XP030200216 *
See also references of EP4029265A4 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11889108B2 (en) 2018-10-22 2024-01-30 Beijing Bytedance Network Technology Co., Ltd Gradient computation in bi-directional optical flow
US11838539B2 (en) 2018-10-22 2023-12-05 Beijing Bytedance Network Technology Co., Ltd Utilization of refined motion vector
US11277624B2 (en) 2018-11-12 2022-03-15 Beijing Bytedance Network Technology Co., Ltd. Bandwidth control methods for inter prediction
US11284088B2 (en) 2018-11-12 2022-03-22 Beijing Bytedance Network Technology Co., Ltd. Using combined inter intra prediction in video processing
US11516480B2 (en) 2018-11-12 2022-11-29 Beijing Bytedance Network Technology Co., Ltd. Simplification of combined inter-intra prediction
US11956449B2 (en) 2018-11-12 2024-04-09 Beijing Bytedance Network Technology Co., Ltd. Simplification of combined inter-intra prediction
US11843725B2 (en) 2018-11-12 2023-12-12 Beijing Bytedance Network Technology Co., Ltd Using combined inter intra prediction in video processing
EP3857889A4 (en) * 2018-11-16 2021-09-22 Beijing Bytedance Network Technology Co. Ltd. Weights in combined inter intra prediction mode
US11956465B2 (en) 2018-11-20 2024-04-09 Beijing Bytedance Network Technology Co., Ltd Difference calculation based on partial position
US11509923B1 (en) 2019-03-06 2022-11-22 Beijing Bytedance Network Technology Co., Ltd. Usage of converted uni-prediction candidate
US11930165B2 (en) 2019-03-06 2024-03-12 Beijing Bytedance Network Technology Co., Ltd Size dependent inter coding
WO2023040993A1 (en) * 2021-09-16 2023-03-23 Beijing Bytedance Network Technology Co., Ltd. Method, device, and medium for video processing
WO2023154359A1 (en) * 2022-02-11 2023-08-17 Beijing Dajia Internet Information Technology Co., Ltd. Methods and devices for multi-hypothesis-based prediction

Also Published As

Publication number Publication date
TW202121901A (en) 2021-06-01
EP4029265A4 (en) 2023-11-08
US20220360824A1 (en) 2022-11-10
KR20220061247A (en) 2022-05-12
MX2022003827A (en) 2023-01-26
US11831928B2 (en) 2023-11-28
TWI774075B (en) 2022-08-11
CN114731427A (en) 2022-07-08
EP4029265A1 (en) 2022-07-20

Similar Documents

Publication Publication Date Title
WO2021058033A1 (en) Method and apparatus of combined inter and intra prediction with different chroma formats for video coding
US11109052B2 (en) Method of motion vector derivation for video coding
US11259025B2 (en) Method and apparatus of adaptive multiple transforms for video coding
US11089323B2 (en) Method and apparatus of current picture referencing for video coding
US10334281B2 (en) Method of conditional binary tree block partitioning structure for video and image coding
US11956421B2 (en) Method and apparatus of luma most probable mode list derivation for video coding
EP3130147B1 (en) Methods of block vector prediction and decoding for intra block copy mode coding
US20190215521A1 (en) Method and apparatus for video coding using decoder side intra prediction derivation
US20170310988A1 (en) Method of Motion Vector Predictor or Merge Candidate Derivation in Video Coding
US11381838B2 (en) Method and apparatus of improved merge with motion vector difference for video coding
WO2019210857A1 (en) Method and apparatus of syntax interleaving for separate coding tree in video coding
US20220286714A1 (en) Method and Apparatus of Partitioning Small Size Coding Units with Partition Constraints
US20220224890A1 (en) Method and Apparatus of Partitioning Small Size Coding Units with Partition Constraints
EP4243416A2 (en) Method and apparatus of chroma direct mode generation for video coding
WO2024088058A1 (en) Method and apparatus of regression-based intra prediction in video coding system
WO2023207511A1 (en) Method and apparatus of adaptive weighting for overlapped block motion compensation in video coding system
WO2024083251A1 (en) Method and apparatus of region-based intra prediction using template-based or decoder side intra mode derivation in video coding system
WO2023020390A1 (en) Method and apparatus for low-latency template matching in video coding system
WO2023207646A1 (en) Method and apparatus for blending prediction in video coding system
WO2024022325A1 (en) Method and apparatus of improving performance of convolutional cross-component model in video coding system
US20230119121A1 (en) Method and Apparatus for Signaling Slice Partition Information in Image and Video Coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20869647

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20227013214

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020869647

Country of ref document: EP

Effective date: 20220414