WO2024027700A1 - Indexation conjointe de mode de partitionnement géométrique dans un codage vidéo - Google Patents

Indexation conjointe de mode de partitionnement géométrique dans un codage vidéo Download PDF

Info

Publication number
WO2024027700A1
WO2024027700A1 PCT/CN2023/110528 CN2023110528W WO2024027700A1 WO 2024027700 A1 WO2024027700 A1 WO 2024027700A1 CN 2023110528 W CN2023110528 W CN 2023110528W WO 2024027700 A1 WO2024027700 A1 WO 2024027700A1
Authority
WO
WIPO (PCT)
Prior art keywords
candidate
gpm
partition
list
candidates
Prior art date
Application number
PCT/CN2023/110528
Other languages
English (en)
Inventor
Yu-Ling Hsiao
Man-Shu CHUANG
Chih-Wei Hsu
Original Assignee
Mediatek Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediatek Inc. filed Critical Mediatek Inc.
Publication of WO2024027700A1 publication Critical patent/WO2024027700A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present disclosure relates generally to video coding.
  • the present disclosure relates to methods of coding pixel blocks by geometric partitioning mode (GPM) .
  • GPM geometric partitioning mode
  • High-Efficiency Video Coding is an international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) .
  • JCT-VC Joint Collaborative Team on Video Coding
  • HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture.
  • the basic unit for compression termed coding unit (CU) , is a 2Nx2N square block of pixels, and each CU can be recursively split into four smaller CUs until the predefined minimum size is reached.
  • Each CU contains one or multiple prediction units (PUs) .
  • VVC Versatile video coding
  • JVET Joint Video Expert Team
  • the input video signal is predicted from the reconstructed signal, which is derived from the coded picture regions.
  • the prediction residual signal is processed by a block transform.
  • the transform coefficients are quantized and entropy coded together with other side information in the bitstream.
  • the reconstructed signal is generated from the prediction signal and the reconstructed residual signal after inverse transform on the de-quantized transform coefficients.
  • the reconstructed signal is further processed by in-loop filtering for removing coding artifacts.
  • the decoded pictures are stored in the frame buffer for predicting the future pictures in the input video signal.
  • a coded picture is partitioned into non-overlapped square block regions represented by the associated coding tree units (CTUs) .
  • the leaf nodes of a coding tree correspond to the coding units (CUs) .
  • a coded picture can be represented by a collection of slices, each comprising an integer number of CTUs. The individual CTUs in a slice are processed in raster-scan order.
  • a bi-predictive (B) slice may be decoded using intra prediction or inter prediction with at most two motion vectors and reference indices to predict the sample values of each block.
  • a predictive (P) slice is decoded using intra prediction or inter prediction with at most one motion vector and reference index to predict the sample values of each block.
  • An intra (I) slice is decoded using intra prediction only.
  • a CTU can be partitioned into one or multiple non-overlapped coding units (CUs) using the quadtree (QT) with nested multi-type-tree (MTT) structure to adapt to various local motion and texture characteristics.
  • a CU can be further split into smaller CUs using one of the five split types: quad-tree partitioning, vertical binary tree partitioning, horizontal binary tree partitioning, vertical center-side triple-tree partitioning, horizontal center-side triple-tree partitioning.
  • Each CU contains one or more prediction units (PUs) .
  • the prediction unit together with the associated CU syntax, works as a basic unit for signaling the predictor information.
  • the specified prediction process is employed to predict the values of the associated pixel samples inside the PU.
  • Each CU may contain one or more transform units (TUs) for representing the prediction residual blocks.
  • a transform unit (TU) is comprised of a transform block (TB) of luma samples and two corresponding transform blocks of chroma samples and each TB correspond to one residual block of samples from one color component.
  • An integer transform is applied to a transform block.
  • the level values of quantized coefficients together with other side information are entropy coded in the bitstream.
  • coding tree block CB
  • CB coding block
  • PB prediction block
  • TB transform block
  • motion parameters consisting of motion vectors, reference picture indices and reference picture list usage index, and additional information are used for inter-predicted sample generation.
  • the motion parameter can be signalled in an explicit or implicit manner.
  • a CU is coded with skip mode, the CU is associated with one PU and has no significant residual coefficients, no coded motion vector delta or reference picture index.
  • a merge mode is specified whereby the motion parameters for the current CU are obtained from neighbouring CUs, including spatial and temporal candidates, and additional schedules introduced in VVC.
  • the merge mode can be applied to any inter-predicted CU.
  • the alternative to merge mode is the explicit transmission of motion parameters, where motion vector, corresponding reference picture index for each reference picture list and reference picture list usage flag and other needed information are signalled explicitly per each CU.
  • Some embodiments of the disclosure provide a method for implementing a candidate list for different combinations of geometric partitioning mode (GPM) is provided.
  • a video coder generates a list of candidates, each candidate specifying (i) a partition mode and (ii) first and second prediction modes. At least a first candidate in the list of candidates specifies motion information for an inter-coded partition of the current block.
  • the video coder signals or receives a selection of a candidate from the list of candidates by e.g., signaling or receiving an index that is assigned according to the computed costs of the candidates in the list.
  • the video coder segments the current block into a first partition and a second partition according to the partition mode of the selected candidate.
  • the video coder generates first and second predictions for the first and second partitions according to the first and second prediction modes of the selected candidate.
  • the video coder encodes or decodes the current block by using the first and second predictions.
  • the candidates in the list of candidates are assigned indices according to an order determined based on costs computed for the candidates.
  • the cost of a candidate is computed based on reconstructed samples neighboring the current block and reference samples derived according to the partition mode and the prediction modes of the candidate.
  • a candidate having a lowest cost among costs computed for all candidates in the list is assigned a shortest code word among all candidates in the list.
  • the first candidate may specify a refinement of an inter-prediction based on the motion information of the first candidate.
  • the refinement may be for GPM-TM, based on minimizing a matching cost between reconstructed samples neighboring the current block and reconstructed samples neighboring a reference block identified by the motion information of the first candidate.
  • the refinement may be for GPM-MMVD, based on a motion vector difference that is specified by a distance and a direction.
  • the first candidate in the list of candidates may specify motion information for both the first and second partitions if both the first and second partitions of the first candidate are inter-coded.
  • the first candidate may specify an intra-prediction mode (e.g., an intra-prediction direction) for an intra coded partition of the current block.
  • the list of candidates may also include a second candidates that specifies two intra-prediction modes for two intra-coded partitions.
  • FIG. 1 shows the intra-prediction modes in different directions.
  • FIG. 2 conceptually illustrates template matching based on a search area around an initial motion vector (MV) .
  • FIG. 3 conceptually illustrates merge mode with motion vector difference (MMVD) candidates and their corresponding offsets.
  • FIG. 4 illustrates the partitioning of a coding unit by the geometric partitioning mode (GPM) .
  • GPM geometric partitioning mode
  • FIG. 5 illustrates an example uni-prediction candidate list for a GPM partition and the selection of a uni-prediction MV for GPM.
  • FIG. 6 illustrates an example partition edge blending process for GPM for a coding unit.
  • FIGS. 7A-C illustrate GPM with inter and intra predictions for a current block.
  • FIG. 8 illustrates GPM with intra and intra prediction for a current block.
  • FIG. 9 conceptually illustrates extending GPM partition edge into the reference template.
  • FIG. 10 conceptually illustrates an example process that a video coder may perform for encoding or decoding a GPM-partitioned current block.
  • FIGS. 11A-C conceptually illustrate a unified GPM candidate list for coding the current block.
  • FIG. 12 illustrates an example video encoder that may encode pixel blocks using GPM.
  • FIG. 13 illustrates portions of the video encoder that implement a unified GPM candidate list.
  • FIG. 14 conceptually illustrates a process for using a unified candidate list of GPM combinations.
  • FIG. 15 illustrates an example video decoder that may decode GPM coded blocks.
  • FIG. 16 illustrates portions of the video decoder that implement a unified GPM candidate list.
  • FIG. 17 conceptually illustrates a process for using a unified candidate list of GPM combinations.
  • FIG. 18 conceptually illustrates an electronic system with which some embodiments of the present disclosure are implemented.
  • Intra-prediction method exploits one reference tier adjacent to the current prediction unit (PU) and one of the intra-prediction modes to generate the predictors for the current PU.
  • the Intra-prediction direction can be chosen among a mode set containing multiple prediction directions. For each PU coded by Intra-prediction, one index will be used and encoded to select one of the intra-prediction modes. The corresponding prediction will be generated and then the residuals can be derived and transformed.
  • the number of directional intra modes may be extended from 33, as used in HEVC, to 65 direction modes so that the range of k is from ⁇ 1 to ⁇ 16.
  • These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.
  • the number of intra-prediction mode is 35 (or 67) .
  • some modes are identified as a set of most probable modes (MPM) for intra-prediction in current prediction block.
  • the encoder may reduce bit rate by signaling an index to select one of the MPMs instead of an index to select one of the 35 (or 67) intra-prediction modes.
  • the intra-prediction mode used in the left prediction block and the intra-prediction mode used in the above prediction block are used as MPMs.
  • the intra-prediction mode in two neighboring blocks use the same intra-prediction mode, the intra-prediction mode can be used as an MPM.
  • the two neighboring directions immediately next to this directional mode can be used as MPMs.
  • DC mode and Planar mode are also considered as MPMs to fill the available spots in the MPM set, especially if the above or top neighboring blocks are not available or not coded in intra-prediction, or if the intra-prediction modes in neighboring blocks are not directional modes.
  • the intra-prediction mode for current prediction block is one of the modes in the MPM set, 1 or 2 bits are used to signal which one it is. Otherwise, the intra-prediction mode of the current block is not the same as any entry in the MPM set, and the current block will be coded as a non-MPM mode. There are all-together 32 such non-MPM modes and a (5-bit) fixed length coding method is applied to signal this mode.
  • the MPM list is constructed based on intra modes of the left and above neighboring block.
  • the mode of the left neighboring block is denoted as Left and the mode of the above neighboring block is denoted as Above, and the unified MPM list may be constructed as follows:
  • Max -Min is equal to 1:
  • Max -Min is greater than or equal to 62:
  • Max -Min is equal to 2:
  • Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction.
  • VVC several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks.
  • the replaced modes are signalled using the original mode indices, which are remapped to indices of wide angular modes after parsing.
  • template matching method can be applied by computing the cost between reconstructed samples and predicting samples.
  • One of the examples is template-based intra mode derivation (TIMD) .
  • TIMD is a coding method in which the intra prediction mode of a CU is implicitly derived by using a neighboring template at both encoder and decoder, instead of the encoder signaling the exact intra prediction mode to the decoder.
  • Decoder-Side Intra Mode Derivation is a technique in which two intra prediction modes/angles/directions are derived from the reconstructed neighbor samples (template) of a block, and those two predictors are combined with the planar mode predictor with the weights derived from the gradients.
  • the DIMD mode is used as an alternative prediction mode and is always checked in high-complexity RDO mode.
  • a texture gradient analysis is performed at both encoder and decoder sides. This process starts with an empty Histogram of Gradient (HoG) having 65 entries, corresponding to the 65 angular/directional intra prediction modes. Amplitudes of these entries are determined during the texture gradient analysis.
  • HoG Histogram of Gradient
  • Template matching is a decoder-side MV derivation method to refine the motion information of the current CU by finding the closest match between a template of the current CU (e.g., top and/or left neighbouring blocks of the current CU) in the current picture and a set of pixels (i.e., same size to the template) in a reference picture.
  • a template of the current CU e.g., top and/or left neighbouring blocks of the current CU
  • a set of pixels i.e., same size to the template
  • FIG. 2 conceptually illustrates template matching based on a search area around an initial motion vector (MV) .
  • the video coder searches the reference picture or frame 201 within a [–8, +8] -pel search range around an initial MV 210 for a better or refined MV 211.
  • the search is based on minimizing the difference (or cost) between a current template 220 neighboring the current block 205 and a reference template 221 identified by the refined MV 211.
  • the template matching may be performed with a search step size that is determined based on an adaptive motion vector resolution mode (AMVR) .
  • the template matching process can be cascaded with a bilateral matching process in merge modes.
  • an MVP candidate is determined based on template matching error to select the one that reaches the minimum difference between the current block template and the reference block template, and then TM is performed only for this particular MVP candidate for MV refinement.
  • the TM process refines this MVP candidate, starting from full-pel MVD precision (or 4-pel for 4-pel AMVR mode) within a [–8, +8] -pel search range by using iterative diamond search.
  • the AMVP candidate may be further refined by using cross search with full-pel MVD precision (or 4-pel for 4-pel AMVR mode) , followed sequentially by half-pel and quarter-pel ones depending on a AMVR mode search pattern according to Table 1 below.
  • Table 1 Search patterns of AMVR and merge mode with AMVR
  • This search process ensures that the MVP candidate still keeps the same MV precision as indicated by the AMVR mode after the TM process.
  • the search process if the difference between the previous minimum cost and the current minimum cost in the iteration is less than a threshold that is equal to the area of the block, the search process terminates.
  • the video coder may apply a similar TM search method to refine the merge candidate indicated by the merge index.
  • TM may be performed all the way down to 1/8-pel MVD precision or skipping those beyond half-pel MVD precision, depending on whether an alternative interpolation filter (that is used when AMVR is of half-pel mode) is used according to merged motion information.
  • template matching may work as an independent process or as an extra MV refinement process between block-based and subblock-based bilateral matching (BM) methods, depending on whether BM can be enabled or not according to its enabling condition check.
  • BM subblock-based bilateral matching
  • the merge candidates may be adaptively reordered with template matching (TM) .
  • TM template matching
  • the reordering method is applied to regular merge mode, template matching (TM) merge mode, and affine merge mode (excluding the SbTMVP candidate) .
  • TM merge mode merge candidates are reordered before the refinement process.
  • merge candidates are divided into several subgroups.
  • the subgroup size is set to 5 for regular merge mode and TM merge mode.
  • the subgroup size is set to 3 for affine merge mode.
  • Merge candidates in each subgroup are reordered ascendingly according to cost values based on template matching. For simplification, merge candidates in the last but not the first subgroup are not reordered.
  • the template matching cost of a merge candidate is measured by the sum of absolute differences (SAD) between samples of a template of the current block and their corresponding reference samples.
  • the template includes a set of reconstructed samples neighboring the current block. Reference samples of the template are located by the motion information of the merge candidate.
  • the reference samples of the template of the merge candidate are also generated by bi-prediction.
  • the above template includes several sub-templates with the size of Wsub ⁇ 1, and the left template include several sub-templates with the size of 1 ⁇ Hsub.
  • the motion information of the subblocks in the first row and the first column of current block is used to derive the reference samples of each sub-template.
  • MMVD Merge Mode with Motion Vector Difference
  • the derived motion information is further refined by a motion vector difference MVD.
  • MMVD also extends the list of candidates for merge mode by adding additional MMVD candidates based on predefined offsets (also referred to as MMVD offsets) .
  • a MMVD flag may be signaled after sending a skip flag and merge flag to specify whether MMVD mode is used for a CU. If MMVD mode is used, a selected merge candidate is refined by MVD information.
  • the MVD information also include a merge candidate flag, a distance index to specify motion magnitude, and an index for indication of motion direction.
  • the merge candidate flag is signaled to specify which of the first two merge candidates is to be used as a starting MV.
  • the distance index is used to specify motion magnitude information by indicating a pre-defined offset from the starting MV.
  • the offset may be added to either horizontal component or vertical component of the starting MV.
  • An example mapping from the distance index to the pre-defined offset is specified in Table 2 below:
  • the direction index represents the direction of the MVD relative to the starting point.
  • the direction index can represent one of the four directions as shown in Table 3.
  • MVD sign may vary according to the information of the starting MV.
  • the starting MV is an uni-prediction MV or a bi-prediction MV with both lists pointing to the same side of the current picture (i.e., picture order counts or POCs, of the two reference pictures are both larger than the POC of the current picture, or are both smaller than the POC of the current picture)
  • the sign in Table 3 specifies the sign of MV offset added to the starting MV.
  • the starting MVs is bi-prediction MVs with the two MVs point to the different sides of the current picture (i.e.
  • a predefined offset (MmvdOffset) of a MMVD candidate is derived from or expressed as a distance value (MmvdDistance) and a directional sign (MmvdSign) .
  • FIG. 3 conceptually illustrates MMVD candidates and their corresponding offsets.
  • the figure illustrates a merge candidate 310 as the starting MV and several MMVD candidates in the vertical direction and the in the horizontal direction.
  • Each of the MMVD candidate is derived by applying an offset to the starting MV 310.
  • the MMVD candidate 322 is derived by adding offset of 2 to the horizontal component of the merge candidate 310
  • the MMVD candidate 324 is derived by adding offset -1 to the vertical component to the merge candidate 310.
  • MMVD candidates with offsets in the horizontal direction such as the MMVD candidate 322
  • MMVD candidates with offsets in the vertical direction such as the MMVD candidate 324, are referred to as vertical MMVD candidates.
  • Intra Block Copy is also referred to as Current Picture Referencing (CPR) .
  • An IBC (or CPR) motion vector is one that refers to the already-reconstructed reference samples in the current picture.
  • IBC prediction mode is treated as the third prediction mode other than intra or inter prediction modes for coding a CU.
  • IBC mode is implemented as a block level coding mode
  • block matching is performed at the encoder to find the optimal block vector (or motion vector) for each CU.
  • a block vector (BV) is used to indicate the displacement from the current block to a reference block, which is already reconstructed inside the current picture.
  • the luma block vector of an IBC-coded CU is in integer precision.
  • the geometric partitioning mode is signalled using a CU-level flag as one kind of merge mode, with other merge modes that includes the regular merge mode, the MMVD mode, the CIIP mode, and the subblock merge mode.
  • a CU-level flag as one kind of merge mode, with other merge modes that includes the regular merge mode, the MMVD mode, the CIIP mode, and the subblock merge mode.
  • w ⁇ h 2 m ⁇ 2 n with m, n ⁇ ⁇ 3...6 ⁇ excluding 8x64 and 64x8.
  • FIG. 4 illustrates the partitioning of a CU by the geometric partitioning mode (GPM) .
  • GPM geometric partitioning mode
  • Each GPM partitioning or GPM split is a partition mode characterized by a distance-angle pairing that defines a bisecting or segmenting line.
  • the figure illustrates examples of the GPM splits grouped by identical angles.
  • a CU is split into at least two parts by a geometrically located straight line.
  • the location of the splitting line is mathematically derived from the angle and offset parameters of a specific partition.
  • Each partition in the CU formed by a partition mode of GPM is inter-predicted using its own motion (vector) .
  • vector vector
  • only uni-prediction is allowed for each partition, that is, each part has one motion vector and one reference index.
  • the uni-prediction motion constraint is applied to ensure that, similar to conventional bi-prediction, only two motion compensated prediction are performed for each CU.
  • a geometric partition index indicating the partition mode of the geometric partitioning (angle and offset) and two merge indices (one for each partition) are further signalled.
  • Each of the at least two partitions created by the geometric partitioning according to a partition mode may be assigned a merge index to select a candidate from a uni-prediction candidate list (also referred to as the GPM candidate list) .
  • the pair of merge indices of the two partitions therefore select a pair of merge candidates.
  • the maximum number of candidates in the GPM candidate list may be signalled explicitly in SPS to specify syntax binarization for GPM merge indices.
  • the sample values along the geometric partitioning edge are adjusted using a blending processing with adaptive weights. This is the prediction signal for the whole CU, and transform and quantization process will be applied to the whole CU as in other prediction modes.
  • the motion field of the CU as predicted by GPM is then stored.
  • the uni-prediction candidate list for a GPM partition may be derived directly from the merge candidate list of the current CU.
  • FIG. 5 illustrates an example uni-prediction candidate list 500 for a GPM partition and the selection of a uni-prediction MV for GPM.
  • the GPM candidate list 500 is constructed in an even-odd manner with only uni-prediction candidates that alternates between L0 MV and L1 MV.
  • n be the index of the uni-prediction motion in the uni-prediction candidate list for GPM.
  • the LX (i.e., L0 or L1) motion vector of the n-th extended merge candidate, with X equal to the parity of n, is used as the n-th uni-prediction motion vector for GPM. (These motion vectors are marked with “x” in the figure. ) In case a corresponding LX motion vector of the n-th extended merge candidate does not exist, the L (1 -X) motion vector of the same candidate is used instead as the uni-prediction motion vector for GPM.
  • the sample values along the geometric partition edge are adjusted using a blending processing with adaptive weights. Specifically, after predicting each part of a geometric partition using its own motion, blending is applied to the at least two prediction signals to derive samples around geometric partition edge.
  • the blending weight for each position of the CU are derived based on the distance between the individual position and the partition edge.
  • the distance for a position (x, y) to the partition edge are derived as:
  • i, j are the indices for angle and offset of a geometric partition, which depend on the signaled geometric partition index.
  • the sign of ⁇ x, j and ⁇ y, j depend on angle index i.
  • FIG. 6 illustrates an example partition edge blending process for GPM for a CU 600.
  • blending weights are generated based on an initial blending weight w 0 .
  • the motion field of a CU predicted using GPM is stored. Specifically, Mv1 from the first part of the geometric partition, Mv2 from the second part of the geometric partition and a combined Mv of Mv1 and Mv2 are stored in the motion field of the GPM coded CU.
  • the stored motion vector type for each individual position in the motion filed are determined as:
  • motionIdx is equal to d (4x+2, 4y+2) , which is recalculated from equation (1) .
  • the partIdx depends on the angle index i. If sType is equal to 0 or 1, Mv0 or Mv1 are stored in the corresponding motion field, otherwise if sType is equal to 2, a combined Mv from Mv0 and Mv2 are stored.
  • the combined Mv are generated using the following process: (i) If Mv1 and Mv2 are from different reference picture lists (one from L0 and the other from L1) , then Mv1 and Mv2 are simply combined to form the bi-prediction motion vectors; (ii) otherwise, if Mv1 and Mv2 are from the same list, only uni-prediction motion Mv2 is stored.
  • GPM is extended by applying motion vector refinement to existing GPM uni-directional MVs.
  • a flag is first signaled for a GPM CU, to specify whether this mode is used. If the mode is used, each geometric partition of a GPM CU can further decide whether to signal motion vector difference (MVD) or not. If MVD is signaled for a geometric partition, after a GPM merge candidate is selected, the motion of the partition is further refined by the signaled MVDs information. All other procedures are kept the same as in GPM.
  • the MVD is signaled as a pair of distance and direction, similar as in MMVD.
  • pic_fpel_mmvd_enabled_flag is equal to 1
  • the MVD is left shifted by 2 as in MMVD.
  • template matching may be applied to refine MVs of GPM partitions.
  • GPM mode When GPM mode is enabled for a CU, a CU-level flag is signaled to indicate whether TM is applied to both geometric partitions. Motion information for each geometric partition is refined using TM.
  • TM When TM is chosen, a template is constructed using left, above, or left and above neighboring samples according to partition angle. Table 4 below shows Template for the first and second geometric partitions, where A represents using above samples, L represents using left samples, and L+A represents using both left and above samples.
  • a GPM candidate list is constructed as follows: (1) the video coder derives interleaved List-0 MV candidates and List-1 MV candidates directly from the regular merge candidate list, where List-0 MV candidates are higher priority than List-1 MV candidates. A pruning method with an adaptive threshold based on the current CU size is applied to remove redundant MV candidates; (2) the video coder further derives interleaved List-1 MV candidates and List-0 MV candidates directly from the regular merge candidate list, where List-1 MV candidates are higher priority than List-0 MV candidates. The same pruning method with the adaptive threshold is also applied to remove redundant MV candidates; and (3) the video coder pads the GPM candidate list with zero MV candidates until the GPM candidate list is full.
  • the GPM-MMVD and GPM-TM are exclusively enabled to one CU for which GPM is used. In some embodiments, this is done by firstly signaling the GPM-MMVD syntax. When both of the two GPM-MMVD control flags are set to false (i.e., the GPM-MMVD are disabled for two GPM partitions) , the GPM-TM flag is signaled to indicate whether the template matching refinement is applied to the GPM partitions, as described in Section II above. Otherwise (at least one GPM-MMVD flag is set to true) , the value of the GPM-TM flag is inferred to be false, and the MMVD refinement is applied to the GPM partitions, as described in Section III above.
  • one GPM region is coded in inter prediction and the other region is coded in intra prediction.
  • the final prediction samples are generated by weighting inter predicted samples and intra predicted samples for each GPM-separated region.
  • the inter predicted samples are derived by inter GPM.
  • the intra predicted samples are derived by an intra prediction mode (IPM) candidate list and an index signaled from the encoder.
  • IPM candidate list size is pre-defined as 3.
  • FIGS. 7A-C illustrate GPM with inter and intra predictions for a current block 700.
  • the current block is partitioned by GPM into an inter-coded region 710 and an intra-coded region 720.
  • FIG. 7A illustrates the region 720 intra coded by a parallel angular mode against the GPM block boundary (Parallel mode) .
  • FIG. 7B illustrates the region 720 intra coded by a perpendicular angular mode against the GPM block boundary (Perpendicular mode) .
  • FIG. 7C illustrates the region 720 intra coded by Planar mode.
  • the parallel mode, perpendicular mode, and the planar mode are available IPM candidates in a IPM candidate list.
  • FIG. 8 illustrates GPM with intra and intra prediction for a current block 800.
  • GPM with intra and inter prediction may be restricted to reduce the signalling overhead for IPMs and avoid an increase in the size of the intra prediction circuit on the hardware decoder.
  • a direct motion vector and IPM storage on the GPM-blending area is introduced to further improve the coding performance.
  • decoder-side intra mode derivation DIMD
  • neighboring mode based IPM derivation Parallel mode is registered first. Therefore, at most two IPM candidates derived from the DIMD method and/or the neighboring blocks can be registered if there is not the same IPM candidate in the list.
  • the neighboring mode derivation there are five positions for available neighboring blocks at most, but they are restricted by the angle of GPM block boundary as shown in Table 4 above. Table 4 shows position of available neighboring blocks for IPM candidate derivation based on the angle of GPM block boundary, which are already used for GPM with template matching (GPM-TM) .
  • GPM-intra can be combined with GPM with merge with motion vector difference (GPM-MMVD) .
  • GPM-MMVD merge with motion vector difference
  • Template-based intra mode derivation TIMD
  • the Parallel mode can be registered first, then IPM candidates of TIMD, DIMD, and neighboring blocks.
  • the respective TM cost values of GPM split modes are computed. Then, all GPM split modes are reordered in ascending ordering based on the TM cost values. Instead of sending GPM split mode, an index using Golomb-Rice code to indicate where the exact GPM split mode is located in the reordering list may be signaled.
  • the reordering method for GPM split modes is a two-step process performed after the respective reference templates of the two GPM partitions in a coding unit are generated, as follows:
  • FIG. 9 conceptually illustrates extending GPM partition edge into the reference template.
  • the figure illustrates a current block 900 that is coded by GPM and partitioned into GPM partitions 910 and 920 by a GPM partition edge 905. Neighboring regions of the current block are used as a template 930 (including a top section and a left section) .
  • the GPM partition edge 905 is extended to divide the template 930 into 930a and 930b for purpose of TM cost calculation.
  • the template section 930a is used for TM of the GPM partition 910 and the template section 930b is used for TM of the GPM partition 920.
  • GPM partition edge is extended from that of the current CU over the template 930, GPM blending process is not used in the template area across the edge.
  • FIG. 10 conceptually illustrates an example process 1000 that a video coder may perform for encoding or decoding a GPM-partitioned current block.
  • one or more processing units e.g., a processor
  • a computing device implementing a video coder performs the process 1000 by executing instructions stored in a computer readable medium.
  • an electronic apparatus implementing the video coder performs the process 1000.
  • the video coder parses (at block 1010) GPM syntax, which may include blending index, GPM mode index, merge index, intra index.
  • GPM variations such as GPM, GPM-MMVD, GPM-TM, and GPM-Intra.
  • the combinations of GPM variations used to code a GPM coded block may include: ⁇ GPM, GPM ⁇ , ⁇ GPM, GPM-MMVD ⁇ , ⁇ GPM-MMVD, GPM-Intra ⁇ , ⁇ GPM-TM, GPM-Intra ⁇ , ⁇ GPM-TM, GPM-TM ⁇ , etc.
  • the video coder generates (at block 1020) a GPM merge candidate list.
  • the video coder may apply MMVD to the selected candidate if GPM-MMVD is used.
  • the video coder may compute TM costs with left and/or above template if GPM-TM is used for two or more candidates.
  • the video coder may generate intra prediction mode (IPM) candidate list if GPM-Intra is used.
  • IPM intra prediction mode
  • the video coder performs (at block 1030) motion compensation and blending for the GPM partitions.
  • the video coder may reorder 64 partition indices and select by mode index as partition index.
  • the video coder may perform intra prediction for the selected candidate if GPM-Intra is used.
  • the video coder may perform overlapped block motion compensation (OBMC) to the inter-predicted partition if the other partition uses GPM-Intra.
  • OBMC overlapped block motion compensation
  • the video coder may perform GPM blending on rounded samples if GPM-Intra is selected.
  • the video coder applies (at block 1040) motion information to inter-predicted GPM partition. No motion information for a GPM-Intra partition.
  • the video coder may also perform OBMC if GPM-Intra is not used.
  • the video coder may construct a GPM candidate list, with at least one entry of the candidate list corresponding to a combination of one partition split mode and two intra prediction modes, and different combinations can be formed from one of the 26 partition modes and 3 of the intra prediction modes.
  • the video coder may signal the index of the candidate selected from the GPM candidate list.
  • the list of candidates is reordered using TM costs where SAD between the prediction and reconstruction of the template is used for ordering.
  • a joint indexing scheme is used for a block coded by GPM, or any one of GPM’s variation or extensions (e.g., GPM for Skip, Merge, Direct, Intra modes, Inter modes, and/or IBC modes. )
  • the joint index is used to indicate a combination of:
  • a subset from the partition mode and the one or more prediction modes for multiple hypotheses of prediction.
  • the joint index when the block is coded with GPM/GPM-MMVD/GPM-TM, the joint index may indicate a combination of a partition mode, two motion candidates/information.
  • the joint index may indicate a combination of two motion candidates /information.
  • the joint index may indicate a combination of a partition mode and a motion candidate/information. After ascending reordering using TM cost, an index is signaled.
  • the combination indicated by the joint index includes a partition mode and GPM-MMVD distance; or the combination may include a partition mode and a GPM-MMVD direction; or the combination may include a partition mode, GPM-MMVD distance, and GPM-MMVD direction.
  • a list of different combinations is re-ordered according to template matching costs.
  • each combination in the list includes a partition mode and two motion candidates/information.
  • each combination in the list includes two motion candidates/information, and the template matching costs are determined by the candidates in the list of combinations and a signaled partition mode.
  • each combination in the list includes two motion candidates/information, and the template matching costs are determined by the candidates in the combination list and a signaled partition mode.
  • the joint index indicates a combination from a reordered list of combinations by template matching based method.
  • the order in the reordered list of combinations implies the signaling priority order of the combinations. That is, the combination at the first position in the list of combinations is signaled/parsed with the shortest codeword among all combinations in the list.
  • the syntax for indicating the combination at the first position in the list of combinations is coded with one or more contexts.
  • the context selection may depend on the block width, height, area, neighboring mode information.
  • the one or more contexts used are not reused by the remaining combinations in the list of combinations.
  • the syntax for indicating the combinations after the first position in the list of combinations is not coded with contexts.
  • FIGS. 11A-C conceptually illustrate a unified GPM candidate list 1105 for coding the current block 1100.
  • the unified GPM candidate list 1105 can be used for different GPM variations.
  • the GPM candidate list 1105 includes entries that correspond to different combinations of GPM partition mode and prediction modes for the two (or more) GPM partitions.
  • Each candidate in the list identifies (i) a GPM partition mode that partitions the current block into two or more GPM partitions and (ii) two (or more) prediction modes for the two or more GPM partitions of the candidate.
  • the prediction mode of a GPM partition may indicate whether the partition is coded by intra or inter prediction, and may specify the intra-prediction mode or the motion information (MV and reference picture) to be used.
  • the candidate combination may further specify refinement for the motion information (e.g., by template matching or by MMVD direction+distance. )
  • the candidate list 1105 includes a candidate 1140 that partitions the current block 1100 into an inter-predicted partition and an intra-predicted partition, and the entry that correspond to the candidate 1140 specifies the GPM partition mode, the intra prediction mode of the intra-predicted partition, and the motion information of the inter-predicted partition.
  • the candidate list 1105 includes a candidate 1150 that partitions the current block 1100 into two inter-predicted partitions, and the entry that correspond to the candidate 1150 specifies the GPM partition mode and the motion information of each of the inter-predicted partitions.
  • the entries 1140 and 1150 may also specify information for motion refinement for their respective inter-coded partitions.
  • the entries of the GPM candidate list 1105 are re-ordered according to the template matching costs of the candidates, with indices assigned to corresponding entries/candidates in the list.
  • the template matching cost of a candidate in the list is computed based on comparing (i) the reconstructed samples of a template region 1110 neighboring the current block 1100 with (ii) the reference samples derived according to the prediction modes of the GPM partitions specified by the candidate.
  • FIG. 11B illustrates the TM cost calculations for the candidate 1140.
  • the GPM partition mode of the candidate 1140 partitions the current block 1100 into two partitions 1141 and 1142.
  • the partition 1141 is intra predicted with a particular intra-prediction mode specified by the candidate 1140.
  • the partition 1142 is inter-predicted with motion information provided by the candidate 1140 to refer to a reference block 1120 (in a reference picture or the current picture) .
  • the TM cost of the candidate 1140 is computed based on reference samples derived from (i) neighboring samples 1115 of the template region 1110 that are identified by the specified intra-prediction mode and (ii) neighboring samples 1130 of the reference block 1120.
  • the reference samples may be derived by blending according to the GPM partition mode specified by the candidate 1140.
  • FIG. 11C illustrates the TM cost calculations for the candidate 1150.
  • the GPM partition mode of the candidate 1150 partitions the current block 1100 into two inter-predicted partitions 1151 and 1152.
  • the partition 1151 has motion information (e.g., MV or BV) referring to a reference block 1121.
  • the partition 1152 has motion information (e.g., MV or BV) referring to a reference block 1122.
  • the TM cost of the candidate 1150 is computed based on reference samples derived from (i) neighboring samples 1131 of the reference block 1121 and (ii) neighboring samples 1132 of the reference block 1122.
  • the reference samples may be derived by blending according to the GPM partition mode specified by the candidate 1150.
  • a unified candidate list including prediction modes for one or more hypotheses of prediction, is generated and different GPM modes (different GPM variations and extensions) share the same unified candidate list (different GPM modes refer to and use the same list) .
  • different GPM modes refer to and use the same list
  • only one circuit for generating the unified candidate list is used and reused by multiple GPM modes.
  • the selection of one or more candidates from the unified candidate list may depend on the signaling belonging to the particular GPM mode (instead of the signaling shared/unified with other GPM modes) .
  • a candidate list that includes combinations (of e.g., partition mode, motion information, etc. ) , is generated, and different GPM modes share the same unified candidate list. Therefore, only one circuit for generating the candidate list is used and reused by multiple GPM modes.
  • the selection of one or more candidates from the unified candidate list and/or the usage of the selected one or more candidates may depend on the signaling belonging to this GPM mode (instead of the signaling shared/unified with other GPM modes) .
  • the foregoing proposed method can be implemented in encoders and/or decoders.
  • the proposed method can be implemented in a inter prediction module and/or intra block copy prediction module of an encoder, and/or a inter prediction module (and/or intra block copy prediction module) of a decoder.
  • FIG. 12 illustrates an example video encoder 1200 that may encode pixel blocks using GPM.
  • the video encoder 1200 receives input video signal from a video source 1205 and encodes the signal into bitstream 1295.
  • the video encoder 1200 has several components or modules for encoding the signal from the video source 1205, at least including some components selected from a transform module 1210, a quantization module 1211, an inverse quantization module 1214, an inverse transform module 1215, an intra-picture estimation module 1220, an intra-prediction module 1225, a motion compensation module 1230, a motion estimation module 1235, an in-loop filter 1245, a reconstructed picture buffer 1250, a MV buffer 1265, and a MV prediction module 1275, and an entropy encoder 1290.
  • the motion compensation module 1230 and the motion estimation module 1235 are part of an inter-prediction module 1240.
  • the modules 1210 –1290 are modules of software instructions being executed by one or more processing units (e.g., a processor) of a computing device or electronic apparatus. In some embodiments, the modules 1210 –1290 are modules of hardware circuits implemented by one or more integrated circuits (ICs) of an electronic apparatus. Though the modules 1210 –1290 are illustrated as being separate modules, some of the modules can be combined into a single module.
  • the video source 1205 provides a raw video signal that presents pixel data of each video frame without compression.
  • a subtractor 1208 computes the difference between the raw video pixel data of the video source 1205 and the predicted pixel data 1213 from the motion compensation module 1230 or intra-prediction module 1225 as prediction residual 1209.
  • the transform module 1210 converts the difference (or the residual pixel data or residual signal 1208) into transform coefficients (e.g., by performing Discrete Cosine Transform, or DCT) .
  • the quantization module 1211 quantizes the transform coefficients into quantized data (or quantized coefficients) 1212, which is encoded into the bitstream 1295 by the entropy encoder 1290.
  • the inverse quantization module 1214 de-quantizes the quantized data (or quantized coefficients) 1212 to obtain transform coefficients, and the inverse transform module 1215 performs inverse transform on the transform coefficients to produce reconstructed residual 1219.
  • the reconstructed residual 1219 is added with the predicted pixel data 1213 to produce reconstructed pixel data 1217.
  • the reconstructed pixel data 1217 is temporarily stored in a line buffer (not illustrated) for intra-picture prediction and spatial MV prediction.
  • the reconstructed pixels are filtered by the in-loop filter 1245 and stored in the reconstructed picture buffer 1250.
  • the reconstructed picture buffer 1250 is a storage external to the video encoder 1200.
  • the reconstructed picture buffer 1250 is a storage internal to the video encoder 1200.
  • the intra-picture estimation module 1220 performs intra-prediction based on the reconstructed pixel data 1217 to produce intra prediction data.
  • the intra-prediction data is provided to the entropy encoder 1290 to be encoded into bitstream 1295.
  • the intra-prediction data is also used by the intra-prediction module 1225 to produce the predicted pixel data 1213.
  • the motion estimation module 1235 performs inter-prediction by producing MVs to reference pixel data of previously decoded frames stored in the reconstructed picture buffer 1250. These MVs are provided to the motion compensation module 1230 to produce predicted pixel data.
  • the video encoder 1200 uses MV prediction to generate predicted MVs, and the difference between the MVs used for motion compensation and the predicted MVs is encoded as residual motion data and stored in the bitstream 1295.
  • the MV prediction module 1275 generates the predicted MVs based on reference MVs that were generated for encoding previously video frames, i.e., the motion compensation MVs that were used to perform motion compensation.
  • the MV prediction module 1275 retrieves reference MVs from previous video frames from the MV buffer 1265.
  • the video encoder 1200 stores the MVs generated for the current video frame in the MV buffer 1265 as reference MVs for generating predicted MVs.
  • the MV prediction module 1275 uses the reference MVs to create the predicted MVs.
  • the predicted MVs can be computed by spatial MV prediction or temporal MV prediction.
  • the difference between the predicted MVs and the motion compensation MVs (MC MVs) of the current frame (residual motion data) are encoded into the bitstream 1295 by the entropy encoder 1290.
  • the entropy encoder 1290 encodes various parameters and data into the bitstream 1295 by using entropy-coding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman encoding.
  • CABAC context-adaptive binary arithmetic coding
  • the entropy encoder 1290 encodes various header elements, flags, along with the quantized transform coefficients 1212, and the residual motion data as syntax elements into the bitstream 1295.
  • the bitstream 1295 is in turn stored in a storage device or transmitted to a decoder over a communications medium such as a network.
  • the in-loop filter 1245 performs filtering or smoothing operations on the reconstructed pixel data 1217 to reduce the artifacts of coding, particularly at boundaries of pixel blocks.
  • the filtering or smoothing operations performed by the in-loop filter 1245 include deblock filter (DBF) , sample adaptive offset (SAO) , and/or adaptive loop filter (ALF) .
  • DPF deblock filter
  • SAO sample adaptive offset
  • ALF adaptive loop filter
  • FIG. 13 illustrates portions of the video encoder 1200 that implement a unified GPM candidate list.
  • a GPM candidate generation module 1310 specifies various candidate GPM combinations. Each candidate GPM combination may specify one GPM partition mode that partitions the current block into two or more GPM partitions, and two or more prediction modes for the two or more GPM partitions. For an intra-coded GPM partition, the candidate combination may specify the intra prediction mode. For an inter-coded GPM partition, the candidate combination may specify motion information (e.g., motion vector, reference picture, or a selected merge candidate) . The motion information may be retrieved from the MV buffer 1265. A candidate combination may also specify refinement for the motion information according to GPM-MMVD or GPM-TM.
  • a template identification module 1320 retrieves samples for a current template and a reference template from the reconstructed picture buffer 1250 based on the prediction modes specified in the candidate GPM combination. For a candidate GPM combination that specifies one or more inter-coded partitions, the template identification module 1320 may retrieve neighboring samples of the current block as the current template and use the motion information specified in the combination to retrieve neighboring samples of a reference block as the reference template. For a candidate GPM combination that specifies one or more intra-coded partitions, the template identification module 1320 may retrieve neighboring samples of the current block as the current template and neighboring samples of the current template that are identified by the specified intra-prediction mode as the reference template.
  • the template identification module 1320 provides the reference template (s) , the current template (s) , of a currently indicated candidate GPM combination to a cost calculator 1330, which performs template matching to produce a cost for the indicated candidate GPM combination.
  • the cost calculator 1330 may combine the reference templates of the different partitions (with edge blending) according to the GPM partition mode of the indicated GPM combination when determining the TM cost.
  • the computed costs of the various candidate GPM combinations are provided to a candidate ordering module 1340, which assigns indices to the various candidate GPM combinations according to their corresponding computed TM costs.
  • a candidate selection module 1345 may select one of the candidate GPM combinations and provide the index of the selected combination to the entropy encoder 1290 to be signaled.
  • the content of the selected GPM combination (including GPM partition mode, prediction modes of the GPM partitions, motion information, intra-prediction mode, etc. ) is provided to a GPM prediction generation module 1350.
  • the GPM prediction generation module 1350 generates a prediction for each GPM partition according to the selected GPM combination.
  • the intra prediction module 1225 may be used to generate the prediction for an inter-predicted partition and the motion compensation module 1230 may be used to generate the prediction for an intra-predicted partition.
  • the GPM prediction generation module 1350 then combines the predictions of the different GPM partitions (by performing GPM edge blending) into one GPM predictor for the current block as the predicted pixel data 1213.
  • FIG. 14 conceptually illustrates a process 1400 for using a unified candidate list of GPM combinations.
  • one or more processing units e.g., a processor
  • a computing device implementing the encoder 1200 performs the process 1400 by executing instructions stored in a computer readable medium.
  • an electronic apparatus implementing the encoder 1200 performs the process 1400.
  • the encoder receives (at block 1410) data to be encoded as a current block of pixels in a current picture of a video.
  • the encoder generates (at block 1420) a list of candidates, each candidate specifying (i) a partition mode and (ii) first and second prediction modes. At least a first candidate in the list of candidates specifies motion information for an inter-coded partition of the current block.
  • the candidates in the list of candidates are assigned indices according to an order determined based on costs computed for the candidates.
  • the cost of a candidate is computed based on reconstructed samples neighboring the current block and reference samples derived according to the partition mode and the prediction modes of the candidate.
  • a candidate having a lowest cost among costs computed for all candidates in the list is assigned a shortest code word among all candidates in the list.
  • the first candidate may specify a refinement of an inter-prediction based on the motion information of the first candidate.
  • the refinement may be for GPM-TM, based on minimizing a matching cost between reconstructed samples neighboring the current block and reconstructed samples neighboring a reference block identified by the motion information of the first candidate.
  • the refinement may be for GPM-MMVD, based on a motion vector difference that is specified by a distance and a direction.
  • the first candidate in the list of candidates may specify motion information for both the first and second partitions if both the first and second partitions of the first candidate are inter-coded.
  • the first candidate may specify an intra-prediction mode (e.g., an intra-prediction direction) for an intra coded partition of the current block.
  • the list of candidates may also include a second candidates that specifies two intra-prediction modes for two intra-coded partitions (GPM-Intra) .
  • the encoder signals (at block 1430) a selection of a candidate from the list of candidates, by e.g., signaling an index that is assigned according to the computed costs of the candidates in the list.
  • the encoder segments (at block 1440) the current block into a first partition and a second partition according to the partition mode of the selected candidate.
  • the encoder generates (at block 1450) first and second predictions for the first and second partitions according to the first and second prediction modes of the selected candidate.
  • the encoder encodes (at block 1460) the current block by using the first and second predictions.
  • the video encoder combines the first and second predictions by GPM edge blending to generate a combined prediction, and the combined prediction is used to generate a residual for encoding the current block.
  • an encoder may signal (or generate) one or more syntax element in a bitstream, such that a decoder may parse said one or more syntax element from the bitstream.
  • FIG. 15 illustrates an example video decoder 1500 that may decode GPM coded blocks.
  • the video decoder 1500 is an image-decoding or video-decoding circuit that receives a bitstream 1595 and decodes the content of the bitstream into pixel data of video frames for display.
  • the video decoder 1500 has several components or modules for decoding the bitstream 1595, including some components selected from an inverse quantization module 1511, an inverse transform module 1510, an intra-prediction module 1525, a motion compensation module 1530, an in-loop filter 1545, a decoded picture buffer 1550, a MV buffer 1565, a MV prediction module 1575, and a parser 1590.
  • the motion compensation module 1530 is part of an inter-prediction module 1540.
  • the modules 1510 –1590 are modules of software instructions being executed by one or more processing units (e.g., a processor) of a computing device. In some embodiments, the modules 1510 –1590 are modules of hardware circuits implemented by one or more ICs of an electronic apparatus. Though the modules 1510 –1590 are illustrated as being separate modules, some of the modules can be combined into a single module.
  • the parser 1590 receives the bitstream 1595 and performs initial parsing according to the syntax defined by a video-coding or image-coding standard.
  • the parsed syntax element includes various header elements, flags, as well as quantized data (or quantized coefficients) 1512.
  • the parser 1590 parses out the various syntax elements by using entropy-coding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman encoding.
  • CABAC context-adaptive binary arithmetic coding
  • Huffman encoding Huffman encoding
  • the inverse quantization module 1511 de-quantizes the quantized data (or quantized coefficients) 1512 to obtain transform coefficients, and the inverse transform module 1510 performs inverse transform on the transform coefficients 1516 to produce reconstructed residual signal 1519.
  • the reconstructed residual signal 1519 is added with predicted pixel data 1513 from the intra-prediction module 1525 or the motion compensation module 1530 to produce decoded pixel data 1517.
  • the decoded pixels data are filtered by the in-loop filter 1545 and stored in the decoded picture buffer 1550.
  • the decoded picture buffer 1550 is a storage external to the video decoder 1500.
  • the decoded picture buffer 1550 is a storage internal to the video decoder 1500.
  • the intra-prediction module 1525 receives intra-prediction data from bitstream 1595 and according to which, produces the predicted pixel data 1513 from the decoded pixel data 1517 stored in the decoded picture buffer 1550.
  • the decoded pixel data 1517 is also stored in a line buffer (not illustrated) for intra-picture prediction and spatial MV prediction.
  • the content of the decoded picture buffer 1550 is used for display.
  • a display device 1555 either retrieves the content of the decoded picture buffer 1550 for display directly, or retrieves the content of the decoded picture buffer to a display buffer.
  • the display device receives pixel values from the decoded picture buffer 1550 through a pixel transport.
  • the motion compensation module 1530 produces predicted pixel data 1513 from the decoded pixel data 1517 stored in the decoded picture buffer 1550 according to motion compensation MVs (MC MVs) . These motion compensation MVs are decoded by adding the residual motion data received from the bitstream 1595 with predicted MVs received from the MV prediction module 1575.
  • MC MVs motion compensation MVs
  • the MV prediction module 1575 generates the predicted MVs based on reference MVs that were generated for decoding previous video frames, e.g., the motion compensation MVs that were used to perform motion compensation.
  • the MV prediction module 1575 retrieves the reference MVs of previous video frames from the MV buffer 1565.
  • the video decoder 1500 stores the motion compensation MVs generated for decoding the current video frame in the MV buffer 1565 as reference MVs for producing predicted MVs.
  • the in-loop filter 1545 performs filtering or smoothing operations on the decoded pixel data 1517 to reduce the artifacts of coding, particularly at boundaries of pixel blocks.
  • the filtering or smoothing operations performed by the in-loop filter 1545 include deblock filter (DBF) , sample adaptive offset (SAO) , and/or adaptive loop filter (ALF) .
  • DPF deblock filter
  • SAO sample adaptive offset
  • ALF adaptive loop filter
  • FIG. 16 illustrates portions of the video decoder 1500 that implement a unified GPM candidate list.
  • a GPM candidate generation module 1610 specifies various candidate GPM combinations. Each candidate GPM combination may specify one GPM partition mode that partitions the current block into two or more GPM partitions, and two or more prediction modes for the two or more GPM partitions. For an intra-coded GPM partition, the candidate combination may specify the intra prediction mode. For an inter-coded GPM partition, the candidate combination may specify motion information (e.g., motion vector, reference picture, or a selected merge candidate) . The motion information may be retrieved from the MV buffer 1565. A candidate combination may also specify refinement for the motion information according to GPM-MMVD or GPM-TM.
  • a template identification module 1620 retrieves samples for a current template and a reference template from the decoded picture buffer 1550 based on the prediction modes specified in the candidate GPM combination. For a candidate GPM combination that specifies one or more inter-coded partitions, the template identification module 1620 may retrieve neighboring samples of the current block as the current template and use the motion information specified in the combination to retrieve neighboring samples of a reference block as the reference template. For a candidate GPM combination that specifies one or more intra-coded partitions, the template identification module 1620 may retrieve neighboring samples of the current block as the current template and neighboring samples of the current template that are identified by the specified intra-prediction mode as the reference template.
  • the template identification module 1620 provides the reference template (s) , the current template (s) , of a currently indicated candidate GPM combination to a cost calculator 1630, which performs template matching to produce a cost for the indicated candidate GPM combination.
  • the cost calculator 1630 may combine the reference templates of the different partitions (with edge blending) according to the GPM partition mode of the indicated GPM combination when determining the TM cost.
  • the computed costs of the various candidate GPM combinations are provided to a candidate ordering module 1640, which assigns indices to the various candidate GPM combinations according to their corresponding computed TM costs.
  • a candidate selection module 1645 may select one of the candidate GPM combinations based on an index provided by the entropy decoder 1590.
  • the content of the selected GPM combination (including GPM partition mode, prediction modes of the GPM partitions, motion information, intra-prediction mode, etc. ) is provided to a GPM prediction generation module 1650.
  • the GPM prediction generation module 1650 generates a prediction for each GPM partition according to the selected GPM combination.
  • the intra prediction module 1525 may be used to generate the prediction for an inter-predicted partition and the motion compensation module 1530 may be used to generate the prediction for an intra-predicted partition.
  • the GPM prediction generation module 1650 then combines the predictions of the different GPM partitions (by performing GPM edge blending) into one GPM predictor for the current block as the predicted pixel data 1513.
  • FIG. 17 conceptually illustrates a process 1700 for using a unified candidate list of GPM combinations.
  • one or more processing units e.g., a processor
  • a computing device implementing the decoder 1500 performs the process 1700 by executing instructions stored in a computer readable medium.
  • an electronic apparatus implementing the decoder 1500 performs the process 1700.
  • the decoder receives (at block 1710) data to be decoded as a current block of pixels in a current picture of a video.
  • the decoder generates (at block 1720) a list of candidates, each candidate specifying (i) a partition mode and (ii) first and second prediction modes. At least a first candidate in the list of candidates specifies motion information for an inter-coded partition of the current block.
  • the candidates in the list of candidates are assigned indices according to an order determined based on costs computed for the candidates.
  • the cost of a candidate is computed based on reconstructed samples neighboring the current block and reference samples derived according to the partition mode and the prediction modes of the candidate.
  • a candidate having a lowest cost among costs computed for all candidates in the list is assigned a shortest code word among all candidates in the list.
  • the first candidate may specify a refinement of an inter-prediction based on the motion information of the first candidate.
  • the refinement may be for GPM-TM, based on minimizing a matching cost between reconstructed samples neighboring the current block and reconstructed samples neighboring a reference block identified by the motion information of the first candidate.
  • the refinement may be for GPM-MMVD, based on a motion vector difference that is specified by a distance and a direction.
  • the first candidate in the list of candidates may specify motion information for both the first and second partitions if both the first and second partitions of the first candidate are inter-coded.
  • the first candidate may specify an intra-prediction mode (e.g., an intra-prediction direction) for an intra coded partition of the current block.
  • the list of candidates may also include a second candidates that specifies two intra-prediction modes for two intra-coded partitions (GPM-Intra) .
  • the decoder receives (at block 1730) a selection of a candidate from the list of candidates, by e.g., receiving an index that is assigned according to the computed costs of the candidates in the list.
  • the decoder segments (at block 1740) the current block into a first partition and a second partition according to the partition mode of the selected candidate.
  • the decoder generates (at block 1750) first and second predictions for the first and second partitions according to the first and second prediction modes of the selected candidate.
  • the decoder reconstructs (at block 1760) the current block by using the first and second predictions.
  • the video decoder combines the first and second predictions by GPM edge blending to generate a combined prediction, and the combined prediction is used as a prediction block to reconstruct the current block.
  • the decoder may then provide the reconstructed current block for display as part of the reconstructed current picture.
  • Computer readable storage medium also referred to as computer readable medium
  • these instructions are executed by one or more computational or processing unit (s) (e.g., one or more processors, cores of processors, or other processing units) , they cause the processing unit (s) to perform the actions indicated in the instructions.
  • computational or processing unit e.g., one or more processors, cores of processors, or other processing units
  • Examples of computer readable media include, but are not limited to, CD-ROMs, flash drives, random-access memory (RAM) chips, hard drives, erasable programmable read only memories (EPROMs) , electrically erasable programmable read-only memories (EEPROMs) , etc.
  • the computer readable media does not include carrier waves and electronic signals passing wirelessly or over wired connections.
  • the term “software” is meant to include firmware residing in read-only memory or applications stored in magnetic storage which can be read into memory for processing by a processor.
  • multiple software inventions can be implemented as sub-parts of a larger program while remaining distinct software inventions.
  • multiple software inventions can also be implemented as separate programs.
  • any combination of separate programs that together implement a software invention described here is within the scope of the present disclosure.
  • the software programs when installed to operate on one or more electronic systems, define one or more specific machine implementations that execute and perform the operations of the software programs.
  • FIG. 18 conceptually illustrates an electronic system 1800 with which some embodiments of the present disclosure are implemented.
  • the electronic system 1800 may be a computer (e.g., a desktop computer, personal computer, tablet computer, etc. ) , phone, PDA, or any other sort of electronic device.
  • Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media.
  • Electronic system 1800 includes a bus 1805, processing unit (s) 1810, a graphics-processing unit (GPU) 1815, a system memory 1820, a network 1825, a read-only memory 1830, a permanent storage device 1835, input devices 1840, and output devices 1845.
  • the bus 1805 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 1800.
  • the bus 1805 communicatively connects the processing unit (s) 1810 with the GPU 1815, the read-only memory 1830, the system memory 1820, and the permanent storage device 1835.
  • the processing unit (s) 1810 retrieves instructions to execute and data to process in order to execute the processes of the present disclosure.
  • the processing unit (s) may be a single processor or a multi-core processor in different embodiments. Some instructions are passed to and executed by the GPU 1815.
  • the GPU 1815 can offload various computations or complement the image processing provided by the processing unit (s) 1810.
  • the read-only-memory (ROM) 1830 stores static data and instructions that are used by the processing unit (s) 1810 and other modules of the electronic system.
  • the permanent storage device 1835 is a read-and-write memory device. This device is a non-volatile memory unit that stores instructions and data even when the electronic system 1800 is off. Some embodiments of the present disclosure use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 1835.
  • the system memory 1820 is a read-and-write memory device. However, unlike storage device 1835, the system memory 1820 is a volatile read-and-write memory, such a random access memory.
  • the system memory 1820 stores some of the instructions and data that the processor uses at runtime.
  • processes in accordance with the present disclosure are stored in the system memory 1820, the permanent storage device 1835, and/or the read-only memory 1830.
  • the various memory units include instructions for processing multimedia clips in accordance with some embodiments. From these various memory units, the processing unit (s) 1810 retrieves instructions to execute and data to process in order to execute the processes of some embodiments.
  • the bus 1805 also connects to the input and output devices 1840 and 1845.
  • the input devices 1840 enable the user to communicate information and select commands to the electronic system.
  • the input devices 1840 include alphanumeric keyboards and pointing devices (also called “cursor control devices” ) , cameras (e.g., webcams) , microphones or similar devices for receiving voice commands, etc.
  • the output devices 1845 display images generated by the electronic system or otherwise output data.
  • the output devices 1845 include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD) , as well as speakers or similar audio output devices. Some embodiments include devices such as a touchscreen that function as both input and output devices.
  • CTR cathode ray tubes
  • LCD liquid crystal displays
  • bus 1805 also couples electronic system 1800 to a network 1825 through a network adapter (not shown) .
  • the computer can be a part of a network of computers (such as a local area network ( “LAN” ) , a wide area network ( “WAN” ) , or an Intranet, or a network of networks, such as the Internet. Any or all components of electronic system 1800 may be used in conjunction with the present disclosure.
  • Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media) .
  • computer-readable media include RAM, ROM, read-only compact discs (CD-ROM) , recordable compact discs (CD-R) , rewritable compact discs (CD-RW) , read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM) , a variety of recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.
  • the computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate arrays
  • integrated circuits execute instructions that are stored on the circuit itself.
  • PLDs programmable logic devices
  • ROM read only memory
  • RAM random access memory
  • the terms “computer” , “server” , “processor” , and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people.
  • display or displaying means displaying on an electronic device.
  • the terms “computer readable medium, ” “computer readable media, ” and “machine readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral signals.
  • any two components so associated can also be viewed as being “operably connected” , or “operably coupled” , to each other to achieve the desired functionality, and any two components capable of being so associated can also be viewed as being “operably couplable” , to each other to achieve the desired functionality.
  • operably couplable include but are not limited to physically mateable and/or physically interacting components and/or wirelessly interactable and/or wirelessly interacting components and/or logically interacting and/or logically interactable components.

Abstract

La présente invention concerne un procédé pour mettre en œuvre une liste de candidats pour différentes combinaisons d'un mode de partitionnement géométrique (GPM). Un codeur vidéo génère une liste de candidats, chaque candidat spécifiant (i) un mode de partitionnement et (ii) des premier et second modes de prédiction. Un premier candidat dans la liste de candidats spécifie des informations de mouvement pour une partition inter-codée du bloc actuel. Le codeur vidéo signale ou reçoit une sélection d'un candidat à partir de la liste de candidats. Le codeur vidéo segmente le bloc actuel en une première partition et en une seconde partition selon le mode de partitionnement du candidat sélectionné. Le codeur vidéo génère des première et seconde prédictions pour les première et seconde partitions selon les premier et second modes de prédiction du candidat sélectionné. Le codeur vidéo code ou décode le bloc actuel à l'aide des première et seconde prédictions.
PCT/CN2023/110528 2022-08-05 2023-08-01 Indexation conjointe de mode de partitionnement géométrique dans un codage vidéo WO2024027700A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263370509P 2022-08-05 2022-08-05
US63/370,509 2022-08-05

Publications (1)

Publication Number Publication Date
WO2024027700A1 true WO2024027700A1 (fr) 2024-02-08

Family

ID=89848503

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/110528 WO2024027700A1 (fr) 2022-08-05 2023-08-01 Indexation conjointe de mode de partitionnement géométrique dans un codage vidéo

Country Status (1)

Country Link
WO (1) WO2024027700A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210227206A1 (en) * 2020-01-12 2021-07-22 Mediatek Inc. Video Processing Methods and Apparatuses of Merge Number Signaling in Video Coding Systems
WO2021196242A1 (fr) * 2020-04-03 2021-10-07 Oppo广东移动通信有限公司 Procédé de prédiction intertrame, codeur, décodeur et support d'enregistrement
CN113497936A (zh) * 2020-04-08 2021-10-12 Oppo广东移动通信有限公司 编码方法、解码方法、编码器、解码器以及存储介质
CN113840148A (zh) * 2020-06-24 2021-12-24 Oppo广东移动通信有限公司 帧间预测方法、编码器、解码器以及计算机存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210227206A1 (en) * 2020-01-12 2021-07-22 Mediatek Inc. Video Processing Methods and Apparatuses of Merge Number Signaling in Video Coding Systems
WO2021196242A1 (fr) * 2020-04-03 2021-10-07 Oppo广东移动通信有限公司 Procédé de prédiction intertrame, codeur, décodeur et support d'enregistrement
CN113497936A (zh) * 2020-04-08 2021-10-12 Oppo广东移动通信有限公司 编码方法、解码方法、编码器、解码器以及存储介质
CN113840148A (zh) * 2020-06-24 2021-12-24 Oppo广东移动通信有限公司 帧间预测方法、编码器、解码器以及计算机存储介质
WO2021258841A1 (fr) * 2020-06-24 2021-12-30 Oppo广东移动通信有限公司 Procédé de prédiction inter-trames, codeur, décodeur, et support de stockage informatique

Similar Documents

Publication Publication Date Title
US11310526B2 (en) Hardware friendly constrained motion vector refinement
US11172203B2 (en) Intra merge prediction
US10715827B2 (en) Multi-hypotheses merge mode
US20220248064A1 (en) Signaling for illumination compensation
US11553173B2 (en) Merge candidates with multiple hypothesis
WO2020169082A1 (fr) Simplification de liste de fusion de copie intra-bloc
US20220224915A1 (en) Usage of templates for decoder-side intra mode derivation
US11240524B2 (en) Selective switch for parallel processing
CN113141783A (zh) 用于多重假设的帧内预测
WO2020103946A1 (fr) Signalisation pour prédiction de ligne de référence multiple et prédiction multi-hypothèse
WO2023020446A1 (fr) Réordonnancement de candidats et affinement de vecteur de mouvement pour un mode de partitionnement géométrique
WO2024027700A1 (fr) Indexation conjointe de mode de partitionnement géométrique dans un codage vidéo
WO2024007789A1 (fr) Génération de prédiction avec contrôle hors limite dans un codage vidéo
WO2023217140A1 (fr) Seuil de similarité pour liste de candidats
WO2024017004A1 (fr) Réordonnancement de liste de référence dans un codage vidéo
WO2024017224A1 (fr) Affinement de candidat affine
WO2023198105A1 (fr) Dérivation et prédiction de mode intra implicites basées sur une région
WO2024016955A1 (fr) Vérification hors limite dans un codage vidéo
WO2023174426A1 (fr) Mode de partitionnement géométrique et réorganisation de candidats à la fusion
WO2023236914A1 (fr) Codage de prédiction d'hypothèses multiples
WO2023198187A1 (fr) Dérivation et prédiction de mode intra basées sur un modèle
WO2023020444A1 (fr) Réordonnancement de candidats pour mode de fusion avec différence de vecteur de mouvement
WO2023241347A1 (fr) Zones adaptatives pour dérivation et prédiction de mode intra côté décodeur
WO2023236916A1 (fr) Mise à jour d'attributs de mouvement de candidats de fusion
WO2023131298A1 (fr) Mise en correspondance de limites pour codage vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23849406

Country of ref document: EP

Kind code of ref document: A1