WO2024078331A1 - Procédé et appareil de prédiction de vecteurs de mouvement basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo - Google Patents

Procédé et appareil de prédiction de vecteurs de mouvement basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo Download PDF

Info

Publication number
WO2024078331A1
WO2024078331A1 PCT/CN2023/121759 CN2023121759W WO2024078331A1 WO 2024078331 A1 WO2024078331 A1 WO 2024078331A1 CN 2023121759 W CN2023121759 W CN 2023121759W WO 2024078331 A1 WO2024078331 A1 WO 2024078331A1
Authority
WO
WIPO (PCT)
Prior art keywords
current block
block
motion
collocated reference
sbtmvp
Prior art date
Application number
PCT/CN2023/121759
Other languages
English (en)
Inventor
Chen-Yen LAI
Ching-Yeh Chen
Tzu-Der Chuang
Chih-Wei Hsu
Yi-Wen Chen
Original Assignee
Mediatek Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediatek Inc. filed Critical Mediatek Inc.
Publication of WO2024078331A1 publication Critical patent/WO2024078331A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention is a non-Provisional Application of and claims priority to U.S. Provisional Patent Application No. 63/379,459, filed on October14, 2022.
  • the U.S. Provisional Patent Application is hereby incorporated by reference in its entirety.
  • the present invention relates to video coding system using SbTMVP (Subblock-based Temporal Motion Vector Prediction) .
  • the present invention relates to techniques to improve the coding efficiency for SbTMVP.
  • VVC Versatile video coding
  • JVET Joint Video Experts Team
  • MPEG ISO/IEC Moving Picture Experts Group
  • ISO/IEC 23090-3 2021
  • Information technology -Coded representation of immersive media -Part 3 Versatile video coding, published Feb. 2021.
  • VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals.
  • HEVC High Efficiency Video Coding
  • Fig. 1A illustrates an exemplary adaptive Inter/Intra video coding system incorporating loop processing.
  • Intra Prediction the prediction data is derived based on previously coded video data in the current picture.
  • Motion Estimation (ME) is performed at the encoder side and Motion Compensation (MC) is performed based on the result of ME to provide prediction data derived from other picture (s) and motion data.
  • Switch 114 selects Intra Prediction 110 or Inter-Prediction 112 and the selected prediction data is supplied to Adder 116 to form prediction errors, also called residues.
  • the prediction error is then processed by Transform (T) 118 followed by Quantization (Q) 120.
  • T Transform
  • Q Quantization
  • the transformed and quantized residues are then coded by Entropy Encoder 122 to be included in a video bitstream corresponding to the compressed video data.
  • the bitstream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with Intra prediction and Inter prediction, and other information such as parameters associated with loop filters applied to underlying image area.
  • the side information associated with Intra Prediction 110, Inter prediction 112 and in-loop filter 130, are provided to Entropy Encoder 122 as shown in Fig. 1A. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well.
  • the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues.
  • the residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data.
  • the reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames.
  • incoming video data undergoes a series of processing in the encoding system.
  • the reconstructed video data from REC 128 may be subject to various impairments due to a series of processing.
  • in-loop filter 130 is often applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality.
  • deblocking filter (DF) may be used.
  • SAO Sample Adaptive Offset
  • ALF Adaptive Loop Filter
  • the loop filter information may need to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, loop filter information is also provided to Entropy Encoder 122 for incorporation into the bitstream.
  • DF deblocking filter
  • SAO Sample Adaptive Offset
  • ALF Adaptive Loop Filter
  • Loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in the reference picture buffer 134.
  • the system in Fig. 1A is intended to illustrate an exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Coding (HEVC) system, VP8, VP9, H. 264 or VVC.
  • HEVC High Efficiency Video Coding
  • the decoder can use similar or portion of the same functional blocks as the encoder except for Transform 118 and Quantization 120 since the decoder only needs Inverse Quantization 124 and Inverse Transform 126.
  • the decoder uses an Entropy Decoder 140 to decode the video bitstream into quantized transform coefficients and needed coding information (e.g. ILPF information, Intra prediction information and Inter prediction information) .
  • the Intra prediction 150 at the decoder side does not need to perform the mode search. Instead, the decoder only needs to generate Intra prediction according to Intra prediction information received from the Entropy Decoder 140.
  • the decoder only needs to perform motion compensation (MC 152) according to Inter prediction information received from the Entropy Decoder 140 without the need for motion estimation.
  • an input picture is partitioned into non-overlapped square block regions referred as CTUs (Coding Tree Units) , similar to HEVC.
  • CTUs Coding Tree Units
  • Each CTU can be partitioned into one or multiple smaller size coding units (CUs) .
  • the resulting CU partitions can be in square or rectangular shapes.
  • VVC divides a CTU into prediction units (PUs) as a unit to apply prediction process, such as Inter prediction, Intra prediction, etc.
  • the VVC standard incorporates various new coding tools to further improve the coding efficiency over the HEVC standard.
  • various new coding tools some coding tools relevant to the present invention are reviewed as follows.
  • VVC supports the subblock-based temporal motion vector prediction (SbTMVP) method. Similar to the temporal motion vector prediction (TMVP) in HEVC, SbTMVP uses the motion field in the collocated picture to improve motion vector prediction and merge mode for CUs in the current picture. The same collocated picture used by TMVP is used for SbTMVP. SbTMVP differs from TMVP in the following two main aspects:
  • TMVP predicts motion at CU level but SbTMVP predicts motion at sub-CU level;
  • TMVP fetches the temporal motion vectors from the collocated block in the collocated picture (i.e., the collocated block is the bottom-right or centre block relative to the current CU)
  • SbTMVP applies a motion shift before fetching the temporal motion information from the collocated picture, where the motion shift is obtained from the motion vector from one of the spatial neighbouring blocks of the current CU.
  • the SbTMVP process is illustrated in Figs. 2A-B.
  • SbTMVP predicts the motion vectors of the sub-CUs within the current CU in two steps.
  • the spatial neighbour A1 of the current block 222 in the current picture 220 in Fig. 2A is examined. If A1 has a motion vector that uses the collocated picture as its reference picture, this motion vector is selected to be the motion shift to be applied. If no such motion is identified, then the motion shift is set to (0, 0) .
  • the motion shift 240 identified in Step 1 is applied (i.e. added to the current block’s coordinates) to obtain sub-CU level motion information (motion vectors and reference indices) from the collocated picture 230 as shown in Fig. 2B.
  • the example in Fig. 2B assumes the motion shift is set to block A1’s motion, and the collocated block 232 in the collocated picture 230 can be located based on the collocated reference subblock A1’.
  • the motion information of its corresponding block (the smallest motion grid that covers the centre sample) in the collocated picture is used to derive the motion information for the sub-CU of the collocated CU 232.
  • the motion information of the upper-left subblock of the collocated CU 232 is used to derive prediction for the motion information of the upper-left subblock of the current CU 222.
  • the motion information of the collocated sub-CU is identified, it is converted to the motion vectors and reference indices of the current sub-CU in a similar way as the TMVP process of HEVC, where temporal motion scaling is applied to align the reference pictures of the temporal motion vectors to those of the current CU.
  • TMVP process of HEVC where temporal motion scaling is applied to align the reference pictures of the temporal motion vectors to those of the current CU.
  • the arrow (s) in each subblock of the collocated picture 230 correspond (s) to the motion vector (s) of a collocated subblock (thick-lined arrow for L0 MV and thin-lined arrow for L1 MV) .
  • the arrow (s) in each subblock correspond (s) to the scaled motion vector (s) of a current subblock (thick-lined arrow for L0 MV and thin-lined arrow for L1 MV) . If no motion information of the collocated sub-CU is available (e.g. an intra coded subblock) , a default motion is used.
  • a combined subblock based merge list which contains both SbTMVP candidate and affine merge candidates, is used for the signalling of subblock based merge mode.
  • the SbTMVP mode is enabled/disabled by a sequence parameter set (SPS) flag. If the SbTMVP mode is enabled, the SbTMVP predictor is added as the first entry of the list of subblock based merge candidates, and followed by the affine merge candidates.
  • SPS sequence parameter set
  • SbTMVP mode is only applicable to the CU with both width and height are larger than or equal to 8.
  • the encoding processing flow of the additional SbTMVP merge candidate is the same as for the other merge candidates, that is, for each CU in P or B slice, an additional RD check is performed to decide whether to use the SbTMVP candidate.
  • Non-Adjacent Motion Vector Prediction (NAMVP)
  • JVET-L0399 a coding tool referred as Non-Adjacent Motion Vector Prediction (NAMVP)
  • JVET-L0399 Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 12th Meeting: Macao, CN, 3–12 Oct. 2018, Document: JVET-L0399
  • the non-adjacent spatial merge candidates are inserted after the TMVP (i.e., the temporal MVP) in the regular merge candidate list.
  • the pattern of spatial merge candidates is shown in Fig.
  • each small numbered box corresponds to a NAMVP candidate and the candidates are ordered (as shown by the number inside the square) according to the distance.
  • MP-DMVR Multi-Pass Decoder-Side Motion Vector Refinement
  • a multi-pass decoder-side motion vector refinement is applied.
  • bilateral matching (BM) is applied to the coding block.
  • BM is applied to each 16x16 subblock within the coding block.
  • MV in each 8x8 subblock is refined by applying bi-directional optical flow (BDOF) .
  • BDOF bi-directional optical flow
  • a refined MV is derived by applying BM to a coding block. Similar to decoder-side motion vector refinement (DMVR) , in the bi-prediction operation, a refined MV is searched around the two initial MVs (i.e., MV0 and MV1) in the reference picture lists L0 and L1. The refined MVs (i.e., MV0_pass1 and MV1_pass1) are derived around the initiate MVs based on the minimum bilateral matching cost between the two reference blocks in L0 and L1.
  • DMVR decoder-side motion vector refinement
  • BM performs local search to derive integer sample precision intDeltaMV.
  • the local search applies a 3 ⁇ 3 square search pattern to loop through the search range [–sHor, sHor] in the horizontal direction and [–sVer, sVer] in the vertical direction, wherein, the values of sHor and sVer are determined by the block dimension, and the maximum value of sHor and sVer is 8.
  • MRSAD cost function is applied to remove the DC effect of distortion between reference blocks.
  • the intDeltaMV local search is terminated. Otherwise, the current minimum cost search point becomes the new centre point of the 3 ⁇ 3 search pattern and continue to search for the minimum cost, until it reaches the end of the search range.
  • the existing fractional sample refinement is further applied to derive the final deltaMV.
  • the refined MVs after the first pass are then derived as:
  • MV0_pass1 MV0 + deltaMV
  • MV1_pass1 MV1 –deltaMV
  • a refined MV is derived by applying BM to a 16 ⁇ 16 grid subblock. For each subblock, a refined MV is searched around the two MVs (e.g. MV0_pass1 and MV1_pass1) , obtained during the first pass, in the reference picture list L0 and L1.
  • the refined MVs i.e., MV0_pass2 (sbIdx2) and MV1_pass2 (sbIdx2)
  • MV0_pass2 sbIdx2
  • sbIdx2 MV1_pass2
  • BM For each subblock, BM performs full search to derive integer sample precision intDeltaMV.
  • the full search has a search range [–sHor, sHor] in the horizontal direction and [–sVer, sVer] in the vertical direction, wherein, the values of sHor and sVer are determined by the block dimension, and the maximum value of sHor and sVer is 8.
  • the search area (2*sHor + 1) * (2*sVer + 1) is divided up to 5 diamond shape search regions shown on Fig. 4, where the 5 search regions are shown in 5 different shades.
  • Each search region is assigned a costFactor, which is determined by the distance (intDeltaMV) between each search point and the starting MV, and each diamond region is processed in the order starting from the centre of the search area. In each region, the search points are processed in the raster scan order starting from the top left going to the bottom right corner of the region.
  • the int-pel full search is terminated; otherwise, the int-pel full search continues to the next search region until all search points are examined. Additionally, if the difference between the previous minimum cost and the current minimum cost in the iteration is less than a threshold that is equal to the area of the block, the search process terminates.
  • the existing VVC DMVR fractional sample refinement is further applied to derive the final deltaMV (sbIdx2) .
  • the refined MVs at second pass is then derived as:
  • ⁇ MV0_pass2 (sbIdx2) MV0_pass1 + deltaMV (sbIdx2)
  • ⁇ MV1_pass2 (sbIdx2) MV1_pass1 –deltaMV (sbIdx2)
  • a refined MV is derived by applying BDOF to an 8 ⁇ 8 grid subblock. For each 8 ⁇ 8 subblock, BDOF refinement is applied to derive scaled Vx and Vy without clipping starting from the refined MV of the parent subblock of the second pass.
  • the derived bioMv (Vx, Vy) is rounded to 1/16 sample precision and clipped between -32 and 32.
  • the refined MVs (e.g. MV0_pass3 (sbIdx3) and MV1_pass3 (sbIdx3) ) at third pass are derived as:
  • MV0_pass3 MV0_pass2 (sbIdx2) + bioMv
  • MV1_pass3 MV0_pass2 (sbIdx2) –bioMv
  • Adaptive decoder side motion vector refinement method is an extension of multi-pass DMVR which consists of the two new merge modes to refine MV only in one direction, either L0 or L1, of the bi prediction for the merge candidates that meet the DMVR conditions.
  • the multi-pass DMVR process is applied for the selected merge candidate to refine the motion vectors, however either MVD0 or MVD1 is set to zero in the 1 st pass (i.e. PU level) DMVR.
  • the merge candidates for the new merge mode are derived from spatial neighbouring coded blocks, TMVPs, non-adjacent blocks, HMVPs, pair-wise candidate, similar as in the regular merge mode. The difference is that only those meet DMVR conditions are added into the candidate list. The same merge candidate list is used by the two new merge modes. If the list of BM candidates contains the inherited BCW weights and DMVR process is unchanged except the computation of the distortion is made using MRSAD or MRSATD if the weights are non-equal and the bi-prediction is weighted with BCW weights. Merge index is coded as in regular merge mode.
  • Template matching is a decoder-side MV derivation method to refine the motion information of the current CU by finding the closest match between a template (i.e., top 514 and/or left 516 neighbouring blocks of the current CU 512) in the current picture 510 and a block (i.e., same size to the template, block 524 and 526) in a reference picture 520 as shown in Fig. 5.
  • a better MV is searched around the initial motion 530 of the current CU 512 of the current picture 510 within a [–8, +8] -pel search range 522 around location 528 in the reference picture 520 as pointed by the initial MV 530.
  • JVET-J0021 The template matching method in JVET-J0021 (Yi-Wen Chen, et al., “Description of SDR, HDR and 360° video coding technology proposal by Qualcomm and Technicolor –low and high complexity versions” , Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC 1/SC 29/WG11, 10th Meeting: San Diego, US, 10–20 Apr. 2018, Document: JVET-J0021) is used with the following modifications: search step size is determined based on AMVR mode and TM can be cascaded with bilateral matching process in merge modes.
  • JCT-VC Joint Collaborative Team on Video Coding
  • an MVP candidate is determined based on the template matching error to select the one which reaches the minimum difference between the current block template and the reference block template.
  • TM is then performed only for this particular MVP candidate for MV refinement.
  • TM refines this MVP candidate by using iterative diamond search starting from full-pel MVD precision (or 4-pel for 4-pel AMVR mode) within a [–8, +8] -pel search range.
  • the AMVP candidate may be further refined by using cross search with full-pel MVD precision (or 4-pel for 4-pel AMVR mode) , followed sequentially by half-pel and quarter-pel ones depending on AMVR mode as specified in Table 1. This search process ensures that the MVP candidate still keeps the same MV precision as indicated by the AMVR mode after the TM process. In the search process, if the difference between the previous minimum cost and the current minimum cost in the iteration is less than a threshold that is equal to the area of the block, the search process terminates.
  • TM may be performed all the way down to 1/8-pel MVD precision or skipping those beyond half-pel MVD precision, depending on whether the alternative interpolation filter (used for AMVR being a half-pel mode) is used according to merged motion information.
  • template matching may work as an independent process or an extra MV refinement process between block-based and subblock-based bilateral matching (BM) methods, depending on whether BM can be enabled or not according to its enabling condition check.
  • the merge candidates are adaptively reordered according to costs evaluated using template matching (TM) .
  • the reordering method can be applied to the regular merge mode, template matching (TM) merge mode, and affine merge mode (excluding the SbTMVP candidate) .
  • TM merge mode merge candidates are reordered before the refinement process.
  • merge candidates are divided into multiple subgroups.
  • the subgroup size is set to 5 for the regular merge mode and TM merge mode.
  • the subgroup size is set to 3 for the affine merge mode.
  • Merge candidates in each subgroup are reordered ascendingly according to cost values based on template matching. For ARMC-TM, the candidates in a subgroup are skipped if the subgroup satisfies the following 2 conditions: (1) the subgroup is the last subgroup and (2) the subgroup is not the first subgroup. For simplification, merge candidates in the last, but not the first subgroup, are not reordered.
  • the template matching cost of a merge candidate is measured as the sum of absolute differences (SAD) between samples of a template of the current block and their corresponding reference samples.
  • the template comprises a set of reconstructed samples neighbouring to the current block. Reference samples of the template are located by the motion information of the merge candidate.
  • a merge candidate When a merge candidate utilizes bi-directional prediction, the reference samples of the template of the merge candidate are also generated by bi-prediction as shown in Fig. 6.
  • block 612 corresponds to a current block in current picture 610
  • blocks 622 and 632 correspond to reference blocks in reference pictures 620 and 630 in list 0 and list 1 respectively.
  • Templates 614 and 616 are for current block 612
  • templates 624 and 626 are for reference block 622
  • templates 634 and 636 are for reference block 632.
  • Motion vectors 640, 642 and 644 are merge candidates in list 0 and motion vectors 650, 652 and 654 are merge candidates in list 1.
  • the above template comprises several sub-templates with the size of Wsub ⁇ 1
  • the left template comprises several sub-templates with the size of 1 ⁇ Hsub.
  • the motion information of the subblocks in the first row and the first column of current block is used to derive the reference samples of each sub-template.
  • block 712 corresponds to a current block in current picture 710
  • block 722 corresponds to a collocated block in reference picture 720.
  • Each small square in the current block and the collocated block corresponds to a subblock.
  • the dot-filled areas on the left and top of the current block correspond to template for the current block.
  • the boundary subblocks are labelled from A to G.
  • the arrow associated with each subblock corresponds to the motion vector of the subblock.
  • the reference subblocks (labelled as Aref to Gref) are located according to the motion vectors associated with the boundary subblocks.
  • MMVD Merge mode with MVD
  • MMVD motion vector differences
  • MMVD after a merge candidate is selected, it is further refined by the signalled MVDs information.
  • the further information includes a merge candidate flag, an index to specify motion magnitude, and an index for indication of motion direction.
  • MMVD mode one of the first two candidates in the merge list is selected to be used as MV basis.
  • the MMVD candidate flag is signalled to specify which one is used between the first and second merge candidates.
  • Distance index specifies motion magnitude information and indicates the pre-defined offset from the starting points (812 and 822) for a L0 reference block 810 and L1 reference block 820 as shown in Fig. 8.
  • an offset is added to either the horizontal component or the vertical component of the starting MV, where small circles in different styles correspond to different offsets from the centre.
  • the relation between the distance index and pre-defined offset is specified in Table 2.
  • Direction index represents the direction of the MVD relative to the starting point.
  • the direction index can represent of the four directions as shown below.
  • Direction index represents the direction of the MVD relative to the starting point.
  • the direction index can represent the four directions as shown in Table 3. It is noted that the meaning of MVD sign could be variant according to the information of starting MVs.
  • the starting MVs are an un-prediction MV or bi-prediction MVs with both lists pointing to the same side of the current picture (i.e. POCs of two references both larger than the POC of the current picture, or both smaller than the POC of the current picture)
  • the sign in Table 3 specifies the sign of the MV offset added to the starting MV.
  • the sign in Table 3 specifies the sign of MV offset added to the list0 MV component of the starting MV and the sign for the list1 MV has an opposite value. Otherwise, if the difference of POC in list 1 is greater than list 0, the sign in Table 3 specifies the sign of the MV offset added to the list1 MV component of starting MV and the sign for the list 0 MV has an opposite value.
  • the MVD is scaled according to the difference of POCs in each direction. If the differences of POCs in both lists are the same, no scaling is needed. Otherwise, if the difference of POC between list 0 (i.e., L0) reference picture and the current picture is larger than the one between list 1 (i.e., L1) picture and the current picture, the MVD for list 1 is scaled according to a ratio of td and tb, where td corresponds to the POC difference of L0 reference picture and the current picture and tb corresponds to POC difference of L1 reference picture and the current picture.
  • the MVD for list 0 is scaled in the similar way. If the starting MV is uni-predicted, the MVD is added to the available MV.
  • a method and apparatus for video coding using SbTMVP Subblock-based Temporal Motion Vector Prediction
  • input data associated with a current block are received, wherein the input data comprise pixel data for the current block to be encoded at an encoder side or encoded data associated with the current block to be decoded at a decoder side.
  • One or more motion shift candidates are determined based on one or more spatial neighbouring blocks of the current block.
  • Two or more collocated reference blocks in a collocated picture are determined based on said one or more motion shift candidates and said one or more spatial neighbouring blocks of the current block respectively.
  • a target collocated reference block is determined from said two or more collocated reference blocks.
  • Subblock motion information for subblocks of the current block is derived based on target motion information of corresponding subblocks of the target collocated reference block.
  • An SbTMVP (Subblock-based Temporal Motion Vector Prediction) candidate for the current block is generated based on the subblock motion information for the subblocks of the current block.
  • the current block is encoded or decoded by using a motion prediction set comprising the SbTMVP candidate.
  • said two or more collocated reference blocks in the collocated picture are determined by locating a base collocated reference block according to a location of a target spatial neighbouring block of the current block and a target motion shift candidate associated with the target spatial neighbouring block of the current block, and one or more additional collocated reference blocks are located by adding one or more additional motion shifts to the base collocated reference block.
  • the target spatial neighbouring block of the current block corresponds to a bottom-left neighbouring block and three additional collocated reference blocks are located on top side, left side and bottom side of the base collocated reference block respectively.
  • the target collocated reference block is determined from said two or more collocated reference blocks according to Rate-Distortion costs associated with said two or more collocated reference blocks.
  • a first syntax is signalled or parsed to indicate whether said two or more collocated reference blocks are used. In one embodiment, when the first syntax indicates said two or more collocated reference blocks being used, a second syntax is signalled or parsed to indicate the target collocated reference block as selected.
  • said two or more collocated reference blocks in the collocated picture comprise one base collocated reference block from each of said two or more motion shift candidates and said one base collocated reference block is located according to a location of each spatial neighbouring block of the current block and a target motion shift candidate associated with said one base collocated reference block.
  • a first syntax is signalled or parsed to indicate whether said two or more motion shift candidates are used.
  • a second syntax is signalled or parsed to indicate which base collocated reference block is selected.
  • the first syntax is signalled or parsed only if affine MMVD is disabled for the current block.
  • corresponding SbTMVP candidates associated with said two or more collocated reference blocks are reordered according to ARMC-TM (Adaptive Reordering of Merge Candidates with Template Matching) .
  • N best candidates are used for further rate-distortion cost evaluation, and wherein N is smaller than or equal to a total number of the corresponding SbTMVP candidates.
  • a set of indexes is used to indicate the N best candidates and a smaller index value is signalled using a shortened codeword.
  • one or more templates of the current block and one or more corresponding templates of the target collocated reference block are used for the ARMC-TM.
  • one or more templates of the current block and one or more corresponding templates of the target collocated reference block are used for the ARMC-TM, and wherein the subblocks of the target collocated reference block are located based on subblock motions.
  • Fig. 1A illustrates an exemplary adaptive Inter/Intra video coding system incorporating loop processing.
  • Fig. 1B illustrates a corresponding decoder for the encoder in Fig. 1A.
  • Fig. 2A illustrates an example of subblock-based Temporal Motion Vector Prediction (SbTMVP) in VVC, where the spatial neighbouring blocks are checked for availability of motion information.
  • SBTMVP Temporal Motion Vector Prediction
  • Fig. 2B illustrates an example of SbTMVP for deriving sub-CU motion field by applying a motion shift from spatial neighbour and scaling the motion information from the corresponding collocated sub-CUs.
  • Fig. 3 illustrates an exemplary pattern of the non-adjacent spatial merge candidates.
  • Fig. 4 illustrates the 5 diamond shape search regions used for multi-pass decoder-side motion vector refinement.
  • Fig. 5 illustrates an example of template matching used to refine an initial MV by searching an area around the initial MV.
  • Fig. 6 illustrates an example of templates used for the current block and corresponding reference blocks to measure matching costs associated with merge candidates.
  • Fig. 7 illustrates the offset distances in the horizontal and vertical directions for a L0 reference block and L1 reference block according to MMVD.
  • Fig. 8 illustrates an example of Merge mode with MVD (MMVD) , where distance index specifies motion magnitude information and indicates the pre-defined offset from the starting points for a L0 reference block and L1 reference block.
  • MMVD Merge mode with MVD
  • Fig. 9A illustrates an example of SbTMVP with multiple motion vector shifts (sbTMVP with Mmvs) according to an embodiment of the present invention, where a motion shift based on A1 is used to located A1’ in the collocated picture and additional candidates (T1, L1 and B1) are located by adding additional shifts.
  • Fig. 9B illustrates an example of the collocated reference block associated with candidate L1.
  • Fig. 10 illustrates an example of deriving multiple collocated reference blocks based on two different motion shifts associated with A1 and TR1 neighbouring blocks.
  • Fig. 11 illustrates an example of CU based template for calculating the template matching cost.
  • Fig. 12 illustrates a flowchart of an exemplary video coding system that utilizes SbTMVP with multiple motion shifts according to an embodiment of the present invention.
  • a motion vector of non-adjacent spatial neighbouring blocks is used as a motion shift.
  • the motion shift derivation is related to TM and/or ARMC-TM with the following variations:
  • a motion shift is refined by TM with CU template.
  • Two or more motion vectors of adjacent and/or non-adjacent spatial neighbouring blocks are reordered by ARMC-TM, then the first one or more are selected as the motion shift.
  • Two or more motion vectors of adjacent and/or non-adjacent spatial neighbouring blocks are refined by TM, then reordered by ARMC-TM, then the first one or more motion vectors are selected as the motion shift.
  • Two or more motion vectors of adjacent and/or non-adjacent spatial neighbouring blocks are reordered by ARMC-TM, then the first one or more motion vectors are selected as the motion shift and then refined by TM.
  • TM or ARMC-TM In the process of TM or ARMC-TM, set the cost of motion vector of TM or ARMC-TM to a large value or skip a motion vector if treating a motion vector as a motion shift would make a SbTMVP candidate not available (i.e., the default motion is not available) .
  • the default motion is further refined by TM and/or BM with the following variations:
  • a default motion is refined by TM with the CU template.
  • a default motion is refined by BM.
  • a default motion is refined by TM and then BM.
  • a default motion is refined by BM and then TM.
  • SbTMVP with multiple motion vector shifts (sbTMVP with Mmvs) is proposed as a new mode.
  • more than one sbTMVP with different motion shift candidates are tried and the best one will be selected based on the RD cost.
  • A1 is a bottom-left neighbouring subblock block of the current block 912.
  • the motion vector 930 points to A1’ in the collocated picture 920 and the collocated CU 922 can be located according to A1.
  • the first sbTMVP with Mmvs candidate derived based on the temporal motion from A1 is candidate 0 (i.e., A1’) .
  • the second sbTMVP with Mmvs candidate derived based on the temporal motion from A1 and with motion shift to left 4 pixels is candidate 1 (i.e., L1) .
  • subblock L1 is located by shifting A1’ to the left by 4 pixels.
  • the corresponding collocated CU 942 can be located according to L1 as shown in Fig. 9B.
  • the third sbTMVP with Mmvs candidate derived based on the temporal motion from A1 and with the motion shift to top 4 pixels is candidate 2 (i.e., T1) .
  • the fourth sbTMVP with Mmvs candidate derived based on the temporal motion from A1 and with the motion shift to bottom 4 pixels is candidate 3 (i.e., B1) .
  • the four sbTMVPs are generated by adding addition motion shifts to a base motion shift 930 so that three additional collocated reference blocks (i.e., T1, L1 and B1 in Fig. 9A) are identified in addition to a base collocated reference block (i.e., A1’) .
  • the collocated CUs corresponding to B1 and T1 can be determined in a way similar to that for L1.
  • the best sbTMVP with Mmvs candidate is determined based on Rate-Distortion (RD) cost.
  • RD Rate-Distortion
  • a flag (i.e., sbtmvp_mmvd_flag) is signalled to indicate the on-off of sbTMVP with Mmvs. If sbtmvp_mmvd_flag is equal to 1, sbtmvp_merge_idx is signalled to indicate the best candidate in sbTMVP with Mmvs candidate list.
  • stmvp_base_idx can be further signalled if sbtmvp_mmvd_flag is equal to 1 to indicate different initial candidate (base candidate) .
  • sbtmvp_base_idx is signalled to indicate the best candidate derived from base candidate 0 (i.e., A1) or base candidate 1 (i.e., TR1) . If the best candidate is derived from base 1 (i.e., TR1) , then the sbtmvp_merge_idx is used to indicate the best candidate being TT1, TB1, or TL1. If the best candidate is derived from base 0 (i.e., A1) , then the sbtmvp_merge_idx is used to indicate the best candidate being T1, B1, or L1.
  • stmvp_base_idx can be further signalled if sbtmvp_mmvd_flag is equal to 1 to indicate different initial candidate (i.e., base candidate) .
  • base candidate 0 is the motion at A1 position from collocated picture 0
  • base candidate 1 is the motion at A1 position from collocated picture 1.
  • stmvp_base_idx can be further signalled if sbtmvp_mmvd_flag is equal to 1 to indicate different initial candidate (i.e., base candidate) .
  • base candidate 0 is the motion at A1 position from collocated picture 0
  • base candidate 1 is the motion at A1 position from collocated picture 1. If both L0 and L1 motions of A1 position are not from collocated picture 0 or collocated picture 1, motion scaling technique can be used on the motion from L0 or motion from L1.
  • sbTMVP with multiple subblock motion shift (sbTMVP with sMmvs) is proposed as a new mode. That means more than one motion shifts are added on each subblock motion of a sbTMVP candidate to generate more than one sbTMVP with sMmvs candidates.
  • an sbTMVP candidate is generated by using a motion from A1.
  • a group of subblock motions i.e., initial motion group
  • more than one subblock motion shift will be added to each subblock motion of the initial motion group to generate more sbTMVP with sMmvs candidates.
  • the best sbTMVP with an sMmvs candidate will be selected based on the RD cost of each candidate.
  • a flag (i.e. sbtmvp_mmvd_flag) is signalled to indicate the on-off of sbTMVP with sMmvs. If sbtmvp_mmvd_flag is equal to 1, sbtmvp_merge_idx is signalled to indicate the best candidate in sbTMVP with sMmvs candidate list.
  • ARMC-TM is performed on the candidate list of sbTMVP with sMmvs or sbTMVP with Mmvs.
  • the candidate list of sbTMVP with sMmvs or sbTMVP with Mmvs can be reordered according to ARMC-TM first, and only the best N candidates after reordering will be used to calculate the RD cost for comparison.
  • N is an integer smaller than or equal to the number of candidates in candidate list.
  • TM cost is calculated by the template of current block on the collocated picture.
  • Fig. 11 shows templates of candidate A1’ and L3.
  • TM costs are calculated based on CU templates (1122 for A1’ and 1124 for L3) , where current picture 1110 and collocated picture 1120 are shown.
  • TM cost is calculated by using subblock motions similar to the technique shown in Fig. 7.
  • the availability of sbTMVP with sMmvs or sbTMVP with Mmvs candidates are checked before RD calculation.
  • the invalid candidates if the centre position of current block of a candidate on the collocated picture is intra mode or IBC mode, the candidate is treated as invalid
  • the final candidate list is generated which does not include invalid candidates.
  • a flag (e.g. sbtmvp_mmvd_flag) is signalled to indicate the on-off of sbTMVP with Mmvs. If sbtmvp_mmvd_flag is equal to 1, sbtmvp_merge_idx is signalled to indicate the best candidate in sbTMVP with Mmvs candidate list. sbtmvp_base_idx is signalled if more than one base candidate is used and they are signalled after affine with mmvd related syntax. An exemplary syntax design is shown as following.
  • SbTMVP with Mmvs can only be enabled when sbTMVP is present in affine merge candidate list. Therefore, affine with mmvd only can be enabled when sbTMVP is not present in the affine merge candidate list. That means SbTMVP with Mmvs can share the same syntax data with affine with mmvd.
  • the decoder can know either SbTMVP with Mmvs or affine with mmvd is enabled. If sbTMVP is present in affine merge candidate list, and subblok_mmvd_flag is equal to 1, SbTMVP with Mmvs is enabled.
  • An exemplary syntax design is shown as following.
  • affine with mmvd can apply 2 bases candidate without signalling affine_base_idx.
  • the syntax design is shown as following. In the encoder, the number of base candidate of affine with mmvd is equal to 1 when at least one of sbTMVP with Mmvs candidate is valid.
  • affine_mmvd_flag affine_merge_idx
  • sbtmvp_mmvd_flag affine_merge_idx
  • sbtmvp_merge_idx affine_merge_idx
  • affine_mmvd_flag [x0] [y0] 1 specifies that for the current coding unit, affine with mmvd is enabled.
  • the array indices x0, y0 specify the location (x0, y0) of the top-left luma sample of the considered coding block relative to the top-left luma sample of the picture.
  • affine_merge_idx [x0] [y0] specifies the merging candidate index of the affine with mmvd candidate list where x0, y0 specify the location (x0, y0) of the top-left luma sample of the considered coding block relative to the top-left luma sample of the picture.
  • sbtmvp_mmvd_flag [x0] [y0] 1 specifies that for the current coding unit, _sbtmvp with Mmvs is enabled or base candidate 1 is used in affine with mmvd_.
  • the array indices x0, y0 specify the location (x0, y0) of the top-left luma sample of the considered coding block relative to the top-left luma sample of the picture. If none of sbTMVP with Mmvs candidates is valid, affineBaseIdx is equal to sbtmvp_mmvd_flag; otherwise, affineBaseIdx is equal to 0.
  • affineEnableFlag is equal to 1; otherwise, affineEnableFlag is equal to affine_mmvd_flag.
  • sbtmvp_merge_idx [x0] [y0] specifies the merging candidate index of the SbTMVP with Mmvs candidate list or merging candidate index of the affine with mmvd candidate list where x0, y0 specify the location (x0, y0) of the top-left luma sample of the considered coding block relative to the top-left luma sample of the picture.
  • affineMmvdMergeIdx is equal to sbtmvp_merge_idx and sbtmvpMmvdMergeIdx is equal to 0; otherwise, affineMmvdMergeIdx is equal to affine_merge_idx and sbtmvpMmvdMergeIdx is equal to stmvp_merge_idx.
  • the motion vector shift for a sbTMVP can be signalled through the scheme similar to the MMVD scheme.
  • a merge candidate is selected by a merge index
  • the further information includes, but not limited to, an index to specify motion magnitude and an index to indicate the motion direction.
  • the set of the motion magnitudes and the set of the motion directions can contain any predefined values and is not limited to the set used in the current MMVD design in VVC as illustrated in the Table 4 and Table 5, respectively.
  • the merge candidate list can be derived by a merge list construction process which is the same as the process of regular merge candidate list construction. Since the merge candidate can be a bi-directional merge candidate, additional syntax elements are signalled to indicate which direction is used as the motion vector shift.
  • the merge candidate list can be derived by only inserting uni-directional motion vector into the candidate list. Since the merge candidate can only be an uni-directional merge candidate, the signalled MVDs information is directly added to the selected uni-directional merge candidate to derive the motion vector shift for the sbTMVP.
  • any of the SbTMVP methods with multiple motion shifts das described above can be implemented in encoders and/or decoders.
  • any of the proposed SbTMVP multiple motion shifts methods can be implemented in an inter coding module and/or a merge/AMVP candidate derivation module of an encoder (e.g. Inter Pred. 112 in Fig. 1A) , or a motion compensation module (e.g., MC 152 in Fig. 1B) and/or a merge/AMVP candidate derivation module of a decoder.
  • any of the proposed methods can be implemented as a circuit coupled to the inter coding module and/or a merge/AMVP candidate derivation module of an encoder and/or motion compensation module and/or a merge/AMVP candidate derivation module of the decoder.
  • the Inter-Pred. 112 and MC 152 are shown as individual processing units to support the SbTMVP methods, they may correspond to executable software or firmware codes stored on a media, such as hard disk or flash memory, for a CPU (Central Processing Unit) or programmable devices (e.g. DSP (Digital Signal Processor) or FPGA (Field Programmable Gate Array) ) .
  • Fig. 12 illustrates a flowchart of an exemplary video coding system that utilizes SbTMVP with multiple motion shifts according to an embodiment of the present invention.
  • the steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side.
  • the steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart.
  • input data associated with a current block are received in step 1210, wherein the input data comprise pixel data for the current block to be encoded at an encoder side or encoded data associated with the current block to be decoded at a decoder side.
  • One or more motion shift candidates are determined based on one or more spatial neighbouring blocks of the current block in step 1220.
  • Two or more collocated reference blocks in a collocated picture are determined based on said one or more motion shift candidates and said one or more spatial neighbouring blocks of the current block respectively in step 1230.
  • a target collocated reference block is determined from said two or more collocated reference blocks in step 1240.
  • Subblock motion information for subblocks of the current block is derived based on target motion information of corresponding subblocks of the target collocated reference block in step 1250.
  • An SbTMVP (Subblock-based Temporal Motion Vector Prediction) candidate for the current block is generated based on the subblock motion information for the subblocks of the current block in step 1260.
  • the current block is encoded or decoded by using a motion prediction set comprising the SbTMVP candidate in step 1270.
  • Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both.
  • an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein.
  • An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
  • DSP Digital Signal Processor
  • the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
  • These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
  • the software code or firmware code may be developed in different programming languages and different formats or styles.
  • the software code may also be compiled for different target platforms.
  • different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne des procédés et un appareil de codage vidéo à l'aide de SbTMVP (prédiction de vecteur de mouvement temporel basée sur un sous-bloc). Selon le procédé, un ou plusieurs candidats de décalage de mouvement sont déterminés sur la base d'un ou de plusieurs blocs voisins spatiaux du bloc actuel. De multiples blocs de référence colocalisés dans une image colocalisée sont déterminés sur la base dudit ou desdits candidats de décalage de mouvement et dudit ou desdits blocs voisins spatiaux du bloc actuel respectivement. Un bloc de référence colocalisé cible est déterminé à partir des multiples blocs de référence colocalisés. Des informations de mouvement de sous-bloc pour des sous-blocs du bloc courant sont dérivées sur la base d'informations de mouvement cible de sous-blocs correspondants du bloc de référence colocalisé cible. Un SbTMVP candidat pour le bloc courant est généré sur la base des informations de mouvement de sous-bloc pour les sous-blocs du bloc courant. Le bloc courant est codé ou décodé à l'aide d'un ensemble de prédiction de mouvement comprenant le prédicteur SbTMVP candidat .
PCT/CN2023/121759 2022-10-14 2023-09-26 Procédé et appareil de prédiction de vecteurs de mouvement basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo WO2024078331A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263379459P 2022-10-14 2022-10-14
US63/379459 2022-10-14

Publications (1)

Publication Number Publication Date
WO2024078331A1 true WO2024078331A1 (fr) 2024-04-18

Family

ID=90668736

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/121759 WO2024078331A1 (fr) 2022-10-14 2023-09-26 Procédé et appareil de prédiction de vecteurs de mouvement basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo

Country Status (1)

Country Link
WO (1) WO2024078331A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018506908A (ja) * 2015-01-26 2018-03-08 クゥアルコム・インコーポレイテッドQualcomm Incorporated サブ予測ユニットベース高度時間動きベクトル予測
CN112204964A (zh) * 2018-04-01 2021-01-08 Lg电子株式会社 基于帧间预测模式的图像处理方法及其装置
CN113261294A (zh) * 2019-01-02 2021-08-13 Lg 电子株式会社 基于sbtmvp的帧间预测方法和设备
CN114270821A (zh) * 2019-06-19 2022-04-01 Lg电子株式会社 包括通过应用确定的预测模式来生成预测样本的图像解码方法及其装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018506908A (ja) * 2015-01-26 2018-03-08 クゥアルコム・インコーポレイテッドQualcomm Incorporated サブ予測ユニットベース高度時間動きベクトル予測
CN112204964A (zh) * 2018-04-01 2021-01-08 Lg电子株式会社 基于帧间预测模式的图像处理方法及其装置
CN113261294A (zh) * 2019-01-02 2021-08-13 Lg 电子株式会社 基于sbtmvp的帧间预测方法和设备
CN114270821A (zh) * 2019-06-19 2022-04-01 Lg电子株式会社 包括通过应用确定的预测模式来生成预测样本的图像解码方法及其装置

Similar Documents

Publication Publication Date Title
US11956462B2 (en) Video processing methods and apparatuses for sub-block motion compensation in video coding systems
US11700391B2 (en) Method and apparatus of motion vector constraint for video coding
TWI702834B (zh) 視訊編解碼系統中具有重疊塊運動補償的視訊處理的方法以及裝置
US20190158870A1 (en) Method and apparatus for affine merge mode prediction for video coding system
US20190387251A1 (en) Methods and Apparatuses of Video Processing with Overlapped Block Motion Compensation in Video Coding Systems
US11985324B2 (en) Methods and apparatuses of video processing with motion refinement and sub-partition base padding
US20200014931A1 (en) Methods and Apparatuses of Generating an Average Candidate for Inter Picture Prediction in Video Coding Systems
US11539977B2 (en) Method and apparatus of merge with motion vector difference for video coding
US11889099B2 (en) Methods and apparatuses of video processing for bi-directional prediction with motion refinement in video coding systems
WO2020098653A1 (fr) Procédé et appareil de codage de vidéo à hypothèses multiples
US20230232012A1 (en) Method and Apparatus Using Affine Non-Adjacent Candidates for Video Coding
WO2024078331A1 (fr) Procédé et appareil de prédiction de vecteurs de mouvement basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo
WO2024027784A1 (fr) Procédé et appareil de prédiction de vecteurs de mouvement temporel basée sur un sous-bloc avec réorganisation et affinement dans un codage vidéo
WO2023208189A1 (fr) Procédé et appareil pour l'amélioration d'un codage vidéo à l'aide d'une fusion avec un mode mvd avec mise en correspondance de modèles
WO2023208224A1 (fr) Procédé et appareil de réduction de complexité de codage vidéo à l'aide de fusion avec mode mvd
US20230328278A1 (en) Method and Apparatus of Overlapped Block Motion Compensation in Video Coding System
WO2024016844A1 (fr) Procédé et appareil utilisant une estimation de mouvement affine avec affinement de vecteur de mouvement de point de commande
WO2023208220A1 (fr) Procédé et appareil pour réordonner des candidats de fusion avec un mode mvd dans des systèmes de codage vidéo
WO2023134564A1 (fr) Procédé et appareil dérivant un candidat de fusion à partir de blocs codés affine pour un codage vidéo
WO2023143325A1 (fr) Procédé et appareil de codage vidéo utilisant un mode fusion avec mvd
WO2023222016A1 (fr) Procédé et appareil de réduction de complexité d'un codage vidéo à l'aide d'une fusion avec un mode mvd
WO2024141071A1 (fr) Procédé, appareil et support de traitement vidéo
WO2024012396A1 (fr) Procédé et appareil de prédiction inter à l'aide d'une mise en correspondance de modèles dans des systèmes de codage vidéo
WO2023143119A1 (fr) Procédé et appareil d'attribution de mv de mode de partition géométrique dans un système de codage vidéo
CN118354099A (zh) 用于视频编解码系统中的子块运动补偿的视频处理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23876535

Country of ref document: EP

Kind code of ref document: A1