WO2020211755A1 - Décomposition de vecteur mouvement et d'échantillon de prédiction - Google Patents

Décomposition de vecteur mouvement et d'échantillon de prédiction Download PDF

Info

Publication number
WO2020211755A1
WO2020211755A1 PCT/CN2020/084726 CN2020084726W WO2020211755A1 WO 2020211755 A1 WO2020211755 A1 WO 2020211755A1 CN 2020084726 W CN2020084726 W CN 2020084726W WO 2020211755 A1 WO2020211755 A1 WO 2020211755A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
mvd
picture
sub
bdof
Prior art date
Application number
PCT/CN2020/084726
Other languages
English (en)
Inventor
Hongbin Liu
Kai Zhang
Li Zhang
Jizheng Xu
Yue Wang
Original Assignee
Beijing Bytedance Network Technology Co., Ltd.
Bytedance Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bytedance Network Technology Co., Ltd., Bytedance Inc. filed Critical Beijing Bytedance Network Technology Co., Ltd.
Priority to CN202080028662.3A priority Critical patent/CN113796084B/zh
Publication of WO2020211755A1 publication Critical patent/WO2020211755A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures

Definitions

  • This patent document relates to video coding techniques, devices and systems.
  • Devices, systems and methods related to digital video coding, and specifically, to management of motion vectors are described.
  • the described methods may be applied to existing video coding standards (e.g., High Efficiency Video Coding (HEVC) or Versatile Video Coding) and future video coding standards or video codecs.
  • HEVC High Efficiency Video Coding
  • Versatile Video Coding future video coding standards or video codecs.
  • the disclosed technology may be used to perform a method of visual media processing.
  • the method includes performing a conversion between a current video block and a bitstream representation of the current video block, wherein the conversion includes co-existence of one or more decoder motion vector derivation (DMVD) steps for refining motion vector information signaled in the bitstream representation, wherein, during the co-existence of the one or more DMVD steps, the motion vector information of the current video block and motion vector information of sub-blocks of the current video block are jointly derived, wherein the co-existence of the one or more DMVD steps includes a use of one or more of: a decoder motion vector refinement (DMVR) step, a Bi-directional Optical flow (BDOF) step, or a frame-rate up-conversion (FRUC) step.
  • DMVR decoder motion vector refinement
  • BDOF Bi-directional Optical flow
  • FRUC frame-rate up-conversion
  • the disclosed technology may be used to perform another method of visual media processing.
  • the method includes performing a conversion between a current video block and a bitstream representation of the current video block, wherein the conversion includes co-existence of one or more decoder motion vector derivation (DMVD) steps for refining motion vector information signaled in the bitstream representation, wherein, during the co-existence of the one or more DMVD steps, the motion vector information of the current video block and motion vector information of sub-blocks of the current video block are jointly derived, wherein the co-existence of the one or more DMVD steps includes a use of one or more of: a decoder motion vector refinement (DMVR) step, a Bi-directional Optical flow (BDOF) step, or a frame-rate up-conversion (FRUC) step; and selectively enabling the co-existence of the one or more DMVD steps for the current video block and/or sub-blocks of the current video block.
  • DMVD decoder motion vector derivation
  • the disclosed technology may be used to perform a method for processing video.
  • the method includes deriving, for a conversion between a first block of video and a bitstream representation of the first block, at least one motion vector difference (MVD) of MVD associated with the first block (MVDb) and MVD associated with sub-block within the first block (MVDsb) by jointly using a first process and a second process of multiple decoder motion vector derivation (DMVD) process, the MVDb being derived at least using the first process, and the MVDsb being derived at least using the second process; refining motion vector (MV) of the first block (MVb) using the at least one MVD; and performing the conversion based on the refined motion vector of the first block.
  • MVD motion vector difference
  • DMVD decoder motion vector derivation
  • an apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon is disclosed.
  • a computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out any one or more of the disclosed methods is disclosed.
  • FIG. 1 shows an example of constructing a merge candidate list.
  • FIG. 2 shows an example of positions of spatial candidates.
  • FIG. 3 shows an example of candidate pairs subject to a redundancy check of spatial merge candidates.
  • FIGs. 4A and 4B show examples of the position of a second prediction unit (PU) based on the size and shape of the current block.
  • FIG. 5 shows an example of motion vector scaling for temporal merge candidates.
  • FIG. 6 shows an example of candidate positions for temporal merge candidates.
  • FIG. 7 shows an example of generating a combined bi-predictive merge candidate.
  • FIG. 8 shows an example of constructing motion vector prediction candidates.
  • FIG. 9 shows an example of motion vector scaling for spatial motion vector candidates.
  • FIG. 10 shows an example of decoder-side motion video refinement (DMVR) in JEM7.
  • FIG. 11 show an example of motion vector difference (MVD) in connection with DMVR.
  • FIG. 12 show an example illustrating checks on motion vectors.
  • FIG. 13 shows an example of a bi-directional optical flow based motion modeling.
  • FIG. 14 is a block diagram of an example of a hardware platform for implementing a visual media decoding or a visual media encoding technique described in the present document.
  • FIG. 15 shows a flowchart of an example method for video coding.
  • FIG. 16 shows a flowchart of an example method for video coding.
  • Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards.
  • the ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards.
  • AVC H. 264/MPEG-4 Advanced Video Coding
  • H. 265/HEVC High Efficiency Video Coding
  • the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized.
  • Joint Video Exploration Team JVET was founded by VCEG and MPEG jointly in 2015.
  • JVET Joint Exploration Model
  • Each inter-predicted PU has motion parameters for one or two reference picture lists.
  • Motion parameters include a motion vector and a reference picture index. Usage of one of the two reference picture lists may also be signalled using inter_pred_idc. Motion vectors may be explicitly coded as deltas relative to predictors.
  • a merge mode is specified whereby the motion parameters for the current PU are obtained from neighbouring PUs, including spatial and temporal candidates.
  • the merge mode can be applied to any inter-predicted PU, not only for skip mode.
  • the alternative to merge mode is the explicit transmission of motion parameters, where motion vector (to be more precise, motion vector differences (MVD) compared to a motion vector predictor) , corresponding reference picture index for each reference picture list and reference picture list usage are signalled explicitly per each PU.
  • MDV motion vector differences
  • Such a mode is named Advanced motion vector prediction (AMVP) in this disclosure.
  • the PU When signalling indicates that one of the two reference picture lists is to be used, the PU is produced from one block of samples. This is referred to as ‘uni-prediction’ . Uni-prediction is available both for P-slices and B-slices.
  • Bi-prediction When signalling indicates that both of the reference picture lists are to be used, the PU is produced from two blocks of samples. This is referred to as ‘bi-prediction’ . Bi-prediction is available for B-slices only.
  • inter prediction is used to denote prediction derived from data elements (e.g., sample values or motion vectors) of reference pictures other than the current decoded picture.
  • data elements e.g., sample values or motion vectors
  • a picture can be predicted from multiple reference pictures.
  • the reference pictures that are used for inter prediction are organized in one or more reference picture lists.
  • the reference index identifies which of the reference pictures in the list should be used for creating the prediction signal.
  • a single reference picture list, List 0 is used for a P slice and two reference picture lists, List 0 and List 1 are used for B slices. It should be noted reference pictures included in List 0/1 could be from past and future pictures in terms of capturing/display order.
  • Step 1.2 Redundancy check for spatial candidates
  • steps are also schematically depicted in FIG. 1.
  • For spatial merge candidate derivation a maximum of four merge candidates are selected among candidates that are located in five different positions.
  • temporal merge candidate derivation a maximum of one merge candidate is selected among two candidates. Since constant number of candidates for each PU is assumed at decoder, additional candidates are generated when the number of candidates obtained from step 1 does not reach the maximum number of merge candidate (MaxNumMergeCand) which is signalled in slice header. Since the number of candidates is constant, index of best merge candidate is encoded using truncated unary binarization (TU) . If the size of CU is equal to 8, all the PUs of the current CU share a single merge candidate list, which is identical to the merge candidate list of the 2N ⁇ 2N prediction unit.
  • TU truncated unary binarization
  • a maximum of four merge candidates are selected among candidates located in the positions depicted in FIG. 2.
  • the order of derivation is A 1 , B 1 , B 0 , A 0 and B 2 .
  • Position B 2 is considered only when any PU of position A 1 , B 1 , B 0 , A 0 is not available (e.g. because it belongs to another slice or tile) or is intra coded.
  • candidate at position A 1 is added, the addition of the remaining candidates is subject to a redundancy check which ensures that candidates with same motion information are excluded from the list so that coding efficiency is improved.
  • a redundancy check To reduce computational complexity, not all possible candidate pairs are considered in the mentioned redundancy check. Instead only the pairs linked with an arrow in FIG.
  • FIG. 4 depicts the second PU for the case of N ⁇ 2N and 2N ⁇ N, respectively.
  • candidate at position A 1 is not considered for list construction. In fact, by adding this candidate will lead to two prediction units having the same motion information, which is redundant to just have one PU in a coding unit.
  • position B 1 is not considered when the current PU is partitioned as 2N ⁇ N.
  • a scaled motion vector is derived based on co-located PU belonging to the picture which has the smallest picture order count (POC) difference with current picture within the given reference picture list.
  • the reference picture list to be used for derivation of the co-located PU is explicitly signalled in the slice header.
  • the scaled motion vector for temporal merge candidate is obtained as illustrated by the dotted line in FIG.
  • tb is defined to be the POC difference between the reference picture of the current picture and the current picture
  • td is defined to be the POC difference between the reference picture of the co-located picture and the co-located picture.
  • the reference picture index of temporal merge candidate is set equal to zero.
  • the position for the temporal candidate is selected between candidates C 0 and C 1 , as depicted in FIG. 6. If PU at position C 0 is not available, is intra coded, or is outside of the current coding tree unit (CTU aka. LCU, largest coding unit) row, position C 1 is used. Otherwise, position C 0 is used in the derivation of the temporal merge candidate.
  • CTU current coding tree unit
  • Zero merge candidate Combined bi-predictive merge candidates are generated by utilizing spatial and temporal merge candidates. Combined bi-predictive merge candidate is used for B-Slice only. The combined bi-predictive candidates are generated by combining the first reference picture list motion parameters of an initial candidate with the second reference picture list motion parameters of another. If these two tuples provide different motion hypotheses, they will form a new bi-predictive candidate. As an example, FIG.
  • Zero motion candidates are inserted to fill the remaining entries in the merge candidates list and therefore hit the Ma*NumMergeCand capacity. These candidates have zero spatial displacement and a reference picture index which starts from zero and increases every time a new zero motion candidate is added to the list. Finally, no redundancy check is performed on these candidates.
  • AMVP exploits spatio-temporal correlation of motion vector with neighbouring PUs, which is used for explicit transmission of motion parameters.
  • a motion vector candidate list is constructed by firstly checking availability of left, above temporally neighbouring PU positions, removing redundant candidates and adding zero vector to make the candidate list to be constant length. Then, the encoder can select the best predictor from the candidate list and transmit the corresponding index indicating the chosen candidate. Similarly with merge index signalling, the index of the best motion vector candidate is encoded using truncated unary. The maximum value to be encoded in this case is 2 (see FIG. 8) .
  • the maximum value to be encoded is 2 (see FIG. 8) .
  • FIG. 8 summarizes derivation process for motion vector prediction candidate.
  • motion vector candidate two types are considered: spatial motion vector candidate and temporal motion vector candidate.
  • spatial motion vector candidate derivation two motion vector candidates are eventually derived based on motion vectors of each PU located in five different positions as depicted in FIG. 2.
  • one motion vector candidate is selected from two candidates, which are derived based on two different co-located positions. After the first list of spatio-temporal candidates is made, duplicated motion vector candidates in the list are removed. If the number of potential candidates is larger than two, motion vector candidates whose reference picture index within the associated reference picture list is larger than 1 are removed from the list. If the number of spatio-temporal motion vector candidates is smaller than two, additional zero motion vector candidates is added to the list.
  • a maximum of two candidates are considered among five potential candidates, which are derived from PUs located in positions as depicted in FIG. 2, those positions being the same as those of motion merge.
  • the order of derivation for the left side of the current PU is defined as A 0 , A 1 , and scaled A 0 , scaled A 1 .
  • the order of derivation for the above side of the current PU is defined as B 0 , B 1 , B 2 , scaled B 0 , scaled B 1 , scaled B 2 .
  • the no-spatial-scaling cases are checked first followed by the spatial scaling. Spatial scaling is considered when the POC is different between the reference picture of the neighboring PU and that of the current PU regardless of reference picture list. If all PUs of left candidates are not available or are intra coded, scaling for the above motion vector is allowed to help parallel derivation of left and above MV candidates. Otherwise, spatial scaling is not allowed for the above motion vector.
  • the motion vector of the neighboring PU is scaled in a similar manner as for temporal scaling, as depicted as FIG. 9.
  • the main difference is that the reference picture list and index of current PU is given as input; the actual scaling process is the same as that of temporal scaling.
  • AMVR Adaptive motion vector difference resolution
  • affine prediction mode Triangular prediction mode (TPM)
  • ATMVP Advanced TMVP
  • GBI Generalized Bi-Prediction
  • BDOF Bi-directional Optical flow
  • DMVR decoder motion vector derivation
  • MMVD merge mode with MVD
  • QuadTree/BinaryTree/MulitpleTree (QT/BT/TT) structure is adopted to divide a picture into square or rectangle blocks.
  • separate tree (a.k.a. Dual coding tree) is also adopted in VVC for I-frames.
  • VVC VVC
  • the coding block structure are signaled separately for the luma and chroma components.
  • bi-prediction operation for the prediction of one block region, two prediction blocks, formed using a motion vector (MV) of list0 and a MV of list1, respectively, are combined to form a single prediction signal.
  • MV motion vector
  • DMVR decoder-side motion vector refinement
  • the motion vectors are refined by a bilateral template matching process.
  • the bilateral template matching applied in the decoder to perform a distortion-based search between a bilateral template and the reconstruction samples in the reference pictures in order to obtain a refined MV without transmission of additional motion information.
  • An example is depicted in FIG. 10.
  • the bilateral template is generated as the weighted combination (i.e. average) of the two prediction blocks, from the initial MV0 of list0 and MV1 of list1, respectively, as shown in FIG. 11.
  • the template matching operation consists of calculating cost measures between the generated template and the sample region (around the initial prediction block) in the reference picture. For each of the two reference pictures, the MV that yields the minimum template cost is considered as the updated MV of that list to replace the original one.
  • the nine MV candidates are searched for each list.
  • the nine MV candidates include the original MV and 8 surrounding MVs with one luma sample offset to the original MV in either the horizontal or vertical direction, or both.
  • the two new MVs i.e., MV0′ and MV1′ as shown in FIG. 12, are used for generating the final bi-prediction results.
  • a sum of absolute differences (SAD) is used as the cost measure.
  • SAD sum of absolute differences
  • MVD mirroring between list 0 and list 1 is assumed as shown in FIG. 11, and bilateral matching is performed to refine the MVs, i.e., to find the best MVD among several MVD candidates.
  • MVL0 L0X, L0Y
  • MVL1 L1X, L1Y
  • the MVD denoted by (MvdX, MvdY) for list 0 that could minimize the cost function (e.g., SAD) is defined as the best MVD.
  • the SAD function it is defined as the SAD between the reference block of list 0 derived with a motion vector (L0X+MvdX, L0Y+MvdY) in the list 0 reference picture and the reference block of list 1 derived with a motion vector (L1X-MvdX, L1Y-MvdY) in the list 1 reference picture.
  • MVD-pair a pair of associated MVDs for L0 and L1 (e.g. (MvdX, MvdY) for L0 and (-MvdX, -MvdY) for L1) is denoted as a MVD-pair.
  • the motion vector refinement process may iterate twice. In each iteration, at most 6 MVDs (with integer-pel precision) may be checked in two steps, as shown in FIG. 12. In the first step, MVD (0, 0) , (-1, 0) , (1, 0) , (0, -1) , (0, 1) are checked. In the second step, one of the MVD (-1, -1) , (-1, 1) , (1, -1) or (1, 1) may be selected and further checked. Suppose function Sad(x, y) returns SAD value of the MVD (x, y) . The MVD, denoted by (MvdX, MvdY) , checked in the second step is decided as follows:
  • the starting point is the signaled MV
  • the starting point is the signaled MV plus the selected best MVD derived in the first iteration.
  • DMVR applies only when one reference picture is a preceding picture and the other reference picture is a following picture, and the two reference pictures are with same picture order count distance from the current picture.
  • JVET-M0147 proposed several changes to the design. More specifically, the adopted DMVR design to VTM-4.0 (to be released soon) has the following main features:
  • DMVR may be enabled:
  • DMVR enabling flag in the SPS i.e., sps_dmvr_enabled_flag
  • TPM flag inter-affine flag and subblock merge flag (either ATMVP or affine merge)
  • MMVD flag are all equal to 0
  • the current CU height is greater than or equal to 8
  • Number of luma samples (CU width*height) is greater than or equal to 64
  • the parametric error surface fit is computed only if the center position is the best cost position in a given iteration.
  • motion compensation is first performed to generate the first predictions (in each prediction direction) of the current block.
  • the first predictions are used to derive the spatial gradient, the temporal gradient and the optical flow of each subblock/pixel within the block, which are then used to generate the second prediction, i.e., the final prediction of the subblock/pixel.
  • BDOF is sample-wise motion refinement which is performed on top of block-wise motion compensation for bi-prediction.
  • the sample-level motion refinement doesn’t use signalling.
  • the motion vector field (v x , v y ) is given by an equation
  • ⁇ 0 and ⁇ 1 denote the distances to the reference frames as shown on a FIG. 13.
  • the motion vector field (v x , v y ) is determined by minimizing the difference ⁇ between values in points A and B (intersection of motion trajectory and reference frame planes on Fig. 9) .
  • Model uses only first linear term of a local Taylor expansion for ⁇ :
  • Equation 3 All values in Equation 3 depend on the sample location (i′, j′) , which was omitted from the notation so far. Assuming the motion is consistent in the local surrounding area, we minimize ⁇ inside the (2M+1) ⁇ (2M+1) square window ⁇ centered on the currently predicted point (i, j) , where M is equal to 2:
  • BDOF With BDOF, it’s possible that the motion field can be refined for each sample.
  • a block-based design of BDOF is used. The motion refinement is calculated based on 4 ⁇ 4 block.
  • the values of s n in Equation 7 of all samples in a 4 ⁇ 4 block are aggregated, and then the aggregated values of s n in are used to derived BDOF motion vectors offset for the 4 ⁇ 4 block.
  • a 6 ⁇ 6 sub-block region where the 4 ⁇ 4 sub-block locates in its center, is used to derive the motion vector of the 4 ⁇ 4 sub-block. More specifically, the following formula is used for sub-block based BDOF derivation:
  • b k denotes the set of samples belonging to the k-th 6 ⁇ 6 block of the predicted block.
  • MV regiment of BDOF might be unreliable due to noise or irregular motion. Therefore, in BDOF, the magnitude of MV regiment is clipped to a threshold value thBDOF.
  • thBDOF is set to max (2, 2 13-d ) , wherein d is the bit depth of the input samples.
  • DMVR is first performed to find the best MVD for the entire block (or the entire 16 ⁇ 16/N ⁇ 16/16 ⁇ N block mentioned above) .
  • BDOF is performed to find the best MVD for each 4 ⁇ 4 block within the block (or within the 16 ⁇ 16/N ⁇ 16/16 ⁇ N block) .
  • DMVR When both DMVR and BDOF are allowed for a block, DMVR is first performed followed by BDOF. First, the best MV offset (or called MV difference (MVD) ) is derived for the entire block (or the entire 16 ⁇ 16/N ⁇ 16/16 ⁇ N block mentioned above) , then, the best MV offset is derived for each 4 ⁇ 4 within the block.
  • MVD MV difference
  • DMVR and BDOF work independently and they cannot be optimized jointly. Meanwhile, the complexity is relatively high due to the two stage optimizations.
  • Decoder motion vector derivation is used to represent DMVR or BDOF or FRUC etc. which derives MV or/and MVD at decoder side.
  • MVD b and MVD sb are used to represent the derived MVDs of the block (or processing unit, such as 16x16) and the sub-block respectively.
  • POC distance is used to represent the absolute POC difference between two pictures.
  • a “unit” may refer to a “block” and a “sub-unit” may refer to a “sub-block” .
  • the motion vector offsets/differences for one unit and for one sub-unit may be jointly determined wherein the unit may be a block/fixed size and one sub-unit may be a smaller region compared to the unit.
  • the corresponding prediction blocks associated with the candidate in one or two prediction directions may be further modified before being used to decide the best MVD-pair in the DMVR process.
  • BDOF may be applied for a given MVD-pair candidate checked in the DMVR process.
  • the interpolation filter when interpolating the reference blocks with the proposed MV refinement method, may be different from what is used for regular inter-prediction without MV refinement.
  • bilinear filter, 4-tap filter or 6-tap filter may be used in the proposed method.
  • interpolation filter used in regular inter mode may be used.
  • the integer-pixel part of the MV b [i] + MVD b j [i] may be used to identify the reference blocks, and therefore no sample interpolation is required.
  • MV b [i] + MVD b j [i] may be rounded to the integer precision toward-to-zero or away-from zero.
  • the MVD b j [i] may be with N-pixel precision, wherein N may be 1/16, 1/4, 1/2, 1, 2, etc.
  • MV b [i] + MVD b j [i] may be rounded to the target precision toward-to-zero or away-from zero.
  • the set of allowed MVD-pair candidates for MVD b [i] may be same as that utilized in the DMVR process.
  • a cost function may be defined, and the cost may be calculated at block level for each MVD b j [i] using the associated MVD sb j [i] for each sub-block and gradient information of the corresponding reference blocks based on MVD b j [i] .
  • the MVD b j [i] achieving the minimum cost and its associated MVD sb j [i] may be used as the final MVDs for the block and sub-blocks. Denote the index of the best MVD sb i and MVD sb j [i] as ibest.
  • the cost function may be defined as wherein ⁇ is defined in equation (7) .
  • the cost function may be defined as
  • the cost function may be calculated over all samples in the block/sub-block.
  • the cost may be calculated on partial samples in the block/sub-block.
  • Partial samples may be the even (or odd) rows of the block/sub-block.
  • Partial samples may be the even (or odd) columns of the block/sub-block.
  • Partial samples may include the 1 st row (or/and column) of every N rows (or/and columns) of the block/sub-block.
  • Partial samples may include the first N1 rows (or/and columns) of every N2 rows (or/and columns) of the block/sub-block.
  • Partial samples may depend on block/sub-block width or/and height.
  • the partial samples may include the 1 st row of every N1 rows; otherwise, the partial samples may include the 1 st row of every N2 rows.
  • N1 > N2.
  • final motion compensation may be performed for the block using MV b [i] + MVD b ibest [i] with regular interpolation filters and/or regular motion precision, if different interpolation filters and/or different motion precision is used to search the best refinement MVDs.
  • MVD sb ibest [i] may be further used to generate the refined prediction samples, e.g. according to equation (6) .
  • v x and v y in equation (6) are the horizontal and vertical component of MVD sb ibest [0] of the sub-block which covers the corresponding sample, respectively, and I (0) and I (1) are generated in the final motion compensation.
  • BDOF may be performed to derive the MVD for each sub-block and the refined prediction sample for each pixel (e.g. according to equation (6) ) .
  • such cost function may be calculated at sub-block level for each MVD b j [i] using the associated MVD sb j [i] and gradient information of the corresponding reference blocks based on MVD b j [i] .
  • the MVD b j [i] and MVD sb j [i] achieving the minimum cost are used in the final prediction sample generation process. Denote the index of the best MVD b j and MVD sb i as ibest for a sub-block.
  • final motion compensation may be performed for the sub-block using MV b [i] + MVD b ibest [i] with regular interpolation filters and/or regular motion precision, if different interpolation filters and/or different motion precision is used to search the best refinement MVDs.
  • MVD sb ibest [i] may be further used to generate the refined prediction sample, e.g. according to equation (6) .
  • v x and v y in equation (6) are the horizontal and vertical component of MVD sb ibest [0] of the sub- block respectively, and I (0) and I (1) are generated in the final motion compensation.
  • BDOF may be performed for the sub-block to derive its MVD and generate the refined prediction samples (e.g. according to equation (6) ) .
  • the sample refinement process in equation (6) may be applied only to some color components.
  • ii For example, it may be applied to all color components.
  • K may be equal to 2, 3, 4, 5, etc.
  • DMVR or other DMVD methods may be used to select K best MVD-pair candidates from M (M > K) MVD candidates for the block, and then item 1 may be applied with the selected K best MVD candidates.
  • K may be equal to 1, 2, 3, etc.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be applied to certain pictures/tiles/slices etc.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be signaled in VPS/SPS/PPS/slice header/tile group head etc.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be applied to pictures that may be referenced by other pictures only.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be applied to a block when the POC distances between the current picture and the two reference pictures of the block are both smaller (or larger) than a threshold.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be applied to a block when the POC distance between current picture and one of the two reference pictures of the block is smaller (or larger) than a threshold.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be applied to a block when the POC distance between current picture and one of the two reference pictures of the block is within a range [T1, T2] .
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be applied to a picture when the POC distances between the picture and its two nearest reference pictures in the two reference picture lists are both smaller (or larger) than a threshold.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods is applied to a unit (e.g, block) or not may depend on coded information of the unit.
  • the coded information may include motion information, residual information, transform information, mode information, dimension etc. of the unit.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be disallowed if additional transform is applied when encoding the residual of the block.
  • the additional transform may be secondary transform or reduced secondary transform or rotational transform or KLT (Karhunan-Loeve transformation) or any other transform.
  • DMVR or/and BDOF or/and other DMVD methods or/and proposed methods may be disallowed if the additional transform is applied and width or/and height of the block are of specific sizes.
  • the block is of size 4*4/4*8/8*4/8*8/4*16/16*4 etc.
  • additional transform may be disallowed for blocks wherein DMVR or/and BDOF or/and other DMVD methods or/and proposed methods are applied.
  • the indication of additional transform may be signaled for these blocks, but are constrained to be false (i.e., the additional transform does not apply) in a conformance bitstream.
  • Proposed methods may be enabled/disabled according to the rule on block dimension.
  • a block size contains less than M*H samples, e.g., 16 or 32 or 64 luma samples, proposed methods are not allowed.
  • a block size contains more than M*H samples, e.g., 16 or 32 or 64 luma samples, proposed methods are not allowed.
  • X is set to 8.
  • X is set to 64.
  • M*M e.g., 128x1228
  • th1 and/or th2 is set to 8.
  • Proposed methods may be performed at sub-block level.
  • L is 64
  • a 64*128/128*64 block is split into two 64*64 sub-blocks
  • a 128x128 block is split into four 64*64 sub-blocks.
  • N*128/128*N block, wherein N ⁇ 64, is not split into sub-blocks.
  • L is 64
  • a 64*128/128*64 block is split into two 64*64 sub-blocks
  • a 128x128 block is split into four 64*64 sub-blocks
  • N*128/128*N block, wherein N ⁇ 64 is split into two N*64/64*N sub-blocks.
  • width (or height) when width (or height) is larger than L, it is split vertically (or horizontally) , and the width or/and height of the sub-block is no larger than L.
  • size (i.e., width *height) of block when size (i.e., width *height) of block is larger than a threshold L1, it may be split into multiple sub-blocks. Each sub-block is treated in the same way as a normal coding block with size equal to the sub-block size.
  • the block is split into sub-blocks with same size that is no larger than L1.
  • width (or height) of the block is no larger than a threshold L2, it is not split vertically (or horizontally) .
  • L1 is 1024
  • L2 is 32
  • a 16x128 block is split into two 16*64 sub-blocks.
  • the threshold L may be pre-defined or signaled in SPS/PPS/picture/slice/tile group/tile level.
  • the thresholds may depend on certain coded information, such as block size, picture type, temporal layer index, etc. al.
  • FIG. 14 is a block diagram of a video processing apparatus 1400.
  • the apparatus 1400 may be used to implement one or more of the methods described herein.
  • the apparatus 1400 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on.
  • the apparatus 1400 may include one or more processors 1402, one or more memories 1404 and video processing hardware 1406.
  • the processor (s) 1402 may be configured to implement one or more methods described in the present document.
  • the memory (memories) 1404 may be used for storing data and code used for implementing the methods and techniques described herein.
  • the video processing hardware 1406 may be used to implement, in hardware circuitry, some techniques described in the present document, and may be partly or completely be a part of the processors 1402 (e.g., graphics processor core GPU or other signal processing circuitry) .
  • video processing may refer to video encoding, video decoding, video compression or video decompression.
  • video compression algorithms may be applied during conversion from pixel representation of a video to a corresponding bitstream representation or vice versa.
  • the bitstream representation of a current video block may, for example, correspond to bits that are either co-located or spread in different places within the bitstream, as is defined by the syntax.
  • a macroblock may be encoded in terms of transformed and coded error residual values and also using bits in headers and other fields in the bitstream.
  • FIG. 15 is a flowchart for an example method 1500 of video processing.
  • the method 1500 includes, at 1510, performing a conversion between a current video block and a bitstream representation of the current video block, wherein the conversion includes co-existence of one or more decoder motion vector derivation (DMVD) steps for refining motion vector information signaled in the bitstream representation, wherein, during the co-existence of the one or more DMVD steps, the motion vector information of the current video block and motion vector information of sub-blocks of the current video block are jointly derived, wherein the co-existence of the one or more DMVD steps includes a use of one or more of: a decoder motion vector refinement (DMVR) step, a Bi-directional Optical flow (BDOF) step, or a frame-rate up-conversion (FRUC) step.
  • DMVR decoder motion vector refinement
  • BDOF Bi-directional Optical flow
  • FRUC frame-rate up-conversion
  • a method of visual media processing comprising:
  • DMVD decoder motion vector derivation
  • the conversion includes co-existence of one or more decoder motion vector derivation (DMVD) steps for refining motion vector information signaled in the bitstream representation, wherein, during the co-existence of the one or more DMVD steps, the motion vector information of the current video block and motion vector information of sub-blocks of the current video block are jointly derived, wherein the co-existence of the one or more DMVD steps includes a use of one or more of: a decoder motion vector refinement (DMVR) step, a Bi-directional Optical flow (BDOF) step, or a frame-rate up-conversion (FRUC) step.
  • DMVR decoder motion vector refinement
  • BDOF Bi-directional Optical flow
  • FRUC frame-rate up-conversion
  • interpolation filter is a bilinear filter, a 4-tap filter, or a 6-tap filter.
  • the motion vector information of the current video block is derived from motion vector information of two other video blocks and refining the derived motion vector information occurs during the DMVR step.
  • prediction of the motion vector information of the current video block is based, at least in part, on K best pair of motion vector information selected from M candidates in the candidate set (M>K) .
  • a method of visual media processing comprising:
  • DMVD decoder motion vector derivation
  • the conversion includes co-existence of one or more decoder motion vector derivation (DMVD) steps for refining motion vector information signaled in the bitstream representation, wherein, during the co-existence of the one or more DMVD steps, the motion vector information of the current video block and motion vector information of sub-blocks of the current video block are jointly derived, wherein the co-existence of the one or more DMVD steps includes a use of one or more of: a decoder motion vector refinement (DMVR) step, a Bi-directional Optical flow (BDOF) step, or a frame-rate up-conversion (FRUC) step; and
  • DMVR decoder motion vector refinement
  • BDOF Bi-directional Optical flow
  • FRUC frame-rate up-conversion
  • the additional coded information of the current video block or the sub-blocks of the current video block includes one or more of: a motion information, a residual information, a transform information, a mode information, or a dimension information.
  • motion vector information of the current video block and motion vector information of sub-blocks of the current video block to be jointly derived, in response to determining that a dimension of the current video block or a dimension of the sub-blocks of the current video block satisfy one or more rules.
  • An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one or more of clauses 1 to 31.
  • a computer program product stored on a non-transitory computer readable media including program code for carrying out the method in any one or more of clauses 1 to 31.
  • FIG. 16 is a flowchart for an example method 1600 of video processing.
  • the method 1600 includes, at 1602, deriving, for a conversion between a first block of video and a bitstream representation of the first block, at least one motion vector difference (MVD) of MVD associated with the first block (MVDb) and MVD associated with sub-block within the first block (MVDsb) by jointly using a first process and a second process of multiple decoder motion vector derivation (DMVD) process, the MVDb being derived at least using the first process, and the MVDsb being derived at least using the second process; at 1604, refining motion vector (MV) of the first block (MVb) using the at least one MVD; and at 1606, performing the conversion based on the refined motion vector of the first block.
  • MVD motion vector difference
  • DMVD decoder motion vector derivation
  • the multiple DMVD processes comprises one or more of: a decoder motion vector refinement (DMVR) process, a Bi-directional Optical flow (BDOF) process, and a frame-rate up-conversion (FRUC) process.
  • DMVR decoder motion vector refinement
  • BDOF Bi-directional Optical flow
  • FRUC frame-rate up-conversion
  • the first process is the DMVR process and the second process is the BDOF process.
  • corresponding reference blocks associated with the given MVD-pair candidate in one or two prediction directions are further modified before being used to decide the best MVD-pair in the DMVR process.
  • the corresponding reference blocks are further modified by the BDOF process.
  • the two reference blocks are interpolated by using an interpolation filter different from a regular interpolation filter used in regular inter mode, wherein the interpolation filter is selected from a bilinear filter, a 4-tap filter or a 6-tap filter and the regular interpolation filter is a 8-tap filter.
  • the two reference blocks are interpolated by using a regular interpolation filter used in regular inter mode, wherein the regular interpolation filter is a 8-tap filter.
  • the reference blocks are identified by using integer-pixel part of the MV b [i] + MVD b j [i] without sample interpolation.
  • the MV b [i] + MVD b j [i] is rounded to an integer precision toward-to-zero or away-from zero.
  • the MVD b j [i] is N-pixel precision, wherein N is one of 1/16, 1/4, 1/2, 1 and 2.
  • the MV b [i] + MVD b j [i] is rounded to a target precision toward-to-zero or away-from zero.
  • the set of allowed MVD-pair candidates for MVD b [i] is same as that utilized in the DMVR process.
  • a cost function is defined for searching best MVDs for the first block and/or sub-blocks.
  • the cost of the cost function is calculated at block level for each MVD b j [i] using the associated MVD sb j [i] for each sub-block and gradient information of the corresponding reference blocks based on MVD b j [i] .
  • the MVD b j [i] achieving the minimum cost and its associated MVD sb j [i] are used as the best MVDs for the first block and sub-blocks, and index of the best MVD b j [i] and MVD sb j [i] are denoted as ibest.
  • the cost function is defined as
  • the cost function is defined as
  • the cost function is calculated on all samples in the first block and/or the sub-blocks.
  • the cost function is calculated on partial samples in the first block and/or the sub-blocks.
  • the partial samples are the even or odd rows of the first block and/or the sub-blocks.
  • the partial samples are the even or odd columns of the first block and/or the sub-blocks.
  • the partial samples include the 1 st row of every N rows of the first block and/or the sub-block and/or the 1 st columns of every N columns of the first block and/or the sub-block.
  • the partial samples include the first N1 row of every N2 rows of the first block and/or the sub-block and/or the first N1 columns of every N2 columns of the first block and/or the sub-block, where N1 and N2 are integers.
  • the partial samples depend on width or/and height of the first block and/or the sub-block.
  • the partial samples include the 1 st row of every N1 rows; otherwise, the partial samples include the 1 st row of every N2 rows, wherein, N1 > N2.
  • final motion compensation is performed for the first block using MV b [i] + MVD b ibest [i] with regular interpolation filters and/or regular motion precision.
  • the MVD sb ibest [i] is further used to generate the refined prediction samples according to the following sample refinement process:
  • ⁇ 0 is the picture order count (POC) distance from the current picture to the reference picture in reference list 0
  • ⁇ 1 is the POC distance from the reference picture in reference list 1 to the current picture
  • v x and v y are the horizontal and vertical component of MVD sb ibest [0] of the sub-block which covers the corresponding sample, respectively
  • I (0) and I (1) are generated in the final motion compensation.
  • BDOF is performed to derive the MVD for each sub-block and generate the refined prediction sample for each pixel according to the following sample refinement process:
  • ⁇ 0 is the picture order count (POC) distance from the current picture to the reference picture in reference list 0
  • ⁇ 1 is the POC distance from the reference picture in reference list 1 to the current picture
  • v x and v y are the horizontal and vertical component of MVD sb ibest [0] of the sub-block which covers the corresponding sample, respectively
  • I (0) and I (1) are generated in the final motion compensation.
  • the cost of the cost function is calculated at sub-block level for each MVD b j [i] using the associated MVD sb j [i] for each sub-block and gradient information of the corresponding reference blocks based on MVD b j [i] .
  • the MVD b j [i] and MVD sb j [i] achieving the minimum cost are used in the final prediction sample generation process, and index of the best MVD b j [i] and MVD sb j [i] are denoted as ibest for a sub-block.
  • final motion compensation is performed for the sub-block using MV b [i] + MVD b ibest [i] with regular interpolation filters and/or regular motion precision.
  • the MVD sb ibest [i] is further used to generate the refined prediction samples according to the following sample refinement process:
  • ⁇ 0 is the picture order count (POC) distance from the current picture to the reference picture in reference list 0
  • ⁇ 1 is the POC distance from the reference picture in reference list 1 to the current picture
  • v x and v y are the horizontal and vertical component of MVD sb ibest [0] of the sub-block which covers the corresponding sample, respectively
  • I (0) and I (1) are generated in the final motion compensation.
  • BDOF is performed for the sub-block to derive its MVD and generate the refined prediction sample according to the following sample refinement process:
  • ⁇ 0 is the picture order count (POC) distance from the current picture to the reference picture in reference list 0
  • ⁇ 1 is the POC distance from the reference picture in reference list 1 to the current picture
  • v x and v y are the horizontal and vertical component of MVD sb ibest [0] of the sub-block which covers the corresponding sample, respectively
  • I (0) and I (1) are generated in the final motion compensation.
  • the sample refinement process is applied to luma component only.
  • the sample refinement process is applied to all color components.
  • the K MVD-pair candidates are selected from M MVD-pair candidates with DMVR process or other DMVD processes for the first block, where M, K are integers and M > K.
  • K is equal to 2, 3, 4 or 5.
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to at least one of certain pictures, tiles and slices.
  • whether the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied or not is signaled in at least one of video parameter set (VPS) , sequence parameter set (SPS) , picture parameter set (PPS) , sequence header, picture header, slice header, tile group header, tile header.
  • VPS video parameter set
  • SPS sequence parameter set
  • PPS picture parameter set
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to pictures that are referenced by other pictures only.
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a block when picture order count (POC) distances between the current picture and the two reference pictures of the block are both smaller than a threshold.
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a block when picture order count (POC) distances between the current picture and the two reference pictures of the block are both larger than a threshold.
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a block when picture order count (POC) distance between the current picture and one of the two reference pictures of the block is smaller than a threshold.
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a block when picture order count (POC) distance between the current picture and one of the two reference pictures of the block is larger than a threshold.
  • POC picture order count
  • POC picture order count
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a picture when picture order count (POC) distances between the picture and its two nearest reference pictures in the two reference picture lists are both smaller than a threshold.
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a picture when picture order count (POC) distances between the picture and its two nearest reference pictures in the two reference picture lists are both larger than a threshold.
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a picture when picture order count (POC) distance between the picture and its two nearest reference pictures in reference picture list 0 or 1 is smaller than a threshold.
  • POC picture order count
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied to a picture when picture order count (POC) distances between the picture and its two nearest reference pictures in reference picture list 0 or 1 is larger than a threshold.
  • POC picture order count
  • POC picture order count
  • POC picture order count
  • the unit is a block.
  • the coded information includes at least one of motion information, residual information, transform information, mode information, dimension of the unit.
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are disabled.
  • the additional transform includes at least one of secondary transform or reduced secondary transform or rotational transform or Karhunan-Loeve transformation (KLT) or any other transform.
  • KLT Karhunan-Loeve transformation
  • the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are disabled.
  • the specific sizes of the block include at least one of 4*4, 4*8, 8*4, 8*8, 4*16 and 16*4.
  • the additional transform is disabled for blocks wherein the DMVR process and/or BDOF process and/or other DMVD process and/or jointly used DMVR process and BDOF process are applied.
  • an indication of additional transform is signaled for these blocks, but is constrained to be false in a conformance bitstream.
  • whether the jointly used first process and second process is enabled or disabled depends on dimension of the block including width (W) and/or height (H) of the block, wherein W and H are integers.
  • a block size contains less than M*H samples
  • the jointly used first process and second process is disabled, where M is an integer.
  • a block size contains more than M*H samples
  • the jointly used first process and second process is disabled, where M is an integer.
  • M*H samples are 16 or 32 or 64 luma samples.
  • minimum size of a block when minimum size of a block’s width or/and height is smaller than or no larger than X, the jointly used first process and second process is disabled, where X is an integer.
  • X 8.
  • th1 and/or th2 is set to 8.
  • the jointly used first process and second process is disabled for M*M block, wherein M is an integer.
  • M 128.
  • the jointly used first process and second process is disabled for N*M or M*N block, wherein M and N are integers.
  • the jointly used first process and second process is performed at sub-block level.
  • L 64.
  • the block when the block is a 64*128 or128*64 block, the block is split into two 64*64 sub-blocks, and a 128x128 block is split into four 64*64 sub-blocks.
  • the block when the block is a N*128 or 128*N block, wherein N ⁇ 64, the block is not split into sub-blocks.
  • the block when the block is a N*128 or 128*N block, wherein N ⁇ 64, the block is split into two N*64 or 64*N sub-blocks.
  • the block when width or height of the block is larger than L, the block is split vertically or horizontally so that the width or/and height of the sub-block is no larger than L.
  • the block when size of a block, which is width*height of the block, is larger than a threshold L1, the block is split into multiple sub-blocks, and each sub-block is used as the first block with size equal to the sub-block size.
  • the block is split into sub-blocks with same size that is no larger than L1, wherein L1 is an integer.
  • the block is not split vertically or horizontally, respectively.
  • L1 is 1024, and L2 is 32.
  • the threshold L is pre-defined or signaled in at lest one of SPS, PPS, picture, slice, tile group and tile level.
  • the thresholds L, L1 and L2 depend on certain coded information including block size, picture type, temporal layer index.
  • the conversion generates the first block of video from the bitstream representation.
  • the conversion generates the bitstream representation from the first block of video.
  • the disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them.
  • the disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus.
  • the computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them.
  • data processing apparatus encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
  • the apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • a propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
  • a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program does not necessarily correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) .
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
  • processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
  • a processor will receive instructions and data from a read only memory or a random-access memory or both.
  • the essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • a computer need not have such devices.
  • Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
  • semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
  • magnetic disks e.g., internal hard disks or removable disks
  • magneto optical disks e.g., CD ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne la décomposition d'un vecteur mouvement et d'un échantillon de prédiction. Dans un exemple, un procédé de traitement vidéo consiste à déduire, pour une conversion entre un premier bloc de vidéo et une représentation de flux binaire du premier bloc, au moins une différence de vecteur mouvement (MVD) parmi une MVD associée au premier bloc (MVDb) et une MVD associée au sous-bloc à l'intérieur du premier bloc (MVDsb) en utilisant conjointement un premier processus et un second processus d'un processus de déduction de vecteur mouvement à décodeurs multiples (DMVD), le MVDb étant déduit au moins à l'aide du premier processus, et le MVDsb étant déduit au moins à l'aide du second processus ; à décomposer un vecteur mouvement (MV) du premier bloc (MVb) à l'aide de ladite MVD ; et à réaliser la conversion sur la base du vecteur mouvement décomposé du premier bloc.
PCT/CN2020/084726 2019-04-14 2020-04-14 Décomposition de vecteur mouvement et d'échantillon de prédiction WO2020211755A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202080028662.3A CN113796084B (zh) 2019-04-14 2020-04-14 运动矢量和预测样点细化

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNPCT/CN2019/082589 2019-04-14
CN2019082589 2019-04-14

Publications (1)

Publication Number Publication Date
WO2020211755A1 true WO2020211755A1 (fr) 2020-10-22

Family

ID=72838017

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/084726 WO2020211755A1 (fr) 2019-04-14 2020-04-14 Décomposition de vecteur mouvement et d'échantillon de prédiction

Country Status (2)

Country Link
CN (1) CN113796084B (fr)
WO (1) WO2020211755A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818046A (zh) * 2021-01-25 2021-05-18 同济大学 一种基于轨交云控的非时空数据变换归集处理系统及方法
WO2022262695A1 (fr) * 2021-06-15 2022-12-22 Beijing Bytedance Network Technology Co., Ltd. Procédé, dispositif et support de traitement vidéo
WO2023277755A1 (fr) * 2021-06-30 2023-01-05 Telefonaktiebolaget Lm Ericsson (Publ) Affinement sélectif de mouvement basé sur un sous-bloc
WO2023277756A1 (fr) * 2021-06-30 2023-01-05 Telefonaktiebolaget Lm Ericsson (Publ) Affinement de mouvement côté décodeur avec chevauchement

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102131091A (zh) * 2010-01-15 2011-07-20 联发科技股份有限公司 解码端运动向量导出方法
WO2012045225A1 (fr) * 2010-10-06 2012-04-12 Intel Corporation Système et procédé de calcul de vecteurs de mouvement à faible complexité
CN102986224A (zh) * 2010-12-21 2013-03-20 英特尔公司 用于增强的解码器侧运动向量导出处理的系统及方法
US20180199057A1 (en) * 2017-01-12 2018-07-12 Mediatek Inc. Method and Apparatus of Candidate Skipping for Predictor Refinement in Video Coding
US20180241998A1 (en) * 2017-02-21 2018-08-23 Qualcomm Incorporated Deriving motion vector information at a video decoder
WO2018175720A1 (fr) * 2017-03-22 2018-09-27 Qualcomm Incorporated Contrainte d'informations de vecteur de mouvement dérivées par dérivation d'un vecteur de mouvement côté décodeur

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102223540B (zh) * 2011-07-01 2012-12-05 宁波大学 一种面向h.264/avc视频的信息隐藏方法
EP3264768A1 (fr) * 2016-06-30 2018-01-03 Thomson Licensing Procédé et appareil pour codage vidéo avec affinage adaptatif d'informations de mouvement
EP3301918A1 (fr) * 2016-10-03 2018-04-04 Thomson Licensing Procédé et appareil de codage et de décodage d'informations de mouvement
US10750203B2 (en) * 2016-12-22 2020-08-18 Mediatek Inc. Method and apparatus of adaptive bi-prediction for video coding
WO2018193967A1 (fr) * 2017-04-19 2018-10-25 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Dispositif de codage, dispositif de décodage, procédé de codage et procédé de décodage
US10856003B2 (en) * 2017-10-03 2020-12-01 Qualcomm Incorporated Coding affine prediction motion information for video coding

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102131091A (zh) * 2010-01-15 2011-07-20 联发科技股份有限公司 解码端运动向量导出方法
WO2012045225A1 (fr) * 2010-10-06 2012-04-12 Intel Corporation Système et procédé de calcul de vecteurs de mouvement à faible complexité
CN102986224A (zh) * 2010-12-21 2013-03-20 英特尔公司 用于增强的解码器侧运动向量导出处理的系统及方法
US20180199057A1 (en) * 2017-01-12 2018-07-12 Mediatek Inc. Method and Apparatus of Candidate Skipping for Predictor Refinement in Video Coding
US20180241998A1 (en) * 2017-02-21 2018-08-23 Qualcomm Incorporated Deriving motion vector information at a video decoder
WO2018175720A1 (fr) * 2017-03-22 2018-09-27 Qualcomm Incorporated Contrainte d'informations de vecteur de mouvement dérivées par dérivation d'un vecteur de mouvement côté décodeur

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818046A (zh) * 2021-01-25 2021-05-18 同济大学 一种基于轨交云控的非时空数据变换归集处理系统及方法
WO2022262695A1 (fr) * 2021-06-15 2022-12-22 Beijing Bytedance Network Technology Co., Ltd. Procédé, dispositif et support de traitement vidéo
WO2023277755A1 (fr) * 2021-06-30 2023-01-05 Telefonaktiebolaget Lm Ericsson (Publ) Affinement sélectif de mouvement basé sur un sous-bloc
WO2023277756A1 (fr) * 2021-06-30 2023-01-05 Telefonaktiebolaget Lm Ericsson (Publ) Affinement de mouvement côté décodeur avec chevauchement

Also Published As

Publication number Publication date
CN113796084A (zh) 2021-12-14
CN113796084B (zh) 2023-09-15

Similar Documents

Publication Publication Date Title
US11889108B2 (en) Gradient computation in bi-directional optical flow
TWI727338B (zh) 用信號通知的運動向量精度
US11956465B2 (en) Difference calculation based on partial position
US11876932B2 (en) Size selective application of decoder side refining tools
US11641467B2 (en) Sub-block based prediction
US11729377B2 (en) Affine mode in video coding and decoding
WO2020147745A1 (fr) Listes de candidats au mouvement utilisant une compensation locale d'éclairage
US11991382B2 (en) Motion vector management for decoder side motion vector refinement
WO2020211755A1 (fr) Décomposition de vecteur mouvement et d'échantillon de prédiction
WO2020156538A1 (fr) Interaction entre un codage avec précisions de vecteurs de mouvement et un codage avec différences de vecteurs de mouvements (mv)
WO2020182140A1 (fr) Affinement de vecteur de mouvement dans un codage vidéo
WO2020182187A1 (fr) Poids adaptatif dans une prédiction à hypothèses multiples dans un codage vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20791680

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.12.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 20791680

Country of ref document: EP

Kind code of ref document: A1