WO2020143831A1 - Mv precision constraints - Google Patents

Mv precision constraints Download PDF

Info

Publication number
WO2020143831A1
WO2020143831A1 PCT/CN2020/071771 CN2020071771W WO2020143831A1 WO 2020143831 A1 WO2020143831 A1 WO 2020143831A1 CN 2020071771 W CN2020071771 W CN 2020071771W WO 2020143831 A1 WO2020143831 A1 WO 2020143831A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
predicted
video
prediction
equal
Prior art date
Application number
PCT/CN2020/071771
Other languages
French (fr)
Inventor
Hongbin Liu
Li Zhang
Kai Zhang
Yue Wang
Original Assignee
Beijing Bytedance Network Technology Co., Ltd.
Bytedance Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bytedance Network Technology Co., Ltd., Bytedance Inc. filed Critical Beijing Bytedance Network Technology Co., Ltd.
Priority to CN202080008722.5A priority Critical patent/CN113574867B/en
Publication of WO2020143831A1 publication Critical patent/WO2020143831A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • This document is related to video coding technologies.
  • Digital video accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
  • the disclosed techniques may be used by video decoder or encoder embodiments for in which interpolation is improved using a block-shape interpolation order technique.
  • a method of video bitstream processing includes determining a shape of a first video block, determining an interpolation order based on the shape of the first video block, the interpolation order indicative of a sequence of performing a horizontal interpolation and a vertical interpolation, and performing the horizontal interpolation and the vertical interpolation for the first video block in the sequence in accordance with the interpolation order to reconstruct a decoded representation of the first video block.
  • a method of video bitstream processing includes determining characteristics of a motion vector related to a first video block, determining an interpolation order based on the characteristics of the motion vector, the interpolation order indicative of a sequence of performing a horizontal interpolation and a vertical interpolation, and performing the horizontal interpolation and the vertical interpolation for the first video block in the sequence in accordance with the interpolation order to reconstruct a decoded representation of the first video block.
  • a method for video bitstream processing includes determining, by a processor, dimension characteristics of a first video block; determining, by the processor, that a first interpolation filter is to be applied to the first video block based on the determination of the dimension characteristics; and performing further processing of the first video block using the first interpolation filter.
  • a method for video bitstream processing includes determining, by a processor, first characteristics of a first video block; determining, by the processor, that a first interpolation filter is to be applied to the first video block based on the first characteristics; performing further processing of the first video block using the first interpolation filter; determining, by a processor, second characteristics of a second video block; determining, by the processor, that a second interpolation filter is to be applied to the second video block based on the second characteristics, the first interpolation filter and the second interpolation filter being different short-tap filters; and performing further processing of the second video block using the second interpolation filter.
  • a method for video bitstream processing includes determining, by a processor, characteristics of a first video block, the characteristics including one or more of: a dimension information of a first video block, a prediction direction of the first video block, or a motion information of the first video block; rounding motion vectors (MVs) related to the first video block to integer-pel precision or half-pel precision based on the determination of the characteristics of the first video block; and performing further processing of the first video block using the motion vectors that are rounded.
  • characteristics of a first video block including one or more of: a dimension information of a first video block, a prediction direction of the first video block, or a motion information of the first video block
  • MVs motion vectors
  • a method for video bitstream processing includes determining, by a processor, that a first video block is coded with a merge mode; rounding motion information related to the first video block to integer precision to generate modified motion information based on the determination that the first video block is coded with the merge mode; and performing a motion compensation process for the first video block using the modified motion information.
  • a method for video bitstream processing includes determining characteristics of a first video block, the characteristics being one or both of: a size of the first video block, or a shape of the first video block; modifying motion vectors related to the first video block to integer-pel precision or half-pel precision to generate modified motion vectors; and performing further processing of the first video block using the modified motion vectors.
  • a method for video bitstream processing includes determining characteristics of a first video block, the characteristics being one or both of: a size dimension of the first video block, or a prediction direction of the first video block; determining MMVD side information based on the determination of the characteristics of the first video block; and performing further processing of the first video block using the MMVD side information.
  • a method for video bitstream processing includes determining characteristics of a first video block, the characteristics being one or both of: a size of the first video block, or a shape of the first video block; modifying motion vectors related to the first video block to integer-pel precision or half-pel precision to generate modified motion vectors; and performing further processing of the first video block using the modified motion vectors.
  • a method for video bitstream processing includes determining characteristics of a first video block, the characteristics being one or both of: a size of the first video block, or a shape of the first video block; determining a threshold number of half-pel motion vector (MV) components or quarter-pel MV components to be constrained based on the determination of the characteristics of the first video block; and performing further processing of the first video block using the threshold number.
  • MV half-pel motion vector
  • a method for video bitstream processing includes determining characteristics of a first video block, the characteristics including a size of the first video block; modifying motion vectors (MVs) related to the first video block from fractional precision to integer precision based on the determination of the characteristics of the first video block; and performing motion compensation for the first video block using the modified MVs.
  • MVs motion vectors
  • a method for video bitstream processing includes determining a first dimension of a first video block; determining a first precision for motion vectors (MVs) related to the first video block based on the determination of the first dimension; determining a second dimension of a second video block, the first dimension and the second dimension being different dimensions; determining a second precision for MVs related to the second video block based on the determination of the second dimension, the first precision and the second precision being different precisions; and performing further processing of the first video block using the first dimension and of the second video block using the second dimension.
  • MVs motion vectors
  • a method of video processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining filters with interpolation filter parameters used for interpolation of the first block based on the characteristics of the first block; and performing the conversion by using the filters with the interpolation filter parameters.
  • a method of video processing includes etching, for a conversion between a first block of video and a bitstream representation of the first block, reference pixels of a first reference block from reference picture, wherein the first reference block is smaller than a second reference block required for motion compensation of the first block; padding the first reference block with padding pixels to generate the second reference block; and performing the conversion by using the generated second reference block.
  • a method for video bitstream processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing rounding process on motion vector (MV) of the first block based on the characteristics of the first block; and performing the conversion by using the rounded MV.
  • MV motion vector
  • a method for video bitstream processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing motion compensation for the first block using a MV with a first precision; and storing a MV with a second precision for the first block; wherein the first precision is different from the second precision.
  • a method for video bitstream processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, coding mode of the first block; performing rounding process on motion vector (MV) of the first block if the coding mode of the first block satisfying a predetermined rule; and performing the motion compensation of the first block by using the rounded MV.
  • MV motion vector
  • a method for video bitstream processing includes generating, for a conversion between a first block of video and a bitstream representation of the first block, a first motion vector (MV) candidate list for the first block; performing rounding process on MV of at least one candidate before adding the at least one candidate into the first MV candidate list; and performing the conversion by using the first MV candidate list.
  • MV motion vector
  • a method for video bitstream processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining constraint parameter to be applied on the first block based on the characteristics of the first block, wherein the constraint parameter constrains a maximum number of fractional motion vector (MV) components of the first block; and performing the conversion by using the constraint parameter.
  • MV fractional motion vector
  • a method for video bitstream processing includes
  • a method for video bitstream processing includes acquiring, an a signaled indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; and performing the conversion by using the indication when the characteristics of the first block satisfies the predetermined rule.
  • a method for video bitstream processing includes signaling, an indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing the conversion based on the characteristics of the first block, wherein during the conversation, at least one of bi-prediction and uni-prediction is disabled when the characteristics of the first block satisfies a predetermined rule.
  • a method for video bitstream processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; signaling, an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing the conversion by using the AMVR parameter.
  • MV fractional motion vector
  • MVP Advanced Motion Vector Resolution
  • a method for video bitstream processing includes determining, for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; acquiring, an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing the conversion by using the AMVR parameter.
  • MV fractional motion vector
  • MVP Advanced Motion Vector Resolution
  • the above-described methods may be implemented by a video decoder apparatus that comprises a processor.
  • the above-described methods may be implemented by a video encoder apparatus comprising a processor for decoding encoded video during video encoding process.
  • these methods may be embodied in the form of processor-executable instructions and stored on a computer-readable program medium.
  • FIG. 1 is an illustration of a QUAD TREE BINARY TREE (QTBT) structure
  • FIG. 2 shows an example derivation process for merge candidates list construction.
  • FIG. 3 shows example positions of spatial merge candidates.
  • FIG. 4 shows an example of candidate pairs considered for redundancy check of spatial merge candidates.
  • FIG. 5A and 5B show examples of positions for the second prediction unit (PU) of N ⁇ 2N and 2N ⁇ N partitions.
  • FIG. 6 is an illustration of motion vector scaling for temporal merge candidate.
  • FIG. 7 shows example candidate positions for temporal merge candidate, C0 and C1.
  • FIG. 8 shows an example of combined bi-predictive merge candidate.
  • FIG. 9 shows an example of a derivation process for motion vector prediction candidates.
  • FIG. 10 is an illustration of motion vector scaling for spatial motion vector candidate.
  • FIG. 11 shows an example of advanced temporal motion vector prediction (ATMVP) motion prediction for a coding unit (CU) .
  • ATMVP advanced temporal motion vector prediction
  • FIG. 12 shows an example of one CU with four sub-blocks (A-D) and its neighbouring blocks (a–d) .
  • FIG. 13 illustrates proposed non-adjacent merge candidates in one example.
  • FIG. 14 illustrates proposed non-adjacent merge candidates in one example.
  • FIG. 15 illustrates proposed non-adjacent merge candidates in one example.
  • FIG. 16 shows an example of integer samples and fractional sample positions for quarter sample luma interpolation.
  • FIG. 17 is a block diagram of an example of a video processing apparatus.
  • FIG. 18 shows a block diagram of an example implementation of a video encoder.
  • FIG. 19 is a flowchart for an example of a video bitstream processing method.
  • FIG. 20 is a flowchart for an example of a video bitstream processing method.
  • FIG. 21 shows an example of repeat boundary pixels of a reference block before interpolation.
  • FIG. 22 is a flowchart for an example of a video bitstream processing method.
  • FIG. 23 is a flowchart for an example of a video bitstream processing method.
  • FIG. 24 is a flowchart for an example of a video bitstream processing method.
  • FIG. 25 is a flowchart for an example of a video bitstream processing method.
  • FIG. 26 is a flowchart for an example of a video bitstream processing method.
  • FIG. 27 is a flowchart for an example of a video bitstream processing method.
  • FIG. 28 is a flowchart for an example of a video bitstream processing method.
  • FIG. 29 is a flowchart for an example of a video bitstream processing method.
  • FIG. 30 is a flowchart for an example of a video bitstream processing method.
  • FIG. 31 is a flowchart for an example of a video bitstream processing method.
  • FIG. 32 is a flowchart for an example of a video bitstream processing method.
  • FIG. 33 is a flowchart for an example of a video bitstream processing method.
  • FIG. 34 is a flowchart for an example of a video bitstream processing method.
  • the present document provides various techniques that can be used by a decoder of video bitstreams to improve the quality of decompressed or decoded digital video. Furthermore, a video encoder may also implement these techniques during the process of encoding in order to reconstruct decoded frames used for further encoding.
  • Section headings are used in the present document for ease of understanding and do not limit the embodiments and techniques to the corresponding sections. As such, embodiments from one section can be combined with embodiments from other sections.
  • This invention is related to video coding technologies. Specifically, it is related to interpolation in video coding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
  • Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards.
  • the ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H.262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards.
  • AVC H. 264/MPEG-4 Advanced Video Coding
  • H. 265/HEVC High Efficiency Video Coding
  • the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized.
  • Joint Video Exploration Team JVET was founded by VCEG and MPEG jointly in 2015.
  • JVET Joint Exploration Model
  • FIG. 18 is a block diagram of an example implementation of a video encoder.
  • Quadtree plus binary tree (QTBT) block structure with larger CTUs
  • a CTU is split into CUs by using a quadtree structure denoted as coding tree to adapt to various local characteristics.
  • the decision whether to code a picture area using inter-picture (temporal) or intra-picture (spatial) prediction is made at the CU level.
  • Each CU can be further split into one, two or four PUs according to the PU splitting type. Inside one PU, the same prediction process is applied and the relevant information is transmitted to the decoder on a PU basis.
  • a CU can be partitioned into transform units (TUs) according to another quadtree structure similar to the coding tree for the CU.
  • TUs transform units
  • the QTBT structure removes the concepts of multiple partition types, i.e. it removes the separation of the CU, PU and TU concepts, and supports more flexibility for CU partition shapes.
  • a CU can have either a square or rectangular shape.
  • a coding tree unit (CTU) is first partitioned by a quadtree structure.
  • the quadtree leaf nodes are further partitioned by a binary tree structure.
  • the binary tree leaf nodes are called coding units (CUs) , and that segmentation is used for prediction and transform processing without any further partitioning.
  • a CU sometimes consists of coding blocks (CBs) of different colour components, e.g. one CU contains one luma CB and two chroma CBs in the case of P and B slices of the 4: 2: 0 chroma format and sometimes consists of a CB of a single component, e.g., one CU contains only one luma CB or just two chroma CBs in the case of I slices.
  • CBs coding blocks
  • CTU size the root node size of a quadtree, the same concept as in HEVC
  • MinQTSize the minimum allowed quadtree leaf node size
  • MaxBTSize the maximum allowed binary tree root node size
  • MaxBTDepth the maximum allowed binary tree depth
  • MinBTSize the minimum allowed binary tree leaf node size
  • the CTU size is set as 128 ⁇ 128 luma samples with two corresponding 64 ⁇ 64 blocks of chroma samples
  • the MinQTSize is set as 16 ⁇ 16
  • the MaxBTSize is set as 64 ⁇ 64
  • the MinBTSize (for both width and height) is set as 4 ⁇ 4
  • the MaxBTDepth is set as 4.
  • the quadtree partitioning is applied to the CTU first to generate quadtree leaf nodes.
  • the quadtree leaf nodes may have a size from 16 ⁇ 16 (i.e., the MinQTSize) to 128 ⁇ 128 (i.e., the CTU size) .
  • the quadtree leaf node is also the root node for the binary tree and it has the binary tree depth as 0.
  • MaxBTDepth i.e., 4
  • no further splitting is considered.
  • MinBTSize i.e., 4
  • no further horizontal splitting is considered.
  • the binary tree node has height equal to MinBTSize
  • no further vertical splitting is considered.
  • the leaf nodes of the binary tree are further processed by prediction and transform processing without any further partitioning. In the JEM, the maximum CTU size is 256 ⁇ 256 luma samples.
  • FIG. 1 illustrates an example of block partitioning by using QTBT
  • FIG. 1 (right) illustrates the corresponding tree representation.
  • the solid lines indicate quadtree splitting and dotted lines indicate binary tree splitting.
  • each splitting (i.e., non-leaf) node of the binary tree one flag is signalled to indicate which splitting type (i.e., horizontal or vertical) is used, where 0 indicates horizontal splitting and 1 indicates vertical splitting.
  • the quadtree splitting there is no need to indicate the splitting type since quadtree splitting always splits a block both horizontally and vertically to produce 4 sub-blocks with an equal size.
  • the QTBT scheme supports the ability for the luma and chroma to have a separate QTBT structure.
  • the luma and chroma CTBs in one CTU share the same QTBT structure.
  • the luma CTB is partitioned into CUs by a QTBT structure
  • the chroma CTBs are partitioned into chroma CUs by another QTBT structure. This means that a CU in an I slice consists of a coding block of the luma component or coding blocks of two chroma components, and a CU in a P or B slice consists of coding blocks of all three colour components.
  • inter prediction for small blocks is restricted to reduce the memory access of motion compensation, such that bi-prediction is not supported for 4 ⁇ 8 and 8 ⁇ 4 blocks, and inter prediction is not supported for 4 ⁇ 4 blocks.
  • these restrictions are removed.
  • Each inter-predicted PU has motion parameters for one or two reference picture lists.
  • Motion parameters include a motion vector and a reference picture index. Usage of one of the two reference picture lists may also be signalled using inter_pred_idc. Motion vectors may be explicitly coded as deltas relative to predictors.
  • a merge mode is specified whereby the motion parameters for the current PU are obtained from neighbouring PUs, including spatial and temporal candidates.
  • the merge mode can be applied to any inter-predicted PU, not only for skip mode.
  • the alternative to merge mode is the explicit transmission of motion parameters, where motion vector (to be more precise, motion vector difference compared to a motion vector predictor) , corresponding reference picture index for each reference picture list and reference picture list usage are signalled explicitly per each PU.
  • Such mode is named Advanced motion vector prediction (AMVP) in this disclosure.
  • the PU When signalling indicates that one of the two reference picture lists is to be used, the PU is produced from one block of samples. This is referred to as ‘uni-prediction’ . Uni-prediction is available both for P-slices and B-slices.
  • Bi-prediction When signalling indicates that both of the reference picture lists are to be used, the PU is produced from two blocks of samples. This is referred to as ‘bi-prediction’ . Bi-prediction is available for B-slices only.
  • Step 1.2 Redundancy check for spatial candidates
  • a maximum of four merge candidates are selected among candidates that are located in five different positions.
  • a maximum of one merge candidate is selected among two candidates. Since constant number of candidates for each PU is assumed at decoder, additional candidates are generated when the number of candidates obtained from step 1 does not reach the maximum number of merge candidate (MaxNumMergeCand) which is signalled in slice header. Since the number of candidates is constant, index of best merge candidate is encoded using truncated unary binarization (TU) . If the size of CU is equal to 8, all the PUs of the current CU share a single merge candidate list, which is identical to the merge candidate list of the 2N ⁇ 2N prediction unit.
  • TU truncated unary binarization
  • a maximum of four merge candidates are selected among candidates located in the positions depicted in FIG. 3.
  • the order of derivation is A 1, B 1, B 0, A 0 and B 2 .
  • Position B 2 is considered only when any PU of position A 1 , B 1 , B 0 , A 0 is not available (e.g. because it belongs to another slice or tile) or is intra coded.
  • candidate at position A 1 is added, the addition of the remaining candidates is subject to a redundancy check which ensures that candidates with same motion information are excluded from the list so that coding efficiency is improved.
  • a redundancy check which ensures that candidates with same motion information are excluded from the list so that coding efficiency is improved.
  • not all possible candidate pairs are considered in the mentioned redundancy check. Instead only the pairs linked with an arrow in FIG.
  • FIG. 5 depicts the second PU for the case of N ⁇ 2N and 2N ⁇ N, respectively.
  • candidate at position A 1 is not considered for list construction. In fact, by adding this candidate will lead to two prediction units having the same motion information, which is redundant to just have one PU in a coding unit.
  • position B 1 is not considered when the current PU is partitioned as 2N ⁇ N.
  • a scaled motion vector is derived based on co-located PU belonging to the picture which has the smallest POC difference with current picture within the given reference picture list.
  • the reference picture list to be used for derivation of the co-located PU is explicitly signalled in the slice header.
  • the scaled motion vector for temporal merge candidate is obtained as illustrated by the dashed line in FIG.
  • tb is defined to be the POC difference between the reference picture of the current picture and the current picture
  • td is defined to be the POC difference between the reference picture of the co-located picture and the co-located picture.
  • the reference picture index of temporal merge candidate is set equal to zero.
  • FIG. 6 is an illustration of motion vector scaling for temporal merge candidate.
  • the position for the temporal candidate is selected between candidates C 0 and C 1 , as depicted in FIG. 7. If PU at position C 0 is not available, is intra coded, or is outside of the current CTU row, position C 1 is used. Otherwise, position C 0 is used in the derivation of the temporal merge candidate.
  • Zero merge candidate Combined bi-predictive merge candidates are generated by utilizing spatial and temporal merge candidates. Combined bi-predictive merge candidate is used for B-Slice only. The combined bi-predictive candidates are generated by combining the first reference picture list motion parameters of an initial candidate with the second reference picture list motion parameters of another. If these two tuples provide different motion hypotheses, they will form a new bi-predictive candidate. As an example, FIG.
  • Zero motion candidates are inserted to fill the remaining entries in the merge candidates list and therefore hit the MaxNumMergeCand capacity. These candidates have zero spatial displacement and a reference picture index which starts from zero and increases every time a new zero motion candidate is added to the list. The number of reference frames used by these candidates is one and two for uni and bi-directional prediction, respectively. Finally, no redundancy check is performed on these candidates.
  • HEVC defines the motion estimation region (MER) whose size is signalled in the picture parameter set using the “log2_parallel_merge_level_minus2” syntax element. When a MER is defined, merge candidates falling in the same region are marked as unavailable and therefore not considered in the list construction.
  • AMVP exploits spatio-temporal correlation of motion vector with neighbouring PUs, which is used for explicit transmission of motion parameters.
  • a motion vector candidate list is constructed by firstly checking availability of left, above temporally neighbouring PU positions, removing redundant candidates and adding zero vector to make the candidate list to be constant length. Then, the encoder can select the best predictor from the candidate list and transmit the corresponding index indicating the chosen candidate. Similarly with merge index signalling, the index of the best motion vector candidate is encoded using truncated unary. The maximum value to be encoded in this case is 2 (see FIG. 9) .
  • the maximum value to be encoded is 2 (see FIG. 9) .
  • FIG. 9 summarizes derivation process for motion vector prediction candidate.
  • motion vector candidate two types are considered: spatial motion vector candidate and temporal motion vector candidate.
  • spatial motion vector candidate derivation two motion vector candidates are eventually derived based on motion vectors of each PU located in five different positions as depicted in FIG. 3.
  • one motion vector candidate is selected from two candidates, which are derived based on two different co-located positions. After the first list of spatio-temporal candidates is made, duplicated motion vector candidates in the list are removed. If the number of potential candidates is larger than two, motion vector candidates whose reference picture index within the associated reference picture list is larger than 1 are removed from the list. If the number of spatio-temporal motion vector candidates is smaller than two, additional zero motion vector candidates is added to the list.
  • a maximum of two candidates are considered among five potential candidates, which are derived from PUs located in positions as depicted in FIG. 3, those positions being the same as those of motion merge.
  • the order of derivation for the left side of the current PU is defined as A 0 , A 1 , and scaled A 0 , scaled A 1 .
  • the order of derivation for the above side of the current PU is defined as B 0 , B 1 , B 2 , scaled B 0 , scaled B 1 , scaled B 2 .
  • the no-spatial-scaling cases are checked first followed by the spatial scaling. Spatial scaling is considered when the POC is different between the reference picture of the neighbouring PU and that of the current PU regardless of reference picture list. If all PUs of left candidates are not available or are intra coded, scaling for the above motion vector is allowed to help parallel derivation of left and above MV candidates. Otherwise, spatial scaling is not allowed for the above motion vector.
  • FIG. 10 is an illustration of motion vector scaling for spatial motion vector candidate.
  • the motion vector of the neighbouring PU is scaled in a similar manner as for temporal scaling, as depicted as FIG. 10.
  • the main difference is that the reference picture list and index of current PU is given as input; the actual scaling process is the same as that of temporal scaling.
  • each CU can have at most one set of motion parameters for each prediction direction.
  • Two sub-CU level motion vector prediction methods are considered in the encoder by splitting a large CU into sub-CUs and deriving motion information for all the sub-CUs of the large CU.
  • Alternative temporal motion vector prediction (ATMVP) method allows each CU to fetch multiple sets of motion information from multiple blocks smaller than the current CU in the collocated reference picture.
  • STMVP spatial-temporal motion vector prediction
  • the motion compression for the reference frames is currently disabled.
  • the motion vectors temporal motion vector prediction is modified by fetching multiple sets of motion information (including motion vectors and reference indices) from blocks smaller than the current CU.
  • the sub-CUs are square N ⁇ N blocks (N is set to 4 by default) .
  • ATMVP predicts the motion vectors of the sub-CUs within a CU in two steps.
  • the first step is to identify the corresponding block in a reference picture with a so-called temporal vector.
  • the reference picture is called the motion source picture.
  • the second step is to split the current CU into sub-CUs and obtain the motion vectors as well as the reference indices of each sub-CU from the block corresponding to each sub-CU, as shown in FIG. 11.
  • a reference picture and the corresponding block is determined by the motion information of the spatial neighbouring blocks of the current CU.
  • the first merge candidate in the merge candidate list of the current CU is used.
  • the first available motion vector as well as its associated reference index are set to be the temporal vector and the index to the motion source picture. This way, in ATMVP, the corresponding block may be more accurately identified, compared with TMVP, wherein the corresponding block (sometimes called collocated block) is always in a bottom-right or center position relative to the current CU.
  • a corresponding block of the sub-CU is identified by the temporal vector in the motion source picture, by adding to the coordinate of the current CU the temporal vector.
  • the motion information of its corresponding block (the smallest motion grid that covers the center sample) is used to derive the motion information for the sub-CU.
  • the motion information of a corresponding N ⁇ N block is identified, it is converted to the motion vectors and reference indices of the current sub-CU, in the same way as TMVP of HEVC, wherein motion scaling and other procedures apply.
  • the decoder checks whether the low-delay condition (i.e.
  • motion vector MV x the motion vector corresponding to reference picture list X
  • motion vector MV y the motion vector corresponding to 0 or 1 and Y being equal to 1-X
  • FIG. 12 illustrates this concept. Let us consider an 8 ⁇ 8 CU which contains four 4 ⁇ 4 sub-CUs A, B, C, and D. The neighbouring 4 ⁇ 4 blocks in the current frame are labelled as a, b, c, and d.
  • the motion derivation for sub-CU A starts by identifying its two spatial neighbours.
  • the first neighbour is the N ⁇ N block above sub-CU A (block c) . If this block c is not available or is intra coded the other N ⁇ N blocks above sub-CU A are checked (from left to right, starting at block c) .
  • the second neighbour is a block to the left of the sub-CU A (block b) . If block b is not available or is intra coded other blocks to the left of sub-CU A are checked (from top to bottom, staring at block b) .
  • the motion information obtained from the neighbouring blocks for each list is scaled to the first reference frame for a given list.
  • temporal motion vector predictor (TMVP) of sub-block A is derived by following the same procedure of TMVP derivation as specified in HEVC.
  • the motion information of the collocated block at location D is fetched and scaled accordingly.
  • all available motion vectors (up to 3) are averaged separately for each reference list. The averaged motion vector is assigned as the motion vector of the current sub-CU.
  • the sub-CU modes are enabled as additional merge candidates and there is no additional syntax element required to signal the modes.
  • Two additional merge candidates are added to merge candidates list of each CU to represent the ATMVP mode and STMVP mode. Up to seven merge candidates are used, if the sequence parameter set indicates that ATMVP and STMVP are enabled.
  • the encoding logic of the additional merge candidates is the same as for the merge candidates in the HM, which means, for each CU in P or B slice, two more RD checks is needed for the two additional merge candidates.
  • the derived candidates are added after TMVP candidates in the merge candidate list.
  • each candidate B (i, j) or C (i, j) has an offset of 16 in the vertical direction compared to its previous B or C candidates.
  • Each candidate A (i, j) or D (i, j) has an offset of 16 in the horizontal direction compared to its previous A or D candidates.
  • Each E (i, j) has an offset of 16 in both horizontal direction and vertical direction compared to its previous E candidates. The candidates are checked from inside to the outside.
  • the order of the candidates is A (i, j) , B (i, j) , C (i, j) , D (i, j) , and E (i, j) .
  • the candidates are added after TMVP candidates in the merge candidate list.
  • the extended spatial positions from 6 to 27 as in FIG. 15 are checked according to their numerical order after the temporal candidate.
  • all the spatial candidates are restricted within two CTU lines.
  • an 8-tap separable DCT-based interpolation filter is used for 2/4 precision samples and a 7-tap separable DCT-based interpolation filter is used for 1/4 precisions samples, as shown in Table 1.
  • Table 1 8-tap DCT-IF coefficients for 1/4th luma interpolation.
  • a 4-tap separable DCT-based interpolation filter is used for the chroma interpolation filter, as shown in Table 2.
  • Table 2 4-tap DCT-IF coefficients for 1/8th chroma interpolation.
  • bit-depth of the output of the interpolation filter is maintained to 14-bit accuracy, regardless of the source bit-depth, before the averaging of the two prediction signals.
  • the actual averaging process is done implicitly with the bit-depth reduction process as:
  • predSamples [x, y ] (predSamplesL0 [x, y ] + predSamplesL1 [x, y ] + offset ) >> shift
  • h k, 0 (-A k, -3 + 4 *A k, -2 -11 *A k, -1 + 40 *A k, 0 + 40 *A k, 1 -11 *A k, 2 + 4 *A k, 3 -A k, 4 ) >> shift1 (2-3)
  • Table 4 interpolation required for WxH luma component when the interpolation order is reversed.
  • interpolation order can lead to different interpolation result when bitdepth of the input video is greater than 8. Therefore, the interpolation order shall be defined implicitly in both encoder and decoder.
  • the interpolation filter tap in motion compensation
  • N for example, 8, 6, 4, or 2
  • WxH the current block size
  • triangle mode is considered as a bi-prediction mode, and the following techniques related to bi-prediction may be applied to triangle mode too.
  • the interpolation order depends on the current coding block shape (e.g., the coding block is a CU) .
  • a for block (such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO) with width > height, vertical interpolation is firstly performed, and then horizonal interpolation is performed, e.g., pixels d k, 0 , h k, 0 and n k, 0 are firstly interpolated and e 0, 0 to r 0, 0 are then interpolated.
  • An example of j 0, 0 is shown in equation 2-3 and 2-4.
  • a block such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO
  • width ⁇ height
  • horizonal interpolation is firstly performed, and then vertical interpolation is performed.
  • a block such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO
  • horizonal interpolation is firstly performed, and then vertical interpolation is performed
  • both the luma component and the chroma components follow the same interpolation order.
  • one chroma coding block corresponds to multiple luma coding blocks (e.g., for 4: 2: 0 color format, one chroma 4x4 block may correspond to two 8x4 or 4x8 luma blocks)
  • luma and chroma may use different interpolation orders.
  • the scaling factors in the multiple stages may be further changed accordingly.
  • the interpolation order of luma component can further depend on the MV.
  • horizonal interpolation is firstly performed, and then vertical interpolation is performed.
  • the proposed methods are only applied to square coding blocks.
  • the associated motion information may be modified to integer precision (e.g., via rounding) before invoking motion compensation process.
  • merge candidates with fractional merge candidates may be excluded from the merge list.
  • fractional motion vectors may be firstly modified to integer precision (e.g., via rounding) before being added to the merge list.
  • a separate HMVP table may be kept on-the-fly to store motion candidates with integer precisions.
  • the above methods may be only applied when the merge candidate is a bi-prediction candidate.
  • the above methods may be applied to certain block dimensions, such as 4x16, 16x4, 4x8, 8x4, 4x4.
  • the above methods may be applied to the AMVP coded blocks wherein the merge candidate may be replaced by an AMVP candidate.
  • the above methods may be applied to certain block modes, such as non-affine mode.
  • the MMVD side information (such as distance table, directions) may be dependent on block dimension and/or prediction direction (e.g., uni-prediction or bi-prediction) .
  • a distance table with all integer precisions may be defined or signaled.
  • the base merge candidate may be firstly modified (such as via rounding) to integer precision and then used to derive the final motion vectors for motion compensation.
  • MV in MMVD mode may be constrained to be with integer-pel precision or half-pel precision for some block sizes or block shapes.
  • the base merge candidates used in MMVD may be firstly modified to integer-pel precision (such as via rounding) .
  • the base merge candidates used in MMVD may be modified to half-pel precision (such as via rounding) .
  • rounding may be performed in the base merge list construction process, therefore, rounded MVs are used in pruning.
  • rounding may be performed after the base merge list construction process, therefore, unrounded MVs are used in pruning.
  • binarization of MVD index may be modified because the maximum MVD index is M –K –1 instead of M –1.
  • different context may be used in CABAC coding.
  • rounding may be performed after deriving the MV in MMVD mode.
  • the constraint may be different for bi-prediction and uni-prediction.
  • the constraint may be not applied in uni-prediction.
  • the constraint may be different for different block sizes or block shapes.
  • half-pel MV components or/and quarter-pel MV components may be constrained for some block sizes or block shapes.
  • bitstream shall conform to the constraint.
  • the constraint may be different for bi-prediction and uni-prediction.
  • the constraint may be not applied in uni-prediction.
  • such constraint may be applied to bi-predicted 4x8 or/and 8x4 or/and 4x16 or/and 16x4 block, however, it may be not applied to uni-predicted 4x8 or/and 8x4 or/and 4x16 or/and 16x4 block.
  • such constraint may be applied to both bi-predicted and uni-predicted 4x4 block.
  • the constraint may be different for different block sizes or block shapes.
  • the constraint may be applied to triangle mode.
  • such constraint may be applied to 4x16 or/and 16x4 block coded in triangle mode.
  • n n.
  • at most 0 fractional MV components may be allowed.
  • some components of a MV may be rounded to integer-pel precision or half-pel precision depending on the dimension (e.g., width and/or height, ratios of width and height) , or/and prediction direction or/and motion information of a block.
  • MV is rounded to the nearest integer-pel precision MV or/and half-pel precision MV.
  • rounding down rounding up, rounding towards zero or rounding away from zero may be used.
  • MV rounding may be applied to horizonal or/and vertical MV component.
  • MV rounding may be applied to horizonal (or vertical) MV component.
  • thresholds L and L1 may be different for bi-predicted blocks and uni-predicted blocks. For example, smaller thresholds may be used for bi-predicted blocks.
  • MV rounding may be applied.
  • MV rounding may be applied only when both horizonal and vertical components of the MV are fractional, i.e., they point to fractional pixel position instead of integer pixel position.
  • MV rounding Whether MV rounding is applied or not may depend on whether the current block is bi-predicted or uni-predicted.
  • MV rounding may be applied only when the current block is bi-predicted.
  • MV rounding i. Whether MV rounding is applied or not may depend on the prediction direction (e.g., from List 0 or list 1) and/or the associated motion vectors. In one example, for bi-predicted blocks, whether MV rounding is applied or not may be different for different prediction directions.
  • MV rounding may be applied to N MV components for prediction direction X; otherwise, MV rounding may be not applied.
  • N 0, 1 or 2.
  • N and M may be different for bi-predicted blocks and uni-predicted blocks.
  • N and M may be different for different block sizes (width or/and height or/and width *height) .
  • N is equal to 4 and M is equal to 4.
  • N is equal to 4 and M is equal to 3.
  • N is equal to 4 and M is equal to 2.
  • N is equal to 4 and M is equal to 1.
  • N is equal to 3 and M is equal to 3.
  • N is equal to 3 and M is equal to 2.
  • N is equal to 3 and M is equal to 1.
  • N is equal to 2 and M is equal to 2.
  • N is equal to 2 and M is equal to 1.
  • N is equal to 1 and M is equal to 1.
  • N is equal to 2 and M is equal to 2.
  • N is equal to 2 and M is equal to 1.
  • N is equal to 1 and M is equal to 1.
  • MV rounding Whether MV rounding is applied or not may be different for different color components such as Y, Cb and Cr.
  • MV rounding may depend on color formats such as 4: 2: 0, 4: 2: 2 or 4: 4: 4.
  • MV rounding may depend on the block size (or width, height) , block shapes, prediction direction etc.
  • some MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks may be rounded to half-pel precision.
  • some MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks may be rounded to integer-pel precision.
  • some MV components of of 4x4 uni-predicted or/and bi-predicted luma blocks may be rounded to integer-pel precision.
  • some MV components of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks may be rounded to integer-pel precision.
  • the MV rounding may be not applied on sub-block prediction, such as affine prediction.
  • the MV rounding may be applied on sub-block prediction, such as ATMVP prediction.
  • each sub-block is treated as a coding block to judge whether and how to apply MV rounding.
  • motion vectors of one block shall be modified to integer precision before being utilized for motion compensation, for example, if they are fractional precisions.
  • the stored motion vectors and those utilized for motion compensation may be in different precisions.
  • sub-pel precision (a. k. a., fractional precision, such as 1/4-pel, 1/16-pel) may be stored for blocks with certain block dimensions, but the motion compensation process is based on integer version of those motion vectors (such as via rounding) .
  • an indication of disallowing bi-prediction for certain block dimensions may be signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/CTU rows/regions/other high-level syntax.
  • an indication of disallowing bi-prediction for certain block dimensions may be signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/CTU rows/regions/other high-level syntax.
  • an indication of disallowing bi-prediction and/or uni-prediction for certain block dimensions may be signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/CTU rows/regions/other high-level syntax.
  • such indications may be only applied to certain modes, such as non-affine mode.
  • the signaling of AMVR indices may be modified accordingly, such as only integer-pel precisions are allowed, or different MV precisions may be utilized instead.
  • a conformance bitstream shall follow the rule that for certain block dimensions, only integer-pel motion vectors are allowed for bi-prediction coded blocks.
  • Signaling of AMVR flag may depend on whether fractional motion vectors are allowed for a block.
  • the flag indicating whether MV/MVD precision of the current block is 1/4-pel may be skipped and derived to be false implicitly.
  • the block dimensions mentioned above are, for example, 4x16, 16x4, 4x8, 8x4, 4x4.
  • filters with different interpolation filters may be used in interpolation depending on the dimension (e.g., width and/or height, ratios of width and height) of a block.
  • Different filters may be used for vertical interpolation and horizontal interpolation. For example, shorter tap filter may be applied for vertical interpolation compared to that for horizontal interpolation.
  • interpolation filters with less taps than the interpolation filters in VTM-3.0 may be applied in some cases. These interpolation filters with less taps are also called “short-tap filters” .
  • different filters e.g., short-tap filters
  • different filters e.g., short-tap filters
  • a different filter from those used for other kinds of blocks may be selected.
  • the short-tap filters may be used only when both horizonal and vertical components of the MV are fractional, i.e., they point to fractional pixel position instead of integer pixel position.
  • Which filter to be used may depend on whether the current block is bi-predicted or uni-predicted.
  • the short-tap filters may be used only when the current block is bi-predicted.
  • Which filter to be used may depend on the prediction direction (e.g., from List 0 or list 1) and/or the associated motion vectors. In one example, for bi-predicted blocks, whether short-tap filters are used or not may be different for different prediction direction.
  • N and M may be different for bi-predicted blocks and uni-predicted blocks.
  • N and M may be different for different block sizes (width or/and height or/and width *height) .
  • N is equal to 4 and M is equal to 4.
  • N is equal to 4 and M is equal to 3.
  • N is equal to 4 and M is equal to 2.
  • N is equal to 4 and M is equal to 1.
  • N is equal to 3 and M is equal to 3.
  • N is equal to 3 and M is equal to 2.
  • N is equal to 3 and M is equal to 1.
  • N is equal to 2 and M is equal to 2.
  • N is equal to 2 and M is equal to 1.
  • N is equal to 1 and M is equal to 1.
  • N is equal to 2 and M is equal to 2.
  • N is equal to 2 and M is equal to 1.
  • N is equal to 1 and M is equal to 1.
  • K of the M MV components use S1-tap filter
  • S1 is equal to 6 and S2 is equal to 4.
  • different filters may be used only for some pixels. For example, they are used only for boundary pixels of the block.
  • short-tap filters may be different for uni-predicted blocks and bi-predicted blocks.
  • short-tap filters may be different for different color components such as Y, Cb and Cr.
  • whether to and how to apply short-tap filters may depend on color formats such as 4: 2: 0, 4: 2: 2 or 4: 4: 4.
  • Different short-tap filters may be used for different blocks.
  • the selected short-tap filters may depend on the block size (or width, height) , block shapes, prediction direction etc.
  • 7-tap filter is used for horizonal and vertical interpolation of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks.
  • 7-tap filter is used for horizonal (or vertical) interpolation of 4x4 uni-predicted or/and bi-predicted luma blocks.
  • 6-tap filter is used for horizonal and vertical interpolation of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
  • 6-tap filter and 5-tap filter are used in horizonal interpolation and vertical interpolation respectively for 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
  • Different short-tap filters may be used for different kinds of motion vectors.
  • longer tap length filters may be used for motion vectors that only have fractional components in one direction (i.e., either horizonal or vertical direction)
  • shorter tap length filters may be used for motion vectors that have fractional components in both horizonal and vertical directions.
  • 8-tap filter is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one direction, and short-tap filters described in bullet 3.
  • h is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both directions.
  • interpolation filters used for affine motion may be different from that used for translational motion vectors.
  • short-tap interpolation filters may be used for affine motion compared to those used for translational motion vectors.
  • the short-tap filters may not be applied on sub-block prediction, such as affine prediction.
  • the short-tap filters may be applied on sub-block prediction, such as ATMVP prediction.
  • each sub-block is treated as a coding block to judge whether and how to apply short-tap filters.
  • whether to apply short-tap filters and/or how to apply short-tap filters may depend on the block dimension, coded information, etc. al.
  • short-tap filters may be applied.
  • padding or derivation from fetched reference samples may be applied.
  • pixels at the reference block boundaries are repeated to generate a (W + N –1) * (H + N –1) block, which is used for the final interpolation.
  • the fetched reference pixels may be identified by (x + MVXInt –N/2 + offSet1, y + MVYInt –N/2 + offSet2) , wherein (x, y) is the top-left position of the current block, (MVXInt, MVYInt) is the integer part of the MV, offSet1 and offSet2 are integers such as -2, -1, 0, 1, 2 etc.
  • PH is zero, and only left or/and right boundaries are repeated.
  • PW is zero, and only top or/and bottom boundaries are repeated.
  • both PW and PH are greater than zero, and first the left or/and the right boundaries are repeated, and then the top or/and bottom boundaries are repeated.
  • both PW and PH are greater than zero, and first the top or/and bottom boundaries are repeated, and then the left or/and right boundaries are repeated.
  • M1 (or PW –M1) is greater than 1, instead of repeating the first left (or right) column M1 times, multiple columns may be utilized, such as the M1 left columns (or PW –M1 right columns) may be repeated.
  • M2 (or PH –M2) is greater than 1, instead of repeating the first top (or bottom) row M2 times, multiple rows may be utilized, such as the M2 top rows (or PH –M2 bottom rows) may be repeated.
  • some default values may be used for boundary padding.
  • boundary pixels repeating method may be used only when both horizonal and vertical components of the MV are fractional, i.e., they point to fractional pixel position instead of integer pixel position.
  • boundary pixels repeating method may be applied to some of or all reference blocks.
  • N and M may be different for bi-predicted blocks and uni-predicted blocks.
  • N and M may be different for different block sizes (width or/and height or/and width *height) .
  • N is equal to 4 and M is equal to 4.
  • N is equal to 4 and M is equal to 3.
  • N is equal to 4 and M is equal to 2.
  • N is equal to 4 and M is equal to 1.
  • N is equal to 3 and M is equal to 3.
  • N is equal to 3 and M is equal to 2.
  • N is equal to 3 and M is equal to 1.
  • N is equal to 2 and M is equal to 2.
  • N is equal to 2 and M is equal to 1.
  • N is equal to 1 and M is equal to 1.
  • N is equal to 2 and M is equal to 2.
  • N is equal to 2 and M is equal to 1.
  • N is equal to 1 and M is equal to 1.
  • Different boundary pixel repeating method may be used for the M MV components.
  • m. PW and/or PH may be different for different color components such as Y, Cb and Cr.
  • boundary pixel repeating may depend on color formats such as 4: 2: 0, 4: 2: 2 or 4: 4: 4.
  • PW and/or PH may be different for different block size or shape.
  • PW and PH are set equal to 1 for 4x16 or/and 16x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH are set equal to 0 and 1 (or 1 and 0) , respectively, for 4x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH are set equal to 2 for 4x8 or/and 8x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH are set equal to 2 and 3 (or 3 and 2) respectively for 4x8 or/and 8x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH may be different for uni-prediction and bi-prediction.
  • p. PW and PH may be different for different kinds of motion vectors.
  • PW and PH may be smaller (or even zero) for motion vectors that only have fractional components in one direction (i.e., either horizonal or vertical direction) , and they may be larger for motion vectors that have fractional components in both horizonal and vertical directions.
  • PW and PH are set equal to 0 for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one direction, and PW and PH described bullet 4.
  • i is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both direction.
  • Figure 21 shows an example of repeat boundary pixels of a reference block before interpolation.
  • the proposed methods may be applied to certain modes, block sizes/shapes, and/or certain sub-block sizes.
  • the proposed methods may be applied to certain modes, such as bi-predicted mode.
  • the proposed methods may be applied to certain block sizes.
  • the proposed methods may be applied to certain color component (such as only luma component) .
  • Shift (x, s) is defined as
  • off is an integer such as 0 or 2 s-1 .
  • c It may be defined as those used for motion vector rounding in the AMVR process, affine process or other process modules.
  • how to round the MVs may be dependent of MV components.
  • y-component of MV is rounded to integer pixel but x-component of MV is not rounded.
  • the MV may be rounded to integer pixels before motion compensation for luma component, but rounded to 2-pel pixels before motion compensation for chroma components when the color format is 4: 2: 0.
  • bi-linear filter is used to do interpolation filtering for one or multiple specific cases, such as:
  • bilinear filter may be used.
  • short-tap or a second interpolation filter may be applied to a reference picture list which involves multiple reference blocks while for another reference picture with only one reference block, the same filter as that used for normal prediction mode may be applied.
  • the proposed method may be applied under certain conditions, such as certain temporal layer (s) , quantization parameters of a block/atile/aslice/apicture containing the block is within a range (such as larger than a threshold) .
  • certain temporal layer (s) quantization parameters of a block/atile/aslice/apicture containing the block is within a range (such as larger than a threshold) .
  • FIG. 17 is a block diagram of a video processing apparatus 1700.
  • the apparatus 1700 may be used to implement one or more of the methods described herein.
  • the apparatus 1700 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on.
  • the apparatus 1700 may include one or more processors 1702, one or more memories 1704 and video processing hardware 1706.
  • the processor (s) 1702 may be configured to implement one or more methods described in the present document.
  • the memory (memories) 1704 may be used for storing data and code used for implementing the methods and techniques described herein.
  • the video processing hardware 1706 may be used to implement, in hardware circuitry, some techniques described in the present document.
  • FIG. 19 is a flowchart for a method 1900 of video bitstream processing.
  • the method 1900 includes determining (1905) a shape of a video block, determining (1910) an interpolation order based on the video block, the interpolation order being indicative of a sequence of performing horizontal interpolation and vertical interpolation, and performing the horizontal interpolation and the vertical interpolation in accordance with the interpolation order for the video block to reconstruct (1915) a decoded representation of the video block.
  • FIG. 20 is a flowchart for a method 2000 of video bitstream processing.
  • the method 2000 includes determining (2005) characteristics of a motion vector related to a video block, determining (2010) an interpolation order of the video block based on the characteristics of the motion vector, the interpolation order being indicative of a sequence of performing horizontal interpolation and vertical interpolation, and performing the horizontal interpolation and the vertical interpolation in accordance with the interpolation order for the video block to reconstruct (2015) a decoded representation of the video block.
  • FIG. 22 is a flowchart for a method 2200 of video bitstream processing.
  • the method 2200 includes determining (2205) dimension characteristics of a first video block, determining (2210) that a first interpolation filter is to be applied to the first video block based on the determination of the dimension characteristics, and performing (2215) further processing of the first video block using the first interpolation filter.
  • FIG. 23 is a flowchart for a method 2300 of video bitstream processing.
  • the method 2300 includes determining (2305) first characteristics of a first video block, determining (2310) that a first interpolation filter is to be applied to the first video block based on the determination of the first characteristics, performing (2315) further processing of the first video block using the first interpolation filter, determining (2320) second characteristics of a second video block, determining (2325) that a second interpolation filter is to be applied to the first video block based on the second characteristics, the first interpolation filter and the second interpolation filter being different short-tap filters, and performing (2330) further processing of the second video block using the second interpolation filter.
  • Section 4 For example, as described in Section 4, under different shapes of the video block, a preference may be given to performing one of the horizontal interpolation or vertical interpolation first.
  • the horizontal interpolation is performed before the vertical interpolation, and in some embodiments the vertical interpolation is performed before the horizontal interpolation.
  • the video block may be encoded in the video bitstream in which bit efficiency may be achieved by using a bitstream generation rule related to interpolation orders that also depends on the shape of the video block.
  • the methods can include wherein rounding the motion vectors includes one or more of: rounding to a nearest integer-pel precision MV, or rounding to a half-pel precision MV.
  • the methods can include wherein rounding the MVs includes one or more of: rounding down, rounding up, rounding towards zero, or rounding away from zero.
  • the methods can include wherein wherein the dimension information represents that a size of the first video block is less than a threshold value, and rounding the MVs is applied to one or both of a horizontal MV component or a vertical MV component based on the dimension information representing that the size of the first video block is less than the threshold value.
  • the methods can include wherein the dimension information represents that a width or a height of the first video block is less than a threshold value, and rounding the MVs is applied to one or both of a horizontal MV component or a vertical MV component based on the dimension information representing that the width or the height of the first video block is less than the threshold value.
  • the methods can include wherein the threshold value is different for bi-predicted blocks and uni-predicted blocks.
  • the methods can include wherein the dimension information represents a ratio between a width and a height of the first video block is larger than a first threshold value or smaller than a second threshold value, and wherein the rounding of the MVs is based on the determination of the dimension information.
  • the methods can include wherein rounding the MVs is further based on both horizontal and vertical components of the MVs being fractional.
  • the methods can include wherein rounding the MVs is further based on the first video block being bi-predicted or uni-predicted.
  • the methods can include wherein rounding the MVs is further based on a prediction direction related to the first video block.
  • the methods can include wherein rounding the MVs is further based on color components of the first video block.
  • the methods can include wherein rounding the MVs is further based on a size of the first video block, a shape of the first video block, or a prediction shape of the first video block.
  • the methods can include wherein rounding the MVs is applied on sub-block prediction.
  • the methods can include wherein a short-tap filter is applied to MV components based on the MV components having fractional precision.
  • the methods can include wherein short-tap filters are applied based on a dimension of the first video block, or coded information of the first video block.
  • the methods can include wherein short-tap filters are applied based on a mode of the first video block.
  • the methods can include wherein default values are used for boundary padding related to the first video block.
  • the methods can include wherein the merge mode is one or more of: a regular merge list, a triangular merge list, an affine merge list, or other non-intra or non-AMVP mode.
  • the merge mode is one or more of: a regular merge list, a triangular merge list, an affine merge list, or other non-intra or non-AMVP mode.
  • the methods can include wherein merge candidates with fractional merge candidates are excluded from a merge list.
  • the methods can include wherein rounding the motion information includes rounding a merge candidate associated with fractional motion vectors to integer precision, and the modified motion information is inserted into a merge list.
  • the methods can include wherein the motion information is a bi-prediction candidate.
  • the methods can include wherein MMVD is mean magnitude of vector difference.
  • the methods can include wherein the motion vectors are in MMVD mode.
  • the methods can include wherein the first video block is an MMVD coded block to be associated with integer-pel precision, and wherein base merge candidates used in MMVD are modified to integer-pel precision via rounding.
  • the methods can include wherein the first video block is an MMVD coded block to be associated with half-pel precision, and wherein base merge candidates used in MMVD are modified to half-pel precision via rounding.
  • the methods can include wherein the threshold number is a maximum number of allowed half-pel MV components or quarter-pel MV components.
  • the methods can include wherein the threshold number is different between bi-prediction and uni-prediction.
  • the methods can include wherein an indication disallowing bi-prediction is signaled in a sequence parameter set, a picture parameter set, a sequence header, a picture header, a tile header, a tile group header, a CTU row, a region, or other high-level syntax.
  • the methods can include wherein the methods are in conformance with a bitstream rule that allows for only integer-pel motion vectors for bi-prediction coded blocks having particular dimensions.
  • the methods can include wherein the first video block has a size of: 4x6, 16x4, 4x8, 8x4, or 4x4.
  • the methods can include wherein modifying or rounding the motion information includes modifying different MV components differently.
  • the methods can include wherein a y-component of a first MV is modified or rounded to integer-pixel, and an x-component of the first MV is not modified or rounded.
  • the methods can include wherein a luma component of a first MV is rounded to integer pixels, and a chroma component of the first MV is rounded to 2-pel pixels.
  • the methods can include wherein the first MV is related to a video block having a color format that is 4: 2: 0.
  • the methods can include wherein the bilateral filter is used for 4x4 uni-prediction, 4x8 bi-prediction, 8x4-bi-prediction, 4x16 bi-prediction, 16x4 bi-prediction, 8x8 bi-prediction, 8x4 uni-prediction, or 4x8 uni-prediction.
  • FIG. 24 is a flowchart for a method 2400 of video processing.
  • the method 2400 includes determining (2402) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining (2404) filters with interpolation filter parameters used for interpolation of the first block based on the characteristics of the first block; and performing (2406) the conversion by using the filters with the interpolation filter parameters.
  • the interpolation filter parameters includes filter taps and/or interpolation filter coefficients, and the interpolation includes at least one of vertical interpolation and horizontal interpolation.
  • the filters includes short-tap filters with less taps than regular interpolation filters.
  • the regular interpolation filters have 8 taps.
  • the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
  • the filter used for the vertical interpolation is different from the filter used for the horizontal interpolation in number of taps.
  • the filter used for the vertical interpolation has less taps than the filter used for the horizontal interpolation.
  • the filter used for the horizontal interpolation has less taps than the filter used for the vertical interpolation.
  • the short-tap filters are used for the horizontal interpolation or/and the vertical interpolation.
  • the short-tap filters are used for the horizontal interpolation or/and the vertical interpolation.
  • the short-tap filters are used for the horizontal interpolation, or when the height of the first block is smaller than and/or equal to a threshold, the short-tap filters are used for the vertical interpolation.
  • the short-tap filters are used for the vertical interpolation and/or horizontal interpolation.
  • the characteristics of the first block includes at least one motion vector (MV) associated with the first block.
  • MV motion vector
  • the short-tap filters are used for the interpolation.
  • the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
  • whether the short-tap filters are used or not depends on the prediction parameter.
  • the short-tap filters are used for the interpolation.
  • the characteristics of the first block includes prediction direction indicating from List 0 or List 1 and/or associated motion vectors (MVs) .
  • whether the short-tap filters are used or not depends on prediction direction of the first block and/or the MVs.
  • the first block is a bi-predicted block
  • whether the short-tap filters are used or not is different for different prediction direction.
  • the short-tap filters are used for the prediction direction X; otherwise, the short-tap filters are not used.
  • N and M are different for bi-predicted blocks and uni-predicted blocks.
  • N is equal to 4 and M is equal to 4, or N is equal to 4 and M is equal to 3, or N is equal to 4 and M is equal to 2, or N is equal to 4 and M is equal to 1, or N is equal to 3 and M is equal to 3, or N is equal to 3 and M is equal to 2, or N is equal to 3 and M is equal to 1, or N is equal to 2 and M is equal to 2, or N is equal to 2 and M is equal to 1, or N is equal to 1 and M is equal to 1.
  • N is equal to 2 and M is equal to 2, or N is equal to 2 and M is equal to 1, or N is equal to 1 and M is equal to 1.
  • the short-tap filters includes first short-tap filters with S1 tap and second short-tap filters with S2 tap, and wherein K MV components of the M MV components use the first short-tap filters, and (M –K) MV components of the M MV components use the second short-tap filters, wherein K is an integer in a range from 0 to M –1, S1 and S2 are integers.
  • N and M are different for different dimension parameters of blocks, wherein the dimension parameters includes width or/and height or/and width *height of the blocks.
  • the characteristics of the first block includes position of the pixels of the first block.
  • whether the short-tap filters are used or not depends on the position of the pixels.
  • the short-tap filters are used only for boundary pixels of the first block.
  • the short-tap filters are used only for N1 right column or/and N2 left column or/and N3 top row or/and N4 bottom row of the first block, N1, N2, N3, N4 being integers.
  • the characteristics of the first block includes color components of the first block.
  • whether the short-tap filters are used or not is different for different color components of the first block.
  • the color components include Y, Cb and Cr.
  • the characteristics of the first block includes color formats of the first block.
  • whether to and how to apply the short-tap filters depend on color formats of the first block.
  • the color formats include 4: 2: 0, 4: 2: 2 or 4: 4: 4.
  • the filters includes different short-tap filters with different taps, and selection of the different short-tap filters is based on the characteristics of the blocks.
  • a 7-tap filter is selected for horizontal and vertical interpolation of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks.
  • a 7-tap filter is selected for horizontal or vertical interpolation of 4x4 uni-predicted or/and bi-predicted luma blocks.
  • a 6-tap filter is selected for horizontal and vertical interpolation of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
  • a 6-tap filter and a 5-tap filter or a 5-tap filter and a 6-tap filter are selected for horizontal interpolation and vertical interpolation respectively for 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
  • the filters includes different short-tap filters with different taps, and the different short-tap filters are used for different kinds of motion vectors (MVs) .
  • MVs motion vectors
  • longer tap length filters from the different short-tap filters are used for MVs that only have fractional components in one of horizontal or vertical direction, and shorter tap length filters from the different short-tap filters are used for MVs that have fractional components in both horizontal and vertical directions.
  • a 8-tap filter is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one of horizontal or vertical direction
  • short-tap filters is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both directions.
  • filters used for affine motion are different from that used for translational motion vectors.
  • filters used for affine motion have less taps compared to those used for translational motion vectors.
  • the short-tap filters are not applied to sub-block based prediction including affine prediction.
  • the short-tap filters are applied to sub-block based prediction including Advanced Temporal Motion Vector Prediction (ATMVP) prediction.
  • ATMVP Advanced Temporal Motion Vector Prediction
  • each sub-block is used as a coding block to determine whether to and how to apply the short-tap filters.
  • the characteristics of the first block includes dimension parameters and coded information of the first block, and whether to and how to apply the short-tap filters depend on the block dimension and coded information of the first block.
  • the short-tap filters are applied.
  • the conversion generates the first/second block of video from the bitstream representation.
  • the conversion generates the bitstream representation from the first/second block of video.
  • FIG. 25 is a flowchart for a method 2500 of video processing.
  • the method 2500 includes fetching (2502) , for a conversion between a first block of video and a bitstream representation of the first block, reference pixels of a first reference block from reference picture, wherein the first reference block is smaller than a second reference block required for motion compensation of the first block; padding (2504) the first reference block with padding pixels to generate the second reference block required for motion compensation of the first block; and performing (2506) the conversion by using the generated second reference block.
  • the first block has a size of W*H
  • the first reference block has a size of (W + N –1 –PW) * (H + N –1 –PH)
  • the second reference block has a size of (W + N –1) * (H + N –1) , wherein W is width of the first block, H is height of the first block, N is the number of interpolation filter taps used for the first block, PW and PH are integers.
  • the step of padding the first reference block with padding pixels to generate the second reference block includes: repeating pixels at one or more boundaries of the first reference block as the padding pixels to generate the second reference block.
  • the boundaries are top, left, bottom and right boundary of the first reference block.
  • the pixels at the top, left and right boundary are repeated once, and the pixels at the bottom boundary are repeated twice.
  • the fetched reference pixels are identified by (x + MVXInt –N/2 +offSet1, y + MVYInt –N/2 + offSet2) , wherein (x, y) is the top-left position of the first block, (MVXInt, MVYInt) is the integer part of motion vector (MV) for the first block, and offSet1 and offSet2 are integers.
  • PH when PH is zero, only the pixels at the left or/and right boundaries of the first reference block are repeated.
  • both PW and PH are greater than zero, first the pixels at the left or/and the right boundaries of the first reference block are repeated, and then the pixels at the top or/and bottom boundaries of the first reference block are repeated, or first the top or/and bottom boundaries of the first reference block are repeated, and then the left or/and right boundaries of the first reference block are repeated.
  • the pixels of M1 left columns of the first reference block, or the pixels of (PW –M1) right columns of the first reference block are repeated, wherein M1 >1 or PW –M1>1.
  • the pixels of M2 top rows of the first reference block, or the pixels of (PH –M2) bottom rows of the first reference block are repeated, wherein M2 >1 or PW –M2>1.
  • pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block.
  • pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block.
  • the first reference block is any one of partial or all reference blocks of the first block.
  • pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block for prediction direction X; otherwise, the pixels are not repeated.
  • N2 and M are different for bi-predicted blocks and uni-predicted blocks.
  • N2 and M are different for different block sizes, the block size being associated with width or/and height or/and width *height of the block.
  • N2 is equal to 4 and M is equal to 4, or N2 is equal to 4 and M is equal to 3, or N2 is equal to 4 and M is equal to 2, or N2 is equal to 4 and M is equal to 1, or N2 is equal to 3 and M is equal to 3, or N2 is equal to 3 and M is equal to 2, or
  • N2 is equal to 3 and M is equal to 1, or N2 is equal to 2 and M is equal to 2, or N2 is equal to 2 and M is equal to 1, or N2 is equal to 1 and M is equal to 1.
  • N2 is equal to 2 and M is equal to 2, or N2 is equal to 2 and M is equal to 1, or N2 is equal to 1 and M is equal to 1.
  • pixels at different boundaries of the first reference block are repeated as the padding pixels in different ways to generate the second reference block for the M MV components.
  • PW is set equal to zero when fetching the first reference block using the MV.
  • PH is set equal to zero when fetching the first reference block using the MV.
  • PW and/or PH are different for different color components of the first block.
  • the color components includes Y, Cb and Cr.
  • PW and/or PH are different for different block size or shape.
  • PW and PH are set equal to 1 for 4x16 or/and 16x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH are set equal to 0 and 1, or 1 and 0 respectively, for 4x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH are set equal to 2 for 4x8 or/and 8x4 bi-predicted or/and uni-predicted block.
  • PW and PH are set equal to 2 and 3, or 3 and 2 respectively, for 4x8 or/and 8x4 bi-predicted or/and uni-predicted blocks.
  • PW and PH are different for uni-prediction and bi-prediction.
  • PW and PH are different for different kinds of motion vectors.
  • PW and PH are set to a smaller value or equal to zero for motion vectors (MVs) that only have fractional components in one of horizontal or vertical direction, and PW and PH are set to a larger value for MVs that have fractional components in both horizontal and vertical directions.
  • MVs motion vectors
  • PW and PH are set equal to 0 for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one of horizontal or vertical direction.
  • the PW and PH are used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both horizontal and vertical direction
  • whether to and how to repeat pixels at the boundaries depend on color formats of the first block.
  • the color formats includes 4: 2: 0, 4: 2: 2 or 4: 4: 4.
  • the step of padding the first reference block with padding pixels to generate the second reference block includes: padding default values as the padding pixels to generate the second reference block.
  • the conversion generates the first block of video from the bitstream representation.
  • the conversion generates the bitstream representation from the first/second block of video.
  • FIG. 26 is a flowchart for a method 2600 of video processing.
  • the method 2600 includes determining (2602) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing (2604) rounding process on motion vector (MV) of the first block based on the characteristics of the first block; and performing (2606) the conversion by using the rounded MV.
  • the performing rounding process on the MV includes rounding the MV to integer-pel precision or half-pel precision.
  • the MV is rounded to a nearest integer-pel precision MV or half-pel precision MV.
  • the performing rounding process on the MV includes rounding up, rounding down, rounding towards zero or rounding away from zero of the MV.
  • the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
  • rounding process is performed on horizontal or/and vertical component of the MV.
  • rounding process is performed on horizontal or/and vertical component of the MV.
  • rounding process is performed on horizontal component of the MV, or when the height of the first block is smaller than and/or equal to the second threshold L1, rounding process is performed on vertical component of the MV.
  • the thresholds L and L1 are different for bi-predicted blocks and uni-predicted blocks.
  • the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
  • whether performing the rounding process on the MV depends on the prediction parameter.
  • rounding process is performed on the MV.
  • the characteristics of the first block includes prediction direction indicating from List 0 or List 1 and/or associated MVs.
  • whether performing the rounding process on the MV depends on prediction direction of the first block and/or the MVs.
  • the first block is a bi-predicted block, whether performing the rounding process on the MV or not is different for different prediction direction.
  • N is an integer in a range from 0 to 2; otherwise, the rounding process is not performed.
  • N1 and M are different for bi-predicted blocks and uni-predicted blocks.
  • N1 is equal to 4 and M is equal to 4, or
  • N1 is equal to 4 and M is equal to 3, or
  • N1 is equal to 4 and M is equal to 2, or
  • N1 is equal to 4 and M is equal to 1, or
  • N1 is equal to 3 and M is equal to 3, or
  • N1 is equal to 3 and M is equal to 2, or
  • N1 is equal to 3 and M is equal to 1, or
  • N1 is equal to 2 and M is equal to 2, or
  • N1 is equal to 2 and M is equal to 1, or
  • N1 is equal to 1 and M is equal to 1.
  • N1 is equal to 2 and M is equal to 2, or
  • N1 is equal to 2 and M is equal to 1, or
  • N1 is equal to 1 and M is equal to 1.
  • N1 and M are different for different dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
  • K MV components of the M MV components are rounded to integer-pel precision and M –K MV components are rounded to half-pel precision, wherein K is an integer in a range from 0 to M –1.
  • the characteristics of the first block includes color components of the first block.
  • whether performing rounding process on the MV is different for different color components of the first block.
  • the color components include Y, Cb and Cr.
  • the characteristics of the first block includes color formats of the first block.
  • whether performing rounding process on the MV depends on color formats of the first block.
  • the color formats include 4: 2: 0, 4: 2: 2 or 4: 4: 4.
  • whether and/or how to perform rounding process on the MV depend on the characteristics of the block.
  • one or more MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks are rounded to half-pel precision.
  • one or more MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks are rounded to integer-pel precision.
  • one or more MV components of 4x4 uni-predicted or/and bi-predicted luma blocks are rounded to integer-pel precision.
  • one or more MV components of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks are rounded to integer-pel precision.
  • the characteristics of the first block includes whether the first block is coded with sub-block based prediction method including affine prediction mode and Sub-block based Temporal Motion Vector Prediction (SbTMVP) mode.
  • sub-block based prediction method including affine prediction mode and Sub-block based Temporal Motion Vector Prediction (SbTMVP) mode.
  • the rounding process on the MV is not applied if the first block is coded with affine prediction mode.
  • the rounding process on the MV is applied if the first block is coded with SbTMVP mode, and the rounding process is performed for each sub-block of the first block.
  • the performing rounding process on motion vector (MV) of the first block based on the characteristics of the first block comprises: determining whether at least one MV of the first block are fractional precisions when the dimension parameters of the first block satisfy a predetermined rule; and in response to the determination that the at least one MV of the first block are fractional precisions, performing rounding process on the at least one MV to generate rounded MVs having integer precision.
  • bitstream representation of the first block follows the rule depending on the dimension parameters of the first block, wherein only integer-pel MVs are allowed for bi-prediction coded blocks.
  • the dimensions parameters of the first block are 4x16, 16x4, 4x8, 8x4, or 4x4.
  • the performing the conversion by using the rounded MV comprises: performing motion compensation for the first block by using the rounded MVs.
  • FIG. 27 is a flowchart for a method 2700 of video processing.
  • the method 2700 includes determining (2702) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing (2704) motion compensation for the first block using a MV with a first precision; and storing (2706) a MV with a second precision for the first block; wherein the first precision is different from the second precision.
  • the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
  • the first precision is integer precision and the second precision is fractional precision.
  • FIG. 28 is a flowchart for a method 2800 of video processing.
  • the method 2800 includes determining (2802) , for a conversion between a first block of video and a bitstream representation of the first block, coding mode of the first block; performing (2804) rounding process on motion vector (MV) of the first block if the coding mode of the first block satisfying a predetermined rule; and performing (2806) the motion compensation of the first block by using the rounded MV.
  • the predetermined rule comprises: the first block is coded with merge mode, non-intra modes or non-Advanced motion vector prediction (AMVP) mode.
  • AMVP advanced motion vector prediction
  • FIG. 29 is a flowchart for a method 2900 of video processing.
  • the method 2900 includes generating (2902) , for a conversion between a first block of video and a bitstream representation of the first block, a first motion vector (MV) candidate list for the first block; performing (2904) rounding process on MV of at least one candidate before adding the at least one candidate into the first MV candidate list; and performing (2906) the conversion by using the first MV candidate list.
  • MV motion vector
  • the first block is coded with merge mode, non-intra modes or non-Advanced motion vector prediction (AMVP) mode
  • the MV candidate list includes merge candidate list and non-merge candidate list.
  • the candidates with fractional MVs are excluded from the first MV candidate list.
  • the at least one candidate comprises: a candidate derived from a spatial block, a candidate derived from a temporal block, a candidate derived from a History motion vector prediction (HMVP) table or a pairwise bi-prediction merge candidate.
  • HMVP History motion vector prediction
  • the method further comprises: providing a separate HMVP table to store the candidates with MV of integer precision.
  • the method further comprises: performing the rounding process on the MV or the rounding process on the MV of candidate in the candidate list based on characteristics of the first block.
  • the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
  • the dimension parameters include at least one of 4x16, 16x4, 4x8, 8x4, 4x4.
  • the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted
  • performing rounding process on MV comprises: performing the rounding process on the MV or the rounding process on the MV of candidate in the candidate list only when the candidate is a bi-prediction candidate.
  • the first block is coded with AMVP mode
  • the candidate is AMVP candidate
  • the first block is non-affine mode.
  • FIG. 30 is a flowchart for a method 3000 of video processing.
  • the method 3000 includes determining (3002) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining (3004) constraint parameter to be applied on the first block based on the characteristics of the first block, wherein the constraint parameter constrains a maximum number of fractional motion vector (MV) components of the first block; and performing (3006) the conversion by using the constraint parameter.
  • MV fractional motion vector
  • the MV components include at least one of horizontal MV component and/or vertical MV component
  • the fractional MV components include at least one of half-pel MV components, quarter-pel MV components, MV components with finer precision than quarter-pel.
  • the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
  • the constraint parameter is different for bi-prediction and uni-prediction.
  • the constraint parameter is not applied in uni-prediction.
  • the constraint parameter is applied when the first block is bi-predicted 4x8, 8x4, 4x16, or16x4 block.
  • the constraint parameter is not applied when the first block is uni-predicted 4x8, 8x4, 4x16 or16x4 block.
  • the constraint parameter is applied when the first block is a uni-predicted 4x4 or a bi-predicted 4x4 block.
  • the maximum number of the fractional MV components is 3, 2, 1 or 0.
  • the maximum number of the fractional MV components is 1 or 0.
  • the maximum number of the quarter-pel MV components is 3, 2, 1 or 0.
  • the maximum number of the quarter-pel MV components is 1 or 0.
  • the characteristics of the first block includes at least one of shapes and dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height, and shape of the first block.
  • the constraint parameter is different for different sizes or shapes of the first block.
  • the characteristics of the first block includes mode parameter indicating coding mode of the first block.
  • the coding mode includes a triangle mode in which the current is split into two partitions, wherein each partition has at least one MV.
  • the constraint parameter is applied when the first block is 4x16 or 16x4 block coded in the triangle mode.
  • FIG. 31 is a flowchart for a method 3100 of video processing.
  • the method 3100 includes acquiring (3102) , an a signaled indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining (3104) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; and performing (3106) the conversion by using the indication when the characteristics of the first block satisfies the predetermined rule.
  • FIG. 32 is a flowchart for a method 3200 of video processing.
  • the method 3200 includes signaling (3202) , an indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining
  • characteristics of the first block for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing (3206) the conversion based on the characteristics of the first block, wherein during the conversation, at least one of bi-prediction and uni-prediction is disabled when the characteristics of the first block satisfies a predetermined rule.
  • the indication is signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/coding tree unit (CTU) rows/regions/other high-level syntax.
  • CTU code tree unit
  • the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height, and shape of the first block.
  • the predetermined rules comprises: the first block is of certain block dimensions.
  • the characteristics of the first block includes mode parameter indicating coding mode of the first block.
  • the predetermined rule comprises: the first block is coded with non-affine mode.
  • the signaling of Advanced Motion Vector Resolution (AMVR) parameter for the first block is modified accordingly.
  • AMVR Advanced Motion Vector Resolution
  • the signaling of Advanced Motion Vector Resolution (AMVR) parameter is modified so that only integer-pel precisions are allowed for the first block.
  • AMVR Advanced Motion Vector Resolution
  • the signaling of Advanced Motion Vector Resolution (AMVR) parameter is modified so that different motion vector (MV) precisions are utilized.
  • AMVR Advanced Motion Vector Resolution
  • the block dimension of the first block is at least one of 4x16, 16x4, 4x8, 8x4, 4x4.
  • bitstream representation of the first block follows the rule depending on the dimensions parameters of the first block, wherein only integer-pel MVs are allowed for bi-prediction coded blocks.
  • FIG. 33 is a flowchart for a method 3300 of video processing.
  • the method 3300 includes determining (3302) , for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; signaling (3304) , an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing (3306) the conversion by using the AMVR parameter.
  • MV fractional motion vector
  • MVP Advanced Motion Vector Resolution
  • FIG. 34 is a flowchart for a method 3400 of video processing.
  • the method 3400 includes determining (3402) , for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; acquiring (3404) , an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing (3406) the conversion by using the AMVR parameter.
  • MV fractional motion vector
  • MVP Advanced Motion Vector Resolution
  • the AMVR parameter indicating whether MV/MVD precision of the current block is fractional precision is skipped and derived to be false implicitly.
  • PW and PH are designed for 4x16, 16x4, 4x4, 8x4 and 4x8 blocks.
  • the MV of the block in reference list X is MVX
  • the interpolation filter tap (in motion compensation) is N (for example, 8, 6, 4, or 2)
  • the current block size is WxH
  • the position (i.e., position of the top-left pixel) of current block is (x, y) .
  • the index of the rows and columns start from 1, for example, H rows include the 1st, ..., (H –1) th row.
  • PW and PH are both set equal to 1 for prediction direction X.
  • (W + N –2) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 1, MVXInt [1] + y –N/2 + 1) .
  • the (W + N –1) th column is generated by copying the (W + N –2) th column.
  • the (H + N –1) th row is generated by copying the (H + N –2) th row.
  • PW and PH are set equal to 0 and 1 respectively.
  • (W + N –1) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 1, MVXInt [1] + y –N/2 + 1) .
  • the (H + N –1) th row is generated by copying the (H + N –2) th row.
  • PW and PH are set equal to 2 and 3 respectively.
  • (W + N –3) * (H + N –4) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 2, MVXInt [1] + y –N/2 + 2) .
  • the 1st column is copied to its left side to obtain W + N –2 columns, after that, the (W + N –1) th column is generated by copying the (W + N –2) th column.
  • PW and PH are both set equal to 1 for prediction direction X.
  • (W + N –2) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 2, MVXInt [1] + y –N/2 + 2) .
  • the 1st column is copied to its left side to obtain W + N –1 columns.
  • the 1st row is copied to its upside to obtain H + N –1 rows.
  • PW and PH are set equal to 0 and 1 respectively.
  • (W + N –1) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 1, MVXInt [1] + y –N/2 + 2) .
  • the 1st row is copied to its upside to obtain H + N –1 rows.
  • PW and PH are set equal to 2 and 3 respectively.
  • (W + N –3) * (H + N –4) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 2, MVXInt [1] + y –N/2 + 2) .
  • the 1st column is copied to its left side to obtain W + N –2 columns, after that, the (W + N –1) th column is generated by copying the (W + N –2) th column.
  • the disclosed techniques may be embodied in video encoders or decoders to improve compression efficiency when the coding units being compressed have shaped that are significantly different than the traditional square shaped blocks or rectangular blocks that are half-square shaped.
  • new coding tools that use long or tall coding units such as 4x32 or 32x4 sized units may benefit from the disclosed techniques.
  • the disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them.
  • the disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus.
  • the computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them.
  • data processing apparatus encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
  • the apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • a propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
  • a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program does not necessarily correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) .
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
  • processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
  • a processor will receive instructions and data from a read only memory or a random-access memory or both.
  • the essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • a computer need not have such devices.
  • Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
  • semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
  • magnetic disks e.g., internal hard disks or removable disks
  • magneto optical disks e.g., CD ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

MV precision constraints are described. A method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block (2402,2602,2702,3002,3104,3204); determining constraint parameter to be applied on the first block based on the characteristics of the first block, wherein the constraint parameter constrains a maximum number of fractional motion vector(MV) components of the first block (3004); and performing the conversion by using the constraint parameter (3006).

Description

MV PRECISION CONSTRAINTS
CROSS-REFERENCE TO RELATED APPLICATION
Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/071503, filed on January 12, 2019, and No. PCT/CN2019/077171, filed on March 6, 2019. The entire disclosures of International Patent Application No. PCT/CN2019/071503 and No. PCT/CN2019/077171 are incorporated by reference as part of the disclosure of this application.
TECHNICAL FIELD
This document is related to video coding technologies.
BACKGROUND
Digital video accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
SUMMARY
The disclosed techniques may be used by video decoder or encoder embodiments for in which interpolation is improved using a block-shape interpolation order technique.
In one example aspect, a method of video bitstream processing is disclosed. The method includes determining a shape of a first video block, determining an interpolation order based on the shape of the first video block, the interpolation order indicative of a sequence of performing a horizontal interpolation and a vertical interpolation, and performing the horizontal interpolation and the vertical interpolation for the first video block in the sequence in accordance with the interpolation order to reconstruct a decoded representation of the first video block.
In another example aspect, a method of video bitstream processing includes determining characteristics of a motion vector related to a first video block, determining an interpolation order based on the characteristics of the motion vector, the interpolation order indicative of a sequence of performing a horizontal interpolation and a vertical interpolation, and performing the horizontal interpolation and the vertical interpolation for the first video block in the sequence in accordance with the interpolation order to reconstruct a decoded representation of the first video block.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, by a processor, dimension characteristics of a first video block; determining, by the processor, that a first interpolation filter is to be applied to the first video block based on the determination of the dimension characteristics; and performing further processing of the first video block using the first interpolation filter.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, by a processor, first characteristics of a first video block; determining, by the processor, that a first interpolation filter is to be applied to the first video block based on the first characteristics; performing further processing of the first video block using the first interpolation filter; determining, by a processor, second characteristics of a second video block; determining, by the processor, that a second interpolation filter is to be applied to the second video block based on the second characteristics, the first interpolation filter and the second interpolation filter being different short-tap filters; and performing further processing of the second video block using the second interpolation filter.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, by a processor, characteristics of a first video block, the characteristics including one or more of: a dimension information of a first video block, a prediction direction of the first video block, or a motion information of the first video block; rounding motion vectors (MVs) related to the first video block to integer-pel precision or half-pel precision based on the determination of the characteristics of the first video block; and performing further processing of the first video block using the motion vectors that are rounded.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, by a processor, that a first video block is coded with a merge mode; rounding motion information related to the first video block to integer precision to generate modified motion information based on the determination that the first video block is coded with  the merge mode; and performing a motion compensation process for the first video block using the modified motion information.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining characteristics of a first video block, the characteristics being one or both of: a size of the first video block, or a shape of the first video block; modifying motion vectors related to the first video block to integer-pel precision or half-pel precision to generate modified motion vectors; and performing further processing of the first video block using the modified motion vectors.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining characteristics of a first video block, the characteristics being one or both of: a size dimension of the first video block, or a prediction direction of the first video block; determining MMVD side information based on the determination of the characteristics of the first video block; and performing further processing of the first video block using the MMVD side information.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining characteristics of a first video block, the characteristics being one or both of: a size of the first video block, or a shape of the first video block; modifying motion vectors related to the first video block to integer-pel precision or half-pel precision to generate modified motion vectors; and performing further processing of the first video block using the modified motion vectors.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining characteristics of a first video block, the characteristics being one or both of: a size of the first video block, or a shape of the first video block; determining a threshold number of half-pel motion vector (MV) components or quarter-pel MV components to be constrained based on the determination of the characteristics of the first video block; and performing further processing of the first video block using the threshold number.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining characteristics of a first video block, the characteristics including a size of the first video block; modifying motion vectors (MVs) related to the first video block from fractional precision to integer precision based on the determination of the characteristics of the  first video block; and performing motion compensation for the first video block using the modified MVs.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining a first dimension of a first video block; determining a first precision for motion vectors (MVs) related to the first video block based on the determination of the first dimension; determining a second dimension of a second video block, the first dimension and the second dimension being different dimensions; determining a second precision for MVs related to the second video block based on the determination of the second dimension, the first precision and the second precision being different precisions; and performing further processing of the first video block using the first dimension and of the second video block using the second dimension.
In another example aspect, a method of video processing, is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining filters with interpolation filter parameters used for interpolation of the first block based on the characteristics of the first block; and performing the conversion by using the filters with the interpolation filter parameters.
In another example aspect, a method of video processing, is disclosed. The method includes etching, for a conversion between a first block of video and a bitstream representation of the first block, reference pixels of a first reference block from reference picture, wherein the first reference block is smaller than a second reference block required for motion compensation of the first block; padding the first reference block with padding pixels to generate the second reference block; and performing the conversion by using the generated second reference block.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing rounding process on motion vector (MV) of the first block based on the characteristics of the first block; and performing the conversion by using the rounded MV.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing motion compensation for the first block using a MV with a first precision; and storing a MV with a second precision for the first block; wherein the first precision is different from the second precision.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, coding mode of the first block; performing rounding process on motion vector (MV) of the first block if the coding mode of the first block satisfying a predetermined rule; and performing the motion compensation of the first block by using the rounded MV.
In another example aspect, a method for video bitstream processing is disclosed. The method includes generating, for a conversion between a first block of video and a bitstream representation of the first block, a first motion vector (MV) candidate list for the first block; performing rounding process on MV of at least one candidate before adding the at least one candidate into the first MV candidate list; and performing the conversion by using the first MV candidate list.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining constraint parameter to be applied on the first block based on the characteristics of the first block, wherein the constraint parameter constrains a maximum number of fractional motion vector (MV) components of the first block; and performing the conversion by using the constraint parameter.
In another example aspect, a method for video bitstream processing is disclosed. The method includes
In another example aspect, a method for video bitstream processing is disclosed. The method includes acquiring, an a signaled indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; and performing the conversion by using the indication when the characteristics of the first block satisfies the predetermined rule.
In another example aspect, a method for video bitstream processing is disclosed. The method includes signaling, an indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing the conversion based on the characteristics of the first  block, wherein during the conversation, at least one of bi-prediction and uni-prediction is disabled when the characteristics of the first block satisfies a predetermined rule.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; signaling, an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing the conversion by using the AMVR parameter.
In another example aspect, a method for video bitstream processing is disclosed. The method includes determining, for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; acquiring, an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing the conversion by using the AMVR parameter.
In another example aspect, the above-described methods may be implemented by a video decoder apparatus that comprises a processor.
In another example aspect, the above-described methods may be implemented by a video encoder apparatus comprising a processor for decoding encoded video during video encoding process.
In yet another example aspect, these methods may be embodied in the form of processor-executable instructions and stored on a computer-readable program medium.
These, and other, aspects are further described in the present document.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is an illustration of a QUAD TREE BINARY TREE (QTBT) structure
FIG. 2 shows an example derivation process for merge candidates list construction.
FIG. 3 shows example positions of spatial merge candidates.
FIG. 4 shows an example of candidate pairs considered for redundancy check of spatial merge candidates.
FIG. 5A and 5B show examples of positions for the second prediction unit (PU) of N×2N and 2N×N partitions.
FIG. 6 is an illustration of motion vector scaling for temporal merge candidate.
FIG. 7 shows example candidate positions for temporal merge candidate, C0 and C1.
FIG. 8 shows an example of combined bi-predictive merge candidate.
FIG. 9 shows an example of a derivation process for motion vector prediction candidates.
FIG. 10 is an illustration of motion vector scaling for spatial motion vector candidate.
FIG. 11 shows an example of advanced temporal motion vector prediction (ATMVP) motion prediction for a coding unit (CU) .
FIG. 12 shows an example of one CU with four sub-blocks (A-D) and its neighbouring blocks (a–d) .
FIG. 13 illustrates proposed non-adjacent merge candidates in one example.
FIG. 14 illustrates proposed non-adjacent merge candidates in one example.
FIG. 15 illustrates proposed non-adjacent merge candidates in one example.
FIG. 16 shows an example of integer samples and fractional sample positions for quarter sample luma interpolation.
FIG. 17 is a block diagram of an example of a video processing apparatus.
FIG. 18 shows a block diagram of an example implementation of a video encoder.
FIG. 19 is a flowchart for an example of a video bitstream processing method.
FIG. 20 is a flowchart for an example of a video bitstream processing method.
FIG. 21 shows an example of repeat boundary pixels of a reference block before interpolation.
FIG. 22 is a flowchart for an example of a video bitstream processing method.
FIG. 23 is a flowchart for an example of a video bitstream processing method.
FIG. 24 is a flowchart for an example of a video bitstream processing method.
FIG. 25 is a flowchart for an example of a video bitstream processing method.
FIG. 26 is a flowchart for an example of a video bitstream processing method.
FIG. 27 is a flowchart for an example of a video bitstream processing method.
FIG. 28 is a flowchart for an example of a video bitstream processing method.
FIG. 29 is a flowchart for an example of a video bitstream processing method.
FIG. 30 is a flowchart for an example of a video bitstream processing method.
FIG. 31 is a flowchart for an example of a video bitstream processing method.
FIG. 32 is a flowchart for an example of a video bitstream processing method.
FIG. 33 is a flowchart for an example of a video bitstream processing method.
FIG. 34 is a flowchart for an example of a video bitstream processing method.
DETAILED DESCRIPTION
The present document provides various techniques that can be used by a decoder of video bitstreams to improve the quality of decompressed or decoded digital video. Furthermore, a video encoder may also implement these techniques during the process of encoding in order to reconstruct decoded frames used for further encoding.
Section headings are used in the present document for ease of understanding and do not limit the embodiments and techniques to the corresponding sections. As such, embodiments from one section can be combined with embodiments from other sections.
1. Summary
This invention is related to video coding technologies. Specifically, it is related to interpolation in video coding. It may be applied to the existing video coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
2. Background
Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards. The ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H.262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards. Since H. 262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore the future video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by VCEG and MPEG jointly in 2015. Since then, many new methods have been adopted by JVET and put into the reference software named Joint Exploration Model (JEM) . In April 2018, the Joint Video Expert Team (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to work on the VVC standard targeting at 50%bitrate reduction compared to HEVC.
FIG. 18 is a block diagram of an example implementation of a video encoder.
2.1 Quadtree plus binary tree (QTBT) block structure with larger CTUs
In HEVC, a CTU is split into CUs by using a quadtree structure denoted as coding tree to adapt to various local characteristics. The decision whether to code a picture area using inter-picture (temporal) or intra-picture (spatial) prediction is made at the CU level. Each CU can be further split into one, two or four PUs according to the PU splitting type. Inside one PU, the same prediction process is applied and the relevant information is transmitted to the decoder on a PU basis. After obtaining the residual block by applying the prediction process based on the PU splitting type, a CU can be partitioned into transform units (TUs) according to another quadtree structure similar to the coding tree for the CU. One of key feature of the HEVC structure is that it has the multiple partition conceptions including CU, PU, and TU.
The QTBT structure removes the concepts of multiple partition types, i.e. it removes the separation of the CU, PU and TU concepts, and supports more flexibility for CU partition shapes. In the QTBT block structure, a CU can have either a square or rectangular shape. As shown in FIG. 1, a coding tree unit (CTU) is first partitioned by a quadtree structure. The quadtree leaf nodes are further partitioned by a binary tree structure. There are two splitting types, symmetric horizontal splitting and symmetric vertical splitting, in the binary tree splitting. The binary tree leaf nodes are called coding units (CUs) , and that segmentation is used for prediction and transform processing without any further partitioning. This means that the CU, PU and TU have the same block size in the QTBT coding block structure. In the JEM, a CU sometimes consists of coding blocks (CBs) of different colour components, e.g. one CU contains one luma CB and two chroma CBs in the case of P and B slices of the 4: 2: 0 chroma format and sometimes consists of a CB of a single component, e.g., one CU contains only one luma CB or just two chroma CBs in the case of I slices.
The following parameters are defined for the QTBT partitioning scheme.
– CTU size: the root node size of a quadtree, the same concept as in HEVC
– MinQTSize: the minimum allowed quadtree leaf node size
– MaxBTSize: the maximum allowed binary tree root node size
– MaxBTDepth: the maximum allowed binary tree depth
– MinBTSize: the minimum allowed binary tree leaf node size
In one example of the QTBT partitioning structure, the CTU size is set as 128×128 luma samples with two corresponding 64×64 blocks of chroma samples, the MinQTSize is set as 16×16, the MaxBTSize is set as 64×64, the MinBTSize (for both width and height) is set as 4×4, and the MaxBTDepth is set as 4. The quadtree partitioning is applied to the CTU first to generate quadtree leaf nodes. The quadtree leaf nodes may have a size from 16×16 (i.e., the MinQTSize) to 128×128 (i.e., the CTU size) . If the leaf quadtree node is 128×128, it will not be further split by the binary tree since the size exceeds the MaxBTSize (i.e., 64×64) . Otherwise, the leaf quadtree node could be further partitioned by the binary tree. Therefore, the quadtree leaf node is also the root node for the binary tree and it has the binary tree depth as 0. When the binary tree depth reaches MaxBTDepth (i.e., 4) , no further splitting is considered. When the binary tree node has width equal to MinBTSize (i.e., 4) , no further horizontal splitting is considered. Similarly, when the binary tree node has height equal to MinBTSize, no further vertical splitting is considered. The leaf nodes of the binary tree are further processed by prediction and transform processing without any further partitioning. In the JEM, the maximum CTU size is 256×256 luma samples.
FIG. 1 illustrates an example of block partitioning by using QTBT, and FIG. 1 (right) illustrates the corresponding tree representation. The solid lines indicate quadtree splitting and dotted lines indicate binary tree splitting. In each splitting (i.e., non-leaf) node of the binary tree, one flag is signalled to indicate which splitting type (i.e., horizontal or vertical) is used, where 0 indicates horizontal splitting and 1 indicates vertical splitting. For the quadtree splitting, there is no need to indicate the splitting type since quadtree splitting always splits a block both horizontally and vertically to produce 4 sub-blocks with an equal size.
In addition, the QTBT scheme supports the ability for the luma and chroma to have a separate QTBT structure. Currently, for P and B slices, the luma and chroma CTBs in one CTU share the same QTBT structure. However, for I slices, the luma CTB is partitioned into CUs by a QTBT structure, and the chroma CTBs are partitioned into chroma CUs by another QTBT structure. This means that a CU in an I slice consists of a coding block of the luma component or coding blocks of two chroma components, and a CU in a P or B slice consists of coding blocks of all three colour components.
In HEVC, inter prediction for small blocks is restricted to reduce the memory access of motion compensation, such that bi-prediction is not supported for 4×8 and 8×4 blocks, and inter  prediction is not supported for 4×4 blocks. In the QTBT of the JEM, these restrictions are removed.
2.2 Inter prediction in HEVC/H. 265
Each inter-predicted PU has motion parameters for one or two reference picture lists. Motion parameters include a motion vector and a reference picture index. Usage of one of the two reference picture lists may also be signalled using inter_pred_idc. Motion vectors may be explicitly coded as deltas relative to predictors.
When a CU is coded with skip mode, one PU is associated with the CU, and there are no significant residual coefficients, no coded motion vector delta or reference picture index. A merge mode is specified whereby the motion parameters for the current PU are obtained from neighbouring PUs, including spatial and temporal candidates. The merge mode can be applied to any inter-predicted PU, not only for skip mode. The alternative to merge mode is the explicit transmission of motion parameters, where motion vector (to be more precise, motion vector difference compared to a motion vector predictor) , corresponding reference picture index for each reference picture list and reference picture list usage are signalled explicitly per each PU. Such mode is named Advanced motion vector prediction (AMVP) in this disclosure.
When signalling indicates that one of the two reference picture lists is to be used, the PU is produced from one block of samples. This is referred to as ‘uni-prediction’ . Uni-prediction is available both for P-slices and B-slices.
When signalling indicates that both of the reference picture lists are to be used, the PU is produced from two blocks of samples. This is referred to as ‘bi-prediction’ . Bi-prediction is available for B-slices only.
The following text provides the details on the inter prediction modes specified in HEVC. The description will start with the merge mode.
2.2.1 Merge Mode
2.2.1.1 Derivation of candidates for merge mode
When a PU is predicted using merge mode, an index pointing to an entry in the merge candidates list is parsed from the bitstream and used to retrieve the motion information. The construction of this list is specified in the HEVC standard and can be summarized according to the following sequence of steps:
● Step 1: Initial candidates derivation
○ Step 1.1: Spatial candidates derivation
○ Step 1.2: Redundancy check for spatial candidates
○ Step 1.3: Temporal candidates derivation
● Step 2: Additional candidates insertion
○ Step 2.1: Creation of bi-predictive candidates
○ Step 2.2: Insertion of zero motion candidates
These steps are also schematically depicted in FIG. 2. For spatial merge candidate derivation, a maximum of four merge candidates are selected among candidates that are located in five different positions. For temporal merge candidate derivation, a maximum of one merge candidate is selected among two candidates. Since constant number of candidates for each PU is assumed at decoder, additional candidates are generated when the number of candidates obtained from step 1 does not reach the maximum number of merge candidate (MaxNumMergeCand) which is signalled in slice header. Since the number of candidates is constant, index of best merge candidate is encoded using truncated unary binarization (TU) . If the size of CU is equal to 8, all the PUs of the current CU share a single merge candidate list, which is identical to the merge candidate list of the 2N×2N prediction unit.
In the following, the operations associated with the aforementioned steps are detailed.
2.2.1.2 Spatial candidates derivation
In the derivation of spatial merge candidates, a maximum of four merge candidates are selected among candidates located in the positions depicted in FIG. 3. The order of derivation is A 1, B 1, B 0, A 0 and B 2. Position B 2 is considered only when any PU of position A 1, B 1, B 0, A 0 is not available (e.g. because it belongs to another slice or tile) or is intra coded. After candidate at position A 1 is added, the addition of the remaining candidates is subject to a redundancy check which ensures that candidates with same motion information are excluded from the list so that coding efficiency is improved. To reduce computational complexity, not all possible candidate pairs are considered in the mentioned redundancy check. Instead only the pairs linked with an arrow in FIG. 4 are considered and a candidate is only added to the list if the corresponding candidate used for redundancy check has not the same motion information. Another source of duplicate motion information is the “second PU” associated with partitions different from 2Nx2N. As an example, FIG. 5 depicts the second PU for the case of N×2N and 2N×N, respectively. When the current PU is partitioned as N×2N, candidate at position A 1 is not  considered for list construction. In fact, by adding this candidate will lead to two prediction units having the same motion information, which is redundant to just have one PU in a coding unit. Similarly, position B 1 is not considered when the current PU is partitioned as 2N×N.
2.2.1.3 Temporal candidates derivation
In this step, only one candidate is added to the list. Particularly, in the derivation of this temporal merge candidate, a scaled motion vector is derived based on co-located PU belonging to the picture which has the smallest POC difference with current picture within the given reference picture list. The reference picture list to be used for derivation of the co-located PU is explicitly signalled in the slice header. The scaled motion vector for temporal merge candidate is obtained as illustrated by the dashed line in FIG. 6, which is scaled from the motion vector of the co-located PU using the POC distances, tb and td, where tb is defined to be the POC difference between the reference picture of the current picture and the current picture and td is defined to be the POC difference between the reference picture of the co-located picture and the co-located picture. The reference picture index of temporal merge candidate is set equal to zero. A practical realization of the scaling process is described in the HEVC specification. For a B-slice, two motion vectors, one is for reference picture list 0 and the other is for reference picture list 1, are obtained and combined to make the bi-predictive merge candidate.
FIG. 6 is an illustration of motion vector scaling for temporal merge candidate.
In the co-located PU (Y) belonging to the reference frame, the position for the temporal candidate is selected between candidates C 0 and C 1, as depicted in FIG. 7. If PU at position C 0 is not available, is intra coded, or is outside of the current CTU row, position C 1 is used. Otherwise, position C 0 is used in the derivation of the temporal merge candidate.
2.2.1.4 Additional candidates insertion
Besides spatial and temporal merge candidates, there are two additional types of merge candidates: combined bi-predictive merge candidate and zero merge candidate. Combined bi-predictive merge candidates are generated by utilizing spatial and temporal merge candidates. Combined bi-predictive merge candidate is used for B-Slice only. The combined bi-predictive candidates are generated by combining the first reference picture list motion parameters of an initial candidate with the second reference picture list motion parameters of another. If these two tuples provide different motion hypotheses, they will form a new bi-predictive candidate. As an example, FIG. 8 depicts the case when two candidates in the original list (on the left) , which  have mvL0 and refIdxL0 or mvL1 and refIdxL1, are used to create a combined bi-predictive merge candidate added to the final list (on the right) . There are numerous rules regarding the combinations which are considered to generate these additional merge candidates.
Zero motion candidates are inserted to fill the remaining entries in the merge candidates list and therefore hit the MaxNumMergeCand capacity. These candidates have zero spatial displacement and a reference picture index which starts from zero and increases every time a new zero motion candidate is added to the list. The number of reference frames used by these candidates is one and two for uni and bi-directional prediction, respectively. Finally, no redundancy check is performed on these candidates.
2.2.1.5 Motion estimation regions for parallel processing
To speed up the encoding process, motion estimation can be performed in parallel whereby the motion vectors for all prediction units inside a given region are derived simultaneously. The derivation of merge candidates from spatial neighbourhood may interfere with parallel processing as one prediction unit cannot derive the motion parameters from an adjacent PU until its associated motion estimation is completed. To mitigate the trade-off between coding efficiency and processing latency, HEVC defines the motion estimation region (MER) whose size is signalled in the picture parameter set using the “log2_parallel_merge_level_minus2” syntax element. When a MER is defined, merge candidates falling in the same region are marked as unavailable and therefore not considered in the list construction.
2.2.2 AMVP
AMVP exploits spatio-temporal correlation of motion vector with neighbouring PUs, which is used for explicit transmission of motion parameters. For each reference picture list, a motion vector candidate list is constructed by firstly checking availability of left, above temporally neighbouring PU positions, removing redundant candidates and adding zero vector to make the candidate list to be constant length. Then, the encoder can select the best predictor from the candidate list and transmit the corresponding index indicating the chosen candidate. Similarly with merge index signalling, the index of the best motion vector candidate is encoded using truncated unary. The maximum value to be encoded in this case is 2 (see FIG. 9) . In the following sections, details about derivation process of motion vector prediction candidate are provided.
2.2.2.1 Derivation of AMVP candidates
FIG. 9 summarizes derivation process for motion vector prediction candidate.
In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidate and temporal motion vector candidate. For spatial motion vector candidate derivation, two motion vector candidates are eventually derived based on motion vectors of each PU located in five different positions as depicted in FIG. 3.
For temporal motion vector candidate derivation, one motion vector candidate is selected from two candidates, which are derived based on two different co-located positions. After the first list of spatio-temporal candidates is made, duplicated motion vector candidates in the list are removed. If the number of potential candidates is larger than two, motion vector candidates whose reference picture index within the associated reference picture list is larger than 1 are removed from the list. If the number of spatio-temporal motion vector candidates is smaller than two, additional zero motion vector candidates is added to the list.
2.2.2.2 Spatial motion vector candidates
In the derivation of spatial motion vector candidates, a maximum of two candidates are considered among five potential candidates, which are derived from PUs located in positions as depicted in FIG. 3, those positions being the same as those of motion merge. The order of derivation for the left side of the current PU is defined as A 0, A 1, and scaled A 0, scaled A 1. The order of derivation for the above side of the current PU is defined as B 0, B 1, B 2, scaled B 0, scaled B 1, scaled B 2. For each side there are therefore four cases that can be used as motion vector candidate, with two cases not required to use spatial scaling, and two cases where spatial scaling is used. The four different cases are summarized as follows.
● No spatial scaling
– (1) Same reference picture list, and same reference picture index (same POC) 
– (2) Different reference picture list, but same reference picture (same POC) 
● Spatial scaling
– (3) Same reference picture list, but different reference picture (different POC) 
– (4) Different reference picture list, and different reference picture (different POC) 
The no-spatial-scaling cases are checked first followed by the spatial scaling. Spatial scaling is considered when the POC is different between the reference picture of the neighbouring PU and that of the current PU regardless of reference picture list. If all PUs of left  candidates are not available or are intra coded, scaling for the above motion vector is allowed to help parallel derivation of left and above MV candidates. Otherwise, spatial scaling is not allowed for the above motion vector.
FIG. 10 is an illustration of motion vector scaling for spatial motion vector candidate.
In a spatial scaling process, the motion vector of the neighbouring PU is scaled in a similar manner as for temporal scaling, as depicted as FIG. 10. The main difference is that the reference picture list and index of current PU is given as input; the actual scaling process is the same as that of temporal scaling.
2.2.2.3 Temporal motion vector candidates
Apart for the reference picture index derivation, all processes for the derivation of temporal merge candidates are the same as for the derivation of spatial motion vector candidates (see FIG. 7) . The reference picture index is signalled to the decoder.
2.3 New inter merge candidates in JEM
2.3.1 Sub-CU based motion vector prediction
In the JEM with QTBT, each CU can have at most one set of motion parameters for each prediction direction. Two sub-CU level motion vector prediction methods are considered in the encoder by splitting a large CU into sub-CUs and deriving motion information for all the sub-CUs of the large CU. Alternative temporal motion vector prediction (ATMVP) method allows each CU to fetch multiple sets of motion information from multiple blocks smaller than the current CU in the collocated reference picture. In spatial-temporal motion vector prediction (STMVP) method motion vectors of the sub-CUs are derived recursively by using the temporal motion vector predictor and spatial neighbouring motion vector.
To preserve more accurate motion field for sub-CU motion prediction, the motion compression for the reference frames is currently disabled.
2.3.1.1 Alternative temporal motion vector prediction
In the alternative temporal motion vector prediction (ATMVP) method, the motion vectors temporal motion vector prediction (TMVP) is modified by fetching multiple sets of motion information (including motion vectors and reference indices) from blocks smaller than the current CU. As shown in FIG. 11, the sub-CUs are square N×N blocks (N is set to 4 by default) .
ATMVP predicts the motion vectors of the sub-CUs within a CU in two steps. The first step is to identify the corresponding block in a reference picture with a so-called temporal vector. The reference picture is called the motion source picture. The second step is to split the current CU into sub-CUs and obtain the motion vectors as well as the reference indices of each sub-CU from the block corresponding to each sub-CU, as shown in FIG. 11.
In the first step, a reference picture and the corresponding block is determined by the motion information of the spatial neighbouring blocks of the current CU. To avoid the repetitive scanning process of neighbouring blocks, the first merge candidate in the merge candidate list of the current CU is used. The first available motion vector as well as its associated reference index are set to be the temporal vector and the index to the motion source picture. This way, in ATMVP, the corresponding block may be more accurately identified, compared with TMVP, wherein the corresponding block (sometimes called collocated block) is always in a bottom-right or center position relative to the current CU.
In the second step, a corresponding block of the sub-CU is identified by the temporal vector in the motion source picture, by adding to the coordinate of the current CU the temporal vector. For each sub-CU, the motion information of its corresponding block (the smallest motion grid that covers the center sample) is used to derive the motion information for the sub-CU. After the motion information of a corresponding N×N block is identified, it is converted to the motion vectors and reference indices of the current sub-CU, in the same way as TMVP of HEVC, wherein motion scaling and other procedures apply. For example, the decoder checks whether the low-delay condition (i.e. the POCs of all reference pictures of the current picture are smaller than the POC of the current picture) is fulfilled and possibly uses motion vector MV x (the motion vector corresponding to reference picture list X) to predict motion vector MV y (with X being equal to 0 or 1 and Y being equal to 1-X) for each sub-CU.
2.3.1.2 Spatial-temporal motion vector prediction
In this method, the motion vectors of the sub-CUs are derived recursively, following raster scan order. FIG. 12 illustrates this concept. Let us consider an 8×8 CU which contains four 4×4 sub-CUs A, B, C, and D. The neighbouring 4×4 blocks in the current frame are labelled as a, b, c, and d.
The motion derivation for sub-CU A starts by identifying its two spatial neighbours. The first neighbour is the N×N block above sub-CU A (block c) . If this block c is not available  or is intra coded the other N×N blocks above sub-CU A are checked (from left to right, starting at block c) . The second neighbour is a block to the left of the sub-CU A (block b) . If block b is not available or is intra coded other blocks to the left of sub-CU A are checked (from top to bottom, staring at block b) . The motion information obtained from the neighbouring blocks for each list is scaled to the first reference frame for a given list. Next, temporal motion vector predictor (TMVP) of sub-block A is derived by following the same procedure of TMVP derivation as specified in HEVC. The motion information of the collocated block at location D is fetched and scaled accordingly. Finally, after retrieving and scaling the motion information, all available motion vectors (up to 3) are averaged separately for each reference list. The averaged motion vector is assigned as the motion vector of the current sub-CU.
2.3.1.3 Sub-CU motion prediction mode signalling
The sub-CU modes are enabled as additional merge candidates and there is no additional syntax element required to signal the modes. Two additional merge candidates are added to merge candidates list of each CU to represent the ATMVP mode and STMVP mode. Up to seven merge candidates are used, if the sequence parameter set indicates that ATMVP and STMVP are enabled. The encoding logic of the additional merge candidates is the same as for the merge candidates in the HM, which means, for each CU in P or B slice, two more RD checks is needed for the two additional merge candidates.
In the JEM, all bins of merge index is context coded by CABAC. While in HEVC, only the first bin is context coded and the remaining bins are context by-pass coded.
2.3.2 Non-adjacent merge candidates
Qualcomm proposes to derive additional spatial merge candidates from non-adjacent neighboring positions which are marked as 6 to 49 as in FIG. 13. The derived candidates are added after TMVP candidates in the merge candidate list.
Tencent proposes to derive additional spatial merge candidates from positions in an outer reference area which has an offset of (-96, -96) to the current block.
As shown in FIG. 14, the positions are marked as A (i, j) , B (i, j) , C (i, j) , D (i, j) and E (i, j) . Each candidate B (i, j) or C (i, j) has an offset of 16 in the vertical direction compared to its previous B or C candidates. Each candidate A (i, j) or D (i, j) has an offset of 16 in the horizontal direction compared to its previous A or D candidates. Each E (i, j) has an offset of 16 in both horizontal direction and vertical direction compared to its previous E candidates. The candidates  are checked from inside to the outside. And the order of the candidates is A (i, j) , B (i, j) , C (i, j) , D (i, j) , and E (i, j) . To further study whether the number of merge candidates can be further reduced. The candidates are added after TMVP candidates in the merge candidate list.
In some examples, the extended spatial positions from 6 to 27 as in FIG. 15 are checked according to their numerical order after the temporal candidate. To save the MV line buffer, all the spatial candidates are restricted within two CTU lines.
2.4 Intra prediction in JEM
2.4.1 Intra mode coding with 67 intra prediction modes
For the luma interpolation filtering, an 8-tap separable DCT-based interpolation filter is used for 2/4 precision samples and a 7-tap separable DCT-based interpolation filter is used for 1/4 precisions samples, as shown in Table 1.
Table 1: 8-tap DCT-IF coefficients for 1/4th luma interpolation.
Position Filter coefficients
1/4 {-1, 4, -10, 58, 17, -5, 1 }
2/4 {-1, 4, -11, 40, 40, -11, 4, -1 }
3/4 {1, -5, 17, 58, -10, 4, -1 }
Similarly, a 4-tap separable DCT-based interpolation filter is used for the chroma interpolation filter, as shown in Table 2.
Table 2: 4-tap DCT-IF coefficients for 1/8th chroma interpolation. 
Position Filter coefficients
1/8 {-2, 58, 10, -2 }
2/8 {-4, 54, 16, -2 }
3/8 {-6, 46, 28, -4 }
4/8 {-4, 36, 36, -4 }
5/8 {-4, 28, 46, -6 }
6/8 {-2, 16, 54, -4 }
7/8 {-2, 10, 58, -2 }
For the vertical interpolation for 4: 2: 2 and the horizontal and vertical interpolation for 4: 4: 4 chroma channels, the odd positions in Table 2 are not used, resulting in 1/4 th chroma interpolation.
For the bi-directional prediction, the bit-depth of the output of the interpolation filter is maintained to 14-bit accuracy, regardless of the source bit-depth, before the averaging of the two prediction signals. The actual averaging process is done implicitly with the bit-depth reduction process as:
predSamples [x, y ] = (predSamplesL0 [x, y ] + predSamplesL1 [x, y ] + offset ) >> shift
where shift = (15 -BitDepth ) and offset = 1 << (shift -1 )
If both horizonal component and vertical component of a motion vector point to sub-pixel positions, horizonal interpolation is always performed firstly, and then the vertical interpolation is performed. For example, to interpolate the subpixel j 0, 0 shown in FIG. 16, first, b 0, k (k = -3, -2, … 3) is interpolated according to equation 2-1, then j 0, 0 is interpolated according to equation 2-2. Here, shift1 = Min (4, BitDepthY -8 ) , and shift2 = 6.
b 0, k = (-A -3, k + 4 *A -2, k -11 *A -1, k + 40 *A 0, k + 40 *A 1, k -11 *A 2, k + 4 *A 3, k -A 4, k ) >> shift1    (2-1)
j 0, 0 = (-b 0, -3 + 4 *b 0, -2 -11 *b 0, -1 + 40 *b 0, 0 + 40 *b 0, 1 -11 *b 0, 2 + 4 *b 0, 3 -b 0, 4 ) >> shift2     (2-2)
Alternatively, we can first perform vertical interpolation and then perform horizonal interpolation. In this case, to interpolation j 0, 0, first, h k, 0 (k = -3, -2, …3) is interpolated according to equation 2-3, then, j 0, 0 is interpolated according to equation 2-4. When BitDepthY is smaller than or equal to 8, shift1 is 0, nothing is lost in the first interpolation stage, therefore, the final interpolation result is not changed by the interpolation order. However, when BitDepthY is greater than 8, shift1 is greater than 0. In this case, the final interpolation result can be different when different interpolation orders are applied.
h k, 0 = (-A k, -3 + 4 *A k, -2 -11 *A k, -1 + 40 *A k, 0 + 40 *A k, 1 -11 *A k, 2 + 4 *A k, 3 -A k, 4 ) >> shift1    (2-3)
j 0, 0 = (-h -3, 0 + 4 *h -2, 0 -11 *h -1, 0 + 40 *h 0, 0 + 40 *h 1, 0 -11 *h 2, 0 + 4 *h 3, 0 -h 4, 0 ) >> shift2    (2-4)
3. Examples of Problems solved by embodiments
For luma block size WxH, if we always perform horizonal interpolation firstly, the required interpolation (per pixel) is shown in Table 3.
Table 3: interpolation required for WxH luma component by HEVC/JEM
Figure PCTCN2020071771-appb-000001
On the other hand, if we perform vertical interpolation firstly, the required interpolation is shown in Table 4. Apparently, the optimal interpolation order is the one which requires smaller interpolation times between Table 3 and Table 4.
Table 4: interpolation required for WxH luma component when the interpolation order is reversed.
Figure PCTCN2020071771-appb-000002
For chroma component, if we always perform horizonal interpolation firstly, the required interpolation is ( (H + 3) x W + W x H) / (W x H) = 2 + 3 /H. if we always perform vertical interpolation firstly, the required interpolation is ( (W + 3) x H + W x H) / (W x H) = 2 + 3 /W.
As mentioned above, different interpolation order can lead to different interpolation result when bitdepth of the input video is greater than 8. Therefore, the interpolation order shall be defined implicitly in both encoder and decoder.
4. Examples of embodiments
To tackle the problems, and provide other benefits, we propose shape dependent interpolation order. Suppose the interpolation filter tap (in motion compensation) is N (for example, 8, 6, 4, or 2) , and the current block size is WxH.
Suppose the number of allowed MVD in MMVD (such as the number of entry to the distance table) is M. Note that triangle mode is considered as a bi-prediction mode, and the following techniques related to bi-prediction may be applied to triangle mode too.
The detailed examples below should be considered as examples to explain general concepts. These examples should not be interpreted in a narrow way. Furthermore, these examples can be combined in any manner.
1. It is proposed that the interpolation order depends on the current coding block shape (e.g., the coding block is a CU) .
a. In one example, for block (such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO) with width > height, vertical interpolation is firstly performed, and then horizonal interpolation is performed, e.g., pixels d k, 0, h k, 0 and n k, 0 are firstly interpolated and e 0, 0 to r 0, 0 are then interpolated. An example of j 0, 0 is shown in equation 2-3 and 2-4.
i. Alternatively, for a block (such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO) with width > = height, vertical interpolation is firstly performed, and then horizonal interpolation is performed
b. In one example, for a block (such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO) with width < = height, horizonal interpolation is firstly performed, and then vertical interpolation is performed.
i. Alternatively, for a block (such as CU, PU or sub-block used in sub-block based prediction like affine, ATMVP or BIO) with width < height, horizonal interpolation is firstly performed, and then vertical interpolation is performed
c. In one example, both the luma component and the chroma components follow the same interpolation order.
d. Alternatively, when one chroma coding block corresponds to multiple luma coding blocks (e.g., for 4: 2: 0 color format, one chroma 4x4 block may correspond to two 8x4 or 4x8 luma blocks) , luma and chroma may use different interpolation orders.
e. In one example, when different interpolation orders are utilized, the scaling factors in the multiple stages (i.e., shift1 and shift2) may be further changed accordingly.
2. Alternatively, in addition, it is proposed that the interpolation order of luma component can further depend on the MV.
a. In one example, if the vertical MV component points to a quarter-pel position and the horizonal MV component points to a half-pel position, horizonal interpolation is firstly performed, and then vertical interpolation is performed.
b. In one example, if the vertical MV component points to a half-pel position and the horizonal MV component points to a quarter-pel position, vertical interpolation is firstly performed, and then horizonal interpolation is performed.
c. In one example, the proposed methods are only applied to square coding blocks.
3. It is proposed that for a block coded with merge mode (e.g., regular merge list, triangular merge list, affine merge list, or other non-intra/non-AMVP modes) , the associated motion information may be modified to integer precision (e.g., via rounding) before invoking motion compensation process.
a. Alternatively, merge candidates with fractional merge candidates may be excluded from the merge list.
b. Alternatively, when a merge candidate derived from spatial or temporal blocks or other ways (such as HMVP, pairwise bi-prediction merge candidates) is associated with fractional motion vectors, the fractional motion vectors may be firstly modified to integer precision (e.g., via rounding) before being added to the merge list.
c. In one example, a separate HMVP table may be kept on-the-fly to store motion candidates with integer precisions.
d. Alternatively, the above methods may be only applied when the merge candidate is a bi-prediction candidate.
e. In one example, the above methods may be applied to certain block dimensions, such as 4x16, 16x4, 4x8, 8x4, 4x4.
f. In one example, the above methods may be applied to the AMVP coded blocks wherein the merge candidate may be replaced by an AMVP candidate.
g. In one example, the above methods may be applied to certain block modes, such as non-affine mode.
4. It is proposed that the MMVD side information (such as distance table, directions) may be dependent on block dimension and/or prediction direction (e.g., uni-prediction or bi-prediction) .
a. In one example, a distance table with all integer precisions may be defined or signaled.
b. In one example, if the base merge candidate is associated with motion vectors of fractional precision, it may be firstly modified (such as via rounding) to integer precision and then used to derive the final motion vectors for motion compensation.
5. It is proposed that MV in MMVD mode may be constrained to be with integer-pel precision or half-pel precision for some block sizes or block shapes.
a. In one example, if integer-pel precision is selected for an MMVD coded block, the base merge candidates used in MMVD may be firstly modified to integer-pel precision (such as via rounding) .
b. In one example, if half-pel precision is selected for an MMVD coded block, the base merge candidates used in MMVD may be modified to half-pel precision (such as via rounding) .
i. In one example, rounding may be performed in the base merge list construction process, therefore, rounded MVs are used in pruning.
ii. In one example, rounding may be performed after the base merge list construction process, therefore, unrounded MVs are used in pruning.
c. In one example, if integer-pel precision or half-pel precision is used for MMVD mode, only MVD with same or lower precision are allowed.
i. For example, if integer-pel precision is used for MMVD mode, only integer-pel precision, 2-pel precision or N-pel precision (N >= 1) MVD are allowed.
d. In one example, if K MVD are not allowed in MMVD mode, binarization of MVD index may be modified because the maximum MVD index is M –K –1 instead of M –1. Meanwhile, different context may be used in CABAC coding.
e. In one example, rounding may be performed after deriving the MV in MMVD mode.
f. The constraint may be different for bi-prediction and uni-prediction. For example, the constraint may be not applied in uni-prediction.
g. The constraint may be different for different block sizes or block shapes.
6. It is proposed that the maximum number of half-pel MV components or/and quarter-pel MV components (e.g., horizonal MV or vertical MV) may be constrained for some block sizes or block shapes.
a. In one example, bitstream shall conform to the constraint.
b. The constraint may be different for bi-prediction and uni-prediction. For example, the constraint may be not applied in uni-prediction.
i. For example, such constraint may be applied to bi-predicted 4x8 or/and 8x4 or/and 4x16 or/and 16x4 block, however, it may be not applied to uni-predicted 4x8 or/and 8x4 or/and 4x16 or/and 16x4 block.
ii. For example, such constraint may be applied to both bi-predicted and uni-predicted 4x4 block.
c. The constraint may be different for different block sizes or block shapes.
d. The constraint may be applied to triangle mode.
i. For example, such constraint may be applied to 4x16 or/and 16x4 block coded in triangle mode.
e. In one example, for bi-predicted blocks, at most 3 quarter-pel MV components may be allowed.
f. In one example, for bi-predicted blocks, at most 2 quarter-pel MV components may be allowed.
g. In one example, for bi-predicted blocks, at most 1 quarter-pel MV components may be allowed.
h. In one example, for bi-predicted blocks, at most 0 quarter-pel MV components may be allowed.
i. In one example, for uni-predicted blocks, at most 1 quarter-pel MV components may be allowed.
j. In one example, for uni-predicted blocks, at most 0 quarter-pel MV components may be allowed.
k. In one example, for bi-predicted blocks, at most 3 fractional MV components may be allowed.
l. In one example, for bi-predicted blocks, at most 2 fractional MV components may be allowed.
m. In one example, for bi-predicted blocks, at most 1 fractional MV components may be allowed.
n. In one example, for bi-predicted blocks, at most 0 fractional MV components may be allowed.
o. In one example, for uni-predicted blocks, at most 1 fractional MV components may be allowed.
p. In one example, for uni-predicted blocks, at most 0 fractional MV components may be allowed.
7. It is proposed that some components of a MV may be rounded to integer-pel precision or half-pel precision depending on the dimension (e.g., width and/or height, ratios of width and height) , or/and prediction direction or/and motion information of a block.
a. In one example, MV is rounded to the nearest integer-pel precision MV or/and half-pel precision MV.
b. In one example, different rounding method may be used. For example, rounding down, rounding up, rounding towards zero or rounding away from zero may be used.
c. In one example, if the size (i.e., width *height) of a block is smaller (or larger than) than (and/or equal to) a threshold L (e.g., L=16 or 64) , MV rounding may be applied to horizonal or/and vertical MV component.
d. In one example, if the width (or height) of a block is smaller than (and/or equal to) a threshold L1 (e.g., L1 = 4, 8) , MV rounding may be applied to horizonal (or vertical) MV component.
e. In one example, thresholds L and L1 may be different for bi-predicted blocks and uni-predicted blocks. For example, smaller thresholds may be used for bi-predicted blocks.
f. In one example, if the ratio between width and height is larger than a first threshold or smaller than a second threshold (such as for narrow blocks like 4x16 or 16x4) , MV rounding may be applied.
g. In one example, MV rounding may be applied only when both horizonal and vertical components of the MV are fractional, i.e., they point to fractional pixel position instead of integer pixel position.
h. Whether MV rounding is applied or not may depend on whether the current block is bi-predicted or uni-predicted.
i. For example, MV rounding may be applied only when the current block is bi-predicted.
i. Whether MV rounding is applied or not may depend on the prediction direction (e.g., from List 0 or list 1) and/or the associated motion vectors. In one example, for bi-predicted blocks, whether MV rounding is applied or not may be different for different prediction directions.
i. In one example, if MV of prediction direction X (X = 0 or 1) have fractional components in both horizonal and vertical directions, then MV rounding may be applied to N MV components for prediction direction X; otherwise, MV rounding may be not applied. Here, N = 0, 1 or 2.
ii. In one example, if N (N >= 0) MV components are with fractional precision, MV rounding may be applied to M (0 <= M <= N) of the N MV components.
1. N and M may be different for bi-predicted blocks and uni-predicted blocks.
2. N and M may be different for different block sizes (width or/and height or/and width *height) .
3. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 4.
4. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 3.
5. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 2.
6. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 1.
7. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 3.
8. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 2.
9. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 1.
10. For example, for bi-predicted blocks, N is equal to 2 and M is equal to 2.
11. For example, for bi-predicted blocks, N is equal to 2 and M is equal to 1.
12. For example, for bi-predicted blocks, N is equal to 1 and M is equal to 1.
13. For example, for uni-predicted blocks, N is equal to 2 and M is equal to 2.
14. For example, for uni-predicted blocks, N is equal to 2 and M is equal to 1.
15. For example, for uni-predicted blocks, N is equal to 1 and M is equal to 1.
iii. In one example, K of the M MV components are rounded to integer-pel precision and M –K MV components are rounded to half-pel precision, wherein K = 0, 1, …, M –1.
j. Whether MV rounding is applied or not may be different for different color components such as Y, Cb and Cr.
i. For example, whether to and how to apply MV rounding may depend on color formats such as 4: 2: 0, 4: 2: 2 or 4: 4: 4.
k. Whether and/or how MV rounding is applied or not may depend on the block size (or width, height) , block shapes, prediction direction etc.
i. In one example, some MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks may be rounded to half-pel precision.
ii. In one example, some MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks may be rounded to integer-pel precision.
iii. In one example, some MV components of of 4x4 uni-predicted or/and bi-predicted luma blocks may be rounded to integer-pel precision.
iv. In one example, some MV components of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks may be rounded to integer-pel precision.
l. In one example, the MV rounding may be not applied on sub-block prediction, such as affine prediction.
i. In an alternative example, the MV rounding may be applied on sub-block prediction, such as ATMVP prediction. In such a case, each sub-block is treated as a coding block to judge whether and how to apply MV rounding.
8. It is proposed that for certain block sizes, motion vectors of one block shall be modified to integer precision before being utilized for motion compensation, for example, if they are fractional precisions.
9. In one example, for certain block dimensions, the stored motion vectors and those utilized for motion compensation may be in different precisions.
a. In one example, sub-pel precision (a. k. a., fractional precision, such as 1/4-pel, 1/16-pel) may be stored for blocks with certain block dimensions, but the motion compensation process is based on integer version of those motion vectors (such as via rounding) .
10. It is proposed that an indication of disallowing bi-prediction for certain block dimensions may be signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/CTU rows/regions/other high-level syntax.
a. Alternatively, an indication of disallowing bi-prediction for certain block dimensions may be signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/CTU rows/regions/other high-level syntax.
b. Alternatively, an indication of disallowing bi-prediction and/or uni-prediction for certain block dimensions may be signaled in sequence parameter set/picture  parameter set/sequence header/picture header/tile header/tile group header/CTU rows/regions/other high-level syntax.
c. Alternatively, furthermore, such indications may be only applied to certain modes, such as non-affine mode.
d. Alternatively, furthermore, when uni-/bi-prediction is disallowed for a block, the signaling of AMVR indices may be modified accordingly, such as only integer-pel precisions are allowed, or different MV precisions may be utilized instead.
e. Alternatively, furthermore, above methods (such as bullets 3-9) may be also applicable.
11. It is proposed that a conformance bitstream shall follow the rule that for certain block dimensions, only integer-pel motion vectors are allowed for bi-prediction coded blocks.
a. It is proposed that a conformance bitstream shall follow the rule that for certain block dimensions, only integer-pel motion vectors are allowed for bi-prediction coded blocks.
12. Signaling of AMVR flag may depend on whether fractional motion vectors are allowed for a block.
a. In one example, if fractional (i.e., 1/4-pel) MV/MVD precision is disallowed for a block, the flag indicating whether MV/MVD precision of the current block is 1/4-pel may be skipped and derived to be false implicitly.
13. In one example, the block dimensions mentioned above are, for example, 4x16, 16x4, 4x8, 8x4, 4x4.
14. It is proposed that filters with different interpolation filters (e.g., different filter taps, and/or different filter interpolation filter coefficients) may be used in interpolation depending on the dimension (e.g., width and/or height, ratios of width and height) of a block.
a. Different filters may be used for vertical interpolation and horizontal interpolation. For example, shorter tap filter may be applied for vertical interpolation compared to that for horizontal interpolation.
b. In one example, interpolation filters with less taps than the interpolation filters in VTM-3.0 may be applied in some cases. These interpolation filters with less taps are also called “short-tap filters” .
c. In one example, if the size (i.e., width *height) of a block is smaller (or larger than) than (and/or equal to) a threshold L (e.g., L=16 or 64) , different filters (e.g., short-tap filters) may be used for horizonal or/and vertical interpolation.
d. In one example, if the width (or height) of a block is smaller than (and/or equal to) a threshold L1 (e.g., L1 = 4, 8) , different filters (e.g., short-tap filters) may be used for horizonal (or vertical) interpolation.
e. In one example, if the ratio between width and height is larger than a first threshold or smaller than a second threshold (such as for narrow blocks like 4x16 or 16x4) , a different filter from those used for other kinds of blocks (e.g., short-tap filter) may be selected.
f. In one example, the short-tap filters may be used only when both horizonal and vertical components of the MV are fractional, i.e., they point to fractional pixel position instead of integer pixel position.
g. Which filter to be used (e.g., the short-tap filters may be used or not) may depend on whether the current block is bi-predicted or uni-predicted.
i. For example, the short-tap filters may be used only when the current block is bi-predicted.
h. Which filter to be used (e.g., the short-tap filters may be used or not) may depend on the prediction direction (e.g., from List 0 or list 1) and/or the associated motion vectors. In one example, for bi-predicted blocks, whether short-tap filters are used or not may be different for different prediction direction.
i. In one example, if MV of prediction direction X (X = 0 or 1) have fractional components in both horizonal and vertical directions, then short-tap filters are used for prediction direction X; otherwise, short-tap filters are not used.
ii. In one example, if N (N >= 0) MV components are with fractional precision, short-tap filter may be applied to M (0 <= M <= N) of the N MV components.
1. N and M may be different for bi-predicted blocks and uni-predicted blocks.
2. N and M may be different for different block sizes (width or/and height or/and width *height) .
3. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 4.
4. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 3.
5. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 2.
6. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 1.
7. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 3.
8. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 2.
9. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 1.
10. For example, for bi-predicted blocks, N is equal to 2 and M is equal to 2.
11. For example, for bi-predicted blocks, N is equal to 2 and M is equal to 1.
12. For example, for bi-predicted blocks, N is equal to 1 and M is equal to 1.
13. For example, for uni-predicted blocks, N is equal to 2 and M is equal to 2.
14. For example, for uni-predicted blocks, N is equal to 2 and M is equal to 1.
15. For example, for uni-predicted blocks, N is equal to 1 and M is equal to 1.
iii. Different short-tap filters may be used for the M MV components.
1. In one example, K of the M MV components use S1-tap filter, and M –K MV components use S2-tap filter, wherein K = 0, 1, …, M –1.For example, S1 is equal to 6 and S2 is equal to 4.
i. In one example, different filters (e.g., the short-tap filters) may be used only for some pixels. For example, they are used only for boundary pixels of the block.
i. For example, they are only used for the N1 right column or/and N2 left column or/and N3 top row or/and N4 bottom row of the block.
j. Whether short-tap filters are used or not may be different for uni-predicted blocks and bi-predicted blocks.
k. Whether short-tap filters are used or not may be different for different color components such as Y, Cb and Cr.
i. For example, whether to and how to apply short-tap filters may depend on color formats such as 4: 2: 0, 4: 2: 2 or 4: 4: 4.
l. Different short-tap filters may be used for different blocks. The selected short-tap filters may depend on the block size (or width, height) , block shapes, prediction direction etc.
i. In one example, 7-tap filter is used for horizonal and vertical interpolation of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks.
ii. In one example, 7-tap filter is used for horizonal (or vertical) interpolation of 4x4 uni-predicted or/and bi-predicted luma blocks.
iii. In one example, 6-tap filter is used for horizonal and vertical interpolation of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
1. Alternatively, 6-tap filter and 5-tap filter (or 5-tap filter and 6-tap filter) are used in horizonal interpolation and vertical interpolation respectively for 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
m. Different short-tap filters may be used for different kinds of motion vectors.
i. In one example, longer tap length filters may be used for motion vectors that only have fractional components in one direction (i.e., either horizonal or vertical direction) , and shorter tap length filters may be used for motion vectors that have fractional components in both horizonal and vertical directions.
ii. For example, 8-tap filter is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one direction, and short-tap filters described in bullet 3. h is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both directions.
iii. In one example, interpolation filters used for affine motion may be different from that used for translational motion vectors.
iv. In one example, short-tap interpolation filters may be used for affine motion compared to those used for translational motion vectors.
n. In one example, the short-tap filters may not be applied on sub-block prediction, such as affine prediction.
i. In an alternative example, the short-tap filters may be applied on sub-block prediction, such as ATMVP prediction. In such a case, each sub-block is treated as a coding block to judge whether and how to apply short-tap filters.
o. In one example, whether to apply short-tap filters and/or how to apply short-tap filters may depend on the block dimension, coded information, etc. al.
i. In one example, when certain mode is enabled for a block, such as OBMC, interweaved affine prediction mode, short-tap filters may be applied.
15. It is proposed that (W + N –1 –PW) * (W + N –1 –PH) reference pixels (instead of (W + N –1) * (H + N –1) reference pixels) may be fetched for motion compensation of WxH block wherein PW and PH couldn’ t be both equal to 0.
a. In one example, furthermore, for the remaining reference pixels (not fetched, but required for motion compensation) , padding or derivation from fetched reference samples may be applied.
b. Alternatively, furthermore, pixels at the reference block boundaries (top, left, bottom and right boundary) are repeated to generate a (W + N –1) * (H + N –1) block, which is used for the final interpolation. An example is shown in Figure 21, in the figure, W = 8, H = 4, N = 7, PW = 2 and PH = 3.
c. The fetched reference pixels may be identified by (x + MVXInt –N/2 + offSet1, y + MVYInt –N/2 + offSet2) , wherein (x, y) is the top-left position of the current block, (MVXInt, MVYInt) is the integer part of the MV, offSet1 and offSet2 are integers such as -2, -1, 0, 1, 2 etc.
d. In one example, PH is zero, and only left or/and right boundaries are repeated.
e. In one example, PW is zero, and only top or/and bottom boundaries are repeated.
f. In one example, both PW and PH are greater than zero, and first the left or/and the right boundaries are repeated, and then the top or/and bottom boundaries are repeated.
g. In one example, both PW and PH are greater than zero, and first the top or/and bottom boundaries are repeated, and then the left or/and right boundaries are repeated.
h. In one example, the left boundary is repeated by M1 times and the right boundary is repeated by PW –M1 times, wherein M1 is an integer and M1 >= 0.
i. Alternatively, if M1 (or PW –M1) is greater than 1, instead of repeating the first left (or right) column M1 times, multiple columns may be utilized, such as the M1 left columns (or PW –M1 right columns) may be repeated.
i. In one example, the top boundary is repeated by M2 times and the bottom boundary is repeated by PH –M2 times, wherein M2 is an integer and M2 >= 0.
i. Alternatively, if M2 (or PH –M2) is greater than 1, instead of repeating the first top (or bottom) row M2 times, multiple rows may be utilized, such as the M2 top rows (or PH –M2 bottom rows) may be repeated.
j. In one example, some default values may be used for boundary padding.
k. In one example, such boundary pixels repeating method may be used only when both horizonal and vertical components of the MV are fractional, i.e., they point to fractional pixel position instead of integer pixel position.
l. In one example, such boundary pixels repeating method may be applied to some of or all reference blocks.
i. In one example, if MV of prediction direction X (X = 0 or 1) have fractional components in both horizonal and vertical directions, then such boundary pixels repeating method is used for prediction direction X; otherwise, it is not used.
ii. In one example, if N (N >= 0) MV components are with fractional precision, boundary pixels repeating method may be applied to M (0 <= M <= N) of the N MV components.
1. N and M may be different for bi-predicted blocks and uni-predicted blocks.
2. N and M may be different for different block sizes (width or/and height or/and width *height) .
3. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 4.
4. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 3.
5. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 2.
6. For example, for bi-predicted blocks, N is equal to 4 and M is equal to 1.
7. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 3.
8. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 2.
9. For example, for bi-predicted blocks, N is equal to 3 and M is equal to 1.
10. For example, for bi-predicted blocks, N is equal to 2 and M is equal to 2.
11. For example, for bi-predicted blocks, N is equal to 2 and M is equal to 1.
12. For example, for bi-predicted blocks, N is equal to 1 and M is equal to 1.
13. For example, for uni-predicted blocks, N is equal to 2 and M is equal to 2.
14. For example, for uni-predicted blocks, N is equal to 2 and M is equal to 1.
15. For example, for uni-predicted blocks, N is equal to 1 and M is equal to 1.
iii. Different boundary pixel repeating method may be used for the M MV components.
m. PW and/or PH may be different for different color components such as Y, Cb and Cr.
i. For example, whether to and how to apply boundary pixel repeating may depend on color formats such as 4: 2: 0, 4: 2: 2 or 4: 4: 4.
n. In one example, PW and/or PH may be different for different block size or shape.
iv. In one example, PW and PH are set equal to 1 for 4x16 or/and 16x4 bi-predicted or/and uni-predicted blocks.
v. In one example, PW and PH are set equal to 0 and 1 (or 1 and 0) , respectively, for 4x4 bi-predicted or/and uni-predicted blocks.
vi. In one example, PW and PH are set equal to 2 for 4x8 or/and 8x4 bi-predicted or/and uni-predicted blocks.
1. Alternatively, PW and PH are set equal to 2 and 3 (or 3 and 2) respectively for 4x8 or/and 8x4 bi-predicted or/and uni-predicted blocks.
o. In one example, PW and PH may be different for uni-prediction and bi-prediction.
p. PW and PH may be different for different kinds of motion vectors.
vii. In one example, PW and PH may be smaller (or even zero) for motion vectors that only have fractional components in one direction (i.e., either horizonal or vertical direction) , and they may be larger for motion vectors that have fractional components in both horizonal and vertical directions.
viii. For example, PW and PH are set equal to 0 for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one direction, and PW and PH described bullet 4. i is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both direction.
Figure 21 shows an example of repeat boundary pixels of a reference block before interpolation.
16. The proposed methods may be applied to certain modes, block sizes/shapes, and/or certain sub-block sizes.
a. The proposed methods may be applied to certain modes, such as bi-predicted mode.
b. The proposed methods may be applied to certain block sizes.
i. In one example, it is only applied to a block with w×h<=T, where w and h are the width and height of the current block.
ii. In one example, it is only applied to a block with h <=T.
c. The proposed methods may be applied to certain color component (such as only luma component) .
17. The rounding operations mentioned above may be defined as:
a. Shift (x, s) is defined as
Shift (x, s) = (x+off) >>s
b. SignShift (x, s) is defined as
Figure PCTCN2020071771-appb-000003
where off is an integer such as 0 or 2 s-1.
c. It may be defined as those used for motion vector rounding in the AMVR process, affine process or other process modules.
18. In one example, how to round the MVs may be dependent of MV components.
a. For example, y-component of MV is rounded to integer pixel but x-component of MV is not rounded.
b. In one example, the MV may be rounded to integer pixels before motion compensation for luma component, but rounded to 2-pel pixels before motion compensation for chroma components when the color format is 4: 2: 0.
19. It is proposed that bi-linear filter is used to do interpolation filtering for one or multiple specific cases, such as:
a. 4x4 uni-prediction;
b. 4x8 bi-prediction;
c. 8x4-bi-prediction;
d. 4x16 bi-prediction;
e. 16x4 bi-prediction;
f. 8x8 bi-prediction;
g. 8x4 uni-prediction;
h. 4x8 uni-prediction;
20. It is proposed that, when multi-hypothesis prediction is applied to one block, short-tap or different interpolation filters may be applied compared to those filters applied to normal prediction mode.
a. In one example, bilinear filter may be used.
b. short-tap or a second interpolation filter may be applied to a reference picture list which involves multiple reference blocks while for another reference picture with only one reference block, the same filter as that used for normal prediction mode may be applied.
c. The proposed method may be applied under certain conditions, such as certain temporal layer (s) , quantization parameters of a block/atile/aslice/apicture containing the block is within a range (such as larger than a threshold) .
FIG. 17 is a block diagram of a video processing apparatus 1700. The apparatus 1700 may be used to implement one or more of the methods described herein. The apparatus 1700 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 1700 may include one or more processors 1702, one or more memories 1704 and video processing hardware 1706. The processor (s) 1702 may be configured to implement one or more methods described in the present document. The memory (memories) 1704 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 1706 may be used to implement, in hardware circuitry, some techniques described in the present document.
FIG. 19 is a flowchart for a method 1900 of video bitstream processing. The method 1900 includes determining (1905) a shape of a video block, determining (1910) an interpolation order based on the video block, the interpolation order being indicative of a sequence of performing horizontal interpolation and vertical interpolation, and performing the horizontal interpolation and the vertical interpolation in accordance with the interpolation order for the video block to reconstruct (1915) a decoded representation of the video block.
FIG. 20 is a flowchart for a method 2000 of video bitstream processing. The method 2000 includes determining (2005) characteristics of a motion vector related to a video block, determining (2010) an interpolation order of the video block based on the characteristics of the motion vector, the interpolation order being indicative of a sequence of performing horizontal interpolation and vertical interpolation, and performing the horizontal interpolation and the vertical interpolation in accordance with the interpolation order for the video block to reconstruct (2015) a decoded representation of the video block.
FIG. 22 is a flowchart for a method 2200 of video bitstream processing. The method 2200 includes determining (2205) dimension characteristics of a first video block, determining (2210) that a first interpolation filter is to be applied to the first video block based on the determination of the dimension characteristics, and performing (2215) further processing of the first video block using the first interpolation filter.
FIG. 23 is a flowchart for a method 2300 of video bitstream processing. The method 2300 includes determining (2305) first characteristics of a first video block, determining (2310) that a first interpolation filter is to be applied to the first video block based on the determination of the first characteristics, performing (2315) further processing of the first video block using the  first interpolation filter, determining (2320) second characteristics of a second video block, determining (2325) that a second interpolation filter is to be applied to the first video block based on the second characteristics, the first interpolation filter and the second interpolation filter being different short-tap filters, and performing (2330) further processing of the second video block using the second interpolation filter.
With reference to  methods  1900, 2000, 2200, and 2300, some examples of sequences of performing horizontal interpolation and vertical interpolation and their use are described in Section 4 of the present document. For example, as described in Section 4, under different shapes of the video block, a preference may be given to performing one of the horizontal interpolation or vertical interpolation first. In some embodiments, the horizontal interpolation is performed before the vertical interpolation, and in some embodiments the vertical interpolation is performed before the horizontal interpolation.
With reference to  methods  1900, 2000, 2200, and 2300, the video block may be encoded in the video bitstream in which bit efficiency may be achieved by using a bitstream generation rule related to interpolation orders that also depends on the shape of the video block.
The methods can include wherein rounding the motion vectors includes one or more of: rounding to a nearest integer-pel precision MV, or rounding to a half-pel precision MV.
The methods can include wherein rounding the MVs includes one or more of: rounding down, rounding up, rounding towards zero, or rounding away from zero.
The methods can include wherein wherein the dimension information represents that a size of the first video block is less than a threshold value, and rounding the MVs is applied to one or both of a horizontal MV component or a vertical MV component based on the dimension information representing that the size of the first video block is less than the threshold value.
The methods can include wherein the dimension information represents that a width or a height of the first video block is less than a threshold value, and rounding the MVs is applied to one or both of a horizontal MV component or a vertical MV component based on the dimension information representing that the width or the height of the first video block is less than the threshold value.
The methods can include wherein the threshold value is different for bi-predicted blocks and uni-predicted blocks.
The methods can include wherein the dimension information represents a ratio between  a width and a height of the first video block is larger than a first threshold value or smaller than a second threshold value, and wherein the rounding of the MVs is based on the determination of the dimension information.
The methods can include wherein rounding the MVs is further based on both horizontal and vertical components of the MVs being fractional.
The methods can include wherein rounding the MVs is further based on the first video block being bi-predicted or uni-predicted.
The methods can include wherein rounding the MVs is further based on a prediction direction related to the first video block.
The methods can include wherein rounding the MVs is further based on color components of the first video block.
The methods can include wherein rounding the MVs is further based on a size of the first video block, a shape of the first video block, or a prediction shape of the first video block.
The methods can include wherein rounding the MVs is applied on sub-block prediction.
The methods can include wherein a short-tap filter is applied to MV components based on the MV components having fractional precision.
The methods can include wherein short-tap filters are applied based on a dimension of the first video block, or coded information of the first video block.
The methods can include wherein short-tap filters are applied based on a mode of the first video block.
The methods can include wherein default values are used for boundary padding related to the first video block.
The methods can include wherein the merge mode is one or more of: a regular merge list, a triangular merge list, an affine merge list, or other non-intra or non-AMVP mode.
The methods can include wherein merge candidates with fractional merge candidates are excluded from a merge list.
The methods can include wherein rounding the motion information includes rounding a merge candidate associated with fractional motion vectors to integer precision, and the modified motion information is inserted into a merge list.
The methods can include wherein the motion information is a bi-prediction candidate.
The methods can include wherein MMVD is mean magnitude of vector difference.
The methods can include wherein the motion vectors are in MMVD mode.
The methods can include wherein the first video block is an MMVD coded block to be associated with integer-pel precision, and wherein base merge candidates used in MMVD are modified to integer-pel precision via rounding.
The methods can include wherein the first video block is an MMVD coded block to be associated with half-pel precision, and wherein base merge candidates used in MMVD are modified to half-pel precision via rounding.
The methods can include wherein the threshold number is a maximum number of allowed half-pel MV components or quarter-pel MV components.
The methods can include wherein the threshold number is different between bi-prediction and uni-prediction.
The methods can include wherein an indication disallowing bi-prediction is signaled in a sequence parameter set, a picture parameter set, a sequence header, a picture header, a tile header, a tile group header, a CTU row, a region, or other high-level syntax.
The methods can include wherein the methods are in conformance with a bitstream rule that allows for only integer-pel motion vectors for bi-prediction coded blocks having particular dimensions.
The methods can include wherein the first video block has a size of: 4x6, 16x4, 4x8, 8x4, or 4x4.
The methods can include wherein modifying or rounding the motion information includes modifying different MV components differently.
The methods can include wherein a y-component of a first MV is modified or rounded to integer-pixel, and an x-component of the first MV is not modified or rounded.
The methods can include wherein a luma component of a first MV is rounded to integer pixels, and a chroma component of the first MV is rounded to 2-pel pixels.
The methods can include wherein the first MV is related to a video block having a color format that is 4: 2: 0.
The methods can include wherein the bilateral filter is used for 4x4 uni-prediction, 4x8 bi-prediction, 8x4-bi-prediction, 4x16 bi-prediction, 16x4 bi-prediction, 8x8 bi-prediction, 8x4 uni-prediction, or 4x8 uni-prediction.
FIG. 24 is a flowchart for a method 2400 of video processing. The method 2400  includes determining (2402) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining (2404) filters with interpolation filter parameters used for interpolation of the first block based on the characteristics of the first block; and performing (2406) the conversion by using the filters with the interpolation filter parameters.
In some examples, the interpolation filter parameters includes filter taps and/or interpolation filter coefficients, and the interpolation includes at least one of vertical interpolation and horizontal interpolation.
In some examples, the filters includes short-tap filters with less taps than regular interpolation filters.
In some examples, the regular interpolation filters have 8 taps.
In some examples, the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
In some examples, the filter used for the vertical interpolation is different from the filter used for the horizontal interpolation in number of taps.
In some examples, the filter used for the vertical interpolation has less taps than the filter used for the horizontal interpolation.
In some examples, the filter used for the horizontal interpolation has less taps than the filter used for the vertical interpolation.
In some examples, when the size of the first block is smaller than and/or equal to a threshold, the short-tap filters are used for the horizontal interpolation or/and the vertical interpolation.
In some examples, when the size of the first block is larger than and/or equal to a threshold, the short-tap filters are used for the horizontal interpolation or/and the vertical interpolation.
In some examples, when the width of the first block is smaller than and/or equal to a threshold, the short-tap filters are used for the horizontal interpolation, or when the height of the first block is smaller than and/or equal to a threshold, the short-tap filters are used for the vertical interpolation.
In some examples, when the ratio between the width and the height is larger than a first threshold or smaller than a second threshold, the short-tap filters are used for the vertical interpolation and/or horizontal interpolation.
In some examples, the characteristics of the first block includes at least one motion vector (MV) associated with the first block.
In some examples, only when both horizontal and vertical components of the MV are fractional, the short-tap filters are used for the interpolation.
In some examples, the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
In some examples, whether the short-tap filters are used or not depends on the prediction parameter.
In some examples, only when the first block is bi-predicted, the short-tap filters are used for the interpolation.
In some examples, the characteristics of the first block includes prediction direction indicating from List 0 or List 1 and/or associated motion vectors (MVs) .
In some examples, whether the short-tap filters are used or not depends on prediction direction of the first block and/or the MVs.
In some examples, in a case that the first block is a bi-predicted block, whether the short-tap filters are used or not is different for different prediction direction.
In some examples, if MV of prediction direction X, X being 0 or 1, has fractional components in both horizontal and vertical directions, the short-tap filters are used for the prediction direction X; otherwise, the short-tap filters are not used.
In some examples, if N MV components are with fractional precision, the short-tap filters are used for M MV components of the N MV components, wherein N, M are integers, and 0 <=M <= N.
In some examples, N and M are different for bi-predicted blocks and uni-predicted blocks.
In some examples, for bi-predicted blocks, N is equal to 4 and M is equal to 4, or N is equal to 4 and M is equal to 3, or N is equal to 4 and M is equal to 2, or N is equal to 4 and M is equal to 1, or N is equal to 3 and M is equal to 3, or N is equal to 3 and M is equal to 2, or N is equal to 3 and M is equal to 1, or N is equal to 2 and M is equal to 2, or N is equal to 2 and M is equal to 1, or N is equal to 1 and M is equal to 1.
In some examples, for uni-predicted blocks, N is equal to 2 and M is equal to 2, or N is equal to 2 and M is equal to 1, or N is equal to 1 and M is equal to 1.
In some examples, the short-tap filters includes first short-tap filters with S1 tap and second short-tap filters with S2 tap, and wherein K MV components of the M MV components use the first short-tap filters, and (M –K) MV components of the M MV components use the second short-tap filters, wherein K is an integer in a range from 0 to M –1, S1 and S2 are integers.
In some examples, N and M are different for different dimension parameters of blocks, wherein the dimension parameters includes width or/and height or/and width *height of the blocks.
In some examples, the characteristics of the first block includes position of the pixels of the first block.
In some examples, whether the short-tap filters are used or not depends on the position of  the pixels.
In some examples, the short-tap filters are used only for boundary pixels of the first block.
In some examples, the short-tap filters are used only for N1 right column or/and N2 left column or/and N3 top row or/and N4 bottom row of the first block, N1, N2, N3, N4 being integers.
In some examples, the characteristics of the first block includes color components of the first block.
In some examples, whether the short-tap filters are used or not is different for different color components of the first block.
In some examples, the color components include Y, Cb and Cr.
In some examples, the characteristics of the first block includes color formats of the first block.
In some examples, whether to and how to apply the short-tap filters depend on color formats of the first block.
In some examples, the color formats include 4: 2: 0, 4: 2: 2 or 4: 4: 4.
In some examples, the filters includes different short-tap filters with different taps, and selection of the different short-tap filters is based on the characteristics of the blocks.
In some examples, a 7-tap filter is selected for horizontal and vertical interpolation of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks.
In some examples, a 7-tap filter is selected for horizontal or vertical interpolation of 4x4 uni-predicted or/and bi-predicted luma blocks.
In some examples, a 6-tap filter is selected for horizontal and vertical interpolation of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
In some examples, a 6-tap filter and a 5-tap filter or a 5-tap filter and a 6-tap filter are  selected for horizontal interpolation and vertical interpolation respectively for 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks.
In some examples, the filters includes different short-tap filters with different taps, and the different short-tap filters are used for different kinds of motion vectors (MVs) .
In some examples, longer tap length filters from the different short-tap filters are used for MVs that only have fractional components in one of horizontal or vertical direction, and shorter tap length filters from the different short-tap filters are used for MVs that have fractional components in both horizontal and vertical directions.
In some examples, a 8-tap filter is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one of horizontal or vertical direction, and short-tap filters is used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both directions.
In some examples, filters used for affine motion are different from that used for translational motion vectors.
In some examples, filters used for affine motion have less taps compared to those used for translational motion vectors.
In some examples, the short-tap filters are not applied to sub-block based prediction including affine prediction.
In some examples, the short-tap filters are applied to sub-block based prediction including Advanced Temporal Motion Vector Prediction (ATMVP) prediction.
In some examples, each sub-block is used as a coding block to determine whether to and how to apply the short-tap filters.
In some examples, the characteristics of the first block includes dimension parameters and coded information of the first block, and whether to and how to apply the short-tap filters depend on the block dimension and coded information of the first block.
In some examples, when certain mode including at least one of OBMC and interweaved affine prediction mode is enabled for the first block, the short-tap filters are applied.
In some examples, the conversion generates the first/second block of video from the bitstream representation.
In some examples, the conversion generates the bitstream representation from the first/second block of video.
FIG. 25 is a flowchart for a method 2500 of video processing. The method 2500 includes fetching (2502) , for a conversion between a first block of video and a bitstream representation of the first block, reference pixels of a first reference block from reference picture, wherein the first reference block is smaller than a second reference block required for motion compensation of the first block; padding (2504) the first reference block with padding pixels to generate the second reference block required for motion compensation of the first block; and performing (2506) the conversion by using the generated second reference block.
In some examples, the first block has a size of W*H, the first reference block has a size of (W + N –1 –PW) * (H + N –1 –PH) , and the second reference block has a size of (W + N –1) * (H + N –1) , wherein W is width of the first block, H is height of the first block, N is the number of interpolation filter taps used for the first block, PW and PH are integers.
In some examples, the step of padding the first reference block with padding pixels to generate the second reference block includes: repeating pixels at one or more boundaries of the first reference block as the padding pixels to generate the second reference block.
In some examples, the boundaries are top, left, bottom and right boundary of the first reference block.
In some examples, W = 8, H = 4, N = 7, PW = 2 and PH = 3.
In some examples, the pixels at the top, left and right boundary are repeated once, and the pixels at the bottom boundary are repeated twice.
In some examples, the fetched reference pixels are identified by (x + MVXInt –N/2 +offSet1, y + MVYInt –N/2 + offSet2) , wherein (x, y) is the top-left position of the first block, (MVXInt, MVYInt) is the integer part of motion vector (MV) for the first block, and offSet1 and offSet2 are integers.
In some examples, when PH is zero, only the pixels at the left or/and right boundaries of the first reference block are repeated.
In some examples, when PW is zero, only the pixels at the top or/and bottom boundaries of the first reference block are repeated.
In some examples, when both PW and PH are greater than zero, first the pixels at the left or/and the right boundaries of the first reference block are repeated, and then the pixels at the top or/and bottom boundaries of the first reference block are repeated, or first the top or/and bottom boundaries of the first reference block are repeated, and then the left or/and right boundaries of the first reference block are repeated.
In some examples, the pixels at the left boundary of the first reference block is repeated by M1 times and the pixels at the right boundary of the first reference block is repeated by (PW –M1) times, wherein M1 is an integer and M1 >= 0.
In some examples, the pixels of M1 left columns of the first reference block, or the pixels of (PW –M1) right columns of the first reference block are repeated, wherein M1 >1 or PW –M1>1.
In some examples, the pixels at the top boundary of the first reference block is repeated by M2 times and the pixels at the bottom boundary of the first reference block is repeated by (PH –M2) times, wherein M2 is an integer and M2 >= 0.
In some examples, the pixels of M2 top rows of the first reference block, or the pixels of (PH –M2) bottom rows of the first reference block are repeated, wherein M2 >1 or PW –M2>1.
In some examples, when both horizontal and vertical components of MV for the first block are fractional, pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block.
In some examples, when MV of prediction direction X, X being 0 or 1, has fractional components in both horizontal and vertical directions, pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block.
In some examples, the first reference block is any one of partial or all reference blocks of the first block.
In some examples, if MV of prediction direction X, X being 0 or 1, has fractional components in both horizontal and vertical directions, pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block for prediction direction X; otherwise, the pixels are not repeated.
In some examples, if N2 MV components are with fractional precision, pixels at one or more boundaries of the first reference block are repeated as the padding pixels to generate the second reference block for M MV components of the N2 MV components, wherein N2, M are integers, and 0 <= M <= N2.
In some examples, N2 and M are different for bi-predicted blocks and uni-predicted blocks.
In some examples, N2 and M are different for different block sizes, the block size being associated with width or/and height or/and width *height of the block.
In some examples, for bi-predicted blocks, N2 is equal to 4 and M is equal to 4, or N2 is equal to 4 and M is equal to 3, or N2 is equal to 4 and M is equal to 2, or N2 is equal to 4 and M is equal to 1, or N2 is equal to 3 and M is equal to 3, or N2 is equal to 3 and M is equal to 2, or
N2 is equal to 3 and M is equal to 1, or N2 is equal to 2 and M is equal to 2, or N2 is equal to 2 and M is equal to 1, or N2 is equal to 1 and M is equal to 1.
In some examples, for uni-predicted blocks, N2 is equal to 2 and M is equal to 2, or N2 is equal to 2 and M is equal to 1, or N2 is equal to 1 and M is equal to 1.
In some examples, pixels at different boundaries of the first reference block are repeated as the padding pixels in different ways to generate the second reference block for the M MV components.
In some examples, when pixel padding is not used for a horizontal MV component, PW is set equal to zero when fetching the first reference block using the MV.
In some examples, when pixel padding is not used for a vertical MV component, PH is set equal to zero when fetching the first reference block using the MV.
In some examples, PW and/or PH are different for different color components of the first block.
In some examples, the color components includes Y, Cb and Cr.
In some examples, PW and/or PH are different for different block size or shape.
In some examples, PW and PH are set equal to 1 for 4x16 or/and 16x4 bi-predicted or/and uni-predicted blocks.
In some examples, PW and PH are set equal to 0 and 1, or 1 and 0 respectively, for 4x4 bi-predicted or/and uni-predicted blocks.
In some examples, PW and PH are set equal to 2 for 4x8 or/and 8x4 bi-predicted or/and uni-predicted block.
In some examples, PW and PH are set equal to 2 and 3, or 3 and 2 respectively, for 4x8 or/and 8x4 bi-predicted or/and uni-predicted blocks.
In some examples, PW and PH are different for uni-prediction and bi-prediction.
In some examples, PW and PH are different for different kinds of motion vectors.
In some examples, PW and PH are set to a smaller value or equal to zero for motion vectors (MVs) that only have fractional components in one of horizontal or vertical direction, and PW and PH are set to a larger value for MVs that have fractional components in both horizontal and vertical directions.
In some examples, PW and PH are set equal to 0 for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that only have fractional MV components in one of horizontal or vertical direction.
In some examples, the PW and PH are used for 4x16 or/and 16x4 or/and 4x8 or/and 8x4 or/and 4x4 bi-predicted or/and uni-predicted blocks that have fractional MV components in both horizontal and vertical direction
In some examples, whether to and how to repeat pixels at the boundaries depend on color formats of the first block.
In some examples, the color formats includes 4: 2: 0, 4: 2: 2 or 4: 4: 4.
In some examples, the step of padding the first reference block with padding pixels to generate the second reference block includes: padding default values as the padding pixels to generate the second reference block.
In some examples, the conversion generates the first block of video from the bitstream representation.
In some examples, the conversion generates the bitstream representation from the first/second block of video.
FIG. 26 is a flowchart for a method 2600 of video processing. The method 2600 includes determining (2602) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing (2604) rounding process on motion vector (MV) of the first block based on the characteristics of the first block; and performing (2606) the conversion by using the rounded MV.
In some examples, the performing rounding process on the MV includes rounding the MV to integer-pel precision or half-pel precision.
In some examples, the MV is rounded to a nearest integer-pel precision MV or half-pel precision MV.
In some examples, the performing rounding process on the MV includes rounding up, rounding down, rounding towards zero or rounding away from zero of the MV.
In some examples, the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
In some examples, when the size of the first block is smaller than and/or equal to a threshold L, rounding process is performed on horizontal or/and vertical component of the MV.
In some examples, when the size of the first block is larger than and/or equal to a threshold L, rounding process is performed on horizontal or/and vertical component of the MV.
In some examples, when the width of the first block is smaller than and/or equal to a second threshold L1, rounding process is performed on horizontal component of the MV, or when the height of the first block is smaller than and/or equal to the second threshold L1, rounding process is performed on vertical component of the MV.
In some examples, the thresholds L and L1 are different for bi-predicted blocks and uni-predicted blocks.
In some examples, when the ratio between width and height is larger than a third threshold L3 or smaller than a fourth threshold L4, rounding process is performed on the MV.
In some examples, when both horizontal and vertical components of the MV are fractional, rounding process is performed on the MV.
In some examples, the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
In some examples, whether performing the rounding process on the MV depends on the prediction parameter.
In some examples, only when the first block is bi-predicted, rounding process is performed on the MV.
In some examples, the characteristics of the first block includes prediction direction indicating from List 0 or List 1 and/or associated MVs.
In some examples, whether performing the rounding process on the MV depends on prediction direction of the first block and/or the MVs.
In some examples, in a case that the first block is a bi-predicted block, whether performing the rounding process on the MV or not is different for different prediction direction.
In some examples, if MV of prediction direction X, X being 0 or 1, has fractional components in both horizontal and vertical directions, the rounding process is performed on N MV components for the prediction direction X, N is an integer in a range from 0 to 2; otherwise, the rounding process is not performed.
In some examples, if N1 MV components are with fractional precision, the rounding process is performed on M MV components of the N1 MV components, wherein N1, M are integers, and 0 <= M <= N1.
In some examples, N1 and M are different for bi-predicted blocks and uni-predicted blocks.
In some examples, for bi-predicted blocks,
N1 is equal to 4 and M is equal to 4, or
N1 is equal to 4 and M is equal to 3, or
N1 is equal to 4 and M is equal to 2, or
N1 is equal to 4 and M is equal to 1, or
N1 is equal to 3 and M is equal to 3, or
N1 is equal to 3 and M is equal to 2, or
N1 is equal to 3 and M is equal to 1, or
N1 is equal to 2 and M is equal to 2, or
N1 is equal to 2 and M is equal to 1, or
N1 is equal to 1 and M is equal to 1.
In some examples, for uni-predicted blocks,
N1 is equal to 2 and M is equal to 2, or
N1 is equal to 2 and M is equal to 1, or
N1 is equal to 1 and M is equal to 1.
In some examples, N1 and M are different for different dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
In some examples, K MV components of the M MV components are rounded to integer-pel precision and M –K MV components are rounded to half-pel precision, wherein K is an integer in a range from 0 to M –1.
In some examples, the characteristics of the first block includes color components of the first block.
In some examples, whether performing rounding process on the MV is different for different color components of the first block.
In some examples, the color components include Y, Cb and Cr.
In some examples, the characteristics of the first block includes color formats of the first block.
In some examples, whether performing rounding process on the MV depends on color formats of the first block.
In some examples, the color formats include 4: 2: 0, 4: 2: 2 or 4: 4: 4.
In some examples, whether and/or how to perform rounding process on the MV depend on the characteristics of the block.
In some examples, one or more MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks are rounded to half-pel precision.
In some examples, one or more MV components of 4x16 or/and 16x4 bi-predicted or/and uni-predicted luma blocks are rounded to integer-pel precision.
In some examples, one or more MV components of 4x4 uni-predicted or/and bi-predicted luma blocks are rounded to integer-pel precision.
In some examples, one or more MV components of 4x8 or/and 8x4 bi-predicted or/and uni-predicted luma blocks are rounded to integer-pel precision.
In some examples, the characteristics of the first block includes whether the first block is coded with sub-block based prediction method including affine prediction mode and Sub-block based Temporal Motion Vector Prediction (SbTMVP) mode.
In some examples, the rounding process on the MV is not applied if the first block is coded with affine prediction mode.
In some examples, the rounding process on the MV is applied if the first block is coded with SbTMVP mode, and the rounding process is performed for each sub-block of the first block.
In some examples, the performing rounding process on motion vector (MV) of the first block based on the characteristics of the first block comprises: determining whether at least one MV of the first block are fractional precisions when the dimension parameters of the first block satisfy a predetermined rule; and in response to the determination that the at least one MV of the first block are fractional precisions, performing rounding process on the at least one MV to generate rounded MVs having integer precision.
In some examples, the bitstream representation of the first block follows the rule depending on the dimension parameters of the first block, wherein only integer-pel MVs are allowed for bi-prediction coded blocks.
In some examples, the dimensions parameters of the first block are 4x16, 16x4, 4x8, 8x4, or 4x4.
In some examples, the performing the conversion by using the rounded MV comprises: performing motion compensation for the first block by using the rounded MVs.
FIG. 27 is a flowchart for a method 2700 of video processing. The method 2700 includes determining (2702) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing (2704) motion compensation for the first block using a MV with a first precision; and storing (2706) a MV with a second precision for the first block; wherein the first precision is different from the second precision.
In some examples, the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
In some examples, the first precision is integer precision and the second precision is fractional precision.
FIG. 28 is a flowchart for a method 2800 of video processing. The method 2800 includes determining (2802) , for a conversion between a first block of video and a bitstream  representation of the first block, coding mode of the first block; performing (2804) rounding process on motion vector (MV) of the first block if the coding mode of the first block satisfying a predetermined rule; and performing (2806) the motion compensation of the first block by using the rounded MV.
In some examples, the predetermined rule comprises: the first block is coded with merge mode, non-intra modes or non-Advanced motion vector prediction (AMVP) mode.
FIG. 29 is a flowchart for a method 2900 of video processing. The method 2900 includes generating (2902) , for a conversion between a first block of video and a bitstream representation of the first block, a first motion vector (MV) candidate list for the first block; performing (2904) rounding process on MV of at least one candidate before adding the at least one candidate into the first MV candidate list; and performing (2906) the conversion by using the first MV candidate list.
In some examples, the first block is coded with merge mode, non-intra modes or non-Advanced motion vector prediction (AMVP) mode, and the MV candidate list includes merge candidate list and non-merge candidate list.
In some examples, the candidates with fractional MVs are excluded from the first MV candidate list.
In some examples, the at least one candidate comprises: a candidate derived from a spatial block, a candidate derived from a temporal block, a candidate derived from a History motion vector prediction (HMVP) table or a pairwise bi-prediction merge candidate.
In some examples, the method further comprises: providing a separate HMVP table to store the candidates with MV of integer precision.
In some examples, the method further comprises: performing the rounding process on the MV or the rounding process on the MV of candidate in the candidate list based on characteristics of the first block.
In some examples, the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height of the first block.
In some examples, the dimension parameters include at least one of 4x16, 16x4, 4x8, 8x4, 4x4.
In some examples, the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted, and performing rounding process on MV comprises: performing the rounding process on the MV or the rounding process on the MV of candidate in the candidate list only when the candidate is a bi-prediction candidate.
In some examples, the first block is coded with AMVP mode, and the candidate is AMVP candidate.
In some examples, the first block is non-affine mode.
FIG. 30 is a flowchart for a method 3000 of video processing. The method 3000 includes determining (3002) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; determining (3004) constraint parameter to be applied on the first block based on the characteristics of the first block, wherein the constraint parameter constrains a maximum number of fractional motion vector (MV) components of the first block; and performing (3006) the conversion by using the constraint parameter.
In some examples, the MV components include at least one of horizontal MV component and/or vertical MV component, and the fractional MV components include at least one of half-pel MV components, quarter-pel MV components, MV components with finer precision than quarter-pel.
In some examples, the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
In some examples, the constraint parameter is different for bi-prediction and uni-prediction.
In some examples, the constraint parameter is not applied in uni-prediction.
In some examples, the constraint parameter is applied when the first block is bi-predicted 4x8, 8x4, 4x16, or16x4 block.
In some examples, the constraint parameter is not applied when the first block is uni-predicted 4x8, 8x4, 4x16 or16x4 block.
In some examples, the constraint parameter is applied when the first block is a uni-predicted 4x4 or a bi-predicted 4x4 block.
In some examples, for bi-predicted blocks, the maximum number of the fractional MV components is 3, 2, 1 or 0.
In some examples, for uni-predicted blocks, the maximum number of the fractional MV components is 1 or 0.
In some examples, for bi-predicted blocks, the maximum number of the quarter-pel MV components is 3, 2, 1 or 0.
In some examples, for uni-predicted blocks, the maximum number of the quarter-pel MV components is 1 or 0.
In some examples, the characteristics of the first block includes at least one of shapes and dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height, and shape of the first block.
In some examples, the constraint parameter is different for different sizes or shapes of the first block.
In some examples, the characteristics of the first block includes mode parameter indicating coding mode of the first block.
In some examples, the coding mode includes a triangle mode in which the current is split into two partitions, wherein each partition has at least one MV.
In some examples, the constraint parameter is applied when the first block is 4x16 or 16x4 block coded in the triangle mode.
FIG. 31 is a flowchart for a method 3100 of video processing. The method 3100 includes acquiring (3102) , an a signaled indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining (3104) , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; and performing (3106) the conversion by using the indication when the characteristics of the first block satisfies the predetermined rule.
FIG. 32 is a flowchart for a method 3200 of video processing. The method 3200 includes signaling (3202) , an indication of disallowing at least one of bi-prediction and uni-prediction when the characteristics of a block satisfying a predetermined rule; determining
 , for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block; performing (3206) the conversion based on the characteristics of the first block, wherein during the conversation, at least one of bi-prediction  and uni-prediction is disabled when the characteristics of the first block satisfies a predetermined rule.
In some examples, the indication is signaled in sequence parameter set/picture parameter set/sequence header/picture header/tile header/tile group header/coding tree unit (CTU) rows/regions/other high-level syntax.
In some examples, the characteristics of the first block includes dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height, and shape of the first block.
In some examples, the predetermined rules comprises: the first block is of certain block dimensions.
In some examples, the characteristics of the first block includes mode parameter indicating coding mode of the first block.
In some examples, the predetermined rule comprises: the first block is coded with non-affine mode.
In some examples, when at least one of uni-prediction and bi-prediction is disallowed for the first block, the signaling of Advanced Motion Vector Resolution (AMVR) parameter for the first block is modified accordingly.
In some examples, the signaling of Advanced Motion Vector Resolution (AMVR) parameter is modified so that only integer-pel precisions are allowed for the first block.
In some examples, the signaling of Advanced Motion Vector Resolution (AMVR) parameter is modified so that different motion vector (MV) precisions are utilized.
In some examples, the block dimension of the first block is at least one of 4x16, 16x4, 4x8, 8x4, 4x4.
In some examples, the bitstream representation of the first block follows the rule depending on the dimensions parameters of the first block, wherein only integer-pel MVs are allowed for bi-prediction coded blocks.
FIG. 33 is a flowchart for a method 3300 of video processing. The method 3300 includes determining (3302) , for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; signaling (3304) , an Advanced Motion  Vector Resolution (AMVR) parameter for the first block based on the determination; and performing (3306) the conversion by using the AMVR parameter.
FIG. 34 is a flowchart for a method 3400 of video processing. The method 3400 includes determining (3402) , for a conversion between a first block of video and a bitstream representation of the first block, whether fractional motion vector (MV) or motion vector difference (MVD) precision is allowed for a first block; acquiring (3404) , an Advanced Motion Vector Resolution (AMVR) parameter for the first block based on the determination; and performing (3406) the conversion by using the AMVR parameter.
In some examples, if the fractional MV or MVD precision is disallowed for the first block, the AMVR parameter indicating whether MV/MVD precision of the current block is fractional precision is skipped and derived to be false implicitly.
5. An Embodiment
In following embodiments, PW and PH are designed for 4x16, 16x4, 4x4, 8x4 and 4x8 blocks.
Suppose the MV of the block in reference list X is MVX, and the horizonal and vertical components of MVX are MVX [0] and MVX [1] respectively, and the integer part of MVX [0] and MVX [1] are MVXInt [0] and MVXInt [1] respectively, wherein X = 0 or 1. Suppose the interpolation filter tap (in motion compensation) is N (for example, 8, 6, 4, or 2) , and the current block size is WxH, and the position (i.e., position of the top-left pixel) of current block is (x, y) . The index of the rows and columns start from 1, for example, H rows include the 1st, …, (H –1) th row.
The following boundary pixel repeating process is performed only when both MVX [0] and MVX [1] are fractional.
5.1 An Embodiment
For 4x16 and 16x4 uni-predicted and bi-predicted blocks, PW and PH are both set equal to 1 for prediction direction X. First, (W + N –2) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 1, MVXInt [1] + y –N/2 + 1) . Then, the (W + N –1) th column is generated by copying the (W + N –2) th column. Finally, the (H + N –1) th row is generated by copying the (H + N –2) th row.
For 4x4 uni-predicted block, PW and PH are set equal to 0 and 1 respectively. First, (W + N –1) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 1, MVXInt [1] + y –N/2 + 1) . Then, the (H + N –1) th row is generated by copying the (H + N –2) th row.
For 4x8 and 8x4 ui-predicted and bi-predicted blocks, PW and PH are set equal to 2 and 3 respectively. First, (W + N –3) * (H + N –4) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 2, MVXInt [1] + y –N/2 + 2) . Then, the 1st column is copied to its left side to obtain W + N –2 columns, after that, the (W + N –1) th column is generated by copying the (W + N –2) th column. Finally, the 1st row is copied to its upside to obtain H + N –3 rows, after that, the (H + N –2) th row and (H + N –1) th row are generated by copying the (H + N –3) th row.
5.2 An Embodiment
For 4x16 and 16x4 ui-predicted and bi-predicted blocks, PW and PH are both set equal to 1 for prediction direction X. First, (W + N –2) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 2, MVXInt [1] + y –N/2 + 2) . Then, the 1st column is copied to its left side to obtain W + N –1 columns. Finally, the 1st row is copied to its upside to obtain H + N –1 rows.
For 4x4 uni-predicted block, PW and PH are set equal to 0 and 1 respectively. First, (W + N –1) * (H + N –2) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 1, MVXInt [1] + y –N/2 + 2) . Then, the 1st row is copied to its upside to obtain H + N –1 rows.
For 4x8 and 8x4 ui-predicted and bi-predicted blocks, PW and PH are set equal to 2 and 3 respectively. First, (W + N –3) * (H + N –4) reference pixels are fetched from the reference picture, wherein the top-left position of the reference pixels is identified by (MVXInt [0] + x –N/2 + 2, MVXInt [1] + y –N/2 + 2) . Then, the 1st column is copied to its left side to obtain W + N –2 columns, after that, the (W + N –1) th column is generated by copying the (W + N –2) th column. Finally, the 1st row is copied to its upside to obtain H + N –3 rows, after that, the (H + N –2) th row and (H + N –1) th row are generated by copying the (H + N –3) th row.
It will be appreciated that the disclosed techniques may be embodied in video encoders or decoders to improve compression efficiency when the coding units being compressed have shaped that are significantly different than the traditional square shaped blocks  or rectangular blocks that are half-square shaped. For example, new coding tools that use long or tall coding units such as 4x32 or 32x4 sized units may benefit from the disclosed techniques.
The disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order  shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (22)

  1. A method of video processing, comprising:
    determining, for a conversion between a first block of video and a bitstream representation of the first block, characteristics of the first block;
    determining constraint parameter to be applied on the first block based on the characteristics of the first block, wherein the constraint parameter constrains a maximum number of fractional motion vector (MV) components of the first block; and
    performing the conversion by using the constraint parameter.
  2. The method of claim 1, wherein the MV components include at least one of horizontal MV component and/or vertical MV component, and the fractional MV components include at least one of half-pel MV components, quarter-pel MV components, MV components with finer precision than quarter-pel.
  3. The method of claim 1 or 2, wherein the characteristics of the first block includes prediction parameter indicating whether the first block is bi-predicted or uni-predicted.
  4. The method of claim 3, wherein the constraint parameter is different for bi-prediction and uni-prediction.
  5. The method of claim 4, wherein the constraint parameter is not applied in uni-prediction.
  6. The method of any one of claims 1-5, wherein the constraint parameter is applied when the first block is bi-predicted 4x8, 8x4, 4x16, or16x4 block.
  7. The method of any one of claims 1-5, wherein the constraint parameter is not applied when the first block is uni-predicted 4x8, 8x4, 4x16 or16x4 block.
  8. The method of any one of claims1-5, wherein the constraint parameter is applied when the first block is a uni-predicted 4x4 or a bi-predicted 4x4 block.
  9. The method of any one of claims 1-4, wherein for bi-predicted blocks, the maximum number of the fractional MV components is 3, 2, 1 or 0.
  10. The method of any one of claims 1-4, wherein for uni-predicted blocks, the maximum number of the fractional MV components is 1 or 0.
  11. The method of any one of claims 1-4, wherein for bi-predicted blocks, the maximum number of the quarter-pel MV components is 3, 2, 1 or 0.
  12. The method of any one of claims 1-4, wherein for uni-predicted blocks, the maximum number of the quarter-pel MV components is 1 or 0.
  13. The method of claim 1 or 2, wherein the characteristics of the first block includes at least one of shapes and dimension parameters including at least one of width, height, a ratio of width and height, a size of width*height, and shape of the first block.
  14. The method of claim 13, wherein the constraint parameter is different for different sizes or shapes of the first block.
  15. The method of claim 1 or 2, wherein the characteristics of the first block includes mode parameter indicating coding mode of the first block.
  16. The method of claim 15, wherein the coding mode includes a triangle mode in which the current is split into two partitions, wherein each partition has at least one MV.
  17. The method of claim 15, wherein the constraint parameter is applied when the first block is 4x16 or 16x4 block coded in the triangle mode.
  18. The method of any one of claims 1-17, wherein the bitstream representation of the first block conforms to the constraint parameter.
  19. The method of any one of claims 1 to 18, wherein the conversion generates the first block of video from the bitstream representation.
  20. The method of any one of claims 1 to 18, wherein the conversion generates the bitstream representation from the first block of video.
  21. An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of claims 1 to 20.
  22. A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of claims 1 to 20.
PCT/CN2020/071771 2019-01-12 2020-01-13 Mv precision constraints WO2020143831A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202080008722.5A CN113574867B (en) 2019-01-12 2020-01-13 MV precision constraint

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CNPCT/CN2019/071503 2019-01-12
CN2019071503 2019-01-12
CN2019077171 2019-03-06
CNPCT/CN2019/077171 2019-03-06

Publications (1)

Publication Number Publication Date
WO2020143831A1 true WO2020143831A1 (en) 2020-07-16

Family

ID=71520978

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CN2020/071774 WO2020143832A1 (en) 2019-01-12 2020-01-13 Bi-prediction constraints
PCT/CN2020/071771 WO2020143831A1 (en) 2019-01-12 2020-01-13 Mv precision constraints

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/071774 WO2020143832A1 (en) 2019-01-12 2020-01-13 Bi-prediction constraints

Country Status (2)

Country Link
CN (2) CN113574867B (en)
WO (2) WO2020143832A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130188720A1 (en) * 2012-01-24 2013-07-25 Qualcomm Incorporated Video coding using parallel motion estimation
CN107079164A (en) * 2014-09-30 2017-08-18 寰发股份有限公司 Method for the adaptive motion vector resolution ratio of Video coding
CN107852499A (en) * 2015-04-13 2018-03-27 联发科技股份有限公司 The method that constraint intra block for reducing the bandwidth under worst case in coding and decoding video replicates
CN107852490A (en) * 2015-07-27 2018-03-27 联发科技股份有限公司 Use the video coding-decoding method and system of intra block replication mode
CN108432250A (en) * 2016-01-07 2018-08-21 联发科技股份有限公司 The method and device of affine inter-prediction for coding and decoding video
CN108632619A (en) * 2016-03-16 2018-10-09 联发科技股份有限公司 Method for video coding and device and relevant video encoding/decoding method and device

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9237355B2 (en) * 2010-02-19 2016-01-12 Qualcomm Incorporated Adaptive motion resolution for video coding
US9591312B2 (en) * 2012-04-17 2017-03-07 Texas Instruments Incorporated Memory bandwidth reduction for motion compensation in video coding
WO2014015807A1 (en) * 2012-07-27 2014-01-30 Mediatek Inc. Method of constrain disparity vector derivation in 3d video coding
KR20130067280A (en) * 2013-04-18 2013-06-21 엠앤케이홀딩스 주식회사 Decoding method of inter coded moving picture
CN103561263B (en) * 2013-11-06 2016-08-24 北京牡丹电子集团有限责任公司数字电视技术中心 Based on motion vector constraint and the motion prediction compensation method of weighted motion vector
US9749642B2 (en) * 2014-01-08 2017-08-29 Microsoft Technology Licensing, Llc Selection of motion vector precision
US10327002B2 (en) * 2014-06-19 2019-06-18 Qualcomm Incorporated Systems and methods for intra-block copy
US20160337662A1 (en) * 2015-05-11 2016-11-17 Qualcomm Incorporated Storage and signaling resolutions of motion vectors
GB2539213A (en) * 2015-06-08 2016-12-14 Canon Kk Schemes for handling an AMVP flag when implementing intra block copy coding mode
US10404992B2 (en) * 2015-07-27 2019-09-03 Qualcomm Incorporated Methods and systems of restricting bi-prediction in video coding
RU2696551C1 (en) * 2016-03-15 2019-08-02 МедиаТек Инк. Method and device for encoding video with compensation of affine motion
WO2017156705A1 (en) * 2016-03-15 2017-09-21 Mediatek Inc. Affine prediction for video coding
US10779007B2 (en) * 2017-03-23 2020-09-15 Mediatek Inc. Transform coding of video data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130188720A1 (en) * 2012-01-24 2013-07-25 Qualcomm Incorporated Video coding using parallel motion estimation
CN107079164A (en) * 2014-09-30 2017-08-18 寰发股份有限公司 Method for the adaptive motion vector resolution ratio of Video coding
CN107852499A (en) * 2015-04-13 2018-03-27 联发科技股份有限公司 The method that constraint intra block for reducing the bandwidth under worst case in coding and decoding video replicates
CN107852490A (en) * 2015-07-27 2018-03-27 联发科技股份有限公司 Use the video coding-decoding method and system of intra block replication mode
CN108432250A (en) * 2016-01-07 2018-08-21 联发科技股份有限公司 The method and device of affine inter-prediction for coding and decoding video
CN108632619A (en) * 2016-03-16 2018-10-09 联发科技股份有限公司 Method for video coding and device and relevant video encoding/decoding method and device

Also Published As

Publication number Publication date
CN113574867B (en) 2022-09-13
CN113287303A (en) 2021-08-20
CN113574867A (en) 2021-10-29
WO2020143832A1 (en) 2020-07-16

Similar Documents

Publication Publication Date Title
US11997253B2 (en) Conditions for starting checking HMVP candidates depend on total number minus K
US11070820B2 (en) Condition dependent inter prediction with geometric partitioning
US11616945B2 (en) Simplified history based motion vector prediction
US11146785B2 (en) Selection of coded motion information for LUT updating
US11589071B2 (en) Invoke of LUT updating
US11595641B2 (en) Alternative interpolation filters in video coding
US11641483B2 (en) Interaction between merge list construction and other tools
US11503288B2 (en) Selective use of alternative interpolation filters in video processing
WO2020125628A1 (en) Shape dependent interpolation filter
WO2020156515A1 (en) Refined quantization steps in video coding
WO2020143830A1 (en) Integer mv motion compensation
WO2020143837A1 (en) Mmvd improvement
WO2020143831A1 (en) Mv precision constraints
WO2020012448A2 (en) Shape dependent interpolation order

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20738865

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 20738865

Country of ref document: EP

Kind code of ref document: A1