WO2021036976A1 - Reference picture resampling - Google Patents

Reference picture resampling Download PDF

Info

Publication number
WO2021036976A1
WO2021036976A1 PCT/CN2020/110763 CN2020110763W WO2021036976A1 WO 2021036976 A1 WO2021036976 A1 WO 2021036976A1 CN 2020110763 W CN2020110763 W CN 2020110763W WO 2021036976 A1 WO2021036976 A1 WO 2021036976A1
Authority
WO
WIPO (PCT)
Prior art keywords
samples
video
resolution
picture
motion vector
Prior art date
Application number
PCT/CN2020/110763
Other languages
French (fr)
Inventor
Kai Zhang
Li Zhang
Hongbin Liu
Zhipin DENG
Jizheng Xu
Yue Wang
Original Assignee
Beijing Bytedance Network Technology Co., Ltd.
Bytedance Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bytedance Network Technology Co., Ltd., Bytedance Inc. filed Critical Beijing Bytedance Network Technology Co., Ltd.
Priority to CN202080057057.9A priority Critical patent/CN114223205A/en
Publication of WO2021036976A1 publication Critical patent/WO2021036976A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness

Definitions

  • This patent document relates to video coding and decoding.
  • a method of video processing includes selecting an interpolation filter used for determining a prediction block for a current block of a current picture of video by motion compensation from a reference picture based on a rule, and performing a conversion between the current block of the video and a coded representation of the video based on the prediction block, wherein the rule specifies that the interpolation filter is a first interpolation filter in case that a resolution of the current picture and a resolution of the reference picture are different and the interpolation filter is a second interpolation filter in case that the resolution of the current picture and the resolution of the reference picture are same, wherein the first interpolation filter is different from the second interpolation filter.
  • another method of video processing includes performing a conversion between a video comprising a current video picture and a coded representation of the video according to a rule, wherein a reference picture is included in at least one of the reference picture lists of the current video picture, wherein the current video picture has a first resolution; wherein the reference picture has a second resolution; wherein the rule specifies whether and/or how the current video picture and/or the reference picture are permitted to have sub-pictures depending on at least one of the first resolution or the second resolution.
  • another method of video processing includes determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a resolution of a reference picture used for the conversion are different, and performing the conversion based on the determining such that same horizontal interpolation filter or vertical interpolation filter is used for generating predicted values of samples of a group of samples of the current block.
  • another method of video processing includes making a determination, due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a constraint on a motion vector used for deriving a prediction block for the current block; and performing a conversion between the video and a coded representation of the video based on the determination, wherein the constrain specifies that the motion vector is an integer motion vector.
  • another method of video processing includes performing a conversion between a video comprising a current block and a coded representation according to a rule, wherein the rule specifies whether a motion vector information used for determining a prediction block of the current block by motion compensation is made available for motion vector prediction of succeeding blocks in a current picture comprising the current block or another picture.
  • another method of video processing includes making a determination, due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a two-step process to generate a prediction block for the current block; and performing a conversion between the video and a coded representation of the video based on the determination, wherein the two-step process comprises a first step of resampling a region of the reference picture to generate a virtual reference block and a second step of generation the prediction block using an interpolation filter on the virtual reference block.
  • another method of video processing includes making a first determination that there is a difference between resolutions of one or more reference pictures used for generating a prediction block of a current block of a current picture of a video and a resolution of the current picture, using a rule to make a second determination, based on the difference, about whether or how to apply a filtering process for a conversion between the video and a coded representation of the video; and performing the conversion according to the second determination.
  • another method of video processing includes performing a conversion between a video comprising multiple video blocks of a video picture and a coded representation of the video according to a rule, wherein the multiple video blocks are processed in an order, wherein the rule specifies that motion vector information used for determining prediction block of a first video block is stored and used during processing of a succeeding video block of the multiple video blocks according to a resolution of a reference picture used by the motion vector information.
  • the above-described method is embodied in the form of processor-executable code and stored in a computer-readable program medium.
  • a device that is configured or operable to perform the above-described method.
  • the device may include a processor that is programmed to implement this method.
  • a video decoder apparatus may implement a method as described herein.
  • a computer readable medium includes a coded video representation generated using one of the above described methods.
  • FIG. 1 A 16x16 block is divided into 16 4x4 regions.
  • FIG. 2A-2C show examples of specific positions in a video block.
  • FIG. 3 is a block diagram of an example implementation of a hardware platform for video processing.
  • FIG. 4 is a flowchart for an example method of video processing.
  • FIGS. 5A to 5H are flowcharts for example methods of video processing.
  • Embodiments of the disclosed technology may be applied to existing video coding standards (e.g., HEVC, H. 265) and future standards to improve compression performance. Section headings are used in the present document to improve readability of the description and do not in any way limit the discussion or the embodiments (and/or implementations) to the respective sections only.
  • This document is related to video coding technologies. Specifically, it is related to adaptive resolution conversion in video coding or decoding. It may be applied to the existing video/image coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
  • Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards.
  • the ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards.
  • AVC H. 264/MPEG-4 Advanced Video Coding
  • H. 265/HEVC High Efficiency Video Coding
  • the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized.
  • Joint Video Exploration Team JVET was founded by VCEG and MPEG jointly in 2015.
  • JVET Joint Exploration Model
  • AVC and HEVC does not have the ability to change resolution without having to introduce an IDR or intra random access point (IRAP) picture; such ability can be referred to as adaptive resolution change (ARC) .
  • ARC adaptive resolution change
  • - Rate adaption in video telephony and conferencing For adapting the coded video to the changing network conditions, when the network condition gets worse so that available bandwidth becomes lower, the encoder may adapt to it by encoding smaller resolution pictures.
  • changing picture resolution can be done only after an IRAP picture; this has several issues.
  • An IRAP picture at reasonable quality will be much larger than an inter-coded picture and will be correspondingly more complex to decode: this costs time and resource. This is a problem if the resolution change is requested by the decoder for loading reasons. It can also break low-latency buffer conditions, forcing an audio re-sync, and the end-to-end delay of the stream will increase, at least temporarily. This can give a poor user experience.
  • the Dynamic Adaptive Streaming over HTTP (DASH) specification includes a feature named @mediaStreamStructureId. This enables switching between different representations at open-GOP random access points with non-decodable leading pictures, e.g., CRA pictures with associated RASL pictures in HEVC.
  • CRA pictures with associated RASL pictures in HEVC.
  • switching between the two representations at a CRA picture with associated RASL pictures can be performed, and the RASL pictures associated with the switching-at CRA pictures can be decoded with acceptable quality hence enabling seamless switching.
  • the @mediaStreamStructureId feature would also be usable for switching between DASH representations with different spatial resolutions.
  • ARC is also known as Dynamic resolution conversion.
  • ARC may also be regarded as a special case of Reference Picture Resampling (RPR) such as H. 263 Annex P.
  • RPR Reference Picture Resampling
  • This mode describes an algorithm to warp the reference picture prior to its use for prediction. It can be useful for resampling a reference picture having a different source format than the picture being predicted. It can also be used for global motion estimation, or estimation of rotating motion, by warping the shape, size, and location of the reference picture.
  • the syntax includes warping parameters to be used as well as a resampling algorithm.
  • the simplest level of operation for the reference picture resampling mode is an implicit factor of 4 resampling as only an FIR filter needs to be applied for the upsampling and downsampling processes. In this case, no additional signaling overhead is required as its use is understood when the size of a new picture (indicated in the picture header) is different from that of the previous picture.
  • ARC a.k.a. RPR (Reference Picture Resampling) is incorporated in JVET-O2001-v14.
  • TMVP is disabled if the collocated picture has a different resolution to the current picture.
  • BDOF and DMVR are disabled when the reference picture has a different resolution to the current picture.
  • the interpolation section is defined as below (section numbers refer to the current version of VCC, italicized text indicates proposed changes to the specification) :
  • variable cIdx specifying the colour component index of the current block.
  • the prediction block border extension size brdExtSize is derived as follows:
  • brdExtSize (bdofFlag
  • variable fRefWidth is set equal to the PicOutputWidthL of the reference picture in luma samples.
  • variable fRefHeight is set equal to PicOutputHeightL of the reference picture in luma samples.
  • the motion vector mvLX is set equal to (refMvLX-mvOffset) .
  • hori_scale_fp ( (fRefWidth ⁇ 14) + (PicOutputWidthL >> 1) ) /PicOutputWidthL (8-753)
  • the top-left coordinate of the bounding block for reference sample padding (xSbInt L , ySbInt L ) is set equal to (xSb + (mvLX [0] >> 4) , ySb + (mvLX [1] >> 4) ) .
  • (refxSb L , refySb L ) and (refx L , refy L ) be luma locations pointed to by a motion vector (refMvLX [0] , refMvLX [1] ) given in 1/16-sample units.
  • the variables refxSb L , refx L , refySb L , and refy L are derived as follows:
  • refxSb L ( (xSb ⁇ 4) + refMvLX [0] ) *hori_scale_fp (8-755)
  • refx L ( (Sign (refxSb) * ( (Abs (refxSb) + 128) >> 8) +x L * ( (hori_scale_fp + 8) >> 4) ) + 32) >> 6 (8-756)
  • the prediction luma sample value predSamplesLX [x L ] [y L ] is derived by invoking the luma integer sample fetching process as specified in clause 8.5.6.3.3 with (xInt L + (xFrac L >> 3) -1) , yInt L + (yFrac L >> 3) -1) and refPicLX as inputs.
  • the prediction luma sample value predSamplesLX [xL] [yL] is derived by invoking the luma sample 8-tap interpolation filtering process as specified in clause 8.5.6.3.2 with (xIntL - (brdExtSize > 0 ? 1: 0) , yIntL - (brdExtSize > 0 ? 1: 0) ) , (xFracL, yFracL) , (xSbInt L , ySbInt L ) , refPicLX, hpelIfIdx, sbWidth, sbHeight and (xSb, ySb) as inputs.
  • the top-left coordinate of the bounding block for reference sample padding (xSbIntC, ySbIntC) is set equal to ( (xSb/SubWidthC) + (mvLX [0] >> 5) , (ySb/SubHeightC) + (mvLX [1] >> 5) ) .
  • refxSb C ( (xSb/SubWidthC ⁇ 5) + mvLX [0] ) *hori_scale_fp (8-763)
  • refx C ( (Sign (refxSb C ) * ( (Abs (refxSb C ) + 256) >> 9) + xC* ( (hori_scale_fp + 8) >> 4) ) + 16) >> 5 (8-764)
  • the prediction sample value predSamplesLX [xC] [yC] is derived by invoking the process specified in clause 8.5.6.3.4 with (xIntC, yIntC) , (xFracC, yFracC) , (xSbIntC, ySbIntC) , sbWidth, sbHeight and refPicLX as inputs.
  • Output of this process is a predicted luma sample value predSampleLX L
  • variable shift1 is set equal to Min (4, BitDepth Y -8)
  • variable shift2 is set equal to 6
  • variable shift3 is set equal to Max (2, 14-BitDepth Y ) .
  • variable picW is set equal to pic_width_in_luma_samples and the variable picH is set equal to pic_height_in_luma_samples.
  • the luma interpolation filter coefficients f L [p] for each 1/16 fractional sample position p equal to xFrac L or yFrac L are derived as follows:
  • the luma interpolation filter coefficients f L [p] are specified in Table 8-11 depending on hpelIfIdx.
  • subpic_treated_as_pic_flag [SubPicIdx] is equal to 1, the following applies:
  • xInt i Clip3 (SubPicLeftBoundaryPos, SubPicRightBoundaryPos, xInt L + i -3) (8-771)
  • yInt i Clip3 (SubPicTopBoundaryPos, SubPicBotBoundaryPos, yInt L + i -3) (8-772)
  • xInt i Clip3 (xSbInt L -3, xSbInt L + sbWidth + 4, xInt i ) (8-775)
  • yInt i Clip3 (ySbInt L -3, ySbInt L + sbHeight + 4, yInt i ) (8-776)
  • the predicted luma sample value predSampleLX L is derived as follows:
  • predSampleLX L is derived as follows:
  • predSampleLX L refPicLX L [xInt 3 ] [yInt 3 ] ⁇ shift3 (8-777)
  • predSampleLX L is derived as follows:
  • predSampleLX L is derived as follows:
  • predSampleLX L is derived as follows:
  • Output of this process is a predicted luma sample value predSampleLX L
  • variable shift is set equal to Max (2, 14 -BitDepth Y ) .
  • variable picW is set equal to pic_width_in_luma_samples and the variable picH is set equal to pic_height_in_luma_samples.
  • the predicted luma sample value predSampleLX L is derived as follows:
  • predSampleLX L refPicLX L [xInt] [yInt] ⁇ shift3 (8-784)
  • Output of this process is a predicted chroma sample value predSampleLX C
  • variable shift1 is set equal to Min (4, BitDepth C -8)
  • variable shift2 is set equal to 6
  • the variable shift3 is set equal to Max (2, 14-BitDepth C ) .
  • variable picW C is set equal to pic_width_in_luma_samples/SubWidthC and the variable picH C is set equal to pic_height_in_luma_samples/SubHeightC.
  • chroma interpolation filter coefficients f C [p] for each 1/32 fractional sample position p equal to xFrac C or yFrac C are specified in Table 8-13.
  • the variable xOffset is set equal to (sps_ref_wraparound_offset_minus1 + 1) *MinCbSizeY) /SubWidthC.
  • subpic_treated_as_pic_flag [SubPicIdx] is equal to 1, the following applies:
  • xInt i Clip3 (SubPicLeftBoundaryPos/SubWidthC, SubPicRightBoundaryPos/SubWidthC, xInt L +i) (8-785)
  • yInt i Clip3 (SubPicTopBoundaryPos/SubHeightC, SubPicBotBoundaryPos/SubHeightC, yInt L +i) (8-786)
  • xInt i Clip3 (xSbIntC -1, xSbIntC + sbWidth + 2, xInt i ) (8-789)
  • yInt i Clip3 (ySbIntC -1, ySbIntC + sbHeight + 2, yInt i ) (8-790)
  • the predicted chroma sample value predSampleLX C is derived as follows:
  • predSampleLX C is derived as follows:
  • predSampleLX C refPicLX C [xInt 1 ] [yInt 1 ] ⁇ shift3 (8-791)
  • predSampleLX C is derived as follows:
  • predSampleLX C is derived as follows:
  • predSampleLX C is derived as follows:
  • RPR When RPR is applied in VVC, RPR (ARC) may have the following problems:
  • the interpolation filters may be different for adjacent samples in a block, which is undesirable in SIMD (Single Instruction Multiple Data) implementation.
  • a motion vector is denoted by (mv_x, mv_y) wherein mv_x is the horizontal component and mv_y is the vertical component.
  • predicted values for a group of samples (at least two samples) of a current block may be generated with the same horizontal and/or vertical interpolation filter.
  • the group may comprise all samples in a region of the block.
  • a block may be divided into S MxN rectangles not overlapped with each other.
  • Each MxN rectangle is a group.
  • a 16x16 block can be divided into 16 4x4 rectangles, each of which is a group.
  • N is an integer no larger than the block width. In one example, N is 4 or 8 or the block width.
  • N is an integer no larger than the block height. In one example, N is 4 or 8 or the block height.
  • iv. M and/or N may be pre-defined or derived on-the-fly, such as based on block dimension/coded information or signaled.
  • samples in the group may have the same MV (denoted as shared MV) .
  • samples in the group may have MVs with the same horizontal component (denoted as shared horizontal component) .
  • samples in the group may have MVs with the same vertical component (denoted as shared vertical component) .
  • samples in the group may have MVs with the same fractional part of the horizontal component (denoted as shared fractional horizontal component) .
  • samples in the group may have MVs with the same fractional part of the vertical component (denoted as shared fractional vertical component) .
  • the motion vector denoted by MV b
  • the motion vector may be firstly derived according to the resolutions of the current picture and the reference picture (e.g. (refx L , refy L ) derived in 8.5.6.3.1 in JVET-O2001-v14) .
  • MV b may be further modified (e.g., being rounded/truncated/clipped) to MV’ to satisfy the requirements such as the above bullets, and MV’ will be used to derive the prediction sample for the sample.
  • MV’ has the same integer part as MV b , and the fractional part of the MV’ is set to be the shared fractional horizontal and/or vertical component.
  • MV’ is set to be the one with the shared fractional horizontal and/or vertical component, and closest to MV b .
  • the shared motion vector (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be the motion vector (and/or horizontal component and/or vertical component and/or fractional vertical component and/or fractional vertical component) of a specific sample in the group.
  • the specific sample may be at a corner of a rectangle-shaped group, such as “A” , “B’ , “C” and “D” shown in FIG. 2A.
  • the specific sample may be at a center of a rectangle-shaped group, such as “E” , “F’ , “G” and “H” shown in FIG. 2A.
  • the specific sample may be at an end of a row-shaped or column-shaped group, such as “A” and “D” shown in FIGS. 2B and 2C.
  • the specific sample may be at a middle of a row-shaped or column-shaped group, such as “B” and “C” shown in FIGS. 2B and 2C.
  • the motion vector of the specific sample may be the MV b mentioned in bullet g.
  • the shared motion vector (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be the motion vector (and/or horizontal component and/or vertical component and/or fractional vertical component and/or fractional vertical component) of a virtual sample located at a different position compared to all samples in this group.
  • the virtual sample is not in the group, but it locates in the region covering all samples in the group.
  • the virtual sample is located outside the region covering all samples in the group, e.g., next to the bottom-right position of the region.
  • the MV of a virtual sample is derived in the same way as a real sample but with different positions.
  • V in FIGS. 2A-2C shows three examples of virtual samples.
  • the shared MV (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be a function of MVs (and/or horizontal components and/or vertical components and/or fractional vertical components and/or fractional vertical components) of multiple samples and/or virtual samples.
  • the shared MV (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be the average of MVs (and/or horizontal components and/or vertical components and/or fractional vertical components and/or fractional vertical components) of all or partial of samples in the group, or of sample “E” , “F” , “G” , “H” in FIG. 2A, or of sample “E” , “H” in FIG. 2A or of sample “A” , “B” , “C” , “D” in FIG. 2A, or of sample “A” , “D” in FIG. 2A, or of sample “B” , “C” in FIG. 2B, or of sample “A” , “D” in FIG. 2B, or of sample “B” , “C” in FIG. 2C, or of sample “A” , “D” in FIG. 2C, or of sample “A” , “D” in FIG. 2C, or
  • the decoded motion vectors for samples to be predicted are rounded to integer MVs before being used.
  • the motion vectors used in the motion compensation process for samples in a current block may be stored in the decoded picture buffer and utilized for motion vector prediction of succeeding blocks in current/different pictures.
  • the motion vectors used in the motion compensation process for samples in a current block may be disallowed to be utilized for motion vector prediction of succeeding blocks in current/different pictures.
  • the decoded motion vectors (e.g., MV b in above bullets) may be utilized for motion vector prediction of succeeding blocks in current/different pictures.
  • the motion vectors used in the motion compensation process for samples in a current block may be utilized in the filtering process (e.g., deblocking filter/SAO/ALF) .
  • the decoded motion vectors (e.g., MV b in above bullets) may be utilized in the filtering process.
  • the interpolation filters used in the motion compensation process to derive the prediction block of a current block may be selected depending on whether the resolution of the reference picture is different to the current picture.
  • the interpolation filters have less taps when the resolution of the reference picture is different to the current picture.
  • bi-linear filters are applied when the resolution of the reference picture is different to the current picture.
  • 4-tap filters or 6-tap filters are applied when the resolution of the reference picture is different to the current picture.
  • a virtual reference block is generated by up-sampling or down-sampling a region in the reference picture depending on width and/or height of the current picture and the reference picture.
  • the prediction samples are generated from the virtual reference block by applying interpolation filtering, independent of width and/or height of the current picture and the reference picture.
  • top-left coordinate of the bounding block for reference sample padding (xSbInt L , ySbInt L ) as defined in 8.5.6.3.1 in JVET-O2001-v14 may be derived depending on width and/or height of the current picture and the reference picture.
  • the luma locations in full-sample units are modified as:
  • xInt i Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i ) ,
  • yInt i Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i ) ,
  • Dx and/or Dy and/or Ux and/or Uy may depend on width and/or height of the current picture and the reference picture.
  • chroma locations in full-sample units are modified as:
  • xInti Clip3 (xSbInt C -Dx, xSbInt C + sbWidth + Ux, xInti)
  • yInti Clip3 (ySbInt C -Dy, ySbInt C + sbHeight + Uy, yInti)
  • Dx and/or Dy and/or Ux and/or Uy may depend on width and/or height of the current picture and the reference picture.
  • MV is clipped according to the bounding block for reference sample padding (e.g., (xSbInt L , ySbInt L ) as defined in 8.5.6.3.1) only when DMVR is applied.
  • reference sample padding e.g., (xSbInt L , ySbInt L ) as defined in 8.5.6.3.1
  • operations 8-775 and 8-776 in the luma sample interpolation filtering process as defined in JVET-O2001-v14 are applied only if DMVR is used for the current block.
  • the above methods may be also applicable to the clipping of chroma samples.
  • whether to and/or how to clip MV according to the bounding block for reference sample padding may depend on whether picture wrapping is used (e.g. whether sps_ref_wraparound_enabled_flag is equal to 0 or 1) .
  • MV is clipped according to the bounding block for reference sample padding (e.g., (xSbInt L , ySbInt L ) as defined in 8.5.6.3.1) only if picture wrapping is not used.
  • reference sample padding e.g., (xSbInt L , ySbInt L ) as defined in 8.5.6.3.1
  • operations 8-775 and 8-776 in the luma sample interpolation filtering process as defined in JVET-O2001-v14 are applied only if picture wrapping is not used.
  • the above methods may be also applicable to the clipping of chroma samples.
  • the luma locations in full-sample units are modified as:
  • xInt i Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i ) ,
  • yInt i Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i ) ,
  • Dx and/or Dy and/or Ux and/or Uy may depend on whether picture wrapping is used.
  • chroma locations in full-sample units are modified as:
  • xInti Clip3 (xSbInt C -Dx, xSbInt C + sbWidth + Ux, xInti)
  • yInti Clip3 (ySbInt C -Dy, ySbInt C + sbHeight + Uy, yInti)
  • Dx and/or Dy and/or Ux and/or Uy may depend on whether picture wrapping is used.
  • filtering process may depend on whether the reference pictures are with different resolutions.
  • the boundary strenght settings in the deblocking filters may take the resolution differences into consideration in addition to motion vector differences.
  • the boundary strenght settings in the deblocking filters may the scaled motion vector differences based on resolution differences.
  • the strength of deblocking filter is increased if the resoltuion of at least one reference picture of block A is different to (or smllar than or larger than) the resoltuion of at least one reference picture of block B.
  • the strength of deblocking filter is decreased if the resoltuion of at least one reference picture of block A is different to (or smllar than or larger than) the resoltuion of at least one reference picture of block B.
  • the strength of deblocking filter is increased if the resoltuion of at least one reference picture of block A and/or block B is different to (or smllar than or larger than) the resoltuion of the current block.
  • the strength of deblocking filter is decreased if the resoltuion of at least one reference picture of block A and/or block B is different to (or smllar than or larger than) the resoltuion of the current block.
  • the reference picture when a sub-picture exists, the reference picture must have the same resolution as the current picture.
  • sub-pictures may be defined separately for pictures with different resolutions.
  • the corresponding sub-picture in the reference picture can be derived by scaling and/or offsetting a sub-picture of the current picture, if the reference picture has a different resolution to the current picture.
  • FIG. 3 is a block diagram of a video processing apparatus 300.
  • the apparatus 300 may be used to implement one or more of the methods described herein.
  • the apparatus 300 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on.
  • the apparatus 300 may include one or more processors 302, one or more memories 304 and video processing hardware 306.
  • the processor (s) 302 may be configured to implement one or more methods described in the present document.
  • the memory (memories) 304 may be used for storing data and code used for implementing the methods and techniques described herein.
  • the video processing hardware 306 may be used to implement, in hardware circuitry, some techniques described in the present document. In some embodiments, the hardware 306 may be at least partly within the processor 302, e.g., a graphics co-processor.
  • a method of video processing comprising determining (402) , for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a reference picture used for the conversion are different, and performing (404) the conversion based on the determining such that predicted values of a group of samples of the current block are generated using a horizontal or a vertical interpolation filter.
  • a motion vector for a specific sample is derived by modifying a value of motion vector derived based on the resolution of the current picture and the resolution of the reference picture by a modification step including truncating, clipping or rounding.
  • a method of video processing comprising: determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a reference picture used for the conversion are different, and performing the conversion based on the determining such that predicted values of a group of samples of the current block are generated as an interpolated version of a virtual reference block that is generated by sample rate changing a region in the reference picture, wherein the sample rate changing depends on a height or a width of the current picture or the reference picture.
  • a method of video processing comprising: determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a reference picture used for the conversion are different, and based on the determining, deriving a top-left coordinate of a bounding block for reference sample padding based on a scheme that is dependent on a height or a width of the current picture or the reference picture, and performing the conversion using the derived top-left coordinate of the bounding box.
  • the scheme comprises calculating luma samples located at integer sample locations as:
  • xInt i Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i ) ,
  • yInt i Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i ) ,
  • Dx and/or Dy and/or Ux and/or Uy depend on the width and/or the height of the current picture or the reference picture, and wherein (xSbInt L , ySbInt L ) is the top left coordinate.
  • the method of solution 16 comprises calculating chroma samples located at integer sample locations as:
  • xInti Clip3 (xSbInt C -Dx, xSbInt C + sbWidth + Ux, xInti)
  • yInti Clip3 (ySbInt C -Dy, ySbInt C + sbHeight + Uy, yInti)
  • Dx and/or Dy and/or Ux and/or Uy depend on the width and/or the height of the current picture or the reference picture, and wherein (xSbInt L , ySbInt L ) is the top left coordinate.
  • a method of video processing comprising: determining, for a conversion between a current block in a current picture of a video and a coded representation of the video, a clipping operation applied to motion vector calculation according to a bounding block for reference sample padding, based on use of a decoder side motion vector refinement (DMVR) during the conversion of the current block; and performing the conversion based on the clipping operation.
  • DMVR decoder side motion vector refinement
  • a method of video processing comprising: determining, for a conversion between a current block in a current picture of a video and a coded representation of the video, a clipping operation applied to motion vector calculation according to a bounding block for reference sample padding, based on use of picture wrapping in the conversion; and performing the conversion based on the clipping operation.
  • xInt i Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i ) ,
  • yInt i Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i ) ,
  • a video decoding apparatus comprising a processor configured to implement a method recited in one or more of solutions 1 to 27.
  • a video encoding apparatus comprising a processor configured to implement a method recited in one or more of solutions 1 to 27.
  • a computer program product having computer code stored thereon, the code, when executed by a processor, causes the processor to implement a method recited in any of solutions 1 to 27.
  • the interpolation filters are used for filtering sample values to produce sample values at (fractional) resample locations.
  • a method of video processing comprising: selecting (502) an interpolation filter used for determining a prediction block for a current block of a current picture of video by motion compensation from a reference picture based on a rule, and performing (504) a conversion between the current block of the video and a coded representation of the video based on the prediction block, wherein the rule specifies that the interpolation filter is a first interpolation filter in case that a resolution of the current picture and a resolution of the reference picture are different and the interpolation filter is a second interpolation filter in case that the resolution of the current picture and the resolution of the reference picture are same, wherein the first interpolation filter is different from the second interpolation filter.
  • a method of video processing (e.g., method 570 shown in FIG. 5H) , comprising: performing (572) a conversion between a video comprising a current video picture and a coded representation of the video according to a rule, wherein a reference picture is included in at least one of the reference picture lists of the current video picture, wherein the current video picture has a first resolution; wherein the reference picture has a second resolution; wherein the rule specifies whether and/or how the current video picture and/or the reference picture are permitted to have sub-pictures depending on at least one of the first resolution or the second resolution.
  • a method of video processing (e.g., method 510 shown in FIG. 5B) , comprising: determining (512) , for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a resolution of a reference picture used for the conversion are different, and performing (514) the conversion based on the determining such that same horizontal interpolation filter or vertical interpolation filter is used for generating predicted values of samples of a group of samples of the current block.
  • N is 4 or 8 or equal to a width of the current block.
  • values M and N are dependent on a dimension of the current block or a coded information of the current block or correspond to a syntax element included in the coded representation.
  • the motion vector used for generating the predicated values of samples in the group of values corresponds to a final motion vector that is first derived according to resolutions of the current picture, denoted as MVb, and the reference picture and then modified according to a motion vector characteristic for the group of samples, denoted as MV'.
  • MV' is selected to be a motion vector having a closest match to MVb and having a fractional part equal to a shared horizontal or a shared vertical component among samples of the group of samples.
  • a method of video processing (e.g., method 520 shown in FIG. 5C) , comprising: making a determination (522) , due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a constraint on a motion vector used for deriving a prediction block for the current block; and performing (524) a conversion between the video and a coded representation of the video based on the determination, wherein the constrain specifies that the motion vector is an integer motion vector.
  • a method of video processing (e.g., method 530 shown in FIG. 5D) , comprising: performing (532) a conversion between a video comprising a current block and a coded representation according to a rule, wherein the rule specifies whether a motion vector information used for determining a prediction block of the current block by motion compensation is made available for motion vector prediction of succeeding blocks in a current picture comprising the current block or another picture.
  • a method of video processing (e.g., method 540 shown in FIG. 5E) , comprising: making a determination (542) , due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a two-step process to generate a prediction block for the current block; and performing (544) a conversion between the video and a coded representation of the video based on the determination, wherein the two-step process comprises a first step of resampling a region of the reference picture to generate a virtual reference block and a second step of generation the prediction block using an interpolation filter on the virtual reference block.
  • resampling includes upsampling or downsampling, depending on widths and/or heights of the current picture and the reference picture.
  • a method of video processing (e.g., method 550 shown in FIG. 5F) , comprising: making a first determination (552) that there is a difference between resolutions of one or more reference pictures used for generating a prediction block of a current block of a current picture of a video and a resolution of the current picture, using a rule to make a second determination (554) , based on the difference, about whether or how to apply a filtering process for a conversion between the video and a coded representation of the video; and performing (556) the conversion according to the second determination.
  • a method of video processing (e.g., method 560 shown in FIG. 5G) , comprising: performing (562) a conversion between a video comprising multiple video blocks of a video picture and a coded representation of the video according to a rule, wherein the multiple video blocks are processed in an order, wherein the rule specifies that motion vector information used for determining prediction block of a first video block is stored and used during processing of a succeeding video block of the multiple video blocks according to a resolution of a reference picture used by the motion vector information.
  • a video decoding apparatus comprising a processor configured to implement a method recited in one or more of examples 1 to 74.
  • a video encoding apparatus comprising a processor configured to implement a method recited in one or more of examples 1 to 74.
  • a computer program product having computer code stored thereon, the code, when executed by a processor, causes the processor to implement a method recited in any of examples 1 to 74.
  • a non-transitory computer-readable storage medium storing instructions that cause a processor to implement a method recited in any of examples 1 to 74.
  • a non-transitory computer-readable recording medium storing a bitstream corresponding to the coded representation that is generated by a method recited in example 78.
  • the performing the conversion includes using the results of previous decision step during the encoding or decoding operation to arrive at the conversion results.
  • video processing may refer to video encoding, video decoding, video compression or video decompression.
  • video compression algorithms may be applied during conversion from pixel representation of a video to a corresponding bitstream representation or vice versa.
  • the bitstream representation, or coded representation, of a current video block may, for example, correspond to bits that are either co-located or spread in different places within the bitstream, as is defined by the syntax.
  • a video block may be encoded in terms of transformed and coded error residual values and also using bits in headers and other fields in the bitstream.
  • a decoder may parse a bitstream with the knowledge that some fields may be present, or absent, based on the determination, as is described in the above solutions.
  • an encoder may determine that certain syntax fields are or are not to be included and generate the coded representation accordingly by including or excluding the syntax fields from the coded representation.
  • Some embodiments of the disclosed technology include making a decision or determination to enable a video processing tool or mode.
  • the encoder when the video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of a block of video, but may not necessarily modify the resulting bitstream based on the usage of the tool or mode. That is, a conversion from the block of video to the bitstream representation of the video will use the video processing tool or mode when it is enabled based on the decision or determination.
  • the decoder when the video processing tool or mode is enabled, the decoder will process the bitstream with the knowledge that the bitstream has been modified based on the video processing tool or mode. That is, a conversion from the bitstream representation of the video to the block of video will be performed using the video processing tool or mode that was enabled based on the decision or determination.
  • Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode.
  • the encoder will not use the tool or mode in the conversion of the block of video to the bitstream representation of the video.
  • the decoder will process the bitstream with the knowledge that the bitstream has not been modified using the video processing tool or mode that was disabled based on the decision or determination.
  • the disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them.
  • the disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus.
  • the computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them.
  • data processing apparatus encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
  • the apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • a propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
  • a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program does not necessarily correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) .
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASlC (application specific integrated circuit) .
  • processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
  • a processor will receive instructions and data from a read only memory or a random-access memory or both.
  • the essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
  • a computer need not have such devices.
  • Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
  • semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
  • magnetic disks e.g., internal hard disks or removable disks
  • magneto optical disks e.g., CD ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method of video processing includes selecting an interpolation filter used for determining a prediction block for a current block of a current picture of video by motion compensation from a reference picture based on a rule and performing a conversion between the current block of the video and a coded representation of the video based on the prediction block. Here, the rule specifies that the interpolation filter is a first interpolation filter in case that a resolution of the current picture and a resolution of the reference picture are different and the interpolation filter is a second interpolation filter in case that the resolution of the current picture and the resolution of the reference picture are same, wherein the first interpolation filter is different from the second interpolation filter.

Description

REFERENCE PICTURE RESAMPLING
CROSS REFERENCE TO RELATED APPLICATIONS
Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/102289, filed on August 23, 2019. For all purposes under the law, the entire disclosures of the aforementioned applications are incorporated by reference as part of the disclosure of this application.
TECHNICAL FIELD
This patent document relates to video coding and decoding.
BACKGROUND
In spite of the advances in video compression, digital video still accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
SUMMARY
Devices, systems and methods related to digital video coding, and specifically, to video and image coding and decoding in which current pictures and references pictures have different sizes or resolutions.
In one example aspect, a method of video processing is disclosed. The method includes selecting an interpolation filter used for determining a prediction block for a current block of a current picture of video by motion compensation from a reference picture based on a rule, and performing a conversion between the current block of the video and a coded representation of the video based on the prediction block, wherein the rule specifies that the interpolation filter is a first interpolation filter in case that a resolution of the current picture and a resolution of the reference picture are different and the interpolation filter is a second interpolation filter in case that the resolution of the current picture and the resolution of the reference picture are same, wherein the first interpolation filter is different from the second interpolation filter.
In another example aspect, another method of video processing is disclosed. The method includes performing a conversion between a video comprising a current video picture and a coded representation of the video according to a rule, wherein a reference picture is included in at least one of the reference picture lists of the current video picture, wherein the current video picture has a first resolution; wherein the reference picture has a second resolution; wherein the rule specifies whether and/or how the current video picture and/or the reference picture are permitted to have sub-pictures depending on at least one of the first resolution or the second resolution.
In another example aspect, another method of video processing is disclosed. The method includes determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a resolution of a reference picture used for the conversion are different, and performing the conversion based on the determining such that same horizontal interpolation filter or vertical interpolation filter is used for generating predicted values of samples of a group of samples of the current block.
In another example aspect, another method of video processing is disclosed. The method includes making a determination, due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a constraint on a motion vector used for deriving a prediction block for the current block; and performing a conversion between the video and a coded representation of the video based on the determination, wherein the constrain specifies that the motion vector is an integer motion vector.
In another example aspect, another method of video processing is disclosed. The method includes performing a conversion between a video comprising a current block and a coded representation according to a rule, wherein the rule specifies whether a motion vector information used for determining a prediction block of the current block by motion compensation is made available for motion vector prediction of succeeding blocks in a current picture comprising the current block or another picture.
In another example aspect, another method of video processing is disclosed. The method includes making a determination, due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a two-step process to generate a prediction block for the current block; and performing a conversion between the video and a coded representation of the video based on the determination, wherein the  two-step process comprises a first step of resampling a region of the reference picture to generate a virtual reference block and a second step of generation the prediction block using an interpolation filter on the virtual reference block.
In another example aspect, another method of video processing is disclosed. The method includes making a first determination that there is a difference between resolutions of one or more reference pictures used for generating a prediction block of a current block of a current picture of a video and a resolution of the current picture, using a rule to make a second determination, based on the difference, about whether or how to apply a filtering process for a conversion between the video and a coded representation of the video; and performing the conversion according to the second determination.
In another example aspect, another method of video processing is disclosed. The method includes performing a conversion between a video comprising multiple video blocks of a video picture and a coded representation of the video according to a rule, wherein the multiple video blocks are processed in an order, wherein the rule specifies that motion vector information used for determining prediction block of a first video block is stored and used during processing of a succeeding video block of the multiple video blocks according to a resolution of a reference picture used by the motion vector information.
In yet another representative aspect, the above-described method is embodied in the form of processor-executable code and stored in a computer-readable program medium.
In yet another representative aspect, a device that is configured or operable to perform the above-described method is disclosed. The device may include a processor that is programmed to implement this method.
In yet another representative aspect, a video decoder apparatus may implement a method as described herein.
In yet another representative aspect, a computer readable medium is disclosed. The medium includes a coded video representation generated using one of the above described methods.
The above and other aspects and features of the disclosed technology are described in greater detail in the drawings, the description and the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 A 16x16 block is divided into 16 4x4 regions.
FIG. 2A-2C show examples of specific positions in a video block.
FIG. 3 is a block diagram of an example implementation of a hardware platform for video processing.
FIG. 4 is a flowchart for an example method of video processing.
FIGS. 5A to 5H are flowcharts for example methods of video processing.
DETAILED DESCRIPTION
Embodiments of the disclosed technology may be applied to existing video coding standards (e.g., HEVC, H. 265) and future standards to improve compression performance. Section headings are used in the present document to improve readability of the description and do not in any way limit the discussion or the embodiments (and/or implementations) to the respective sections only.
1. Introduction
This document is related to video coding technologies. Specifically, it is related to adaptive resolution conversion in video coding or decoding. It may be applied to the existing video/image coding standard like HEVC, or the standard (Versatile Video Coding) to be finalized. It may be also applicable to future video coding standards or video codec.
2. Initial discussion
Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards. The ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards. Since H. 262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore the future video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by VCEG and MPEG jointly in 2015. Since then, many new methods have been adopted by JVET and put into the reference software named Joint Exploration Model (JEM) . In April 2018, the Joint Video Expert Team (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to work on the VVC standard targeting at 50%bitrate reduction compared to HEVC.
AVC and HEVC does not have the ability to change resolution without having to introduce  an IDR or intra random access point (IRAP) picture; such ability can be referred to as adaptive resolution change (ARC) . There are use cases or application scenarios that would benefit from an ARC feature, including the following:
- Rate adaption in video telephony and conferencing: For adapting the coded video to the changing network conditions, when the network condition gets worse so that available bandwidth becomes lower, the encoder may adapt to it by encoding smaller resolution pictures. Currently, changing picture resolution can be done only after an IRAP picture; this has several issues. An IRAP picture at reasonable quality will be much larger than an inter-coded picture and will be correspondingly more complex to decode: this costs time and resource. This is a problem if the resolution change is requested by the decoder for loading reasons. It can also break low-latency buffer conditions, forcing an audio re-sync, and the end-to-end delay of the stream will increase, at least temporarily. This can give a poor user experience.
- Active speaker changes in multi-party video conferencing: For multi-party video conferencing, it is common that the active speaker is shown in bigger video size than the video for the rest of conference participants. When the active speaker changes, picture resolution for each participant may also need to be adjusted. The need to have ARC feature becomes more important when such change in active speaker happens frequently.
- Fast start in streaming: For streaming application, it is common that the application would buffer up to certain length of decoded picture before start displaying. Starting the bitstream with smaller resolution would allow the application to have enough pictures in the buffer to start displaying faster.
Adaptive stream switching in streaming: The Dynamic Adaptive Streaming over HTTP (DASH) specification includes a feature named @mediaStreamStructureId. This enables switching between different representations at open-GOP random access points with non-decodable leading pictures, e.g., CRA pictures with associated RASL pictures in HEVC. When two different representations of the same video have different bitrates but the same spatial resolution while they have the same value of @mediaStreamStructureId, switching between the two representations at a CRA picture with associated RASL pictures can be performed, and the RASL pictures associated with the switching-at CRA pictures can be decoded with acceptable quality hence enabling seamless switching. With ARC, the @mediaStreamStructureId feature would also be usable for switching between DASH representations with different spatial  resolutions.
ARC is also known as Dynamic resolution conversion.
ARC may also be regarded as a special case of Reference Picture Resampling (RPR) such as H. 263 Annex P.
2.1. Reference picture resampling in H. 263 Annex P
This mode describes an algorithm to warp the reference picture prior to its use for prediction. It can be useful for resampling a reference picture having a different source format than the picture being predicted. It can also be used for global motion estimation, or estimation of rotating motion, by warping the shape, size, and location of the reference picture. The syntax includes warping parameters to be used as well as a resampling algorithm. The simplest level of operation for the reference picture resampling mode is an implicit factor of 4 resampling as only an FIR filter needs to be applied for the upsampling and downsampling processes. In this case, no additional signaling overhead is required as its use is understood when the size of a new picture (indicated in the picture header) is different from that of the previous picture.
2.2. Contributions on ARC to VVC
Several contributions have been proposed addressing ARC, as listed below:
JVET-M0135, JVET-M0259, JVET-N0048, JVET-N0052, JVET-N0118, JVET-N0279.
2.3. ARC in JVET-O2001-v14
ARC, a.k.a. RPR (Reference Picture Resampling) is incorporated in JVET-O2001-v14.
With RPR in JVET-O2001-v14, TMVP is disabled if the collocated picture has a different resolution to the current picture. Besides, BDOF and DMVR are disabled when the reference picture has a different resolution to the current picture.
To handle the normal MC when the reference picture has a different resolution than the current picture, the interpolation section is defined as below (section numbers refer to the current version of VCC, italicized text indicates proposed changes to the specification) :
8.5.6.3.1 General
Inputs to this process are:
- a luma location (xSb, ySb) specifying the top-left sample of the current coding subblock relative to the top-left luma sample of the current picture,
- a variable sbWidth specifying the width of the current coding subblock,
- a variable sbHeight specifying the height of the current coding subblock,
- a motion vector offset mvOffset,
- a refined motion vector refMvLX,
- the selected reference picture sample array refPicLX,
- the half sample interpolation filter index hpelIfIdx,
- the bi-directional optical flow flag bdofFlag,
- a variable cIdx specifying the colour component index of the current block.
Outputs of this process are:
- an (sbWidth + brdExtSize) x (sbHeight + brdExtSize) array predSamplesLX of prediction sample values.
The prediction block border extension size brdExtSize is derived as follows:
brdExtSize = (bdofFlag || (inter_affine_flag [xSb] [ySb] && sps_affine_prof_enabled_flag) ) ? 2: 0    (8-752)
The variable fRefWidth is set equal to the PicOutputWidthL of the reference picture in luma samples.
The variable fRefHeight is set equal to PicOutputHeightL of the reference picture in luma samples.
The motion vector mvLX is set equal to (refMvLX-mvOffset) .
- If cIdx is equal to 0, the following applies:
- The scaling factors and their fixed-point representations are defined as
hori_scale_fp = ( (fRefWidth << 14) + (PicOutputWidthL >> 1) ) /PicOutputWidthL      (8-753)
vert_scale_fp = ( (fRefHeight << 14) + (PicOutputHeightL >> 1) ) /PicOutputHeightL    (8-754)
- Let (xIntL, yIntL) be a luma location given in full-sample units and (xFracL, yFracL) be an offset given in 1/16-sample units. These variables are used only in this clause for specifying fractional-sample locations inside the reference sample arrays refPicLX.
- The top-left coordinate of the bounding block for reference sample padding (xSbInt L, ySbInt L) is set equal to (xSb + (mvLX [0] >> 4) , ySb + (mvLX [1] >> 4) ) .
- For each luma sample location (x L = 0.. sbWidth-1 + brdExtSize, y L = 0.. sbHeight-1 + brdExtSize) inside the prediction  luma sample array predSamplesLX, the corresponding prediction luma sample value predSamplesLX [x L] [y L] is derived as follows:
- Let (refxSb L, refySb L) and (refx L, refy L) be luma locations pointed to by a motion vector (refMvLX [0] , refMvLX [1] ) given in 1/16-sample units. The variables refxSb L, refx L, refySb L, and refy L are derived as follows:
refxSb L = ( (xSb << 4) + refMvLX [0] ) *hori_scale_fp               (8-755)
refx L = ( (Sign (refxSb) * ( (Abs (refxSb) + 128) >> 8) +x L * ( (hori_scale_fp + 8) >> 4) ) + 32) >> 6              (8-756)
refySb L = ( (ySb << 4) + refMvLX [1] ) *vert_scale_fp               (8-757)
refy L= ( (Sign (refySb) * ( (Abs (refySb) + 128) >> 8) + yL * ( (vert_scale_fp +8) >> 4) ) +32) >> 6                    (8-758)
- The variables xInt L, yInt L, xFrac L and yFrac L are derived as follows:
xInt L =refx L >> 4      (8-759)
yInt L = refy L >> 4      (8-760)
xFrac L = refx L & 15     (8-761)
yFrac L = refy L & 15     (8-762)
- If bdofFlag is equal to TRUE or (sps_affine_prof_enabled_flag is equal to TRUE and inter_affine_flag [xSb] [ySb] is equal to TRUE) , and one or more of the following conditions are true, the prediction luma sample value predSamplesLX [x L] [y L] is derived by invoking the luma integer sample fetching process as specified in clause 8.5.6.3.3 with (xInt L + (xFrac L >> 3) -1) , yInt L + (yFrac L >> 3) -1) and refPicLX as inputs.
- x L is equal to 0.
- x L is equal to sbWidth + 1.
- y L is equal to 0.
- y L is equal to sbHeight + 1.
- Otherwise, the prediction luma sample value predSamplesLX [xL] [yL] is derived by invoking the luma sample 8-tap interpolation filtering process as specified in clause 8.5.6.3.2 with (xIntL - (brdExtSize > 0 ? 1: 0) , yIntL - (brdExtSize > 0 ? 1: 0) ) , (xFracL, yFracL) , (xSbInt L, ySbInt L) , refPicLX, hpelIfIdx, sbWidth, sbHeight and (xSb, ySb) as inputs.
- Otherwise (cIdx is not equal to 0) , the following applies:
- Let (xIntC, yIntC) be a chroma location given in full-sample units and (xFracC, yFracC) be an offset given in 1/32 sample units. These variables are used only in this clause for specifying general fractional-sample locations inside the reference sample arrays refPicLX.
- The top-left coordinate of the bounding block for reference sample padding (xSbIntC, ySbIntC) is set equal to ( (xSb/SubWidthC) + (mvLX [0] >> 5) , (ySb/SubHeightC) + (mvLX [1] >> 5) ) .
- For each chroma sample location (xC = 0.. sbWidth -1, yC = 0.. sbHeight -1) inside the prediction chroma sample arrays predSamplesLX, the corresponding prediction chroma sample value predSamplesLX [xC] [yC] is derived as follows:
- Let (refxSb C, refySb C) and (refx C, refy C) be chroma locations pointed to by a motion vector (mvLX [0] , mvLX [1] ) given in 1/32-sample units. The variables refxSb C, refySb C, refx C and refy C are derived as follows:
refxSb C= ( (xSb/SubWidthC << 5) + mvLX [0] ) *hori_scale_fp      (8-763)
refx C= ( (Sign (refxSb C) * ( (Abs (refxSb C) + 256) >> 9) + xC* ( (hori_scale_fp + 8) >> 4) ) + 16) >> 5      (8-764)
refySb C= ( (ySb/SubHeightC << 5) + mvLX [1] ) *vert_scale_fp      (8-765)
refy C = ( (Sign (refySb C) * ( (Abs (refySb C) + 256) > > 9) + yC* ( (vert_scale_fp + 8) >> 4) ) + 16) >> 5      (8-766)
- The variables xInt C, yInt C, xFrac C and yFrac C are derived as follows:
xInt C = refx C >> 5      (8-767)
yInt C = refy C >> 5      (8-768)
xFrac C = refy C & 31      (8-769)
yFrac C = refy C & 31      (8-770)
- The prediction sample value predSamplesLX [xC] [yC] is derived by invoking the process specified in clause 8.5.6.3.4 with (xIntC, yIntC) , (xFracC, yFracC) , (xSbIntC, ySbIntC) , sbWidth, sbHeight and refPicLX as inputs.
Luma sample interpolation filtering process
Inputs to this process are:
- a luma location in full-sample units (xInt L, yInt L) ,
- a luma location in fractional-sample units (xFrac L, yFrac L) ,
- a luma location in full-sample units (xSbInt L, ySbInt L) specifying the top-left sample of the bounding block for reference sample padding relative to the top-left luma sample of the reference picture,
- the luma reference sample array refPicLX L,
- the half sample interpolation filter index hpelIfIdx,
- a variable sbWidth specifying the width of the current subblock,
- a variable sbHeight specifying the height of the current subblock,
- a luma location (xSb, ySb) specifying the top-left sample of the current subblock relative to the top-left luma sample of the current picture,
Output of this process is a predicted luma sample value predSampleLX L
The variables shift1, shift2 and shift3 are derived as follows:
- The variable shift1 is set equal to Min (4, BitDepth Y-8) , the variable shift2 is set equal to 6 and the variable shift3 is set equal to Max (2, 14-BitDepth Y) .
- The variable picW is set equal to pic_width_in_luma_samples and the variable picH is set equal to pic_height_in_luma_samples.
The luma interpolation filter coefficients f L [p] for each 1/16 fractional sample position p equal to xFrac L or yFrac L are derived as follows:
- If MotionModelIdc [xSb] [ySb] is greater than 0, and sbWidth and sbHeight are both equal to 4, the luma interpolation filter coefficients f L [p] are specified in Table 8-12.
- Otherwise, the luma interpolation filter coefficients f L [p] are specified in Table 8-11 depending on hpelIfIdx.
The luma locations in full-sample units (xInt i, yInt i) are derived as follows for i = 0.. 7:
- If subpic_treated_as_pic_flag [SubPicIdx] is equal to 1, the following applies:
xInt i = Clip3 (SubPicLeftBoundaryPos, SubPicRightBoundaryPos, xInt L + i -3)    (8-771)
yInt i = Clip3 (SubPicTopBoundaryPos, SubPicBotBoundaryPos, yInt L + i -3)    (8-772)
- Otherwise (subpic_treated_as_pic_flag [SubPicIdx] is equal to 0) , the following applies:
Figure PCTCN2020110763-appb-000001
yInt i = Clip3 (0, picH -1, yInt L + i -3)           (8-774)
The luma locations in full-sample units are further modified as follows for i = 0.. 7:
xInt i = Clip3 (xSbInt L -3, xSbInt L + sbWidth + 4, xInt i)      (8-775)
yInt i = Clip3 (ySbInt L -3, ySbInt L + sbHeight + 4, yInt i)      (8-776)
The predicted luma sample value predSampleLX L is derived as follows:
- If both xFrac Land yFrac L are equal to 0, the value of predSampleLX L is derived as follows:
predSampleLX L = refPicLX L [xInt 3] [yInt 3] << shift3          (8-777)
- Otherwise, if xFrac L is not equal to 0 and yFrac L is equal to 0, the value of predSampleLX L is derived as follows:
Figure PCTCN2020110763-appb-000002
- Otherwise, if xFrac L is equal to 0 and yFrac L is not equal to 0, the value of predSampleLX L is derived as follows:
Figure PCTCN2020110763-appb-000003
- Otherwise, if xFrac L is not equal to 0 and yFrac L is not equal to 0, the value of predSampleLX L is derived as follows:
- The sample array temp [n] with n = 0.. 7, is derived as follows:
Figure PCTCN2020110763-appb-000004
- The predicted luma sample value predSampleLX L is derived as follows:
Figure PCTCN2020110763-appb-000005
Table 8-11-Specification of the luma interpolation filter coefficients f L [p] for each 1/16 fractional sample position p.
Figure PCTCN2020110763-appb-000006
Table 8-12-Specification of the luma interpolation filter coefficients f L [p] for each 1/16 fractional sample position p for affine motion mode.
Figure PCTCN2020110763-appb-000007
Luma integer sample fetching process
Inputs to this process are:
- a luma location in full-sample units (xInt L, yInt L) ,
- the luma reference sample array refPicLX L,
Output of this process is a predicted luma sample value predSampleLX L
The variable shift is set equal to Max (2, 14 -BitDepth Y) .
The variable picW is set equal to pic_width_in_luma_samples and the variable picH is set equal to pic_height_in_luma_samples.
The luma locations in full-sample units (xInt, yInt) are derived as follows:
Figure PCTCN2020110763-appb-000008
yInt = Clip3 (0, picH-1, yInt L)      (8-783)
The predicted luma sample value predSampleLX L is derived as follows:
predSampleLX L = refPicLX L [xInt] [yInt] << shift3    (8-784)
Chroma sample interpolation process
Inputs to this process are:
- a chroma location in full-sample units (xInt C, yInt C) ,
- a chroma location in 1/32 fractional-sample units (xFrac C, yFrac C) ,
- a chroma location in full-sample units (xSbIntC, ySbIntC) specifying the top-left sample of the bounding block for reference sample padding relative to the top-left chroma sample of the reference picture,
- a variable sbWidth specifying the width of the current subblock,
- a variable sbHeight specifying the height of the current subblock,
- the chroma reference sample array refPicLX C.
Output of this process is a predicted chroma sample value predSampleLX C
The variables shift1, shift2 and shift3 are derived as follows:
- The variable shift1 is set equal to Min (4, BitDepth C-8) , the variable shift2 is set equal to 6 and the variable shift3 is set equal to Max (2, 14-BitDepth C) .
- The variable picW C is set equal to pic_width_in_luma_samples/SubWidthC and the variable picH C is set equal to pic_height_in_luma_samples/SubHeightC.
The chroma interpolation filter coefficients f C [p] for each 1/32 fractional sample position p equal to xFrac C or yFrac C are specified in Table 8-13.
The variable xOffset is set equal to (sps_ref_wraparound_offset_minus1 + 1) *MinCbSizeY) /SubWidthC.
The chroma locations in full-sample units (xInt i, yInt i) are derived as follows for i = 0.. 3:
- If subpic_treated_as_pic_flag [SubPicIdx] is equal to 1, the following applies:
xInt i = Clip3 (SubPicLeftBoundaryPos/SubWidthC, SubPicRightBoundaryPos/SubWidthC, xInt L+i)    (8-785)
yInt i = Clip3 (SubPicTopBoundaryPos/SubHeightC, SubPicBotBoundaryPos/SubHeightC, yInt L+i)    (8-786)
- Otherwise (subpic_treated_as_pic_flag [SubPicIdx] is equal to 0) , the following applies:
Figure PCTCN2020110763-appb-000009
yInt i = Clip3 (0, picH C -1, yInt C + i -1)           (8-788)
The chroma locations in full-sample units (xInt i, yInt i) are further modified as follows for i = 0.. 3:
xInt i = Clip3 (xSbIntC -1, xSbIntC + sbWidth + 2, xInt i)        (8-789)
yInt i = Clip3 (ySbIntC -1, ySbIntC + sbHeight + 2, yInt i)        (8-790)
The predicted chroma sample value predSampleLX C is derived as follows:
- If both xFrac C and yFrac C are equal to 0, the value of predSampleLX C is derived as follows:
predSampleLX C = refPicLX C [xInt 1] [yInt 1] << shift3        (8-791)
- Otherwise, if xFrac C is not equal to 0 and yFrac C is equal to 0, the value of predSampleLX C is derived as follows:
Figure PCTCN2020110763-appb-000010
- Otherwise, if xFrac C is equal to 0 and yFrac C is not equal to 0, the value of predSampleLX C is derived as follows:
Figure PCTCN2020110763-appb-000011
- Otherwise, if xFrac C is not equal to 0 and yFrac C is not equal to 0, the value of predSampleLX C is derived as follows:
- The sample array temp [n] with n = 0.. 3, is derived as follows:
Figure PCTCN2020110763-appb-000012
- The predicted chroma sample value predSampleLX C is derived as follows:
Figure PCTCN2020110763-appb-000013
Table 8-13 -Specification of the chroma interpolation filter coefficients f C [p] for each 1/32 fractional sample position p.
Figure PCTCN2020110763-appb-000014
3. Technical problems associated with current video coded technologies
When RPR is applied in VVC, RPR (ARC) may have the following problems:
1. With RPR, the interpolation filters may be different for adjacent samples in a block, which is undesirable in SIMD (Single Instruction Multiple Data) implementation.
2. The bounding region does not consider RPR
4. A listing of embodiments and technical solutions
The list below should be considered as examples to explain general concepts. These items  should not be interpreted in a narrow way. Furthermore, these items can be combined in any manner.
A motion vector is denoted by (mv_x, mv_y) wherein mv_x is the horizontal component and mv_y is the vertical component.
1. When the resolution of the reference picture is different to the current picture, predicted values for a group of samples (at least two samples) of a current block may be generated with the same horizontal and/or vertical interpolation filter.
a. In one example, the group may comprise all samples in a region of the block.
i. For example, a block may be divided into S MxN rectangles not overlapped with each other. Each MxN rectangle is a group. In an example as shown in FIG. 1, a 16x16 block can be divided into 16 4x4 rectangles, each of which is a group.
ii. For example, a row with N samples is a group. N is an integer no larger than the block width. In one example, N is 4 or 8 or the block width.
iii. For example, a column with N samples is a group. N is an integer no larger than the block height. In one example, N is 4 or 8 or the block height.
iv. M and/or N may be pre-defined or derived on-the-fly, such as based on block dimension/coded information or signaled.
b. In one example, samples in the group may have the same MV (denoted as shared MV) .
c. In one example, samples in the group may have MVs with the same horizontal component (denoted as shared horizontal component) .
d. In one example, samples in the group may have MVs with the same vertical component (denoted as shared vertical component) .
e. In one example, samples in the group may have MVs with the same fractional part of the horizontal component (denoted as shared fractional horizontal component) .
i. For example, suppose the MV for a first sample is (MV1x, MV1y) and the MV for a second sample is (MV2x, MV2y) , it should be satisfied that MV1x & (2 M-1) is equal to MV2x & (2 M-1) , where M denotes MV precision. For example, M=4.
f. In one example, samples in the group may have MVs with the same fractional part of the vertical component (denoted as shared fractional vertical component) .
i. For example, suppose the MV for a first sample is (MV1x, MV1y) and the MV for a second sample is (MV2x, MV2y) , it should be satisfied that MV1y & (2 M-1) is equal to MV2y & (2 M-1) , where M denotes MV precision. For example, M=4.
g. In one example, for a sample in the group to be predicted, the motion vector, denoted by MV b, may be firstly derived according to the resolutions of the current picture and the reference picture (e.g. (refx L, refy L) derived in 8.5.6.3.1 in JVET-O2001-v14) . Then, MV b may be further modified (e.g., being rounded/truncated/clipped) to MV’ to satisfy the requirements such as the above bullets, and MV’ will be used to derive the prediction sample for the sample.
i. In one example, MV’ has the same integer part as MV b, and the fractional part of the MV’ is set to be the shared fractional horizontal and/or vertical component.
ii. In one example, MV’ is set to be the one with the shared fractional horizontal and/or vertical component, and closest to MV b.
h. The shared motion vector (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be the motion vector (and/or horizontal component and/or vertical component and/or fractional vertical component and/or fractional vertical component) of a specific sample in the group.
i. For example, the specific sample may be at a corner of a rectangle-shaped group, such as “A” , “B’ , “C” and “D” shown in FIG. 2A.
ii. For example, the specific sample may be at a center of a rectangle-shaped group, such as “E” , “F’ , “G” and “H” shown in FIG. 2A.
iii. For example, the specific sample may be at an end of a row-shaped or column-shaped group, such as “A” and “D” shown in FIGS. 2B and 2C.
iv. For example, the specific sample may be at a middle of a row-shaped or column-shaped group, such as “B” and “C” shown in FIGS. 2B and 2C.
v. In one example, the motion vector of the specific sample may be the MV b mentioned in bullet g.
i. The shared motion vector (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be the motion vector (and/or horizontal component and/or vertical component and/or fractional vertical component and/or fractional vertical component) of a virtual sample located at a different position compared to all samples in this group.
i. In one example, the virtual sample is not in the group, but it locates in the region covering all samples in the group.
1) Alternatively, the virtual sample is located outside the region covering all samples in the group, e.g., next to the bottom-right position of the region.
ii. In one example, the MV of a virtual sample is derived in the same way as a real sample but with different positions.
iii. “V” in FIGS. 2A-2C shows three examples of virtual samples.
j. The shared MV (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be a function of MVs (and/or horizontal components and/or vertical components and/or fractional vertical components and/or fractional vertical components) of multiple samples and/or virtual samples.
i. For example, the shared MV (and/or shared horizontal component and/or shared vertical component and/or shared fractional vertical component and/or shared fractional vertical component) may be set to be the average of MVs (and/or horizontal components and/or vertical components and/or fractional vertical components and/or fractional vertical components) of all or partial of samples in the group, or of sample “E” , “F” , “G” , “H” in FIG. 2A, or of sample “E” , “H” in FIG. 2A or of sample “A” , “B” , “C” , “D” in FIG. 2A, or of sample “A” , “D” in FIG. 2A, or of sample “B” , “C” in FIG. 2B, or of sample “A” , “D” in FIG. 2B, or of sample “B” , “C” in FIG. 2C, or of sample “A” , “D” in FIG. 2C,
2. It is proposed that only integer MVs are allowed to perform the motion compensation process to derive the prediction block of a current block when the resolution of the reference picture is different to the current picture.
a. In one example, the decoded motion vectors for samples to be predicted are rounded to integer MVs before being used.
3. The motion vectors used in the motion compensation process for samples in a current block (e.g., shared MV/shared horizontal or vertical or fractional component/MV’ mentioned in above bullets) may be stored in the decoded picture buffer and utilized for motion vector prediction of succeeding blocks in current/different pictures.
a. Alternatively, the motion vectors used in the motion compensation process for samples in a current block (e.g., shared MV/shared horizontal or vertical or fractional component/MV’ mentioned in above bullets) may be disallowed to be utilized for motion vector prediction of succeeding blocks in current/different pictures.
i. In one example, the decoded motion vectors (e.g., MV b in above bullets) may be utilized for motion vector prediction of succeeding blocks in current/different pictures.
b. In one example, the motion vectors used in the motion compensation process for samples in a current block may be utilized in the filtering process (e.g., deblocking filter/SAO/ALF) .
i. Alternatively, the decoded motion vectors (e.g., MV b in above bullets) may be utilized in the filtering process.
4. It is proposed that the interpolation filters used in the motion compensation process to derive the prediction block of a current block may be selected depending on whether the resolution of the reference picture is different to the current picture.
a. In one example, the interpolation filters have less taps when the resolution of the reference picture is different to the current picture.
i. In one example, bi-linear filters are applied when the resolution of the reference picture is different to the current picture.
ii. In one example, 4-tap filters or 6-tap filters are applied when the resolution of the reference picture is different to the current picture.
5. It is proposed that a two-stage process for prediction block generation is applied when the resolution of the reference picture is different to the current picture.
a. In the first stage, a virtual reference block is generated by up-sampling or down-sampling a region in the reference picture depending on width and/or height of the current picture and the reference picture.
b. In the second stage, the prediction samples are generated from the virtual reference block by applying interpolation filtering, independent of width and/or height of the current picture and the reference picture.
6. It is proposed that the calculation of top-left coordinate of the bounding block for reference sample padding (xSbInt L, ySbInt L) as defined in 8.5.6.3.1 in JVET-O2001-v14 may be derived depending on width and/or height of the current picture and the reference picture.
a. In one example, the luma locations in full-sample units are modified as:
xInt i = Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i) ,
yInt i = Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i) ,
where Dx and/or Dy and/or Ux and/or Uy may depend on width and/or height of the current picture and the reference picture.
b. In one example, the chroma locations in full-sample units are modified as:
xInti = Clip3 (xSbInt C -Dx, xSbInt C + sbWidth + Ux, xInti)
yInti = Clip3 (ySbInt C -Dy, ySbInt C + sbHeight + Uy, yInti)
where Dx and/or Dy and/or Ux and/or Uy may depend on width and/or height of the current picture and the reference picture.
7. It is proposed that whether to and/or how to clip MV according to the bounding block for reference sample padding (e.g., the (xSbInt L, ySbInt L) as defined in 8.5.6.3.1 in JVET-O2001-v14) may depend on the usage of DMVR.
a. In one example, MV is clipped according to the bounding block for reference sample padding (e.g., (xSbInt L, ySbInt L) as defined in 8.5.6.3.1) only when DMVR is applied.
i. For example, operations 8-775 and 8-776 in the luma sample interpolation filtering process as defined in JVET-O2001-v14 are applied only if DMVR is used for the current block.
ii. For example, operations 8-789 and 8-790 in the chroma sample interpolation filtering process as defined in JVET-O2001-v14 are applied only if DMVR is used for the current block.
b. Alternatively, furthermore, the above methods may be also applicable to the clipping of chroma samples.
8. It is proposed that whether to and/or how to clip MV according to the bounding block for reference sample padding (e.g., (xSbInt L, ySbInt L) as defined in 8.5.6.3.1 in JVET-O2001-v14) may depend on whether picture wrapping is used (e.g. whether sps_ref_wraparound_enabled_flag is equal to 0 or 1) .
a. In one example, MV is clipped according to the bounding block for reference sample padding (e.g., (xSbInt L, ySbInt L) as defined in 8.5.6.3.1) only if picture wrapping is not used.
i. For example, operations 8-775 and 8-776 in the luma sample interpolation filtering process as defined in JVET-O2001-v14 are applied only if picture wrapping is not used.
ii. For example, operations 8-789 and 8-790 in the chroma sample interpolation filtering process as defined in JVET-O2001-v14 are applied only if picture wrapping is not used.
b. Alternatively, furthermore, the above methods may be also applicable to the clipping of chroma samples.
c. In one example, the luma locations in full-sample units are modified as:
xInt i = Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i) ,
yInt i = Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i) ,
where Dx and/or Dy and/or Ux and/or Uy may depend on whether picture wrapping is used.
d. In one example, the chroma locations in full-sample units are modified as:
xInti = Clip3 (xSbInt C -Dx, xSbInt C + sbWidth + Ux, xInti)
yInti = Clip3 (ySbInt C -Dy, ySbInt C + sbHeight + Uy, yInti)
where Dx and/or Dy and/or Ux and/or Uy may depend on whether picture wrapping is used.
9. Whether to/how to apply filtering process (e.g., deblocking filter) may depend on whether the reference pictures are with different resolutions.
a. In one example, the boundary strenght settings in the deblocking filters may take the resolution differences into consideration in addition to motion vector differences.
b. In one example, the boundary strenght settings in the deblocking filters may the scaled motion vector differences based on resolution differences.
c. In one example, the strength of deblocking filter is increased if the resoltuion of at least one reference picture of block A is different to (or smllar than or larger than) the resoltuion of at least one reference picture of block B.
d. In one example, the strength of deblocking filter is decreased if the resoltuion of at least one reference picture of block A is different to (or smllar than or larger than) the resoltuion of at least one reference picture of block B.
e. In one example, the strength of deblocking filter is increased if the resoltuion of at least one reference picture of block A and/or block B is different to (or smllar than or larger than) the resoltuion of the current block.
f. In one example, the strength of deblocking filter is decreased if the resoltuion of at least one reference picture of block A and/or block B is different to (or smllar than or larger than) the resoltuion of the current block.
10. Instead of storing/using the motion vectors for a block based on the same reference picture resolution as current picture, it is proposed to use the real motion vectors with the resolution difference taken into consideration.
a. Alternatively, furthermore, when using the motion vector to generate the prediction block, there is no need to further change the motionv vector according to the  resolutions of the current picture and the reference picture (e.g. (refx L, refy L) derived in 8.5.6.3.1 in JVET-O2001-v14) .
11. In one example, when a sub-picture exists, the reference picture must have the same resolution as the current picture.
a. Alternatively, when a reference picture has a different resolution to the current picture, there must be no sub-picture in the current picture.
12. In one example, sub-pictures may be defined separately for pictures with different resolutions.
13. In one example, the corresponding sub-picture in the reference picture can be derived by scaling and/or offsetting a sub-picture of the current picture, if the reference picture has a different resolution to the current picture.
FIG. 3 is a block diagram of a video processing apparatus 300. The apparatus 300 may be used to implement one or more of the methods described herein. The apparatus 300 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 300 may include one or more processors 302, one or more memories 304 and video processing hardware 306. The processor (s) 302 may be configured to implement one or more methods described in the present document. The memory (memories) 304 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 306 may be used to implement, in hardware circuitry, some techniques described in the present document. In some embodiments, the hardware 306 may be at least partly within the processor 302, e.g., a graphics co-processor.
The following solutions may be implemented as preferred solutions in some embodiments.
The following solutions may be implemented together with additional techniques described in items listed in the previous section (e.g., item 1) .
1. A method of video processing (e.g., method 400 depicted in FIG. 4) , comprising determining (402) , for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a reference picture used for the conversion are different, and performing (404) the conversion based on the determining such that predicted values of a group of samples of the current block are generated using a horizontal or a vertical interpolation filter.
2. The method of solution 1, wherein the group of samples corresponds to all samples of  the current block.
3. The method of solution 1, wherein the group of samples corresponds to some samples of the current block.
4 The method of solution 3, wherein the group of samples corresponds to all samples of a region in the current block.
5. The method of any of solutions 1-4, wherein the group of samples is selected to have a same motion vector (MV) used during the conversion.
6. The method of any of solutions 1-4, wherein the group of samples have a same horizontal motion vector component.
7. The method of any of solutions 1-4, wherein the group of samples have a same vertical motion vector component.
8. The method of any of solutions 1-4, wherein the group of samples have a same fractional horizontal motion vector component part
9. The method of any of solutions 1-4, wherein the group of samples have a same fractional vertical motion vector component part.
10. The method of any of solutions 1-9, wherein, during the conversion, a motion vector for a specific sample is derived by modifying a value of motion vector derived based on the resolution of the current picture and the resolution of the reference picture by a modification step including truncating, clipping or rounding.
11. The method of any of solutions 1-7, wherein, during the conversion, a motion vector for a specific sample is set to a value of a shared motion vector that is shared by all samples in the group of samples.
12. The method of any of solutions 1-9, wherein the group of samples share a shared motion vector during the conversion, and wherein the shared motion vector is derived from motion vectors of one or more samples in the group of samples.
13. The method of solution 11, wherein the shared motion vector is further derived from a virtual sample.
The following solutions may be implemented together with additional techniques described in items listed in the previous section (e.g., item 5) .
14. A method of video processing, comprising: determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current  picture containing the current block and a reference picture used for the conversion are different, and performing the conversion based on the determining such that predicted values of a group of samples of the current block are generated as an interpolated version of a virtual reference block that is generated by sample rate changing a region in the reference picture, wherein the sample rate changing depends on a height or a width of the current picture or the reference picture.
15. The method of solution 14, wherein the interpolated version is generated using an interpolation filter whose coefficients do not depend on the height or the width of the current picture or the reference picture.
The following solutions may be implemented together with additional techniques described in items listed in the previous section (e.g., item 6) .
16. A method of video processing, comprising: determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a reference picture used for the conversion are different, and based on the determining, deriving a top-left coordinate of a bounding block for reference sample padding based on a scheme that is dependent on a height or a width of the current picture or the reference picture, and performing the conversion using the derived top-left coordinate of the bounding box.
17. The method of solution 16, the scheme comprises calculating luma samples located at integer sample locations as:
xInt i = Clip3 (xSbInt L-Dx, xSbInt L + sbWidth + Ux, xInt i) ,
yInt i = Clip3 (ySbInt L-Dy, ySbInt L + sbHeight + Uy, yInt i) ,
where Dx and/or Dy and/or Ux and/or Uy depend on the width and/or the height of the current picture or the reference picture, and wherein (xSbInt L, ySbInt L) is the top left coordinate.
18. The method of solution 16, the scheme comprises calculating chroma samples located at integer sample locations as:
xInti = Clip3 (xSbInt C-Dx, xSbInt C + sbWidth + Ux, xInti)
yInti = Clip3 (ySbInt C-Dy, ySbInt C + sbHeight + Uy, yInti)
where Dx and/or Dy and/or Ux and/or Uy depend on the width and/or the height of the current picture or the reference picture, and wherein (xSbInt L, ySbInt L) is the top left coordinate.
The following solutions may be implemented together with additional techniques described in items listed in the previous section (e.g., item 7) .
19. A method of video processing, comprising: determining, for a conversion between a current block in a current picture of a video and a coded representation of the video, a clipping operation applied to motion vector calculation according to a bounding block for reference sample padding, based on use of a decoder side motion vector refinement (DMVR) during the conversion of the current block; and performing the conversion based on the clipping operation.
20. The method of solution 19, wherein the determining enables a legacy clipping operation due to the DMVR being used for the current block.
21. The method of any of solutions 19-20, wherein the current block is a chroma block.
The following solutions may be implemented together with additional techniques described in items listed in the previous section (e.g., item 8) .
22. A method of video processing, comprising: determining, for a conversion between a current block in a current picture of a video and a coded representation of the video, a clipping operation applied to motion vector calculation according to a bounding block for reference sample padding, based on use of picture wrapping in the conversion; and performing the conversion based on the clipping operation.
23. The method of solution 22, wherein the determining enables a legacy clipping operation only if the picture wrapping is disabled for the current block.
24. The method of any of solutions 22-23, wherein the current block is a chroma block.
25. The method of any of solutions 22-23, wherein the clipping operation is used to calculate luma samples as:
xInt i = Clip3 (xSbInt L -Dx, xSbInt L + sbWidth + Ux, xInt i) ,
yInt i = Clip3 (ySbInt L -Dy, ySbInt L + sbHeight + Uy, yInt i) ,
where Dx and/or Dy and/or Ux and/or Uy depend on the use of picture wrapping, and wherein (xSbInt L, ySbInt L) represents the bounding block.
26. The method of any of solutions 1 to 25, wherein the conversion comprises encoding the video into the coded representation.
27. The method of any of solutions 1 to 25, wherein the conversion comprises decoding the coded representation to generate pixel values of the video.
28. A video decoding apparatus comprising a processor configured to implement a method recited in one or more of solutions 1 to 27.
29. A video encoding apparatus comprising a processor configured to implement a method recited in one or more of solutions 1 to 27.
30. A computer program product having computer code stored thereon, the code, when executed by a processor, causes the processor to implement a method recited in any of solutions 1 to 27.
31. A method, apparatus or system described in the present document.
Following examples show features implemented by some preferred embodiments based on the disclosed technology..
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 4) . In these examples, the interpolation filters are used for filtering sample values to produce sample values at (fractional) resample locations.
1. A method of video processing (e.g., method 500 shown in FIG. 5A) , comprising: selecting (502) an interpolation filter used for determining a prediction block for a current block of a current picture of video by motion compensation from a reference picture based on a rule, and performing (504) a conversion between the current block of the video and a coded representation of the video based on the prediction block, wherein the rule specifies that the interpolation filter is a first interpolation filter in case that a resolution of the current picture and a resolution of the reference picture are different and the interpolation filter is a second interpolation filter in case that the resolution of the current picture and the resolution of the reference picture are same, wherein the first interpolation filter is different from the second interpolation filter.
2. The method of example 1, wherein the first interpolation filter has less taps than the second interpolation filter.
3. The method of any of examples 1-2, wherein the first interpolation filter is a bi-linear filter.
4. The method of any of examples 1-3, wherein the first interpolation filter is a 4-tap filter.
5. The method of any of examples 1-3, wherein the first interpolation filter is a 6-tap filter.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., items 11, 12, 13) .
6. A method of video processing (e.g., method 570 shown in FIG. 5H) , comprising: performing (572) a conversion between a video comprising a current video picture and a coded representation of the video according to a rule, wherein a reference picture is included in at least one of the reference picture lists of the current video picture, wherein the current video picture has a first resolution; wherein the reference picture has a second resolution; wherein the rule specifies whether and/or how the current video picture and/or the reference picture are permitted to have sub-pictures depending on at least one of the first resolution or the second resolution.
7. The method of example 6, wherein the rule specifies that for the reference picture to include a sub-picture, the first resolution is identical to the second resolution. For example, the rule may specify that the same resolution may be both a necessary and sufficient condition.
8. The method of example 6, wherein the rule specifies that for the current picture to include a sub-picture, the first resolution is identical to the second resolution.
9. The method of example 6, wherein the rule specifies that ifthe first resolution is different than the second resolution, then the reference picture cannot be partitioned in a sub-picture.
10. The method of example 6, wherein the rule specifies that if the first resolution is different than the second resolution, then the current picture cannot be partitioned in a sub-picture.
11. The method of any of examples 6-10, wherein the rule specifies a partitioning of pictures into sub-pictures according to picture resolutions.
12. The method of example 6, wherein the rule specifies to use a scaling or offsetting for deriving a sub-picture in the reference picture from a sub-picture in the current picture in case that the first resolution is different than the second resolution.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 1) .
13. A method of video processing (e.g., method 510 shown in FIG. 5B) , comprising: determining (512) , for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a resolution of a reference picture used for the conversion are different, and performing (514) the conversion based on the determining such that same horizontal interpolation filter or vertical interpolation filter is used for generating predicted values of samples of a group of samples of the current block.
14. The method of example 13, wherein the group of samples corresponds to all samples of the current block.
15. The method of example 13, wherein the group of samples corresponds to less than all samples of the current block.
16. The method of solution 15, wherein the group of samples corresponds to all samples of a region in the current block.
17. The method of any of examples 13-16, wherein the current block is divided into MxN non-overlapping rectangles of samples, where M and N are integers, and wherein the group of samples corresponds to an MxN rectangle.
18. The method of any of examples 13-16, wherein the group of samples corresponds to N samples of a row of samples of the current block, where N is an integer.
19. The method of example 18, wherein N is 4 or 8 or equal to a width of the current block.
20. The method of any of examples 13-16, wherein the group of samples corresponds to N samples of a column of samples of the current block, where N is an integer.
21. The method of example 20, wherein N is 4 or 8 or equal to a height of the current block.
22. The method of any of examples 17-21, wherein values M and N are constant during the conversion.
23. The method of any of examples 13-22, wherein values M and N are dependent on a dimension of the current block or a coded information of the current block or correspond to a syntax element included in the coded representation.
24. The method of example 13, wherein the group of samples is selected from samples having a same motion vector.
25. The method of example 13, wherein the group of samples includes samples having a same horizontal component of a motion vector.
26. The method of example 13, wherein the group of samples includes samples having a same vertical component of a motion vector.
27. The method of example 13, wherein the group of samples includes samples that have a same fractional part of a horizontal component of a motion vector.
28. The method of example 17, wherein the fractional part of the horizontal component is represented using M least significant bits, where M is an integer.
29. The method of example 13, wherein the group of samples includes samples that have  a same fractional part of a vertical component of a motion vector.
30. The method of example 17, wherein the fractional part of the horizontal component is represented using M least significant bits, where M is an integer.
31. The method of any of examples 25-30, wherein, the motion vector used for generating the predicated values of samples in the group of values corresponds to a final motion vector that is first derived according to resolutions of the current picture, denoted as MVb, and the reference picture and then modified according to a motion vector characteristic for the group of samples, denoted as MV'.
32. The method of example 31, wherein an integer part of MV' is same as that of MVb, and wherein a fractional part of MV' is same for all samples in the group of pictures.
33. The method of example 31, wherein MV' is selected to be a motion vector having a closest match to MVb and having a fractional part equal to a shared horizontal or a shared vertical component among samples of the group of samples.
34. The method of example 13, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information corresponds to a motion vector value of a particular sample of the group of samples.
35. The method of example 34, wherein the group of samples has a rectangular shape, and wherein the particular sample is a comer sample of the rectangular shape.
36. The method of example 34, wherein the group of samples has a rectangular shape, and wherein the particular sample is a center sample of the rectangular shape.
37. The method of example 34, wherein the group of samples has a linear shape, and wherein the particular sample is an end sample of the linear shape.
38. The method of example 34, wherein the group of samples has a linear shape, and wherein the particular sample is a center sample of the linear shape.
39. The method of example 34, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information corresponds to a motion vector determined according to the resolution of the current picture and the resolution of the reference picture.
40. The method of example 13, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the  shared motion vector information corresponds to a motion vector value of a particular sample that is not in the group of samples.
41. The method of example 40, wherein the particular sample is in a region that encloses all samples in the group of samples.
42. The method of example 40, wherein the particular sample is in a region that is non-overlapping to all samples in the group of samples.
43. The method of example 42, wherein the particular sample is near a bottom-right of the group of samples.
44. The method of any of examples 40-43, wherein a motion vector for the particular sample is derived for the conversion.
45. The method of any of examples 40-44, wherein the particular sample is at a fractional position between a row of samples in the group of samples and/or a column of in the group of samples.
46. The method of example 35, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information is a function of motion vector information of one or more samples in the group of samples or one or more virtual samples related to the group of samples.
47. The method of example 46, wherein the shared motion vector information is an average of the motion vector information of the one or more samples in the group of samples or the one or more virtual samples related to the group of samples.
48. The method of any of examples 46-47, wherein the one or more virtual samples are at fractional positions with respect to the group of samples.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 2) .
49. A method of video processing (e.g., method 520 shown in FIG. 5C) , comprising: making a determination (522) , due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a constraint on a motion vector used for deriving a prediction block for the current block; and performing (524) a conversion between the video and a coded representation of the video based on the determination, wherein the constrain specifies that the motion vector is an integer motion vector.
50. The method of example 49, wherein the constraint is enforced during motion compensation by rounding decoded motion vectors from the coded representation to integer values.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 3) .
51. A method of video processing (e.g., method 530 shown in FIG. 5D) , comprising: performing (532) a conversion between a video comprising a current block and a coded representation according to a rule, wherein the rule specifies whether a motion vector information used for determining a prediction block of the current block by motion compensation is made available for motion vector prediction of succeeding blocks in a current picture comprising the current block or another picture.
52. The method of example 51, wherein the motion vector information comprises a shared motion vector information recited in any of the above examples.
53. The method of any of examples 51-52, wherein the motion vector information comprises motion vector information that is coded in the coded representation.
54. The method of any of examples 51-53, wherein the motion vector information is used for motion vector prediction of succeeding blocks in the current picture or the another picture.
55. The method of any of examples 51-54, wherein the motion vector information is used for a filtering process in the conversion.
56. The method of example 55, wherein the filtering process comprises a deblocking filtering or a sample adaptive offset filtering or an adaptive loop filtering.
57. The method of example 51, wherein the rule disallows use of the motion vector information for motion compensation or filtering of a succeeding video block in the current picture or the another picture.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 5) .
58. A method of video processing (e.g., method 540 shown in FIG. 5E) , comprising: making a determination (542) , due to a resolution of a current picture of a video comprising a current block being different from a resolution of a reference picture of the current block, to use a two-step process to generate a prediction block for the current block; and performing (544) a conversion between the video and a coded representation of the video based on the determination,  wherein the two-step process comprises a first step of resampling a region of the reference picture to generate a virtual reference block and a second step of generation the prediction block using an interpolation filter on the virtual reference block.
59. The method of example 58, wherein the resampling includes upsampling or downsampling, depending on widths and/or heights of the current picture and the reference picture.
60. The method of any of examples 58-59, wherein the interpolation filter is independent of widths and/or heights of the current picture and the reference picture.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 9) .
61. A method of video processing (e.g., method 550 shown in FIG. 5F) , comprising: making a first determination (552) that there is a difference between resolutions of one or more reference pictures used for generating a prediction block of a current block of a current picture of a video and a resolution of the current picture, using a rule to make a second determination (554) , based on the difference, about whether or how to apply a filtering process for a conversion between the video and a coded representation of the video; and performing (556) the conversion according to the second determination.
62. The method of example 61, wherein the rule specifies that a boundary strength of a filter used in the filtering process is a function of the difference.
63. The method of example 62, wherein the rule specifies that the boundary strength is scaled according to the difference.
64. The method of example 62, wherein the rule specifies that the boundary strength is increased from normal, in case that a first reference picture of a first neighboring block of the current block and a second reference picture of a second neighboring block of the current block have different resolutions.
65. The method of example 62, wherein the rule specifies that the boundary strength is decreased from normal, in case that a first reference picture of a first neighboring block of the current block and a second reference picture of a second neighboring block of the current block have different resolutions.
66. The method of example 62, wherein the rule specifies that the boundary strength is increased from normal, in case that a resolution of a first reference picture of a first neighboring  block and/or a resolution of a second reference picture of a second neighboring block of the current block are different.
67. The method of example 62, wherein the rule specifies that the boundary strength is decreased from normal, in case that a resolution of a first reference picture of a first neighboring block and/or a resolution of a second reference picture of a second neighboring block of the current block are different.
68. The method of any of examples 62-67, wherein the resolution of the first reference picture or the resolution of the second reference picture is greater than the resolution of the reference picture of the current block.
69. The method of any of examples 62-67, wherein the resolution of the first reference picture or the resolution of the second reference picture is less than the resolution of the reference picture of the current block.
The following examples may be preferably implemented together with additional techniques described in items listed in the previous section (e.g., item 10) .
70. A method of video processing (e.g., method 560 shown in FIG. 5G) , comprising: performing (562) a conversion between a video comprising multiple video blocks of a video picture and a coded representation of the video according to a rule, wherein the multiple video blocks are processed in an order, wherein the rule specifies that motion vector information used for determining prediction block of a first video block is stored and used during processing of a succeeding video block of the multiple video blocks according to a resolution of a reference picture used by the motion vector information.
71. The method of example 70, wherein the rule specifies that the motion vector information is used for processing the succeeding video block by adjusting according to a resolution difference between the resolution of the reference picture and a resolution of the current picture.
72. The method of example 70, wherein the rule specifies that the motion vector information is used for processing the succeeding video block without adjusting according to a resolution difference between the resolution of the reference picture and a resolution of the current picture.
73. The method of any of examples 1 to 72, wherein the conversion comprises decoding the coded representation to generate pixel values of the video.
74. The method of any of examples 1 to 72, wherein the conversion comprises encoding the coded pixel values of the video into the coded representation.
75. A video decoding apparatus comprising a processor configured to implement a method recited in one or more of examples 1 to 74.
76. A video encoding apparatus comprising a processor configured to implement a method recited in one or more of examples 1 to 74.
77. A computer program product having computer code stored thereon, the code, when executed by a processor, causes the processor to implement a method recited in any of examples 1 to 74.
78. A non-transitory computer-readable storage medium storing instructions that cause a processor to implement a method recited in any of examples 1 to 74.
79. A non-transitory computer-readable recording medium storing a bitstream corresponding to the coded representation that is generated by a method recited in example 78.
80. A method, apparatus or system described in the present document.
In the above solutions, the performing the conversion includes using the results of previous decision step during the encoding or decoding operation to arrive at the conversion results.
In the present document, the term “video processing” may refer to video encoding, video decoding, video compression or video decompression. For example, video compression algorithms may be applied during conversion from pixel representation of a video to a corresponding bitstream representation or vice versa. The bitstream representation, or coded representation, of a current video block may, for example, correspond to bits that are either co-located or spread in different places within the bitstream, as is defined by the syntax. For example, a video block may be encoded in terms of transformed and coded error residual values and also using bits in headers and other fields in the bitstream. Furthermore, during conversion, a decoder may parse a bitstream with the knowledge that some fields may be present, or absent, based on the determination, as is described in the above solutions. Similarly, an encoder may determine that certain syntax fields are or are not to be included and generate the coded representation accordingly by including or excluding the syntax fields from the coded representation.
Some embodiments of the disclosed technology, e.g., the above described solutions and examples, include making a decision or determination to enable a video processing tool or mode. In an example, when the video processing tool or mode is enabled, the encoder will use or  implement the tool or mode in the processing of a block of video, but may not necessarily modify the resulting bitstream based on the usage of the tool or mode. That is, a conversion from the block of video to the bitstream representation of the video will use the video processing tool or mode when it is enabled based on the decision or determination. In another example, when the video processing tool or mode is enabled, the decoder will process the bitstream with the knowledge that the bitstream has been modified based on the video processing tool or mode. That is, a conversion from the bitstream representation of the video to the block of video will be performed using the video processing tool or mode that was enabled based on the decision or determination.
Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode. In an example, when the video processing tool or mode is disabled, the encoder will not use the tool or mode in the conversion of the block of video to the bitstream representation of the video. In another example, when the video processing tool or mode is disabled, the decoder will process the bitstream with the knowledge that the bitstream has not been modified using the video processing tool or mode that was disabled based on the decision or determination.
The disclosed and other solutions, examples, embodiments, modules and the functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode  information for transmission to suitable receiver apparatus.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASlC (application specific integrated circuit) .
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random-access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any subject matter or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular techniques. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (80)

  1. A method of video processing, comprising:
    selecting an interpolation filter used for determining a prediction block for a current block of a eurrent picture of video by motion compensation from a reference picture based on a rule, and
    performing a conversion between the current block of the video and a coded representation of the video based on the prediction block,
    wherein the rule specifies that the interpolation filter is a first interpolation filter in case that a resolution of the current picture and a resolution of the reference picture are different and the interpolation filter is a second interpolation filter in case that the resolution of the current picture and the resolution of the reference picture are same, wherein the first interpolation filter is different from the second interpolation filter.
  2. The method of claim 1, wherein the first interpolation filter has less taps than the second interpolation filter.
  3. The method of any of claims 1-2, wherein the first interpolation filter is a bi-linear filter.
  4. The method of any of claims 1-3, wherein the first interpolation filter is a 4-tap filter.
  5. The method of any of claims 1-3, wherein the first interpolation filter is a 6-tap filter.
  6. A method of video processing, comprising:
    performing a conversion between a video comprising a current video picture and a coded representation of the video according to a rule,
    wherein a reference picture is included in at least one of the reference picture lists of the current video picture,
    wherein the current video picture has a first resolution;
    wherein the reference picture has a second resolution;
    wherein the rule specifies whether and/or how the current video picture and/or the reference picture are permitted to have sub-pictures depending on at least one of the first resolution or the second resolution.
  7. The method of claim 6, wherein the rule specifies that for the reference picture to include a sub-picture, the first resolution is identical to the second resolution.
  8. The method of claim 6, wherein the rule specifies that for the current picture to include a sub-picture, the first resolution is identical to the second resolution.
  9. The method of claim 6, wherein the rule specifies that if the first resolution is different than the second resolution, then the reference picture cannot be partitioned in a sub-picture.
  10. The method of claim 6, wherein the rule specifies that if the first resolution is different than the second resolution, then the current picture cannot be partitioned in a sub-picture.
  11. The method of any of claims 6-10, wherein the rule specifies a partitioning of pictures into sub-pictures according to picture resolutions.
  12. The method of claim 6, wherein the rule specifies to use a scaling or offsetting for deriving a sub-picture in the reference picture from a sub-picture in the current picture in case that the first resolution is different than the second resolution.
  13. A method of video processing, comprising:
    determining, for a conversion between a current block of a video and a coded representation of the video, that a resolution of a current picture containing the current block and a resolution of a reference picture used for the conversion are different, and
    performing the conversion based on the determining such that same horizontal interpolation filter or vertical interpolation filter is used for generating predicted values of samples of a group of samples of the current block.
  14. The method of claim 13, wherein the group of samples corresponds to all samples of the current block.
  15. The method of claim 13, wherein the group of samples corresponds to less than all samples of the current block.
  16. The method of solution 15, wherein the group of samples corresponds to all samples of a region in the current block.
  17. The method of any of claims 13-16, wherein the current block is divided into MxN non-overlapping rectangles of samples, where M and N are integers, and wherein the group of samples corresponds to an MxN rectangle.
  18. The method of any of claims 13-16, wherein the group of samples corresponds to N samples of a row of samples of the current block, where N is an integer.
  19. The method of claim 18, wherein N is 4 or 8 or equal to a width of the current block.
  20. The method of any of claims 13-16, wherein the group of samples corresponds to N samples of a column of samples of the current block, where N is an integer.
  21. The method of claim 20, wherein N is 4 or 8 or equal to a height of the current block.
  22. The method of any of claims 17-21, wherein values M and N are constant during the conversion.
  23. The method of any of claims 13-22, wherein values M and N are dependent on a dimension of the current block or a coded information of the current block or correspond to a syntax element included in the coded representation.
  24. The method of claim 13, wherein the group of samples is selected from samples having a same motion vector.
  25. The method of claim 13, wherein the group of samples includes samples having a same horizontal component of a motion vector.
  26. The method of claim 13, wherein the group of samples includes samples having a same vertical component of a motion vector.
  27. The method of claim 13, wherein the group of samples includes samples that have a same fractional part of a horizontal component of a motion vector.
  28. The method of claim 17, wherein the fractional part of the horizontal component is represented using M least significant bits, where M is an integer.
  29. The method of claim 13, wherein the group of samples includes samples that have a same fractional part of a vertical component of a motion vector.
  30. The method of claim 17, wherein the fractional part of the horizontal component is represented using M least significant bits, where M is an integer.
  31. The method of any of claims 25-30, wherein, the motion vector used for generating the predicated values of samples in the group of values corresponds to a final motion vector that is first derived according to resolutions of the current picture, denoted as MVb, and the reference picture and then modified according to a motion vector characteristic for the group of samples, denoted as MV'.
  32. The method of claim 31, wherein an integer part of MV' is same as that of MVb, and wherein a fractional part of MV' is same for all samples in the group of pictures.
  33. The method of claim 31, wherein MV' is selected to be a motion vector having a closest match to MVb and having a fractional part equal to a shared horizontal or a shared vertical component among samples of the group of samples.
  34. The method of claim 13, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information corresponds to a motion vector value of a particular sample of the group of samples.
  35. The method of claim 34, wherein the group of samples has a rectangular shape, and wherein the particular sample is a corner sample of the rectangular shape.
  36. The method of claim 34, wherein the group of samples has a rectangular shape, and wherein the particular sample is a center sample of the rectangular shape.
  37. The method of claim 34, wherein the group of samples has a linear shape, and wherein the particular sample is an end sample of the linear shape.
  38. The method of claim 34, wherein the group of samples has a linear shape, and wherein the particular sample is a center sample of the linear shape.
  39. The method of claim 34, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information corresponds to a motion vector determined according to the resolution of the current picture and the resolution of the reference picture.
  40. The method of claim 13, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information corresponds to a motion vector value of a particular sample that is not in the group of samples.
  41. The method of claim 40, wherein the particular sample is in a region that encloses all samples in the group of samples.
  42. The method of claim 40, wherein the particular sample is in a region that is non-overlapping to all samples in the group of samples.
  43. The method of claim 42, wherein the particular sample is near a bottom-right of the group of samples.
  44. The method of any of claims 40-43, wherein a motion vector for the particular sample is derived for the conversion.
  45. The method of any of claims 40-44, wherein the particular sample is at a fractional position between a row of samples in the group of samples and/or a column of in the group of samples.
  46. The method of claim 35, wherein the predicted values of samples are generated by using a shared motion vector information for all samples in the group of samples, and wherein the shared motion vector information is a function of motion vector information of one or more samples in the group of samples or one or more virtual samples related to the group of samples.
  47. The method of claim 46, wherein the shared motion vector information is an average of the motion vector information of the one or more samples in the group of samples or the one or more virtual samples related to the group of samples.
  48. The method of any of claims 46-47, wherein the one or more virtual samples are at fractional positions with respect to the group of samples.
  49. A method of video processing, comprising:
    making a determination, due to a resolution of a current picture ora video comprising a current block being different from a resolution of a reference picture of the current block, to use a constraint on a motion vector used for deriving a prediction block for the current block; and
    performing a conversion between the video and a coded representation of the video based on the determination,
    wherein the constrain specifies that the motion vector is an integer motion vector.
  50. The method of claim 49, wherein the constraint is enforced during motion compensation by rounding decoded motion vectors from the coded representation to integer values.
  51. A method of video processing, comprising:
    performing a conversion between a video comprising a eurrent block and a coded representation according to a rule,
    wherein the rule specifies whether a motion vector information used for determining a prediction block of the current block by motion compensation is made available for motion vector prediction of succeeding blocks in a current picture comprising the current block or another picture.
  52. The method of claim 51, wherein the motion vector information comprises a shared motion vector information recited in any of the above claims.
  53. The method of any of claims 51-52, wherein the motion vector information comprises motion vector information that is coded in the coded representation.
  54. The method of any of claims 51-53, wherein the motion vector information is used for motion vector prediction of succeeding blocks in the current picture or the another picture.
  55. The method of any of claims 51-54, wherein the motion vector information is used for a filtering process in the conversion.
  56. The method of claim 55, wherein the filtering process comprises a deblocking filtering or a sample adaptive offset filtering or an adaptive loop filtering.
  57. The method of claim 51, wherein the rule disallows use of the motion vector information for motion compensation or filtering of a succeeding video block in the current picture or the another picture.
  58. A method of video processing, comprising:
    making a determination, due to a resolution of a current picture ora video comprising a current block being different from a resolution of a reference picture of the current block, to use a two-step process to generate a prediction block for the current block; and
    performing a conversion between the video and a coded representation of the video based on the determination,
    wherein the two-step process comprises a first step of resampling a region of the reference picture to generate a virtual reference block and a second step of generation the prediction block using an interpolation filter on the virtual reference block.
  59. The method of claim 58, wherein the resampling includes upsampling or downsampling, depending on widths and/or heights of the current picture and the reference picture.
  60. The method of any of claims 58-59, wherein the interpolation filter is independent of widths and/or heights of the current picture and the reference picture.
  61. A method of video processing, comprising:
    making a first determination that there is a difference between resolutions of one or more reference pictures used for generating a prediction block of a current block of a current picture of a video and a resolution of the current picture,
    using a rule to make a second determination, based on the difference, about whether or how to apply a filtering process for a conversion between the video and a coded representation of the video; and
    performing the conversion according to the second determination.
  62. The method of claim 61, wherein the rule specifies that a boundary strength of a filter used in the filtering process is a function of the difference.
  63. The method of claim 62, wherein the rule specifies that the boundary strength is scaled according to the difference.
  64. The method of claim 62, wherein the rule specifies that the boundary strength is increased from normal, in case that a first reference picture of a first neighboring block of the current block and a second reference picture of a second neighboring block of the current block have different resolutions.
  65. The method of claim 62, wherein the rule specifies that the boundary strength is decreased from normal, in case that a first reference picture of a first neighboring block of the current block and a second reference picture of a second neighboring block of the current block have different resolutions.
  66. The method of claim 62, wherein the rule specifies that the boundary strength is increased from normal, in case that a resolution of a first reference picture of a first neighboring block and/or a resolution of a second reference picture of a second neighboring block of the current block are different.
  67. The method of claim 62, wherein the rule specifies that the boundary strength is decreased from normal, in case that a resolution of a first reference picture of a first neighboring block and/or a resolution of a second reference picture of a second neighboring block of the current block are different.
  68. The method of any of claims 62-67, wherein the resolution of the first reference picture or the resolution of the second reference picture is greater than the resolution of the reference picture of the current block.
  69. The method of any of claims 62-67, wherein the resolution of the first reference picture or the resolution of the second reference picture is less than the resolution of the reference picture of the current block.
  70. A method of video processing, comprising:
    performing a conversion between a video comprising multiple video blocks of a video picture and a coded representation of the video according to a rule,
    wherein the multiple video blocks are processed in an order,
    wherein the rule specifies that motion vector information used for determining prediction block of a first video block is stored and used during processing of a succeeding video block of the multiple video blocks according to a resolution of a reference picture used by the motion vector information.
  71. The method of claim 70, wherein the rule specifies that the motion vector information is used for processing the succeeding video block by adjusting according to a resolution difference between the resolution of the reference picture and a resolution of the current picture.
  72. The method of claim 70, wherein the rule specifies that the motion vector information is used for processing the succeeding video block without adjusting according to a resolution difference between the resolution of the reference picture and a resolution of the current picture.
  73. The method of any of claims 1 to 72, wherein the conversion comprises decoding the coded representation to generate pixel values of the video.
  74. The method of any of claims 1 to 72, wherein the conversion comprises encoding the coded pixel values of the video into the coded representation.
  75. A video decoding apparatus comprising a processor configured to implement a method recited in one or more of claims 1 to 74.
  76. A video encoding apparatus comprising a processor configured to implement a method recited in one or more of claims 1 to 74.
  77. A computer program product having computer code stored thereon, the code, when executed by a processor, causes the processor to implement a method recited in any of claims 1 to 74.
  78. A non-transitory computer-readable storage medium storing instructions that cause a processor to implement a method recited in any of claims 1 to 74.
  79. A non-transitory computer-readable recording medium storing a bitstream corresponding to the coded representation that is generated by a method recited in claim 78.
  80. A method, apparatus or system described in the present document.
PCT/CN2020/110763 2019-08-23 2020-08-24 Reference picture resampling WO2021036976A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202080057057.9A CN114223205A (en) 2019-08-23 2020-08-24 Reference picture resampling

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNPCT/CN2019/102289 2019-08-23
CN2019102289 2019-08-23

Publications (1)

Publication Number Publication Date
WO2021036976A1 true WO2021036976A1 (en) 2021-03-04

Family

ID=74684057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110763 WO2021036976A1 (en) 2019-08-23 2020-08-24 Reference picture resampling

Country Status (2)

Country Link
CN (1) CN114223205A (en)
WO (1) WO2021036976A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080267291A1 (en) * 2005-02-18 2008-10-30 Joseph J. Laks Thomson Licensing Llc Method for Deriving Coding Information for High Resolution Images from Low Resolution Images and Coding and Decoding Devices Implementing Said Method
US20120230393A1 (en) * 2011-03-08 2012-09-13 Sue Mon Thet Naing Methods and apparatuses for encoding and decoding video using adaptive interpolation filter length
CN107925772A (en) * 2015-09-25 2018-04-17 华为技术有限公司 The apparatus and method that video motion compensation is carried out using optional interpolation filter

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080267291A1 (en) * 2005-02-18 2008-10-30 Joseph J. Laks Thomson Licensing Llc Method for Deriving Coding Information for High Resolution Images from Low Resolution Images and Coding and Decoding Devices Implementing Said Method
US20120230393A1 (en) * 2011-03-08 2012-09-13 Sue Mon Thet Naing Methods and apparatuses for encoding and decoding video using adaptive interpolation filter length
CN107925772A (en) * 2015-09-25 2018-04-17 华为技术有限公司 The apparatus and method that video motion compensation is carried out using optional interpolation filter

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
J. SAMUELSSON (SHARPLABS), S. DESHPANDE (SHARPLABS), A. SEGALL (SHARP): "AHG8: Adaptive Resolution Change (ARC) with downsampling", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 5 July 2019 (2019-07-05), XP030218945 *
J. SAMUELSSON (SHARPLABS), S. DESHPANDE (SHARPLABS), A. SEGALL (SHARP): "AHG8: On Adaptive Resolution Change (ARC) High-Level Syntax (HLS)", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 25 June 2019 (2019-06-25), XP030218837 *
P. CHEN (BROADCOM), T. HELLMAN, B. HENG, W. WAN, M. ZHOU (BROADCOM): "AHG8: Adaptive Resolution Change", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 5 July 2019 (2019-07-05), XP030219200 *
P. TOPIWALA, M. KRISHNAN, W. DAI (FASTVDO): "AHG8: On Adaptive Resolution Change", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 26 June 2019 (2019-06-26), XP030219244 *
Y. HE (INTERDIGITAL), Y. HE (INTERDIGITAL), A. HAMZA (INTERDIGITAL): "AHG8: On adaptive resolution change constraint", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 26 June 2019 (2019-06-26), XP030218774 *

Also Published As

Publication number Publication date
CN114223205A (en) 2022-03-22

Similar Documents

Publication Publication Date Title
WO2021036977A1 (en) Clipping in reference picture resampling
WO2021052490A1 (en) Scaling window in video coding
WO2021073488A1 (en) Interplay between reference picture resampling and video coding tools
WO2021068956A1 (en) Prediction type signaling in video coding
WO2021063418A1 (en) Level-based signaling of video coding tools
WO2021052491A1 (en) Deriving reference sample positions in video coding
WO2021078177A1 (en) Signaling for reference picture resampling
WO2021078178A1 (en) Calculation for multiple coding tools
WO2021036976A1 (en) Reference picture resampling
CN114556937B (en) Downsampling filter type for chroma blend mask generation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20859567

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20859567

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26/08/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20859567

Country of ref document: EP

Kind code of ref document: A1