WO2021188598A1 - Methods and devices for affine motion-compensated prediction refinement - Google Patents

Methods and devices for affine motion-compensated prediction refinement Download PDF

Info

Publication number
WO2021188598A1
WO2021188598A1 PCT/US2021/022640 US2021022640W WO2021188598A1 WO 2021188598 A1 WO2021188598 A1 WO 2021188598A1 US 2021022640 W US2021022640 W US 2021022640W WO 2021188598 A1 WO2021188598 A1 WO 2021188598A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
block
horizontal
vertical
difference
Prior art date
Application number
PCT/US2021/022640
Other languages
English (en)
French (fr)
Inventor
Wei Chen
Xiaoyu XIU
Yi-Wen Chen
Tsung-Chuan MA
Hong-Jheng Jhu
Xianglin Wang
Bing Yu
Original Assignee
Beijing Dajia Internet Information Technology Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co., Ltd. filed Critical Beijing Dajia Internet Information Technology Co., Ltd.
Priority to CN202180021435.2A priority Critical patent/CN115280779A/zh
Publication of WO2021188598A1 publication Critical patent/WO2021188598A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy

Definitions

  • the present disclosure relates to video coding and compression, and in particular but not limited to, methods and apparatuses for affine motion-compensated prediction refinement (AMPR) in video coding.
  • AMPR affine motion-compensated prediction refinement
  • Video coding is performed according to one or more video coding standards.
  • video coding standards include Versatile Video Coding (VVC), High Efficiency Video Coding (HEVC, also known as H.265 or MPEG-H Part2) and Advanced Video Coding (AVC, also known as H.264 or MPEG-4 Part 10), which are jointly developed by ISO/IEC MPEG and ITU-T VECG.
  • AV Versatile Video Coding
  • HEVC High Efficiency Video Coding
  • AVC also known as H.264 or MPEG-4 Part 10
  • AOMedia Video 1 was developed by Alliance for Open Media (AOM) as a successor to its preceding standard VP9.
  • Audio Video Coding which refers to digital audio and digital video compression standard
  • AVS Audio Video Coding
  • Most of the existing video coding standards are built upon the famous hybrid video coding framework i.e., using block-based prediction methods (e.g., inter-prediction, intra prediction) to reduce redundancy present in video images or sequences and using transform coding to compact the energy of the prediction errors.
  • An important goal of video coding techniques is to compress video data into a form that uses a lower bit rate while avoiding or minimizing degradations to video quality.
  • the first generation AVS standard includes Chinese national standard “Information Technology, Advanced Audio Video Coding, Part 2: Video” (known as AVS1) and “Information Technology, Advanced Audio Video Coding Part 16: Radio Television Video” (known as AVS+). It can offer around 50% bit-rate saving at the same perceptual quality compared to MPEG-2 standard.
  • the AVS1 standard video part was promulgated as the Chinese national standard in February 2006.
  • the second generation AVS standard includes the series of Chinese national standard “Information Technology, Efficient Multimedia Coding” (knows as AVS2), which is mainly targeted at the transmission of extra HD TV programs.
  • the coding efficiency of the AVS2 is double of that of the AVS+. In May 2016, the AVS2 was issued as the Chinese national standard.
  • the AVS2 standard video part was submitted by Institute of Electrical and Electronics Engineers (IEEE) as one international standard for applications.
  • the AVS3 standard is one new generation video coding standard for UHD video application aiming at surpassing the coding efficiency of the latest international standard HEVC.
  • March 2019, at the 68-th AVS meeting the AVS3-P2 baseline was finished, which provides approximately 30% bit-rate savings over the HEVC standard.
  • HPM high performance model
  • the present disclosure provides examples of techniques relating to AMPR for the AVS 3 standards.
  • a method for AMPR includes that a sub-block prediction is generated, at a pixel location in a sub-block, by performing a sub-block-based affine motion compensation on a video picture that includes a plurality of sub-blocks. Additionally, the method includes that a horizontal spatial gradient and a vertical spatial gradient for the sub-block prediction are obtained, at the pixel location, by using an interpolation filter. Further, the method includes that a motion vector (MV) difference between a first MV and a second MV is obtained at the pixel location and based on the pixel location relative to a position within the sub-block.
  • MV motion vector
  • the first MV is an MV of a pixel located at the pixel location
  • the second MV is an MV of the sub-block [0007]
  • an apparatus for AMPR includes one or more processors and a memory configured to store instructions executable by the one or more processors.
  • the one or more processors upon execution of the instructions, are configured to: generate, at a pixel location in a sub-block, a sub-block prediction by performing a sub-block-based affine motion compensation on a video picture that includes a plurality of sub-blocks; obtain, at the pixel location, a horizontal spatial gradient and a vertical spatial gradient for the sub-block prediction by using an interpolation filter; and obtain, at the pixel location, a MV difference between a first MV and a second MV based on the pixel location relative to a position within the sub-block.
  • the first MV is an MV of a pixel located at the pixel location
  • the second MV is an MV of the sub-block.
  • a non-transitory computer readable storage medium for AMPR storing computer-executable instructions.
  • the instructions when executed by one or more computer processors, causing the one or more computer processors to perform acts including: generating, at a pixel location in a sub-block, a sub-block prediction by performing a sub-block-based affine motion compensation on a video picture that includes a plurality of sub-blocks; obtaining, at the pixel location, a horizontal spatial gradient and a vertical spatial gradient for the sub-block prediction by using an interpolation filter; and obtaining, at the pixel location, a MV difference between a first MV and a second MV based on the pixel location relative to a position within the sub-block.
  • the first MV is an MV of a pixel located at the pixel location
  • the second MV is an MV of the sub-block.
  • FIG. 1 is a block diagram illustrating an exemplary video encoder in accordance with some implementations of the present disclosure.
  • FIG. 2 is a block diagram illustrating an exemplary video decoder in accordance with some implementations of the present disclosure.
  • FIGS. 3A-3E are schematic diagrams illustrating multi -type tree splitting modes in accordance with some implementations of the present disclosure.
  • FIG. 4 is a schematic diagram illustrating an example of a bi-directional optical flow (BIO) model in accordance with some implementations of the present disclosure.
  • FIGS. 5A-5B are schematic diagrams illustrating examples of 4-parameter affine model in accordance with some implementations of the present disclosure.
  • FIG. 6 is a schematic diagram illustrating an example of 6-parameter affine model in accordance with some implementations of the present disclosure.
  • FIG. 7 illustrates a prediction refinement with optical flow (PROF) process for affine mode in accordance with some implementations of the present disclosure.
  • FIG. 8 illustrates an example of calculation of a horizontal offset and a vertical offset from a sample location to a specific position of a sub-block where a sub-block MV is derived in accordance with some implementations of the present disclosure.
  • FIG. 9 illustrates an example of sub-blocks inside one affine CU in accordance with some implementations of the present disclosure.
  • FIG. 10 is a block diagram illustrating an exemplary apparatus for AMPR in accordance with some implementations of the present disclosure.
  • FIG. 11 is a flowchart illustrating an exemplary process of AMPR in accordance with some implementations of the present disclosure.
  • references throughout this specification to “one embodiment,” “an embodiment,” “an example,” “some embodiments,” “some examples,” or similar language means that a particular feature, structure, or characteristic described is included in at least one embodiment or example. Features, structures, elements, or characteristics described in connection with one or some embodiments are also applicable to other embodiments, unless expressly specified otherwise.
  • the terms “first,” “second,” “third,” and etc. are all used as nomenclature only for references to relevant elements, e.g. devices, components, compositions, steps, and etc., without implying any spatial or chronological orders, unless expressly specified otherwise.
  • a “first device” and a “second device” may refer to two separately formed devices, or two parts, components or operational states of a same device, and may be named arbitrarily.
  • module may include memory (shared, dedicated, or group) that stores code or instructions that can be executed by one or more processors.
  • a module may include one or more circuits with or without stored code or instructions.
  • the module or circuit may include one or more components that are directly or indirectly connected. These components may or may not be physically attached to, or located adjacent to, one another.
  • the term “if’ or “when” may be understood to mean “upon” or “in response to” depending on the context. These terms, if appear in a claim, may not indicate that the relevant limitations or features are conditional or optional.
  • a method may comprise steps of: i) when or if condition X is present, function or action X’ is performed, and ii) when or if condition Y is present, function or action Y’ is performed.
  • the method may be implemented with both the capability of performing function or action X’, and the capability of performing function or action Y’.
  • the functions X’ and Y’ may both be performed, at different times, on multiple executions of the method.
  • a unit or module may be implemented purely by software, purely by hardware, or by a combination of hardware and software.
  • the unit or module may include functionally related code blocks or software components, that are directly or indirectly linked together, so as to perform a particular function.
  • FIG. 1 shows a block diagram of illustrating an exemplary block-based hybrid video encoder 100 which may be used in conjunction with many video coding standards using block- based processing.
  • a video frame is partitioned into a plurality of video blocks for processing.
  • a prediction is formed based on either an inter prediction approach or an intra prediction approach.
  • inter prediction one or more predictors are formed through motion estimation and motion compensation, based on pixels from previously reconstructed frames.
  • intra prediction predictors are formed based on reconstructed pixels in a current frame. Through mode decision, a best predictor may be chosen to predict a current block.
  • Intra prediction uses pixels from the samples of already coded neighboring blocks (which are called reference samples) in the same video picture and/or slice to predict the current video block. Spatial prediction reduces spatial redundancy inherent in the video signal.
  • Inter prediction uses reconstructed pixels from already-coded video pictures to predict the current video block.
  • Temporal prediction reduces temporal redundancy inherent in the video signal.
  • Temporal prediction signal for a given coding unit (CU) or coding block is usually signaled by one or more motion vectors (MVs) which indicate the amount and the direction of motion between the current CU and its temporal reference. Further, if multiple reference pictures are supported, one reference picture index is additionally sent, which is used to identify from which reference picture in the reference picture store the temporal prediction signal comes.
  • MVs motion vectors
  • an intra/inter mode decision circuitry 121 in the encoder 100 chooses the best prediction mode, for example based on the rate-distortion optimization method.
  • the block predictor 120 is then subtracted from the current video block; and the resulting prediction residual is de-correlated using the transform circuitry 102 and the quantization circuitry 104.
  • the resulting quantized residual coefficients are inverse quantized by the inverse quantization circuitry 116 and inverse transformed by the inverse transform circuitry 118 to form the reconstructed residual, which is then added back to the prediction block to form the reconstructed signal of the CU.
  • in-loop filtering 115 such as a deblocking filter, a sample adaptive offset (SAO), and/or an adaptive in-loop filter (ALF) may be applied on the reconstructed CU before it is put in the reference picture store of the picture buffer 117 and used to code future video blocks.
  • coding mode inter or intra
  • prediction mode information motion information
  • quantized residual coefficients are all sent to the entropy coding unit 106 to be further compressed and packed to form the bit-stream.
  • a deblocking filter is available in AVC, HEVC as well as the now- current version of VVC.
  • SAO sample adaptive offset
  • SAO sample adaptive offset
  • ALF adaptive loop filter
  • These in-loop filter operations are optional. Performing these operations helps to improve coding efficiency and visual quality. They may also be turned off as a decision rendered by the encoder 100 to save computational complexity. [0033] It should be noted that intra prediction is usually based on unfiltered reconstructed pixels, while inter prediction is based on filtered reconstructed pixels if these filter options are turned on by the encoder 100.
  • FIG. 2 is a block diagram illustrating an exemplary block-based video decoder 200 which may be used in conjunction with many video coding standards.
  • This decoder 200 is similar to the reconstruction-related section residing in the encoder 100 of FIG. 1.
  • an incoming video bitstream 201 is first decoded through an Entropy Decoding 202 to derive quantized coefficient levels and prediction-related information.
  • the quantized coefficient levels are then processed through an Inverse Quantization 204 and an Inverse Transform 206 to obtain a reconstructed prediction residual.
  • a block predictor mechanism implemented in an Intra/inter Mode Selector 212, is configured to perform either an Intra Prediction 208, or a Motion Compensation 210, based on decoded prediction information.
  • a set of unfiltered reconstructed pixels are obtained by summing up the reconstructed prediction residual from the Inverse Transform 206 and a predictive output generated by the block predictor mechanism, using a summer 214.
  • the reconstructed block may further go through an In-Loop Filter 209 before it is stored in a Picture Buffer 213 which functions as a reference picture store.
  • the reconstructed video in the Picture Buffer 213 may be sent to drive a display device, as well as used to predict future video blocks.
  • a filtering operation is performed on these reconstructed pixels to derive a final reconstructed Video Output 222.
  • Video coding/decoding standards mentioned above, such as HEVC and AV3 are conceptually similar. For example, they all use block-based hybrid video coding framework. Block partitioning schemes in some standards are elaborated below.
  • HEVC partitions blocks only based on quad-trees.
  • the basic unit for compression is termed coding tree unit (CTU).
  • CTU coding tree unit
  • Each CTU may contain one coding unit (CU) or recursively split into four smaller CUs until the predefined minimum CU size is reached.
  • Each CU also named leaf CU
  • PUs prediction units
  • TUs tree of transform units
  • one coding tree unit (CTU) is split into CUs to adapt to varying local characteristics based on quad/binary/extended-quad-tree. Additionally, the concept of multiple partition unit type in the HEVC is removed, i.e., the separation of CU, PU and TU does not exist in the AVS3.
  • each CU is always used as the basic unit for both prediction and transform without further partitions.
  • one CTU is firstly partitioned based on a quad-tree structure.
  • each quad-tree leaf node may be further partitioned based on a binary and extended-quad-tree structure.
  • FIGS. 3A-3E are schematic diagrams illustrating multi -type tree splitting modes in accordance with some implementations of the present disclosure. As shown in FIGS. 3A-3E, there are five splitting types in multi-type tree structure: quaternary partitioning 301, vertical binary partitioning 302, horizontal binary partitioning 303, vertical extended quaternary partitioning 304, and horizontal extended quaternary partitioning 305.
  • block-based motion compensation may be applied to achieve a tradeoff among coding efficiency, complexity, and memory access bandwidth.
  • the average prediction accuracy is inferior to pixel-based prediction because all pixels within each block or sub-block share the same block level motion vector.
  • prediction refinement with optical flow (PROF) for affine mode is adopted as a coding tool in the current VVC standards.
  • AVS3 there is no similar tool.
  • Some examples of the present disclosure provide alternative optical-flow based methods to improve affine mode prediction accuracy as elaborated below.
  • affine motion compensated prediction is applied by signaling one flag for each inter coding block to indicate whether the translation motion model or the affine motion model is applied for inter prediction.
  • two affine modes including 4-paramter affine mode and 6-parameter affine mode, are supported for one affine coding block.
  • the 4-parameter affine model may have following parameters: two parameters for translation movement in horizontal and vertical directions respectively, one parameter for zoom motion and one parameter for rotational motion for both directions.
  • horizontal zoom parameter may be equal to vertical zoom parameter
  • horizontal rotation parameter may be equal to vertical rotation parameter.
  • those affine parameters may be derived from two MVs, which are also called control point motion vector (CPMV), located at the top-left comer and top-right comer of a current block.
  • CPMV control point motion vector
  • FIGS. 5A-5B are schematic diagrams illustrating examples of 4-parameter affine model in accordance with some implementations of the present disclosure.
  • the affine motion field of the block is described by two control point MVs (Vo, Vi). Based on the control point motion, the motion field (v x , vy) of one affine coded block is described as equation (1) below:
  • the 6-parameter affine mode may have following parameters: two parameters for translation movement in horizontal and vertical directions respectively, two parameters for zoom motion and rotation motion respectively in horizontal direction, and another two parameters for zoom motion and rotation motion respectively in vertical direction.
  • the 6- parameter affine motion model is coded with three CPMVs.
  • FIG. 6 is a schematic diagram illustrating an example of 6-parameter affine model in accordance with some implementations of the present disclosure.
  • three control points of one 6-paramter affine block 601 are located at the top-left, top-right and bottom left comer of the block.
  • the motion at top-left control point is related to translation motion
  • the motion at top-right control point is related to rotation and zoom motion in horizontal direction
  • the motion at bottom-left control point is related to rotation and zoom motion in vertical direction.
  • the rotation and zoom motion in the horizontal direction of the 6-paramter may not be same as those motion in vertical direction.
  • the PROF is adopted in the VVC which refines the sub-block based affine motion compensation based on the optical flow model. Specifically, after performing the sub-block-based affine motion compensation, each luma prediction sample of one affine block is modified by one sample refinement value derived based on the optical flow equation.
  • the operations of the PROF may be summarized as following four steps.
  • the first step is that the sub-block-based affine motion compensation is performed to generate sub-block prediction /(/, j) using the sub-block MVs as derived according to the equation (1) above for 4-parameter affine model, or equation (2) above for 6-parameter affine model.
  • the second step is that spatial gradients g x (i,j ) and g y (i,j ) of each prediction samples are calculated as equation (3) below:
  • one additional row and/ or column of prediction samples need to be generated on each of the four sides of one sub-block, which extends a 4x4 subblock into a 6x6 sub-block.
  • the samples on the extended borders are copied from the nearest integer pixel position in the reference picture to avoid additional interpolation processes.
  • the third step is that the luma prediction refinement value is calculated by equation (4) below: where the Av(i,j) is the difference between a pixel MV computed for sample location denoted by v(i,j ), and the sub-block MV of the sub-block where the pixel (i,j) locates at.
  • FIG. 7 illustrates a PROF process for the affine mode in accordance with some implementations of the present disclosure.
  • Function clip3(min, max, val) confines a given value “val” in the range of [min, max]
  • Av(i,j ) can be calculated for the first sub-block, and reused for other sub-blocks in the same CU.
  • Ax and Ay are the horizontal and vertical offset from the sample location (i,j) to the center of the sub-block that the sample belongs to, Av(i,j) can be derived as shown in equation (5) below:
  • parameters c, d, e, and f may be derived as shown in equation below: where (v 0x , v 0y ), (v lx , v ly ), (v 2x , v 2y ) are the top-left, top-right and bottom-left control point MVs of the current coding block, w and h are the width and height of the block.
  • the MV difference Av x and Av y are always derived at the precision of 1/32-pel.
  • the fourth step is that the luma prediction refinement Al(i,j) is added to the sub-block prediction /(/, ) .
  • the final prediction V(i,j ) for sample at location (i,j) is generated as shown in equation (6) below:
  • Bi-prediction in video coding is a simple combination of two temporal prediction blocks obtained from the reference pictures.
  • the motion vectors received at a decoder end may not be so accurate.
  • the BIO tool is adopted in both VVC and AVS3 standards to compensate such motion for every sample inside one block.
  • the BIO is sample-wise motion refinement that is performed on top of the block-based motion-compensated predictions when bi-prediction is used.
  • the derivation of the refined motion vector for each sample in one block is based on the classical optical flow model.
  • the motion refinement (v x , ry) at (x, y) can be derived by optical flow equation (7) below:
  • BIO prediction may be obtained as shown in equation (8) below:
  • FIG. 4 is a schematic diagram illustrating an example of a BIO model in accordance with some implementations of the present disclosure.
  • (MV x o, MV y o) and (MV xi , MV yi ) indicate the block-level motion vectors that are used to generate the two prediction blocks / (0) and / (1) .
  • the motion refinement v y ) at the sample location (x, y) is calculated by minimizing the difference D between the values of the samples after motion refinement compensation (i.e., A and B in FIG. 4), as shown in equation (9) below:
  • the gradients need to be derived in the BIO for every sample of each motion compensated block (i.e., / (0) and / (1) ) in order to derive the local motion refinement and generate the final predication at that sample location.
  • the gradients are calculated by a 2D separable finite impulse response (FIR) filtering process which defines a set of 8-tap filters and applies different filters to derive the horizontal and vertical gradients according to the precision of the block-level motion vector (e.g., (MV x o, MV y o) and (MV xi , MV yi ) in FIG. 4).
  • Table 1 illustrates coefficients of the gradient filters that are used by the BIO.
  • the BIO is only applied to bi-prediction blocks which are predicted by two reference blocks from temporal neighboring pictures. Additionally, the BIO is enabled without sending additional information from encoder to decoder. Specifically, the BIO is applied to all the bi-directional predicted blocks which have both the forward and backward prediction signals.
  • the UMVE mode in AVS3 standards is the same tool named merge mode with motion vector differences (MMVD) in the VVC standards.
  • MMVD motion vector differences
  • the MMVD/UMVE mode is introduced in both the VVC and AVS standards as one special merge mode.
  • it is signaled by one MMVD flag at coding block level.
  • MMVD mode two base merge candidates are firstly generated as the first two candidates of the regular merge mode. After one base merge candidate is selected and signaled, additional syntax elements are signaled to indicate the MVDs that are added to the motion of the selected merge candidate.
  • the MMVD syntax elements include a merge candidate flag to select the base merge candidate, a distance index to specify the MVD magnitude and a direction index to indicate the MVD direction.
  • sub-block based affine motion compensation (affine mode), similar as the VVC standards, is used to generate inter-predicted pixel values.
  • This sub-block- based prediction is a trade-off among coding efficiency, complexity and memory access bandwidth.
  • the average prediction accuracy is inferior to pixel-based prediction because all pixels within each sub-block share the same motion vector.
  • the AVS3 standards does not have pixel level refinement after sub-block-based motion compensation.
  • the present disclosure provides a new method to improve affine mode prediction accuracy.
  • prediction value of each pixel is refined by adding a differential value derived by the optical flow equation.
  • the proposed method may be referred as Affine Motion-compensated Prediction Refinement (AMPR).
  • AMPR Affine Motion-compensated Prediction Refinement
  • the AMPR may achieve pixel level prediction accuracy without significantly increasing the complexity and also keeps the worst-case memory access bandwidth comparable to the regular sub-block based motion compensation in the affine mode.
  • the AMPR is also built upon optical flow, it is significantly different from the PROF in the VVC standards in the following aspects.
  • AMPR may be adaptively skipped at the decoder side based on certain defined conditions indicating that applying AMPR is not a good performance and/or complexity tradeoff.
  • this early termination method may be also used to simplify encoder side operations.
  • Some encoder side optimization methods for the AMPR process are also presented in the examples of the present disclosure to reduce its latency and energy consumption, such as skip AMPR for affine UMVE, check the best mode selection at the parent CU before applying AMPR, skip AMPR for motion estimation at certain block sizes, check the magnitude of pixel MV difference before applying AMPR, skip AMPR for some picture types (e.g., low-delay pictures or non-low-delay pictures), etc.
  • the AMPR method may include five steps as explained below.
  • regular sub-block-based affine motion compensation is performed to generate sub-block prediction I(i,j) at each pixel location
  • both the horizontal and vertical gradients of affine prediction samples are directly calculated from reference samples at integer sample positions in the temporal reference picture.
  • One advantage of doing so is that for each affine sub-block, its gradient values may be generated at the same time when generating its prediction samples.
  • Another design benefit of such a gradient calculation method is that it is also consistent with the gradient calculation process used by other coding tools in the AVS standard, such as BIO. Sharing the same process among different modules in a standard is friendly to the pipe-line and/or parallelism design in practical hardware codec implementations.
  • the input to the gradient derivation process is the same reference samples as that used for the motion compensation of the affine sub-block and the same fractional components (fracX, fracY) of the input motion (MV X , MV y ) of the sub-block.
  • fracX, fracY fractional components of the input motion
  • MV X , MV y fractional components of the input motion
  • another set of new FIR filters ho are introduced in the proposed method to calculate the gradient values.
  • the order of applying the filters ho and hi are different.
  • the gradient filter ho is firstly applied in the horizontal direction to derive the horizontal gradient values at horizontal fractional sample position fracX ; then, the interpolation filter /? / is applied vertically to interpolate the gradient values at vertical fractional sample position fracY.
  • the interpolation filter hi is firstly applied horizontally to interpolate intermediate interpolation samples at horizontal sample position fracX, followed by the gradient filter ho being applied in the vertical direction to derive the vertical gradient values at vertical fraction sample positon fracY from the intermediate interpolation samples.
  • the gradient filters may be generated with different filter coefficient precisions and with different number of taps, which can provide various tradeoff between gradient calculation precision and computational complexity. For instance, gradient filters with more filter taps and/or with a higher filter coefficient precision can generally lead to better coding efficiency, but at the expense of more computational operations (e.g., number of additions, multiplications and shifts) due to the gradient calculation processes.
  • following 8-tap filters are proposed for the horizontal and/or vertical gradient calculations of the AMPR, as shown in Table 2.
  • Table 2 is an exemplary table of predefined 8- tap interpolation filter coefficents f g rad[p] for generating spatial gradients based on 1/16-pel precision of input sample values.
  • Table 3 is an exemplary table of predefined 4-tap interpolation filter coefficients f g rad[p] for generating spatial gradients based on 1/16-pel precision of input sample values.
  • FIG. 8 illustrates an example of calculation of a horizontal offset and a vertical offset from a sample location to a specific position of a sub-block where a sub-block MV is derived.
  • the horizontal Ax and the vertical offset Ay are calculated from the sample location (i ) to the specific position (/',/) of the sub-block where the sub-block MV is derived.
  • the specific position (/',/) may not be always the center of the sub-block.
  • Av(i,j) is calculated based on the pixel location relative to a specific position within the sub-block by the equation (4).
  • (i,j) be the pixel location/coordinate within a sub-block to which the pixel belongs
  • Ax and Ay may be calculated by equation shown below:
  • Ax and Ay may be calculated by equation shown below:
  • Ax and Ay can be calculated by equation shown below:
  • Ax and Ay can be calculated by equation shown below:
  • Ax and Ay can be calculated by equation shown below:
  • Av(i,j) may be calculated by equation (5), Ax and Ay are horizontal and vertical offsets from the sample location (i,j) to the pilot sample location of the sub-block where the sample belongs to.
  • the pilot sample location refers to the sample location inside one sub-block which is used to derive the MV for generating the sub-block- based prediction samples of the sub-block.
  • the values of Ax and Ay are derived as follows.
  • FIG. 9 illustrates an example of sub-blocks inside one affine CU in accordance with some implementations of the present disclosure.
  • Ax i
  • Ax (i — (w » 1) — 0.5)
  • Ay (J — ( h » 1) — 0.5) when 4-parameter affine model is applied.
  • Ax (i — (w » 1) — 0.5)
  • Ay (J — ( h » 1) — 0.5).
  • a prediction refinement value is calculated by the equation (4).
  • the prediction refinement is added to the sub-block prediction l ⁇ i,j) .
  • the final prediction I'(i,j ) is generated as the equation (6).
  • the proposed AMPR workflow may be applied to luma component and/or chroma components.
  • the proposed AMPR is only applied to refine the affine prediction samples of luma component while the chroma prediction samples are still generated based on the existing sub-block-based affine motion compensation.
  • both luma component and chroma components are refined by the proposed AMPR process.
  • the sample-wise MV difference Av(i,j) may be derived in different manners.
  • the sample-wise MV difference Av(i,j) may be always derived only once based on luma sub-blocks and then reused for chroma components when prediction refinement value is calculated in the above fourth step.
  • the value of Av(i,j ) used by chroma components may be scaled according to the sampling grid ratio between the collocated luma and chroma coding block. For example, for 4:2:0 video, the value of the reused Av(i,j ) may be halved before used by chroma components, while for 4:4:4 video, the same value of the reused Av(i,j ) may be used by chroma components. For 4:2:2 video, the horizontal offset of Av(i,j ) may be halved while the vertical offset of Av(i,j) may not be changed before used by chroma components.
  • sample-wise MV difference Av(i,j ) may be separately derived for luma and chroma components, where the derivation process may be the same as the above described third step.
  • one flag is signaled to indicate whether the AMPR is applied to chroma components at various coding levels, e.g., sequence level, picture level, slice level and so forth. Further, if the above enabling/ disabling flag is true, another flag may be signaled from encoder to decoder to indicate whether the chroma motion refinements are re-calculated from the corresponding control-point motion vectors or directly borrowed from the corresponding motion refinements of luma component.
  • the prediction refinement derived by applying AMPR may not be always beneficial or/and necessary.
  • the significance of the derived Al(i,j) is determined by the precision and magnitude of the derived Av(i,j) and g(i,j).
  • AMPR operation may be conditionally applied based on certain conditions. This may be achieved by signaling a flag for each block to indicate if AMPR mode is applied or not. It may also be achieved by using the same conditions to enable AMPR operation at both encoder and decode sides, with no additional signaling required.
  • AMPR operation may not help, or even hurt, coding performance, and therefore it is better to skip AMPR operation for the block.
  • Another motivation of such conditional application of AMPR operation is that in some cases the benefit of applying AMPR may be marginal and from computation complexity point of view it is also better to turn the operation off.
  • AMPR operation may be applied depending on whether the CMPVs are explicitly signaled or not.
  • affine merge mode where the CPMVs are not explicitly signaled but implicitly derived from spatial neighbor CUs, AMPR may be skipped for the current CU because the CPMVs under this mode may not be accurate.
  • AMPR may be skipped.
  • threshold values may be determined based on various factors, e.g., CU aspect ratio and/or sub-block size, etc. Such an example may be implemented in different manners described below.
  • AMPR may be skipped for this sub-block.
  • This condition may have different implementation variations.
  • the check of the absolute value of the derived Av(i,j ) for all the pixels may be simplified by only checking the four comers of the current sub-block, where the maximum absolute value of the derived Av(i,j) for all the pixels within a sub-block can be found as shown in equation (12) below: where the pixel location (i,j) may be any pixel coordinate in the sub-block, or may be from four comers (0, 0), (w — 1, 0), (0, h — 1), (w — 1, h — 1).
  • the calculation of the maximum absolute value of all Av(i,j) may be obtained by the equation below: where the sample locations (i,j) are the four comers of those sub-blocks in a CU except the top-left (i.e., sub-block A in FIG. 9), top-right (i.e., sub-block B in FIG. 9) and bottom-left sub block (i.e., sub-block C in FIG. 9)).
  • the coordinates of the four comer pixels within a subblock are: (0, 0), (w — 1, 0), (0, h — 1), (w — 1, h — 1).
  • is the function to take absolute value of x.
  • the check of the derived Av(i,j) may be combined with non- simplified AMPR operation.
  • the equation (13) may be merged with equation (4), then the prediction refinement value is calculated by equation below:
  • the threshv x or threshv y may be with different or the same value.
  • the values of threshv x and threshv y may be determined depending on which position is used for deriving the sub block MV. In other words, a different or same pair of values of threshv x and threshv y may be determined for two sub-blocks if their MVs are derived using different positions.
  • its pair of values of threshv x and threshv y may be the same or different from that of a sub block whose sub-block level MV is derived based on the position of the sub-block top-left comer.
  • the values of threshv x and threshv y may be defined as in the range of [1/32, 1/16] in unit of pixels. For example, a value of (1/16)*(10/16), (1/16)*(12/16), or (1/16)*(14/16) may be used as the thresholds. In this case, the threshold value is a floating point of 1/16-pel, e.g., 0.625 in unit of 1/16-pel, 0.75 in unit of 1/16-pel, or 0.875 in unit of 1/16-pel. [0110] In some examples, the values of threshv x and threshv y may be defined based on picture types.
  • the derived affine model parameters may have smaller magnitude than other non-low-delay pictures, since low-delay pictures tends to have smaller and/or smoother motions and therefore smaller values may be preferred for those thresholds.
  • the values of threshv x and threshv y may be the same regardless of different picture types.
  • AMPR may be skipped for this sub-block.
  • a sub-block contains smooth surface which may consist of flat textures (e.g., with no or small number of high-frequency details).
  • the significance of Av(i,j) and g(i,j ) may be considered jointly or used in a hybrid manner to decide whether AMPR should be skipped for current sub block or CU.
  • affine UMVE mode is computation intensive for encoder because it involves choosing the best distance index for each merge mode candidate.
  • SID sum of absolute transformed difference
  • AMPR operation is skipped during SATD based cost calculation for affine UMVE mode at the encoder side. It is found through experiments that while the best index is selected according to the best SATD cost, whether AMPR is applied during the SATD calculation or not usually does not change the ranking of the best SATD cost. Therefore, with the proposed method enabling AMPR mode wouldn’t incur obvious encoder complexity for affine UMVE mode.
  • Motion estimation is another major overhead at the encoder side.
  • AMPR process may be skipped depending on certain conditions. These conditions indicate that the best encoding mode of a CU is unlikely to be affine mode after mode selection process.
  • One example of such a condition is whether a current CU has a parent CU which is already determined to be coded by explicit affine mode or affine merge mode. This is due to the strong correlation of coding mode selection between a CU and its parent CU, and it is more likely that the best coding mode for the current CU is also explicit affine mode if the condition above is true.
  • Another exemplar condition used for enabling AMPR is whether the parent CU of the current CU is determined to be inter-predicted with explicit affine mode. If it is true, AMPR is applied during affine motion estimation of the current CU; otherwise, AMPR is skipped during affine motion estimation of the current CU.
  • AMPR may be skipped for small size CUs.
  • the size of a CU may be defined as the total number of pixels.
  • a pixel number threshold such as 16x16 or 16x32 or 32*32 may be defined, and for a block with a size smaller than the defined threshold, AMPR may be skipped during affine motion estimation process for the block.
  • FIG. 10 is a block diagram illustrating an apparatus for AMPR in accordance with some implementations of the present disclosure.
  • the apparatus 1000 may be a terminal, such as a mobile phone, a tablet computer, a digital broadcast terminal, a tablet device, or a personal digital assistant.
  • the apparatus 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power supply component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, a sensor component 1014, and a communication component 1016.
  • the processing component 1002 usually controls overall operations of the apparatus 1000, such as operations relating to display, a telephone call, data communication, a camera operation and a recording operation.
  • the processing component 1002 may include one or more processors 1020 for executing instructions to complete all or a part of steps of the above method.
  • the processing component 1002 may include one or more modules to facilitate interaction between the processing component 1002 and other components.
  • the processing component 1002 may include a multimedia module to facilitate the interaction between the multimedia component 1008 and the processing component 1002.
  • the memory 1004 is configured to store different types of data to support operations of the apparatus 1000. Examples of such data include instructions, contact data, phonebook data, messages, pictures, videos, and so on for any application or method that operates on the apparatus 1000.
  • the memory 1004 may be implemented by any type of volatile or non-volatile storage devices or a combination thereof, and the memory 1004 may be a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic memory, a flash memory, a magnetic disk or a compact disk.
  • SRAM Static Random Access Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • PROM Programmable Read-Only Memory
  • ROM Read-Only Memory
  • magnetic memory a magnetic memory
  • flash memory a
  • the power supply component 1006 supplies power for different components of the apparatus 1000.
  • the power supply component 1006 may include a power supply management system, one or more power supplies, and other components associated with generating, managing and distributing power for the apparatus 1000.
  • the multimedia component 1008 includes a screen providing an output interface between the apparatus 1000 and a user.
  • the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen receiving an input signal from a user.
  • the touch panel may include one or more touch sensors for sensing a touch, a slide and a gesture on the touch panel. The touch sensor may not only sense a boundary of a touching or sliding actions, but also detect duration and pressure related to the touching or sliding operation.
  • the multimedia component 1008 may include a front camera and/or a rear camera.
  • the audio component 1010 is configured to output and/or input an audio signal.
  • the audio component 1010 includes a microphone (MIC).
  • the microphone is configured to receive an external audio signal.
  • the received audio signal may be further stored in the memory 1004 or sent via the communication component 1016.
  • the audio component 1010 further includes a speaker for outputting an audio signal.
  • the I/O interface 1012 provides an interface between the processing component 1002 and a peripheral interface module.
  • the above peripheral interface module may be a keyboard, a click wheel, a button, or the like. These buttons may include but not limited to, a home button, a volume button, a start button and a lock button.
  • the sensor component 1014 includes one or more sensors for providing a state assessment in different aspects for the apparatus 1000.
  • the sensor component 1014 may detect an on/off state of the apparatus 1000 and relative locations of components.
  • the components are a display and a keypad of the apparatus 1000.
  • the sensor component 1014 may also detect a position change of the apparatus 1000 or a component of the apparatus 1000, presence or absence of a contact of a user on the apparatus 1000, an orientation or acceleration/deceleration of the apparatus 1000, and a temperature change of apparatus 1000.
  • the sensor component 1014 may include a proximity sensor configured to detect presence of a nearby object without any physical touch.
  • the sensor component 1014 may further include an optical sensor, such as a CMOS or CCD image sensor used in an imaging application.
  • the sensor component 1014 may further include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • the communication component 1016 is configured to facilitate wired or wireless communication between the apparatus 1000 and other devices.
  • the apparatus 1000 may access a wireless network based on a communication standard, such as WiFi, 4G, or a combination thereof.
  • the communication component 1016 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 1016 may further include a Near Field Communication (NFC) module for promoting short-range communication.
  • NFC Near Field Communication
  • the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra-Wide Band (UWB) technology, Bluetooth (BT) technology and other technology.
  • RFID Radio Frequency Identification
  • IrDA infrared data association
  • UWB Ultra-Wide Band
  • Bluetooth Bluetooth
  • the apparatus 1000 may be implemented by one or more of Application Specific Integrated Circuits (ASIC), Digital Signal Processors (DSP), Digital Signal Processing Devices (DSPD), Programmable Logic Devices (PLD), Field Programmable Gate Arrays (FPGA), controllers, microcontrollers, microprocessors or other electronic elements to perform the above method.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processors
  • DSPD Digital Signal Processing Devices
  • PLD Programmable Logic Devices
  • FPGA Field Programmable Gate Arrays
  • controllers microcontrollers, microprocessors or other electronic elements to perform the above method.
  • a non-transitory computer readable storage medium may be, for example, a Hard Disk Drive (HDD), a Solid-State Drive (SSD), Flash memory, a Hybrid Drive or Solid-State Hybrid Drive (SSHD), a Read-Only Memory (ROM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk and etc.
  • HDD Hard Disk Drive
  • SSD Solid-State Drive
  • SSHD Solid-State Hybrid Drive
  • ROM Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • magnetic tape a floppy disk and etc.
  • FIG. 11 is a flowchart illustrating an exemplary process of AMPR in accordance with some implementations of the present disclosure.
  • step 1102 the processor 1020 generates, at a pixel location in a sub-block, a sub block prediction by performing a sub-block-based affine motion compensation on a video picture that includes a plurality of sub-blocks.
  • step 1104 the processor 1020 obtains, at the pixel location, a horizontal spatial gradient and a vertical spatial gradient for the sub-block prediction by using an interpolation filter.
  • step 1106 the processor 1020 obtains, at the pixel location, a MV difference between a first MV and a second MV based on the pixel location relative to a position within the sub-block.
  • the first MV may be an MV of a pixel located at the pixel location
  • the second MV may be an MV of the sub-block.
  • the processor 1020 may obtain a prediction refinement, at the pixel location, based on the horizontal spatial gradient, the vertical spatial gradient, a horizontal MV difference, and a vertical MV difference, and generate a final prediction at the pixel location by adding the prediction refinement to the sub-block prediction.
  • the MV difference may include the horizontal MV difference and the vertical MV difference.
  • the processor 1020 may determine whether the AMPR applies to the sub-block, enable the AMPR of the sub-block upon determining that the AMPR applies, and skip the AMPR of the sub-block upon determining that the AMPR does not apply.
  • the processor 1020 may determine whether one or more CPMVs of the sub-block are explicitly signaled with a flag indicating that the AMPR applies and skip the AMPR of the sub block upon determining that the one or more CPMVs are not explicitly signaled with the flag and implicitly derived from spatial neighbor sub-blocks.
  • the processor 1020 may determine whether magnitude of the horizontal MV difference, the vertical MV difference, the horizontal spatial gradient, or the vertical spatial gradient is less than a threshold value and skip the AMPR of the sub-block upon determining that the magnitude is less than the threshold value.
  • the processor 1020 may determine whether magnitude of the horizontal MV difference or magnitude of the horizontal spatial gradient is less than a horizontal threshold value, and skip the AMPR of the sub-block on a horizontal direction upon determining that the magnitude of the horizontal MV difference or the magnitude of the horizontal spatial gradient is less than the horizontal threshold value.
  • the processor 1020 may determine whether magnitude of the vertical MV difference or magnitude of the vertical spatial gradient is less than a vertical threshold value and skip the AMPR of the sub-block on a vertical direction upon determining that the magnitude of the vertical MV difference or the magnitude of the vertical spatial gradient is less than the vertical threshold value.
  • the processor 1020 may determine whether magnitude of the horizontal MV difference or magnitude of the horizontal spatial gradient is less than a horizontal threshold value and determining whether magnitude of the vertical MV difference or magnitude of the vertical spatial gradient is less than a vertical threshold value and skip the AMPR of the sub block both on a horizontal direction and a vertical direction upon determining that the magnitude of the horizontal MV difference or the magnitude of the horizontal spatial gradient is less than the horizontal threshold value and in response to determining that the magnitude of the vertical MV difference or the magnitude of the vertical spatial gradient is less than the vertical threshold value.
  • the processor 1020 may obtain the horizontal spatial gradient and the vertical spatial gradient for the sub-block prediction at the same time generating prediction samples for the sub-block at integer sample positions in a temporal reference picture associated with the sub block.
  • the processor 1020 may obtain, at the pixel location, the horizontal spatial gradient by horizontally applying a gradient filter to derive the horizontal spatial gradient at a horizontal fractional sample position and vertically applying the interpolation filter to interpolate the horizontal spatial gradient at a vertical fractional sample position.
  • the processor 1020 may obtain, at the pixel location, the vertical spatial gradient by horizontally applying the interpolation filter to interpolate an intermediate interpolation sample at a horizontal sample position and vertically applying a gradient filter to derive the vertical spatial gradient at a vertical fraction sample position from the intermediate interpolation sample.
  • the processor 1020 may determine a maximum horizontal absolute value and a maximum vertical absolute value of the MV difference for a plurality of pixels within the sub block and determine the MV difference based on the maximum horizontal absolute value and the maximum vertical absolute value by one of following acts including: determining the horizontal MV difference of the plurality of pixels as zero upon determining that the maximum horizontal absolute value is less than a horizontal threshold; determining the vertical MV difference of the plurality of pixels as zero upon determining that the maximum vertical absolute value is less than a vertical threshold; and determining the horizontal MV difference and the vertical MV difference of the plurality of pixels as zero upon determining that the maximum horizontal absolute value is less than the horizontal threshold and the maximum vertical absolute value is less than the vertical threshold.
  • the processor 1020 may obtain the MV difference based on a luma sub-block and obtain a prediction refinement for a chroma sub-block according to the MV difference obtained based on the luma sub-block.
  • the processor 1020 may obtain the prediction refinement for the chroma sub-block according to the MV difference obtained based on the luma sub-block in response to a chroma sub-block signaled with a flag indicating that the AMPR applies.
  • an apparatus for video coding includes one or more processors 1020; and a memory 1004 configured to store instructions executable by the one or more processors; where the processor, upon execution of the instructions, is configured to perform a method as illustrated in FIG. 11.
  • a non-transitory computer readable storage medium 1004 having instructions stored therein. When the instructions are executed by one or more processors 1020, the instructions cause the processor to perform a method as illustrated in FIG. 11.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
PCT/US2021/022640 2020-03-20 2021-03-16 Methods and devices for affine motion-compensated prediction refinement WO2021188598A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202180021435.2A CN115280779A (zh) 2020-03-20 2021-03-16 用于仿射运动补偿预测细化的方法和装置

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202062992897P 2020-03-20 2020-03-20
US62/992,897 2020-03-20
US202062993654P 2020-03-23 2020-03-23
US62/993,654 2020-03-23

Publications (1)

Publication Number Publication Date
WO2021188598A1 true WO2021188598A1 (en) 2021-09-23

Family

ID=77768297

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/022640 WO2021188598A1 (en) 2020-03-20 2021-03-16 Methods and devices for affine motion-compensated prediction refinement

Country Status (2)

Country Link
CN (1) CN115280779A (zh)
WO (1) WO2021188598A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114157868A (zh) * 2022-02-07 2022-03-08 杭州未名信科科技有限公司 视频帧的编码模式筛选方法、装置及电子设备
WO2023078449A1 (en) * 2021-11-08 2023-05-11 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for video processing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018002021A1 (en) * 2016-06-30 2018-01-04 Thomson Licensing Video coding with adaptive motion information refinement
US20200007877A1 (en) * 2018-06-27 2020-01-02 Avago Technologies General Ip (Singapore) Pte. Ltd. Low complexity affine merge mode for versatile video coding
WO2020049540A1 (en) * 2018-09-08 2020-03-12 Beijing Bytedance Network Technology Co., Ltd. Affine mode in video coding and decoding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018002021A1 (en) * 2016-06-30 2018-01-04 Thomson Licensing Video coding with adaptive motion information refinement
US20200007877A1 (en) * 2018-06-27 2020-01-02 Avago Technologies General Ip (Singapore) Pte. Ltd. Low complexity affine merge mode for versatile video coding
WO2020049540A1 (en) * 2018-09-08 2020-03-12 Beijing Bytedance Network Technology Co., Ltd. Affine mode in video coding and decoding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
J. CHEN, Y. YE, S. KIM: "Algorithm description for Versatile Video Coding and Test Model 8 (VTM 8)", 17. JVET MEETING; 20200107 - 20200117; BRUSSELS; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 3 March 2020 (2020-03-03), pages 1 - 92, XP030288000 *
Y.-C. YANG (FGINNOV), P.-H. LIN (FOXCONN): "CE4-related: On Conditions for enabling PROF", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-O0313, 26 June 2019 (2019-06-26), pages 1 - 4, XP030219229 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023078449A1 (en) * 2021-11-08 2023-05-11 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for video processing
CN114157868A (zh) * 2022-02-07 2022-03-08 杭州未名信科科技有限公司 视频帧的编码模式筛选方法、装置及电子设备

Also Published As

Publication number Publication date
CN115280779A (zh) 2022-11-01

Similar Documents

Publication Publication Date Title
US20210377557A1 (en) Methods and apparatus of motion vector rounding, clipping and storage for interprediction
CN116506609B (zh) 用于在视频编码中用信号发送合并模式的方法和装置
WO2021188598A1 (en) Methods and devices for affine motion-compensated prediction refinement
WO2021030502A1 (en) Methods and apparatuses for adaptive motion vector resolution in video coding
WO2022032028A1 (en) Methods and apparatuses for affine motion-compensated prediction refinement
WO2022081878A1 (en) Methods and apparatuses for affine motion-compensated prediction refinement
KR102663465B1 (ko) 비디오 코딩을 위한 예측 종속 잔차 스케일링을 위한 방법 및 장치
CN114009017A (zh) 使用组合帧间和帧内预测进行运动补偿
WO2020257365A1 (en) Methods and apparatuses for decoder-side motion vector refinement in video coding
CN114342390B (zh) 用于仿射运动补偿的预测细化的方法和装置
US20240098290A1 (en) Methods and devices for overlapped block motion compensation for inter prediction
US20240015316A1 (en) Overlapped block motion compensation for inter prediction
WO2022026480A1 (en) Weighted ac prediction for video coding
WO2021248135A1 (en) Methods and apparatuses for video coding using satd based cost calculation
WO2021062283A1 (en) Methods and apparatuses for decoder-side motion vector refinement in video coding
WO2024006231A1 (en) Methods and apparatus on chroma motion compensation using adaptive cross-component filtering
WO2021007133A1 (en) Methods and apparatuses for decoder-side motion vector refinement in video coding
WO2021021698A1 (en) Methods and apparatuses for decoder-side motion vector refinement in video coding
WO2021188707A1 (en) Methods and apparatuses for simplification of bidirectional optical flow and decoder side motion vector refinement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21772205

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21772205

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30/03/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21772205

Country of ref document: EP

Kind code of ref document: A1