CN114125466A - Inter-frame prediction method, encoder, decoder, and computer storage medium - Google Patents

Inter-frame prediction method, encoder, decoder, and computer storage medium Download PDF

Info

Publication number
CN114125466A
CN114125466A CN202010873979.9A CN202010873979A CN114125466A CN 114125466 A CN114125466 A CN 114125466A CN 202010873979 A CN202010873979 A CN 202010873979A CN 114125466 A CN114125466 A CN 114125466A
Authority
CN
China
Prior art keywords
motion vector
determining
sub
block
deviation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010873979.9A
Other languages
Chinese (zh)
Inventor
谢志煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010873979.9A priority Critical patent/CN114125466A/en
Priority to TW110130848A priority patent/TW202209893A/en
Publication of CN114125466A publication Critical patent/CN114125466A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/533Motion estimation using multistep search, e.g. 2D-log search or one-at-a-time search [OTS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the application discloses an inter-frame prediction method, an encoder, a decoder and a computer storage medium, wherein the decoder analyzes a code stream and acquires a prediction mode parameter of a current block; determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor of the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining the second predicted value of the sub-block, and determining the second predicted value as the inter-frame predicted value of the sub-block.

Description

Inter-frame prediction method, encoder, decoder, and computer storage medium
Technical Field
The present application relates to the field of video encoding and decoding technologies, and in particular, to an inter-frame prediction method, an encoder, a decoder, and a computer storage medium.
Background
In the field of Video encoding and decoding, in order to take performance and cost into consideration, generally, affine prediction in multifunctional Video Coding (VVC) and digital Audio/Video encoding Standard of China (AVS) is realized based on subblocks. Currently, on the one hand, it is proposed that prediction refinement with optical flow (PROF) using optical flow principles corrects sub-block based affine predictions using optical flow principles; on the other hand, it is proposed to obtain a more accurate predicted value using quadratic prediction after sub-block-based prediction, thereby achieving improvement of affine prediction. Specifically, the PROF modifies the affine prediction result depending on the gradients in the horizontal and vertical directions of the reference position, and the quadratic prediction is to predict again one pixel position in the sub-block using the filter.
However, if a large motion vector deviation exists in the center position of a filter that needs to be used for quadratic prediction or PROF, the prediction accuracy may be greatly reduced, that is, the existing methods for correcting the prediction value and quadratic prediction for the PROF are not strict, and when affine prediction is improved, the methods cannot be well applied to all scenes, and the encoding performance needs to be improved.
Disclosure of Invention
The application provides an inter-frame prediction method, an encoder, a decoder and a computer storage medium, which can reduce prediction errors and greatly improve coding performance, thereby improving coding and decoding efficiency.
The technical scheme of the application is realized as follows:
in a first aspect, an embodiment of the present application provides an inter-frame prediction method applied to a decoder, where the method includes:
analyzing the code stream to obtain the prediction mode parameter of the current block;
determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks;
determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block;
determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation;
and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
In a second aspect, an embodiment of the present application provides an inter-frame prediction method applied to an encoder, where the method includes:
determining a prediction mode parameter of a current block;
determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks;
determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block;
determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation;
and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
In a third aspect, an embodiment of the present application provides a decoder, where the decoder includes a parsing unit, a first determining unit;
the analysis unit is used for analyzing the code stream to obtain the prediction mode parameter of the current block;
the first determination unit to determine a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; and determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
In a fourth aspect, embodiments of the present application provide a decoder comprising a first processor, a first memory storing first processor-executable instructions that, when executed, implement the inter prediction method as described above.
In a fifth aspect, an embodiment of the present application provides an encoder, where the encoder includes a second determining unit;
the second determining unit is used for determining the prediction mode parameter of the current block; and determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; and determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
In a sixth aspect, embodiments of the present application provide an encoder comprising a second processor, a second memory storing instructions executable by the second processor, the instructions, when executed, implement the inter prediction method as described above.
In a seventh aspect, an embodiment of the present application provides a computer storage medium, where a computer program is stored, and when the computer program is executed by a first processor and a second processor, the inter-frame prediction method is implemented as described above.
According to the inter-frame prediction method, the encoder, the decoder and the computer storage medium, the decoder analyzes the code stream and obtains the prediction mode parameters of the current block; determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor of the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining the second predicted value of the sub-block, and determining the second predicted value as the inter-frame predicted value of the sub-block. The encoder determines a prediction mode parameter of the current block; determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor of the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining the second predicted value of the sub-block, and determining the second predicted value as the inter-frame predicted value of the sub-block. That is, the inter prediction method proposed in the present application may re-determine the center position and the second motion vector offset used for performing the quadratic prediction or the PROF process based on the first motion vector offset for the pixel position where the first motion vector offset between the motion vector and the motion vector of the sub-block is large after the prediction based on the sub-block, so that the second prediction value may be obtained by performing the point-based quadratic prediction using the center position and the second motion vector offset on the basis of the first prediction value based on the sub-block. Therefore, the inter-frame prediction method provided by the application can be well suitable for all scenes, the prediction error can be reduced, the coding performance is greatly improved, and the coding and decoding efficiency is improved.
Drawings
FIG. 1 is a diagram of an affine model one;
FIG. 2 is a second schematic diagram of an affine model;
FIG. 3 is a schematic diagram of interpolation of a pixel;
FIG. 4 is a first schematic diagram of sub-block interpolation;
FIG. 5 is a second schematic diagram of sub-block interpolation;
FIG. 6 is a diagram of motion vectors for each sub-block;
FIG. 7 is a schematic view of a sample position;
FIG. 8 is a first schematic view of the center position;
FIG. 9 is a second schematic view of the center position;
fig. 10 is a block diagram illustrating a video coding system according to an embodiment of the present application;
fig. 11 is a block diagram illustrating a video decoding system according to an embodiment of the present application;
FIG. 12 is a first flowchart illustrating an implementation of an inter-frame prediction method;
FIG. 13 is a third schematic view of the center position;
FIG. 14 is a flowchart illustrating a second implementation of the inter prediction method;
FIG. 15 is a third flowchart illustrating an implementation of an inter-frame prediction method;
FIG. 16 is a flowchart illustrating a fourth implementation of the inter-frame prediction method;
FIG. 17 is a second schematic diagram of a two-dimensional filter;
FIG. 18 is a third flowchart illustrating an implementation of an inter-frame prediction method;
FIG. 19 is a flowchart illustrating a fourth implementation of the inter-frame prediction method;
FIG. 20 is a first block diagram of a decoder;
FIG. 21 is a second block diagram of the decoder;
FIG. 22 is a first schematic diagram of the encoder;
fig. 23 is a schematic diagram of the second constituent structure of the encoder.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant application and are not limiting of the application. It should be noted that, for the convenience of description, only the parts related to the related applications are shown in the drawings.
Currently, the common video codec standard is based on the adoption of a block-based hybrid coding framework. Each frame in a video image is divided into square Largest Coding Units (LCUs) with the same size (e.g., 128 × 128, 64 × 64, etc.), and each Largest Coding Unit may be further divided into rectangular Coding Units (CUs) according to rules; and the coding Unit may be further divided into smaller Prediction Units (PUs). Specifically, the hybrid coding framework may include modules such as prediction, Transform (Transform), Quantization (Quantization), entropy coding (entropy coding), Loop Filter (In Loop Filter), and the like; the prediction module may include intra prediction (intraPrediction) and inter prediction (interPrediction), and the inter prediction may include motion estimation (motion estimation) and motion compensation (motion compensation). Because strong correlation exists between adjacent pixels in one frame of the video image, the spatial redundancy between the adjacent pixels can be eliminated by using an intra-frame prediction mode in the video coding and decoding technology; however, because there is strong similarity between adjacent frames in the video image, the inter-frame prediction method is used in the video coding and decoding technology to eliminate the time redundancy between adjacent frames, thereby improving the coding efficiency. The following detailed description of the present application will be made in terms of inter prediction.
Inter-frame prediction is to use an already coded/decoded frame to predict a part to be coded/decoded in a current frame, and in a block-based coding/decoding frame, the part to be coded/decoded is usually a coding unit or a prediction unit. The coding unit or prediction unit that needs to be coded/decoded is collectively referred to herein as a current block. Translational motion is a common and simple motion mode in video, so the prediction of translational motion is also a traditional prediction method in video coding and decoding. Translational motion in a video may be understood as the movement of a portion of content from a location on one frame to a location on another frame over time. A simple one-way prediction of the translation can be represented by a Motion Vector (MV) between a certain frame and the current frame. The current block can find a reference block with the same size as the current block on the reference frame through the motion information containing the reference frame and the motion vector, and the reference block is taken as a prediction block of the current block. In an ideal translation motion, the content of the current block has no deformation, rotation, and the like, and no change in luminance color, and the like, between different frames, however, the content in the video does not always conform to such an ideal situation. Bi-directional prediction may solve the above problem to some extent. The general bi-directional prediction refers to bi-directional translational prediction. Bi-prediction is to use two reference frames and motion information of motion vectors to find two reference blocks with the same size as the current block from the two reference frames (the two reference frames may be the same reference frame), and use the two reference blocks to generate a prediction block of the current block. The generation method includes averaging, weighted averaging, and some other calculation.
In this application, prediction may be considered as part of motion compensation, and some documents will refer to prediction in this application as motion compensation, and some documents will refer to affine prediction as affine motion compensation.
Rotation, zooming, warping, etc. are also common changes in video, however, ordinary translational prediction does not handle such changes well, and affine (affine) prediction models are applied in video codecs, such as affine in VVC and AVS, where the affine prediction models of VVC and AVS3 are similar. In the rotation, enlargement, reduction, warping, deformation, etc., it can be considered that the current block does not use the same MV for all points, and thus the MV for each point needs to be derived. The affine prediction model derives the MV for each point by calculation with a small number of several parameters. The affine predictive models of VVC and AVS3 both used 2-control-point (4-parameter) and 3-control-point (6-parameter) models. 2 control points, namely 2 control points at the upper left corner and the upper right corner of the current block, and 3 control points, namely 3 control points at the upper left corner, the upper right corner and the lower left corner of the current block. Illustratively, fig. 1 is a schematic diagram of an affine model, and fig. 2 is a schematic diagram of an affine model, as shown in fig. 1 and 2. Since each MV comprises an x-component and a y-component, there are 4 parameters for 2 control points and 6 parameters for 3 control points.
An MV can be derived for each pixel position according to the affine prediction model, each pixel position can find its corresponding position in the reference frame, and if the position is not an integer pixel position, the value of the sub-pixel position needs to be obtained through an interpolation method. The interpolation methods used in the video coding and decoding standards are usually implemented by Finite Impulse Response (FIR) filters, and the complexity (cost) of implementation in this way is high. For example, in AVS3, an 8-tap interpolation filter is used for the luminance component, and the normal mode sub-pixel precision is 1/4 pixels and the affine mode sub-pixel precision is 1/16 pixels. For each sub-pixel point meeting 1/16-pixel precision, 8 whole pixels in the horizontal direction and 8 whole pixels in the vertical direction, namely 64 whole pixels, are needed to be interpolated. Fig. 3 is a schematic diagram of interpolation of pixels, as shown in fig. 3, a circular pixel is a sub-pixel point to be obtained, a dark square pixel is a full pixel position corresponding to the sub-pixel, a vector between the two is a motion vector of the sub-pixel, a light square pixel is a pixel required for interpolation of the circular sub-pixel position, and to obtain a value of the sub-pixel position, pixel values of these 8 × 8 light square pixel regions are required for interpolation, and the dark pixel position is also included.
In conventional translational prediction, the MV of each pixel position of the current block is the same. If the concept of sub-blocks is further introduced, the size of the sub-blocks is e.g. 4x4, 8x8, etc. Fig. 4 is a schematic diagram of sub-block interpolation, i.e., a pixel region required for 4 × 4 block interpolation is shown in fig. 4. Fig. 5 is a diagram of sub-block interpolation, and the pixel regions required for 8 × 8 block interpolation are shown in fig. 5.
If the MV for each pixel position in a sub-block is the same. The pixel locations in one sub-block can be interpolated together to share the bandwidth, filters using the same phase, and to share intermediate values of the interpolation process. However, if one MV is used for each pixel, the bandwidth increases and filters with different phases may be used and intermediate values of the interpolation process cannot be shared.
Affine predictions based on points are costly, and therefore, to achieve both performance and cost, affine predictions in VVC and AVS3 are implemented on a subblock basis. The subblock sizes in the AVS3 are two sizes of 4x4 and 8x8, and a subblock size of 4x4 is used in the VVC. Each sub-block has an MV, and the pixel locations within the sub-block share the same MV. So as to interpolate all pixel positions inside the sub-block uniformly. By the method, the motion compensation complexity of the affine prediction based on the sub-block is similar to that of other prediction methods based on the sub-block.
It can be seen that, in the sub-block-based affine prediction method, pixel positions inside the sub-block share the same MV, wherein the method for determining the shared MV is to take the MV at the center of the current sub-block. For at least one sub-block having an even number of pixels in the vertical and horizontal directions, such as 4x4 and 8x8, the center of the sub-block is located at a non-integer pixel position. The current standard takes an integer pixel position, for example, for a 4 × 4 sub-block, a pixel position (2, 2) from the upper left corner position. For the 8x8 sub-block, the pixel position (4, 4) from the upper left corner position is taken.
The affine prediction model can derive the MV for each pixel position from the control points (2 control points or 3 control points) used by the current block. In the sub-block based affine prediction, the MV at this position is calculated from the pixel position in the previous segment as the MV of the sub-block. Fig. 6 is a schematic diagram of the motion vector of each sub-block, and as shown in fig. 6, in order to derive the motion vector of each sub-block, the motion vector of the center sample of each sub-block is rounded to 1/16 precision, and then motion compensation is performed.
With the development of the technology, a method called a prediction improvement technique PROF using optical flow is proposed. This technique can improve the prediction value of block-based affine prediction without increasing the bandwidth. After the affine prediction based on the sub-blocks is finished, calculating the gradients of each pixel point which is finished with the affine prediction based on the sub-blocks in the horizontal direction and the vertical direction. When the PROF calculates the gradient in VVC, a 3-tap filter [ -1, 0, 1] is used, and the calculation method is the same as that of a Bi-directional Optical flow (BDOF). Then, for each pixel position, calculating the motion vector deviation thereof, wherein the motion vector deviation is the difference between the motion vector of the current pixel position and the MV used by the whole sub-block. These motion vector biases can all be calculated from the formula of the affine prediction model. Due to the characteristics of the formula, the motion vector deviations of the same positions of some sub-blocks are the same, and for these sub-blocks, only one set of motion vector deviations needs to be calculated, and other sub-blocks can directly multiplex the values. For each pixel position, the horizontal pixel vertical gradient and the motion vector deviation (including the deviation in the horizontal direction and the deviation in the vertical direction) of the point are used for calculating the correction value of the predicted value of the pixel position, and then the corrected predicted value can be obtained by adding the correction value of the predicted value to the original predicted value, namely the predicted value based on the affine prediction of the sub-block.
When calculating the gradient in the horizontal and vertical directions, a [ -1, 0, 1] filter is used, that is, for the current pixel position, the predicted values of the pixel position with the distance of 1 on the left and the pixel position with the distance of 1 on the right are used in the horizontal direction, and the predicted values of the pixel position with the distance of 1 on the upper side and the pixel position with the distance of 1 on the lower side are used in the vertical direction. If the current pixel position is the boundary position of the current block, some of the pixel positions may exceed the boundary of the current block by a pixel distance. The position of one pixel distance is filled out using the predictor of the boundary of the current block to satisfy the gradient calculation, so that the above predictor of one pixel distance beyond the boundary of the current block does not need to be additionally increased. Since the gradient calculation only needs to use the prediction values based on the affine prediction of the sub-blocks, no additional bandwidth needs to be added.
In the VVC standard text, the MV of each subblock of the current block and the motion vector deviation of each pixel position within the subblock are derived for the MV according to the control point. The pixel positions used by each subblock in the VVC as subblocks MV are the same, so only the motion vector offsets of a group of subblocks need to be derived, and other subblocks can multiplex the subblocks. Further, in the VVC standard text, description of the PROF flow, calculation of the motion vector deviation by the PROF is included in the above flow.
The affine prediction of the AVS3 is the same as the basic principle of VVC when calculating the MV of a sub-block, but the AVS3 has special treatment for the top left sub-block a, the top right sub-block B and the bottom left sub-block C of the current block.
The following is a description of the AVS3 standard text derivation of an affine motion element sub-block motion vector array:
if there are 3 motion vectors in the affine control point motion vector group, the motion vector group can be represented as mvsAffine (mv0, mv1, mv 2); if there are 2 motion vectors in the affine control point motion vector group, the motion vector group can be represented as mvsAffine (mv0, mv 1). Next, an affine motion element sub-block motion vector array may be derived as follows:
1. calculating variables dHorX, dVerX, dHorY and dVerY:
dHorX=(mv1_x-mv0_x)<<(7-Log(width));
dHorY=(mv1_y-mv0_y)<<(7-Log(width));
if the motion vector group is mvsAffine (mv0, mv1, mv2), then:
dVerX=(mv2_x-mv0_x)<<(7-Log(height));
dVerY=(mv2_y-mv0_y)<<(7-Log(height));
if the motion vector group is mvsAffine (mv0, mv1), then:
dVerX=-dHorY;
dVerY=dHorX;
it should be noted that fig. 7 is a sample position diagram, as shown in fig. 7, (xE, yE) is a position of an upper left corner sample of a luma prediction block of a current prediction unit in a luma sample matrix of a current image, a width and a height of the current prediction unit are width and height, respectively, a width and a height of each sub-block are sub-width and sub-height, respectively, a sub-block where the upper left corner sample of the luma prediction block of the current prediction unit is located is a, a sub-block where the upper right corner sample is located is B, and a sub-block where the lower left corner sample is located is C.
2.1, if the prediction reference mode of the current prediction unit is 'Pred _ List 01' or affinesblocksizeflag is equal to 1 (affinesbubblocksizeflag is used to indicate the size of the sub-block size), then both subwidth and subweight are equal to 8, (x, y) is the coordinate of the position of the upper left corner of the sub-block with size 8x8, then the motion vector mvE (mvE _ x, mvE _ y) of each 8x8 luma sub-block can be calculated:
if the sub-block is A, both xPos and yPos are equal to 0;
if the sub-block is B, xPos equals width and yPos equals 0;
if the subblock is C and there are 3 motion vectors in mvsAffine, xPos equals 0 and yPos equals height;
otherwise, xPos equals (x-xE) +4, yPos equals (y-yE) + 4;
thus, the motion vector mvE for the current 8x8 sub-block is:
mvE_x=Clip3(-131072,131071,Rounding((mv0_x<<7)+dHorX×xPos+dVerX×yPos,7));
mvE_y=Clip3(-131072,131071,Rounding((mv0_y<<7)+dHorY×xPos+dVerY×yPos,7));
2.2, if the prediction reference mode of the current prediction unit is 'Pred _ List 0' or 'Pred _ List 1', and affinesbubblocksizeflag is equal to 0, then both subwidth and subweight are equal to 4, (x, y) is the coordinate of the position of the upper left corner of the subblock of size 4x4, and the motion vector mvE (mvE _ x, mvE _ y) of each 4x4 luma subblock is calculated:
if the sub-block is A, both xPos and yPos are equal to 0;
if the sub-block is B, xPos equals width and yPos equals 0;
if the subblock is C and there are 3 motion vectors in mvAffinine, xPos is equal to 0, and y Pos and other heights;
otherwise, xPos equals (x-xE) +2, yPos equals (y-yE) + 2;
thus, the motion vector mvE for the current 4x4 sub-block is:
mvE_x=Clip3(-131072,131071,Rounding((mv0_x<<7)+dHorX×xPos+dVerX×yPos,7));
mvE_y=Clip3(-131072,131071,Rounding((mv0_y<<7)+dHorY×xPos+dVerY×yPos,7))。
the following is a description of the AVS3 text for affine prediction sample derivation and luma, chroma sample interpolation:
if the position of the top left sample of the luma prediction block of the current prediction unit in the luma sample matrix of the current picture is (xE, yE).
If the prediction reference mode of the current prediction unit is 'PRED _ List 0' and the value of affinesbocksizeflag is 0, mv0E0 is an LO motion vector of 4x4 units with MvArrayL l0 motion vectors grouped at (xE + x, yE + y) positions. The value of element pred matrix l0[ x ] [ y ] in luma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE + x) < <4) + mv0E0_ x, (yE + y) < <4) + mv0E0_ y)), and the value of element pred matrix xl0[ x ] [ y ] in chroma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE +2x) < <4) + MvC _ x, (yE +2y) < <4) + MvC _ y))). Where x1 ═ ((xE +2x) > >3) < <3, y1 ═ ((yE +2y) > >3) < <3, mv1E0 is an LO motion vector of a 4x4 unit whose MvArrayL l0 motion vector set is at the (x1, y1) position, mv2E0 is an LO motion vector of a 4x4 unit whose MvArrayL l0 motion vector set is at the (x1+4, y1) position, mv3E0 is an LO motion vector of a 4x4 unit whose MvArrayL l0 motion vector set is at the (x1, y1+4) position, and mv4E0 is an LO motion vector of a 4x4 unit whose MvArrayL l0 motion vector set is at the (x1+4, y1+4) position.
MvC_x=(mv1E0_x+mv2E0_x+mv3E0_x+mv4E0_x+2)>>2
MvC_y=(mv1E0_y+mv2E0_y+mv3E0_y+mv4E0_y+2)>>2
If the prediction reference mode of the current prediction unit is 'PRED _ List 0' and the value of affinesbocksizeflag is 1, mv0E0 is an LO motion vector of 8x8 units with MvArrayL l0 motion vectors grouped at (xE + x, yE + y) positions. The value of element pred matrix l0[ x ] [ y ] in luma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE + x) < <4) + mv0E0_ x, (yE + y) < <4) + mv0E0_ y)), and the value of element pred matrix xl0[ x ] [ y ] in chroma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE +2x) < <4) + MvC _ x, (yE +2y) <4) + MvC _ y))). Wherein MvC _ x is equal to mv0E0_ x, MvC _ y is equal to mv0E 0.
If the prediction reference mode of the current prediction unit is 'PRED _ List 1' and the value of affinesbocksizeflag is 0, mv0E1 is an L1 motion vector of 4x4 units with MvArrayL L1 motion vectors grouped at (xE + x, yE + y) positions. The value of element pred matrix l1[ x ] [ y ] in luma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE + x) < <4) + mv0E1_ x, (yE + y) < <4) + mv0E1_ y)), and the value of element pred matrix xl1[ x ] [ y ] in chroma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE +2x) < <4) + MvC _ x, (yE +2y) <4) + MvC _ y))). Where x1 ═ ((xE +2x) > >3) < <3, y1 ═ ((yE +2y) > >3) < <3, mv1E1 is the L1 motion vector of the 4x4 unit with MvArrayL L1 motion vector set at the (x1, y1) position, mv2E1 is the L1 motion vector of the 4x4 unit with MvArrayL L1 motion vector set at the (x1+4, y1) position, mv3E1 is the L1 motion vector of the 4x4 unit with MvArrayL L1 motion vector set at the (x1, y1+4) position, and mv4E1 is the L1 motion vector of the 4x 1 unit with MvArrayL L1 motion vector set at the (x1+4, y1+4) position.
MvC_x=(mv1E1_x+mv2E1_x+mv3E1_x+mv4E1_x+2)>>2
MvC_y=(mv1E1_y+mv2E1_y+mv3E1_y+mv4E1_y+2)>>2
If the prediction reference mode of the current prediction unit is 'PRED _ List 1' and the value of AffiniBanckSizeFlag is 1, mv0E1 is an L1 motion vector of 8x8 units with MvArrayL1 motion vectors grouped at (xE + x, yE + y) positions. The value of element pred matrix l1[ x ] [ y ] in luma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE + x) < <4) + mv0E1_ x, (yE + y) < <4) + mv0E1_ y)), and the value of element pred matrix xl1[ x ] [ y ] in chroma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE +2x) < <4) + MvC _ x, (yE +2y) <4) + MvC _ y))). Where MvC _ x equals mv0E1_ x, MvC _ y equals mv0E 1.
If the prediction mode of the current prediction unit is 'PRED _ List 01', mv0E0 is an L0 motion vector of 8x8 units with the MvArrayL0 motion vector set at the (xE + x, yE + y) position, and mv0E1 is an L1 motion vector of 8x8 units with the MvArrayL1 motion vector set at the (x, y) position. The value of the element pred matrix 0[ x ] [ y ] in the luma prediction sample matrix pred matrix xl0 is the sample value located in the 1/16 precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE + x) < <4) + mv0E0_ x, (yE + y) < <4) + mv0E0_ y))), the value of the element pred matrix xl0[ x ] [ y ] in the chroma prediction sample matrix pred matrixl0 is the sample value located in the 1/32 precision chroma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE +2x) < <4) + MvC0_ x, (yE +2y) <4) + MvC0_ y)), the value of the element pred matrix is the value of the element pred matrix 37 [ x ] [ y ] in the luma prediction sample matrix pred m 1 is the value of the mve matrix 360 (((xix) (ref +2x) (7 x × 11 m) (the value of the mve matrix 3638), (yE + y) < <4) + mv0E1_ y))), the value of the element predmatrix l1[ x ] [ y ] in the chroma prediction sample matrix predmatrix l1 is the sample value located at position (((xE +2x) < <4) + MvC1_ x, (yE +2y) < <4) + MvC1_ y))) in the 1/32 precision chroma sample matrix with reference index RefIdxL1 in the reference picture queue 1. Where MvC0_ x equals mv0E0_ x, MvC0_ y equals mv0E0_ y, MvC1_ x equals mv0E1_ x, and MvC1_ y equals mv0E1_ y.
The element values of the positions in the luminance 1/16 precision sample matrix and the chrominance 1/32 precision sample matrix of the reference image are obtained by the following interpolation method defined by the affine luminance sample interpolation process and the affine chrominance sample interpolation process. Integer samples outside the reference image should be replaced with integer samples (edge or corner samples) within the image that are closest to the sample, i.e. the motion vectors can point to samples outside the reference image.
Specifically, the affine luminance sample interpolation process is as follows:
a, B, C, D are adjacent integer pixel samples, dx and dy are the horizontal and vertical distances between the integer pixel sample A and its peripheral sub-pixel samples a (dx, dy), dx is equal to fx&15, dy equals fy&15, where (fx, fy) is the coordinate of the sub-pixel sample in the luminance sample matrix with accuracy 1/16. Integer pixel Ax,y255 sub-pixels around (a) are sample ax,y(dx,dy)。
In particular, the sample position ax,0(x is 1-15) is obtained by filtering 8 integer values closest to the interpolation point in the horizontal direction, and the predicted value is obtained by the following method:
ax,0=Clip1((fL[x][0]×A-3,0+fL[x][1]×A-2,0+fL[x][2]×A-1,0+fL[x][3]×A0,0+fL[x][4]×A1,0+fL[x][5]×A2,0+fL[x][6] ×A3,0+fL[x][7]×A4,0+32)>>6)。
in particular, the sample position a0,y(y 1-15) filtering by 8 integer values closest to the interpolation point in the vertical directionThe predicted value is obtained in the following manner:
a0,y=Clip1((fL[y][0]×A0,-3+fL[y][1]×A-2,0+fL[y][2]×A-1,0+fL[y][3]×A0,0+fL[y][4]×A1,0+fL[y][5]×A2,0+fL[y][6]×A3,0+fL[y][7]×A-4,0+32)>>6)。
in particular, the sample position ax,yThe predicted values (x1 to 15, y1 to 15) are obtained as follows:
ax,y=Clip1((fL[y][0]×a'x,y-3+fL[y][1]×a'x,y-2+fL[y][2]×a'x,y-1+fL[y][3]×a'x,y+fL[y][4]×a'x,y+1+fL[y][5]×a'x,y+2+fL[y][6]×a'x,y+3+fL[y][7]×a'x,y+4+(1<<(19-BitDepth)))>>(20-BitDepth))。
wherein:
a'x,y=(fL[x][0]×A-3,y+fL[x][1]×A-2,y+fL[x][2]×A-1,y+fL[x][3]×A0,y+fL[x][4]×A1,y+fL[x][5]×A2,y+fL[x][6]×A3,y+fL[x][7]×A4,y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
the luminance interpolation filter coefficients are shown in table 1:
TABLE 1
Figure RE-GDA0002866025470000061
Specifically, the affine chroma sample interpolation process is as follows:
a, B, C, D are adjacent integer pixel samples, dx and dy are the horizontal and vertical distances between the integer pixel sample A and its peripheral sub-pixel samples a (dx, dy), dx is equal to fx&31, dy equals fy&31, (fx, fy) is the coordinate of the sub-pixel sample in the chroma sample matrix with accuracy 1/32. Integer pixel Ax,yHas 1023 sub-pixel samples ax,y(dx,dy)。
Specifically, for a sub-pixel point where dx is equal to 0 or dy is equal to 0, it can be directly interpolated with chroma integer pixels, and for a point where dx is not equal to 0 and dy is not equal to 0, it is calculated using sub-pixels on the integer pixel row (dy is equal to 0):
if(dx==0){
ax,y(0,dy)=Clip3(0,(1<<BitDepth)-1,(fC[dy][0]×Ax,y-1+fC[dy][1]×Ax,y+fC[dy][2]×Ax,y+1+fC[dy][3]×Ax,y+2+32)>>6)
}
else if(dy==0){
ax,y(dx,0)=Clip3(0,(1<<BitDepth)-1,(fC[dx][0]×Ax-1,y+fC[dx][1]×Ax,y+fC[dx][2]×Ax+1,y+fC[dx][3]×Ax+2,y+32)>>6)
}
else{
ax,y(dx,dy)=Clip3(0,(1<<BitDepth)-1,(C[dy][0]×a'x,y-1(dx,0)+C[dy][1]×a'x,y(dx,0)+C[dy][2]×a'x,y+1 (dx,0)+C[dy][3]×a'x,y+2(dx,0)+(1<<(19-BitDepth)))>>(20-BitDepth))
}
wherein, a'x,y(dx, 0) is the temporary value for a sub-pixel on the integer pixel row, defined as: a'x,y(dx,0)=(fC[dx][0]×Ax-1,y+fC[dx][1]×Ax,y+fC[dx][2]×Ax+1y+fC[dx][3]×Ax+2,y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
The chroma interpolation filter coefficients are shown in table 2:
TABLE 2
Figure RE-GDA0002866025470000062
Figure RE-GDA0002866025470000071
A common method of affine prediction may comprise the following steps:
step 101, determining a motion vector of a control point.
And step 102, determining the motion vector of the sub-block according to the motion vector of the control point.
And 103, predicting the sub-blocks according to the motion vectors of the sub-blocks.
At present, when improving the prediction value of the block-based affine prediction by the PROF, the following steps may be specifically included:
step 101, determining a motion vector of a control point.
And step 102, determining the motion vector of the sub-block according to the motion vector of the control point.
And 103, predicting the sub-blocks according to the motion vectors of the sub-blocks.
And step 104, determining the deviation of each position in the sub-block and the motion vector of the sub-block according to the motion vector of the control point and the motion vector of the sub-block.
And step 105, determining the motion vector of the sub-block according to the motion vector of the control point.
Step 106, the horizontal and vertical gradients are calculated for each position using the predicted values based on the sub-blocks.
And step 107, calculating the deviation value of the predicted value of each position according to the motion vector deviation and the gradients in the horizontal and vertical directions of each position by using the optical flow principle.
And step 108, adding the deviation value of the predicted value to the predicted value of each position based on the subblocks to obtain the corrected predicted value.
At present, when improving the prediction value of affine prediction based on a block by means of the PROF, the method may specifically include the following steps:
step 101, determining a motion vector of a control point.
And step 109, determining the motion vector of the sub-block and the deviation of each position in the sub-block and the motion vector of the sub-block according to the motion vector of the control point.
And 103, predicting the sub-blocks according to the motion vectors of the sub-blocks.
Step 106, the horizontal and vertical gradients are calculated for each position using the predicted values based on the sub-blocks.
And step 107, calculating the deviation value of the predicted value of each position according to the motion vector deviation and the gradients in the horizontal and vertical directions of each position by using the optical flow principle.
And step 108, adding the deviation value of the predicted value to the predicted value of each position based on the subblocks to obtain the corrected predicted value.
The PROF can correct the affine prediction based on the sub-blocks by using the optical flow principle, so that the compression performance is improved. However, the optical flow calculation method using the PROF is effective when the deviation between the motion vector of the pixel position within the sub-block and the sub-block motion vector is very small, that is, the deviation between the motion vector of the pixel position within the sub-block and the sub-block motion vector is very small, but since the PROF depends on the horizontal and vertical gradients of the reference position, the horizontal and vertical gradients of the reference position cannot truly reflect the horizontal and vertical gradients between the reference position and the actual position when the actual position is far from the reference position, and therefore, the method is not particularly effective when the deviation between the motion vector of the pixel position within the sub-block and the sub-block motion vector is large.
At this time, a quadratic prediction method is proposed to overcome the defects of the PROF. The method for performing quadratic prediction by the decoder can comprise the following steps:
step 201, predicting the sub-block according to the motion vector of the sub-block to obtain a predicted value.
Step 202, determining the motion vector deviation of each position in the sub-block and the sub-block.
And step 203, filtering the predicted value by using a two-dimensional filter according to the motion vector deviation of each position to obtain the predicted value of the secondary prediction.
That is, the quadratic prediction method may perform point-based quadratic prediction on the basis of prediction based on sub-blocks for pixel positions where the motion vector deviates from the motion vector of the sub-blocks after prediction based on the sub-blocks, and finally complete correction of the predicted value to obtain a new predicted value, that is, a predicted value of the quadratic prediction.
Specifically, point-based quadratic prediction uses a two-dimensional filter. The two-dimensional filter is a filter composed of adjacent dots constituting a preset shape. The adjacent dots constituting the preset shape may be 9 dots. For a pixel location, the result of the filter processing is the predicted value of the quadratic prediction for that location. The filter coefficient of the two-dimensional filter is determined by the motion vector deviation of each position, the input of the two-dimensional filter is a predicted value, and the output of the two-dimensional filter is a predicted value of secondary prediction.
Further, in the present application, if the affine mode is used for the current block in the present application, the motion vector of the control point needs to be determined first. The method for performing quadratic prediction by the decoder can comprise the following steps:
and step 204, determining the motion vector of the control point.
Step 205, determining the motion vector of the sub-block according to the motion vector of the control point.
Step 201, predicting the sub-block according to the motion vector of the sub-block to obtain a predicted value.
Step 202, determining the motion vector deviation of each position in the sub-block and the sub-block.
And step 203, filtering the predicted value by using a two-dimensional filter according to the motion vector deviation of each position to obtain the predicted value of the secondary prediction.
As can be seen from this, in the embodiment of the present application, after the motion vector of the control point is determined, the sub-block may be subjected to prediction processing by using the motion vector of the control point, and after the deviation between the pixel position in the sub-block and the motion vector of the sub-block is determined, point-based secondary prediction may be performed on the pixel position where the motion vector deviates from the motion vector of the sub-block based on prediction based on the sub-block, and finally, the correction of the predicted value is completed, so as to obtain a new predicted value, that is, the predicted value of the secondary prediction.
Further, in the present application, the two steps of determining the motion vector of the sub-block and determining the deviation of each position within the sub-block from the motion vector of the sub-block may also be performed simultaneously. The method for performing quadratic prediction by the decoder can comprise the following steps:
and step 204, determining the motion vector of the control point.
And step 206, determining the motion vector of the sub-block according to the motion vector of the control point and the motion vector deviation of each position in the sub-block and the sub-block.
Step 201, predicting the sub-block according to the motion vector of the sub-block to obtain a predicted value.
And step 203, filtering the predicted value by using a two-dimensional filter according to the motion vector deviation of each position to obtain the predicted value of the secondary prediction.
Therefore, in the embodiment of the present application, after the motion vector of the control point is determined, the motion vector of the sub-block and the deviation between each position in the sub-block and the motion vector of the sub-block may be determined at the same time, then, for the pixel position where the motion vector deviates from the motion vector of the sub-block, the point-based secondary prediction is performed on the basis of the prediction based on the sub-block, and finally, the correction of the predicted value is completed, so as to obtain a new predicted value, that is, the predicted value of the secondary prediction.
It should be noted that, because the affine model can explicitly calculate the motion vector of each pixel position, or the deviation between each pixel position in the sub-block and the sub-block motion vector, the quadratic prediction method can be used to improve the affine prediction, and certainly, the quadratic prediction can also be applied to improve other sub-block-based predictions. That is, the sub-block based prediction proposed herein includes, but is not limited to, affine sub-block based prediction.
Further, the quadratic prediction method may be based on the AVS3 standard, and may also be applied to the VVC standard, which is not specifically limited in the present application.
For each pixel position in the sub-block, when performing the quadratic prediction or the PROF process, the center point of the default filter is the same pixel position in the prediction block based on the prediction of the sub-block, and assuming that the motion vector deviation is 0, the prediction value of the pixel position is the same value of the pixel position in the prediction block based on the prediction of the sub-block. In general, the above method can be applied to a condition that a motion vector deviation between a pixel position and a sub-block is small. However, in the case where the motion vector deviation between the pixel position and the sub-block is not small, for example, when the deviation exceeds a half pixel or a certain proportion in the horizontal or vertical direction, the prediction effect obtained by the quadratic prediction or the PROF process may be greatly reduced.
Fig. 8 is a schematic diagram of a center position, as shown in fig. 8, a square represents a pixel position of a prediction block based on prediction of a sub-block, and a circle represents an actual position of a pixel position to be predicted in the prediction block based on prediction of the sub-block, wherein a square 1 represents a pixel position in the prediction block based on prediction of the sub-block, which is the same as the pixel position to be predicted, that is, a (1, 1) position of the sub-block. The square 1 is the default point-based prediction reference position, i.e., the center position of the filter, with the motion vector biases denoted (dmv _ x0, dmv _ y 0).
Fig. 9 is a schematic diagram of the center position two, and as shown in fig. 9, if both the dmv _ x0 and the dmv _ y0 are greater than one-half pixel, then there is a larger distance between the square 1 and the pixel position to be predicted in the sub-block, and if the square 1 is used as the center position of the filter to perform quadratic prediction or PROF processing, there will be a larger error in the final prediction result.
Therefore, when a large motion vector deviation exists in the center position of a filter which needs to be used for quadratic prediction or PROF, the problem of reduced prediction performance may be caused, that is, the conventional PROF prediction value correction method and quadratic prediction method are not strict in practice, and when affine prediction is improved, the method cannot be well applied to all scenes, and the encoding performance needs to be improved.
In order to solve the drawbacks of the prior art, in the embodiment of the present application, after the prediction based on the sub-block, for the pixel position where the first motion vector deviation between the motion vector and the motion vector of the sub-block is large, the determination of the center position and the second motion vector deviation used for performing the quadratic prediction or the PROF process may be re-determined based on the first motion vector deviation, so that the second prediction value may be obtained by performing the point-based quadratic prediction using the center position and the second motion vector deviation on the basis of the first prediction value based on the sub-block. Therefore, the inter-frame prediction method provided by the application can be well suitable for all scenes, the prediction error can be reduced, the coding performance is greatly improved, and the coding and decoding efficiency is improved.
It should be understood that the present application provides a video coding system, fig. 10 is a schematic block diagram illustrating a component of the video coding system provided by the present application, and as shown in fig. 10, the video coding system 11 may include: a transform unit 111, a quantization unit 112, a mode selection and coding control logic unit 113, an intra prediction unit 114, an inter prediction unit 115 (including: motion compensation and motion estimation), an inverse quantization unit 116, an inverse transform unit 117, a loop filtering unit 118, an encoding unit 119, and a decoded image buffer unit 110; for an input original video signal, a video reconstruction block can be obtained by dividing a Coding Tree Unit (CTU), a Coding mode is determined by a mode selection and Coding control logic Unit 113, and then residual pixel information obtained by intra-frame or inter-frame prediction is transformed by a transformation Unit 111 and a quantization Unit 112, including transforming the residual information from a pixel domain to a transformation domain and quantizing the obtained transformation coefficient, so as to further reduce the bit rate; the intra-prediction unit 114 is configured to perform intra-prediction on the video reconstructed block; wherein, the intra prediction unit 114 is configured to determine an optimal intra prediction mode (i.e. a target prediction mode) of the video reconstructed block; inter-prediction unit 115 is to perform inter-prediction encoding of the received video reconstructed block relative to one or more blocks in one or more reference frames to provide temporal prediction information; wherein motion estimation is the process of generating motion vectors that can estimate the motion of the video reconstructed block, and then motion compensation is performed based on the motion vectors determined by motion estimation; after determining the inter prediction mode, the inter prediction unit 115 is also configured to supply the selected inter prediction data to the encoding unit 119, and also to send the calculated determined motion vector data to the encoding unit 119; furthermore, the inverse quantization unit 116 and the inverse transformation unit 117 are used for reconstruction of the video reconstruction block, reconstructing a residual block in the pixel domain, which removes blocking artifacts through the loop filtering unit 118, and then adding the reconstructed residual block to a predictive block in the frame of the decoded picture buffer unit 110 to generate a reconstructed video reconstruction block; coding section 119 is for coding various coding parameters and quantized transform coefficients. And the decoded picture buffer unit 110 is used to store reconstructed video reconstructed blocks for prediction reference. As the video coding proceeds, new reconstructed video blocks are generated, and these reconstructed video blocks are stored in the decoded picture buffer unit 110.
Fig. 11 is a schematic block diagram illustrating a composition of a video decoding system provided in an embodiment of the present application, and as shown in fig. 11, the video decoding system 12 may include: a decoding unit 121, an inverse transform unit 127, and inverse quantization unit 122, intra prediction unit 123, motion compensation unit 124, loop filter unit 125, and decoded picture buffer unit 126; after the input video signal is coded by the video coding system 11, the code stream of the video signal is output; the code stream is input into the video decoding system 12, and first passes through the decoding unit 121 to obtain a decoded transform coefficient; the transform coefficients are processed by an inverse transform unit 127 and an inverse quantization unit 122 to produce a residual block in the pixel domain; intra-prediction unit 123 may be used to generate prediction data for a current video decoded block based on the determined intra-prediction direction and data from previously decoded blocks of the current frame or picture; motion compensation unit 124 is a predictive block that determines prediction information for a video decoded block by parsing motion vectors and other associated syntax elements and uses the prediction information to generate the video decoded block being decoded; forming a decoded video block by summing the residual block from inverse transform unit 127 and inverse quantization unit 122 with the corresponding predictive block generated by intra prediction unit 123 or motion compensation unit 124; the decoded video signal passes through the loop filtering unit 125 to remove blocking artifacts, which may improve video quality; the decoded video blocks are then stored in the decoded picture buffer unit 126, and the decoded picture buffer unit 126 stores reference pictures for subsequent intra prediction or motion compensation, and also for the output of the video signal, resulting in a restored original video signal.
The inter-frame prediction method provided by the embodiment of the present application mainly acts on the inter-frame prediction unit 215 of the video coding system 11 and the inter-frame prediction unit, i.e., the motion compensation unit 124, of the video decoding system 12; that is, if the video encoding system 11 can obtain a better prediction effect by the inter-frame prediction method provided in the embodiment of the present application, the video decoding system 12 can also improve the video decoding recovery quality accordingly.
Based on this, the technical solution of the present application is further elaborated below with reference to the drawings and the embodiments. Before the detailed description is given, it should be noted that "first", "second", "third", etc. are mentioned throughout the specification only for distinguishing different features, and do not have the functions of defining priority, precedence, size relationship, etc.
It should be noted that, in the present embodiment, an example is described based on the AVS3 standard, and the inter-frame prediction method proposed in the present application may be applied to other coding standard technologies such as VVC, and the present application is not limited to this.
The embodiment of the application provides an inter-frame prediction method, which is applied to video decoding equipment, namely a decoder. The functions performed by the method may be implemented by the first processor in the decoder calling a computer program, which of course may be stored in the first memory, it being understood that the decoder comprises at least the first processor and the first memory.
Further, in an embodiment of the present application, fig. 12 is a first flowchart illustrating an implementation of an inter prediction method, and as shown in fig. 12, the method for a decoder to perform inter prediction may include the following steps:
step 301, analyzing the code stream to obtain the prediction mode parameter of the current block.
In an embodiment of the present application, a decoder may first parse a binary code stream to obtain the prediction mode parameters of the current block. Wherein the prediction mode parameter may be used to determine a prediction mode used by the current block.
It should be noted that an image to be decoded may be divided into a plurality of image blocks, and an image block to be decoded currently may be referred to as a current block (which may be represented by a CU), and an image block adjacent to the current block may be referred to as a neighboring block; that is, in the image to be decoded, the current block has a neighboring relationship with the neighboring block. Here, each current block may include a first image component, a second image component, and a third image component, that is, the current block represents an image block to be currently subjected to prediction of the first image component, the second image component, or the third image component in an image to be decoded.
Wherein, assuming that the current block performs the first image component prediction, and the first image component is a luminance component, that is, the image component to be predicted is a luminance component, then the current block may also be called a luminance block; alternatively, assuming that the current block performs the second image component prediction, and the second image component is a chroma component, that is, the image component to be predicted is a chroma component, the current block may also be referred to as a chroma block.
Further, in the embodiments of the present application, the prediction mode parameter may indicate not only the prediction mode adopted by the current block but also a parameter related to the prediction mode.
It is understood that, in the embodiments of the present application, the prediction modes may include inter prediction modes, conventional intra prediction modes, and non-conventional intra prediction modes, etc.
That is to say, on the encoding side, the encoder may select an optimal prediction mode to perform pre-encoding on the current block, and in this process, the prediction mode of the current block may be determined, and then a prediction mode parameter for indicating the prediction mode is determined, so that the corresponding prediction mode parameter is written into the code stream and transmitted to the decoder by the encoder.
Correspondingly, on the decoder side, the decoder can directly acquire the prediction mode parameters of the current block by analyzing the code stream, and determines the prediction mode used by the current block and the related parameters corresponding to the prediction mode according to the prediction mode parameters acquired by analyzing.
Further, in an embodiment of the present application, after parsing to obtain the prediction mode parameter, the decoder may determine whether the current block uses an inter prediction mode based on the prediction mode parameter.
Step 302, when the prediction mode parameter indicates that the inter prediction value of the current block is determined by using the inter prediction mode, determining a first motion vector of a sub-block of the current block; wherein the current block includes a plurality of sub-blocks.
In an embodiment of the present application, after parsing to obtain the prediction mode parameter, if the parsed prediction mode parameter indicates that the current block determines the inter prediction value of the current block using the inter prediction mode, the decoder may determine the first motion vector of each sub-block of the current block. Wherein a sub-block corresponds to a first motion vector.
It should be noted that, in the embodiment of the present application, the current block is an image block to be decoded in the current frame, the current frame is sequentially decoded in a certain order in the form of an image block, and the current block is an image block to be decoded in the current frame at the next moment in the order. The current block may have a variety of specification sizes, such as a specification of 16 × 16, 32 × 32, or 32 × 16, where the numbers represent the number of rows and columns of pixel points on the current block.
Further, in the embodiment of the present application, the current block may be divided into a plurality of sub-blocks, where the size of each sub-block is the same, and the sub-blocks are a set of pixels with a smaller specification. The sub-blocks may be 8 × 8 or 4 × 4 in size.
For example, in the present application, the size of the current block is 16 × 16, and the current block may be divided into 4 sub-blocks each having a size of 8 × 8.
It can be understood that, in the embodiment of the present application, in the case that the decoder parses the code stream to obtain the prediction mode parameter indicating that the inter prediction value of the current block is determined using the inter prediction mode, the inter prediction method provided in the embodiment of the present application may be continuously used.
In an embodiment of the present application, further, when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, the method of the decoder determining the first motion vector of the sub-block of the current block may include the steps of:
step 302a, analyzing the code stream, and obtaining the affine mode parameter and the prediction reference mode of the current block.
Step 302b, when the affine mode parameter indicates that the affine mode is used, determining a control point mode and a sub-block size parameter.
Step 302c, determining a first motion vector according to the prediction reference mode, the control point mode and the sub-block size parameter.
In an embodiment of the application, after the decoder parses the obtained prediction mode parameters, if the parsed prediction mode parameters indicate that the current block determines the inter prediction value of the current block by using the inter prediction mode, the decoder may obtain the affine mode parameters and the prediction reference mode by parsing the code stream.
It should be noted that, in the embodiment of the present application, an affine mode parameter is used to indicate whether to use an affine mode. Specifically, the affine mode parameter may be an affine motion compensation enable flag affine _ enable _ flag, and the decoder may further determine whether to use the affine mode through determination of a value of the affine mode parameter.
That is, in the present application, the affine pattern parameter may be a binary variable. If the value of the affine mode parameter is 1, indicating that the affine mode is used; if the value of the affine mode parameter is 0, it indicates that the affine mode is not used.
It is understood that in the present application, the decoder parses the code stream, and if the affine mode parameter is not parsed, it can also be understood as indicating that the affine mode is not used.
For example, in the present application, the value of the affine mode parameter may be equal to the value of the affine motion compensation enable flag affine _ enable _ flag, and if the value of the affine _ enable _ flag is '1', it indicates that affine motion compensation may be used; if the value of affine _ enable _ flag is '0', it indicates that affine motion compensation should not be used.
Further, in the embodiment of the present application, if affine mode parameters obtained by the decoder parsing the code stream indicate that an affine mode is used, the decoder may perform the obtaining of the control point mode and the sub-block size parameters.
In the embodiment of the present application, the control point mode is used to determine the number of control points. In the affine model, one sub-block may have 2 control points or 3 control points, and accordingly, the control point pattern may be a control point pattern corresponding to 2 control points or a control point pattern corresponding to 3 control points. I.e. the control point mode may comprise a 4-parameter mode and a 6-parameter mode.
It can be understood that, in the embodiment of the present application, for the AVS3 standard, if the current block uses the affine mode, the decoder needs to determine the number of control points in the affine mode of the current block, so as to determine whether to use the 4-parameter (2 control points) mode or the 6-parameter (3 control points) mode.
Further, in the embodiment of the present application, if the affine mode parameter obtained by the decoder parsing the code stream indicates that the affine mode is used, the decoder may further obtain the sub-block size parameter by parsing the code stream.
Specifically, the subblock size parameter may be determined by an affine predictor subblock size flag affine _ subblocklock _ size _ flag, and the decoder obtains the subblock size flag by parsing the code stream and determines the size of the subblock of the current block according to a value of the subblock flag. The size of the sub-block may be 8 × 8 or 4 × 4. Specifically, in the present application, the sub-block size flag may be a binary variable. If the value of the sub-block size flag is 1, indicating that the sub-block size parameter is 8 multiplied by 8; if the sub-block size flag takes a value of 0, it indicates that the sub-block size parameter is 4 × 4.
For example, in the present application, the value of the sub-block size flag may be equal to the value of an affine predictor block size flag affine _ sub-block _ size _ flag, and if the value of the affine _ sub-block _ size _ flag is '1', the current block is divided into sub-blocks with a size of 8 × 8; if the value of affine _ sub _ size _ flag is '0', the current block is divided into subblocks of size 4 × 4.
It can be understood that, in the present application, the decoder parses the code stream, and if the sub-block size flag is not parsed, it can also be understood that the current block is divided into 4 × 4 sub-blocks. That is, if affine _ sub _ size _ flag does not exist in the code stream, the value of the sub-block size flag may be directly set to 0.
Further, in the embodiment of the present application, after determining the control point mode and the sub-block size parameter, the decoder may further determine the first motion vector of the sub-block in the current block according to the prediction reference mode, the control point mode, and the sub-block size parameter.
Specifically, in the embodiments of the present application, the decoder may first determine the control point motion vector group according to the prediction reference mode; a first motion vector for the sub-block may then be determined based on the set of control point motion vectors, the control point mode, and the sub-block size parameter.
It will be appreciated that in embodiments of the present application, a set of control point motion vectors may be used to determine the motion vectors for the control points.
It should be noted that, in the embodiment of the present application, the decoder may traverse each sub-block in the current block according to the above method, and determine the first motion vector of each sub-block by using the control point motion vector group, the control point mode, and the sub-block size parameter of each sub-block, so that the motion vector set may be constructed and obtained according to the first motion vector of each sub-block.
It is to be understood that, in the embodiment of the present application, the first motion vector of each sub-block of the current block may be included in the motion vector set of the current block.
Further, in the embodiment of the present application, when the decoder determines the first motion vector according to the control point motion vector group, the control point mode, and the sub-block size parameter, the decoder may first determine the difference variable according to the control point motion vector group, the control point mode, and the size parameter of the current block; the sub-block location may then be determined based on the prediction mode parameter and the sub-block size parameter; finally, the difference variable and the position of the sub-block can be used to determine the first motion vector of the sub-block, and then the motion vector set of the sub-blocks of the current block can be obtained.
For example, in the present application, the difference variable may include 4 variables, specifically dHorX, dVerX, dHorY, and dVerY, and when calculating the difference variable, the decoder needs to determine a control point motion vector group first, where the control point motion vector group may characterize the motion vector of the control point.
Specifically, if the control point mode is a 6-parameter mode, i.e., there are 3 control points, the control point motion vector group may be a motion vector group including 3 motion vectors, denoted as mvsffine (mv0, mv1, mv 2); if the control point mode is a 4-parameter mode, i.e. there are 2 control points, then the control point motion vector set may be a motion vector set comprising 2 motion vectors, denoted mvsAffine (mv0, mv 1).
The decoder then performs the calculation of the difference variable using the set of control point motion vectors:
dHorX=(mv1_x-mv0_x)<<(7-Log(width));
dHorY=(mv1_y-mv0_y)<<(7-Log(width));
if the motion vector group is mvsAffine (mv0, mv1, mv2), then:
dVerX=(mv2_x-mv0_x)<<(7-Log(height));
dVerY=(mv2_y-mv0_y)<<(7-Log(height));
if the motion vector group is mvsAffine (mv0, mv1), then:
dVerX=-dHorY;
dVerY=dHorX。
the width and height are respectively a width and a height of the current block, that is, a size parameter of the current block, and specifically, the size parameter of the current block may be obtained by a decoder by analyzing a code stream.
Further, in embodiments of the present application, the decoder, after determining the difference variable, may then determine the sub-block location based on the prediction mode parameter and the sub-block size parameter. Specifically, the decoder can determine the size of the sub-block through the block size flag, and can determine which prediction mode is specifically used through the prediction mode parameter, and then can determine the position of the sub-block according to the size of the sub-block and the prediction mode used.
For example, in the present application, if the prediction reference mode of the current block is set to 2, i.e., the third reference mode 'Pred _ List 01', or the subblock size flag is set to 1, i.e., both the width and height of a subblock are equal to 8, (x, y) is the coordinates of the upper left corner position of an 8 × 8 subblock, the coordinates xPos and yPos of the subblock position may be determined by:
if the sub-block is the control point of the upper left corner of the current block, both xPos and yPos are equal to 0;
if the sub-block is the control point of the upper right corner of the current block, xPos is equal to width, and yPos is equal to 0;
if the sub-block is a control point in the lower left corner of the current block, and the control point motion vector group can be a motion vector group comprising 3 motion vectors, xPos is equal to 0 and yPos is equal to height;
otherwise, xPos equals (x-xE) +4, and yPos equals (y-yE) + 4.
For example, in the present application, if the prediction reference mode of the current block has a value of 0 or 1, i.e., the first reference mode 'Pred _ List 0' or the second reference mode 'Pred _ List 1', and the subblock size flag has a value of 0, i.e., both the width and height of the subblock are equal to 4, (x, y) are coordinates of the upper left corner position of the 4 × 4 subblock, the coordinates xPos and yPos of the subblock position may be determined by:
if the sub-block is the control point of the upper left corner of the current block, both xPos and yPos are equal to 0;
if the sub-block is the control point of the upper right corner of the current block, xPos is equal to width, and yPos is equal to 0;
if the sub-block is a control point in the lower left corner of the current block, and the control point motion vector group can be a motion vector group comprising 3 motion vectors, xPos is equal to 0 and yPos is equal to height;
otherwise, xPos equals (x-xE) +2 and yPos equals (y-yE) + 2.
Further, in the embodiment of the present application, after the decoder calculates and obtains the positions of the sub-blocks, a first motion vector of the sub-block may be determined based on the positions of the sub-blocks and the difference variable, and finally, a motion vector set for obtaining a plurality of sub-blocks of the current block may be constructed by traversing each sub-block of the current block to obtain the first motion vector of each sub-block.
Illustratively, in the present application, after determining the sub-block locations xPos and yPos, the decoder may determine a first motion vector mvE (mvE _ x, mvE _ y) for the sub-block in the following manner
mvE_x=Clip3(-131072,131071,Rounding((mv0_x<<7)+dHorX×xPos+dVerX×yPos,7));
mvE_y=Clip3(-131072,131071,Rounding((mv0_y<<7)+dHorY×xPos+dVerY×yPos,7))。
It should be noted that, in the present application, when determining the deviation between each position in the sub-block and the motion vector of the sub-block, if the current block uses an affine prediction model, the motion vector of each position in the sub-block can be calculated according to the formula of the affine prediction model, and the deviation can be obtained by subtracting the motion vector of the sub-block. If the motion vectors of the sub-blocks all select the same position of the motion vector within the sub-block, e.g. 4x4 block uses a position from the top left corner (2, 2) and 8x8 block uses a position from the top left corner (4, 4), the motion vector deviations for the same position of each sub-block are the same according to the affine model used in the present standard including VVC and AVS 3. But the lower left corner in the case of AVS at the upper left corner, upper right corner, and 3 control points (the a, B, C positions shown in figure 7 in the AVS3 text above) is different from the positions used by other blocks, and correspondingly different from other blocks when calculating the motion vector deviations for the sub-blocks at the upper left corner, upper right corner, and lower left corner in the case of 3 control points.
Step 303 determines a first predictor of the sub-block based on the first motion vector and a first motion vector offset between each pixel position in the sub-block and the sub-block.
In an embodiment of the present application, after determining the first motion vector of each sub-block of the current block, the decoder may determine a first predictor of the sub-block and a first motion vector offset between each pixel position in the sub-block and the sub-block, respectively, based on the first motion vector of the sub-block, where one pixel position corresponds to one first motion vector offset.
It is understood that, in the embodiment of the present application, step 303 may specifically include:
step 303a determines a first predictor of the sub-block based on the first motion vector.
Step 303b determines a first motion vector offset between each pixel position in the sub-block and the sub-block based on the first motion vector.
In this application, the order in which the decoder performs the step 303a and the step 303b is not limited by the inter prediction method provided in this embodiment, that is, in this application, after determining the first motion vector of each sub-block of the current block, the decoder may perform the step 303a first and then perform the step 303b, may perform the step 303b first and then perform the step 303a, and may also perform the step 303a and the step 303b at the same time.
Further, in an embodiment of the present application, the decoder may first determine a sample matrix when determining the first predictor of the sub-block based on the first motion vector; wherein the sample matrix comprises a luminance sample matrix and a chrominance sample matrix; the first predictor may then be determined based on the prediction reference mode, the sub-block size parameter, the sample matrix, and the set of motion vectors.
It should be noted that, in the embodiment of the present application, when the decoder determines the first predicted value according to the prediction reference mode, the sub-block size parameter, the sample matrix, and the motion vector set, the decoder may first determine a target motion vector from the motion vector set according to the prediction reference mode and the sub-block size parameter; then, a reference image queue and a reference index sample matrix corresponding to the prediction reference mode and a target motion vector can be utilized to determine a prediction sample matrix; wherein the prediction sample matrix includes a first prediction value of the plurality of sub-blocks.
Specifically, in an embodiment of the present application, the sample matrix may include a luma sample matrix and a chroma sample matrix, and accordingly, the prediction sample matrix determined by the decoder may include a luma prediction sample matrix and a chroma prediction sample matrix, wherein the luma prediction sample matrix includes a first luma predictor of the plurality of sub-blocks, the chroma prediction sample matrix includes a first chroma predictor of the plurality of sub-blocks, and the first luma predictor and the first chroma predictor constitute the first predictor of the sub-blocks.
For example, in the present application, it is assumed that the position of the top-left sample of the current block in the luma sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is taken to be 0, i.e., the first reference mode 'PRED _ List 0' is used, and the subblock size flag is taken to be 0, i.e., the subblock size parameter is 4 × 4, the target motion vector mv0E0 is the first motion vector of the 4 × 4 subblock where the motion vector set of the current block is at the (xE + x, yE + y) position. The value of element pred matrix l0[ x ] [ y ] in luma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE + x) < <4) + mv0E0_ x, (yE + y) < <4) + mv0E0_ y)), and the value of element pred matrix xl0[ x ] [ y ] in chroma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE +2 × x) < <4) + MvC _ x, (yE +2 × y) < <4) + MvC _ y))). Where x1 ═ ((xE +2 × x) > >3) < <3, y1 ═ ((yE +2 × y) > >3) < <3, mv1E0 is a first motion vector of 4 × 4 units where the motion vector set of the current block is at the (x1, y1) position, mv2E0 is a first motion vector of 4 × 4 units where the motion vector set of the current block is at the (x1+4, y1) position, mv3E0 is a first motion vector of 4 × 4 units where the motion vector set of the current block is at the (x1, y1+4) position, and mv4E0 is a first motion vector of 4 × 4 units where the motion vector set of the current block is at the (x1+4, y1+4) position.
Specifically, MvC _ x and MvC _ y may be determined by:
MvC_x=(mv1E0_x+mv2E0_x+mv3E0_x+mv4E0_x+2)>>2
MvC_y=(mv1E0_y+mv2E0_y+mv3E0_y+mv4E0_y+2)>>2
for example, in the present application, assuming that the position of the top-left sample of the current block in the luma sample matrix of the current picture is (xE, yE), if the prediction reference mode of the current block is taken to be 0, i.e., the first reference mode 'PRED _ List 0' is used, and the sub-block size flag is taken to be 1, i.e., the sub-block size parameter is 8 × 8, then the target motion vector mv0E0 is the first motion vector of 8 × 8 units of the motion vector set of the current block at the (xE + x, yE + y) position. The value of element pred matrix l0[ x ] [ y ] in luma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE + x) < <4) + mv0E0_ x, (yE + y) < <4) + mv0E0_ y)), and the value of element pred matrix xl0[ x ] [ y ] in chroma prediction sample matrix pred matrix l0 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl0 in reference image queue 0 (((xE +2 × x) < <4) + MvC _ x, (yE +2 × y) < <4) + MvC _ y))). Wherein MvC _ x is equal to mv0E0_ x, MvC _ y is equal to mv0E 0.
For example, in the present application, it is assumed that the position of the top-left sample of the current block in the luma sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is taken to be 1, i.e., the second reference mode 'PRED _ List 1' is used, and the subblock size flag is taken to be 0, i.e., the subblock size parameter is 4 × 4, the target motion vector mv0E1 is the first motion vector of the 4 × 4 unit of the motion vector set of the current block at the (xE + x, yE + y) position. The value of element pred matrix l1[ x ] [ y ] in luma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE + x) < <4) + mv0E1_ x, (yE + y) < <4) + mv0E1_ y)), and the value of element pred matrix xl1[ x ] [ y ] in chroma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE +2 × x) < <4) + MvC _ x, (yE +2 × y) < <4) + MvC _ y))). Where x1 ═ ((xE +2 × x) > >3) < <3, y1 ═ ((yE +2 × y) > >3) < <3, mv1E1 is the first motion vector of a 4 × 4 unit of the MvArray first motion vector set at the (x1, y1) position, mv2E1 is the first motion vector of a 4 × 4 unit of the MvArray first motion vector set at the (x1+4, y1) position, mv3E1 is the first motion vector of a 4 × 4 unit of the MvArray first motion vector set at the (x1, y1+4) position, and mv4E1 is the first motion vector of a 4 × 4 unit of the MvArray first motion vector set at the (x1+4, y1+4) position.
Specifically, MvC _ x and MvC _ y may be determined by:
MvC_x=(mv1E1_x+mv2E1_x+mv3E1_x+mv4E1_x+2)>>2
MvC_y=(mv1E1_y+mv2E1_y+mv3E1_y+mv4E1_y+2)>>2
for example, in the present application, it is assumed that the position of the top-left sample of the current block in the luma sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is taken to be 1, i.e., the second reference mode 'PRED _ List 1' is used, and the sub-block size flag is taken to be 1, i.e., the sub-block size parameter is 8 × 8, the target motion vector mv0E1 is the first motion vector of the 8 × 8 unit of the motion vector set of the current block at the (xE + x, yE + y) position. The value of element pred matrix l1[ x ] [ y ] in luma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/16 precision luma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE + x) < <4) + mv0E1_ x, (yE + y) < <4) + mv0E1_ y)), and the value of element pred matrix xl1[ x ] [ y ] in chroma prediction sample matrix pred matrix l1 is the sample value with the position in the 1/32 precision chroma sample matrix with reference index ref idxl1 in reference image queue 1 (((xE +2 × x) < <4) + MvC _ x, (yE +2 × y) < <4) + MvC _ y))). Where MvC _ x equals mv0E1_ x, MvC _ y equals mv0E 1.
For example, in the present application, it is assumed that the position of the top-left sample of the current block in the luma sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is taken to be 2, i.e., the third reference mode 'PRED _ List 01' is used, the target motion vector mv0E0 is the first motion vector of 8 × 8 units of the motion vector set of the current block at the (xE + x, yE + y) position, and the target motion vector mv0E1 is the first motion vector of 8 × 8 units of the motion vector set of the current block at the (x, y) position. The value of element pred matrix xl0[ x ] [ y ] in luma prediction sample matrix pred matrix xl0 is the sample value located in position (((xE + x) < <4) + mv0E0_ x, (yE + y) < <4) + mv0E0_ y)) in the 1/16-precision luma sample matrix with reference index ref idxl0 in reference image queue 0, the value of element pred matrix 0[ x ] [ y ] in chroma prediction sample matrix pred matrixl0 is the sample value located in position in 1/32-precision chroma sample matrix with reference index ref idxl0 in reference image queue 0 ((xE +2x) < <4) + MvC0_ x, (yE +2x) <4) + MvC0_ y)) in reference image queue 0, the value of element pred matrix 0[ x ] [ 37 ] in luma prediction sample matrix pred 1 is the value of pixel pred matrix (< 37 [ x ] [1] in ref x ] is the reference image queue 3638 (ref x + mve # 1/16), (yE + y) < <4) + mv0E1_ y))), the value of the element predmatrix l1[ x ] [ y ] in the chroma prediction sample matrix predmatrix l1 is the sample value located in position (((xE +2 × x) < <4) + MvC1_ x, (yE +2 × y) < <4) + MvC1_ y)) in the 1/32 precision chroma sample matrix with reference index RefIdxL1 in the reference picture queue 1). Where MvC0_ x equals mv0E0_ x, MvC0_ y equals mv0E0_ y, MvC1_ x equals mv0E1_ x, and MvC1_ y equals mv0E1_ y.
It should be noted that, in the embodiment of the application, the luminance sample matrix in the sample matrix may be an 1/16 precision luminance sample matrix, and the chrominance sample matrix in the sample matrix may be a 1/32 precision chrominance sample matrix.
It is understood that, in the embodiments of the present application, the reference image queues and the reference indexes obtained by the decoder by parsing the codestream are different for different prediction reference modes.
Further, in the embodiment of the present application, when the decoder determines the sample matrix, a luminance interpolation filter coefficient and a chrominance interpolation filter coefficient may be obtained first; a luma sample matrix may then be determined based on the luma interpolation filter coefficients, while a chroma sample matrix may be determined based on the chroma interpolation filter coefficients.
For example, in the present application, when determining the luminance sample matrix, the decoder obtains the luminance interpolation filter coefficients as shown in table 1, and then obtains the luminance sample matrix by calculation according to the pixel position and the sample position.
In particular, the sample position ax,0(x is 1-15) is obtained by filtering 8 integer values closest to the interpolation point in the horizontal direction, and the predicted value is obtained by the following method:
ax,0=Clip1((fL[x][0]×A-3,0+fL[x][1]×A-2,0+fL[x][2]×A-1,0+fL[x][3]×A0,0+fL[x][4]×A1,0+fL[x][5]×A2,0+fL[x][6] ×A3,0+fL[x][7]×A4,0+32)>>6)。
in particular, the sample position a0,yAnd (y is 1-15) filtering the 8 integer values nearest to the interpolation point in the vertical direction to obtain a predicted value in the following mode:
a0,y=Clip1((fL[y][0]×A0,-3+fL[y][1]×A-2,0+fL[y][2]×A-1,0+fL[y][3]×A0,0+fL[y][4]×A1,0+fL[y][5]×A2,0+fL[y][6]×A3,0+fL[y][7]×A-4,0+32)>>6)。
in particular, the sample position ax,yThe predicted values (x1 to 15, y1 to 15) are obtained as follows:
ax,y=Clip1((fL[y][0]×a'x,y-3+fL[y][1]×a'x,y-2+fL[y][2]×a'x,y-1+fL[y][3]×a'x,y+fL[y][4]×a'x,y+1+fL[y][5]×a'x,y+2+fL[y][6]×a'x,y+3+fL[y][7]×a'x,y+4+(1<<(19-BitDepth)))>>(20-BitDepth))。
wherein:
a'x,y=(fL[x][0]×A-3,y+fL[x][1]×A-2,y+fL[x][2]×A-1,y+fL[x][3]×A0,y+fL[x][4]×A1,y+fL[x][5]×A2,y+fL[x][6]×A3,y+fL[x][7]×A4,y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
for example, in the present application, when determining the chroma sample matrix, the decoder may first parse the code stream to obtain the chroma interpolation filter coefficients as shown in table 2, and then calculate the chroma sample matrix according to the pixel position and the sample position.
Specifically, for a sub-pixel point where dx is equal to 0 or dy is equal to 0, it can be directly interpolated with chroma integer pixels, and for a point where dx is not equal to 0 and dy is not equal to 0, it is calculated using sub-pixels on the integer pixel row (dy is equal to 0):
if(dx==0){
ax,y(0,dy)=Clip3(0,(1<<BitDepth)-1,(fC[dy][0]×Ax,y-1+fC[dy][1]×Ax,y+fC[dy][2]×Ax,y+1+fC[dy][3]×Ax,y+2+32)>>6)
}
else if(dy==0){
ax,y(dx,0)=Clip3(0,(1<<BitDepth)-1,(fC[dx][0]×Ax-1,y+fC[dx][1]×Ax,y+fC[dx][2]×Ax+1,y+fC[dx][3]×Ax+2,y+32)>>6)
}
else{
ax,y(dx,dy)=Clip3(0,(1<<BitDepth)-1,(C[dy][0]×a'x,y-1(dx,0)+C[dy][1]×a'x,y(dx,0)+C[dy][2]×a'x,y+1 (dx,0)+C[dy][3]×a'x,y+2(dx,0)+(1<<(19-BitDepth)))>>(20-BitDepth))
}
wherein, a'x,y(dx, 0) is the temporary value for a sub-pixel on the integer pixel row, defined as: a'x,y(dx,0)=(fC[dx][0]×Ax-1,y+fC[dx][1]×Ax,y+fC[dx][2]×Ax+1,y+fC[dx][3]×Ax+2,y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
Further, in the embodiment of the present application, when the decoder is based on the first motion vector deviation between the pixel position and the sub-block, the decoder may first parse the code stream to obtain the secondary prediction parameter; if the quadratic prediction parameter indicates that quadratic prediction is used, the decoder may determine a first motion vector offset between the sub-block and each pixel location based on the value variable.
Specifically, in the embodiment of the present application, when determining the first motion vector offset between the sub-block and each pixel position based on the difference variables, the decoder may determine 4 difference variables dHorX, dVerX, dHorY, and dVerY according to the control point motion vector group, the control point mode, and the size parameter of the current block according to the method set forth in step 302, and then further determine the first motion vector offset corresponding to each pixel position in the sub-block by using the difference variables.
Illustratively, in the present application, the width and height are the width and height of the current block obtained by the decoder, respectively, and the width and height of the subblock are determined by using the subblock size parameter. Assuming that (i, j) is the coordinate of any pixel point inside the subblock, wherein the range of i is 0 to (sub-width-1) and the range of j is 0 to (sub-height-1), the first motion vector deviation of each pixel (i, j) position inside the subblocks of 4 different types can be calculated by the following method:
if the sub-block is the control point A in the upper left corner of the current block, then (i, j) the first motion vector disparity dMvA [ i ] [ j ] for the pixel:
dMvA[i][j][0]=dHorX×i+dVerX×j
dMvA[i][j][1]=dHorY×i+dVerY×j;
if the sub-block is the control point B in the upper right corner of the current block, then (i, j) the first motion vector disparity dMvB [ i ] [ j ] for the pixel:
dMvB[i][j][0]=dHorX×(i-subwidth)+dVerX×j
dMvB[i][j][1]=dHorY×(i-subwidth)+dVerY×j;
if the sub-block is the control point C in the lower left corner of the current block, which may be a motion vector group comprising 3 motion vectors, then the first motion vector offset dMvC [ i ] [ j ] for the (i, j) pixel:
dMvC[i][j][0]=dHorX×i+dVerX×(j-subheight)
dMvC[i][j][1]=dHorY×i+dVerY×(j-subheight);
otherwise, the first motion vector deviation dMvN [ i ] [ j ] of the (i, j) pixel:
dMvN[i][j][0]=dHorX×(i–(subwidth>>1))+dVerX×(j–(subheight>>1))
dMvN[i][j][1]=dHorY×(i–(subwidth>>1))+dVerY×(j–(subheight>>1))。
where dMvX [ i ] [ j ] [0] represents the offset value of the first motion vector offset in the horizontal component, and dMvX [ i ] [ j ] [1] represents the offset value of the first motion vector offset in the vertical component. X is A, B, C or N.
It is to be understood that, in the embodiment of the present application, after determining the first motion vector offset between the sub-block and each pixel position based on the difference variable, the decoder may use all the first motion vector offsets corresponding to all the pixel positions in the sub-block to construct the motion vector offset matrix corresponding to the sub-block. As can be seen, the motion vector deviation matrix includes the motion vector deviation between the sub-block and any one of the internal pixel points, i.e. the first motion vector deviation.
Further, in the embodiment of the present application, if the secondary prediction parameter obtained by the decoder parsing the code stream indicates that secondary prediction is not used, the decoder may select to directly use the first prediction value of the sub-block of the current block obtained in the above step 303a as the second prediction value of the sub-block, without performing the following processing of steps 304 and 305.
In particular, in embodiments of the present application, if the secondary prediction parameters indicate that secondary prediction is not used, the decoder may determine the second prediction value using the prediction sample matrix. The decoder may determine the first prediction value of the sub-block where the pixel position is located as the second prediction value of the decoder.
For example, in the present application, if the prediction reference mode of the current block takes a value of 0 or 1, i.e., a first reference mode 'PRED _ List 0' is used, or a second reference mode 'PRED _ List 1' is used, the first predictor of the sub-block where the pixel position is located may be selected directly from a prediction sample matrix including one luma prediction sample matrix and two chroma prediction sample matrices, and determined as the inter predictor of the pixel position, i.e., the second predictor.
For example, in this application, if the prediction reference mode of the current block is 2, that is, the third reference mode 'PRED _ List 01' is used, an average operation may be performed on 2 luma prediction sample matrices (2 groups of 4 chroma prediction sample matrices) included in the prediction sample matrix to obtain 1 averaged luma prediction sample (2 averaged chroma prediction samples), and finally, a first predictor of a sub-block where a pixel position is located is selected from the averaged luma prediction samples (2 averaged chroma prediction samples), and the first predictor is determined as an inter predictor of the pixel position, that is, a second predictor.
And step 304, determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation.
In an embodiment of the present application, after determining the first predictor of the sub-block and the first motion vector offset between each pixel position in the sub-block and the sub-block based on the first motion vector, respectively, the decoder may offset the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector.
It should be noted that, in the embodiment of the present application, for the pixel at each pixel position in the sub-block, when performing quadratic prediction or PROF processing, if the deviation of the motion vector between the default central position used by the filter and the sub-block is large, the obtained prediction result may have a large error, and therefore, after determining the first motion vector deviation between each pixel position and the sub-block, the decoder may first confirm whether each pixel position is suitable as the central position of the filter used when performing quadratic prediction or PROF processing for a certain pixel, and if not, may perform adjustment of the central position again.
It is understood that, in the present application, the center position and the second motion vector offset corresponding to a pixel position may be the center position and the motion vector offset of a filter used in performing quadratic prediction or PROF processing on the pixel position. I.e. the center position and the second motion vector offset, are used for a quadratic prediction or PROF processing of the pixel position.
Further, in an embodiment of the present application, the method for determining the center position and the second motion vector offset corresponding to each pixel position by the decoder according to the first motion vector offset may include the following steps:
step 304a, determining a horizontal deviation and a vertical deviation of the first motion vector deviation;
and step 304b, determining a central position and a second motion vector deviation according to the first absolute value of the horizontal deviation, the second absolute value of the vertical deviation and a preset deviation threshold value.
In the embodiment of the present application, the decoder may determine the deviation values of the first motion vector deviation in different directions, that is, determine the horizontal deviation and the vertical deviation corresponding to the first motion vector deviation. Then, the decoder may further calculate an absolute value of the horizontal deviation, i.e., a first absolute value, and an absolute value of the vertical deviation, i.e., a second absolute value. Finally, the decoder may further determine the center position and the second motion vector offset used for performing quadratic prediction or PROF processing according to the first absolute value, the second absolute value, and a preset offset threshold.
It should be noted that, in the embodiment of the present application, the decoder may compare the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with a preset deviation threshold, respectively, so that the determination of the center position and the second motion vector deviation may be performed according to the comparison result. The preset deviation threshold may be preset and used to determine whether to adjust the deviation between the center position and the motion vector.
For example, in the present application, the preset deviation threshold may be in units of pixels, and specifically, the preset deviation threshold may be k pixels; wherein k is greater than 0.5 and less than or equal to 1. That is, the preset deviation threshold may be set in advance to one-half pixel, three-quarters pixel, or one pixel.
Further, in the embodiment of the present application, when the decoder determines the center position and the second motion vector offset based on the horizontal offset, the vertical offset, and the preset offset threshold, if both the first absolute value and the second absolute value are smaller than the preset offset threshold, the decoder may determine the first motion vector offset as the second motion vector offset and each pixel position as the center position.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold, if both the first absolute value and the second absolute value are smaller than the preset deviation, the decoder may consider that the vector deviation between the pixel position and the sub-block is small, and the prediction result obtained by performing the quadratic prediction or the PROF processing using the pixel position as the center position of the filter is more accurate, and thus, the first motion vector deviation corresponding to the pixel position may be directly determined as the second motion vector deviation for performing the quadratic prediction or the PROF processing, and at the same time, the pixel position may be directly determined as the center position used by the filter performing the quadratic prediction or the PROF processing.
Further, in the embodiments of the present application, when the decoder determines the center position and the second motion vector deviation according to the horizontal deviation, the vertical deviation, and the preset deviation threshold, if the first absolute value is greater than or equal to the preset deviation threshold and the second absolute value is less than the preset deviation threshold, the decoder may determine the first adjustment direction according to the horizontal deviation; the center position and the second motion vector offset can then be determined based on the first adjustment direction.
It should be noted that, in the embodiment of the present application, the first adjustment direction is used for adjusting the first motion vector in the horizontal direction, and therefore, the first adjustment direction includes the left side and the right side.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold respectively, if the first absolute value is greater than or equal to the preset deviation threshold and the second absolute value is less than the preset deviation threshold, it may be considered that the motion vector deviation between the pixel position and the sub-block in the horizontal direction is large, and if there may be an error in the prediction result obtained by performing the second prediction or the PROF processing using the pixel as the center position of the filter, therefore, it is necessary to further determine the first adjustment direction according to the horizontal deviation, and then perform the adjustment according to the first adjustment direction, so as to finally determine the center position and the second motion vector deviation used in the second prediction or the PROF processing.
Exemplarily, in the present application, when the first motion vector deviation of the pixel position is (0.75, 0) and the preset deviation threshold is 1/2 pixels, the first absolute value is 0.75 and the second absolute value is 0, it can be seen that the first absolute value is greater than the preset deviation threshold and the second absolute value is less than the preset deviation threshold, that is, the motion vector deviation of the pixel position in the horizontal direction is large, and it can be determined that it is closer to the right side based on the corresponding horizontal deviation 0.75, and therefore, the first adjustment direction can be determined to be the right side.
Illustratively, in the present application, when the first motion vector deviation of a pixel position is (-0.75, 0) and the preset deviation threshold is 1/2 pixels, the first absolute value is 0.75 and the second absolute value is 0, it can be seen that the first absolute value is greater than the preset deviation threshold and the second absolute value is less than the preset deviation threshold, that is, the motion vector deviation of the pixel position in the horizontal direction is greater, and it can be determined that it is closer to the left based on the corresponding horizontal deviation-0.75, and therefore, the first adjustment direction can be determined to be the left.
Further, in the embodiment of the present application, when the decoder determines the center position and the second motion vector offset according to the first adjustment direction, if the first adjustment direction is left, the adjacent left pixel position of any one pixel position may be taken as the center position, and then (1, 0) may be added to the first motion vector offset, so that the corresponding second motion vector offset may be obtained.
For example, in the present application, when the first motion vector offset of a pixel position is (-0.75, 0), and the preset offset threshold is 1/2 pixels, the first adjustment direction is determined to be the left side, then the pixel position adjacent to the pixel position on the left side can be used as the central position of the filter, and then (1, 0) can be added to the corresponding first motion vector offset (-0.75, 0), and the finally obtained second motion vector offset is (0.25, 0).
Further, in the embodiment of the present application, the decoder determines the center position and the second motion vector offset based on the first adjustment direction, and if the first adjustment direction is the right side, the adjacent right pixel position of any pixel position may be taken as the center position, and then the first motion vector offset may be subtracted by (1, 0), so that the second motion vector offset may be obtained.
For example, in the present application, when the first motion vector offset of a pixel position is (0.75, 0) and the preset offset threshold is 1/2 pixels, the first adjustment direction is determined to be right, then the pixel position on the right adjacent to the pixel position may be used as the center position of the filter, and then the corresponding first motion vector offset (0.75, 0) may be subtracted by (1, 0), and the finally obtained second motion vector offset is (-0.25, 0).
Further, in the embodiments of the present application, when the decoder determines the center position and the second motion vector deviation according to the horizontal deviation, the vertical deviation, and the preset deviation threshold, if the second absolute value is greater than or equal to the preset deviation threshold and the first absolute value is less than the preset deviation threshold, the decoder may determine the second adjustment direction according to the vertical deviation; the center position and the second motion vector offset may then be determined based on the second adjustment direction.
It should be noted that, in the embodiment of the present application, the second adjustment direction is used for adjusting the first motion vector in the vertical direction, and therefore, the second adjustment direction includes an upper side and a lower side.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold, if the second absolute value is greater than or equal to the preset deviation threshold and the first absolute value is smaller than the preset deviation threshold, it may be considered that the motion vector deviation between the pixel position and the sub-block in the vertical direction is large, and if there may be an error in the prediction result obtained by performing the secondary prediction or the PROF processing using the pixel as the center position of the filter, it is necessary to further determine the second adjustment direction according to the vertical deviation, and then perform the adjustment according to the second adjustment direction, so that the center position and the second motion vector deviation used in the secondary prediction or the PROF processing may be finally determined.
Exemplarily, in the present application, when the first motion vector deviation of the pixel position is (0, 0.75) and the preset deviation threshold is 1/2 pixels, the second absolute value is 0.75 and the first absolute value is 0, it can be seen that the second absolute value is larger than the preset deviation threshold, the first absolute value is smaller than the preset deviation threshold, that is, the motion vector deviation of the pixel position in the vertical direction is larger, and it can be determined that it is closer to the lower side based on the corresponding vertical deviation of 0.75, and therefore, the second adjustment direction can be determined to be the lower side.
Exemplarily, in the present application, when the first motion vector deviation of the pixel position is (0, -0.75) and the preset deviation threshold is 1/2 pixels, the second absolute value is 0.75 and the first absolute value is 0, it can be seen that the second absolute value is larger than the preset deviation threshold, the first absolute value is smaller than the preset deviation threshold, that is, the motion vector deviation of the pixel position in the vertical direction is larger, and it can be determined that it is closer to the upper side based on the corresponding vertical deviation-0.75, and therefore, the second adjustment direction can be determined to be the upper side.
Further, in the embodiment of the present application, when the decoder determines the center position and the second motion vector offset according to the second adjustment direction, if the second adjustment direction is the upper side, the pixel position on the upper side adjacent to the arbitrary pixel position may be taken as the center position, and then (0, 1) may be added to the first motion vector offset, so that the corresponding second motion vector offset may be obtained.
For example, in the present application, when the first motion vector deviation of a pixel position is (0, -0.75) and the preset deviation threshold is 1/2 pixels, the second adjustment direction is determined to be the upper side, the upper side pixel position adjacent to the pixel position may be used as the central position of the filter, then, the corresponding first motion vector deviation (0, -0.75) may be added with (0, 1), and the finally obtained second motion vector deviation is (0, 0.25).
Further, in the embodiment of the present application, the decoder determines the center position and the second motion vector offset according to the second adjustment direction, and if the second adjustment direction is the lower side, the pixel position on the lower side adjacent to any pixel position may be taken as the center position, and then the first motion vector offset may be subtracted by (0, 1), so that the second motion vector offset may be obtained.
For example, in the present application, when the first motion vector offset of a pixel position is (0, 0.75) and the preset offset threshold is 1/2 pixels, the second adjustment direction is determined to be the lower side, the pixel position on the lower side adjacent to the pixel position may be used as the central position of the filter, and then, the corresponding first motion vector offset (0, 0.75) may be subtracted by (0, 1), and the finally obtained second motion vector offset is (0, 0.25).
Further, in the embodiments of the present application, when the decoder determines the center position and the second motion vector offset according to the horizontal offset, the vertical offset, and the preset offset threshold, if both the first absolute value and the second absolute value are greater than or equal to the preset offset threshold, the decoder may determine a third adjustment direction according to the horizontal offset and the vertical offset; the center position and the second motion vector offset may then be determined based on the third adjustment direction.
It should be noted that, in the embodiment of the present application, the third adjustment direction is used for adjusting the first motion vector in the horizontal direction and the vertical direction at the same time, and therefore, the third adjustment direction includes an upper right side, a lower right side, an upper left side, and a lower left side.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold, if both the first absolute value and the second absolute value are greater than or equal to the preset deviation threshold, the decoder may consider that the motion vector deviations in the horizontal direction and the vertical direction between the pixel position and the sub-block are large, and if there may be an error in the prediction result obtained by performing the second prediction or the PROF processing using the pixel as the center position of the filter, it is necessary to further determine a third adjustment direction according to the horizontal deviation and the vertical deviation, and then perform the adjustment according to the third adjustment direction, so that the center position and the second motion vector deviation used when performing the second prediction or the PROF processing may be finally determined.
Exemplarily, in the present application, when the first motion vector deviation of the pixel position is (0.75 ) and the preset deviation threshold is 1/2 pixels, the first absolute value is 0.75 and the second absolute value is 0.75, and it can be seen that both the first absolute value and the second absolute value are greater than the preset deviation threshold, that is, the motion vector deviation of the pixel position in the horizontal direction and the vertical direction is large, and it can be determined to be closer to the right side based on the corresponding horizontal deviation 0.75, and at the same time, it can be determined to be closer to the lower side based on the corresponding vertical deviation 0.75, and thus, it can be determined that the third adjustment direction is the lower right side.
Illustratively, in the present application, when the first motion vector deviation of a pixel position is (-0.75 ), and the preset deviation threshold is 1/2 pixels, the first absolute value is-0.75, and the second absolute value is-0.75, it can be seen that both the first absolute value and the second absolute value are greater than the preset deviation threshold, i.e., the motion vector deviation of the pixel position in the horizontal direction and the vertical direction is greater, and it can be determined that it is closer to the left side based on the corresponding horizontal deviation-0.75, and at the same time, it can be determined that it is closer to the upper side based on the corresponding vertical deviation-0.75, and therefore, it can be determined that the third adjustment direction is the upper left side.
Further, in the embodiment of the present application, when the decoder determines the center position and the second motion vector offset according to the third adjustment direction, if the third adjustment direction is the upper left side, the adjacent upper left side pixel position of the arbitrary pixel position may be taken as the center position, and then (1, 1) may be added to the first motion vector offset, so that the corresponding second motion vector offset may be obtained.
For example, in the present application, when the first motion vector deviation of a pixel position is (-0.75 ), and the preset deviation threshold is 1/2 pixels, the third adjustment direction is determined to be the upper left side, then the pixel position adjacent to the pixel position on the upper left side may be used as the center position of the filter, and then the corresponding first motion vector deviation (-0.75 ) may be added by (1, 1), and the finally obtained second motion vector deviation is (0.25 ).
Further, in the embodiment of the present application, the decoder determines the center position and the second motion vector offset according to the third adjustment direction, and if the third adjustment direction is the lower right side, the adjacent lower right side pixel position of any pixel position may be taken as the center position, and then, the first motion vector offset may be reduced by (1, 1), so that the second motion vector offset may be obtained.
Illustratively, in the present application, when the first motion vector deviation of a pixel position is (0.75 ) and the preset deviation threshold is 1/2 pixels, the third adjustment direction is determined to be the lower right side, then the pixel position adjacent to the pixel position and on the lower right side can be used as the center position of the filter, and then the corresponding first motion vector deviation (0.75 ) can be subtracted by (1, 1), and the finally obtained second motion vector deviation is (-0.25 ).
It is to be understood that, in the embodiments of the present application, the deviation value of the second motion vector deviation in both the horizontal direction and the vertical direction is smaller than the preset deviation threshold.
That is, in the present application, the decoder may first determine the center position and the second motion vector offset, after determining the offset of the motion vector of each pixel position in the sub-block from the motion vector of the sub-block, and before filtering the prediction block based on prediction of the sub-block, wherein the center position and the second motion vector are used for performing quadratic prediction or PROF processing.
Specifically, in the present application, for each pixel position in the sub-block, the magnitude of the first motion vector offset between the pixel position and the sub-block may be determined, including the horizontal offset and the vertical offset of the first motion vector offset in the horizontal direction and the vertical direction. If the absolute value of the horizontal direction and/or the vertical direction of the motion vector deviation of a certain pixel position is larger than the preset deviation threshold, it can be determined that the central position of the filter corresponding to the pixel position is no longer the pixel position itself, and further determination of the central position and the second motion vector deviation is needed.
For example, in the present application, assuming that the previous pixel position is CURRENT, and accordingly, the pixel position of the CURRENT pixel position on the LEFT side in the horizontal direction is LEFT, LEFT is the coordinate of CURRENT plus (-1, 0); if the pixel position of the CURRENT pixel position on the RIGHT side in the horizontal direction is RIGHT, then RIGHT is the coordinate of CURRENT plus (1, 0); if the pixel position of the CURRENT pixel position on the upper side in the vertical direction is UP, UP is the coordinate of CURRENT plus (0, -1); if the pixel position of the CURRENT pixel position on the lower side in the vertical direction is DOWN, the DOWN is the sum of the coordinate of CURRENT and (0, 1); the pixel position at the upper left side of the CURRENT pixel position is UPLEFT, and the UPLEFT is the coordinate of CURRENT plus (-1, -1); the pixel position at the upper right side of the CURRENT pixel position is UPRIGHT, and the UPRIGHT is the coordinate of CURRENT plus (1, -1); the pixel position at the left lower side of the CURRENT pixel position is DOWNLEFT, and the DOWNLEFT is the coordinate of CURRENT plus (-1, 1); the pixel position on the lower right side of the CURRENT pixel position is downIGHT, which is the coordinate of CURRENT plus (1, 1).
Further, in the embodiment of the present application, before performing the quadratic prediction or the PROF processing on a pixel position in a sub-block, if it is determined that the absolute value of the deviation component of the first motion vector deviation between the pixel position and the sub-block in the horizontal direction and/or the vertical direction is greater than or equal to the preset deviation threshold (e.g. 1/2 pixels, 3/4 pixels, or 1 pixel), the starting point used in the quadratic prediction or the PROF processing, that is, the center position of the two-dimensional filter used in the quadratic prediction or the PROF processing, needs to be adjusted.
It is understood that, in the present application, when adjusting the center position and the motion vector deviation, another pixel position in the current block may be selected, where the motion vector deviation from the pixel position in both the horizontal direction and the vertical direction is less than or equal to a preset deviation threshold.
For example, in the present application, if the first motion vector deviation of the motion vector of the CURRENT pixel position in the sub-block from the motion vector of the sub-block is (0.75, 0), and the preset deviation threshold is 1/2 pixels, the pixel position on the RIGHT side of the CURRENT pixel position CURRENT, i.e., RIGHT, may be used as the start position of the quadratic prediction of the CURRENT pixel position, i.e., the center position of the two-dimensional filter. Accordingly, after adjusting the start position of the quadratic prediction, the first motion vector offset needs to be adjusted accordingly, and specifically, the second motion vector offset may be determined according to the offset between the motion vector of the current pixel position and the motion vector of the center position. For example, since the starting position is adjusted to RIGHT by CURRENT, the first motion vector offset needs to be subtracted by (1, 0), and then the adjusted second motion vector offset is (-0.25, 0).
For example, in the present application, if the first motion vector deviation of the motion vector of the CURRENT pixel position in the sub-block from the motion vector of the sub-block is (0.75 ) and the preset deviation threshold is 1/2 pixels, the pixel position on the lower right side of the CURRENT pixel position CURRENT, i.e., downpixel, may be used as the start position of the PROF process of the CURRENT pixel position, i.e., the center position of the one-dimensional filter. Accordingly, after adjusting the start position of the quadratic prediction, the first motion vector offset needs to be adjusted accordingly, and specifically, the second motion vector offset may be determined according to the offset between the motion vector of the current pixel position and the motion vector of the center position. For example, since the starting position is adjusted to DOWNRIGHT by CURRENT, the first motion vector offset needs to be subtracted by (1, 1), and then the adjusted second motion vector offset is (-0.25 ).
Based on the above fig. 9, fig. 13 is a schematic diagram three of the center position, and as shown in fig. 13, both of the dmv _ x0 and the dmv _ y0 are larger than one-half pixel, it can be seen that the pixel position of the square 2 with coordinates (2, 2) is closest to the circle, and if the reference position, that is, the center position of the filter is set at this position, it can be considered that the obtained prediction result is more accurate than the original one. The adjusted motion vector biases are (dmv _ x, dmv _ y), with the absolute value of dmv _ x being less than dmv _ x0 and the absolute value of dmv _ y being less than dmv _ y 0. The dashed boxes in the figure represent pixel positions used by the filter before adjustment, and the solid boxes in the figure represent pixel positions used by the filter after adjustment.
In an embodiment of the present application, further, fig. 14 is a schematic diagram illustrating an implementation flow of an inter prediction method, as shown in fig. 14, before determining a center position corresponding to each pixel position and a second motion vector offset according to the first motion vector offset, that is, before step 304, the method for performing inter prediction by the decoder may further include the following steps:
step 306, limiting the first motion vector deviation according to a preset deviation range; the preset deviation range comprises a deviation lower limit value and a deviation upper limit value.
In an embodiment of the present application, the decoder may limit the first motion vector before determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset. Specifically, the decoder may perform a restriction process on the first motion vector offset according to a preset offset range; the preset deviation range comprises a deviation lower limit value and a deviation upper limit value.
Further, in the embodiment of the present application, when the decoder limits the first motion vector bias according to the preset bias range, if the horizontal bias and/or the vertical bias is smaller than the lower bias limit, the horizontal bias and/or the vertical bias may be set as the lower bias limit; if the horizontal deviation and/or the vertical deviation is larger than the deviation upper limit value, the horizontal deviation and/or the vertical deviation can be set as the deviation upper limit value.
It should be noted that in the present application, the decoder may not limit the first motion vector offset, i.e. does not perform step 306, but directly adjust the center position to any suitable pixel position according to the first motion vector offset. The limitation of the first motion vector offset may be performed first according to step 306, or the adjustable range and the motion vector offset may be limited.
Further, in the embodiment of the present application, the preset deviation range may be composed of a deviation lower limit min and a deviation upper limit max, that is, the preset deviation range may be represented as (min, max), and when the first motion vector deviation is subjected to the limiting process, if the deviation value of the first motion vector deviation in the horizontal direction and/or the vertical direction is smaller than the deviation lower limit min, the deviation value may be directly set as min; if the deviation value of the first motion vector deviation in the horizontal direction and/or the vertical direction is larger than max, the deviation value can be directly set as max.
Exemplarily, in the present application, assuming that the upper limit value max is 1, i.e. the first motion vector bias is limited to 1 pixel, if the absolute value (the first absolute value and/or the second absolute value) of the horizontal direction and/or the vertical direction bias value of the first motion vector bias is greater than 1, the absolute value of the horizontal direction and/or the vertical direction bias value of the first motion vector bias is set to 1 pixel.
In an embodiment of the present application, further, fig. 15 is a schematic view illustrating an implementation flow of an inter prediction method, as shown in fig. 15, after determining a center position corresponding to each pixel position and a second motion vector offset according to the first motion vector offset, that is, after step 304, the method for performing inter prediction by the decoder may further include the following steps:
and 307, if the central position does not belong to the current block, re-determining the central position according to the pixel position in the current block.
In the embodiment of the present application, after determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset, the decoder may first determine whether the center position exceeds the current block, and if the center position exceeds the range of the current block, that is, the center position does not belong to the current block, the decoder needs to re-determine the center position according to the pixel position in the current block.
In an embodiment of the present application, further, fig. 16 is a schematic diagram illustrating an implementation flow of an inter prediction method in four, as shown in fig. 16, after determining a center position corresponding to each pixel position and a second motion vector offset according to the first motion vector offset, that is, after step 304, the method for performing inter prediction by the decoder may further include the following steps:
and 308, if the central position does not belong to the current block, directly determining each pixel position as the central position corresponding to each pixel position.
In the embodiment of the present application, after determining the center position corresponding to each pixel position and the second motion vector bias according to the first motion vector bias, the decoder may first determine whether the center position exceeds the current block, and if the center position exceeds the range of the current block, that is, the center position does not belong to the current block, the decoder may directly determine the corresponding original pixel position as the center position. I.e. the center position determined based on the first motion vector offset corresponding to each pixel position in the sub-block, the decoder may choose to take each pixel position as the center position if it is out of the range of the current block.
That is, in the embodiment of the present application, for the adjusted start position of the quadratic prediction, i.e., the center position of the filter usage determined based on the first motion vector deviation, the decoder may restrict the center position not to exceed the range of the current block, i.e., not to exceed the range of the start position of the quadratic prediction of all pixel positions in the current block before the adjustment.
And 305, performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
In the embodiment of the present application, after determining the center position and the second motion vector offset corresponding to each pixel position according to the first motion vector offset, the decoder may perform quadratic prediction or PROF processing according to the center position and the second motion vector offset based on the first prediction value, so that the prediction value of the pixel at each pixel position may be obtained.
Further, in the embodiment of the present application, after the decoder traverses each pixel position in the sub-block to obtain the predicted value of the pixel at each pixel position, the decoder may determine the second predicted value of the sub-block according to the predicted value of the pixel at each pixel position, so that the second predicted value may be determined as the inter-frame predicted value of the sub-block.
It should be noted that, in the embodiment of the present application, since the deviation between the center position and the second motion vector may be used to perform secondary prediction or PROF processing on the pixel point at each pixel position in the sub-block, after obtaining the deviation between the center position and the second motion vector, the decoder may perform secondary prediction or PROF processing on the pixel point at each pixel position by using the center position and the deviation between the second motion vector based on the first predicted value, and finally may obtain the second predicted value corresponding to the sub-block, so that the second predicted value may be determined as the inter-frame predicted value of the sub-block.
Further, in an embodiment of the present application, the method in which the decoder performs quadratic prediction or PROF processing according to the center position and the second motion vector deviation based on the first predictor to determine the second predictor of the sub-block, and the second predictor is determined as an inter predictor of the sub-block, may include the steps of:
305a, analyzing the code stream to obtain a PROF parameter;
step 305b, when the PROF parameter indicates to perform PROF processing, determining a pixel horizontal gradient and a pixel vertical gradient corresponding to the central position based on the first predicted value;
step 305c, calculating a deviation value corresponding to each pixel position according to the pixel horizontal gradient, the pixel vertical gradient and the second motion vector deviation;
305d, obtaining a predicted value of the pixel at each pixel position based on the first predicted value and the deviation value;
step 305e, using the predicted value of the pixel at each pixel location, determines a second predicted value.
In the embodiment of the application, the decoder may first parse the code stream to obtain the PROF parameter, and if the PROF parameter indicates to perform the PROF processing, the decoder may determine, based on the first predicted value, a pixel horizontal gradient and a pixel vertical gradient corresponding to the central position; the pixel horizontal gradient is a gradient value between a pixel value corresponding to the central position and a pixel value corresponding to an adjacent pixel position in the horizontal direction; the pixel vertical gradient is a gradient value between a pixel value corresponding to the center position and a pixel value corresponding to an adjacent pixel position in the vertical direction.
Further, in the embodiment of the present application, the decoder may calculate an offset value corresponding to each pixel position according to the pixel horizontal gradient, the pixel vertical gradient and the second motion vector offset of the center position corresponding to each pixel position. The offset value may be used to correct the predicted value of the pixel value at each pixel position.
It should be noted that, in the embodiment of the present application, the decoder may further obtain, according to the first predicted value and the deviation value, a corrected predicted value corresponding to any pixel position, and after traversing each pixel position in the current sub-block and obtaining the corrected predicted value corresponding to each pixel position, determine, by using the corrected predicted values corresponding to all pixel positions, a second predicted value corresponding to the current sub-block, thereby determining the corresponding inter-frame predicted value. Specifically, in the present application, after the prediction based on the sub-blocks is completed, the first prediction value of the current sub-block is used as the prediction value of each pixel position, and then, the first prediction value is added to the offset value corresponding to each pixel position, so that the correction processing of the prediction value of each pixel position can be completed, the corrected prediction value is obtained, and thus the second prediction value of the current sub-block can be further obtained, and the second prediction value is used as the inter-frame prediction value corresponding to the current sub-block.
Further, in an embodiment of the present application, the method in which the decoder performs quadratic prediction or PROF processing according to the center position and the second motion vector deviation based on the first predictor to determine the second predictor of the sub-block, and the second predictor is determined as an inter predictor of the sub-block, may include the steps of:
305f, analyzing the code stream to obtain a secondary prediction parameter;
305g, when the secondary prediction parameter indicates that secondary prediction is used, determining a filter coefficient of the two-dimensional filter according to the second motion vector deviation; the two-dimensional filter is used for carrying out quadratic prediction processing according to a preset shape;
step 305h, determining a predicted value of the pixel at each pixel position based on the filter coefficient and the first predicted value;
step 305i determines a second predicted value using the predicted value of the pixel at each pixel location.
In the embodiment of the application, the decoder may first parse the code stream to obtain a secondary prediction parameter, and if the secondary prediction parameter indicates that secondary prediction is used, the decoder may determine a filter coefficient of the two-dimensional filter according to the second motion vector deviation; the two-dimensional filter is used for carrying out quadratic prediction processing according to a preset shape.
It should be noted that, in the embodiment of the present application, the filter coefficient of the two-dimensional filter is related to the second motion vector offset corresponding to the target pixel position. That is, if the corresponding second motion vector deviations are different for different target pixel positions, the filter coefficients of the two-dimensional filters used are also different.
It will be appreciated that in embodiments of the present application, a two-dimensional filter is used for quadratic prediction using a plurality of adjacent pixel locations that form a predetermined shape. Wherein, the preset shape is a rectangle, a rhombus or any symmetrical shape.
That is, in the present application, the two-dimensional filter for performing quadratic prediction is a filter configured by adjacent points constituting a preset shape. The adjacent dots constituting the preset shape may include a plurality of dots, for example, 9 dots. The predetermined shape may be a symmetrical shape, for example, the predetermined shape may include a rectangle, a diamond shape, or any other symmetrical shape.
Illustratively, in the present application, the two-dimensional filter is a rectangular filter, and specifically, the two-dimensional filter is a filter composed of 9 adjacent pixel positions constituting a rectangle. Of the 9 pixel positions, the pixel position located at the center is the pixel position of the pixel currently requiring quadratic prediction, i.e., the current pixel position.
Further, in the embodiment of the present application, when determining the filter coefficient of the two-dimensional filter according to the second motion vector deviation, the decoder may first analyze the code stream to obtain the scale parameter, and then may determine the filter coefficient corresponding to the pixel position according to the scale parameter and the second motion vector deviation.
It should be noted that, in the embodiment of the present application, the scale parameter may include at least one scale value, and the second motion vector offset includes a horizontal offset and a vertical offset; wherein at least one of the proportional values is a non-zero real number.
Specifically, in the present application, when the two-dimensional filter performs secondary prediction using 9 adjacent pixel positions that form a rectangle, a pixel position located in the center of the rectangle is a position to be predicted, that is, a current pixel position, and the other 8 target pixel positions are sequentially located in 8 directions, that is, the upper left direction, the upper right direction, the lower left direction, and the left direction, of the current pixel position.
Accordingly, in the present application, the decoder may calculate and obtain 9 filter coefficient coefficients corresponding to 9 adjacent pixel positions according to a preset calculation rule based on at least one scale value and the second motion vector deviation of the position to be predicted.
It should be noted that, in the present application, the preset calculation rule may include a plurality of different calculation manners, such as addition, subtraction, multiplication, and the like. Wherein, for different pixel positions, different calculation modes can be used for calculating the filter coefficient.
It is understood that, in the present application, in the case that the decoder obtains a plurality of filter coefficients corresponding to a plurality of pixel positions by calculation according to different calculation methods in the preset calculation rule, some of the filter coefficients may be a linear function of the second motion vector deviation, that is, the linear relationship between the first motion vector deviation and the second motion vector deviation, or may be a quadratic function or a higher-order function of the second motion vector deviation, that is, the nonlinear relationship between the second motion vector deviation and the second motion vector deviation.
That is, in the present application, any one of the plurality of filter coefficients corresponding to a plurality of adjacent pixel positions may be a linear function, a quadratic function, or a high-order function of the second motion vector deviation.
Illustratively, in this application, the second motion vector bias for a pixel location is assumed to be (dmv _ x, dmv _ y), where, if the coordinates of the target pixel location are (i, j), dmv _ x may be represented as dMvX [ i ] [ j ] [0], i.e., representing the bias value for the second motion vector bias in the horizontal component, and dmv _ y may be represented as dMvX [ i ] [ j ] [1], i.e., representing the bias value for the second motion vector bias in the vertical component.
Accordingly, table 3 is a filter coefficient obtained based on the second motion vector biases (dmv _ x, dmv _ y), and as shown in table 3, for the two-dimensional filter, 9 filter coefficients corresponding to 9 neighboring pixel positions can be obtained according to the second motion vector bias (horizontal bias is dmv _ x, vertical bias is dmv _ y) of the pixel position and different scaling parameters, such as m and n, wherein the decoder can directly set the filter coefficient of the current pixel position at the center to 1.
TABLE 3
Pixel position Filter coefficient
Upper left of (-dmv_x-dmv_y)×m
Left side of -dmv_x×n
Left lower part (-dmv_x+dmv_y)×m
On the upper part -dmv_y×n
Center of a ship 1
Lower part dmv_y×n
Upper right part (dmv_x-dmv_y)×m
Right side dmv_x×n
Lower right (dmv_x+dmv_y)×m
Where the scaling parameters m and n are typically fractional numbers or fractions, one possible scenario is where both m and n are powers of 2, such as 1/2, 1/4, 1/8, and the like. Here, both the dmv _ x, dmv _ y are their actual sizes, i.e., 1 for dmv _ x, dmv _ y represents a distance of 1 pixel, and dmv _ x, dmv _ y is a fraction or fraction.
It should be noted that, in the embodiment of the present application, compared to the existing 8-tap filter, the motion vectors of the integer pixel position and the sub-pixel position corresponding to the currently common 8-tap filter are non-negative in both horizontal and vertical directions, and both belong to 0 pixel to 1 pixel, i.e., dmv _ x, dmv _ y may not be negative. In the present application, the motion vectors for the integer pixel position and the sub-pixel position corresponding to the filter may be negative in both the horizontal and vertical directions, i.e., dmv _ x, dmv _ y may be negative.
For example, in the embodiment of the present application, if the ratio parameter m is 1/16 and n is 1/2, the above table 3 may be represented as the following table 4:
TABLE 4
Pixel position Filter coefficient
Upper left of (-dmv_x-dmv_y)/16
Left side of -dmv_x/2
Left lower part (-dmv_x+dmv_y)/16
On the upper part -dmv_y/2
Center of a ship 1
Lower part dmv_y/2
Upper right part (dmv_x-dmv_y)/16
Right side dmv_x/2
Lower right (dmv_x+dmv_y)/16
It will be appreciated that in embodiments of the present application, in video codec techniques and standards, magnification is typically used to avoid fractional, floating point operations, and then the result of the computation is reduced by an appropriate factor to obtain the correct result. A left shift is typically used for magnification and a right shift is typically used for reduction. Therefore, when performing the second prediction by the two-dimensional filter, the following form is written in practical use:
assuming that the second motion vector offset of the pixel position is (dmv _ x, dmv _ y), which is obtained by left shifting 1 (dmv _ x ', dmv _ y'), the coefficients of the two-dimensional filter can be represented as the following table 5 based on the above table 4:
TABLE 5
Pixel position Filter coefficient
Upper left of -dmv_x’-dmv_y’
Left side of -dmv_x’×8
Left lower part -dmv_x’+dmv_y’
On the upper part -dmv_y’×8
Center of a ship 16<<shift1
Lower part dmv_y’×8
Upper right part dmv_x’-dmv_y’
Right side dmv_x×8
Lower right dmv_x’+dmv_y’
Fig. 17 is a schematic diagram of a two-dimensional filter, as shown in fig. 17, based on the result of prediction based on sub-blocks as a quadratic prediction, and the light square is the integer pixel position of the filter, i.e. the position obtained based on prediction of sub-blocks. The circle is a sub-pixel position which needs to be subjected to secondary prediction, namely the position of a pixel position, the dark square is an integer pixel position corresponding to the sub-pixel position, and 9 integer pixel positions as shown in the figure are needed when the sub-pixel position is obtained through interpolation.
Fig. 18 is a diagram ii of a two-dimensional filter, as shown in fig. 18, based on the result of prediction based on sub-blocks as a quadratic prediction, and the light square is the integer pixel position of the filter, that is, the position obtained based on prediction of sub-blocks. The circle is a sub-pixel position which needs to be subjected to secondary prediction, namely the position of a pixel position, the dark square is an integer pixel position corresponding to the sub-pixel position, and when the sub-pixel position is obtained through interpolation, 13 integer pixel positions are needed as shown in the figure.
Further, in the embodiment of the present application, after determining the filter coefficient of the two-dimensional filter according to the second motion vector deviation, the decoder may determine the second predicted value of the current sub-block based on the filter coefficient and the first predicted value, so that secondary prediction of the current sub-block may be implemented.
It is understood that, in the embodiment of the present application, the decoder determines the filter coefficient by using the second motion vector offset corresponding to the pixel position, so that the first prediction value can be corrected by the two-dimensional filter according to the filter coefficient to obtain the corrected second prediction value of the current sub-block. As can be seen, the second predicted value is a corrected value based on the first predicted value.
Further, in the embodiment of the present application, when the decoder determines the second predicted value of the current sub-block based on the filter coefficient and the first predicted value, the decoder may first perform multiplication on the filter coefficient and the first predicted value to obtain a product result, perform addition on the product results of all pixel positions of the current sub-block after traversing all pixel positions in the current sub-block to obtain a sum result, and finally perform normalization processing on the sum result, so as to finally obtain the second predicted value after the current sub-block is corrected.
In the embodiment of the present application, before performing the secondary prediction, the first predicted value of the current sub-block where the pixel position is located is generally used as the predicted value before the correction of the pixel position, and therefore, when performing the filtering by the two-dimensional filter, the filter coefficient may be multiplied by the predicted value of the corresponding pixel position, that is, the first predicted value, and the multiplication results corresponding to each pixel position may be accumulated and then normalized.
It is understood that the decoder can perform normalization in various ways in the present application, for example, the filter coefficients can be multiplied by the predicted values of the corresponding pixel positions and then the accumulated results can be right-shifted by 4+ shift1 bits. Alternatively, the filter coefficients may be multiplied by the predicted values for the corresponding pixel locations and added to (1< < (3+ shift1)), and then shifted right by 4+ shift1 bits.
Therefore, in the present application, after obtaining the second motion vector deviation corresponding to the pixel position inside the current sub-block, for each sub-block and each pixel position in each sub-block, filtering may be performed by using a two-dimensional filter based on the motion-compensated first predicted value of the current sub-block according to the second motion vector deviation, so as to complete secondary prediction on the current sub-block, and obtain a new second predicted value.
Further, in the embodiments of the present application, the two-dimensional filter may be understood as performing quadratic prediction using a plurality of adjacent pixel positions constituting a preset shape. The preset shape can be a rectangle, a rhombus or any symmetrical shape.
Specifically, in the embodiment of the present application, when performing secondary prediction by using 9 adjacent pixel positions constituting a rectangle, the two-dimensional filter may first determine a prediction sample matrix of the current block and a motion vector deviation matrix of a current sub-block of the current block; the motion vector deviation matrix comprises second motion vector deviations corresponding to all pixel positions; a quadratic predicted sample matrix for the current block is then determined based on the 9 neighboring pixel positions constituting the rectangle, using the prediction sample matrix and the motion vector disparity matrix.
Illustratively, in the present application, if the width and height of the current block are width and height, respectively, the width and height of each sub-block are sub-width and sub-height, respectively. As shown in fig. 7, the sub-block where the top-left sample of the luma prediction sample matrix of the current block is located is a, the sub-block where the top-right sample is located is B, the sub-block where the bottom-left sample is located is C, and the sub-blocks where other positions are located are other sub-blocks.
For each sub-block in the current block, the motion vector disparity matrix for the sub-block can be recorded as dMv, then:
1. if the sub-block is A, dMv equals dMvA;
2. if the sub-block is B, dMv equals dMvB;
3. if the sub-block is C and there are 3 motion vectors in the control point motion vector group mvAffinine of the current sub-block, dMv is equal to dMvC;
4. dMv equals dMvN if the sub-block is other than A, B, C.
Further, assuming that (x, y) is the coordinate of the upper left corner position of the current sub-block, (i, j) is the coordinate of the pixel inside the luminance sub-block, the range of i is 0 to (subwidth-1), the range of j is 0 to (subweight-1), the prediction sample matrix based on the sub-block is predmatrix sb, the prediction sample matrix for the quadratic prediction is predmatrix, dcentx, dcenty are offset matrices in the horizontal direction and the vertical direction of the quadratic prediction start position, and the prediction sample predmatrix for the quadratic prediction of (x + i, y + j) [ x + i ] [ y + j ] can be calculated according to the following method:
dCenterX[i][j]=(dMv[i][j][0]<0?-1:1)*((abs(dMv[i][j][0])+(1<<10))/(1<<11))
dCenterX[i][j]=CLIP3(-i,width-1-i)
dCenterY[i][j]=(dMv[i][j][1]<0?-1:1)*((abs(dMv[i][j][1])+(1<<10))/(1<<11))
dCenterY[i][j]=CLIP3(-j,height-1-j)
dMv[i][j][0]=dMv[i][j][0]–dCenterX*(1<<11)
dMv[i][j][1]=dMv[i][j][1]–dCenterY*(1<<11)
PredMatrixS[x+i][y+j]=
(UPLEFT(x+i,y+j)×(-dMv[i][j][0]-dMv[i][j][1])+
UP(x+i,y+j)×((-dMv[i][j][1])<<3)+
UPRIGHT(x+i,y+j)×(dMv[i][j][0]-dMv[i][j][1])+
LEFT(x+i,y+j)×((-dMv[i][j][0])<<3)+
CENTER(x+i,y+j)×(1<<15)+
RIGHT(x+i,y+j)×(dMv[i][j][0]<<3)+
DOWNLEFT(x+i,y+j)×(-dMv[i][j][0]+dMv[i][j][1])+
DOWN(x+i,y+j)×(dMv[i][j][1]<<3)+
DOWNRIGHT(x+i,y+j)×(dMv[i][j][0]+dMv[i][j][1])+
(1<<14))>>15
PredMatrixS[x+i][y+j]=Clip3(0,(1<<BitDepth)-1,PredMatrixS[x+i][y+j])。
wherein UPLEFT (x + i, y + j) ═ PredMatrixSb [ max (0, x + i + dCentX [ i ] [ j ] -1) ] [ max (0, y + j + dCentY [ i ] [ j ] -1) ]
UP(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]][max(0,y+j+dCenterY[i][j]-1)]
UPRIGHT(x+i,y+j)=PredMatrixSb[min(width-1,x+i+dCenterX[i][j]+1)][max(0,y+j+dCenterY[i][j]-1)]
LEFT(x+i,y+j)=PredMatrixSb[max(0,x+i+dCenterX[i][j]-1)][y+j+dCenterY[i][j]]
CENTER(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]][y+j+dCenterY[i][j]]
RIGHT(x+i,y+j)=PredMatrixSb[min(width-1,x+i+dCenterX[i][j]+1)][y+j+dCenterY[i][j]]
DOWNLEFT(x+i,y+j)=PredMatrixSb[max(0,x+i+dCenterX[i][j]-1)][min(height-1,y+j+dCenterY[i][j]+1)]
DOWN(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]][max(0,x+i+dCenterX[i][j]-1)][min(height-1,y+j+dCenterY[i][j]+1)]
DOWNRIGHT(x+i,y+j)=PredMatrixSb[min(width-1,x+i+dCenterX[i][j]+1)][max(0,x+i+dCenterX[i][j]-1)][min(height-1, y+j+dCenterY[i][j]+1)]
Where max (a, b) may be understood as taking the larger of a, b and min (a, b) may be understood as taking the smaller of a, b.
Further, in the present application, if a method of expanding a prediction block based on prediction of subblock-based prediction by one row and one column is adopted, i.e., expanding rows of-1 and height and columns of-1 and width of predMatrix, the previous paragraph can be expressed as:
UPLEFT(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]-1][y+j+dCenterY[i][j]-1]
UP(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]][y+j+dCenterY[i][j]-1]
UPRIGHT(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]+1][y+j+dCenterY[i][j]-1]
LEFT(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]-1][y+j+dCenterY[i][j]]
CENTER(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]][y+j+dCenterY[i][j]]
RIGHT(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]+1][y+j+dCenterY[i][j]]
DOWNLEFT(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]-1][y+j+dCenterY[i][j]+1]
DOWN(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]][y+j+dCenterY[i][j]+1]
DOWNRIGHT(x+i,y+j)=PredMatrixSb[x+i+dCenterX[i][j]+1][y+j+dCenterY[i][j]+1]。
further, in the present application, dcentrex, dcentrey may be written not in the form of a matrix but in the form of a temporary variable.
It is understood that, in the embodiment of the present application, the CENTER (x + i, y + j) pixel position may be a central position among the 9 adjacent pixel positions constituting the rectangle, and then the quadratic prediction process may be performed based on the (x + i, y + j) pixel position and the other 8 pixel positions adjacent thereto. Specifically, the other 8 pixel positions are UP (UP), uptight (UP RIGHT), LEFT (LEFT), RIGHT (RIGHT), DOWN (DOWN LEFT), DOWN (DOWN), DOWN RIGHT, and UP LEFT, respectively.
It should be noted that, in the present application, the precision of the calculation formula of predmatrix s [ x + i ] [ y + j ] may use a lower precision. For example, right-shifted terms for each multiplication, such as dMv [ i ] [ j ] [0] and dMv [ i ] [ j ] [1] right-shifted by 3 bits, and accordingly, 1< <15 becomes 1< < (15-shift3), … + (1< <10)) > >11 becomes … + (1< < (10-shift3))) > (11-shift 3).
For example, the size of the motion vector may be limited to a reasonable range, and the positive and negative values of the motion vector in the horizontal direction and the vertical direction used as above do not exceed 1 pixel or 1/2 pixels or 1/4 pixels.
It can be understood that, if the prediction reference mode of the current block is 'Pred _ List 01', the decoder averages a plurality of prediction sample matrices of respective components to obtain a final prediction sample matrix of the component. For example, a new luma prediction sample matrix is obtained by averaging 2 luma prediction sample matrices.
Further, in the embodiment of the present application, after obtaining the prediction sample matrix of the current block, if the current block has no transform coefficient, the prediction matrix is used as the decoding result of the current block, and if the current block has transform coefficients, the transform coefficient may be decoded first, and a residual matrix is obtained through inverse transform and inverse quantization, and the residual matrix is added to the prediction matrix to obtain the decoding result.
It can be understood that, in the inter-frame prediction method provided by the present application, an initial position of the secondary prediction may be adjusted based on a first motion vector deviation corresponding to a pixel position in a sub-block, that is, a central position used by a filter performing the secondary prediction or the PROF processing is adjusted, then the first motion vector may be further adjusted based on the determined central position after the adjustment, so as to obtain a second motion vector deviation, and finally, the pixel position may be subjected to the point-based prediction by using the central position and the second motion vector deviation, so as to obtain a second prediction value of the sub-block.
It should be noted that the inter-frame prediction method proposed in the present application may be applied to any image component, and in the present embodiment, a quadratic prediction scheme is exemplarily used for the luminance component, but may also be applied to the chrominance component, or any component in other formats. The inter-frame prediction method proposed in the present application can also be applied to any video format, including but not limited to YUV format, including but not limited to the luma component in YUV format.
The present embodiment provides an inter prediction method that can re-determine, after prediction based on sub-blocks, a center position and a second motion vector deviation used for performing quadratic prediction or PROF processing for a pixel position where a first motion vector deviation between a motion vector and a motion vector of a sub-block is large based on the first motion vector deviation, so that point-based quadratic prediction can be performed using the center position and the second motion vector deviation on the basis of the first prediction value based on sub-blocks to obtain a second prediction value. Therefore, the inter-frame prediction method provided by the application can be well suitable for all scenes, the prediction error can be reduced, the coding performance is greatly improved, and the coding and decoding efficiency is improved.
The embodiment of the application provides an inter-frame prediction method, which is applied to a video coding device, namely an encoder. The functions implemented by the method may be implemented by the second processor in the encoder calling the computer program, although the computer program may be stored in the second memory, it is understood that the encoder comprises at least the second processor and the second memory.
Fig. 19 is a flowchart illustrating a fifth implementation of the inter prediction method, and as shown in fig. 19, the method for performing inter prediction by the encoder may include the following steps:
step 401, determining a prediction mode parameter of the current block.
In an embodiment of the present application, an encoder may first determine a prediction mode parameter of a current block. Specifically, the encoder may first determine a prediction mode used by the current block and then determine corresponding prediction mode parameters based on the prediction mode. Wherein the prediction mode parameter may be used to determine a prediction mode used by the current block.
It should be noted that, in the embodiment of the present application, an image to be encoded may be divided into a plurality of image blocks, the image block to be encoded currently may be referred to as a current block, and an image block adjacent to the current block may be referred to as an adjacent block; i.e. in the image to be encoded, the current block has a neighboring relationship with the neighboring block. Here, each current block may include a first image component, a second image component, and a third image component; that is, the current block is an image block to be subjected to prediction of a first image component, a second image component or a third image component in the image to be coded.
Wherein, assuming that the current block performs the first image component prediction, and the first image component is a luminance component, that is, the image component to be predicted is a luminance component, then the current block may also be called a luminance block; alternatively, assuming that the current block performs the second image component prediction, and the second image component is a chroma component, that is, the image component to be predicted is a chroma component, the current block may also be referred to as a chroma block.
It should be noted that, in the embodiment of the present application, the prediction mode parameter indicates the prediction mode adopted by the current block and a parameter related to the prediction mode. Here, for the determination of the prediction mode parameter, a simple decision strategy may be adopted, such as determining according to the magnitude of the distortion value; a complex decision strategy, such as determination based on the result of Rate Distortion Optimization (RDO), may also be adopted, and the embodiment of the present application is not limited in any way. Generally, the prediction mode parameter of the current block may be determined in an RDO manner.
Specifically, in some embodiments, when determining the prediction mode parameter of the current block, the encoder may perform pre-coding processing on the current block by using multiple prediction modes to obtain a rate-distortion cost value corresponding to each prediction mode; and then selecting the minimum rate distortion cost value from the obtained multiple rate distortion cost values, and determining the prediction mode parameters of the current block according to the prediction mode corresponding to the minimum rate distortion cost value.
That is, on the encoder side, the current block may be pre-encoded in a plurality of prediction modes for the current block. Here, the plurality of prediction modes generally include an inter prediction mode, a conventional intra prediction mode, and a non-conventional intra prediction mode; the conventional Intra prediction modes may include a Direct-Current (DC) mode, a PLANAR (PLANAR) mode, an angular mode, and the like, the non-conventional Intra prediction modes may include a Matrix-based Intra prediction (MIP) mode, a Cross-component linear model prediction (CCLM) mode, an Intra Block Copy (IBC) mode, a plt (palette) mode, and the like, and the inter prediction modes may include a general inter prediction mode, a GPM mode, an AWP mode, and the like.
Therefore, after the current block is pre-coded by utilizing a plurality of prediction modes, the rate distortion cost value corresponding to each prediction mode can be obtained; and then selecting the minimum rate distortion cost value from the obtained multiple rate distortion cost values, and determining the prediction mode corresponding to the minimum rate distortion cost value as the prediction mode parameter of the current block. In addition, after the current block is pre-coded by utilizing a plurality of prediction modes, a distortion value corresponding to each prediction mode can be obtained; then, the minimum distortion value is selected from the obtained distortion values, the prediction mode corresponding to the minimum distortion value is determined as the prediction mode used by the current block, and the corresponding prediction mode parameters are set according to the prediction mode. In this way, the determined prediction mode parameters are finally used for encoding the current block, and in the prediction mode, the prediction residual error can be smaller, and the encoding efficiency can be improved.
That is to say, on the encoding side, the encoder may select an optimal prediction mode to perform pre-encoding on the current block, and in this process, the prediction mode of the current block may be determined, and then a prediction mode parameter for indicating the prediction mode is determined, so that the corresponding prediction mode parameter is written into the code stream and transmitted to the decoder by the encoder.
Correspondingly, on the decoder side, the decoder can directly acquire the prediction mode parameters of the current block by analyzing the code stream, and determines the prediction mode used by the current block and the related parameters corresponding to the prediction mode according to the prediction mode parameters acquired by analyzing.
Step 402, determining a first motion vector of a sub-block of a current block when a prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block includes a plurality of sub-blocks.
In an embodiment of the present application, if the prediction mode parameter indicates that the current block determines an inter prediction value of the current block using an inter prediction mode, the encoder may first determine a first motion vector of each sub-block of the current block. Wherein a sub-block corresponds to a first motion vector.
It should be noted that, in the embodiment of the present application, the current block is an image block to be encoded in the current frame, the current frame is sequentially encoded in a certain order in the form of an image block, and the current block is an image block to be encoded in the current frame at the next moment in the order. The current block may have a variety of specification sizes, such as a specification of 16 × 16, 32 × 32, or 32 × 16, where the numbers represent the number of rows and columns of pixel points on the current block.
Further, in the embodiment of the present application, the current block may be divided into a plurality of sub-blocks, where the size of each sub-block is the same, and the sub-blocks are a set of pixels with a smaller specification. The sub-blocks may be 8 × 8 or 4 × 4 in size.
For example, in the present application, the size of the current block is 16 × 16, and the current block may be divided into 4 sub-blocks each having a size of 8 × 8.
It can be understood that, in the embodiment of the present application, in the case that the encoder determines that the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, the inter prediction method provided by the embodiment of the present application may be continuously employed.
In an embodiment of the present application, further, when the encoder determines the first motion vector of the sub-block of the current block when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, the affine mode parameter and the prediction reference mode of the current block may be determined. When the affine pattern parameter indicates that the affine pattern is used, the control point pattern and the sub-block size parameter are determined. Finally, the first motion vector may be determined according to the prediction reference mode, the control point mode, and the sub-block size parameter.
In an embodiment of the present application, after the encoder determines the prediction mode parameter, the encoder may determine the affine mode parameter and the prediction reference mode if the prediction mode parameter indicates that the current block determines the inter prediction value of the current block using the inter prediction mode.
It should be noted that, in the embodiment of the present application, an affine mode parameter is used to indicate whether to use an affine mode. Specifically, the affine mode parameter may be an affine motion compensation enable flag affine _ enable _ flag, and the encoder may further determine whether to use the affine mode through determination of a value of the affine mode parameter.
That is, in the present application, the affine pattern parameter may be a binary variable. If the value of the affine mode parameter is 1, indicating that the affine mode is used; if the value of the affine mode parameter is 0, it indicates that the affine mode is not used.
For example, in the present application, the value of the affine mode parameter may be equal to the value of the affine motion compensation enable flag affine _ enable _ flag, and if the value of the affine _ enable _ flag is '1', it indicates that affine motion compensation may be used; if the value of affine _ enable _ flag is '0', it indicates that affine motion compensation should not be used.
Further, in embodiments of the present application, if the affine mode parameters determined by the encoder indicate that the affine mode is used, the encoder may proceed to obtain the control point mode and the sub-block size parameters.
In the embodiment of the present application, the control point mode is used to determine the number of control points. In the affine model, one sub-block may have 2 control points or 3 control points, and accordingly, the control point pattern may be a control point pattern corresponding to 2 control points or a control point pattern corresponding to 3 control points. I.e. the control point mode may comprise a 4-parameter mode and a 6-parameter mode.
It can be understood that, in the embodiment of the present application, for the AVS3 standard, if the current block uses the affine mode, the encoder needs to determine the number of control points in the affine mode of the current block, so as to determine whether to use the 4-parameter (2 control points) mode or the 6-parameter (3 control points) mode.
Further, in embodiments of the present application, if the affine mode parameters determined by the encoder indicate that affine mode is used, the encoder may further determine sub-block size parameters.
In particular, the subblock size parameter may be characterized by an affine predictor subblock size flag affine _ subblocklock _ size _ flag, and the encoder may indicate the subblock size parameter, i.e., indicate the size of the subblock of the current block, by setting the value of the subblock size flag. The size of the sub-block may be 8 × 8 or 4 × 4. Specifically, in the present application, the sub-block size flag may be a binary variable. If the value of the sub-block size flag is 1, indicating that the sub-block size parameter is 8 multiplied by 8; if the sub-block size flag takes a value of 0, it indicates that the sub-block size parameter is 4 × 4.
For example, in the present application, the value of the sub-block size flag may be equal to the value of an affine predictor block size flag affine _ sub-block _ size _ flag, and if the value of the affine _ sub-block _ size _ flag is '1', the current block is divided into sub-blocks with a size of 8 × 8; if the value of affine _ sub _ size _ flag is '0', the current block is divided into subblocks of size 4 × 4.
Further, in the embodiment of the present application, after determining the control point mode and the sub-block size parameter, the encoder may further determine the first motion vector of the sub-block in the current block according to the prediction reference mode, the control point mode, and the sub-block size parameter.
Specifically, in the embodiment of the present application, the encoder may first determine the control point motion vector group according to the prediction reference mode; a first motion vector for the sub-block may then be determined based on the set of control point motion vectors, the control point mode, and the sub-block size parameter.
It will be appreciated that in embodiments of the present application, a set of control point motion vectors may be used to determine the motion vectors for the control points.
It should be noted that, in the embodiment of the present application, the encoder may traverse each sub-block in the current block according to the above method, and determine the first motion vector of each sub-block by using the control point motion vector group, the control point mode, and the sub-block size parameter of each sub-block, so that the motion vector set may be constructed and obtained according to the first motion vector of each sub-block.
It is to be understood that, in the embodiment of the present application, the first motion vector of each sub-block of the current block may be included in the motion vector set of the current block.
Further, in the embodiment of the present application, when the encoder determines the first motion vector according to the control point motion vector group, the control point mode, and the sub-block size parameter, the encoder may first determine the difference variable according to the control point motion vector group, the control point mode, and the size parameter of the current block; the sub-block location may then be determined based on the prediction mode parameter and the sub-block size parameter; finally, the difference variable and the position of the sub-block can be used to determine the first motion vector of the sub-block, and then the motion vector set of the sub-blocks of the current block can be obtained.
It should be noted that, in the present application, when determining the deviation between each position in the sub-block and the motion vector of the sub-block, if the current block uses an affine prediction model, the motion vector of each position in the sub-block can be calculated according to the formula of the affine prediction model, and the deviation can be obtained by subtracting the motion vector of the sub-block. If the motion vectors of the sub-blocks all select the same position of the motion vector within the sub-block, e.g. 4x4 block uses a position from the top left corner (2, 2) and 8x8 block uses a position from the top left corner (4, 4), the motion vector deviations for the same position of each sub-block are the same according to the affine model used in the present standard including VVC and AVS 3. But the lower left corner in the case of AVS at the upper left corner, upper right corner, and 3 control points (the a, B, C positions shown in figure 7 in the AVS3 text above) is different from the positions used by other blocks, and correspondingly different from other blocks when calculating the motion vector deviations for the sub-blocks at the upper left corner, upper right corner, and lower left corner in the case of 3 control points. The concrete is as shown in the embodiment.
Step 403 determines a first predictor of the sub-block based on the first motion vector and a first motion vector offset between each pixel position in the sub-block and the sub-block.
In an embodiment of the present application, after determining the first motion vector of each sub-block of the current block, the encoder may determine a first prediction value of the sub-block and a first motion vector offset between each pixel position in the sub-block and the sub-block, respectively, based on the first motion vector of the sub-block.
It is understood that, in the embodiment of the present application, step 403 may specifically include:
step 403a, determining a first predictor of the sub-block based on the first motion vector.
Step 403b, determining a first motion vector offset between each pixel position in the sub-block and the sub-block based on the first motion vector.
In this application, the order in which the encoder performs steps 403a and 403b is not limited by the inter prediction method provided in this embodiment, that is, in this application, after determining the first motion vector of each sub-block of the current block, the encoder may perform step 403a first and then step 403b, or may perform step 403b first and then step 403a, or may perform step 403a and step 403b simultaneously.
Further, in an embodiment of the present application, the encoder may first determine a sample matrix when determining the first predictor of the sub-block based on the first motion vector; wherein the sample matrix comprises a luminance sample matrix and a chrominance sample matrix; the first predictor may then be determined based on the prediction reference mode, the sub-block size parameter, the sample matrix, and the set of motion vectors.
It should be noted that, in the embodiment of the present application, when the encoder determines the first predicted value according to the prediction reference mode, the sub-block size parameter, the sample matrix, and the motion vector set, the encoder may first determine a target motion vector from the motion vector set according to the prediction reference mode and the sub-block size parameter; then, a reference image queue and a reference index sample matrix corresponding to the prediction reference mode and a target motion vector can be utilized to determine a prediction sample matrix; wherein the prediction sample matrix includes a first prediction value of the plurality of sub-blocks.
Specifically, in an embodiment of the present application, the sample matrix may include a luma sample matrix and a chroma sample matrix, and accordingly, the prediction sample matrix determined by the encoder may include a luma prediction sample matrix and a chroma prediction sample matrix, wherein the luma prediction sample matrix includes a first luma predictor of the plurality of sub-blocks, the chroma prediction sample matrix includes a first chroma predictor of the plurality of sub-blocks, and the first luma predictor and the first chroma predictor constitute the first predictor of the sub-blocks.
It should be noted that, in the embodiment of the application, the luminance sample matrix in the sample matrix may be an 1/16 precision luminance sample matrix, and the chrominance sample matrix in the sample matrix may be a 1/32 precision chrominance sample matrix.
It is to be understood that in the embodiments of the present application, the reference image queues and reference indexes obtained by the encoder are not the same for different prediction reference modes.
Further, in the embodiment of the present application, when the encoder determines the sample matrix, a luminance interpolation filter coefficient and a chrominance interpolation filter coefficient may be obtained first; a luma sample matrix may then be determined based on the luma interpolation filter coefficients, while a chroma sample matrix may be determined based on the chroma interpolation filter coefficients.
Further, in embodiments of the present application, the encoder may determine a secondary prediction parameter when determining a first motion vector offset between each pixel location in the sub-block and the sub-block; if the secondary prediction parameters indicate that secondary prediction is used, the encoder may determine a first motion vector offset between the sub-block and each pixel location based on the difference variable.
It is to be understood that, in the embodiment of the present application, after determining the first motion vector offset between the sub-block and each pixel position based on the difference variable, the encoder may construct the motion vector offset matrix corresponding to the sub-block by using all the first motion vector offsets corresponding to all the pixel positions in the sub-block. As can be seen, the motion vector deviation matrix includes the motion vector deviation between the sub-block and any one of the internal pixel points, i.e. the first motion vector deviation.
Further, in the embodiment of the present application, if the secondary prediction parameter determined by the encoder indicates that secondary prediction is not used, the encoder may select to directly use the first prediction value of the sub-block of the current block obtained in the above step 403a as the second prediction value of the sub-block without performing the processes of the following steps 404 and 405.
In particular, in embodiments of the present application, if the secondary prediction parameters indicate that secondary prediction is not used, the encoder may determine the second prediction value using a prediction sample matrix. The encoder may determine the first prediction value of the sub-block where the pixel position is located as the second prediction value of the encoder.
For example, in the present application, if the prediction reference mode of the current block takes a value of 0 or 1, i.e., using the first reference mode 'PRED _ List 0', or using the second reference mode 'PRED _ List 1', the first predictor of the sub-block where the pixel position is located may be selected directly from the prediction sample matrix including 1 luma prediction sample matrix (2 chroma prediction sample matrices), and determined as the inter predictor of the pixel position, i.e., the second predictor.
For example, in this application, if the prediction reference mode of the current block is 2, that is, the third reference mode 'PRED _ List 01' is used, an average operation may be performed on 2 luma prediction sample matrices (2 groups of 4 chroma prediction sample matrices) included in the prediction sample matrix to obtain 1 averaged luma prediction sample (2 averaged chroma prediction samples), and finally, a first predictor of a sub-block where a pixel position is located is selected from the averaged luma prediction samples (2 averaged chroma prediction samples), and the first predictor is determined as an inter predictor of the pixel position, that is, a second predictor.
And step 404, determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation.
In an embodiment of the present application, the encoder may, after determining the first predictor of the sub-block and the first motion vector offset between each pixel position in the sub-block and the sub-block, respectively, based on the first motion vector, offset the center position and the second motion vector offset corresponding to each pixel position according to the first motion vector.
It should be noted that, in the embodiment of the present application, for the pixel at each pixel position in the sub-block, when performing quadratic prediction or PROF processing, if the deviation of the motion vector between the default central position used by the filter and the sub-block is large, the obtained prediction result may have a large error, and therefore, after determining the first motion vector deviation between each pixel position and the sub-block, the encoder may first confirm whether each pixel position is suitable as the central position of the filter used when performing quadratic prediction or PROF processing for a certain pixel, and if not, may perform adjustment of the central position again.
It is understood that, in the present application, the center position and the second motion vector offset corresponding to a pixel position may be the center position and the motion vector offset of a filter used in performing quadratic prediction or PROF processing on the pixel position. I.e. the center position and the second motion vector offset, are used for a quadratic prediction or PROF processing of the pixel position.
Further, in an embodiment of the present application, the method for determining the center position and the second motion vector offset corresponding to each pixel position by the encoder according to the first motion vector offset may include the following steps:
step 404a, determining a horizontal deviation and a vertical deviation of the first motion vector deviation;
step 404b, determining a center position and a second motion vector deviation according to the first absolute value of the horizontal deviation, the second absolute value of the vertical deviation and a preset deviation threshold value.
In an embodiment of the present application, the encoder may determine deviation values of the first motion vector deviation in different directions, that is, determine a horizontal deviation and a vertical deviation corresponding to the first motion vector deviation. Then, the encoder may further calculate an absolute value of the horizontal deviation, i.e., a first absolute value, and an absolute value of the vertical deviation, i.e., a second absolute value. Finally, the encoder may further determine the center position and the second motion vector offset used in performing the quadratic prediction or the PROF process according to the first absolute value, the second absolute value, and a preset offset threshold.
It should be noted that, in the embodiment of the present application, the encoder may compare the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with a preset deviation threshold, respectively, so that the determination of the center position and the second motion vector deviation may be performed according to the comparison result. The preset deviation threshold may be preset and used to determine whether to adjust the deviation between the center position and the motion vector.
Further, in an embodiment of the present application, when the encoder determines the center position and the second motion vector offset based on the horizontal offset, the vertical offset, and a preset offset threshold, if both the first absolute value and the second absolute value are smaller than the preset offset threshold, the encoder may determine the first motion vector offset as the second motion vector offset and each pixel position as the center position.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold, respectively, if both the first absolute value and the second absolute value are smaller than the preset deviation, the encoder may consider that the vector deviation between the pixel position and the sub-block is small, and the prediction result obtained by performing the quadratic prediction or the PROF process using the pixel position as the center position of the filter is more accurate, and thus, the first motion vector deviation corresponding to the pixel position may be directly determined as the second motion vector deviation for performing the quadratic prediction or the PROF process, and at the same time, the pixel position may be directly determined as the center position used by the filter performing the quadratic prediction or the PROF process.
Further, in the embodiments of the present application, when the encoder determines the deviation between the center position and the second motion vector according to the horizontal deviation, the vertical deviation and the preset deviation threshold, if the first absolute value is greater than or equal to the preset deviation threshold and the second absolute value is less than the preset deviation threshold, the encoder may determine the first adjustment direction according to the horizontal deviation; the center position and the second motion vector offset can then be determined based on the first adjustment direction.
It should be noted that, in the embodiment of the present application, the first adjustment direction is used for adjusting the first motion vector in the horizontal direction, and therefore, the first adjustment direction includes the left side and the right side.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold respectively, if the first absolute value is greater than or equal to the preset deviation threshold and the second absolute value is less than the preset deviation threshold, it may be considered that the motion vector deviation between the pixel position and the sub-block in the horizontal direction is large, and if there may be an error in the prediction result obtained by performing the secondary prediction or the PROF processing using the pixel as the center position of the filter, therefore, it is necessary to further determine the first adjustment direction according to the horizontal deviation, then perform the adjustment according to the first adjustment direction, and finally determine the center position and the second motion vector deviation used in the secondary prediction or the PROF processing.
Further, in the embodiment of the present application, when the encoder determines the center position and the second motion vector offset according to the first adjustment direction, if the first adjustment direction is the left side, the adjacent left pixel position of the arbitrary pixel position may be taken as the center position, and then (1, 0) may be added to the first motion vector offset, so that the corresponding second motion vector offset may be obtained.
Further, in the embodiment of the present application, the encoder determines the center position and the second motion vector offset based on the first adjustment direction, and if the first adjustment direction is the right side, the adjacent right pixel position of any pixel position may be taken as the center position, and then the first motion vector offset may be subtracted by (1, 0), so that the second motion vector offset may be obtained.
Further, in the embodiments of the present application, when the encoder determines the center position and the second motion vector deviation according to the horizontal deviation, the vertical deviation, and the preset deviation threshold, if the second absolute value is greater than or equal to the preset deviation threshold and the first absolute value is less than the preset deviation threshold, the encoder may determine the second adjustment direction according to the vertical deviation; the center position and the second motion vector offset may then be determined based on the second adjustment direction.
It should be noted that, in the embodiment of the present application, the second adjustment direction is used for adjusting the first motion vector in the vertical direction, and therefore, the second adjustment direction includes an upper side and a lower side.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold, if the second absolute value is greater than or equal to the preset deviation threshold and the first absolute value is smaller than the preset deviation threshold, it may be considered that the motion vector deviation between the pixel position and the sub-block in the vertical direction is large, and if there may be an error in the prediction result obtained by performing secondary prediction or PROF processing using the pixel as the center position of the filter, it is necessary to further determine a second adjustment direction according to the vertical deviation, and then perform adjustment according to the second adjustment direction, so that the center position and the second motion vector deviation used in performing secondary prediction or PROF processing may be finally determined.
Further, in the embodiment of the present application, when the encoder determines the center position and the second motion vector offset from the second adjustment direction, if the second adjustment direction is the upper side, the pixel position on the upper side adjacent to the arbitrary pixel position may be taken as the center position, and then (0, 1) may be added to the first motion vector offset, so that the corresponding second motion vector offset may be obtained.
Further, in the embodiment of the present application, the encoder determines the center position and the second motion vector deviation based on the second adjustment direction, and if the second adjustment direction is the lower side, the adjacent lower side pixel position of any pixel position may be taken as the center position, and then the first motion vector deviation may be subtracted by (0, 1), so that the second motion vector deviation may be obtained.
Further, in the embodiments of the present application, when the encoder determines the center position and the second motion vector deviation according to the horizontal deviation, the vertical deviation, and the preset deviation threshold, if both the first absolute value and the second absolute value are greater than or equal to the preset deviation threshold, the encoder may determine a third adjustment direction according to the horizontal deviation and the vertical deviation; the center position and the second motion vector offset may then be determined based on the third adjustment direction.
It should be noted that, in the embodiment of the present application, the third adjustment direction is used for adjusting the first motion vector in the horizontal direction and the vertical direction at the same time, and therefore, the third adjustment direction includes an upper right side, a lower right side, an upper left side, and a lower left side.
It is understood that, in the embodiment of the application, after comparing the first absolute value of the horizontal deviation and the second absolute value of the vertical deviation with the preset deviation threshold, if both the first absolute value and the second absolute value are greater than or equal to the preset deviation threshold, it may be considered that the motion vector deviations between the pixel position and the sub-block in the horizontal direction and the vertical direction are large, and if there may be an error in the prediction result obtained by performing the second prediction or the PROF processing using the pixel as the center position of the filter, therefore, it is necessary to further determine a third adjustment direction according to the horizontal deviation and the vertical deviation, and then perform the adjustment according to the third adjustment direction, so that the center position and the second motion vector deviation used when performing the second prediction or the PROF processing may be finally determined.
Further, in the embodiment of the present application, when the encoder determines the center position and the second motion vector offset according to the third adjustment direction, if the third adjustment direction is the upper left side, the adjacent upper left side pixel position of the arbitrary pixel position may be taken as the center position, and then (1, 1) may be added to the first motion vector offset, so that the corresponding second motion vector offset may be obtained.
Further, in the embodiment of the present application, the encoder determines the center position and the second motion vector deviation based on the third adjustment direction, and if the third adjustment direction is the lower right side, the adjacent lower right side pixel position of any pixel position may be taken as the center position, and then, the first motion vector deviation may be reduced by (1, 1), so that the second motion vector deviation may be obtained.
It is to be understood that, in the embodiments of the present application, the deviation value of the second motion vector deviation in both the horizontal direction and the vertical direction is smaller than the preset deviation threshold.
That is, in the present application, the encoder may first determine the center position and the second motion vector offset, after determining the offset of the motion vector of each pixel position in the sub-block from the motion vector of the sub-block, and before filtering the prediction block based on prediction of the sub-block, wherein the center position and the second motion vector are used for performing the quadratic prediction or the PROF process.
Specifically, in the present application, for each pixel position in the sub-block, the magnitude of the first motion vector offset between the pixel position and the sub-block may be determined, including the horizontal offset and the vertical offset of the first motion vector offset in the horizontal direction and the vertical direction. If the absolute value of the horizontal direction and/or the vertical direction of the motion vector deviation of a certain pixel position is larger than the preset deviation threshold, it can be determined that the central position of the filter corresponding to the pixel position is no longer the pixel position itself, and further determination of the central position and the second motion vector deviation is needed.
Further, in the embodiment of the present application, before performing the quadratic prediction or the PROF processing on a pixel position in a sub-block, if it is determined that the absolute value of the deviation component of the first motion vector deviation between the pixel position and the sub-block in the horizontal direction and/or the vertical direction is greater than or equal to the preset deviation threshold (e.g. 1/2 pixels, 3/4 pixels, or 1 pixel), the starting point used in the quadratic prediction or the PROF processing, that is, the center position of the two-dimensional filter used in the quadratic prediction or the PROF processing, needs to be adjusted.
It is understood that, in the present application, when adjusting the center position and the motion vector deviation, another pixel position in the current block may be selected, where the motion vector deviation from the pixel position in both the horizontal direction and the vertical direction is less than or equal to a preset deviation threshold.
In the embodiment of the present application, further before determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset, that is, before step 404, the method for performing inter-frame prediction by the encoder may further include the following steps:
step 406, limiting the first motion vector deviation according to a preset deviation range; the preset deviation range comprises a deviation lower limit value and a deviation upper limit value.
In an embodiment of the present application, the encoder may limit the first motion vector before determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset. Specifically, the encoder may perform the restriction processing on the first motion vector offset according to a preset offset range; the preset deviation range comprises a deviation lower limit value and a deviation upper limit value.
Further, in the embodiment of the present application, when the encoder performs the limitation processing on the first motion vector deviation according to the preset deviation range, if the horizontal deviation and/or the vertical deviation is smaller than the lower deviation limit value, the horizontal deviation and/or the vertical deviation may be set as the lower deviation limit value; if the horizontal deviation and/or the vertical deviation is larger than the deviation upper limit value, the horizontal deviation and/or the vertical deviation can be set as the deviation upper limit value.
It should be noted that, in the present application, the encoder may not limit the first motion vector deviation, that is, not perform step 407, but directly adjust the center position to any suitable pixel position according to the first motion vector deviation. The limitation of the first motion vector offset may be performed first according to step 407, or the adjustable range and the motion vector offset may be limited.
Further, in the embodiment of the present application, the preset deviation range may be composed of a deviation lower limit min and a deviation upper limit max, that is, the preset deviation range may be represented as (min, max), and when the first motion vector deviation is subjected to the limiting process, if the deviation value of the first motion vector deviation in the horizontal direction and/or the vertical direction is smaller than the deviation lower limit min, the deviation value may be directly set as min; if the deviation value of the first motion vector deviation in the horizontal direction and/or the vertical direction is larger than max, the deviation value can be directly set as max.
In the embodiment of the present application, further, after determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset, that is, after step 404, the method for performing inter-frame prediction by the encoder may further include the following steps:
step 407, if the center position does not belong to the current block, re-determining the center position according to the pixel position in the current block.
In the embodiment of the present application, after determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset, the encoder may first determine whether the center position exceeds the current block, and if the center position exceeds the range of the current block, that is, the center position does not belong to the current block, the encoder needs to re-determine the center position according to the pixel position in the current block.
In the embodiment of the present application, further, after determining the center position corresponding to each pixel position and the second motion vector offset according to the first motion vector offset, that is, after step 404, the method for performing inter-frame prediction by the encoder may further include the following steps:
and step 408, if the central position does not belong to the current block, directly determining each pixel position as the central position corresponding to each pixel position.
In the embodiment of the present application, after determining the center position corresponding to each pixel position and the second motion vector bias according to the first motion vector bias, the encoder may first determine whether the center position exceeds the current block, and if the center position exceeds the range of the current block, that is, the center position does not belong to the current block, the encoder may directly determine the corresponding original pixel position as the center position. I.e. the center position determined based on the first motion vector offset corresponding to each pixel position in the sub-block, the encoder may choose to take each pixel position as the center position if it is out of the range of the current block.
That is, in the embodiment of the present application, for the adjusted start position of the quadratic prediction, i.e., the center position of the filter usage determined based on the first motion vector deviation, the encoder may restrict the center position not to exceed the range of the current block, i.e., not to exceed the range of the start position of the quadratic prediction of all pixel positions in the current block before the adjustment.
And step 405, performing secondary prediction or PROF processing according to the center position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the subblock, and determining the second predicted value as the interframe predicted value of the subblock.
In the embodiment of the present application, after determining the center position and the second motion vector offset corresponding to each pixel position according to the first motion vector offset, the encoder may perform quadratic prediction or PROF processing according to the center position and the second motion vector offset based on the first prediction value, so that the prediction value of the pixel at each pixel position may be obtained.
Further, in the embodiment of the present application, after the encoder traverses each pixel position in the sub-block to obtain the predicted value of the pixel at each pixel position, the encoder may determine the second predicted value of the sub-block according to the predicted value of the pixel at each pixel position, so that the second predicted value may be determined as the inter-frame predicted value of the sub-block.
It should be noted that, in the embodiment of the present application, since the deviation between the center position and the second motion vector may be used to perform secondary prediction or PROF processing on the pixel point at each pixel position in the sub-block, after obtaining the deviation between the center position and the second motion vector, the encoder may perform secondary prediction or PROF processing on the pixel point at each pixel position by using the center position and the deviation between the second motion vector based on the first predicted value, and finally may obtain the second predicted value corresponding to the sub-block, so that the second predicted value may be determined as the inter-frame predicted value of the sub-block.
Further, in an embodiment of the present application, the method in which the encoder performs quadratic prediction or PROF processing according to the center position and the second motion vector deviation based on the first predictor to determine the second predictor of the sub-block, and the second predictor is determined as an inter predictor of the sub-block, may include the steps of:
step 405a, determining a PROF parameter;
step 405b, when the PROF parameter indicates that PROF processing is performed, determining a pixel horizontal gradient and a pixel vertical gradient corresponding to the central position based on the first predicted value;
step 405c, calculating a deviation value corresponding to each pixel position according to the pixel horizontal gradient, the pixel vertical gradient and the second motion vector deviation;
step 405d, obtaining a predicted value of a pixel at each pixel position based on the first predicted value and the deviation value;
step 405e determines a second predicted value using the predicted value of the pixel at each pixel location.
In an embodiment of the present application, the encoder may determine the PROF parameter first, and if the PROF parameter indicates that the PROF processing is performed, the encoder may determine a pixel horizontal gradient and a pixel vertical gradient corresponding to the central position based on the first prediction value; the pixel horizontal gradient is a gradient value between a pixel value corresponding to the central position and a pixel value corresponding to an adjacent pixel position in the horizontal direction; the pixel vertical gradient is a gradient value between a pixel value corresponding to the center position and a pixel value corresponding to an adjacent pixel position in the vertical direction.
Further, in the embodiment of the present application, the encoder may calculate an offset value corresponding to each pixel position according to the pixel horizontal gradient, the pixel vertical gradient, and the second motion vector offset of the center position corresponding to each pixel position. The offset value may be used to correct the predicted value of the pixel value at each pixel position.
It should be noted that, in the embodiment of the present application, the encoder may further obtain the corrected predicted value corresponding to any pixel position according to the first predicted value and the deviation value, and after traversing each pixel position in the current sub-block and obtaining the corrected predicted value corresponding to each pixel position, determine the second predicted value corresponding to the current sub-block by using the corrected predicted values corresponding to all pixel positions, thereby determining the corresponding inter-frame predicted value. Specifically, in the present application, after the prediction based on the sub-blocks is completed, the first prediction value of the current sub-block is used as the prediction value of each pixel position, and then, the first prediction value is added to the offset value corresponding to each pixel position, so that the correction processing of the prediction value of each pixel position can be completed, the corrected prediction value is obtained, and thus the second prediction value of the current sub-block can be further obtained, and the second prediction value is used as the inter-frame prediction value corresponding to the current sub-block.
Further, in an embodiment of the present application, the method in which the encoder performs quadratic prediction or PROF processing according to the center position and the second motion vector deviation based on the first predictor to determine the second predictor of the sub-block, and the second predictor is determined as an inter predictor of the sub-block, may include the steps of:
step 405f, determining secondary prediction parameters;
step 405g, when the secondary prediction parameter indicates that secondary prediction is used, determining a filter coefficient of the two-dimensional filter according to the second motion vector deviation; the two-dimensional filter is used for carrying out quadratic prediction processing according to a preset shape;
step 405h, determining a predicted value of a pixel at each pixel position based on the filter coefficient and the first predicted value;
step 405i determines a second predicted value using the predicted value of the pixel at each pixel location.
In an embodiment of the present application, the encoder may first determine a quadratic prediction parameter, and if the quadratic prediction parameter indicates that quadratic prediction is used, the encoder may determine a filter coefficient of the two-dimensional filter according to the second motion vector deviation; the two-dimensional filter is used for carrying out quadratic prediction processing according to a preset shape.
It should be noted that, in the embodiment of the present application, the filter coefficient of the two-dimensional filter is related to the second motion vector offset corresponding to the target pixel position. That is, if the corresponding second motion vector deviations are different for different target pixel positions, the filter coefficients of the two-dimensional filters used are also different.
It will be appreciated that in embodiments of the present application, a two-dimensional filter is used for quadratic prediction using a plurality of adjacent pixel locations that form a predetermined shape. Wherein, the preset shape is a rectangle, a rhombus or any symmetrical shape.
That is, in the present application, the two-dimensional filter for performing quadratic prediction is a filter configured by adjacent points constituting a preset shape. The adjacent dots constituting the preset shape may include a plurality of dots, for example, 9 dots. The predetermined shape may be a symmetrical shape, for example, the predetermined shape may include a rectangle, a diamond shape, or any other symmetrical shape.
Illustratively, in the present application, the two-dimensional filter is a rectangular filter, and specifically, the two-dimensional filter is a filter composed of 9 adjacent pixel positions constituting a rectangle. Of the 9 pixel positions, the pixel position located at the center is the pixel position of the pixel currently requiring quadratic prediction, i.e., the current pixel position.
Further, in the embodiment of the present application, when determining the filter coefficient of the two-dimensional filter according to the second motion vector deviation, the encoder may determine the scaling parameter first, and then may determine the filter coefficient corresponding to the pixel position according to the scaling parameter and the second motion vector deviation.
It should be noted that, in the embodiment of the present application, the scale parameter may include at least one scale value, and the second motion vector offset includes a horizontal offset and a vertical offset; wherein at least one of the proportional values is a non-zero real number.
Specifically, in the present application, when the two-dimensional filter performs secondary prediction using 9 adjacent pixel positions that form a rectangle, a pixel position located in the center of the rectangle is a position to be predicted, that is, a current pixel position, and the other 8 target pixel positions are sequentially located in 8 directions, that is, the upper left direction, the upper right direction, the lower left direction, and the left direction, of the current pixel position.
Accordingly, in the present application, the encoder may calculate and obtain 9 filter coefficient coefficients corresponding to 9 adjacent pixel positions according to a preset calculation rule based on at least one scale value and the second motion vector deviation of the position to be predicted.
It should be noted that, in the present application, the preset calculation rule may include a plurality of different calculation manners, such as addition, subtraction, multiplication, and the like. Wherein, for different pixel positions, different calculation modes can be used for calculating the filter coefficient.
It is understood that, in the present application, in the case that the encoder obtains a plurality of filter coefficients corresponding to a plurality of pixel positions by calculation according to different calculation methods in the preset calculation rule, some of the filter coefficients may be a linear function of the second motion vector deviation, that is, the two are in a linear relationship, or may be a quadratic function or a higher-order function of the second motion vector deviation, that is, the two are in a nonlinear relationship.
That is, in the present application, any one of the plurality of filter coefficients corresponding to a plurality of adjacent pixel positions may be a linear function, a quadratic function, or a high-order function of the second motion vector deviation.
It is understood that in the present application, the encoder may write the prediction mode parameter, the affine mode parameter, and the prediction reference mode into the code stream. And PROF parameters and secondary prediction parameters can be written into the code stream.
The present embodiment provides an inter prediction method that can re-determine, after prediction based on sub-blocks, a center position and a second motion vector deviation used for performing quadratic prediction or PROF processing for a pixel position where a first motion vector deviation between a motion vector and a motion vector of a sub-block is large based on the first motion vector deviation, so that point-based quadratic prediction can be performed using the center position and the second motion vector deviation on the basis of the first prediction value based on sub-blocks to obtain a second prediction value. Therefore, the inter-frame prediction method provided by the application can be well suitable for all scenes, the prediction error can be reduced, the coding performance is greatly improved, and the coding efficiency is improved.
Based on the foregoing embodiments, in yet another embodiment of the present application, fig. 20 is a schematic structural diagram of a decoder, and as shown in fig. 20, a decoder 300 according to an embodiment of the present application may include an analysis unit 301 and a first determination unit 302;
the analysis unit 301 is configured to analyze the code stream to obtain a prediction mode parameter of the current block;
the first determining unit 302, configured to determine a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
Fig. 21 is a schematic diagram illustrating a composition structure of a decoder, and as shown in fig. 21, the decoder 300 according to the embodiment of the present application may further include a first processor 303, a first memory 304 storing an executable instruction of the first processor 303, a first communication interface 305, and a first bus 306 for connecting the first processor 303, the first memory 304, and the first communication interface 305.
Further, in an embodiment of the present application, the first processor 303 is configured to parse a code stream to obtain a prediction mode parameter of a current block; determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
Fig. 22 is a first schematic structural diagram of an encoder, and as shown in fig. 22, an encoder 400 according to an embodiment of the present application may include a second determining unit 401;
the second determining unit 401 is configured to determine a prediction mode parameter of the current block; determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
Fig. 23 is a schematic diagram of a second constituent structure of the encoder, and as shown in fig. 23, the encoder 400 according to the embodiment of the present application may further include a second processor 402, a second memory 403 in which an executable instruction of the second processor 402 is stored, a second communication interface 404, and a second bus 405 for connecting the second processor 402, the second memory 403, and the second communication interface 404.
Further, in an embodiment of the present application, the second processor 402 is configured to determine a prediction mode parameter of the current block; determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
Embodiments of the present application provide a decoder and an encoder, which can re-determine, after prediction based on sub-blocks, a center position and a second motion vector deviation used for performing quadratic prediction or PROF processing for a pixel position where a first motion vector deviation between a motion vector and a motion vector of a sub-block is large based on the first motion vector deviation, so that point-based quadratic prediction can be performed using the center position and the second motion vector deviation on the basis of the first prediction value based on the sub-block to obtain a second prediction value. Therefore, the inter-frame prediction method provided by the application can be well suitable for all scenes, the prediction error can be reduced, the coding performance is greatly improved, and the coding efficiency is improved.
Embodiments of the present application provide a computer-readable storage medium and a computer-readable storage medium, on which a program is stored, which when executed by a processor implements the method as described in the above embodiments.
Specifically, the program instructions corresponding to an inter-frame prediction method in the present embodiment may be stored on a storage medium such as an optical disc, a hard disc, a usb disk, or the like, and when the program instructions corresponding to an inter-frame prediction method in the storage medium are read or executed by an electronic device, the method includes the following steps:
analyzing the code stream to obtain the prediction mode parameter of the current block;
determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks;
determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block;
determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation;
and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
Specifically, the program instructions corresponding to an inter-frame prediction method in the present embodiment may be stored on a storage medium such as an optical disc, a hard disc, a usb disk, or the like, and when the program instructions corresponding to an inter-frame prediction method in the storage medium are read or executed by an electronic device, the method includes the following steps:
determining a prediction mode parameter of a current block;
determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks;
determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block;
determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation;
and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of implementations of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks and/or flowchart block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks in the flowchart and/or block diagram block or blocks.
The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to obtain new method embodiments.
Features disclosed in several of the product embodiments provided in the present application may be combined in any combination to yield new product embodiments without conflict.
The features disclosed in the several method or apparatus embodiments provided in the present application may be combined arbitrarily, without conflict, to arrive at new method embodiments or apparatus embodiments.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (65)

1. An inter-prediction method applied to a decoder, the method comprising:
analyzing the code stream to obtain the prediction mode parameter of the current block;
determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks;
determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block;
determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation;
and performing secondary prediction or prediction correction (PROF) processing by using an optical flow principle according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
2. The method of claim 1, wherein determining the first motion vector for the sub-block of the current block comprises:
analyzing the code stream to obtain affine mode parameters and a prediction reference mode of the current block;
determining a control point mode and a sub-block size parameter when the affine mode parameter indicates that an affine mode is used;
and determining the first motion vector according to the prediction reference mode, the control point mode and the sub-block size parameter.
3. The method of claim 2, wherein determining the first motion vector according to the prediction reference mode, the control point mode, and the sub-block size parameter comprises:
determining a control point motion vector group according to the prediction reference mode;
and determining the first motion vector according to the control point motion vector group, the control point mode and the sub-block size parameter.
4. The method of claim 3, wherein determining the first motion vector according to the set of control point motion vectors, the control point mode, and the sub-block size parameter comprises:
determining a difference variable according to the control point motion vector group, the control point mode and the size parameter of the current block;
determining a subblock position based on the prediction mode parameter and the subblock size parameter;
determining the first motion vector of the current sub-block using the difference variable and the sub-block position.
5. The method according to any one of claims 2 to 4, further comprising:
and traversing each sub-block of the current block, and constructing a motion vector set according to the first motion vector of each sub-block.
6. The method of claim 5, wherein determining the first motion vector according to the set of control point motion vectors, the control point mode, and the sub-block size parameter comprises:
determining a difference variable according to the control point motion vector group, the control point mode and the size parameter of the current block;
determining a subblock position based on the prediction mode parameter and the subblock size parameter;
determining the first motion vector of the sub-block using the difference variable and the sub-block position.
7. The method of claim 5, wherein the determining the first predictor of the sub-block based on the first motion vector comprises:
determining a sample matrix; wherein the sample matrix comprises a luma sample matrix and a chroma sample matrix;
determining the first prediction value according to the prediction reference mode, the sub-block size parameter, the sample matrix, and the motion vector set.
8. The method of claim 7, wherein determining the center position and the second motion vector offset for each pixel position according to the first motion vector offset comprises:
determining a horizontal deviation and a vertical deviation of the first motion vector deviation;
and determining the deviation of the central position and the second motion vector according to the first absolute value of the horizontal deviation, the second absolute value of the vertical deviation and a preset deviation threshold.
9. The method of claim 8, wherein determining the center position and the second motion vector bias based on the horizontal bias, the vertical bias, and a preset bias threshold comprises:
and if the first absolute value and the second absolute value are both smaller than the preset deviation threshold, determining the first motion vector deviation as the second motion vector deviation, and determining each pixel position as the center position.
10. The method of claim 8, wherein determining the center position and the second motion vector bias based on the horizontal bias, the vertical bias, and a preset bias threshold comprises:
if the first absolute value is greater than or equal to the preset deviation threshold value and the second absolute value is smaller than the preset deviation threshold value, determining a first adjusting direction according to the horizontal deviation; wherein the first adjustment direction comprises a left side and a right side;
and determining the deviation between the central position and the second motion vector according to the first adjusting direction.
11. The method of claim 10, wherein determining the center position and the second motion vector offset based on the first adjustment direction comprises:
if the first adjustment direction is the left side, taking the adjacent left side pixel position of any pixel position as the central position;
and adding (1, 0) to the first motion vector deviation to obtain the second motion vector deviation.
12. The method of claim 10, wherein determining the center position and the second motion vector offset based on the first adjustment direction comprises:
if the first adjustment direction is the right side, taking the adjacent right side pixel position of any pixel position as the central position;
subtracting (1, 0) from the first motion vector bias to obtain the second motion vector bias.
13. The method of claim 8, wherein determining the center position and the second motion vector bias based on the horizontal bias, the vertical bias, and a preset bias threshold comprises:
if the second absolute value is greater than or equal to the preset deviation threshold value and the first absolute value is smaller than the preset deviation threshold value, determining a second adjustment direction according to the vertical deviation; wherein the second adjustment direction comprises an upper side and a lower side;
and determining the deviation between the central position and the second motion vector according to the second adjusting direction.
14. The method of claim 13, wherein determining the center position and the second motion vector offset based on the second adjustment direction comprises:
if the second adjustment direction is the upper side, taking the upper side pixel position adjacent to the any pixel position as the central position;
and adding (0, 1) to the first motion vector deviation to obtain the second motion vector deviation.
15. The method of claim 13, wherein determining the center position and the second motion vector offset based on the second adjustment direction comprises:
if the second adjustment direction is the lower side, taking the adjacent lower side pixel position of any pixel position as the central position;
subtracting (0, 1) the first motion vector bias to obtain the second motion vector bias.
16. The method of claim 8, wherein determining the center position and the second motion vector bias based on the horizontal bias, the vertical bias, and a preset bias threshold comprises:
if the first absolute value and the second absolute value are both greater than or equal to the preset deviation threshold, determining a third adjustment direction according to the horizontal deviation and the vertical deviation; wherein the third adjustment direction comprises a right upper side, a right lower side, a left upper side and a left lower side;
and determining the deviation between the central position and the second motion vector according to the third adjusting direction.
17. The method according to any one of claims 8 to 16, wherein the preset deviation threshold is k pixels; wherein k is greater than 0.5 and less than or equal to 1.
18. The method of claim 8, wherein before determining the center position and the second motion vector offset for each pixel position according to the first motion vector offset, the method further comprises:
limiting the first motion vector deviation according to a preset deviation range; the preset deviation range comprises a deviation lower limit value and a deviation upper limit value.
19. The method according to claim 18, wherein the limiting the first motion vector offset according to a preset offset range comprises:
if the horizontal deviation and/or the vertical deviation is smaller than the deviation lower limit value, setting the horizontal deviation and/or the vertical deviation as the deviation lower limit value;
and if the horizontal deviation and/or the vertical deviation is larger than the deviation upper limit value, setting the horizontal deviation and/or the vertical deviation as the deviation upper limit value.
20. The method of claim 1, wherein after determining the center position and the second motion vector offset corresponding to each pixel position according to the first motion vector offset, the method further comprises:
and if the central position does not belong to the current block, re-determining the central position according to the pixel position in the current block.
21. The method of claim 1, wherein after determining the center position and the second motion vector offset corresponding to each pixel position according to the first motion vector offset, the method further comprises:
and if the central position does not belong to the current block, directly determining each pixel position as the central position corresponding to each pixel position.
22. The method of claim 1, wherein determining the second predictor of the sub-block by performing quadratic prediction or PROF processing according to the center position and the second motion vector offset based on the first predictor, and determining the second predictor as the inter predictor of the sub-block comprises:
analyzing the code stream to obtain a PROF parameter;
when the PROF parameter indicates that PROF processing is performed, determining a pixel horizontal gradient and a pixel vertical gradient corresponding to the central position based on the first predicted value;
calculating a deviation value corresponding to each pixel position according to the pixel horizontal gradient, the pixel vertical gradient and the second motion vector deviation;
obtaining a predicted value of the pixel at each pixel position based on the first predicted value and the deviation value;
and determining the second predicted value by using the predicted value of the pixel at each pixel position.
23. The method of claim 1, wherein determining the second predictor of the sub-block by performing quadratic prediction or PROF processing according to the center position and the second motion vector offset based on the first predictor, and determining the second predictor as the inter predictor of the sub-block comprises:
analyzing the code stream to obtain a secondary prediction parameter;
determining a filter coefficient of a two-dimensional filter according to the second motion vector bias when the quadratic prediction parameter indicates that quadratic prediction is used; the two-dimensional filter is used for carrying out quadratic prediction processing according to a preset shape;
determining a predicted value of the pixel at each pixel position based on the filter coefficient and the first predicted value;
and determining the second predicted value by using the predicted value of the pixel at each pixel position.
24. The method of claim 23, wherein the two-dimensional filter is used for quadratic prediction using a plurality of adjacent pixel locations that form the predetermined shape.
25. The method of claim 24, wherein the predetermined shape is a rectangle, a diamond, or any one of symmetrical shapes.
26. The method of claim 25,
if the value of the affine mode parameter is 1, indicating to use the affine mode;
and if the value of the affine mode parameter is 0 or the affine mode parameter is not obtained through analysis, indicating not to use the affine mode.
27. The method of claim 2, wherein the determining the sub-block size parameter comprises:
analyzing the code stream to obtain a subblock size mark;
if the value of the subblock size flag is 1, determining that the subblock size parameter is 8 multiplied by 8;
and if the value of the subblock size flag is 0 or the subblock size flag is not obtained through analysis, determining that the subblock size parameter is 4x 4.
28. The method of claim 2, wherein the control point modes include a 4-parameter mode and a 6-parameter mode.
29. An inter-prediction method applied to an encoder, the method comprising:
determining a prediction mode parameter of a current block;
determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks;
determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block;
determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation;
and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
30. The method of claim 29, wherein determining the prediction mode parameter for the current block comprises:
carrying out pre-coding processing on the current block by utilizing multiple prediction modes to obtain rate distortion cost values corresponding to each prediction mode;
and selecting the minimum rate distortion cost value from the obtained multiple rate distortion cost values, and determining the prediction mode parameters of the current block according to the prediction mode corresponding to the minimum rate distortion cost value.
31. The method of claim 29, wherein determining the first motion vector for the sub-block of the current block comprises:
determining affine mode parameters and a prediction reference mode for the current block;
determining a control point mode and a sub-block size parameter when the affine mode parameter indicates that an affine mode is used;
and determining the first motion vector according to the prediction reference mode, the control point mode and the sub-block size parameter.
32. The method of claim 31, wherein determining the first motion vector according to the prediction reference mode, the control point mode and the sub-block size parameter comprises:
determining a control point motion vector group according to the prediction reference mode;
and determining the first motion vector according to the control point motion vector group, the control point mode and the sub-block size parameter.
33. The method of claim 32, wherein determining the first motion vector according to the set of control point motion vectors, the control point mode, and the sub-block size parameter comprises:
determining a difference variable according to the control point motion vector group, the control point mode and the size parameter of the current block;
determining a subblock position based on the prediction mode parameter and the subblock size parameter;
determining the first motion vector of the current sub-block using the difference variable and the sub-block position.
34. The method of any one of claims 31 to 33, further comprising:
and traversing each sub-block of the current block, and constructing a motion vector set according to the first motion vector of each sub-block.
35. The method of claim 34, wherein determining the first motion vector according to the set of control point motion vectors, the control point mode, and the sub-block size parameter comprises:
determining a difference variable according to the control point motion vector group, the control point mode and the size parameter of the current block;
determining a subblock position based on the prediction mode parameter and the subblock size parameter;
determining the first motion vector of the sub-block using the difference variable and the sub-block position.
36. The method of claim 34, wherein the determining the first predictor of the sub-block based on the first motion vector comprises:
determining a sample matrix; wherein the sample matrix comprises a luma sample matrix and a chroma sample matrix;
determining the first prediction value according to the prediction reference mode, the sub-block size parameter, the sample matrix, and the motion vector set.
37. The method of claim 36, wherein determining the center position and the second motion vector offset for each pixel position according to the first motion vector offset comprises:
determining a horizontal deviation and a vertical deviation of the first motion vector deviation;
and determining the deviation of the central position and the second motion vector according to the first absolute value of the horizontal deviation, the second absolute value of the vertical deviation and a preset deviation threshold.
38. The method of claim 37, wherein determining the center position and the second motion vector offset based on the horizontal offset, the vertical offset, and a preset offset threshold comprises:
and if the first absolute value and the second absolute value are both smaller than the preset deviation threshold, determining the first motion vector deviation as the second motion vector deviation, and determining each pixel position as the center position.
39. The method of claim 37, wherein determining the center position and the second motion vector offset based on the horizontal offset, the vertical offset, and a preset offset threshold comprises:
if the first absolute value is greater than or equal to the preset deviation threshold value and the second absolute value is smaller than the preset deviation threshold value, determining a first adjusting direction according to the horizontal deviation; wherein the first adjustment direction comprises a left side and a right side;
and determining the deviation between the central position and the second motion vector according to the first adjusting direction.
40. The method of claim 39, wherein determining the center position and the second motion vector offset based on the first adjustment direction comprises:
if the first adjustment direction is the left side, taking the adjacent left side pixel position of any pixel position as the central position;
and adding (1, 0) to the first motion vector deviation to obtain the second motion vector deviation.
41. The method of claim 39, wherein determining the center position and the second motion vector offset based on the first adjustment direction comprises:
if the first adjustment direction is the right side, taking the adjacent right side pixel position of any pixel position as the central position;
subtracting (1, 0) from the first motion vector bias to obtain the second motion vector bias.
42. The method of claim 37, wherein determining the center position and the second motion vector offset based on the horizontal offset, the vertical offset, and a preset offset threshold comprises:
if the second absolute value is greater than or equal to the preset deviation threshold value and the first absolute value is smaller than the preset deviation threshold value, determining a second adjustment direction according to the vertical deviation; wherein the second adjustment direction comprises an upper side and a lower side;
and determining the deviation between the central position and the second motion vector according to the second adjusting direction.
43. The method of claim 42, wherein determining the center position and the second motion vector offset based on the second adjustment direction comprises:
if the second adjustment direction is the upper side, taking the upper side pixel position adjacent to the any pixel position as the central position;
and adding (0, 1) to the first motion vector deviation to obtain the second motion vector deviation.
44. The method of claim 42, wherein determining the center position and the second motion vector offset based on the second adjustment direction comprises:
if the second adjustment direction is the lower side, taking the adjacent lower side pixel position of any pixel position as the central position;
subtracting (0, 1) the first motion vector bias to obtain the second motion vector bias.
45. The method of claim 37, wherein determining the center position and the second motion vector offset based on the horizontal offset, the vertical offset, and a preset offset threshold comprises:
if the first absolute value and the second absolute value are both greater than or equal to the preset deviation threshold, determining a third adjustment direction according to the horizontal deviation and the vertical deviation; wherein the third adjustment direction comprises a right upper side, a right lower side, a left upper side and a left lower side;
and determining the deviation between the central position and the second motion vector according to the third adjusting direction.
46. The method of any one of claims 37 to 45, wherein the preset deviation threshold is k pixels; wherein k is greater than 0.5 and less than or equal to 1.
47. The method of claim 37, wherein before determining the center position and the second motion vector offset for each pixel position according to the first motion vector offset, the method further comprises:
limiting the first motion vector deviation according to a preset deviation range; the preset deviation range comprises a deviation lower limit value and a deviation upper limit value.
48. The method according to claim 47, wherein said limiting said first motion vector bias according to a preset bias range comprises:
if the horizontal deviation and/or the vertical deviation is smaller than the deviation lower limit value, setting the horizontal deviation and/or the vertical deviation as the deviation lower limit value;
and if the horizontal deviation and/or the vertical deviation is larger than the deviation upper limit value, setting the horizontal deviation and/or the vertical deviation as the deviation upper limit value.
49. The method of claim 29, wherein after determining the center position and the second motion vector offset for each pixel position according to the first motion vector offset, the method further comprises:
and if the central position does not belong to the current block, re-determining the central position according to the pixel position in the current block.
50. The method of claim 29, wherein after determining the center position and the second motion vector offset for each pixel position according to the first motion vector offset, the method further comprises:
and if the central position does not belong to the current block, directly determining each pixel position as the central position corresponding to each pixel position.
51. The method of claim 29, wherein determining the second predictor of the sub-block by performing quadratic prediction or PROF processing according to the center position and the second motion vector offset based on the first predictor, and determining the second predictor as the inter predictor of the sub-block comprises:
determining a PROF parameter;
when the PROF parameter indicates that PROF processing is performed, determining a pixel horizontal gradient and a pixel vertical gradient corresponding to the central position based on the first predicted value;
calculating a deviation value corresponding to each pixel position according to the pixel horizontal gradient, the pixel vertical gradient and the second motion vector deviation;
obtaining a predicted value of the pixel at each pixel position based on the first predicted value and the deviation value;
and determining the second predicted value by using the predicted value of the pixel at each pixel position.
52. The method of claim 29, wherein determining the second predictor of the sub-block by performing quadratic prediction or PROF processing according to the center position and the second motion vector offset based on the first predictor, and determining the second predictor as the inter predictor of the sub-block comprises:
determining a secondary prediction parameter;
determining a filter coefficient of a two-dimensional filter according to the second motion vector bias when the quadratic prediction parameter indicates that quadratic prediction is used; the two-dimensional filter is used for carrying out quadratic prediction processing according to a preset shape;
determining a predicted value of the pixel at each pixel position based on the filter coefficient and the first predicted value;
and determining the second predicted value by using the predicted value of the pixel at each pixel position.
53. The method of claim 52, wherein the two-dimensional filter is used for quadratic prediction using a plurality of adjacent pixel locations that form the predetermined shape.
54. The method of claim 53, wherein the predetermined shape is a rectangle, a diamond, or any one of symmetrical shapes.
55. The method of claim 54,
if the value of the affine mode parameter is 1, indicating to use the affine mode;
and if the value of the affine mode parameter is 0 or the affine mode parameter is not obtained through analysis, indicating not to use the affine mode.
56. The method of claim 31,
if the subblock size parameter is 8 multiplied by 8, setting the subblock size flag to be 1, and writing the subblock size flag into a code stream;
and if the subblock size parameter is 4 multiplied by 4, setting the subblock size flag to be 0, and writing the subblock size flag into a code stream.
57. The method of claim 31, wherein the control point modes comprise a 4-parameter mode and a 6-parameter mode.
58. The method of claim 31,
and writing the prediction mode parameters, the affine mode parameters and the prediction reference mode into a code stream.
59. The method of claim 51,
and writing the PROF parameters into a code stream.
60. The method of claim 52,
and writing the secondary prediction parameters into a code stream.
61. A decoder, characterized in that the decoder comprises a parsing unit, a first determining unit;
the analysis unit is used for analyzing the code stream to obtain the prediction mode parameter of the current block;
the first determination unit to determine a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; and determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
62. A decoder, comprising a first processor, a first memory having stored thereon instructions executable by the first processor, the instructions when executed by the first processor implementing the method of any of claims 1-28.
63. An encoder, characterized in that the encoder comprises a second determining unit;
the second determining unit is used for determining the prediction mode parameter of the current block; and determining a first motion vector of a sub-block of the current block when the prediction mode parameter indicates that an inter prediction value of the current block is determined using an inter prediction mode; wherein the current block comprises a plurality of sub-blocks; and determining a first predictor for the sub-block based on the first motion vector and a first motion vector offset between each pixel location in the sub-block and the sub-block; determining a central position and a second motion vector deviation corresponding to each pixel position according to the first motion vector deviation; and performing secondary prediction or PROF processing according to the central position and the second motion vector deviation based on the first predicted value, determining a second predicted value of the sub-block, and determining the second predicted value as an inter-frame predicted value of the sub-block.
64. An encoder, comprising a second processor, a second memory storing instructions executable by the second processor, the second processor when executing the instructions implementing the method of any of claims 29-60.
65. A computer storage medium, characterized in that it stores a computer program which, when executed by a first processor, implements the method of any one of claims 1-28, or which, when executed by a second processor, implements the method of any one of claims 29-60.
CN202010873979.9A 2020-08-26 2020-08-26 Inter-frame prediction method, encoder, decoder, and computer storage medium Pending CN114125466A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010873979.9A CN114125466A (en) 2020-08-26 2020-08-26 Inter-frame prediction method, encoder, decoder, and computer storage medium
TW110130848A TW202209893A (en) 2020-08-26 2021-08-20 Inter-frame prediction method, coder, decoder and computer storage medium characterized by reducing the prediction errors and promoting the coding performance to increase the coding/decoding efficiency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010873979.9A CN114125466A (en) 2020-08-26 2020-08-26 Inter-frame prediction method, encoder, decoder, and computer storage medium

Publications (1)

Publication Number Publication Date
CN114125466A true CN114125466A (en) 2022-03-01

Family

ID=80374203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010873979.9A Pending CN114125466A (en) 2020-08-26 2020-08-26 Inter-frame prediction method, encoder, decoder, and computer storage medium

Country Status (2)

Country Link
CN (1) CN114125466A (en)
TW (1) TW202209893A (en)

Also Published As

Publication number Publication date
TW202209893A (en) 2022-03-01

Similar Documents

Publication Publication Date Title
JP7391958B2 (en) Video signal encoding/decoding method and equipment used in the method
CN117528111A (en) Image encoding method, image decoding method, and method for transmitting bit stream
CN104041041B (en) Motion vector scaling for the vectorial grid of nonuniform motion
KR102359415B1 (en) Interpolation filter for inter prediction apparatus and method for video coding
CN102804779A (en) Image processing device and method
KR101579106B1 (en) Method and device for encoding video image, method and device for decoding video image, and program therefor
CN116668725A (en) Method for encoding and decoding image and non-transitory computer readable storage medium
CN116208765A (en) Video signal encoding/decoding method and apparatus therefor
CN116233466A (en) Method and apparatus for optical flow Prediction Refinement (PROF)
KR20140029434A (en) Method and device for encoding video image, method and device for decoding video image, and program therefor
KR20190110065A (en) Image decoding method/apparatus, image encoding method/apparatus and recording medium for storing bitstream
CN118200536A (en) Image encoding/decoding method, storage medium, and method of transmitting bitstream
WO2022022278A1 (en) Inter-frame prediction method, encoder, decoder, and computer storage medium
WO2022061680A1 (en) Inter-frame prediction method, encoder, decoder, and computer storage medium
US11202082B2 (en) Image processing apparatus and method
CN114125466A (en) Inter-frame prediction method, encoder, decoder, and computer storage medium
KR20210153547A (en) method and apparatus for encoding/decoding a VIDEO SIGNAL, and a recording medium storing a bitstream
CN114503582A (en) Inter-frame prediction method, encoder, decoder, and computer storage medium
CN116980596A (en) Intra-frame prediction method, encoder, decoder and storage medium
WO2022077495A1 (en) Inter-frame prediction methods, encoder and decoders and computer storage medium
KR102435445B1 (en) Method and apparatus for encoding/decoding a video signal
WO2020251469A1 (en) Sample value clipping on mip reduced prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination