WO2022037344A1 - 帧间预测方法、编码器、解码器以及计算机存储介质 - Google Patents

帧间预测方法、编码器、解码器以及计算机存储介质 Download PDF

Info

Publication number
WO2022037344A1
WO2022037344A1 PCT/CN2021/106589 CN2021106589W WO2022037344A1 WO 2022037344 A1 WO2022037344 A1 WO 2022037344A1 CN 2021106589 W CN2021106589 W CN 2021106589W WO 2022037344 A1 WO2022037344 A1 WO 2022037344A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
sub
current
pixel position
motion vector
Prior art date
Application number
PCT/CN2021/106589
Other languages
English (en)
French (fr)
Inventor
谢志煌
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to CN202180005743.6A priority Critical patent/CN114503582A/zh
Priority to MX2023000107A priority patent/MX2023000107A/es
Priority to CN202210827144.9A priority patent/CN114979668A/zh
Publication of WO2022037344A1 publication Critical patent/WO2022037344A1/zh
Priority to ZA2023/00127A priority patent/ZA202300127B/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors

Definitions

  • the present application relates to the technical field of video coding and decoding, and in particular, to an inter-frame prediction method, an encoder, a decoder, and a computer storage medium.
  • VVC Versatile Video Coding
  • AVS Digital Audio Video Coding Standard Workgroup of China
  • PROF prediction refinement with optical flow
  • the quadratic prediction achieves a more accurate prediction value, thus achieving an improvement over affine prediction.
  • PROF relies on the horizontal and vertical gradients of the reference position to correct the affine prediction results, while the secondary prediction uses a filter to predict a pixel position in a sub-block again.
  • the present application proposes an inter-frame prediction method, an encoder, a decoder, and a computer storage medium, which can greatly improve encoding performance and thus improve encoding and decoding efficiency.
  • an embodiment of the present application provides an inter-frame prediction method, which is applied to a decoder, and the method includes:
  • a first motion vector of the current subblock of the current block is determined; wherein the current block includes a plurality of subblocks;
  • the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, the current pixel position is the position of a pixel in the current sub-block , the target pixel position is the position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position;
  • the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • a second predicted value corresponding to the current sub-block is determined, and the second predicted value is determined as an inter-frame predicted value of the current sub-block.
  • an embodiment of the present application provides an inter-frame prediction method, which is applied to an encoder, and the method includes:
  • a first motion vector of the current subblock of the current block is determined; wherein the current block includes a plurality of subblocks;
  • the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, the current pixel position is the position of a pixel in the current sub-block , the target pixel position is the position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position;
  • the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • a second predicted value corresponding to the current sub-block is determined, and the second predicted value is determined as an inter-frame predicted value of the current sub-block.
  • an embodiment of the present application provides a decoder, the decoder includes a parsing part, a first determining part, a first updating part,
  • the parsing part is configured to parse the code stream to obtain the prediction mode parameter of the current block
  • the first determination part is configured to determine the first motion vector of the current sub-block of the current block when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode; wherein, The current block includes a plurality of sub-blocks; the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, the current pixel position is the current pixel position The position of a pixel in the sub-block, the target pixel position is the position of the pixel that is subjected to secondary prediction or PROF processing to the pixel of the current pixel position;
  • the first update part is configured to, if the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • the first determining part is further configured to determine a second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position, and determine the second predicted value as the current sub-block The inter-predicted value of the sub-block.
  • an embodiment of the present application provides a decoder.
  • the decoder includes a first processor and a first memory storing an executable instruction of the first processor. When the instruction is executed, the When executed by the first processor, the above-mentioned inter-frame prediction method is implemented.
  • an embodiment of the present application provides an encoder, the encoder includes a second determining part, a second updating part,
  • the second determination part is configured to determine the prediction mode parameter of the current block; when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, determine the current subblock of the current block
  • the first motion vector of The current pixel position is the position of a pixel in the current sub-block, and the target pixel position is the position of the pixel that is subjected to secondary prediction or PROF processing to the pixel of the current pixel position;
  • the second update part is configured to, if the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • the second determining part is further configured to determine a second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position, and determine the second predicted value as the current sub-block The inter-predicted value of the sub-block.
  • an embodiment of the present application provides an encoder, the encoder includes a second processor, and a second memory storing an executable instruction of the second processor, when the instruction is executed, the When executed by the second processor, the inter-frame prediction method as described above is implemented.
  • an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and when the computer program is executed by the first processor and the second processor, the above-mentioned inter-frame prediction is implemented method.
  • An inter-frame prediction method an encoder, a decoder, and a computer storage medium provided by the embodiments of the present application, the decoder parses the code stream, and obtains the prediction mode parameter of the current block; when the prediction mode parameter indicates that the inter-frame prediction mode is used to determine the current block
  • the first motion vector of the current sub-block of the current block is determined; wherein, the current block includes a plurality of sub-blocks; the first prediction value of the current sub-block is determined based on the first motion vector, and the current pixel position is determined The corresponding target pixel position; wherein, the current pixel position is the position of a pixel in the current sub-block, and the target pixel position is the position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position; if the target pixel If the position does not belong to the current sub-block, the target pixel position is updated according to the current sub-block to obtain the updated pixel position;
  • the boundary of the current sub-block can be adjusted by Expansion, redefining the target pixel position beyond the boundary of the current sub-block, etc., limit the pixel positions used for secondary prediction or PROF processing to the same sub-block, thus solving the problem of pixel disjointness.
  • the resulting problem of degraded prediction performance can reduce prediction errors, greatly improve encoding performance, and thus improve encoding and decoding efficiency.
  • Fig. 1 is the schematic diagram 1 of the affine model
  • Fig. 2 is the schematic diagram two of affine model
  • Fig. 3 is the interpolation schematic diagram of the pixel
  • 4 is a schematic diagram 1 of sub-block interpolation
  • 5 is a schematic diagram 2 of sub-block interpolation
  • Fig. 6 is the motion vector schematic diagram of each sub-block
  • Figure 7 is a schematic diagram of the sample location
  • Fig. 8 is the schematic diagram of carrying out secondary prediction to the current pixel position
  • FIG. 9 is a schematic diagram 1 of pixel positions not belonging to the same sub-block.
  • 10 is a schematic diagram 2 of pixel positions not belonging to the same sub-block
  • FIG. 11 is a schematic diagram of using PROF for the current pixel position
  • FIG. 12 is a schematic diagram 3 of a pixel position not belonging to the same sub-block
  • FIG. 13 is a schematic block diagram of the composition of a video coding system provided by an embodiment of the present application.
  • FIG. 14 is a schematic block diagram of the composition of a video decoding system provided by an embodiment of the application.
  • 15 is a schematic diagram 1 of the implementation flow of the inter-frame prediction method
  • 16 is a schematic diagram 2 of the implementation flow of the inter-frame prediction method
  • 17 is a schematic diagram of extending the current sub-block
  • 18 is a schematic diagram one of a two-dimensional filter
  • 19 is a schematic diagram two of a two-dimensional filter
  • 20 is a schematic diagram three of the implementation flow of the inter-frame prediction method
  • Figure 21 is a schematic diagram of a 4x4 sub-block
  • FIG. 22 is a schematic diagram one of extended sub-blocks
  • FIG. 23 is a schematic diagram 2 of extended sub-blocks
  • 24 is a schematic diagram of a replacement pixel
  • FIG. 25 is a schematic diagram 4 of the implementation flow of the inter-frame prediction method
  • Figure 26 is a schematic diagram of an alternate pixel position
  • Figure 27 is a schematic diagram of the composition structure of a decoder
  • Figure 28 is a schematic diagram of the composition structure of the decoder 2;
  • Figure 29 is a schematic diagram of the composition structure of the encoder one
  • FIG. 30 is a second schematic diagram of the composition and structure of the encoder.
  • LCUs Large Coding Units
  • CU Coding Unit
  • Prediction Unit PU
  • the hybrid coding framework may include modules such as prediction, transform (Transform), quantization (Quantization), entropy coding (EntropyCoding), and loop filtering (In Loop Filter); wherein, the prediction module may include intraPrediction (intraPrediction) And inter prediction (interPrediction), inter prediction can include motion estimation (motion estimation) and motion compensation (motion compensation). Since there is a strong correlation between adjacent pixels in a frame of a video image, the use of intra-frame prediction in video coding and decoding technology can eliminate the spatial redundancy between adjacent pixels; There is also a strong similarity between frames. In the video coding and decoding technology, the inter-frame prediction method is used to eliminate the temporal redundancy between adjacent frames, so as to improve the coding efficiency. The present application will be described in detail below in terms of inter prediction.
  • Inter-frame prediction is to use the already encoded/decoded frame to predict the part that needs to be encoded/decoded in the current frame.
  • the part that needs to be encoded/decoded is usually a coding unit or a prediction unit.
  • the coding unit or prediction unit that needs to be encoded/decoded is collectively referred to as the current block.
  • Pan motion is a common and simple motion method in video, so pan prediction is also a traditional prediction method in video codec. Panning motion in video can be understood as a part of the content moving from a certain position on one frame to a certain position on another frame over time.
  • a simple unidirectional prediction of translation can be represented by a motion vector (MV) between a certain frame and the current frame.
  • a certain frame mentioned here is a reference frame of the current frame.
  • the current block can find a reference block on the reference frame with the same size as the current block through the motion information including the reference frame and motion vector, and use this reference block as the current block.
  • the predicted block for the block In an ideal translation motion, the content of the current block does not change from frame to frame, such as deformation, rotation, etc., as well as changes in brightness and color, etc. However, the content in the video does not always meet such an ideal situation.
  • Bidirectional prediction can solve the above problems to a certain extent.
  • the usual bidirectional prediction refers to bidirectional translation prediction.
  • Bidirectional prediction is to use the motion information of two reference frames and motion vectors to find two reference blocks with the same size as the current block from two reference frames (the two reference frames may be the same reference frame), and use these two reference frames.
  • the block generates a predicted block for the current block.
  • Generation methods include averaging, weighted averaging, and some other calculations.
  • prediction can be considered as a part of motion compensation, and some documents will call the prediction in this application motion compensation, and as mentioned in this application, affine prediction, some documents will call it affine motion compensation.
  • FIG. 1 is a schematic diagram 1 of an affine model
  • FIG. 2 is a schematic diagram 2 of an affine model, as shown in FIGS. 1 and 2 . Because each MV includes an x-component and a y-component, 2 control points have 4 parameters, and 3 control points have 6 parameters.
  • an MV can be derived for each pixel position, and each pixel position can find its corresponding position in the reference frame. If the position is not an integer pixel position, the sub-pixel needs to be obtained by interpolation. position value.
  • the interpolation methods used in the current video coding and decoding standards are usually implemented by finite-length unit impulse response (Finite Impulse Response, FIR) filters, and the complexity (cost) is very high to implement in this way. For example, in AVS3, an 8-tap interpolation filter is used for the luminance component, and the sub-pixel accuracy of normal mode is 1/4 pixel, and the sub-pixel accuracy of affine mode is 1/16 pixel.
  • Figure 3 is a schematic diagram of pixel interpolation.
  • the circular pixel is the desired sub-pixel point
  • the dark square pixel is the position of the integer pixel corresponding to the sub-pixel
  • the vector between them is The motion vector of the sub-pixels
  • the light-colored square pixels are the pixels that need to be used for the interpolation of the circular sub-pixel positions.
  • the pixel values of these 8x8 light-colored square pixel areas need to be interpolated. , which also contains dark pixel locations.
  • FIG. 4 is a schematic diagram 1 of sub-block interpolation. The pixel area that needs to be used for 4 ⁇ 4 block interpolation is shown in FIG. 4 .
  • FIG. 5 is a schematic diagram 2 of sub-block interpolation. The pixel area that needs to be used for 8 ⁇ 8 block interpolation is shown in FIG. 5 .
  • the MV of each pixel location in a sub-block is the same.
  • the pixel positions in a sub-block can then be interpolated together, sharing the bandwidth, using filters of the same phase, and sharing the intermediate values of the interpolation process. But if you use one MV per pixel, the bandwidth will increase, and you may use filters of different phases and intermediate values that cannot share the interpolation process.
  • affine prediction in VVC and AVS3 is implemented based on sub-blocks.
  • the sub-block size in AVS3 is 4x4 and 8x8, and the 4x4 sub-block size is used in VVC.
  • Each sub-block has an MV, and the pixel positions within the sub-block share the same MV. In this way, all pixel positions inside the sub-block are uniformly interpolated.
  • the sub-block-based affine prediction is similar in motion compensation complexity to other sub-block-based prediction methods.
  • the method for determining this shared MV is to take the MV of the center of the current sub-block.
  • the center of the sub-block actually falls on a non-integer pixel position.
  • the position of an integer pixel is taken. For example, for a 4 ⁇ 4 sub-block, the position of the pixel that is (2, 2) away from the upper left corner is taken.
  • the position of the pixel that is (2, 2) away from the upper left corner is taken.
  • an 8x8 sub-block take the pixel position (4, 4) from the upper left corner position.
  • the affine prediction model can derive the MV for each pixel location based on the control points (2 control points or 3 control points) used by the current block.
  • the MV of this position is calculated according to the pixel position in the previous segment as the MV of the sub-block.
  • Figure 6 is a schematic diagram of the motion vector of each sub-block. As shown in Figure 6, in order to derive the motion vector of each sub-block, the motion vector sampled by the center of each sub-block is as shown in the figure, rounded to 1/16 precision, and then motion compensation.
  • PROF prediction improvement technique
  • This technique can improve the prediction value of block-based affine prediction without increasing bandwidth.
  • the gradients in the horizontal and vertical directions are calculated for each pixel for which the sub-block-based affine prediction has been completed.
  • PROF uses a 3-tap filter [-1, 0, 1] when calculating the gradient, and its calculation method is the same as that of Bi-directional Optical flow (BDOF).
  • BDOF Bi-directional Optical flow
  • the motion vector deviations at the same position of some sub-blocks are the same. For these sub-blocks, only a set of motion vector deviations need to be calculated, and other sub-blocks can directly reuse these values. For each pixel position, use the horizontal pixel vertical gradient of the point and the motion vector deviation (including the deviation in the horizontal direction and the deviation in the vertical direction) to calculate the corrected value of the predicted value of the pixel position, and then use the original predicted value, that is, based on The predicted value of the affine prediction of the sub-block is added to the modified value of the predicted value to obtain the modified predicted value.
  • the [-1, 0, 1] filter When calculating the gradient in the horizontal and vertical directions, the [-1, 0, 1] filter is used, that is, for the current pixel position, the horizontal direction will use the prediction of the pixel position with a distance of 1 on the left and a pixel position with a distance of 1 on the right. value, the vertical direction will use the predicted value to the pixel position with a distance of 1 above and a pixel position with a distance of 1 below. If the current pixel position is the boundary position of the current block, some of the above-mentioned pixel positions will exceed the boundary position of the current block by a distance of one pixel.
  • the predicted value of the boundary of the current block is used to fill the position of one pixel distance outward to satisfy the gradient calculation, so that there is no need to additionally increase the predicted value of one pixel distance beyond the boundary of the current block. Since the gradient computation only needs to use the predicted value of the subblock-based affine prediction, no additional bandwidth needs to be added.
  • the MV of each sub-block of the current block and the motion vector deviation of each pixel position in the sub-block are derived from the MV of the control point.
  • Each sub-block in VVC uses the same pixel position as the sub-block MV, so it is only necessary to derive the motion vector deviation of a group of sub-blocks, and other sub-blocks can reuse the sub-block.
  • the description of the PROF process the calculation of the motion vector deviation by PROF is included in the above process.
  • AVS3's affine prediction has the same basic principle as VVC when calculating the MV of a sub-block, but AVS3 has special processing for the upper-left sub-block A, upper-right sub-block B and lower-left sub-block C of the current block.
  • the motion vector group can be expressed as mvsAffine(mv0, mv1, mv2); if there are 2 motion vectors in the affine control point motion vector group, then the motion vector group Can be expressed as mvsAffine(mv0, mv1).
  • the affine motion unit subblock motion vector array can be derived as follows:
  • FIG. 7 is a schematic diagram of sample positions.
  • (xE, yE) is the position of the upper left sample of the luminance prediction block of the current prediction unit in the luminance sample matrix of the current image, and the width of the current prediction unit and The heights are width and height respectively, the width and height of each sub-block are subwidth and subheight respectively, the sub-block where the upper-left sample of the luminance prediction block of the current prediction unit is located is A, the sub-block where the upper-right sample is located is B, and the lower-left sample is located.
  • the sub-block is C.
  • both xPos and yPos are equal to 0;
  • xPos is equal to width and yPos is equal to 0;
  • xPos is equal to (x-xE)+4, and yPos is equal to (y-yE)+4;
  • mvE_x Clip3(-131072, 131071, Rounding((mv0_x ⁇ 7)+dHorX ⁇ xPos+dVerX ⁇ yPos,7));
  • mvE_y Clip3(-131072, 131071, Rounding((mv0_y ⁇ 7)+dHorY ⁇ xPos+dVerY ⁇ yPos,7));
  • both xPos and yPos are equal to 0;
  • xPos is equal to width and yPos is equal to 0;
  • xPos is equal to (x-xE)+2, and yPos is equal to (y-yE)+2;
  • mvE_x Clip3(-131072, 131071, Rounding((mv0_x ⁇ 7)+dHorX ⁇ xPos+dVerX ⁇ yPos,7));
  • mvE_y Clip3(-131072, 131071, Rounding((mv0_y ⁇ 7)+dHorY ⁇ xPos+dVerY ⁇ yPos,7)).
  • mv0E0 is the LO motion vector of the 4x4 unit with the MvArrayL0 motion vector set at (xE+x, yE+y) position.
  • the value of the element predMatrixL0[x][y] in the luma prediction sample matrix predMatrixL0 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE+x) ⁇ 4) +mv0E0_x, the sample value of ((yE+y) ⁇ 4)+mv0E0_y), the value of the element predMatrixL0[x][y] in the chrominance prediction sample matrix predMatrixL0, which is the reference index in the reference image queue 0 is RefIdxL0
  • x1 ((xE+2x)>>3) ⁇ 3
  • y1 ((yE+2y)>>3) ⁇ 3
  • mv1E0 is the 4x4 unit of MvArrayL0 motion vector set at (x1, y1) position
  • the LO motion vector of the , mv4E0 is the LO motion vector of the 4x4 unit of the MvArrayL0 motion vector set at the (x1+4, y1+4) position.
  • MvC_x (mv1E0_x+mv2E0_x+mv3E0_x+mv4E0_x+2)>>2
  • MvC_y (mv1E0_y+mv2E0_y+mv3E0_y+mv4E0_y+2)>>2
  • mv0E0 is the LO motion vector of an 8x8 unit with the MvArrayL0 motion vector set at (xE+x, yE+y) position.
  • the value of the element predMatrixL0[x][y] in the luma prediction sample matrix predMatrixL0 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE+x) ⁇ 4) +mv0E0_x, the sample value of ((yE+y) ⁇ 4)+mv0E0_y), the value of the element predMatrixL0[x][y] in the chroma prediction sample matrix predMatrixL0 is 1 whose reference index is RefIdxL0 in the reference image queue 0
  • MvC_x is equal to mv0E0_x
  • MvC_y is equal to m
  • mv0E1 is the L1 motion vector of the 4x4 unit with the MvArrayL1 motion vector set at (xE+x, yE+y) position.
  • the value of the element predMatrixL1[x][y] in the luma prediction sample matrix predMatrixL1 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL1 in the reference image queue 1 (((xE+x) ⁇ 4) +mv0E1_x, the sample value of ((yE+y) ⁇ 4)+mv0E1_y), the value of the element predMatrixL1[x][y] in the chrominance prediction sample matrix predMatrixL1 is 1 whose reference index is RefIdxL1 in the reference image queue 1
  • mv1E1 is the 4x4 unit of MvArrayL1 motion vector set at (x1, y1) position
  • mv2E1 is the L1 motion vector of the 4x4 unit of the MvArrayL1 motion vector set at the (x1+4, y1) position
  • mv3E1 is the MvArrayL1 motion vector set at the (x1, y1+4) position of the 4x4 unit L1 motion vector of the unit
  • mv4E1 is the L1 motion vector of the 4x4 unit of the MvArrayL1 motion vector set at the (x1+4, y1+4) position.
  • MvC_x (mv1E1_x+mv2E1_x+mv3E1_x+mv4E1_x+2)>>2
  • MvC_y (mv1E1_y+mv2E1_y+mv3E1_y+mv4E1_y+2)>>2
  • mv0E1 is the L1 motion vector of the 8x8 unit with the MvArrayL1 motion vector set at (xE+x, yE+y) position.
  • the value of the element predMatrixL1[x][y] in the luma prediction sample matrix predMatrixL1 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL1 in the reference image queue 1 (((xE+x) ⁇ 4) +mv0E1_x, the sample value of ((yE+y) ⁇ 4)+mv0E1_y), the value of the element predMatrixL1[x][y] in the chrominance prediction sample matrix predMatrixL1 is 1 whose reference index is RefIdxL1 in the reference image queue 1
  • MvC_x is equal to mv0E1_x
  • MvC_y is equal to mv0
  • mv0E0 is the L0 motion vector of the 8x8 unit with the MvArrayL0 motion vector set at (xE+x, yE+y) position
  • mv0E1 is the MvArrayL1 motion vector set at the (x, y) position , an L1 motion vector of 8x8 cells.
  • the value of the element predMatrixL0[x][y] in the luma prediction sample matrix predMatrixL0 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE+x) ⁇ 4)+ mv0E0_x, the sample value of ((yE+y) ⁇ 4)+mv0E0_y), the value of the element predMatrixL0[x][y] in the chrominance prediction sample matrix predMatrixL0 is 1/ of the reference index in the reference image queue 0 as RefIdxL0
  • the position in the matrix is ((((xE+x) ⁇ 4)+mv0E1_x, ((yE+y) ⁇ ⁇ 4)+mv0E1_y)) sample value
  • the value of the element predMatrixL1[x][y] in the chroma prediction sample matrix predMatrixL1 is the position in the 1/32-precision chroma sample matrix whose reference index is RefIdxL1 in the reference image queue 1 is the sample value of (((xE+2x) ⁇ 4)+MvC1_x, ((yE+2y) ⁇ 4)+MvC1_y).
  • MvC0_x is equal to mv0E0_x
  • MvC0_y is equal to mv0E0_y
  • MvC1_x is equal to mv0E1_x
  • MvC1_y is equal to mv0E1_y.
  • the element values of each position in the luminance 1/16 precision sample matrix and the chrominance 1/32 precision sample matrix of the reference image are obtained by the interpolation method defined by the following affine luminance sample interpolation process and affine chrominance sample interpolation process.
  • Integer samples outside the reference image should be replaced with the nearest integer samples (edge or corner samples) in the image to the sample, that is, the motion vector can point to samples outside the reference image.
  • affine luminance sample interpolation process is as follows:
  • A, B, C, D are adjacent integer pixel samples
  • dx and dy are the horizontal and vertical distances between sub-pixel samples a(dx, dy) and A around integer pixel sample A
  • dx is equal to fx&15
  • dy is equal to fy&15
  • ( fx, fy) are the coordinates of the sub-pixel sample in the 1/16 precision luminance sample matrix.
  • a x,0 Clip1((fL[x][0] ⁇ A -3,0 +fL[x][1] ⁇ A- 2,0 +fL[x][2] ⁇ A- 1,0 + fL[x][3] ⁇ A 0,0 +fL[x][4] ⁇ A 1,0 +fL[x][5] ⁇ A 2,0 +fL[x][6] ⁇ A 3, 0 +fL[x][7] ⁇ A 4, 0 +32)>>6).
  • a 0,y Clip1((fL[y][0] ⁇ A 0,-3 +fL[y][1] ⁇ A- 2,0 +fL[y][2] ⁇ A- 1,0 + fL[y][3] ⁇ A 0,0 +fL[y][4] ⁇ A 1,0 +fL[y][5] ⁇ A 2,0 +fL[y][6] ⁇ A 3, 0 +fL[y][7] ⁇ A -4 , 0 +32)>>6).
  • a x, y Clip1((fL[y][0] ⁇ a' x, y-3 +fL[y][1] ⁇ a' x, y-2 +fL[y][2] ⁇ a' x, y-1 +fL[y][3] ⁇ a'x, y +fL[y][4] ⁇ a'x, y+1 +fL[y][5] ⁇ a'x, y + 2+fL[y][6] ⁇ a'x, y+3 +fL[y][7] ⁇ a'x, y+4 +(1 ⁇ (19-BitDepth)))>>(20- BitDepth)).
  • a' x, y (fL[x][0] ⁇ A -3, y +fL[x][1] ⁇ A -2, y +fL[x][2] ⁇ A -1, y+ fL[ x][3] ⁇ A 0, y +fL[x][4] ⁇ A 1, y +fL[x][5] ⁇ A 2, y +fL[x][6] ⁇ A 3, y + fL[x][7] ⁇ A 4, y+ ((1 ⁇ (BitDepth-8))>>1))>>(BitDepth-8).
  • the luminance interpolation filter coefficients are shown in Table 1:
  • affine chroma sample interpolation process is as follows:
  • A, B, C, D are adjacent integer pixel samples
  • dx and dy are the horizontal and vertical distances between sub-pixel samples a(dx, dy) and A around integer pixel sample A
  • dx is equal to fx&31
  • dy is equal to fy&31
  • ( fx, fy) are the coordinates of the sub-pixel sample in the 1/32-precision chroma sample matrix.
  • a x, y (0, dy) Clip3(0, (1 ⁇ BitDepth)-1, (fC[dy][0] ⁇ Ax, y -1+fC[dy][1] ⁇ Ax, y +fC[dy][2] ⁇ A x, y +1+fC[dy][3] ⁇ A x, y +2+32)>>6)
  • a x, y (dx, 0) Clip3(0, (1 ⁇ BitDepth)-1, (fC[dx][0] ⁇ Ax-1, y +fC[dx][1] ⁇ Ax, y +fC[dx][2] ⁇ A x+1, y +fC[dx][3] ⁇ A x+2, y+32 )>>6)
  • a x,y (dx,dy) Clip3(0,(1 ⁇ BitDepth)-1,(C[dy][0] ⁇ a' x,y-1 (dx,0)+C[dy][ 1] ⁇ a' x,y (dx,0)+C[dy][2] ⁇ a'x,y +1(dx,0)+C[dy][3] ⁇ a'x,y +2 (dx, 0)+(1 ⁇ (19-BitDepth)))>>(20-BitDepth))
  • affine prediction methods can include the following steps:
  • Step 101 Determine the motion vector of the control point.
  • Step 102 Determine the motion vector of the sub-block according to the motion vector of the control point.
  • Step 103 Predict the sub-block according to the motion vector of the sub-block.
  • Step 101 Determine the motion vector of the control point.
  • Step 102 Determine the motion vector of the sub-block according to the motion vector of the control point.
  • Step 103 Predict the sub-block according to the motion vector of the sub-block.
  • Step 104 Determine the motion vector deviation between each position in the sub-block and the sub-block according to the motion vector of the control point and the motion vector of the sub-block.
  • Step 105 Determine the motion vector of the sub-block according to the motion vector of the control point.
  • Step 106 using the sub-block-based prediction value to derive each position to calculate gradients in the horizontal and vertical directions.
  • Step 107 Calculate the deviation value of the predicted value of each position according to the motion vector deviation of each position and the gradients in the horizontal and vertical directions by using the principle of optical flow.
  • Step 108 Add the deviation value of the predicted value to the predicted value based on the sub-block for each position to obtain a revised predicted value.
  • Step 101 Determine the motion vector of the control point.
  • Step 109 Determine the motion vector of the sub-block and the deviation between each position in the sub-block and the motion vector of the sub-block according to the motion vector of the control point.
  • Step 103 Predict the sub-block according to the motion vector of the sub-block.
  • Step 106 using the sub-block-based prediction value to derive each position to calculate gradients in the horizontal and vertical directions.
  • Step 107 Calculate the deviation value of the predicted value of each position according to the motion vector deviation of each position and the gradients in the horizontal and vertical directions by using the principle of optical flow.
  • Step 108 Add the deviation value of the predicted value to the predicted value based on the sub-block for each position to obtain a revised predicted value.
  • PROF can revise sub-block-based affine prediction using the principle of optical flow, which improves the compression performance.
  • the application of PROF is based on the case where the motion vector of the pixel position within the sub-block has a very small deviation from the sub-block motion vector, that is, the deviation of the motion vector of the pixel position within the sub-block from the sub-block motion vector is very small.
  • the optical flow calculation method using PROF is effective in the case of The gradient does not truly reflect the gradient in the horizontal and vertical directions between the reference position and the actual position. Therefore, when the motion vector of the pixel position in the sub-block has a large deviation from the motion vector of the sub-block, this method will Not particularly effective anymore.
  • the decoder's method for secondary prediction may include the following steps:
  • Step 201 Predict the sub-block according to the motion vector of the sub-block to obtain a predicted value.
  • Step 202 Determine the motion vector deviation between each position in the sub-block and the sub-block.
  • Step 203 Use a two-dimensional filter to filter the predicted value according to the motion vector deviation of each position to obtain the predicted value of secondary prediction.
  • the secondary prediction method can perform point-based secondary prediction on the basis of the sub-block-based prediction for the pixel positions where the motion vector deviates from the motion vector of the sub-block after the sub-block-based prediction, and finally The correction of the predicted value is completed, and a new predicted value is obtained, that is, the predicted value of the secondary prediction.
  • point-based secondary prediction uses a two-dimensional filter.
  • a two-dimensional filter is a filter composed of adjacent points forming a preset shape. Adjacent points constituting the preset shape can be 9 points.
  • the result of the filter processing is the predicted value of the secondary prediction for that location.
  • the filter coefficient of the two-dimensional filter is determined by the motion vector deviation of each position, the input of the two-dimensional filter is the predicted value, and the output is the predicted value of the secondary prediction.
  • the decoder's method for secondary prediction may include the following steps:
  • Step 204 Determine the motion vector of the control point.
  • Step 205 Determine the motion vector of the sub-block according to the motion vector of the control point.
  • Step 201 Predict the sub-block according to the motion vector of the sub-block to obtain a predicted value.
  • Step 202 Determine the motion vector deviation between each position in the sub-block and the sub-block.
  • Step 203 Use a two-dimensional filter to filter the predicted value according to the motion vector deviation of each position to obtain the predicted value of secondary prediction.
  • the motion vector of the control point after the motion vector of the control point is determined, the motion vector of the control point can be used to perform prediction processing on the sub-block, and the pixel position in the sub-block and the sub-block can be determined after the pixel position in the sub-block is determined.
  • a point-based secondary prediction is carried out on the basis of the sub-block-based prediction for the pixel position where the motion vector is deviated from the motion vector of the sub-block, and finally the correction of the predicted value is completed, and a new prediction is obtained. value, which is the predicted value of the secondary prediction.
  • the decoder's method for secondary prediction may include the following steps:
  • Step 204 Determine the motion vector of the control point.
  • Step 206 Determine the motion vector of the sub-block according to the motion vector of the control point, and the deviation of each position in the sub-block from the motion vector of the sub-block.
  • Step 201 Predict the sub-block according to the motion vector of the sub-block to obtain a predicted value.
  • Step 203 Use a two-dimensional filter to filter the predicted value according to the motion vector deviation of each position to obtain the predicted value of secondary prediction.
  • the motion vector of the sub-block and the deviation of each position in the sub-block from the motion vector of the sub-block can be determined at the same time, and then the motion vector of the sub-block can be determined.
  • the point-based secondary prediction is performed on the basis of the sub-block-based prediction, and finally the correction of the prediction value is completed, and a new prediction value is obtained, that is, the prediction of the secondary prediction value.
  • the secondary prediction method can be used for simulating
  • secondary prediction can also be applied to improve other sub-block-based predictions. That is to say, the sub-block-based prediction proposed in this application includes, but is not limited to, affine sub-block-based prediction.
  • the secondary prediction method may be based on the AVS3 standard, and may also be applied to the VVC standard, which is not specifically limited in this application.
  • the pixel value of the adjacent pixel position in the prediction block (current block) based on the sub-block prediction will be used.
  • the two-dimensional filter used is a 3x3 rectangular filter
  • the adjacent pixel positions include the pixel position of one pixel distance to the left of the current pixel position, the pixel position of one pixel distance to the right, the pixel position of Pixel position one pixel away from the left, one pixel to the left one pixel away from the top, one pixel to the right one pixel away from the top, one pixel to the left one pixel away from the bottom, one pixel to the right one pixel away from the bottom.
  • the pixel position of the distance to these 8 pixel positions may also be included.
  • the pixel position used by the filter may belong to more than one sub-block, and there is a case of crossing sub-blocks. It is often assumed that adjacent pixels between adjacent sub-blocks are continuous. However, in practical situations, since multiple sub-blocks of the current block may be predicted based on different mv, the adjacent sub-blocks are adjacent to each other. It is very likely that the pixels are not connected, or actually are not adjacent, or that they are not adjacent in the reference image. If the gap between adjacent sub-blocks mv of the current block is relatively small, this disconnection may not be obvious, but if the gap between adjacent sub-blocks mv is relatively large, this disconnection will be reflected. .
  • FIG. 8 is a schematic diagram of performing secondary prediction on the current pixel position.
  • the current pixel position is position 1
  • position 2 is the left adjacent pixel of position 1, which is the same as position 1.
  • the left distance is a pixel position of 1 pixel distance
  • position 2 and position 1 are not in the same sub-block
  • position 1 belongs to sub-block 1
  • position 2 belongs to sub-block 2.
  • Fig. 9 is a schematic diagram 1 of a pixel position that does not belong to the same sub-block.
  • a possible relative position of the two adjacent sub-blocks of sub-block 1 and sub-block 2 in the reference image Among them, the MVs of sub-block 1 and sub-block 2 are not much different, and the positions in the reference image are slightly shifted, so the actual position of position 2 adjacent to position 1 in the reference image is position 3 in sub-block 2. It can be seen that the position 3 in the sub-block 2 is not a position one pixel distance directly to the left of the position 1 in the sub-block 1, and there is a positional deviation between the two pixel positions.
  • FIG. 10 is a schematic diagram 2 of a pixel position that does not belong to the same sub-block.
  • FIG. 10 another possible relative position of the two adjacent sub-blocks, sub-block 1 and sub-block 2, in the reference image. , in which, there is no pixel position in sub-block 2 that is directly adjacent to position 1 in sub-block 1 to the left.
  • position 3 in sub-block 2 will be selected as position 1 in sub-block 1. It can be seen that position 1 and position 3 are far apart. If position 3 is used as the adjacent pixel position of position 1 for filtering, the prediction effect of the final secondary prediction will be greatly reduced. .
  • PROF needs to calculate the gradients in the horizontal and vertical directions.
  • the gradient calculation in the horizontal direction needs to use the pixel position of the current block based on the prediction of the sub-block and one pixel distance to the left of the current pixel position. and the pixel position one pixel away from the right
  • the gradient calculation in the vertical direction needs to use the pixel position one pixel away from the current pixel position in the previous block based on the prediction of the sub-block, and the pixel position one pixel away from the bottom of the current pixel position.
  • the accuracy of the gradient computation decreases if they are not connected in the reference image.
  • FIG. 11 is a schematic diagram of using PROF for the current pixel position.
  • PROF calculates the gradient in the horizontal direction
  • it is assumed that the current pixel position is position 1.
  • 2 pixel positions of the belongs to sub-block 1
  • position 2 belongs to sub-block 2.
  • Figure 12 is a schematic diagram 3 where the pixel positions do not belong to the same sub-block.
  • the MVs of sub-block 1 and sub-block 2 are not much different, and the positions in the reference image are slightly misplaced, so the position adjacent to position 1
  • the actual position of 2 in the reference image is position 3 in sub-block 2. It can be seen that the position 3 in the sub-block 2 is not a position one pixel distance directly to the left of the position 1 in the sub-block 1, and there is a positional deviation between the two pixel positions.
  • the prediction performance may be degraded because the pixel positions are not connected. That is, the existing PROF correction prediction method and secondary prediction method In fact, it is not rigorous. When improving affine prediction, it cannot be well applied to all scenarios, and the coding performance needs to be improved.
  • the current The expansion of the boundary of the sub-block, the redefinition of the target pixel position beyond the boundary of the current sub-block, etc., limit the pixel positions that need to be used for secondary prediction or PROF processing in the same sub-block, so as to solve the problem. It can reduce the prediction error and greatly improve the coding performance, thereby improving the coding and decoding efficiency.
  • FIG. 13 is a schematic block diagram of the composition of a video encoding system provided by an embodiment of the present application.
  • the video encoding system 11 may include: a transformation unit 111 , quantization unit 112, mode selection and coding control logic unit 113, intra prediction unit 114, inter prediction unit 115 (including motion compensation and motion estimation), inverse quantization unit 116, inverse transform unit 117, loop filtering unit 118 , encoding unit 119 and decoded image buffering unit 110; for the original video signal of the input, a video reconstruction block can be obtained by the division of the coding tree block (Coding Tree Unit, CTU), and the encoding mode is determined by mode selection and encoding control logic unit 113 , and then, for the residual pixel information obtained after intra-frame or inter-frame prediction, the video reconstruction block is transformed by the transform unit 111 and the quantization unit 112, including transforming the residual information from the
  • CTU Coding Tree Unit
  • the obtained transform coefficients are quantized to further reduce the bit rate; the intra-frame prediction unit 114 is used to perform intra-frame prediction on the video reconstruction block; wherein, the intra-frame prediction unit 114 is used to determine the optimal intra-frame frame of the video reconstruction block prediction mode (ie, target prediction mode); inter prediction unit 115 is used to perform inter prediction encoding of the received video reconstruction block relative to one or more blocks in one or more reference frames to provide temporal prediction information;
  • motion estimation is the process of generating a motion vector, which can estimate the motion of the video reconstruction block, and then, motion compensation performs motion compensation based on the motion vector determined by motion estimation; after determining the inter prediction mode, inter
  • the prediction unit 115 is also used to supply the selected inter-frame prediction data to the coding unit 119, and also to send the calculated motion vector data to the coding unit 119; in addition, the inverse quantization unit 116 and the inverse transform unit 117 are used for The reconstruction of the video reconstruction block reconstructs a residual block in the pixel domain, the
  • a predictive block in the frame of unit 110 is used to generate a reconstructed video reconstruction block; encoding unit 119 is used to encode various encoding parameters and quantized transform coefficients.
  • the decoded image buffer unit 110 is used for storing reconstructed video reconstruction blocks for prediction reference. As the video image encoding proceeds, new reconstructed video reconstruction blocks are continuously generated, and these reconstructed video reconstruction blocks are all stored in the decoded image buffer unit 110 .
  • FIG. 14 is a schematic block diagram of the composition of a video decoding system provided by an embodiment of the present application.
  • the video decoding system 12 may include: a decoding unit 121, an inverse The transformation unit 127, together with the inverse quantization unit 122, the intra-frame prediction unit 123, the motion compensation unit 124, the loop filtering unit 125 and the decoded image buffer unit 126;
  • the code stream of the video signal; the code stream is input into the video decoding system 12, and firstly passes through the decoding unit 121 to obtain the decoded transform coefficient; the transform coefficient is processed by the inverse transform unit 127 and the inverse quantization unit 122, so that the A residual block is generated in the pixel domain;
  • intra-prediction unit 123 may be used to generate prediction data for the current video decoding block based on the determined intra-prediction direction and data from previously decoded blocks of the current frame or picture; motion compensation unit 124 is by parsing
  • An inter-frame prediction method provided in this embodiment of the present application mainly acts on the inter-frame prediction unit 215 of the video coding system 11 and the inter-frame prediction unit of the video decoding system 12, that is, the motion compensation unit 124;
  • the system 11 can obtain a better prediction effect through the inter-frame prediction method provided in the embodiment of the present application, and correspondingly, the video decoding system 12 can also improve the quality of video decoding and restoration.
  • this embodiment is exemplified based on the AVS3 standard, and the inter-frame prediction method proposed in this application can also be applied to other coding standard technologies such as VVC, which is not specifically limited in this application.
  • An embodiment of the present application provides an inter-frame prediction method, which is applied to a video decoding device, that is, a decoder.
  • the functions implemented by the method can be implemented by the first processor in the decoder calling a computer program, and of course the computer program can be stored in the first memory.
  • the decoder includes at least a first processor and a first memory.
  • FIG. 15 is a schematic diagram 1 of the implementation flow of the inter-frame prediction method.
  • the method for performing the inter-frame prediction by the decoder may include the following steps:
  • Step 301 parse the code stream, and obtain the prediction mode parameter of the current block.
  • the decoder may first parse the binary code stream to obtain the prediction mode parameter of the current block.
  • the prediction mode parameter may be used to determine the prediction mode used by the current block.
  • each current block may include a first image component, a second image component, and a third image component, that is, the current block indicates that the prediction of the first image component, the second image component or the third image component in the image to be decoded is currently performed image block.
  • the current block performs the first image component prediction, and the first image component is a luminance component, that is, the image component to be predicted is a luminance component, then the current block may also be called a luminance block; or, it is assumed that the current block performs the second image component prediction prediction, and the second image component is a chrominance component, that is, the image component to be predicted is a chrominance component, then the current block may also be called a chrominance block.
  • the prediction mode parameter may not only indicate the prediction mode adopted by the current block, but may also indicate parameters related to the prediction mode.
  • the prediction mode may include an inter prediction mode, a traditional intra prediction mode, a non-traditional intra prediction mode, and the like.
  • the encoder can select the optimal prediction mode to pre-encode the current block.
  • the prediction mode of the current block can be determined, and then the prediction mode parameters used to indicate the prediction mode can be determined. Thereby, the corresponding prediction mode parameters are written into the code stream and transmitted from the encoder to the decoder.
  • the decoder can directly obtain the prediction mode parameter of the current block by parsing the code stream, and determine the prediction mode used by the current block according to the prediction mode parameter obtained by parsing, and the correlation corresponding to the prediction mode. parameter.
  • the decoder may determine whether the current block uses the inter prediction mode based on the prediction mode parameter.
  • Step 302 When the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, determine the first motion vector of the current subblock of the current block; wherein the current block includes a plurality of subblocks.
  • the decoder may first determine the current block.
  • the first motion vector for each sub-block of the block wherein, one sub-block corresponds to one first motion vector.
  • the current block is the image block to be decoded in the current frame
  • the current frame is decoded in the form of image blocks in a certain order
  • the current block is the image block in the current frame in this order.
  • the current block may have various sizes, such as 16 ⁇ 16, 32 ⁇ 32, or 32 ⁇ 16, where the numbers represent the number of rows and columns of pixels on the current block.
  • the current block may be divided into multiple sub-blocks, wherein each sub-block of the current block is a predicted sub-block based on the prediction of the sub-block, and the size of each sub-block is Similarly, a sub-block is a collection of smaller-sized pixels.
  • the size of the sub-block can be 8x8 or 4x4.
  • the size of the current block is 16 ⁇ 16, which can be divided into 4 sub-blocks each with a size of 8 ⁇ 8.
  • the embodiment of the present application when the decoder parses the code stream and obtains the prediction mode parameter indicating that the inter-frame prediction mode is used to determine the inter-frame prediction value of the current block, the embodiment of the present application can continue to be used.
  • the provided inter prediction method when the decoder parses the code stream and obtains the prediction mode parameter indicating that the inter-frame prediction mode is used to determine the inter-frame prediction value of the current block, the embodiment of the present application can continue to be used.
  • the provided inter prediction method when the decoder parses the code stream and obtains the prediction mode parameter indicating that the inter-frame prediction mode is used to determine the inter-frame prediction value of the current block.
  • the method for the decoder to determine the first motion vector of the current subblock of the current block may include the following: step:
  • Step 302a Parse the code stream to obtain the affine mode parameter and prediction reference mode of the current block.
  • Step 302b when the affine mode parameter indicates to use the affine mode, determine the control point mode and the sub-block size parameter.
  • Step 302c Determine the first motion vector according to the prediction reference mode, the control point mode and the sub-block size parameter.
  • the decoder after the decoder obtains the prediction mode parameter by parsing, if the prediction mode parameter obtained by parsing indicates that the current block uses the inter prediction mode to determine the inter prediction value of the current block, the decoder can parse the code stream by , to obtain the affine mode parameters and the prediction reference mode.
  • the affine mode parameter is used to indicate whether to use the affine mode.
  • the affine mode parameter may be the affine motion compensation enable flag affine_enable_flag, and the decoder may further determine whether to use the affine mode by determining the value of the affine mode parameter.
  • the affine mode parameter may be a binary variable. If the value of the affine mode parameter is 1, it indicates that the affine mode is used; if the value of the affine mode parameter is 0, it indicates that the affine mode is not used.
  • the decoder parses the code stream, and if the affine mode parameter is not obtained by parsing, it can also be understood as indicating that the affine mode is not used.
  • the value of the affine mode parameter may be equal to the value of the affine motion compensation enable flag affine_enable_flag, if the value of affine_enable_flag is '1', it means that affine motion compensation can be used; if the value of affine_enable_flag is affine motion compensation '0', indicating that affine motion compensation should not be used.
  • the decoder may obtain the control point mode and the sub-block size parameter.
  • control point mode is used to determine the number of control points.
  • a sub-block can have 2 control points or 3 control points, correspondingly, the control point pattern can be the control point pattern corresponding to 2 control points, or the control point pattern corresponding to 3 control points . That is, the control point mode can include a 4-parameter mode and a 6-parameter mode.
  • the decoder if the current block uses the affine mode, the decoder also needs to determine the number of control points in the affine mode of the current block to determine, so that it can be determined. Determines whether 4-parameter (2 control points) mode or 6-parameter (3 control points) mode is used.
  • the decoder can further obtain the sub-block size parameter by parsing the code stream.
  • the sub-block size parameter can be determined by the affine prediction sub-block size flag affine_subblock_size_flag, the decoder obtains the sub-block size flag by parsing the code stream, and determines the size of the current sub-block of the current block according to the value of the sub-block flag .
  • the size of the sub-block may be 8 ⁇ 8 or 4 ⁇ 4.
  • the sub-block size flag may be a binary variable. If the value of the subblock size flag is 1, it indicates that the subblock size parameter is 8 ⁇ 8; if the value of the subblock size flag is 0, it indicates that the subblock size parameter is 4 ⁇ 4.
  • the value of the subblock size flag may be equal to the value of the affine prediction subblock size flag affine_subblock_size_flag. If the value of affine_subblock_size_flag is '1', the current block is divided into subblocks with a size of 8 ⁇ 8. block; if the value of affine_subblock_size_flag is '0', the current block is divided into sub-blocks of size 4 ⁇ 4.
  • the decoder parses the code stream, and if the sub-block size flag is not obtained by parsing, it can also be understood that the current block is divided into 4 ⁇ 4 sub-blocks. That is to say, if affine_subblock_size_flag does not exist in the code stream, the value of the subblock size flag can be directly set to 0.
  • the decoder can further determine the current sub-block in the current block according to the prediction reference mode, the control point mode and the sub-block size parameter The first motion vector of .
  • the decoder may first determine the control point motion vector group according to the prediction reference mode; then, based on the control point motion vector group, the control point mode and the subblock size parameter, determine the The first motion vector.
  • control point motion vector group may be used to determine the motion vector of the control point.
  • the decoder can traverse each sub-block in the current block according to the above method, and use the control point motion vector group, control point mode and sub-block size parameter of each sub-block , the first motion vector of each sub-block is determined, so that the motion vector set can be obtained by constructing and obtaining the first motion vector of each sub-block.
  • the motion vector set of the current block may include the first motion vector of each sub-block of the current block.
  • the decoder when determining the first motion vector according to the control point motion vector group, the control point mode and the sub-block size parameter, the decoder may first determine the first motion vector according to the control point motion vector group, the control point mode and the The size parameter of the current block is used to determine the difference variable; then the sub-block position can be determined based on the prediction mode parameter and the sub-block size parameter; finally, the first motion vector of the sub-block can be determined using the difference variable and the sub-block position, and then the sub-block can be determined. Obtain the motion vector sets of multiple sub-blocks of the current block.
  • the difference variable may include 4 variables, specifically dHorX, dVerX, dHorY, and dVerY.
  • the decoder needs to first determine the control point motion vector group, where the control point The point motion vector group can characterize the motion vector of the control point.
  • control point motion vector group may be a motion vector group including 3 motion vectors, expressed as mvsAffine(mv0, mv1, mv2); if the control point If the point mode is a 4-parameter mode, that is, there are two control points, the control point motion vector group may be a motion vector group including two motion vectors, which is expressed as mvsAffine(mv0, mv1).
  • the decoder can use the control point motion vector group to calculate the difference variable:
  • the width and height are respectively the width and height of the current block, that is, the size parameter of the current block.
  • the size parameter of the current block may be obtained by the decoder by parsing the code stream.
  • the decoder may then determine the sub-block position based on the prediction mode parameter and the sub-block size parameter. Specifically, the decoder can determine the size of the sub-block through the word block size flag, and can determine which prediction mode to use through the prediction mode parameter, and then can determine the sub-block according to the size of the sub-block and the prediction mode used. block location.
  • the value of the prediction reference mode of the current block is 2, it is the third reference mode 'Pred_List01', or, the value of the subblock size flag is 1, that is, the width subwidth of the subblock. and height subheight are both equal to 8, then (x, y) is the coordinates of the upper left corner of the 8 ⁇ 8 sub-block, then the coordinates xPos and yPos of the sub-block position can be determined in the following ways:
  • both xPos and yPos are equal to 0;
  • xPos is equal to width and yPos is equal to 0;
  • control point motion vector group may be a motion vector group including 3 motion vectors, then xPos is equal to 0, and yPos is equal to height;
  • xPos is equal to (x-xE)+4 and yPos is equal to (y-yE)+4.
  • the prediction reference mode of the current block is 0 or 1
  • it is the first reference mode 'Pred_List0' or the second reference mode 'Pred_List1'
  • the value of the subblock size flag is 0, that is, the width subwidth and height subheight of the subblock are both equal to 4
  • (x, y) is the coordinates of the upper left corner of the 4 ⁇ 4 subblock
  • the coordinates xPos and yPos of the subblock position can be determined in the following ways:
  • both xPos and yPos are equal to 0;
  • xPos is equal to width and yPos is equal to 0;
  • control point motion vector group may be a motion vector group including 3 motion vectors, then xPos is equal to 0, and yPos is equal to height;
  • xPos is equal to (x-xE)+2 and yPos is equal to (y-yE)+2.
  • the first motion vector of the current sub-block can be determined based on the position of the sub-block and the difference variable, and finally, it traverses each of the current sub-blocks. sub-block, the first motion vector of each sub-block can be obtained, and then a motion vector set of multiple sub-blocks of the current block can be obtained.
  • the decoder may determine the first motion vector mvE (mvE_x, mvE_y) of the sub-block in the following manner:
  • mvE_x Clip3(-131072, 131071, Rounding((mv0_x ⁇ 7)+dHorX ⁇ xPos+dVerX ⁇ yPos,7));
  • mvE_y Clip3(-131072, 131071, Rounding((mv0_y ⁇ 7)+dHorY ⁇ xPos+dVerY ⁇ yPos,7)).
  • the current block when determining the deviation between each position in the sub-block and the motion vector of the sub-block, if the current block uses an affine prediction model, it can be calculated according to the formula of the affine prediction model.
  • the motion vector of each position within the sub-block is subtracted from the motion vector of the sub-block to obtain their offset. If the motion vectors of the sub-blocks all select the motion vector at the same position in the sub-block, for example, the 4x4 block uses the position from the upper left corner (2, 2), and the 8x8 block uses the position from the upper left corner (4, 4).
  • the standard includes the affine model used in VVC and AVS3, where the motion vector bias for the same position of each sub-block is the same.
  • AVS is in the upper left corner, upper right corner, and lower left corner in the case of 3 control points (the A, B, C positions in the above AVS3 text, as shown in Figure 7) are different from the positions used by other blocks, correspondingly
  • the calculation of the motion vector deviation of the upper left corner, the upper right corner, and the lower left sub-block in the case of three control points is also different from other blocks.
  • Step 303 determine the first predicted value of the current sub-block based on the first motion vector, and determine the target pixel position corresponding to the current pixel position; wherein, the current pixel position is the position of a pixel in the current sub-block, and the target pixel position is The position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position.
  • the decoder may first determine the first predicted value of the current sub-block based on the first motion vector of the current sub-block, and then The target pixel position corresponding to the current pixel position in the current sub-block can be determined.
  • the target pixel position is a pixel position adjacent to the current pixel position.
  • the current pixel position is the position of a pixel in the current sub-block of the current block, where the current pixel position may represent the position of the pixel to be processed.
  • the current pixel position may be the position of the pixel to be re-predicted, or may be the position of the pixel to be PROF processed.
  • the target pixel position is a pixel position around the current pixel point and adjacent to the current pixel position. Specifically, if the current pixel position is the position of the pixel to be re-predicted, then based on the shape of the filter used in the secondary prediction, it can be determined that the pixel position adjacent to the current pixel position in all directions is the target pixel position ; If the current pixel position is the position of the pixel to be PROF processed, the pixel position adjacent to the current pixel position in the horizontal and vertical directions can be determined as the target pixel position.
  • the target pixel position is that the current pixel position is at the top, bottom, left, right, top left, bottom left, top right, bottom right. Pixel positions that are adjacent in 8 directions. If PROF processing is performed on the current pixel position, the target pixel position is the two pixel positions adjacent to the current pixel position in the horizontal direction, left and right, and the two pixel positions adjacent to the current pixel position in the vertical direction, up and down.
  • all the positions in the target pixel position may belong to the current sub-block corresponding to the current pixel position, and some positions in the target pixel position may not belong to the current sub-block corresponding to the current pixel position. . That is to say, the target pixel position and the current pixel position may both belong to the same sub-block, or may belong to different sub-blocks.
  • step 303 may specifically include:
  • Step 303a Determine the first predicted value of the current sub-block based on the first motion vector.
  • Step 303b Determine the target pixel position corresponding to the current pixel position.
  • the inter-frame prediction method proposed in the embodiment of the present application does not limit the order in which the decoder performs step 303a and step 303b, that is, in the present application, after determining the first motion of each sub-block of the current block After the vector, the decoder may perform step 303a first and then step 303b, or may perform step 303b first, then step 303a, or perform step 303a and step 303b simultaneously.
  • the decoder when determining the first predicted value of the current subblock based on the first motion vector, may first determine a sample matrix; wherein, the sample matrix includes a luminance sample matrix and a chrominance sample matrix; The first predictor may then be determined from the prediction reference mode, the subblock size parameter, the sample matrix, and the set of motion vectors.
  • the decoder when the decoder determines the first prediction value according to the prediction reference mode, the sub-block size parameter, the sample matrix, and the motion vector set, it can first determine the first prediction value according to the prediction reference mode and the sub-block size parameter. , determine the target motion vector from the motion vector set; then the reference image queue corresponding to the prediction reference mode and the reference index sample matrix and the target motion vector can be used to determine the prediction sample matrix; wherein, the prediction sample matrix includes the first prediction of a plurality of sub-blocks value.
  • the sample matrix may include a luma sample matrix and a chroma sample matrix.
  • the predicted sample matrix determined by the decoder may include a luma predicted sample matrix and a chrominance predicted sample matrix, wherein, The luma prediction sample matrix includes the first luma predicted values of multiple sub-blocks, the chrominance prediction sample matrix includes the first chrominance predicted values of the multiple sub-blocks, and the first luma predicted value and the first chrominance predicted value constitute the first luma predicted value of the sub-block. Predictive value.
  • the position of the upper left corner sample of the current block in the luminance sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is 0, that is, the first reference mode 'PRED_List0' is used, and the sub-block size flag is 0, that is, the sub-block size parameter is 4 ⁇ 4, then the target motion vector mv0E0 is the current The motion vector of the block sets the first motion vector of the 4x4 sub-block at the (xE+x, yE+y) position.
  • the value of the element predMatrixL0[x][y] in the luma prediction sample matrix predMatrixL0 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE+x) ⁇ 4)+ mv0E0_x, the sample value of ((yE+y) ⁇ 4)+mv0E0_y), the value of the element predMatrixL0[x][y] in the chrominance prediction sample matrix predMatrixL0 is 1/ of the reference index in the reference image queue 0 as RefIdxL0
  • mv2E0 is the first motion vector of the 4 ⁇ 4 unit at the (x1+4, y1) position
  • mv3E0 is the motion vector set of the current block.
  • mv4E0 is the first motion vector of the 4 ⁇ 4 unit of the current block’s motion vector set at the (x1+4, y1+4) position.
  • MvC_x and MvC_y can be determined in the following ways:
  • MvC_x (mv1E0_x+mv2E0_x+mv3E0_x+mv4E0_x+2)>>2
  • MvC_y (mv1E0_y+mv2E0_y+mv3E0_y+mv4E0_y+2)>>2
  • the position of the upper left corner sample of the current block in the luminance sample matrix of the current image is (xE, yE)
  • the prediction reference mode of the current block is 0, that is, the first reference mode is used.
  • 'PRED_List0' and the value of the sub-block size flag is 1, that is, the sub-block size parameter is 8 ⁇ 8
  • the target motion vector mv0E0 is the motion vector set of the current block at (xE+x, yE+y) position 8 The first motion vector of ⁇ 8 cells.
  • the value of the element predMatrixL0[x][y] in the luma prediction sample matrix predMatrixL0 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE+x) ⁇ 4)+ mv0E0_x, the sample value of ((yE+y) ⁇ 4)+mv0E0_y), the value of the element predMatrixL0[x][y] in the chrominance prediction sample matrix predMatrixL0 is 1/ of the reference index in the reference image queue 0 as RefIdxL0
  • MvC_x is equal to mv0E0_x
  • MvC_y is equal to
  • the position of the upper left corner sample of the current block in the luminance sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is 1, that is, the second reference mode 'PRED_List1' is used, and the sub-block size flag is 0, that is, the sub-block size parameter is 4 ⁇ 4, then the target motion vector mv0E1 is the current The block's motion vector sets the first motion vector of the 4x4 unit at (xE+x, yE+y) position.
  • the value of the element predMatrixL1[x][y] in the luma prediction sample matrix predMatrixL1 is the 1/16-precision luma sample matrix whose reference index is RefIdxL1 in the reference image queue 1.
  • the position in the matrix is (((xE+x) ⁇ 4)+ mv0E1_x, the sample value of ((yE+y) ⁇ 4)+mv0E1_y), the value of the element predMatrixL1[x][y] in the chrominance prediction sample matrix predMatrixL1 is 1/ of the reference index in the reference image queue 1 as RefIdxL1
  • mv1E1 is the first motion vector set of MvArray in (x1, y1) the first motion vector of the 4 ⁇ 4 unit at the position
  • mv2E1 is the first motion vector of the 4 ⁇ 4 unit of the MvArray first motion vector set at the (x1+4, y1) position
  • mv3E1 is the MvArray first motion vector set The first motion vector of the 4 ⁇ 4 element at the (x1, y1+4) position
  • mv4E1 is the first motion vector of the 4 ⁇ 4 element of the MvArray first motion vector set at the (x1+4, y1+4) position.
  • MvC_x and MvC_y can be determined in the following ways:
  • MvC_x (mv1E1_x+mv2E1_x+mv3E1_x+mv4E1_x+2)>>2
  • MvC_y (mv1E1_y+mv2E1_y+mv3E1_y+mv4E1_y+2)>>2
  • the position of the upper left corner sample of the current block in the luminance sample matrix of the current image is (xE, yE). If the prediction reference mode of the current block is 1, that is, the second reference mode 'PRED_List1' is used, and the sub-block size flag is 1, that is, the sub-block size parameter is 8 ⁇ 8, then the target motion vector mv0E1 is the current The block's motion vector sets the first motion vector of an 8x8 unit at (xE+x, yE+y) position.
  • the value of the element predMatrixL1[x][y] in the luma prediction sample matrix predMatrixL1 is the 1/16-precision luma sample matrix whose reference index is RefIdxL1 in the reference image queue 1.
  • the position in the matrix is (((xE+x) ⁇ 4)+ mv0E1_x, the sample value of ((yE+y) ⁇ 4)+mv0E1_y), the value of the element predMatrixL1[x][y] in the chrominance prediction sample matrix predMatrixL1 is 1/ of the reference index in the reference image queue 1 as RefIdxL1
  • MvC_x is equal to mv0E1_x
  • MvC_y is equal to mv0E1.
  • the position of the upper left corner sample of the current block in the luminance sample matrix of the current image is (xE, yE).
  • the prediction reference mode of the current block is 2, that is, the third reference mode 'PRED_List01' is used
  • the target motion vector mv0E0 is an 8 ⁇ 8 unit where the motion vector set of the current block is located at (xE+x, yE+y) position
  • the first motion vector of , the target motion vector mv0E1 is the first motion vector of the 8 ⁇ 8 unit where the motion vectors of the current block are set at the (x, y) position.
  • the value of the element predMatrixL0[x][y] in the luma prediction sample matrix predMatrixL0 is the position in the 1/16-precision luma sample matrix with the reference index RefIdxL0 in the reference image queue 0 (((xE+x) ⁇ 4)+ mv0E0_x, the sample value of ((yE+y) ⁇ 4)+mv0E0_y), the value of the element predMatrixL0[x][y] in the chrominance prediction sample matrix predMatrixL0 is 1/ of the reference index in the reference image queue 0 as RefIdxL0
  • MvC0_x is equal to mv0E0_x
  • MvC0_y is equal to mv0E0_y
  • MvC1_x is equal to mv0E1_x
  • MvC1_y is equal to mv0E1_y.
  • the luminance sample matrix in the sample matrix may be a 1/16 precision luminance sample matrix
  • the chrominance sample matrix in the sample matrix may be a 1/32 precision chrominance sample matrix
  • the reference picture queue and reference index obtained by the decoder by parsing the code stream are different.
  • the decoder when it determines the sample matrix, it may first obtain the luminance interpolation filter coefficients and the chrominance interpolation filter coefficients; then, the luminance sample matrix may be determined based on the luminance interpolation filter coefficients. A matrix of chroma samples is determined based on the chroma interpolation filter coefficients.
  • the decoder determines the luminance sample matrix
  • the obtained luminance interpolation filter coefficients are shown in Table 1, and then calculates and obtains the luminance sample matrix according to the pixel position and the sample position.
  • a x,0 Clip1((fL[x][0] ⁇ A -3,0 +fL[x][1] ⁇ A- 2,0 +fL[x][2] ⁇ A- 1,0 + fL[x][3] ⁇ A 0,0 +fL[x][4] ⁇ A 1,0 +fL[x][5] ⁇ A 2,0 +fL[x][6] ⁇ A 3, 0 +fL[x][7] ⁇ A 4, 0 +32)>>6).
  • a 0,y Clip1((fL[y][0] ⁇ A 0,-3 +fL[y][1] ⁇ A- 2,0 +fL[y][2] ⁇ A- 1,0 + fL[y][3] ⁇ A 0,0 +fL[y][4] ⁇ A 1,0 +fL[y][5] ⁇ A 2,0 +fL[y][6] ⁇ A 3, 0 +fL[y][7] ⁇ A -4 , 0 +32)>>6).
  • a x, y Clip1((fL[y][0] ⁇ a' x, y-3 +fL[y][1] ⁇ a' x, y-2 +fL[y][2] ⁇ a' x, y-1 +fL[y][3] ⁇ a'x, y +fL[y][4] ⁇ a'x, y+1 +fL[y][5] ⁇ a'x, y + 2+fL[y][6] ⁇ a'x, y+3 +fL[y][7] ⁇ a'x, y+4 +(1 ⁇ (19-BitDepth)))>>(20- BitDepth)).
  • a' x, y (fL[x][0] ⁇ A -3, y +fL[x][1] ⁇ A -2, y +fL[x][2] ⁇ A -1, y+ fL[ x][3] ⁇ A 0, y +fL[x][4] ⁇ A 1, y +fL[x][5] ⁇ A 2, y +fL[x][6] ⁇ A 3, y + fL[x][7] ⁇ A 4, y+ ((1 ⁇ (BitDepth-8))>>1))>>(BitDepth-8).
  • the decoder when the decoder determines the chrominance sample matrix, it can first parse the code stream to obtain the chrominance interpolation filter coefficients as shown in Table 2 above, and then calculate and obtain the chrominance according to the pixel position and the sample position. sample matrix.
  • a x, y (0, dy) Clip3(0, (1 ⁇ BitDepth)-1, (fC[dy][0] ⁇ Ax, y -1+fC[dy][1] ⁇ Ax, y +fC[dy][2] ⁇ A x, y +1+fC[dy][3] ⁇ A x, y +2+32)>>6)
  • a x, y (dx, 0) Clip3(0, (1 ⁇ BitDepth)-1, (fC[dx][0] ⁇ Ax-1, y +fC[dx][1] ⁇ Ax, y +fC[dx][2] ⁇ A x+1, y +fC[dx][3] ⁇ A x+2, y+32 )>>6)
  • a x,y (dx,dy) Clip3(0,(1 ⁇ BitDepth)-1,(C[dy][0] ⁇ a' x,y-1 (dx,0)+C[dy][ 1] ⁇ a' x,y (dx,0)+C[dy][2] ⁇ a'x,y +1(dx,0)+C[dy][3] ⁇ a'x,y +2 (dx, 0)+(1 ⁇ (19-BitDepth)))>>(20-BitDepth))
  • Step 304 If the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position.
  • the decoder after the decoder determines the first predicted value of the current subblock based on the first motion vector, and determines the target pixel position corresponding to the current pixel position, if the target pixel position does not belong to the current subblock, that is, the target pixel position If the pixel position and the current pixel position belong to different sub-blocks, the decoder can update the target pixel position according to the current sub-block to obtain the updated pixel position.
  • a current pixel position corresponds to a plurality of target pixel positions, if the current pixel position corresponds to a plurality of target pixel positions all belong to the current sub-block, then determine the target pixel position and the current sub-block.
  • the pixel positions all belong to the current sub-block; if there is at least one position that does not belong to the current sub-block among the multiple target pixel positions, it is determined that the target pixel position and the current pixel position belong to different sub-blocks.
  • FIG. 16 is a schematic diagram 2 of the implementation flow of the inter-frame prediction method.
  • the decoder determines the target pixel position according to the current sub-block. The position is updated, and before obtaining the updated pixel position, that is, before step 304, the method for the decoder to perform inter-frame prediction may further include the following steps:
  • Step 306 Determine whether the current sub-block satisfies a preset restriction condition; wherein, the preset restriction condition is used to determine whether to limit the target pixel position within the current sub-block.
  • Step 307 If the preset restriction condition is satisfied, determine whether the target pixel position belongs to the current sub-block.
  • Step 308 If the preset restriction conditions are not met, determine a second predicted value based on the first predicted value and the target pixel position, and determine the second predicted value as an inter-frame predicted value.
  • the decoder may first determine whether the current sub-block satisfies the preset restriction conditions, and if the preset restriction conditions are met, the decoder may continue to determine whether the target pixel position belongs to the current sub-block, so as to further determine whether to The target pixel position is subjected to restriction processing, that is, the process of whether to perform update processing on the target pixel position; if the preset restriction conditions are not met, the decoder does not need to perform restriction processing on the target pixel position, but directly based on the first predicted value.
  • the target pixel position further determine the corrected predicted value corresponding to the current pixel position, and after traversing each pixel position in the current sub-block to obtain the corrected predicted value corresponding to each pixel position, determine the current sub-block corresponding to , so as to determine the inter-frame prediction value of the current sub-block.
  • the preset restriction condition is preset by the decoder to determine whether to limit the target pixel position within the current sub-block, that is to say, in the present application, the decoder Whether to restrict the current pixel position and the target pixel position to belong to the same sub-block may be determined by using a preset restriction condition.
  • the preset restriction conditions are met, it can be considered that the current pixel position and the target pixel position need to be limited to the same sub-block, so it is necessary to continue to determine whether the target pixel position belongs to the current pixel position.
  • the current subblock corresponding to the pixel position.
  • the preset restriction conditions are not met, it can be considered that the current pixel position and the target pixel position do not need to be restricted in the same sub-block, so the target pixel position can be directly used to The current pixel position is subjected to secondary prediction or PROF processing, and finally the inter-frame prediction value corresponding to the current sub-block is obtained.
  • the decoder may limit whether all pixel positions used in secondary prediction or PROF processing belong to the same sub-block based on preset constraints. Specifically, if the preset restriction conditions are met, the pixel values of the current sub-block are used for the pixel values required for secondary prediction or PROF processing, that is, all pixel positions are required to have the data of the current sub-block, and steps 305 to 307 can be followed.
  • the proposed method improves the affine prediction effect of the current pixel position and obtains the corresponding inter-frame prediction value; if the preset constraints are not met, the pixel values required for secondary prediction or PROF processing can use other sub-frames.
  • the pixel value of the block that is, it is not required that all pixel positions share the data of the current sub-block, the affine prediction effect can be improved for the current pixel position according to the method proposed in step 308 to obtain the corresponding inter-frame prediction value.
  • the decoder may determine whether the current sub-block satisfies the preset restriction condition in various ways. Specifically, the decoder can further determine whether the preset restriction condition is satisfied through the motion vector.
  • the decoder may first determine the first motion vector deviation based on the control point motion vector group; and then compare the first motion vector deviation with the preset deviation. The thresholds are compared, and if the first motion vector deviation is greater than or equal to the preset deviation threshold, it can be determined that the preset restriction conditions are met; if the first motion vector deviation is smaller than the preset deviation threshold, it is determined that the preset restriction conditions are not met.
  • the first motion vector deviation may be the difference between motion vectors of two control points of the current block.
  • the decoder in the affine mode, if the 4-parameter (2 control points) mode is used, the decoder can directly convert the motion vector mv0 of the two control points in the control point motion vector group mvsAffine(mv0, mv1) The difference between mv1 and mv1 is determined as the first motion vector deviation; if the 6-parameter (3 control point) mode is used, the decoder can The difference between the motion vectors of the two control points is determined as the first motion vector deviation, and the maximum difference between the motion vectors in the control point motion vector group mvsAffine(mv0, mv1, mv2) can also be determined as the first motion vector deviation.
  • the first motion vector deviation representing the difference between the motion vectors of the two control points of the current block is used to judge whether the preset restriction
  • the setting may be related to the size of the current block or the size of the sub-blocks of the current block used by the current affine prediction.
  • the current block All sub-blocks of the current block meet the preset constraints, or all sub-blocks of the current block do not meet the preset constraints.
  • the decoder may first determine the second motion vector deviation based on the first motion vector of each sub-block; Compared with the preset deviation threshold, if the deviation of the second motion vector is greater than or equal to the preset deviation threshold, it is determined that the preset restriction condition is met; if the second motion vector deviation is smaller than the preset deviation threshold, it is determined that the preset restriction condition is not met .
  • the second motion vector deviation may be the difference between the motion vectors of any two or more sub-blocks in the current block.
  • the relative positions of the pixel positions of the motion vectors of the sub-blocks in the sub-blocks are not all the same, as shown in FIG. Therefore, the motion vector differences between different sub-blocks are not exactly the same, and the motion vector difference between different sub-blocks of the current block can be used as the second motion vector deviation to judge the current Whether the subblock satisfies the preset constraints.
  • the second motion vector deviation representing the motion vector difference between different sub-blocks of the current block is used to judge whether the preset restriction
  • the setting may be related to the size of the subblocks of the current block used by the current affine prediction.
  • each sub-block of the current block can be based on The preset deviation threshold and the corresponding second motion vector deviation determine whether the current sub-block satisfies the preset restriction conditions, wherein, except for A, B, C as shown in FIG. 7 and other sub-blocks connected to them All blocks conform to the same rules, so other sub-blocks can share the same conditional judgment result.
  • the decoder may first determine the third motion vector deviation between the preset pixel position and the current sub-block in the current sub-block; Then the third motion vector deviation is compared with the preset deviation threshold. If the third motion vector deviation is greater than or equal to the preset deviation threshold, it is determined that the preset restriction conditions are met; if the third motion vector deviation is smaller than the preset deviation threshold, then It is determined that the preset constraints are not met.
  • the third motion vector deviation may be the difference of the motion vectors between the preset pixel positions in the current sub-block and the current sub-block.
  • the corresponding motion vectors may be different.
  • the current block Each pixel position in the sub-block of the All blocks conform to the same rules, so other sub-blocks can share the same conditional judgment result.
  • Decoding can select the difference between the motion vector of one or several specific pixel positions and the motion vector of the corresponding sub-block as the third motion vector deviation.
  • the motion vector of the pixel position of the upper left corner in the sub-block is used as the motion vector of the sub-block, and the position of the pixel position in the sub-block is (0, 0), the width of the sub-block is sub_w, and the height of the sub-block is sub_h, then the sub-block
  • the difference between the motion vector at the pixel position and the motion vector of the A-type sub-block is used as the third motion vector deviation corresponding to the A-type sub-block, and the difference can also be used as the third motion vector deviation corresponding to the current block.
  • the difference between the motion vector of the pixel position in the lower left corner of the sub-block and the motion vector of the corresponding sub-block can be selected as the third motion vector deviation;
  • the difference between the motion vector of the pixel position in the upper right corner of the sub-block and the motion vector of the corresponding sub-block is selected as the third motion vector deviation.
  • the preset deviation threshold may be determined according to the size parameter of the current block and/or the sub-block size parameter.
  • the decoder may update the target pixel position according to the current sub-block, so as to obtain the updated pixel position.
  • the decoder can obtain the updated pixel position in various ways.
  • the decoder when the decoder updates the target pixel position according to the current sub-block and obtains the updated pixel position, it can first perform expansion processing on the current sub-block to obtain the expanded sub-block; The updated pixel position corresponding to the target pixel position is determined in the extended sub-block.
  • the decoder in order to use the pixel values of the current sub-block for all required pixel values during secondary prediction or PROF processing, that is, the pixel positions that are restricted to be used belong to the current sub-block, the decoder can The current sub-block is extended based on the predicted predicted sub-block of the sub-block, that is, the current sub-block is extended first. Specifically, since the distance between the target pixel position and the current pixel position used in the secondary prediction or PROF processing is 1 pixel position, even if the target pixel position does not belong to the current sub-block, it only exceeds the current sub-block. The boundary is one pixel position.
  • the decoder when the decoder expands the current sub-block, it only needs to expand the current sub-block based on the prediction of the sub-block by one or two rows of pixels, and/or, one or two columns of pixels, and finally Get the expanded sub-block.
  • FIG. 17 is a schematic diagram of expanding the current sub-block.
  • the current sub-block that can be predicted for the sub-block can have its upper, lower, left, and right boundaries.
  • a simple expansion method is to copy the pixel value of the left edge corresponding to the left edge of the pixel extended on the left, the pixel value of the right edge corresponding to the horizontal direction of the pixel extended on the right, and the pixel value of the extended pixel on the upper edge.
  • the pixel value of the upper boundary corresponding to the vertical direction, the pixel value of the lower boundary corresponding to the vertical direction is copied to the pixel expanded on the lower side, and the pixel value of the corresponding vertex can be copied to the four expanded vertices.
  • the decoder when the decoder performs expansion processing on the current sub-block and obtains the expanded sub-block, it can choose to use all the boundary positions of the current sub-block to perform expansion processing to obtain the expanded sub-block. ; You can also choose to use the boundary positions of the rows and/or columns corresponding to the target pixel position in the current sub-block to perform expansion processing to obtain the expanded sub-block.
  • the expansion processing may be performed on all four boundaries of the current sub-block, the upper, lower, left, and right, or only one or two boundaries corresponding to the target pixel may be expanded. For example, if a target pixel position belongs to a sub-block to the left of the current sub-block, then the left boundary of the current sub-block can be extended, but the other three-side boundaries of the current sub-block are not extended.
  • the decoder when the decoder updates the target pixel position according to the current sub-block and obtains the updated pixel position, it can also use the pixel position in the current sub-block that is adjacent to the target pixel position. , replace the target pixel position to obtain the updated pixel position.
  • the corresponding pixel position can be adjusted to be the current sub-block.
  • a pixel location within the block For example, the pixel position of the upper left corner of the current sub-block is (0, 0), the width of the current sub-block is sub_width, and the height of the current sub-block is sub_height, then the horizontal range of the current sub-block is 0 ⁇ (sub_width-1), The vertical range of the current sub-block is 0 to (sub_height-1).
  • the pixel position to be used by the secondary prediction filter or PROF processing is (x, y)
  • x is less than 0
  • set x is 0.
  • x is greater than sub_width-1
  • set x is sub_width-1.
  • y is less than 0
  • Step 305 Based on the first predicted value and the updated pixel position, determine a second predicted value corresponding to the current sub-block, and determine the second predicted value as the inter-frame predicted value of the current sub-block.
  • the decoder updates the target pixel position according to the current sub-block and obtains the updated pixel position.
  • Pixel position determine the corrected predicted value corresponding to the current pixel position, and after traversing each pixel position in the current sub-block to obtain the corrected predicted value corresponding to each pixel position, determine the second corresponding to the current sub-block.
  • the predicted value, and then the second predicted value may be determined as the inter predicted value of the current sub-block.
  • the decoder Secondary prediction or PROF processing may be performed on the current pixel position based on the first predicted value and the updated pixel position, so as to obtain a corresponding second predicted value.
  • the method for the decoder to determine the second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position may include the following steps:
  • Step 305a parsing the code stream to obtain PROF parameters
  • Step 305b when the PROF parameter indicates that PROF processing is performed, the pixel horizontal gradient and the pixel vertical gradient between the current pixel position and the updated pixel position are determined based on the first predicted value;
  • Step 305c determine the fourth motion vector deviation between the updated pixel position and the current sub-block
  • Step 305d according to the pixel horizontal gradient, the pixel vertical gradient and the fourth motion vector deviation, calculate the deviation value corresponding to the current pixel position;
  • Step 305e based on the first predicted value and the deviation value, obtain a second predicted value.
  • the decoder may first parse the code stream to obtain the PROF parameter. If the PROF parameter indicates to perform PROF processing, the decoder may determine the pixel between the current pixel position and the updated pixel position based on the first predicted value Horizontal gradient and pixel vertical gradient; where the pixel horizontal gradient is the gradient value between the pixel value corresponding to the current pixel position and the pixel value corresponding to the updated pixel position in the horizontal direction; the pixel vertical gradient is the current pixel position corresponding to The gradient value between the pixel value of , and the pixel value corresponding to the updated pixel position in the vertical direction.
  • the decoder can also determine the fourth motion vector deviation between the updated pixel position and the current sub-block, where the fourth motion vector deviation is the difference between the updated pixel position’s motion vector and the current sub-block’s first motion vector .
  • the decoder may calculate and obtain the deviation value corresponding to the current pixel position according to the pixel horizontal gradient, the pixel vertical gradient and the fourth motion vector deviation corresponding to the current pixel position.
  • the deviation value can be used to correct the predicted value of the pixel value of the current pixel position.
  • the decoder may further obtain the corrected predicted value corresponding to the current pixel position according to the first predicted value and the deviation value, and traverse each pixel in the current sub-block After obtaining the modified predicted value corresponding to each pixel position, the second predicted value corresponding to the current sub-block is determined by using the modified predicted value corresponding to all pixel positions, thereby determining the corresponding inter-frame predicted value. Specifically, in this application, after the prediction based on the sub-block is completed, the first predicted value of the current sub-block is used as the predicted value of the current pixel position, and then the first predicted value is compared with the deviation value corresponding to the current pixel position.
  • Add that is, the correction processing of the predicted value of the current pixel position can be completed, and the corrected predicted value can be obtained, so that the second predicted value of the current sub-block can be further obtained, and the second predicted value can be used as the inter-frame prediction corresponding to the current sub-block value.
  • the method for the decoder to determine the second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position may include the following steps:-
  • Step 305f parsing the code stream to obtain secondary prediction parameters
  • Step 305g when the secondary prediction parameter indicates that the secondary prediction is used, determine the deviation of the pixel position after the update from the fourth motion vector of the current sub-block;
  • Step 305h determining the filter coefficient of the two-dimensional filter according to the fourth motion vector deviation; wherein, the two-dimensional filter is used to perform secondary prediction processing according to a preset shape;
  • Step 305i Determine a second predicted value based on the filter coefficient and the first predicted value, and determine the second predicted value as an inter-frame predicted value.
  • the decoder may first parse the code stream to obtain secondary prediction parameters, and if the secondary prediction parameters indicate the use of secondary prediction, the decoder may determine the updated pixel position and the fourth motion of the current sub-block Vector deviation, the fourth motion vector deviation is the difference between the motion vector of the updated pixel position and the first motion vector of the current sub-block, so that the filter coefficient of the two-dimensional filter can be determined according to the fourth motion vector deviation; Among them, the two-dimensional filter is used for secondary prediction processing according to the preset shape.
  • the decoder may determine the fourth motion vector between the current sub-block and each pixel position based on the difference variable. Three motion vector deviations.
  • the decoder determines the fourth motion vector deviation between the current sub-block and each pixel position based on the difference variable, it can follow the method proposed in step 302 above, according to the control point
  • the motion vector group, the control point mode and the size parameters of the current block are determined to determine the four difference variables of dHorX, dVerX, dHorY and dVerY, and then use the difference variable to further determine the corresponding pixel position in the sub-block. Fourth motion vector deviation.
  • width and height are the width and height of the current block obtained by the decoder, respectively, and the width subwidth and height subheight of the subblock are determined by using the subblock size parameter.
  • (i, j) is the coordinates of any pixel inside the sub-block, where the value range of i is 0 ⁇ (subwidth-1), and the value range of j is 0 ⁇ (subheight-1), then, we can
  • the fourth motion vector bias for each pixel (i, j) position inside the 4 different types of sub-blocks is calculated by:
  • dMvB[i][j][1] dHorY ⁇ (i-subwidth)+dVerY ⁇ j;
  • control point motion vector group may be a motion vector group including 3 motion vectors, then the fourth motion vector deviation dMvC[i][ j]:
  • dMvC[i][j][1] dHorY ⁇ i+dVerY ⁇ (j-subheight);
  • dMvN[i][j][0] dHorX ⁇ (i–(subwidth>1))+dVerX ⁇ (j–(subheight>>1))
  • dMvN[i][j][1] dHorY ⁇ (i ⁇ (subwidth>1))+dVerY ⁇ (j ⁇ (subheight>>1)).
  • dMvX[i][j][0] represents the deviation value of the fourth motion vector deviation in the horizontal component
  • dMvX[i][j][1] represents the deviation value of the fourth motion vector deviation in the vertical component.
  • X is A, B, C or N.
  • the decoder after determining the fourth motion vector deviation between the current sub-block and each pixel position based on the difference variable, the decoder can use the correspondence of all pixel positions in the sub-block.
  • the motion vector deviation matrix corresponding to the current sub-block is constructed. It can be seen that the motion vector deviation matrix includes the fourth motion vector deviation between the sub-block and any internal pixel point.
  • the filter coefficient of the two-dimensional filter is related to the fourth motion vector deviation corresponding to the target pixel position. That is to say, for different target pixel positions, if the corresponding fourth motion vector deviations are different, the filter coefficients of the used two-dimensional filters are also different.
  • the two-dimensional filter is used to perform secondary prediction by using a plurality of adjacent pixel positions that form a preset shape.
  • the preset shape is a rectangle, a rhombus, or any symmetrical shape.
  • the two-dimensional filter used for secondary prediction is a filter formed by adjacent points forming a preset shape.
  • the adjacent dots constituting the preset shape may include a plurality of dots, for example, composed of 9 dots.
  • the predetermined shape may be a symmetrical shape, for example, the predetermined shape may include a rectangle, a rhombus, or any other symmetrical shape.
  • the two-dimensional filter is a rectangular filter, and specifically, the two-dimensional filter is a filter composed of 9 adjacent pixel positions forming a rectangle.
  • the pixel position located in the center is the pixel position of the pixel currently requiring secondary prediction, that is, the current pixel position.
  • the decoder when the decoder determines the filter coefficient of the two-dimensional filter according to the fourth motion vector deviation, it can first parse the code stream to obtain the scale parameter, and then can obtain the scale parameter according to the scale parameter and the fourth motion vector. Deviation, determine the filter coefficient corresponding to the pixel position.
  • the scale parameter may include at least one scale value
  • the fourth motion vector deviation includes a horizontal deviation and a vertical deviation; wherein, at least one scale value is a non-zero real number.
  • the pixel position located in the center of the rectangle is the position to be predicted, that is, the current pixel position, and the other 8
  • the target pixel positions are sequentially located in the eight directions of the upper left, upper, upper right, right, lower right, lower, lower left, and left of the current pixel position.
  • the decoder can calculate and obtain 9 filter coefficients corresponding to 9 adjacent pixel positions based on at least one scale value and the fourth motion vector deviation of the position to be predicted according to a preset calculation rule.
  • the preset calculation rule may include various calculation methods, such as addition operation, subtraction operation, multiplication movement, and the like. Among them, for different pixel positions, different calculation methods can be used to calculate the filter coefficients.
  • some filter coefficients may be the deviation of the fourth motion vector.
  • the linear function that is, the two are in a linear relationship, may also be a quadratic function or a higher-order function of the deviation of the fourth motion vector, that is, the two are in a non-linear relationship.
  • any one of the filter coefficients of the plurality of filter coefficients corresponding to the plurality of adjacent pixel positions may be a linear function, a quadratic function, or a higher-order function of the fourth motion vector deviation.
  • the fourth motion vector deviation of the pixel position is (dmv_x, dmv_y), wherein, if the coordinates of the target pixel position are (i, j), then dmv_x can be expressed as dMvX[i][ j][0], that is, the deviation value of the fourth motion vector deviation in the horizontal component, dmv_y can be expressed as dMvX[i][j][1], that is, the deviation value of the fourth motion vector deviation in the vertical component.
  • Table 3 shows the filter coefficients obtained based on the fourth motion vector deviation (dmv_x, dmv_y).
  • the fourth motion vector deviation horizontal deviation is dmv_x
  • the vertical deviation is dmv_y
  • 9 filter coefficients corresponding to 9 adjacent pixel positions can be obtained, wherein the decoder can directly convert the filter coefficients of the current pixel position in the center Set to 1.
  • the scale parameters m and n are generally decimals or fractions, and a possible situation is that both m and n are powers of 2, such as 1/2, 1/4, 1/8 and so on.
  • dmv_x and dmv_y are their actual sizes, that is, 1 of dmv_x and dmv_y represents the distance of 1 pixel, and dmv_x and dmv_y are decimals or fractions.
  • the motion vectors of the integer pixel position and the sub-pixel position corresponding to the current common 8-tap filter are horizontal and vertical.
  • the directions are all non-negative, and the sizes are between 0 pixels and 1 pixel, that is, dmv_x and dmv_y cannot be negative.
  • the motion vectors of the integer pixel position and the sub-pixel position corresponding to the filter may be negative in both the horizontal and vertical directions, that is, dmv_x and dmv_y may be negative.
  • Fig. 18 is a schematic diagram 1 of a two-dimensional filter.
  • the sub-block-based prediction result is used as the basis for the secondary prediction
  • the light-colored square is the integer pixel position of the filter, that is, the sub-block-based Predicted location.
  • the circle is the sub-pixel position that needs to be predicted twice, that is, the position of the pixel position
  • the dark square is the integer pixel position corresponding to the sub-pixel position.
  • 9 integer pixels as shown in the figure are required. Location.
  • Fig. 19 is a schematic diagram 2 of a two-dimensional filter.
  • the sub-block-based prediction result is used as the basis for the secondary prediction
  • the light-colored square is the integer pixel position of the filter, that is, the sub-block-based Predicted location.
  • the circle is the sub-pixel position that needs to be predicted twice, that is, the position of the pixel position
  • the dark square is the integer pixel position corresponding to the sub-pixel position.
  • 13 integer pixels are required as shown in the figure. Location.
  • the decoder can determine the second prediction of the current sub-block based on the filter coefficient and the first prediction value. value, so that the secondary prediction of the current sub-block can be realized.
  • the decoder determines the filter coefficient by using the fourth motion vector deviation corresponding to the pixel position, so that the first predicted value can be processed by the two-dimensional filter according to the filter coefficient. Correction to obtain the corrected second predicted value of the current sub-block. It can be seen that the second predicted value is a modified value based on the first predicted value.
  • the decoder when determining the second predicted value of the current subblock based on the filter coefficient and the first predicted value, the decoder may first perform a multiplication operation on the filter coefficient and the first predicted value to obtain: Product result, after traversing all pixel positions in the current sub-block, add the product results of all pixel positions in the current sub-block to obtain the summation result, and finally normalize the addition result, and finally obtain The modified second predicted value of the current sub-block.
  • the first prediction value of the current sub-block where the pixel position is located is generally used as the prediction value before the correction of the pixel position. Therefore, when filtering through a two-dimensional filter, the filter coefficient can be multiplied by the predicted value of the corresponding pixel position, that is, the first predicted value, and the multiplication results corresponding to each pixel position can be accumulated and then normalized.
  • the decoder can perform normalization processing in various ways. For example, the result of multiplying the filter coefficient by the predicted value of the corresponding pixel position and then accumulating it can be shifted to the right by 4+shift1. bit. Alternatively, it is also possible to multiply the filter coefficient by the predicted value of the corresponding pixel position and then add (1 ⁇ (3+shift1)) to the accumulated result, and then shift to the right by 4+shift1 bits.
  • the fourth motion vector deviation after obtaining the fourth motion vector deviation corresponding to the pixel position inside the current sub-block, for each sub-block and each pixel position in each sub-block, the fourth motion vector deviation can be obtained according to the fourth motion vector deviation. , based on the first prediction value of the motion compensation of the current sub-block, use a two-dimensional filter to perform filtering, complete the secondary prediction of the current sub-block, and obtain a new second prediction value.
  • the two-dimensional filter can be understood as performing secondary prediction by using a plurality of adjacent pixel positions that form a preset shape.
  • the preset shape may be a rectangle, a rhombus or any symmetrical shape.
  • the two-dimensional filter when it uses 9 adjacent pixel positions that form a rectangle to perform secondary prediction, it can first determine the prediction sample matrix of the current block, and the current sub-frame of the current block.
  • the motion vector deviation matrix of the block wherein, the motion vector deviation matrix includes the fourth motion vector deviation corresponding to all pixel positions; then based on 9 adjacent pixel positions that form a rectangle, using the predicted sample matrix and the motion vector deviation matrix, determine The sub-predicted sample matrix for the current block.
  • the width and height of the current block are width and height, respectively
  • the width and height of each sub-block are subwidth and subheight, respectively.
  • the sub-block where the upper-left sample of the luminance prediction sample matrix of the current block is located is A
  • the sub-block where the upper-right sample is located is B
  • the sub-block where the lower-left sample is located is C
  • the sub-blocks where the other positions are located are other sub-blocks.
  • the motion vector deviation matrix of the sub-block can be denoted as dMv, then:
  • dMv is equal to dMvA
  • dMv is equal to dMvB
  • dMv is equal to dMvC
  • dMv is equal to dMvN.
  • (x, y) is the coordinate of the upper left corner of the current sub-block
  • (i, j) is the coordinate of the pixel inside the luminance sub-block
  • the value range of i is 0 ⁇ (subwidth-1)
  • the value of j The value range is 0 ⁇ (subheight-1)
  • the prediction sample matrix based on sub-block is PredMatrixSb
  • the prediction sample matrix of secondary prediction is PredMatrixS.
  • the secondary prediction of (x+i, y+j) can be calculated as follows Predicted sample PredMatrixS[x+i][y+j]:
  • PredMatrixS[x+i][y+j] Clip3(0, (1 ⁇ BitDepth)-1, PredMatrixS[x+i][y+j]).
  • the preset threshold may be equal to 2048 converted by 1 integer pixel, that is, when the absolute value of the maximum value of dMv is less than At 2048,
  • max(a, b) can be understood as taking the larger value of a and b
  • min(a, b) can be understood as taking the smaller value of a and b.
  • the CENTER(x+i, y+j) pixel position may be the center position of the above-mentioned 9 adjacent pixel positions that form a rectangle, and then based on the (x) +i, y+j) pixel position, and the other 8 pixel positions adjacent to it, perform secondary prediction processing.
  • the other 8 pixel positions are UP (upper), UPRIGHT (upper right), LEFT (left), RIGHT (right), DOWNLEFT (lower left), DOWN (lower), DOWNRIGHT (lower right), UPLEFT (upper left) .
  • the precision of the calculation formula of PredMatrixS[x+i][y+j] may use a lower precision.
  • the item on the right side of each multiplication is shifted to the right, for example, dMv[i][j][0] and dMv[i][j][1] are shifted right by shift3 bits, correspondingly, 1 ⁇ 15 becomes 1 ⁇ (15-shift3),...+(1 ⁇ 10))>>11 becomes...+(1 ⁇ (10-shift3)))>>(11-shift3).
  • the size of the motion vector may be limited to a reasonable range, for example, the positive and negative values of the motion vector used above in the horizontal and vertical directions do not exceed 1 pixel, 1/2 pixel or 1/4 pixel, etc.
  • the decoder averages the multiple prediction sample matrices of each component to obtain the final prediction sample matrix of the component. For example, two luminance prediction sample matrices are averaged to obtain a new luminance prediction sample matrix.
  • the prediction matrix is used as the decoding result of the current block, and if the current block has transform coefficients, then First decode the transform coefficients, and obtain the residual matrix through inverse transformation and inverse quantization, and add the residual matrix to the prediction matrix to obtain the decoding result.
  • the decoder can update the target pixel positions that do not belong to the current sub-block to obtain the updated pixel positions, so that secondary prediction or PROF processing can be performed based on the updated pixel positions.
  • the above update process can be understood as the expansion of the boundary of the current sub-block, and it can also be understood as the redefinition of the target pixel position beyond the boundary of the current sub-block, so that the secondary prediction or PROF processing needs to be used.
  • the pixel positions of are restricted to the same sub-block, which improves the prediction performance by ensuring that the pixels are connected.
  • the pixel position that needs to be used for secondary prediction is not in the same sub-block means that for a certain pixel position in the current sub-block, that is, the current pixel position, the secondary prediction needs to be used for filtering. Not all of the target pixel positions belong to the current sub-block.
  • the pixel positions that need to be used for PROF processing are not in the same sub-block means that for a certain pixel position in the current sub-block, that is, the current pixel position, the target pixel positions that PROF processing needs to use to calculate the gradient do not all belong to the current sub-block. Piece.
  • inter-frame prediction method proposed in the embodiments of the present application can act on the entire coding unit or prediction unit, that is, on the current block, can also act on each sub-block in the current block, and can also act on any each pixel position in a subblock. This application does not make any specific limitations.
  • the inter-frame prediction method acts on the current block
  • the data dependency between each sub-block in the current block can also be reduced, so that a sub-block-based, two Parallel execution of sub-prediction or PROF processing, that is, when sub-prediction or PROF processing is performed on a sub-block, there is no need to wait for the predicted value of the sub-block-based prediction of other sub-blocks.
  • parallel processing of sub-block-based prediction can be implemented, and parallel processing of point-based prediction (secondary prediction or PROF processing) can also be implemented on the basis of sub-block-based prediction.
  • inter-frame prediction method proposed in this application can be applied to any image component.
  • a secondary prediction scheme is exemplarily used for the luminance component, but it can also be used for the chrominance component, or Any component in another format.
  • the inter-frame prediction method proposed in this application can also be applied to any video format, including but not limited to the YUV format, including but not limited to the luminance component of the YUV format.
  • This embodiment provides an inter-frame prediction method. After sub-block-based prediction, if the pixel positions required for secondary prediction or PROF processing are not in the same sub-block, the boundary of the current sub-block can be extended by extending the boundary of the current sub-block. , redefining the target pixel position beyond the boundary of the current sub-block, etc., to limit the pixel positions that need to be used for secondary prediction or PROF processing in the same sub-block, thus solving the problem of unconnected pixels.
  • the resulting problem of degraded prediction performance can reduce prediction errors, greatly improve encoding performance, and thus improve encoding and decoding efficiency.
  • FIG. 20 is a schematic diagram 3 of the implementation flow of the inter-frame prediction method.
  • the method for performing inter-frame prediction by a decoder may further include the following steps :
  • Step 309 When the prediction mode parameter indicates that the inter prediction mode is used to determine the inter prediction value of the current block, determine the extended subblock of the current subblock.
  • Step 3010 Determine the third prediction value of the extended sub-block based on the first motion vector, and determine the target pixel position corresponding to the current pixel position.
  • Step 3011 If the target pixel position does not belong to the current sub-block, determine a second predicted value based on the third predicted value and the target pixel position, and determine the second predicted value as an inter-frame predicted value.
  • the decoder may first determine the extended subblock of the current subblock .
  • the extended sub-block is determined after extending the current sub-block in the current block. Specifically, since the distance between the target pixel position and the current pixel position used in the secondary prediction or PROF processing is 1 pixel position, even if the target pixel position does not belong to the current sub-block, it only exceeds the boundary of the current sub-block by one pixel position. Therefore, when the decoder performs prediction based on the sub-block, it can directly predict the extended sub-block corresponding to the current sub-block, so as to solve the problem of poor prediction effect caused by the target pixel position not belonging to the current sub-block.
  • the extended sub-block corresponding to the current sub-block may add one row (column) of pixels on the upper side, the lower side, the left side and the right side of the current sub-block. point.
  • the size parameter of the extended sub-block that is, the corresponding extended sub-block
  • the size of the block can be 10x10 or 6x6.
  • the decoder can determine the third predicted value of the extended sub-block based on the first motion vector of the current sub-block, and at the same time, it can determine the third prediction value of the extended sub-block. Determine the target pixel position corresponding to the current pixel position.
  • step 3010 may specifically include:
  • Step 3010a Determine a third predictor of the extended sub-block based on the first motion vector.
  • Step 3010b Determine the target pixel position corresponding to the current pixel position.
  • the inter-frame prediction method proposed in this embodiment of the present application does not limit the order in which the decoder performs step 3010a and step 3010b.
  • the decoder may first perform step 3010a and then step 3010b, or may perform step 3010b first and then step 3010a, or may perform step 3010a and step 3010b simultaneously.
  • the decoder when determining the third predicted value of the extended sub-block based on the first motion vector, may first determine a sample matrix; wherein, the sample matrix includes a luminance sample matrix and a chrominance sample matrix; The third predictor may then be determined based on the prediction reference mode, the size parameter of the extended sub-block, the sample matrix, and the motion vector set.
  • the pixel values of the pixel positions required by the secondary prediction or PROF processing can also be predicted at the same time. Since the distance between the target pixel position and the current pixel position used in the secondary prediction or PROF processing is 1 pixel position, in the secondary prediction or PROF processing, only the upper and lower rows and the left and right columns of the current sub-block need to be used. According to this principle, the corresponding extended sub-block can be determined.
  • the decoder predicts the pixel values of the pixel positions required by the secondary prediction or PROF processing in advance in the sub-block-based prediction process, then in the subsequent secondary prediction or PROF processing process, There will be no problem that the current pixel position and the target pixel position are not connected.
  • the decoder may, based on the third predicted value and the target pixel position , the second prediction value corresponding to the current sub-block is determined, and then the second prediction value can be determined as the inter-frame prediction value.
  • the decoder can Use the third predicted value of the extended sub-block and the target pixel position to further perform secondary prediction on the current pixel position, obtain the corrected predicted value corresponding to the current pixel position, traverse each pixel position in the current sub-block, and obtain each The corrected predicted value corresponding to the pixel position can finally determine the second predicted value of the current sub-block, and then the second predicted value can be determined as the inter-frame predicted value of the current block.
  • the width and height of the current block are width and height, respectively
  • the width and height of each sub-block are subwidth and subheight, respectively.
  • the sub-block where the upper-left sample of the luminance prediction sample matrix of the current block is located is A
  • the sub-block where the upper-right sample is located is B
  • the sub-block where the lower-left sample is located is C
  • the sub-blocks where the other positions are located are other sub-blocks.
  • the motion vector deviation matrix of the sub-block can be denoted as dMv, then:
  • dMv is equal to dMvA
  • dMv is equal to dMvB
  • dMv is equal to dMvC
  • dMv is equal to dMvN.
  • (x, y) is the coordinate of the upper left corner of the current sub-block
  • (i, j) is the coordinate of the pixel inside the luminance sub-block
  • the value range of i is 0 ⁇ (subwidth-1)
  • the value of j The value range is 0 ⁇ (subheight-1)
  • the prediction sample matrix of the current sub-block based on the sub-block is PredMatrixTmp
  • PredMatrixTmp is the prediction sample matrix corresponding to the extended sub-block (subwidth+2)*(subheight+2)
  • Secondary prediction The predicted sample matrix of is PredMatrixS
  • the predicted sample PredMatrixS[x+i][y+j] of the secondary prediction of (x+i, y+j) can be calculated as follows:
  • PredMatrixS[x+i][y+j] Clip3(0, (1 ⁇ BitDepth)-1, PredMatrixS[x+i][y+j]).
  • the prediction sample matrix PredMatrixTmp corresponding to the extended sub-block whose size parameter is 6x6 or 10x10 can be a sub-block prediction matrix based on the original sub-block-based prediction.
  • PredMatrixTmp is the matrix corresponding to the sub-block
  • PredMatrixS is the matrix corresponding to the entire prediction unit, that is, the matrix of the current block, so their indices are different.
  • the index of the position corresponding to the sub-block is incremented by 1 in the horizontal direction and 1 in the vertical direction, respectively.
  • the decoder may also follow the steps proposed in the above steps 306 to 308. method, firstly judging whether the current sub-block satisfies the preset restriction conditions, and only when the preset restriction conditions are met, the decoder will determine the third prediction value of the extended sub-block, and use the third prediction value for secondary prediction or PROF processing to obtain the inter-frame prediction value of the current sub-block; if the current sub-block does not meet the preset constraints, the decoder can directly use the first prediction value of the current sub-block and the target pixel position to determine the second prediction value, The second predicted value is then determined as an inter predicted value.
  • the preset restriction condition may also be used to determine whether to perform the determination of the third prediction value of the extended sub-block on the current sub-block.
  • the decoder in addition to using the above-mentioned first motion vector deviation, second motion vector deviation or third motion vector deviation to judge whether the preset restriction conditions are met, can also directly Use the sub-block size parameter corresponding to the current sub-block to determine whether the preset restriction condition is satisfied.
  • the sub-block size parameter corresponding to the current sub-block is 8 ⁇ 8, it can be determined that the preset restriction condition is met; if the sub-block size parameter corresponding to the current sub-block is 4 ⁇ 4, then it can be determined that the preset restriction condition is not satisfied.
  • the size of the current sub-block is 8x8, then it is determined that the preset constraints are met, and the third prediction value of the extended sub-block is determined and used for secondary prediction or PROF processing; if the size of the current sub-block is 4x4 , then it is determined that the preset restriction condition is not met, and there is no need to determine the third prediction value of the extended sub-block, but the first prediction value of the current sub-block is directly used for secondary prediction or PROF processing.
  • the bandwidth will be increased.
  • a filter with less than 8 taps can also be used. Interpolation is performed, so that the bandwidth is not increased when the predicted value of the additional pixel position is determined, and further, the nearest reference pixel value can be directly used as the predicted value of the corresponding additional pixel position. This application does not make any specific limitations.
  • FIG. 21 is a schematic diagram of a 4 ⁇ 4 sub-block.
  • an 8-tap interpolation filter for a 4 ⁇ 4 sub-block, it is necessary to use the surrounding sub-block as shown in the figure.
  • the 11x11 reference pixels are interpolated. That is to say, for a 4x4 sub-block, if an 8-tap interpolation filter is used, 3 more reference pixels are required on the left and upper sides, and 4 more reference pixels are required on the right and lower sides for interpolation.
  • the problem of bandwidth increase can be solved by using a filter with fewer taps for the extended sub-block, for example, an interpolation filter with n taps is used for the extended sub-block to obtain the third prediction value; where n is any of the following values: 6, 5, 4, 3, 2.
  • Figure 22 is a schematic diagram of an extended sub-block.
  • the circle 1 can represent the sub-pixel in the reference image corresponding to the pixel position inside the sub-block, that is, it can represent the predicted value of the sub-block-based prediction of the sub-block
  • the circle 2 can represent the extra prediction value obtained by the advance prediction and beyond the boundary of the atomic block in the extended sub-block.
  • the reference pixels required for using the same interpolation for circle 2 as for circle 1 will exceed the original 11x11 range. But using a filter with fewer taps, such as a 6 or 5 or 4 or 3 or 2 tap filter, it is possible to stay within this range without increasing the bandwidth.
  • the 2-tap filter can be an average or weighted average of 2 adjacent integer pixels of a sub-pixel point.
  • the filters mentioned above are all in the horizontal or vertical direction. If the position to be interpolated is divided into pixels in the horizontal and vertical directions, then the filters in the horizontal and vertical directions need to be superimposed.
  • the reference pixels corresponding to the current sub-block can also be expanded first to obtain the expanded reference pixels; and then based on the expanded reference pixels, the current sub-block corresponding the interpolation filter to obtain the third predicted value.
  • Figure 23 is a schematic diagram of the extended sub-block. As shown in Figure 23, in order not to increase the bandwidth, the outermost circle of the original interpolation reference pixel can also be extended. For example, the 11x11 reference pixel is extended to 13x13. Specifically, the extension method proposed in the above step 304 may be adopted. This allows the use of uniform filters.
  • the adjacent pixel positions corresponding to the extended sub-blocks can also be determined in the current block first; then the third predicted value can be determined according to the integer pixel values of the adjacent pixel positions. .
  • Fig. 24 is a schematic diagram of replacing a pixel.
  • the value of the nearest integer pixel can also be directly used as the prediction value that needs to be used beyond the sub-block.
  • the pixel value of square 3 can be directly used as the predicted value of circle 2. If the sub-pixel MV in the horizontal direction is less than or equal to (or less than) 1/2 pixel, the integer pixel to the left of the sub-pixel is used, and if the sub-pixel MV in the horizontal direction is greater than (or greater than or equal to) 1/2 pixel, the right sub-pixel is used.
  • the integer pixel above the sub-pixel is used. If the sub-pixel MV in the vertical direction is greater than (or greater than or equal to) 1/2 pixel, then Use the whole pixel below the subpixel.
  • FIG. 25 is a schematic diagram 4 of the implementation flow of the inter prediction method.
  • the first prediction value of the current sub-block is determined based on the first motion vector, and the current pixel is determined.
  • the method for performing inter-frame prediction by the decoder may further include the following steps:
  • Step 3012 If the target pixel position does not belong to the current sub-block, determine the adjacent pixel position in the current block; wherein, the adjacent pixel position is adjacent to the target pixel position in the current block.
  • Step 3013 Determine a second predicted value based on the first predicted value and adjacent pixel positions, and determine the second predicted value as an inter-frame predicted value.
  • decode The controller may first determine the adjacent pixel position with the closest distance to the target pixel position among all the pixel positions in the current block; wherein, the adjacent pixel position is adjacent to the target pixel position in the current block.
  • the decoder can re-determine the pixel value. Specifically, the decoder can calculate and obtain a neighboring pixel position in the current block that is closest to the target pixel position, and then can use the pixel value of the neighboring pixel position to perform secondary prediction or PROF processing.
  • FIG. 26 is a schematic diagram of an alternative pixel position.
  • position 3 in sub-block 2 will be selected as the position one pixel away from position 1 in sub-block 1. It can be seen that position 1 and position 3 are far apart.
  • position 3 is used as the position Filtering the adjacent pixel positions of 1 will greatly reduce the prediction effect of the final secondary prediction.
  • position 3 can be replaced with position 4 in the current block, that is, position 4 is the adjacent pixel position of position 2.
  • position 4 is the pixel position closest to position 2, and the deviation between the two is small. Filtering using position 4 as the adjacent pixel location for position 1 will result in more accurate results than using position 3.
  • the decoder when the decoder determines the position of the adjacent pixel in other sub-blocks in the current block, it needs to use the first motion vector of the current sub-block for calculation, and the obtained result may also be: The adjacent pixel position does not exist in other sub-blocks in the current block. If the adjacent pixel position does not exist, the decoder may choose to continue to use the pixel value of the target pixel position for secondary prediction or PROF processing.
  • the decoder can update the target pixel positions that do not belong to the current sub-block to obtain the updated pixel positions, which can be based on the updated pixel positions.
  • the pixel positions of the s are subjected to secondary prediction or PROF processing.
  • the above update process can be understood as the expansion of the boundary of the current sub-block, and it can also be understood as the redefinition of the target pixel position beyond the boundary of the current sub-block, so that the secondary prediction or PROF processing needs to be used.
  • the pixel positions of are restricted to the same sub-block, which improves the prediction performance by ensuring that the pixels are connected.
  • the inter-frame prediction method can act on the current block, can act on each sub-block in the current block, and can act on each pixel position in any sub-block. It can be seen that the inter-frame prediction method proposed in the embodiment of the present application can reduce the prediction error and improve the coding performance by avoiding the use of obviously non-adjacent pixels in the reference image as adjacent pixels for secondary prediction or PROF processing. .
  • This embodiment provides an inter-frame prediction method. After sub-block-based prediction, if the pixel positions required for secondary prediction or PROF processing are not in the same sub-block, the boundary of the current sub-block can be extended by extending the boundary of the current sub-block. , redefining the target pixel position beyond the boundary of the current sub-block, etc., to limit the pixel positions that need to be used for secondary prediction or PROF processing in the same sub-block, thus solving the problem of unconnected pixels.
  • the resulting problem of degraded prediction performance can reduce prediction errors, greatly improve encoding performance, and thus improve encoding and decoding efficiency.
  • An embodiment of the present application provides an inter-frame prediction method, which is applied to a video encoding device, that is, an encoder.
  • the functions implemented by the method can be implemented by calling a computer program by the second processor in the encoder, and of course the computer program can be stored in the second memory.
  • the encoder includes at least a second processor and a second memory.
  • the method for performing inter-frame prediction by the encoder may include the following steps:
  • Step 401 Determine the prediction mode parameter of the current block.
  • the encoder may first determine the prediction mode parameter of the current block. Specifically, the encoder may first determine the prediction mode used by the current block, and then determine the corresponding prediction mode parameter based on the prediction mode. The prediction mode parameter may be used to determine the prediction mode used by the current block.
  • an image to be encoded may be divided into multiple image blocks, an image block to be encoded currently may be referred to as a current block, and an image block adjacent to the current block may be referred to as an adjacent block ; That is, in the to-be-coded image, the current block and adjacent blocks have an adjacent relationship.
  • each current block may include a first image component, a second image component, and a third image component; that is, the current block is the prediction of the first image component, the second image component, or the third image component in the image to be encoded. image block.
  • the current block performs the first image component prediction, and the first image component is a luminance component, that is, the image component to be predicted is a luminance component, then the current block may also be called a luminance block; or, it is assumed that the current block performs the second image component prediction prediction, and the second image component is a chrominance component, that is, the image component to be predicted is a chrominance component, then the current block may also be called a chrominance block.
  • the prediction mode parameter indicates the prediction mode adopted by the current block and parameters related to the prediction mode.
  • a simple decision-making strategy can be used, for example, according to the size of the distortion value;
  • the embodiments of the present application do not make any limitation.
  • the RDO method can be used to determine the prediction mode parameter of the current block.
  • the encoder when determining the prediction mode parameter of the current block, may first perform precoding processing on the current block by using multiple prediction modes to obtain the rate-distortion cost value corresponding to each prediction mode; then The minimum rate-distortion cost value is selected from the obtained multiple rate-distortion cost values, and the prediction mode parameter of the current block is determined according to the prediction mode corresponding to the minimum rate-distortion cost value.
  • the multiple prediction modes may be used for the current block to perform precoding processing on the current block respectively.
  • the multiple prediction modes usually include an inter prediction mode, a traditional intra prediction mode, and a non-traditional intra prediction mode; wherein, the traditional intra prediction mode may include a direct current (Direct Current, DC) mode, a plane (PLANAR) mode mode and angle mode, etc.
  • non-traditional intra prediction modes can include matrix-based intra prediction (MIP) mode, cross-component linear model prediction (Cross-component Linear model prediction, CCLM) mode, intra block copy (Intra Block Copy, IBC) mode and PLT (Palette) mode, etc.
  • the inter-frame prediction mode may include ordinary inter-frame prediction mode, GPM mode, and AWP mode.
  • the rate-distortion cost value corresponding to each prediction mode can be obtained; then the minimum rate-distortion cost value is selected from the obtained multiple rate-distortion cost values, The prediction mode corresponding to the minimum rate-distortion cost value is determined as the prediction mode parameter of the current block.
  • the encoder can select the optimal prediction mode to pre-encode the current block.
  • the prediction mode of the current block can be determined, and then the prediction mode parameters used to indicate the prediction mode can be determined. Thereby, the corresponding prediction mode parameters are written into the code stream and transmitted from the encoder to the decoder.
  • the decoder can directly obtain the prediction mode parameter of the current block by parsing the code stream, and determine the prediction mode used by the current block according to the prediction mode parameter obtained by parsing, and the correlation corresponding to the prediction mode. parameter.
  • Step 402 when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, determine the first motion vector of the current subblock of the current block; wherein the current block includes multiple subblocks.
  • the encoder may first determine the first motion vector of each subblock of the current block. Wherein, one sub-block corresponds to one first motion vector.
  • the current block is the image block to be encoded in the current frame
  • the current frame is encoded in the form of image blocks in a certain order
  • the current block is the image block in the current frame in this order.
  • the current block may have various sizes, such as 16 ⁇ 16, 32 ⁇ 32, or 32 ⁇ 16, where the numbers represent the number of rows and columns of pixels on the current block.
  • the current block may be divided into a plurality of sub-blocks, wherein the size of each sub-block is the same, and the sub-block is a set of pixel points of a smaller specification.
  • the size of the sub-block can be 8x8 or 4x4.
  • the size of the current block is 16 ⁇ 16, which can be divided into 4 sub-blocks each with a size of 8 ⁇ 8.
  • the encoder determines that the prediction mode parameter indicates that the inter-frame prediction mode is used to determine the inter-frame prediction value of the current block
  • the frame provided by the embodiments of the present application can continue to be used. inter-prediction method.
  • the encoder determines the first motion vector of the current sub-block of the current block, it can determine the current Affine mode parameter and prediction reference mode for the block.
  • the affine mode parameter indicates to use the affine mode
  • the control point mode and subblock size parameters are determined.
  • the first motion vector can be determined according to the prediction reference mode, the control point mode and the sub-block size parameter.
  • the affine mode parameter is used to indicate whether to use the affine mode.
  • the affine mode parameter may be the affine motion compensation enable flag affine_enable_flag, and the encoder may further determine whether to use the affine mode by determining the value of the affine mode parameter.
  • the affine mode parameter may be a binary variable. If the value of the affine mode parameter is 1, it indicates that the affine mode is used; if the value of the affine mode parameter is 0, it indicates that the affine mode is not used.
  • control point mode is used to determine the number of control points.
  • a sub-block can have 2 control points or 3 control points, correspondingly, the control point pattern can be the control point pattern corresponding to 2 control points, or the control point pattern corresponding to 3 control points . That is, the control point mode can include a 4-parameter mode and a 6-parameter mode.
  • the encoder if the current block uses the affine mode, the encoder also needs to determine the number of control points in the affine mode of the current block to determine, so that the affine mode can be determined. Determines whether 4-parameter (2 control points) mode or 6-parameter (3 control points) mode is used.
  • the encoder may further determine the sub-block size parameter.
  • the sub-block size parameter can be represented by the affine prediction sub-block size flag affine_subblock_size_flag, and the encoder can set the value of the sub-block size flag to indicate the sub-block size parameter.
  • Block The size of the current subblock of the current block.
  • the size of the sub-block may be 8 ⁇ 8 or 4 ⁇ 4.
  • the sub-block size parameter is 8 ⁇ 8, the sub-block size flag is set to 1, and the sub-block size flag is written into the code stream; if the sub-block size parameter is 4 ⁇ 4, then Set the sub-block size flag to 0, and write the sub-block size flag into the code stream.
  • the encoder may first determine the control point motion vector group according to the prediction reference mode; A motion vector.
  • control point motion vector group may be used to determine the motion vector of the control point.
  • the encoder can traverse each sub-block in the current block according to the above method, and use the control point motion vector group, control point mode and sub-block size parameter of each sub-block , the first motion vector of each sub-block is determined, so that the motion vector set can be obtained by constructing and obtaining the first motion vector of each sub-block.
  • the motion vector set of the current block may include the first motion vector of each sub-block of the current block.
  • the encoder when the encoder determines the first motion vector according to the control point motion vector group, the control point mode and the sub-block size parameter, it may first determine the first motion vector according to the control point motion vector group, the control point mode and the The size parameter of the current block is used to determine the difference variable; then the sub-block position can be determined based on the prediction mode parameter and the sub-block size parameter; finally, the first motion vector of the current sub-block can be determined by using the difference variable and the current sub-block position , and then a motion vector set of multiple sub-blocks of the current block can be obtained.
  • Step 403 determine the first predicted value of the current sub-block based on the first motion vector, and determine the target pixel position corresponding to the current pixel position; wherein, the current pixel position is the position of a pixel in the current sub-block, and the target pixel position is The position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position.
  • the encoder may first determine the first predicted value of the current sub-block based on the first motion vector of the current sub-block, and then The target pixel position corresponding to the current pixel position in the current sub-block can be determined.
  • the target pixel position is a pixel position adjacent to the current pixel position.
  • the current pixel position is the position of a pixel in the current sub-block of the current block, where the current pixel position may represent the position of the pixel to be processed.
  • the current pixel position may be the position of the pixel to be re-predicted, or may be the position of the pixel to be PROF processed.
  • step 403 may specifically include:
  • Step 403a Determine the first predicted value of the sub-block based on the first motion vector.
  • Step 403b Determine the target pixel position corresponding to the current pixel position.
  • the inter-frame prediction method proposed in this embodiment of the present application does not limit the order in which the encoder performs step 403a and step 403b, that is, in the present application, after determining the first motion of each sub-block of the current block After the vector, the encoder may first perform step 403a and then step 403b, or may first perform step 403b and then perform step 403a, or may perform step 403a and step 403b simultaneously.
  • Step 404 If the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position.
  • the encoder after the encoder determines the first predicted value of the current subblock based on the first motion vector, and determines the target pixel position corresponding to the current pixel position, if the target pixel position does not belong to the current subblock, that is, the target pixel position If the pixel position and the current pixel position belong to different sub-blocks, the encoder can update the target pixel position according to the current sub-block to obtain the updated pixel position.
  • the encoder updates the target pixel position according to the current sub-block, and before obtaining the updated pixel position, that is, before step 404, the encoder
  • the method for performing inter-frame prediction may further include the following steps:
  • Step 406 Determine whether the current sub-block satisfies a preset restriction condition; wherein, the preset restriction condition is used to determine whether to limit the target pixel position within the current sub-block.
  • Step 407 If the preset restriction condition is met, determine whether the target pixel position belongs to the current sub-block.
  • Step 408 If the preset restriction conditions are not met, determine a second predicted value based on the first predicted value and the target pixel position, and determine the second predicted value as an inter-frame predicted value.
  • the encoder may first determine whether the current sub-block satisfies the preset restriction conditions, and if the preset restriction conditions are met, the encoder may continue to determine whether the target pixel position belongs to the current sub-block, thereby further determining whether the target pixel position belongs to the current sub-block. Perform restriction processing on the target pixel position, that is, whether to perform the update processing process on the target pixel position; if the preset restriction conditions are not met, the encoder does not need to perform restriction processing on the target pixel position, but directly based on the first predicted value.
  • the target pixel position further determine the corrected predicted value corresponding to the current pixel position, and after traversing each pixel position in the current sub-block to obtain the corrected predicted value corresponding to each pixel position, determine the current sub-block corresponding to , so as to determine the inter-frame prediction value of the current sub-block.
  • the encoder can determine whether the current sub-block satisfies the preset restriction condition in various ways. Specifically, the encoder can further determine whether the preset restriction condition is satisfied through the motion vector.
  • the encoder may first determine the first motion vector deviation based on the control point motion vector group; and then compare the first motion vector deviation with the preset deviation. The thresholds are compared, and if the first motion vector deviation is greater than or equal to the preset deviation threshold, it can be determined that the preset restriction conditions are met; if the first motion vector deviation is smaller than the preset deviation threshold, it is determined that the preset restriction conditions are not met.
  • the encoder may first determine the second motion vector deviation based on the first motion vector of each sub-block; Compared with the preset deviation threshold, if the deviation of the second motion vector is greater than or equal to the preset deviation threshold, it is determined that the preset restriction condition is met; if the second motion vector deviation is smaller than the preset deviation threshold, it is determined that the preset restriction condition is not met .
  • the encoder may first determine the third motion vector deviation between the preset pixel position and the current sub-block in the current sub-block; Then the third motion vector deviation is compared with the preset deviation threshold. If the third motion vector deviation is greater than or equal to the preset deviation threshold, it is determined that the preset restriction conditions are met; if the third motion vector deviation is smaller than the preset deviation threshold, then It is determined that the preset constraints are not met.
  • the preset deviation threshold may be determined according to the size parameter of the current block and/or the sub-block size parameter.
  • the encoder when the encoder performs update processing on the target pixel position according to the current sub-block and obtains the updated pixel position, it can first perform expansion processing on the current sub-block to obtain the expanded sub-block; The updated pixel position corresponding to the target pixel position is determined in the extended sub-block.
  • the encoder when the encoder performs expansion processing on the current sub-block and obtains the expanded sub-block, it can choose to use all the boundary positions of the current sub-block to perform expansion processing to obtain the expanded sub-block. ; You can also choose to use the boundary positions of the rows and/or columns corresponding to the target pixel position in the current sub-block to perform expansion processing to obtain the expanded sub-block.
  • Step 405 Based on the first predicted value and the updated pixel position, determine a second predicted value corresponding to the current sub-block, and determine the second predicted value as the inter-frame predicted value of the current sub-block.
  • the encoder updates the target pixel position according to the current sub-block and obtains the updated pixel position.
  • Pixel position determine the corrected predicted value corresponding to the current pixel position, and after traversing each pixel position in the current sub-block to obtain the corrected predicted value corresponding to each pixel position, determine the second corresponding to the current sub-block.
  • the predicted value, and then the second predicted value may be determined as the inter predicted value of the current sub-block.
  • the method for the encoder to determine the second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position may include the following steps:
  • Step 405a determine PROF parameters
  • Step 405b when the PROF parameter indicates that PROF processing is performed, the pixel horizontal gradient and the pixel vertical gradient between the current pixel position and the updated pixel position are determined based on the first predicted value;
  • Step 405c determine the fourth motion vector deviation between the updated pixel position and the current sub-block
  • Step 405d according to the pixel horizontal gradient, the pixel vertical gradient and the fourth motion vector deviation, calculate the deviation value corresponding to the current pixel position;
  • Step 405e based on the first predicted value and the deviation value, obtain a second predicted value.
  • the method for the encoder to determine the second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position may include the following steps:-
  • Step 405f determine secondary prediction parameters
  • Step 405g when the secondary prediction parameter indicates that the secondary prediction is used, determine the deviation of the pixel position after the update from the fourth motion vector of the current sub-block;
  • Step 405h determining the filter coefficient of the two-dimensional filter according to the fourth motion vector deviation; wherein, the two-dimensional filter is used to perform secondary prediction processing according to a preset shape;
  • Step 405i Determine a second predicted value based on the filter coefficient and the first predicted value, and determine the second predicted value as an inter-frame predicted value.
  • the encoder can update the target pixel position that does not belong to the current sub-block to obtain the updated pixel position, so that secondary prediction or PROF processing can be performed based on the updated pixel position.
  • the above update process can be understood as the expansion of the boundary of the current sub-block, and it can also be understood as the redefinition of the target pixel position beyond the boundary of the current sub-block, so that the secondary prediction or PROF processing needs to be used.
  • the pixel positions of are restricted to the same sub-block, which improves the prediction performance by ensuring that the pixels are connected.
  • the method for performing inter-frame prediction by the encoder may further include the following steps:
  • Step 409 When the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, determine the extended subblock of the current subblock.
  • Step 4010 Determine the third prediction value of the extended sub-block based on the first motion vector, and determine the target pixel position corresponding to the current pixel position.
  • Step 4011 If the target pixel position does not belong to the current sub-block, determine a second predicted value based on the third predicted value and the target pixel position, and determine the second predicted value as an inter-frame predicted value.
  • the encoder may also follow the steps proposed in steps 406 to 408 above.
  • the method is to first judge whether the current sub-block satisfies the preset restriction conditions, and only when the preset restriction conditions are met, will determine the third predicted value of the extended sub-block, and use the third predicted value to perform secondary prediction or PROF processing , to obtain the inter-frame prediction value of the current sub-block; if the current sub-block does not meet the preset constraints, the decoder can directly use the first prediction value and target pixel position of the current sub-block to determine the second prediction value, and then The second predicted value is determined to be an inter predicted value.
  • the preset restriction condition may also be used to determine whether to perform the determination of the third prediction value of the extended sub-block on the current sub-block.
  • the problem of bandwidth increase can be solved by using a filter with fewer taps for the extended sub-block, for example, an interpolation filter with n taps is used for the extended sub-block to obtain the third prediction value; where n is any of the following values: 6, 5, 4, 3, 2. That is, in the present application, using a filter with fewer taps, such as a 6- or 5- or 4- or 3- or 2-tap filter, does not exceed this range and thus does not increase the bandwidth.
  • the 2-tap filter can be an average or weighted average of 2 adjacent integer pixels of a sub-pixel point.
  • the filters mentioned above are all in the horizontal or vertical direction. If the position to be interpolated is divided into pixels in the horizontal and vertical directions, then the filters in the horizontal and vertical directions need to be superimposed.
  • the reference pixels corresponding to the current sub-block can also be expanded first to obtain the expanded reference pixels; and then based on the expanded reference pixels, the current sub-block corresponding the interpolation filter to obtain the third predicted value. That is to say, in the present application, the outermost circle of the originally interpolated reference pixels can be expanded, for example, the 11 ⁇ 11 reference pixels are expanded to 13 ⁇ 13 reference pixels, and the expansion method proposed in the above step 304 can be specifically adopted. This allows the use of uniform filters. At the same time, another method that does not increase the bandwidth is proposed. Instead of expanding the pixels, when the pixels to be used by the filter exceed the original pixel range, the pixels in the nearest range are selected instead.
  • the adjacent pixel positions corresponding to the extended sub-blocks can also be determined in the current block first; then the third predicted value can be determined according to the integer pixel values of the adjacent pixel positions. . That is to say, in the present application, the value of the most adjacent integer pixel can also be directly used as the predicted value beyond the sub-block to be used. Specifically, the pixel value of square 3 can be directly used as the predicted value of circle 2.
  • the sub-pixel MV in the horizontal direction is less than or equal to (or less than) 1/2 pixel
  • the integer pixel to the left of the sub-pixel is used, and if the sub-pixel MV in the horizontal direction is greater than (or greater than or equal to) 1/2 pixel, the left sub-pixel is used.
  • the sub-pixel MV in the vertical direction is less than or equal to (or less than) 1/2 pixel
  • the integer pixel above the sub-pixel is used.
  • the sub-pixel MV in the vertical direction is greater than (or greater than or equal to) 1/2 pixel, then Use the whole pixel below the subpixel.
  • the encoder after determining the first prediction value of the current sub-block based on the first motion vector, and determining the target pixel position corresponding to the current pixel position, that is, after step 203, the encoder performs inter-frame prediction
  • the method may also include the following steps:
  • Step 4012 if the target pixel position does not belong to the current sub-block, determine the adjacent pixel position in the current block; wherein, the adjacent pixel position is adjacent to the target pixel position in the current block.
  • Step 4013 Determine a second predicted value based on the first predicted value and adjacent pixel positions, and determine the second predicted value as an inter-frame predicted value.
  • inter-frame prediction method proposed in the embodiments of the present application can act on the entire coding unit or prediction unit, that is, on the current block, can also act on each sub-block in the current block, and can also act on any each pixel position in a subblock. This application does not make any specific limitations.
  • the encoder may write the prediction mode parameters, the affine mode parameters, and the prediction reference mode into the code stream.
  • PROF parameters and secondary prediction parameters can also be written into the code stream.
  • This embodiment provides an inter-frame prediction method. After sub-block-based prediction, if the pixel positions required for secondary prediction or PROF processing are not in the same sub-block, the boundary of the current sub-block can be extended by extending the boundary of the current sub-block. , redefining the target pixel position beyond the boundary of the current sub-block, etc., to limit the pixel positions that need to be used for secondary prediction or PROF processing in the same sub-block, thus solving the problem of unconnected pixels.
  • the resulting problem of degraded prediction performance can reduce prediction errors, greatly improve encoding performance, and thus improve encoding and decoding efficiency.
  • FIG. 27 is a schematic diagram of the composition and structure of the decoder.
  • the decoder 300 proposed in the embodiment of the present application may include a parsing part 301 , a first determining part part 302 and the first update part 303 .
  • the parsing part 301 is configured to parse the code stream and obtain the prediction mode parameter of the current block;
  • the first determining part 302 is configured to determine the first motion vector of the current subblock of the current block when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode; wherein , the current block includes a plurality of sub-blocks; the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, the current pixel position is the The position of a pixel in the current sub-block, the target pixel position is the position of the pixel that is subjected to secondary prediction or PROF processing on the pixel of the current pixel position;
  • the first update part 303 is configured to, if the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • the first determining part 302 is further configured to determine a second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position, and determine the second predicted value as the The inter-predicted value of the current subblock.
  • FIG. 28 is a second schematic diagram of the composition and structure of the decoder.
  • the decoder 300 proposed in this embodiment of the present application may further include a first processor 304 and a first memory 305 storing executable instructions of the first processor 304 , a first communication interface 306 , and a first bus 307 for connecting the first processor 304 , the first memory 305 and the first communication interface 306 .
  • the above-mentioned first processor 304 is configured to parse the code stream and obtain the prediction mode parameter of the current block; when the prediction mode parameter indicates that the inter prediction mode is used to determine the prediction mode of the current block
  • the first motion vector of the current sub-block of the current block is determined; wherein, the current block includes a plurality of sub-blocks; the first prediction value of the current sub-block is determined based on the first motion vector , and determine the target pixel position corresponding to the current pixel position; wherein, the current pixel position is the position of a pixel point in the current sub-block, and the target pixel position is the pixel point of the current pixel position.
  • the first predicted value and the updated pixel position are used to determine a second predicted value corresponding to the current sub-block, and the second predicted value is determined as the inter-frame predicted value of the current sub-block.
  • FIG. 29 is a schematic diagram 1 of the composition structure of the encoder.
  • the encoder 400 proposed in this embodiment of the present application may include a second determination part 401 and a second update part 402 .
  • the second determining part 401 is configured to determine the prediction mode parameter of the current block; when the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode, determine the current sub-frame of the current block The first motion vector of the block; wherein the current block includes a plurality of sub-blocks; the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, The current pixel position is the position of a pixel in the current sub-block, and the target pixel position is the position of the pixel that is subjected to secondary prediction or PROF processing to the pixel of the current pixel position;
  • the second update part 402 is configured to, if the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • the second determining part 401 is further configured to determine a second predicted value corresponding to the current sub-block based on the first predicted value and the updated pixel position, and determine the second predicted value as the The inter-predicted value of the current subblock.
  • FIG. 30 is a second schematic diagram of the composition and structure of the encoder.
  • the encoder 400 proposed in this embodiment of the present application may further include a second processor 403 and a second memory 404 storing executable instructions of the second processor 403 , a second communication interface 405 , and a second bus 406 for connecting the second processor 403 , the second memory 404 and the second communication interface 405 .
  • the above-mentioned second processor 403 is configured to determine the prediction mode parameter of the current block; when the prediction mode parameter indicates that the inter prediction mode is used to determine the inter prediction value of the current block
  • determine the first motion vector of the current sub-block of the current block wherein, the current block includes multiple sub-blocks; determine the first predicted value of the current sub-block based on the first motion vector, and determine the current The target pixel position corresponding to the pixel position; wherein, the current pixel position is the position of a pixel in the current sub-block, and the target pixel position is to perform secondary prediction or PROF on the pixel at the current pixel position
  • the embodiments of the present application provide a decoder and an encoder. After the sub-block-based prediction, if the pixel positions required for secondary prediction or PROF processing are not in the same sub-block, the decoder and the encoder can use The expansion of the boundary of the current sub-block, the redefinition of the target pixel position beyond the boundary of the current sub-block, etc., limit the pixel positions that need to be used for secondary prediction or PROF processing in the same sub-block, Therefore, the problem that the prediction performance is degraded because the pixels are not connected can be solved, the prediction error can be reduced, the coding performance can be greatly improved, and the coding and decoding efficiency can be improved.
  • Embodiments of the present application provide a computer-readable storage medium and a computer-readable storage medium, on which a program is stored, and when the program is executed by a processor, the method described in the foregoing embodiments is implemented.
  • a program instruction corresponding to an inter-frame prediction method in this embodiment may be stored on a storage medium such as an optical disc, a hard disk, a U disk, etc.
  • a storage medium such as an optical disc, a hard disk, a U disk, etc.
  • the prediction mode parameter indicates that the inter prediction value of the current block is determined using the inter prediction mode
  • the first motion vector of the current subblock of the current block is determined; wherein the current block includes a plurality of subblocks
  • the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, the current pixel position is the position of a pixel in the current sub-block , the target pixel position is the position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position;
  • the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • a second predicted value corresponding to the current sub-block is determined, and the second predicted value is determined as an inter-frame predicted value of the current sub-block.
  • a program instruction corresponding to an inter-frame prediction method in this embodiment may be stored on a storage medium such as an optical disc, a hard disk, a U disk, etc.
  • a storage medium such as an optical disc, a hard disk, a U disk, etc.
  • a first motion vector of the current subblock of the current block is determined; wherein the current block includes a plurality of subblocks;
  • the first predicted value of the current sub-block is determined based on the first motion vector, and the target pixel position corresponding to the current pixel position is determined; wherein, the current pixel position is the position of a pixel in the current sub-block , the target pixel position is the position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position;
  • the target pixel position does not belong to the current sub-block, update the target pixel position according to the current sub-block to obtain the updated pixel position;
  • a second predicted value corresponding to the current sub-block is determined, and the second predicted value is determined as an inter-frame predicted value of the current sub-block.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including but not limited to disk storage, optical storage, and the like.
  • An inter-frame prediction method an encoder, a decoder, and a computer storage medium provided by the embodiments of the present application, the decoder parses the code stream, and obtains the prediction mode parameter of the current block; when the prediction mode parameter indicates that the inter-frame prediction mode is used to determine the current block
  • the first motion vector of the current sub-block of the current block is determined; wherein, the current block includes a plurality of sub-blocks; the first prediction value of the current sub-block is determined based on the first motion vector, and the current pixel position is determined The corresponding target pixel position; wherein, the current pixel position is the position of a pixel in the current sub-block, and the target pixel position is the position of the pixel that performs secondary prediction or PROF processing on the pixel at the current pixel position; if the target pixel If the position does not belong to the current sub-block, the target pixel position is updated according to the current sub-block to obtain the updated pixel position;
  • the boundary of the current sub-block can be adjusted by Expansion, redefining the target pixel position beyond the boundary of the current sub-block, etc., limit the pixel positions used for secondary prediction or PROF processing to the same sub-block, thus solving the problem of pixel disjointness.
  • the resulting problem of degraded prediction performance can reduce prediction errors, greatly improve encoding performance, and thus improve encoding and decoding efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了一种帧间预测方法、编码器、解码器以及计算机存储介质,包括:解析码流,获取当前块的预测模式参数;当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前块的当前子块的第一运动矢量;基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;当前像素位置为当前子块内的一个像素点的位置,目标像素位置为对当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;若目标像素位置不属于当前子块,则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置;基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值,将第二预测值确定为当前子块的帧间预测值。

Description

帧间预测方法、编码器、解码器以及计算机存储介质
相关申请的交叉引用
本申请要求在2020年08月20日提交中国专利局、申请号为202010845318.5、申请名称为“帧间预测方法、编码器、解码器以及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频编解码技术领域,尤其涉及一种帧间预测方法、编码器、解码器以及计算机存储介质。
背景技术
在视频编解码领域,为了兼顾性能和代价,一般情况下,在多功能视频编码(Versatile Video Coding,VVC)及数字音视频编解码技术标准工作组(Audio Video coding Standard Workgroup of China,AVS)中的仿射预测是基于子块来实现的。目前,一方面提出了使用光流原理的预测修正(prediction refinement with optical flow,PROF)使用光流原理对基于子块的仿射预测进行修正;另一方面提出了在基于子块的预测之后使用二次预测获得更加准确的预测值,从而实现了对仿射预测的改善。具体地,PROF依赖于基准位置的水平方向和垂直方向的梯度来修正仿射预测结果,而二次预测则是使用滤波器对子块中的一个像素位置再次进行预测。
然而,无论是二次预测中还是PROF,均需要使用到当前像素位置周围的其他像素位置,一旦其他像素位置中的一个或者多个像素位置与当前像素位置不在同一个子块时,计算的准确性可能会大大降低,也就是说,现有的PROF修正预测值方法和二次预测方法其实并不严谨,在对仿射预测进行改善时,并不能很好的适用于全部场景,编码性能有待提升。
发明内容
本申请提出一种帧间预测方法、编码器、解码器以及计算机存储介质,可以大大提升编码性能,从而提高了编解码效率。
本申请的技术方案是这样实现的:
第一方面,本申请实施例提供了一种帧间预测方法,应用于解码器,该方法包括:
解析码流,获取当前块的预测模式参数;
当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;
基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
第二方面,本申请实施例提供了一种帧间预测方法,应用于编码器,该方法包括:
确定当前块的预测模式参数;
当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;
基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
第三方面,本申请实施例提供了一种解码器,所述解码器包括解析部分,第一确定部分,第一更新部分,
所述解析部分,配置为解析码流,获取当前块的预测模式参数;
所述第一确定部分,配置为当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
所述第一更新部分,配置为若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
所述第一确定部分,还配置为基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
第四方面,本申请实施例提供了一种解码器,所述解码器包括第一处理器、存储有所述第一处理器可执行指令的第一存储器,当所述指令被执行时,所述第一处理器执行时实现如上所述的帧间预测方法。
第五方面,本申请实施例提供了一种编码器,所述编码器包括第二确定部分,第二更新部分,
所述第二确定部分,配置为确定当前块的预测模式参数;当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
所述第二更新部分,配置为若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
所述第二确定部分,还配置为基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将 所述第二预测值确定为所述当前子块的帧间预测值。
第六方面,本申请实施例提供了一种编码器,所述编码器包括第二处理器、存储有所述第二处理器可执行指令的第二存储器,当所述指令被执行时,所述第二处理器执行时实现如上所述的帧间预测方法。
第七方面,本申请实施例提供了一种计算机存储介质,该计算机存储介质存储有计算机程序,所述计算机程序被第一处理器和第二处理器执行时,实现如上所述的帧间预测方法。
本申请实施例所提供的一种帧间预测方法、编码器、解码器以及计算机存储介质,解码器解析码流,获取当前块的预测模式参数;当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前块的当前子块的第一运动矢量;其中,当前块包括多个子块;基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,当前像素位置为当前子块内的一个像素点的位置,目标像素位置为对当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;若目标像素位置不属于当前子块,则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置;基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值,将第二预测值确定为当前子块的帧间预测值。也就是说,本申请提出的帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。
附图说明
图1为仿射模型的示意图一;
图2为仿射模型的示意图二;
图3为像素的插值示意图;
图4为子块插值的示意图一;
图5为子块插值的示意图二;
图6为每个子块的运动矢量示意图;
图7为样本位置示意图;
图8为对当前像素位置进行二次预测的示意图;
图9为像素位置不属于相同子块的示意图一;
图10为像素位置不属于相同子块的示意图二;
图11为对当前像素位置使用PROF的示意图;
图12为像素位置不属于相同子块的示意图三;
图13为本申请实施例提供的一种视频编码系统的组成框图示意图;
图14为本申请实施例提供的一种视频解码系统的组成框图示意图;
图15为帧间预测方法的实现流程示意图一;
图16为帧间预测方法的实现流程示意图二;
图17为扩展当前子块的示意图;
图18为二维滤波器的示意图一;
图19为二维滤波器的示意图二;
图20为帧间预测方法的实现流程示意图三;
图21为4x4子块的示意图;
图22为扩展子块示意图一;
图23为扩展子块示意图二;
图24为替换像素的示意图;
图25为帧间预测方法的实现流程示意图四;
图26为替代像素位置的示意图;
图27为解码器的组成结构示意图一;
图28为解码器的组成结构示意图二;
图29为编码器的组成结构示意图一;
图30为编码器的组成结构示意图二。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。可以理解的是,此处所描述的具体实施例仅仅用于解释相关申请,而非对该申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关申请相关的部分。
目前,通用的视频编解码标准基于都采用基于块的混合编码框架。视频图像中的每一帧被分割成相同大小(比如128×128,64×64等)的正方形的最大编码单元(Largest Coding Unit,LCU),每个最大编码单元还可以根据规则划分成矩形的编码单元(Coding Unit,CU);而且编码单元可能还会划分成更小的预测单元(Prediction Unit,PU)。具体地,混合编码框架可以包括有预测、变换(Transform)、量化(Quantization)、熵编码(EntropyCoding)、环路滤波(In Loop Filter)等模块;其中,预测模块可以包括帧内预测(intraPrediction)和帧间预测(interPrediction),帧间预测可以包括运动估计(motion estimation)和运动补偿(motion compensation)。由于视频图像的一个帧内相邻像素之间存在很强的相关性,在视频编解码技术中使用帧内预测方式能够消除相邻像素之间的空间冗余;但是由于视频图像中的相邻帧之间也存在着很强的相似性,在视频编解码技术中使用帧间预测方式消除相邻帧之间的时间冗余,从而能够提高编码效率。下述本申请将以帧间预测进行详细描述。
帧间预测即使用已经编/解码的帧来预测当前帧中需要编/解码的部分,在基于块的编解码框架里,需要编/解码的部分通常是编码单元或预测单元。这里把需要编/解码的编码单元或预测单元统称为当前块。平移运动是视频中的一种常见而且简 单的运动方式,所以平移的预测也是视频编解码中的一种传统的预测方法。视频中的平移运动可以理解为一部分内容随着时间的变化,从一个帧上的某一个位置移动到了另一个帧上的某一个位置。平移的一个简单的单向预测可以用某一帧与当前帧之间的一个运动矢量(motion vector,MV)来表示。这里所说的某一帧为当前帧的一个参考帧,当前块通过这个包含参考帧和运动矢量的运动信息可以找到参考帧上的一个与当前块大小相同的参考块,把这个参考块作为当前块的预测块。理想的平移运动中,当前块的内容在不同帧之间没有变形、旋转等变化以及亮度颜色的变化等,然而,视频中的内容并不总是符合这样的理想情况。双向预测在一定程度上可以解决上述问题。通常的双向预测是指双向的平移的预测。双向预测即是用两个参考帧和运动矢量的运动信息分别从两个参考帧(两个参考帧可能是同一个参考帧)找到两个与当前块大小相同的参考块,利用这两个参考块生成当前块的预测块。生成方法包括平均、加权平均以及一些其他的计算等。
在本申请中,可以认为预测是运动补偿的一部分,有些文献会把本申请中的预测称为运动补偿,如本申请所说仿射预测,有些文献会叫作仿射运动补偿。
旋转、放大、缩小、扭曲、形变等也是在视频中常见的变化,然而,普通的平移预测并不能很好地处理这种变化,于是仿射(affine)预测模型被应用于视频编解码中,如VVC和AVS中的仿射,其中,VVC和AVS3的仿射预测模型相似。在旋转、放大、缩小、扭曲、形变等变化中,可以认为当前块不是所有点都使用相同的MV,因此需要导出每一个点的MV。仿射预测模型用少量的几个参数通过计算导出每一个点的MV。VVC和AVS3的仿射预测模型都使用了2控制点(4参数)和3控制点(6参数)模型。2控制点即当前块的左上角和右上角2个控制点,3控制点即当前块的左上角、右上角和左下角3个控制点。示例性的,图1为仿射模型的示意图一,图2为仿射模型的示意图二,如图1和2所示。因为每个MV包括一个x分量和一个y分量,所以2个控制点有4个参数,3个控制点有6个参数。
根据仿射预测模型可以为每一个像素位置导出一个MV,每一个像素位置都可以在参考帧中找到其对应的位置,如果这个位置不是整像素位置,那么需要通过插值的方法得出这个分像素位置的值。现在视频编解码标准中使用的插值方法通常都是有限长单位冲激响应(Finite Impulse Response,FIR)滤波器来实现,而按这种方式实现复杂度(成本)是很高的。例如,在AVS3中,对亮度分量使用8抽头的插值滤波器,而且普通模式分像素精度是1/4像素的,affine模式的分像素精度是1/16像素的。对每一个符合1/16像素精度的分像素点,需要使用水平方向的8个整像素和垂直方向8个整像素即64个整像素插值得到。图3为像素的插值示意图,如图3所示,圆形像素是想要得到的分像素点,深色的正方形像素是该分像素对应的整像素的位置,它们两个之间的矢量就是分像素的运动矢量,浅色的正方形像素是对圆形分像素位置的插值需要用到的像素,要得到该分像素位置的值,需要这些8x8的浅色的正方形像素区域的像素值进行插值,也包含深色的像素位置。
在传统的平移的预测中,当前块的每个像素位置的MV是相同的。如果进一步地引入子块的概念,子块的大小如4x4,8x8等。图4为子块插值的示意图一,4x4的块插值需要使用的像素区域如图4所示。图5为子块插值的示意图二,8x8的块插值需要使用的像素区域如图5所示。
如果一个子块中的每个像素位置的MV是相同的。那么一个子块中的像素位置可以一起进行插值,从而分摊带宽、使用同一个相位的滤波器以及共享插值过程的中间值。但是如果每一个像素点使用一个MV,那么带宽就会增加,而且可能会使用不同相位的滤波器以及不能共享插值过程的中间值。
基于点的仿射预测代价很高,因此,为了兼顾性能和代价,在VVC和AVS3中的仿射预测是基于子块来实现的。AVS3中的子块大小有4x4和8x8两种大小,VVC中使用4x4的子块大小。每一个子块有一个MV,子块内部的像素位置共享同一个MV。从而对子块内部的所有像素位置统一做插值。通过上述方法,基于子块的仿射预测与其他基于子块的预测方法的运动补偿复杂度相近。
可见,基于子块的仿射预测方法中,子块内部的像素位置共享同一个MV,其中,确定这个共享MV的方法是取当前子块中心的MV。而对4x4、8x8等水平垂直方向上至少有一个是偶数个像素的子块来说,其中心其实是落在非整数像素位置上的。目前的标准中都取一个整数像素的位置,例如,如对4x4子块,取距离左上角位置为(2,2)的像素位置。对8x8子块,取距离左上角位置为(4,4)的像素位置。
仿射预测模型可以根据当前块所使用的控制点(2个控制点或3个控制点)导出每一个像素位置的MV。在基于子块的仿射预测中,根据上一段中像素位置计算出这个位置的MV,作为该子块的MV。图6为每个子块的运动矢量示意图,如图6所示,为了推导每个子块的运动矢量,每个子块的中心采样的运动矢量如图所示,取整到1/16精度,然后进行运动补偿。
随着技术的发展,一个叫做使用光流的预测改进技术PROF的方法被提出。这个技术可以在不增加带宽的情况下,对基于块的仿射预测的预测值进行改进。在基于子块的仿射预测做完后,对每一个已完成基于子块的仿射预测的像素点计算其水平方向和垂直方向的梯度。在VVC中PROF计算梯度时使用的是3抽头的滤波器[-1,0,1],其计算方法和双向光流(Bi-directional Optical flow,BDOF)的方法相同。之后,对每一个像素位置,计算其运动矢量偏差,该运动矢量偏差即为当前像素位置的运动矢量与整个子块使用MV的差值。这些运动矢量偏差都可以根据仿射预测模型的公式计算得出。由于公式的特性,一些子块的相同位置的运动矢量偏差是相同的,对这些子块,只需要计算出一组运动矢量偏差,其他子块可以直接复用这些值。对每一个像素位置,使用该点的水平像素垂直梯度及运动矢量偏差(包括水平方向的偏差和垂直方向的偏差)计算出该像素位置的预测值的修正值,然后将原预测值,即基于子块的仿射预测的预测值,加上预测值的修正值,便可以得到修正后的预测值。
计算水平垂直方向的梯度时,使用[-1,0,1]滤波器,也就是对当前像素位置,水平方向会使用到其左边距离为1的像素位置和右边距离为1的像素位置的预测值,垂直方向会使用到其上边距离为1的像素位置和下边距离为1的像素位置的预测值。如果当前像素位置是当前块的边界位置,那么上述的几个像素位置中有的会超过当前块的边界一个像素距离。使用当前块的边界的预测值向外填充一个像素距离的位置以满足梯度计算,从而不需要额外地增加上述超过当前块的边界一个像素距离的预测值。由于梯度计算只需要使用基于子块的仿射预测的预测值,所以不需要增加额外的带宽。
在VVC标准文本中,对根据控制点的MV导出当前块的各个子块的MV以及子块内的各像素位置的运动矢量偏差。VVC中每个子块使用的作为子块MV的像素位置都相同,所以只需要导出一组子块的运动矢量偏差,其他子块可以复用该子块。进一步地,在VVC标准文本中,对PROF流程的描述,PROF对运动矢量偏差的计算包含在上面的流程中。
AVS3的仿射预测在计算子块的MV时,和VVC的基本原理相同,但AVS3对当前块的左上角子块A,右上角子块B和左下角子块C有特殊的处理。
下面是AVS3标准文本对仿射运动单元子块运动矢量阵列的导出的描述:
如果仿射控制点运动矢量组中有3个运动矢量,那么运动矢量组可以表示为mvsAffine(mv0,mv1,mv2);如果仿射控制点运动矢量组中有2个运动矢量,那么运动矢量组可以表示为mvsAffine(mv0,mv1)。接着,可以按照以下步骤导出仿射运动单元子块运动矢量阵列:
1、计算变量dHorX、dVerX、dHorY和dVerY:
dHorX=(mv1_x-mv0_x)<<(7-Log(width));
dHorY=(mv1_y-mv0_y)<<(7-Log(width));
若运动矢量组为mvsAffine(mv0,mv1,mv2),则:
dVerX=(mv2_x-mv0_x)<<(7-Log(height));
dVerY=(mv2_y-mv0_y)<<(7-Log(height));
若运动矢量组为mvsAffine(mv0,mv1),则:
dVerX=-dHorY;
dVerY=dHorX;
需要说明的是,图7为样本位置示意图,如图7所示,(xE,yE)是当前预测单元亮度预测块左上角样本在当前图像的亮度样本矩阵中的位置,当前预测单元的宽度和高度分别是width和height,每个子块的宽度和高度分别是subwidth和subheight,当前预测单元亮度预测块的左上角样本所在的子块为A,右上角样本所在的子块为B,左下角样本所在的子块为C。
2.1、如果当前预测单元的预测参考模式是‘Pred_List01’或AffineSubblockSizeFlag等于1(AffineSubblockSizeFlag用于指示子块尺寸的大小),则subwidth和subheight均等于8,(x,y)是尺寸为8x8的子块左上角位置的坐标,则可以计算每个8x8亮度子块的运动矢量mvE(mvE_x,mvE_y):
如果当前子块是A,则xPos和yPos均等于0;
如果当前子块是B,则xPos等于width,yPos等于0;
如果当前子块为C且mvsAffine中有3个运动矢量,则xPos等于0,yPos等于height;
否则,xPos等于(x-xE)+4,yPos等于(y-yE)+4;
因此,当前8x8子块的运动矢量mvE为:
mvE_x=Clip3(-131072,131071,Rounding((mv0_x<<7)+dHorX×xPos+dVerX×yPos,7));
mvE_y=Clip3(-131072,131071,Rounding((mv0_y<<7)+dHorY×xPos+dVerY×yPos,7));
2.2、如果当前预测单元的预测参考模式是‘Pred_List0’或‘Pred_List1’,且AffineSubblockSizeFlag等于0,则subwidth和subheight均等于4,(x,y)是尺寸为4x4的子块左上角位置的坐标,计算每个4x4亮度子块的运动矢量mvE(mvE_x,mvE_y):
如果当前子块是A,则xPos和yPos均等于0;
如果当前子块是B,则xPos等于width,yPos等于0;
如果当前子块是C且mvAffine中有3个运动矢量,则xPos等于0,yPos等height;
否则,xPos等于(x-xE)+2,yPos等于(y-yE)+2;
因此,当前4x4子块的运动矢量mvE为:
mvE_x=Clip3(-131072,131071,Rounding((mv0_x<<7)+dHorX×xPos+dVerX×yPos,7));
mvE_y=Clip3(-131072,131071,Rounding((mv0_y<<7)+dHorY×xPos+dVerY×yPos,7))。
下面是AVS3文本对于仿射预测样本导出及亮度、色度样本插值的描述:
如果当前预测单元亮度预测块左上角样本在当前图像的亮度样本矩阵中的位置为(xE,yE)。
如果当前预测单元的预测参考模式是‘PRED_List0’且AffineSubblockSizeFlag的值为0,mv0E0是MvArrayL0运动矢量集合在(xE+x,yE+y)位置的、4x4单元的LO运动矢量。亮度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值,是参考图像队列0中参考索引为RefIdxL0的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E0_x,((yE+y)<<4)+mv0E0_y)的样本值,色度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值,是参考图像队列0中参考索引为RefIdxL0的1/32精度色度样本矩阵中位置为(((xE+2x)<<4)+MvC_x,((yE+2y)<<4)+MvC_y)的样本值。其中,x1=((xE+2x)>>3)<<3,y1=((yE+2y)>>3)<<3,mv1E0是MvArrayL0运动矢量集合在(x1,y1)位置的4x4单元的LO运动矢量,mv2E0是MvArrayL0运动矢量集合在(x1+4,y1)位置的4x4单元的LO运动矢量,mv3E0是MvArrayL0运动矢量集合在(x1,y1+4)位置的4x4单元的LO运动矢量,mv4E0是MvArrayL0运动矢量集合在(x1+4,y1+4)位置的4x4单元的LO运动矢量。
MvC_x=(mv1E0_x+mv2E0_x+mv3E0_x+mv4E0_x+2)>>2
MvC_y=(mv1E0_y+mv2E0_y+mv3E0_y+mv4E0_y+2)>>2
如果当前预测单元的预测参考模式是‘PRED_List0’且AffineSubblockSizeFlag的值为1,mv0E0是MvArrayL0运动矢量集合在(xE+x,yE+y)位置的、8x8单元的LO运动矢量。亮度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值,是参考图像队列0中参考索引为RefIdxL0的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E0_x,((yE+y)<<4)+mv0E0_y)的样本值,色度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/32精度色度样本矩阵中位置为(((xE+2x)<<4)+MvC_x,((yE+2y)<<4)+MvC_y)的样本值。其中,MvC_x等于mv0E0_x,MvC_y等于mv0E0。
如果当前预测单元的预测参考模式是‘PRED_List1’且AffineSubblockSizeFlag的值为0,mv0E1是MvArrayL1运动矢量集合在(xE+x,yE+y)位置的、4x4单元的L1运动矢量。亮度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值,是参考图像队列1中参考索引为RefIdxL1的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E1_x,((yE+y)<<4)+mv0E1_y)的样本值,色度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/32精度色度样本矩阵中位置为(((xE+2x)<<4)+MvC_x,((yE+2y)<<4)+MvC_y)的样本值。 其中,x1=((xE+2x)>>3)<<3,y1=((yE+2y)>>3)<<3,mv1E1是MvArrayL1运动矢量集合在(x1,y1)位置的4x4单元的L1运动矢量,mv2E1是MvArrayL1运动矢量集合在(x1+4,y1)位置的4x4单元的L1运动矢量,mv3E1是MvArrayL1运动矢量集合在(x1,y1+4)位置的4x4单元的L1运动矢量,mv4E1是MvArrayL1运动矢量集合在(x1+4,y1+4)位置的4x4单元的L1运动矢量。
MvC_x=(mv1E1_x+mv2E1_x+mv3E1_x+mv4E1_x+2)>>2
MvC_y=(mv1E1_y+mv2E1_y+mv3E1_y+mv4E1_y+2)>>2
如果当前预测单元的预测参考模式是‘PRED_List1’且AffineSubblockSizeFlag的值为1,mv0E1是MvArrayL1运动矢量集合在(xE+x,yE+y)位置的、8x8单元的L1运动矢量。亮度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值,是参考图像队列1中参考索引为RefIdxL1的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E1_x,((yE+y)<<4)+mv0E1_y)的样本值,色度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/32精度色度样本矩阵中位置为(((xE+2x)<<4)+MvC_x,((yE+2y)<<4)+MvC_y)的样本值。其中MvC_x等于mv0E1_x,MvC_y等于mv0E1。
如果当前预测单元的预测模式是‘PRED_List01’,mv0E0是MvArrayL0运动矢量集合在(xE+x,yE+y)位置的8x8单元的L0运动矢量,mv0E1是MvArrayL1运动矢量集合在(x,y)位置的、8x8单元的L1运动矢量。亮度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E0_x,((yE+y)<<4)+mv0E0_y)的样本值,色度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/32精度色度样本矩阵中位置为(((xE+2x)<<4)+MvC0_x,((yE+2y)<<4)+MvC0_y)的样本值,亮度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/16精度亮度样本矩阵中位置为((((xE+x)<<4)+mv0E1_x,((yE+y)<<4)+mv0E1_y))的样本值,色度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/32精度色度样本矩阵中位置为(((xE+2x)<<4)+MvC1_x,((yE+2y)<<4)+MvC1_y)的样本值。其中MvC0_x等于mv0E0_x,MvC0_y等于mv0E0_y,MvC1_x等于mv0E1_x,MvC1_y等于mv0E1_y。
其中,参考图像的亮度1/16精度样本矩阵和色度1/32精度样本矩阵中各个位置的元素值通过如下的仿射亮度样本插值过程和仿射色度样本插值过程所定义的插值方法得到。参考图像外的整数样本应使用该图像内距离该样本最近的整数样本(边缘或角样本)代替,即运动矢量能指向参考图像外的样本。
具体地,仿射亮度样本插值过程如下:
A,B,C,D是相邻整像素样本,dx与dy是整像素样本A周边分像素样本a(dx,dy)与A的水平和垂直距离,dx等于fx&15,dy等于fy&15,其中(fx,fy)是该分像素样本在1/16精度的亮度样本矩阵中的坐标。整像素A x,y的周边有的255个分像素样本a x,y(dx,dy)。
具体地,样本位置a x,0(x=1~15)由水平方向上距离插值点最近得8个整数值滤波得到,预测值的获取方式如下:
a x,0=Clip1((fL[x][0]×A -3,0+fL[x][1]×A -2,0+fL[x][2]×A -1,0+fL[x][3]×A 0,0+fL[x][4]×A 1,0+fL[x][5]×A 2,0+fL[x][6]×A 3,0+fL[x][7]×A 4,0+32)>>6)。
具体地,样本位置a 0,y(y=1~15)由垂直方向上距离插值点最近得8个整数值滤波得到,预测值的获取方式如下:
a 0,y=Clip1((fL[y][0]×A 0,-3+fL[y][1]×A -2,0+fL[y][2]×A -1,0+fL[y][3]×A 0,0+fL[y][4]×A 1,0+fL[y][5]×A 2,0+fL[y][6]×A 3, 0+fL[y][7]×A -4,0+32)>>6)。
具体地,样本位置a x,y(x=1~15,y=1~15)的预测值的获取方式如下:
a x,y=Clip1((fL[y][0]×a' x,y-3+fL[y][1]×a' x,y-2+fL[y][2]× a'x,y-1+fL[y][3]× a'x,y+fL[y][4]× a'x,y+1+fL[y][5]× a'x,y+2+fL[y][6]× a'x, y+3+fL[y][7]× a'x,y+4+(1<<(19-BitDepth)))>>(20-BitDepth))。
其中:
a' x,y=(fL[x][0]×A -3,y+fL[x][1]×A -2,y+fL[x][2]×A -1,y+fL[x][3]×A 0,y+fL[x][4]×A 1,y+fL[x][5]×A 2,y+fL[x][6]×A 3,y+fL[x][7]×A 4, y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
亮度插值滤波器系数如表1所示:
表1
Figure PCTCN2021106589-appb-000001
具体地,仿射色度样本插值过程如下:
A,B,C,D是相邻整像素样本,dx与dy是整像素样本A周边分像素样本a(dx,dy)与A的水平和垂直距离,dx等于fx&31,dy等于fy&31,其中(fx,fy)是该分像素样本在1/32精度的色度样本矩阵中的坐标。整像素A x,y的周边有1023个分像素样本a x,y(dx,dy)。
具体地,对于dx等于0或dy等于0的分像素点,可直接用色度整像素插值得到,对于dx不等于0且dy不等于0的点,使用整像素行(dy等于0)上的分像素进行计算:
if(dx==0){
a x,y(0,dy)=Clip3(0,(1<<BitDepth)-1,(fC[dy][0]×A x,y-1+fC[dy][1]×A x,y+fC[dy][2]×A x,y+1+fC[dy][3]×A x,y+2+32)>>6)
}
else if(dy==0){
a x,y(dx,0)=Clip3(0,(1<<BitDepth)-1,(fC[dx][0]×A x-1,y+fC[dx][1]×A x,y+fC[dx][2]×A x+1,y+fC[dx][3]×A x+2,y+32)>>6)
}
else{
a x,y(dx,dy)=Clip3(0,(1<<BitDepth)-1,(C[dy][0]×a' x,y-1(dx,0)+C[dy][1]×a' x,y(dx,0)+C[dy][2]×a' x,y+1(dx,0)+C[dy][3]×a' x,y+2(dx,0)+(1<<(19-BitDepth)))>>(20-BitDepth))
}
其中,a' x,y(dx,0)是整像素行上的分像素的临时值,定义为:a' x,y(dx,0)=(fC[dx][0]×A x-1,y+fC[dx][1]×A x,y+fC[dx][2]×A x+1, y+fC[dx][3]×A x+2,y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
色度插值滤波器系数如表2所示:
表2
Figure PCTCN2021106589-appb-000002
常见的仿射预测的方法可以包括以下步骤:
步骤101、确定控制点的运动矢量。
步骤102、根据控制点的运动矢量确定子块的运动矢量。
步骤103、根据子块的运动矢量对子块进行预测。
目前,在通过PROF改进基于块的仿射预测的预测值时,具体可以包括以下步骤:
步骤101、确定控制点的运动矢量。
步骤102、根据控制点的运动矢量确定子块的运动矢量。
步骤103、根据子块的运动矢量对子块进行预测。
步骤104、根据控制点的运动矢量与子块的运动矢量确定子块内每个位置与子块的运动矢量偏差。
步骤105、根据控制点的运动矢量确定子块的运动矢量。
步骤106、使用基于子块的预测值导出每一个位置计算水平和垂直方向的梯度。
步骤107、利用光流原理,根据每一个位置的运动矢量偏差和水平、垂直方向的梯度,计算每一个位置的预测值的偏差值。
步骤108、对每一个位置基于子块的预测值加上预测值的偏差值,得到修正后的预测值。
目前,在通过PROF改进基于块的仿射预测的预测值时,具体还可以包括以下步骤:
步骤101、确定控制点的运动矢量。
步骤109、根据控制点的运动矢量确定子块的运动矢量及子块内每个位置与子块的运动矢量的偏差。
步骤103、根据子块的运动矢量对子块进行预测。
步骤106、使用基于子块的预测值导出每一个位置计算水平和垂直方向的梯度。
步骤107、利用光流原理,根据每一个位置的运动矢量偏差和水平、垂直方向的梯度,计算每一个位置的预测值的偏差值。
步骤108、对每一个位置基于子块的预测值加上预测值的偏差值,得到修正后的预测值。
PROF可以使用光流原理对基于子块的仿射预测进行修正,提高了压缩性能。然而,PROF的应用是基于子块内的像素位置的运动矢量与子块运动矢量的偏差非常小的情况,也就是说,子块内的像素位置的运动矢量与子块运动矢量的偏差非常小的情况下使用PROF的光流计算方法是有效的,但是,由于PROF依赖于基准位置的水平方向和垂直方向的梯度,在实际位置离基准位置较远的情况下,基准位置的水平和垂直方向的梯度并不能真实的反映基准位置和实际位置之间的水平和垂直方向的梯度,因此,在子块内的像素位置的运动矢量与子块运动矢量的偏差较大的情况下,该方法就不是特别有效了。
此时,提出了二次预测的方法来克服PROF的缺陷。解码器进行二次预测的方法可以包括以下步骤:
步骤201、根据子块的运动矢量对子块进行预测,获得预测值。
步骤202、确定子块内每一个位置与子块的运动矢量偏差。
步骤203、根据每一个位置的运动矢量偏差,利用二维滤波器对预测值进行滤波,得到二次预测的预测值。
也就是说,二次预测方法可以在基于子块的预测之后,对运动矢量与子块的运动矢量有偏差的像素位置,在基于子块的预测的基础上进行基于点的二次预测,最终完成对预测值的修正,获得新的预测值,即二次预测的预测值。
具体地,基于点的二次预测使用二维滤波器。二维滤波器是相邻的构成预设形状的点的构成的滤波器。相邻的构成预设形状的点可以为9个点。对一个像素位置,滤波器处理的结果为该位置的二次预测的预测值。其中,二维滤波器的滤波系数由每一个位置的运动矢量偏差确定,二维滤波器的输入为预测值,输出为二次预测的预测值。
进一步地,在本申请中,如果本申请中对当前块使用仿射模式,那么需要先对控制点的运动矢量进行确定。解码器进行二次预测的方法可以包括以下步骤:
步骤204、确定控制点的运动矢量。
步骤205、根据控制点的运动矢量确定子块的运动矢量。
步骤201、根据子块的运动矢量对子块进行预测,获得预测值。
步骤202、确定子块内每一个位置与子块的运动矢量偏差。
步骤203、根据每一个位置的运动矢量偏差,利用二维滤波器对预测值进行滤波,得到二次预测的预测值。
由此可见,在本申请的实施例中,在对控制点的运动矢量进行确定之后,可以利用控制点的运动矢量对子块进行预测处理,并在确定出子块中的像素位置与子块的运动矢量偏差之后,对运动矢量与子块的运动矢量有偏差的像素位置,在基于子块的预测的基础上进行基于点的二次预测,最终完成对预测值的修正,获得新的预测值,即二次预测的预测值。
进一步地,在本申请中,也可以将确定子块的运动矢量和确定子块内每个位置与子块的运动矢量偏差的两个步骤同时进行。解码器进行二次预测的方法可以包括以下步骤:
步骤204、确定控制点的运动矢量。
步骤206、根据控制点的运动矢量确定子块的运动矢量,和子块内每一个位置与子块的运动矢量偏差。
步骤201、根据子块的运动矢量对子块进行预测,获得预测值。
步骤203、根据每一个位置的运动矢量偏差,利用二维滤波器对预测值进行滤波,得到二次预测的预测值。
由此可见,在本申请的实施例中,在对控制点的运动矢量进行确定之后,可以同时进行子块的运动矢量和子块内每一个位置与子块的运动矢量偏差的确定,然后对运动矢量与子块的运动矢量有偏差的像素位置,在基于子块的预测的基础上进行基于点的二次预测,最终完成对预测值的修正,获得新的预测值,即二次预测的预测值。
需要说明的是,因为仿射模型可以明确地计算出每一个像素位置的运动矢量,或者说子块内每一个像素位置与子块运动矢量的偏差,因此,二次预测方法可以用于对仿射预测进行改善,当然,二次预测也可以应用于其他基于子块的预测的改善。也就是说,本申请所提出的基于子块的预测包括但不限于仿射基于子块的预测。
进一步地,二次预测方法可以以AVS3标准为基础,也可以应用于VVC标准,本申请不作具体限定。
在使用二次预测方法对子块中的当前像素位置进行二次预测时,会使用到当前像素位置在基于子块的预测的预测块(当前块)中的、相邻的像素位置的像素值。例如,使用的二维滤波器为3x3的矩形滤波器,相邻的像素位置包括当前像素位置的左边一个像素距离的像素位置、右边一个像素距离的像素位置、上边一个像素距离的像素位置、下边一个像素距离的像素位置、左边一个像素距离上边一个像素距离的像素位置、右边一个像素距离上边一个像素距离的像素位置、左边一个像素距离下边一个像素距离的像素位置、右边一个像素距离下边一个像素距离的像素位置这8个像素位置。如果使用其他形状的滤波器,也可能包括对应的其它像素位置。
如果滤波器使用到的与当前像素位置相邻的其他像素位置与当前像素位置不属于相同的子块,即滤波器使用到的像素位置可能属于不止一个子块里面,存在跨子块的情况,常常假设相邻子块之间相邻的像素是连续的,然而,在实际情况中,由于当前块的多个子块可能是基于不同的mv预测得到的,所以相邻子块之间相邻的像素很可能并不是相连的,或者说实际是 不相邻的,或者说它们在参考图像中是不相邻的。如果当前块的相邻子块mv之间的差距比较小,这种不相连性可能并不明显,但是如果相邻子块mv之间的差距比较大,这种不相连性就会体现出来了。
由此可见,在当前块的相邻子块之间相邻的像素不相连的情况下,它们之间的相关性就会变弱,此时,如果滤波器使用不属于相同的子块的、不相连的像素位置的像素进行滤波,效果就会变差。
图8为对当前像素位置进行二次预测的示意图,如图8所示,使用二次预测方法在基于子块的预测的预测块的基础上进行二次预测时,假设当前像素位置为位置1,即使用3x3的矩形滤波器对位置1进行二次预测,此时,需要使用到与位置1相邻的8个像素位置,其中,位置2为位置1的左相邻像素,即与位置1左边距离为1个像素距离的像素位置,而位置2与位置1并不在相同的子块中,位置1属于子块1,位置2属于子块2。一般情况下,可以认为相邻的子块1和子块2之间相邻的像素是连续的,即位置1和位置2是相连的,但是在参考图像中,它们的位置关系未必如此。
示例性的,图9为像素位置不属于相同子块的示意图一,如图9所示,子块1和子块2这两个相邻的子块在参考图像中的一种可能的相对位置,其中,子块1和子块2的MV相差不大,在参考图像中的位置稍有错位,那么与位置1相邻的位置2在参考图像中的实际位置是子块2中的位置3。可见,子块2中的位置3并不是子块1中位置1的正左方一个像素距离的位置,这两个像素位置之间存在位置偏差。
示例性的,图10为像素位置不属于相同子块的示意图二,如图10所示,子块1和子块2这两个相邻的子块在参考图像中的另一种可能的相对位置,其中,子块2中并不存在与子块1中的位置1正左方相邻的像素位置,如果按照目前的二次预测方法,会选择子块2中的位置3作为子块1中位置1的正左方一个像素距离的位置,可以看出,位置1和位置3相差较远,如果使用位置3作为位置1的相邻像素位置进行滤波,会大大降低最终获得的二次预测的预测效果。
相应地,PROF的情况也是类似的,PROF需要计算水平方向和垂直方向的梯度,水平方向的梯度计算需要使用基于子块的预测的当前块中的、当前像素位置的左边一个像素距离的像素位置和右边一个像素距离的像素位置,垂直方向的梯度计算需要使用基于子块的预测的前块中的、当前像素位置的上边一个像素距离的像素位置和下边一个像素距离的像素位置,在需要使用的这些位置中的一个或多个与当前像素位置不在同一个子块中的情况下,如果它们在参考图像中不相连,那么梯度计算的准确性会下降。
示例性的,图11为对当前像素位置使用PROF的示意图,如图11所示,PROF计算水平方向的梯度时,假设当前像素位置为位置1,此时,需要使用到与位置1水平相邻的2个像素位置,其中,位置2为位置1的左相邻像素,即与位置1左边距离为1个像素距离的像素位置,而位置2与位置1并不在相同的子块中,位置1属于子块1,位置2属于子块2。一般情况下,可以认为相邻的子块1和子块2之间相邻的像素是连续的,即位置1和位置2是相连的,但是在参考图像中,它们的位置关系未必如此。
图12为像素位置不属于相同子块的示意图三,如图12所示,子块1和子块2的MV相差不大,在参考图像中的位置稍有错位,那么与位置1相邻的位置2在参考图像中的实际位置是子块2中的位置3。可见,子块2中的位置3并不是子块1中位置1的正左方一个像素距离的位置,这两个像素位置之间存在位置偏差。
由此可见,当二次预测或PROF需要使用的像素位置不在同一个子块时,可能会由于像素位置不相连而造成预测性能下降的问题,即现有的PROF修正预测值方法和二次预测方法其实并不严谨,在对仿射预测进行改善时,并不能很好的适用于全部场景,编码性能有待提升。
为了解决现有技术中存在的缺陷,在本申请的实施例中,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。
应理解,本申请实施例提供一种视频编码系统,图13为本申请实施例提供的一种视频编码系统的组成框图示意图,如图13所示,该视频编码系统11可以包括:变换单元111、量化单元112、模式选择和编码控制逻辑单元113、帧内预测单元114、帧间预测单元115(包括:运动补偿和运动估计)、反量化单元116、反变换单元117、环路滤波单元118、编码单元119和解码图像缓存单元110;针对输入的原始视频信号,通过编码树块(Coding Tree Unit,CTU)的划分可以得到一个视频重建块,通过模式选择和编码控制逻辑单元113确定编码模式,然后,对经过帧内或帧间预测后得到的残差像素信息,通过变换单元111、量化单元112对该视频重建块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内预测单元114用于对该视频重建块进行帧内预测;其中,帧内预测单元114用于确定该视频重建块的最优帧内预测模式(即目标预测模式);帧间预测单元115用于执行所接收的视频重建块相对于一或多个参考帧中的一或多个块的帧间预测编码,以提供时间预测信息;其中吗,运动估计为产生运动向量的过程,运动向量可以估计该视频重建块的运动,然后,运动补偿基于由运动估计所确定的运动向量执行运动补偿;在确定帧间预测模式之后,帧间预测单元115还用于将所选择的帧间预测数据提供到编码单元119,而且,将所计算确定的运动向量数据也发送到编码单元119;此外,反量化单元116和反变换单元117用于该视频重建块的重构建,在像素域中重构建残差块,该重构建残差块通过环路滤波单元118去除方块效应伪影,然后,将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频重建块;编码单元119是用于编码各种编码参数及量化后的变换系数。而解码图像缓存单元110用于存放重构建的视频重建块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频重建块,这些重构建的视频重建块都会被存放在解码图像缓存单元110中。
本申请实施例还提供一种视频解码系统,图14为本申请实施例提供的一种视频解码系统的组成框图示意图,如图14所示,该视频解码系统12可以包括:解码单元121、反变换单元127,与反量化单元122、帧内预测单元123、运动补偿单元124、环路滤波单元125和解码图像缓存单元126单元;输入的视频信号经过视频编码系统11进行编码处理之后,输出该视频信号的码流;该码流输入视频解码系统12中,首先经过解码单元121,用于得到解码后的变换系数;针对该变换系数通过反变换单元127与反量化单元122进行处理,以便在像素域中产生残差块;帧内预测单元123可用于基于所确定的帧内预测方向和来自当前帧或图片的先前经解码块的数据而产生当前视频解码块的预测数据;运动补偿单元124是通过剖析运动向量和其他关联语法元素来确定用于视频解码块的预测信息,并使用该预测信息以产生正被解码的视频解码块的预测性块;通过对来自反变换单元127与反量化单元122的残差块与由帧内预测单元123或运动补偿单元124产生的对应预测性块进行 求和,而形成解码的视频块;该解码的视频信号通过环路滤波单元125以便去除方块效应伪影,可以改善视频质量;然后将经解码的视频块存储于解码图像缓存单元126中,解码图像缓存单元126存储用于后续帧内预测或运动补偿的参考图像,同时也用于视频信号的输出,得到所恢复的原始视频信号。
本申请实施例提供的一种帧间预测方法主要作用于视频编码系统11的帧间预测单元215和视频解码系统12的帧间预测单元,即运动补偿单元124;也就是说,如果在视频编码系统11能够通过本申请实施例提供的帧间预测方法得到一个较好的预测效果,那么,对应地,在视频解码系统12,也能够改善视频解码恢复质量。
基于此,下面结合附图和实施例对本申请的技术方案进一步详细阐述。在进行详细阐述之前,需要说明的是,说明书通篇中提到的“第一”、“第二”、“第三”等,仅仅是为了区分不同的特征,不具有限定优先级、先后顺序、大小关系等功能。
需要说明的是,本实施例以AVS3标准为基础进行示例性说明,本申请提出的帧间预测方法同样可以适用于VVC等其他编码标准技术,不申请对此不作具体限定。
本申请实施例提供一种帧间预测方法,该方法应用于视频解码设备,即解码器。该方法所实现的功能可以通过解码器中的第一处理器调用计算机程序来实现,当然计算机程序可以保存在第一存储器中,可见,该解码器至少包括第一处理器和第一存储器。
进一步地,在本申请的实施例中,图15为帧间预测方法的实现流程示意图一,如图15所示,解码器进行帧间预测的方法可以包括以下步骤:
步骤301、解析码流,获取当前块的预测模式参数。
在本申请的实施例中,解码器可以先解析二进制码流,从而获得当前块的预测模式参数。其中,预测模式参数可以用于对当前块所使用的预测模式进行确定。
需要说明的是,待解码图像可以划分为多个图像块,而当前待解码的图像块可以称为当前块(可以用CU表示),与当前块相邻的图像块可以称为相邻块;即在待解码图像中,当前块与相邻块之间具有相邻关系。这里,每个当前块可以包括第一图像分量、第二图像分量和第三图像分量,也即当前块表示待解码图像中当前待进行第一图像分量、第二图像分量或者第三图像分量预测的图像块。
其中,假定当前块进行第一图像分量预测,而且第一图像分量为亮度分量,即待预测图像分量为亮度分量,那么当前块也可以称为亮度块;或者,假定当前块进行第二图像分量预测,而且第二图像分量为色度分量,即待预测图像分量为色度分量,那么当前块也可以称为色度块。
进一步地,在本申请的实施例中,预测模式参数不仅可以指示当前块采用的预测模式,还可以指示与该预测模式相关的参数。
可以理解的是,在本申请的实施例中,预测模式可以包括有帧间预测模式、传统帧内预测模式以及非传统帧内预测模式等。
也就是说,在编码侧,编码器可以选取最优的预测模式对当前块进行预编码,在这过程中就可以确定出当前块的预测模式,然后确定用于指示预测模式的预测模式参数,从而将相应的预测模式参数写入码流,由编码器传输到解码器。
相应地,在解码器侧,解码器通过解析码流便可以直接获取到当前块的预测模式参数,并根据解析获得的预测模式参数确定当前块所使用的预测模式,以及该预测模式对应的相关参数。
进一步地,在本申请的实施例中,解码器在解析获得预测模式参数之后,可以基于预测模式参数确定当前块是否使用帧间预测模式。
步骤302、当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前块的当前子块的第一运动矢量;其中,当前块包括多个子块。
在本申请的实施例中,解码器在解析获得预测模式参数之后,如果解析获得的预测模式参数指示当前块使用帧间预测模式确定当前块的帧间预测值,那么解码器可以先确定出当前块的每一个子块的第一运动矢量。其中,一个子块对应有一个第一运动矢量。
需要说明的是,在本申请的实施例中,当前块为当前帧中待解码的图像块,当前帧以图像块的形式按一定顺序依次进行解码,该当前块为当前帧内按该顺序下一时刻待解码的图像块。当前块可具有多种规格尺寸,例如16×16、32×32或32×16等规格,其中数字表示当前块上像素点的行数和列数。
进一步对,在本申请的实施例中,当前块可以划分为多个子块,其中,当前块的每一个子块均是基于子块的预测的预测子块,每一个子块的尺寸大小都是相同的,子块为较小规格的像素点集合。子块的尺寸可以为8×8或4×4。
示例性的,在本申请中,当前块的尺寸为16×16,可以划分为4个尺寸均为8×8的子块。
可以理解的是,在本申请的实施例中,在解码器解析码流获取到预测模式参数指示使用帧间预测模式确定当前块的帧间预测值的情况下,就可以继续采用本申请实施例所提供的帧间预测方法。
在本申请的实施例中,进一步地,当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,解码器确定当前块的当前子块的第一运动矢量的方法可以包括以下步骤:
步骤302a、解析码流,获取当前块的仿射模式参数和预测参考模式。
步骤302b、当仿射模式参数指示使用仿射模式时,确定控制点模式和子块尺寸参数。
步骤302c、根据预测参考模式、控制点模式以及子块尺寸参数,确定第一运动矢量。
在本申请的实施例中,解码器在解析获得预测模式参数之后,如果解析获得的预测模式参数指示当前块使用帧间预测模式确定当前块的帧间预测值,那么解码器可以通过解析码流,获得仿射模式参数和预测参考模式。
需要说明的是,在本申请的实施例中,仿射模式参数用于对是否使用仿射模式进行指示。具体地,仿射模式参数可以为仿射运动补偿允许标志affine_enable_flag,解码器通过仿射模式参数的取值的确定,可以进一步确定是否使用仿射模式。
也就是说,在本申请中,仿射模式参数可以为一个二值变量。若仿射模式参数的取值为1,则指示使用仿射模式;若仿射模式参数的取值为0,则指示不使用仿射模式。
可以理解的是,在本申请中,解码器解析码流,如果未解析得到仿射模式参数,那么也可以理解为指示不使用仿射模式。
示例性的,在本申请中,仿射模式参数的取值可以等于仿射运动补偿允许标志affine_enable_flag的值,如果affine_enable_flag的值为‘1’,表示可使用仿射运动补偿;如果affine_enable_flag的值为‘0’,表示不应使用仿射运动补偿。
进一步地,在本申请的实施例中,如果解码器解析码流所获取的仿射模式参数指示使用仿射模式,那么解码器可以进行控制点模式和子块尺寸参数的获取。
需要说明的是,在本申请的实施例中,控制点模式用于对控制点的个数进行确定。在仿射模型中,一个子块可以有2个控制点或者3个控制点,相应地,控制点模式可以为2个控制点对应的控制点模式,或者为3个控制点对应的控制点模式。即控制点模式可以包括4参数模式和6参数模式。
可以理解的是,在本申请的实施例中,对于AVS3标准,如果当前块使用了仿射模式,那么解码器还需要确定出当前块在仿射模式中控制点的个数进行确定,从而可以确定出使用的是4参数(2个控制点)模式,还是6参数(3个控制点)模式。
进一步地,在本申请的实施例中,如果解码器解析码流所获取的仿射模式参数指示使用仿射模式,那么解码器可以通过解析码流,进一步获取子块尺寸参数。
具体地,子块尺寸参数可以通过仿射预测子块尺寸标志affine_subblock_size_flag确定,解码器通过解析码流,获得子块尺寸标志,并根据子块标志的取值确定当前块的当前子块的尺寸大小。其中,子块的尺寸大小可以为8×8或4×4。具体地,在本申请中,子块尺寸标志可以为一个二值变量。若子块尺寸标志的取值为1,则指示子块尺寸参数为8×8;若子块尺寸标志的取值为0,则指示子块尺寸参数为4×4。
示例性的,在本申请中,子块尺寸标志的取值可以等于仿射预测子块尺寸标志affine_subblock_size_flag的值,如果affine_subblock_size_flag的值为‘1’,则当前块划分为尺寸为8×8的子块;如果affine_subblock_size_flag的值为‘0’,则当前块划分为尺寸为4×4的子块。
可以理解的是,在本申请中,解码器解析码流,如果未解析得到子块尺寸标志,那么也可以理解为当前块划分为4×4的子块。也就是说,如果码流中不存在affine_subblock_size_flag,可以直接将子块尺寸标志的取值设置为0。
进一步地,在本申请的实施例中,解码器在确定控制点模式和子块尺寸参数之后,便可以根据预测参考模式、控制点模式以及子块尺寸参数,进一步确定出当前块中的当前子块的第一运动矢量。
具体地,在本申请的实施例中,解码器可以先根据预测参考模式确定控制点运动矢量组;然后可以基于控制点运动矢量组、控制点模式以及子块尺寸参数,确定出当前子块的第一运动矢量。
可以理解的是,在本申请的实施例中,控制点运动矢量组可以用于对控制点的运动矢量进行确定。
需要说明的是,在本申请的实施例中,解码器可以按照上述方法,遍历当前块中的每一个子块,利用每一个子块的控制点运动矢量组、控制点模式以及子块尺寸参数,确定出每一个子块的第一运动矢量,从而可以根据每一个子块的第一运动矢量构建获得运动矢量集合。
可以理解的是,在本申请的实施例中,当前块的运动矢量集合中可以包括当前块的每一个子块的第一运动矢量。
进一步地,在本申请的实施例中,解码器在根据控制点运动矢量组、控制点模式以及子块尺寸参数,确定第一运动矢量时,可以先根据控制点运动矢量组、控制点模式以及当前块的尺寸参数,确定差值变量;然后可以基于预测模式参数和子块尺寸参数,确定子块位置;最后,便可以利用差值变量和子块位置,确定子块的第一运动矢量,进而可以获得当前块的多个子块的运动矢量集合。
示例性的,在本申请中,差值变量可以包括4个变量,具体为dHorX、dVerX、dHorY和dVerY,在计算差值变量时,解码器需要先确定出控制点运动矢量组,其中,控制点运动矢量组可以对控制点的运动矢量进行表征。
具体地,如果控制点模式为6参数模式,即存在3个控制点,那么控制点运动矢量组可以为包括3个运动矢量的运动矢量组,表示为mvsAffine(mv0,mv1,mv2);如果控制点模式为4参数模式,即存在2个控制点,那么控制点运动矢量组可以为包括2个运动矢量的运动矢量组,表示为mvsAffine(mv0,mv1)。
接着,解码器便可以利用控制点运动矢量组进行差值变量的计算:
dHorX=(mv1_x-mv0_x)<<(7-Log(width));
dHorY=(mv1_y-mv0_y)<<(7-Log(width));
若运动矢量组为mvsAffine(mv0,mv1,mv2),则:
dVerX=(mv2_x-mv0_x)<<(7-Log(height));
dVerY=(mv2_y-mv0_y)<<(7-Log(height));
若运动矢量组为mvsAffine(mv0,mv1),则:
dVerX=-dHorY;
dVerY=dHorX。
其中,width和height分别为当前块的宽度和高度,即当前块的尺寸参数,具体地,当前块的尺寸参数可以为解码器通过解析码流获取的。
进一步地,在本申请的实施例中,解码器在确定出差值变量之后,可以接着基于预测模式参数和子块尺寸参数,确定子块位置。具体地,解码器可以通过字块尺寸标志确定出子块的尺寸,同时可以通过预测模式参数确定具体使用哪一种预测模式,然后便可以根据子块的尺寸和使用的预测模式,确定出子块位置。
示例性的,在本申请中,如果当前块的预测参考模式的取值为2,即为第三参考模式‘Pred_List01’,或者,子块尺寸标志的取值为1,即子块的宽度subwidth和高度subheight均等于8,则(x,y)是8×8子块左上角位置的坐标,那么可以通过以下方式确定子块位置的坐标xPos和yPos:
如果当前子块是当前块的左上角的控制点,则xPos和yPos均等于0;
如果当前子块是当前块的右上角的控制点,则xPos等于width,yPos等于0;
如果当前子块是当前块的左下角的控制点,控制点运动矢量组可以为包括3个运动矢量的运动矢量组,则xPos等于0, yPos等于height;
否则,xPos等于(x-xE)+4,yPos等于(y-yE)+4。
示例性的,在本申请中,如果当前块的预测参考模式的取值0或1,即为第一参考模式‘Pred_List0’或第二参考模式‘Pred_List1’,且子块尺寸标志的取值为0,即子块的宽度subwidth和高度subheight均等于4,(x,y)是4×4子块左上角位置的坐标,那么可以通过以下方式确定子块位置的坐标xPos和yPos:
如果当前子块是当前块的左上角的控制点,则xPos和yPos均等于0;
如果当前子块是当前块的右上角的控制点,则xPos等于width,yPos等于0;
如果当前子块是当前块的左下角的控制点,控制点运动矢量组可以为包括3个运动矢量的运动矢量组,则xPos等于0,yPos等于height;
否则,xPos等于(x-xE)+2,yPos等于(y-yE)+2。
进一步地,在本申请的实施例中,解码器计算获得子块位置之后,接着可以基于子块位置和差值变量确定出该当前子块的第一运动矢量,最后,遍历当前块的每一个子块,获得每一个子块的第一运动矢量,便可以构建获得当前块的多个子块的运动矢量集合。
示例性的,在本申请中,解码器在确定出子块位置xPos和yPos之后,可以通过以下方式确定出子块的第一运动矢量mvE(mvE_x,mvE_y)
mvE_x=Clip3(-131072,131071,Rounding((mv0_x<<7)+dHorX×xPos+dVerX×yPos,7));
mvE_y=Clip3(-131072,131071,Rounding((mv0_y<<7)+dHorY×xPos+dVerY×yPos,7))。
需要说明的是,在本申请中,在确定子块内的每个位置与子块的运动矢量的偏差时,如果当前块使用的是仿射预测模型,可以根据仿射预测模型的公式计算出子块内的每个位置的运动矢量,与子块的运动矢量相减得到它们的偏差。如果子块的运动矢量都选择子块内同一位置的运动矢量,如4x4的块使用距离左上角(2,2)的位置,8x8的块使用距离左上角(4,4)的位置,根据现在标准包括VVC和AVS3中使用的仿射模型,每个子块相同位置的运动矢量偏差都是相同的。但是AVS在左上角,右上角,以及3个控制点的情况下的左下角(上述AVS3文本中的,如图7所示的A,B,C位置)与其他块使用的位置不同,相应地在计算左上角,右上角,以及3控制点的情况下的左下角的子块的运动矢量偏差时与其他块也不同。
步骤303、基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,当前像素位置为当前子块内的一个像素点的位置,目标像素位置为对当前像素位置的像素点进行二次预测或PROF处理的像素点的位置。
在本申请的实施例中,解码器在确定出当前块的每一个子块的第一运动矢量之后,可以先基于当前子块的第一运动矢量确定出当前子块的第一预测值,然后可以确定该当前子块中的当前像素位置所对应的目标像素位置。其中,目标像素位置是与当前像素位置相邻的像素位置。
需要说明的是,在本申请的实施例中,当前像素位置为当前块的当前子块中的一个像素点的位置,其中,当前像素位置可以表征待处理的像素点的位置。具体地,当前像素位置可以为待二次预测的像素点的位置,也可以为待PROF处理的像素点的位置。
进一步地,在本申请的实施例中,目标像素位置为当前像素点周围的、与当前像素位置相邻的像素位置。具体地,如果当前像素位置为待二次预测的像素点的位置,那么基于二次预测所使用的滤波器的形状,可以确定与当前像素位置在各个方向上相邻的像素位置为目标像素位置;如果当前像素位置为待PROF处理的像素点的位置,那么可以确定与当前像素位置在水平方向和垂直方向上相邻的像素位置为目标像素位置。
示例性的,在本申请中,如果使用矩形滤波器对当前像素位置进行二次预测,那么目标像素位置即为当前像素位置在上、下、左、右、左上、左下、右上、右下这8个方向上相邻的像素位置。如果对当前像素位置进行PROF处理,那么目标像素位置即为当前像素位置在水平方向上、左右相邻的2个像素位置,以及当前像素位置在垂直方向上、上下相邻的2个像素位置。
需要说明的是,在本申请的实施例中,目标像素位置中的全部位置可以均属于当前像素位置对应的当前子块,目标像素位置中的部分位置可以不属于当前像素位置对应的当前子块。也就是说,目标像素位置和当前像素位置可以均属于同一个子块,也可以属于不同的子块。
可以理解的是,在本申请的实施例中,步骤303具体可以包括:
步骤303a、基于第一运动矢量确定当前子块的第一预测值。
步骤303b、确定当前像素位置对应的目标像素位置。
其中,本申请实施例提出的帧间预测方法对解码器执行步骤303a和步骤303b的顺序不进行限定,也就是说,在本申请中,在确定出当前块的每一个子块的第一运动矢量之后,解码器可以先执行步骤303a,然后执行步骤303b,也可以先执行步骤303b,再执行步骤303a,还可以同时执行步骤303a和步骤303b。
进一步地,在本申请的实施例中,解码器在基于第一运动矢量确定当前子块的第一预测值时,可以先确定样本矩阵;其中,样本矩阵包括亮度样本矩阵和色度样本矩阵;然后可以根据预测参考模式、子块尺寸参数、样本矩阵以及运动矢量集合,确定第一预测值。
需要说明的是,在本申请的实施例中,解码器在根据预测参考模式、子块尺寸参数、样本矩阵以及运动矢量集合,确定第一预测值时,可以先根据预测参考模式和子块尺寸参数,从运动矢量集合中确定目标运动矢量;然后可以利用预测参考模式对应的参考图像队列和参考索引样本矩阵以及目标运动矢量,确定预测样本矩阵;其中,预测样本矩阵包括多个子块的第一预测值。
具体地,在本申请的实施例中,样本矩阵可以包括亮度样本矩阵和色度样本矩阵,相应地,解码器确定出的预测样本矩阵可以包括亮度预测样本矩阵和色度预测样本矩阵,其中,亮度预测样本矩阵包括多个子块的第一亮度预测值,色度预测样本矩阵包括多个子块的第一色度预测值,第一亮度预测值和第一色度预测值构成子块的第一预测值。
示例性的,在本申请中,假设当前块左上角样本在当前图像的亮度样本矩阵中的位置为(xE,yE)。如果当前块的预测参考模式取值为0,即使用第一参考模式‘PRED_List0’,且子块尺寸标志的取值为0,即子块尺寸参数为4×4,那么目 标运动矢量mv0E0是当前块的运动矢量集合在(xE+x,yE+y)位置的4×4子块的第一运动矢量。亮度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E0_x,((yE+y)<<4)+mv0E0_y)的样本值,色度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/32精度色度样本矩阵中位置为(((xE+2×x)<<4)+MvC_x,((yE+2×y)<<4)+MvC_y)的样本值。其中,x1=((xE+2×x)>>3)<<3,y1=((yE+2×y)>>3)<<3,mv1E0是当前块的运动矢量集合在(x1,y1)位置的4×4单元的第一运动矢量,mv2E0是当前块的运动矢量集合在(x1+4,y1)位置的4×4单元的第一运动矢量,mv3E0是当前块的运动矢量集合在(x1,y1+4)位置的4×4单元的第一运动矢量,mv4E0是当前块的运动矢量集合在(x1+4,y1+4)位置的4×4单元的第一运动矢量。
具体地,MvC_x和MvC_y可以通过以下方式确定:
MvC_x=(mv1E0_x+mv2E0_x+mv3E0_x+mv4E0_x+2)>>2
MvC_y=(mv1E0_y+mv2E0_y+mv3E0_y+mv4E0_y+2)>>2
示例性的,在本申请中,假设当前块左上角样本在当前图像的亮度样本矩阵中的位置为(xE,yE),如果当前块的预测参考模式取值为0,即使用第一参考模式‘PRED_List0’,且子块尺寸标志的取值为1,即子块尺寸参数为8×8,那么目标运动矢量mv0E0是当前块的运动矢量集合在(xE+x,yE+y)位置的8×8单元的第一运动矢量。亮度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E0_x,((yE+y)<<4)+mv0E0_y)的样本值,色度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/32精度色度样本矩阵中位置为(((xE+2×x)<<4)+MvC_x,((yE+2×y)<<4)+MvC_y)的样本值。其中,MvC_x等于mv0E0_x,MvC_y等于mv0E0。
示例性的,在本申请中,假设当前块左上角样本在当前图像的亮度样本矩阵中的位置为(xE,yE)。如果当前块的预测参考模式取值为1,即使用第二参考模式‘PRED_List1’,且子块尺寸标志的取值为0,即子块尺寸参数为4×4,那么目标运动矢量mv0E1是当前块的运动矢量集合在(xE+x,yE+y)位置的4×4单元的第一运动矢量。亮度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E1_x,((yE+y)<<4)+mv0E1_y)的样本值,色度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/32精度色度样本矩阵中位置为(((xE+2×x)<<4)+MvC_x,((yE+2×y)<<4)+MvC_y)的样本值。其中,x1=((xE+2×x)>>3)<<3,y1=((yE+2×y)>>3)<<3,mv1E1是MvArray第一运动矢量集合在(x1,y1)位置的4×4单元的第一运动矢量,mv2E1是MvArray第一运动矢量集合在(x1+4,y1)位置的4×4单元的第一运动矢量,mv3E1是MvArray第一运动矢量集合在(x1,y1+4)位置的4×4单元的第一运动矢量,mv4E1是MvArray第一运动矢量集合在(x1+4,y1+4)位置的4×4单元的第一运动矢量。
具体地,MvC_x和MvC_y可以通过以下方式确定:
MvC_x=(mv1E1_x+mv2E1_x+mv3E1_x+mv4E1_x+2)>>2
MvC_y=(mv1E1_y+mv2E1_y+mv3E1_y+mv4E1_y+2)>>2
示例性的,在本申请中,假设当前块左上角样本在当前图像的亮度样本矩阵中的位置为(xE,yE)。如果当前块的预测参考模式取值为1,即使用第二参考模式‘PRED_List1’,且子块尺寸标志的取值为1,即子块尺寸参数为8×8,那么目标运动矢量mv0E1是当前块的运动矢量集合在(xE+x,yE+y)位置的8×8单元的第一运动矢量。亮度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E1_x,((yE+y)<<4)+mv0E1_y)的样本值,色度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/32精度色度样本矩阵中位置为(((xE+2×x)<<4)+MvC_x,((yE+2×y)<<4)+MvC_y)的样本值。其中MvC_x等于mv0E1_x,MvC_y等于mv0E1。
示例性的,在本申请中,假设当前块左上角样本在当前图像的亮度样本矩阵中的位置为(xE,yE)。如果当前块的预测参考模式取值为2,即使用第三参考模式‘PRED_List01’,那么目标运动矢量mv0E0是当前块的运动矢量集合在(xE+x,yE+y)位置的8×8单元的第一运动矢量,目标运动矢量mv0E1是当前块的运动矢量集合在(x,y)位置的8×8单元的第一运动矢量。亮度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/16精度亮度样本矩阵中位置为(((xE+x)<<4)+mv0E0_x,((yE+y)<<4)+mv0E0_y)的样本值,色度预测样本矩阵predMatrixL0中的元素predMatrixL0[x][y]的值是参考图像队列0中参考索引为RefIdxL0的1/32精度色度样本矩阵中位置为(((xE+2×x)<<4)+MvC0_x,((yE+2×y)<<4)+MvC0_y)的样本值,亮度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/16精度亮度样本矩阵中位置为((((xE+x)<<4)+mv0E1_x,((yE+y)<<4)+mv0E1_y))的样本值,色度预测样本矩阵predMatrixL1中的元素predMatrixL1[x][y]的值是参考图像队列1中参考索引为RefIdxL1的1/32精度色度样本矩阵中位置为(((xE+2×x)<<4)+MvC1_x,((yE+2×y)<<4)+MvC1_y)的样本值。其中MvC0_x等于mv0E0_x,MvC0_y等于mv0E0_y,MvC1_x等于mv0E1_x,MvC1_y等于mv0E1_y。
需要说明的是,在申请的实施例中,样本矩阵中的亮度样本矩阵可以为1/16精度亮度样本矩阵,样本矩阵中的色度样本矩阵可以为1/32精度色度样本矩阵。
可以理解的是,在本申请的实施例中,对于不同的预测参考模式,解码器通过解析码流所获取的参考图像队列和参考索引是不相同的。
进一步地,在本申请的实施例中,解码器确定样本矩阵时,可以先获取亮度插值滤波器系数和色度插值滤波器系数;然后可以基于亮度插值滤波器系数确定亮度样本矩阵,同时,可以基于色度插值滤波器系数确定色度样本矩阵。
示例性的,在本申请中,解码器在确定亮度样本矩阵时,获取的亮度插值滤波器系数如上述表1所示,然后按照像素位置和样本位置,计算获得亮度样本矩阵。
具体地,样本位置a x,0(x=1~15)由水平方向上距离插值点最近得8个整数值滤波得到,预测值的获取方式如下:
a x,0=Clip1((fL[x][0]×A -3,0+fL[x][1]×A -2,0+fL[x][2]×A -1,0+fL[x][3]×A 0,0+fL[x][4]×A 1,0+fL[x][5]×A 2,0+fL[x][6]×A 3,0+fL[x][7]×A 4,0+32)>>6)。
具体地,样本位置a 0,y(y=1~15)由垂直方向上距离插值点最近得8个整数值滤波得到,预测值的获取方式如下:
a 0,y=Clip1((fL[y][0]×A 0,-3+fL[y][1]×A -2,0+fL[y][2]×A -1,0+fL[y][3]×A 0,0+fL[y][4]×A 1,0+fL[y][5]×A 2,0+fL[y][6]×A 3, 0+fL[y][7]×A -4,0+32)>>6)。
具体地,样本位置a x,y(x=1~15,y=1~15)的预测值的获取方式如下:
a x,y=Clip1((fL[y][0]×a' x,y-3+fL[y][1]×a' x,y-2+fL[y][2]× a'x,y-1+fL[y][3]× a'x,y+fL[y][4]× a'x,y+1+fL[y][5]× a'x,y+2+fL[y][6]× a'x, y+3+fL[y][7]× a'x,y+4+(1<<(19-BitDepth)))>>(20-BitDepth))。
其中:
a' x,y=(fL[x][0]×A -3,y+fL[x][1]×A -2,y+fL[x][2]×A -1,y+fL[x][3]×A 0,y+fL[x][4]×A 1,y+fL[x][5]×A 2,y+fL[x][6]×A 3,y+fL[x][7]×A 4, y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
示例性的,在本申请中,解码器在确定色度样本矩阵时,可以先解析码流获得色度插值滤波器系数如上述表2所示,然后按照像素位置和样本位置,计算获得色度样本矩阵。
具体地,对于dx等于0或dy等于0的分像素点,可直接用色度整像素插值得到,对于dx不等于0且dy不等于0的点,使用整像素行(dy等于0)上的分像素进行计算:
if(dx==0){
a x,y(0,dy)=Clip3(0,(1<<BitDepth)-1,(fC[dy][0]×A x,y-1+fC[dy][1]×A x,y+fC[dy][2]×A x,y+1+fC[dy][3]×A x,y+2+32)>>6)
}
else if(dy==0){
a x,y(dx,0)=Clip3(0,(1<<BitDepth)-1,(fC[dx][0]×A x-1,y+fC[dx][1]×A x,y+fC[dx][2]×A x+1,y+fC[dx][3]×A x+2,y+32)>>6)
}
else{
a x,y(dx,dy)=Clip3(0,(1<<BitDepth)-1,(C[dy][0]×a' x,y-1(dx,0)+C[dy][1]×a' x,y(dx,0)+C[dy][2]×a' x,y+1(dx,0)+C[dy][3]×a' x,y+2(dx,0)+(1<<(19-BitDepth)))>>(20-BitDepth))
}
其中,a' x,y(dx,0)是整像素行上的分像素的临时值,定义为:a' x,y(dx,0)=(fC[dx][0]×A x-1,y+fC[dx][1]×A x,y+fC[dx][2]×A x+1, y+fC[dx][3]×A x+2,y+((1<<(BitDepth-8))>>1))>>(BitDepth-8)。
步骤304、若目标像素位置不属于当前子块,则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置。
在本申请的实施例中,解码器在基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置之后,如果目标像素位置不属于当前子块,即目标像素位置和当前像素位置属于不同的子块,那么解码器可以根据当前子块对目标像素位置进行更新处理,从而获得更新后像素位置。
需要说明的是,在本申请的实施例中,一个当前像素位置对应有多个目标像素位置,如果该当前像素位置对应有多个目标像素位置均属于当前子块,那么确定目标像素位置和当前像素位置均属于当前子块;如果多个目标像素位置中存在不属于当前子块的至少一个位置,那么确定目标像素位置和当前像素位置属于不同的子块。
在本申请的实施例中,进一步地,图16为帧间预测方法的实现流程示意图二,如图16所示,若目标像素位置不属于当前子块,解码器则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置之前,即步骤304之前,解码器进行帧间预测的方法还可以包括以下步骤:
步骤306、判断当前子块是否满足预设限制条件;其中,预设限制条件用于确定是否将目标像素位置限制在当前子块内。
步骤307、若满足预设限制条件,则判断目标像素位置是否属于当前子块。
步骤308、若不满足预设限制条件,则基于第一预测值和目标像素位置,确定第二预测值,将第二预测值确定为帧间预测值。
在本申请的实施例中,解码器可以先判断当前子块是否满足预设限制条件,如果满足预设限制条件,那么解码器可以继续确定目标像素位置是否属于当前子块,从而进一步确定是否对目标像素位置进行限制处理,即是否对目标像素位置执行更新处理的流程;如果不满足预设限制条件,那么解码器便可以不需要对目标像素位置进行限制处理,而是直接基于第一预测值和目标像素位置,进一步确定当前像素位置对应的的修正后预测值,并在遍历当前子块中的每一个像素位置,获得每一个像素位置对应的修正后预测值之后,确定出当前子块对应的第二预测值,从而确定出当前子块的帧间预测值。
需要说明的是,在本申请的实施例中,预设限制条件为解码器预先设置的、用于确定是否将目标像素位置限制在当前子块内,也就是说,在本申请中,解码器可以利用预设限制条件来对是否限制当前像素位置和目标像素位置属于相同的子块进行确定。
可以理解的是,在本申请的实施例中,如果满足预设限制条件,便可以认为需要将当前像素位置和目标像素位置限制在相同的子块内,因此需要继续确定目标像素位置是否属于当前像素位置对应的当前子块。
可以理解的是,在本申请的实施例中,如果不满足预设限制条件,便可以认为不需要将当前像素位置和目标像素位置限制在相同的子块内,因此可以直接利用目标像素位置对当前像素位置进行二次预测或PROF处理,最终获得当前子块对应的帧间预测值。
也就是说,在本申请中,解码器可以基于预设限制条件来对二次预测或PROF处理所使用的全部像素位置是否属于相同的子块进行限制。具体地,如果满足预设限制条件,那么二次预测或PROF处理时所需要的像素值都使用当前子块的像素值,即要求全部像素位置均数据当前子块,可以按照步骤305至步骤307所提出的方法来对当前像素位置进行仿射预测效果的提升,获得对应的帧间预测值;如果不满足预设限制条件,那么二次预测或PROF处理时所需要的像素值可以使用其他子块的像素值,即不要求全部像素位置均数据当前子块,可以按照步骤308所提出的方法来对当前像素位置进行仿射预测效果的提升,获得对应的帧间预测值。
进一步地,在本申请的实施例中,解码器可以通过多种方式来判断当前子块是否满足预设限制条件。具体地,解码器可以通过运动矢量来进一步确定是否满足预设限制条件。
示例性的,在本申请中,解码器在判断当前子块是否满足预设限制条件时,可以先基于控制点运动矢量组确定第一运动矢量偏差;然后将第一运动矢量偏差与预设偏差阈值进行比较,如果第一运动矢量偏差大于获得等于预设偏差阈值,那么可以判定满足预设限制条件;如果第一运动矢量偏差小于预设偏差阈值,那么判定不满足预设限制条件。
可以理解的是,在本申请中,第一运动矢量偏差可以为当前块的两个控制点的运动矢量的差值。其中,在仿射模式中,如果使用的是4参数(2个控制点)模式,那么解码器可以直接将控制点运动矢量组mvsAffine(mv0,mv1)中的这两个控制点的运动矢量mv0和mv1之间的差值确定为第一运动矢量偏差;如果使用的是6参数(3个控制点)模式,那么解码器可以将控制点运动矢量组mvsAffine(mv0,mv1,mv2)中的任意2个控制点的运动矢量的差值确定为第一运动矢量偏差,也可以将控制点运动矢量组mvsAffine(mv0,mv1,mv2)中的运动矢量的最大差值确定为第一运动矢量偏差。
需要说明的是,在本申请的实施例中,如果使用表征当前块的两个控制点的运动矢量的差值的第一运动矢量偏差进行是否满足预设限制条件的判断,预设偏差阈值的设置可能与当前块的大小有关,也可能与当前仿射预测使用的当前块的子块的大小有关。
进一步地,在本申请中,如果使用表征当前块的两个控制点的运动矢量的差值的第一运动矢量偏差进行是否满足预设限制条件的判断,基于预设偏差阈值的设置,当前块的所有子块全都满足预设限制条件,或者,当前块的所有子块全都不满足预设限制条件。
示例性的,在本申请中,解码器在判断当前子块是否满足预设限制条件时,可以先基于每一个子块的第一运动矢量确定第二运动矢量偏差;然后将第二运动矢量偏差与预设偏差阈值进行比较,如果第二运动矢量偏差大于获得等于预设偏差阈值,那么判定满足预设限制条件;如果第二运动矢量偏差小于预设偏差阈值,那么判定不满足预设限制条件。
可以理解的是,在本申请中,第二运动矢量偏差可以为当前块中的任意两个或多个子块的运动矢量的差值。其中,在AVS3中,对于不同的子块,作为子块的运动矢量的像素位置在子块中的相对位置不都完全相同,如图7中所示的A左上角、B右上角、C左下角这三个子块,因此,不同子块之间的运动矢量的差值也不完全相同,进而可以将当前块的不同子块之间的运动矢量差值作为第二运动矢量偏差,以判断当前子块是否满足预设限制条件。
需要说明的是,在本申请的实施例中,如果使用表征当前块的不同子块之间的运动矢量差值的第二运动矢量偏差进行是否满足预设限制条件的判断,预设偏差阈值的设置可能与当前仿射预测使用的当前块的子块的大小有关。
进一步地,在本申请中,如果使用表征当前块的不同子块之间的运动矢量差值的第二运动矢量偏差进行是否满足预设限制条件的判断,当前块的每一个子块均可以基于预设偏差阈值和与其对应的第二运动矢量偏差判断当前子块是否满足预设限制条件,其中,除了如图7中所示的A、B、C以及与它们相连的子块以外的其他子块之间均符合相同的规则,因此,其他子块可以共用同一个条件判断结果。
示例性的,在本申请中,解码器在判断当前子块是否满足预设限制条件时,可以先确定当前子块内的、预设像素位置与当前子块之间的第三运动矢量偏差;然后将第三运动矢量偏差与预设偏差阈值进行比较,如果第三运动矢量偏差大于获得等于预设偏差阈值,那么判定满足预设限制条件;如果第三运动矢量偏差小于预设偏差阈值,那么判定不满足预设限制条件。
可以理解的是,在本申请中,第三运动矢量偏差可以为当前子块内的、预设像素位置与当前子块之间的运动矢量的差值。其中,在AVS3中,对于子块内的不同像素位置,对应的运动矢量可能是不同的,一般情况下,距离作为子块运动矢量的像素位置越远的预设像素位置,对应的第三运动矢量偏差越大,那么与该预设像素位置相邻的、其他子块的像素位置的不相连的可能性就越大。
进一步地,在本申请中,如果使用表征当前子块内的、预设像素位置与当前子块之间的运动矢量差值的第三运动矢量偏差进行是否满足预设限制条件的判断,当前块的子块内的每一个像素位置可以分别基于预设偏差阈值判断是否满足预设限制条件,其中,除了如图7中所示的A、B、C以及与它们相连的子块以外的其他子块之间均符合相同的规则,因此,其他子块可以共用同一个条件判断结果。
可以理解的是,在本申请中,如果使用表征当前子块内的、预设像素位置与当前子块之间的运动矢量差值的第三运动矢量偏差进行是否满足预设限制条件的判断,解码可以选择某一个或某几个特定像素位置的运动矢量与其对应子块的运动矢量之前的差作为第三运动矢量偏差,例如,如图7所示的A类子块,A类子块使用子块内左上角的像素位置的运动矢量作为子块的运动矢量,记该像素位置在子块内的位置为(0,0),子块宽度为sub_w,子块高度为sub_h,那么子块的右下角的像素位置就为(sub_w-1,sub_h-1),由于右下角的像素位置与位置(0,0)的水平方向和垂直方向的距离都最远,因此,可以把右下角的像素位置的运动矢量与A类子块的运动矢量之间的差作为A类子块对应的第三运动矢量偏差,也可以将该差值作为当前块对应的第三运动矢量偏差。同理,对如图7所示的B,可以选择子块左下角的像素位置的运动矢量与其对应子块的运动矢量的差作为第三运动矢量偏差;对如图7所示的C,可以选择子块右上角的像素位置的运动矢量与其对应子块的运动矢量的差作为第三运动矢量偏差。
需要说明的是,在本申请中,可以根据当前块的尺寸参数和\或子块尺寸参数确定预设偏差阈值。
进一步地,在本申请的实施例中,如果目标像素位置不属于当前子块,那么解码器可以根据当前子块对目标像素位置进行更新处理,从而可以获得更新后像素位置。其中,解码器可以采用多种方式获取更新后像素位置。
可以理解的是,在本申请中,解码器在根据当前子块对目标像素位置进行更新处理,获得更新后像素位置时,可以先对当前子块进行扩展处理,获得扩展后子块;然后可以在扩展后子块内确定目标像素位置对应的更新后像素位置。
也就是说,在本申请中,为了在二次预测或PROF处理时使所需要的像素值都使用当前子块的像素值,即限制使用的像素位置均属于当前子块,解码器可以先对当前的子块基于子块的预测的预测子块进行扩展,即先对当前子块进行扩展处理。具体地,由于二次预测或PROF处理时所使用的目标像素位置与当前像素位置之间的距离为1个像素位置,因此,即使目标像素位置不属于当前子块,也仅仅超出当前子块的边界一个像素位置,可见,解码器在对当前子块进行扩展时,只需要对基于子块的预测的当前子块扩展一行或两行像素,和/或,一列或两列像素,最终便可以获得扩展后子块。
示例性的,在本申请中,图17为扩展当前子块的示意图,如图17所示,以9点矩形滤波器为例,可以子块的预测的当前子块可以在其上下左右边界各扩展出一个像素位置,一个简单的扩展方法是左边扩展的像素复制其水平方向对应的左边界的像素值,右边扩展的像素复制其水平方向对应的右边界的像素值,上边扩展的像素复制其垂直方向对应的上边界的像素值,下边扩展的像素复制其垂直方向对应的下边界的像素值,扩展的四个顶点可以复制其对应的顶点的像素值。
可以理解的是,在本申请的实施例中,解码器在对当前子块进行扩展处理,获得扩展后子块时,可以选择利用当前子块的全部边界位置进行扩展处理,获得扩展后子块;也可以选择利用当前子块内的、目标像素位置对应的行和\或列的边界位置进行扩展处理,获得扩展后子块。
也就是说,在本申请中,在对当前子块进行扩展时,可以对当前子块的上下左右四个边界都进行扩展处理,也可以仅仅对于目标像素对应的一个或者两个边界进行扩展处理,例如,如果一个目标像素位置属于当前子块左边的子块,那么可以对当前子块的左侧边界进行扩展处理,而不对当前子块的其他三侧边界进行扩展处理。
可以理解的是,在本申请中,解码器在根据当前子块对目标像素位置进行更新处理,获得更新后像素位置时,还可以利用当前子块内的、与目标像素位置相邻的像素位置,替换目标像素位置,从而获得更新后像素位置。
也就是说,在本申请中,如果二次预测的滤波器的某一个位置对应的像素位置或PROF处理使用的像素位置超出了当前子块的边界,那么可以调整其对应的像素位置为当前子块内的一个像素位置。例如,当前子块的左上角像素位置为(0,0),当前子块的宽度是sub_width,当前子块的高度是sub_height,那么当前子块水平方向的范围是0~(sub_width-1),当前子块垂直方向的范围是0~(sub_height-1)。若二次预测的滤波器或PROF处理需要使用的像素位置为(x,y),如果x小于0,那么将x设为0。如果x大于sub_width-1,那么将x设为sub_width-1。如果y小于0,那么将y设为0,如果y大于sub_height-1,那么将y设为sub_height-1。
步骤305、基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值,将第二预测值确定为当前子块的帧间预测值。
在本申请的实施例中,如果目标像素位置不属于当前子块,解码器在根据当前子块对目标像素位置进行更新处理,获得更新后像素位置之后,便可以基于第一预测值和更新后像素位置,确定出当前像素位置对应的的修正后预测值,并在遍历当前子块中的每一个像素位置,获得每一个像素位置对应的修正后预测值之后,确定当前子块对应的第二预测值,然后可以将第二预测值确定为当前子块的帧间预测值。
需要说明的是,在本申请的实施例中,由于目标像素位置为对当前像素位置的像素点进行二次预测或PROF处理,因此,在获得目标像素位置对应的更新后像素位置以后,解码器可以基于第一预测值和更新后像素位置对当前像素位置进行二次预测或PROF处理,从而获得对应的第二预测值。
进一步地,在本申请的实施例中,解码器基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值的方法可以包括以下步骤:
步骤305a、解析码流,获取PROF参数;
步骤305b、当PROF参数指示进行PROF处理时,基于第一预测值确定当前像素位置与更新后像素位置之间的像素水平梯度和像素垂直梯度;
步骤305c、确定更新后像素位置与当前子块的第四运动矢量偏差;
步骤305d、根据像素水平梯度、像素垂直梯度以及第四运动矢量偏差,计算当前像素位置对应的偏差值;
步骤305e、基于第一预测值和偏差值,获得第二预测值。
在本申请的实施例中,解码器可以先解析码流,获得PROF参数,如果PROF参数指示进行PROF处理,那么解码器可以基于第一预测值确定当前像素位置与更新后像素位置之间的像素水平梯度和像素垂直梯度;其中,像素水平梯度即为当前像素位置对应的像素值与水平方向上的、更新后像素位置对应的像素值之间的梯度值;像素垂直梯度即为当前像素位置对应的像素值与垂直方向上的、更新后像素位置对应的像素值之间的梯度值。同时,解码器还可以确定更新后像素位置与当前子块的第四运动矢量偏差,第四运动矢量偏差即为更新后像素位置的运动矢量与当前子块的第一运动矢量之间的差值。
进一步地,在本申请的实施例中,解码器可以根据当前像素位置所对应的像素水平梯度、像素垂直梯度以及第四运动矢量偏差,计算获得与当前像素位置对应的偏差值。其中,该偏差值可以用于对当前像素位置的像素值的预测值进行修正处理。
需要说明的是,在本申请的实施例中,解码器可以进一步根据第一预测值和偏差值,获得当前像素位置所对应的的修正后预测值,并在遍历当前子块中的每一个像素位置,获得每一个像素位置对应的修正后预测值之后,利用全部像素位置对应的修正后预测值确定出当前子块对应的第二预测值,从而确定出对应的帧间预测值。具体地,在本申请中,在完成基于子块的预测之后,将当前子块的第一预测值作为当前像素位置的预测值,接着,将第一预测值与当前像素位置对应的偏差值相加,即可以完成对当前像素位置的预测值的修正处理,获得修正后预测值,从而可以进一步获得当前子块的第二预测值,并将第二预测值作为当前子块对应的帧间预测值。
进一步地,在本申请的实施例中,解码器基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值的方法可以包括以下步骤:-
步骤305f、解析码流,获取二次预测参数;
步骤305g、当二次预测参数指示使用二次预测时,确定更新后像素位置与当前子块的第四运动矢量偏差;
步骤305h、根据第四运动矢量偏差确定二维滤波器的滤波系数;其中,二维滤波器用于按照预设形状进行二次预测处理;
步骤305i、基于滤波系数和第一预测值,确定第二预测值,将第二预测值确定为帧间预测值。
在本申请的实施例中,解码器可以先解析码流,获得二次预测参数,如果二次预测参数指示使用二次预测,那么解码器可以确定更新后像素位置与当前子块的第四运动矢量偏差,第四运动矢量偏差即为更新后像素位置的运动矢量与当前子块的第一运动矢量之间的差值,从而便可以根据第四运动矢量偏差确定二维滤波器的滤波系数;其中,二维滤波器用于按照预设形状进行二次预测处理。
进一步地,在本申请的实施例中,解码器在目标像素位置与当前子块之间的第四运动矢量偏差时,可以基于差值变量确定当前子块与每一个像素位置之间的第四三运动矢量偏差。
具体地,在本申请的实施例中,解码器在基于差值变量确定当前子块与每一个像素位置之间的第四运动矢量偏差时,可以按照上述步骤302所提出的方法,根据控制点运动矢量组、控制点模式以及当前块的尺寸参数,确定出dHorX、dVerX、dHorY和dVerY这4个差值变量,然后再利用差值变量进一步确定出子块中的每一个像素位置所对应的第四运动矢量偏差。
示例性的,在本申请中,width和height分别为解码器获取的当前块的宽度和高度,利用子块尺寸参数所确定出子块的宽度subwidth和高度subheight。假设(i,j)为子块内部的任意像素点的坐标,其中,i的取值范围是0~(subwidth-1),j的取值范围是0~(subheight-1),那么,可以通过以下方法计算4不同类型的子块的内部每个像素(i,j)位置的第四运动矢量偏差:
如果当前子块是当前块的左上角的控制点A,那么(i,j)像素的第四运动矢量偏差dMvA[i][j]:
dMvA[i][j][0]=dHorX×i+dVerX×j
dMvA[i][j][1]=dHorY×i+dVerY×j;
如果当前子块是当前块的右上角的控制点B,那么(i,j)像素的第四运动矢量偏差dMvB[i][j]:
dMvB[i][j][0]=dHorX×(i-subwidth)+dVerX×j
dMvB[i][j][1]=dHorY×(i-subwidth)+dVerY×j;
如果当前子块是当前块的左下角的控制点C,控制点运动矢量组可以为包括3个运动矢量的运动矢量组,那么(i,j)像素的第四运动矢量偏差dMvC[i][j]:
dMvC[i][j][0]=dHorX×i+dVerX×(j-subheight)
dMvC[i][j][1]=dHorY×i+dVerY×(j-subheight);
否则,(i,j)像素的第四运动矢量偏差dMvN[i][j]:
dMvN[i][j][0]=dHorX×(i–(subwidth>>1))+dVerX×(j–(subheight>>1))
dMvN[i][j][1]=dHorY×(i–(subwidth>>1))+dVerY×(j–(subheight>>1))。
其中,dMvX[i][j][0]表示第四运动矢量偏差在水平分量的偏差值,dMvX[i][j][1]表示第四运动矢量偏差在垂直分量的偏差值。X为A,B,C或N。
可以理解的是,在本申请的实施例中,解码器在基于差值变量确定当前子块与每一个像素位置之间的第四运动矢量偏差之后,便可以利用子块内的全部像素位置对应的全部第四运动矢量偏差,构建出该当前子块对应的运动矢量偏差矩阵。可见,运动矢量偏差矩阵中包括有子块与任意一个内部的像素点之间的第四运动矢量偏差。
需要说明的是,在本申请的实施例中,二维滤波器的滤波器系数是与目标像素位置所对应的第四运动矢量偏差相关的。也就是说,对于不同的目标像素位置,如果对应的第四运动矢量偏差不同,那么使用的二维滤波器的滤波系数也是不同的。
可以理解的是,在本申请的实施例中,二维滤波器用于利用多个相邻的、构成预设形状的像素位置进行二次预测。其中,预设形状为矩形、菱形或任意一种对称形状。
也就是说,在本申请的中,用于进行二次预测的二维滤波器是相邻的构成预设形状的点所构成的滤波器。相邻的构成预设形状的点可以包括多个点,例如由9个点构成。预设形状可以为对称形状,例如,预设形状可以包括矩形、菱形或其他任意一种对称形状。
示例性的,在本申请中,二维滤波器是一个矩形的滤波器,具体地,二维滤波器是由9个相邻的构成矩形的像素位置组成的滤波器。在9个像素位置中,位于中心的像素位置是当前需要二次预测的像素的像素位置,即当前像素位置。
进一步地,在本申请的实施例中,解码器在根据第四运动矢量偏差确定二维滤波器的滤波系数时,可以先解析码流,获取比例参数,然后可以根据比例参数和第四运动矢量偏差,确定像素位置对应的滤波器系数。
需要说明的是,在本申请的实施例中,比例参数可以包括至少一个比例值,第四运动矢量偏差包括水平偏差和垂直偏差;其中,至少一个比例值均为非零实数。
具体地,在本申请中,当二维滤波器利用9个相邻的、构成矩形的像素位置进行二次预测时,位于矩形的中心的像素位置为待预测位置,即当前像素位置,其他8个目标像素位置依次位于当前像素位置的左上、上、右上、右、右下、下、左下、左这8个方向。
相应地,在本申请中,解码器可以基于至少一个比例值和待预测位置的第四运动矢量偏差,按照预设计算规则计算获得9个相邻的像素位置对应的9个滤波器系数系数。
需要说明的是,在本申请中,预设计算规则可以包括多种不同的计算方式,如加法运算、减法运算、乘法运动等。其中,对于不同的像素位置,可以使用不同的计算方式进行滤波器系数的计算。
可以理解的是,在本申请中,解码器在按照预设计算规则中不同的计算方法计算获得多个像素位置对应的多个滤波器系数中,部分滤波器系数可以为第四运动矢量偏差的一次函数,即两者为线性关系,还可以为第四运动矢量偏差的二次函数或高次函数,即两者为非线性关系。
也就是说,在本申请中,多个相邻的像素位置对应的多个滤波器系数中的任意一个滤波器系数,可以为第四运动矢量偏差的一次函数、二次函数或者高次函数。
示例性的,在本申请中,假设像素位置的第四运动矢量偏差为(dmv_x,dmv_y),其中,如果目标像素位置的坐标为(i,j),那么dmv_x可以表示为dMvX[i][j][0],即表示第四运动矢量偏差在水平分量的偏差值,dmv_y可以表示为dMvX[i][j][1],即表示第四运动矢量偏差在垂直分量的偏差值。
相应地,表3为基于第四运动矢量偏差(dmv_x,dmv_y)所获得的滤波器系数,如表3所示,对于二维滤波器,按照像素位置的第四运动矢量偏差(水平偏差为dmv_x,垂直偏差为dmv_y)和不同的比例参数,如m和n,可以获得9个相邻的像素位置对应的9个滤波器系数,其中,解码器可以直接将中心的当前像素位置的滤波器系数设置为1。
表3
像素位置 滤波器系数
左上 (-dmv_x-dmv_y)×m
-dmv_x×n
左下 (-dmv_x+dmv_y)×m
-dmv_y×n
中心 1
dmv_y×n
右上 (dmv_x-dmv_y)×m
dmv_x×n
右下 (dmv_x+dmv_y)×m
其中,比例参数m和n一般是小数或分数,一种可能的情况是m和n都是2的幂,如1/2,1/4,1/8等。这里的dmv_x, dmv_y都是其实际的大小,即dmv_x,dmv_y的1表示1个像素的距离,dmv_x,dmv_y是小数或者分数。
需要说明的是,在本申请的实施例中,与现有的8抽头的滤波器相比,目前常见的8抽头的滤波器所对应的整像素位置和分像素位置的运动矢量在水平和垂直方向均为非负的,且大小均属于0像素到1像素之间,即dmv_x,dmv_y不可以为负的。而在本申请中,滤波器对应的整像素位置和分像素位置的运动矢量在水平和垂直方向都可以为负的,即dmv_x,dmv_y可以为负的。
示例性的,在本申请的实施例中,如果比例参数m为1/16,n为1/2,那么上述表3可以表示为下表4:
表4
像素位置 滤波器系数
左上 (-dmv_x-dmv_y)/16
-dmv_x/2
左下 (-dmv_x+dmv_y)/16
-dmv_y/2
中心 1
dmv_y/2
右上 (dmv_x-dmv_y)/16
dmv_x/2
右下 (dmv_x+dmv_y)/16
可以理解的是,在本申请的实施例中,在视频编解码技术以及标准中,通常使用放大倍数以避免小数,浮点数运算,然后将计算的结果缩小合适的倍数以得到正确的结果。放大倍数时通常使用左移,缩小倍数时通常使用右移。因此,在通过二维滤波器进行二次预测时,在实际应用时会写成如下的形式:
假设像素位置的第四运动矢量偏差为(dmv_x,dmv_y),经过左移shift1得到(dmv_x’,dmv_y’),基于上述表4,二维滤波器的系数可以表示为下表5:
表5
像素位置 滤波器系数
左上 -dmv_x’-dmv_y’
-dmv_x’×8
左下 -dmv_x’+dmv_y’
-dmv_y’×8
中心 16<<shift1
dmv_y’×8
右上 dmv_x’-dmv_y’
dmv_x×8
右下 dmv_x’+dmv_y’
图18为二维滤波器的示意图一,如图18所示,以基于子块的预测的结果为二次预测的基础,浅色正方形为该滤波器的整像素位置,也就是基于子块的预测得到的位置。圆形是需要进行二次预测的分像素位置,即像素位置的位置,深色正方形是该分像素位置对应的整像素位置,插值得到这个分像素位置时需要如图所示的9个整像素位置。
图19为二维滤波器的示意图二,如图19所示,以基于子块的预测的结果为二次预测的基础,浅色正方形为该滤波器的整像素位置,也就是基于子块的预测得到的位置。圆形是需要进行二次预测的分像素位置,即像素位置的位置,深色正方形是该分像素位置对应的整像素位置,插值得到这个分像素位置时需要如图所示的13个整像素位置。
进一步地,在本申请的实施例中,解码器在根据第四运动矢量偏差确定出二维滤波器的滤波系数之后,便可以基于滤波系数和第一预测值,确定当前子块的第二预测值,从而可以实现对当前子块的二次预测。
可以理解的是,在本申请的实施例中,解码器利用像素位置所对应的第四运动矢量偏差确定出滤波器系数,从而可以按照滤波器系数,通过二维滤波器对第一预测值进行修正,获得修正后的、当前子块的第二预测值。可见,第二预测值为基于第一预测值的修正值。
进一步地,在本申请的实施例中,解码器在基于滤波系数和第一预测值,确定当前子块的第二预测值时,可以先对滤波器系数与第一预测值进行乘法运算,获得乘积结果,遍历当前子块中的全部像素位置之后,再对当前子块的全部像素位置的乘积结果进行加法运算,获得求和结果,最后可以对加法结果进行归一化处理,最终便可以获得当前子块修正后的第二预测值。
需要说明的是,在本申请的实施例中,在进行二次预测之前,一般情况下是将像素位置所在的当前子块的第一预测值作为该像素位置的修正前的预测值,因此,在通过二维滤波器进行滤波时,可以将滤波器系数与对应像素位置的预测值,即第一预测值相乘,并对每一个像素位置对应的乘积结果进行累加,然后归一化。
可以理解的是,在本申请中,解码器可以通过多种方式进行归一化处理,例如,可以将滤波器系数与对应的像素位置的预测值相乘后累加的结果,右移4+shift1位。或者,还可以将滤波器系数与对应的像素位置的预测值相乘后累加的结果,再加上(1<<(3+shift1)),然后右移4+shift1位。
可见,在本申请中,在获得当前子块内部的像素位置对应的第四运动矢量偏差之后,对每一个子块,以及每一个子块中的每一个像素位置,可以根据第四运动矢量偏差,基于当前子块的运动补偿的第一预测值,使用二维滤波器进行滤波,完成对当前子块的二次预测,得到新的第二预测值。
进一步地,在本申请的实施例中,二维滤波器可以理解为利用多个相邻的、构成预设形状的像素位置进行二次预测。其中,预设形状可以为矩形、菱形或任意一种对称形状。
具体地,在本申请的实施例中,二维滤波器在利用9个相邻的、构成矩形的像素位置进行二次预测时,可以先确定当前 块的预测样本矩阵,和当前块的当前子块的运动矢量偏差矩阵;其中,运动矢量偏差矩阵包括全部像素位置对应的第四运动矢量偏差;然后基于9个相邻的、构成矩形的像素位置,利用预测样本矩阵和运动矢量偏差矩阵,确定当前块的二次预测后的样本矩阵。
示例性的,在本申请中,如果当前块的宽度和高度分别是width和height,每个子块的宽度和高度分别是subwidth和subheight。如图7所示,当前块的亮度预测样本矩阵的左上角样本所在的子块为A,右上角样本所在的子块为B,左下角样本所在的子块为C,其他位置所在子块为其它子块。
对当前块中的每个子块,可以将子块的运动矢量偏差矩阵记为dMv,那么:
1、如果子块是A,dMv等于dMvA;
2、如果子块是B,dMv等于dMvB;
3、如果子块为C,且该当前子块的控制点运动矢量组mvAffine中有3个运动矢量,dMv等于dMvC;
4、如果子块为A、B、C以外的其他子块,dMv等于dMvN。
进一步地,假设(x,y)是当前子块左上角位置的坐标,(i,j)是亮度子块内部像素的坐标,i的取值范围是0~(subwidth-1),j的取值范围是0~(subheight-1),基于子块的预测样本矩阵为PredMatrixSb,二次预测的预测样本矩阵为PredMatrixS,可以按照以下方法计算(x+i,y+j)的二次预测的预测样本PredMatrixS[x+i][y+j]:
PredMatrixS[x+i][y+j]=
(UPLEFT(x+i,y+j)×(-dMv[i][j][0]-dMv[i][j][1])+
UP(x+i,y+j)×((-dMv[i][j][1])<<3)+
UPRIGHT(x+i,y+j)×(dMv[i][j][0]-dMv[i][j][1])+
LEFT(x+i,y+j)×((-dMv[i][j][0])<<3)+
CENTER(x+i,y+j)×(1<<15)+
RIGHT(x+i,y+j)×(dMv[i][j][0]<<3)+
DOWNLEFT(x+i,y+j)×(-dMv[i][j][0]+dMv[i][j][1])+
DOWN(x+i,y+j)×(dMv[i][j][1]<<3)+
DOWNRIGHT(x+i,y+j)×(dMv[i][j][0]+dMv[i][j][1])+
(1<<14))>>15
PredMatrixS[x+i][y+j]=Clip3(0,(1<<BitDepth)-1,PredMatrixS[x+i][y+j])。
进一步地,在本申请中,如果dMv的最大值的绝对值小于一个预设阈值,例如,该预设阈值可以等于通过1个整像素换算的2048,即,当dMv的最大值的绝对值小于2048时,
UPLEFT(x+i,y+j)=PredMatrixSb[max(0,x+i-1)][max(0,y+j-1)]
UP(x+i,y+j)=PredMatrixSb[x+i][max(0,y+j-1)]
UPRIGHT(x+i,y+j)=PredMatrixSb[min(width-1,x+i+1)][max(0,y+j-1)]
LEFT(x+i,y+j)=PredMatrixSb[max(0,x+i-1)][y+j]
CENTER(x+i,y+j)=PredMatrixSb[x+i][y+j]
RIGHT(x+i,y+j)=PredMatrixSb[min(width-1,x+i+1)][y+j]
DOWNLEFT(x+i,y+j)=PredMatrixSb[max(0,x+i-1)][min(height-1,y+j+1)]
DOWN(x+i,y+j)=PredMatrixSb[x+i][min(height-1,y+j+1)]
DOWNRIGHT(x+i,y+j)=PredMatrixSb[min(width-1,x+i+1)][min(height-1,y+j+1)]
否则:
UPLEFT(x+i,y+j)=PredMatrixSb[max(x,x+i-1)][max(y,y+j-1)]
UP(x+i,y+j)=PredMatrixSb[x+i][max(y,y+j-1)]
UPRIGHT(x+i,y+j)=PredMatrixSb[min(x+subwidth-1,x+i+1)][max(y,y+j-1)]
LEFT(x+i,y+j)=PredMatrixSb[max(x,x+i-1)][y+j]
CENTER(x+i,y+j)=PredMatrixSb[x+i][y+j]
RIGHT(x+i,y+j)=PredMatrixSb[min(x+subwidth-1,x+i+1)][y+j]
DOWNLEFT(x+i,y+j)=PredMatrixSb[max(x,x+i-1)][min(y+subheight-1,y+j+1)]
DOWN(x+i,y+j)=PredMatrixSb[x+i][min(y+subheight-1,y+j+1)]
DOWNRIGHT(x+i,y+j)=PredMatrixSb[min(x+subwidth-1,x+i+1)][min(y+subheight-1,y+j+1)]
其中,可以将max(a,b)理解为取a,b中的较大值,可以将min(a,b)理解为取a,b中的较小值。
可以理解的是,在本申请的实施例中,CENTER(x+i,y+j)像素位置可以为上述9个相邻的、构成矩形的像素位置中的中心位置,然后可以基于该(x+i,y+j)像素位置,以及与其相邻的其他8个像素位置,进行二次预测处理。具体地,其他8个像素位置分别为UP(上)、UPRIGHT(右上)、LEFT(左)、RIGHT(右)、DOWNLEFT(左下)、DOWN(下)、DOWNRIGHT(右下)、UPLEFT(左上)。
需要说明的是,在本申请中,PredMatrixS[x+i][y+j]的计算公式的精度可以使用更低的精度。比如将每一个乘法右边的项都右移,如dMv[i][j][0]和dMv[i][j][1]都右移shift3位,相应的,1<<15变为1<<(15-shift3),…+(1<<10))>>11变为…+(1<<(10-shift3)))>>(11-shift3)。
示例性的,可以限制运动矢量的大小在一个合理的范围,如上述使用的运动矢量在水平方向和垂直方向的正负值都不超过1个像素或1/2像素或1/4像素等。
可以理解的是,如果当前块的预测参考模式是‘Pred_List01’,那么解码器将各分量的多个预测样本矩阵平均得到该分量的最终的预测样本矩阵。例如,2个亮度预测样本矩阵平均后得到新的亮度预测样本矩阵。
进一步地,在本申请的实施例中,在得到当前块的预测样本矩阵后,如果当前块没有变换系数,那么预测矩阵就作为当前块的解码结果,如果当前块还有变换系数,那么,可以先解码变换系数,并通过反变换、反量化得到残差矩阵,将残差矩 阵加到预测矩阵上得到解码结果。
综上所述,通过步骤301至步骤308所提出的帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,为了避免因为像素不相连导致的预测性能的下降,解码器可以对不属于当前子块的目标像素位置进行更新处理,获得更新后的像素位置,从而可以基于更新后的像素位置进行二次预测或PROF处理。其中,上述更新处理的过程可以理解为对当前子块的边界的扩展,也可以理解为对超出当前子块的边界的目标像素位置的重新定义,从而将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,进而通过保证像素之间相连而提高预测性能。
可以理解的是,在本申请中,进行二次预测所需要使用的像素位置不在同一个子块是指对当前子块中的某一个像素位置,即当前像素位置,二次预测需要用来滤波的目标像素位置不全部属于该当前子块。相应地,进行PROF处理需要使用的像素位置不在同一个子块是指对当前子块中的某一个像素位置,即当前像素位置,PROF处理需要用来计算梯度的目标像素位置不全部属于该当前子块。
需要说明的是,本申请实施例提出的帧间预测方法,可以作用于整个编码单元或预测单元,即作用于当前块,也可以作用于当前块中的每一个子块,还可以作用于任意一个子块中的每一个像素位置。本申请不作具体限定。
进一步地,在本申请的实施例中,如果该帧间预测方法作用于当前块中,还能够降低当前块中的每一个子块之间的数据依赖性,从而可以实现基于子块的、二次预测或PROF处理的并行执行,也就是说,在对一个子块进行二次预测或PROF处理时,不需要等待其他子块的、基于子块的预测的预测值。具体地,可以实现基于子块的预测的并行处理,还可以在基于子块的预测的基础上,实现基于点的预测(二次预测或PROF处理)的并行处理。
需要说明的是,本申请提出的帧间预测方法,可以适用于任何一个图像分量上,在本实施例中示例性的对亮度分量使用二次预测方案,但是也可以用于色度分量,或者其他格式的任一分量。本申请提出的帧间预测方法也可以适用于任何一种视频格式上,包括但不限于YUV格式,包括但不限于YUV格式的亮度分量。
本实施例提供了一种帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。
基于上述实施例,在本申请的再一实施例中,进一步地,图20为帧间预测方法的实现流程示意图三,如图20所示,解码器进行帧间预测的方法还可以包括以下步骤:
步骤309、当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前子块的扩展子块。
步骤3010、基于第一运动矢量确定扩展子块的第三预测值,并确定当前像素位置对应的目标像素位置。
步骤3011、若目标像素位置不属于当前子块,基于第三预测值和目标像素位置,确定第二预测值,将第二预测值确定为帧间预测值。
在本申请的实施例中,解码器在解析获得预测模式参数之后,如果预测模式参数指示使用帧间预测模式确定当前块的帧间预测值,那么解码器可以先确定当前子块的扩展子块。
需要说明的是,在本申请的实施例中,扩展子块为当前块中的、扩展当前子块后所确定出的。具体地,由于二次预测或PROF处理时所使用的目标像素位置与当前像素位置之间的距离为1个像素位置,即使目标像素位置不属于当前子块,也仅仅超出当前子块的边界一个像素位置,因此,解码器在基于子块进行预测时,可以先直接对当前子块对应的扩展子块进行预测,从而解决目标像素位置不属于当前子块所造成的预测效果不佳的问题。
进一步地,在本申请的实施例中,当前子块对应的扩展子块与当前子块相比,可以在当前子块的上侧、下侧、左侧以及右侧各增加一行(列)像素点。
需要说明的是,在本申请的实施例中,基于子块尺寸参数确定出当前子块的尺寸大小可以为8×8或4×4,那么,扩展子块的尺寸参数,即对应的扩展子块的尺寸大小可以为10x10或6x6。
进一步地,在本申请的实施例中,解码器在确定当前子块的扩展子块之后,便可以基于当前子块的第一运动矢量确定出该扩展子块的第三预测值,同时,可以确定当前像素位置对应的目标像素位置。
可以理解的是,在本申请中,由于扩展子块的中心点与对应的当前子块的中心点是相同的,因此扩展子块与对应的当前子块的运动矢量是相同的。
需要说明的是,在本申请的实施例中,步骤3010具体可以包括:
步骤3010a、基于第一运动矢量确定扩展子块的第三预测值。
步骤3010b、确定当前像素位置对应的目标像素位置。
其中,本申请实施例提出的帧间预测方法对解码器执行步骤3010a和步骤3010b的顺序不进行限定,也就是说,在本申请中,在确定出当前块的每一个子块对应的第一运动矢量之后,解码器可以先执行步骤3010a,然后执行步骤3010b,也可以先执行步骤3010b,再执行步骤3010a,还可以同时执行步骤3010a和步骤3010b。
进一步地,在本申请的实施例中,解码器在基于第一运动矢量确定扩展子块的第三预测值时,可以先确定样本矩阵;其中,样本矩阵包括亮度样本矩阵和色度样本矩阵;然后可以根据预测参考模式、扩展子块的尺寸参数、样本矩阵以及运动矢量集合,确定第三预测值。
由此可见,在本申请中,在进行基于子块的预测的过程中,可以提前将二次预测或PROF处理所需要的像素位置的像素值也同时预测出来。由于二次预测或PROF处理时所使用的目标像素位置与当前像素位置之间的距离为1个像素位置,因此在二次预测或PROF处理仅仅需要使用当前子块的上下各一行、左右各一列的像素位置的像素值,按照该原则便可以确定出对应的扩展子块。例如,如果当前子块是尺寸大小为4x4的子块,那么二次预测或PROF处理需要使用尺寸大小为6x6的扩展子块的基于子块的预测值,如果当前子块是尺寸大小为8x8的子块,那么二次预测或PROF处理需要使用尺寸大小为10x10的扩展子块的基于子块的预测值。也就是说,解码器通过在基于子块的预测过程中,提前将二次预测或PROF处理所需要的像素位置的像素值也一起预测出来,那么在后续的二次预测或PROF处理过程中,便不会存在当前像素位置和目标像素位置不相连的问题。
在本申请的实施例中,进一步地,解码器在提前预测获得扩展子块的第三预测值之后,如果目标像素位置不属于当前子 块,那么解码器可以基于第三预测值和目标像素位置,确定出当前子块对应的第二预测值,然后便可以将第二预测值确定为帧间预测值。
可以理解的是,在本申请中,即使目标像素位置不属于当前子块,也一定属于当前子块对应的扩展子块,因此,在确定目标像素位置不属于当前子块之后,解码器便可以利用扩展子块的第三预测值和目标像素位置进一步对当前像素位置进行二次预测,获得该当前像素位置所对应的修正后预测值,遍历当前子块中的每一个像素位置,获得每一个像素位置对应的修正后预测值,最终便可以确定当前子块的第二预测值,进而可以将第二预测值确定为当前块的帧间预测值。
示例性的,在本申请中,如果当前块的宽度和高度分别是width和height,每个子块的宽度和高度分别是subwidth和subheight。如图7所示,当前块的亮度预测样本矩阵的左上角样本所在的子块为A,右上角样本所在的子块为B,左下角样本所在的子块为C,其他位置所在子块为其它子块。
对当前块中的每个子块,可以将子块的运动矢量偏差矩阵记为dMv,那么:
1、如果子块是A,dMv等于dMvA;
2、如果子块是B,dMv等于dMvB;
3、如果子块为C,且该当前子块的控制点运动矢量组mvAffine中有3个运动矢量,dMv等于dMvC;
4、如果子块为A、B、C以外的其他子块,dMv等于dMvN。
进一步地,假设(x,y)是当前子块左上角位置的坐标,(i,j)是亮度子块内部像素的坐标,i的取值范围是0~(subwidth-1),j的取值范围是0~(subheight-1),当前子块基于子块的的预测样本矩阵为PredMatrixTmp,PredMatrixTmp是扩展子块(subwidth+2)*(subheight+2)对应的预测样本矩阵,二次预测的预测样本矩阵为PredMatrixS,可以按照以下方法计算(x+i,y+j)的二次预测的预测样本PredMatrixS[x+i][y+j]:
PredMatrixS[x+i][y+j]=
(UPLEFT(x+i,y+j)×(-dMv[i][j][0]-dMv[i][j][1])+
UP(x+i,y+j)×((-dMv[i][j][1])<<3)+
UPRIGHT(x+i,y+j)×(dMv[i][j][0]-dMv[i][j][1])+
LEFT(x+i,y+j)×((-dMv[i][j][0])<<3)+
CENTER(x+i,y+j)×(1<<15)+
RIGHT(x+i,y+j)×(dMv[i][j][0]<<3)+
DOWNLEFT(x+i,y+j)×(-dMv[i][j][0]+dMv[i][j][1])+
DOWN(x+i,y+j)×(dMv[i][j][1]<<3)+
DOWNRIGHT(x+i,y+j)×(dMv[i][j][0]+dMv[i][j][1])+
(1<<14))>>15
PredMatrixS[x+i][y+j]=Clip3(0,(1<<BitDepth)-1,PredMatrixS[x+i][y+j])。
其中,UPLEFT(x+i,y+j)=PredMatrixTmp[i][j]
UP(x+i,y+j)=PredMatrixTmp[i+1][j]
UPRIGHT(x+i,y+j)=PredMatrixTmp[i+2][j]
LEFT(x+i,y+j)=PredMatrixTmp[i][j+1]
CENTER(x+i,y+j)=PredMatrixTmp[i+1][j+1]
RIGHT(x+i,y+j)=PredMatrixTmp[i+2][j+1]
DOWNLEFT(x+i,y+j)=PredMatrixTmp[i][j+2]
DOWN(x+i,y+j)=PredMatrixTmp[i+1][j+2]
DOWNRIGHT(x+i,y+j)=PredMatrixTmp[i+2][j+2]
可以理解的是,在本申请中,尺寸参数为6x6或10x10的扩展子块对应的预测样本矩阵PredMatrixTmp,可以为在原有基于子块的预测的子块预测矩阵基础上增加上下各一行、左右各一列的预测值的预测矩阵。其中,PredMatrixTmp是对应于子块的矩阵,而PredMatrixS是对应于整个预测单元的矩阵,即当前块的矩阵,所以它们的索引有区别。扩展子块与当前子块相比,由于PredMatrixTmp在左边多一列且在上边多一行,因此与子块对应的位置的索引分别水平方向加1和垂直方向加1。
进一步地,在本申请的实施例中,在执行步骤步骤3010a之前,即在基于第一运动矢量确定扩展子块的第三预测值之前,解码器也可以按照上述步骤306至步骤308所提出的方法,先对当前子块是否满足预设限制条件进行判断,只有在满足预设限制条件下,解码器才会确定扩展子块的第三预测值,并使用第三预测值进行二次预测或PROF处理,获得当前子块的帧间预测值;如果当前子块不满足预设限制条件,那么解码器便可以直接使用当前子块的第一预测值和目标像素位置,确定第二预测值,然后将第二预测值确定为帧间预测值。
也就是说,在本申请中,预设限制条件还可以用于确定是否对当前子块进行扩展子块的第三预测值的确定。
可以理解的是,在本申请的实施例中,除了使用上述的第一运动矢量偏差、第二运动矢量偏差或者第三运动矢量偏差进行是否满足预设限制条件的判断以外,解码器还可以直接利用当前子块对应的子块尺寸参数来判断是否满足预设限制条件。
示例性的,在本申请中,如果当前子块对应的子块尺寸参数为8x8,那么可以判定满足预设限制条件;如果当前子块对应的子块尺寸参数为4x4,那么可以判定不满足预设限制条件。
由于基于子块的预测需要用到插值滤波,一般情况下可以使用水平和垂直方向的8抽头滤波器进行内插。如果将插值点数由原来的4x4变成6x6,或者由原来的8x8变成10x10,那么需要增加额外的带宽,计算量也会增加。因此,需要基于预设限制条件对是否确定并使用当前子块对应的扩展子块的第三预测值。例如,如果当前子块的大小是8x8,那么便判定满足预设限制条件,则确定并使用扩展子块的第三预测值,以进行二次预测或PROF处理;如果当前子块的大小是4x4,那么便判定不满足预设限制条件,则不需要确定扩展子块的第三预测值,而是直接使用当前子块的第一预测值,以进行二次预测或PROF处理。
由于按现有的内插的方法增加新的预测值,如上述的上下各一行、左右各一列的预测值,会增加带宽,为了解决带宽增 加的问题,还可以使用少于8抽头的滤波器进行内插,从而可以在确定额外像素位置的预测值时不增加带宽,进一步地,也可以直接使用距离最近的参考像素值作为对应的额外像素位置的预测值。本申请不作具体限定。
示例性的,在本申请中,图21为4x4子块的示意图,如图21所示,对一个4x4的子块使用8抽头的插值滤波器,需要使用如图所示的、该子块周围的11x11的参考像素进行插值。也就是说,对一个4x4子块来说,如果使用8抽头的插值滤波器,左侧和上侧需要多出3个参考像素、右侧和下侧需要多出4个参考像素进行插值。
可以理解的是,在本申请中,可以通过对扩展子块使用较少抽头的滤波器来解决带宽增加的问题,例如,对扩展子块使用n抽头的插值滤波器,获得第三预测值;其中,n为以下值中的任一者:6,5,4,3,2。
图22为扩展子块示意图一,如图22所示,圆形1可以表征子块内部的像素位置对应的参考图像中的分像素,即可以表征该子块的基于子块的预测的预测值,圆形2可以表征提前预测获得的、扩展子块中超出原子块边界的额外的预测值。其中,对圆形2使用与圆形1一样的插值需要的参考像素就会超出原来的11x11的范围。但是使用更少抽头的滤波器,如6或5或4或3或2抽头滤波器,就可以不超出该范围从而不增加带宽。2抽头滤波器可以是分像素点的相邻的2个整像素的平均或加权平均。上述所说的滤波器都是水平或垂直方向的,如果要插值的位置在水平和垂直方向都是分像素,那么需要叠加使用水平和垂直方向的滤波器。
可以理解的是,在本申请中,为了解决带宽增加的问题,还可以先对当前子块对应的参考像素进行扩展处理,获得扩展后参考像素;然后基于扩展后参考像素,使用当前子块对应的插值滤波器,获得第三预测值。
图23为扩展子块示意图二,如图23所示,为了不增加带宽,还可以将原来插值的参考像素的最外面一圈进行扩展,例如,将11x11的参考像素扩充到13x13的参考像素,具体可以采用上述步骤304所提出的扩展方法。这样可以使用统一的滤波器。
示例性的,提出另一种不增加带宽的方法,可以不扩展像素,而是在滤波器需要使用的像素超出了原有的像素范围时,选择使用就近的范围内的像素来代替。
可以理解的是,在本申请中,为了解决带宽增加的问题,还可以先在当前块中确定扩展子块对应的相邻像素位置;然后根据相邻像素位置的整像素值确定第三预测值。
图24为替换像素的示意图,如图24所示,为了不增加带宽,还可以直接使用最相邻的整像素的值作为需要使用的、超出该子块之外的预测值。具体地,可以直接使用方形3的像素值作为圆形2的预测值。如果水平方向的分像素MV小于等于(或小于)1/2像素,则使用分像素左边的整像素,如果水平方向的分像素MV大于(或大于等于)1/2像素,则使用分像素右边的整像素,如果垂直方向的分像素MV小于等于(或小于)1/2像素,则使用分像素上边的整像素,如果垂直方向的分像素MV大于(或大于等于)1/2像素,则使用分像素下边的整像素。
在本申请的实施例中,进一步地,图25为帧间预测方法的实现流程示意图四,如图25所示,在基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置之后,即步骤303之后,解码器进行帧间预测的方法还可以包括以下步骤:
步骤3012、若目标像素位置不属于当前子块,则在当前块中确定邻近像素位置;其中,邻近像素位置在当前块中与目标像素位置相邻。
步骤3013、基于第一预测值和邻近像素位置,确定第二预测值,将第二预测值确定为帧间预测值。
在本申请的实施例中,解码器在确定当前像素位置对应的目标像素位置之后,如果目标像素位置不属于当前子块,即当前像素位置和目标像素位置并不在相同的子块中,那么解码器可以先在当前块中的全部像素位置中,确定出与目标像素位置距离最近的邻近像素位置;其中,该邻近像素位置是在当前块中与目标像素位置相邻的。
进一步地,在本申请的实施例中,如果进行二次预测或PROF处理时所需要的像素值是当前子块外的目标像素位置的像素值,那么解码器可以重新确定该像素值。具体地,解码器可以计算获得当前块中的、与该目标像素位置距离最近的一个邻近像素位置,然后可以使用该邻近像素位置的像素值进行二次预测或PROF处理。
示例性的,在本申请中,图26为替代像素位置的示意图,如图26所示,子块2中并不存在与子块1中的位置1正左方相邻的像素位置,如果按照目前的二次预测方法,会选择子块2中的位置3作为子块1中位置1的正左方一个像素距离的位置,可以看出,位置1和位置3相差较远,如果使用位置3作为位置1的相邻像素位置进行滤波,会大大降低最终获得的二次预测的预测效果。这时,可以在当前块中选择与位置4替代位置3,即该位置4为位置2的邻近像素位置,具体地,位置4是与位置2距离最近的像素位置,两者的偏差较小,使用位置4作为位置1的相邻像素位置进行滤波,会比使用位置3获得更加准确的结果。
可以理解的是,在本申请的实施例中,解码器在当前块中的其他子块中确定该邻近像素位置时,需要使用当前子块的第一运动矢量进行计算,获得的结果也可能为当前块中的其他子块中不存在该邻近像素位置,如果邻近像素位置不存在,那么解码器可以选择继续使用目标像素位置的像素值进行二次预测或PROF处理。
也就是说,本申请实施例提出的帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,为了避免因为像素不相连导致的预测性能的下降,基于将使用的像素位置限制在同一个子块中的原则,解码器可以对不属于当前子块的目标像素位置进行更新处理,获得更新后的像素位置,从而可以基于更新后的像素位置进行二次预测或PROF处理。其中,上述更新处理的过程可以理解为对当前子块的边界的扩展,也可以理解为对超出当前子块的边界的目标像素位置的重新定义,从而将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,进而通过保证像素之间相连而提高预测性能。其中,该帧间预测方法可以对当前块作用,也可以对当前块中的每一个子块作用,还可以对任意一个子块中的每一个像素位置作用。可见,本申请实施例提出的帧间预测方法,通过避免把参考图像中明显不相邻的像素作为相邻像素来进行二次预测或PROF处理,从而可以减小预测的误差,进而提升编码性能。
本实施例提供了一种帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。
本申请实施例提供一种帧间预测方法,该方法应用于视频编码设备,即编码器。该方法所实现的功能可以通过编码器中 的第二处理器调用计算机程序来实现,当然计算机程序可以保存在第二存储器中,可见,该编码器至少包括第二处理器和第二存储器。
进一步地,在本申请的实施例中,编码器进行帧间预测的方法可以包括以下步骤:
步骤401、确定当前块的预测模式参数。
在本申请的实施例中,编码器可以先确定当前块的预测模式参数。具体地,编码器可以先确定当前块使用的预测模式,然后基于该预测模式确定对应的预测模式参数。其中,预测模式参数可以用于对当前块所使用的预测模式进行确定。
需要说明的是,在本申请的实施例中,待编码图像可以划分为多个图像块,当前待编码的图像块可以称为当前块,与当前块相邻的图像块可以称为相邻块;即在待编码图像中,当前块与相邻块之间具有相邻关系。这里,每个当前块可以包括第一图像分量、第二图像分量和第三图像分量;也即当前块为待编码图像中当前待进行第一图像分量、第二图像分量或者第三图像分量预测的图像块。
其中,假定当前块进行第一图像分量预测,而且第一图像分量为亮度分量,即待预测图像分量为亮度分量,那么当前块也可以称为亮度块;或者,假定当前块进行第二图像分量预测,而且第二图像分量为色度分量,即待预测图像分量为色度分量,那么当前块也可以称为色度块。
需要说明的是,在本申请的实施例中,预测模式参数指示了当前块采用的预测模式以及与该预测模式相关的参数。这里,针对预测模式参数的确定,可以采用简单的决策策略,比如根据失真值的大小进行确定;也可以采用复杂的决策策略,比如根据率失真优化(Rate Distortion Optimization,RDO)的结果进行确定,本申请实施例不作任何限定。通常而言,可以采用RDO方式来确定当前块的预测模式参数。
具体地,在一些实施例中,编码器在确定当前块的预测模式参数时,可以先利用多种预测模式对当前块进行预编码处理,获得每一种预测模式对应的率失真代价值;然后从所获得的多个率失真代价值中选择最小率失真代价值,并根据最小率失真代价值对应的预测模式确定当前块的预测模式参数。
也就是说,在编码器侧,针对当前块可以采用多种预测模式分别对当前块进行预编码处理。这里,多种预测模式通常包括有帧间预测模式、传统帧内预测模式和非传统帧内预测模式;其中,传统帧内预测模式可以包括有直流(Direct Current,DC)模式、平面(PLANAR)模式和角度模式等,非传统帧内预测模式可以包括有基于矩阵的帧内预测(Matrix-based IntraPrediction,MIP)模式、跨分量线性模型预测(Cross-component LinearmodelPrediction,CCLM)模式、帧内块复制(Intra Block Copy,IBC)模式和PLT(Palette)模式等,而帧间预测模式可以包括有普通帧间预测模式、GPM模式和AWP模式等。
这样,在利用多种预测模式分别对当前块进行预编码之后,可以得到每一种预测模式对应的率失真代价值;然后从所得到的多个率失真代价值中选取最小率失真代价值,并将该最小率失真代价值对应的预测模式确定为当前块的预测模式参数。除此之外,还可以在利用多种预测模式分别对当前块进行预编码之后,得到每一种预测模式对应的失真值;然后从所得到的多个失真值中选取最小失真值,然后将该最小失真值对应的预测模式确定为当前块使用的预测模式,并根据该预测模式设置对应的预测模式参数。如此,最终使用所确定的预测模式参数对当前块进行编码,而且在这种预测模式下,可以使得预测残差较小,能够提高编码效率。
也就是说,在编码侧,编码器可以选取最优的预测模式对当前块进行预编码,在这过程中就可以确定出当前块的预测模式,然后确定用于指示预测模式的预测模式参数,从而将相应的预测模式参数写入码流,由编码器传输到解码器。
相应地,在解码器侧,解码器通过解析码流便可以直接获取到当前块的预测模式参数,并根据解析获得的预测模式参数确定当前块所使用的预测模式,以及该预测模式对应的相关参数。
步骤402、当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前块的当前子块的第一运动矢量;其中,当前块包括多个子块。
在本申请的实施例中,如果预测模式参数指示当前块使用帧间预测模式确定当前块的帧间预测值,那么编码器可以先确定出当前块的每一个子块的第一运动矢量。其中,一个子块对应有一个第一运动矢量。
需要说明的是,在本申请的实施例中,当前块为当前帧中待编码的图像块,当前帧以图像块的形式按一定顺序依次进行编码,该当前块为当前帧内按该顺序下一时刻待编码的图像块。当前块可具有多种规格尺寸,例如16×16、32×32或32×16等规格,其中数字表示当前块上像素点的行数和列数。
进一步对,在本申请的实施例中,当前块可以划分为多个子块,其中,每一个子块的尺寸大小都是相同的,子块为较小规格的像素点集合。子块的尺寸可以为8×8或4×4。
示例性的,在本申请中,当前块的尺寸为16×16,可以划分为4个尺寸均为8×8的子块。
可以理解的是,在本申请的实施例中,在编码器确定预测模式参数指示使用帧间预测模式确定当前块的帧间预测值的情况下,就可以继续采用本申请实施例所提供的帧间预测方法。
在本申请的实施例中,进一步地,当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,编码器确定当前块的当前子块的第一运动矢量时,可以确定当前块的仿射模式参数和预测参考模式。当仿射模式参数指示使用仿射模式时,确定控制点模式和子块尺寸参数。最后可以根据预测参考模式、控制点模式以及子块尺寸参数,确定第一运动矢量。
需要说明的是,在本申请的实施例中,仿射模式参数用于对是否使用仿射模式进行指示。具体地,仿射模式参数可以为仿射运动补偿允许标志affine_enable_flag,编码器通过仿射模式参数的取值的确定,可以进一步确定是否使用仿射模式。
也就是说,在本申请中,仿射模式参数可以为一个二值变量。若仿射模式参数的取值为1,则指示使用仿射模式;若仿射模式参数的取值为0,则指示不使用仿射模式。
需要说明的是,在本申请的实施例中,控制点模式用于对控制点的个数进行确定。在仿射模型中,一个子块可以有2个控制点或者3个控制点,相应地,控制点模式可以为2个控制点对应的控制点模式,或者为3个控制点对应的控制点模式。即控制点模式可以包括4参数模式和6参数模式。
可以理解的是,在本申请的实施例中,对于AVS3标准,如果当前块使用了仿射模式,那么编码器还需要确定出当前块在仿射模式中控制点的个数进行确定,从而可以确定出使用的是4参数(2个控制点)模式,还是6参数(3个控制点)模 式。
进一步地,在本申请的实施例中,如果编码器确定的仿射模式参数指示使用仿射模式,那么编码器可以进一步确定子块尺寸参数。
具体地,可以通过仿射预测子块尺寸标志affine_subblock_size_flag表征子块尺寸参数,编码器可以对子块尺寸标志的取值的设定来指示子块尺寸参数子块尺寸参数,即指示当前块的子块当前块的当前子块的尺寸大小。其中,子块的尺寸大小可以为8×8或4×4。
示例性的,在本申请中,若子块尺寸参数为8×8,则将子块尺寸标志的设置为1,并将子块尺寸标志写入码流;若子块尺寸参数为4×4,则将子块尺寸标志的设置为0,并将子块尺寸标志写入码流。
具体地,在本申请的实施例中,编码器可以先根据预测参考模式确定控制点运动矢量组;然后可以基于控制点运动矢量组、控制点模式以及子块尺寸参数,确定出子块的第一运动矢量。
可以理解的是,在本申请的实施例中,控制点运动矢量组可以用于对控制点的运动矢量进行确定。
需要说明的是,在本申请的实施例中,编码器可以按照上述方法,遍历当前块中的每一个子块,利用每一个子块的控制点运动矢量组、控制点模式以及子块尺寸参数,确定出每一个子块的第一运动矢量,从而可以根据每一个子块的第一运动矢量构建获得运动矢量集合。
可以理解的是,在本申请的实施例中,当前块的运动矢量集合中可以包括当前块的每一个子块的第一运动矢量。
进一步地,在本申请的实施例中,编码器在根据控制点运动矢量组、控制点模式以及子块尺寸参数,确定第一运动矢量时,可以先根据控制点运动矢量组、控制点模式以及当前块的尺寸参数,确定差值变量;然后可以基于预测模式参数和子块尺寸参数,确定子块位置;最后,便可以利用差值变量和当前子块位置,确定当前子块的第一运动矢量,进而可以获得当前块的多个子块的运动矢量集合。
步骤403、基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,当前像素位置为当前子块内的一个像素点的位置,目标像素位置为对当前像素位置的像素点进行二次预测或PROF处理的像素点的位置。
在本申请的实施例中,编码器在确定出当前块的每一个子块的第一运动矢量之后,可以先基于当前子块的第一运动矢量确定出当前子块的第一预测值,然后可以确定该当前子块中的当前像素位置所对应的目标像素位置。其中,目标像素位置是与当前像素位置相邻的像素位置。
需要说明的是,在本申请的实施例中,当前像素位置为当前块的当前子块中的一个像素点的位置,其中,当前像素位置可以表征待处理的像素点的位置。具体地,当前像素位置可以为待二次预测的像素点的位置,也可以为待PROF处理的像素点的位置。
可以理解的是,在本申请的实施例中,步骤403具体可以包括:
步骤403a、基于第一运动矢量确定子块的第一预测值。
步骤403b、确定当前像素位置对应的目标像素位置。
其中,本申请实施例提出的帧间预测方法对编码器执行步骤403a和步骤403b的顺序不进行限定,也就是说,在本申请中,在确定出当前块的每一个子块的第一运动矢量之后,编码器可以先执行步骤403a,然后执行步骤403b,也可以先执行步骤403b,再执行步骤403a,还可以同时执行步骤403a和步骤403b。
步骤404、若目标像素位置不属于当前子块,则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置。
在本申请的实施例中,编码器在基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置之后,如果目标像素位置不属于当前子块,即目标像素位置和当前像素位置属于不同的子块,那么编码器便可以根据当前子块对目标像素位置进行更新处理,从而获得更新后像素位置。
进一步地,在本申请的实施例中,若目标像素位置不属于当前子块,编码器则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置之前,即步骤404之前,编码器进行帧间预测的方法还可以包括以下步骤:
步骤406、判断当前子块是否满足预设限制条件;其中,预设限制条件用于确定是否将目标像素位置限制在当前子块内。
步骤407、若满足预设限制条件,则判断目标像素位置是否属于当前子块。
步骤408、若不满足预设限制条件,则基于第一预测值和目标像素位置,确定第二预测值,将第二预测值确定为帧间预测值。
在本申请的实施例中,编码器可以先判断当前子块是否满足预设限制条件,如果满足预设限制条件,那么编码器可以继续确定目标像素位置是否属于当前子块,从而进一步确定是否对目标像素位置进行限制处理,即是否对目标像素位置执行更新处理的流程;如果不满足预设限制条件,那么编码器便可以不需要对目标像素位置进行限制处理,而是直接基于第一预测值和目标像素位置,进一步确定当前像素位置对应的的修正后预测值,并在遍历当前子块中的每一个像素位置,获得每一个像素位置对应的修正后预测值之后,确定出当前子块对应的第二预测值,从而确定出当前子块的帧间预测值。
进一步地,在本申请的实施例中,编码器可以通过多种方式来判断当前子块是否满足预设限制条件。具体地,编码器可以通过运动矢量来进一步确定是否满足预设限制条件。
示例性的,在本申请中,编码器在判断当前子块是否满足预设限制条件时,可以先基于控制点运动矢量组确定第一运动矢量偏差;然后将第一运动矢量偏差与预设偏差阈值进行比较,如果第一运动矢量偏差大于获得等于预设偏差阈值,那么可以判定满足预设限制条件;如果第一运动矢量偏差小于预设偏差阈值,那么判定不满足预设限制条件。
示例性的,在本申请中,编码器在判断当前子块是否满足预设限制条件时,可以先基于每一个子块的第一运动矢量确定第二运动矢量偏差;然后将第二运动矢量偏差与预设偏差阈值进行比较,如果第二运动矢量偏差大于获得等于预设偏差阈值,那么判定满足预设限制条件;如果第二运动矢量偏差小于预设偏差阈值,那么判定不满足预设限制条件。
示例性的,在本申请中,编码器在判断当前子块是否满足预设限制条件时,可以先确定当前子块内的、预设像素位置与当前子块之间的第三运动矢量偏差;然后将第三运动矢量偏差与预设偏差阈值进行比较,如果第三运动矢量偏差大于获得等于预设偏差阈值,那么判定满足预设限制条件;如果第三运动矢量偏差小于预设偏差阈值,那么判定不满足预设限制条件。
需要说明的是,在本申请中,可以根据当前块的尺寸参数和\或子块尺寸参数确定预设偏差阈值。
可以理解的是,在本申请中,编码器在根据当前子块对目标像素位置进行更新处理,获得更新后像素位置时,可以先对当前子块进行扩展处理,获得扩展后子块;然后可以在扩展后子块内确定目标像素位置对应的更新后像素位置。
可以理解的是,在本申请的实施例中,编码器在对当前子块进行扩展处理,获得扩展后子块时,可以选择利用当前子块的全部边界位置进行扩展处理,获得扩展后子块;也可以选择利用当前子块内的、目标像素位置对应的行和\或列的边界位置进行扩展处理,获得扩展后子块。
步骤405、基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值,将第二预测值确定为当前子块的帧间预测值。
在本申请的实施例中,如果目标像素位置不属于当前子块,编码器在根据当前子块对目标像素位置进行更新处理,获得更新后像素位置之后,便可以基于第一预测值和更新后像素位置,确定出当前像素位置对应的的修正后预测值,并在遍历当前子块中的每一个像素位置,获得每一个像素位置对应的修正后预测值之后,确定当前子块对应的第二预测值,然后可以将第二预测值确定为当前子块的帧间预测值。
进一步地,在本申请的实施例中,编码器基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值的方法可以包括以下步骤:
步骤405a、确定PROF参数;
步骤405b、当PROF参数指示进行PROF处理时,基于第一预测值确定当前像素位置与更新后像素位置之间的像素水平梯度和像素垂直梯度;
步骤405c、确定更新后像素位置与当前子块的第四运动矢量偏差;
步骤405d、根据像素水平梯度、像素垂直梯度以及第四运动矢量偏差,计算当前像素位置对应的偏差值;
步骤405e、基于第一预测值和偏差值,获得第二预测值。
进一步地,在本申请的实施例中,编码器基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值的方法可以包括以下步骤:-
步骤405f、确定二次预测参数;
步骤405g、当二次预测参数指示使用二次预测时,确定更新后像素位置与当前子块的第四运动矢量偏差;
步骤405h、根据第四运动矢量偏差确定二维滤波器的滤波系数;其中,二维滤波器用于按照预设形状进行二次预测处理;
步骤405i、基于滤波系数和第一预测值,确定第二预测值,将第二预测值确定为帧间预测值。
综上所述,通过步骤401至步骤408所提出的帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,为了避免因为像素不相连导致的预测性能的下降,编码器可以对不属于当前子块的目标像素位置进行更新处理,获得更新后的像素位置,从而可以基于更新后的像素位置进行二次预测或PROF处理。其中,上述更新处理的过程可以理解为对当前子块的边界的扩展,也可以理解为对超出当前子块的边界的目标像素位置的重新定义,从而将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,进而通过保证像素之间相连而提高预测性能。
进一步地,在本申请的实施例中,编码器进行帧间预测的方法还可以包括以下步骤:
步骤409、当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前子块的扩展子块。
步骤4010、基于第一运动矢量确定扩展子块的第三预测值,并确定当前像素位置对应的目标像素位置。
步骤4011、若目标像素位置不属于当前子块,基于第三预测值和目标像素位置,确定第二预测值,将第二预测值确定为帧间预测值。
进一步地,在本申请的实施例中,在执行步骤步骤4010a之前,即在基于第一运动矢量确定扩展子块的第三预测值之前,编码器也可以按照上述步骤406至步骤408所提出的方法,先对当前子块是否满足预设限制条件进行判断,只有在满足预设限制条件下,才会确定扩展子块的第三预测值,并使用第三预测值进行二次预测或PROF处理,获得当前子块的帧间预测值;如果当前子块不满足预设限制条件,那么解码器便可以直接使用当前子块的第一预测值和目标像素位置,确定第二预测值,然后将第二预测值确定为帧间预测值。
也就是说,在本申请中,预设限制条件还可以用于确定是否对当前子块进行扩展子块的第三预测值的确定。
可以理解的是,在本申请中,可以通过对扩展子块使用较少抽头的滤波器来解决带宽增加的问题,例如,对扩展子块使用n抽头的插值滤波器,获得第三预测值;其中,n为以下值中的任一者:6,5,4,3,2。也就是说,在本申请中,使用更少抽头的滤波器,如6或5或4或3或2抽头滤波器,就可以不超出该范围从而不增加带宽。2抽头滤波器可以是分像素点的相邻的2个整像素的平均或加权平均。上述所说的滤波器都是水平或垂直方向的,如果要插值的位置在水平和垂直方向都是分像素,那么需要叠加使用水平和垂直方向的滤波器。
可以理解的是,在本申请中,为了解决带宽增加的问题,还可以先对当前子块对应的参考像素进行扩展处理,获得扩展后参考像素;然后基于扩展后参考像素,使用当前子块对应的插值滤波器,获得第三预测值。也就是说,在本申请中,可以将原来插值的参考像素的最外面一圈进行扩展,例如,将11x11的参考像素扩充到13x13的参考像素,具体可以采用上述步骤304所提出的扩展方法。这样可以使用统一的滤波器。同时还提出另一种不增加带宽的方法,可以不扩展像素,而是在滤波器需要使用的像素超出了原有的像素范围时,选择使用就近的范围内的像素来代替。
可以理解的是,在本申请中,为了解决带宽增加的问题,还可以先在当前块中确定扩展子块对应的相邻像素位置;然后根据相邻像素位置的整像素值确定第三预测值。也就是说,在本申请中,还可以直接使用最相邻的整像素的值作为需要使用的、超出该子块之外的预测值。具体地,可以直接使用方形3的像素值作为圆形2的预测值。如果水平方向的分像素MV小于等于(或小于)1/2像素,则使用分像素左边的整像素,如果水平方向的分像素MV大于(或大于等于)1/2像素,则使用分像素左边的整像素,如果垂直方向的分像素MV小于等于(或小于)1/2像素,则使用分像素上边的整像素,如果垂直方向的分像素MV大于(或大于等于)1/2像素,则使用分像素下边的整像素。
进一步地,在本申请的实施例中,在基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置之后,即步骤203之后,编码器进行帧间预测的方法还可以包括以下步骤:
步骤4012、若目标像素位置不属于当前子块,则在当前块中确定邻近像素位置;其中,邻近像素位置在当前块中与目 标像素位置相邻。
步骤4013、基于第一预测值和邻近像素位置,确定第二预测值,将第二预测值确定为帧间预测值。
需要说明的是,本申请实施例提出的帧间预测方法,可以作用于整个编码单元或预测单元,即作用于当前块,也可以作用于当前块中的每一个子块,还可以作用于任意一个子块中的每一个像素位置。本申请不作具体限定。
可以理解的是,在本申请中,编码器可以将预测模式参数、仿射模式参数、预测参考模式写入码流。还可以将PROF参数、二次预测参数写入码流。
本实施例提供了一种帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。
基于上述实施例,在本申请的再一实施例中,图27为解码器的组成结构示意图一,如图27所示,本申请实施例提出的解码器300可以包括解析部分301、第一确定部分302以及第一更新部分303。
所述解析部分301,配置为解析码流,获取当前块的预测模式参数;
所述第一确定部分302,配置为当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
所述第一更新部分303,配置为若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
所述第一确定部分302,还配置为基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
图28为解码器的组成结构示意图二,如图28所示,本申请实施例提出的解码器300还可以包括第一处理器304、存储有第一处理器304可执行指令的第一存储器305、第一通信接口306,和用于连接第一处理器304、第一存储器305以及第一通信接口306的第一总线307。
进一步地,在本申请的实施例中,上述第一处理器304,用于解析码流,获取当前块的预测模式参数;当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
图29为编码器的组成结构示意图一,如图29所示,本申请实施例提出的编码器400可以包括第二确定部分401和第二更新部分402。
所述第二确定部分401,配置为确定当前块的预测模式参数;当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
所述第二更新部分402,配置为若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
所述第二确定部分401,还配置为基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
图30为编码器的组成结构示意图二,如图30所示,本申请实施例提出的编码器400还可以包括第二处理器403、存储有第二处理器403可执行指令的第二存储器404、第二通信接口405,和用于连接第二处理器403、第二存储器404以及第二通信接口405的第二总线406。
进一步地,在本申请的实施例中,上述第二处理器403,配置为确定当前块的预测模式参数;当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
本申请实施例提供了一种解码器和编码器,该解码器和编码器在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。
本申请实施例提供计算机可读存储介质和计算机可读存储介质,其上存储有程序,该程序被处理器执行时实现如上述实施例所述的方法。
具体来讲,本实施例中的一种帧间预测方法对应的程序指令可以被存储在光盘,硬盘,U盘等存储介质上,当存储介质中的与一种帧间预测方法对应的程序指令被一电子设备读取或被执行时,包括如下步骤:
解析码流,获取当前块的预测模式参数;
当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢 量;其中,所述当前块包括多个子块;
基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
具体来讲,本实施例中的一种帧间预测方法对应的程序指令可以被存储在光盘,硬盘,U盘等存储介质上,当存储介质中的与一种帧间预测方法对应的程序指令被一电子设备读取或被执行时,包括如下步骤:
确定当前块的预测模式参数;
当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;
基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本申请实施例所提供的一种帧间预测方法、编码器、解码器以及计算机存储介质,解码器解析码流,获取当前块的预测模式参数;当预测模式参数指示使用帧间预测模式确定当前块的帧间预测值时,确定当前块的当前子块的第一运动矢量;其中,当前块包括多个子块;基于第一运动矢量确定当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,当前像素位置为当前子块内的一个像素点的位置,目标像素位置为对当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;若目标像素位置不属于当前子块,则根据当前子块对目标像素位置进行更新处理,获得更新后像素位置;基于第一预测值和更新后像素位置,确定当前子块对应的第二预测值,将第二预测值确定为当前子块的帧间预测值。也就是说,本申请提出的帧间预测方法,在基于子块的预测之后,如果进行二次预测或PROF处理所需要使用的像素位置不在同一个子块中,可以通过对当前子块的边界的扩展、对超出当前子块的边界的目标像素位置的重新定义等多种方式,将进行二次预测或PROF处理所需要使用的像素位置限制在相同的子块中,从而解决了因为像素不相连所导致的预测性能下降的问题,能够减小预测的误差,大大提升编码性能,从而提高了编解码效率。

Claims (64)

  1. 一种帧间预测方法,应用于解码器,所述方法包括:
    解析码流,获取当前块的预测模式参数;
    当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;
    基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或使用光流原理的预测修正PROF处理的像素点的位置;
    若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
    基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
  2. 根据权利要求1所述的方法,其中,所述确定所述当前块的当前子块的第一运动矢量,包括:
    解析所述码流,获取所述当前块的仿射模式参数和预测参考模式;
    当所述仿射模式参数指示使用仿射模式时,确定控制点模式和子块尺寸参数;
    根据所述预测参考模式、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量。
  3. 根据权利要求2所述的方法,其中,所述根据所述预测参考模式、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量,包括:
    根据所述预测参考模式确定控制点运动矢量组;
    根据所述控制点运动矢量组、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量。
  4. 根据权利要求3所述的方法,其中,所述根据所述控制点运动矢量组、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量,包括:
    根据所述控制点运动矢量组、所述控制点模式以及所述当前块的尺寸参数,确定差值变量;
    基于所述预测模式参数和所述子块尺寸参数,确定子块位置;
    利用所述差值变量和所述子块位置,确定所述当前子块的所述第一运动矢量。
  5. 根据权利要求2-4任一项所述的方法,其中,所述方法还包括:
    遍历所述当前块的每一个子块,根据所述每一个子块的第一运动矢量构建运动矢量集合。
  6. 根据权利要求5所述的方法,其中,所述基于所述第一运动矢量确定所述当前子块的第一预测值,包括:
    确定样本矩阵;其中,所述样本矩阵包括亮度样本矩阵和色度样本矩阵;
    根据所述预测参考模式、所述子块尺寸参数、所述样本矩阵以及所述运动矢量集合,确定所述第一预测值。
  7. 根据权利要求6所述的方法,其中,所述方法还包括:
    判断所述当前子块是否满足预设限制条件;其中,所述预设限制条件用于确定是否将所述目标像素位置限制在所述当前子块内;
    若满足所述预设限制条件,则判断所述目标像素位置是否属于所述当前子块。
  8. 根据权利要求7所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    基于所述控制点运动矢量组确定第一运动矢量偏差;
    若所述第一运动矢量偏差大于获得等于预设偏差阈值,则判定满足所述预设限制条件;
    若所述第一运动矢量偏差小于所述预设偏差阈值,则判定不满足所述预设限制条件。
  9. 根据权利要求7所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    基于所述每一个子块的第一运动矢量确定第二运动矢量偏差;
    若所述第二运动矢量偏差大于获得等于预设偏差阈值,则判定满足所述预设限制条件;
    若所述第二运动矢量偏差小于所述预设偏差阈值,则判定不满足所述预设限制条件。
  10. 根据权利要求7所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    确定所述当前子块内的、预设像素位置与所述当前子块之间的第三运动矢量偏差;
    若所述第三运动矢量偏差大于获得等于预设偏差阈值,则判定满足所述预设限制条件;
    若所述第三运动矢量偏差小于所述预设偏差阈值,则判定不满足所述预设限制条件。
  11. 根据权利要求1所述的方法,其中,所述根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置,包括:
    对所述当前子块进行扩展处理,获得扩展后子块;
    在所述扩展后子块内确定所述目标像素位置对应的所述更新后像素位置。
  12. 根据权利要求11所述的方法,其中,所述对所述当前子块进行扩展处理,获得扩展后子块,包括:
    利用所述当前子块的全部边界位置进行扩展处理,获得所述扩展后子块;或者,
    利用所述当前子块内的、所述目标像素位置对应的行和\或列的边界位置进行扩展处理,获得所述扩展后子块。
  13. 根据权利要求1所述的方法,其中,所述根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置,包括:
    利用所述当前子块内的、与所述目标像素位置相邻的像素位置,替换所述目标像素位置,获得所述更新后像素位置。
  14. 根据权利要求8至10任一项所述的方法,其中,
    根据所述当前块的尺寸参数和\或所述子块尺寸参数确定所述预设偏差阈值。
  15. 根据权利要求7所述的方法,其中,所述方法还包括:
    当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前子块的扩展子块;
    基于所述第一运动矢量确定所述扩展子块的第三预测值,并确定所述当前像素位置对应的所述目标像素位置;
    若所述目标像素位置不属于所述当前子块,基于所述第三预测值和所述目标像素位置,确定所述第二预测值,将所述第二预测值确定为所述帧间预测值。
  16. 根据权利要求15所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    若所述子块尺寸参数为8x8,则判定满足所述预设限制条件;
    若所述子块尺寸参数为4x4,则判定不满足所述预设限制条件。
  17. 根据权利要求15所述的方法,其中,
    对所述扩展子块使用n抽头的插值滤波器,获得所述第三预测值;其中,所述n为以下值中的任一者:6,5,4,3,2。
  18. 根据权利要求15所述的方法,其中,
    对所述当前子块对应的参考像素进行扩展处理,获得扩展后参考像素;
    基于所述扩展后参考像素,使用所述当前子块对应的插值滤波器,获得所述第三预测值。
  19. 根据权利要求15所述的方法,其中,
    确定所述扩展子块对应的相邻像素位置;
    根据所述相邻像素位置的整像素值确定所述第三预测值。
  20. 根据权利要求19所述的方法,其中,所述方法还包括:
    若所述扩展子块对应的水平方向的分像素位置的运动矢量小于或者等于1/2像素,则将所述分像素位置左边的所述相邻像素位置的整像素值确定所述第三预测值;或者,
    若所述扩展子块对应的水平方向的分像素位置的运动矢量大于1/2像素,则将所述分像素位置右边的所述相邻像素位置的整像素值确定所述第三预测值;或者,
    若所述扩展子块对应的垂直方向的分像素位置的运动矢量小于或者等于1/2像素,则将所述分像素位置上边的所述相邻像素位置的整像素值确定所述第三预测值;或者,
    若所述扩展子块对应的垂直方向的分像素位置的运动矢量大于1/2像素,则将所述分像素位置下边的所述相邻像素位置的整像素值确定所述第三预测值。
  21. 根据权利要求14所述的方法,其中,所述基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置之后,所述方法还包括:
    若所述目标像素位置不属于所述当前子块,则在所述当前块中确定邻近像素位置;其中,所述邻近像素位置在所述当前块中与所述目标像素位置相邻;
    基于所述第一预测值和所述邻近像素位置,确定所述第二预测值,将所述第二预测值确定为所述帧间预测值。
  22. 根据权利要求1所述的方法,其中,所述基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,包括:
    解析所述码流,获取PROF参数;
    当所述PROF参数指示进行PROF处理时,基于所述第一预测值确定所述当前像素位置与所述更新后像素位置之间的像素水平梯度和像素垂直梯度;
    确定所述更新后像素位置与所述当前子块的第四运动矢量偏差;
    根据所述像素水平梯度、所述像素垂直梯度以及所述第四运动矢量偏差,计算所述当前像素位置对应的偏差值;
    基于所述第一预测值和所述偏差值,获得所述第二预测值。
  23. 根据权利要求1所述的方法,其中,所述基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,包括:
    解析所述码流,获取二次预测参数;
    当所述二次预测参数指示使用二次预测时,确定所述更新后像素位置与所述当前子块的第四运动矢量偏差;
    根据所述第四运动矢量偏差确定二维滤波器的滤波系数;其中,所述二维滤波器用于按照预设形状进行二次预测处理;
    基于所述滤波系数和所述第一预测值,确定所述第二预测值,将所述第二预测值确定为所述帧间预测值。
  24. 根据权利要求23所述的方法,其中,所述二维滤波器用于利用多个相邻的、构成所述预设形状的像素位置进行二次预测。
  25. 根据权利要求24所述的方法,其中,所述预设形状为矩形、菱形或任意一种对称形状。
  26. 根据权利要求25所述的方法,其中,
    若所述仿射模式参数的取值为1,则指示使用所述仿射模式;
    若所述仿射模式参数的取值为0,或者,未解析得到所述仿射模式参数,则指示不使用所述仿射模式。
  27. 根据权利要求2所述的方法,其中,所述确定子块尺寸参数,包括:
    解析所述码流,获得子块尺寸标志;
    若所述子块尺寸标志的取值为1,则确定所述子块尺寸参数为8×8;
    若所述子块尺寸标志的取值为0,或者,未解析得到所述子块尺寸标志,则确定所述子块尺寸参数为4×4。
  28. 根据权利要求2所述的方法,其中,所述控制点模式包括4参数模式和6参数模式。
  29. 一种帧间预测方法,应用于编码器,所述方法包括:
    确定当前块的预测模式参数;
    当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;
    基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
    若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位 置;
    基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
  30. 根据权利要求29所述的方法,其中,所述确定所述当前块的当前子块的第一运动矢量,包括:
    确定所述当前块的仿射模式参数和预测参考模式;
    当所述仿射模式参数指示使用仿射模式时,确定控制点模式和子块尺寸参数;
    根据所述预测参考模式、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量。
  31. 根据权利要求30所述的方法,其中,所述根据所述预测参考模式、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量,包括:
    根据所述预测参考模式确定控制点运动矢量组;
    根据所述控制点运动矢量组、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量。
  32. 根据权利要求31所述的方法,其中,所述根据所述控制点运动矢量组、所述控制点模式以及所述子块尺寸参数,确定所述第一运动矢量,包括:
    根据所述控制点运动矢量组、所述控制点模式以及所述当前块的尺寸参数,确定差值变量;
    基于所述预测模式参数和所述子块尺寸参数,确定子块位置;
    利用所述差值变量和所述子块位置,确定所述当前子块的所述第一运动矢量。
  33. 根据权利要求30至32任一项所述的方法,其中,所述方法还包括:
    遍历所述当前块的每一个子块,根据所述每一个子块的第一运动矢量构建运动矢量集合。
  34. 根据权利要求33所述的方法,其中,所述基于所述第一运动矢量确定所述当前子块的第一预测值,包括:
    确定样本矩阵;其中,所述样本矩阵包括亮度样本矩阵和色度样本矩阵;
    根据所述预测参考模式、所述子块尺寸参数、所述样本矩阵以及所述运动矢量集合,确定所述第一预测值。
  35. 根据权利要求34所述的方法,其中,所述方法还包括:
    判断所述当前子块是否满足预设限制条件;其中,所述预设限制条件用于确定是否将所述目标像素位置限制在所述当前子块内;
    若满足所述预设限制条件,则判断所述目标像素位置是否属于所述当前子块。
  36. 根据权利要求35所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    基于所述控制点运动矢量组确定第一运动矢量偏差;
    若所述第一运动矢量偏差大于获得等于预设偏差阈值,则判定满足所述预设限制条件;
    若所述第一运动矢量偏差小于所述预设偏差阈值,则判定不满足所述预设限制条件。
  37. 根据权利要求35所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    基于所述每一个子块的第一运动矢量确定第二运动矢量偏差;
    若所述第二运动矢量偏差大于获得等于预设偏差阈值,则判定满足所述预设限制条件;
    若所述第二运动矢量偏差小于所述预设偏差阈值,则判定不满足所述预设限制条件。
  38. 根据权利要求35所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    确定所述当前子块内的、预设像素位置与所述当前子块之间的第三运动矢量偏差;
    若所述第三运动矢量偏差大于获得等于预设偏差阈值,则判定满足所述预设限制条件;
    若所述第三运动矢量偏差小于所述预设偏差阈值,则判定不满足所述预设限制条件。
  39. 根据权利要求29所述的方法,其中,所述根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置,包括:
    对所述当前子块进行扩展处理,获得扩展后子块;
    在所述扩展后子块内确定所述目标像素位置对应的所述更新后像素位置。
  40. 根据权利要求39所述的方法,其中,所述对所述当前子块进行扩展处理,获得扩展后子块,包括:
    利用所述当前子块的全部边界位置进行扩展处理,获得所述扩展后子块;或者,
    利用所述当前子块内的、所述目标像素位置对应的行和\或列的边界位置进行扩展处理,获得所述扩展后子块。
  41. 根据权利要求29所述的方法,其中,所述根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置,包括:
    利用所述当前子块内的、与所述目标像素位置相邻的像素位置,替换所述目标像素位置,获得所述更新后像素位置。
  42. 根据权利要求36至38任一项所述的方法,其中,
    根据所述当前块的尺寸参数和\或所述子块尺寸参数确定所述预设偏差阈值。
  43. 根据权利要求35所述的方法,其中,所述方法还包括:
    当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前子块的扩展子块;
    基于所述第一运动矢量确定所述扩展子块的第三预测值,并确定所述当前像素位置对应的所述目标像素位置;
    若所述目标像素位置不属于所述当前子块,基于所述第三预测值和所述目标像素位置,确定所述第二预测值,将所述第二预测值确定为所述帧间预测值。
  44. 根据权利要求43所述的方法,其中,所述判断所述当前子块是否满足预设限制条件,包括:
    若所述子块尺寸参数为8x8,则判定满足所述预设限制条件;
    若所述子块尺寸参数为4x4,则判定不满足所述预设限制条件。
  45. 根据权利要求43所述的方法,其中,
    对所述扩展子块使用n抽头的插值滤波器,获得所述第三预测值;其中,n为以下值中的任一者:6,5,4,3,2。
  46. 根据权利要求43所述的方法,其中,
    对所述当前子块对应的参考像素进行扩展处理,获得扩展后参考像素;
    基于所述扩展后参考像素,使用所述当前子块对应的插值滤波器,获得所述第三预测值。
  47. 根据权利要求43所述的方法,其中,
    确定所述扩展子块对应的相邻像素位置;
    根据所述相邻像素位置的整像素值确定所述第三预测值。
  48. 根据权利要求47所述的方法,其中,所述方法还包括:
    若所述扩展子块对应的水平方向的分像素位置的运动矢量小于或者等于1/2像素,则将所述分像素位置左边的所述相邻像素位置的整像素值确定所述第三预测值;或者,
    若所述扩展子块对应的水平方向的分像素位置的运动矢量大于1/2像素,则将所述分像素位置右边的所述相邻像素位置的整像素值确定所述第三预测值;或者,
    若所述扩展子块对应的垂直方向的分像素位置的运动矢量小于或者等于1/2像素,则将所述分像素位置上边的所述相邻像素位置的整像素值确定所述第三预测值;或者,
    若所述扩展子块对应的垂直方向的分像素位置的运动矢量大于1/2像素,则将所述分像素位置下边的所述相邻像素位置的整像素值确定所述第三预测值。
  49. 根据权利要求42所述的方法,其中,所述基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置之后,所述方法还包括:
    若所述目标像素位置不属于所述当前子块,则在所述当前块中确定邻近像素位置;其中,所述邻近像素位置在所述当前块中与所述目标像素位置相邻;
    基于所述第一预测值和所述邻近像素位置,确定所述第二预测值,将所述第二预测值确定为所述帧间预测值。
  50. 根据权利要求29所述的方法,其中,所述基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,包括:
    确定PROF参数;
    当所述PROF参数指示进行PROF处理时,基于所述第一预测值确定所述当前像素位置与所述更新后像素位置之间的像素水平梯度和像素垂直梯度;
    确定所述更新后像素位置与所述当前子块的第四运动矢量偏差;
    根据所述像素水平梯度、所述像素垂直梯度以及所述第四运动矢量偏差,计算所述当前像素位置对应的偏差值;
    基于所述第一预测值和所述偏差值,获得所述第二预测值。
  51. 根据权利要求29所述的方法,其中,所述基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,包括:
    确定二次预测参数;
    当所述二次预测参数指示使用二次预测时,确定所述更新后像素位置与所述当前子块的第四运动矢量偏差;
    根据所述第四运动矢量偏差确定二维滤波器的滤波系数;其中,所述二维滤波器用于按照预设形状进行二次预测处理;
    基于所述滤波系数和所述第一预测值,确定所述第二预测值,将所述第二预测值确定为所述帧间预测值。
  52. 根据权利要求51所述的方法,其中,所述二维滤波器用于利用多个相邻的、构成所述预设形状的像素位置进行二次预测。
  53. 根据权利要求52所述的方法,其中,所述预设形状为矩形、菱形或任意一种对称形状。
  54. 根据权利要求53所述的方法,其中,
    若所述仿射模式参数的取值为1,则指示使用所述仿射模式;
    若所述仿射模式参数的取值为0,或者,未解析得到所述仿射模式参数,则指示不使用所述仿射模式。
  55. 根据权利要求31所述的方法,其中,
    若所述子块尺寸参数为8×8,则将所述子块尺寸标志的设置为1,并将所述子块尺寸标志写入码流;
    若所述子块尺寸参数为4×4,则将所述子块尺寸标志的设置为0,并将所述子块尺寸标志写入码流。
  56. 根据权利要求31所述的方法,其中,所述控制点模式包括4参数模式和6参数模式。
  57. 根据权利要求30所述的方法,其中,
    将所述预测模式参数、所述仿射模式参数、所述预测参考模式写入码流。
  58. 根据权利要50所述的方法,其中,
    将所述PROF参数写入码流。
  59. 根据权利要51所述的方法,其中,
    将所述二次预测参数写入码流。
  60. 一种解码器,所述解码器包括解析部分,第一确定部分,第一更新部分,
    所述解析部分,配置为解析码流,获取当前块的预测模式参数;
    所述第一确定部分,配置为当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
    所述第一更新部分,配置为若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
    所述第一确定部分,还配置为基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
  61. 一种解码器,所述解码器包括第一处理器、存储有所述第一处理器可执行指令的第一存储器,当所述指令被执行时,所述第一处理器执行时实现如权利要求1-28任一项所述的方法。
  62. 一种编码器,所述编码器包括第二确定部分,第二更新部分,
    所述第二确定部分,配置为确定当前块的预测模式参数;当所述预测模式参数指示使用帧间预测模式确定所述当前块的帧间预测值时,确定所述当前块的当前子块的第一运动矢量;其中,所述当前块包括多个子块;基于所述第一运动矢量确定所述当前子块的第一预测值,并确定当前像素位置对应的目标像素位置;其中,所述当前像素位置为所述当前子块内的一个像素点的位置,所述目标像素位置为对所述当前像素位置的像素点进行二次预测或PROF处理的像素点的位置;
    所述第二更新部分,配置为若所述目标像素位置不属于所述当前子块,则根据所述当前子块对所述目标像素位置进行更新处理,获得更新后像素位置;
    所述第二确定部分,还配置为基于所述第一预测值和所述更新后像素位置,确定所述当前子块对应的第二预测值,将所述第二预测值确定为所述当前子块的帧间预测值。
  63. 一种编码器,所述编码器包括第二处理器、存储有所述第二处理器可执行指令的第二存储器,当所述指令被执行时,所述第二处理器执行时实现如权利要求29-59任一项所述的方法。
  64. 一种计算机存储介质,所述计算机存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现如权利要求1-28任一项所述的方法,或者,被第二处理器执行时实现如权利要求29-59任一项所述的方法。
PCT/CN2021/106589 2020-08-20 2021-07-15 帧间预测方法、编码器、解码器以及计算机存储介质 WO2022037344A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202180005743.6A CN114503582A (zh) 2020-08-20 2021-07-15 帧间预测方法、编码器、解码器以及计算机存储介质
MX2023000107A MX2023000107A (es) 2020-08-20 2021-07-15 Metodo de prediccion inter-cuadro, codificador, decodificador, y medio de almacenamiento en computadora.
CN202210827144.9A CN114979668A (zh) 2020-08-20 2021-07-15 帧间预测方法、编码器、解码器以及计算机存储介质
ZA2023/00127A ZA202300127B (en) 2020-08-20 2023-01-03 Inter-frame prediction method, encoder, decoder, and computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010845318.5A CN114079784A (zh) 2020-08-20 2020-08-20 帧间预测方法、编码器、解码器以及计算机存储介质
CN202010845318.5 2020-08-20

Publications (1)

Publication Number Publication Date
WO2022037344A1 true WO2022037344A1 (zh) 2022-02-24

Family

ID=80282186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/106589 WO2022037344A1 (zh) 2020-08-20 2021-07-15 帧间预测方法、编码器、解码器以及计算机存储介质

Country Status (5)

Country Link
CN (3) CN114079784A (zh)
MX (1) MX2023000107A (zh)
TW (1) TW202209892A (zh)
WO (1) WO2022037344A1 (zh)
ZA (1) ZA202300127B (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110832858A (zh) * 2017-07-03 2020-02-21 Vid拓展公司 基于双向光流的运动补偿预测
WO2020061082A1 (en) * 2018-09-21 2020-03-26 Vid Scale, Inc. Complexity reduction and bit-width control for bi-directional optical flow
CN111405277A (zh) * 2019-01-02 2020-07-10 华为技术有限公司 帧间预测方法、装置以及相应的编码器和解码器
WO2020163319A1 (en) * 2019-02-07 2020-08-13 Vid Scale, Inc. Systems, apparatus and methods for inter prediction refinement with optical flow

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110832858A (zh) * 2017-07-03 2020-02-21 Vid拓展公司 基于双向光流的运动补偿预测
WO2020061082A1 (en) * 2018-09-21 2020-03-26 Vid Scale, Inc. Complexity reduction and bit-width control for bi-directional optical flow
CN111405277A (zh) * 2019-01-02 2020-07-10 华为技术有限公司 帧间预测方法、装置以及相应的编码器和解码器
WO2020163319A1 (en) * 2019-02-07 2020-08-13 Vid Scale, Inc. Systems, apparatus and methods for inter prediction refinement with optical flow

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIANCONG (DANIEL) LUO , YUWEN HE: "CE4-2.1: Prediction refinement with optical flow for affine mode", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-O0070, 18 June 2019 (2019-06-18), pages 1 - 4, XP030205608 *

Also Published As

Publication number Publication date
CN114503582A (zh) 2022-05-13
CN114079784A (zh) 2022-02-22
TW202209892A (zh) 2022-03-01
CN114979668A (zh) 2022-08-30
ZA202300127B (en) 2023-09-27
MX2023000107A (es) 2023-02-09

Similar Documents

Publication Publication Date Title
JP6467030B2 (ja) 動き補償予測のための方法
TW202315408A (zh) 以區塊為基礎之預測技術
JP5594841B2 (ja) 画像符号化装置及び画像復号装置
TWI504241B (zh) 影像編碼方法、裝置、影像解碼方法、裝置及其程式產品
JP2023126972A (ja) タイル独立性制約を使用したインター予測概念
WO2020251470A1 (en) Simplified downsampling for matrix based intra prediction
JP7375224B2 (ja) 符号化・復号方法、装置及びそのデバイス
EP2092752A2 (en) Adaptive interpolation method and system for motion compensated predictive video coding and decoding
AU2022235881A1 (en) Decoding method and apparatus, encoding method and apparatus, device, and storage medium
WO2022022278A1 (zh) 帧间预测方法、编码器、解码器以及计算机存储介质
WO2022061680A1 (zh) 帧间预测方法、编码器、解码器以及计算机存储介质
WO2022037344A1 (zh) 帧间预测方法、编码器、解码器以及计算机存储介质
US11202082B2 (en) Image processing apparatus and method
CN116980596A (zh) 一种帧内预测方法、编码器、解码器及存储介质
WO2022077495A1 (zh) 帧间预测方法、编码器、解码器以及计算机存储介质
TW202209893A (zh) 幀間預測方法、編碼器、解碼器以及電腦儲存媒介
JP7061737B1 (ja) 画像復号装置、画像復号方法及びプログラム
JP7083971B1 (ja) 画像復号装置、画像復号方法及びプログラム
JP7034363B2 (ja) 画像復号装置、画像復号方法及びプログラム
US20220264148A1 (en) Sample Value Clipping on MIP Reduced Prediction
WO2023051654A1 (en) Method, apparatus, and medium for video processing
CN115398893A (zh) 用于视频滤波的方法和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21857439

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21857439

Country of ref document: EP

Kind code of ref document: A1