WO2020248105A1 - 预测值的确定方法、编码器以及计算机存储介质 - Google Patents

预测值的确定方法、编码器以及计算机存储介质 Download PDF

Info

Publication number
WO2020248105A1
WO2020248105A1 PCT/CN2019/090594 CN2019090594W WO2020248105A1 WO 2020248105 A1 WO2020248105 A1 WO 2020248105A1 CN 2019090594 W CN2019090594 W CN 2019090594W WO 2020248105 A1 WO2020248105 A1 WO 2020248105A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
encoded
candidate list
block
adjacent
Prior art date
Application number
PCT/CN2019/090594
Other languages
English (en)
French (fr)
Inventor
梁凡
曹健
曹思琪
李正仁
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2019/090594 priority Critical patent/WO2020248105A1/zh
Priority to CN201980096199.3A priority patent/CN113796070A/zh
Publication of WO2020248105A1 publication Critical patent/WO2020248105A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • the embodiments of the present application relate to video encoding technology, and in particular, to a method for determining a prediction value, an encoder, and a computer storage medium.
  • the main coding mode of the screen content coding is the traditional intra prediction mode (Intra mode).
  • SCC Screen Content Coding
  • VVC Versatile Video In Coding
  • IBC mode Intra Block Copy mode
  • Intra mode only performs predictive coding based on spatial textures. Compared with natural acquisition videos, SCC video image textures are less correlated, and the difference between the predicted value determined by Intra mode and the true value is large.
  • IBC Although mode can take advantage of the SCC's feature of having many repeated regions, it still performs intra prediction based on the translational motion model. In the translational model IBC mode, all pixels in a coding unit share the same motion vector (MV, Motion Vector). However, in the SCC scene, there are still complex motion situations such as scaling, rotation, deformation, and perspective transformation, and there are also large errors between the predicted value and the true value based on the translational motion model.
  • MV Motion Vector
  • the embodiments of the present application provide a method for determining a predicted value, an encoder, and a computer storage medium, which can improve coding efficiency.
  • an embodiment of the present application provides a method for determining a predicted value, the method is applied to an encoder, and the method includes:
  • the preset affine motion model is called, the reference image block of the image block to be encoded is calculated, and the pixel value of the reference image block is used as the prediction of the image block to be encoded value.
  • an encoder in a second aspect, provides an encoder, and the encoder includes:
  • the acquisition module is used to acquire the image block to be encoded and the encoded image block from the current image frame;
  • a construction module configured to construct a block vector BV candidate list for the image block to be encoded according to the encoded image block
  • An iterative module configured to perform an iterative operation on each BV in the BV candidate list according to a preset iterative algorithm to obtain a BV candidate list after the iterative operation;
  • a selection module configured to select the BV of the image block to be encoded from the BV candidate list after the iterative operation
  • the determining module is configured to call a preset affine motion model according to the BV of the image block to be encoded, calculate the reference image block of the image block to be encoded, and use the pixel value of the reference image block as the The predicted value of the encoded image block.
  • an encoder in a third aspect, provides an encoder, and the encoder includes:
  • an embodiment of the present application provides a computer-readable storage medium in which executable instructions are stored, and when the executable instructions are executed by one or more processors, the processors execute the first aspect The method for determining the predicted value.
  • the embodiment of the application provides a method for determining a predicted value, an encoder, and a computer storage medium.
  • the encoder obtains the image block to be encoded and the encoded image block from the current image frame, and based on the encoded image block, it is the image to be encoded
  • the block builds a BV candidate list, and performs iterative operations on each BV in the BV candidate list according to the preset iterative algorithm to obtain the BV candidate list after the iterative operation.
  • the preset affine motion model is called to calculate the reference image block of the image block to be encoded, and the pixel value of the reference image block is used as the predicted value of the image block to be encoded;
  • the BV candidate list is constructed for the image block to be encoded, and each BV in the BV candidate list is iteratively operated. In this way, the candidate can be selected from the BV candidate list after the iterative operation.
  • the affine motion model is called, so that the prediction value is predicted by the affine motion compensation method during intra prediction coding.
  • the predicted value is closer to the true value, which effectively improves the coding efficiency.
  • FIG. 1A is a schematic diagram of the arrangement of coding units whose prediction division mode is "NO_SPLIT";
  • FIG. 1B is a schematic diagram of the arrangement of coding units whose prediction division mode is "HOR_tN";
  • FIG. 1C is a schematic diagram of the arrangement of coding units whose prediction division mode is "HOR_UP";
  • FIG. 1D is a schematic diagram of the arrangement of coding units whose prediction division mode is "HOR_DOWN";
  • Fig. 1E is a schematic diagram of the arrangement of coding units whose prediction division mode is "VER_tN";
  • Fig. 1F is a schematic diagram of the arrangement of coding units whose prediction division mode is "VER_LEFT";
  • Fig. 1G is a schematic diagram of the arrangement of coding units whose prediction division mode is "VER_RIGHT";
  • Figure 2 is a schematic diagram of the intra prediction mode in AVS3;
  • 3A is a schematic diagram of the structure of a coding unit after QT division
  • FIG. 3B is a schematic structural diagram of a coding unit after vertical BT division
  • FIG. 3C is a schematic structural diagram of a coding unit after horizontal BT division
  • FIG. 3D is a schematic structural diagram of a coding unit after vertical TT division
  • 3E is a schematic diagram of the structure of a coding unit after horizontal TT division
  • Figure 4 is a schematic diagram of the arrangement of 67 prediction directions
  • Figure 5 is a schematic diagram of the arrangement of BV mapping relationships in IBC Mode
  • FIG. 6A is a schematic diagram of the arrangement of image blocks to be coded using four-parameter Affine
  • FIG. 6B is a schematic diagram of the structure of an image block to be encoded using a six-parameter Affine
  • FIG. 7 is a schematic diagram of the arrangement of MVs of sub-blocks obtained by the affine transformation prediction method based on sub-blocks;
  • Fig. 8 is a schematic diagram of the test result of the first frame image of the SCC test sequence on the meter
  • FIG. 9A is a schematic diagram of mapping between coding units and reference units as scaling models
  • FIG. 9B is a schematic diagram of mapping between coding units and reference units as translation models
  • FIG. 10 is a schematic flowchart of an optional method for determining a predicted value according to an embodiment of this application.
  • FIG. 11A is a schematic diagram of using a four-parameter Affine to perform a blind search in a preset area
  • FIG. 11B is a schematic diagram of blind search in a preset area using six-parameter Affine
  • FIG. 12 is a schematic diagram of the arrangement of an optional image block to be encoded according to an embodiment of this application.
  • FIG. 13A is a schematic diagram of the arrangement of the BV of the translation relationship of the four-parameter Affine
  • FIG. 13B is a schematic diagram of the arrangement of the BV of the four-parameter Affine non-translation relationship
  • FIG. 13C is a schematic diagram of the arrangement of the BV of the translation relationship of the six-parameter Affine
  • FIG. 13D is a schematic diagram of the arrangement of the BV of the six-parameter Affine non-translation relationship
  • Figure 13E is a schematic diagram of the arrangement of the translation relationship in the SCC
  • Figure 13F is a schematic diagram of the arrangement of non-translational relationships in SCC
  • FIG. 14 is a schematic diagram of another optional arrangement of image blocks to be encoded according to an embodiment of this application.
  • FIG. 15 is a schematic diagram of the arrangement of the first optional BV candidate list construction provided by an embodiment of the application.
  • 16A is a schematic diagram of the arrangement of a second optional BV candidate list construction provided by an embodiment of this application.
  • 16B is a schematic diagram of the arrangement of a third optional BV candidate list construction provided by an embodiment of the application.
  • FIG. 17A is a schematic diagram of the arrangement of a third optional BV candidate list construction provided by an embodiment of this application.
  • FIG. 17B is a schematic diagram of the arrangement of a fourth optional BV candidate list construction provided by an embodiment of this application.
  • 18A is a schematic diagram of the arrangement of a fifth optional BV candidate list construction provided by an embodiment of this application.
  • 18B is a schematic diagram of the arrangement of the sixth optional BV candidate list construction provided by an embodiment of the application.
  • 19A is a schematic diagram of the arrangement of a seventh optional BV candidate list construction provided by an embodiment of this application.
  • FIG. 19B is a schematic diagram of the arrangement of the eighth optional construction of the BV candidate list provided by an embodiment of the application.
  • 19C is a schematic diagram of the arrangement of a ninth optional BV candidate list construction provided by an embodiment of this application.
  • FIG. 19D is a schematic diagram of the arrangement of the tenth optional BV candidate list construction provided by an embodiment of the application.
  • FIG. 20 is a schematic diagram of an optional process for constructing a BV candidate list according to an embodiment of the application.
  • FIG. 21 is a schematic flowchart of an optional iterative operation method provided by an embodiment of this application.
  • FIG. 22A is a schematic flowchart of an optional determining coding mode provided by an embodiment of the application.
  • 22B is a schematic diagram of another optional process for determining an encoding mode according to an embodiment of the application.
  • FIG. 23 is a schematic flowchart of a method for inferring an optional coding mode provided by an embodiment of the application.
  • FIG. 24 is a schematic flowchart of another method for inferring an optional encoding mode provided by an embodiment of this application.
  • FIG. 25 is a schematic structural diagram of an optional encoder provided by an embodiment of the application.
  • FIG. 26 is a schematic structural diagram of another optional encoder provided by an embodiment of the application.
  • SCC mainly adopts Intra Mode, which performs predictive coding based on spatial texture.
  • IntraCuFlag the luminance component value of an image
  • Table 1 the coding unit type and related information when IntraCuFlag is 1 are shown in Table 1:
  • the IntraCuFlag of the current coding unit is 1, look up Table 1 according to the coding unit type (CuType) to determine the prediction division mode (SplitMode), the number of intra luma prediction blocks (NumOfIntraPredBlock), and the prediction division mode (SplitMode) Indicates how the coding unit is divided into intra prediction blocks.
  • CuType the coding unit type
  • SplitMode the prediction division mode
  • NumOfIntraPredBlock the number of intra luma prediction blocks
  • SplitMode the prediction division mode
  • Figure 1A is a schematic diagram of the arrangement of coding units with a prediction division mode of "NO_SPLIT", as shown in Figure 1A, coding units 2M ⁇ 2N are not divided and are a whole block;
  • Figure 1B is a coding unit with prediction division mode "HOR_tN” Arrangement diagram, the encoder unit 2M ⁇ 2N is divided into 0, 1, 2, 3, a total of 4 blocks, and the size of each block is 2M ⁇ 0.5N;
  • Figure 1C is the arrangement of the coding unit with the prediction division mode "HOR_UP" In the schematic diagram, the encoder unit 2M ⁇ 2N is divided into 0, 1, and a total of 2 blocks, and the size of each block is 2M ⁇ 0.5N and 2M ⁇ 1.5N respectively;
  • 1D is the arrangement of the coding unit with the prediction division method "HOR_DOWN"
  • the encoder unit 2M ⁇ 2N is divided into 0, 1, and a total of 2 blocks, with sizes of 2M ⁇ 1.5N and 2M ⁇ 0.5N respectively; Fig.
  • FIG. 1E is a schematic diagram of the arrangement of coding units with a prediction division mode of "VER_tN", encoding The unit 2M ⁇ 2N is divided into 0, 1, 2, 3, 4 blocks in total, each block size is 0.5M ⁇ 2N;
  • Figure 1F is a schematic diagram of the arrangement of the coding unit with the prediction division mode "VER_LEFT”, the coding unit 2M ⁇ 2N is divided into 0, 1, and a total of 2 blocks, each of which has a size of 0.5M ⁇ 2N and 1.5M ⁇ 2N;
  • Figure 1G is a schematic diagram of the arrangement of coding units whose prediction division method is "VER_RIGHT”, and the coding unit 2M ⁇ 2N is Divide 0, 1, a total of 2 blocks, each block size is 1.5M ⁇ 2N and 0.5M ⁇ 2N.
  • the intra prediction mode can be obtained according to the value of IntraLumaPredMode, as shown in Table 2 below:
  • IntraLumaPredMode Intra prediction mode 0 Intra_Luma_DC 1 Intra_Luma_Plane 2 Intra_Luma_Bilinear 3 ⁇ 11 Intra_Luma_Angular 12 Intra_Luma_Vertical 13 ⁇ 23 Intra_Luma_Angular twenty four Intra_Luma_Horizontal 25 ⁇ 32 Intra_Luma_Angular 33 Intra_Luma_PCM
  • Figure 2 is a schematic diagram of the intra-frame prediction mode in AVS3. According to the different prediction modes shown in Figure 2, the prediction error between the predicted value and the true value of the image block to be encoded can be calculated for each prediction mode. Among them, the prediction mode with the smallest coding overhead (error and redundant information) is the prediction mode that best matches the image block to be coded.
  • VVC there are two main encoding modes for SCC, namely the traditional Intra Mode and IBC Mode.
  • IBC Mode is only turned on when encoding the SCC sequence and Class F sequence, while Intra Mode applies to the sequence used. .
  • VVC uses Quad Tree (QT, Quad Tree)/Three Tree (Three Tree)/Binary Tree (QT, Binary Tree) for coding tree unit (CTU, Coding Tree Unit).
  • Figure 3A shows the structure of the coding unit after QT division.
  • FIG. 3A Schematic diagram, as shown in Figure 3A, VVC uses QT to divide the CTU into 4 sub-blocks;
  • Figure 3B is a structural diagram of the coding unit after vertical BT division, as shown in Figure 3B, VVC uses vertical BT to divide the CTU into 2 sub-blocks
  • Figure 3C is a schematic structural diagram of a coding unit after horizontal BT division, as shown in Figure 3C, VVC uses horizontal BT to divide the CTU into 2 sub-blocks;
  • Figure 3D is a schematic structural diagram of a coding unit after vertical TT division, as shown in Figure 3D As shown, VVC uses vertical TT to divide CTU into 3 sub-blocks;
  • Fig. 3E is a schematic diagram of the coding unit after horizontal TT division, as shown in Fig. 3E, VVC uses horizontal TT to divide CTU into 3 sub-blocks; in this way, CTU is divided into small non-overlapping blocks as the basic unit of coding, and then it is coded.
  • Figure 4 is a schematic diagram of the arrangement of 67 prediction directions. As shown in Figure 4, VVC expands the number of intra prediction mode angles to 65 to better adapt to textures in different directions, plus planar and DC predictions There are 67 optional prediction directions.
  • the prediction error between the prediction block and the true value in each prediction direction can be calculated.
  • the prediction direction with the smallest rate-distortion cost (Rdcost, Rate-distortion Cost) (equivalent to encoding overhead) is the block The best matching prediction direction.
  • IBC Mode can only be turned on when encoding SCC sequence and Class F sequence.
  • IBC Mode is similar to the traditional inter-frame mode. It uses motion vector (MV, Motion Vector) to establish the image block to be encoded to the reference image block. The difference is that in IBC Mode, the image block to be encoded and the reference image block are in the same frame. In order to distinguish it from the inter mode, the block vector (BV, Block Vector) to represent the reference relationship in IBC Mode.
  • MV Motion Vector
  • BV Block Vector
  • FIG. 5 is a schematic diagram of the arrangement of BV mapping relationships in IBC Mode.
  • IBC Mode performs motion estimation based on a translational motion model. All pixels in the coding prediction unit (PU, Prediction Unit) share the same BV. Limited by the prediction effect of the translation model, IBC Mode is only applicable to coding units whose width and height are both less than or equal to a distance of 64 pixels (for a large block of PU, all pixels in the PU share one BV, which will cause relatively large prediction errors. Large, increase the coding time and the effect is not good, so IBC Mode is limited to small blocks).
  • Affine is a high-order motion model , Can better describe complex motions such as scaling, rotation, deformation, perspective, and irregular angle rotation of the image block to be encoded. Among them, Affine is divided into four parameters and six parameters.
  • Figure 6A is a schematic diagram of the arrangement of image blocks to be encoded using a four-parameter Affine. As shown in Figure 6A, only the MV (v 0, v 1 ) of the two control points in the upper left corner and the upper right corner need to be obtained.
  • the structure diagram of the image block to be encoded with the parameter Affine is shown in Figure 6B.
  • the MV (v 0, v 1, v 2 ) of the three control points in the upper left corner, the upper right corner and the lower left corner need to be obtained, which can be based on the following formula (1) or the following formula (2) infer the MV (v x , v y ) of each pixel in the image block to be coded w ⁇ h, so when the MV of each pixel is different, non-translational motion is achieved When the MV of each pixel is the same, it is equivalent to a translation model.
  • the coordinates of the control point v0 in the upper left corner are (v0x, v0y), and the coordinates of the control point v1 in the upper right corner are (v1x, v1y).
  • the coordinates of the control point v1 in the upper right corner are (v1x, v1y).
  • FIG. 7 is a schematic diagram of the arrangement of the MVs of the sub-blocks obtained by the sub-block-based affine transformation prediction method.
  • AffineFlag is 1
  • each sub-block of equal size in the image block to be encoded shares one MV (calculated according to formula (1) or (2)), instead of calculating each pixel separately
  • An MV which not only achieves the non-translational effect, but also effectively reduces the coding complexity; among them, the size of the sub-block can be controlled by AffineSubblockSizeFlag, when AffineSubblockSizeFlag is 0, the sub-block size is set to 4 ⁇ 4, otherwise it is set It is 8 ⁇ 8.
  • VVC also adopts the sub-block-based affine transform prediction method. That is to say, each 4 ⁇ 4 size sub-block in the image block to be coded shares one MV instead of calculating one MV for each pixel, which not only achieves the effect of non-translation, but also effectively reduces the coding Complexity.
  • the traditional Intra Mode only predicts based on the spatial texture.
  • the correlation of the SCC video image texture is weak, but the repetitive or similar regions More features.
  • IBC Mode can take advantage of the characteristics of multiple repeated regions, it is still based on the translational motion model for intra prediction.
  • IBC Mode In the translational model IBC Mode, all pixels in a coding unit share the same MV, but In the SCC scene, there are still complex motions such as zooming, rotation, deformation, and perspective transformation. The prediction results based on the translational motion model have large errors; in addition, when there are non-translational complex motions such as rotation or zooming, in order to meet the image quality
  • the encoder will tend to divide an object into small units, and use the translational motion of small units to approximate complex motion. This division and prediction method will bring a lot of redundant information, such as , Such as dividing information, this will affect compression performance.
  • Figure 8 is a schematic diagram of the test results of the first frame image of the SCC test sequence on the meter. As shown in Figure 8, the translational motion model makes the current coding unit only search in the area where the coding has been completed and its size is equal, and there is no Complicated motion units such as rotation angle and deformation are used as your reference.
  • Fig. 9A is a schematic diagram of the mapping between the coding unit and the reference unit as a zoom model. As shown in Fig. 9A, for any two unequal rectangles A and B, the ideal reference process is shown in Fig. 9A. The rectangle B can pass non-translation The model (including zoom motion) is directly coded with A as the reference unit. However, for the translation motion model, Figure 9B is a schematic diagram of mapping the coding unit and the reference unit as the translation model. The actual reference process is shown in Figure 9B. The size is changed.
  • the large coding unit is divided into units B, C, and D as large as A, so that the units B, C, and D can be respectively coded with A as the mapping reference; this introduces a large amount of redundant division information and motion parameters, which is very A large extent affects the coding performance.
  • the line pattern may rise, fall and oscillate, and the possibility of repeated areas within the same frame is less.
  • the translational motion model of IBC Mode is not effective.
  • an embodiment of the present application provides a method for determining a predicted value.
  • FIG. 10 is an optional prediction provided by an embodiment of this application.
  • the schematic flow chart of the method for determining the value is shown in FIG. 10. The method is applied to an encoder, and the method may include:
  • Affine is introduced on the basis of intra prediction to provide a new coding mode, which is called Intra Affine Mode in AVS3, which is called Intra Affine Mode in VVC.
  • Intra Affine Mode for the current frame reference (CPR, Current Picture Referencing)-the coding mode of affine motion within the frame Intra Affine CPR Mode.
  • CPR Current Picture Referencing
  • Intra Affine Mode and Intra Affine CPR Mode are newly added coding modes in addition to the original Intra Mode, which are independent of the original Intra Mode, do not interfere with each other, and compete with each other.
  • the Rdcost is finally calculated by the coding process. Rdcost determines the best encoding mode.
  • the encoder when encoding the current image frame, the encoder first obtains the image block to be encoded and the encoded image block, and then determines the predicted value of the image block to be encoded according to the pixel value of the encoded image block.
  • the type of the aforementioned pixel value may be the pixel value of the luminance component of the image block, or the pixel value of the chrominance component of the image block, which is not specifically limited in the embodiment of the present application.
  • S1002 Construct a BV candidate list for the image blocks to be encoded according to the encoded image blocks
  • a BV candidate list needs to be constructed for the image block to be encoded.
  • the image blocks to be encoded need to be screened to determine whether the image blocks to be encoded are applicable to Intra Affine Mode or Intra Affine CPR Mode; in an optional embodiment, S1002 may include :
  • the intra-frame mode is used to encode the image block to be encoded to obtain the predicted value of the image block to be encoded.
  • the Intra Affine Mode or Intra Affine CPR Mode is used for encoding, that is, according to the coded image block , Build a BV candidate list for the image block to be encoded, otherwise, use the traditional encoding mode for encoding.
  • Affine can make better predictions on large blocks and reduce the prediction error, and the redundant information brought by the coding unit will be greatly reduced, such as partition information and motion parameters, which is conducive to reducing the code rate;
  • partition information and motion parameters which is conducive to reducing the code rate;
  • coding units with a length and/or width less than 16 will skip Intra-Affine Mode.
  • each sub-block of equal size will have derivables that do not need to be transmitted.
  • Independent BV which can achieve sufficient prediction accuracy, and it does not make much sense to continue to divide.
  • the main encoding mode of AVS3 is the traditional Intra Mode.
  • S1002 may include:
  • the searched BV is calculated for the image block to be encoded
  • the initial BV candidate list can be established by multiplexing the motion parameter information of the adjacent image blocks in the coded space, but the premise is that at least one of the adjacent image blocks uses the coding mode Intra. -Affine Mode's coded image block; then, when using Intra-Affine Mode to construct the BV candidate list, it is necessary to continuously inherit the motion parameters of the coded region of the current frame, so there are the following problems: the first code to enter Intra Affine Mode The unit cannot construct the BV candidate list from the adjacent image blocks in the coded space, which leads to the inability to use Intra Affine Mode; in addition, it may not be enough to construct the BV candidate list only from adjacent image blocks, which is not conducive to finding the optimal BV; therefore, this The application embodiment proposes a blind search method, which can perform blind search within the largest coding unit (LCU, Largest Coding Unit) where the current coding unit is located, and further expand the BV candidate list of the
  • the first image block to be encoded that uses Intra Affine Mode must use a blind search method to construct a BV candidate list.
  • FIG. 11A is a schematic diagram of using a four-parameter Affine to perform a blind search in a preset area
  • FIG. 11B is a schematic diagram of using a six-parameter Affine to perform a blind search in a preset area.
  • the blind search is Methods as below:
  • the first pixel in the upper left corner of the LCU where the current coding unit (equivalent to the above-mentioned image block to be coded) is located is the coordinate origin, the width direction is the abscissa, and the height direction is the ordinate to establish a coordinate system.
  • the current LCU size is W ⁇ H ,
  • the preset area is composed of coded area A and coded area B, where the horizontal coordinate range of area A is [0, W-1], and the vertical coordinate range is [0, y-1] ], the horizontal coordinate range of area B is [0,x-1], and the vertical coordinate range is [y,y+H-1].
  • the search step can be set according to the requirements. , The smaller the search step, the more refined the prediction, but the more time-consuming; here, the embodiment of the present application does not specifically limit the value of the search step.
  • Ref_LT and Ref_RT are in the same area, the search of Ref_RT starts with the pixel position of Ref_LT; otherwise The pixel at the upper left corner of the area A or B where Ref_RT is located is the starting point, and Ref_RT will not appear on the left side of Ref_LT during the search process;
  • Ref_RT completes the search of the area, update Ref_LT, that is, shift Ref_LT to the right or down according to the search step of 8 pixels, and return to continue to get the Ref_RT corresponding to the current Ref_LT;
  • blind search of the six-parameter Affine in FIG. 11B in the preset area is similar to the above-mentioned blind search method of the four-parameter Affine, which will not be repeated here.
  • S1002 may include:
  • the BV candidate list of adjacent image blocks is constructed from the searched BV.
  • FIG. 12 is a schematic diagram of the arrangement of an optional image block to be encoded according to an embodiment of the application.
  • adjacent image blocks of the image block to be encoded include For A, B, C, D, G, and F, if the encoding mode of the adjacent image block is Intra Affine Mode, directly multiplex its BV and fill it in the BV candidate list.
  • the BV of the adjacent image block is obtained through blind search, indicating that the encoding mode adopted by the adjacent image block is Intra Affine Mode, so it is added to the BV candidate list.
  • the method may further include:
  • the BV candidate list of adjacent image blocks is constructed from the searched BV and/or the BV candidate list of adjacent image blocks of adjacent image blocks.
  • the BV candidate list of the adjacent image block is the searched BV and/or the BV candidate list of the adjacent image block of the adjacent image block, That is to say, when the adjacent image block adopts Intra Affine Mode, the adjacent image block is the BV candidate list obtained by blind search, and/or the BV candidate list obtained from the BV candidate list of the adjacent image block of the adjacent image block , Therefore, add the BV candidate list of adjacent image blocks to the BV candidate list.
  • FIG. 13A is a schematic diagram of the arrangement of the BV of the translational relationship of the four-parameter Affine, as shown in FIG.
  • Figure 13A if The BV of the upper left control point (LT) and the upper right control point (RT) of the current coding block are equal, indicating that the reference relationship is a translational relationship
  • Figure 13B is a schematic diagram of the arrangement of the BV of the four-parameter Affine non-translational relationship, as shown in Figure 13B As shown, if they are not equal, it means the reference relationship of non-translational complex motions such as zooming and rotation
  • Figure 13C is a schematic diagram of the arrangement of the BV of the translational relationship of the six-parameter Affine, as shown in Figure 13C, if The current coding block LT, RT and the BV of the lower left corner control point (LB) are equal, indicating that the reference relationship is a translation relationship.
  • Figure 13D is a schematic diagram of the arrangement of the BV of the six-parameter Affine non-translation relationship, as shown in Figure 13D, if The unequal description refers to the reference relationship of non-translational complex motions such as scaling and rotation.
  • FIG. 13E shows the SCC A schematic diagram of the arrangement of the translation relationship in the middle, as shown in FIG. 13E, the current coding unit will refer to a coding unit located in the coded region of the current frame with the same width and height.
  • FIG. 13F is a schematic diagram of the arrangement of non-translational relationships in SCC.
  • the current coding unit will refer to a coding unit located in the coded area of the current frame that is different from its width and height (the BV of each sub-block is different, The reference sub-block may not be located in the same coding unit).
  • Intra Affine CPR Mode can also implement non-translational reference relationships, improve the accuracy of intra prediction, and improve coding efficiency.
  • This technology can be used in common SCC scenarios such as game live broadcasts, remote desktops, and online games. Has a good application prospect.
  • the main encoding modes adopted by SCC are Intra Mode and IBC Mode.
  • S1002 can include:
  • the BV candidate list of the adjacent image block is constructed from the BV of the reference image block of the adjacent image block.
  • the coding mode of IBC Mode can be used in VVC, if the coding mode adopted by the BV of the adjacent image block is IBC Mode, that is, the BV candidate list of the adjacent image block is determined by the adjacent image block.
  • the BV candidate list of the adjacent image block is added to the BV candidate list, and the mode of encoding the BV candidate list constructed in this way is called Intra Affine CPR Mode.
  • FIG. 14 is a schematic diagram of another optional arrangement of image blocks to be encoded according to an embodiment of the application.
  • the BV candidate list can be obtained from the BV of the coded adjacent PU of CurPU.
  • the access order of adjacent PUs is from the right adjacent PU (A0) to the lower left adjacent PU (A1), to the directly above adjacent PU (B0), to the upper right adjacent PU (B1) and then to the upper left adjacent PU (B2).
  • Fig. 15 is a schematic diagram of the arrangement of the first optional BV candidate list construction provided by an embodiment of the application. As shown in Fig. 15, the four-parameter Affine is used.
  • the coding mode adopted by the adjacent PU is IBC Mode, Copy the BV candidate list of adjacent image blocks to all control points of the current PU; although the initial BV candidate list of the control points is the same, the subsequent iterative update will adjust the BV candidate list to achieve the effect of the non-translational model.
  • Figure 16A is a schematic diagram of the arrangement of the second optional BV candidate list construction provided by an embodiment of the application.
  • the four-parameter Affine is used. If the adjacent PU is Intra Affine CPR Mode, the BV of the adjacent PU can be directly reused The candidate list to the BV candidate list, as shown in Figure 16A.
  • Figure 16B is a schematic diagram of the arrangement of the third optional BV candidate list construction provided by the embodiments of the application, using the six-parameter Affine. If the adjacent PU is Intra Affine CPR Mode, the BV of the adjacent PU can be directly reused Candidate list to BV candidate list, as shown in Figure 15B.
  • S102 may include:
  • the BV candidate list of adjacent image blocks is constructed from the BV candidate list of adjacent image blocks of adjacent image blocks.
  • the BV candidate list of the selected adjacent image block is constructed from the BV candidate list of the adjacent image block of the adjacent image block, it means that the adjacent image block of the adjacent image block adopts IBC Mode or adopts It is the coding mode of Intra Affine CPR Mode, so the BV candidate list of adjacent image blocks is added to the BV candidate list.
  • S1002 may include:
  • the searched BV is calculated for the image block to be encoded
  • blind search in VVC is an optional way to build a BV candidate list
  • blind search in AVS3 is a necessary way to build a candidate list
  • S1002 may include:
  • the BV candidate list of adjacent image blocks is constructed from the BV candidate list of the reference image block of the adjacent image block, and/or the BV candidate list of the adjacent image block of the adjacent image block, and/or the searched BV .
  • the BV candidate list of the adjacent image block is the searched BV, and/or the adjacent image of the adjacent image block.
  • the BV candidate list of the block, and/or the BV of the reference image block of the adjacent image block, that is, when the adjacent image block adopts Intra Affine Mode or IBC Mode, the BV candidate list of the adjacent image block can be added to BV candidate list.
  • S1002 may include:
  • the control point of the reference image block of the adjacent image block is calculated
  • the BV of the image block to be encoded and the reference image block of the adjacent image block is calculated;
  • the BV of the image block to be encoded and the reference image block of the adjacent image block is added to the BV candidate list.
  • the BV of the adjacent image block is obtained first, and the BV of the adjacent image block is the reference of the adjacent image block’s control point to the adjacent image.
  • the vector of the control point of the image block based on this, on the basis of knowing the BV of the adjacent image block, the control point of the reference image block of the adjacent image block can be calculated, and based on this, the control point of the image block to be coded is calculated.
  • the vector of the control point of the reference image block of the adjacent image block is obtained, and the BV of the image block to be coded and the reference image block of the adjacent image block is obtained and added to the BV candidate list.
  • FIG. 17A is a schematic diagram of the arrangement of the third optional construction of the BV candidate list provided by the embodiment of the application.
  • Intra Affine CPR Mode can directly multiplex the two control points of the upper left corner and the upper right corner of the reference image block of the adjacent image block, and infer the two control points of the image block to be encoded point to the reference image block of the adjacent image block according to the spatial position And add the inferred BV to the BV candidate list; taking the six-parameter Affine as an example, FIG. 17B is the fourth optional construction of the BV candidate list provided by the embodiment of this application.
  • the BV of the adjacent image block is first obtained.
  • the BV of the adjacent image block is the reference image block with the upper left corner of the image to be encoded and the control point pointing to the adjacent image.
  • the control point of the reference image block of the adjacent image block can be calculated, and based on this, the control point of the image block to be coded is calculated.
  • the vector of the control point of the reference image block of the adjacent image block is obtained, and the BV of the image block to be coded and the reference image block of the adjacent image block is obtained and added to the BV candidate list.
  • FIG. 18A is a schematic diagram of the arrangement of the fifth optional BV candidate list construction provided by an embodiment of the application.
  • the coding mode of adjacent image blocks (NeiPU in FIG. 18A) is In IBC Mode
  • add the BV of the upper left and upper right control points of the image block to be coded (CurPU in Fig. 18A) to the upper left and upper right control points of the reference PU of NeiPU (NeiRefPU in Fig. 18A) to In the BV candidate list
  • FIG. 18B is a schematic diagram of the arrangement of the sixth alternative BV candidate list construction provided by an embodiment of the application, as shown in FIG. 18B, when the coding mode of adjacent image blocks (NeiPU in FIG.
  • S1002 may include:
  • the BV of the adjacent image block is used to calculate the indirect reference image block of the image block to be encoded
  • the vector addition and subtraction algorithm is used to calculate the BV of the image block to be encoded and the indirect reference image block ;
  • the BVs of the image block to be coded and the indirect reference image block are added to the BV candidate list.
  • the BV of the adjacent image block can be multiplexed by the image block to be encoded, the indirect reference image block of the image block to be encoded is calculated, and the BV of the indirect reference image block is obtained Finally, according to the vector addition and subtraction algorithm, the BV of the image block to be coded and the indirect reference image block is calculated and added to the BV candidate list.
  • FIG. 19A is a schematic diagram of the arrangement of the seventh alternative BV candidate list construction provided by an embodiment of the application.
  • LT represents the upper left corner control point of CurPU
  • RT represents the upper right corner control point of CurPU.
  • Point, (v 0 , v 1 ) is the BV of NeiPU.
  • CurPU can find the indirect reference image block through the BV of NeiPU, it is represented by PU0 and PU1 in Fig. 19A.
  • the BV (v l1 , v r1 obtained by using the indirect reference image block) ) Can be expressed by the following formula:
  • FIG. 19B is a schematic diagram of the arrangement of the eighth optional construction of the BV candidate list provided by the embodiments of the application.
  • LT represents the upper left corner control point of CurPU
  • RT represents the upper right corner control point of CurPU
  • (v0 , V1) is the BV of NeiPU.
  • CurPU can find the indirect reference image block through the BV of NeiPU, it is represented by PU0 and PU1 in Figure 19B.
  • the BV (v l2 , v r2 ) obtained by using the indirect reference image block is expressed by the following formula :
  • Fig. 19B which is similar to Fig. 19B, and will not be repeated here.
  • Fig. 19C is a schematic diagram of the arrangement of the ninth optional construction of the BV candidate list provided by the embodiments of the application.
  • LT represents the upper left corner control point of CurPU
  • RT represents the upper right corner control point of CurPU
  • (v 0 , v 1 ) is the BV of NeiPU.
  • CurPU can find the indirect reference image block through the BV of NeiPU, it is represented by PU0 and PU1 in Figure 19C.
  • the BV (v l3 , v r3 ) obtained by using the indirect reference image block is as follows The formula says:
  • Fig. 19C which is similar to Fig. 19C, and will not be repeated here.
  • the current PU first multiplexes the BV (v0, v1) of the spatially adjacent PU (NeiPU), finds the PU0 and PU1 pointed to by v 0 and v 1 , and then according to PU0's BV (v 00 , v 01 ), PU1's BV (v 10 , v 11 ), use the above formula (4) or (5) or (6) to calculate and derive the current PU's BV, and fill in the BV candidate list .
  • FIG. 19D is a schematic diagram of the arrangement of the tenth optional construction of the BV candidate list provided by the embodiments of the application.
  • LT represents the upper left corner control point of CurPU
  • RT represents the upper right corner control point of CurPU
  • (v0 , V1) is the BV of NeiPU.
  • CurPU can find the indirect reference image block through the BV of NeiPU, it is represented by PU0 and PU1 in Figure 19D.
  • the BV obtained by using the indirect reference image block is expressed by the following formula:
  • NeiPU, PU0, and PU1 shown in Figure 18A- Figure 18C all use Intra Affine CPR Mode. If one of the PUs is IBC Mode, this indirect multiplexing method can also be used; in this case, only Just copy its BV to all other control points, as shown in Figure 19D, and the same is true in other situations. It is just a combination of IBC Mode and Intra-Affine CPR Mode, which will not be repeated here.
  • Intra Affine CPR Mode When constructing the BV candidate list in VVC, it is necessary to continuously inherit the motion parameters of the coded area of the current frame.
  • VVC Video Coding
  • AVS3 does not have IBC Mode. If there is no blind search process, then the starting point cannot be found at all, and subsequent iterative update steps cannot be performed. Therefore, the blind search process must be included in AVS3, while VVC When the IBC Mode is turned on, the BV of the IBC Mode can be reused, and the BV can be adjusted as the starting point of the BV, so the blind search process is not necessarily included in the VVC.
  • FIG. 20 is a schematic diagram of an optional BV candidate list construction process provided by an embodiment of the application. As shown in FIG. 20, the construction of the BV candidate list can be roughly divided into direct multiplexing, indirect multiplexing and blind multiplexing. There are three ways to search, the specific implementation steps are as follows:
  • S2003 Construct a BV candidate list by using a blind search method, and perform coding according to the constructed candidate list, and execute S2006; here, the blind search method is the same as the above description, and will not be repeated here.
  • the searched BV can be added to the BV candidate list, and the searched BV can also be selectively added to the BV candidate according to the coding overhead of each searched BV
  • the method may further include:
  • the above-mentioned first encoding overhead is the error between the true value of the image block to be encoded and the predicted value of the image block to be encoded and the redundant information in the encoding process.
  • the six-parameter Affine's blind search steps in the LCU are similar to the four-parameter search, except that the search for the control point LB reference point Ref_LB in the lower left corner of the current coding unit is added.
  • the Ref_LB search steps are also similar to Ref_RT, so I won’t repeat them here.
  • S1003 Perform an iterative operation on each BV in the BV candidate list according to a preset iterative algorithm to obtain a BV candidate list after the iterative operation;
  • S103 may include:
  • each BV in the BV candidate list is substituted into the error formula between the true value of the image block to be encoded and the predicted value of the image block to be encoded, and the least square algorithm is used to update the error
  • the updated BV candidate list will be used as the iterative calculation of the BV candidate list, and the parameters of the affine motion model in the updated error formula will be used to update the parameters in the affine motion model ;
  • Update i to i+1 and return to execute to determine whether the iteration number i is less than or equal to the preset iteration number.
  • the error formula between the true value of the image block to be encoded and the predicted value of the image block to be encoded can be expressed as:
  • A [a ⁇ 0 b ⁇ 1 ] T is the parameter of the Affine model.
  • the parameters of Affine and the BV candidate list can be updated to obtain the BV candidate list after the iterative operation.
  • Fig. 21 is a schematic flow diagram of an optional iterative operation method provided by an embodiment of the application. As shown in Fig. 21, the maximum number of iterations for the four-parameter Affine and the six-parameter Affine are set to 5 and 4, respectively; As an example, the iterative operation method is as follows:
  • step S2104 Then the number of iterations is automatically increased by 1, and when the number of iterations is less than or equal to the iteration threshold, return to step S2102;
  • S1004 may include:
  • the BV corresponding to the minimum value in the second encoding overhead is used as the BV of the image block to be encoded.
  • FIG. 22A is a schematic diagram of an optional process for determining an encoding mode according to an embodiment of the application. As shown in FIG. 22A, the specific implementation steps are as follows:
  • S22A1 Input the current coding unit; execute S22A2 and S22A3;
  • S22A2 Use conventional Intra Mode for encoding, and execute S22A6;
  • S22A3 Determine the distance of 16 pixels that the length H and width W of the image block to be encoded are both greater than or equal to;
  • S22A6 Calculate the coding overhead using Intra Mode and Intra Affine Mode respectively, and finally select the coding mode corresponding to the minimum coding overhead from the coding overhead, and determine it as the final coding mode.
  • the conventional Intra Mode is used for encoding
  • the Intra Affine Mode is used when the length and width of the image block to be encoded are both greater than or equal to the distance of 16 pixels.
  • FIG. 22B is a schematic diagram of another optional process for determining an encoding mode provided by an embodiment of the application. As shown in FIG. 22B, the specific implementation steps are as follows:
  • S22B1 Input the current coding unit; execute S22B2, S22B3 and S22B7;
  • S22B2 Use conventional Intra Mode for encoding, and execute S22B8;
  • S22B3 Determine the distance of 16 pixels that the length H and width W of the image block to be encoded are both greater than or equal to;
  • S22B5 Iteratively search for the best BV from the initial BV candidate list, and execute S22B8;
  • S22B6 Determine the distance of 16 pixels that the length H and width W of the image block to be encoded are both less than or equal to;
  • S22B7 Use conventional IBC Mode for encoding, and execute S22B8;
  • S22B8 Calculate the coding overheads using Intra Mode, IBC Mode and Intra Affine CPR Mode respectively, and finally select the coding mode corresponding to the minimum value of the coding overhead from the coding overhead, and determine it as the final coding mode.
  • VVC after acquiring the image block to be encoded, use the conventional Intra Mode for encoding, and use IBC Mode for encoding when the length and width of the image block to be encoded are less than or equal to a distance of 64 pixels.
  • the Intra Affine CPR Mode is used for encoding (the BV corresponding to the minimum encoding cost can be selected through iterative calculation), and then the calculation Adopt the coding overhead of Intra Mode, IBC Mode and Intra Affine CPR Mode, and finally select the coding mode corresponding to the minimum coding overhead from all the coding overheads, and determine it as the final coding mode.
  • the method may further include:
  • the control point of the reference image block pointed to by each BV in the BV candidate list after the iterative operation is in the coded image block, to determine whether the BV in the BV candidate list is legal, only the BV candidate after the iterative operation
  • the BV of the control point of the reference image block pointed to in the list in the coded image block is a legal BV, otherwise it is illegal.
  • the current frame itself needs to be used as the only reference frame.
  • the reference unit is in the coded area of the current frame.
  • the legality of the searched BV needs to be checked, including the initial BV in the establishment of the BV candidate list mentioned above, and the continuously updated BV and optimal BV in the iterative search process; if not Through the legality check, it means that the reference pixel may have exceeded the image boundary or located in the uncoded area, and the available reference pixel value will not be available at the decoding end, resulting in decoding confusion.
  • the width and length of the image are W_pic and H_pic respectively
  • the BV of the control points LT (upper left corner) and RT (upper right corner) of the current coding unit are (v 0x ,v 0y ) and (v 1x ,v 1y )
  • the width and length of the current coding unit are w, h
  • the position of any pixel in the coding unit is (x, y)
  • the position of the upper left pixel is (X PU , Y PU )
  • the pixels in the unit are in the coded area and do not exceed the image boundary.
  • S1005 According to the BV of the image block to be encoded, call a preset affine motion model, calculate the reference image block of the image block to be encoded, and use the pixel value of the reference image block as the predicted value of the image block to be encoded.
  • formula (1) and formula (2) can be used to calculate the BV of each sub-block in the image block to be encoded, and then for each sub-block, according to each sub-block
  • the BV of the block determines the reference image block of each sub-block, and uses the pixel value of the reference image block of each sub-block as the predicted value of each sub-block, so that the predicted value of the image block to be encoded can be determined.
  • the coding mode of Intra Affine Mode and the coding mode of Intra Affine CPR Mode are introduced, in order to better implement the coding mode of Intra Affine Mode and the coding mode of Intra Affine CPR Mode in the encoder and decoder. , Using the following method:
  • Intra Affine Mode When the Intra Affine Mode is executed on the encoding end, in order to mark or infer the mode, so that other encoding units can refer to encoding units with Intra Affine Mode, or to facilitate decoding at the decoding end.
  • FIG. 23 is a schematic flow diagram of a method for inference of an optional encoding mode provided by an embodiment of the application, as shown in FIG. 23, the specific inference flow is as follows:
  • Intra Affine CPR Mode in VVC, in order to mark or infer this mode, so that other PUs can refer to PUs with Intra Affine CPR Mode, or to facilitate decoding by the decoder.
  • FIG. 24 is a schematic flowchart of another method for inference of an optional encoding mode provided by an embodiment of the application. As shown in FIG. 24, the specific inference process is as follows:
  • S2401 Determine whether Flag_predMode is MODE_INTER, if yes, execute S2402, if not, execute select Intra Mode;
  • S2402 Determine whether the reference frame is the current frame, if yes, execute S2403, otherwise select Inter Mode;
  • S2403 Determine whether Flag_Affine is true, if it is, select Intra Affine CPR Mode, otherwise select IBC mode.
  • the embodiment of the application provides a method for determining a predicted value, which includes: an encoder obtains an image block to be encoded and an encoded image block from a current image frame, and constructs a BV candidate list for the image block to be encoded according to the encoded image block , Perform iterative operations on each BV in the BV candidate list according to the preset iterative algorithm to obtain the BV candidate list after the iterative operation.
  • the BV candidate list after the iterative operation From the BV candidate list after the iterative operation, select the BV of the image block to be encoded, and To encode the BV of the image block, call the preset affine motion model, calculate the reference image block of the image block to be encoded, and use the pixel value of the reference image block as the predicted value of the image block to be encoded; in other words, it is implemented in this application
  • the BV of the image block to be encoded can be selected from the BV candidate list after the iterative operation.
  • a more accurate BV can be quickly determined for the image to be coded, and then the affine motion model can be invoked, so that the affine motion compensation method is used to predict the predicted value during intra-frame prediction coding.
  • the determined predicted value is more Close to the true value, effectively improving the coding efficiency.
  • FIG. 25 is a schematic structural diagram of an optional encoder proposed in an embodiment of this application.
  • the encoder proposed in an embodiment of this application may include an acquisition module 251, a construction module 252, Iteration module 253, selection module 254, and determination module 255.
  • the obtaining module 251 is configured to obtain the image block to be coded and the coded image block from the current image frame;
  • the construction module 252 is configured to construct a block vector BV candidate list for the image block to be encoded according to the encoded image block;
  • the iteration module 253 is configured to perform an iterative operation on each BV in the BV candidate list according to a preset iterative algorithm to obtain a BV candidate list after the iterative operation;
  • the selection module 254 is used to select the BV of the image block to be encoded from the BV candidate list after the iterative operation;
  • the determining module 255 is configured to call a preset affine motion model according to the BV of the image block to be encoded, calculate the reference image block of the image block to be encoded, and use the pixel value of the reference image block as the predicted value of the image block to be encoded.
  • the construction module 252 is specifically configured to:
  • the searched BV is calculated for the image block to be encoded
  • building module 252 is specifically used for:
  • the BV candidate list of adjacent image blocks is constructed from the searched BV.
  • building module 252 is also specifically used for:
  • the BV candidate list of adjacent image blocks is constructed from the searched BV and/or the BV candidate list of adjacent image blocks of adjacent image blocks.
  • the construction module 252 is specifically used for:
  • the BV candidate list of the adjacent image block is constructed from the BV of the reference image block of the adjacent image block.
  • building module 252 is specifically used for:
  • the BV candidate list of adjacent image blocks is constructed from the BV candidate list of adjacent image blocks of adjacent image blocks.
  • building module 252 is specifically used for:
  • the searched BV is calculated for the image block to be encoded
  • building module 252 is specifically used for:
  • the BV candidate list of adjacent image blocks is constructed from the BV candidate list of the reference image block of the adjacent image block, and/or the BV candidate list of the adjacent image block of the adjacent image block, and/or the searched BV .
  • building module 252 is specifically used for:
  • the control point of the reference image block of the adjacent image block is calculated
  • the BV of the image block to be encoded and the reference image block of the adjacent image block is calculated;
  • the BV of the image block to be encoded and the reference image block of the adjacent image block is added to the BV candidate list.
  • building module 252 is specifically used for:
  • the BV of the adjacent image block is used to calculate the indirect reference image block of the image block to be encoded
  • the vector addition and subtraction algorithm is used to calculate the BV of the image block to be encoded and the indirect reference image block ;
  • building module 252 is specifically used for:
  • the intra-frame mode is used to encode the image block to be encoded to obtain the predicted value of the image block to be encoded.
  • building module 252 is specifically used for:
  • the calculation uses each BV of the searched BVs to encode the image block to be encoded.
  • the iteration module 253 is specifically used for:
  • each BV in the BV candidate list is substituted into the error formula between the true value of the image block to be encoded and the predicted value of the image block to be encoded, and the least squares algorithm is used to Update the parameters of the affine motion model and the BV candidate list in the error formula;
  • the updated BV candidate list is taken as the iterative operation BV candidate list, and the parameters of the affine motion model in the updated error formula are used in the affine motion model. parameter;
  • Update i to i+1 and return to execute to determine whether the iteration number i is less than or equal to the preset iteration number.
  • selection module 253 is specifically used for:
  • the BV corresponding to the minimum value in the second encoding overhead is used as the BV of the image block to be encoded.
  • encoder is also used for:
  • FIG. 26 is a schematic structural diagram of another optional encoder proposed in an embodiment of the application.
  • the encoder 2600 proposed in an embodiment of the application may further include a processor 261 and a processor 261 stored therein.
  • the storage medium 262 for instructions the storage medium 262 relies on the processor 261 to perform operations through the communication bus 263, and when the instructions are executed by the processor 261, the method for determining the predicted value described in one or more embodiments is executed.
  • the communication bus 263 is used to implement connection and communication between these components.
  • the communication bus 263 also includes a power bus, a control bus, and a status signal bus.
  • various buses are marked as the communication bus 263 in FIG. 26.
  • An embodiment of the present application provides a computer storage medium that stores executable instructions.
  • the processors execute the operations described in one or more embodiments above. How to determine the predicted value.
  • the memory in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), and electrically available Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Enhanced SDRAM, ESDRAM Synchronous Link Dynamic Random Access Memory
  • Synchlink DRAM Synchronous Link Dynamic Random Access Memory
  • DRRAM Direct Rambus RAM
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • the steps of the above method can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the aforementioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the embodiments described herein can be implemented by hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Equipment (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), general-purpose processors, controllers, microcontrollers, microprocessors, and others for performing the functions described in this application Electronic unit or its combination.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device Digital Signal Processing Equipment
  • PLD programmable Logic Device
  • PLD Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • the technology described herein can be implemented through modules (such as procedures, functions, etc.) that perform the functions described herein.
  • the software codes can be stored in the memory and executed by the processor.
  • the memory can be implemented in the processor or external to the processor.
  • the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. ⁇
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes a number of instructions to enable a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the method described in each embodiment of the present application.
  • the embodiment of the application provides a method for determining a predicted value, an encoder, and a computer storage medium.
  • the encoder obtains the image block to be encoded and the encoded image block from the current image frame, and based on the encoded image block, it is the image to be encoded
  • the block builds a BV candidate list, and performs iterative operations on each BV in the BV candidate list according to the preset iterative algorithm to obtain the BV candidate list after the iterative operation.
  • the preset affine motion model is called to calculate the reference image block of the image block to be encoded, and the pixel value of the reference image block is used as the predicted value of the image block to be encoded;
  • the predicted value is closer to the true value, which effectively improves the coding efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了一种预测值的确定方法、编码器以及计算机存储介质,该方法包括:从当前图像帧中,获取待编码图像块和已编码图像块,根据已编码图像块,为待编码图像块构建BV候选列表,对BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表,从迭代运算后的BV候选列表中,选取出待编码图像块的BV,根据待编码图像块的BV,调用预设的仿射运动模型,计算得到待编码图像块的参考图像块,将参考图像块的像素值作为待编码图像块的预测值。

Description

预测值的确定方法、编码器以及计算机存储介质 技术领域
本申请实施例涉及视频编码技术,尤其涉及一种预测值的确定方法、编码器以及计算机存储介质。
背景技术
在音视频编码标准(AVS,Audio Video Coding Standard),屏幕内容编码(SCC,Screen Content Coding)主要的编码模式是传统的帧内预测模式(Intra mode),在多功能视频编码(VVC,Versatile Video Coding)中,SCC主要有两种编码模式,分别为传统的Intra mode和帧内块复制模式(IBC mode,Intra Block Copy mode)。
然而,Intra mode仅仅根据空间纹理进行预测编码,与自然采集类视频相比,SCC的视频图像纹理的相关性较弱,采用Intra mode确定出的预测值与真实值之间的差距较大,IBC mode虽然能够利用SCC重复区域多的特性,但是仍然是基于平移运动模型进行帧内预测的,在平移模型IBC mode中,一个编码单元中所有的像素共享同一个运动向量(MV,Motion Vector)。但是在SCC的场景下,仍然存在缩放、旋转、变形、透视变换等复杂运动情况,基于平移运动模型的预测值与真实值之间也具有较大误差。
发明内容
本申请实施例提供一种预测值的确定方法、编码器以及计算机存储介质,能够提高编码效率。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供一种预测值的确定方法,所述方法应用于一编码器中,所述方法包括:
从当前图像帧中,获取待编码图像块和已编码图像块;
根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表;
对所述BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表;
从所述迭代运算后的BV候选列表中,选取出所述待编码图像块的BV;
根据所述待编码图像块的BV,调用预设的仿射运动模型,计算得到所述待编码图像块的参考图像块,将所述参考图像块的像素值作为所述待编码图像块的预测值。
第二方面,本申请实施例提供一种编码器,所述编码器包括:
获取模块,用于从当前图像帧中,获取待编码图像块和已编码图像块;
构建模块,用于根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表;
迭代模块,用于对所述BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表;
选取模块,用于从所述迭代运算后的BV候选列表中,选取出所述待编码图像块的BV;
确定模块,用于根据所述待编码图像块的BV,调用预设的仿射运动模型,计算得到所述待编码图像块的参考图像块,将所述参考图像块的像素值作为所述待编码图像块的预测值。
第三方面,本申请实施例提供一种编码器,所述编码器包括:
处理器以及存储有所述处理器可执行指令的存储介质,所述存储介质通过通信总线依赖所述处理器执行操作,当所述指令被所述处理器执行时,执行第一方面所述的预测值的确定方法。
第四方面,本申请实施例提供一种计算机可读存储介质,其中,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行第一方面所述的预测值的确定的方法。
本申请实施例提供了一种预测值的确定方法、编码器以及计算机存储介质,编码器从当前图像帧中,获取待编码图像块和已编码图像块,根据已编码图像块,为待编码图像块构建BV候选列表,对BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表,从迭代运算后的BV候选列表中,选取出待编码图像块的BV,根据待编码图像块的BV,调用预设的仿射运动模型,计算得到待编码图像块的参考图像块,将参考图像块的像素值作为待编码图像块的预测值;也就是 说,在本申请实施例中,通过为待编码图像块构建出的BV候选列表,并对BV候选列表中的每个BV进行迭代运算,这样,能够从迭代运算后的BV候选列表中选取出待编码图像块的BV,从而可以为待编码图像快确定出更加准确的BV,然后调用仿射运动模型,使得在帧内预测编码时使用仿射运动补偿的方式来预测出预测值,如此,确定出的预测值更接近于真实值,有效地提高了编码效率。
附图说明
图1A为预测划分方式为“NO_SPLIT”的编码单元的排布示意图;
图1B为预测划分方式为“HOR_tN”的编码单元的排布示意图;
图1C为预测划分方式为“HOR_UP”的编码单元的排布示意图;
图1D为预测划分方式为“HOR_DOWN”的编码单元的排布示意图;
图1E为预测划分方式为“VER_tN”的编码单元的排布示意图;
图1F为预测划分方式为“VER_LEFT”的编码单元的排布示意图;
图1G为预测划分方式为“VER_RIGHT”的编码单元的排布示意图;
图2为AVS3中帧内预测模式的示意图;
图3A为QT划分后的编码单元的结构示意图;
图3B为垂直BT划分后的编码单元的结构示意图;
图3C为水平BT划分后的编码单元的结构示意图;
图3D为垂直TT划分后的编码单元的结构示意图;
图3E为水平TT划分后的编码单元的结构示意图;
图4为67种预测方向的排布示意图;
图5为IBC Mode中BV的映射关系的排布示意图;
图6A为采用四参数Affine的待编码图像块的排布示意图;
图6B为采用六参数Affine的待编码图像块的结构示意图;
图7为基于子块的仿射变换预测方法得到的子块的MV的排布示意图;
图8为SCC测试序列在仪表上的第1帧图像的测试结果的示意图;
图9A为编码单元与参考单元为缩放模型的映射示意图;
图9B为编码单元与参考单元为平移模型的映射示意图;
图10为本申请实施例提供的一种可选的预测值的确定方法的流程示意图;
图11A为采用四参数Affine在预设区域进行盲搜索的示意图;
图11B为采用六参数Affine在预设区域进行盲搜索的示意图;
图12为本申请实施例提供的一种可选的待编码图像块的排布示意图;
图13A为四参数Affine的平移关系的BV的排布示意图;
图13B为四参数Affine的非平移关系的BV的排布示意图;
图13C为六参数Affine的平移关系的BV的排布示意图;
图13D为六参数Affine的非平移关系的BV的排布示意图;
图13E为SCC中平移关系的排布示意图;
图13F为SCC中非平移关系的排布示意图;
图14为本申请实施例提供的另一种可选的待编码图像块的排布示意图;
图15为本申请实施例提供的第一种可选的构建BV候选列表的排布示意图;
图16A为本申请实施例提供的第二种可选的构建BV候选列表的排布示意图;
图16B为本申请实施例提供的第三种可选的构建BV候选列表的排布示意图;
图17A为本申请实施例提供的第三种可选的构建BV候选列表的排布示意图;
图17B为本申请实施例提供的第四种可选的构建BV候选列表的排布示意图;
图18A为本申请实施例提供的第五种可选的构建BV候选列表的排布示意图;
图18B为本申请实施例提供的第六种可选的构建BV候选列表的排布示意图;
图19A为本申请实施例提供的第七种可选的构建BV候选列表的排布示意图;
图19B为本申请实施例提供的第八种可选的构建BV候选列表的排布示意图;
图19C为本申请实施例提供的第九种可选的构建BV候选列表的排布示意图;
图19D为本申请实施例提供的第十种可选的构建BV候选列表的排布示意图;
图20为本申请实施例提供的一种可选的构建BV候选列表的流程示意图;
图21为本申请实施例提供的一种可选的采用迭代运算方法的流程示意图;
图22A为本申请实施例提供的一种可选的确定编码模式的流程示意图;
图22B为本申请实施例提供的另一种可选的确定编码模式的流程示意图;
图23为本申请实施例提供的一种可选的编码模式的推断的方法流程示意图;
图24为本申请实施例提供的另一种可选的编码模式的推断的方法流程示意图;
图25为本申请实施例提供的一种可选的编码器的结构示意图;
图26为本申请实施例提供的另一种可选的编码器的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。可以理解的是,此处所描述的具体实施例仅仅用于解释相关申请,而非对该申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关申请相关的部分。
现今,在AVS3中,SCC主要采用Intra Mode,Intra Mode根据空间纹理进行预测编码,以图像的亮度分量值为例来说,IntraCuFlag为1时的编码单元类型和相关信息如下面表1所示:
表1
CuType SplitMode NumOfIntraPredBlock
I_2Mx2N I_NO_SPLIT 1
I_2M_hN HOR_tN 4
I_2M_nU HOR_UP 2
I_2M_nD HOR_DOWN 2
I_hM_2N VER_tN 4
I_nL_2N VER_LEFT 2
I_nR_2N VER_RIGHT 2
如表1所示,若当前编码单元的IntraCuFlag为1,根据编码单元类型(CuType)查表1可以确定预测划分方式(SplitMode)以及帧内亮度预测块数(NumOfIntraPredBlock),预测划分方式(SplitMode)表示编码单元划分为帧内预测块的方式。
图1A为预测划分方式为“NO_SPLIT”的编码单元的排布示意图,如图1A,编码单元2M×2N为没有划分,为一整块;图1B为预测划分方式为“HOR_tN”的编码单元的排布示意图,编码器单元2M×2N被划分0,1,2,3,共4块,每一块的大小为2M×0.5N;图1C为预测划分方式为“HOR_UP”的编码单元的排布示意图,编码器单元2M×2N被划分0,1,共2块,每一块的大小分别为2M×0.5N和2M×1.5N;图1D为预测划分方式为“HOR_DOWN”的编码单元的排布示意图,编码器单元2M×2N被划分0,1,共2块,大小分别为2M×1.5N和2M×0.5N;图1E为预测划分方式为“VER_tN”的编码单元的排布示意图,编码器单元2M×2N被划分0,1,2,3,共4块,每一块大小为0.5M×2N;图1F为预测划分方式为“VER_LEFT”的编码单元的排布示意图,编码单元2M×2N被划分0,1,共2块,每一块大小分别为0.5M×2N和1.5M×2N;图1G为预测划分方式为“VER_RIGHT”的编码单元的排布示意图,编码单元2M×2N被划分0,1,共2块,每一块大小分别为1.5M×2N和0.5M×2N。
在AVS3中,根据IntraLumaPredMode的值可以得到帧内预测模式,如下面表2所示:
表2
IntraLumaPredMode 帧内预测模式
0 Intra_Luma_DC
1 Intra_Luma_Plane
2 Intra_Luma_Bilinear
3~11 Intra_Luma_Angular
12 Intra_Luma_Vertical
13~23 Intra_Luma_Angular
24 Intra_Luma_Horizontal
25~32 Intra_Luma_Angular
33 Intra_Luma_PCM
图2为AVS3中帧内预测模式的示意图,根据图2所示的不同的预测模式,可以计算出每一种预测模式下,待编码图像块的预测值与真实值的预测误差。其中,编码开销(误差和冗余信息)最小的预测模式即为该待编码图像块最匹配的预测模式。
另外,在VVC中,针对SCC主要有两种编码模式,分别为传统的Intra Mode和IBC Mode,其中,IBC Mode只有对SCC序列和Class F序列编码时才打开,而Intra Mode适用于所用的序列。
VVC对编码树单元(CTU,Coding Tree Unit)采用四叉树(QT,Quad Tree)/三叉数TT(Three Tree)/二叉树(QT,Binary Tree),图3A为QT划分后的编码单元的结构示意图,如图3A所示,VVC对CTU采用QT划分成4个子块;图3B为垂直BT划分后的编码单元的结构示意图,如图3B所示,VVC对CTU采用垂直BT划分成2个子块;图3C为水平BT划分后的编码单元的结构示意图,如图3C所示,VVC对CTU采用水平BT划分成2个子块;图3D为垂直TT划分后的编码单元的结构示意图,如图3D所示,VVC对CTU采用垂直TT划分成3个子块;图3E为水平TT划分后的编码单元的结构示意图,如图3E所示,VVC对CTU采用水平TT划分成3个子块;这样,把CTU划分为互不重叠的小块作为编码的基本单元,然后再对其进行编码。
图4为67种预测方向的排布示意图,如图4所示,VVC将帧内预测模式角度的数量扩展到65种,以更好地适应不同方向的纹理,加上planar和DC两种预测方向,共有67种可选的预测方向。
基于图4,可以计算出每一种预测方向下,预测块与真实值的预测误差,其中,率失真代价(Rdcost,Rate-distortion Cost)(相当于编码开销)最小的预测方向即为该块最匹配的预测方向。
针对IBC Mode来说,IBC Mode只有对SCC序列和Class F序列编码时才打开,IBC Mode类似于传统的帧间模式,利用运动向量(MV,Motion Vector)建立待编码图像块到参考图像块之间的参考关系,从而对待编码图像块的像素点进行预测,不同的是,IBC Mode下,待编码图像块和参考图像块处于同一帧,为了与帧间模式作区分,使用块向量(BV,Block Vector)来表示IBC Mode中的参考关系。
图5为IBC Mode中BV的映射关系的排布示意图,如图5所示,IBC Mode基于平移运动模型进行运动估计,编码预测单元(PU,Prediction Unit)中的所有像素点共用相同的BV,受限于平动模型的预测效果,IBC Mode仅适用于宽度与高度都小于等于64个像素点的距离的编码单元(对于大块PU,PU中的所有像素点共用一个BV会造成预测误差较大,增加编码时间而且效果不佳,所以IBC Mode限制于小块)。
在AVS3和VVC中,对于帧间预测使用了一种新技术:仿射运动模型Affine,当配置文件允许使用Affine仿射运动时,表示编码器可以使用仿射运动补偿,Affine是高阶运动模型,可以更好地描述待编码图像块的缩放、旋转、变形、透视、非规则角度旋转等复杂运动,其中,Affine分为四参数和六参数两种。
图6A为采用四参数Affine的待编码图像块的排布示意图,如图6A所示,只需得到左上角和右上角两个控制点的MV(v 0,v 1),图6B为采用六参数Affine的待编码图像块的结构示意图,如图6B所示,需要得到左上角、右上角和左下角三个控制点的MV(v 0,v 1,v 2),就可以根据下面的公式(1)或下面的公式(2)推断出待编码图像块w×h内每个像素点的MV(v x,v y),所以,当每个像素点的MV不同时,达到非平动的效果,当每个像素点的MV都相同,相当于平移模型。
Figure PCTCN2019090594-appb-000001
Figure PCTCN2019090594-appb-000002
其中,在四参数Affine中,左上角的控制点v0的坐标为(v0x,v0y),右上角的控制点v1的坐标为(v1x,v1y),在六参数Affine中,与四参数Affine相比,新增一个左下角的控制点v1的坐标为(v2x,v2y)。
另外,AVS3采用了基于子块的仿射变换预测方法,图7为基于子块的仿射变换预测方法得到的子块的MV的排布示意图,如图7所示,若待编码图像块是仿射模式(AffineFlag为1),则待编码图像块内的每个等大小的子块共用一个MV(根据公式(1)或(2)计算得出),而不是对每个像素点分别计 算一个MV,这样既达到了非平动的效果,还有效地降低了编码的复杂度;其中,子块的大小可以由AffineSubblockSizeFlag控制,当AffineSubblockSizeFlag为0,子块大小设置为4×4,否则设置为8×8。
VVC为了简化运动补偿预测,也采用了基于子块的仿射变换预测方法。也就是说,待编码图像块内的每个4×4大小的子块共用一个MV,而不是对每个像素分别计算一个MV,这样既达到了非平动的效果,还有效地降低了编码的复杂度。
也就是说,在AVS3和VVC中,传统的Intra Mode仅根据空间纹理进行预测,没有利用到SCC的与自然采集类视频相比,SCC的视频图像纹理的相关性较弱,但是重复或者相似区域较多的特点。
并且,在VVC中,IBC Mode虽然能利用重复区域多的特性,但仍然是基于平移运动模型进行帧内预测,在平移模型IBC Mode中,一个编码单元中所有的像素点共享同一个MV,但是在SCC的场景下,仍然存在缩放、旋转、变形、透视变换等复杂运动情况,基于平移运动模型的预测结果具有较大误差;此外,存在旋转或者缩放等非平移复杂运动时,为了满足图像质量,基于平移运动的IBC Mode下,编码器会倾向于将一个物体划分为很小的单元,利用小单元的平移运动近似模拟复杂运动,这种划分和预测方法会带来很多冗余信息,例如,如划分信息,这样会影响压缩性能。
图8为SCC测试序列在仪表上的第1帧图像的测试结果的示意图,如图8所示,平移运动模型使得当前编码单元只能在已完成编码的区域内搜索和其尺寸相等,并且没有旋转角度、变形等复杂运动的单元作为自己的参考。
图9A为编码单元与参考单元为缩放模型的映射示意图,如图9A所示,对于其中任意两个不相等的矩形A和B,理想的参考过程如图9A所示,矩形B可以通过非平移模型(包含缩放运动)直接以A为参考单元进行编码,然而,对于平移运动模型,图9B为编码单元与参考单元为平移模型的映射示意图,真实的参考过程如图9B所示,将尺寸更大的编码单元划分为与A等大的单元B、C和D,才能分别对单元B、C和D以A为映射参考进行编码;由此引入了大量冗余的划分信息和运动参数,很大程度影响了编码性能。
此外,诸如股市走势图,线型可能会上升、下降和振荡,同一帧内出现重复区域的可能性较小,这种情况下,IBC Mode的平移运动模型效果不佳。
为了使得确定出待编码图像块的预测值更接近于待编码图像块的真实值,本申请实施例提供一种预测值的确定方法,图10为本申请实施例提供的一种可选的预测值的确定方法的流程示意图,如图10所示,该方法应用于一编码器中,该方法可以包括:
S1001:从当前图像帧中,获取待编码图像块和已编码图像块;
这里,针对AVS3和VVC来说,在帧内预测的基础上引入了Affine,提供了新的编码模式,在AVS3中称之为帧内仿射运动的编码模式Intra Affine Mode,在VVC中称之为当前帧参考(CPR,Current Picture Referencing)-帧内仿射运动的编码模式Intra Affine CPR Mode。
其中,Intra Affine Mode和Intra Affine CPR Mode是在原有Intra Mode之外,新增的编码模式,分别与原有Intra Mode之间独立进行、互不干扰、相互竞争,最终由编码过程计算Rdcost,根据Rdcost确定最佳编码模式。
在新的编码模式中,在编码当前图像帧时,编码器先获取待编码图像块和已编码图像块,然后根据已编码图像块的像素值来确定待编码图像块的预测值。
其中,上述像素值的种类可以为图像块的亮度分量的像素值,也可以为图像块的色度分量的像素值,这里,本申请实施例对此不作具体限定。
S1002:根据已编码图像块,为待编码图像块构建BV候选列表;
为了确定出待编码图像块的预测值,首先,需要为待编码图像块构建出BV候选列表。
具体来说,在构建BV候选列表前,需要对待编码图像块进行筛选,以确定待编码图像块是否适用于Intra Affine Mode或者Intra Affine CPR Mode;在一种可选的实施例中,S1002可以包括:
判断待编码图像块的长和宽是否均大于等于16个像素点的距离;
当待编码图像块的长和宽均大于等于16个像素点的距离时,根据已编码图像块,为待编码图像块构建BV候选列表;
当待编码图像块的长或者宽小于16个像素点的距离时,采用帧内模式对待编码图像块进行编码,得到待编码图像块的预测值。
这里,在获取到待编码图像块之后,判断待编码图像块的长是否大于等于16个像素点的距离,且判断待编码图像块的宽是否大于等于16个像素点的距离,只有当待编码图像块的长大于等于16个像素点的距离,且待编码图像块的宽大于等于16个像素点的距离时,才采用Intra Affine Mode或者Intra Affine CPR Mode进行编码,即,根据已编码图像块,为待编码图像块构建BV候选列表,否则,采用传统的编码模式进行编码。
这里,需要注意的是,本申请实施例提出的Intra-Affine Mode筛选待编码图像块的长和宽都大于等于16个像素点的距离的图像块的原因如下:
相比于平移模型,Affine可以对大块进行较好的预测,降低预测误差,而且编码单元所带来的冗余信息会大幅度降低,比如划分信息,运动参数,这有利于降码率;为了降低复杂度,长和/或宽小于16的编码单元将跳过Intra-Affine Mode,此外,还有就是在Affine中,每个等大的子块都会有可以推导出的、不需要传输的、独立的BV,这可以达到足够的预测精度,继续划分的意义不大。
在SCC中,AVS3主要的编码模式是传统的Intra Mode,在一种可选的实施例中,当编码器采用的编码模式为帧内模式时,S1002可以包括:
获取与待编码图像块相邻的预设区域;
按照对预设区域设置好的控制点的起点和预设的搜索步长,搜索预设区域,为待编码图像块搜索出控制点;
根据搜索出的控制点和待编码图像块的控制点,为待编码图像块计算得到搜索出的BV;
将搜索出的BV添加至BV候选列表中。
具体来说,在构建BV候选列表中,可以通过复用已编码空间相邻图像块的运动参数信息,建立初始BV候选列表,但是前提是相邻图像块中至少存在一个采用的编码模式为Intra-Affine Mode的已编码图像块;那么,采用Intra-Affine Mode模式构建BV候选列表时,需要不断继承当前帧已编码区域的运动参数,这样就存在以下问题:第一个进入Intra Affine Mode的编码单元无法通过已编码空间相邻图像块构建BV候选列表,从而导致无法使用Intra Affine Mode;此外,仅通过相邻图像块构建BV候选列表可能不够多,不利于找到最优的BV;因此,本申请实施例提出盲搜索的方法,可以在当前编码单元所在的最大编码单元(LCU,Largest Coding Unit)内,进行盲搜索,进一步扩充待编码图像块的BV候选列表。
需要说明的是,针对AVS3来说,在编码的过程中,第一个使用Intra Affine Mode的待编码图像块必须使用盲搜索的方法构建BV候选列表。
举例来说,图11A为采用四参数Affine在预设区域进行盲搜索的示意图,图11B为采用六参数Affine在预设区域进行盲搜索的示意图,下面,以四参数Affine为例,盲搜索的方法如下:
以当前编码单元(相当于上述待编码图像块)所在的LCU的左上角第一个像素点为坐标原点,宽度方向为横坐标,高度方向为纵坐标建立坐标系,当前LCU大小为W×H,当前编码单元大小为w×h,当前编码单元左上角控制点(LT)像素点位置为(x 1,y 1),右上角控制点(RT)像素点位置为(x 2,y 2),且(x 2,y 2)=(x 1+w,y 1)。
如图11A所示,预设区域由已编码区域A和已编码区域B组成,其中,区域A在横向坐标范围区间为[0,W-1],纵向坐标范围区间为[0,y-1],区域B在横向坐标范围区间为[0,x-1],纵向坐标范围区间[y,y+H-1]。
在区域A和B内分别搜索当前编码单元控制点LT和RT的参考点,分别对应表示为Ref_LT(x 01,y 01)和Ref_RT(x 02,y 02),根据LT的坐标和Ref_LT的坐标计算求得LT控制点的BV(v LT),同理得到RT控制点的BV(v RT),两个控制点的BV组成一个BV(v LT,v RT),作为运动估计的搜索起点,具体如下:
Figure PCTCN2019090594-appb-000003
搜索Ref_LT(x 01,y 01)和Ref_RT(x 02,y 02)的过程中,由于当前编码单元最小为16×16,综合考虑搜索的时间复杂度,可以根据需求对设置搜索步长,其中,搜索步长越小预测越精细,但是更耗时;这里,本申请实施例对搜索步长的值不作具体限定。
下面以搜索步长为8个像素点的距离为例,搜索过程具体步骤如下:
首先,设置Ref_LT的搜索起点,即Ref_LT所在区域A或B的左上角位置的像素点;
然后,固定Ref_LT的位置对Ref_RT进行搜索,即按照8像素点搜索步长向右或向下平移Ref_RT,当Ref_LT和Ref_RT在同一区域内时,Ref_RT的搜索以Ref_LT像素点位置为起点;否则以Ref_RT所在区域A或B的左上角位置像素点为起点,搜索过程中限制Ref_RT不会出现在Ref_LT左侧;
当Ref_RT完成所在区域的搜索,对Ref_LT进行更新,即按照8像素点的搜索步长向右或向下平移Ref_LT,回到继续得到当前Ref_LT对应的Ref_RT;
直到Ref_LT对其所在区域A和B完成搜索,盲搜索结束;
需要说明的是,图11B的六参数Affine在预设区域的盲搜索与上述四参数Affine的盲搜索方式类似,这里,不再赘述。
针对AVS3来说,在一种可选的实施例中,S1002可以包括:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由搜索出的BV构建出的。
具体来说,图12为本申请实施例提供的一种可选的待编码图像块的排布示意图,如图12所示,在已编码图像块中,待编码图像块的相邻图像块包括A、B、C、D、G和F,如果相邻图像块的编码模式采用的是Intra Affine Mode,直接复用它的BV,将其填入BV候选列表中。
举例来说,相邻图像块的BV是通过盲搜索得到的,说明相邻图像块采用的编码模式为Intra Affine Mode,所以,将其添加至BV候选列表中。
针对相邻图像块采用Intra Affine Mode中,在一种可选的实施例中,在从已编码图像块中,选取出待编码图像块的相邻图像块之后,该方法还可以包括:
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由搜索出的BV和/或相邻图像块的相邻图像块的BV候选列表构建出的。
这里,若相邻图像块的编码模式采用的是Intra Affine Mode时,那么相邻图像块的BV候选列表是搜索出的BV和/或是相邻图像块的相邻图像块的BV候选列表,也就是说,相邻图像块采用Intra Affine Mode时,相邻图像块是通过盲搜索得到的BV候选列表,和/或是相邻图像块的相邻图像块的BV候选列表得到的BV候选列表,所以,将相邻图像块的BV候选列表添加至BV候选列表中。
在本申请实施例中,以AVS3的SCC为例来说,在Intra Affine Mode中,针对四参数Affine,图13A为四参数Affine的平移关系的BV的排布示意图,如图13A所示,若当前编码块左上角控制点(LT)与右上角控制点(RT)的BV相等,则表示参考关系是平移关系;图13B为四参数Affine的非平移关系的BV的排布示意图,如图13B所示,若不相等说明是缩放、旋转等非平移复杂运动参考关系;同理,针对六参数Affine,图13C为六参数Affine的平移关系的BV的排布示意图,如图13C所示,若当前编码块LT、RT和左下角控制点(LB)的BV相等,则表示参考关系是平移关系,图13D为六参数Affine的非平移关系的BV的排布示意图,如图13D所示,若不相等说明是缩放、旋转等非平移复杂运动参考关系。
在平移参考关系下,SCC序列重复区域多的特性可以带来码率节省,此时,一个编码单元内根据公式(1)或(2)推导而出子块的BV完全相同,图13E为SCC中平移关系的排布示意图,如图13E所示,当前编码单元会参考一个与其宽、高完全相同的位于当前帧已编码区域的编码单元。
当存在缩放、旋转、变形、透视变换等复杂运动情况,平移运动的预测结果具有较大误差,但是,Affine是高阶运动模型,可以存在在非平移参考关系。非平移参考关系下SCC序列相似区域多的特性可以带来码率节省,此时,一个编码单元内根据公式(1)或(2)推导而出子块的BV互不相同,以模拟缩放、旋转、透视等复杂运动。图13F为SCC中非平移关系的排布示意图,如图13F所示,当前编码单元会参考一个与其宽、高不相同的位于当前帧已编码区域的编码单元(每个子块的BV不相同,参考的子块可能不位于同一个编码单元内)。
如此,通过Intra Affine Mode实现了针对SCC的非平移参考关系,可以提高帧内预测的精度,提高编码效率,该技术可以用于游戏直播、远程桌面、在线游戏等SCC常见的场景,有着较好的应用前景。
同样地,针对VVC的SCC,Intra Affine CPR Mode同样可以实现非平移参考关系,提高帧内预测的精度,提高编码效率,该技术可以用于游戏直播、远程桌面、在线游戏等SCC常见的场景,有着较好的应用前景。
针对VVC来说,SCC采用的主要编码模式为Intra Mode和IBC Mode,在一种可选的实施例中,当编码器采用的编码模式为帧内模式和帧内块复制IBC模式时,S1002可以包括:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由相邻图像块的参考图像块的BV构建出的。
具体来说,由于VVC中可以采用IBC Mode的编码模式,所以,若相邻图像块的BV采用的编码模式是IBC Mode时,也就是说,相邻图像块的BV候选列表是由相邻图像快的参考图像块的BV构建的时,将相邻图像块的BV候选列表添加至BV候选列表,并且将通过该方式构建出的BV候选列表进行编码的模式称之为Intra Affine CPR Mode。
图14为本申请实施例提供的另一种可选的待编码图像块的排布示意图,如图14所示,BV候选列表可以由CurPU的已编码的相邻PU的BV获得。其中,相邻PU的访问顺序是按照正左侧相邻PU(A0) 到左下相邻PU(A1),到正上方相邻PU(B0),到右上方相邻PU(B1)再到左上方相邻PU(B2)。
图15为本申请实施例提供的第一种可选的构建BV候选列表的排布示意图,如图15所示,采用的是四参数Affine,当相邻PU采用的编码模式为IBC Mode时,将相邻图像块的BV候选列表复制给当前PU的所有控制点;虽然控制点的初始BV候选列表相同,但是后续迭代更新会调整BV候选列表,以达到非平动模型的效果。
图16A为本申请实施例提供的第二种可选的构建BV候选列表的排布示意图,采用的是四参数Affine,如果相邻PU是Intra Affine CPR Mode,可以直接复用相邻PU的BV候选列表至BV候选列表,如图16A所示。
图16B为本申请实施例提供的第三种可选的构建BV候选列表的排布示意图,采用的是六参数Affine,如果相邻PU是Intra Affine CPR Mode,可以直接复用相邻PU的BV候选列表至BV候选列表,如图15B所示。
在VVC中为了为待编码图像块构建出BV候选列表,在一种可选的实施例中,S102可以包括:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由相邻图像块的相邻图像块的BV候选列表构建出的。
这里,当选取出的相邻图像块的BV候选列表是由相邻图像块的相邻图像块的BV候选列表构建出的,说明相邻图像块的相邻图像块采用的是IBC Mode或者采用的是Intra Affine CPR Mode的编码模式,所以,将相邻图像块的BV候选列表添加至BV候选列表中。
针对VVC来说,由于SCC可以采用IBC Mode,所以,可以直接复用采用IBC Mode的相邻图像块的BV候选列表,所以,盲搜索的方式在VVC中来说为一种可以选择的构建BV候选列表的方式,并不属于必须的构建BV候选列表的方式,在一种可选的实施例中,S1002可以包括:
获取与待编码图像块相邻的预设区域;
按照对预设区域设置好的控制点的起点和预设的搜索步长,搜索预设区域,为待编码图像块搜索出控制点;
根据搜索出的控制点和待编码图像块的控制点,为待编码图像块计算得到搜索出的BV;
将搜索出的BV添加至BV候选列表中。
这里,需要说明的是,在VVC中的盲搜索的方法与AVS3中的盲搜索的方法相同,这里,不再赘述。
其中,在VVC中盲搜索是一种可选的构建BV候选列表的方式,而在AVS3中盲搜索是一种必须的构建候选列表的方式。
在VVC中为了为待编码图像块构建出BV候选列表,在一种可选的实施例中,S1002可以包括:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由相邻图像块的参考图像块的BV,和/或相邻图像块的相邻图像块的BV候选列表,和/或搜索出的BV构建出的。
这里,若相邻图像块的编码模式采用的是Intra Affine CPR Mode或者IBC Mode时,那么,相邻图像块的BV候选列表是搜索出的BV,和/或是相邻图像块的相邻图像块的BV候选列表,和/或采用相邻图像块的参考图像块的BV,也就是说,相邻图像块采用Intra Affine Mode或者IBC Mode时,可以将相邻图像块的BV候选列表添加至BV候选列表中。
在VVC中为了为待编码图像块构建出BV候选列表,在一种可选的实施例中,S1002可以包括:
根据相邻图像块的BV,计算得到相邻图像块的参考图像块的控制点;
根据相邻图像块的参考图像块的控制点和待编码图像块的控制点,计算得到待编码图像块与相邻图像块的参考图像块的BV;
将待编码图像块与相邻图像块的参考图像块的BV添加至BV候选列表中。
具体来说,当相邻图像块采用的是Intra Affine CPR Mode的编码模式,先获取相邻图像块的BV,相邻图像块的BV是相邻图像块的控制点指向相邻图像快的参考图像块的控制点的向量,基于此,在知晓相邻图像块的BV的基础上,可以计算得到相邻图像块的参考图像块的控制点,基于此,计算待编码图像块的控制点指向相邻图像块的参考图像块的控制点的向量,得到待编码图像块与相邻图像块的参考图像块的BV,并将其添加至BV候选列表中。
举例来说,图17A为本申请实施例提供的第三种可选的构建BV候选列表的排布示意图,以四参数Affine为例来说,如图17A所示,如果相邻图像块是Intra Affine CPR Mode,可以直接复用相邻图像块的参考图像块的左上角和右上角两个控制点,根据空间位置推断出待编码图像块的两个控制点指向相 邻图像块的参考图像块的两个控制点的BV,并将推断出的BV添加至BV候选列表;以六参数Affine为例来说,图17B为本申请实施例提供的第四种可选的构建BV候选列表的排布示意图,如图17B所示,如果相邻图像块是Intra Affine CPR Mode,可以直接复用相邻图像块的参考图像块的左上角、右上角和左下角的控制点,根据空间位置推断出待编码图像块的三个控制点指向相邻图像块的参考图像块的三个控制点的BV,并将推断出的BV添加至BV候选列表。
另外,当相邻图像块采用的是IBC Mode的编码模式,先获取相邻图像块的BV,相邻图像块的BV是待编码图像快的左上角控制点指向相邻图像快的参考图像块的左上角控制点的向量,基于此,在知晓相邻图像块的BV的基础上,可以计算得到相邻图像块的参考图像块的控制点,基于此,计算待编码图像块的控制点指向相邻图像块的参考图像块的控制点的向量,得到待编码图像块与相邻图像块的参考图像块的BV,并将其添加至BV候选列表中。
举例来说,图18A为本申请实施例提供的第五种可选的构建BV候选列表的排布示意图,如图18A所示,当相邻图像块(图18A中的NeiPU)的编码模式为IBC Mode时,将待编码图像块(图18A中的CurPU)的左上角、右上角控制点指向NeiPU的参考PU(图18A中的NeiRefPU)的左上角、右上角的控制点的BV,添加至BV候选列表中;图18B为本申请实施例提供的第六种可选的构建BV候选列表的排布示意图,如图18B所示,当相邻图像块(图18B中的NeiPU)的编码模式为IBC Mode时,将待编码图像块(图18B中的CurPU)的左上角、右上角和左下角的控制点指向NeiPU的参考PU(图18B中的NeiRefPU)的左上角、右上角和左下角的控制点的BV,添加至BV候选列表中。
需要说明的是,虽然与相邻PU的参考控制点相同,但是由于当前PU和相邻PU的宽和高不一定一样,所以后续根据Affine推断出来的每个子块指向的参考子块会有所区别,做到了非平动的效果。
在VVC中为了为待编码图像块构建出BV候选列表,在一种可选的实施例中,S1002可以包括:
基于待编码图像块的控制点,采用相邻图像块的BV,计算得到待编码图像块的间接参考图像块;
根据相邻图像块的BV、间接参考图像块的控制点、间接参考图像块的BV和待编码图像块的控制点,采用向量加减算法,计算得到待编码图像块与间接参考图像块的BV;
待编码图像块与间接参考图像块的BV添加至BV候选列表中。
具体来说,在获取到相邻图像块的BV后,可以通过待编码图像块复用相邻图像块的BV,计算得到待编码图像块的间接参考图像块,并获取间接参考图像块的BV,最后,根据向量的加减算法,计算得到待编码图像块与间接参考图像块的BV,并将其添加至BV候选列表中。
举例来说,图19A为本申请实施例提供的第七种可选的构建BV候选列表的排布示意图,如图19A所示,LT表示CurPU的左上角控制点,RT表示CurPU的右上角控制点,(v 0,v 1)为NeiPU的BV,当CurPU通过NeiPU的BV可以找到间接参考图像块,图19A中用PU0和PU1表示,采用间接参考图像块得到的BV(v l1,v r1)可以用下面的公式表示:
Figure PCTCN2019090594-appb-000004
其中,公式(4)中的向量如图19A所示,(w pu0,0)-(w Cur_pu,0)表示RT指向LT的中间一点的向量,该向量的长度为w Cur_pu-w pu0
图19B为本申请实施例提供的第八种可选的构建BV候选列表的排布示意图,如图19B所示,LT表示CurPU的左上角控制点,RT表示CurPU的右上角控制点,(v0,v1)为NeiPU的BV,当CurPU通过NeiPU的BV可以找到间接参考图像块,图19B中用PU0和PU1表示,采用间接参考图像块得到的BV(v l2,v r2)用下面的公式表示:
Figure PCTCN2019090594-appb-000005
其中,公式(5)中的向量如图19B所示,与图19B类似,这里不再赘述。
图19C为本申请实施例提供的第九种可选的构建BV候选列表的排布示意图,如图19C所示,LT表示CurPU的左上角控制点,RT表示CurPU的右上角控制点,(v 0,v 1)为NeiPU的BV,当CurPU通过NeiPU的BV可以找到间接参考图像块,图19C中用PU0和PU1表示,采用间接参考图像块得到的BV(v l3,v r3)用下面的公式表示:
Figure PCTCN2019090594-appb-000006
其中,公式(6)中的向量如图19C所示,与图19C类似,这里不再赘述。
其中,如上图19A-图19C所示,当前PU(CurPU)先复用空间相邻PU(NeiPU)的BV(v0,v1),找到v 0和v 1指向的PU0和PU1,然后,再根据PU0的BV(v 00,v 01)、PU1的BV(v 10,v 11),利用上述公式(4)或者(5)或者(6)计算推导出当前PU的BV,填写进BV候选列表中。
图19D为本申请实施例提供的第十种可选的构建BV候选列表的排布示意图,如图19D所示,LT表示CurPU的左上角控制点,RT表示CurPU的右上角控制点,(v0,v1)为NeiPU的BV,当CurPU通过NeiPU的BV可以找到间接参考图像块,图19D中用PU0和PU1表示,采用间接参考图像块得到的BV用下面的公式表示:
Figure PCTCN2019090594-appb-000007
需要说明的是,图18A-图18C中展示的NeiPU、PU0、PU1均采用的是Intra Affine CPR Mode,若其中某个PU为IBC Mode,也可以用此间接复用方法;此时,只需要将其BV复制给其他所有控制点即可,如图19D所示,其他情况亦是如此,只是IBC Mode和Intra-Affine CPR Mode的组合,此处不再赘述。
在VVC中构建BV候选列表时,需要不断继承当前帧已编码区域的运动参数,存在以下问题:由于上次提出的Intra Affine CPR Mode是和IBC Mode、传统的Intra Mode并列的新模式,完全独立、互不干扰、互相竞争。如果关闭IBC Mode,那么便不能继续直接或间接复用BV,那么,第一个进入Intra Affine CPR Mode的编码单元无法通过上述继承运动估计起点BV,无法进行后续的迭代更新,找到最优BV,从而一步步导致后面的编码单元都无法进行继承。也就是说,Intra Affine CPR Mode效果好的前提是IBC Mode也必须打开。此外,仅通过上述继承运动估计起点找到的BV可能不够多,不利于找到最优BV,因此在VVC中可以采用盲搜索的方法。
需要说明的是,VVC和AVS3较大的区别,AVS3没有IBC Mode,如果没有盲搜索过程,那么完全无法找到起点,无法进行后续的迭代更新步骤,所以,AVS3中必须包含盲搜索过程,而VVC在IBC Mode打开的情况下,可以复用IBC Mode的BV,调整BV来作为BV的起点,所以VVC中不一定包含盲搜索的过程。
针对VVC来说,图20为本申请实施例提供的一种可选的构建BV候选列表的流程示意图,如图20所示,构建BV候选列表大体可以分为直接复用、间接复用和盲搜索三种方式,具体实现步骤如下:
S2001:清空初始BV候选列表,cnt_ind_inherit=0,cnt_inherit=0;执行S2002和S2003;
S2002:判断空间相邻PU的编码模式为IBC Mode或者Intra Affine CPR Mode?若为是,执行S2004和S2005;若为否,结束。
S2003:采用盲搜索的方式构建BV候选列表,并根据构建出的候选列表进行编码,执行S2006;这里,盲搜索的方式与上述描述相同,这里,不再赘述。
S2004:采用间接复用的方式构建BV候选列表,并根据构建出的候选列表进行编码,执行S2006;
S2005:采用直接复用的方式构建BV候选列表,并根据构建出的候选列表进行编码,执行S2006;
S2006:将cnt更新为cnt_inherit+cnt_ind_inherit+cnt_search,结束。
其中,直接复用的具体步骤如上述图15、图16A、图16B、图17A、图17B、图18A和图18B;间接复用的具体步骤如上述图19A至图19D,盲搜索同本申请实施例其他部分的盲搜索的过程,这里不再赘述。
针对盲搜索来说,当搜索出BV之后,可以将搜索出的BV添加至BV候选列表中,还可以根据每个搜索出的BV的编码开销来选择性地将搜索出的BV添加至BV候选列表中,在一种可选的实施例中,在根据搜索出的控制点和待编码图像块的控制点,为待编码图像块计算得到搜索出的BV之后,该方法还可以包括:
计算采用搜索出的BV中的每个BV分别编码待编码图像块所产生的第一编码开销;
将第一编码开销按照从大到小的顺序选取预设数目的BV;
将预设数目的BV添加至BV候选列表中。
其中,上述第一编码开销为待编码图像块的真实值与待编码图像块的预测值之间的误差和编码过程中的冗余信息。
仍然以图11A为例来说,在得到搜索出的BV之后,对搜索过程中每一组Ref_LT和Ref_RT,建立对应的BV候选列表,不进行迭代搜索直接计算BV的cost(cost综合考虑预测误差与其它开销,如引入的运动参数),并根据cost对所有搜索得到的BV候选列表中的BV进行排序;
当所有分类方法都完成搜索后,最终仅保留16个cost最小的BV作为运动估计的起点,添加到初始BV候选列表中。
针对AVS3(或者VVC)来说,六参数的Affine在LCU(VVC中称为CTU)内进行盲搜索的步骤和四参数相似,只是多了当前编码单元左下角控制点LB参考点Ref_LB的搜索,Ref_LB搜索的步骤也和Ref_RT相似,此处不再赘述。
S1003:对BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表;
这里,预先设置有迭代公式和迭代阈值,将BV候选列表中的每个BV进行迭代运算,得到迭代运算后的BV候选列表,在一种可选的实施例中,S103可以包括:
判断迭代次数i是否小于等于预设的迭代阈值;其中,迭代次数i的初始值为1;
当i小于等于迭代阈值时,将BV候选列表中的每个BV分别代入至待编码图像块的真实值与待编码图像块的预测值之间的误差公式,采用最小二乘法算法,以更新误差公式中仿射运动模型的参数和BV候选列表;
当迭代次数大于预设的迭代阈值时,将得到更新后的BV候选列表作为迭代运算后的BV候选列表,且用更新后的误差公式中仿射运动模型的参数更新仿射运动模型中的参数;
i更新为i+1,返回执行判断迭代次数i是否小于等于预设的迭代次数。
具体来说,使用p(x i,y i)表示待编码图像块中位于(x i,y i)的像素值,p'(x i,y i)表示参考图像块中对应位置的像素值,使用(Δx,Δy)表示该像素点到其参考像素点的位移向量,待编码图像块中该像素点的预测值
Figure PCTCN2019090594-appb-000008
可以表示如下:
Figure PCTCN2019090594-appb-000009
其中,
Figure PCTCN2019090594-appb-000010
Figure PCTCN2019090594-appb-000011
分别表示图像在水平和垂直方向的梯度,可以用传统的索贝尔(Sobel)算子计算得到。
其中,待编码图像块的真实值与待编码图像块的预测值之间的误差公式可以表示为:
Figure PCTCN2019090594-appb-000012
其中,A=[a ω 0 b ω 1] T是Affine模型的参数。
根据最小二乘法,为了使预测误差最小,得到:
Figure PCTCN2019090594-appb-000013
基于公式(9)和(10),能够更新Affine的参数和BV候选列表,得到迭代运算后的BV候选列表。
图21为本申请实施例提供的一种可选的采用迭代运算方法的流程示意图,如图21所示,设置四参数Affine和六参数Affine的最大迭代次数分别设为5和4;以四参数为例来说,该迭代运算方法具体如下:
S2101:获取BV候选列表中的每个BV和迭代阈值,其中,迭代次数cnt=0;
S2102:将BV分别待入至公式(9)中得到每个BV的误差公式;
S2103:根据公式(10)计算梯度矩阵
Figure PCTCN2019090594-appb-000014
和误差d(x i,y i),以更新参数矩阵A和BV,用本次迭代更新后的BV更新BV候选列表,用本次迭代更新后的参数矩阵A更新Affine中的参数。
S2104:然后迭代次数自动加1,当迭代次数小于等于迭代阈值时,返回步骤S2102;
S2105:当迭代次数大于迭代阈值时,输出迭代运算后的BV候选列表。
S1004:从迭代运算后的BV候选列表中,选取出待编码图像块的BV;
为了选取出最佳BV作为待编码图像块的BV,在一种可选的实施例中,S1004可以包括:
计算采用迭代运算后的BV候选列表的每个BV分别编码待编码图像块所产生的第二编码开销;
将第二编码开销中最小值对应的BV作为待编码图像块的BV。
具体来说,针对AVS3来说,在确定采用Intra Affine Mode进行编码时,为了确定出最佳的BV,这里,可以对每个BV计算对应的编码开销,然后选取出最小值对应的BV作为待编码图像块的BV,以减小采用Intra Affine Mode进行编码时的编码开销。
图22A为本申请实施例提供的一种可选的确定编码模式的流程示意图,如图22A所示,具体实现步骤如下:
S22A1:输入当前编码单元;执行S22A2和S22A3;
S22A2:采用常规的Intra Mode进行编码,执行S22A6;
S22A3:判断待编码图像块的长H和宽W均大于等于的16个像素点的距离;
S22A4:若均大于等于时,采用Intra Affine Mode进行编码,具体来说,先建立初始BV候选列表;
S22A5:以初始BV候选列表为起点迭代搜索最佳BV;
S22A6:计算分别采用Intra Mode和Intra Affine Mode的编码开销,最终从编码开销中选取出编码开销的最小值对应的编码模式,确定为最终的编码模式。
也就是说,针对AVS3来说,在获取待编码图像块之后,采用常规的Intra Mode进行编码,同时在待编码图像块的长和宽均大于等于的16个像素点的距离时采用Intra Affine Mode进行编码(通过迭代运算能够选取出编码开销最小值对应的BV进行编码的),然后计算分别采用Intra Mode和Intra Affine Mode的编码开销,最终从编码开销中选取出编码开销的最小值对应的编码模式,确定为最终的编码模式。
图22B为本申请实施例提供的另一种可选的确定编码模式的流程示意图,如图22B所示,具体实现步骤如下:
S22B1:输入当前编码单元;执行S22B2,S22B3和S22B7;
S22B2:采用常规的Intra Mode进行编码,执行S22B8;
S22B3:判断待编码图像块的长H和宽W均大于等于的16个像素点的距离;
S22B4:若均大于等于时,采用Intra Affine Mode进行编码,具体来说,先建立初始BV候选列表;
S22B5:以初始BV候选列表为起点迭代搜索最佳BV,执行S22B8;
S22B6:判断待编码图像块的长H和宽W均小于等于的16个像素点的距离;
S22B7:采用常规的IBC Mode进行编码,执行S22B8;
S22B8:计算分别采用Intra Mode、IBC Mode和Intra Affine CPR Mode的编码开销,最终从编码开销中选取出编码开销的最小值对应的编码模式,确定为最终的编码模式。
也就是说,针对VVC来说,在获取待编码图像块之后,采用常规的Intra Mode进行编码,在待编码图像块的长和宽均小于等于的64个像素点的距离时采用IBC Mode进行编码,同时在待编码图像块的长和宽均大于等于的16个像素的距离时采用Intra Affine CPR Mode进行编码(通过迭代运算能够选取出编码开销最小值对应的BV进行编码的),然后计算分别采用Intra Mode、IBC Mode和Intra Affine CPR Mode的编码开销,最终从所有的编码开销中选取出编码开销的最小值对应的编码模式,确定为最终的编码模式。
在一种可选的实施例中,在S1004之前,该方法还可以包括:
判断迭代运算后的BV候选列表中每个BV所指向的参考图像块的控制点是否在已编码图像块内;
从迭代运算后的BV候选列表中,删除迭代运算后的BV候选列表中所指向的参考图像块的控制点不在已编码图像块内的BV。
这里,通过判断迭代运算后的BV候选列表中每个BV所指向的参考图像块的控制点是否在已编码图像块内,来确定BV候选列表中的BV是否合法,只有迭代运算后的BV候选列表中所指向的参考图像块的控制点在已编码图像块内的BV才为合法的BV,否则不合法。
举例来说,在Intra Affine Mode模式中,需要将当前帧本身作为唯一参考帧,为了保证可以成功解码,必须保证参考单元处于当前帧的已编码区域。
因此,在Intra Affine Mode中,需要对搜索出的BV进行合法性检查,包括上文提到的建立BV候 选列表中的初始BV,以及迭代搜索过程中不断更新的BV和最优BV;如果没有通过合法性检查,表示参考像素点可能超越了图像边界或者位于未编码区域内,在解码端将无法获取到可用的参考像素值,导致解码混乱。
以四参数Affine为例,假设图像的宽、长分别为W_pic、H_pic,当前编码单元的控制点LT(左上角)和RT(右上角)的BV分别为(v 0x,v 0y)和(v 1x,v 1y),当前编码单元的宽、长为w、h,编码单元中任一像素点位置为(x,y),左上角像素点位置为(X PU,Y PU),为了限制参考单元内的像素点均在已编码区域且不超出图像边界,合法性限制条件如下:
Figure PCTCN2019090594-appb-000015
Figure PCTCN2019090594-appb-000016
对于Intra Affine Mode中得到的每一个BV,只有满足上式才被认为是合法可用的,否则进行丢弃。
六参数的合法性检查也基本一致,此处不再赘述。
需要说明的是,在Intra Affine CPR Mode中和Intra Affine Mode中进行BV的合法性检查类似,这里,不再赘述。
S1005:根据待编码图像块的BV,调用预设的仿射运动模型,计算得到待编码图像块的参考图像块,将参考图像块的像素值作为待编码图像块的预测值。
这样,在确定出待编码图像块的BV之后,可以采用公式(1)和公式(2),计算得到待编码图像块中每一个子块的BV,然后针对每一个子块,根据每一个子块的BV确定出每一个子块的参考图像块,用每一个子块的参考图像块的像素值作为每一个子块的预测值,从而可以确定出待编码图像块的预测值。
在本申请实施例中,引入了Intra Affine Mode的编码模式和Intra Affine CPR Mode的编码模式,为了在编码器和解码器中更好地执行Intra Affine Mode的编码模式和Intra Affine CPR Mode的编码模式,采用如下方式:
在编码端执行Intra Affine Mode时,为了标记或推断出该模式,以便于其他编码单元参考具备Intra Affine Mode的编码单元,或者便于解码端解码。
在本申请实施例中,不需要在码流中传输一个Flag,来标记提出的Intra Affine Mode,如上文所述,在编码端进入Intra Affine Mode时,会将当前图像类型PictureType设置为1,即,变为P帧(PictureType=0表示I帧,1表示P帧,2表示B帧),此时,编码单元的预测模式设置为PRED_No_Constraint(可使用帧内预测和帧间预测),并将当前图像设置为预测的唯一参考帧(将当前帧塞进参考图像队列0reference picture list 0中),将编码单元的AffineFlag设为true。
所以,不论在编码端还是解码端,只要检测到某编码单元预测模式为PRED_No_Constraint、“参考帧为当前帧”,而且“AffineFlag为true”,即可以推断该编码单元采取本申请所提出的Intra Affine Mode;图23为本申请实施例提供的一种可选的编码模式的推断的方法流程示意图,如图23所示,具体推断流程如下:
S2301:判断预测模式是否为PRED_No_Constraint,若为是,执行S2302,否则执行S2303;
S2302:判断参考帧是否为当前帧,若为是,执行S2304,否则执行选择Inter Mode;
S2303:判断预测模式是否为PRED_Intra_Only,若为是选择Intra Mode。
S2304:判断AffineFlag是否等于true,若为是选择Intra Affine Mode。
在本实施例中,不需要额外传输Flag,只用复用码流中原有的flag,这有利于降低码率,提高编码效率。
同样地,针对VVC中的Intra Affine CPR Mode,为了标记或推断出该模式,以便于其他PU参考具备Intra Affine CPR Mode模式的PU,或者便于解码端解码。
在本申请实施例中,不需要在码流中传输一个Flag,来标记提出的Intra Affine CPR Mode,而是在编码端进入Intra Affine CPR Mode时,会将当前图像类型由I帧设置变为P帧,并将编码单元的 Flag_predMode设置为MODE_INTER(帧间),将当前图像设置为预测的唯一参考帧;此外,会将编码单元的Flag_Affine设为true,所以,不论在编码端还是解码端,只要检测到某编码单元“Flag_predMode为MODE_INTER”、“参考帧为当前帧”而且“Flag_Affine为true”,即可以推断该编码单元采取本文所提出的Intra Affine CPR Mode。
图24为本申请实施例提供的另一种可选的编码模式的推断的方法流程示意图,如图24所示,具体推断流程如下:
S2401:判断Flag_predMode是否为MODE_INTER,若为是执行S2402,若为否,执行选择Intra Mode;
S2402:判断参考帧是否为当前帧,若为是,执行S2403,否则选择Inter Mode;
S2403:判断Flag_Affine是否为true,若为是,选择Intra Affine CPR Mode,否则选择IBC mode。
这样,不需要额外传输Flag,只用复用码流中原有的flag,这有利于降低码率,提高编码效率。
本申请实施例提供了一种预测值的确定方法,包括:编码器从当前图像帧中,获取待编码图像块和已编码图像块,根据已编码图像块,为待编码图像块构建BV候选列表,对BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表,从迭代运算后的BV候选列表中,选取出待编码图像块的BV,根据待编码图像块的BV,调用预设的仿射运动模型,计算得到待编码图像块的参考图像块,将参考图像块的像素值作为待编码图像块的预测值;也就是说,在本申请实施例中,通过为待编码图像块构建出的BV候选列表,并对BV候选列表中的每个BV进行迭代运算,这样,能够从迭代运算后的BV候选列表中选取出待编码图像块的BV,从而可以为待编码图像快确定出更加准确的BV,然后调用仿射运动模型,使得在帧内预测编码时使用仿射运动补偿的方式来预测出预测值,如此,确定出的预测值更接近于真实值,有效地提高了编码效率。
实施例二
基于同一发明构思下,图25为本申请实施例提出的一种可选的编码器的结构示意图,如图25所示,本申请实施例提出的编码器可以包括获取模块251、构建模块252、迭代模块253、选取模块254以及确定模块255。
获取模块251,用于从当前图像帧中,获取待编码图像块和已编码图像块;
构建模块252,用于根据已编码图像块,为待编码图像块构建块向量BV候选列表;
迭代模块253,用于对BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表;
选取模块254,用于从迭代运算后的BV候选列表中,选取出待编码图像块的BV;
确定模块255,用于根据待编码图像块的BV,调用预设的仿射运动模型,计算得到待编码图像块的参考图像块,将参考图像块的像素值作为待编码图像块的预测值。
进一步地,当编码器采用的编码模式为帧内模式时,构建模块252,具体用于:
获取与待编码图像块相邻的预设区域;
按照对预设区域设置好的控制点的起点和预设的搜索步长,搜索预设区域,为待编码图像块搜索出控制点;
根据搜索出的控制点和待编码图像块的控制点,为待编码图像块计算得到搜索出的BV;
将搜索出的BV添加至BV候选列表中。
进一步地,构建模块252,具体用于:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由搜索出的BV构建出的。
进一步地,构建模块252,还具体用于:
在从已编码图像块中,选取出待编码图像块的相邻图像块之后,将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由搜索出的BV和/或相邻图像块的相邻图像块的BV候选列表构建出的。
进一步地,当编码器采用的编码模式为帧内模式和帧内块复制IBC模式时,构建模块252,具体用于:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由相邻图像块的参考图像块的BV构建出的。
进一步地,构建模块252,具体用于:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由相邻图像块的相邻图像块的BV候选列表构建出的。
进一步地,构建模块252,具体用于:
获取与待编码图像块相邻的预设区域;
按照对预设区域设置好的控制点的起点和预设的搜索步长,搜索预设区域,为待编码图像块搜索出控制点;
根据搜索出的控制点和待编码图像块的控制点,为待编码图像块计算得到搜索出的BV;
将搜索出的BV添加至BV候选列表中。
进一步地,构建模块252,具体用于:
从已编码图像块中,选取出待编码图像块的相邻图像块;
将相邻图像块的BV候选列表添加至BV候选列表中;
其中,相邻图像块的BV候选列表是由相邻图像块的参考图像块的BV,和/或相邻图像块的相邻图像块的BV候选列表,和/或搜索出的BV构建出的。
进一步地,构建模块252,具体用于:
根据相邻图像块的BV,计算得到相邻图像块的参考图像块的控制点;
根据相邻图像块的参考图像块的控制点和待编码图像块的控制点,计算得到待编码图像块与相邻图像块的参考图像块的BV;
将待编码图像块与相邻图像块的参考图像块的BV添加至BV候选列表中。
进一步地,构建模块252,具体用于:
基于待编码图像块的控制点,采用相邻图像块的BV,计算得到待编码图像块的间接参考图像块;
根据相邻图像块的BV、间接参考图像块的控制点、间接参考图像块的BV和待编码图像块的控制点,采用向量加减算法,计算得到待编码图像块与间接参考图像块的BV;
将待编码图像块与间接参考图像块的BV添加至BV候选列表中。
进一步地,构建模块252,具体用于:
判断待编码图像块的长和宽是否均大于等于16个像素值的距离;
当待编码图像块的长和宽均大于等于16个像素点的距离时,根据已编码图像块,为待编码图像块构建块向量BV候选列表;
当待编码图像块的长或者宽小于16个像素点的距离时,采用帧内模式对待编码图像块进行编码,得到待编码图像块的预测值。
进一步地,构建模块252,具体用于:
在根据搜索出的控制点和待编码图像块的控制点,为待编码图像块计算得到搜索出的BV之后,计算采用搜索出的BV中的每个BV分别编码待编码图像块所产生的第一编码开销;
将第一编码开销按照从大到小的顺序选取预设数目的BV;
将预设数目的BV添加至BV候选列表中。
进一步地,迭代模块253,具体用于:
判断迭代次数i是否小于等于预设的迭代阈值;其中,迭代次数i的初始值为1;
当i小于等于所述迭代阈值时,将BV候选列表中的每个BV分别代入至待编码图像块的真实值与待编码图像块的预测值之间的误差公式,采用最小二乘法算法,以更新误差公式中仿射运动模型的参数和BV候选列表;
当迭代次数大于预设的迭代阈值时,将得到更新后的BV候选列表作为迭代运算后的BV候选列表,且用更新后的误差公式中仿射运动模型的参数所述仿射运动模型中的参数;
i更新为i+1,返回执行判断迭代次数i是否小于等于预设的迭代次数。
进一步地,选取模块253,具体用于:
计算采用迭代运算后的BV候选列表的每个BV分别编码待编码图像块所产生的第二编码开销;
将第二编码开销中最小值对应的BV作为待编码图像块的BV。
进一步地,该编码器,还用于:
在从迭代运算后的BV候选列表中,选取出待编码图像块的BV之前,判断迭代运算后的BV候选列表中每个BV所指向的参考图像块的控制点是否在已编码图像块内;
从迭代运算后的BV候选列表中,删除迭代运算后的BV候选列表中所指向的参考图像块的控制点不在已编码图像块内的BV。
图26为本申请实施例提出的另一种可选的编码器的结构示意图,如图26所示,本申请实施例提出 的编码器2600还可以包括处理器261以及存储有处理器261可执行指令的存储介质262,存储介质262通过通信总线263依赖处理器261执行操作,当指令被处理器261执行时,执行上述一个或多个实施例所述的预测值的确定方法。
需要说明的是,实际应用时,终端中的各个组件通过通信总线263耦合在一起。可理解,通信总线263用于实现这些组件之间的连接通信。通信总线263除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图26中将各种总线都标为通信总线263。
本申请实施例提供了一种计算机存储介质,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行上述一个或多个实施例所述的预测值的确定方法。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
而处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本文描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。
对于软件实现,可通过执行本文所述功能的模块(例如过程、函数等)来实现本文所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机、计算机、服务器、或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本申请的保护之内。
工业实用性
本申请实施例提供了一种预测值的确定方法、编码器以及计算机存储介质,编码器从当前图像帧中,获取待编码图像块和已编码图像块,根据已编码图像块,为待编码图像块构建BV候选列表,对BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表,从迭代运算后的BV候选列表中,选取出待编码图像块的BV,根据待编码图像块的BV,调用预设的仿射运动模型,计算得到待编码图像块的参考图像块,将参考图像块的像素值作为待编码图像块的预测值;如此,确定出的预测值更接近于真实值,有效地提高了编码效率。

Claims (18)

  1. 一种预测值的确定方法,其中,所述方法应用于一编码器中,所述方法包括:
    从当前图像帧中,获取待编码图像块和已编码图像块;
    根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表;
    对所述BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表;
    从所述迭代运算后的BV候选列表中,选取出所述待编码图像块的BV;
    根据所述待编码图像块的BV,调用预设的仿射运动模型,计算得到所述待编码图像块的参考图像块,将所述参考图像块的像素值作为所述待编码图像块的预测值。
  2. 根据权利要求1所述的方法,其中,当所述编码器采用的编码模式为帧内模式时,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    获取与所述待编码图像块相邻的预设区域;
    按照对所述预设区域设置好的控制点的起点和预设的搜索步长,搜索所述预设区域,为所述待编码图像块搜索出控制点;
    根据所述搜索出的控制点和所述待编码图像块的控制点,为所述待编码图像块计算得到搜索出的BV;
    将所述搜索出的BV添加至所述BV候选列表中。
  3. 根据权利要求2所述的方法,其中,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    从所述已编码图像块中,选取出所述待编码图像块的相邻图像块;
    将所述相邻图像块的BV候选列表添加至所述BV候选列表中;
    其中,所述相邻图像块的BV候选列表是由所述搜索出的BV构建出的。
  4. 根据权利要求3所述的方法,其中,在从所述已编码图像块中,选取出所述待编码图像块的相邻图像块之后,所述方法还包括:
    将所述相邻图像块的BV候选列表添加至所述BV候选列表中;
    其中,所述相邻图像块的BV候选列表是由所述搜索出的BV和/或所述相邻图像块的相邻图像块的BV候选列表构建出的。
  5. 根据权利要求1所述的方法,其中,当所述编码器采用的编码模式为帧内模式和帧内块复制IBC模式时,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    从所述已编码图像块中,选取出所述待编码图像块的相邻图像块;
    将所述相邻图像块的BV候选列表添加至所述BV候选列表中;
    其中,所述相邻图像块的BV候选列表是由所述相邻图像块的参考图像块的BV构建出的。
  6. 根据权利要求5所述的方法,其中,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    从所述已编码图像块中,选取出所述待编码图像块的相邻图像块;
    将所述相邻图像块的BV候选列表添加至所述BV候选列表中;
    其中,所述相邻图像块的BV候选列表是由所述相邻图像块的相邻图像块的BV候选列表构建出的。
  7. 根据权利要求5所述的方法,其中,所述根据所述已编码出的图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    获取与所述待编码图像块相邻的预设区域;
    按照对所述预设区域设置好的控制点的起点和预设的搜索步长,搜索所述预设区域,为所述待编码图像块搜索出控制点;
    根据所述搜索出的控制点和所述待编码图像块的控制点,为所述待编码图像块计算得到搜索出的BV;
    将所述搜索出的BV添加至所述BV候选列表中。
  8. 根据权利要求5至7任一项所述的方法,其中,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    从所述已编码图像块中,选取出所述待编码图像块的相邻图像块;
    将所述相邻图像块的BV候选列表添加至所述BV候选列表中;
    其中,所述相邻图像块的BV候选列表是由所述相邻图像块的参考图像块的BV,和/或所述相邻图像块的相邻图像块的BV候选列表,和/或所述搜索出的BV构建出的。
  9. 根据权利要求8所述的方法,其中,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    根据所述相邻图像块的BV,计算得到所述相邻图像块的参考图像块的控制点;
    根据所述相邻图像块的参考图像块的控制点和所述待编码图像块的控制点,计算得到所述待编码图像块与所述相邻图像块的参考图像块的BV;
    将所述待编码图像块与所述相邻图像块的参考图像块的BV添加至所述BV候选列表中。
  10. 根据权利要求8所述的方法,其中,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    基于所述待编码图像块的控制点,采用所述相邻图像块的BV,计算得到所述待编码图像块的间接参考图像块;
    根据所述相邻图像块的BV、所述间接参考图像块的控制点、所述间接参考图像块的BV和所述待编码图像块的控制点,采用向量加减算法,计算得到所述待编码图像块与所述间接参考图像块的BV;
    将所述待编码图像块与所述间接参考图像块的BV添加至所述BV候选列表中。
  11. 根据权利要求1所述的方法,其中,所述根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表,包括:
    判断所述待编码图像块的长和宽是否均大于等于16个像素点的距离;
    当所述待编码图像块的长和宽均大于等于16个像素点的距离时,根据所述已编码图像块,为所述待编码图像块构建块向量BV候选列表;
    当所述待编码图像块的长或者宽小于16个像素点的距离时,采用帧内模式对所述待编码图像块进行编码,得到所述待编码图像块的预测值。
  12. 根据权利要求2或7所述的方法,其中,在根据所述搜索出的控制点和所述待编码图像块的控制点,为所述待编码图像块计算得到搜索出的BV之后,所述方法还包括:
    计算采用所述搜索出的BV中的每个BV分别编码所述待编码图像块所产生的第一编码开销;
    将所述第一编码开销按照从大到小的顺序选取预设数目的BV;
    将预设数目的BV添加至所述BV候选列表中。
  13. 根据权利要求1所述的方法,其中,对所述BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表,包括:
    判断迭代次数i是否小于等于预设的迭代阈值;其中,所述迭代次数i的初始值为1;
    当i小于等于所述迭代阈值时,将所述BV候选列表中的每个BV分别代入至所述待编码图像块的真实值与所述待编码图像块的预测值之间的误差公式,采用最小二乘法算法,以更新所述误差公式中仿射运动模型的参数和所述BV候选列表;
    当所述迭代次数大于所述迭代阈值时,将得到更新后的BV候选列表作为所述迭代运算后的BV候选列表,且用更新后的所述误差公式中仿射运动模型的参数更新所述仿射运动模型中的参数;
    i更新为i+1,返回执行所述判断迭代次数i是否小于等于预设的迭代阈值。
  14. 根据权利要求1所述的方法,其中,所述从所述迭代运算后的BV候选列表中,选取出所述待编码图像块的BV,包括:
    计算采用所述迭代运算后的BV候选列表的每个BV分别编码所述待编码图像块所产生的第二编码开销;
    将所述第二编码开销中最小值对应的BV作为所述待编码图像块的BV。
  15. 根据权利要求1所述的方法,其中,在所述从所述迭代运算后的BV候选列表中,选取出所述待编码图像块的BV之前,所述方法还包括:
    判断所述迭代运算后的BV候选列表中每个BV所指向的参考图像块的控制点是否在所述已编码图像块内;
    从所述迭代运算后的BV候选列表中,删除所述迭代运算后的BV候选列表中所指向的参考图像块的控制点不在所述已编码图像块内的BV。
  16. 一种编码器,其中,所述编码器包括:
    获取模块,用于从当前图像帧中,获取待编码图像块和已编码图像块;
    构建模块,用于根据所述已编码图像块,为所述待编码图像块构建BV候选列表;
    迭代模块,用于对所述BV候选列表中的每个BV按照预设的迭代算法进行迭代运算,得到迭代运算后的BV候选列表;
    选取模块,用于从所述迭代运算后的BV候选列表中,选取出所述待编码图像块的BV;
    确定模块,用于根据所述待编码图像块的BV,调用预设的仿射运动模型,计算得到所述待编码图像块的参考图像块,将所述参考图像块的像素值作为所述待编码图像块的预测值。
  17. 一种编码器,其中,所述编码器包括:
    处理器以及存储有所述处理器可执行指令的存储介质,所述存储介质通过通信总线依赖所述处理器执行操作,当所述指令被所述处理器执行时,执行上述的权利要求1至15任一项所述的预测值的确定方法。
  18. 一种计算机可读存储介质,其中,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行上述的权利要求1至15任一项所述的预测值的确定方法。
PCT/CN2019/090594 2019-06-10 2019-06-10 预测值的确定方法、编码器以及计算机存储介质 WO2020248105A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/090594 WO2020248105A1 (zh) 2019-06-10 2019-06-10 预测值的确定方法、编码器以及计算机存储介质
CN201980096199.3A CN113796070A (zh) 2019-06-10 2019-06-10 预测值的确定方法、编码器以及计算机存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/090594 WO2020248105A1 (zh) 2019-06-10 2019-06-10 预测值的确定方法、编码器以及计算机存储介质

Publications (1)

Publication Number Publication Date
WO2020248105A1 true WO2020248105A1 (zh) 2020-12-17

Family

ID=73780933

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090594 WO2020248105A1 (zh) 2019-06-10 2019-06-10 预测值的确定方法、编码器以及计算机存储介质

Country Status (2)

Country Link
CN (1) CN113796070A (zh)
WO (1) WO2020248105A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105659602A (zh) * 2013-10-14 2016-06-08 微软技术许可有限责任公司 用于视频和图像编码的帧内块复制预测模式的编码器侧选项
CN105917648A (zh) * 2014-01-17 2016-08-31 微软技术许可有限责任公司 具有非对称分区的帧内块复制预测以及编码器侧搜索图案、搜索范围和用于分区的方法
CN106464905A (zh) * 2014-05-06 2017-02-22 寰发股份有限公司 用于块内复制模式编码的块向量预测方法
WO2017156705A1 (en) * 2016-03-15 2017-09-21 Mediatek Inc. Affine prediction for video coding
CN109076214A (zh) * 2016-05-28 2018-12-21 联发科技股份有限公司 使用仿射运动补偿的视频编码的当前图像参考的方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2050279B1 (en) * 2006-08-02 2018-08-29 Thomson Licensing Adaptive geometric partitioning for video decoding
US9438928B2 (en) * 2012-11-05 2016-09-06 Lifesize, Inc. Mechanism for video encoding based on estimates of statistically-popular motion vectors in frame
KR101908205B1 (ko) * 2014-02-21 2018-10-15 미디어텍 싱가폴 피티이. 엘티디. 인트라 화상 블록 카피에 기초한 예측을 이용하는 비디오 코딩 방법
CA2942292A1 (en) * 2014-03-11 2015-09-17 Samsung Electronics Co., Ltd. Depth image prediction mode transmission method and apparatus for encoding and decoding inter-layer video

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105659602A (zh) * 2013-10-14 2016-06-08 微软技术许可有限责任公司 用于视频和图像编码的帧内块复制预测模式的编码器侧选项
CN105917648A (zh) * 2014-01-17 2016-08-31 微软技术许可有限责任公司 具有非对称分区的帧内块复制预测以及编码器侧搜索图案、搜索范围和用于分区的方法
CN106464905A (zh) * 2014-05-06 2017-02-22 寰发股份有限公司 用于块内复制模式编码的块向量预测方法
WO2017156705A1 (en) * 2016-03-15 2017-09-21 Mediatek Inc. Affine prediction for video coding
CN109076214A (zh) * 2016-05-28 2018-12-21 联发科技股份有限公司 使用仿射运动补偿的视频编码的当前图像参考的方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SIQI CAO , HAIYANG HAN , JUN WANG , FAN LIANG , YUANFANG YU , YANG LIU: "Affine Motion Mode in Intra Coding", 13. JVET MEETING; 20190109 - 20190118; MARRAKECH; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-M0634, 14 January 2019 (2019-01-14), pages 1 - 2, XP030202148 *

Also Published As

Publication number Publication date
CN113796070A (zh) 2021-12-14

Similar Documents

Publication Publication Date Title
CN113170181B (zh) 块内拷贝模式中的仿射继承方法
JP7256265B2 (ja) ルックアップテーブルの更新:fifo、制限されたfifo
TWI729402B (zh) 加權交織預測
CN110249628B (zh) 用于预测分区的视频编码器和解码器
CN109379594B (zh) 视频编码压缩方法、装置、设备和介质
CN112104868B (zh) 一种针对vvc帧内编码单元划分的快速决策方法
US10455229B2 (en) Prediction mode selection method, apparatus and device
US20060039476A1 (en) Methods for efficient implementation of skip/direct modes in digital video compression algorithms
KR102327942B1 (ko) 코딩 유닛 분할 결정 방법 및 디바이스, 컴퓨팅 디바이스 및 판독 가능한 저장 매체
CN104363450A (zh) 一种帧内编码模式决策方法及装置
WO2020084472A1 (en) Affine mode parameter inheritance or prediction
WO2020181428A1 (zh) 预测方法、编码器、解码器及计算机存储介质
US10542277B2 (en) Video encoding
WO2018040869A1 (zh) 一种帧间预测编码方法及装置
WO2020140215A1 (zh) 色度帧内预测方法和装置、及计算机存储介质
US20130287289A1 (en) Synthetic Reference Picture Generation
TW202145794A (zh) 幀間預測方法、編碼器、解碼器以及電腦儲存媒介
WO2020248105A1 (zh) 预测值的确定方法、编码器以及计算机存储介质
CN105992012B (zh) 一种错误隐藏的方法和装置
KR101541077B1 (ko) 텍스쳐 기반 블록 분할을 이용한 프레임 보간 장치 및 방법
CN113347417B (zh) 提高率失真优化计算效率的方法、装置、设备及存储介质
CN112055221B (zh) 一种帧间预测方法、视频编码方法及电子设备和存储介质
CN109618152B (zh) 深度划分编码方法、装置和电子设备
CN112203094A (zh) 编码方法、装置、电子设备及存储介质
CN111988612A (zh) 一种视频编码处理方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19932367

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19932367

Country of ref document: EP

Kind code of ref document: A1