CN115211116A - Image prediction method, encoder, decoder, and storage medium - Google Patents

Image prediction method, encoder, decoder, and storage medium Download PDF

Info

Publication number
CN115211116A
CN115211116A CN202080097798.XA CN202080097798A CN115211116A CN 115211116 A CN115211116 A CN 115211116A CN 202080097798 A CN202080097798 A CN 202080097798A CN 115211116 A CN115211116 A CN 115211116A
Authority
CN
China
Prior art keywords
value
gradient
prediction
image block
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080097798.XA
Other languages
Chinese (zh)
Inventor
万帅
巩浩
冉启宏
霍俊彦
马彦卓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202310064906.9A priority Critical patent/CN116647698A/en
Publication of CN115211116A publication Critical patent/CN115211116A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the application discloses an image prediction method, an encoder, a decoder and a storage medium, wherein the method comprises the following steps: carrying out motion estimation on an image block to be coded, and determining unidirectional predicted values of the image block in two prediction directions respectively; determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value; calculating gradient values of the image blocks by using the determined gradient parameters and the unidirectional predicted values respectively corresponding to the two prediction directions; and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.

Description

Image prediction method, encoder, decoder, and storage medium Technical Field
The present disclosure relates to the field of video encoding and decoding technologies, and in particular, to an image prediction method, an encoder, a decoder, and a storage medium.
Background
With the widespread application of multimedia technology, modern coding technology adopts a hybrid coding framework, including prediction, transformation, quantization, entropy coding, and other processes. The predictive coding comprises intra-frame prediction and inter-frame prediction, wherein the intra-frame prediction is to predict an image block to be currently coded by using an image block which is already coded and reconstructed in the same frame image, and the inter-frame prediction is to predict an image to be currently coded by using images of other frames which are already coded and reconstructed. Here, inter-frame predictive coding utilizes temporal correlation of video sequences, removes spatial redundancy, and is a very important link in the current video coding framework.
The reference software model HPM6.0 of the new generation of digital Audio and Video Coding standard (3 rd Audio and Video Coding standard, AVS3) introduces a bidirectional prediction technology. When the current block is predicted, two unidirectional prediction values can be obtained according to two groups of reference frames and Motion Vectors (MVs), and then the obtained two unidirectional prediction values are averaged to obtain a bidirectional prediction value. This averaging may bias the bi-directional prediction values, resulting in a prediction result that may be inaccurate.
Disclosure of Invention
Embodiments of the present application provide an image prediction method, an encoder, a decoder, and a storage medium, which can improve accuracy of a prediction result by correcting an initial bidirectional prediction value.
The technical scheme of the embodiment of the application can be realized as follows:
in a first aspect, an embodiment of the present application provides an image prediction method applied to an encoder, where the method includes:
performing motion estimation on an image block to be coded, and determining unidirectional predicted values of the image block in two prediction directions respectively;
determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value;
calculating gradient values of the image blocks by using the determined gradient parameters and the unidirectional predicted values respectively corresponding to the two prediction directions;
and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
In a second aspect, an embodiment of the present application provides an image prediction method applied to a decoder, where the method includes:
analyzing the code stream to obtain a prediction mode parameter of an image block to be decoded;
when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, analyzing motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index;
according to the motion parameters, unidirectional predicted values of the image blocks in two prediction directions are determined;
determining gradient parameters of the image blocks, and calculating gradient values of the image blocks by using the gradient parameters and unidirectional predicted values of the image blocks corresponding to the two prediction directions respectively;
and correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
In a third aspect, an embodiment of the present application provides an encoder including a first determining unit, a first calculating unit, and a first correcting unit, wherein,
the first determining unit is configured to perform motion estimation on an image block to be encoded, and determine unidirectional prediction values respectively corresponding to the image block in two prediction directions;
the first determining unit is further configured to determine a gradient parameter corresponding to the image block, where the gradient parameter at least includes a gradient flag value and a gradient direction index value;
the first calculating unit is configured to calculate gradient values of the image block by using the determined gradient parameters and the unidirectional prediction values respectively corresponding to the two prediction directions;
the first correcting unit is configured to correct an initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain a bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is a weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
In a fourth aspect, embodiments of the present application provide an encoder that includes a first memory and a first processor, wherein,
a first memory for storing a computer program operable on a first processor;
a first processor adapted to perform the method according to the first aspect when running a computer program.
In a fifth aspect, an embodiment of the present application provides a decoder, which includes a parsing unit, a second determining unit, a second calculating unit, and a second modifying unit, wherein,
the analysis unit is configured to analyze the code stream to obtain a prediction mode parameter of the image block to be decoded; when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index;
the second determining unit is configured to determine, according to the motion parameters, unidirectional prediction values corresponding to the image blocks in two prediction directions respectively;
the second calculating unit is configured to determine a gradient parameter of the image block, and calculate a gradient value of the image block by using the gradient parameter and unidirectional prediction values respectively corresponding to the image block in two prediction directions;
the second correcting unit is configured to correct the initial bidirectional prediction value according to the gradient value of the image block and a preset correction intensity value, and obtaining the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
In a sixth aspect, an embodiment of the present application provides a decoder, which includes a second memory and a second processor, wherein,
a second memory for storing a computer program operable on the second processor;
a second processor for performing the method according to the second aspect when running the computer program.
In a seventh aspect, an embodiment of the present application provides a computer storage medium, where a computer program is stored, and when the computer program is executed by a first processor, the method according to the first aspect is implemented, or when the computer program is executed by a second processor, the method according to the second aspect is implemented.
The embodiment of the application provides an image prediction method, an encoder, a decoder and a storage medium, wherein the method can be applied to the encoder, and the method comprises the steps of performing motion estimation on an image block to be encoded to determine unidirectional prediction values respectively corresponding to the image block in two prediction directions; determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value; calculating gradient values of the image blocks by using the determined gradient parameters and the unidirectional predicted values respectively corresponding to the two prediction directions; and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. The method can be applied to a decoder, and the prediction mode parameters of the image blocks to be decoded are obtained by analyzing the code stream; when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index; according to the motion parameters, unidirectional predicted values of the image blocks in two prediction directions are determined, wherein the unidirectional predicted values correspond to the two prediction directions respectively; determining gradient parameters of the image blocks, and calculating gradient values of the image blocks by using the gradient parameters and unidirectional predicted values of the image blocks corresponding to the two prediction directions respectively; and correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block. Therefore, gradient parameters corresponding to the image blocks can be used for obtaining a gradient value between two unidirectional predicted values, and then the initial bidirectional predicted value is corrected according to the gradient value and the preset correction intensity value, so that the bidirectional predicted value is more accurate, the accuracy of a prediction result can be improved, the coding and decoding efficiency can be improved, and the video image quality is improved.
Drawings
Fig. 1A is a schematic structural diagram of unidirectional inter-frame prediction provided in a related art scheme;
fig. 1B is a schematic structural diagram of bi-directional inter-frame prediction according to a related art;
fig. 2A is a block diagram of a video coding system according to an embodiment of the present disclosure;
fig. 2B is a block diagram of a video decoding system according to an embodiment of the present application;
fig. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present disclosure;
fig. 4 is a graph illustrating a variation trend of a luminance value according to an embodiment of the present disclosure;
fig. 5 is a schematic flowchart of another image prediction method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an encoder according to an embodiment of the present disclosure;
fig. 7 is a schematic hardware structure diagram of an encoder according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a decoder according to an embodiment of the present application;
fig. 9 is a schematic hardware structure diagram of a decoder according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant application and are not limiting of the application. It should be noted that, for the convenience of description, only the parts related to the related applications are shown in the drawings.
In a video image, a first image component, a second image component, and a third image component are generally used to represent a Coding Block (CB) or a Coding Unit (CU); wherein the three image components are respectively a luminance component, a blue chrominance component and a red chrominance component, and specifically, the luminance component is generally represented by a symbol Y, the blue chrominance component is generally represented by a symbol Cb or U, and the red chrominance component is generally represented by a symbol Cr or V; thus, the video image can be represented in YCbCr format, and also in YUV format.
In the embodiment of the present application, the first image component may be a luminance component, the second image component may be a blue chrominance component, and the third image component may be a red chrominance component, but is not particularly limited.
The following will describe a related art scheme for the inter prediction technique.
Inter-frame prediction uses inter-frame correlation, i.e. temporal correlation, of video images to achieve the purpose of image compression. The contents of the consecutive video frames are similar, and the correlation is high. When a video frame is encoded, a predictive encoding technology can be used, other decoded video frames are utilized to predict the current video frame, and then only the prediction residual is encoded, so that the code rate can be effectively reduced.
The inter-frame prediction can be divided into unidirectional prediction and bidirectional prediction, wherein the unidirectional prediction is mainly to directly obtain a predicted value through a group of reference frames and MVs, and the bidirectional prediction is to obtain the predicted value through correlation calculation of two groups of reference frames and MVs. As shown in fig. 1A, which shows an example of the structure of unidirectional prediction, in addition to a current frame, a set of parameters, i.e., a reference frame and an MV, are included, and the current frame can be unidirectional predicted according to the reference frame and the MV; as shown in fig. 1B, which shows an example of the structure of bidirectional prediction, in addition to the current frame, two sets of parameters, i.e., reference frames 0 and MV0, and reference frames 1 and MV1, are included, and the current frame can be uni-directionally predicted in one direction according to the reference frames 0 and MV0, and can be uni-directionally predicted in another direction according to the reference frames 1 and MV1, so as to realize bidirectional prediction of the current frame.
Specifically, for unidirectional prediction, the prediction value of unidirectional prediction is calculated by a set of reference frames and MVs. The MVs may include horizontally oriented MVs H And MV in the vertical direction V The horizontal and vertical positional relationships of the content in the CU between the current frame and the reference frame are shown, respectively. Here, on the encoder side, the encoder can calculate the most suitable reference frame and MV through motion estimation; on the decoder side, the decoder may parse out the reference frame and the MV from the code stream, thereby calculating the prediction value of the CU.
For bi-directional prediction, the prediction value of bi-directional prediction is calculated from two sets of reference frames and MVs. In the reference software HPM6.0 of AVS3, the bidirectional predictor is calculated in the following way: two unidirectional predicted values are calculated by using the two groups of reference frames and the MVs respectively, and then the two unidirectional predicted values are subjected to mean value calculation to obtain bidirectional predicted values; as will be shown in detail below, the present invention,
pred BI =(pred0+pred1+1)>>1 (1)
wherein pred0 represents a unidirectional predicted value in one direction, pred1 represents a unidirectional predicted value in another direction, and pred BI Representing a bidirectional predictive value; "=" denotes assignment operator ">>"denotes a right shift operator, e.g.)">>A 1' indicates an operation of shifting a bit to the right by one bit on a binary basis.
In the latest generation Video Coding standard H.266/multifunctional Video Coding (VVC), a CU-level bidirectional prediction (Bi-prediction with CU-level Weight, BCW) technology is introduced, and a bidirectional prediction value can be obtained in a weighted average mode. Assuming that the unidirectional predicted values obtained from two directions are pred0 and pred1 respectively, and the weight value is w, then the bidirectional predicted value pred BI The calculation formula of (a) is as follows,
pred BI =((8-w)×pred0+w×pred1+4)>>3 (2)
wherein the w value comprises 5 values, w ∈ { -2, 3, 4, 5, 10}, which can be selected from the values according to the weight index. Specifically, the w value has two acquisition modes, one is for a CU in a non-merge (merge) mode, and a Motion Vector Difference (MVD) needs to be transmitted at this time, and the decoder side determines a weight index by analyzing the Motion Vector Difference in the code stream; for a CU in merge mode, a weight index may be inferred from neighboring blocks, but the embodiment of the present application is not limited thereto.
Further, a Bi-Directional Optical Flow (BIO) technique is also adopted in AVS3, and a similar Bi-Directional Optical Flow (BDOF) technique is adopted in VVC. The bidirectional optical flow technology adopts point-level motion optimization, block motion compensation based on bidirectional prediction and no transmission of redundant signals. But after bi-prediction, it is still found that there is a motion deviation within the partial area of the current block.
In practical application, for AVS3, a method for obtaining a bidirectional predicted value by current reference software HPM6.0 is that two unidirectional predicted values can be obtained according to two groups of reference frames and MVs, and then an average value of the two unidirectional predicted values is calculated, where the obtained average value is the bidirectional predicted value, and even if bidirectional prediction techniques such as BIO are added to adjust the bidirectional predicted value, the two unidirectional predicted values are directly averaged when bidirectional prediction is performed, and this averaging method may cause the bidirectional predicted value to have a deviation, which may result in inaccurate prediction result.
Based on this, the embodiments of the present application provide an image prediction method applied to an encoder or a decoder. For the encoder side, after unidirectional predicted values of the image block in two prediction directions are determined by performing motion estimation on the image block to be encoded, gradient values of the image block are calculated according to the gradient parameters corresponding to the determined image block and the unidirectional predicted values in the two prediction directions; and then correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. For a decoder side, after obtaining a prediction mode parameter of an image block to be decoded by analyzing a code stream, when the prediction mode parameter indicates that the image block uses a bidirectional prediction mode, analyzing a motion parameter of the image block; according to the motion parameters, unidirectional prediction values of the image block in two prediction directions are determined; calculating gradient values of the image blocks according to the gradient parameters corresponding to the determined image blocks and the unidirectional prediction values respectively corresponding to the two prediction directions; and then correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block. Therefore, gradient parameters corresponding to the image blocks can be used for obtaining a gradient value between two unidirectional predicted values, and then the initial bidirectional predicted value is corrected according to the gradient value and the preset correction intensity value, so that the bidirectional predicted value is more accurate, the accuracy of a prediction result can be improved, the coding and decoding efficiency can be improved, and the video image quality is improved.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2A, a block diagram of an example of a video coding system provided in an embodiment of the present application is shown; as shown in fig. 2A, the video Coding system 10 includes a transform and quantization unit 101, an intra estimation unit 102, an intra prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter control analysis unit 107, a filtering unit 108, a Coding unit 109, a decoded image buffer unit 110, and the like, wherein the filtering unit 108 may implement deblocking filtering and Sample Adaptive set (sao) filtering, and the Coding unit 109 may implement header information Coding and Context-based Adaptive Binary arithmetic Coding (CABAC). For an input original video signal, a video Coding block can be obtained by dividing a Coding Tree Unit (CTU), and then residual pixel information obtained by intra-frame or inter-frame prediction is transformed by a transformation and quantization Unit 101, including transforming the residual information from a pixel domain to a transformation domain and quantizing the obtained transformation coefficient, so as to further reduce the bit rate; the intra estimation unit 102 and the intra prediction unit 103 are used for intra prediction of the video coding block; in particular, intra estimation unit 102 and intra prediction unit 103 are used to determine the intra prediction mode to be used to encode the video coding block; motion compensation unit 104 and motion estimation unit 105 are to perform inter-prediction encoding of the received video coding block relative to one or more blocks in one or more reference frames to provide temporal prediction information; motion estimation performed by the motion estimation unit 105 is a process of generating motion vectors that can estimate the motion of the video coding block, and then performing motion compensation by the motion compensation unit 104 based on the motion vectors determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 is also configured to supply the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 sends the calculated determined motion vector data to the encoding unit 109 as well; furthermore, the inverse transform and inverse quantization unit 106 is used for reconstruction of the video coding block, reconstructing a residual block in the pixel domain, the reconstructed residual block removing blocking artifacts through the filter control analysis unit 107 and the filtering unit 108, and then adding the reconstructed residual block to a predictive block in the frame of the decoded picture buffer unit 110 to generate a reconstructed video coding block; the encoding unit 109 is configured to encode various encoding parameters and quantized transform coefficients, and in a CABAC-based encoding algorithm, context content may be based on adjacent encoding blocks, and may be configured to encode information indicating the determined intra prediction mode and output a code stream of the video signal; the decoded picture buffer unit 110 is used to store reconstructed video coding blocks for prediction reference. As the video coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are stored in the decoded picture buffer unit 110.
Referring to fig. 2B, a block diagram of an example of a video decoding system provided in an embodiment of the present application is shown; as shown in fig. 2B, the video decoding system 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra prediction unit 203, a motion compensation unit 204, a filtering unit 205, a decoded picture buffer unit 206, and the like, wherein the decoding unit 201 can implement header information decoding and CABAC decoding, and the filtering unit 205 can implement deblocking filtering and SAO filtering. After the input video signal is subjected to the encoding processing of fig. 2A, a code stream of the video signal is output; the code stream is input into the video decoding system 20, and first passes through the decoding unit 201 to obtain a decoded transform coefficient; processes the transform coefficients by an inverse transform and inverse quantization unit 202 to produce a residual block in the pixel domain; intra-prediction unit 203 may be used to generate prediction data for a current video decoded block based on the determined intra-prediction mode and data from previously decoded blocks of the current frame or picture; motion compensation unit 204 is a predictive block that determines prediction information for a video decoded block by parsing motion vectors and other associated syntax elements and uses the prediction information to generate the video decoded block being decoded; forming a decoded video block by summing the residual block from inverse transform and inverse quantization unit 202 with the corresponding predictive block generated by intra prediction unit 203 or motion compensation unit 204; the decoded video signal passes through the filtering unit 205 to remove blocking artifacts, which can improve video quality; the decoded video blocks are then stored in the decoded picture buffer unit 206, and the decoded picture buffer unit 206 stores the reference pictures for subsequent intra prediction or motion compensation, and also for the output of the video signal, i.e. the restored original video signal is obtained.
The image prediction method in the embodiment of the present application may be applied to the inter prediction portions of the motion compensation unit 104 and the motion estimation unit 105 shown in fig. 2A, or may be applied to the inter prediction portion of the motion compensation unit 204 shown in fig. 2B. That is to say, the image prediction method in the embodiment of the present application may be applied to a video coding system, a video decoding system, or even applied to both the video coding system and the video decoding system, but the embodiment of the present application is not limited specifically. Here, when the method is applied to a video coding system, an "image block" specifically refers to a current block to be coded in inter prediction; when the method is applied to a video decoding system, the "image block" specifically refers to a current block to be decoded in inter prediction.
Based on the application scenario example of fig. 2A, refer to fig. 3, which shows a flowchart of an image prediction method provided in an embodiment of the present application. As shown in fig. 3, applied to an encoder, the method may include:
s301: carrying out motion estimation on an image block to be coded, and determining unidirectional predicted values of the image block in two prediction directions respectively;
it should be noted that a frame of video image may be divided into a plurality of image blocks, and each image block to be currently encoded may be a CU. Here, the image block to be encoded specifically refers to a current image block to be subjected to first image component, second image component, or third image component encoding prediction in a video image.
If the first image component prediction is performed on the image block to be coded and the first image component is a luminance component, the image block to be coded may be called luminance prediction, and the obtained prediction value may be called a luminance value; or, assuming that the second image component prediction is performed on the image block to be encoded, and the second image component is a chrominance component, it may be referred to as performing chrominance prediction on the image block to be encoded, and the obtained prediction value may be referred to as a chrominance value; in the embodiment of the present application, it is preferable to perform luminance prediction on an image block to be encoded, but the present application is not limited thereto.
Specifically, in some embodiments, the performing motion estimation on the image block to be encoded, and determining the unidirectional prediction values of the image block in two prediction directions may include:
acquiring predicted image blocks of the image blocks in two prediction directions;
performing motion estimation according to the image block and the two predicted image blocks, and determining motion vectors corresponding to the two prediction directions respectively;
and determining unidirectional predicted values respectively corresponding to the two prediction directions according to the two predicted image blocks and the two motion vectors.
It should be noted that the video sequence includes multiple frames of video images, such as a current frame, a first reference frame (reference frame 0), a second reference frame (reference frame 1), and the like; assuming that the image block to be encoded is located on the current frame, the two predicted image blocks will be located on reference frame 0 and reference frame 1, respectively, in the video sequence. In addition, the reference frame 0 and the reference frame 1 may be respectively located in a forward direction and a backward direction of the current frame, may also be respectively located in two backward directions of the current frame, and may even be respectively located in two forward directions of the current frame; that is, the two predetermined directions may include a forward direction and a backward direction, or two forward directions or two backward directions, etc. In the embodiment of the present application, preferably, the two predetermined directions include a forward direction and a backward direction, but are not limited in any way.
It should be further noted that the unidirectional predicted values may be one unidirectional predicted value in each prediction direction, so that there are two unidirectional predicted values in total; it is also possible that there is at least one unidirectional predictor per prediction direction, so that there are more than two unidirectional predictors in total. Preferably, there is a unidirectional predictor in each prediction direction, such as a first unidirectional predictor determined in a first prediction direction and a second unidirectional predictor determined in a second prediction direction.
That is, assuming that the two preset directions include a first prediction direction and a second prediction direction, a first unidirectional prediction value (denoted by Pred 0) may be determined from the corresponding prediction image block and the motion vector for the first prediction direction; for the second prediction direction, a second unidirectional prediction value (denoted by Pred 1) may be determined from the corresponding prediction image block and the motion vector.
Therefore, after the unidirectional predicted values respectively corresponding to the two prediction directions are obtained, on one hand, the motion vector and the reference image index corresponding to the prediction image block can be written into a code stream for a decoder side to obtain by analyzing the code stream; on the other hand, the method can also utilize a preset algorithm to calculate the unidirectional predicted values respectively corresponding to the two prediction directions so as to obtain the initial bidirectional predicted value (by Pred) BI Represents); the preset algorithm may be a weighted sum algorithm, but the embodiment of the present application is not limited in any way.
It is to be understood that the bidirectional prediction algorithm for interframes may employ a bidirectional optical flow prediction algorithm in addition to the calculation using the formula (1) or the formula (2). Specifically, the BIO technique can perform motion compensation on motion after bidirectional prediction by using an optical flow principle, and is only applicable to the case of bidirectional prediction, and calculates gradient values in the x direction and the y direction for each pixel in a forward and backward prediction image block, and calculates a calculation factor for each pixel according to a pixel value and a gradient value corresponding to each pixel.
To reduce computational complexity, it can be considered that all pixels within each cluster have the same motion vector, and using a larger window than the cluster can improve the accuracy of the computed motion vector. Here, a cluster denotes an image block of size 4 × 4, and a window denotes an image block of size greater than 4 × 4; a cluster or window can be considered as a sub-block (subBlock). That is, in the BIO technique, the size of a cluster is 4 × 4, i.e., the motion vector value (vx, vy) of the 4 × 4 cluster is calculated from a window centered on the cluster. And for each cluster, calculating a motion vector value (vx, vy) according to the calculation factors of all pixel positions in the window where the cluster is located, and finally calculating each pixel in the cluster to obtain a bidirectional predicted value. The specific calculation formula is as follows,
pred BI (x,y)=(I (0) (x,y)+I (1) (x,y)+b+1)>>1 (3)
therein, pred BI (x, y) is the corrected bidirectional predictor, I (0) (x, y) denotes a one-way predictor of the (x, y) pixel position within reference frame 0, I (1) (x, y) represents a one-way prediction value for the (x, y) pixel position within the reference frame 1, "=" represents the assignment operator, ")">>"represents a right shift operator, and b is a correction value.
Specifically, the calculation formula of b is as follows,
Figure PCTCN2020077491-APPB-000001
wherein, the first and the second end of the pipe are connected with each other,
Figure PCTCN2020077491-APPB-000002
representing partial derivative operators, v x ,v y The motion vector values obtained for bi-directional optical flow prediction, as shown in detail below,
v x =(S 1 +r)>mclip3(-th BIO ,th BIO ,(S 3 <<5)/(S 1 +r)):0 (5)
v y =(S 5 +r)>mclip3(-th BIO ,th BIO ,((S 6 <<6)-v x S 2 )/((S 5 +r)<<1)):0 (6)
wherein "<<"indicates the left shift operator, clip3 indicates the clamp operator, -th BIO Denotes the lower bound value, th BIO Represents an upper bound value, such as clip3 (i, j, x): when x is smaller than i, the value is i; when x is larger than j, the value is j; when x is greater than or equal to i and less than or equal to j, the value is x.
Here, S 1 ~S 6 For the gradient values, the calculation formulas are respectively as follows,
S 1 =∑ (i,y)∈Ω ψ x (i,j)·ψ x (i,j) (7)
S 2 =∑ (i,y)∈Ω ψ x (i,j)·ψ y (i,j) (8)
S 3 =∑ (i,y)∈Ω θ(i,j)·ψ x (i,j) (9)
S 5 =∑ (i,y)∈Ω ψ y (i,j)·ψ y (i,j) (10)
S 6 =∑ (i,y)∈Ω θ(i,j)·ψ y (i,j) (11)
wherein psi x (i,j)、ψ y The calculation formulas of (i, j) and theta (i, j) are respectively as follows,
Figure PCTCN2020077491-APPB-000003
Figure PCTCN2020077491-APPB-000004
θ(i,j)=I (1) (i,j)-I (0) (i,j) (14)
besides VVCs, there is also a very similar BDOF technology. Here, S is calculated using spatial and temporal gradients 1 ~S 6 The calculation formulas are respectively as follows,
S 1 =∑ (i,y)∈Ω Abs(ψ x (i,j)) (15)
S 2 =∑ (i,y)∈Ω ψ x (i,j)·Sign(ψ y (i,j)) (16)
S 3 =∑ (i,y)∈Ω θ(i,j)·Sign(ψ x (i,j)) (17)
S 5 =∑ (i,y)∈Ω Abs(ψ y (i,j)) (18)
S 6 =∑ (i,y)∈Ω θ(i,j)·Sign(ψ y (i,j)) (19)
wherein psi x (i,j)、ψ y The calculation formulas of (i, j) and theta (i, j) are respectively as follows,
Figure PCTCN2020077491-APPB-000005
Figure PCTCN2020077491-APPB-000006
θ(i,j)=(I (1) (i,j)>>n b )-(I (0) (i,j)>>n b ) (22)
from the above equations (15) to (19), the compensated motion vector value v can be calculated x 、v y The calculation formula is as follows,
Figure PCTCN2020077491-APPB-000007
Figure PCTCN2020077491-APPB-000008
wherein the content of the first and second substances,
Figure PCTCN2020077491-APPB-000009
BD denotes bit depth;
Figure PCTCN2020077491-APPB-000010
indicating a rounding down operation.
Then, a correction value b can be calculated from the following equations (23) and (24),
Figure PCTCN2020077491-APPB-000011
the formula for calculating the bi-directional prediction value obtained finally is as follows,
pred BDOF (x,y)=(I (0) (x,y)+I (1) (x,y)+b(x,y)+o offset )>>shift (26)
therein, pred BDOF (x, y) represents a bidirectional prediction value after correction in BDOF technology, and "o offset "represents a preset compensation value," shift "represents a bit right shift number.
It should be noted that the method of the embodiment of the present application is applied to the inter-frame bidirectional prediction technology, mainly aims at the correction process of the initial bidirectional prediction value, and can be applied after bidirectional prediction and/or BIO. That is, in HPM6.0, the initial bidirectional predictive value can be directly calculated by equation (1), or can be obtained by using equation (1)Calculated by the following equations (3) to (14). Illustratively, if the bi-directional prediction satisfies the preset condition of the BIO technique, the calculation of the initial bi-directional prediction value Pred by using the equations (3) to (14) may be selected BI (ii) a Otherwise, the initial bidirectional prediction value Pred can be calculated by directly using the formula (1) BI
Taking the brightness values of the sampling points in the image block as an example, refer to fig. 4, which shows a curve diagram of a brightness value variation trend provided by the embodiment of the present application. As shown in fig. 4, it is assumed that an image block to be predicted is located in a current frame, a first predicted image block is located in a reference frame 0, a second predicted image block is located in a reference frame 1, luminance values of samples continuously change between the reference frame and the current frame, the reference frame 0 and the reference frame 1 are located on both sides of the current frame, and luminance values of the samples at corresponding positions are R0 and R1, respectively. The trend of the luminance change between R0 and R1 may be along the curve 1, may also be along the curve 2, and may also be along the curve 3. If the current frame is positioned between the reference frame 0 and the reference frame 1, the change trend is close to linearity at this time, namely a curve 2, and the bidirectional prediction value obtained by using the averaging mode is more accurate; when the variation trend approaches to curve 1 or curve 3, the accuracy of the bidirectional prediction value obtained by the averaging method is low at this time.
In order to improve the accuracy of the bidirectional predictive value, the gradient parameters corresponding to the image block need to be determined at this time to determine whether the variation trend is close to the curve 1 or close to the curve 3, and then the subsequent steps are executed to correct the bidirectional predictive value so as to improve the accuracy of the bidirectional predictive value.
S302: determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value;
it should be noted that after the initial bidirectional prediction value is determined, the gradient parameter corresponding to the image block may also be determined. Wherein, the gradient flag value may be denoted by grad _ flag, and the gradient direction index value may be denoted by grad _ idx. Here, the gradient flag value is used to indicate whether the initial bidirectional prediction value is subjected to gradient correction, and the gradient direction index value is used to indicate the corrected gradient direction; and the gradient flag value and the gradient direction index value are both binary variables, namely only two values of '0' and '1' are included. For example, the value of grad _ flag is equal to 0, which indicates that no gradient correction is performed; the value of the grad _ flag is equal to 1, which indicates that gradient correction is carried out; the value of grad _ idx is equal to 0, and the positive gradient direction is represented; the value of grad _ idx equals 1, indicating the direction of the inverse gradient.
It should be noted that, on the encoder side, the image blocks to be predicted are subjected to precoding processing by using multiple prediction modes. Here, the plurality of prediction modes may generally include a first prediction mode, a second prediction mode, a third prediction mode, and the like, wherein gradient parameters corresponding to the first prediction mode, the second prediction mode, and the third prediction mode are different; specifically, in the first prediction mode, the gradient flag value in the gradient parameter may be set equal to 0; in the second prediction mode, a gradient flag value in the gradient parameter may be set equal to 1, and a gradient direction index value in the gradient parameter may be set equal to 0; in the third prediction mode, a gradient flag value in the gradient parameter may be set equal to 1, and a gradient direction index value in the gradient parameter may be set equal to 1.
In some embodiments, for S302, before the determining the gradient parameter corresponding to the image block, the method may further include:
carrying out pre-coding processing on the image blocks by utilizing multiple prediction modes to obtain multiple pre-coding results; wherein, the gradient parameters corresponding to different prediction modes are different, and the gradient parameters at least comprise
Selecting an optimal coding result from the plurality of pre-coding results according to a preset strategy;
and determining the prediction mode corresponding to the optimal coding result as a target prediction mode.
That is to say, after the image blocks to be predicted are respectively precoded by using multiple prediction modes, precoding results corresponding to each prediction mode can be obtained, that is, multiple precoding results can be obtained; then, a preferred pre-coding result is decided from the plurality of pre-coding results, and a prediction mode corresponding to the preferred pre-coding result is determined as a target prediction mode, so that gradient parameters corresponding to an image block to be predicted can be determined; in this way, subsequent encoding prediction is performed based on the determined gradient parameter, the prediction residual can be made small, and encoding efficiency can be improved.
Further, for the decision of the optimal precoding result, a simple decision strategy can be adopted, for example, the decision is made according to the size of the distortion value; complex decision strategies such as Rate Distortion Optimization (RDO) results can also be used for decision making, and the embodiments of the present application are not limited in any way.
Optionally, in some embodiments, the selecting a preferred encoding result from the plurality of precoding results according to a preset policy may include:
determining a rate distortion cost value corresponding to each pre-coding result according to the plurality of pre-coding results;
and selecting the minimum rate distortion cost value from the plurality of determined rate distortion cost values, and determining the pre-coding result corresponding to the minimum rate distortion cost value as the optimal coding result.
Optionally, in some embodiments, the selecting a preferred encoding result from the plurality of precoding results according to a preset policy may include:
determining a distortion value corresponding to each pre-coding result according to the plurality of pre-coding results;
and selecting a minimum distortion value from the plurality of determined distortion values, and determining a pre-coding result corresponding to the minimum distortion value as the optimal coding result.
Here, preferably, an RDO decision is taken as an example, and a rate distortion cost value corresponding to each precoding result is determined according to the obtained multiple precoding results; and then selecting the minimum rate distortion cost value from the plurality of determined rate distortion cost values, and determining the pre-coding result corresponding to the minimum rate distortion cost value as the optimal coding result.
Further, after the target prediction mode is determined, the setting condition of the gradient parameter can be obtained according to the target prediction mode. Specifically, in some embodiments, the determining the prediction mode corresponding to the preferred encoding result as the target prediction mode may include:
if the target prediction mode is a first prediction mode, setting a gradient flag value in the gradient parameter to be equal to 0;
if the target prediction mode is a second prediction mode, setting a gradient flag value in the gradient parameter equal to 1, and setting a gradient direction index value in the gradient parameter equal to 0;
and if the target prediction mode is a third prediction mode, setting a gradient flag value in the gradient parameter equal to 1, and setting a gradient direction index value in the gradient parameter equal to 1.
That is, if it is determined that the target prediction mode is the first prediction mode, it may be obtained that the gradient flag value in the gradient parameter is equal to 0; if the target prediction mode is determined to be the second prediction mode, obtaining that the gradient sign value in the gradient parameter is equal to 1, and the gradient direction index value in the gradient parameter is equal to 0; if the target prediction mode is determined to be the third prediction mode, it can be obtained that the gradient flag value in the gradient parameter is equal to 1, and the gradient direction index value in the gradient parameter is equal to 1.
Further, after the gradient flag value is obtained, the gradient flag value needs to be written into a code stream; and if the gradient flag value is equal to 1, the gradient direction index values are all required to be written into the code stream. Specifically, in some embodiments, the method may further comprise:
if the gradient flag value is equal to 0, writing the gradient flag value into a code stream;
and if the gradient mark value is equal to 1, writing the gradient mark value and the gradient direction index value into a code stream.
Therefore, at the encoder side, the gradient flag value or the gradient flag value and the gradient direction index value in the gradient parameter need to be written into the code stream, so that the subsequent analysis processing at the decoder side is facilitated, and the gradient flag value or the gradient flag value and the gradient direction index value are directly obtained.
S303: calculating gradient values of the image blocks by using the determined gradient parameters and the unidirectional predicted values respectively corresponding to the two prediction directions;
it should be noted that after the gradient parameters are obtained, if the set gradient flag value is equal to 0, it indicates that the gradient correction is not needed to be performed on the initial bidirectional prediction value, that is, the gradient correction is closed, and at this time, step S303 is not needed to be executed, and the initial bidirectional prediction value may be directly the bidirectional prediction value of the image block. Thus, in some embodiments, the method may further comprise: and when the set gradient flag value is equal to 0, determining the initial bidirectional prediction value as the bidirectional prediction value of the image block.
It should also be noted that by default, the gradient flag value is equal to 1, which can be determined by a configuration (configure) file on the encoder side. When the gradient flag value is equal to 1, it indicates that the initial bi-directional prediction value needs to be subjected to gradient correction, and at this time, step S303 needs to be performed to calculate the gradient value of the image block to be encoded.
Specifically, in some embodiments, when the set gradient flag value is equal to 1, for S303, the calculating the gradient value of the image block by using the determined gradient parameter and the unidirectional prediction values respectively corresponding to the two prediction directions may include:
acquiring the set gradient mark value and the gradient direction index value;
if the gradient flag value is equal to 1 and the gradient direction index value is equal to 0, subtracting the first unidirectional predicted value from the second unidirectional predicted value to obtain a gradient value of the image block;
and if the gradient flag value is equal to 1 and the gradient direction index value is equal to 1, subtracting a second unidirectional predicted value from a first unidirectional predicted value to obtain a gradient value of the image block.
Here, the first unidirectional prediction value (Pred 0) represents a unidirectional prediction value corresponding to the image block in a first prediction direction, and the second unidirectional prediction value (Pred 1) represents a unidirectional prediction value corresponding to the image block in a second prediction direction.
It should be noted that, when the gradient flag value is equal to 1, it indicates that the initial bidirectional prediction value needs to be corrected; at this time, if the gradient direction index value is equal to 0, indicating that the gradient direction is a forward gradient direction, the gradient values of the image blocks are Pred1-Pred0; if the gradient direction index value is equal to 1, indicating that the gradient direction is the inverse gradient direction, the gradient values Pred0-Pred1 for the image block may be obtained.
Still taking fig. 4 as an example, when it is determined that the variation trend is close to the curve 1, indicating that the gradient direction is a positive gradient direction, i.e. the gradient direction index value is equal to 0, the gradient value (R1-R0) between a segment of reference frame 0 and reference frame 1 can be determined; when it is determined that the variation trend is close to the curve 3, it indicates that the gradient direction is the reverse gradient direction, that is, the index value of the gradient direction is equal to 1, at this time, a gradient value (R0-R1) between a section of reference frame 1 and a section of reference frame 0 may be determined, and then, the initial bidirectional predictive value may be corrected according to the determined gradient value, that is, step S304 is performed.
S304: and correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block.
The initial bidirectional predictor is a weighted sum of unidirectional predictors corresponding to the two prediction directions, respectively. In terms of implementation steps, the initial bidirectional prediction value may be calculated after the unidirectional prediction value is obtained in step S301, may be calculated before the correction operation is performed in step S304, and may be calculated synchronously even in the processes of step S302 and step S303, which is not limited in any way.
It should be further noted that, after the gradient value of the image block is determined, the initial bidirectional prediction value may be corrected by combining a preset correction strength value to obtain a final bidirectional prediction value. Specifically, the modifying the initial bidirectional prediction value according to the gradient value of the image block and a preset modification strength value to obtain the bidirectional prediction value of the image block may include:
determining a correction gradient value of the image block according to the gradient value of the image block and a preset correction intensity value;
and correcting the initial bidirectional prediction value by using the determined correction gradient value to obtain the bidirectional prediction value of the image block.
Alternatively, a modified gradient value of the image block may be calculated by using a shifting manner, and then the initial bidirectional prediction value may be modified according to the modified gradient value. Specifically, in some embodiments, the determining a modified gradient value of the image block according to the gradient value of the image block and a preset modified intensity value may include:
and performing displacement calculation on the gradient values of the image blocks by using a preset correction intensity value to obtain the correction gradient values.
Further, the modifying the initial bidirectional prediction value by using the determined modification gradient value to obtain the bidirectional prediction value of the image block may include:
and superposing the corrected gradient value and the initial bidirectional prediction value to obtain the bidirectional prediction value of the image block.
Here, the preset modified intensity value may be represented by k, and in general, a default value of k may be set to a fixed value, preferably 3, or may be defined using a slice (slice) level or higher syntax.
Assuming that the final bidirectional predictor is denoted by Pred, the initial bidirectional predictor is denoted by Pred BI Indicating that the first unidirectional predictor is denoted by Pred0 and the second unidirectional predictor is denoted by Pred1, then the bi-directional predictor is represented as follows,
Figure PCTCN2020077491-APPB-000012
where "=" denotes the assignment operator and "> >" denotes the right shift operator, i.e., "> > k" denotes the bit right shifted by k bits.
Further, the value of k may be a preset constant value, for example, the value of k is 3. In addition, the optimal k value can be selected from a mapping list in a self-adaptive determination mode according to the existing parameters, such as an RDO decision mode; and then writing the index sequence number of the k value in the mapping list into the code stream, so that the index sequence number can be conveniently acquired by analyzing the code stream at the decoder side subsequently, and the k value can be determined. That is, in some embodiments, the method may further comprise:
obtaining a mapping list corresponding to the corrected intensity value; wherein the mapping list represents a correspondence between the correction intensity value and the index number value;
calculating the rate distortion cost value corresponding to each correction intensity value in the mapping list;
selecting a minimum rate distortion cost value from the plurality of rate distortion cost values obtained through calculation, and determining a correction strength value corresponding to the minimum rate distortion cost value as the preset correction strength value;
and acquiring an index serial number value corresponding to the preset correction intensity value, and writing the acquired index serial number value into a code stream.
It should be noted that the mapping list reflects the correspondence between the correction strength value and the index number value. Filtering of different intensities can be achieved by using different modified intensity values. In general, k may be a default value or an optional item in the mapping list, and the embodiment of the present application is not particularly limited. For example, table 1 gives an example of a mapping list defining a correspondence between the modified strength value and the index number value.
TABLE 1
Index sequence number value 0 1 2 3
Modifying the intensity value 0 2 3 4
Alternatively, a modified gradient value of the image block may be calculated by multiplication, and then the initial bidirectional prediction value may be modified according to the modified gradient value. Specifically, in some embodiments, the determining a modified gradient value of the image block according to the gradient value of the image block and a preset modified intensity value may include:
and multiplying a preset correction intensity value by the gradient value of the image block to obtain the correction gradient value.
Further, the correcting the initial bidirectional prediction value according to the correction gradient value to obtain the bidirectional prediction value of the image block may include:
and superposing the corrected gradient value and the initial bidirectional prediction value to obtain the bidirectional prediction value of the image block.
At this time, the calculation can be performed by using multiplication instead of shift, and more flexible correction can be realized. Wherein, the corrected intensity value can also be called as a multiplication factor and is expressed by s; the bi-directional prediction value is expressed as follows,
Figure PCTCN2020077491-APPB-000013
it should be further noted that the value of s may be a preset constant value; or, the optimal s value may be selected according to the existing parameters, for example, by using the mapping list shown in table 1 and using the ROD decision method, and then the corresponding index number value is written into the code stream, so as to facilitate the subsequent parsing on the decoder side.
The embodiment of the application provides an image prediction method which is applied to an encoder. The method comprises the steps of determining unidirectional predicted values of an image block in two prediction directions respectively corresponding to the image block by performing motion estimation on the image block to be coded; determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value; calculating gradient values of the image blocks by using the determined gradient parameters and the unidirectional predicted values respectively corresponding to the two prediction directions; and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. Therefore, gradient parameters corresponding to the image blocks can be used for obtaining a gradient value between two unidirectional predicted values, and then the initial bidirectional predicted value is corrected according to the gradient value and the preset correction intensity value, so that the bidirectional predicted value is more accurate, the accuracy of a prediction result can be improved, the coding and decoding efficiency can be improved, and the video image quality is improved.
Further, for the image prediction method in the embodiment of the present application, the image block to be encoded may be one CU. Illustratively, the use limiting condition for applying the method may include:
(1) Only for bi-directional predicted CUs, i.e. only when both reference frame 0 and reference frame 1 are valid;
(2) Only CUs having a number of samples equal to or greater than 256 are used.
(3) Only for CUs transmitting MVDs.
(4) Only in non-low latency (lowdelay) conditions.
(5) Only for luminance values.
It should be noted that these conditions are only used to give the use limitation conditions for applying the method by way of example, but are not limited specifically. In addition, for the image prediction method in the embodiment of the present application, a slice level or a higher level may also be used to set a corresponding syntax element, and the following description will be made for the relevant syntax elements.
First, for a video sequence, it needs to be described whether the video sequence can use inter-frame bi-prediction. In particular, a syntax element may be introduced in the sequence header: the inter bi-directional gradient enable flag may be denoted by grad _ enable _ flag. The inter-frame bidirectional gradient permission flag is a binary variable, and when the value is '1', the inter-frame bidirectional gradient permission flag indicates that the video sequence can use inter-frame bidirectional gradient; when the value is '0', it indicates that the video sequence cannot use the inter-frame bi-directional gradient. Table 2 gives a syntax description of this part of the content, the grey part being the newly added syntax content.
TABLE 2
Figure PCTCN2020077491-APPB-000014
Figure PCTCN2020077491-APPB-000015
Then, for the newly added gradient parameters, such as the gradient flag value and the gradient direction index value, etc. This time two CU-level syntax elements are introduced: (1) The inter-frame bidirectional gradient CU level permission flag can be represented by grad _ flag, is a binary variable, and when the value is '1', indicates that the CU uses inter-frame bidirectional gradient correction; when the value is '0', the CU does not use the inter-frame bidirectional gradient; (2) The inter-frame bidirectional gradient CU level direction mark can be represented by grad _ idx, which is a binary variable, and when the value is '1', the CU uses reverse gradient correction; when the value is "0", the CU is corrected using the forward gradient. Table 3 gives a syntax description of this part of the content, the grey part being the newly added syntax content.
TABLE 3
Figure PCTCN2020077491-APPB-000016
Further, syntax elements may also be set in the sequence header definition, picture header definition, slice definition, coding tree unit definition, and a lower layer may cover a higher layer.
Wherein modifying the CU-level syntax element meaning: the inter-frame bidirectional gradient CU level permission flag is represented by grad _ flag, is a binary variable, and when the value is '1', indicates whether the CU uses gradient correction or not, and is opposite to the CTU layer; when the value is "0", it indicates agreement with the CTU layer.
Syntax element meanings that can be added:
(1) An inter-frame bidirectional gradient sequence level permission flag is represented by grad _ seq _ flag; it is a binary variable, when the value is '1', it represents that the video sequence is used for inter-frame bidirectional gradient correction; when the value is '0', the video sequence does not use the interframe bidirectional gradient;
(2) An inter-frame bidirectional gradient picture level permission flag, which is represented by grad _ pic _ flag; the method is a binary variable, and when the value is '1', the method indicates whether the image is opposite to a sequence layer by using gradient correction; when the value is "0", it indicates that the sequence layer is consistent;
(3) An inter-frame bidirectional gradient slice level permission flag is represented by grad _ pat _ flag; it is a binary variable, when the value is '1', it indicates whether the slice uses gradient correction opposite to the image layer; when the value is "0", it indicates that the image layer is consistent with the image layer;
(4) An inter-frame bidirectional gradient CTU level permission flag is represented by grad _ CTU _ flag; it is a binary variable, and when the value is '1', it indicates whether the CTU uses gradient correction opposite to the slice layer; when the value is "0", it is indicated as being consistent with the slice layer.
Tables 4-1, 4-2, 4-3, and 4-4 respectively give syntax descriptions of the content in the sequence header definition, inter prediction picture header definition, slice definition, and coding tree unit definition, while adding syntax elements to the corresponding parts of the syntax descriptions.
TABLE 4-1
Figure PCTCN2020077491-APPB-000017
TABLE 4-2
Figure PCTCN2020077491-APPB-000018
Tables 4 to 3
Figure PCTCN2020077491-APPB-000019
Tables 4 to 4
Figure PCTCN2020077491-APPB-000020
Further, syntax elements may also be set in the sequence header definition, picture header definition, slice definition, coding tree unit definition, and a lower layer may cover a higher layer.
Wherein modifying the CU-level syntax element meaning: (1) The inter-frame bidirectional gradient CU level permission flag is represented by grad _ flag, is a binary variable, and when the value is '1', indicates whether the CU uses gradient correction or not, and is opposite to the CTU layer; when the value is "0", it indicates that the CTU layer is consistent with the CTU layer; (2) The inter-frame bidirectional gradient CU level direction mark is represented by grad _ idx, is a binary variable, and when the value is '1', indicates that the CU uses reverse gradient correction; when the value is "0", the CU is corrected using the forward gradient.
Syntax element meanings that can be added:
(1) An inter-frame bidirectional gradient sequence level permission flag is represented by grad _ seq _ flag; it is a binary variable, when the value is '1', it represents using the video sequence inter-frame bidirectional gradient correction; when the value is '0', the video sequence does not use the interframe bidirectional gradient;
(2) The inter-frame bidirectional gradient sequence level allowed direction mark is represented by grad _ seq _ idx; it is a binary variable, when the value is '1', it means that the video sequence is modified by using the inverse gradient; when the value is '0', the video sequence is corrected by using the forward gradient;
(3) An inter-frame bidirectional gradient picture level permission flag, which is represented by grad _ pic _ flag; the method is a binary variable, and when the value is '1', the method indicates whether the image is opposite to a sequence layer by using gradient correction; when the value is "0", it indicates that the sequence layer is consistent;
(4) The inter-frame bidirectional gradient image level direction mark is represented by grad _ pic _ idx; the method is a binary variable, and when the value is '1', the image is corrected by using a reverse gradient; when the value is '0', the forward gradient correction is used for the image;
(5) An inter-frame bidirectional gradient slice level permission flag is represented by grad _ pat _ flag; it is a binary variable, when the value is '1', it indicates whether the slice uses gradient correction opposite to the image layer; when the value is "0", it indicates that the image layer is consistent with the image layer;
(6) The inter-frame bidirectional gradient slice level direction mark is represented by grad _ pat _ iadx; it is a binary variable, when the value is '1', the presentation piece uses the reverse gradient to revise; when the value is '0', the sheet is corrected by using the positive gradient;
(7) An inter-frame bidirectional gradient CTU level permission flag is represented by grad _ CTU _ flag; it is a binary variable, and when the value is '1', it indicates whether the CTU uses gradient correction opposite to the slice layer; when the value is "0", it is indicated to be consistent with the slice layer.
(8) The CTU level direction mark of the interframe bidirectional gradient is represented by grad _ CTU _ idx; it is a binary variable, when the value is '1', it indicates that the CTU uses reverse gradient correction; when the value is "0", it indicates that the CTU uses forward gradient correction.
Tables 5-1, 5-2, 5-3, and 5-4 respectively give syntax descriptions of the content in the sequence header definition, inter prediction picture header definition, slice definition, and coding tree unit definition, while adding syntax elements to the corresponding parts of the syntax descriptions.
TABLE 5-1
Figure PCTCN2020077491-APPB-000021
TABLE 5-2
Figure PCTCN2020077491-APPB-000022
Tables 5 to 3
Figure PCTCN2020077491-APPB-000023
Tables 5 to 4
Figure PCTCN2020077491-APPB-000024
Further, for a preset modified intensity value, the index number of the modified intensity value in the mapping list shown in table 1 can be described by setting the syntax element grad _ k _ idx, so as to implement filtering with different intensities.
At this time, the added syntax elements are needed:
(1) The interframe bidirectional gradient sequence level correction intensity value is expressed by k _ idx _ grad _ seq; it is a multivalued variable. The values are the index number values in table 1.
(2) The inter-frame bidirectional gradient image level modification correction intensity mark is represented by k _ idx _ grad _ pic _ flag, is a binary variable, and when the value is '1', indicates whether the image uses the correction intensity value of a sequence layer; when the value is "0", it indicates agreement with the sequence level, otherwise k _ idx _ grad _ pic will be transmitted to modify the modified intensity values used inside the present image.
(3) The inter-frame bidirectional gradient image level correction intensity value is represented by k _ idx _ grad _ pic; it is a multivalued variable whose value is the index number value in table 1 above.
(4) Modifying and correcting the strength mark by inter-frame bidirectional gradient slice level, and using k _ idx _ grad _ pat _ flag to represent; it is a binary variable. When the value is '1', indicating whether the slice uses the image layer to correct the intensity k value; when the value is "0", it indicates agreement with the image layer, otherwise k _ idx _ grad _ pat will be transmitted to modify the modified intensity values used inside the slice.
(5) The inter-frame bidirectional gradient slice-level correction intensity value is represented by k _ idx _ grad _ pat; it is a multivalued variable whose value is the index number value in table 1 above.
(6) The inter-frame bidirectional gradient CTU level modification correction intensity flag is represented by k _ idx _ grad _ CTU _ flag; the method is a binary variable, and when the value is '1', the CTU indicates whether to use a lamella correction strength k value; when the value is "0", it indicates conformance to the slice, otherwise k _ idx _ grad _ CTU will be transmitted to modify the modified intensity value used inside the present CTU.
(7) The inter-frame bidirectional gradient CTU level corrected intensity value is represented by k _ idx _ grad _ CTU; it is a multivalued variable whose value is the index number value in table 1 above.
(8) Modifying and correcting the strength mark by inter-frame bidirectional gradient CU level, and using k _ idx _ grad _ CU _ flag to represent; the method is a binary variable, and when the value is '1', the method indicates whether the CU uses a CTU layer to correct the strength k value; when the value is "0", this indicates agreement with the CTU layer, otherwise k _ idx _ grad _ CU will be transmitted to modify the modified intensity values used inside the present CU.
(9) Inter-frame bidirectional gradient CU level correction intensity values are represented by k _ idx _ grad _ CU; it is a multivalued variable whose value is the index number value in table 1 above.
Tables 6-1, 6-2, 6-3, 6-4, and 6-5 respectively show the syntax descriptions of the part of the content in the sequence header definition, inter prediction picture header definition, slice definition, coding tree unit definition, and coding unit definition, and the modified intensity value is represented by k, while adding the syntax element to the corresponding part of the syntax description.
TABLE 6-1
Figure PCTCN2020077491-APPB-000025
TABLE 6-2
Figure PCTCN2020077491-APPB-000026
Tables 6 to 3
Figure PCTCN2020077491-APPB-000027
Tables 6 to 4
Figure PCTCN2020077491-APPB-000028
Figure PCTCN2020077491-APPB-000029
Tables 6 to 5
Figure PCTCN2020077491-APPB-000030
Here, of the syntax elements, ae (v) denotes a context-adaptive entropy-coded syntax element, i.e., a context-adaptive arithmetic entropy-coding syntax element; u (n) denotes unsigned integer using n bits, i.e. an unsigned integer represented by n bits. The syntax element for CU level increase is ae (v), the flag in other header information, or only two values of other syntax elements are u (1), and other syntax elements with multiple values are u (n).
In the embodiment of the present application, an implicit correction scheme may also be adopted. Specifically, the method includes the steps of deciding a grad _ flag and a grad _ idx according to characteristics of the CU, wherein the grad _ flag and the grad _ idx are not transmitted in a code stream or are transmitted only in the code stream, and then predicting the grad _ idx according to the characteristics of the CU, so that coding decisions can be reduced and code rates can be reduced; or, for a CU in direct/skip mode in AVS3, the grad _ flag and the grad _ idx may be inherited or predicted, so that the prediction is more accurate.
In addition, in this embodiment of the present application, according to the variation trend of the brightness value of the sampling point, when the current frame is not located between two reference frames, assuming that the unidirectional predicted values obtained in two directions are pred0 and pred1 respectively, then the following calculation formula may be used according to the relative difference between the positions of the reference frames:
Pred=pred BI + (Pred 1-Pred 0) > k, or Pred = Pred BI +(pred0-pred1)>>k
And calculating to obtain a final bidirectional predicted value according to the two calculation formulas.
In addition, in the embodiment of the present application, when the method is applied after bidirectional prediction or after BIO, the substantially same correction calculation may be performed by deriving a merging processing form through operation. Alternatively, an implicit application condition may be set, for example, the gradient correction may be disabled when the Sum of Absolute Differences (SAD) of two unidirectional predicted values satisfies a certain condition, or the gradient correction may be disabled when the reference frame position satisfies a certain condition, or the gradient correction may be disabled when BIO is used. Or, even the existence condition of the syntax element may be changed, for example, CU size, encoding mode, reference frame positional relationship, etc. that allow gradient modification to be used may be adjusted.
Illustratively, the test results of the sequence of the second Access and the sequence of the third Access in the Random Access (RA) mode based on the reference software HPM6 of the AVS3 are shown in the following table 7. Specifically, table 7 shows the gain effects of three image components (Y/U/V) measured by using the image prediction method according to the embodiment of the present application, and it can be seen that the prediction result can be more accurate by using the method, thereby improving the coding efficiency.
TABLE 7
Figure PCTCN2020077491-APPB-000031
The embodiment provides an image prediction method applied to an encoder. The specific implementation of the foregoing embodiment is elaborated in detail through this embodiment, and it can be seen that, for an image block to be encoded, a gradient value between two unidirectional predicted values can be obtained by using a gradient parameter corresponding to the image block, and then an initial bidirectional predicted value is corrected according to the gradient value and a preset correction strength value, so that the bidirectional predicted value is more accurate, the accuracy of a prediction result can be improved, the encoding efficiency can be improved, and the video image quality is improved.
Based on the application scenario example of fig. 2B, refer to fig. 5, which shows a flowchart of another image prediction method provided in an embodiment of the present application. As shown in fig. 5, the method may include:
s501: analyzing the code stream to obtain a prediction mode parameter of an image block to be decoded;
it should be noted that the method is applied to a decoder. On the decoder side, a frame of video image may also be divided into a plurality of image blocks, and each current image block to be decoded may be a CU. Here, the image block to be decoded specifically refers to a current image block to be subjected to decoding prediction of the first image component, the second image component, or the third image component in the video image.
It should be noted that the prediction mode parameter indicates the encoding mode of the image block and the mode-related parameter. The coding modes of the image block generally include a unidirectional prediction mode, a bidirectional prediction mode, and the like. That is to say, on the encoder side, prediction encoding is performed on an image block, and in this process, the encoding mode of the current block can be determined, and corresponding encoding mode parameters are written into a code stream and transmitted to a decoder by the encoder.
In this way, on the decoder side, the prediction mode parameters of the image block can be obtained by analyzing the code stream, and whether the image block uses the bidirectional prediction mode or not is determined according to the obtained prediction mode parameters.
S502: when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, analyzing motion parameters of the image block;
it should be noted that the motion parameters include a motion vector and a reference picture index. In this way, if the prediction mode parameter indicates that the image block uses the bidirectional prediction mode, the code stream needs to be continuously analyzed to obtain the motion vector and the reference picture index of the image block, so as to determine the unidirectional prediction values of the image block in the two prediction directions.
S503: according to the motion parameters, unidirectional predicted values of the image blocks in two prediction directions are determined, wherein the unidirectional predicted values correspond to the two prediction directions respectively;
it should be noted that after the motion parameters (the motion vector and the reference picture index) of the image block are obtained, the unidirectional prediction values corresponding to the image block in the two prediction directions may be determined according to the motion vector and the reference picture index. Specifically, in some embodiments, the determining, according to the motion parameter, unidirectional prediction values that respectively correspond to the image block in two prediction directions may include:
determining the prediction image blocks of the image blocks in two prediction directions according to the reference image indexes in the motion parameters;
and determining unidirectional predicted values respectively corresponding to the image blocks in two prediction directions according to the two determined predicted image blocks and the motion vectors in the motion parameters.
It should be noted that the video sequence includes multiple frames of video images, such as a current frame, a first reference frame (reference frame 0), a second reference frame (reference frame 1), and the like; assuming that the image block to be decoded is located on the current frame, the two predicted image blocks will be located on reference frame 0 and reference frame 1, respectively, in the video sequence.
In addition, the reference frame 0 and the reference frame 1 may be respectively located in a forward direction and a backward direction of the current frame, may also be respectively located in two backward directions of the current frame, and may even be respectively located in two forward directions of the current frame; that is, the two prediction directions may include a forward direction and a backward direction, or two forward directions or two backward directions, and so on. In the embodiment of the present application, preferably, the two prediction directions include a forward direction and a backward direction, but are not limited in any way.
It should be further noted that the unidirectional predicted values may be one unidirectional predicted value in each prediction direction, so that there are two unidirectional predicted values in total; it is also possible that there is at least one unidirectional predictor per prediction direction, so that there are more than two unidirectional predictors in total. Preferably, there is a unidirectional predictor in each prediction direction, such as a first unidirectional predictor determined in a first prediction direction and a second unidirectional predictor determined in a second prediction direction. Here, the determination of the unidirectional predictor is the same as the encoder-side procedure, and is not described in detail here.
S504: determining a gradient parameter corresponding to the image block, and calculating a gradient value of the image block by using the gradient parameter and a unidirectional prediction value corresponding to the image block in two prediction directions;
it should be noted that the gradient parameters at least include a gradient flag value and a gradient direction index value. After the encoder side sets the gradient parameters, the gradient parameters are written into a code stream at the same time, and then the code stream is transmitted to the decoder side from the encoder side; therefore, on the decoder side, the gradient parameters corresponding to the image blocks can be obtained directly by analyzing the code stream without determining the gradient parameters.
Specifically, in some embodiments, the determining the gradient parameter of the image block may include:
analyzing the code stream, and acquiring the gradient parameters of the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value.
Further, in some embodiments, the analyzing the code stream to obtain the gradient parameter of the image block may include:
analyzing the code stream to obtain a gradient mark value in the gradient parameter;
judging whether the obtained gradient flag value is equal to 1;
if the obtained gradient mark value is equal to 1, the code stream is continuously analyzed, and a gradient direction index value in the gradient parameter is obtained.
That is to say, after the code stream is analyzed, if the gradient flag value in the obtained gradient parameter is equal to 0, the gradient direction index value in the gradient parameter is not obtained any more; if the obtained gradient flag value in the gradient parameter is equal to 1, then the code stream needs to be continuously analyzed, and the gradient direction index value in the gradient parameter is obtained.
It can be understood that after the gradient parameters are obtained, if the obtained gradient flag value is equal to 0, it indicates that the gradient correction is not required to be performed on the initial bidirectional prediction value, that is, the gradient correction is turned off, and at this time, the step S504 is not required to be performed, and the initial bidirectional prediction value may be directly the bidirectional prediction value of the image block. Thus, in some embodiments, the method may further comprise: and if the gradient flag value is equal to 0, determining the initial bidirectional prediction value as the bidirectional prediction value of the image block.
It should be noted that, by default, the gradient flag value is equal to 1, which indicates that the initial bidirectional prediction value needs to be subjected to gradient correction, and step S504 needs to be executed at this time to calculate the gradient value of the image block to be predicted.
Specifically, in some embodiments, when the gradient flag value is equal to 1, for S504, the calculating the gradient value of the image block by using the gradient parameters and the unidirectional prediction values respectively corresponding to the image block in two prediction directions may include:
if the gradient flag value is equal to 1 and the gradient direction index value is equal to 0, subtracting the first unidirectional predicted value from the second unidirectional predicted value to obtain a gradient value of the image block;
and if the gradient flag value is equal to 1 and the gradient direction index value is equal to 1, subtracting a second unidirectional prediction value from the first unidirectional prediction value to obtain the gradient value of the image block.
Here, the first unidirectional prediction value (Pred 0) represents a unidirectional prediction value corresponding to the image block in a first prediction direction, and the second unidirectional prediction value (Pred 1) represents a unidirectional prediction value corresponding to the image block in a second prediction direction.
It should be noted that, when the gradient flag value is equal to 1, it indicates that the initial bidirectional prediction value needs to be corrected; at this time, if the gradient direction index value is equal to 0, indicating that the gradient direction is a forward gradient direction, the gradient values of the image blocks are Pred1-Pred0; if the gradient direction index value is equal to 1, indicating that the gradient direction is the inverse gradient direction, the gradient values Pred0-Pred1 for the image block can be obtained.
Still taking fig. 4 as an example, when it is determined that the variation trend is close to the curve 1, indicating that the gradient direction is a positive gradient direction, i.e. the gradient direction index value is equal to 0, the gradient value (R1-R0) between a segment of reference frame 0 and reference frame 1 can be determined; when it is determined that the variation trend is close to the curve 3, it indicates that the gradient direction is the reverse gradient direction, that is, the index value of the gradient direction is equal to 1, at this time, a gradient value (R0-R1) between a section of reference frame 1 and a section of reference frame 0 may be determined, and then, the initial bidirectional predictive value may be corrected according to the determined gradient value, that is, step S505 is performed.
S505: and correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block.
It should be noted that the initial bidirectional predictor is a weighted sum of the unidirectional predictors corresponding to the two prediction directions, respectively. In terms of implementation steps, the initial bidirectional prediction value may be calculated after the unidirectional prediction value is obtained in step S503, may be calculated before the correction operation is performed in step S505, and may be calculated synchronously even during step S504, which is not limited in any way.
After the gradient value of the image block is determined, the initial bidirectional prediction value can be corrected by combining a preset correction strength value to obtain a final bidirectional prediction value. Specifically, the modifying the initial bidirectional prediction value according to the gradient value of the image block and a preset modification strength value to obtain the bidirectional prediction value of the image block may include:
determining a correction gradient value of the image block according to the gradient value of the image block and a preset correction intensity value;
and correcting the initial bidirectional predicted value by using the determined correction gradient value to obtain the bidirectional predicted value of the image block.
Optionally, in some embodiments, the determining a modified gradient value of the image block according to the gradient value of the image block and a preset modified intensity value may include:
and performing displacement calculation on the gradient values of the image blocks by using a preset correction intensity value to obtain the correction gradient values.
Optionally, in some embodiments, the determining a modified gradient value of the image block according to the gradient value of the image block and a preset modified intensity value may include:
and multiplying a preset correction intensity value by the gradient value of the image block to obtain the correction gradient value.
Further, the correcting the initial bidirectional prediction value according to the correction gradient value to obtain the bidirectional prediction value of the image block may include:
and superposing the corrected gradient value and the initial bidirectional prediction value to obtain the bidirectional prediction value of the image block.
That is, the modified gradient value of the image block may be calculated by a shift method and then the initial bidirectional predictive value may be modified according to the modified gradient value, or the modified gradient value of the image block may be calculated by a multiplication method and then the initial bidirectional predictive value may be modified according to the modified gradient value, which is not limited in this embodiment of the present application.
Taking the shift approach as an example, the preset modified intensity value may be represented by k, and in general, the default value of k may be set to a preset constant value, preferably 3, or may also be defined by using a slice (slice) level or higher syntax, and then the bi-directional prediction value of the image block is represented as shown in the above equation (27).
Further, for the value of k, besides being set as a preset constant value, the code stream may be analyzed to obtain an index sequence number value, and then a k value corresponding to the index sequence number value is selected from the mapping list. Specifically, in some embodiments, the method may further comprise:
obtaining a mapping list corresponding to the corrected intensity value; wherein the mapping list represents a correspondence between the correction intensity value and the index number value;
analyzing the code stream to obtain an index serial number value;
and selecting a correction intensity value corresponding to the index sequence number value from the mapping list, and determining the selected correction intensity value as the preset correction intensity value.
It should be noted that the mapping list reflects the correspondence between the correction strength value and the index number value, and filtering with different strengths can be implemented by using different correction strength values. The mapping list is the same as the encoder side, such as shown in table 1 above. Therefore, after the code stream is analyzed to obtain the index serial number value, the correction strength value corresponding to the index serial number value can be selected from the mapping list, and the final correction of the initial bidirectional predicted value is realized.
It should be noted that, some steps on the decoder side are the same as those on the encoder side, and specific reference is made to the contents on the encoder side, which is not described in detail here.
The present embodiment provides a prediction method, which is applied to a decoder. Obtaining a prediction mode parameter of an image block to be decoded by analyzing the code stream; when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index; according to the motion parameters, unidirectional predicted values of the image blocks in two prediction directions are determined; determining gradient parameters of the image blocks, and calculating gradient values of the image blocks by using the gradient parameters and unidirectional predicted values of the image blocks corresponding to the two prediction directions respectively; and correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. Therefore, gradient parameters corresponding to the image blocks can be used for obtaining a gradient value between two unidirectional predicted values, and then the initial bidirectional predicted value is corrected according to the gradient value and the preset correction intensity value, so that the bidirectional predicted value is more accurate, the accuracy of a prediction result can be improved, the decoding efficiency can be improved, and the quality of a video image is improved.
Based on the same inventive concept of the foregoing embodiment, refer to fig. 6, which shows a schematic structural diagram of an encoder 60 according to an embodiment of the present application. As shown in fig. 6, the encoder 60 may include a first determining unit 601, a first calculating unit 602, and a first correcting unit 603; wherein the content of the first and second substances,
a first determining unit 601, configured to perform motion estimation on an image block to be encoded, and determine unidirectional prediction values corresponding to the image block in two prediction directions, respectively;
a first determining unit 601, further configured to determine gradient parameters corresponding to the image block, where the gradient parameters include at least a gradient flag value and a gradient direction index value;
a first calculating unit 602, configured to calculate gradient values of the image block by using the determined gradient parameters and the unidirectional prediction values respectively corresponding to the two prediction directions;
a first correcting unit 603, configured to correct an initial bidirectional prediction value according to the gradient value of the image block and a preset correction strength value, to obtain a bidirectional prediction value of the image block, where the initial bidirectional prediction value is a weighted sum of the unidirectional prediction values respectively corresponding to the two prediction directions.
In the above scheme, referring to fig. 6, the encoder 60 may further include a pre-coding unit 604 and a first selecting unit 605; wherein the content of the first and second substances,
a pre-coding unit 604 configured to perform pre-coding processing on the image block by using multiple prediction modes to obtain multiple pre-coding results; the gradient parameters corresponding to different prediction modes are different;
a first selecting unit 605, configured to select a preferred encoding result from the plurality of pre-encoding results according to a preset policy; and determining the prediction mode corresponding to the preferred coding result as a target prediction mode.
In the above scheme, the first selecting unit 605 is specifically configured to determine, according to the multiple precoding results, a rate-distortion cost value corresponding to each precoding result; and selecting the minimum rate distortion cost value from the multiple determined rate distortion cost values, and determining the pre-coding result corresponding to the minimum rate distortion cost value as the optimal coding result.
In the above, the plurality of prediction modes include a first prediction mode, a second prediction mode, and a third prediction mode; referring to fig. 6, the encoder 60 may further include a setting unit 606 configured to set a value of a gradient flag in the gradient parameter equal to 0 if the target prediction mode is the first prediction mode; if the target prediction mode is a second prediction mode, setting a gradient flag value in the gradient parameter equal to 1, and setting a gradient direction index value in the gradient parameter equal to 0; and if the target prediction mode is a third prediction mode, setting a gradient flag value in the gradient parameter equal to 1, and setting a gradient direction index value in the gradient parameter equal to 1.
In the above solution, the first determining unit 601 is further configured to determine the initial bidirectional prediction value as the bidirectional prediction value of the image block when the set gradient flag value is equal to 0.
In the above scheme, referring to fig. 6, the encoder 60 may further include a first obtaining unit 607 configured to obtain the set gradient flag value and gradient direction index value;
a first calculating unit 602, configured to subtract the first unidirectional prediction value from the second unidirectional prediction value to obtain a gradient value of the image block if the gradient flag value is equal to 1 and the gradient direction index value is equal to 0; if the gradient flag value is equal to 1 and the gradient direction index value is equal to 1, subtracting a second unidirectional prediction value from a first unidirectional prediction value to obtain a gradient value of the image block; the first unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a first prediction direction, and the second unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a second prediction direction.
In the above solution, the first determining unit 601 is further configured to determine a corrected gradient value of the image block according to the gradient value of the image block and a preset corrected intensity value;
the first correcting unit 603 is specifically configured to correct the initial bidirectional prediction value by using the determined correction gradient value, so as to obtain the bidirectional prediction value of the image block.
In the above solution, the first calculating unit 602 is further configured to perform shift calculation on the gradient values of the image block by using a preset correction strength value, so as to obtain the correction gradient values.
In the above solution, the first calculating unit 602 is further configured to multiply a preset correction intensity value with the gradient value of the image block to obtain the correction gradient value.
In the above scheme, the preset correction intensity value is a preset constant value.
In the above scheme, referring to fig. 6, the encoder 60 may further include a writing unit 608; wherein the content of the first and second substances,
a first obtaining unit 607, further configured to obtain a mapping list corresponding to the corrected intensity value; wherein the mapping list represents a corresponding relationship between the correction intensity value and the index sequence number value;
a first calculating unit 602, further configured to calculate a rate-distortion cost value corresponding to each modified intensity value in the mapping list;
a first selecting unit 605, configured to select a minimum rate distortion cost value from the calculated multiple rate distortion cost values, and determine a correction strength value corresponding to the minimum rate distortion cost value as the preset correction strength value;
the writing unit 608 is configured to obtain an index sequence number value corresponding to the preset correction strength value, and write the obtained index sequence number value into the code stream.
In the above solution, the writing unit 608 is further configured to write the gradient flag value into the code stream if the gradient flag value is equal to 0; and if the gradient mark value is equal to 1, writing both the gradient mark value and the gradient direction index value into a code stream.
In the above solution, the first obtaining unit 607 is further configured to obtain predicted image blocks of the image block in two prediction directions;
a first determining unit 601, further configured to perform motion estimation according to the image block and the two predicted image blocks, and determine motion vectors corresponding to two prediction directions respectively; and further configured to determine unidirectional prediction values respectively corresponding in two prediction directions from the two prediction image blocks and the two motion vectors.
It is understood that in this embodiment, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and may also be a module, or may also be non-modular. Moreover, each component in the embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware or a form of a software functional module.
Based on the understanding that the technical solution of the present embodiment essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method of the present embodiment. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Accordingly, the present embodiment provides a computer storage medium applied to the encoder 60, the computer storage medium storing an image prediction program, which when executed by the first processor, implements the method of any one of the preceding embodiments.
Based on the above-mentioned composition of the encoder 60 and the computer storage medium, referring to fig. 7, it shows a specific hardware structure of the encoder 60 provided in the embodiment of the present application, which may include: a first communication interface 701, a first memory 702, and a first processor 703; the various components are coupled together by a first bus system 704. It is understood that the first bus system 704 is used to enable connection communications between these components. The first bus system 704 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as a first bus system 704 in fig. 7. Wherein the content of the first and second substances,
a first communication interface 701, configured to receive and transmit signals during information transmission and reception with other external network elements;
a first memory 702 for storing a computer program capable of running on the first processor 703;
a first processor 703, configured to execute, when running the computer program:
carrying out motion estimation on an image block to be coded, and determining unidirectional predicted values of the image block in two prediction directions respectively;
determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value;
calculating the gradient value of the image block by using the determined gradient parameter and the unidirectional predicted values respectively corresponding to the two prediction directions;
and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
It will be appreciated that the first memory 702 in embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), enhanced Synchronous SDRAM (ESDRAM), synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The first memory 702 of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The first processor 703 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the first processor 703. The first Processor 703 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the first memory 702, and the first processor 703 reads the information in the first memory 702, and completes the steps of the method in combination with the hardware thereof.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof. For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Optionally, as another embodiment, the first processor 703 is further configured to, when running the computer program, perform the method of any one of the foregoing embodiments.
The present embodiment provides an encoder, which may include a first determining unit, a first calculating unit, and a first correcting unit, where the first determining unit is configured to perform motion estimation on an image block to be encoded, and determine unidirectional prediction values corresponding to the image block in two prediction directions, respectively; the first determining unit is further configured to determine gradient parameters corresponding to the image block, where the gradient parameters at least include a gradient flag value and a gradient direction index value; a first calculating unit, configured to calculate gradient values of the image block by using the determined gradient parameters and the unidirectional prediction values respectively corresponding to the two prediction directions; and the first correction unit is configured to correct an initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is a weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. Therefore, for the image blocks to be coded, gradient parameters corresponding to the image blocks can be used for obtaining gradient values between two unidirectional predicted values, and then the initial bidirectional predicted values are corrected according to the gradient values and the preset correction intensity values, so that the bidirectional predicted values are more accurate, the accuracy of a prediction result can be improved, the coding efficiency can be improved, and the quality of a video image is improved.
Based on the same inventive concept of the foregoing embodiment, refer to fig. 8, which shows a schematic structural diagram of a decoder 80 provided in an embodiment of the present application. As shown in fig. 8, the decoder 80 may include a parsing unit 801, a second determining unit 802, a second calculating unit 803, and a second correcting unit 804, wherein,
the analysis unit 801 is configured to analyze the code stream to obtain a prediction mode parameter of the image block to be decoded; when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index;
a second determining unit 802, configured to determine, according to the motion parameter, unidirectional prediction values corresponding to the image block in two prediction directions, respectively;
a second calculating unit 803, configured to determine a gradient parameter of the image block, and calculate a gradient value of the image block by using the gradient parameter and unidirectional prediction values of the image block corresponding to two prediction directions, respectively;
a second correcting unit 804, configured to correct an initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value, to obtain a bidirectional predicted value of the image block, where the initial bidirectional predicted value is a weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
In the above scheme, the parsing unit 801 is further configured to parse the code stream to obtain gradient parameters of the image block, where the gradient parameters at least include a gradient flag value and a gradient direction index value.
In the above scheme, the parsing unit 801 is specifically configured to parse the code stream to obtain a gradient flag value in the gradient parameter; and judging whether the obtained gradient flag value is equal to 1, if so, continuing to analyze the code stream to obtain a gradient direction index value in the gradient parameter.
In the above solution, the second determining unit 802 is further configured to determine the initial bidirectional prediction value as the bidirectional prediction value of the image block if the gradient flag value is equal to 0.
In the above solution, the second calculating unit 803 is specifically configured to, if the gradient flag value is equal to 1 and the gradient direction index value is equal to 0, subtract the first unidirectional prediction value from the second unidirectional prediction value to obtain a gradient value of the image block; if the gradient flag value is equal to 1 and the gradient direction index value is equal to 1, subtracting a second unidirectional prediction value from a first unidirectional prediction value to obtain a gradient value of the image block; the first unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a first prediction direction, and the second unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a second prediction direction.
In the above solution, the second determining unit 802 is further configured to determine a corrected gradient value of the image block according to the gradient value of the image block and a preset corrected intensity value;
the second correcting unit 804 is specifically configured to correct the initial bidirectional prediction value by using the determined correction gradient value, so as to obtain a bidirectional prediction value of the image block.
In the above solution, the second calculating unit 803 is further configured to perform a shift calculation on the gradient values of the image blocks by using a preset correction strength value to obtain the correction gradient values.
In the above solution, the second calculating unit 803 is further configured to multiply a preset correction intensity value with the gradient value of the image block to obtain the correction gradient value.
In the above scheme, the preset correction intensity value is a preset constant value.
In the above scheme, referring to fig. 8, the decoder 80 may further include a second obtaining unit 805 and a second selecting unit 806; wherein the content of the first and second substances,
a second obtaining unit 805 configured to obtain a mapping list corresponding to the corrected intensity value; wherein the mapping list represents a correspondence between the correction intensity value and the index number value;
the parsing unit 801 is further configured to parse the code stream to obtain an index number value;
a second selecting unit 806, configured to select a correction strength value corresponding to the index number value from the mapping list, and determine the selected correction strength value as the preset correction strength value.
In the above solution, the second determining unit 802 is further configured to determine, according to a reference image index in the motion parameters, predicted image blocks of the image block in two prediction directions; and the method is also configured to determine unidirectional predicted values respectively corresponding to the image blocks in two prediction directions according to the determined two predicted image blocks and the motion vector in the motion parameter.
It is understood that, in this embodiment, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and may also be a module, or may be non-modular. Moreover, each component in the embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated unit, if implemented in the form of a software functional module and not sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the present embodiment provides a computer storage medium applied to the decoder 80, the computer storage medium storing an image prediction program which, when executed by the second processor, implements the method of any one of the preceding embodiments.
Based on the above-mentioned composition of the decoder 80 and the computer storage medium, referring to fig. 9, it shows a specific hardware structure of the decoder 80 provided in the embodiment of the present application, which may include: a second communication interface 901, a second memory 902 and a second processor 903; the various components are coupled together by a second bus system 904. It will be appreciated that the second bus system 904 is used to enable communications among the components. The second bus system 904 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as the second bus system 904 in figure 9. Wherein, the first and the second end of the pipe are connected with each other,
a second communication interface 901, configured to receive and send signals in a process of receiving and sending information with other external network elements;
a second memory 902 for storing a computer program capable of running on the second processor 903;
a second processor 903 for, when running the computer program, performing:
analyzing the code stream to obtain a prediction mode parameter of an image block to be decoded;
when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index;
according to the motion parameters, unidirectional predicted values of the image blocks in two prediction directions are determined;
determining gradient parameters of the image blocks, and calculating gradient values of the image blocks by using the gradient parameters and unidirectional predicted values of the image blocks corresponding to the two prediction directions respectively;
and correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
Optionally, as another embodiment, the second processor 903 is further configured to execute the method in any one of the foregoing embodiments when running the computer program.
It is to be understood that the second memory 902 has hardware functionality similar to that of the first memory 702, and the second processor 903 has hardware functionality similar to that of the first processor 703; and will not be described in detail herein.
The embodiment provides a decoder, which may include an analysis unit, a second determination unit, a second calculation unit, and a second correction unit, wherein the analysis unit is configured to analyze a code stream to obtain a prediction mode parameter of an image block to be decoded; when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index; the second determining unit is configured to determine unidirectional predicted values corresponding to the image blocks in two prediction directions respectively according to the motion parameters; the second calculation unit is configured to determine gradient parameters of the image block, and calculate gradient values of the image block by using the gradient parameters and unidirectional prediction values of the image block corresponding to two prediction directions respectively; and the second correction unit is configured to correct the initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is a weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. Therefore, for the image blocks to be decoded, gradient parameters corresponding to the image blocks can be used for obtaining gradient values between two unidirectional predicted values, and then the initial bidirectional predicted values are corrected according to the gradient values and the preset correction intensity values, so that the bidirectional predicted values are more accurate, the accuracy of a predicted result can be improved, the coding efficiency can be improved, and the quality of a video image is improved.
It should be noted that, in the present application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of another like element in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to obtain new method embodiments.
Features disclosed in several of the product embodiments provided in the present application may be combined in any combination to yield new product embodiments without conflict.
The features disclosed in the several method or apparatus embodiments provided herein may be combined in any combination to arrive at a new method or apparatus embodiment without conflict.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Industrial applicability
In the embodiment of the application, the method is applied to an encoder or a decoder. For the encoder side, after unidirectional predicted values of the image block in two prediction directions are determined by performing motion estimation on the image block to be encoded, gradient values of the image block are calculated according to the gradient parameters corresponding to the determined image block and the unidirectional predicted values in the two prediction directions; and then correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions. For a decoder side, after obtaining a prediction mode parameter of an image block to be decoded by analyzing a code stream, when the prediction mode parameter indicates that the image block uses a bidirectional prediction mode, analyzing a motion parameter of the image block; according to the motion parameters, determining unidirectional predicted values of the image block in two prediction directions respectively; calculating gradient values of the image blocks according to the gradient parameters corresponding to the determined image blocks and the unidirectional prediction values respectively corresponding to the two prediction directions; and then correcting the initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block. Therefore, gradient parameters corresponding to the image blocks can be used for obtaining a gradient value between two unidirectional predicted values, and then the initial bidirectional predicted value is corrected according to the gradient value and the preset correction intensity value, so that the bidirectional predicted value is more accurate, the accuracy of a prediction result can be improved, the coding and decoding efficiency can be improved, and the video image quality is improved.

Claims (29)

  1. An image prediction method applied to an encoder, the method comprising:
    carrying out motion estimation on an image block to be coded, and determining unidirectional predicted values of the image block in two prediction directions respectively;
    determining gradient parameters corresponding to the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value;
    calculating the gradient value of the image block by using the determined gradient parameter and the unidirectional predicted values respectively corresponding to the two prediction directions;
    and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
  2. The method of claim 1, wherein prior to said determining the gradient parameters corresponding to the image patch, the method further comprises:
    carrying out pre-coding processing on the image blocks by utilizing multiple prediction modes to obtain multiple pre-coding results; wherein, the gradient parameters corresponding to different prediction modes are different;
    selecting an optimal coding result from the plurality of pre-coding results according to a preset strategy;
    and determining the prediction mode corresponding to the optimal coding result as a target prediction mode.
  3. The method according to claim 2, wherein the selecting a preferred coding result from the plurality of precoding results according to a preset strategy comprises:
    determining a rate distortion cost value corresponding to each pre-coding result according to the plurality of pre-coding results;
    and selecting the minimum rate distortion cost value from the multiple determined rate distortion cost values, and determining a pre-coding result corresponding to the minimum rate distortion cost value as the optimal coding result.
  4. The method of claim 2, wherein the plurality of prediction modes include a first prediction mode, a second prediction mode, and a third prediction mode, and the determining the prediction mode corresponding to the preferred encoding result as the target prediction mode comprises:
    if the target prediction mode is a first prediction mode, setting a gradient flag value in the gradient parameter to be equal to 0;
    if the target prediction mode is a second prediction mode, setting a gradient flag value in the gradient parameter equal to 1, and setting a gradient direction index value in the gradient parameter equal to 0;
    and if the target prediction mode is a third prediction mode, setting a gradient flag value in the gradient parameter equal to 1, and setting a gradient direction index value in the gradient parameter equal to 1.
  5. The method of claim 4, wherein the method further comprises:
    and when the set gradient flag value is equal to 0, determining the initial bidirectional prediction value as the bidirectional prediction value of the image block.
  6. The method according to claim 4, wherein when the set gradient flag value is equal to 1, the calculating the gradient value of the image block using the determined gradient parameter and the unidirectional prediction values respectively corresponding to the two prediction directions comprises:
    acquiring the set gradient mark value and the gradient direction index value;
    if the gradient flag value is equal to 1 and the gradient direction index value is equal to 0, subtracting the first unidirectional predicted value from the second unidirectional predicted value to obtain a gradient value of the image block;
    if the gradient flag value is equal to 1 and the gradient direction index value is equal to 1, subtracting a second unidirectional prediction value from a first unidirectional prediction value to obtain a gradient value of the image block; the first unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a first prediction direction, and the second unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a second prediction direction.
  7. The method according to claim 1, wherein the correcting the initial bidirectional prediction value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional prediction value of the image block comprises:
    determining a correction gradient value of the image block according to the gradient value of the image block and a preset correction intensity value;
    and correcting the initial bidirectional prediction value by using the determined correction gradient value to obtain the bidirectional prediction value of the image block.
  8. The method of claim 7, wherein determining the modified gradient values of the image blocks according to the gradient values of the image blocks and preset modified intensity values comprises:
    and performing displacement calculation on the gradient values of the image blocks by using a preset correction intensity value to obtain the correction gradient values.
  9. The method of claim 7, wherein determining the modified gradient values of the image blocks according to the gradient values of the image blocks and preset modified intensity values comprises:
    and multiplying a preset correction intensity value by the gradient value of the image block to obtain the correction gradient value.
  10. The method of claim 1, wherein the preset modified intensity value is a preset constant value.
  11. The method of claim 1, wherein the method further comprises:
    obtaining a mapping list corresponding to the corrected intensity value; wherein the mapping list represents a correspondence between the correction intensity value and the index number value;
    calculating the rate distortion cost value corresponding to each correction intensity value in the mapping list;
    selecting a minimum rate distortion cost value from the multiple rate distortion cost values obtained through calculation, and determining a correction strength value corresponding to the minimum rate distortion cost value as the preset correction strength value;
    and acquiring an index serial number value corresponding to the preset correction intensity value, and writing the acquired index serial number value into a code stream.
  12. The method of claim 4, wherein the method further comprises:
    if the gradient flag value is equal to 0, writing the gradient flag value into a code stream;
    and if the gradient flag value is equal to 1, writing both the gradient flag value and the gradient direction index value into a code stream.
  13. The method according to any one of claims 1 to 12, wherein the performing motion estimation on the image block to be encoded and determining the uni-directional prediction values respectively corresponding to the image block in two prediction directions comprises:
    acquiring predicted image blocks of the image blocks in two prediction directions;
    performing motion estimation according to the image blocks and the two predicted image blocks, and determining motion vectors respectively corresponding to the two prediction directions;
    and determining unidirectional predicted values respectively corresponding to the two prediction directions according to the two predicted image blocks and the two motion vectors.
  14. An image prediction method applied to a decoder, the method comprising:
    analyzing the code stream to obtain a prediction mode parameter of an image block to be decoded;
    when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index;
    according to the motion parameters, unidirectional predicted values of the image blocks in two prediction directions are determined, wherein the unidirectional predicted values correspond to the two prediction directions respectively;
    determining gradient parameters of the image blocks, and calculating gradient values of the image blocks by using the gradient parameters and unidirectional predicted values of the image blocks corresponding to the two prediction directions respectively;
    and correcting an initial bidirectional predicted value according to the gradient value of the image block and a preset correction intensity value to obtain the bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is the weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
  15. The method of claim 14, wherein the determining gradient parameters of the image patch comprises:
    analyzing the code stream, and acquiring the gradient parameters of the image block, wherein the gradient parameters at least comprise a gradient mark value and a gradient direction index value.
  16. The method of claim 15, wherein the determining gradient parameters for the image patch comprises:
    analyzing the code stream to obtain a gradient mark value in the gradient parameter;
    judging whether the obtained gradient flag value is equal to 1;
    if the obtained gradient mark value is equal to 1, the code stream is continuously analyzed, and a gradient direction index value in the gradient parameter is obtained.
  17. The method of claim 16, wherein after the determining whether the obtained gradient flag value is equal to 1, the method further comprises:
    and if the gradient flag value is equal to 0, determining the initial bidirectional prediction value as the bidirectional prediction value of the image block.
  18. The method according to claim 16, wherein when the gradient flag value is equal to 1, the calculating the gradient value of the image block using the gradient parameters and the uni-directional prediction values respectively corresponding to the image block in two prediction directions comprises:
    if the gradient flag value is equal to 1 and the gradient direction index value is equal to 0, subtracting the first unidirectional predicted value from the second unidirectional predicted value to obtain a gradient value of the image block;
    if the gradient flag value is equal to 1 and the gradient direction index value is equal to 1, subtracting a second unidirectional prediction value from a first unidirectional prediction value to obtain a gradient value of the image block; the first unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a first prediction direction, and the second unidirectional prediction value represents a unidirectional prediction value corresponding to the image block in a second prediction direction.
  19. The method according to claim 14, wherein the correcting the initial bidirectional prediction value according to the gradient value of the image block and a preset correction strength value to obtain the bidirectional prediction value of the image block comprises:
    determining a correction gradient value of the image block according to the gradient value of the image block and a preset correction intensity value;
    and correcting the initial bidirectional prediction value by using the determined correction gradient value to obtain the bidirectional prediction value of the image block.
  20. The method of claim 19, wherein the determining the modified gradient values of the image blocks according to the gradient values of the image blocks and a preset modified intensity value comprises:
    and performing displacement calculation on the gradient values of the image blocks by using a preset correction intensity value to obtain the correction gradient values.
  21. The method of claim 19, wherein determining the modified gradient values of the image blocks according to the gradient values of the image blocks and preset modified intensity values comprises:
    and multiplying a preset correction intensity value by the gradient value of the image block to obtain the correction gradient value.
  22. The method of claim 14, wherein the preset modified intensity value is a preset constant value.
  23. The method of claim 14, wherein the method further comprises:
    obtaining a mapping list corresponding to the correction intensity value; wherein the mapping list represents a correspondence between the correction intensity value and the index number value;
    analyzing the code stream to obtain an index sequence number value;
    and selecting a correction intensity value corresponding to the index sequence number value from the mapping list, and determining the selected correction intensity value as the preset correction intensity value.
  24. The method according to any one of claims 14 to 23, wherein said determining, according to the motion parameter, unidirectional prediction values respectively corresponding to the image blocks in two prediction directions comprises:
    determining the prediction image blocks of the image blocks in two prediction directions according to the reference image indexes in the motion parameters;
    and determining unidirectional predicted values respectively corresponding to the image blocks in two prediction directions according to the two determined prediction image blocks and the motion vectors in the motion parameters.
  25. An encoder comprising a first determining unit, a first calculating unit and a first correcting unit, wherein,
    the first determining unit is configured to perform motion estimation on an image block to be encoded, and determine unidirectional prediction values respectively corresponding to the image block in two prediction directions;
    the first determining unit is further configured to determine a gradient parameter corresponding to the image block, where the gradient parameter at least includes a gradient flag value and a gradient direction index value;
    the first calculation unit is configured to calculate gradient values of the image blocks by using the determined gradient parameters and the unidirectional prediction values respectively corresponding to the two prediction directions;
    the first correcting unit is configured to correct an initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain a bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is a weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
  26. An encoder comprising a first memory and a first processor, wherein,
    the first memory for storing a computer program operable on the first processor;
    the first processor, when executing the computer program, is configured to perform the method of any of claims 1 to 13.
  27. A decoder comprising a parsing unit, a second determining unit, a second calculating unit and a second modifying unit, wherein,
    the analysis unit is configured to analyze the code stream to obtain a prediction mode parameter of the image block to be decoded; when the prediction mode parameters indicate that the image block uses a bidirectional prediction mode, resolving motion parameters of the image block, wherein the motion parameters comprise a motion vector and a reference image index;
    the second determining unit is configured to determine, according to the motion parameter, unidirectional prediction values corresponding to the image blocks in two prediction directions respectively;
    the second calculating unit is configured to determine a gradient parameter of the image block, and calculate a gradient value of the image block by using the gradient parameter and unidirectional prediction values respectively corresponding to the image block in two prediction directions;
    the second correcting unit is configured to correct an initial bidirectional predicted value according to the gradient value of the image block and a preset correction strength value to obtain a bidirectional predicted value of the image block, wherein the initial bidirectional predicted value is a weighted sum of the unidirectional predicted values respectively corresponding to the two prediction directions.
  28. A decoder comprising a second memory and a second processor, wherein,
    the second memory for storing a computer program operable on the second processor;
    the second processor, when executing the computer program, is configured to perform the method of any of claims 14 to 24.
  29. A computer storage medium, wherein the computer storage medium stores a computer program which, when executed by a first processor, implements the method of any of claims 1 to 13, or which, when executed by a second processor, implements the method of any of claims 14 to 24.
CN202080097798.XA 2020-03-02 2020-03-02 Image prediction method, encoder, decoder, and storage medium Pending CN115211116A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310064906.9A CN116647698A (en) 2020-03-02 2020-03-02 Image prediction method, encoder, decoder, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/077491 WO2021174396A1 (en) 2020-03-02 2020-03-02 Image prediction method, encoder, decoder and storage medium

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202310064906.9A Division CN116647698A (en) 2020-03-02 2020-03-02 Image prediction method, encoder, decoder, and storage medium

Publications (1)

Publication Number Publication Date
CN115211116A true CN115211116A (en) 2022-10-18

Family

ID=77614434

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202080097798.XA Pending CN115211116A (en) 2020-03-02 2020-03-02 Image prediction method, encoder, decoder, and storage medium
CN202310064906.9A Pending CN116647698A (en) 2020-03-02 2020-03-02 Image prediction method, encoder, decoder, and storage medium

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202310064906.9A Pending CN116647698A (en) 2020-03-02 2020-03-02 Image prediction method, encoder, decoder, and storage medium

Country Status (4)

Country Link
CN (2) CN115211116A (en)
MX (1) MX2022010825A (en)
WO (1) WO2021174396A1 (en)
ZA (1) ZA202209981B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130083851A1 (en) * 2010-04-06 2013-04-04 Samsung Electronics Co., Ltd. Method and apparatus for video encoding and method and apparatus for video decoding
US20180249172A1 (en) * 2015-09-02 2018-08-30 Mediatek Inc. Method and apparatus of motion compensation for video coding based on bi prediction optical flow techniques
KR102332526B1 (en) * 2016-07-14 2021-11-29 삼성전자주식회사 Video decoding method and apparatus thereof, video encoding method and apparatus thereof
WO2018169099A1 (en) * 2017-03-13 2018-09-20 엘지전자(주) Method for processing inter prediction mode-based image and device therefor
US11109062B2 (en) * 2017-03-16 2021-08-31 Mediatek Inc. Method and apparatus of motion refinement based on bi-directional optical flow for video coding

Also Published As

Publication number Publication date
ZA202209981B (en) 2023-05-31
CN116647698A (en) 2023-08-25
WO2021174396A1 (en) 2021-09-10
MX2022010825A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
US10715827B2 (en) Multi-hypotheses merge mode
CN110199521B (en) Low complexity mixed domain collaborative in-loop filter for lossy video coding
US8553779B2 (en) Method and apparatus for encoding/decoding motion vector information
TWI793438B (en) A filter
CN106034235B (en) Method and system for calculating coding distortion degree and controlling coding mode
US10432961B2 (en) Video encoding optimization of extended spaces including last stage processes
JP2021523604A (en) Motion compensation for video coding and decoding
WO2008148272A1 (en) Method and apparatus for sub-pixel motion-compensated video coding
JP2012142886A (en) Image coding device and image decoding device
US20160353107A1 (en) Adaptive quantization parameter modulation for eye sensitive areas
CN112655215A (en) Image component prediction method, encoder, decoder, and storage medium
CN113784128A (en) Image prediction method, encoder, decoder, and storage medium
CN116569554A (en) Inter-frame prediction method, video encoding and decoding method, device and medium
EP3935861A1 (en) Local illumination compensation for video encoding or decoding
CN115428451A (en) Video encoding method, encoder, system, and computer storage medium
CN113709498A (en) Inter-frame prediction method, encoder, decoder, and computer storage medium
US20140056348A1 (en) Methods and device for reconstructing and coding an image block
EP3737099A1 (en) Local illumination compensation for video encoding or decoding
CN113273194A (en) Image component prediction method, encoder, decoder, and storage medium
KR20200136407A (en) Method and apparatus for decoder-side prediction based on weighted distortion
CN115211116A (en) Image prediction method, encoder, decoder, and storage medium
CN113196762A (en) Image component prediction method, device and computer storage medium
CN113766233A (en) Image prediction method, encoder, decoder, and storage medium
KR20230067653A (en) Deep prediction refinement
EP3706419A1 (en) Multi-model local illumination compensation for video encoding or decoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination