WO2019169949A1

WO2019169949A1 - Method for generating predicted motion vector and related apparatus

Info

Publication number: WO2019169949A1
Application number: PCT/CN2019/070629
Authority: WO
Inventors: 陈旭; 郑建铧
Original assignee: 华为技术有限公司
Priority date: 2018-03-07
Filing date: 2019-01-07
Publication date: 2019-09-12
Also published as: CN110248188A

Abstract

Embodiments of the present invention provide a method for generating a predicted motion vector and a related apparatus. The method comprises: constructing a candidate motion vector set of a block to be processed; determining at least two first reference motion vectors according to a first candidate motion vector in the candidate motion vector set; respectively calculating a pixel difference value or a rate distortion cost value between a first adjacent reconstructed block of the at least two determined reference blocks and a second adjacent reconstructed block of the block to be processed; and obtaining, according to a first reference motion vector corresponding to the smallest pixel difference value or rate distortion cost value in the at least two first reference motion vectors, a first predicted motion vector of the block to be processed. The implementation of the embodiments of the present invention is advantageous for obtaining an optimal reference image block of a current block in a codec process, thereby constructing an accurately reconstructed block of the current block.

Description

Prediction motion vector generation method and related device

Technical field

The present invention relates to the field of video codec, and more particularly to a method for predicting motion vector generation and related devices.

Background technique

In video coding and decoding frameworks, hybrid coding structures are commonly used for encoding and decoding video sequences. The coding end of the hybrid coding structure generally includes: a prediction module, a transformation module, a quantization module, and an entropy coding module; the decoding end of the hybrid coding structure generally includes: an entropy decoding module, an inverse quantization module, an inverse transform module, and a prediction compensation module. In video coding and decoding frameworks, images of a video sequence are typically divided into image blocks for encoding. A frame of image is divided into blocks of images that are encoded and decoded using the above modules. The combination of these encoding and decoding modules can effectively remove redundant information of the video sequence and ensure that the encoded image of the video sequence is obtained at the decoding end.

In the above module, the prediction module is used by the encoding end to obtain the prediction block information of the image block of the video sequence coded image, and further determines whether the residual of the image block needs to be obtained according to the specific mode, and the prediction compensation module is used by the decoding end to obtain the current decoded image block. The prediction block information is further determined according to a specific mode whether the current decoded image block is obtained according to the decoded image block residual. The prediction module usually includes two techniques of intra prediction and inter prediction. The intra prediction technique uses the spatial pixel information of the current image block to remove redundant information of the current image block to obtain a residual. The advanced motion vector prediction (AMVP) mode of the inter prediction technique and the non-SKIP mode in the merge mode (Merge) mode use the pixel information of the coded image adjacent to the current image to remove the current image block (referred to as the current block for short). The redundant information obtains the residual, and the SKIP mode in the merge mode does not depend on the image block residual, and the current decoded image block can be obtained directly according to the predicted block information. The coded image adjacent to the current image is referred to as a reference image.

In the AMVP mode or the Merge mode implementation process, the motion vectors of neighboring blocks in the time domain or the air domain are directly used as candidate MVs in the candidate list created by them. However, the motion of the neighboring block does not necessarily coincide with the current block, that is, the motion vector of the neighboring block may be different from the actual motion vector of the current block, so the current block may not obtain the best reference based directly on these candidate mvs. Image block.

Summary of the invention

The embodiment of the present invention provides a method for generating a motion vector and a related device. The embodiment of the present invention is implemented to facilitate obtaining the best reference image block of the current block in the codec process, and then constructing a reconstructed block with the current block.

In a first aspect, an embodiment of the present invention provides a method for generating a motion vector predictor, the method comprising: constructing a candidate motion vector set of a block to be processed; and determining at least two according to a first candidate motion vector in the candidate motion vector set. First reference motion vectors, the first reference motion vectors are used to determine a reference block of the to-be-processed block in a first reference image of the to-be-processed block; respectively calculating at least two of the determined reference blocks a pixel difference value or rate-distortion value between one or more first neighbor reconstructed blocks and one or more second neighbor reconstructed blocks of the block to be processed, wherein The first adjacent reconstructed block and the second adjacent reconstructed block have the same shape and the same size; according to the pixel difference value or the rate distortion value of the at least two first reference motion vectors is the smallest A first reference motion vector obtains a first predicted motion vector of the to-be-processed block.

The candidate motion vector set may include a plurality of candidate motion vectors of the current block, and the candidate motion vector may be referred to as a candidate predicted motion vector. In a possible embodiment, the candidate motion vector may further include a plurality of candidate motion vectors of the current block. Reference image information. Specifically, the candidate motion vector set is a Merge candidate list constructed based on the Merge mode, or an AMVP candidate list constructed based on the AMVP mode. For the encoding end and the decoding end, the candidate motion vector set of the current block may be constructed according to a preset rule, for example, according to a neighboring block in the current block airspace, or a time domain reference block corresponding to the current block, or a neighboring block adjacent to the current block. The predicted motion vectors of the corresponding time domain reference blocks are used as candidate motion vectors, and the candidate list is constructed based on the candidate Merge mode or the AMVP mode based on the candidate motion vectors.

In the embodiment of the present invention, after constructing the Merge candidate list or the AMVP candidate list, the candidate motion vectors in the candidate list are also updated based on the template matching manner. The template matching process in the embodiment of the present invention specifically includes: selecting a first candidate motion vector in the candidate list, where the first candidate motion vector represents a reference block in the current image to the reference image block in the first reference image (referred to as a reference block for short) Predicted motion vector. Determining a search range in the first reference image; performing a search around the first reference block according to the search range, obtaining at least one first reference image block (which may be simply referred to as a reference block, but for the candidate motion vector) The determined reference block is differentiated, which may also be referred to simply as an image block, and the current block determines a motion vector to each of the first image blocks, and the motion vector is a reference motion vector. That is, based on the first candidate motion vector, at least two first reference motion vectors may be determined: one of the reference motion vectors is the first candidate motion vector itself determined by the reference block, and the other at least one first The reference motion vector is a reference motion vector determined by the at least one image block. And calculating, respectively, a pixel difference value or a rate distortion value between the at least one adjacent reconstructed block of the current block and the at least one adjacent reconstructed block of the at least one first image block, and calculating the current Selecting pixel difference values or rate distortion values between at least one adjacent reconstructed block of the block and at least one adjacent reconstructed block of the reference block; selecting from the obtained pixel difference values or rate distortion generation values One of the pixel difference value or the rate distortion generation value is the smallest, and the corresponding motion vector with the pixel difference value or the lowest value is used as the new candidate motion vector of the current block.

In a specific embodiment, if the pixel difference value or a corresponding motion vector with the lowest value is not the candidate motion vector, the at least one (specifically, two or more) first image blocks may be used. An image block having the smallest pixel value or rate distortion generation value is determined, and a reference motion vector corresponding to the image block is used as a new candidate motion vector. For the decoding end, if the parsed index value indicates the new candidate motion vector, the decoding end may directly use the new candidate motion vector as the predicted motion vector to obtain the prediction block of the current block (for unidirectional prediction, The prediction block is the reconstructed block of the current block; for bidirectional prediction, the prediction block can be used for the reconstructed block that ultimately constitutes the current block). For the encoding end, the new candidate motion vector may be replaced with the candidate motion vector selected in the template matching stage in the replacement candidate list, thereby implementing the update of the candidate list. In this way, the encoding end may traverse all the candidate motion vectors in the list according to the rate distortion generation value based on the updated candidate list, and determine an optimal candidate motion vector as the predicted motion vector of the current block (for example, based on the Merge candidate list). The predicted motion vector is the optimal MV, and the predicted motion vector obtained based on the AMVP candidate list is the optimal MVP, and the predicted block of the current block is obtained based on the predicted motion vector (for the unidirectional prediction, the prediction block is the weight of the current block) Construction block; for bidirectional prediction, the prediction block can be used to finally form a reconstructed block of the current block).

It can be seen that, in the embodiment of the present invention, the video codec system can verify whether the adjacent reconstructed block of the reference image block in a certain range (or even the entire reference image) in the reference image of the current block is compared with the template matching manner. The neighboring reconstructed block of the current block has a good match, and the candidate motion vector of the candidate list constructed based on the Merge or AMVP mode is updated, and the updated candidate list can ensure the best of the current block in the encoding and decoding process. The reference image block is used to construct an accurate block of the current block.

Based on the first aspect, in a possible implementation, for the decoding end, in addition to a motion vector that can directly minimize the pixel difference value or the rate distortion generation value (the candidate motion vector selected by the template matching or the reference corresponding to the image block) The motion vector is directly used as the predicted motion vector. If one of the pixel difference values or the rate distortion generation value is the reference motion vector corresponding to a certain image block, the reference motion vector may be replaced by the template matching candidate list. The selected candidate motion vector, and (based on the index value), determines the replaced candidate motion vector (ie, the reference motion vector) as the predicted motion vector.

Based on the first aspect, in a possible implementation manner of the present invention, the method may be used by a decoding end, where the decoding end parses the code stream to obtain identification information and/or a candidate list before constructing the candidate motion vector set of the to-be-processed block. An index value is determined based on the identification information and/or the index value to establish a candidate list based on which prediction mode (eg, Merge mode or AMVP mode), and which candidate motion vector in the candidate list is updated in template matching.

In a possible implementation manner, the decoding end may obtain identification information (such as an identifier bit in the code stream code) and an index value of the candidate list by parsing the code stream. The identifier information is used to indicate that the candidate motion vector set is constructed based on a Merge mode or an AMVP mode, and the index value is used to indicate a specific candidate motion vector in the candidate list. Therefore, the decoding end can quickly determine the prediction mode based on the identification information, and quickly select the candidate motion vector based on the index value to update using the template matching manner (or directly select the candidate motion vector as the predicted motion vector based on the index value to calculate the prediction. Block) to improve decoding efficiency.

In a possible implementation manner, the identification information (for example, the identifier bit in the code stream code) can be used to indicate the prediction mode used by the decoding current block, and can also be used to indicate the candidate constructed based on the prediction mode. The index value of the list. Specifically, when the identifier information is used to indicate that the prediction mode is the Merge mode, and is used to indicate the index information of the Merge mode, or when the identifier information is used to indicate that the prediction mode is the AVMP mode, Index information used to indicate the AMVP mode. Therefore, the decoding end can quickly determine the prediction mode based on the identification information, and quickly select the candidate motion vector to update by using template matching (or directly select the candidate motion vector as the predicted motion vector based on the indicated index value to calculate Prediction block) to improve decoding efficiency.

In a possible implementation manner, the identifier information (for example, an identifier bit in the code stream code) is used to indicate a prediction mode used by decoding the current block, and when the prediction mode indicated by the identifier information is an AMVP mode, The index value of the AMVP candidate list is simultaneously indicated; when the prediction mode indicated by the identifier information is the Merge mode, the decoding end uses the index value of the preset Merge mode. Therefore, the decoding end can quickly determine the prediction mode and determine the candidate motion vector based on the identification information, and use the template matching in the first candidate motion vector (or other pre-specified candidate motion vector) of the quick candidate list in the Merge mode. The way to update (or directly select the specified candidate motion vector as the predicted motion vector to calculate the prediction block), improve the decoding efficiency.

In a possible implementation manner, in the bidirectional prediction, if the codec end adopts the hybrid prediction mode, the decoding may obtain at least two pieces of identification information, one identification information is used to indicate that one direction adopts the Merge mode, and another identification information is used. The AMVP mode is used to indicate the other direction. Alternatively, one identification information is used to indicate that one direction adopts the Merge mode and an index value indicating the Merge candidate list, and the other identification information is used to indicate that the other direction adopts the AMVP mode and the index value indicating the AMVP candidate list. Therefore, the decoding end can quickly determine the prediction mode based on the two identification information in the bidirectional prediction, and quickly select the candidate motion vector to update by using template matching (or directly select the candidate motion vector based on the indicated index value). The prediction block is calculated as a predicted motion vector), and the decoding efficiency is improved.

In a possible implementation manner, in the bidirectional prediction, if the codec side adopts the hybrid prediction mode (that is, the Merge mode and the AMVP mode are respectively adopted in different directions), the decoding may obtain at least two combinations {first identification information, An index value}, {second identification information, a second index value}, wherein in the {first identification information, the first index value}, the first identification information represents a prediction mode in a first direction, and the first index value represents the first In the index value of the candidate list in one direction; in the second identifier information, the second index value, the second identifier information indicates the prediction mode in the second direction, and the second index value indicates the index value of the candidate list in the second direction. Therefore, the decoding end can quickly determine the prediction modes in different directions in the bidirectional prediction based on the two combinations {first identification information, the first index value}, the {second identification information, the second index value}, and select the candidates in the candidate list. The candidate motion vector is updated by using template matching (or the candidate motion vector is directly selected as the predicted motion vector based on the index value to calculate the prediction block of the corresponding direction), thereby improving the decoding efficiency.

In a possible implementation manner, the decoding end may also obtain an index value of the candidate list by parsing the code stream, and the index value of the candidate list may be used to indicate the prediction mode used for decoding the current block, and may also be used to indicate A specific candidate motion vector of the candidate list constructed based on the prediction mode. Therefore, the decoding end can quickly determine the prediction mode based on the index value of the candidate list, and quickly select the candidate motion vector to update using the template matching manner (or directly select the candidate motion vector as the predicted motion vector based on the index value to calculate the prediction block. ), improve decoding efficiency.

In the embodiment of the present invention, the bidirectional prediction includes a first direction prediction and a second direction prediction, where the first direction prediction is based on a prediction of a first reference frame list, and the second direction prediction is based on a second reference. Prediction of the frame list. The first reference frame list includes the first reference image, and the second reference frame list includes a second reference image. Generally, the prediction in one direction may also be referred to as forward prediction, and the prediction in another direction. Referred to as backward prediction.

The bidirectional prediction involved in the embodiments of the present invention includes at least two types. One type is bidirectional prediction based on a hybrid prediction mode, and the other type is bidirectional prediction based on a prediction mode.

For the first type, the hybrid prediction mode includes the Merge mode and the AMVP mode, that is, one direction of the bidirectional prediction adopts the Merge mode, and the other direction adopts the AMVP mode.

In this type, each direction can obtain a predicted motion vector, and then a prediction block is obtained based on the predicted motion vector, and finally the prediction blocks in the two directions are combined by a preset algorithm (for example, an algorithm for performing weighted averaging). To get the reconstructed block of the current block. The process of obtaining the predicted motion vector in each direction may be the same or different; it may be independent or coordinated.

For example, the two directions respectively adopt a prediction mode, wherein one direction (first direction) independently implements update of the candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the first The predicted motion vector of the direction, the other direction (the second direction) independently obtains the predicted motion vector of the second direction directly based on the constructed candidate list (ie, no template matching is performed).

For example, the two directions respectively adopt a prediction mode, wherein one direction (first direction) independently implements update of the candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the first The predicted motion vector of the direction, the other direction (the second direction) also independently implements the update of the candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the predicted motion vector of the second direction. .

For example, the two directions respectively adopt a prediction mode, and one direction (such as the first direction) independently implements update of the candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the first The predicted motion vector in one direction, and the other direction (such as the second direction) cooperatively uses the difference between the candidate motion vector in the first direction to update the candidate motion vector in the candidate list constructed in the second direction, and further A predicted motion vector in the second direction is obtained. The implementation process of the second direction prediction includes: calculating a difference between the replaced first direction candidate motion vector and the first direction candidate motion vector before the replacement; and selecting, according to the difference value and the second direction candidate motion vector set The second direction candidate motion vector in the second direction obtains a new candidate motion vector in the second direction, and replaces the new candidate motion vector in the second direction with the original candidate motion vector in the second direction in the candidate motion vector set to implement the second direction The candidate motion vectors in the constructed candidate list are updated to obtain a predicted motion vector in the second direction.

It can be seen that the hybrid prediction mode provided by the embodiment of the present invention can also ensure that the best reference block of the current block is obtained in the codec process, and the codec efficiency is improved while ensuring that the best reference block of the current block is obtained.

For the second type, a unidirectional prediction mode may be a Merge mode or an AMVP mode, based on which a candidate list for bidirectional prediction is included, the candidate list including candidate motion vector sums for first direction prediction Candidate motion vector for second direction prediction. The process of obtaining the predicted motion vector in each direction may be the same or different; it may be independent or coordinated.

For example, one of the directions (the first direction) independently implements the update of the first direction candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the predicted motion vector of the first direction, and One direction (the second direction) independently obtains the predicted motion vector of the second direction based on the constructed candidate list (ie, no template matching is performed).

For example, one of the directions (the first direction) independently implements the update of the first direction candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the predicted motion vector of the first direction, and One direction (the second direction) also independently implements the update of the second candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the predicted motion vector of the second direction.

For example, the two directions respectively adopt a prediction mode, and one direction (such as the first direction) independently implements update of the candidate motion vector/update of the candidate list based on the template matching manner provided by the embodiment of the present invention, thereby obtaining the first The predicted motion vector in one direction, the other direction (such as the second direction) cooperatively uses the difference before and after the update of the first direction candidate motion vector, and updates the motion vector selected in the second direction to obtain the predicted motion vector in the second direction. . The implementation process of the second direction prediction includes: calculating a difference between the replaced first direction candidate motion vector and the first direction candidate motion vector before the replacement; according to the difference and the candidate motion vector set a second direction candidate motion vector, obtaining a new candidate motion vector in the second direction, and replacing the new candidate motion vector in the second direction with the original candidate motion vector in the second direction of the candidate motion vector set, to implement the second direction The candidate motion vectors in the constructed candidate list are updated to obtain a predicted motion vector in the second direction.

It can be seen that in the bidirectional prediction, the update of the candidate motion vector in another direction can be implemented based on the update result in one direction, thereby greatly improving the efficiency in the encoding and decoding process.

Based on the first aspect, in a possible implementation manner, the manner in which the decoding end (or the encoding end) selects (the first direction) the first candidate motion vector from the candidate motion vector set may be various, including:

Manner 1: When the fourth candidate motion vector in the candidate motion vector set is generated by the second direction prediction, the fourth candidate motion vector is reduced or enlarged according to a proportional relationship to obtain the first candidate motion vector. Wherein the proportional relationship includes a ratio of a first timing difference and a second timing difference, the first timing difference being an image sequence number of the reference image frame determined by the first candidate motion vector and the to-be-processed block a difference between image sequence numbers of the image frames in which the second time difference is an image sequence number of the reference image frame determined by the fourth candidate motion vector and an image sequence number of the image frame in which the to-be-processed block is located The difference. Specifically: when the fourth candidate motion vector selected from the candidate motion vector set is generated by the second direction prediction, mapping the fourth candidate motion vector to the first candidate motion according to a first proportional relationship a vector; wherein the first candidate motion vector and the fourth candidate motion vector form a first proportional relationship (the first proportional relationship is a scalar); the reference image frame determined by the first candidate motion vector Forming a first timing difference between the timing of the image frame in which the current block is located, and the timing difference between the timing of the reference image frame determined by the fourth candidate motion vector and the timing of the image frame in which the current block is located constitutes a second timing difference And forming a second proportional relationship between the first timing difference and the second timing difference (the second proportional relationship is a scalar); the first proportional relationship and the second proportional relationship are the same.

That is to say, if the candidate motion vector selected from the previous (post) prediction is obtained by post- (or forward) prediction, the candidate motion vector for the pre- (post) prediction can be obtained by mapping as the selected one. The candidate motion vector values are applied to subsequent steps.

Manner 2: determining, when the first candidate motion vector is generated by the bidirectional prediction, the at least two first reference motion vectors according to the fifth candidate motion vector, wherein the reference of the first candidate motion vector determination The block is obtained by weighting a first direction reference block determined according to the fifth candidate motion vector and a second direction reference block determined according to the sixth candidate motion vector, the fifth candidate motion vector being predicted by the first direction Generating, the sixth candidate motion vector is generated by the second direction prediction.

That is, if the candidate motion vector selected from the previous (or later) prediction is obtained by bidirectional prediction, the value of the candidate motion vector for the front (or back) prediction portion is selected as the selected one. The candidate motion vector values are applied to subsequent steps.

If the selected candidate motion vector is obtained by pre- (post) prediction, then the candidate motion vector is directly used as the selected candidate motion vector value to be applied to the subsequent step.

Manner 3: If the candidate motion vector selected in the previous (post) prediction is obtained by the pre (post) prediction, the candidate motion vector is directly used as the selected candidate motion vector value to be applied to the subsequent step.

It should be noted that, in a possible implementation manner, the manner of selecting the second candidate motion vector (the second direction) from the candidate motion vector set may also be multiple, and may be implemented by referring to the foregoing manner, and details are not described herein again.

It can be seen that the implementation of the above embodiment of the present invention can improve the accuracy and fault tolerance of the candidate motion vector selection process, thereby improving the accuracy and fault tolerance in the template matching process.

Based on the first aspect, in a possible implementation, determining the at least two first reference motion vectors according to the first candidate motion vector in the candidate motion vector set, including: according to the first candidate motion vector, a reference block of the to-be-processed block determined in the first reference image; searching at a neighboring position of the determined reference block with a target precision to obtain at least two candidate reference blocks, wherein each of the candidate reference blocks corresponds to one The first reference motion vector, the target accuracy includes one of 4 pixel precision, 2 pixel precision, integer pixel precision, half pixel precision, 1/4 pixel precision, and 1/8 pixel precision. The implementation of the embodiments of the present invention is advantageous for improving the search precision in the template matching process, thereby improving the accuracy of the template matching result.

Based on the first aspect, in a possible implementation, the at least one adjacent reconstructed block of the current block, the at least one adjacent reconstructed block of the reference block, and the at least one adjacent reconstructed block of the image block have the same shape and The dimensions are equal. Supposing that at least one adjacent reconstructed block of the current block has a positional relationship 1 with the current block (eg, adjacent or close to an adjacency relationship within a certain range), at least one adjacent reconstructed block and a reference block of the reference block Having a positional relationship 2 (such as adjacent or close to a neighboring relationship within a certain range), at least one adjacent reconstructed block of the image block has a positional relationship 3 with the image block (eg, adjacent or close within a certain range) The adjacency relationship, then, the positional relationship 1, the positional relationship 2, and the positional relationship 3 may be the same. However, in a possible embodiment, the positional relationship 1, the positional relationship 2, and the positional relationship 3 may also differ.

In a second aspect, an embodiment of the present invention provides a method for generating a motion vector predictor, where the method may be used for bidirectional prediction in a hybrid prediction mode, where the hybrid prediction mode includes a Merge mode and an AMVP mode. The bidirectional prediction includes a first direction prediction based on a prediction of a first reference frame list and a second direction prediction based on a prediction of a second reference frame list, the method The method includes: acquiring a first prediction mode, generating a first candidate motion vector set, where the first candidate motion vector set is used to generate a first direction prediction motion vector in the first direction prediction; acquiring a second prediction mode, generating a a second candidate motion vector set for generating a second direction prediction motion vector in the second direction prediction, wherein when the first prediction mode is an AMVP mode, the second The prediction mode is the Merge mode, or when the first prediction mode is the Merge mode, the second prediction mode is the AMVP mode.

The hybrid prediction mode provided by the embodiment of the present invention is a novel codec mode, and the hybrid prediction mode can also ensure that the best reference block of the current block is obtained in the codec process, and the best reference block of the current block is obtained. Improve the efficiency of codec in case.

Based on the first aspect, in a possible implementation, when the first prediction mode is the Merge mode, the acquiring the first prediction mode, generating the first candidate motion vector set, includes: using the Merge mode at the first The candidate motion vector used in the direction prediction generates the first candidate motion vector set.

Based on the first aspect, in a possible implementation, when the first prediction mode is the AMVP mode, the acquiring the first prediction mode, generating the first candidate motion vector set, includes: using the AMVP mode at the first The candidate motion vector used in the direction prediction generates the first candidate motion vector set.

Based on the first aspect, in a possible implementation, when the second prediction mode is the Merge mode, the acquiring the second prediction mode, generating the second candidate motion vector set, includes: using the Merge mode in the second The candidate prediction vector used in the direction prediction generates the second candidate motion vector set.

Based on the first aspect, in a possible implementation, when the second prediction mode is the AMVP mode, the acquiring the second prediction mode, generating the second candidate motion vector set, includes: using the AMVP mode in the second The candidate motion vector used in the direction prediction generates the second candidate motion vector set.

In a possible implementation manner, in the bidirectional prediction, the codec end adopts a hybrid prediction mode, and then decoding may obtain at least two pieces of identification information, one identification information is used to indicate that one direction adopts the Merge mode, and another identification information is used for Indicates that the other direction uses AMVP mode. Alternatively, one identification information is used to indicate that one direction adopts the Merge mode and an index value indicating the Merge candidate list, and the other identification information is used to indicate that the other direction adopts the AMVP mode and the index value indicating the AMVP candidate list. Therefore, the decoding end can quickly determine the prediction mode based on the two identification information in the bidirectional prediction, and quickly select the candidate motion vector to update by using template matching (or directly select the candidate motion vector based on the indicated index value). The prediction block is calculated as a predicted motion vector), and the decoding efficiency is improved.

In a possible implementation manner, in the bidirectional prediction, the codec end adopts a hybrid prediction mode (that is, the Merge mode and the AMVP mode are respectively adopted in different directions), then the decoding obtains at least two combinations {first identification information, first Index value}, {second identification information, second index value}, wherein in the {first identification information, the first index value}, the first identification information represents a prediction mode in the first direction, and the first index value represents the first In the index value of the candidate list of directions; in the second identifier information, the second index value, the second identifier information indicates a prediction mode in the second direction, and the second index value indicates an index value of the candidate list in the second direction. Therefore, the decoding end can quickly determine the prediction modes in different directions in the bidirectional prediction based on the two combinations {first identification information, the first index value}, the {second identification information, the second index value}, and select the candidates in the candidate list. The candidate motion vector is updated by using template matching (or the candidate motion vector is directly selected as the predicted motion vector based on the index value to calculate the prediction block of the corresponding direction), thereby improving the decoding efficiency.

Specifically, when the first identifier information is used to indicate that the first prediction mode is the Merge mode, the first identifier information is further used to indicate index information of the Merge mode, or when the first identifier information is used by The first identifier information is further used to indicate index information of the AMVP mode when the first prediction mode is indicated as the AVMP mode.

Specifically, in a certain direction in the bidirectional prediction, the identifier information (the first identifier information or the second identifier information) is used to indicate a prediction mode used by the decoding current block, and when the prediction mode indicated by the identifier information is AMVP In the mode, the index value of the AMVP candidate list is simultaneously indicated; when the prediction mode indicated by the identifier information is the Merge mode, the decoding end may use the index value of the preset Merge mode. Therefore, the decoding end can quickly determine the prediction mode and determine the candidate motion vector based on the identification information, and use the template matching in the first candidate motion vector (or other pre-specified candidate motion vector) of the quick candidate list in the Merge mode. The way to update (or directly select the specified candidate motion vector as the predicted motion vector to calculate the prediction block), improve the decoding efficiency.

Specifically, before the acquiring the second prediction mode and generating the second candidate motion vector set, the method further includes: after determining that the first prediction mode is the Merge mode, parsing the code stream to obtain the second identifier. Information, the second identification information is used to indicate index information of the AMVP mode.

Specifically, before the acquiring the second prediction mode and generating the second candidate motion vector set, the method further includes: after determining that the first prediction mode is the AMVP mode, parsing the code stream to obtain the second identifier. Information, the second identification information is used to indicate index information of the Merge mode.

Based on the first aspect, in a possible implementation, after the decoding end determines that the first prediction mode is the AMVP mode, parsing the code stream to obtain the reference frame index and the motion vector difference information of the first direction prediction; or After determining that the second prediction mode is the AMVP mode, parsing the code stream to obtain the reference frame index and motion vector difference information of the second direction prediction.

In addition, in a possible embodiment of the present invention, the codec end in the bidirectional prediction may use the template matching manner provided by the embodiment of the present invention to update the candidate list. The specific process refers to the description of the first aspect, and details are not described herein again.

In a third aspect, an embodiment of the present invention provides an apparatus for generating a predicted motion vector, where the apparatus includes: a set generation module, configured to construct a candidate motion vector set of a block to be processed; and a template matching module, configured to Determining, by the first candidate motion vector in the set of candidate motion vectors, at least two first reference motion vectors, the first reference motion vector being used to determine that the to-be-processed block is in a first reference image of the to-be-processed block a reference block; the template matching module is further configured to separately calculate one or more first neighbor reconstructed blocks of the at least two determined reference blocks and one or more second neighbors of the to-be-processed block Reconstructing a pixel difference value or a rate distortion generation value between the blocks, wherein the first adjacent reconstructed block and the second adjacent reconstructed block have the same shape and the same size; the predicted motion vector generation module uses Obtaining the to-be-processed block according to a corresponding one of the at least two first reference motion vectors and the first reference motion vector having the lowest rate-distortion value The first predicted motion vector. Specifically, each module in the device is used to implement the method described in the first aspect.

In a fourth aspect, an embodiment of the present invention provides another apparatus for generating a predicted motion vector, where the apparatus is configured to perform bidirectional prediction of a to-be-processed block, where the bidirectional prediction includes a first direction prediction and a second direction prediction, where The first direction prediction is based on the prediction of the first reference frame list, and the second direction prediction is based on the prediction of the second reference frame list, the device comprising: a first set generation module, configured to acquire the first prediction mode, Generating a first candidate motion vector set, the first candidate motion vector set is used to generate a first direction prediction motion vector in the first direction prediction, and a second set generation module is configured to acquire a second prediction mode, generate a first a second candidate motion vector set for generating a second direction prediction motion vector in the second direction prediction, wherein when the first prediction mode is an AMVP mode, the second The prediction mode is the Merge mode, or when the first prediction mode is the Merge mode, the second prediction mode is the AMVP mode. The various modules in the device are specifically for implementing the method described in the second aspect.

In a fifth aspect, an embodiment of the present invention provides an apparatus for generating a predicted motion vector, where the apparatus includes: the device may be applied to an encoding side, or may be applied to a decoding side. The device includes a processor, a memory, and the memory are connected (e.g., connected to each other through a bus). In a possible implementation, the device may further include a transceiver, the transceiver is coupled to the processor and the memory, for receiving /send data. The memory is used to store program code and video data. The processor can be configured to read the program code stored in the memory and perform the method described in the first aspect.

In a sixth aspect, the embodiment of the present invention provides another device for generating a predicted motion vector, where the device includes: the device may be applied to the encoding side, or may be applied to the decoding side. The device includes a processor, a memory, and the memory are connected (e.g., connected to each other through a bus). In a possible implementation, the device may further include a transceiver, the transceiver is coupled to the processor and the memory, for receiving /send data. The memory is used to store program code and video data. The processor can be configured to read the program code stored in the memory and perform the method described in the second aspect.

In a seventh aspect, an embodiment of the present invention provides a video codec system, where the video codec system includes a source device and a destination device. The source device and the destination device can be in a communicative connection. The source device produces encoded video data. Therefore, the source device may be referred to as a video encoding device or a video encoding device. The destination device can decode the encoded video data produced by the source device. Thus, the destination device may be referred to as a video decoding device or a video decoding device. The source device and the destination device may be instances of a video codec device or a video codec device. The method described in the first aspect and/or the second aspect may be applied to the video codec device or video codec device, that is, the video codec system may be used to implement the first aspect and/or the second aspect. The method described.

In an eighth aspect, an embodiment of the present invention provides a computer readable storage medium, wherein the computer readable storage medium stores instructions that, when run on a computer, cause the computer to perform the method described in the first aspect above.

According to a ninth aspect, an embodiment of the present invention provides a computer readable storage medium, wherein the computer readable storage medium stores instructions that, when run on a computer, cause the computer to perform the method described in the second aspect above.

In a tenth aspect, an embodiment of the present invention provides a computer program product comprising instructions, which when executed on a computer, cause the computer to perform the method described in the first aspect above.

In an eleventh aspect, an embodiment of the present invention provides a computer program product comprising instructions that, when run on a computer, cause the computer to perform the method of the second aspect described above.

It can be seen that, in the embodiment of the present invention, the video codec system can verify whether the adjacent reconstructed block of the reference image block in a certain range (or even the entire reference image) in the reference image of the current block is compared with the template matching manner. The neighboring reconstructed block of the current block has a good match, and the candidate motion vector of the candidate list constructed based on the Merge or AMVP mode is updated, and the updated candidate list can ensure the best of the current block in the encoding and decoding process. The reference image block is used to construct an accurate block of the current block. In addition, the embodiment of the present invention further provides a hybrid prediction mode, and based on the hybrid prediction mode, it may also be able to obtain an optimal reference block of the current block in the coding and decoding process, and improve in the case of ensuring that the best reference block of the current block is obtained. The efficiency of codec.

DRAWINGS

1 is a schematic structural diagram of a codec system according to an embodiment of the present invention;

2 is a schematic block diagram of a video codec device or an electronic device according to an embodiment of the present invention;

3 is a schematic diagram of a terminal according to an embodiment of the present invention;

4 is a schematic diagram of an AMVP candidate list according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a merge candidate list according to an embodiment of the present invention; FIG.

FIG. 6 is a schematic diagram of still another AMVP candidate list according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of still another merge candidate list according to an embodiment of the present invention; FIG.

FIG. 8 is a schematic diagram of a scenario of a template matching manner according to an embodiment of the present disclosure;

FIG. 9 is a schematic diagram of another scenario of a template matching manner according to an embodiment of the present disclosure;

10 is a schematic diagram of a codec process of a Merge mode according to an embodiment of the present invention;

11 is a schematic diagram of a codec process of another Merge mode according to an embodiment of the present invention;

FIG. 12 is a schematic diagram of a codec process of an AMVP mode according to an embodiment of the present invention;

FIG. 13 is a schematic diagram of another encoding and decoding process of an AMVP mode according to an embodiment of the present invention; FIG.

14-24 are schematic flowcharts of some methods for generating a motion vector predictor according to an embodiment of the present invention;

25-28 are schematic structural diagrams of some devices provided by an embodiment of the present invention.

Detailed ways

The embodiments of the present invention are described below in conjunction with the accompanying drawings in the embodiments of the present invention. The terms used in the embodiments of the present invention are only used to explain the specific embodiments of the present invention, and are not intended to limit the present invention.

The system frame to which the embodiment of the present invention is applied is first introduced. Referring first to FIG. 1, FIG. 1 is a schematic block diagram of a video codec system 10 according to an embodiment of the present invention. As shown in FIG. 1, video codec system 10 includes source device 12 and destination device 14. Source device 12 produces encoded video data. Thus, source device 12 may be referred to as a video encoding device or a video encoding device. Destination device 14 may decode the encoded video data produced by source device 12. Thus, destination device 14 may be referred to as a video decoding device or a video decoding device. Source device 12 and destination device 14 may be examples of video codec devices or video codec devices. Source device 12 and destination device 14 may include a wide range of devices including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set top boxes, smart phones, etc., televisions, cameras, display devices , digital media player, video game console, on-board computer, or the like.

Destination device 14 may receive the encoded video data from source device 12 via channel 16. Channel 16 may include one or more media and/or devices capable of moving encoded video data from source device 12 to destination device 14. In one example, channel 16 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time. In this example, source device 12 may modulate the encoded video data in accordance with a communication standard (eg, a wireless communication protocol) and may transmit the modulated video data to destination device 14. The one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)). The one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14.

In another example, channel 16 can include a storage medium that stores encoded video data generated by source device 12. In this example, destination device 14 can access the storage medium via disk access or card access. The storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, DVD, CD-ROM, flash memory, or other suitable digital storage medium for storing encoded video data.

In another example, channel 16 can include a file server or another intermediate storage device that stores encoded video data generated by source device 12. In this example, destination device 14 may access the encoded video data stored at a file server or other intermediate storage device via streaming or download. The file server may be a server type capable of storing encoded video data and transmitting the encoded video data to the destination device 14. The instance file server includes a web server (eg, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, and a local disk drive.

Destination device 14 can access the encoded video data via a standard data connection (e.g., an internet connection). An instance type of a data connection includes a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, DSL, cable modem, etc.), or both, suitable for accessing encoded video data stored on a file server. combination. The transmission of the encoded video data from the file server may be streaming, downloading, or a combination of both.

The technology of the present invention is not limited to a wireless application scenario. Illustratively, the technology can be applied to video codecs supporting multiple multimedia applications such as aerial television broadcasting, cable television transmission, satellite television transmission, and streaming video. Transmission (eg, via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application. In some examples, video codec system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

In the example of FIG. 1, source device 12 includes a video source 18, a video encoder 20, and an output interface 22. In some examples, output interface 22 can include a modulator/demodulator (modem) and/or a transmitter. Video source 18 may include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer for generating video data. A graphics system, or a combination of the above video data sources.

The video encoder 20 may encode video data from the video source 18. Specifically, the video encoder 20 is provided with a prediction module, a transform module, a quantization module, an entropy encoding module, and the like. In some examples, source device 12 transmits the encoded video data directly to destination device 14 via output interface 22. The encoded video data may also be stored on a storage medium or file server for later access by the destination device 14 for decoding and/or playback.

In the example of FIG. 1, destination device 14 includes an input interface 28, a video decoder 30, and a display device 32. In some examples, input interface 28 includes a receiver and/or a modem. Input interface 28 can receive the encoded video data via channel 16. The video decoder 30 is configured to decode the code stream (video data) received by the input interface 28. Specifically, the video decoder 30 is provided with an entropy decoding module, an inverse quantization module, an inverse transform module, a prediction compensation module, and the like. Display device 32 may be integral with destination device 14 or may be external to destination device 14. In general, display device 32 displays the decoded video data. Display device 32 may include a variety of display devices such as liquid crystal displays (LCDs), plasma displays, organic light emitting diode (OLED) displays, or other types of display devices.

Video encoder 20 and video decoder 30 may operate in accordance with a video compression standard (eg, the High Efficiency Video Codec H.265 standard) and may conform to the HEVC Test Model (HM). A textual description of the H.265 standard is published on April 29, 2015, ITU-T.265(V3) (04/2015), available for download from http://handle.itu.int/11.1002/1000/12455 The entire contents of the document are incorporated herein by reference.

Referring to FIG. 2, FIG. 2 is a schematic block diagram of a video codec device or electronic device 50 according to an embodiment of the present invention. The device or electronic device 50 may incorporate a codec according to an embodiment of the present invention. FIG. 3 is a schematic structural diagram of an apparatus for video encoding according to an embodiment of the present invention. The units in Figures 2 and 3 will be explained below.

The electronic device 50 can be, for example, a mobile terminal or user equipment of a wireless communication system. It should be understood that embodiments of the invention may be practiced in any electronic device or device that may require encoding and decoding, or encoding, or decoding of a video image.

Device 50 can include a housing 30 for incorporating and protecting the device. Device 50 may also include display 32 in the form of a liquid crystal display. In other embodiments of the invention, the display may be any suitable display technology suitable for displaying images or video. Device 50 may also include a keypad 34. In other embodiments of the invention, any suitable data or user interface mechanism may be utilized. For example, the user interface can be implemented as a virtual keyboard or data entry system as part of a touch sensitive display. The device may include a microphone 36 or any suitable audio input, which may be a digital or analog signal input. The apparatus 50 may also include an audio output device, which in an embodiment of the invention may be any of the following: an earphone 38, a speaker, or an analog audio or digital audio output connection. Device 50 may also include battery 40, and in other embodiments of the invention, the device may be powered by any suitable mobile energy device, such as a solar cell, fuel cell, or clock mechanism generator. The device may also include an infrared port 42 for short-range line of sight communication with other devices. In other embodiments, device 50 may also include any suitable short range communication solution, such as a Bluetooth wireless connection or a USB/Firewire wired connection.

Device 50 may include a controller 56 or processor for controlling device 50. Controller 56 may be coupled to memory 58, which may store data in the form of images and audio in an embodiment of the invention, and/or may also store instructions for execution on controller 56. Controller 56 may also be coupled to codec circuitry 54 suitable for implementing encoding and decoding of audio and/or video data or assisted encoding and decoding by controller 56.

The apparatus 50 may also include a card reader 48 and a smart card 46, such as a UICC and a UICC reader, for providing user information and for providing authentication information for authenticating and authorizing users on the network.

Apparatus 50 may also include a radio interface circuit 52 coupled to the controller and adapted to generate, for example, a wireless communication signal for communicating with a cellular communication network, a wireless communication system, or a wireless local area network. Apparatus 50 may also include an antenna 44 coupled to radio interface circuitry 52 for transmitting radio frequency signals generated at radio interface circuitry 52 to other apparatus(s) and for receiving radio frequency signals from other apparatus(s).

In some embodiments of the invention, device 50 includes a camera capable of recording or detecting a single frame, and codec 54 or controller receives the individual frames and processes them. In some embodiments of the invention, the device may receive video image data to be processed from another device prior to transmission and/or storage. In some embodiments of the invention, device 50 may receive images for encoding/decoding via a wireless or wired connection. The method described in the embodiments of the present invention is mainly applied to inter prediction in the corresponding codec process of the video encoder 20 and the video decoder 30.

In the codec according to the embodiment of the present invention, the essence of the inter prediction is the current block of the current image (the current block indicates the image block currently required to be edited/decoded, and the current block may also be referred to as the to-be-processed block) in the reference image. Look for the most similar block (reference block). In order to obtain the reference block that best matches the current block, the Advanced Motion Vector Prediction (AMVP) mode and the Merge mode in the codec inter prediction mode are implemented in different ways.

For the AMVP mode, the AMVP mode first predicts an MV for the current block. This predicted motion vector is also called Motion Vector Prediction (MVP). The MVP can be based on the neighboring block in the current block airspace, or the current block. The motion vector of the time domain reference block corresponding to the time domain reference block or the neighboring block of the current block is directly obtained. Because there are multiple neighboring blocks, there are multiple MVPs, and one MVP is essentially a candidate motion vector. (candidate mv), the AMVP mode sets these MVP groups into a candidate list. In this paper, the candidate list constructed by the AMVP mode is called an AMVP candidate list. After establishing the AMVP candidate list, the encoding end selects an optimal MVP from the AMVP candidate list, determines the starting point of the search in the reference image according to the optimal MVP (the MVP is also a candidate MV), and then searches for the starting point in the reference image. Searching in a specific way in a specific range nearby and performing rate distortion generation value calculation, and finally calculating an optimal MV, the optimal MV determines the position of the actual reference block (predicted block) in the reference image, through the optimal MV and the optimal The difference of the MVP is obtained by a motion vector difference (MVD), and the index value of the optimal MVP corresponding to the AMVP candidate list is encoded, and the index of the reference image is encoded. The encoding end only needs to send the index of the MVD, the Merge candidate list and the index of the reference image to the decoding end in the code stream, thereby achieving the purpose of video data compression. On the one hand, the decoding end decodes the index of the index value and the reference image in the MVD and the candidate list from the code stream, and on the other hand establishes an AMVP candidate list by itself, and obtains the optimal MVP through the index value, according to the MVD and the optimal one. The MVP obtains the optimal MV, obtains the reference image according to the index of the reference image, uses the optimal MV and finds the actual reference block (predicted block) from the reference image, and then obtains the current block by performing motion compensation on the actual reference block (predicted block). Refactoring block.

For the Merge mode, the Merge mode is also used as the candidate motion vector according to the motion vector of the neighboring block in the current block airspace, or the time domain reference block corresponding to the current block, or the time domain reference block corresponding to the neighboring block in the current block. Candidate mv), because there are multiple neighboring blocks, so there are multiple candidate mvs. The Merge mode builds a candidate list based on these candidate mvs. The candidate list constructed by the Merge mode is called a Merge candidate list (the list length is different from the AMVP mode). . In the Merge mode, the MV of the neighboring block is directly used as the predicted motion vector of the current block, that is, the current block shares an MV with the adjacent block (so there is no MVD at this time), and the reference image of the adjacent block is used as its own reference image. . The Merge mode traverses all candidate MVs in the Merge candidate list, and performs rate distortion generation value calculation. Finally, a candidate MV with the lowest rate distortion value is selected as the optimal MV of the Merge mode, and the most MV in the Merge candidate list. The index value of the MV is encoded, and the encoding end only needs to send the index of the Merge candidate list to the decoding end in the code stream, thereby achieving the purpose of video data compression. On the one hand, the decoding end decodes the index of the Merge candidate list from the code stream, and on the other hand, establishes a Merge candidate list by itself, and determines a candidate MV in the Merge candidate list as the optimal MV by using the index value, and uses the adjacent block. The reference image is used as its own reference image, and the actual reference block (predicted block) is found from the reference image by using the optimal MV, and then the reconstructed block of the current block is finally obtained by performing motion compensation on the actual reference block (predicted block).

The traditional AMVP mode or Merge mode is based on the candidate motion vectors of adjacent blocks to obtain the best reference image block. However, the candidate motion vector of the neighboring block of the current block may not be the optimal predictive motion vector, that is, the actual reference block obtained by directly using the candidate motion vector of the mode in the conventional AMVP mode or the Merge mode implementation process. It is not necessarily the best reference block for the current block. In order to solve the technical defect of the traditional AMVP mode or the Merge mode, the embodiment of the present invention provides a prediction motion vector generation method, which implements improvement (update) on the candidate list constructed by the traditional AMVP mode or the Merge mode, based on the updated candidate list. It can guarantee the best reference block of the current block in the codec process. In addition, the embodiment of the present invention further provides some hybrid prediction modes, and based on the hybrid prediction mode, it is also possible to ensure that the best reference block of the current block is obtained in the codec process.

In order to facilitate the understanding of the technical solutions of the present invention, the unidirectional prediction/bidirectional prediction, the constructed candidate list, and the method for updating the candidate motion vector and the candidate list based on the template matching manner, based on the template, are first described below. A matching codec process, a candidate list based on decoding information, and a way to select candidate motion vectors.

First, inter-frame unidirectional prediction (referred to as unidirectional prediction) and inter-frame bidirectional prediction (referred to as bidirectional prediction) according to an embodiment of the present invention are described.

The unidirectional prediction refers to determining a first prediction motion vector of a current block based on a reference image of a single direction, thereby obtaining a prediction block of a current block in a single direction. Generally, the unidirectional prediction may be referred to as forward prediction or backward prediction according to the relative relationship between the image sequence number of the reference image frame and the image sequence number of the current image frame.

The bidirectional prediction includes a first direction prediction and a second direction prediction, where the first direction prediction is to determine a first prediction motion vector of the current block based on the reference image of the first direction, thereby obtaining a first direction of the current block. a prediction block, wherein the reference image in the first direction is one of a first reference image frame set, the first reference image frame set includes a certain number of reference images; and the second direction prediction is based on a second direction a reference image to determine a second predicted motion vector of the current block, thereby obtaining a prediction block in the second direction of the current block, the reference image in the second direction being one of the second reference image frame set, the second The reference image frame set includes a certain number of reference images. The prediction block obtained based on the first prediction motion vector and the prediction block obtained based on the second prediction motion vector are processed according to a preset algorithm, thereby finally obtaining a reconstructed block of the current block, for example, a prediction block obtained by the first predicted motion vector. And performing weighted averaging on the prediction block obtained based on the second predicted motion vector to obtain a reconstructed block of the current block. In general, inter-frame bidirectional prediction can also be called forward backward prediction, that is, inter-frame bidirectional prediction includes forward prediction and backward prediction. In this case, when the first direction prediction is forward prediction, then The second direction prediction is corresponding to backward prediction; when the first direction prediction is backward prediction, then the second direction prediction is corresponding to forward prediction.

Next, a candidate list related to an embodiment of the present invention will be described.

Referring to FIG. 4, FIG. 4 is an AMVP candidate list used in the AMVP mode according to an embodiment of the present invention. The AMVP candidate list may be applied to forward prediction and may also be applied to backward prediction. Specifically, the AMVP candidate list can be applied to one-way prediction (forward prediction or backward prediction), to forward prediction in bidirectional prediction, or to backward prediction in bidirectional prediction.

The AMVP candidate list includes a set of multiple MVPs (each MVP is also a candidate MV), and the MVP may be based on a neighboring block in the current block airspace, or a time domain reference block corresponding to the current block, or a neighboring block adjacent to the current block. The motion vector corresponding to the time domain reference block is directly obtained. As shown in (a) of FIG. 4, the AMVP candidate list includes MVP0, MVP1...MVPn, wherein the positions of each MVP in the candidate list respectively correspond to a specific candidate list index (which may be specifically referred to as an AMVP candidate list). Index), that is, each index is used to indicate a candidate MV of a corresponding position in the list, and the candidate list indexes corresponding to MVP0, MVP1, ... MVPn in the figure are index 0, index 1 ... index n, respectively.

In an application scenario, the length of the AMVP candidate list constructed based on the AMVP mode is 2. As shown in (b) of FIG. 4, the AMVP candidate list includes two MVPs (MVP0 and MVP1), one of which may be a candidate motion vector obtained according to a neighboring block on the airspace, and correspondingly, another MVP may be A candidate motion vector obtained from neighboring blocks in the time domain. In addition, an index value of 0 of the candidate list may be defined to indicate MVP0, and an index value of 1 of the candidate list may be defined to indicate MVP1. Certainly, the embodiment of the present invention does not limit the specific value of the index value, that is, MVP0 and MVP1 may also be indicated by other defined index values.

Referring to FIG. 5, FIG. 5 is a Merge candidate list used in the Merge mode according to an embodiment of the present invention. The Merge candidate list may be applied to forward prediction or may be applied to backward prediction. Specifically, the Merge candidate list may be applied to one-way prediction (forward prediction or backward prediction), to forward prediction in bidirectional prediction, or to backward prediction in bidirectional prediction.

The Merge candidate list includes a set of multiple candidate MVs, and the candidate MV may be based on a neighboring block in the current block airspace, or a time domain reference block corresponding to the current block, or a time domain reference block corresponding to the neighboring block in the current block. The motion vector is obtained directly. The Merge candidate list shown in (a) of FIG. 5 includes pre (and/or) backward candidate MV0, candidate MV1 ... candidate MVn, wherein the position of each candidate MV in the candidate list corresponds to a specific candidate, respectively. A list index (which may be referred to as an index of a Merge candidate list), that is, each index is used to indicate a candidate MV of a corresponding position in the list, and the candidate list index corresponding to the candidate MV0, the candidate MV1, and the MVn in the figure is respectively an index. 0, index 1...index n. In addition, in the embodiment of the present invention, the candidate MV0, the candidate MV1, ..., MVn may be used to indicate a reference image used when constructing the current block (for example, a reference image directly using a neighboring block as a reference image of the current block).

In an application scenario, the Merge candidate list length based on the Merge mode is 5. As shown in (b) of FIG. 5, the Merge candidate list includes five MVPs (candidate MV0, candidate MV1, candidate MV2, candidate MV3, candidate MV4), wherein the four candidate MVs may be based on neighboring blocks on the airspace. The obtained candidate motion vector, the remaining one candidate MV is a candidate motion vector obtained from neighboring blocks in the time domain. In addition, an index value 2, an index value 3, an index value, 4, an index value 5, and an index value 6 of the candidate list may be defined to indicate the candidate MV0, the candidate MV1, the candidate MV2, the candidate MV3, and the candidate MV4. Certainly, the embodiment of the present invention does not limit the specific value of each index value, that is, each MVP can also be indicated by other defined index values.

Referring to FIG. 6, FIG. 6 is an AMVP candidate list used in the AMVP mode according to an embodiment of the present invention. The AMVP candidate list can be applied to bidirectional prediction, that is, simultaneous application to forward prediction and backward prediction in bidirectional prediction. .

As shown in FIG. 6, the AMVP candidate list includes a set of MVPs (including MVP10, MVP11...MVP1n) for first direction prediction and a set of MVPs (including MVP20, MVP21...MVP2n) for second direction prediction, Each MVP may be directly obtained according to a motion vector of a neighboring block in the current block airspace, or a time domain reference block corresponding to the current block, or a time domain reference block corresponding to the neighboring block in the current block. In this case, the candidate list index index 0, index 1 . . . index n may be used to indicate the MVP 10, the MVP 11 . . . MVP 1 n , respectively, or the candidate list indexes index 0 , index 1 . . . index n may be used to indicate the MVP 20 , MVP 21 . . . MVP 2 n respectively. Instructions. In a specific embodiment, the index value used to indicate the first direction MVP and the index value used to indicate the second direction MVP may be the same or different. For example, the index value 0 may be used to simultaneously indicate the MVP 10 and the MVP 20, and the index value 0 and the index value 1 may be respectively used to indicate the MVP 10 and the MVP 20, respectively.

In the following embodiments of the present invention, when the first direction is the front (or back) direction, the candidate motion vector for the front (or backward) prediction in the AMVP candidate list may be simply referred to as the front (or back) candidate motion vector. When the second direction is the back (or front) direction, the candidate motion vector for the post (or forward) prediction in the AMVP candidate list may be simply referred to as the posterior (or front) candidate motion vector.

Referring to FIG. 7, FIG. 7 is another Merge candidate list used in the Merge mode according to an embodiment of the present invention. The Merge candidate list can be applied to bidirectional prediction, that is, simultaneous application to forward prediction and backward direction in bidirectional prediction. prediction.

As shown in FIG. 7, the Merge candidate list includes a set of candidate MVs (including candidate MV10, candidate MV11...candidate MV1n) for first direction prediction and candidate MVs for second direction prediction (including candidate MV20, candidate) For the set of MV21...candidate MV2n), each candidate MV may be directly obtained according to the motion vector of the neighboring block in the current block airspace, or the time domain reference block corresponding to the current block, or the time domain reference block corresponding to the neighboring block in the current block. . In this case, the candidate list index index 0, index 1 . . . index n may be used to indicate the candidate MV 10, the candidate MV 11 . . . MV 1 n , respectively, or the candidate list index index 0 , index 1 . . . index n may be used to indicate the candidate MV 20 , The candidate MV21...candidate MV2n is indicated. In a specific embodiment, the index value used to indicate the first direction candidate MV and the index value used to indicate the second direction candidate MV may be the same or different. For example, the index value 0 may be used to simultaneously indicate the candidate MV 10 and the candidate MV 20, and the index value 0 and the index value 1 may be respectively used to indicate the candidate MV 10 and the candidate MV 20, respectively.

In the following embodiments of the present invention, when the first direction is the front (or back) direction, the candidate motion vector for the front (or backward) prediction in the Merge candidate list may be simply referred to as the front (or back) candidate motion vector. When the second direction is the back (or front) direction, the candidate motion vector for the post (or forward) prediction in the Merge candidate list may also be simply referred to as the posterior (or front) candidate motion vector.

Again, the template matching method involved in the embodiment of the present invention is described.

In order to solve the technical defect of the traditional AMVP mode or the Merge mode, the template matching manner provided by the embodiment of the present invention can be used to update candidate motion vectors used in the AMVP mode or the Merge mode, and even to the AMVP mode or the Merge mode. The candidate list is updated, and the updated candidate motion vector/candidate list can be implemented in the codec process of the AMVP mode or the Merge mode, and the obtained actual reference block is the best reference block of the current block, thereby ensuring the obtained current reference block. The correctness of the reconstructed block of the block. The process of implementing template matching to update candidate motion vectors is described as follows: (1) In the candidate list constructed based on the AMVP mode or the Merge mode, one candidate motion vector is selected. The candidate motion vector determines a reference image block (abbreviated as a reference block) on the reference image of the current block (2) determines a search range in the reference image determined by the candidate motion vector, and determines at least within the search range a reference motion vector different from the candidate motion vector, each of the at least one reference motion vector respectively corresponding to a reference image block (referred to as an image block) in the reference image indicated by the candidate motion vector, so as to be distinguished from The reference block determined by the candidate motion vector). (3) separately calculating pixel difference values or rate-distortion values between at least one adjacent reconstructed block of the at least one image block and at least one adjacent reconstructed block of the current block, and calculating a reference block A pixel difference value or a rate distortion value between at least one adjacent reconstructed block and at least one adjacent reconstructed block of the current block. Wherein at least one adjacent reconstructed block of the current block, at least one adjacent reconstructed block of the reference block, and at least one adjacent reconstructed block of the image block have the same shape and are equal in size. Suppose that at least one adjacent reconstructed block of the current block has a positional relationship 1 with the current block (eg, adjacent or close within a certain range), and at least one adjacent reconstructed block of the reference block has a position between the referenced block and the reference block. Relationship 2 (such as adjacent or close within a certain range), at least one adjacent reconstructed block of the image block has a positional relationship 3 with the image block (eg, adjacent or close within a certain range), then the positional relationship 1. The positional relationship 2 and the positional relationship 3 can be the same. However, in a possible embodiment, the positional relationship 1, the positional relationship 2 and the positional relationship 3 may also differ (4) from at least one adjacent reconstructed block of the at least one image block and at least one adjacent of the current block is already heavy a pixel difference value or a rate distortion value obtained between the blocks, and a pixel difference value or rate distortion value between at least one adjacent reconstructed block of the reference block and at least one adjacent reconstructed block of the current block, Determining a minimum pixel difference value or a rate distortion generation value, the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is either the candidate motion vector or one of the at least one reference motion vector. Specifically, by calculating the pixel difference value between the reconstructed blocks, calculating the preset coefficient and the number of bits required to obtain the motion vector information, such as the number of codec bits consumed to obtain the current block motion vector difference, The summation calculation yields the rate distortion generation value. (4) The motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is used as the new candidate motion vector. Specifically, if the motion vector is one of the at least one reference motion vector, the reference motion vector is used as a new candidate motion vector, and the original candidate motion vector is updated with the new candidate motion vector. If the motion vector is the original candidate motion vector, there is no need to update the candidate motion vector. Thus, the above (1)(2)(3)(4) is the update process of the candidate motion vector using template matching. It can be understood that if the obtained motion vector is one of the at least one reference motion vector after performing (1)(2)(3)(4), the reference motion vector is taken as a new candidate motion vector. The new candidate motion vector is replaced with the original candidate motion vector in the constructed candidate list, thereby realizing the update of the candidate list using template matching.

For example, referring to FIG. 8, FIG. 8 shows a schematic diagram of template matching, in which a candidate motion vector is selected based on a candidate list constructed by an AMVP mode or a Merge mode, and the candidate motion vector is determined. In the reference block in the reference image, a reference motion vector (i.e., reference motion vector 1 in the illustration) is included in the search range determined in conjunction with the candidate motion vector, and the reference motion vector 1 determines the image block 1 in the reference image. In template matching, the adjacent reconstructed blocks of the current block are A1 and A2, the adjacent reconstructed blocks of the reference block are B1 and B2, and the adjacent reconstructed blocks of image block 1 are C1 and C2. The shapes between {A1, A2}, {B1, B2}, {C1, C2} are the same and equal in size, and they have the same positional relationship with the current block, the reference block, and the image block 1, respectively. Calculate the pixel difference value or rate distortion value between {C1, C2} of image block 1 and {A1, A2} of the current block, and calculate {B1, B2} of the reference block and {A1, A2} of the current block The pixel difference value or rate distortion between generation values. The motion vector corresponding to the smallest one of the two pixel difference values or the rate distortion cost value is used as the candidate motion vector, for example, the minimum pixel difference value or the motion vector corresponding to the rate distortion cost value is the reference motion vector 1, and the reference motion vector is used. As a new candidate motion vector, the candidate motion vector in the candidate list is replaced with the reference motion vector 1. It can be understood that if the minimum pixel difference value or the motion vector corresponding to the rate distortion cost value is the candidate motion vector, the candidate motion vector in the candidate list does not need to be updated.

For example, referring to FIG. 9, FIG. 9 shows another schematic diagram of template matching. In the template matching process, a candidate motion vector is selected based on a candidate list constructed by the AMVP mode or the Merge mode, and the candidate motion vector is selected. Determining a reference block in the reference image, including two reference motion vectors (ie, reference motion vector 1 and reference motion vector 2 in the figure) in the search range determined in conjunction with the candidate motion vector, reference motion vector 1, reference motion The vector 2 determines the image block 1 and the image block 2 in the reference image, respectively. In template matching, the adjacent reconstructed blocks of the current block are A1 and A2, the adjacent reconstructed blocks of the reference block are B1 and B2, and the adjacent reconstructed blocks of image block 1 are C1 and C2, and the adjacent of the image block 2 The reconstructed block is D1 and D2, and the shapes between {A1, A2}, {B1, B2}, {C1, C2}, {D1, and D2} are the same size and are respectively associated with the current block, the reference block, and the image block. 1. Image blocks 2 have the same positional relationship. Calculate the pixel difference value or rate distortion value between {C1, C2} of image block 1 and {A1, A2} of the current block, calculate {D1, D2} of image block 2 and {A1, A2} of the current block The pixel difference value or rate distortion generation value between the pixel and the pixel difference value or rate distortion generation value between {B1, B2} of the reference block and {A1, A2} of the current block are calculated. From the pixel difference value or the rate distortion cost value, the smallest one of the pixel difference value or the rate distortion cost value is selected as the candidate motion vector, for example, the minimum pixel difference value or the motion vector corresponding to the rate distortion cost value is the reference motion. Vector 1 or reference motion vector 2, then reference motion vector 1 or reference motion vector 2 is used as a new candidate motion vector, and the candidate motion vector in the candidate list is replaced with reference motion vector 1 or reference motion vector 2. It can be understood that if the minimum pixel difference value or the motion vector corresponding to the rate distortion cost value is the candidate motion vector, the candidate motion vector in the candidate list does not need to be updated.

It should be noted that, in the specific embodiment of the present invention, updating the candidate list by using template matching may be updating one candidate motion vector in the candidate list, and may update multiple candidate motion vectors in the candidate list. It is also possible to update all candidate motion vectors in the candidate list.

A codec process for obtaining a reconstructed block of a current block based on a template matching manner in the Merge mode according to an embodiment of the present invention is described below. The process may be divided into an encoding process and a decoding process.

Referring to FIG. 10, FIG. 10 shows a specific encoding process of the Merge mode in the embodiment of the present invention. As shown in FIG. 10, on the encoding end, a candidate list of the Merge mode (ie, a Merge candidate list) is constructed, and the candidate list includes Candidate MV1, candidate MV2, etc., select candidate MV2 in the candidate list, and update the candidate MV2 by template matching (refer to the above description), obtain a reference motion vector, and use the reference motion vector as a new candidate motion. The vector replaces the candidate MV2 in the candidate list, thereby implementing an update of the candidate list, the updated candidate list including the candidate MV1 and the reference motion vector. Then, for the updated candidate list, the predicted motion vector is obtained in a conventional manner (the predicted motion vector here is the optimal MV), that is, the Merge mode traverses all candidate MVs in the candidate list (including the candidate MV1 and Refer to the motion vector), and perform rate distortion generation value calculation. Finally, select a candidate MV with the lowest rate distortion value as the optimal MV of the Merge mode, and construct the prediction block of the current block based on the optimal MV (in one-way prediction) The prediction block of the current block is the reconstructed block of the current block, and the index value of the optimal MV is encoded. For example, if the optimal MV is a reference motion vector, the prediction block of the current block is constructed based on the reference motion vector, and the index value corresponding to the reference motion vector is encoded. The encoding end transmits the index value of the Merge candidate list to the decoding end in the code stream.

On the decoding side, the decoding end constructs a Merge candidate list based on the same rules as the encoding end, and the candidate list includes the candidate MV1, the candidate MV2, and the like; on the other hand, parses the code stream sent from the encoding end to obtain a list of the Merge candidate. Index value, and update the candidate motion vector indicated by the index value by using template matching. For example, if the index value indicates the candidate MV2 in the candidate list, the candidate MV2 is updated by using template matching (refer to the above process). Description), get the reference motion vector. Thereafter, in a possible embodiment, the reference motion vector is replaced with a candidate MV2 in the candidate list as a new candidate motion vector, thereby implementing an update of the candidate list, the updated candidate list including the candidate MV1 and the reference motion vector . Then, for the updated candidate list, the reference motion vector directly determined based on the index value is used as the predicted motion vector of the current block (the predicted motion vector is the optimal MV), and based on the predicted motion vector (optimal MV) The prediction block of the current block is constructed (in the unidirectional prediction, the prediction block of the current block is the reconstructed block of the current block). In another possible embodiment, after the reference motion vector is obtained based on the template matching, the reference motion vector may also be directly used as the predicted motion vector of the current block, and the prediction block of the current block is constructed based on the predicted motion vector.

It should be noted that, in the specific embodiment of the present invention, the encoding end updates the Merge candidate list by using template matching, which may be updating one candidate motion vector in the candidate list, and may be multiple candidate motion vectors in the candidate list. To update, it is also possible to update all candidate motion vectors in the candidate list. However, at the decoding end, only the candidate MV indicated by the index value of the Merge candidate list is updated by using template matching, thereby improving the decoding efficiency of the decoding end. Referring to FIG. 11, FIG. 11 shows another specific encoding process of the Merge mode in the embodiment of the present invention. As shown in FIG. 12, the encoding end performs multiple candidate MVs (including candidate MV1 and candidate MV2) in the candidate list. Updating, determining the reference motion vector 2 as the predicted motion vector (optimal MV) of the current block based on the updated candidate list, implementing the prediction block of the current block based on the reference motion vector 2, and encoding the candidate list index corresponding to the reference motion vector 2. At the decoding end, it is only necessary to update the candidate MV2 indicated by the template matching based on the received candidate list index, thereby obtaining the reference motion vector 2, and constructing the prediction block of the current block based on the reference motion vector 2.

It should be noted that, in a possible embodiment of the present invention, if the predicted motion vector obtained by the encoding end based on the updated candidate list is not one of the updated candidate motion vectors, then at the decoding end, the index of the candidate list is received. Then, the candidate motion vector in the candidate list can be directly selected according to the index of the candidate list as the predicted motion vector of the decoding end, and the candidate motion vector is not required to be updated.

A codec process for obtaining a reconstructed block of a current block based on a template matching manner in an AMVP mode according to an embodiment of the present invention is described below. The process may be divided into an encoding process and a decoding process.

Referring to FIG. 12, FIG. 12 shows a specific encoding process of the AMVP mode in the embodiment of the present invention. As shown in FIG. 12, at the encoding end, a candidate list of AMVP modes (ie, an AMVP candidate list) is constructed, and the candidate list includes MVP1, MVP2, etc., select MVP2 in the candidate list, update the MVP2 by using template matching (refer to the above description), obtain a reference motion vector, and replace the reference motion vector as a candidate candidate motion vector replacement candidate list. MVP2 in the middle, thereby implementing an update to the candidate list, the updated candidate list including the MVP1 and the reference motion vector. Then, for the updated candidate list, the predicted motion vector is obtained in a conventional manner (the predicted motion vector here is the optimal MVP), that is, an optimal MVP is selected from the AMVP candidate list, according to the optimal The MVP determines the starting point of the search in the reference image, and then searches in a specific manner in the vicinity of the search starting point and performs rate distortion generation value calculation, and finally calculates an optimal MV, and the optimal MV determines the current block. The position of the prediction block in the reference image (in unidirectional prediction, the prediction block of the current block is the reconstructed block of the current block), and the motion vector difference MVD is obtained by the difference between the optimal MV and the optimal MVP And encoding the index value corresponding to the optimal MVP in the AMVP candidate list, and encoding the index of the reference image. For example, if the optimal MVP is a reference motion vector (ie, the reference motion vector is the predicted motion vector), a prediction block of the current block is constructed based on the reference motion vector, and an index value corresponding to the reference motion vector is encoded, and the MVD is encoded to encode the reference motion. The index of the corresponding reference image of the vector. Then, the encoding end sends the encoded index value of the AMVP candidate list, the MVD and the index of the reference image to the decoding end in the code stream.

On the decoding side, the decoding end constructs an AMVP candidate list based on the same rules as the encoding end, and the candidate list includes MVP1, MVP2, etc.; on the other hand, parses the code stream sent from the encoding end to obtain an index value of the Merge candidate list. And indexing the MVD and the reference image, and updating the candidate motion vector indicated by the index value by using template matching. For example, if the index value indicates MVP2 in the candidate list, the MVP2 is updated by using template matching (specific process Referring to the above description), a reference motion vector is obtained. Thereafter, in a possible embodiment, the reference motion vector is replaced with MVP2 in the candidate list as a new candidate motion vector, thereby implementing an update of the candidate list, the updated candidate list including MVP1 and the reference motion vector. Then, for the updated candidate list, the reference motion vector directly determined based on the index value is used as the optimal MVP of the current block (the optimal MVP is the predicted motion vector of the current block), and then the optimal MVP is combined with the MVD. The optimal MV of the current block determines the reference picture corresponding to the optimal MV based on the index of the reference image, and constructs a prediction block of the current block (in the unidirectional prediction, the prediction block of the current block is the reconstructed block of the current block). In another possible embodiment, after the reference motion vector is obtained based on the template matching, the reference motion vector may be directly used as the optimal MV (predicted motion vector) of the current block, and the current implementation is based on the predicted motion vector. The prediction block of the block.

Similarly, in the specific embodiment of the present invention, the encoding end updates the AMVP candidate list by using template matching, which may be updating one candidate motion vector in the candidate list, and may update multiple candidate motion vectors in the candidate list. It is also possible to update all candidate motion vectors in the candidate list. However, at the decoding end, only the candidate MV indicated by the index value of the AMVP candidate list is updated by using template matching, thereby improving the decoding efficiency of the decoding end. Referring to FIG. 13, FIG. 13 shows another specific encoding process of the AMVP mode in the embodiment of the present invention. As shown in FIG. 13, the encoding end updates multiple MVPs (including MVP1 and MVP2) in the candidate list, based on The updated candidate list determines that the reference motion vector 2 is the predicted motion vector (optimal MVP) of the current block, and finally constructs the prediction block of the current block based on the reference motion vector 2, and encodes the candidate list index, MVD, and the corresponding reference motion vector 2 Reference image. At the decoding end, it is only necessary to update the MVP2 indicated by the template matching based on the received candidate list index, thereby obtaining the reference motion vector 2, and finally constructing the prediction of the current block based on the reference motion vector 2, the MVD and the reference image. Piece.

Similarly, in a possible embodiment of the present invention, if the predicted motion vector obtained by the encoding end based on the updated candidate list is not one of the updated candidate motion vectors, then at the decoding end, after receiving the index of the candidate list, The candidate motion vector in the candidate list is directly selected according to the index of the candidate list as the predicted motion vector of the decoding end, and the candidate motion vector is not updated.

A manner of constructing a candidate list based on decoding information in a decoding process (for example, an entropy decoding process) and selecting a candidate motion vector based on the decoding information will be described below.

In a possible embodiment of the present invention, the decoding end may determine which prediction mode is used by the current codec according to the identification information of the decoded candidate list and/or the index value of the candidate list, and/or which one of the candidate lists is selected. Candidate motion vector.

In a possible implementation, the decoding end may obtain identification information (e.g., an identification bit in the code stream code) and an index value of the candidate list by parsing the code stream. The identifier information is used to indicate that the candidate motion vector set is constructed based on a Merge mode or an AMVP mode, for example, an identifier bit of 0 indicates a Merge mode, and an identifier bit of 1 indicates an AMVP mode. The index value of the candidate list is used to indicate a specific candidate motion vector in the candidate list, for example, the index value 0 represents the first candidate motion vector in the candidate list, and the index value 1 represents the second candidate motion vector in the candidate list. ,and many more. It can be understood that, in this case, if the current decoding obtains a combination {identification bit 0, index value 1}, it indicates that the prediction mode used for decoding the current block is the Merge mode, and the second decoding step selects the second of the Merge candidate list. Candidate MVs.

In a possible implementation manner, the identification information (for example, the identifier bit in the code stream code) can be used to indicate the prediction mode used by the decoding current block, and can also be used to indicate the candidate constructed based on the prediction mode. The index value of the list. Specifically, when the identifier information is used to indicate that the prediction mode is the Merge mode, and is used to indicate the index information of the Merge mode, or when the identifier information is used to indicate that the prediction mode is the AVMP mode, Index information used to indicate the AMVP mode. For example, when the identifier bit value is 0 or 1, the first MVP or the second MVP for indicating the candidate list of the AMVP mode; when the identifier bit value is 2, 3, 4, 5, 6, The first, second, third, fourth, and fifth candidate MVs for indicating the candidate list of the Merge mode, respectively.

In a possible implementation manner, the identifier information (for example, an identifier bit in the code stream code) is used to indicate a prediction mode used by decoding the current block, and when the prediction mode indicated by the identifier information is an AMVP mode, The index value of the AMVP candidate list is simultaneously indicated; when the prediction mode indicated by the identifier information is the Merge mode, the decoding end uses the index value of the preset Merge mode. For example, when the identifier bit value is 0 or 1, the first MVP or the second MVP for indicating the candidate list of the AMVP mode; when the identification bit value is a non-zero and non-specific value (for example) 2), it is used to indicate the Merge mode. In this case, in the subsequent decoding step, the decoding end directly selects the first candidate MV in the candidate list established by the Merge mode (may also be a candidate MV of any other preset index position). ).

In a possible implementation manner, in the bidirectional prediction, if the codec end adopts the hybrid prediction mode, the decoding may obtain at least two pieces of identification information, one identification information is used to indicate that one direction adopts the Merge mode, and another identification information is used. The AMVP mode is used to indicate the other direction. Alternatively, one identification information is used to indicate that one direction adopts the Merge mode and an index value indicating the Merge candidate list, and the other identification information is used to indicate that the other direction adopts the AMVP mode and the index value indicating the AMVP candidate list.

In a possible implementation manner, in the bidirectional prediction, if the codec side adopts the hybrid prediction mode (that is, the Merge mode and the AMVP mode are respectively adopted in different directions), the decoding may obtain at least two combinations {first identification information, An index value}, {second identification information, a second index value}, wherein in the {first identification information, the first index value}, the first identification information represents a prediction mode in a first direction, and the first index value represents the first In the index value of the candidate list in one direction; in the second identifier information, the second index value, the second identifier information indicates the prediction mode in the second direction, and the second index value indicates the index value of the candidate list in the second direction.

In a possible implementation manner, the decoding end may also obtain an index value of the candidate list by parsing the code stream, and the index value of the candidate list may be used to indicate the prediction mode used for decoding the current block, and may also be used to indicate A specific candidate motion vector of the candidate list constructed based on the prediction mode. For example, when the index value is 0 or 1, the first MVP or the second MVP for indicating the candidate list of the AMVP mode; when the index values are 2, 3, 4, 5, and 6, respectively, The first, second, third, fourth, and fifth candidate MVs of the candidate list indicating the Merge mode. In a possible implementation manner, in the bidirectional prediction, if the codec end adopts the hybrid prediction mode, the decoding may obtain at least two index values, wherein one index value is used to indicate that one direction adopts the Merge mode and indicates the Merge candidate list. The specific candidate motion vector in the other, the other index value is used to indicate that the other direction adopts the AMVP mode and indicates the specific candidate motion vector in the AMVP candidate list.

It should be noted that the above specific embodiments are merely examples, and do not represent limitations of the embodiments of the present invention.

Based on the unidirectional prediction/bidirectional prediction described above, the constructed candidate list, the method of updating the candidate motion vector and the candidate list based on the template matching manner, the template matching based codec process, the following description is provided by the embodiment of the present invention Predictive motion vector generation method.

Referring to FIG. 14, FIG. 14 is a schematic flowchart diagram of a method for generating a motion vector predictor according to an embodiment of the present invention, where the method includes but is not limited to the following steps:

Step S101: Construct a candidate motion vector set of the current block. The candidate motion vector set may include a plurality of candidate motion vectors of a current block, which may be referred to as a candidate predicted motion vector. In a possible embodiment, the reference image corresponding to the multiple candidate motion vectors of the current block may also be included. information. Specifically, the candidate motion vector set is a Merge candidate list constructed based on the Merge mode, or an AMVP candidate list constructed based on the AMVP mode.

Specifically, for the encoding end, the encoding end may construct a candidate motion vector set of the current block according to a preset rule (for example, a traditional manner), for example, according to a neighboring neighboring block in the current block airspace, or a time domain reference block corresponding to the current block, or The motion vectors of the time domain reference blocks corresponding to the neighboring blocks of the current block are used as candidate motion vectors, and the candidate list is constructed based on the preset Merge mode or the AMVP mode based on the candidate motion vectors.

At the decoding end, the prediction information or the index value obtained by decoding may be used to determine which prediction mode is used in the current codec, and then the corresponding candidate list is constructed.

For example, the identifier information is used to indicate a prediction mode used by decoding the current block, and when the prediction mode indicated by the identifier information is an AMVP mode, the index value of the AMVP candidate list is simultaneously indicated; when the identifier information is used. When the predicted prediction mode is Merge mode, the decoding end uses the index value of the preset Merge mode.

For example, the identification information may be used to indicate a prediction mode used by decoding the current block, and may also be used to indicate an index value of a candidate list constructed based on the prediction mode.

For example, the decoding end can obtain the index information of the identification information and the candidate list by parsing the code stream. The identifier information is used to indicate that the candidate motion vector set is constructed based on a Merge mode or an AMVP mode, and an index value of the candidate list is used to indicate a specific candidate motion vector in the candidate list.

For example, the decoding end may also obtain an index value of the candidate list by parsing the code stream, and the index value of the candidate list may be used to indicate the prediction mode used for decoding the current block, and may also be used to indicate that the prediction mode is based on the prediction mode. The specific candidate motion vector of the constructed candidate list.

For related implementations of the foregoing embodiments and implementations of other possible embodiments, reference may be made to the related descriptions of the foregoing, which are not described in detail herein.

Step S102: Select a candidate motion vector in the candidate motion vector set, determine at least one reference motion vector based on the candidate motion vector, and use a reference motion vector to determine a reference of the current block in a reference image of the current block. Piece.

Specifically, the reference block of the current block determined in the reference image according to the candidate motion vector includes: combining the candidate motion vector value, determining a search range in the reference image, with a target accuracy in the Searching for a neighboring position of the determined reference block to obtain a candidate reference block, wherein each of the candidate reference blocks corresponds to a reference motion vector, the target precision includes 4 pixel precision, 2 pixel precision, integer pixel precision, and half pixel precision One of 1/4 pixel precision and 1/8 pixel precision.

Step S103: Calculate a pixel difference value or a rate distortion value between at least one adjacent reconstructed block of at least one of the determined reference blocks and at least one adjacent reconstructed block of the current block, respectively. Specifically, the embodiment of the present invention implements the update of the candidate motion vector by using a template matching manner. For the specific implementation manner, reference may be made to the related description in the foregoing, and details are not described herein again.

Step S104: Obtain a predicted motion vector of the current block according to the reference pixel motion value or the one reference motion vector with the lowest rate distortion value in the at least one reference motion vector.

In a specific embodiment, the candidate motion vector in the candidate motion vector set may be replaced by using the corresponding one of the pixel difference value or the rate distortion generation value that is the smallest; In the candidate motion vector set of a candidate motion vector, the predicted motion vector of the current block is obtained. For the Merge mode, the predicted motion vector is the actual motion vector (actual MV) of the current block, and the encoding end or the decoding end may obtain the prediction block of the current block according to the predicted motion vector (in the unidirectional prediction, the prediction block That is, the final reconstructed block; in the bidirectional prediction, the prediction block is used to obtain the reconstructed block of the current block based on a preset algorithm); for the AMVP mode, the predicted motion vector is the optimal MVP of the current block, and the encoding end or The decoding end may obtain the actual MV according to the predicted motion vector, thereby obtaining a prediction block of the current block (in the unidirectional prediction, the prediction block is the final reconstructed block; in the bidirectional prediction, using the prediction block based on a preset algorithm Get the reconstructed block of the current block).

In a bi-predictive scenario of the embodiment of the present invention, after the current (or updated) candidate motion vector is updated (for example, updated by template matching), the candidate motion vector of the post (pre) prediction may also be directly based on Pre- (or) updating to the predicted update information, the specific process includes: calculating a difference between the candidate motion vector after the replacement of the predicted candidate list and the candidate motion vector before the replacement; and combining the difference sum (pre)predicted candidate motion vector, obtain a new candidate motion vector after (pre) prediction, and replace the original candidate motion vector with the (pre)predicted new candidate motion vector, thereby making the latter The (pre)predicted candidate motion vector implementation update.

It can be seen that, in the embodiment of the present invention, the video codec system can verify whether the image block in a certain range (or even the entire reference image) in the reference image of the current block has a good match with the current block by using template matching. The candidate motion vector of the candidate list constructed based on the Merge or AMVP mode is updated, and the updated candidate list can ensure that the best reference block of the current block is obtained in the encoding and decoding process, and finally the optimal reconstructed block is obtained.

A specific implementation manner of the predicted motion vector generation method provided by the embodiment of the present invention is described in detail below.

Referring to FIG. 15, FIG. 15 is a method for generating a motion vector predictor according to an embodiment of the present invention, which is described from the perspective of a coding end, and the method may be used in one-way prediction (forward prediction or backward prediction), and the method includes Not limited to the following steps:

Step S201: The encoding end constructs an AMVP candidate list or a Merge candidate list.

Specifically, if the candidate list is a candidate list that uses unidirectional prediction, refer to the related description of the embodiment of FIG. 4 or FIG. 5. If the candidate list constructed is a candidate list using bidirectional prediction, refer to the figure. 6 or the related description of the embodiment of FIG. 7, and details are not described herein again.

Step S202: The encoding end selects one candidate motion vector (ie, MVP or candidate MV) in the constructed AMVP candidate list or the Merge candidate list.

Step S203: The encoding end determines the search range in the reference image according to the selected candidate motion vector, the search range includes at least one motion vector value, and obtains a reference of the current block according to the at least one motion vector value in the search range. At least one image block in the image.

Specifically, performing at least one image block by searching for a target accuracy in a neighboring position of the reference block determined by the candidate motion vector, where each image block corresponds to one reference motion vector, and the target precision includes 4 pixel precision, One of 2 pixel precision, integer pixel precision, half pixel precision, 1/4 pixel precision, and 1/8 pixel precision.

Step S204: The encoding end respectively calculates a pixel difference value or a rate distortion value between at least one adjacent reconstructed block of the current block and at least one adjacent reconstructed block of each image block, and calculates at least a current block. A pixel difference value or rate distortion cost value between at least one adjacent reconstructed block of a reference block corresponding to the reconstructed block and the candidate motion vector. Among these pixel difference values or rate distortion generation values, one of the pixel difference value or the rate distortion generation value is selected, and the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is used as the candidate motion vector of the current block.

Step S205: If the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is different from the candidate motion vector selected in step 202, the encoding end uses the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value as the current A new candidate motion vector of the block is updated for the constructed AMVP candidate list or the Merge candidate list. Specifically, the motion vector corresponding to the minimum pixel difference value or the rate distortion value is used to replace the corresponding candidate motion vector in the AMVP candidate list or the Merge candidate list.

It should be noted that the specific implementation process of the foregoing steps S201-S205 may refer to the related description of the embodiments in FIG. 8 to FIG. 9, and details are not described herein again.

It should be noted that, in the embodiment of the present invention, the foregoing steps S201-S205 may be repeatedly performed to implement updating of multiple (or even all) candidate motion vectors in the AMVP candidate list or the Merge candidate list.

In a possible embodiment of the present invention, if the AMVP candidate list or the Merge candidate list for bidirectional prediction is constructed, the forward candidate motion vector and the backward candidate motion indicated by the index value in the candidate list update process. The vectors are updated.

Step S206: The encoding end obtains the predicted motion vector of the current block based on the updated AMVP candidate list or the Merge candidate list, and further obtains the reconstructed block of the current block based on the predicted motion vector of the current block, and sends the code stream to the decoding end. .

Specifically, if the prediction mode used by the encoding end is the AMVP mode, the predicted motion vector (optimal MVP) is determined based on the updated AMVP candidate list, thereby obtaining a reconstructed block of the current block, and encoding the corresponding block of the current block. The motion vector difference information (MVD) encodes an index of the reference image corresponding to the reconstructed block, and encodes an index value of the AMVP candidate list corresponding to the optimal MVP. It can be understood that the code stream sent by the encoding end to the decoding end includes the MVD, an index value of the AMVP candidate list, and an index of the reference image.

Specifically, if the prediction mode used by the encoding end is the Merge mode, the predicted motion vector (optimal MV) is determined based on the updated Merge candidate list, thereby obtaining a reconstructed block of the current block, and encoding the optimal MV corresponding The index value of the Merge candidate list. It can be understood that the encoding end sends the index value of the Merge candidate list in the code stream to the decoding end.

It should be noted that the implementation process of the foregoing steps S201 to S206 of the embodiment of the present invention may also refer to the related description of the coding end in the embodiment of FIG. 10 to FIG. 11 or the embodiment of FIG. 12 to FIG. 13 , and details are not described herein again.

It should also be noted that related steps not detailed in the encoding end may also refer to the related descriptions above. For the sake of brevity of the description, the description will not be repeated here.

Referring to FIG. 16, FIG. 16 is a schematic diagram of a method for generating a motion vector according to an embodiment of the present invention. The decoding end may be corresponding to the encoding end in the embodiment of FIG. The method can be used in one-way prediction (forward prediction or backward prediction), and the method includes but is not limited to the following steps:

Step S301: The decoding end constructs an AMVP candidate list or a Merge candidate list.

It can be understood that, in the embodiment of the present invention, the decoding end adopts a prediction mode consistent with the encoding end. That is to say, if the encoding end constructs the Merge candidate list based on the AMVP mode, the decoding end also constructs the Merge candidate list based on the AMVP mode, and the method of establishing the list is consistent with the encoding end; if the encoding end builds the Merge candidate list based on the Merge mode, then The decoding end also constructs a list of Merge candidates based on the Merge mode, and the method of creating the list is consistent with the encoding end. Specifically, if the candidate list is a candidate list that uses unidirectional prediction, refer to the related description of the embodiment of FIG. 4 or FIG. 5. If the candidate list constructed is a candidate list using bidirectional prediction, refer to the figure. 6 or the related description of the embodiment of FIG. 7, and details are not described herein again.

Step S302: The decoding end parses the code stream to obtain an index value of the AMVP candidate list or the Merge candidate list. A candidate motion vector (ie, an MVP or a candidate MV) indicated by the index value in the AMVP candidate list or the Merge candidate list is selected based on the index value.

Specifically, if both the encoding end and the decoding end adopt the AMVP mode, the decoding end decodes the index value of the AMVP candidate list in the process of parsing the code stream, decodes the index of the reference image, and obtains motion vector difference information (MVD). And selecting a candidate motion vector indicated by the index value in the AMVP candidate list based on the index value.

If both the encoding end and the decoding end adopt the Merge mode, the decoding end decodes the index value of the Merge candidate list during decoding of the code stream, and selects a candidate motion vector indicated by the index value in the Merge candidate list based on the index value. .

For example, when the index value obtained by decoding is 0 or 1, the 0 or 1 is used to indicate the first MVP or the second MVP of the candidate list of the AMVP mode; when the decoded index value is 2, 3, 4 At 5, 6, the 2, 3, 4, 5, and 6 are used to indicate the first, second, third, fourth, and fifth candidate MVs of the candidate list of the Merge mode, respectively.

Step S303: The decoding end determines a search range in the reference image in combination with the selected candidate motion vector, the search range includes at least one motion vector value, and obtains a reference of the current block according to the at least one motion vector value in the search range. At least one image block in the image.

Step S304: respectively calculating a pixel difference value or a rate distortion value between at least one adjacent reconstructed block of the current block and at least one adjacent reconstructed block of each image block, and calculating at least one neighbor of the current block. A pixel difference value or rate distortion cost value between at least one of the reconstructed block and the reference block corresponding to the candidate motion vector is adjacent to the reconstructed block. Among these pixel difference values or rate distortion generation values, one of the pixel difference value or the rate distortion generation value is selected, and the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is used as the candidate motion vector of the current block.

Step S305: If the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is different from the candidate motion vector selected in step 302, the encoding end uses the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value as the current A new candidate motion vector of the block is updated for the constructed AMVP candidate list or the Merge candidate list. Specifically, the motion vector corresponding to the minimum pixel difference value or the rate distortion value is used to replace the corresponding candidate motion vector in the AMVP candidate list or the Merge candidate list.

It should be noted that the specific implementation process of the foregoing steps S301-S305 may refer to the related description of the embodiment of FIG. 8 to FIG. 9, and details are not described herein again.

It should be noted that, in the embodiment of the present invention, the foregoing steps S301-S305 may be repeatedly performed to implement updating of multiple (or even all) candidate motion vectors in the AMVP candidate list or the Merge candidate list.

Step S306: The decoding end determines, according to the index value of the candidate list, the candidate motion vector indicated by the index value in the updated AMVP candidate list or the Merge candidate list as the predicted motion vector, and further obtains the current based on the predicted motion vector of the current block. Reconstructed block of the block.

For example, in the embodiment of the present invention, the Merge mode is adopted, and the Merge index value obtained by the decoding end is 0. Then, the decoding end constructs a Merge candidate list. After the completion of the construction, if the current block selects the candidate motion vector corresponding to the Merge candidate list index value of 0 in the Merge mode, the unidirectional prediction is performed, and the value of the candidate motion vector is (3, 5). And obtaining a reference image block (reference block) in the reference image according to the selected candidate motion vector. Focusing on the reference block and searching for the entire pixel precision in the range of the surrounding 1-pixel region, a plurality of reference image blocks (image blocks) of the same size as the reference block can be obtained. Matching the left block adjacent to the current block and/or the upper adjacent reconstructed block as a template, respectively, the reference block, each of the image blocks left adjacent and/or the upper adjacent reconstructed block, and the rate is matched. The smallest distortion generation value is the updated reference image block, and the motion vector (new candidate MV) corresponding to the updated reference image block is (4, 5), that is, the predicted motion vector of the current block is (4, 5). Based on (4, 5), the current block is reconstructed with the updated reference image block, thereby obtaining a reconstructed block of the current block.

For example, in the embodiment of the present invention, the AMVP mode is adopted, and the index value of the candidate list obtained by the decoding end is decoded. The decoding end constructs an AMVP candidate list. After the completion of the construction, if the current block selects the candidate motion vector corresponding to the index value in the AMVP mode as a unidirectional prediction, and the candidate motion vector has a value of (3, 5), according to the selected candidate motion. The vector gets a reference image block (reference block) in the reference image. Focusing on the reference block and searching for the entire pixel precision in the range of the surrounding 1-pixel region, a plurality of reference image blocks (image blocks) of the same size as the reference block can be obtained. Matching with the reference block and the left adjacent and/or upper adjacent reconstructed block of each of the image blocks respectively, using the current block left adjacent and/or upper adjacent reconstructed block as a template, The lowest rate distortion generation value is the updated reference image block, and the motion vector (new MVP) corresponding to the updated reference image block is (4, 5), that is, the predicted motion vector of the current block is (4, 5). Based on (4, 5), the current block is reconstructed with the updated reference image block, thereby obtaining a reconstructed block of the current block.

In a possible embodiment of the present invention, the decoding end may skip the update of the candidate list, directly use the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value as the predicted motion vector, and then obtain the current block based on the predicted motion vector of the current block. Refactor the block.

It should be noted that, in the implementation process of the foregoing steps S301 to S306 of the embodiment of the present invention, reference may be made to the description of the decoding end in the embodiment of FIG. 10 to FIG. 11 or the embodiment of FIG. 12 to FIG. 13 , and details are not described herein again.

It should also be noted that the relevant steps not described in detail in the decoding end may refer to the description of the relevant steps in the encoding end, and refer to the related description in the foregoing. For the sake of brevity of the description, the description will not be repeated here.

Referring to Figure 17, a hybrid prediction mode is introduced which is applied to bidirectional prediction. The hybrid prediction mode refers to adopting the Merge mode and the AMVP mode simultaneously in the bidirectional prediction process of the codec. Specifically, in the bidirectional prediction, one direction prediction adopts the Merge mode for encoding and decoding, and the other direction prediction adopts the AMVP mode. . It can be understood that when one direction prediction is simply referred to as forward prediction, the other direction prediction may be referred to as backward prediction. A prediction motion vector generation method according to an embodiment of the present invention is described below based on the hybrid prediction mode. The method is described from the perspective of an encoding end. In this method, a direction prediction is performed by using a Merge mode to obtain a current block. The process of the first prediction block includes steps S401-S404, and the process of the other direction prediction using the AMVP mode for encoding and decoding to obtain the second prediction block of the current block includes steps S405-S406, and finally, in step S407, the current block The first prediction block and the second prediction block of the current block can finally obtain the reconstructed block of the current block by using a preset algorithm. These steps are described in detail below.

First, steps S401-S404 are described. In steps S401-S404, the Merge mode uses the template matching manner to update the candidate list in the front (or later, the same below) to obtain the first prediction block, as follows:

Step S401: The front (or back) prediction of the encoding end adopts the Merge mode, and the encoding end constructs a Merge candidate list based on the Merge mode, and the Merge candidate list is used for performing pre (or backward) prediction in the bidirectional prediction. For details, refer to the description of the embodiment of FIG. 5, and details are not described herein again.

Step S402: The encoding end selects a candidate motion vector in the Merge candidate list.

In a specific application scenario, if the selected candidate motion vector is obtained by post- (or forward) prediction, the candidate motion vector for the front (back) prediction may be obtained by mapping as the selected candidate motion vector. The value is applied to the subsequent step S403. For example, the encoding end selects the candidate motion vector in the Merge candidate list as (-2, -2), and the (-2, -2) is based on the backward prediction. The current block corresponds to the image sequence number in the image frame sequence is 4. The backward predicted reference block corresponds to the image sequence number in the image frame sequence is 6, and the forward predicted reference block corresponds to the image sequence number in the image frame sequence is 3, then the current block and the forward predicted reference block The timing difference (referred to as the second timing difference) is 1 (ie, 4-3=1), and the timing difference between the current block and the backward predicted reference block (referred to as the first timing difference) is -2 (ie, 4-6) =-2), therefore, the (-2,-2) mapping for backward prediction can be used according to the proportional relationship between the first timing difference and the second timing difference (-2/1=-2). The candidate motion vector (1, 1) predicted in the forward direction, that is, (-2, -2) / -2 = (1, 1). The (1, 1) is applied as the selected candidate motion vector value to the subsequent step S403.

In addition, in a possible embodiment, if the selected candidate motion vector is obtained by post- (or forward) prediction, the selection of the candidate motion vector may also be interrupted, that is, re-in the Merge candidate. Select other candidate motion vectors from the list.

In a specific application scenario, if the selected candidate motion vector is obtained by bidirectional prediction, the value of the candidate motion vector for the front (or back) prediction portion is selected as the selected candidate motion vector value application. Go to the subsequent step S403. For example, in the pre- (or post-) prediction, the candidate motion vectors pre-selected by the encoding end in the Merge candidate list include motion vectors (1, 1) for pre (post) prediction and post (pre) prediction. For the motion vector (-2, -2), the encoding end finally applies the pre- (post)-predicted motion vector (1, 1) as the selected candidate motion vector value to the subsequent step S403.

In a specific application scenario, if the selected candidate motion vector is obtained by pre- (post) prediction, the candidate motion vector is directly used as the selected candidate motion vector value to be applied to the subsequent step S403.

Step S403: The encoding end updates the candidate motion vector by using template matching.

The specific process includes: the encoding end selects a candidate motion vector of the Merge candidate list as an input, and the candidate motion vector corresponds to a reference block in the reference image. Determining a search range by combining a candidate motion vector, the search range includes at least one reference motion vector; obtaining an image block predicted in the reference image before (later) the current block according to the reference motion vector within the search range; respectively calculating the current block Pixel difference value or rate distortion value between at least one adjacent reconstructed block and at least one adjacent reconstructed block of each image block, and calculating at least one adjacent reconstructed block of the current block and candidate motion vector corresponding At least one of the reference blocks is adjacent to the pixel difference value or rate distortion generation value between the reconstructed blocks. In the pixel difference value or the rate distortion generation value, one of the pixel difference value or the rate distortion generation value is selected, and the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is used as the candidate motion vector of the current block, thereby The update of the candidate motion vector is implemented.

For a specific implementation process, reference may be made to the description of the embodiments in FIG. 8 to FIG. 9 , and details are not described herein again.

Step S404: The encoding end may replace the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value with the corresponding candidate motion vector in the Merge candidate list, thereby implementing updating of the Merge candidate list, based on the updated Merge candidate list. Obtaining a predicted motion vector (also referred to as a first predicted motion vector) of the front (back) direction of the current block, and obtaining a front (back) directed block in combination with the predicted motion vector of the front (back) direction (also referred to as The first prediction block) and encodes the index value corresponding to the predicted motion vector in the Merge candidate list. For a specific process, reference may be made to the description of the coding end of the embodiment of FIG. 10 to FIG. 11 , and details are not described herein again.

Next, steps S405-S406 are described. The AMVP mode uses the constructed candidate list directly in the prediction (or before, the same below) to obtain the second prediction block, as follows:

Step S405: The encoding end constructs an AMVP candidate list based on the AMVP mode. The AMVP candidate list is used for post- (or forward) prediction in bi-directional prediction. For details, refer to the description of the embodiment of FIG. 6 , and details are not described herein again.

Step S406: The encoding end obtains a post (previous) prediction block (also referred to as a second prediction block) of the current block by using a conventional AMVP method based on the AMVP candidate list.

The specific process includes: the encoding end directly selects an optimal MVP from the AMVP candidate list, determines a starting point of searching in the reference image according to the optimal MVP, and then searches according to a specific manner within a specific range near the search starting point. The rate distortion generation value is calculated, and finally an optimal MV is obtained. The optimal MV determines the position of the second prediction block, and the motion vector difference MVD is obtained by the difference between the optimal MV and the optimal MVP, and the most The optimal MVP encodes the index value in the AMVP candidate list, and encodes the index of the reference image.

Finally, in step S407, the encoding end obtains the reconstructed block, and transmits the code stream to the decoding end. Specifically, after obtaining the first (back) direction first prediction block and the back (front) direction second prediction block, the encoding end processes the first prediction block and the second prediction block according to a preset algorithm. Thereby, a reconstructed block of the current block is obtained, for example, the first prediction block and the second prediction block are weighted and halved, thereby obtaining a reconstructed block of the current block. Thereafter, the encoding end transmits a code stream to the decoding end, and the code stream correspondingly includes an index value of the Merge candidate list, an index value of the AMVP candidate list, an MVD, an index of the reference image, and the like.

It should be noted that there is no necessary sequence between the above steps S401-S404 and steps S405-S406, that is, S401-S404 can be performed before or after S405-S406, and S401-S404 and S405-S406 can also be simultaneously get on.

It should be noted that, in another embodiment of the present invention, the S401-S404 may be replaced by the AMVP mode. The candidate list is updated in the template matching manner in the pre- (post) prediction, and the first prediction block is obtained. -S406 is replaced with the Merge mode. The second prediction block is obtained by using the constructed candidate list directly in the prediction. The specific implementation process may refer to the foregoing description, and details are not described herein again.

Referring to FIG. 18, FIG. 18 is a prediction motion vector generation method based on a hybrid prediction mode, which is applied to bidirectional prediction, and is described from the perspective of a decoding end, and the decoding end may correspond to the encoding end in FIG.

In the process of entropy decoding, the decoding end parses the code stream to obtain an index value of the candidate list. It is then judged whether or not the following steps S501-S504 or S505-S506 are performed by whether the index value is a specific value. In a specific embodiment, if the index value is a specific value, the index value is used to indicate that the decoding end adopts the Merge mode before and after the prediction, and the decoding end further performs steps S501-S504; if the index value is not a specific value, The index value is used to indicate that the encoding end adopts the AMVP mode, and the decoding end further decodes the index of the reference image of the prediction direction and the motion vector difference information (MVD), and then performs steps S505-S506. For example, in a possible application scenario, if the index value is 0 or 1, the index value is used to indicate that the backward prediction adopts the AMVP mode (specifically used to indicate the first or the first of the candidate lists constructed by the AMVP mode. The two MVPs, the decoder also decodes the index of the backward predicted reference image and the MVD, and then performs the subsequent steps; if the index value is a specific value of 2, the index value is used to indicate that the forward prediction adopts the Merge mode. (Specificly used to indicate a specific candidate MV in the candidate list constructed by the Merge mode, such as the first candidate MV) to perform subsequent steps. The above examples are merely illustrative of the invention and are not limiting. These steps are described in detail below.

First, steps S501-S504 are described. In steps S501-S504, the Merge mode uses the template matching manner to update the candidate motion vector indicated by the index value in the candidate list in the pre- (or later, the same below) prediction, thereby obtaining the first prediction block, as follows:

Step S501: In the case that the decoded index value indicates that the Merge mode is used before (or after) the prediction, the decoding end determines that the pre- (or post-) prediction adopts the Merge mode, thereby constructing the Merge candidate list. The specific construction of the list can be consistent with step S401 of the embodiment of FIG.

In a possible embodiment of the present invention, the decoding end may also determine the prediction direction corresponding to the Merge mode or the AMVP mode by decoding the candidate motion vector value. For example, the decoding end selects the candidate motion vector corresponding to the Merge candidate list index value of 3, and finds that the selected candidate motion vector is the forward predicted candidate motion vector, and determines that the forward prediction adopts the Merge mode. For another example, the decoding end selects the candidate motion vector corresponding to the AMVP candidate list index value of 0, and finds that the selected candidate motion vector is the backward predicted candidate motion vector, and determines that the backward prediction adopts the Merge mode.

Step S502: The decoding end selects a candidate motion vector corresponding to the index value in the Merge candidate list based on the index value obtained by the decoding. For example, if the index value is 2, and the 2 is used to indicate the first candidate MV in the Merge candidate list, then the first candidate MV in the Merge candidate list is selected.

Step S503: The decoding end updates the candidate motion vector by using template matching. For details, refer to the related description of step S403 in the embodiment of FIG. 17 , and details are not described herein again.

Step S504: The decoding end implements an update of the candidate MV selected in the Merge candidate list, and uses the updated candidate MV as a pre- (post) forward predicted motion vector (also referred to as a first predicted motion vector). A pre- (post-) directed prediction block (also referred to as a first prediction block) is obtained in conjunction with the pre- (rear) predicted motion vector. For a specific process, reference may be made to the description of the decoding end of the embodiment of FIG. 10 to FIG. 11 , and details are not described herein again.

Steps S505-S506 are described below:

Step S505: After the index value indicated by the decoding indicates (or before) the AMVP mode is adopted for the prediction, the decoding end determines (or before) the prediction adopts the AMVP mode, and continues to decode to obtain the index of the reference image and the MVD and the like. Then build an AMVP candidate list. The specific construction of the list can be consistent with step S405 of the embodiment of FIG.

Step S506: The decoding end decodes the post (previous) prediction block of the current block by using a conventional AMVP method based on the AMVP candidate list, the index of the decoded reference image, and the MVD information (also referred to as a second Forecast block).

Finally, in step S507, after the decoding end obtains the first (back) direction first prediction block and the back (front) direction second prediction block, the first prediction block and the second method are performed based on a preset algorithm. The prediction block is processed to obtain a reconstructed block of the current block. For example, the first prediction block and the second prediction block are subjected to weighted halving calculation, thereby obtaining a reconstructed block of the current block.

It should be noted that there is no necessary sequence between the above steps S501-S504 and steps S505-S506, that is, S501-S504 can be performed before or after S505-S506, and S501-S504 and S505-S506 can also be simultaneously get on.

It should be noted that, in another embodiment of the present invention, the S501-S504 may be replaced by the AMVP mode, and the candidate list may be updated by using the template matching manner in the pre- (post) prediction to obtain the first prediction block (for example, The index value indicates that the AMVP mode is adopted in the forward prediction, and the S505-S506 is replaced with the Merge mode. The second prediction block is obtained by directly using the constructed candidate list in the backward (previous) prediction (for example, the index value indicates that the backward prediction is used. Merge mode), the specific implementation process can refer to the foregoing description, and will not be described here.

Referring to FIG. 19, FIG. 19 is a method for generating a predicted motion vector of a hybrid prediction mode according to an embodiment of the present invention, which is described from the perspective of an encoding end. The difference between the method embodiment and the embodiment of FIG. 17 includes forward prediction and The backward prediction uses different prediction modes (Merge mode and AMVP mode), and both the forward prediction and the backward prediction are used to update the candidate motion vector/update candidate list in the template matching manner. These steps are briefly described below.

First, steps S601-S604 are described. In steps S601-S604, the Merge mode uses the template matching manner to update the candidate list in the front (or later, the same below) to obtain the first prediction block, as follows:

Step S601: The pre- (or post-) prediction of the encoding end adopts the Merge mode, and the encoding end constructs a Merge candidate list based on the Merge mode, and the Merge candidate list is used for performing pre- (or backward) prediction in the bidirectional prediction.

Step S602: The encoding end selects a candidate motion vector in the Merge candidate list.

For a specific implementation process, refer to the related description in step S402 of FIG. 17, and details are not described herein again.

Step S603: The encoding end updates the candidate motion vector by means of template matching.

Step S604: The encoding end obtains a pre (post) prediction block (also referred to as a first prediction block) based on the updated Merge candidate list before (behind) the current block.

Steps S605-S608 are described below. In steps S605-S606, the AMVP mode uses the template matching method to update the candidate list and obtain the second prediction block in the backward (or before, the same below) prediction, as follows:

Step S605: The encoding end constructs an AMVP candidate list based on the AMVP mode. The AMVP candidate list is used for post- (or forward) prediction in bi-directional prediction.

Step S606: The encoding end selects a candidate motion vector in the AMVP candidate list.

In a specific application scenario, if the selected candidate motion vector is obtained by pre- (or backward) prediction, the candidate motion vector for the post-previous prediction may be obtained by mapping as the selected candidate motion vector. The value is applied to the subsequent step S607. For example, the encoding end selects the forward candidate motion vector in the AMVP candidate list as (-2, -2), the (-2, -2) is obtained based on the forward prediction, and the current block corresponds to the image serial number in the image frame sequence. (Picture Order Count, referred to as timing) is 4, the forward predicted reference block corresponds to the image sequence number in the image frame sequence is 2, and the backward predicted reference block corresponds to the image sequence number in the image frame sequence is 5, then The timing difference between the current block and the backward predicted reference block (which may be simply referred to as the second timing difference) is 1 (ie, 4-5=-1), and the timing difference between the current block and the forward predicted reference block (may be simply referred to as The first timing difference) is 2 (ie, 4-2=2), so it can be used for forward prediction according to the proportional relationship between the first timing difference and the second timing difference (2/-1=-2). The (-2, -2) mapping becomes a candidate motion vector (1, 1) for backward prediction, that is, (-2, -2) / -2 = (1, 1). The (1, 1) is applied as the selected candidate motion vector value to the subsequent step S607.

In addition, in a possible embodiment, if the selected candidate motion vector is obtained by pre- (or backward) prediction, the selection of the candidate motion vector may be interrupted, that is, re-initiated in the AMVP candidate. Select other candidate motion vectors from the list.

In a bi-directional prediction process of a specific application scenario, if the candidate motion vector pre-selected from the post-pre-prediction is obtained by bidirectional prediction, the value of the bi-directional prediction for the post-previous prediction portion is finally selected. The selected candidate motion vector value is applied to the subsequent step S607. For example, the post (pre) prediction, the candidate motion vector preselected by the encoding end in the Merge candidate list is obtained by bidirectional prediction, which includes the motion vector (1, 1) for the pre (post) prediction. For the motion vector (-2, -2) of the backward (front) direction prediction, the encoding end finally applies the post (pre)to the predicted motion vector (-2, -2) as the selected candidate motion vector value to Subsequent step S607.

In a specific application scenario, if the selected candidate motion vector is obtained by post- (pre-) prediction, the candidate motion vector is directly used as the selected candidate motion vector value to be applied to the subsequent step S607.

Step S607: The encoding end updates the candidate motion vector by using template matching.

The specific process includes, as an input, a candidate motion vector of an encoding end select AMVP candidate list, where the candidate motion vector corresponds to a reference block in the reference image. Determining a search range by combining candidate motion vectors, the search range includes at least one reference motion vector; and obtaining an image block in the reference image after the current block is obtained from the reference motion vector within the search range; respectively calculating the current block Pixel difference value or rate distortion value between at least one adjacent reconstructed block and at least one adjacent reconstructed block of each image block, and calculating at least one adjacent reconstructed block of the current block and candidate motion vector corresponding At least one of the reference blocks is adjacent to the pixel difference value or rate distortion generation value between the reconstructed blocks. In the pixel difference value or the rate distortion generation value, one of the pixel difference value or the rate distortion generation value is selected, and the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value is used as the candidate motion vector of the current block, thereby The update of the candidate motion vector is implemented.

Step S608: The encoding end may replace the motion vector corresponding to the minimum pixel difference value or the rate distortion cost value with the corresponding candidate motion vector in the AMVP candidate list, thereby implementing updating of the AMVP candidate list, based on the updated AMVP candidate list. Obtaining a predicted motion vector (also referred to as a second predicted motion vector) after the current (front) direction of the current block, and obtaining a post (pre)predicted block in combination with the predicted (previous) predicted motion vector (also referred to as a The second prediction block is used, and the index value corresponding to the prediction motion vector in the AMVP candidate list is encoded, and the index of the reference image corresponding to the prediction block is encoded, and the MVD is encoded. The specific process can be implemented by referring to FIG. 12 to FIG. The description of the coding end of the example is not described here.

Finally, in step S609, the encoding end obtains the reconstructed block, and transmits the code stream to the decoding end. Specifically, after obtaining the first (back) direction first prediction block and the back (front) direction second prediction block, the encoding end processes the first prediction block and the second prediction block according to a preset algorithm. To get the reconstructed block of the current block. Thereafter, the encoding end transmits a code stream to the decoding end, and the code stream correspondingly includes an index value of the Merge candidate list, an index value of the AMVP candidate list, an MVD, an index of the reference image, and the like.

Referring to FIG. 20, FIG. 20 is a schematic diagram of a method for generating a motion vector predicting motion in a mixed prediction mode according to an embodiment of the present invention. The decoding end of the method embodiment can be compared with the code end of the embodiment of FIG. correspond. The difference between the method embodiment and the embodiment of FIG. 18 includes that different prediction modes (Merge mode and AMVP mode) are used for forward prediction and backward prediction, and template matching method is used to update candidates in both forward prediction and backward prediction. Motion vector/update candidate list. These steps are briefly described below.

First, steps S701-S704 are described. In steps S701-S704, the Merge mode uses the template matching manner to update the candidate motion vector indicated by the index value in the candidate list in the pre- (or later, the same below) prediction, thereby obtaining the first prediction block, as follows:

Step S701: The decoded index value indicates that the pre- (or post) prediction adopts the Merge mode, and the decoding end determines that the pre- (or post-) prediction adopts the Merge mode, thereby constructing the Merge candidate list.

Step S702: The decoding end selects a candidate motion vector corresponding to the index value in the Merge candidate list based on the decoding information (identification bit and/or index value, specifically referring to the foregoing description).

In a specific application scenario, if the candidate motion vector selected based on the decoded index value is obtained by post- (or forward) prediction, the candidate motion vector for the front (back) prediction may be obtained by mapping, as The selected candidate motion vector value is applied to subsequent step S703. For example, the decoding end selects the candidate motion vector as (-2, -2) in the Merge candidate list, and the (-2, -2) is obtained based on the backward prediction, and the current block corresponds to the image sequence number in the image frame sequence is 4, The backward predicted reference block corresponds to the image sequence number in the image frame sequence is 6, and the forward predicted reference block corresponds to the image sequence number in the image frame sequence is 3, then the current block and the forward predicted reference block The timing difference (referred to as the second timing difference) is 1 (ie, 4-3=1), and the timing difference between the current block and the backward predicted reference block (referred to as the first timing difference) is -2 (ie, 4-6= -2), therefore, (-2, -2) mapping for backward prediction can be used for the proportional relationship between the first timing difference and the second timing difference (-2/1 = -2) The forward predicted candidate motion vector (1, 1), ie (-2, -2) / -2 = (1, 1). The (1, 1) is applied as the selected candidate motion vector value to the subsequent step S703.

In addition, in a possible embodiment, if the candidate motion vector selected based on the decoded index value is obtained by post- (or forward) prediction, the selection of the candidate motion vector may be interrupted, that is, Say, re-select other candidate motion vectors in the Merge candidate list.

In a specific application scenario, if the candidate motion vector selected based on the decoded index value is obtained by bidirectional prediction, the value of the candidate motion vector for the front (or back) prediction portion is selected as the selected The candidate motion vector value is applied to the subsequent step S703. For example, in the pre- (or post-) prediction, the candidate motion vectors pre-selected by the decoding end in the Merge candidate list include motion vectors (1, 1) for pre (post) prediction and post (pre) prediction. For the motion vector (-2, -2), the encoding end finally applies the pre- (post)-predicted motion vector (1, 1) as the selected candidate motion vector value to the subsequent step S703.

In a specific application scenario, if the candidate motion vector selected based on the decoded index value is obtained by pre- (post) prediction, the candidate motion vector is directly used as the selected candidate motion vector value to be applied to subsequent steps. S703.

Step S703: The decoding end updates the candidate motion vector by means of template matching.

Step S704: The decoding end implements an update of the candidate MV selected in the Merge candidate list, and uses the updated candidate MV as a pre- (post) forward predicted motion vector (also referred to as a first predicted motion vector). A pre- (post-) directed prediction block (also referred to as a first prediction block) is obtained in conjunction with the pre- (rear) predicted motion vector.

Steps S705-S708 are described below. In steps S705-S, 706, the AMVP mode uses the template matching manner to update the candidate list and obtain the second prediction block in the backward (or before, the same below) prediction, as follows:

Step S705: The decoding end determines the AMVP mode based on the decoding information (identification bit and/or index value, specifically referring to the foregoing description), and constructs an AMVP candidate list based on the AMVP mode. The AMVP candidate list is used for post- (or forward) prediction in bi-directional prediction.

Step S706: The decoding end selects the candidate motion vector in the AMVP candidate list based on the decoding information (identification bit and/or index value, specifically referring to the foregoing description).

In a specific application scenario, if the candidate motion vector selected based on the decoded index value is obtained by pre- (or backward) prediction, the candidate motion vector for the post-previous prediction may be obtained by mapping, as The selected candidate motion vector value is applied to subsequent step S707. For example, the decoding end selects the forward candidate motion vector in the AMVP candidate list as (-2, -2), the (-2, -2) is obtained based on the forward prediction, and the current block corresponds to the image serial number in the image frame sequence. 4, the forward predicted reference block corresponds to the image sequence number in the image frame sequence is 2, and the backward predicted reference block corresponds to the image sequence number in the image frame sequence is 5, then the current block and the backward predicted The timing difference of the reference block (which may be simply referred to as the second timing difference) is 1 (ie, 4-5=-1), and the timing difference between the current block and the forward-predicted reference block (which may be simply referred to as the first timing difference) is 2 ( That is, 4-2=2), so, according to the proportional relationship between the first timing difference and the second timing difference (2/-1=-2), (-2, -2) for forward prediction The mapping becomes a candidate motion vector (1, 1) for backward prediction, that is, (-2, -2) / -2 = (1, 1). The (1, 1) is applied as the selected candidate motion vector value to the subsequent step S707.

In addition, in a possible embodiment, if the candidate motion vector selected based on the decoded index value is obtained by pre- (or backward) prediction, the selection of the candidate motion vector may be interrupted, that is, Said to re-select other candidate motion vectors in the AMVP candidate list.

In a bi-directional prediction process of a specific application scenario, if the candidate motion vector pre-selected based on the decoded index value in the post-pre (forecast) prediction is obtained by bidirectional prediction, the bidirectional prediction is finally selected for later ( The value of the front prediction section is applied to the subsequent step S707 as the selected candidate motion vector value. For example, the post (pre) prediction, the candidate motion vector preselected by the decoding end in the Merge candidate list is obtained by bidirectional prediction, the candidate motion vector including the motion vector (1, 1) for the pre (post) prediction. For the motion vector (-2, -2) of the backward (front) direction prediction, the decoding end finally applies the post (pre)to the predicted motion vector (-2, -2) as the selected candidate motion vector value to Subsequent step S707.

In a specific application scenario, if the selected candidate motion vector is obtained by post- (pre-) prediction, the candidate motion vector is directly used as the selected candidate motion vector value to be applied to the subsequent step S707.

Step S707: The decoding end updates the candidate motion vector by means of template matching.

Step S708: The decoding end obtains a post (previous) prediction block (also referred to as a second prediction block) of the current block based on the AMVP candidate list, the index of the decoded reference image, and the MVD and the like.

Finally, in step S709, after the decoding end obtains the first (back) direction first prediction block and the back (front) direction second prediction block, the first prediction block and the second method are performed based on a preset algorithm. The prediction block is processed to obtain a reconstructed block of the current block.

Referring to FIG. 21, FIG. 21 is yet another method for generating a motion vector predictor according to an embodiment of the present invention. The method is described in the perspective of an encoding end. The method may be used in an encoding end in bidirectional prediction, and the method includes but is not limited to the following steps:

Step S801: The encoding end constructs a Merge candidate list or an AMVP candidate list, and may refer to the related description of the embodiment of FIG. 6 or FIG. 7 respectively, and details are not described herein again.

After constructing the Merge candidate list or the AMVP candidate list, the candidate list includes a forward candidate motion vector for forward prediction and a backward motion vector for backward prediction, for the front (back) candidate motion vector, The subsequent steps S802-S804, S805-S807 are performed (front) to the candidate motion vectors, which are described below.

First, steps S802-S804 are described:

Step S802: The encoding end selects the pre- (or following, the same below) candidate motion vector in the constructed candidate list.

Step S803: The encoding end updates the front (back) candidate motion vector by means of template matching.

Step S804: The encoding end obtains a front (back) prediction block (also referred to as a first prediction block) based on the front (rear) candidate motion vector in the updated candidate list before (behind) the current block.

Steps S805-S807 are described below:

Step S805: The encoding end determines the post- (or before, the same below) candidate motion vector in the constructed candidate list.

Step S806: The encoding end updates the candidate motion vector by the preceding (backward) candidate motion vector pair before and after the update.

Specifically, in the encoding end calculation candidate list, the replaced pre (post) candidate motion vector (ie, the new pre (post) candidate motion vector obtained in step S803) and the pre (rear) candidate motion vector before replacement (i.e., the candidate motion vector selected in step S802); in combination with the difference and the posterior (front) candidate motion vector selected in step S805, a new posterior (front) candidate motion vector is obtained; And replacing the post (front) candidate motion vector determined in step S805 with the new post (pre) candidate motion vector, so that the candidate list is further updated.

Step S807: The encoding end obtains a post (pre)direction prediction block (also referred to as a second prediction block) based on the post (pre)direction candidate motion vector in the updated candidate list based on the current block (front).

Step S808: After obtaining the first (back) direction first prediction block and the back (front) direction second prediction block, the encoding end processes the first prediction block and the second prediction block according to a preset algorithm. , thereby obtaining a reconstructed block of the current block, and then transmitting the code stream to the decoding end. It can be understood that if the Merge mode encoding is currently used, the code stream includes an index value of the Merge candidate list; if the AMVP mode encoding is currently used, the code stream includes an index value of the AMVP candidate list, an index value of the reference image, and MVD information. .

Referring to FIG. 22, FIG. 22 is still another method for generating a motion vector predictor according to an embodiment of the present invention, which is described from a decoding end. The method may be used in a decoding end in bidirectional prediction. The decoding end in the method may be the same as that in FIG. The encoding end corresponds.

Step S901: The decoding end parses the code stream, and constructs a Merge candidate list or an AMVP candidate list according to the index value of the candidate list in the code stream. Reference may be made to the related description of the embodiment of FIG. 6 or FIG. 7 , and details are not described herein again.

After constructing the Merge candidate list or the AMVP candidate list, the candidate list includes a forward candidate motion vector for forward prediction and a backward motion vector for backward prediction, for the front (back) candidate motion vector, (Front) The subsequent steps S902-S904, S905-S907 are performed on the candidate motion vector, which are described separately below.

First, steps S902-S904 are described:

Step S902: The decoding end selects the pre- (or following, the same below) candidate motion vector in the constructed candidate list in combination with the decoded index value.

Step S903: The decoding end updates the front (back) candidate motion vector by means of template matching.

Step S904: The decoding end combines the decoding information to obtain a front (back) prediction block (also referred to as a first prediction block) based on the front (back) candidate motion vector in the updated candidate list before (behind) the current block. .

Steps S905-S907 are described below:

Step S905: The decoding end determines the (or before, the same below) candidate motion vector in the constructed candidate list.

Step S906: The decoding end updates the candidate motion vector by the pre (post) to the candidate motion vector before and after the update.

Specifically, the decoding side calculates the candidate front (rear) candidate motion vector (ie, the new pre (post) candidate motion vector obtained in step S903) and the pre (post) candidate motion vector before replacement in the candidate list. (i.e., the candidate motion vector selected in step S902); in combination with the difference and the posterior (front) candidate motion vector selected in step S905, a new posterior (front) candidate motion vector is obtained; And replacing the post (front) candidate motion vector determined in step S905 with the new post (pre) candidate motion vector, so that the candidate list is further updated.

In a possible embodiment, the decoding end may calculate a difference between the replaced pre (post) candidate motion vector in the candidate list and the pre (re) candidate motion vector before replacement; calculate the difference and the candidate a sum of the posterior (anterior) candidate motion vectors in the motion vector set, obtaining a new post (front) candidate motion vector, replacing the post (pre) candidate motion vector with the original one in the candidate motion vector set Post (front) to candidate motion vectors.

For example, in an application scenario, the index value of the obtained Merge candidate list is 0. The decoding end constructs a Merge list. After the completion of the construction, the current block selects a pre (post) candidate motion vector corresponding to the Merge candidate list index value of 0 in the Merge mode for performing the pre (post) prediction in the bidirectional prediction, and the The front (back) candidate motion vector is (3, 5), and the reference image block (ie, reference block) in the front (back) prediction process reference image is obtained according to the selected (3, 5). The previous (back) is centered on the prediction reference image block, and the entire pixel precision is searched within the range of the surrounding 1-pixel region, and a number of new reference image blocks (ie, image blocks) of the same size as the reference block are obtained. Taking the left neighboring and/or upper adjacent deconstructed block of the current block as a template, respectively, with the front (back) prediction reference block and the front (back) prediction image block left adjacent and/or the upper adjacent deconstructed block respectively The minimum value of the matching rate distortion generation value is the updated reference image block, and the motion vector corresponding to the lowest rate distortion generation value is (4, 5). The reference picture block whose decoding end utilization distortion generation value is the smallest is used as the current block front (back) direction prediction block, and the front (back) direction prediction motion vector of the current block is set to (4, 5). In the candidate list, the posterior (previous) candidate motion vector corresponding to the index value of 0 is (1, 4), and the pre-replacement candidate motion vector (3, 5) and the replaced prediction motion vector are combined. (4,5), the difference (1,0) is obtained, and the (previous) to the candidate motion vector is (1,4) combined with the difference (1,0) to obtain the updated (pre)predicted reference block corresponding to The predicted motion vector is (2, 4).

For example, in an application scenario, the index value of the obtained Merge candidate list is 0. The decoding end constructs a Merge list. After the completion of the construction, the current block selects a pre (post) candidate motion vector corresponding to the Merge candidate list index value of 0 in the Merge mode for performing the pre (post) prediction in the bidirectional prediction, and the The front (back) to reference image image serial number is 3, the current image highlights the serial number as 4, and the rear (front) reference image image serial number is 5. The front (back) candidate motion vector is (3, 5), and the reference image block (ie, reference block) in the front (back) prediction process reference image is obtained according to the selected (3, 5). The previous (back) is centered on the prediction reference image block, and the entire pixel precision is searched within the range of the surrounding 1-pixel region, and a number of new reference image blocks (ie, image blocks) of the same size as the reference block are obtained. Taking the left neighboring and/or upper adjacent deconstructed block of the current block as a template, respectively, with the front (back) prediction reference block and the front (back) prediction image block left adjacent and/or the upper adjacent deconstructed block respectively The minimum value of the matching rate distortion generation value is the updated reference image block, and the motion vector corresponding to the lowest rate distortion generation value is (4, 5). The reference picture block whose decoding end utilization distortion generation value is the smallest is used as the current block front (back) direction prediction block, and the front (back) direction prediction motion vector of the current block is set to (4, 5). In the candidate list, the posterior (previous) candidate motion vector corresponding to the index value of 0 is (1, 4), and the pre-replacement candidate motion vector (3, 5) and the replaced prediction motion vector are combined. (4,5), the difference (-1,0) is obtained by the proportional mapping calculation, and the (front) (the first) candidate motion vector is (1,4) combined with the difference (-1,0) to get the updated (front) The predicted motion vector corresponding to the prediction reference block is (0, 4).

In a possible embodiment, the decoding end may calculate a difference between the replaced front (rear) candidate motion vector and the pre-replacement (post) candidate motion vector in the candidate list; first multiply the difference by a pre- Setting a coefficient, and then performing a summation calculation on the backward (front) candidate motion vector in the candidate motion vector set to obtain a new post (pre)direction candidate motion vector, and the post (pre)toward candidate motion vector The vector replaces the original back (front) candidate motion vector in the set of candidate motion vectors.

In a possible embodiment, the decoding end may calculate a front (rear) difference between the replaced front (rear) candidate motion vector and the pre-replacement (post) candidate motion vector in the candidate list; The (post) difference value is mapped to the back (front) direction difference value, and the back (front) direction difference value is combined with the post (front) candidate motion vector in the candidate motion vector set to perform a summation calculation to obtain a new The post (pre)to-candidate motion vector replaces the post-(pre) candidate motion vector with the original post (pre-) candidate motion vector in the candidate motion vector set. For example, for example, the front (rear) candidate motion vector in the candidate list and the front (rear) candidate motion vector before the replacement may be calculated as (2, 4), and the current block corresponds to the current block. The sequence image sequence number in the image frame sequence is 4, the forward prediction reference block corresponds to the image sequence number sequence in the image frame sequence is 2, and the backward predicted reference block corresponds to the image sequence number sequence in the image frame sequence. 5, then, the timing difference between the current block and the backward predicted reference block (which may be simply referred to as the second timing difference) is 1 (ie, -5=-1), and the timing difference between the current block and the forward predicted reference block ( It can be abbreviated as the first timing difference) to be -2 (that is, -2=2). Therefore, according to the proportional relationship between the first timing difference and the second timing difference (2/-1=-2), the front ( After) the difference (2, 4) is mapped to the back (front) difference (-1, 0), and then the post (front) difference (-1, -2) and the back (front) are used. The candidate motion vector performs a summation calculation to obtain a new post (pre)direction candidate motion vector, and replaces the post (pre)toward candidate motion vector with the original post (previous) candidate motion in the candidate motion vector set the amount.

In a possible embodiment, the decoding end may calculate a front (rear) difference between the replaced front (rear) candidate motion vector and the pre-replacement (post) candidate motion vector in the candidate list; The (post) difference value is mapped to the back (front) difference value, and then multiplied by the preset coefficient, and then combined with the posterior (front) candidate motion vector in the candidate motion vector set to obtain a new sum. (front) to the candidate motion vector, replacing the latter (front) candidate motion vector with the original back (front) candidate motion vector in the candidate motion vector set. Wherein, in a possible implementation manner, if the front (back) difference value is mapped to the backward (front) difference value, the proportional relationship between the obtained first time difference and the second time difference is greater than 0, Then define a preset coefficient of 1, or other preset positive values; if the proportional relationship is less than 0, define a preset coefficient of -1, or other preset negative values.

Step S907: The decoding end combines the decoding information, and obtains a post (previous) prediction block (also referred to as a second prediction block) based on the post (pre)direction candidate motion vector in the updated candidate list after the current block (front). .

Step S908: After the decoding end obtains the first (back) direction first prediction block and the back (front) direction second prediction block, the first prediction block and the second prediction block are processed according to a preset algorithm, Thereby the reconstructed block of the current block is obtained.

It should be noted that the relevant steps that are not described in detail in the decoding end may refer to the description of related steps in the encoding end, and may also refer to the related description in the foregoing. For the sake of brevity of the description, the description will not be repeated here.

Referring to FIG. 23, FIG. 23 is still another method for generating a motion vector predictor according to an embodiment of the present invention, which is described from the perspective of an encoding end. The method can be used in the encoding end in bidirectional prediction, and the difference between the method embodiment and the embodiment in FIG. In the method, the forward prediction and the backward prediction respectively adopt different prediction modes (ie, hybrid coding mode), and when the current (or later) direction adopts the Merge mode, the AMVP mode is adopted correspondingly (or before). The specific description is as follows:

First, steps S1001-S1004 are described:

Step S1001: The pre- (or the following, the same below) prediction of the encoding end adopts the Merge mode, and the Merge candidate list is constructed based on the Merge mode.

Step S1002: The encoding end selects the front (back) candidate motion vector in the constructed Merge candidate list.

Step S1003: The encoding end updates the front (back) candidate motion vector by means of template matching.

Step S1004: The encoding end obtains a pre (post) prediction block (also referred to as a first prediction block) and encodes the pre (post) candidate motion vector in the updated Merge candidate list based on the current block (behind). A candidate list index corresponding to the first prediction block.

Steps S1005-S1007 are described below:

Step S1005: The post- (or before, the same below) prediction of the encoding end adopts the AMVP mode, and the AMVP candidate list is constructed based on the AMVP mode.

Step S1006: The encoding end selects the (or before, the same below) candidate motion vector in the constructed AMVP candidate list.

Step S1007: The encoding end updates the candidate motion vector by the preceding (backward) candidate motion vector pair before and after the update.

Specifically, the encoding end calculates the replaced front (back) candidate motion vector (ie, the new pre (post) candidate motion vector obtained in step S1003) and the pre (post) candidate motion before replacement in the Merge candidate list. a difference between the vector (ie, the candidate motion vector selected in step S1002); in combination with the difference and the posterior (front) candidate motion vector selected in step S1006, a new posterior (front) candidate motion vector is obtained; Then, the post (previous) candidate motion vector selected in step S1006 in the AMVP candidate list is replaced with the new post (pre) candidate motion vector, so that the AMVP candidate list is further updated.

Step S1008: The encoding end obtains a post (previous) prediction block (also referred to as a second prediction block) from the post (pre)direction candidate motion vector in the updated AMVP candidate list based on the current block (front), and encodes a candidate list index corresponding to the second prediction block, an index of the reference image corresponding to the second prediction block, and MVD information.

Step S1009: After obtaining the first (back) direction first prediction block and the second (front) direction second prediction block, the encoding end processes the first prediction block and the second prediction block according to a preset algorithm. , thereby obtaining a reconstructed block of the current block, and then transmitting the code stream to the decoding end. It can be understood that the code stream includes an index value of the Merge candidate list, an index value of the AMVP candidate list, an index value of the reference image, and MVD information.

Referring to FIG. 24, FIG. 24 is still another method for generating a motion vector predictor according to an embodiment of the present invention, which is described from a decoding end. The method may be used in a decoding end in bidirectional prediction, and the decoding end in the method may be the same as that in FIG. The encoding end corresponds.

First, steps S1101-S1104 are described:

Step S1101: The decoding end parses the code stream, determines, according to the index value of the Merge candidate list in the code stream, that the pre- (or the following, the same below) prediction adopts the Merge mode, and constructs the Merge candidate list based on the Merge mode.

Step S1102: The decoding end selects the pre- (or following, the same below) candidate motion vector in the constructed Merge candidate list in combination with the decoded index value.

Step S1103: The decoding end updates the front (back) candidate motion vector by means of template matching.

Step S1104: The decoding end combines the decoding information, and obtains a pre (post) prediction block (also referred to as a first prediction block) based on the front (rear) candidate motion vector in the updated candidate list before (behind) the current block. .

Steps S1005-S1007 are described below:

Step S1105: The decoding end parses the code stream, and according to the index value of the AMVP candidate list in the code stream, determines that the AMVP mode is adopted for the prediction (or before, the same, the same below), and the AMVP candidate list is constructed based on the AMVP mode.

Step S1106: The decoding end selects the (previous) candidate motion vector in the constructed AMVP candidate list in combination with the decoded index value.

Step S1107: The decoding end updates the candidate motion vector by the preceding (backward) candidate motion vector pair before and after the update.

Specifically, the decoding end calculates the replaced front (back) candidate motion vector (ie, the new pre (post) candidate motion vector obtained in step S1103) and the pre (post) candidate motion before replacement in the Merge candidate list. a difference between the vector (ie, the candidate motion vector selected in step S1102); in combination with the difference and the posterior (front) candidate motion vector selected in step S1106, a new posterior (front) candidate motion vector is obtained; Then, the new (front) candidate motion vector determined in step S1106 is replaced with the new post (pre) candidate motion vector, so that the candidate list is further updated.

Step S1108: The decoding end combines the decoding information, and obtains a post (pre)predicted block (also referred to as a second prediction block) based on the post (pre)direction candidate motion vector in the updated AMVP candidate list after the current block (front). ).

Step S1109: After the decoding end obtains the first (back) direction first prediction block and the back (front) direction second prediction block, the first prediction block and the second prediction block are processed according to a preset algorithm, Thereby the reconstructed block of the current block is obtained.

It can be seen that, in the embodiment of the present invention, the video codec system can verify whether the image block in a certain range (or even the entire reference image) in the reference image of the current block has a good match with the current block by using template matching. The candidate motion vector of the candidate list constructed based on the Merge or AMVP mode is updated, and the updated candidate list can be used to ensure that the best reference block of the current block is obtained during the encoding and decoding process. In addition, the embodiment of the present invention further provides a hybrid prediction mode, and based on the hybrid prediction mode, it may also be able to obtain an optimal reference block of the current block in the coding and decoding process, and improve in the case of ensuring that the best reference block of the current block is obtained. The efficiency of codec.

The codec system and related methods of the embodiments of the present invention are described above, and related devices related to the embodiments of the present invention are further described below.

Referring to FIG. 25, an embodiment of the present invention provides an apparatus 1200 for generating a predicted motion vector, where the apparatus 1200 may be applied to an encoding side or may be applied to a decoding side. The device 1200 includes a processor 1201 and a memory 1202. The processor 1201 and the memory 1202 are connected to each other (e.g., connected to each other through a bus 1204). In a possible implementation, the device 1200 may further include a transceiver 1203, and the transceiver 1203 is connected to the processing. The device 1201 and the memory 1202 are configured to receive/transmit data.

The memory 1202 includes, but is not limited to, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read only memory (EPROM), or A compact disc read-only memory (CD-ROM) for storing related program codes and video data.

The processor 1201 may be one or more central processing units (CPUs). In the case where the processor 1201 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.

The processor 1201 is configured to read the program code stored in the memory 1202 and perform the following operations:

Constructing a candidate motion vector set of the to-be-processed block based on the prediction mode; the prediction mode is a Merge mode or an AMVP mode;

Determining, according to the first candidate motion vector in the set of candidate motion vectors, at least two first reference motion vectors, the first reference motion vector being used to determine a first reference of the to-be-processed block in the to-be-processed block a reference block in the image;

Calculating a pixel difference value or rate distortion rate-distortion generation between at least one first adjacent reconstructed block of at least two of the determined reference blocks and at least one second adjacent reconstructed block of the to-be-processed block, respectively Value, wherein the at least one first adjacent reconstructed block and the at least one second adjacent reconstructed block are identical in shape and of equal size;

Obtaining a first predicted motion vector of the to-be-processed block according to a corresponding one of the at least two first reference motion vectors or a first reference motion vector having a lowest rate-distortion value.

It should be noted that, in a specific embodiment, the processor 1201 may be used to perform the related methods described in the foregoing embodiments of FIG. 10 to FIG. 24, and details are not described herein for brevity of the description.

Referring to FIG. 26, an embodiment of the present invention provides another apparatus 1300 for generating a predicted motion vector. The apparatus 1300 may be applied to an encoding side or may be applied to a decoding side. The device 1300 includes a processor 1301 and a memory 1302. The processor 1301 and the memory 1302 are connected to each other (e.g., connected to each other through a bus 1304). In a possible implementation, the device 1300 may further include a transceiver 1303, and the transceiver 1303 is connected to the processing. The device 1301 and the memory 1302 are configured to receive/send data.

The memory 1302 includes, but is not limited to, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read only memory (EPROM), or A compact disc read-only memory (CD-ROM) for storing related program codes and video data.

The processor 1301 may be one or more central processing units (CPUs). In the case where the processor 1301 is a CPU, the CPU may be a single core CPU or a multi-core CPU.

The processor 1301 is configured to read program code stored in the stored memory 1302 to perform bidirectional prediction of a block to be processed, where the bidirectional prediction includes a first direction prediction and a second direction prediction, where the first direction prediction is based on A prediction of a reference frame list, the second direction prediction being based on a prediction of a second reference frame list. The processor 1301 specifically performs the following operations:

Obtaining a first prediction mode, generating a first candidate motion vector set, where the first candidate motion vector set is used to generate a first direction prediction motion vector in the first direction prediction;

Obtaining a second prediction mode, generating a second candidate motion vector set, where the second candidate motion vector set is used to generate a second direction prediction motion vector in the second direction prediction,

When the first prediction mode is the AMVP mode, the second prediction mode is the Merge mode, or when the first prediction mode is the Merge mode, the second prediction mode is the AMVP mode.

It should be noted that, in a specific embodiment, the processor 1301 may be used to perform the related methods described in the foregoing embodiments of FIG. 10 to FIG. 24, and for brevity of the description, details are not described herein again.

Based on the same inventive concept, the embodiment of the present invention provides another apparatus 1400 for generating a predicted motion vector. Referring to FIG. 27, the apparatus 1400 includes:

a set generation module 1401, configured to construct a candidate motion vector set of the to-be-processed block;

The template matching module 1402 is configured to determine, according to the first candidate motion vector in the candidate motion vector set, at least two first reference motion vectors, where the first reference motion vector is used to determine that the to-be-processed block is in the a reference block in the first reference image of the block to be processed;

The template matching module 1402 is further configured to respectively calculate between at least one first adjacent reconstructed block of the at least two determined reference blocks and at least one second adjacent reconstructed block of the to-be-processed block a pixel difference value or rate distortion rate-distortion value, wherein the at least one first neighbor reconstructed block and the at least one second neighbor reconstructed block have the same shape and are equal in size;

a prediction motion vector generation module 1403, configured to obtain, according to the corresponding one of the at least two first reference motion vectors, or the first reference motion vector with the lowest rate distortion value, A predictive motion vector.

It should be noted that, through the related descriptions of the foregoing embodiments of FIG. 10 to FIG. 24, those skilled in the art may know the implementation methods of the modules included in the device 1400, and therefore, for brevity of the description, details are not described herein again.

Based on the same inventive concept, the embodiment of the present invention provides another apparatus 1500 for generating a predicted motion vector. Referring to FIG. 28, the apparatus 1500 is configured to perform bidirectional prediction of a to-be-processed block, where the bidirectional prediction includes a first direction prediction. And the second direction prediction, the first direction prediction is based on the prediction of the first reference frame list, and the second direction prediction is based on the prediction of the second reference frame list, where the device 1500 specifically includes:

a first set generation module 1501, configured to acquire a first prediction mode, and generate a first candidate motion vector set, where the first candidate motion vector set is used to generate a first direction prediction motion vector in the first direction prediction;

a second set generation module 1502, configured to acquire a second prediction mode, and generate a second candidate motion vector set, where the second candidate motion vector set is used to generate a second direction prediction motion vector in the second direction prediction,

It should be noted that, through the related descriptions of the foregoing embodiments of FIG. 10 to FIG. 24, those skilled in the art may know the implementation methods of the modules included in the device 1500, and therefore, for brevity of the description, details are not described herein again.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.

A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions which, when loaded and executed on a computer, produce, in whole or in part, a process or function according to an embodiment of the invention. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a network site, computer, server or data center Transmission to another network site, computer, server, or data center via wired (eg, coaxial cable, fiber optic, digital subscriber line) or wireless (eg, infrared, microwave, etc.). The computer readable storage medium can be any available media that can be accessed by a computer, or can be a data storage device such as a server, data center, or the like that includes one or more available media. The usable medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape, etc.), an optical medium (such as a DVD, etc.), or a semiconductor medium (such as a solid state hard disk) or the like.

In the above embodiments, the descriptions of the various embodiments are focused on, and the parts that are not detailed in a certain embodiment may be referred to the related descriptions of other embodiments.

Claims

A method for generating a motion vector predicting, comprising:

Constructing a candidate motion vector set of the block to be processed;

Determining, according to the first candidate motion vector in the set of candidate motion vectors, at least two first reference motion vectors, the first reference motion vector being used to determine a first reference of the to-be-processed block in the to-be-processed block a reference block in the image;

Calculating pixel difference values or rate distortions between one or more first neighbor reconstructed blocks of the at least two of the determined reference blocks and one or more second neighbor reconstructed blocks of the block to be processed, respectively Rate-distortion generation value, wherein the first adjacent reconstructed block and the second adjacent reconstructed block have the same shape and are equal in size;

Obtaining a first predicted motion vector of the to-be-processed block according to a corresponding one of the at least two first reference motion vectors or a first reference motion vector having a lowest rate-distortion value.
The method according to claim 1, wherein the method is configured to encode the to-be-processed block, and the pixel difference value or rate distortion generation corresponding to the at least two first reference motion vectors The first reference motion vector having the smallest value, obtaining the predicted motion vector of the to-be-processed block, includes:

Replacing the first candidate motion vector in the candidate motion vector set with the corresponding one of the pixel difference value or the first reference motion vector having the smallest rate distortion value, to obtain an updated candidate motion vector set;

Obtaining, from the updated candidate motion vector set, a first predicted motion vector of the to-be-processed block.
The method according to claim 2, wherein the obtaining the predicted motion vector of the to-be-processed block from the updated candidate motion vector set comprises:

From the updated candidate motion vector set, a first predicted motion vector of the to-be-processed block is selected according to a rate distortion cost value.
The method according to claim 1, wherein the method is configured to decode the to-be-processed block; and the pixel difference value or rate distortion generation corresponding to the at least two first reference motion vectors A first reference motion vector having the smallest value, obtaining the first predicted motion vector of the to-be-processed block, comprising:

A first reference motion vector that minimizes the corresponding pixel difference value or rate distortion value is used as the first predicted motion vector of the to-be-processed block.
The method according to claim 4, wherein before the constructing the set of candidate motion vectors of the block to be processed, the method further comprises:

The analytic code stream obtains identification information, where the identifier information is used to indicate that the candidate motion vector set is a candidate motion vector set adopted by the Merge mode or a candidate motion vector set adopted by the AMVP mode;

Correspondingly, the candidate motion vector set for constructing the to-be-processed block includes:

And when the identifier information indicates that the candidate motion vector set is a candidate motion vector set adopted by the Merge mode, constructing a candidate motion vector set adopted by the to-be-processed block in the Merge mode;

And when the identifier information indicates that the candidate motion vector set is a candidate motion vector set adopted by the AMVP mode, constructing a candidate motion vector set adopted by the to-be-processed block in the AMVP mode.
The method according to claim 4, wherein the method is used to decode the to-be-processed block, further comprising:

Parsing the code stream to obtain index information, and determining the first candidate motion vector according to the index information.
The method according to any one of claims 1 to 6, wherein the candidate motion vector set is used for bidirectional prediction of the to-be-processed block, the bidirectional prediction including first direction prediction and second direction prediction The first direction prediction is based on a prediction of a first reference frame list, and the second direction prediction is based on a prediction of a second reference frame list.
The method according to claim 7, wherein the first reference frame list comprises the first reference image, and the second reference frame list comprises a second reference image; the method further comprises:

Calculating a difference between the first reference motion vector having the smallest pixel difference value or the rate distortion value and the first candidate motion vector;

Obtaining, according to the difference value and the second candidate motion vector in the candidate motion vector set, a third candidate motion vector, where the second candidate motion vector is used to determine that the to-be-processed block is in the second reference image Reference block.
The method according to claim 8, wherein the method is used to encode the to-be-processed block; after the obtaining the third candidate motion vector, the method further comprises:

Replacing the second candidate motion vector in the candidate motion vector set with the third candidate motion vector.
The method according to claim 8, wherein the method is used to decode the to-be-processed block; after the obtaining the third candidate motion vector, the method further comprises:

The third candidate motion vector is used as a second predicted motion vector of the to-be-processed block.
The method according to claim 7, wherein the first reference frame list comprises the first reference image, and the second reference frame list comprises a second reference image; the method further comprises:

Determining at least two second reference motion vectors according to a second candidate motion vector in the candidate motion vector set, the second reference motion vector being used to determine a reference of the to-be-processed block in the second reference image Piece;

Calculating pixel difference values or rate distortions between one or more third neighboring reconstructed blocks of at least two of the determined reference blocks and one or more fourth neighboring reconstructed blocks of the to-be-processed block, respectively Generation value, wherein the third adjacent reconstructed block and the fourth adjacent reconstructed block have the same shape and are equal in size;

Obtaining a second predicted motion vector of the to-be-processed block according to the corresponding one of the at least two second reference motion vectors or the second reference motion vector with the lowest rate-distortion value.
The method according to claim 9, wherein the obtaining the second reference motion vector according to the pixel difference value or the rate distortion generation value of the at least two second reference motion vectors is the smallest. The predicted motion vector of the processing block, including:

Replacing the second candidate motion vector in the candidate motion vector set with the corresponding one of the pixel difference value or the second reference motion vector having the smallest rate distortion value, to obtain a re-updated candidate motion vector set;

Obtaining, from the re-updated candidate motion vector set, a second predicted motion vector of the to-be-processed block.
The method according to any one of claims 7 to 12, wherein before determining the at least two first reference motion vectors according to the first candidate motion vector in the candidate motion vector set, the method comprises:

When the fourth candidate motion vector in the candidate motion vector set is generated by the second direction prediction, the fourth candidate motion vector is reduced or enlarged according to a proportional relationship to obtain the first candidate motion vector;

The ratio relationship includes a ratio of a first timing difference and a second timing difference, where the first timing difference is an image sequence number of the reference image frame determined by the first candidate motion vector and the to-be-processed block a difference between image sequence numbers of the image frames, the second timing difference being an image sequence number of the reference image frame determined by the fourth candidate motion vector and an image sequence number of the image frame in which the to-be-processed block is located Difference.
The method according to any one of claims 7 to 12, wherein determining at least two first reference motion vectors according to the first candidate motion vector in the candidate motion vector set comprises:

Determining, according to the fifth candidate motion vector, the at least two first reference motion vectors, when the first candidate motion vector is generated by the bidirectional prediction, wherein the reference block determined by the first candidate motion vector is determined according to And obtaining, by the first direction reference block determined by the fifth candidate motion vector, and the second direction reference block determined according to the sixth candidate motion vector, where the fifth candidate motion vector is generated by the first direction prediction, The sixth candidate motion vector is generated by the second direction prediction.
The method according to any one of claims 1 to 14, wherein determining at least two first reference motion vectors according to the first candidate motion vector in the candidate motion vector set comprises:

a reference block of the to-be-processed block determined in the first reference image according to the first candidate motion vector;

Searching at a neighboring position of the determined reference block with target precision to obtain at least two candidate reference blocks, wherein each of the candidate reference blocks corresponds to one of the first reference motion vectors, the target accuracy includes 4 pixel precision One of 2 pixel precision, integer pixel precision, half pixel precision, 1/4 pixel precision, and 1/8 pixel precision.
The method according to any one of claims 1 to 15, comprising: having a first positional relationship between the at least one first adjacent reconstructed block and the reference block determined by the first reference motion vector a second positional relationship between the second neighboring block and the to-be-processed block, the first positional relationship being the same as the second positional relationship.
A method for generating a motion vector predicting, wherein the method is used for bidirectional prediction of a block to be processed, the bidirectional prediction comprising a first direction prediction and a second direction prediction, the first direction prediction being based on the first Referring to the prediction of the frame list, the second direction prediction is based on a prediction of a second reference frame list, the method comprising:

Obtaining a first prediction mode, generating a first candidate motion vector set, where the first candidate motion vector set is used to generate a first direction prediction motion vector in the first direction prediction;

Obtaining a second prediction mode, generating a second candidate motion vector set, where the second candidate motion vector set is used to generate a second direction prediction motion vector in the second direction prediction,

When the first prediction mode is the AMVP mode, the second prediction mode is the Merge mode, or when the first prediction mode is the Merge mode, the second prediction mode is the AMVP mode.
The method according to claim 17, wherein when the first prediction mode is the Merge mode, the acquiring the first prediction mode and generating the first candidate motion vector set comprises:

The first candidate motion vector set is generated using a candidate motion vector used by the Merge mode in the first direction prediction.
The method according to claim 17, wherein when the second prediction mode is the Merge mode, the acquiring the second prediction mode and generating the second candidate motion vector set comprises:

The second candidate motion vector set is generated using a candidate prediction vector used by the Merge mode in the second direction prediction.
The method according to claim 17, wherein the method is used to decode the to-be-processed block, and before the acquiring the first prediction mode to generate the first candidate motion vector set, the method further includes:

The first coded information is used to indicate that the first prediction mode is a Merge mode or an AMVP mode.
The method according to claim 20, wherein the first identification information is used to indicate that the first prediction mode is a Merge mode or an AMVP mode, and the first identifier information is used to indicate the The first prediction mode is the Merge mode, or the first identifier information is used to indicate index information of the first candidate motion vector set generated based on the Merge mode, or the first identifier information is used to indicate the first A prediction mode is an AVMP mode, or the first identification information is used to indicate index information of the first candidate motion vector set generated based on the AMVP mode.
The method according to claim 20, comprising: when the first identification information is used to indicate that the first prediction mode is a Merge mode, using index information of a preset Merge mode.
The method according to claim 22, wherein the index information of the preset Merge mode is the first candidate motion vector of the Merge mode.
The method according to claim 20, further comprising: before the acquiring the second prediction mode and generating the second candidate motion vector set, further comprising:

After determining that the first prediction mode is the Merge mode, parsing the code stream to obtain second identification information, where the second identification information is used to indicate index information of the second prediction mode, where the second prediction The mode is AMVP mode.
The method according to claim 20, further comprising: before the acquiring the second prediction mode and generating the second candidate motion vector set, further comprising:

After determining that the first prediction mode is the AMVP mode, parsing the code stream to obtain second identification information, where the second identification information is used to indicate index information of the second prediction mode, where the second prediction The mode is Merge mode.
The method according to any one of claims 17 to 25, comprising: after determining that the first prediction mode is an AMVP mode, parsing the code stream to obtain a reference frame index and motion of the first direction prediction The vector difference information; or, after determining that the second prediction mode is the AMVP mode, parsing the code stream to obtain the reference frame index and motion vector difference information of the second direction prediction.
An apparatus for generating a predicted motion vector, the apparatus comprising:

a set generation module, configured to construct a candidate motion vector set of the to-be-processed block;

a template matching module, configured to determine, according to the first candidate motion vector in the candidate motion vector set, at least two first reference motion vectors, where the first reference motion vector is used to determine that the to-be-processed block is in the to-be-processed Processing a reference block in the first reference image of the block;

The template matching module is further configured to separately calculate one or more first neighbor reconstructed blocks of the at least two determined reference blocks and one or more second neighbor reconstructed blocks of the to-be-processed block a pixel difference value or rate distortion rate-distortion generation value, wherein the first adjacent reconstructed block and the second adjacent reconstructed block have the same shape and are equal in size;

a prediction motion vector generation module, configured to obtain, according to the corresponding one of the at least two first reference motion vectors, or the first reference motion vector with the lowest rate distortion value, the first of the to-be-processed blocks Predict motion vectors.
The device according to claim 27, wherein the device is configured to decode the to-be-processed block; and the predicted motion vector generating module is configured to: according to the corresponding one of the at least two first reference motion vectors Obtaining a first reference motion vector of the to-be-processed block by using a first reference motion vector with a pixel difference value or a rate distortion generation value, including:

The predicted motion vector generation module is configured to use, as a first predicted motion vector of the to-be-processed block, a first reference motion vector that minimizes the corresponding pixel difference value or rate distortion generation value.
The device according to claim 26 or 27, wherein the device is configured to decode the to-be-processed block; the device further comprises a parsing module;

The parsing module is configured to: the parsing code stream obtains the identifier information, where the identifier information is used to indicate that the candidate motion vector set is a candidate motion vector set used by the Merge mode or a candidate motion vector set used by the AMVP mode;

When the identifier information indicates that the candidate motion vector set is a candidate motion vector set adopted by the Merge mode, the set generating module is configured to construct a candidate motion vector set adopted by the to-be-processed block in the Merge mode;

When the identifier information indicates that the candidate motion vector set is a candidate motion vector set adopted by the AMVP mode, the set generation module is configured to construct a candidate motion vector set adopted by the to-be-processed block in the AMVP mode.
The device according to any one of claims 27 to 29, wherein the parsing module is further configured to parse the code stream to obtain index information, and determine the first candidate motion vector according to the index information.
The device according to any one of claims 27 to 30, wherein the candidate motion vector set is used for bidirectional prediction of the to-be-processed block, the bidirectional prediction including first direction prediction and second direction prediction The first direction prediction is based on a prediction of a first reference frame list, and the second direction prediction is based on a prediction of a second reference frame list.
The device according to claim 31, wherein the first reference frame list comprises the first reference image, the second reference frame list comprises a second reference image, and the predicted motion vector generation module further uses to:

Calculating a difference between the first candidate motion vector replaced by the first reference motion vector and the first candidate motion vector before the replacement, where the corresponding pixel difference value or rate distortion value is the smallest;

Obtaining, according to the difference value and the second candidate motion vector in the candidate motion vector set, a third candidate motion vector, where the second candidate motion vector is used to determine that the to-be-processed block is in the second reference image The reference block uses the third candidate motion vector as the second predicted motion vector that is fast to be processed.
The device according to claim 31, wherein the first reference frame list comprises the first reference image, and the second reference frame list comprises a second reference image;

The template matching module is further configured to: determine, according to the second candidate motion vector in the candidate motion vector set, at least two second reference motion vectors, where the second reference motion vector is used to determine that the to-be-processed block is a reference block in the second reference image;

The template matching module is further configured to separately calculate one or more third adjacent reconstructed blocks of the at least two determined reference blocks and one or more fourth adjacent reconstructed blocks of the to-be-processed block a pixel difference value or a rate distortion generation value, wherein the third neighbor reconstructed block and the fourth neighbor reconstructed block have the same shape and are equal in size;

The predicted motion vector generation module is further configured to obtain the to-be-processed block according to the corresponding one of the at least two second reference motion vectors or the second reference motion vector with the lowest rate-distortion value. The second predicted motion vector.
The device according to claim 33, wherein the predicted motion vector generation module is configured to select a second reference corresponding to the pixel difference value or the rate distortion value of the at least two second reference motion vectors. a motion vector, obtaining a predicted motion vector of the to-be-processed block, comprising:

The predicted motion vector generation module is configured to replace the second candidate motion vector in the candidate motion vector set with the corresponding one of the pixel difference value or a second reference motion vector with a lowest rate distortion value. Obtaining a candidate motion vector set after re-updating;

Obtaining, from the obtained re-updated candidate motion vector set, a second predicted motion vector of the to-be-processed block.
The device according to any one of claims 31 to 34, wherein the template matching module is configured to determine at least two first reference motion vectors according to a first candidate motion vector in the candidate motion vector set. ,include:

The template matching module is configured to: when the fourth candidate motion vector in the candidate motion vector set is generated by the second direction prediction, reduce or enlarge the fourth candidate motion vector according to a proportional relationship to obtain the First candidate motion vector;

The ratio relationship includes a ratio of a first timing difference and a second timing difference, where the first timing difference is an image sequence number of the reference image frame determined by the first candidate motion vector and the to-be-processed block a difference between image sequence numbers of the image frames, the second timing difference being an image sequence number of the reference image frame determined by the fourth candidate motion vector and an image sequence number of the image frame in which the to-be-processed block is located Difference.
The device according to any one of claims 31 to 34, wherein the template matching module is configured to determine at least two first reference motion vectors according to a first candidate motion vector in the candidate motion vector set. include:

The template matching module is configured to: when the first candidate motion vector is generated by the bidirectional prediction, determine the at least two first reference motion vectors according to a fifth candidate motion vector, where the first candidate The motion vector determined reference block is obtained by weighting a first direction reference block determined according to the fifth candidate motion vector and a second direction reference block determined according to the sixth candidate motion vector, the fifth candidate motion vector The first direction prediction generation is generated, and the sixth candidate motion vector is generated by the second direction prediction.
The device according to any one of claims 27 to 36, wherein the template matching module is configured to determine at least two first reference motion vectors according to a first candidate motion vector in the candidate motion vector set. include:

The template matching module is configured to: reference a block of the to-be-processed block determined in the first reference image according to the first candidate motion vector;

Searching at a neighboring position of the determined reference block with target precision to obtain at least two candidate reference blocks, wherein each of the candidate reference blocks corresponds to one of the first reference motion vectors, the target accuracy includes 4 pixel precision One of 2 pixel precision, integer pixel precision, half pixel precision, 1/4 pixel precision, and 1/8 pixel precision.
The apparatus according to any one of claims 27 to 37, comprising: a first positional relationship between the at least one first adjacent reconstructed block and the reference block determined by the first reference motion vector a second positional relationship between the second neighboring block and the to-be-processed block, the first positional relationship being the same as the second positional relationship.
An apparatus for generating a predicted motion vector, wherein the apparatus is configured to perform bidirectional prediction of a block to be processed, the bidirectional prediction including a first direction prediction and a second direction prediction, where the first direction prediction is Based on the prediction of the first reference frame list, the second direction prediction is based on a prediction of a second reference frame list, the device comprising:

a first set generating module, configured to acquire a first prediction mode, and generate a first candidate motion vector set, where the first candidate motion vector set is used to generate a first direction prediction motion vector in the first direction prediction;

a second set generating module, configured to acquire a second prediction mode, and generate a second candidate motion vector set, where the second candidate motion vector set is used to generate a second direction predicted motion vector in the second direction prediction,

When the first prediction mode is the AMVP mode, the second prediction mode is the Merge mode, or when the first prediction mode is the Merge mode, the second prediction mode is the AMVP mode.
The device according to claim 39, wherein, when the first prediction mode is the Merge mode, the first set generation module acquires a first prediction mode, and generates a first candidate motion vector set, including:

The first set generation module is configured to generate the first candidate motion vector set by using a candidate motion vector used by the Merge mode in the first direction prediction.
The device according to claim 39, wherein when the second prediction mode is the Merge mode, the second set generation module is configured to acquire a second prediction mode, and generate a second candidate motion vector set, including:

The second set generation module is configured to generate the second candidate motion vector set by using a candidate prediction vector used by the Merge mode in the second direction prediction.
The device according to claim 39, wherein the device is configured to decode the to-be-processed block; the device further includes a parsing module, where the parsing module is configured to parse the code stream to obtain first identification information, The first identification information is used to indicate that the first prediction mode is a Merge mode or an AMVP mode.
The device according to claim 42, characterized in that the first identifier information is used to indicate that the first prediction mode is a Merge mode or an AMVP mode, and the first identifier information is used to indicate The first prediction mode is a Merge mode, or the first identifier information is used to indicate index information of the first candidate motion vector set generated based on the Merge mode, or the first identifier information is used to indicate The first prediction mode is an AVMP mode, or the first identifier information is used to indicate index information of the first candidate motion vector set generated based on the AMVP mode.
The device according to claim 42, wherein the parsing module is further configured to: when the first identifier information is used to indicate that the first prediction mode is a Merge mode, use an index of a preset Merge mode. information.
The device according to claim 44, wherein the index information of the preset Merge mode is the first candidate motion vector of the Merge mode.
The device according to claim 42, wherein the parsing module is further configured to: after determining that the first prediction mode is a Merge mode, parsing the code stream to obtain second identification information, the second identifier The information is used to indicate index information of the second prediction mode of the AMVP, wherein the second prediction mode is an AMVP mode.
The device according to claim 42, wherein the parsing module is further configured to: after determining that the first prediction mode is an AMVP mode, parsing the code stream to obtain second identification information, the second identifier The information is used to indicate the index information of the second prediction mode of the Merge, wherein the second prediction mode is the Merge mode.
The device according to any one of claims 39 to 47, wherein the parsing module is further configured to: after determining that the first prediction mode is an AMVP mode, parsing the code stream to obtain the first direction prediction Reference frame index and motion vector difference information; or, after determining that the second prediction mode is the AMVP mode, parsing the code stream to obtain the reference frame index and motion vector difference information of the second direction prediction.