WO2021056215A1 - Motion estimation method and system, and storage medium - Google Patents

Motion estimation method and system, and storage medium Download PDF

Info

Publication number
WO2021056215A1
WO2021056215A1 PCT/CN2019/107601 CN2019107601W WO2021056215A1 WO 2021056215 A1 WO2021056215 A1 WO 2021056215A1 CN 2019107601 W CN2019107601 W CN 2019107601W WO 2021056215 A1 WO2021056215 A1 WO 2021056215A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
accuracy
coding unit
motion
affine
Prior art date
Application number
PCT/CN2019/107601
Other languages
French (fr)
Chinese (zh)
Inventor
马思伟
孟学苇
郑萧桢
王苫社
Original Assignee
深圳市大疆创新科技有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司, 北京大学 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/107601 priority Critical patent/WO2021056215A1/en
Priority to CN201980066902.6A priority patent/CN112868234A/en
Publication of WO2021056215A1 publication Critical patent/WO2021056215A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation

Definitions

  • the present invention relates to the technical field of video coding and decoding, in particular to a motion estimation method, system and storage medium.
  • the basic principle of video coding is to use the correlation between the spatial, temporal and codewords to remove redundancy as much as possible.
  • Current video coding schemes mainly include intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • the inter-frame prediction technology uses the time-domain correlation between adjacent frames of the video, uses the previously encoded reconstructed frame as a reference frame, and performs motion estimation (ME) and motion compensation (MC) pairs
  • the current frame (that is, the frame currently being encoded) is predicted to remove the temporal redundant information of the video.
  • the image can be divided into several coding units, and the position of each coding unit in the adjacent frames can be searched out, and the spatial position between the two can be obtained.
  • Relative offset the obtained relative offset is usually referred to as a motion vector (motion vector, MV), and the process of obtaining a motion vector is called motion estimation.
  • Motion compensation is the process of using MV and reference frames to obtain the predicted frame.
  • the predicted frame obtained by this process may be different from the original current frame. Therefore, the difference between the predicted frame and the current frame needs to be transformed and quantized.
  • the MV information is passed to the decoder, so that the decoder can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
  • Motion estimation is an important link that affects the efficiency of video coding. Therefore, how to optimize the motion estimation method has always been a concern of those skilled in the art.
  • the first aspect of the embodiments of the present invention provides a motion estimation method, the method includes:
  • For the affine coding unit in the current frame select one from at least four kinds of motion vector accuracy to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit;
  • the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • the second aspect of the embodiments of the present invention provides another motion estimation method, and the method includes:
  • the affine coding unit in the current frame select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein the multiple Motion vector accuracy includes 1/2 pixel accuracy;
  • the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • a third aspect of the embodiments of the present invention provides a motion estimation system.
  • the system includes a storage device and a processor.
  • the storage device stores a computer program run by the processor.
  • the processor executes the above-mentioned motion estimation method while it is running.
  • a fourth aspect of the embodiments of the present invention provides a storage medium on which a computer program is stored, and the computer program executes the above-mentioned motion estimation method when running.
  • the motion estimation method, system and storage medium of the present invention unify the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the conventional mode, and improve the coding performance.
  • Fig. 1 shows a flowchart of a motion estimation method according to an embodiment of the present invention
  • Fig. 2 shows a schematic diagram of a motion vector of a control point of an affine coding unit according to an embodiment of the present invention
  • Fig. 3 shows a schematic diagram of motion vectors of subunits of an affine coding unit according to an embodiment of the present invention
  • Fig. 4 shows a flowchart of a motion estimation method according to another embodiment of the present invention.
  • Fig. 5 shows a structural block diagram of a motion estimation system according to an embodiment of the present invention.
  • the motion estimation method of the embodiment of the present invention can be applied to the inter-frame prediction part of the video coding and decoding technology.
  • Video is generally composed of multiple frames of images in a certain order. There are often a lot of identical or similar spatial structures in one frame of image, that is to say, there are a lot of spatial redundant information in the video file. In addition. Since the sampling time interval between two adjacent frames of the video is extremely short, there is usually a large amount of similarity in the adjacent two frames, that is, there is a large amount of time redundant information in the video. In addition, from the perspective of the visual sensitivity of the human eye, there is also a part of video information that can be used for compression, that is, visual redundant information.
  • video image information also has a series of redundant information such as information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy and so on.
  • the purpose of video coding is to remove redundant information in a video sequence, so as to reduce storage space and save transmission bandwidth.
  • video coding mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • the embodiment of the present invention mainly aims at improving the inter-frame prediction part.
  • the inter-frame prediction technology uses the time-domain correlation between adjacent frames of the video, uses the previously encoded reconstructed frame as a reference frame, and predicts the current frame (the frame currently being encoded) through motion estimation and motion compensation, thereby removing Time redundant information of the video.
  • the motion estimation method, system, and storage medium described in the embodiments of the present invention use the HEVC standard or its extension.
  • the present invention is also applicable to other coding standards, such as the H.264 standard, the next generation video coding standard VVC, AVS3, or any other suitable coding standard.
  • Fig. 1 shows a flowchart of a motion estimation method 100 according to an embodiment of the present invention. As shown in FIG. 1, the method 100 includes the following steps:
  • step S110 for the affine coding unit in the current frame, one of at least four kinds of motion vector accuracy is selected to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit.
  • the current frame is the video frame currently to be encoded.
  • the current frame can be a video frame collected in real time, or a video frame extracted from a storage medium.
  • the reference frame is the video frame to be referred to when encoding the current frame.
  • the reference frame may be a reconstructed video frame obtained by reconstructing the encoded data corresponding to the video frame that can be used as the reference frame.
  • the reference frame can be a forward reference frame, a backward reference frame, or a bidirectional reference frame.
  • inter-frame prediction techniques include forward prediction, backward prediction, bidirectional prediction, and so on.
  • Forward prediction uses the previous frame (historical frame) of the current frame as a reference frame to predict the current frame.
  • Backward prediction uses the frame after the current frame (future frame) as a reference frame to predict the current frame.
  • Bidirectional prediction uses not only historical frames but also future frames to predict the current frame.
  • a bidirectional prediction mode is adopted, that is, the reference frame includes both historical frames and future frames.
  • the affine coding unit in the current frame is a coding unit (CU) divided in the current frame based on the affine motion compensation prediction (Affine) technology.
  • the traditional motion model only includes translational motion, but in reality there are many forms of motion, such as zooming, rotating, perspective motion and other irregular motions, which introduces the Affine technology.
  • the processing unit in the Affine technology is no longer the entire coding unit, but divides the entire coding unit into multiple sub-units. In the process of motion compensation, motion compensation is performed in the unit of sub-units.
  • the affine coding unit in the Affine mode no longer has only one motion vector, but each subunit in the affine coding unit has its own motion vector.
  • the motion vector of each subunit in the affine coding unit passes through the two control points of the affine coding unit (ie, the four-parameter model, see the left figure in Figure 2) or The motion vectors of the three control points (that is, the six-parameter model, see the right figure in Figure 2) are calculated and derived. Only the motion vector information of the control point needs to be written in the code stream, not the motion of each subunit. Vector information.
  • the motion vector of the control point is first determined.
  • the embodiment of the present invention adopts the adaptive motion vector accuracy (AMVR) technology in The encoder side adaptively determines the accuracy of the motion vector.
  • the determination of the motion vector of the control point is based on the Inter mode (also known as the AMVP mode) in the Affine mode. In this mode, the motion vector accuracy is selected on the encoder side, and MVD (Motion Vector Difference, motion vector difference) calculation.
  • the selectable motion vector precision includes four kinds, and for each coding unit, one of the four kinds of motion vector precision is selected for motion estimation.
  • the at least four motion vector precisions include any four of 4 pixels, 2 pixels, whole pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels.
  • the four kinds of motion vector precisions may be integer pixel precision, 1/2 pixel precision, 1/4 pixel precision, and 1/16 pixel precision.
  • the conventional AMVP mode includes four AMVR precisions. Therefore, compared with the previous Affine mode with three precisions, the embodiment of the present invention increases the precision by one bit, so that the number of motion vector precisions available for the affine coding unit is the same as the number of motion vector precisions available for the conventional coding unit. Furthermore, the design of adaptive motion vector accuracy in Affine mode is unified with the design of adaptive motion vector accuracy in conventional AMVP mode. In one embodiment, the newly added precision in the embodiment of the present invention is 1/2 pixel precision.
  • the accuracy of the motion vector of the control point referred to in the Affine mode is not the actual use in the process of sub-unit motion compensation.
  • the accuracy of the motion vector is not the actual use in the process of sub-unit motion compensation.
  • the method for determining the accuracy of the motion vector includes: selecting the accuracy of the motion vector according to the selected motion vector accuracy of the neighboring coding unit.
  • the method for determining the accuracy of a motion vector may further include: attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation.
  • two kinds of motion vector precisions can be selected from the four optional motion vector precisions, and motion estimation is attempted respectively, and the effects of the motion estimation twice are compared. For example, you can select 1/2 pixel precision and integer pixel precision to perform motion estimation separately.
  • the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. For example, if the effect of using the integer pixel precision for motion estimation is better than the effect of using 1/2 pixel accuracy for motion estimation, then no other precision attempts are made, and the integer pixel precision is directly selected. If the motion estimation effect with higher motion vector accuracy is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained. For example, if the effect of using 1/2 pixel accuracy for motion estimation is better than that of whole pixel accuracy, then continue to try 1/4 pixel accuracy for motion estimation.
  • the determining the motion vector of the control point of the affine coding unit includes: first, obtaining the motion vector of the spatial or temporal adjacent coding unit, and according to the spatial adjacent coding unit or temporal adjacent coding The combination of the motion vectors of the units constructs a candidate list.
  • the motion vector obtained in this process may be the motion vector of the control point of the coding unit in the Affine mode, or the motion vector of the conventional coding unit in the traditional mode.
  • the obtained motion vectors are combined to construct a candidate list of control point motion vectors, and the number of motion vectors in each combination depends on the number of control points of the affine coding unit.
  • the motion vector predictor MVP
  • the corresponding reference block can be determined in the reference frame according to the predicted motion vector.
  • interpolation processing is performed on the reference block to generate fractional pixels, and then the actual motion vector is determined.
  • the encoding end can also calculate the difference MVD (Motion Vector Difference) between the actual motion vector and the predicted motion vector, encode the MVD, and send the encoded MVD and the index of the predicted motion vector in the candidate list to the decoding end.
  • MVD Motion Vector Difference
  • the accuracy of the motion vector includes integer pixel accuracy and fractional pixel accuracy. Since the pixel at the fractional pixel position does not exist, it is necessary to interpolate the reference block to obtain the pixel at the sub-pixel position. Interpolation is to use the value of integer pixels to generate fractional pixels between each integer sample. The more fractional pixels are generated between integer pixels, the higher the resolution of the reference block becomes, and the more accurately and accurately the displacement of fractional pixel precision can be compensated. With the improvement of interpolation accuracy, the efficiency of motion estimation and motion compensation will be improved to a certain extent.
  • the accuracy of the motion vector in the Affine mode can be an integer, that is, an integer pixel accuracy, such as integer, 2 pixels; it can also be non-integer, that is, a sub-pixel accuracy, such as 1/2, 1/4, 1/8. Equal precision.
  • the pixel at 1/2 precision position needs to be obtained by interpolation of the pixel at the whole pixel position.
  • the pixel values of other precision positions need to be obtained by further interpolation using integer-pixel precision pixels or 1/2-precision pixels.
  • an interpolation filter can be selected according to the selected motion vector accuracy to perform interpolation processing on the reference block.
  • the same interpolation filter may be used for all motion vector accuracy.
  • the existing six-tap interpolation filter is used by default.
  • the identification bit that characterizes the type of the interpolation filter may not be set in the code stream, thereby saving one bit of data.
  • different interpolation filters can be selected according to different motion vector accuracy.
  • 1/2 precision uses a 6-tap interpolation filter, and other precisions all use an 8-tap interpolation filter. Therefore, in an embodiment of the present invention, when 1/2 pixel precision is selected as the motion vector precision, the first interpolation filter is selected to perform interpolation processing on the reference block; when the precision other than 1/2 pixel precision is selected When the other precision of is used as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein the number of taps of the first interpolation filter and the second interpolation filter are different.
  • the first interpolation filter may be a 6-tap interpolation filter
  • the second interpolation filter may be an 8-tap interpolation filter.
  • the filter type identification bit can be set in the code stream. For example, 1 can be used to indicate that a 6-tap interpolation filter is used; 0 can be used to indicate that a 6-tap interpolation filter is not used, that is, the default 8-tap interpolation filter is used.
  • the motion estimation method 200 when applied to the decoding end, if different interpolation filters are selected for different motion vector accuracy, before selecting the interpolation filter according to the motion vector accuracy, the motion estimation method 200 further includes: acquiring code Stream, the code stream is provided with an identification bit of the filter type corresponding to the motion vector.
  • motion estimation can include both Affine mode and regular AMVP mode.
  • motion estimation is performed with the entire coding unit as a unit.
  • each conventional coding unit when adaptive motion vector accuracy (AMVR) is applied, it also includes adaptively selecting one of four motion vector accuracy for motion estimation.
  • the four motion vector accuracies of the conventional coding unit are the same or different from the four motion vector accuracies of the affine coding unit.
  • the four kinds of motion vector precisions may include integer pixel, 4 pixel, 1/4 pixel and 1/2 pixel precision.
  • the accuracy of the motion vector is not limited to the above four types, for example, it may also include 1/8 pixel, 1/16 pixel, and so on.
  • the corresponding motion vector accuracy is adaptively decided at the coding end, and the result of the decision is written into the code stream and passed to the decoding end.
  • the identifier indicating the accuracy of the motion vector of the affine coding unit is consistent with the identifier indicating the accuracy of the motion vector of the conventional coding unit, so that the two modes are more unified.
  • the method when the motion estimation method 200 is applied to the decoding end, before selecting one of the at least four motion vector precisions to perform motion estimation in the reference frame, the method further includes: acquiring a bitstream, so The identification bit of the code stream records the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the motion vector accuracy of the conventional coding unit Consistent.
  • step S120 the affine coding unit is divided into several subunits.
  • the size of the sub-units may be fixed, for example, each sub-unit is divided into a size of 4 ⁇ 4 pixels.
  • the size of the subunit may also be determined in other ways. For example, a subunit of an appropriate size may be selected to reduce the complexity of coding and decoding.
  • step S130 the motion vector of the subunit in the affine coding unit is calculated according to the motion vector of the control point.
  • the sports field of the Affine mode can be derived from the motion vectors of two control points (four parameters) or three control points (six parameters).
  • the motion vector of the subunit located at the (x, y) position is calculated by the following formula (1):
  • (mv 0x ,mv 0y ) is the motion vector of the control point in the upper left corner
  • (mv 1x ,mv 1y ) is the motion vector of the control point in the upper right corner
  • x and y are the coordinates of the center point of the subunit
  • w is the affine The width of the coding unit.
  • the motion vector of the sub-unit at the position (x, y) is calculated by the following formula (2):
  • (mv 0x ,mv 0y ) is the motion vector of the control point in the upper left corner
  • (mv 1x ,mv 1y ) is the motion vector of the control point in the upper right corner
  • (mv 2x ,mv 2y ) is the motion vector of the control point in the lower left corner
  • w is the width of the affine coding unit.
  • a schematic diagram of the motion vector in an affine coding unit is shown in Fig. 3, where each square represents a 4 ⁇ 4 size subunit. All motion vectors after the calculation of the above formula will be rounded to a 1/16 pixel precision representation.
  • the size of the subunits of the chrominance component and the luminance component are both 4 ⁇ 4, and the motion vector of the chrominance component 4 ⁇ 4 subunit can be obtained by averaging the motion vectors of the corresponding four 4 ⁇ 4 luminance components.
  • the prediction block of each subunit in the reference frame can be obtained through a motion compensation process. After that, the prediction frame can be obtained by using the motion vector and the prediction block.
  • the encoding end transfers the difference between the prediction frame and the actual current frame to the decoding end after transformation, quantization, etc., and the decoding end uses the motion vector, reference frame, and The difference between the predicted frame and the current frame can reconstruct the current frame.
  • the motion estimation method unifies the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the normal mode, and improves the coding performance.
  • Fig. 4 shows a flowchart of a motion estimation method 400 according to another embodiment of the present invention. As shown in FIG. 4, the method 400 includes the following steps:
  • step S410 for the affine coding unit in the current frame, select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein,
  • the various motion vector precisions include 1/2 pixel precision;
  • step S420 the affine coding unit is divided into several subunits
  • step S430 the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • the current frame is the video frame currently to be encoded.
  • the reference frame is the video frame to be referred to when encoding the current frame.
  • the reference frame in this embodiment includes both historical frames and future frames.
  • the affine coding unit in the current frame is a coding unit (CU) divided in the current frame based on the affine motion compensation prediction (Affine) technology.
  • the processing unit in the Affine technology is no longer the entire coding unit, but divides the entire coding unit into multiple sub-units. In the process of motion compensation, motion compensation is performed in the unit of sub-units.
  • the affine coding unit in the Affine mode no longer has only one motion vector, but each subunit in the affine coding unit has its own motion vector.
  • the motion vector of each subunit in the affine coding unit passes through the two control points of the affine coding unit (ie, the four-parameter model, see the left figure in Figure 2) or The motion vectors of the three control points (that is, the six-parameter model, see the right figure in Figure 2) are calculated and derived. Only the motion vector information of the control point needs to be written in the code stream, not the motion of each subunit. Vector information.
  • the motion vector of the control point needs to be determined first.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the embodiment of the present invention adopts the adaptive motion vector accuracy (AMVR) technology in The encoder side adaptively determines the accuracy of the motion vector.
  • the determination of the motion vector of the control point is based on the Inter mode (also known as the AMVP mode) in the Affine mode. In this mode, the motion vector accuracy is selected on the encoder side, and MVD (Motion Vector Difference, motion vector difference) calculation.
  • the Inter mode also known as the AMVP mode
  • MVD Motion Vector Difference, motion vector difference
  • one is selected from multiple types of motion vector accuracy to perform motion estimation in the reference frame, where the multiple types of motion vector accuracy include 1/2 pixel accuracy.
  • the multiple types of motion vector accuracy include 1/2 pixel accuracy.
  • the various motion vector precisions include any of 4 pixels, 2 pixels, whole pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels.
  • one can be selected from integer pixel accuracy, 1/2 pixel accuracy, 1/4 pixel accuracy, and 1/16 pixel accuracy for motion estimation.
  • the conventional AMVP mode adds 1/2 pixel AMVR accuracy. Therefore, the embodiment of the present invention adds 1/2 pixel precision to the optional motion vector precision, so that the design of the motion vector precision of the affine coding unit matches the design of the motion vector precision of the conventional coding unit.
  • the accuracy of the motion vector of the control point referred to in the Affine mode is not the actual use in the process of sub-unit motion compensation.
  • the accuracy of the motion vector is not the actual use in the process of sub-unit motion compensation.
  • the method for determining the accuracy of the motion vector includes: selecting the accuracy of the motion vector according to the selected motion vector accuracy of the neighboring coding unit.
  • the method for determining the accuracy of a motion vector may further include: attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation.
  • two kinds of motion vector precisions can be selected from the four optional motion vector precisions, and motion estimation is attempted respectively, and the effects of the two motion estimations are compared. After that, compare the effects of motion estimation. If the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. If the motion estimation effect with higher motion vector accuracy is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained.
  • the determining the motion vector of the control point of the affine coding unit includes: first, obtaining the motion vector of the spatial or temporal adjacent coding unit, and according to the spatial adjacent coding unit or temporal adjacent coding The combination of the motion vectors of the units constructs a candidate list. After that, the obtained motion vectors are combined to construct a candidate list of control point motion vectors, and the number of motion vectors in each combination depends on the number of control points of the affine coding unit.
  • the motion vector predictor MVP
  • the corresponding reference block can be determined in the reference frame according to the predicted motion vector.
  • interpolation processing is performed on the reference block to generate fractional pixels, and then the actual motion vector is determined. Interpolation is to use the value of integer pixels to generate fractional pixels between each integer sample. The more fractional pixels are generated between integer pixels, the higher the resolution of the reference frame becomes, and the more accurately and accurately the displacement of fractional pixel accuracy can be compensated. With the improvement of interpolation accuracy, the efficiency of motion estimation and motion compensation will be improved to a certain extent.
  • the accuracy of the motion vector in the Affine mode can be an integer, that is, an integer pixel accuracy, such as integer, 2 pixels; it can also be non-integer, that is, a sub-pixel accuracy, such as 1/2, 1/4, 1/8. Equal precision.
  • the pixel at 1/2 precision position needs to be obtained by interpolation of the pixel at the whole pixel position.
  • the pixel values of other precision positions need to be obtained by further interpolation using integer-pixel precision pixels or 1/2-precision pixels.
  • an interpolation filter can be selected according to the selected motion vector accuracy to perform interpolation processing on the reference block.
  • the same interpolation filter may be used for all motion vector accuracy.
  • the existing six-tap interpolation filter is used by default.
  • the identification bit that characterizes the type of the interpolation filter may not be set in the code stream, thereby saving one bit of data.
  • different interpolation filters can be selected according to different motion vector accuracy.
  • 1/2 precision uses a 6-tap interpolation filter, and other precisions all use an 8-tap interpolation filter. Therefore, in an embodiment of the present invention, when 1/2 pixel precision is selected as the motion vector precision, the first interpolation filter is selected to perform interpolation processing on the reference block; when the precision other than 1/2 pixel precision is selected When the other precision of is used as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein the number of taps of the first interpolation filter and the second interpolation filter are different.
  • the first interpolation filter may be a 6-tap interpolation filter
  • the second interpolation filter may be an 8-tap interpolation filter.
  • the filter type identification bit can be set in the code stream. For example, 1 can be used to indicate that a 6-tap interpolation filter is used; 0 can be used to indicate that a 6-tap interpolation filter is not used, that is, the default 8-tap interpolation filter is used.
  • the motion estimation method 400 when applied to the decoding end, if different interpolation filters are selected for different motion vector accuracy, before selecting the interpolation filter according to the motion vector accuracy, the motion estimation method 400 further includes: acquiring code Stream, the code stream is provided with an identification bit of the filter type corresponding to the motion vector.
  • motion estimation may include both Affine mode and regular AMVP mode.
  • AMVP regular AMVP mode
  • motion estimation is performed with the entire coding unit as a unit.
  • each conventional coding unit when adaptive motion vector precision (AMVR) is applied, it also includes adaptively selecting one of multiple motion vector precisions for motion estimation, the multiple motion vector precisions including 1/2 Pixel accuracy. Except for 1/2 pixel precision, the optional motion vector precision of the conventional coding unit is the same as or different from the optional motion vector precision of the affine coding unit. In one embodiment, the conventional coding unit also includes four optional motion vector precisions.
  • AMVR adaptive motion vector precision
  • the corresponding motion vector accuracy is adaptively decided at the coding end, and the result of the decision is written into the code stream and passed to the decoding end.
  • the identifier indicating the accuracy of the motion vector of the affine coding unit is consistent with the identifier indicating the accuracy of the motion vector of the conventional coding unit, so that the two modes are more unified.
  • the method when the motion estimation method 400 is applied to the decoding end, before selecting one of the at least four motion vector precisions to perform motion estimation in the reference frame, the method further includes: obtaining a code stream, so The identification bit of the code stream records the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the motion vector accuracy of the conventional coding unit Consistent.
  • step S420 the affine coding unit is divided into several sub-units, and in step S430, the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • step S420 and step S430 reference may be made to the related description of step S120 and step S130 of the method 100, which will not be repeated here.
  • the motion estimation method adds 1/2 pixel precision to the optional motion vector precision in affine mode, so that the precision of the motion vector in affine mode is the same as that in normal mode.
  • the precision design is unified, and the coding performance is improved.
  • the following describes a motion estimation system 500 according to an embodiment of the present invention with reference to FIG. 5.
  • FIG. 5 is a schematic block diagram of a motion estimation system 500 according to an embodiment of the present invention.
  • the motion estimation system 500 shown in FIG. 5 includes a processor 510, a storage device 520, and a computer program stored on the storage device 520 and running on the processor 510.
  • the processor implements the foregoing figure when the program is executed. Steps of the motion estimation method 100 shown in 1 or the motion estimation method 400 shown in FIG. 4.
  • the processor 510 may be a central processing unit (CPU), an image processing unit (GPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other forms with data processing capabilities and/or instruction execution capabilities
  • the processor 510 may be a central processing unit (CPU) or other form of processing unit with data processing capability and/or instruction execution capability, and may control other components in the motion estimation system 500 to execute The desired function.
  • the processor 510 can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSM), digital signal processors (DSP), or combinations thereof.
  • the storage device 520 includes one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 210 may run the program instructions to implement the motion estimation method in the embodiments of the present invention (implemented by the processor) described below. And/or other desired functions.
  • Various application programs and various data such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.
  • the system 500 further includes an input device (not shown).
  • the input device may be a device used by the user to input instructions, and may include one of operation keys, a keyboard, a mouse, a microphone, and a touch screen. Or more.
  • the input device may also be any interface for receiving information.
  • the system 500 further includes an output device that can output various information (such as images or sounds) to the outside (such as a user), and may include a display (such as displaying a video image to the user), One or more of speakers, etc.
  • the output device may also be any other device with output function.
  • system 500 further includes a communication interface, which is used to communicate with other devices, including wired or wireless communication.
  • the processor implements the following steps when executing the program: For the affine coding unit in the current frame, select one of at least four motion vector precisions to perform motion estimation in the reference frame, In this way, the motion vector of the control point of the affine coding unit is determined; the affine coding unit is divided into several sub-units; the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point. Motion vector.
  • the processor implements the following steps when executing the program: For the affine coding unit in the current frame, select one of a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the The motion vector of the control point of the affine coding unit, wherein the various motion vector precisions include 1/2 pixel precision; the affine coding unit is divided into a number of sub-units; the calculation is based on the motion vector of the control point The motion vector of the sub-unit in the affine coding unit.
  • the embodiment of the present invention also provides a storage medium on which a computer program is stored.
  • the computer program is executed by the processor, the steps of the method shown in FIG. 1 or FIG. 4 can be implemented.
  • the storage medium is a computer-readable storage medium.
  • the computer-readable storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disk read-only Memory (CD-ROM), USB memory, or any combination of the above storage media.
  • the computer-readable storage medium may be any combination of one or more computer-readable storage media.
  • the computer program instructions when run by the computer or processor, cause the computer or processor to perform the following steps: For the affine coding unit in the current frame, select one of at least four motion vector precisions to Perform motion estimation in the reference frame to determine the motion vector of the control point of the affine coding unit; divide the affine coding unit into several subunits; calculate the affine coding according to the motion vector of the control point The motion vector of the sub-unit in the unit.
  • the computer program instructions when run by the computer or processor, cause the computer or processor to perform the following steps: For the affine coding unit in the current frame, select one from a variety of motion vector precisions. Perform motion estimation in a reference frame to determine the motion vector of the control point of the affine coding unit, wherein the various motion vector precisions include 1/2 pixel precision; the affine coding unit is divided into several sub- Unit; calculate the motion vector of the sub-unit in the affine coding unit according to the motion vector of the control point.
  • the motion estimation method, system and storage medium of the present invention unify the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the conventional mode, improve the coding performance, and can be used to improve the quality of compressed video. Enhancing the hardware friendliness of the codec is of great significance to the video compression processing of broadcast television, video conference, network video, etc.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present invention essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
  • the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as a device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals.
  • Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A motion estimation method and system, and a storage medium. The method comprises: for an affine coding unit in a current frame, selecting one of at least four types of motion vector accuracy to perform motion estimation in a reference frame, so as to determine the motion vector of a control point of the affine coding unit (S110); dividing the affine coding unit into a plurality of subunits (S120); and calculating the motion vectors of the subunits in the affine coding unit according to the motion vector of the control point (S130). The motion estimation method and system and the storage medium unify the design of the motion vector accuracy in an affine mode with the motion vector accuracy in the conventional mode, thus improving the coding performance.

Description

运动估计方法、系统和存储介质Motion estimation method, system and storage medium 技术领域Technical field
本发明涉及视频编解码技术领域,具体而言涉及一种运动估计方法、系统和存储介质。The present invention relates to the technical field of video coding and decoding, in particular to a motion estimation method, system and storage medium.
背景技术Background technique
视频编码的基本原理是利用空域、时域和码字之间的相关性,尽可能去除冗余。目前的视频编码方案主要包括帧内预测、帧间预测、变换、量化、熵编码和环路滤波等部分。The basic principle of video coding is to use the correlation between the spatial, temporal and codewords to remove redundancy as much as possible. Current video coding schemes mainly include intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
其中,帧间预测技术利用视频相邻帧之间的时域相关性,使用先前已经编码的重构帧作为参考帧,通过运动估计(motion estimation,ME)和运动补偿(motion compensation,MC)对当前帧(即当前正在编码的帧)进行预测,从而去除视频的时间冗余信息。其中,由于视频中邻近帧之间存在着一定的相关性,因此,可将图像分成若干编码单元,搜索出每个编码单元在邻近帧中的位置,并得出两者之间的空间位置的相对偏移量,得到的相对偏移量就是通常所指的运动矢量(motion vector,MV),得到运动矢量的过程被称为运动估计。而运动补偿就是利用MV和参考帧得到预测帧的过程,此过程得到的预测帧可能和原始的当前帧有一定的差别,因此需要将预测帧和当前帧的差值经过变换、量化等过程之后与MV信息传递到解码端,这样解码端通过MV、参考帧、以及预测帧和当前帧的差值,就可以重构出当前帧。Among them, the inter-frame prediction technology uses the time-domain correlation between adjacent frames of the video, uses the previously encoded reconstructed frame as a reference frame, and performs motion estimation (ME) and motion compensation (MC) pairs The current frame (that is, the frame currently being encoded) is predicted to remove the temporal redundant information of the video. Among them, because there is a certain correlation between adjacent frames in the video, the image can be divided into several coding units, and the position of each coding unit in the adjacent frames can be searched out, and the spatial position between the two can be obtained. Relative offset, the obtained relative offset is usually referred to as a motion vector (motion vector, MV), and the process of obtaining a motion vector is called motion estimation. Motion compensation is the process of using MV and reference frames to obtain the predicted frame. The predicted frame obtained by this process may be different from the original current frame. Therefore, the difference between the predicted frame and the current frame needs to be transformed and quantized. The MV information is passed to the decoder, so that the decoder can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
运动估计是影响视频编码效率的重要环节,因此如何优化运动估计方法一直是本领域技术人员关注的问题。Motion estimation is an important link that affects the efficiency of video coding. Therefore, how to optimize the motion estimation method has always been a concern of those skilled in the art.
发明内容Summary of the invention
在发明内容部分中引入了一系列简化形式的概念,这将在具体实施方式部分中进一步详细说明。本发明的发明内容部分并不意味着要 试图限定出所要求保护的技术方案的关键特征和必要技术特征,更不意味着试图确定所要求保护的技术方案的保护范围。A series of simplified concepts are introduced in the content of the invention, which will be described in further detail in the detailed implementation section. The inventive content part of the present invention does not mean an attempt to limit the key features and necessary technical features of the claimed technical solution, nor does it mean an attempt to determine the protection scope of the claimed technical solution.
针对现有技术的不足,本发明实施例第一方面提供了一种运动估计方法,所述方法包括:In view of the shortcomings of the prior art, the first aspect of the embodiments of the present invention provides a motion estimation method, the method includes:
对于当前帧中的仿射编码单元,从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量;For the affine coding unit in the current frame, select one from at least four kinds of motion vector accuracy to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit;
将所述仿射编码单元划分为若干个子单元;Dividing the affine coding unit into several subunits;
根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。The motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
本发明实施例第二方面提供了另一种运动估计方法,所述方法包括:The second aspect of the embodiments of the present invention provides another motion estimation method, and the method includes:
对于当前帧中的仿射编码单元,从多种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量,其中,所述多种运动矢量精度包括1/2像素精度;For the affine coding unit in the current frame, select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein the multiple Motion vector accuracy includes 1/2 pixel accuracy;
将所述仿射编码单元划分为若干个子单元;Dividing the affine coding unit into several subunits;
根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。The motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
本发明实施例第三方面提供了一种运动估计系统,所述系统包括存储装置和处理器,所述存储装置上存储有由所述处理器运行的计算机程序,所述计算机程序在被所述处理器运行时执行如上所述的运动估计方法。A third aspect of the embodiments of the present invention provides a motion estimation system. The system includes a storage device and a processor. The storage device stores a computer program run by the processor. The processor executes the above-mentioned motion estimation method while it is running.
本发明实施例第四方面提供了一种存储介质,所述存储介质上存储有计算机程序,所述计算机程序在运行时执行如上所述的运动估计方法。A fourth aspect of the embodiments of the present invention provides a storage medium on which a computer program is stored, and the computer program executes the above-mentioned motion estimation method when running.
本发明的运动估计方法、系统和存储介质使仿射模式下的运动矢量精度与常规模式下的运动矢量精度的设计相统一,提高了编码性能。The motion estimation method, system and storage medium of the present invention unify the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the conventional mode, and improve the coding performance.
附图说明Description of the drawings
本发明的下列附图在此作为本发明的一部分用于理解本发明。附图中示出了本发明的实施例及其描述,用来解释本发明的原理。The following drawings of the present invention are used here as a part of the present invention for understanding the present invention. The drawings show the embodiments of the present invention and the description thereof to explain the principle of the present invention.
附图中:In the attached picture:
图1示出了根据本发明一实施例的运动估计方法的流程图;Fig. 1 shows a flowchart of a motion estimation method according to an embodiment of the present invention;
图2示出了根据本发明一实施例的仿射编码单元的控制点的运动矢量的示意图;Fig. 2 shows a schematic diagram of a motion vector of a control point of an affine coding unit according to an embodiment of the present invention;
图3示出了根据本发明一实施例的仿射编码单元的子单元的运动矢量的示意图;Fig. 3 shows a schematic diagram of motion vectors of subunits of an affine coding unit according to an embodiment of the present invention;
图4示出了根据本发明另一实施例的运动估计方法的流程图;Fig. 4 shows a flowchart of a motion estimation method according to another embodiment of the present invention;
图5示出了根据本发明一实施例的运动估计系统的结构框图。Fig. 5 shows a structural block diagram of a motion estimation system according to an embodiment of the present invention.
具体实施方式detailed description
为了使得本发明的目的、技术方案和优点更为明显,下面将参照附图详细描述根据本发明的示例实施例。显然,所描述的实施例仅仅是本发明的一部分实施例,而不是本发明的全部实施例,应理解,本发明不受这里描述的示例实施例的限制。基于本发明中描述的本发明实施例,本领域技术人员在没有付出创造性劳动的情况下所得到的所有其它实施例都应落入本发明的保护范围之内。In order to make the objectives, technical solutions, and advantages of the present invention more obvious, the exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments of the present invention, and it should be understood that the present invention is not limited by the exemplary embodiments described herein. Based on the embodiments of the present invention described in the present invention, all other embodiments obtained by those skilled in the art without creative work should fall within the protection scope of the present invention.
在下文的描述中,给出了大量具体的细节以便提供对本发明更为彻底的理解。然而,对于本领域技术人员而言显而易见的是,本发明可以无需一个或多个这些细节而得以实施。在其他的例子中,为了避免与本发明发生混淆,对于本领域公知的一些技术特征未进行描述。In the following description, a lot of specific details are given in order to provide a more thorough understanding of the present invention. However, it is obvious to those skilled in the art that the present invention can be implemented without one or more of these details. In other examples, in order to avoid confusion with the present invention, some technical features known in the art are not described.
应当理解的是,本发明能够以不同形式实施,而不应当解释为局限于这里提出的实施例。相反地,提供这些实施例将使公开彻底和完全,并且将本发明的范围完全地传递给本领域技术人员。It should be understood that the present invention can be implemented in different forms and should not be construed as being limited to the embodiments presented here. On the contrary, the provision of these embodiments will make the disclosure thorough and complete, and will fully convey the scope of the present invention to those skilled in the art.
在此使用的术语的目的仅在于描述具体实施例并且不作为本发明的限制。在此使用时,单数形式的“一”、“一个”和“所述/该”也意图包括复数形式,除非上下文清楚指出另外的方式。还应明白术语“组成”和/或“包括”,当在该说明书中使用时,确定所述特征、整数、步骤、操作、元件和/或部件的存在,但不排除一个或更多其它的特征、整数、步骤、操作、元件、部件和/或组的存在或添加。在此使用时,术语“和/或”包括相关所列项目的任何及所有组合。The purpose of the terms used here is only to describe specific embodiments and not as a limitation of the present invention. When used herein, the singular forms "a", "an" and "the/the" are also intended to include plural forms, unless the context clearly indicates otherwise. It should also be understood that the terms "composition" and/or "including", when used in this specification, determine the existence of the described features, integers, steps, operations, elements and/or components, but do not exclude one or more other The existence or addition of features, integers, steps, operations, elements, components, and/or groups. As used herein, the term "and/or" includes any and all combinations of related listed items.
为了彻底理解本发明,将在下列的描述中提出详细的步骤以及详 细的结构,以便阐释本发明提出的技术方案。本发明的较佳实施例详细描述如下,然而除了这些详细描述外,本发明还可以具有其他实施方式。In order to thoroughly understand the present invention, detailed steps and detailed structures will be proposed in the following description to explain the technical solution proposed by the present invention. The preferred embodiments of the present invention are described in detail as follows. However, in addition to these detailed descriptions, the present invention may also have other embodiments.
本发明实施例的运动估计方法可以应用到视频编解码技术中的帧间预测部分。为了更好地理解本发明实施例的运动估计方法,下面首先对视频编解码进行介绍。The motion estimation method of the embodiment of the present invention can be applied to the inter-frame prediction part of the video coding and decoding technology. In order to better understand the motion estimation method of the embodiment of the present invention, the following first introduces video coding and decoding.
视频一般由多帧图像按照一定的次序组成。一帧图像内往往存在着大量空间结构相同或者相似之处,也就是说视频文件中存在大量的空间冗余信息。此外。由于视频相邻两帧之间的采样时间间隔极短,因此相邻两帧中通常存在大量的相似性,即视频中还存在大量的时间冗余信息。此外,从人眼的视觉敏感度的角度出发,视频信息中也存在可以用来压缩的部分,即视觉冗余信息。Video is generally composed of multiple frames of images in a certain order. There are often a lot of identical or similar spatial structures in one frame of image, that is to say, there are a lot of spatial redundant information in the video file. In addition. Since the sampling time interval between two adjacent frames of the video is extremely short, there is usually a large amount of similarity in the adjacent two frames, that is, there is a large amount of time redundant information in the video. In addition, from the perspective of the visual sensitivity of the human eye, there is also a part of video information that can be used for compression, that is, visual redundant information.
除了上述空间冗余、时间冗余和视觉冗余外,视频图像信息中还存在信息熵冗余、结构冗余、知识冗余、重要性冗余等等一系列的冗余信息。视频编码的目的就在于去除视频序列中的冗余信息,以达到减小存储空间和节省传输带宽的效果。In addition to the above-mentioned spatial redundancy, temporal redundancy and visual redundancy, video image information also has a series of redundant information such as information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy and so on. The purpose of video coding is to remove redundant information in a video sequence, so as to reduce storage space and save transmission bandwidth.
目前,视频编码主要包括帧内预测、帧间预测、变换、量化、熵编码、环路滤波几个部分,本发明实施例主要针对帧间预测部分进行改进。帧间预测技术利用视频相邻帧之间的时域相关性,使用先前已经编码的重构帧作为参考帧,通过运动估计和运动补偿对当前帧(当前正在编码的帧)进行预测,从而去除视频的时间冗余信息。At present, video coding mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering. The embodiment of the present invention mainly aims at improving the inter-frame prediction part. The inter-frame prediction technology uses the time-domain correlation between adjacent frames of the video, uses the previously encoded reconstructed frame as a reference frame, and predicts the current frame (the frame currently being encoded) through motion estimation and motion compensation, thereby removing Time redundant information of the video.
下面结合附图,对本申请的运动估计方法、系统和存储介质进行详细说明。在不冲突的情况下,下述的实施例及实施方式中的特征可以相互组合。本发明实施例中所描述的运动估计方法、系统和存储介质使用HEVC标准或其扩展。然而,本发明也适用于其它编码标准,例如H.264标准、下一代视频编码标准VVC、AVS3或任何其他合适的编码标准。The following describes the motion estimation method, system and storage medium of the present application in detail with reference to the accompanying drawings. In the case of no conflict, the following embodiments and features in the implementation can be combined with each other. The motion estimation method, system, and storage medium described in the embodiments of the present invention use the HEVC standard or its extension. However, the present invention is also applicable to other coding standards, such as the H.264 standard, the next generation video coding standard VVC, AVS3, or any other suitable coding standard.
图1示出了根据本发明的一个实施例的、运动估计方法100的流程图。如图1所示,方法100包括如下步骤:Fig. 1 shows a flowchart of a motion estimation method 100 according to an embodiment of the present invention. As shown in FIG. 1, the method 100 includes the following steps:
在步骤S110,对于当前帧中的仿射编码单元,从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射 编码单元的控制点的运动矢量。In step S110, for the affine coding unit in the current frame, one of at least four kinds of motion vector accuracy is selected to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit.
其中,所述当前帧即当前待编码的视频帧。当前帧可以是实时采集的视频帧,也可以是从存储介质中提取的视频帧。Wherein, the current frame is the video frame currently to be encoded. The current frame can be a video frame collected in real time, or a video frame extracted from a storage medium.
参考帧是在对当前帧进行编码时所要参考的视频帧。参考帧可以是对可作为参考帧的视频帧对应的编码数据重构得到的重构视频帧。根据帧间预测类型的不同,参考帧可为前向参考帧、后向参考帧或双向参考帧。具体地,帧间预测技术包括前向预测、后向预测、双向预测等。前向预测利用当前帧的前一帧(历史帧)作为参考帧对当前帧进行预测。后向预测利用当前帧之后的帧(将来帧)作为参考帧对当前帧进行预测。双向预测是不仅利用历史帧、也利用将来帧对当前帧进行预测。本实施例中采用双向预测模式,即参考帧既包括历史帧又包括将来帧。The reference frame is the video frame to be referred to when encoding the current frame. The reference frame may be a reconstructed video frame obtained by reconstructing the encoded data corresponding to the video frame that can be used as the reference frame. Depending on the type of inter prediction, the reference frame can be a forward reference frame, a backward reference frame, or a bidirectional reference frame. Specifically, inter-frame prediction techniques include forward prediction, backward prediction, bidirectional prediction, and so on. Forward prediction uses the previous frame (historical frame) of the current frame as a reference frame to predict the current frame. Backward prediction uses the frame after the current frame (future frame) as a reference frame to predict the current frame. Bidirectional prediction uses not only historical frames but also future frames to predict the current frame. In this embodiment, a bidirectional prediction mode is adopted, that is, the reference frame includes both historical frames and future frames.
当前帧中的仿射编码单元,即基于仿射运动补偿预测(Affine)技术在当前帧中划分的编码单元(Coding Unit,CU)。The affine coding unit in the current frame is a coding unit (CU) divided in the current frame based on the affine motion compensation prediction (Affine) technology.
具体地,传统的运动模型只包括平移运动,然而现实存在很多种运动形式,例如缩放、旋转、透视运动等无规则的运动,由此引入了Affine技术。Affine技术中的处理单位不再是整个编码单元,而是将整个编码单元划分为多个子单元,在运动补偿的过程中,以子单元为单位进行运动补偿。Specifically, the traditional motion model only includes translational motion, but in reality there are many forms of motion, such as zooming, rotating, perspective motion and other irregular motions, which introduces the Affine technology. The processing unit in the Affine technology is no longer the entire coding unit, but divides the entire coding unit into multiple sub-units. In the process of motion compensation, motion compensation is performed in the unit of sub-units.
与常规编码单元相比,Affine模式下的仿射编码单元不再只有一个运动矢量,而是仿射编码单元中的每一个子单元有各自的运动矢量。在确定仿射编码单元的控制点的运动矢量之后,仿射编码单元中每个子单元的运动矢量通过仿射编码单元的两个控制点(即四参数模型,参见图2中左侧图)或三个控制点(即六参数模型,参见图2中右侧图)的运动矢量计算导出,在码流中只需要写入控制点的运动矢量的信息,而不需要写入每个子单元的运动矢量的信息。Compared with the conventional coding unit, the affine coding unit in the Affine mode no longer has only one motion vector, but each subunit in the affine coding unit has its own motion vector. After determining the motion vector of the control point of the affine coding unit, the motion vector of each subunit in the affine coding unit passes through the two control points of the affine coding unit (ie, the four-parameter model, see the left figure in Figure 2) or The motion vectors of the three control points (that is, the six-parameter model, see the right figure in Figure 2) are calculated and derived. Only the motion vector information of the control point needs to be written in the code stream, not the motion of each subunit. Vector information.
如上所述,为了确定子单元的运动矢量,首先确定控制点的运动矢量。在运动估计的过程,由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位,因此本发明实施例采用自适应运动矢量精度(AMVR)技术在编码端自适应地确定运动矢量的精度。在本发明实施例中,控制点的运动矢量的确定是基于 Affine模式下的Inter模式(又称AMVP模式),在该模式下,会在编码端进行运动矢量精度的选择,以及进行MVD(Motion Vector Difference,运动矢量差值)的计算。As described above, in order to determine the motion vector of the subunit, the motion vector of the control point is first determined. In the process of motion estimation, due to the continuity of natural object motion, the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the embodiment of the present invention adopts the adaptive motion vector accuracy (AMVR) technology in The encoder side adaptively determines the accuracy of the motion vector. In the embodiment of the present invention, the determination of the motion vector of the control point is based on the Inter mode (also known as the AMVP mode) in the Affine mode. In this mode, the motion vector accuracy is selected on the encoder side, and MVD (Motion Vector Difference, motion vector difference) calculation.
在一个实施例中,可选的运动矢量精度共包括四种,对于每个编码单元,从四种运动矢量精度中选择一种以进行运动估计。所述至少四种运动矢量精度包括4像素、2像素、整像素、1/2像素、1/4像素、1/8像素和1/16像素中的任意四种。例如,四种运动矢量精度可以为整像素精度、1/2像素精度、1/4像素精度和1/16像素精度。In one embodiment, the selectable motion vector precision includes four kinds, and for each coding unit, one of the four kinds of motion vector precision is selected for motion estimation. The at least four motion vector precisions include any four of 4 pixels, 2 pixels, whole pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels. For example, the four kinds of motion vector precisions may be integer pixel precision, 1/2 pixel precision, 1/4 pixel precision, and 1/16 pixel precision.
在当前视频编码软件VTM-6.0中,常规AMVP模式包括四种AMVR精度。因此与以往的采用三种精度的Affine模式相比,本发明实施例增加了一位精度,从而使仿射编码单元可用的运动矢量精度的数量与常规编码单元可用的运动矢量精度的数量相同,进而使Affine模式下的自适应运动矢量精度的设计与常规AMVP模式下自适应运动矢量精度的设计相统一。在一个实施例中,本发明实施例新增的精度为1/2像素精度。In the current video coding software VTM-6.0, the conventional AMVP mode includes four AMVR precisions. Therefore, compared with the previous Affine mode with three precisions, the embodiment of the present invention increases the precision by one bit, so that the number of motion vector precisions available for the affine coding unit is the same as the number of motion vector precisions available for the conventional coding unit. Furthermore, the design of adaptive motion vector accuracy in Affine mode is unified with the design of adaptive motion vector accuracy in conventional AMVP mode. In one embodiment, the newly added precision in the embodiment of the present invention is 1/2 pixel precision.
需要说明的是,对于Affine模式中所涉及的1/16精度、1/4精度、整像素精度等所指的控制点的运动矢量的精度,而不是子单元做运动补偿的过程中实际使用的运动矢量的精度。It should be noted that the accuracy of the motion vector of the control point referred to in the Affine mode, such as 1/16 accuracy, 1/4 accuracy, and integer pixel accuracy, is not the actual use in the process of sub-unit motion compensation. The accuracy of the motion vector.
在一个实施例中,确定运动矢量精度的方法包括:根据临近编码单元已选定的运动矢量精度来选择所述运动矢量精度。In an embodiment, the method for determining the accuracy of the motion vector includes: selecting the accuracy of the motion vector according to the selected motion vector accuracy of the neighboring coding unit.
在另一实施例中,确定运动矢量精度的方法还可以包括:基于所述四种运动矢量精度中的至少两种尝试进行运动估计,并基于所述运动估计的效果选择所述运动矢量精度。In another embodiment, the method for determining the accuracy of a motion vector may further include: attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation.
具体地,可以从所述四种可选的运动矢量精度中选择两种运动矢量精度,分别尝试进行运动估计,并比较两次所述运动估计的效果。例如,可以选择1/2像素精度和整像素精度,分别进行运动估计。Specifically, two kinds of motion vector precisions can be selected from the four optional motion vector precisions, and motion estimation is attempted respectively, and the effects of the motion estimation twice are compared. For example, you can select 1/2 pixel precision and integer pixel precision to perform motion estimation separately.
之后,比较运动估计的效果。若使用较低的运动矢量精度的运动估计效果较好,则停止尝试,并直接将所述较低的运动矢量精度作为选定的运动矢量精度。例如,若采用整像素精度进行运动估计的效果优于采用1/2像素精度进行运动估计的效果,则不再进行其他精度的尝试,直接选定整像素精度。若使用较高的运动矢量精度的运动估计 效果较好,则继续使用更高的运动矢量精度尝试进行运动估计,直到获得最佳的运动估计效果。例如,若采用1/2像素精度进行运动估计的效果优于整像素精度,则继续尝试1/4像素精度进行运动估计。若1/2像素精度下的运动估计的效果优于1/4像素精度下的运动估计的效果,则选定1/2像素精度作为运动矢量精度。若1/4像素精度下的运动估计的效果优于1/2像素精度下的运动估计的效果,则还可以继续比较1/8像素精度下的运动估计的效果。After that, compare the effects of motion estimation. If the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. For example, if the effect of using the integer pixel precision for motion estimation is better than the effect of using 1/2 pixel accuracy for motion estimation, then no other precision attempts are made, and the integer pixel precision is directly selected. If the motion estimation effect with higher motion vector accuracy is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained. For example, if the effect of using 1/2 pixel accuracy for motion estimation is better than that of whole pixel accuracy, then continue to try 1/4 pixel accuracy for motion estimation. If the effect of motion estimation under 1/2 pixel accuracy is better than the effect of motion estimation under 1/4 pixel accuracy, then 1/2 pixel accuracy is selected as the motion vector accuracy. If the effect of motion estimation under 1/4 pixel accuracy is better than the effect of motion estimation under 1/2 pixel accuracy, you can continue to compare the effect of motion estimation under 1/8 pixel accuracy.
在一个实施例中,所述确定所述仿射编码单元的控制点的运动矢量包括:首先,获取空域或时域临近编码单元的运动矢量,并根据所述空域临近编码单元或时域临近编码单元的运动矢量的组合构建候选列表。In an embodiment, the determining the motion vector of the control point of the affine coding unit includes: first, obtaining the motion vector of the spatial or temporal adjacent coding unit, and according to the spatial adjacent coding unit or temporal adjacent coding The combination of the motion vectors of the units constructs a candidate list.
其中,该过程获取的运动矢量可以是Affine模式的编码单元的控制点的运动矢量,也可以是传统模式下常规编码单元的运动矢量。之后,将所获取到的运动矢量进行组合,以构建控制点运动矢量的候选列表,每个组合中运动矢量的个数取决于仿射编码单元的控制点的数目。The motion vector obtained in this process may be the motion vector of the control point of the coding unit in the Affine mode, or the motion vector of the conventional coding unit in the traditional mode. After that, the obtained motion vectors are combined to construct a candidate list of control point motion vectors, and the number of motion vectors in each combination depends on the number of control points of the affine coding unit.
之后,在所述候选列表中选取一组运动矢量作为所述仿射编码单元的控制点的预测运动矢量(MVP),并根据所述预测运动矢量在所述参考帧中进行运动估计,以确定所述仿射编码单元的控制点的实际运动矢量。例如,可以根据预测运动矢量,在参考帧中确定对应的参考块。之后,对参考块进行插值处理,以生成分数像素点,进而确定所述实际运动矢量。Afterwards, select a group of motion vectors from the candidate list as the motion vector predictor (MVP) of the control point of the affine coding unit, and perform motion estimation in the reference frame according to the motion vector predictor to determine The actual motion vector of the control point of the affine coding unit. For example, the corresponding reference block can be determined in the reference frame according to the predicted motion vector. After that, interpolation processing is performed on the reference block to generate fractional pixels, and then the actual motion vector is determined.
编码端还可以计算实际运动矢量与预测运动矢量之间的差值MVD(Motion Vector Difference),对MVD进行编码,并且将编码的MVD和预测运动矢量在候选列表中的索引发送到解码端。The encoding end can also calculate the difference MVD (Motion Vector Difference) between the actual motion vector and the predicted motion vector, encode the MVD, and send the encoded MVD and the index of the predicted motion vector in the candidate list to the decoding end.
如上所述,运动矢量的精度包括整像素精度和分数像素精度,而由于分数像素位置的像素本身不存在,因此需要通过对参考块进行插值来获取亚像素位置处的像素。插值是为了利用整数像素点的值生成各个整数样本之间的分数像素点。整数像素点之间生成越多分数像素点,参考块的分辨率变得越高,可越精确和准确地补偿分数像素精度的位移。随着插值精度的提升,运动估计和运动补偿的效率会有一定 程度的提升。As described above, the accuracy of the motion vector includes integer pixel accuracy and fractional pixel accuracy. Since the pixel at the fractional pixel position does not exist, it is necessary to interpolate the reference block to obtain the pixel at the sub-pixel position. Interpolation is to use the value of integer pixels to generate fractional pixels between each integer sample. The more fractional pixels are generated between integer pixels, the higher the resolution of the reference block becomes, and the more accurately and accurately the displacement of fractional pixel precision can be compensated. With the improvement of interpolation accuracy, the efficiency of motion estimation and motion compensation will be improved to a certain extent.
具体而言,Affine模式的运动矢量的精度可以是整数,即整像素精度,例如整像素、2像素;也可以是非整数的,即亚像素精度,例如1/2、1/4、1/8等精度。作为示例,1/2精度位置的像素需要通过整像素位置的像素进行插值来得到。其他精度位置的像素值需要使用整像素精度像素或1/2精度像素进一步插值来获得。Specifically, the accuracy of the motion vector in the Affine mode can be an integer, that is, an integer pixel accuracy, such as integer, 2 pixels; it can also be non-integer, that is, a sub-pixel accuracy, such as 1/2, 1/4, 1/8. Equal precision. As an example, the pixel at 1/2 precision position needs to be obtained by interpolation of the pixel at the whole pixel position. The pixel values of other precision positions need to be obtained by further interpolation using integer-pixel precision pixels or 1/2-precision pixels.
示例性地,可以根据所选定的运动矢量精度选择插值滤波器,以对所述参考块进行插值处理。Exemplarily, an interpolation filter can be selected according to the selected motion vector accuracy to perform interpolation processing on the reference block.
在一个实施例中,可以对于所有运动矢量精度均采用同一种插值滤波器。例如,对于所有的运动矢量精度,均默认采用现有的六抽头的插值滤波器。在这种情况下,可以不在码流中设置表征插值滤波器类型的标识位,从而节省一位数据位。In one embodiment, the same interpolation filter may be used for all motion vector accuracy. For example, for all motion vector accuracy, the existing six-tap interpolation filter is used by default. In this case, the identification bit that characterizes the type of the interpolation filter may not be set in the code stream, thereby saving one bit of data.
在另一实施例中,可以根据不同的运动矢量精度选择不同的插值滤波器。例如,由于对于常规的AMVP模式,仅1/2精度采用6抽头插值滤波器,其它精度均采用8抽头插值滤波器。因而在本发明一个实施例中,当选择1/2像素精度作为所述运动矢量精度时,选择第一插值滤波器,以对所述参考块进行插值处理;当选择除1/2像素精度以外的其他精度作为所述运动矢量精度时,选择第二插值滤波器,以对所述参考块进行插值处理,其中,所述第一插值滤波器和所述第二插值滤波器的抽头数量不同。进一步地,所述第一插值滤波器可以为6抽头插值滤波器,所述第二插值滤波器可以为8抽头插值滤波器。由此,使Affine模式下的插值滤波器设计与传统AMVP模式下的插值滤波器设计更为匹配。In another embodiment, different interpolation filters can be selected according to different motion vector accuracy. For example, for the conventional AMVP mode, only 1/2 precision uses a 6-tap interpolation filter, and other precisions all use an 8-tap interpolation filter. Therefore, in an embodiment of the present invention, when 1/2 pixel precision is selected as the motion vector precision, the first interpolation filter is selected to perform interpolation processing on the reference block; when the precision other than 1/2 pixel precision is selected When the other precision of is used as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein the number of taps of the first interpolation filter and the second interpolation filter are different. Further, the first interpolation filter may be a 6-tap interpolation filter, and the second interpolation filter may be an 8-tap interpolation filter. As a result, the interpolation filter design in the Affine mode is more matched with the interpolation filter design in the traditional AMVP mode.
进一步地,若根据不同的运动矢量精度选择不同的插值滤波器,则可以在码流中设置滤波器类型的标识位。例如,可以用1表示使用6抽头的插值滤波器;用0表示不使用6抽头的插值滤波器,即使用默认的8抽头的插值滤波器。Further, if different interpolation filters are selected according to different motion vector accuracy, then the filter type identification bit can be set in the code stream. For example, 1 can be used to indicate that a 6-tap interpolation filter is used; 0 can be used to indicate that a 6-tap interpolation filter is not used, that is, the default 8-tap interpolation filter is used.
因而,在一个实施例中,当应用于解码端时,若不同的运动矢量精度选择不同的插值滤波器,则根据所述运动矢量精度选择插值滤波器之前,运动估计方法200还包括:获取码流,所述码流中设置有运动矢量对应的滤波器类型的标识位。Therefore, in one embodiment, when applied to the decoding end, if different interpolation filters are selected for different motion vector accuracy, before selecting the interpolation filter according to the motion vector accuracy, the motion estimation method 200 further includes: acquiring code Stream, the code stream is provided with an identification bit of the filter type corresponding to the motion vector.
如上所述,在获取临近编码单元的运动矢量以构建候选列表的过程中,所获取到的可以是Affine模式的编码单元的控制点的运动矢量,也可以是传统模式下常规编码单元的运动矢量。也就是说,运动估计可以同时包括Affine模式和常规AMVP模式。在常规AMVP下,对于当前帧中所划分的常规编码单元,以整个编码单元为单位进行运动估计。As mentioned above, in the process of obtaining the motion vectors of adjacent coding units to construct the candidate list, what is obtained can be the motion vector of the control point of the coding unit in Affine mode, or the motion vector of the conventional coding unit in the traditional mode. . In other words, motion estimation can include both Affine mode and regular AMVP mode. Under the conventional AMVP, for the conventional coding unit divided in the current frame, motion estimation is performed with the entire coding unit as a unit.
对于每个常规编码单元,当应用自适应运动矢量精度(AMVR)时,同样包括从四种运动矢量精度中自适应地选择一种以进行运动估计。所述常规编码单元的四种运动矢量精度与所述仿射编码单元的四种运动矢量精度相同或不同。例如,所述四种运动矢量精度可以包括整像素、4像素,1/4像素和1/2像素精度。然而需要注意的是,所述运动矢量精度不限于上述四种,例如其还可以包括1/8像素、1/16像素等。For each conventional coding unit, when adaptive motion vector accuracy (AMVR) is applied, it also includes adaptively selecting one of four motion vector accuracy for motion estimation. The four motion vector accuracies of the conventional coding unit are the same or different from the four motion vector accuracies of the affine coding unit. For example, the four kinds of motion vector precisions may include integer pixel, 4 pixel, 1/4 pixel and 1/2 pixel precision. However, it should be noted that the accuracy of the motion vector is not limited to the above four types, for example, it may also include 1/8 pixel, 1/16 pixel, and so on.
对于每一个采用AMVR技术的常规编码单元,在编码端自适应地决策其对应的运动矢量精度,并将决策的结果写进码流传递到解码端。在本发明实施例中,表示仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致,从而使两种模式更加统一。For each conventional coding unit using AMVR technology, the corresponding motion vector accuracy is adaptively decided at the coding end, and the result of the decision is written into the code stream and passed to the decoding end. In the embodiment of the present invention, the identifier indicating the accuracy of the motion vector of the affine coding unit is consistent with the identifier indicating the accuracy of the motion vector of the conventional coding unit, so that the two modes are more unified.
因而,在一个实施例中,当所述运动估计方法200应用于解码端时,在从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计之前还包括:获取码流,所述码流的标识位中记录有所选定的所述仿射编码单元的运动矢量精度,表示所述仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致。Therefore, in one embodiment, when the motion estimation method 200 is applied to the decoding end, before selecting one of the at least four motion vector precisions to perform motion estimation in the reference frame, the method further includes: acquiring a bitstream, so The identification bit of the code stream records the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the motion vector accuracy of the conventional coding unit Consistent.
在步骤S120,将所述仿射编码单元划分为若干个子单元。In step S120, the affine coding unit is divided into several subunits.
其中,子单元的尺寸可以是固定的,例如每个子单元都被划分为4×4像素大小。或者,子单元的尺寸也可以是通过其他方式确定的,例如,可以选取合适尺寸的子单元,以降低编解码的复杂度。Wherein, the size of the sub-units may be fixed, for example, each sub-unit is divided into a size of 4×4 pixels. Alternatively, the size of the subunit may also be determined in other ways. For example, a subunit of an appropriate size may be selected to reduce the complexity of coding and decoding.
之后,在步骤S130中,根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。After that, in step S130, the motion vector of the subunit in the affine coding unit is calculated according to the motion vector of the control point.
示例性地,Affine模式的运动场可以通过两个控制点(四参数)或三个控制点(六参数)的运动矢量导出。在确定控制点的运动矢量 之后,对于四参数(两个控制点)的仿射编码单元,位于(x,y)位置的子单元的运动矢量通过以下公式(1)计算得到:Exemplarily, the sports field of the Affine mode can be derived from the motion vectors of two control points (four parameters) or three control points (six parameters). After determining the motion vector of the control point, for the four-parameter (two control points) affine coding unit, the motion vector of the subunit located at the (x, y) position is calculated by the following formula (1):
Figure PCTCN2019107601-appb-000001
Figure PCTCN2019107601-appb-000001
其中,(mv 0x,mv 0y)为左上角控制点的运动矢量,(mv 1x,mv 1y)为右上角控制点的运动矢量,x、y为子单元中心点处的坐标,w为仿射编码单元的宽度。 Among them, (mv 0x ,mv 0y ) is the motion vector of the control point in the upper left corner, (mv 1x ,mv 1y ) is the motion vector of the control point in the upper right corner, x and y are the coordinates of the center point of the subunit, and w is the affine The width of the coding unit.
对于六参数(三个控制点)的仿射编码单元,位于(x,y)位置处的子单元的运动矢量通过以下公式(2)计算得到:For the six-parameter (three control points) affine coding unit, the motion vector of the sub-unit at the position (x, y) is calculated by the following formula (2):
Figure PCTCN2019107601-appb-000002
Figure PCTCN2019107601-appb-000002
其中,(mv 0x,mv 0y)为左上角控制点的运动矢量,(mv 1x,mv 1y)为右上角控制点的运动矢量,(mv 2x,mv 2y)为左下角控制点的运动矢量,w为仿射编码单元的宽度。 Among them, (mv 0x ,mv 0y ) is the motion vector of the control point in the upper left corner, (mv 1x ,mv 1y ) is the motion vector of the control point in the upper right corner, (mv 2x ,mv 2y ) is the motion vector of the control point in the lower left corner, w is the width of the affine coding unit.
经过上述公式的计算,一个仿射编码单元中运动矢量的示意图如图3所示,其中每个方格代表4×4大小的子单元。在上述公式计算之后的所有运动矢量都会舍入为1/16像素精度的表示。色度分量和亮度分量的子单元的大小都是4×4,色度分量4×4子单元的运动矢量可以由其对应的四个4×4的亮度分量的运动矢量平均得到。在计算得到每一个子单元的运动矢量之后,经过运动补偿过程可以得到每一个子单元在参考帧中的预测块。之后,利用运动矢量和预测块可以得到预测帧,编码端将预测帧和实际的当前帧之间的差值经过变换、量化等处理之后传递到解码端,解码端通过运动矢量、参考帧、以及预测帧和当前帧之间的差值可以重构出当前帧。After calculating the above formula, a schematic diagram of the motion vector in an affine coding unit is shown in Fig. 3, where each square represents a 4×4 size subunit. All motion vectors after the calculation of the above formula will be rounded to a 1/16 pixel precision representation. The size of the subunits of the chrominance component and the luminance component are both 4×4, and the motion vector of the chrominance component 4×4 subunit can be obtained by averaging the motion vectors of the corresponding four 4×4 luminance components. After the motion vector of each subunit is calculated, the prediction block of each subunit in the reference frame can be obtained through a motion compensation process. After that, the prediction frame can be obtained by using the motion vector and the prediction block. The encoding end transfers the difference between the prediction frame and the actual current frame to the decoding end after transformation, quantization, etc., and the decoding end uses the motion vector, reference frame, and The difference between the predicted frame and the current frame can reconstruct the current frame.
基于上面的描述,根据本发明实施例的运动估计方法使仿射模式下的运动矢量精度与常规模式下的运动矢量精度的设计相统一,提高了编码性能。Based on the above description, the motion estimation method according to the embodiment of the present invention unifies the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the normal mode, and improves the coding performance.
图4示出了根据本发明的另一实施例的、运动估计方法400的流程图。如图4所示,方法400包括如下步骤:Fig. 4 shows a flowchart of a motion estimation method 400 according to another embodiment of the present invention. As shown in FIG. 4, the method 400 includes the following steps:
在步骤S410,对于当前帧中的仿射编码单元,从多种运动矢量 精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量,其中,所述多种运动矢量精度包括1/2像素精度;In step S410, for the affine coding unit in the current frame, select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein, The various motion vector precisions include 1/2 pixel precision;
在步骤S420,将所述仿射编码单元划分为若干个子单元;In step S420, the affine coding unit is divided into several subunits;
在步骤S430,根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。In step S430, the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
在步骤S410中,所述当前帧即当前待编码的视频帧。参考帧是在对当前帧进行编码时所要参考的视频帧。本实施例中参考帧既包括历史帧又包括将来帧。In step S410, the current frame is the video frame currently to be encoded. The reference frame is the video frame to be referred to when encoding the current frame. The reference frame in this embodiment includes both historical frames and future frames.
当前帧中的仿射编码单元,即基于仿射运动补偿预测(Affine)技术在当前帧中划分的编码单元(Coding Unit,CU)。Affine技术中的处理单位不再是整个编码单元,而是将整个编码单元划分为多个子单元,在运动补偿的过程中,以子单元为单位进行运动补偿。The affine coding unit in the current frame is a coding unit (CU) divided in the current frame based on the affine motion compensation prediction (Affine) technology. The processing unit in the Affine technology is no longer the entire coding unit, but divides the entire coding unit into multiple sub-units. In the process of motion compensation, motion compensation is performed in the unit of sub-units.
与常规编码单元相比,Affine模式下的仿射编码单元不再只有一个运动矢量,而是仿射编码单元中的每一个子单元有各自的运动矢量。在确定仿射编码单元的控制点的运动矢量之后,仿射编码单元中每个子单元的运动矢量通过仿射编码单元的两个控制点(即四参数模型,参见图2中左侧图)或三个控制点(即六参数模型,参见图2中右侧图)的运动矢量计算导出,在码流中只需要写入控制点的运动矢量的信息,而不需要写入每个子单元的运动矢量的信息。Compared with the conventional coding unit, the affine coding unit in the Affine mode no longer has only one motion vector, but each subunit in the affine coding unit has its own motion vector. After determining the motion vector of the control point of the affine coding unit, the motion vector of each subunit in the affine coding unit passes through the two control points of the affine coding unit (ie, the four-parameter model, see the left figure in Figure 2) or The motion vectors of the three control points (that is, the six-parameter model, see the right figure in Figure 2) are calculated and derived. Only the motion vector information of the control point needs to be written in the code stream, not the motion of each subunit. Vector information.
如上所述,为了确定子单元的运动矢量,首先需要确定控制点的运动矢量。在运动估计的过程,由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位,因此本发明实施例采用自适应运动矢量精度(AMVR)技术在编码端自适应地确定运动矢量的精度。在本发明实施例中,控制点的运动矢量的确定是基于Affine模式下的Inter模式(又称AMVP模式),在该模式下,会在编码端进行运动矢量精度的选择,以及进行MVD(Motion Vector Difference,运动矢量差值)的计算。As mentioned above, in order to determine the motion vector of the subunit, the motion vector of the control point needs to be determined first. In the process of motion estimation, due to the continuity of natural object motion, the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the embodiment of the present invention adopts the adaptive motion vector accuracy (AMVR) technology in The encoder side adaptively determines the accuracy of the motion vector. In the embodiment of the present invention, the determination of the motion vector of the control point is based on the Inter mode (also known as the AMVP mode) in the Affine mode. In this mode, the motion vector accuracy is selected on the encoder side, and MVD (Motion Vector Difference, motion vector difference) calculation.
在本实施例中,从多种运动矢量精度中选择一种以在参考帧中进行运动估计,其中,所述多种运动矢量精度包括1/2像素精度。示例性地,对于每个编码单元,可以从四种运动矢量精度中选择一种以进 行运动估计。除了固定的1/2像素精度以外,所述多种运动矢量精度包括4像素、2像素、整像素、1/4像素、1/8像素和1/16像素中的任意几种。例如,可以从整像素精度、1/2像素精度、1/4像素精度和1/16像素精度中选择一种以进行运动估计。In this embodiment, one is selected from multiple types of motion vector accuracy to perform motion estimation in the reference frame, where the multiple types of motion vector accuracy include 1/2 pixel accuracy. Exemplarily, for each coding unit, one of four kinds of motion vector accuracy can be selected for motion estimation. In addition to the fixed 1/2 pixel precision, the various motion vector precisions include any of 4 pixels, 2 pixels, whole pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels. For example, one can be selected from integer pixel accuracy, 1/2 pixel accuracy, 1/4 pixel accuracy, and 1/16 pixel accuracy for motion estimation.
在当前视频编码软件VTM-6.0中,常规AMVP模式新增了1/2像素的AMVR精度。因此本发明实施例在可选的运动矢量精度中增加了1/2像素精度,从而使仿射编码单元的运动矢量精度的设计与常规编码单元的运动矢量精度的设计相匹配。In the current video coding software VTM-6.0, the conventional AMVP mode adds 1/2 pixel AMVR accuracy. Therefore, the embodiment of the present invention adds 1/2 pixel precision to the optional motion vector precision, so that the design of the motion vector precision of the affine coding unit matches the design of the motion vector precision of the conventional coding unit.
需要说明的是,对于Affine模式中所涉及的1/16精度、1/4精度、整像素精度等所指的控制点的运动矢量的精度,而不是子单元做运动补偿的过程中实际使用的运动矢量的精度。It should be noted that the accuracy of the motion vector of the control point referred to in the Affine mode, such as 1/16 accuracy, 1/4 accuracy, and integer pixel accuracy, is not the actual use in the process of sub-unit motion compensation. The accuracy of the motion vector.
在一个实施例中,确定运动矢量精度的方法包括:根据临近编码单元已选定的运动矢量精度来选择所述运动矢量精度。In an embodiment, the method for determining the accuracy of the motion vector includes: selecting the accuracy of the motion vector according to the selected motion vector accuracy of the neighboring coding unit.
在另一实施例中,确定运动矢量精度的方法还可以包括:基于所述四种运动矢量精度中的至少两种尝试进行运动估计,并基于所述运动估计的效果选择所述运动矢量精度。In another embodiment, the method for determining the accuracy of a motion vector may further include: attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation.
具体地,可以从所述四种可选的运动矢量精度中选择两种运动矢量精度,分别尝试进行运动估计,并比较两次所述运动估计的效果。之后,比较运动估计的效果。若使用较低的运动矢量精度的运动估计效果较好,则停止尝试,并直接将所述较低的运动矢量精度作为选定的运动矢量精度。若使用较高的运动矢量精度的运动估计效果较好,则继续使用更高的运动矢量精度尝试进行运动估计,直到获得最佳的运动估计效果。Specifically, two kinds of motion vector precisions can be selected from the four optional motion vector precisions, and motion estimation is attempted respectively, and the effects of the two motion estimations are compared. After that, compare the effects of motion estimation. If the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. If the motion estimation effect with higher motion vector accuracy is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained.
在一个实施例中,所述确定所述仿射编码单元的控制点的运动矢量包括:首先,获取空域或时域临近编码单元的运动矢量,并根据所述空域临近编码单元或时域临近编码单元的运动矢量的组合构建候选列表。之后,将所获取到的运动矢量进行组合,以构建控制点运动矢量的候选列表,每个组合中运动矢量的个数取决于仿射编码单元的控制点的数目。In an embodiment, the determining the motion vector of the control point of the affine coding unit includes: first, obtaining the motion vector of the spatial or temporal adjacent coding unit, and according to the spatial adjacent coding unit or temporal adjacent coding The combination of the motion vectors of the units constructs a candidate list. After that, the obtained motion vectors are combined to construct a candidate list of control point motion vectors, and the number of motion vectors in each combination depends on the number of control points of the affine coding unit.
之后,在所述候选列表中选取一组运动矢量作为所述仿射编码单元的控制点的预测运动矢量(MVP),并根据所述预测运动矢量在所 述参考帧中进行运动估计,以确定所述仿射编码单元的控制点的实际运动矢量。例如,可以根据预测运动矢量,在参考帧中确定对应的参考块。之后,对参考块进行插值处理,以生成分数像素点,进而确定所述实际运动矢量。插值是为了利用整数像素点的值生成各个整数样本之间的分数像素点。整数像素点之间生成越多分数像素点,参考帧的分辨率变得越高,可越精确和准确地补偿分数像素精度的位移。随着插值精度的提升,运动估计和运动补偿的效率会有一定程度的提升。Afterwards, select a group of motion vectors from the candidate list as the motion vector predictor (MVP) of the control point of the affine coding unit, and perform motion estimation in the reference frame according to the motion vector predictor to determine The actual motion vector of the control point of the affine coding unit. For example, the corresponding reference block can be determined in the reference frame according to the predicted motion vector. After that, interpolation processing is performed on the reference block to generate fractional pixels, and then the actual motion vector is determined. Interpolation is to use the value of integer pixels to generate fractional pixels between each integer sample. The more fractional pixels are generated between integer pixels, the higher the resolution of the reference frame becomes, and the more accurately and accurately the displacement of fractional pixel accuracy can be compensated. With the improvement of interpolation accuracy, the efficiency of motion estimation and motion compensation will be improved to a certain extent.
具体而言,Affine模式的运动矢量的精度可以是整数,即整像素精度,例如整像素、2像素;也可以是非整数的,即亚像素精度,例如1/2、1/4、1/8等精度。作为示例,1/2精度位置的像素需要通过整像素位置的像素进行插值来得到。其他精度位置的像素值需要使用整像素精度像素或1/2精度像素进一步插值来获得。Specifically, the accuracy of the motion vector in the Affine mode can be an integer, that is, an integer pixel accuracy, such as integer, 2 pixels; it can also be non-integer, that is, a sub-pixel accuracy, such as 1/2, 1/4, 1/8. Equal precision. As an example, the pixel at 1/2 precision position needs to be obtained by interpolation of the pixel at the whole pixel position. The pixel values of other precision positions need to be obtained by further interpolation using integer-pixel precision pixels or 1/2-precision pixels.
示例性地,可以根据所选定的运动矢量精度选择插值滤波器,以对所述参考块进行插值处理。Exemplarily, an interpolation filter can be selected according to the selected motion vector accuracy to perform interpolation processing on the reference block.
在一个实施例中,可以对于所有运动矢量精度均采用同一种插值滤波器。例如,对于所有的运动矢量精度,均默认采用现有的六抽头的插值滤波器。在这种情况下,可以不在码流中设置表征插值滤波器类型的标识位,从而节省一位数据位。In one embodiment, the same interpolation filter may be used for all motion vector accuracy. For example, for all motion vector accuracy, the existing six-tap interpolation filter is used by default. In this case, the identification bit that characterizes the type of the interpolation filter may not be set in the code stream, thereby saving one bit of data.
在另一实施例中,可以根据不同的运动矢量精度选择不同的插值滤波器。例如,由于对于常规的AMVP模式,仅1/2精度采用6抽头插值滤波器,其它精度均采用8抽头插值滤波器。因而在本发明一个实施例中,当选择1/2像素精度作为所述运动矢量精度时,选择第一插值滤波器,以对所述参考块进行插值处理;当选择除1/2像素精度以外的其他精度作为所述运动矢量精度时,选择第二插值滤波器,以对所述参考块进行插值处理,其中,所述第一插值滤波器和所述第二插值滤波器的抽头数量不同。进一步地,所述第一插值滤波器可以为6抽头插值滤波器,所述第二插值滤波器可以为8抽头插值滤波器。由此,使Affine模式下的插值滤波器设计与传统AMVP模式下的插值滤波器设计更为匹配。In another embodiment, different interpolation filters can be selected according to different motion vector accuracy. For example, for the conventional AMVP mode, only 1/2 precision uses a 6-tap interpolation filter, and other precisions all use an 8-tap interpolation filter. Therefore, in an embodiment of the present invention, when 1/2 pixel precision is selected as the motion vector precision, the first interpolation filter is selected to perform interpolation processing on the reference block; when the precision other than 1/2 pixel precision is selected When the other precision of is used as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein the number of taps of the first interpolation filter and the second interpolation filter are different. Further, the first interpolation filter may be a 6-tap interpolation filter, and the second interpolation filter may be an 8-tap interpolation filter. As a result, the interpolation filter design in the Affine mode is more matched with the interpolation filter design in the traditional AMVP mode.
进一步地,若根据不同的运动矢量精度选择不同的插值滤波器,则可以在码流中设置滤波器类型的标识位。例如,可以用1表示使用 6抽头的插值滤波器;用0表示不使用6抽头的插值滤波器,即使用默认的8抽头的插值滤波器。Further, if different interpolation filters are selected according to different motion vector accuracy, then the filter type identification bit can be set in the code stream. For example, 1 can be used to indicate that a 6-tap interpolation filter is used; 0 can be used to indicate that a 6-tap interpolation filter is not used, that is, the default 8-tap interpolation filter is used.
因而,在一个实施例中,当应用于解码端时,若不同的运动矢量精度选择不同的插值滤波器,则根据所述运动矢量精度选择插值滤波器之前,运动估计方法400还包括:获取码流,所述码流中设置有运动矢量对应的滤波器类型的标识位。Therefore, in one embodiment, when applied to the decoding end, if different interpolation filters are selected for different motion vector accuracy, before selecting the interpolation filter according to the motion vector accuracy, the motion estimation method 400 further includes: acquiring code Stream, the code stream is provided with an identification bit of the filter type corresponding to the motion vector.
在一个实施例中,运动估计可以同时包括Affine模式和常规AMVP模式。在常规AMVP下,对于当前帧中所划分的常规编码单元,以整个编码单元为单位进行运动估计。In one embodiment, motion estimation may include both Affine mode and regular AMVP mode. Under the conventional AMVP, for the conventional coding unit divided in the current frame, motion estimation is performed with the entire coding unit as a unit.
对于每个常规编码单元,当应用自适应运动矢量精度(AMVR)时,同样包括从多种运动矢量精度中自适应地选择一种以进行运动估计,所述多种运动矢量精度包括1/2像素精度。除1/2像素精度以外,所述常规编码单元的可选的运动矢量精度与所述仿射编码单元的可选的运动矢量精度相同或不同。在一个实施例中,常规编码单元同样包括四种可选的运动矢量精度。For each conventional coding unit, when adaptive motion vector precision (AMVR) is applied, it also includes adaptively selecting one of multiple motion vector precisions for motion estimation, the multiple motion vector precisions including 1/2 Pixel accuracy. Except for 1/2 pixel precision, the optional motion vector precision of the conventional coding unit is the same as or different from the optional motion vector precision of the affine coding unit. In one embodiment, the conventional coding unit also includes four optional motion vector precisions.
对于每一个采用AMVR技术的常规编码单元,在编码端自适应地决策其对应的运动矢量精度,并将决策的结果写进码流传递到解码端。在本发明实施例中,表示仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致,从而使两种模式更加统一。For each conventional coding unit using AMVR technology, the corresponding motion vector accuracy is adaptively decided at the coding end, and the result of the decision is written into the code stream and passed to the decoding end. In the embodiment of the present invention, the identifier indicating the accuracy of the motion vector of the affine coding unit is consistent with the identifier indicating the accuracy of the motion vector of the conventional coding unit, so that the two modes are more unified.
因而,在一个实施例中,当所述运动估计方法400应用于解码端时,在从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计之前还包括:获取码流,所述码流的标识位中记录有所选定的所述仿射编码单元的运动矢量精度,表示所述仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致。Therefore, in an embodiment, when the motion estimation method 400 is applied to the decoding end, before selecting one of the at least four motion vector precisions to perform motion estimation in the reference frame, the method further includes: obtaining a code stream, so The identification bit of the code stream records the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the motion vector accuracy of the conventional coding unit Consistent.
之后,在步骤S420,将所述仿射编码单元划分为若干个子单元,并在步骤S430,根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。步骤S420和步骤S430的具体细节可以参照方法100的步骤S120和步骤S130的相关描述,在此不再赘述。After that, in step S420, the affine coding unit is divided into several sub-units, and in step S430, the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point. For the specific details of step S420 and step S430, reference may be made to the related description of step S120 and step S130 of the method 100, which will not be repeated here.
基于上面的描述,根据本发明实施例的运动估计方法在仿射模式下可选的运动矢量精度中增加了1/2像素精度,使仿射模式下的运动 矢量精度与常规模式下的运动矢量精度的设计相统一,提高了编码性能。Based on the above description, the motion estimation method according to the embodiment of the present invention adds 1/2 pixel precision to the optional motion vector precision in affine mode, so that the precision of the motion vector in affine mode is the same as that in normal mode. The precision design is unified, and the coding performance is improved.
下面结合图5描述根据本发明实施例的运动估计系统500。The following describes a motion estimation system 500 according to an embodiment of the present invention with reference to FIG. 5.
图5是本发明实施例的运动估计系统500的一个示意性框图。图5所示的运动估计系统500包括:处理器510、存储装置520及存储在所述存储装置520上且在所述处理器510上运行的计算机程序,处理器执行所述程序时实现前述图1所示的运动估计方法100或图4所示的运动估计方法400的步骤。FIG. 5 is a schematic block diagram of a motion estimation system 500 according to an embodiment of the present invention. The motion estimation system 500 shown in FIG. 5 includes a processor 510, a storage device 520, and a computer program stored on the storage device 520 and running on the processor 510. The processor implements the foregoing figure when the program is executed. Steps of the motion estimation method 100 shown in 1 or the motion estimation method 400 shown in FIG. 4.
所述处理器510可以是中央处理单元(CPU)、图像处理单元(GPU)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,所述处理器510可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,并且可以控制所述运动估计系统500中的其它组件以执行期望的功能。例如,处理器510能够包括一个或多个嵌入式处理器、处理器核心、微型处理器、逻辑电路、硬件有限状态机(FSM)、数字信号处理器(DSP)或其组合。The processor 510 may be a central processing unit (CPU), an image processing unit (GPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other forms with data processing capabilities and/or instruction execution capabilities The processor 510 may be a central processing unit (CPU) or other form of processing unit with data processing capability and/or instruction execution capability, and may control other components in the motion estimation system 500 to execute The desired function. For example, the processor 510 can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSM), digital signal processors (DSP), or combinations thereof.
所述存储装置520包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器210可以运行所述程序指令,以实现下文所述的本发明实施例中(由处理器实现)的运动估计方法以及/或者其它期望的功能。在所述计算机可读存储介质中还可以存储各种应用程序和各种数据,例如所述应用程序使用和/或产生的各种数据等。The storage device 520 includes one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include random access memory (RAM) and/or cache memory (cache), for example. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 210 may run the program instructions to implement the motion estimation method in the embodiments of the present invention (implemented by the processor) described below. And/or other desired functions. Various application programs and various data, such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.
在一种实施方式中,系统500还包括输入装置(未示出),所述输入装置可以是用户用来输入指令的装置,并且可以包括操作键、键盘、鼠标、麦克风和触摸屏等中的一个或多个。此外,所述输入装置也可以是任何接收信息的接口。In an embodiment, the system 500 further includes an input device (not shown). The input device may be a device used by the user to input instructions, and may include one of operation keys, a keyboard, a mouse, a microphone, and a touch screen. Or more. In addition, the input device may also be any interface for receiving information.
在一种实施方式中,系统500还包括输出装置,所述输出装置可以向外部(例如用户)输出各种信息(例如图像或声音),并且可以包括显示器(例如向用户显示视频图像等)、扬声器等中的一个或多个。此外,所述输出装置也可以是任何其他具备输出功能的设备。In an embodiment, the system 500 further includes an output device that can output various information (such as images or sounds) to the outside (such as a user), and may include a display (such as displaying a video image to the user), One or more of speakers, etc. In addition, the output device may also be any other device with output function.
在一种实施方式中,系统500还包括通信接口,通信接口用于与其他设备之间进行通信,包括有线或者无线方式的通信。In an embodiment, the system 500 further includes a communication interface, which is used to communicate with other devices, including wired or wireless communication.
具体地,在一个实施例中,处理器执行所述程序时实现以下步骤:对于当前帧中的仿射编码单元,从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量;将所述仿射编码单元划分为若干个子单元;根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。Specifically, in one embodiment, the processor implements the following steps when executing the program: For the affine coding unit in the current frame, select one of at least four motion vector precisions to perform motion estimation in the reference frame, In this way, the motion vector of the control point of the affine coding unit is determined; the affine coding unit is divided into several sub-units; the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point. Motion vector.
在另一个实施例中,处理器执行所述程序时实现以下步骤:对于当前帧中的仿射编码单元,从多种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量,其中,所述多种运动矢量精度包括1/2像素精度;将所述仿射编码单元划分为若干个子单元;根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。In another embodiment, the processor implements the following steps when executing the program: For the affine coding unit in the current frame, select one of a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the The motion vector of the control point of the affine coding unit, wherein the various motion vector precisions include 1/2 pixel precision; the affine coding unit is divided into a number of sub-units; the calculation is based on the motion vector of the control point The motion vector of the sub-unit in the affine coding unit.
另外,本发明实施例还提供了一种存储介质,其上存储有计算机程序。当所述计算机程序由处理器执行时,可以实现前述图1或图4所示的方法的步骤。In addition, the embodiment of the present invention also provides a storage medium on which a computer program is stored. When the computer program is executed by the processor, the steps of the method shown in FIG. 1 or FIG. 4 can be implemented.
例如,该存储介质为计算机可读存储介质。计算机可读存储介质例如可以包括智能电话的存储卡、平板电脑的存储部件、个人计算机的硬盘、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、或者上述存储介质的任意组合。计算机可读存储介质可以是一个或多个计算机可读存储介质的任意组合。For example, the storage medium is a computer-readable storage medium. The computer-readable storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disk read-only Memory (CD-ROM), USB memory, or any combination of the above storage media. The computer-readable storage medium may be any combination of one or more computer-readable storage media.
在一个实施例中,所述计算机程序指令在被计算机或处理器运行时使计算机或处理器执行以下步骤:对于当前帧中的仿射编码单元,从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量;将所述仿射编码单元 划分为若干个子单元;根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。In one embodiment, the computer program instructions, when run by the computer or processor, cause the computer or processor to perform the following steps: For the affine coding unit in the current frame, select one of at least four motion vector precisions to Perform motion estimation in the reference frame to determine the motion vector of the control point of the affine coding unit; divide the affine coding unit into several subunits; calculate the affine coding according to the motion vector of the control point The motion vector of the sub-unit in the unit.
在另一个实施例中,所述计算机程序指令在被计算机或处理器运行时使计算机或处理器执行以下步骤:对于当前帧中的仿射编码单元,从多种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量,其中,所述多种运动矢量精度包括1/2像素精度;将所述仿射编码单元划分为若干个子单元;根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。In another embodiment, the computer program instructions, when run by the computer or processor, cause the computer or processor to perform the following steps: For the affine coding unit in the current frame, select one from a variety of motion vector precisions. Perform motion estimation in a reference frame to determine the motion vector of the control point of the affine coding unit, wherein the various motion vector precisions include 1/2 pixel precision; the affine coding unit is divided into several sub- Unit; calculate the motion vector of the sub-unit in the affine coding unit according to the motion vector of the control point.
综上所述,本发明的运动估计方法、系统和存储介质使仿射模式下的运动矢量精度与常规模式下的运动矢量精度的设计相统一,提高了编码性能,可用于提升压缩视频质量,提升编解码器的硬件友好性,对广播电视、电视会议、网络视频等的视频压缩处理具有重要意义。In summary, the motion estimation method, system and storage medium of the present invention unify the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the conventional mode, improve the coding performance, and can be used to improve the quality of compressed video. Enhancing the hardware friendliness of the codec is of great significance to the video compression processing of broadcast television, video conference, network video, etc.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性, 机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present invention essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. It should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
尽管这里已经参考附图描述了示例实施例,应理解上述示例实施例仅仅是示例性的,并且不意图将本发明的范围限制于此。本领域普通技术人员可以在其中进行各种改变和修改,而不偏离本发明的范围和精神。所有这些改变和修改意在被包括在所附权利要求所要求的本发明的范围之内。Although the exemplary embodiments have been described herein with reference to the accompanying drawings, it should be understood that the above-described exemplary embodiments are merely exemplary, and are not intended to limit the scope of the present invention thereto. Those of ordinary skill in the art can make various changes and modifications therein without departing from the scope and spirit of the present invention. All these changes and modifications are intended to be included within the scope of the present invention as claimed in the appended claims.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每 个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to realize the described functions, but such realization should not be considered as going beyond the scope of the present invention.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个设备,或一些特征可以忽略,或不执行。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the instructions provided here, a lot of specific details are explained. However, it can be understood that the embodiments of the present invention can be practiced without these specific details. In some instances, well-known methods, structures, and technologies are not shown in detail, so as not to obscure the understanding of this specification.
类似地,应当理解,为了精简本发明并帮助理解各个发明方面中的一个或多个,在对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该本发明的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如相应的权利要求书所反映的那样,其发明点在于可以用少于某个公开的单个实施例的所有特征的特征来解决相应的技术问题。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it should be understood that in order to simplify the present invention and help understand one or more of the various aspects of the invention, in the description of the exemplary embodiments of the present invention, the various features of the present invention are sometimes grouped together into a single embodiment. , Or in its description. However, the method of the present invention should not be construed as reflecting the intention that the claimed invention requires more features than those explicitly stated in each claim. To be more precise, as reflected in the corresponding claims, the point of the invention is that the corresponding technical problems can be solved with features that are less than all the features of a single disclosed embodiment. Therefore, the claims following the specific embodiment are thus explicitly incorporated into the specific embodiment, wherein each claim itself serves as a separate embodiment of the present invention.
本领域的技术人员可以理解,除了特征之间相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的替代特征来代替。Those skilled in the art can understand that in addition to mutual exclusion between the features, any combination of all features disclosed in this specification (including the accompanying claims, abstract, and drawings) and any method or device disclosed in this manner can be used. Processes or units are combined. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。In addition, those skilled in the art can understand that although some embodiments described herein include certain features included in other embodiments but not other features, the combination of features of different embodiments means that they are within the scope of the present invention. Within and form different embodiments. For example, in the claims, any one of the claimed embodiments can be used in any combination.
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的一些模块的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules according to the embodiments of the present invention. The present invention can also be implemented as a device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-mentioned embodiments illustrate rather than limit the present invention, and those skilled in the art can design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be constructed as a limitation to the claims. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims that list several devices, several of these devices may be embodied in the same hardware item. The use of the words first, second, and third, etc. do not indicate any order. These words can be interpreted as names.
以上所述,仅为本发明的具体实施方式或对具体实施方式的说明,本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。本发明的保护范围应以权利要求的保护范围为准。The above are only specific implementations or descriptions of specific implementations of the present invention. The protection scope of the present invention is not limited thereto. Any person skilled in the art can easily fall within the technical scope disclosed by the present invention. Any change or replacement should be included in the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (36)

  1. 一种运动估计方法,其特征在于,所述方法包括:A motion estimation method, characterized in that the method includes:
    对于当前帧中的仿射编码单元,从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量;For the affine coding unit in the current frame, select one from at least four kinds of motion vector accuracy to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit;
    将所述仿射编码单元划分为若干个子单元;Dividing the affine coding unit into several subunits;
    根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。The motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  2. 根据权利要求1所述的方法,其特征在于,所述四种运动矢量精度包括4像素、2像素、整像素、1/2像素、1/4像素、1/8像素和1/16像素中的任意四种。The method according to claim 1, wherein the four kinds of motion vector precisions include 4 pixels, 2 pixels, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels. Any four of them.
  3. 根据权利要求1所述的方法,其特征在于,所述确定所述仿射编码单元的控制点的运动矢量包括:The method according to claim 1, wherein the determining the motion vector of the control point of the affine coding unit comprises:
    获取空域临近编码单元或时域临近编码单元的运动矢量,并根据所述空域临近编码单元或时域临近编码单元的运动矢量的组合构建候选列表;Acquiring motion vectors of spatial neighboring coding units or temporal neighboring coding units, and constructing a candidate list according to a combination of motion vectors of the spatial neighboring coding units or temporal neighboring coding units;
    在所述候选列表中选取一组运动矢量作为所述仿射编码单元的控制点的预测运动矢量;Selecting a group of motion vectors from the candidate list as the predicted motion vectors of the control points of the affine coding unit;
    根据所述预测运动矢量在所述参考帧中进行运动估计,以确定所述仿射编码单元的控制点的实际运动矢量。Perform motion estimation in the reference frame according to the predicted motion vector to determine the actual motion vector of the control point of the affine coding unit.
  4. 根据权利要求1所述的方法,其特征在于,还包括:对于所述当前帧中的常规编码单元,以整个编码单元为单位进行运动估计。The method according to claim 1, further comprising: for the conventional coding unit in the current frame, performing motion estimation with the entire coding unit as a unit.
  5. 根据权利要求4所述的方法,其特征在于,所述仿射编码单元可用的运动矢量精度的数量与所述常规编码单元可用的运动矢量精度的数量相同。The method according to claim 4, wherein the number of motion vector precisions available to the affine coding unit is the same as the number of motion vector precisions available to the conventional coding unit.
  6. 根据权利要求5所述的方法,其特征在于,还包括:对于每 个常规编码单元,从四种运动矢量精度中自适应地选择一种以进行所述常规编码单元的运动估计,所述常规编码单元的四种运动矢量精度与所述仿射编码单元的四种运动矢量精度相同或不同。The method according to claim 5, further comprising: for each conventional coding unit, adaptively selecting one of four kinds of motion vector accuracy to perform the motion estimation of the conventional coding unit, and the conventional coding unit The four motion vector accuracies of the coding unit are the same as or different from the four motion vector accuracies of the affine coding unit.
  7. 根据权利要求6所述的方法,其特征在于,还包括:在码流标识位中记录所选定的所述仿射编码单元的运动矢量精度,表示所述仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致。The method according to claim 6, further comprising: recording the accuracy of the motion vector of the selected affine coding unit in the bitstream identification bit, indicating the accuracy of the motion vector of the affine coding unit The identification is consistent with the identification indicating the accuracy of the motion vector of the conventional coding unit.
  8. 根据权利要求6所述的方法,其特征在于,所述从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,之前还包括:The method according to claim 6, wherein the selecting one of the at least four kinds of motion vector accuracy to perform motion estimation in a reference frame, before further comprising:
    获取码流,所述码流的标识位中记录有所选定的所述仿射编码单元的运动矢量精度,表示所述仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致。Obtain a code stream, the identification bits of the code stream record the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the conventional coding unit The signs of motion vector accuracy are consistent.
  9. 根据权利要求1所述的方法,其特征在于,每个所述仿射编码单元包括两个控制点或三个控制点。The method according to claim 1, wherein each of the affine coding units includes two control points or three control points.
  10. 根据权利要求1所述的方法,其特征在于,还包括:根据所述运动矢量精度选择插值滤波器,以对参考块进行插值处理。The method according to claim 1, further comprising: selecting an interpolation filter according to the accuracy of the motion vector to perform interpolation processing on the reference block.
  11. 根据权利要求10所述的方法,其特征在于,所述根据所述运动矢量精度选择插值滤波器包括:The method according to claim 10, wherein the selecting an interpolation filter according to the accuracy of the motion vector comprises:
    根据不同的运动矢量精度选择不同的插值滤波器;或者Choose different interpolation filters according to different motion vector accuracy; or
    对于所有运动矢量精度均采用同一种插值滤波器。The same interpolation filter is used for all motion vector accuracy.
  12. 根据权利要求11所述的方法,其特征在于,还包括:若根据不同的运动矢量精度选择不同的插值滤波器,则在码流中设置滤波器类型的标识位。The method according to claim 11, further comprising: if different interpolation filters are selected according to different motion vector precisions, setting an identification bit of the filter type in the code stream.
  13. 根据权利要求11所述的方法,其特征在于,不同的运动矢 量精度选择不同的插值滤波器;The method according to claim 11, wherein different interpolation filters are selected for different motion vector accuracy;
    所述根据所述运动矢量精度选择插值滤波器,之前还包括:The selection of an interpolation filter according to the accuracy of the motion vector previously further includes:
    获取码流,所述码流中设置有运动矢量对应的滤波器类型的标识位。Obtain a code stream, and an identification bit of the filter type corresponding to the motion vector is set in the code stream.
  14. 根据权利要求11所述的方法,其特征在于,所述根据不同的运动矢量精度选择不同的插值滤波器包括:The method according to claim 11, wherein the selecting different interpolation filters according to different motion vector precisions comprises:
    当选择1/2像素精度作为所述运动矢量精度时,选择第一插值滤波器,以对参考块进行插值处理;When 1/2 pixel accuracy is selected as the motion vector accuracy, the first interpolation filter is selected to perform interpolation processing on the reference block;
    当选择除1/2像素精度以外的其他精度作为所述运动矢量精度时,选择第二插值滤波器,以对所述参考块进行插值处理,其中,When a precision other than 1/2 pixel precision is selected as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein,
    所述第一插值滤波器和所述第二插值滤波器的抽头数量不同。The number of taps of the first interpolation filter and the second interpolation filter are different.
  15. 根据权利要求14所述的方法,其特征在于,所述第一插值滤波器为6抽头插值滤波器。The method according to claim 14, wherein the first interpolation filter is a 6-tap interpolation filter.
  16. 根据权利要求14所述的方法,其特征在于,所述第二插值滤波器为8抽头插值滤波器。The method according to claim 14, wherein the second interpolation filter is an 8-tap interpolation filter.
  17. 根据权利要求1所述的方法,其特征在于,所述从四种运动矢量精度中选择一种以在参考帧中进行运动估计,包括:The method according to claim 1, wherein the selecting one of four kinds of motion vector accuracy to perform motion estimation in a reference frame comprises:
    根据临近编码单元已选定的运动矢量精度来选择所述运动矢量精度。The accuracy of the motion vector is selected according to the accuracy of the motion vector selected by the neighboring coding unit.
  18. 根据权利要求1所述的方法,其特征在于,所述从四种运动矢量精度中自适应地选择一种以在参考帧中进行运动估计,包括:The method according to claim 1, wherein the adaptively selecting one of four kinds of motion vector accuracy to perform motion estimation in a reference frame comprises:
    基于所述四种运动矢量精度中的至少两种尝试进行运动估计,并基于所述运动估计的效果选择所述运动矢量精度。At least two of the four types of motion vector accuracy are attempted to perform motion estimation, and the motion vector accuracy is selected based on the effect of the motion estimation.
  19. 根据权利要求18所述的方法,其特征在于,所述基于所述四种运动矢量精度中的至少两种尝试进行运动估计,并基于所述运动 估计的效果选择所述运动矢量精度,包括:The method according to claim 18, wherein the attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation, comprises:
    从所述四种运动矢量精度中选择两种运动矢量精度,分别尝试进行运动估计,并比较两次所述运动估计的效果;Select two kinds of motion vector precisions from the four kinds of motion vector precisions, try to perform motion estimation respectively, and compare the effects of the two motion estimations;
    若使用较低的运动矢量精度的运动估计效果较好,则停止尝试,并直接将所述较低的运动矢量精度作为选定的运动矢量精度,若使用较高的运动矢量精度的运动估计效果较好,则继续使用更高的运动矢量精度尝试进行运动估计,直到获得最佳的运动估计效果。If the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. If the motion estimation effect with higher motion vector accuracy is used If it is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained.
  20. 根据权利要求1所述的方法,其特征在于,所述参考帧包括所述当前帧之前的视频帧和所述当前帧之后的视频帧。The method according to claim 1, wherein the reference frame includes a video frame before the current frame and a video frame after the current frame.
  21. 一种运动估计方法,其特征在于,所述方法包括:A motion estimation method, characterized in that the method includes:
    对于当前帧中的仿射编码单元,从多种运动矢量精度中选择一种以在参考帧中进行运动估计,从而确定所述仿射编码单元的控制点的运动矢量,其中,所述多种运动矢量精度包括1/2像素精度;For the affine coding unit in the current frame, select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein the multiple Motion vector accuracy includes 1/2 pixel accuracy;
    将所述仿射编码单元划分为若干个子单元;Dividing the affine coding unit into several subunits;
    根据所述控制点的运动矢量计算所述仿射编码单元中的所述子单元的运动矢量。The motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  22. 根据权利要求21所述的方法,其特征在于,所述多种运动矢量精度还包括4像素精度、2像素精度、整像素精度、1/4像素精度、1/8像素精度和1/16像素精度中的至少一种。The method according to claim 21, wherein the multiple motion vector precisions further include 4 pixel precision, 2 pixel precision, integer pixel precision, 1/4 pixel precision, 1/8 pixel precision, and 1/16 pixel precision. At least one of accuracy.
  23. 根据权利要求21所述的方法,其特征在于,所述确定所述仿射编码单元的控制点的运动矢量包括:22. The method according to claim 21, wherein said determining the motion vector of the control point of the affine coding unit comprises:
    获取空域临近编码单元或时域临近编码单元的运动矢量,并根据所述空域临近编码单元或时域临近编码单元的运动矢量的组合构建候选列表;Acquiring motion vectors of spatial neighboring coding units or temporal neighboring coding units, and constructing a candidate list according to a combination of motion vectors of the spatial neighboring coding units or temporal neighboring coding units;
    在所述候选列表中选取一组运动矢量作为所述仿射编码单元的控制点的预测运动矢量;Selecting a group of motion vectors from the candidate list as the predicted motion vectors of the control points of the affine coding unit;
    根据所述预测运动矢量在所述参考帧中进行运动估计,以确定所 述仿射编码单元的控制点的实际运动矢量。Perform motion estimation in the reference frame according to the predicted motion vector to determine the actual motion vector of the control point of the affine coding unit.
  24. 根据权利要求21所述的方法,其特征在于,还包括:对于所述当前帧中的常规编码单元,以整个编码单元为单位进行运动估计。The method according to claim 21, further comprising: for the conventional coding unit in the current frame, performing motion estimation using the entire coding unit as a unit.
  25. 根据权利要求24所述的方法,其特征在于,所述仿射编码单元可用的运动矢量精度的数量与所述常规编码单元可用的运动矢量精度的数量相同。The method according to claim 24, wherein the number of motion vector precisions available for the affine coding unit is the same as the number of motion vector precisions available for the conventional coding unit.
  26. 根据权利要求25所述的方法,其特征在于,还包括:在码流标识位中记录所选定的所述仿射编码单元的运动矢量精度,表示所述仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致。The method according to claim 25, further comprising: recording the accuracy of the motion vector of the selected affine coding unit in the bitstream identification bit, indicating the accuracy of the motion vector of the affine coding unit The identification is consistent with the identification indicating the accuracy of the motion vector of the conventional coding unit.
  27. 根据权利要求26所述的方法,其特征在于,所述从至少四种运动矢量精度中选择一种以在参考帧中进行运动估计,之前还包括:The method according to claim 26, wherein the selecting one of the at least four kinds of motion vector accuracy to perform motion estimation in a reference frame, before further comprising:
    获取码流,所述码流的标识位中记录有所选定的所述仿射编码单元的运动矢量精度,表示所述仿射编码单元的运动矢量精度的标识与表示所述常规编码单元的运动矢量精度的标识相一致。Obtain a code stream, the identification bits of the code stream record the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the conventional coding unit The signs of motion vector accuracy are consistent.
  28. 根据权利要求21所述的方法,其特征在于,还包括:根据所述运动矢量精度选择插值滤波器,以对参考块进行插值处理。The method according to claim 21, further comprising: selecting an interpolation filter according to the accuracy of the motion vector to perform interpolation processing on the reference block.
  29. 根据权利要求28所述的方法,其特征在于,所述根据所述运动矢量精度选择插值滤波器包括:The method according to claim 28, wherein the selecting an interpolation filter according to the accuracy of the motion vector comprises:
    根据不同的运动矢量精度选择不同的插值滤波器;或者Choose different interpolation filters according to different motion vector accuracy; or
    对于所有运动矢量精度均采用同一种插值滤波器。The same interpolation filter is used for all motion vector accuracy.
  30. 根据权利要求29所述的方法,其特征在于,所述根据不同的运动矢量精度选择不同的插值滤波器包括:The method according to claim 29, wherein the selecting different interpolation filters according to different motion vector precisions comprises:
    当选择所述1/2像素精度作为所述运动矢量精度时,选择第一插 值滤波器,以对参考块进行插值处理;When the 1/2 pixel precision is selected as the motion vector precision, a first interpolation filter is selected to perform interpolation processing on the reference block;
    当选择除所述1/2像素精度以外的其他精度作为所述运动矢量精度时,选择第二插值滤波器,以对所述参考块进行插值处理,When a precision other than the 1/2 pixel precision is selected as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block,
    所述第一插值滤波器和所述第二插值滤波器的抽头数量不同。The number of taps of the first interpolation filter and the second interpolation filter are different.
  31. 根据权利要求30所述的方法,其特征在于,所述第一插值滤波器为6抽头插值滤波器。The method according to claim 30, wherein the first interpolation filter is a 6-tap interpolation filter.
  32. 根据权利要求30所述的方法,其特征在于,所述第二插值滤波器为8抽头插值滤波器。The method according to claim 30, wherein the second interpolation filter is an 8-tap interpolation filter.
  33. 根据权利要求29所述的方法,其特征在于,还包括:若根据不同的运动矢量精度选择不同的插值滤波器,则在码流中设置滤波器类型的标识位。The method according to claim 29, further comprising: if different interpolation filters are selected according to different motion vector precisions, setting an identification bit of the filter type in the code stream.
  34. 根据权利要求33所述的方法,其特征在于,不同的运动矢量精度选择不同的插值滤波器;The method according to claim 33, wherein different interpolation filters are selected for different motion vector accuracy;
    所述根据所述运动矢量精度选择插值滤波器,之前还包括:The selection of an interpolation filter according to the accuracy of the motion vector previously further includes:
    获取码流,所述码流中设置有运动矢量对应的滤波器类型的标识位。Obtain a code stream, and an identification bit of the filter type corresponding to the motion vector is set in the code stream.
  35. 一种运动估计系统,其特征在于,所述系统包括存储装置和处理器,所述存储装置上存储有由所述处理器运行的计算机程序,所述计算机程序在被所述处理器运行时执行如权利要求1-34中的任一项所述的运动估计方法。A motion estimation system, characterized in that the system includes a storage device and a processor, the storage device stores a computer program run by the processor, and the computer program is executed when the processor is run The motion estimation method according to any one of claims 1-34.
  36. 一种存储介质,其特征在于,所述存储介质上存储有计算机程序,所述计算机程序在运行时执行如权利要求1-34中的任一项所述的运动估计方法。A storage medium, characterized in that a computer program is stored on the storage medium, and the computer program executes the motion estimation method according to any one of claims 1-34 during operation.
PCT/CN2019/107601 2019-09-24 2019-09-24 Motion estimation method and system, and storage medium WO2021056215A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/107601 WO2021056215A1 (en) 2019-09-24 2019-09-24 Motion estimation method and system, and storage medium
CN201980066902.6A CN112868234A (en) 2019-09-24 2019-09-24 Motion estimation method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/107601 WO2021056215A1 (en) 2019-09-24 2019-09-24 Motion estimation method and system, and storage medium

Publications (1)

Publication Number Publication Date
WO2021056215A1 true WO2021056215A1 (en) 2021-04-01

Family

ID=75165894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107601 WO2021056215A1 (en) 2019-09-24 2019-09-24 Motion estimation method and system, and storage medium

Country Status (2)

Country Link
CN (1) CN112868234A (en)
WO (1) WO2021056215A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113630602B (en) * 2021-06-29 2024-07-02 杭州未名信科科技有限公司 Affine motion estimation method and device of coding unit, storage medium and terminal
CN113630601B (en) * 2021-06-29 2024-04-02 杭州未名信科科技有限公司 Affine motion estimation method, affine motion estimation device, affine motion estimation equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277506A (en) * 2017-08-15 2017-10-20 中南大学 A kind of motion vector accuracy fast selecting method and device based on adaptive motion vector precision
CN108781284A (en) * 2016-03-15 2018-11-09 联发科技股份有限公司 The method and device of coding and decoding video with affine motion compensation
CN109155854A (en) * 2016-05-27 2019-01-04 松下电器(美国)知识产权公司 Code device, decoding apparatus, coding method and coding/decoding method
CN109792532A (en) * 2016-10-04 2019-05-21 高通股份有限公司 Adaptive motion vector precision for video coding
CN110620932A (en) * 2018-06-19 2019-12-27 北京字节跳动网络技术有限公司 Mode dependent motion vector difference accuracy set

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011021914A2 (en) * 2009-08-21 2011-02-24 에스케이텔레콤 주식회사 Method and apparatus for encoding/decoding images using adaptive motion vector resolution
CN115002458A (en) * 2015-06-05 2022-09-02 杜比实验室特许公司 Image encoding and decoding method and image decoding apparatus
CN116193110A (en) * 2017-01-16 2023-05-30 世宗大学校产学协力团 Image coding/decoding method
WO2019072187A1 (en) * 2017-10-13 2019-04-18 Huawei Technologies Co., Ltd. Pruning of motion model candidate list for inter-prediction
CN109729352B (en) * 2017-10-27 2020-07-21 华为技术有限公司 Method and device for determining motion vector of affine coding block
US20190246134A1 (en) * 2018-02-06 2019-08-08 Panasonic Intellectual Property Corporation Of America Encoding method, decoding method, encoder, and decoder
KR102424189B1 (en) * 2018-02-14 2022-07-21 후아웨이 테크놀러지 컴퍼니 리미티드 Adaptive interpolation filter

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108781284A (en) * 2016-03-15 2018-11-09 联发科技股份有限公司 The method and device of coding and decoding video with affine motion compensation
CN109155854A (en) * 2016-05-27 2019-01-04 松下电器(美国)知识产权公司 Code device, decoding apparatus, coding method and coding/decoding method
CN109792532A (en) * 2016-10-04 2019-05-21 高通股份有限公司 Adaptive motion vector precision for video coding
CN107277506A (en) * 2017-08-15 2017-10-20 中南大学 A kind of motion vector accuracy fast selecting method and device based on adaptive motion vector precision
CN110620932A (en) * 2018-06-19 2019-12-27 北京字节跳动网络技术有限公司 Mode dependent motion vector difference accuracy set

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG ZHAO, WANG SHIQI, ZHANG JIAN, MA SIWEI: "Adaptive Progressive Motion Vector Resolution Selection Based on Rate–Distortion Optimization", IEEE TRANSACTIONS ON IMAGE PROCESSING, IEE SERVICE CENTER , PISCATAWAY , NJ, US, vol. 26, no. 1, 1 January 2017 (2017-01-01), US, pages 400 - 413, XP055795908, ISSN: 1057-7149, DOI: 10.1109/TIP.2016.2627814 *

Also Published As

Publication number Publication date
CN112868234A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
TWI729422B (en) Sub-block mv inheritance between color components
CN112889269B (en) Video decoding method and device
US11178419B2 (en) Picture prediction method and related apparatus
CN112470474B (en) Video encoding and decoding method and device
US11856220B2 (en) Reducing computational complexity when video encoding uses bi-predictively encoded frames
CN104363451B (en) Image prediction method and relevant apparatus
JP2021182752A (en) Image prediction method and related apparatus
WO2017005146A1 (en) Video encoding and decoding method and device
JP6905093B2 (en) Optical flow estimation of motion compensation prediction in video coding
WO2019242563A1 (en) Video encoding and decoding method and device, storage medium and computer device
JP2022508074A (en) Methods and equipment for multi-hypothesis signaling for skip and merge modes and distance offset table signaling for merge by motion vector differences.
JP6945654B2 (en) Methods and Devices for Encoding or Decoding Video Data in FRUC Mode with Reduced Memory Access
WO2020140331A1 (en) Video image processing method and device
TW201526617A (en) Method and system for image processing, decoding method, encoder and decoder
KR102059066B1 (en) Motion vector field coding method and decoding method, and coding and decoding apparatuses
JP2022515031A (en) Methods, equipment and computer programs for video coding
WO2016065872A1 (en) Image prediction method and relevant device
WO2017201678A1 (en) Image prediction method and related device
KR20200125698A (en) Method and apparatus for sub-block motion vector prediction
TWI790662B (en) Encoding and decoding method, apparatus and device thereof
WO2021056215A1 (en) Motion estimation method and system, and storage medium
TW201937924A (en) Methods and devices for improvement in obtaining linear component sample prediction parameters
CN110719467B (en) Prediction method of chrominance block, encoder and storage medium
WO2022227622A1 (en) Weight-configurable inter-frame and intra-frame joint prediction coding and decoding methods and devices
JP7437426B2 (en) Inter prediction method and device, equipment, storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946975

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19946975

Country of ref document: EP

Kind code of ref document: A1