WO2021056215A1 - Procédé et système d'estimation de mouvement, et support de stockage - Google Patents

Procédé et système d'estimation de mouvement, et support de stockage Download PDF

Info

Publication number
WO2021056215A1
WO2021056215A1 PCT/CN2019/107601 CN2019107601W WO2021056215A1 WO 2021056215 A1 WO2021056215 A1 WO 2021056215A1 CN 2019107601 W CN2019107601 W CN 2019107601W WO 2021056215 A1 WO2021056215 A1 WO 2021056215A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
accuracy
coding unit
motion
affine
Prior art date
Application number
PCT/CN2019/107601
Other languages
English (en)
Chinese (zh)
Inventor
马思伟
孟学苇
郑萧桢
王苫社
Original Assignee
深圳市大疆创新科技有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司, 北京大学 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201980066902.6A priority Critical patent/CN112868234A/zh
Priority to PCT/CN2019/107601 priority patent/WO2021056215A1/fr
Publication of WO2021056215A1 publication Critical patent/WO2021056215A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation

Definitions

  • the present invention relates to the technical field of video coding and decoding, in particular to a motion estimation method, system and storage medium.
  • the basic principle of video coding is to use the correlation between the spatial, temporal and codewords to remove redundancy as much as possible.
  • Current video coding schemes mainly include intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • the inter-frame prediction technology uses the time-domain correlation between adjacent frames of the video, uses the previously encoded reconstructed frame as a reference frame, and performs motion estimation (ME) and motion compensation (MC) pairs
  • the current frame (that is, the frame currently being encoded) is predicted to remove the temporal redundant information of the video.
  • the image can be divided into several coding units, and the position of each coding unit in the adjacent frames can be searched out, and the spatial position between the two can be obtained.
  • Relative offset the obtained relative offset is usually referred to as a motion vector (motion vector, MV), and the process of obtaining a motion vector is called motion estimation.
  • Motion compensation is the process of using MV and reference frames to obtain the predicted frame.
  • the predicted frame obtained by this process may be different from the original current frame. Therefore, the difference between the predicted frame and the current frame needs to be transformed and quantized.
  • the MV information is passed to the decoder, so that the decoder can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
  • Motion estimation is an important link that affects the efficiency of video coding. Therefore, how to optimize the motion estimation method has always been a concern of those skilled in the art.
  • the first aspect of the embodiments of the present invention provides a motion estimation method, the method includes:
  • For the affine coding unit in the current frame select one from at least four kinds of motion vector accuracy to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit;
  • the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • the second aspect of the embodiments of the present invention provides another motion estimation method, and the method includes:
  • the affine coding unit in the current frame select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein the multiple Motion vector accuracy includes 1/2 pixel accuracy;
  • the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • a third aspect of the embodiments of the present invention provides a motion estimation system.
  • the system includes a storage device and a processor.
  • the storage device stores a computer program run by the processor.
  • the processor executes the above-mentioned motion estimation method while it is running.
  • a fourth aspect of the embodiments of the present invention provides a storage medium on which a computer program is stored, and the computer program executes the above-mentioned motion estimation method when running.
  • the motion estimation method, system and storage medium of the present invention unify the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the conventional mode, and improve the coding performance.
  • Fig. 1 shows a flowchart of a motion estimation method according to an embodiment of the present invention
  • Fig. 2 shows a schematic diagram of a motion vector of a control point of an affine coding unit according to an embodiment of the present invention
  • Fig. 3 shows a schematic diagram of motion vectors of subunits of an affine coding unit according to an embodiment of the present invention
  • Fig. 4 shows a flowchart of a motion estimation method according to another embodiment of the present invention.
  • Fig. 5 shows a structural block diagram of a motion estimation system according to an embodiment of the present invention.
  • the motion estimation method of the embodiment of the present invention can be applied to the inter-frame prediction part of the video coding and decoding technology.
  • Video is generally composed of multiple frames of images in a certain order. There are often a lot of identical or similar spatial structures in one frame of image, that is to say, there are a lot of spatial redundant information in the video file. In addition. Since the sampling time interval between two adjacent frames of the video is extremely short, there is usually a large amount of similarity in the adjacent two frames, that is, there is a large amount of time redundant information in the video. In addition, from the perspective of the visual sensitivity of the human eye, there is also a part of video information that can be used for compression, that is, visual redundant information.
  • video image information also has a series of redundant information such as information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy and so on.
  • the purpose of video coding is to remove redundant information in a video sequence, so as to reduce storage space and save transmission bandwidth.
  • video coding mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • the embodiment of the present invention mainly aims at improving the inter-frame prediction part.
  • the inter-frame prediction technology uses the time-domain correlation between adjacent frames of the video, uses the previously encoded reconstructed frame as a reference frame, and predicts the current frame (the frame currently being encoded) through motion estimation and motion compensation, thereby removing Time redundant information of the video.
  • the motion estimation method, system, and storage medium described in the embodiments of the present invention use the HEVC standard or its extension.
  • the present invention is also applicable to other coding standards, such as the H.264 standard, the next generation video coding standard VVC, AVS3, or any other suitable coding standard.
  • Fig. 1 shows a flowchart of a motion estimation method 100 according to an embodiment of the present invention. As shown in FIG. 1, the method 100 includes the following steps:
  • step S110 for the affine coding unit in the current frame, one of at least four kinds of motion vector accuracy is selected to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit.
  • the current frame is the video frame currently to be encoded.
  • the current frame can be a video frame collected in real time, or a video frame extracted from a storage medium.
  • the reference frame is the video frame to be referred to when encoding the current frame.
  • the reference frame may be a reconstructed video frame obtained by reconstructing the encoded data corresponding to the video frame that can be used as the reference frame.
  • the reference frame can be a forward reference frame, a backward reference frame, or a bidirectional reference frame.
  • inter-frame prediction techniques include forward prediction, backward prediction, bidirectional prediction, and so on.
  • Forward prediction uses the previous frame (historical frame) of the current frame as a reference frame to predict the current frame.
  • Backward prediction uses the frame after the current frame (future frame) as a reference frame to predict the current frame.
  • Bidirectional prediction uses not only historical frames but also future frames to predict the current frame.
  • a bidirectional prediction mode is adopted, that is, the reference frame includes both historical frames and future frames.
  • the affine coding unit in the current frame is a coding unit (CU) divided in the current frame based on the affine motion compensation prediction (Affine) technology.
  • the traditional motion model only includes translational motion, but in reality there are many forms of motion, such as zooming, rotating, perspective motion and other irregular motions, which introduces the Affine technology.
  • the processing unit in the Affine technology is no longer the entire coding unit, but divides the entire coding unit into multiple sub-units. In the process of motion compensation, motion compensation is performed in the unit of sub-units.
  • the affine coding unit in the Affine mode no longer has only one motion vector, but each subunit in the affine coding unit has its own motion vector.
  • the motion vector of each subunit in the affine coding unit passes through the two control points of the affine coding unit (ie, the four-parameter model, see the left figure in Figure 2) or The motion vectors of the three control points (that is, the six-parameter model, see the right figure in Figure 2) are calculated and derived. Only the motion vector information of the control point needs to be written in the code stream, not the motion of each subunit. Vector information.
  • the motion vector of the control point is first determined.
  • the embodiment of the present invention adopts the adaptive motion vector accuracy (AMVR) technology in The encoder side adaptively determines the accuracy of the motion vector.
  • the determination of the motion vector of the control point is based on the Inter mode (also known as the AMVP mode) in the Affine mode. In this mode, the motion vector accuracy is selected on the encoder side, and MVD (Motion Vector Difference, motion vector difference) calculation.
  • the selectable motion vector precision includes four kinds, and for each coding unit, one of the four kinds of motion vector precision is selected for motion estimation.
  • the at least four motion vector precisions include any four of 4 pixels, 2 pixels, whole pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels.
  • the four kinds of motion vector precisions may be integer pixel precision, 1/2 pixel precision, 1/4 pixel precision, and 1/16 pixel precision.
  • the conventional AMVP mode includes four AMVR precisions. Therefore, compared with the previous Affine mode with three precisions, the embodiment of the present invention increases the precision by one bit, so that the number of motion vector precisions available for the affine coding unit is the same as the number of motion vector precisions available for the conventional coding unit. Furthermore, the design of adaptive motion vector accuracy in Affine mode is unified with the design of adaptive motion vector accuracy in conventional AMVP mode. In one embodiment, the newly added precision in the embodiment of the present invention is 1/2 pixel precision.
  • the accuracy of the motion vector of the control point referred to in the Affine mode is not the actual use in the process of sub-unit motion compensation.
  • the accuracy of the motion vector is not the actual use in the process of sub-unit motion compensation.
  • the method for determining the accuracy of the motion vector includes: selecting the accuracy of the motion vector according to the selected motion vector accuracy of the neighboring coding unit.
  • the method for determining the accuracy of a motion vector may further include: attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation.
  • two kinds of motion vector precisions can be selected from the four optional motion vector precisions, and motion estimation is attempted respectively, and the effects of the motion estimation twice are compared. For example, you can select 1/2 pixel precision and integer pixel precision to perform motion estimation separately.
  • the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. For example, if the effect of using the integer pixel precision for motion estimation is better than the effect of using 1/2 pixel accuracy for motion estimation, then no other precision attempts are made, and the integer pixel precision is directly selected. If the motion estimation effect with higher motion vector accuracy is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained. For example, if the effect of using 1/2 pixel accuracy for motion estimation is better than that of whole pixel accuracy, then continue to try 1/4 pixel accuracy for motion estimation.
  • the determining the motion vector of the control point of the affine coding unit includes: first, obtaining the motion vector of the spatial or temporal adjacent coding unit, and according to the spatial adjacent coding unit or temporal adjacent coding The combination of the motion vectors of the units constructs a candidate list.
  • the motion vector obtained in this process may be the motion vector of the control point of the coding unit in the Affine mode, or the motion vector of the conventional coding unit in the traditional mode.
  • the obtained motion vectors are combined to construct a candidate list of control point motion vectors, and the number of motion vectors in each combination depends on the number of control points of the affine coding unit.
  • the motion vector predictor MVP
  • the corresponding reference block can be determined in the reference frame according to the predicted motion vector.
  • interpolation processing is performed on the reference block to generate fractional pixels, and then the actual motion vector is determined.
  • the encoding end can also calculate the difference MVD (Motion Vector Difference) between the actual motion vector and the predicted motion vector, encode the MVD, and send the encoded MVD and the index of the predicted motion vector in the candidate list to the decoding end.
  • MVD Motion Vector Difference
  • the accuracy of the motion vector includes integer pixel accuracy and fractional pixel accuracy. Since the pixel at the fractional pixel position does not exist, it is necessary to interpolate the reference block to obtain the pixel at the sub-pixel position. Interpolation is to use the value of integer pixels to generate fractional pixels between each integer sample. The more fractional pixels are generated between integer pixels, the higher the resolution of the reference block becomes, and the more accurately and accurately the displacement of fractional pixel precision can be compensated. With the improvement of interpolation accuracy, the efficiency of motion estimation and motion compensation will be improved to a certain extent.
  • the accuracy of the motion vector in the Affine mode can be an integer, that is, an integer pixel accuracy, such as integer, 2 pixels; it can also be non-integer, that is, a sub-pixel accuracy, such as 1/2, 1/4, 1/8. Equal precision.
  • the pixel at 1/2 precision position needs to be obtained by interpolation of the pixel at the whole pixel position.
  • the pixel values of other precision positions need to be obtained by further interpolation using integer-pixel precision pixels or 1/2-precision pixels.
  • an interpolation filter can be selected according to the selected motion vector accuracy to perform interpolation processing on the reference block.
  • the same interpolation filter may be used for all motion vector accuracy.
  • the existing six-tap interpolation filter is used by default.
  • the identification bit that characterizes the type of the interpolation filter may not be set in the code stream, thereby saving one bit of data.
  • different interpolation filters can be selected according to different motion vector accuracy.
  • 1/2 precision uses a 6-tap interpolation filter, and other precisions all use an 8-tap interpolation filter. Therefore, in an embodiment of the present invention, when 1/2 pixel precision is selected as the motion vector precision, the first interpolation filter is selected to perform interpolation processing on the reference block; when the precision other than 1/2 pixel precision is selected When the other precision of is used as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein the number of taps of the first interpolation filter and the second interpolation filter are different.
  • the first interpolation filter may be a 6-tap interpolation filter
  • the second interpolation filter may be an 8-tap interpolation filter.
  • the filter type identification bit can be set in the code stream. For example, 1 can be used to indicate that a 6-tap interpolation filter is used; 0 can be used to indicate that a 6-tap interpolation filter is not used, that is, the default 8-tap interpolation filter is used.
  • the motion estimation method 200 when applied to the decoding end, if different interpolation filters are selected for different motion vector accuracy, before selecting the interpolation filter according to the motion vector accuracy, the motion estimation method 200 further includes: acquiring code Stream, the code stream is provided with an identification bit of the filter type corresponding to the motion vector.
  • motion estimation can include both Affine mode and regular AMVP mode.
  • motion estimation is performed with the entire coding unit as a unit.
  • each conventional coding unit when adaptive motion vector accuracy (AMVR) is applied, it also includes adaptively selecting one of four motion vector accuracy for motion estimation.
  • the four motion vector accuracies of the conventional coding unit are the same or different from the four motion vector accuracies of the affine coding unit.
  • the four kinds of motion vector precisions may include integer pixel, 4 pixel, 1/4 pixel and 1/2 pixel precision.
  • the accuracy of the motion vector is not limited to the above four types, for example, it may also include 1/8 pixel, 1/16 pixel, and so on.
  • the corresponding motion vector accuracy is adaptively decided at the coding end, and the result of the decision is written into the code stream and passed to the decoding end.
  • the identifier indicating the accuracy of the motion vector of the affine coding unit is consistent with the identifier indicating the accuracy of the motion vector of the conventional coding unit, so that the two modes are more unified.
  • the method when the motion estimation method 200 is applied to the decoding end, before selecting one of the at least four motion vector precisions to perform motion estimation in the reference frame, the method further includes: acquiring a bitstream, so The identification bit of the code stream records the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the motion vector accuracy of the conventional coding unit Consistent.
  • step S120 the affine coding unit is divided into several subunits.
  • the size of the sub-units may be fixed, for example, each sub-unit is divided into a size of 4 ⁇ 4 pixels.
  • the size of the subunit may also be determined in other ways. For example, a subunit of an appropriate size may be selected to reduce the complexity of coding and decoding.
  • step S130 the motion vector of the subunit in the affine coding unit is calculated according to the motion vector of the control point.
  • the sports field of the Affine mode can be derived from the motion vectors of two control points (four parameters) or three control points (six parameters).
  • the motion vector of the subunit located at the (x, y) position is calculated by the following formula (1):
  • (mv 0x ,mv 0y ) is the motion vector of the control point in the upper left corner
  • (mv 1x ,mv 1y ) is the motion vector of the control point in the upper right corner
  • x and y are the coordinates of the center point of the subunit
  • w is the affine The width of the coding unit.
  • the motion vector of the sub-unit at the position (x, y) is calculated by the following formula (2):
  • (mv 0x ,mv 0y ) is the motion vector of the control point in the upper left corner
  • (mv 1x ,mv 1y ) is the motion vector of the control point in the upper right corner
  • (mv 2x ,mv 2y ) is the motion vector of the control point in the lower left corner
  • w is the width of the affine coding unit.
  • a schematic diagram of the motion vector in an affine coding unit is shown in Fig. 3, where each square represents a 4 ⁇ 4 size subunit. All motion vectors after the calculation of the above formula will be rounded to a 1/16 pixel precision representation.
  • the size of the subunits of the chrominance component and the luminance component are both 4 ⁇ 4, and the motion vector of the chrominance component 4 ⁇ 4 subunit can be obtained by averaging the motion vectors of the corresponding four 4 ⁇ 4 luminance components.
  • the prediction block of each subunit in the reference frame can be obtained through a motion compensation process. After that, the prediction frame can be obtained by using the motion vector and the prediction block.
  • the encoding end transfers the difference between the prediction frame and the actual current frame to the decoding end after transformation, quantization, etc., and the decoding end uses the motion vector, reference frame, and The difference between the predicted frame and the current frame can reconstruct the current frame.
  • the motion estimation method unifies the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the normal mode, and improves the coding performance.
  • Fig. 4 shows a flowchart of a motion estimation method 400 according to another embodiment of the present invention. As shown in FIG. 4, the method 400 includes the following steps:
  • step S410 for the affine coding unit in the current frame, select one from a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the motion vector of the control point of the affine coding unit, wherein,
  • the various motion vector precisions include 1/2 pixel precision;
  • step S420 the affine coding unit is divided into several subunits
  • step S430 the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • the current frame is the video frame currently to be encoded.
  • the reference frame is the video frame to be referred to when encoding the current frame.
  • the reference frame in this embodiment includes both historical frames and future frames.
  • the affine coding unit in the current frame is a coding unit (CU) divided in the current frame based on the affine motion compensation prediction (Affine) technology.
  • the processing unit in the Affine technology is no longer the entire coding unit, but divides the entire coding unit into multiple sub-units. In the process of motion compensation, motion compensation is performed in the unit of sub-units.
  • the affine coding unit in the Affine mode no longer has only one motion vector, but each subunit in the affine coding unit has its own motion vector.
  • the motion vector of each subunit in the affine coding unit passes through the two control points of the affine coding unit (ie, the four-parameter model, see the left figure in Figure 2) or The motion vectors of the three control points (that is, the six-parameter model, see the right figure in Figure 2) are calculated and derived. Only the motion vector information of the control point needs to be written in the code stream, not the motion of each subunit. Vector information.
  • the motion vector of the control point needs to be determined first.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the embodiment of the present invention adopts the adaptive motion vector accuracy (AMVR) technology in The encoder side adaptively determines the accuracy of the motion vector.
  • the determination of the motion vector of the control point is based on the Inter mode (also known as the AMVP mode) in the Affine mode. In this mode, the motion vector accuracy is selected on the encoder side, and MVD (Motion Vector Difference, motion vector difference) calculation.
  • the Inter mode also known as the AMVP mode
  • MVD Motion Vector Difference, motion vector difference
  • one is selected from multiple types of motion vector accuracy to perform motion estimation in the reference frame, where the multiple types of motion vector accuracy include 1/2 pixel accuracy.
  • the multiple types of motion vector accuracy include 1/2 pixel accuracy.
  • the various motion vector precisions include any of 4 pixels, 2 pixels, whole pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels.
  • one can be selected from integer pixel accuracy, 1/2 pixel accuracy, 1/4 pixel accuracy, and 1/16 pixel accuracy for motion estimation.
  • the conventional AMVP mode adds 1/2 pixel AMVR accuracy. Therefore, the embodiment of the present invention adds 1/2 pixel precision to the optional motion vector precision, so that the design of the motion vector precision of the affine coding unit matches the design of the motion vector precision of the conventional coding unit.
  • the accuracy of the motion vector of the control point referred to in the Affine mode is not the actual use in the process of sub-unit motion compensation.
  • the accuracy of the motion vector is not the actual use in the process of sub-unit motion compensation.
  • the method for determining the accuracy of the motion vector includes: selecting the accuracy of the motion vector according to the selected motion vector accuracy of the neighboring coding unit.
  • the method for determining the accuracy of a motion vector may further include: attempting to perform motion estimation based on at least two of the four kinds of motion vector accuracy, and selecting the motion vector accuracy based on the effect of the motion estimation.
  • two kinds of motion vector precisions can be selected from the four optional motion vector precisions, and motion estimation is attempted respectively, and the effects of the two motion estimations are compared. After that, compare the effects of motion estimation. If the motion estimation effect with lower motion vector accuracy is better, stop trying, and directly use the lower motion vector accuracy as the selected motion vector accuracy. If the motion estimation effect with higher motion vector accuracy is better, continue to use higher motion vector accuracy to try motion estimation until the best motion estimation effect is obtained.
  • the determining the motion vector of the control point of the affine coding unit includes: first, obtaining the motion vector of the spatial or temporal adjacent coding unit, and according to the spatial adjacent coding unit or temporal adjacent coding The combination of the motion vectors of the units constructs a candidate list. After that, the obtained motion vectors are combined to construct a candidate list of control point motion vectors, and the number of motion vectors in each combination depends on the number of control points of the affine coding unit.
  • the motion vector predictor MVP
  • the corresponding reference block can be determined in the reference frame according to the predicted motion vector.
  • interpolation processing is performed on the reference block to generate fractional pixels, and then the actual motion vector is determined. Interpolation is to use the value of integer pixels to generate fractional pixels between each integer sample. The more fractional pixels are generated between integer pixels, the higher the resolution of the reference frame becomes, and the more accurately and accurately the displacement of fractional pixel accuracy can be compensated. With the improvement of interpolation accuracy, the efficiency of motion estimation and motion compensation will be improved to a certain extent.
  • the accuracy of the motion vector in the Affine mode can be an integer, that is, an integer pixel accuracy, such as integer, 2 pixels; it can also be non-integer, that is, a sub-pixel accuracy, such as 1/2, 1/4, 1/8. Equal precision.
  • the pixel at 1/2 precision position needs to be obtained by interpolation of the pixel at the whole pixel position.
  • the pixel values of other precision positions need to be obtained by further interpolation using integer-pixel precision pixels or 1/2-precision pixels.
  • an interpolation filter can be selected according to the selected motion vector accuracy to perform interpolation processing on the reference block.
  • the same interpolation filter may be used for all motion vector accuracy.
  • the existing six-tap interpolation filter is used by default.
  • the identification bit that characterizes the type of the interpolation filter may not be set in the code stream, thereby saving one bit of data.
  • different interpolation filters can be selected according to different motion vector accuracy.
  • 1/2 precision uses a 6-tap interpolation filter, and other precisions all use an 8-tap interpolation filter. Therefore, in an embodiment of the present invention, when 1/2 pixel precision is selected as the motion vector precision, the first interpolation filter is selected to perform interpolation processing on the reference block; when the precision other than 1/2 pixel precision is selected When the other precision of is used as the motion vector precision, a second interpolation filter is selected to perform interpolation processing on the reference block, wherein the number of taps of the first interpolation filter and the second interpolation filter are different.
  • the first interpolation filter may be a 6-tap interpolation filter
  • the second interpolation filter may be an 8-tap interpolation filter.
  • the filter type identification bit can be set in the code stream. For example, 1 can be used to indicate that a 6-tap interpolation filter is used; 0 can be used to indicate that a 6-tap interpolation filter is not used, that is, the default 8-tap interpolation filter is used.
  • the motion estimation method 400 when applied to the decoding end, if different interpolation filters are selected for different motion vector accuracy, before selecting the interpolation filter according to the motion vector accuracy, the motion estimation method 400 further includes: acquiring code Stream, the code stream is provided with an identification bit of the filter type corresponding to the motion vector.
  • motion estimation may include both Affine mode and regular AMVP mode.
  • AMVP regular AMVP mode
  • motion estimation is performed with the entire coding unit as a unit.
  • each conventional coding unit when adaptive motion vector precision (AMVR) is applied, it also includes adaptively selecting one of multiple motion vector precisions for motion estimation, the multiple motion vector precisions including 1/2 Pixel accuracy. Except for 1/2 pixel precision, the optional motion vector precision of the conventional coding unit is the same as or different from the optional motion vector precision of the affine coding unit. In one embodiment, the conventional coding unit also includes four optional motion vector precisions.
  • AMVR adaptive motion vector precision
  • the corresponding motion vector accuracy is adaptively decided at the coding end, and the result of the decision is written into the code stream and passed to the decoding end.
  • the identifier indicating the accuracy of the motion vector of the affine coding unit is consistent with the identifier indicating the accuracy of the motion vector of the conventional coding unit, so that the two modes are more unified.
  • the method when the motion estimation method 400 is applied to the decoding end, before selecting one of the at least four motion vector precisions to perform motion estimation in the reference frame, the method further includes: obtaining a code stream, so The identification bit of the code stream records the motion vector accuracy of the selected affine coding unit, the identifier representing the motion vector accuracy of the affine coding unit and the identifier representing the motion vector accuracy of the conventional coding unit Consistent.
  • step S420 the affine coding unit is divided into several sub-units, and in step S430, the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point.
  • step S420 and step S430 reference may be made to the related description of step S120 and step S130 of the method 100, which will not be repeated here.
  • the motion estimation method adds 1/2 pixel precision to the optional motion vector precision in affine mode, so that the precision of the motion vector in affine mode is the same as that in normal mode.
  • the precision design is unified, and the coding performance is improved.
  • the following describes a motion estimation system 500 according to an embodiment of the present invention with reference to FIG. 5.
  • FIG. 5 is a schematic block diagram of a motion estimation system 500 according to an embodiment of the present invention.
  • the motion estimation system 500 shown in FIG. 5 includes a processor 510, a storage device 520, and a computer program stored on the storage device 520 and running on the processor 510.
  • the processor implements the foregoing figure when the program is executed. Steps of the motion estimation method 100 shown in 1 or the motion estimation method 400 shown in FIG. 4.
  • the processor 510 may be a central processing unit (CPU), an image processing unit (GPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other forms with data processing capabilities and/or instruction execution capabilities
  • the processor 510 may be a central processing unit (CPU) or other form of processing unit with data processing capability and/or instruction execution capability, and may control other components in the motion estimation system 500 to execute The desired function.
  • the processor 510 can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSM), digital signal processors (DSP), or combinations thereof.
  • the storage device 520 includes one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 210 may run the program instructions to implement the motion estimation method in the embodiments of the present invention (implemented by the processor) described below. And/or other desired functions.
  • Various application programs and various data such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.
  • the system 500 further includes an input device (not shown).
  • the input device may be a device used by the user to input instructions, and may include one of operation keys, a keyboard, a mouse, a microphone, and a touch screen. Or more.
  • the input device may also be any interface for receiving information.
  • the system 500 further includes an output device that can output various information (such as images or sounds) to the outside (such as a user), and may include a display (such as displaying a video image to the user), One or more of speakers, etc.
  • the output device may also be any other device with output function.
  • system 500 further includes a communication interface, which is used to communicate with other devices, including wired or wireless communication.
  • the processor implements the following steps when executing the program: For the affine coding unit in the current frame, select one of at least four motion vector precisions to perform motion estimation in the reference frame, In this way, the motion vector of the control point of the affine coding unit is determined; the affine coding unit is divided into several sub-units; the motion vector of the sub-unit in the affine coding unit is calculated according to the motion vector of the control point. Motion vector.
  • the processor implements the following steps when executing the program: For the affine coding unit in the current frame, select one of a variety of motion vector precisions to perform motion estimation in the reference frame, thereby determining the The motion vector of the control point of the affine coding unit, wherein the various motion vector precisions include 1/2 pixel precision; the affine coding unit is divided into a number of sub-units; the calculation is based on the motion vector of the control point The motion vector of the sub-unit in the affine coding unit.
  • the embodiment of the present invention also provides a storage medium on which a computer program is stored.
  • the computer program is executed by the processor, the steps of the method shown in FIG. 1 or FIG. 4 can be implemented.
  • the storage medium is a computer-readable storage medium.
  • the computer-readable storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disk read-only Memory (CD-ROM), USB memory, or any combination of the above storage media.
  • the computer-readable storage medium may be any combination of one or more computer-readable storage media.
  • the computer program instructions when run by the computer or processor, cause the computer or processor to perform the following steps: For the affine coding unit in the current frame, select one of at least four motion vector precisions to Perform motion estimation in the reference frame to determine the motion vector of the control point of the affine coding unit; divide the affine coding unit into several subunits; calculate the affine coding according to the motion vector of the control point The motion vector of the sub-unit in the unit.
  • the computer program instructions when run by the computer or processor, cause the computer or processor to perform the following steps: For the affine coding unit in the current frame, select one from a variety of motion vector precisions. Perform motion estimation in a reference frame to determine the motion vector of the control point of the affine coding unit, wherein the various motion vector precisions include 1/2 pixel precision; the affine coding unit is divided into several sub- Unit; calculate the motion vector of the sub-unit in the affine coding unit according to the motion vector of the control point.
  • the motion estimation method, system and storage medium of the present invention unify the design of the motion vector accuracy in the affine mode with the motion vector accuracy in the conventional mode, improve the coding performance, and can be used to improve the quality of compressed video. Enhancing the hardware friendliness of the codec is of great significance to the video compression processing of broadcast television, video conference, network video, etc.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present invention essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
  • the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as a device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals.
  • Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Procédé et système d'estimation de mouvement, et support de stockage. Le procédé fait appel aux étapes suivantes : pour une unité de codage affine dans une trame actuelle, la sélection de l'un d'au moins quatre types de précision de vecteur de mouvement afin d'effectuer une estimation de mouvement dans une trame de référence, de façon à déterminer le vecteur de mouvement d'un point de commande de l'unité de codage affine (S110) ; la division de l'unité de codage affine en une pluralité de sous-unités (S120) ; et le calcul des vecteurs de mouvement des sous-unités dans l'unité de codage affine selon le vecteur de mouvement du point de commande (S130). Le procédé et le système d'estimation de mouvement et le support de stockage unifient la conception de la précision de vecteur de mouvement dans un mode affine avec la précision de vecteur de mouvement dans le mode classique, améliorant ainsi les performances de codage.
PCT/CN2019/107601 2019-09-24 2019-09-24 Procédé et système d'estimation de mouvement, et support de stockage WO2021056215A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980066902.6A CN112868234A (zh) 2019-09-24 2019-09-24 运动估计方法、系统和存储介质
PCT/CN2019/107601 WO2021056215A1 (fr) 2019-09-24 2019-09-24 Procédé et système d'estimation de mouvement, et support de stockage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/107601 WO2021056215A1 (fr) 2019-09-24 2019-09-24 Procédé et système d'estimation de mouvement, et support de stockage

Publications (1)

Publication Number Publication Date
WO2021056215A1 true WO2021056215A1 (fr) 2021-04-01

Family

ID=75165894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107601 WO2021056215A1 (fr) 2019-09-24 2019-09-24 Procédé et système d'estimation de mouvement, et support de stockage

Country Status (2)

Country Link
CN (1) CN112868234A (fr)
WO (1) WO2021056215A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113630602A (zh) * 2021-06-29 2021-11-09 杭州未名信科科技有限公司 编码单元的仿射运动估计方法、装置、存储介质及终端
CN113630601B (zh) * 2021-06-29 2024-04-02 杭州未名信科科技有限公司 一种仿射运动估计方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277506A (zh) * 2017-08-15 2017-10-20 中南大学 一种基于自适应运动矢量精度的运动矢量精度快速选择方法及装置
CN108781284A (zh) * 2016-03-15 2018-11-09 联发科技股份有限公司 具有仿射运动补偿的视频编解码的方法及装置
CN109155854A (zh) * 2016-05-27 2019-01-04 松下电器(美国)知识产权公司 编码装置、解码装置、编码方法及解码方法
CN109792532A (zh) * 2016-10-04 2019-05-21 高通股份有限公司 用于视频译码的适应性运动向量精准度
CN110620932A (zh) * 2018-06-19 2019-12-27 北京字节跳动网络技术有限公司 依赖模式的运动矢量差精度集

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011021914A2 (fr) * 2009-08-21 2011-02-24 에스케이텔레콤 주식회사 Procédé et appareil de codage/décodage d'images utilisant une résolution de vecteur de mouvement adaptative
EP4221202A1 (fr) * 2015-06-05 2023-08-02 Dolby Laboratories Licensing Corporation Procédé de codage et de décodage d'image et dispositif de décodage d'image
US20190364284A1 (en) * 2017-01-16 2019-11-28 Industry Academy Cooperation Foundation Of Sejong University Image encoding/decoding method and device
WO2019072187A1 (fr) * 2017-10-13 2019-04-18 Huawei Technologies Co., Ltd. Élagage de liste de candidats de modèle de mouvement pour une inter-prédiction
CN109729352B (zh) * 2017-10-27 2020-07-21 华为技术有限公司 确定仿射编码块的运动矢量的方法和装置
US20190246134A1 (en) * 2018-02-06 2019-08-08 Panasonic Intellectual Property Corporation Of America Encoding method, decoding method, encoder, and decoder
WO2019160860A1 (fr) * 2018-02-14 2019-08-22 Futurewei Technologies, Inc. Filtre d'interpolation adaptatif

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108781284A (zh) * 2016-03-15 2018-11-09 联发科技股份有限公司 具有仿射运动补偿的视频编解码的方法及装置
CN109155854A (zh) * 2016-05-27 2019-01-04 松下电器(美国)知识产权公司 编码装置、解码装置、编码方法及解码方法
CN109792532A (zh) * 2016-10-04 2019-05-21 高通股份有限公司 用于视频译码的适应性运动向量精准度
CN107277506A (zh) * 2017-08-15 2017-10-20 中南大学 一种基于自适应运动矢量精度的运动矢量精度快速选择方法及装置
CN110620932A (zh) * 2018-06-19 2019-12-27 北京字节跳动网络技术有限公司 依赖模式的运动矢量差精度集

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG ZHAO, WANG SHIQI, ZHANG JIAN, MA SIWEI: "Adaptive Progressive Motion Vector Resolution Selection Based on Rate–Distortion Optimization", IEEE TRANSACTIONS ON IMAGE PROCESSING, IEE SERVICE CENTER , PISCATAWAY , NJ, US, vol. 26, no. 1, 1 January 2017 (2017-01-01), US, pages 400 - 413, XP055795908, ISSN: 1057-7149, DOI: 10.1109/TIP.2016.2627814 *

Also Published As

Publication number Publication date
CN112868234A (zh) 2021-05-28

Similar Documents

Publication Publication Date Title
TWI729422B (zh) 色彩分量間的子區塊移動向量繼承
CN112889269B (zh) 视频解码方法及装置
US11178419B2 (en) Picture prediction method and related apparatus
CN112470474B (zh) 视频编解码的方法和装置
US11856220B2 (en) Reducing computational complexity when video encoding uses bi-predictively encoded frames
JP6490203B2 (ja) 画像予測方法および関連装置
JP2021182752A (ja) 画像予測方法および関連装置
WO2017005146A1 (fr) Procédé et dispositif d'encodage et de décodage vidéo
JP6905093B2 (ja) 映像コーディングにおける動き補償予測のオプティカルフロー推定
WO2019242563A1 (fr) Procédé et dispositif de codage et de décodage vidéo, support de stockage et dispositif informatique
TW201813396A (zh) 用於視訊編解碼的基於模型的運動向量推導
JP6945654B2 (ja) 低減されたメモリアクセスを用いてfrucモードでビデオデータを符号化又は復号する方法及び装置
WO2020140331A1 (fr) Procédé et dispositif de traitement d'image de vidéo
TW201526617A (zh) 影像處理方法與系統、解碼方法、編碼器與解碼器
JP2022508074A (ja) スキップ及びマージモードのためのマルチ仮説のシグナリング及び動きベクトル差分によるマージの距離オフセットテーブルのシグナリングのための方法及び装置
JP2022515031A (ja) ビデオコーディングのための方法、機器及びコンピュータ・プログラム
KR102059066B1 (ko) 모션 벡터 필드 코딩 방법 및 디코딩 방법, 및 코딩 및 디코딩 장치들
WO2016065872A1 (fr) Procédé de prédiction d'image, et dispositif correspondant
WO2017201678A1 (fr) Procédé de prédiction d'image et dispositif associé
TWI790662B (zh) 一種編解碼方法、裝置及其設備
KR20200125698A (ko) 서브-블록 모션 벡터 예측을 위한 방법 및 장치
WO2021056215A1 (fr) Procédé et système d'estimation de mouvement, et support de stockage
TW201937924A (zh) 用於改進獲得線性分量樣本預測參數的方法以及裝置
CN110719467B (zh) 色度块的预测方法、编码器及存储介质
WO2022227622A1 (fr) Procédés et dispositifs de codage et de décodage de prédiction conjointe inter-trame et intra-trame configurables en poids

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946975

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19946975

Country of ref document: EP

Kind code of ref document: A1