WO2020181507A1 - Image processing method and apparatus - Google Patents

Image processing method and apparatus Download PDF

Info

Publication number
WO2020181507A1
WO2020181507A1 PCT/CN2019/077894 CN2019077894W WO2020181507A1 WO 2020181507 A1 WO2020181507 A1 WO 2020181507A1 CN 2019077894 W CN2019077894 W CN 2019077894W WO 2020181507 A1 WO2020181507 A1 WO 2020181507A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
motion vector
sub
cpmv
prediction
Prior art date
Application number
PCT/CN2019/077894
Other languages
French (fr)
Chinese (zh)
Inventor
孟学苇
郑萧桢
王苫社
马思伟
Original Assignee
北京大学
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学, 深圳市大疆创新科技有限公司 filed Critical 北京大学
Priority to PCT/CN2019/077894 priority Critical patent/WO2020181507A1/en
Priority to CN201980005232.7A priority patent/CN111247804B/en
Publication of WO2020181507A1 publication Critical patent/WO2020181507A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Definitions

  • This application relates to the field of image processing, and more specifically, to an image processing method and device.
  • inter-frame prediction uses the time-domain correlation between adjacent frames of the video, use the previously coded reconstructed frame as a reference frame, and predict the current frame through motion estimation and motion compensation. Remove the time redundant information of the video.
  • the general process of inter-frame prediction includes Motion Estimation (ME) and Motion Compensation (MC).
  • ME Motion Estimation
  • MC Motion Compensation
  • the current coding block of the current frame searches for the most similar block in the reference frame as the prediction block of the current block, and the relative displacement between the current block and its similar block is a motion vector (MV).
  • the process of motion estimation is the process of obtaining a motion vector after searching and comparing the current coding block of the current frame in the reference frame.
  • Motion compensation is the process of obtaining prediction frames using MV and reference frames.
  • the predicted frame obtained by motion compensation may be different from the original current frame. Therefore, the difference (residual) between the predicted frame and the current frame needs to be transmitted to the decoder after transformation, quantization, etc., in addition to Pass the MV and reference frame information to the decoder.
  • the decoder can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixel units.
  • sub-pixel accuracy is proposed.
  • HEVC high-efficiency video coding
  • motion vectors with 1/4 pixel precision are used for motion estimation of luminance components.
  • there is no sample value at the fractional pixel in digital video there is no sample value at the fractional pixel in digital video.
  • the value of these fractional pixels must be approximately interpolated, that is, the line direction and the reference frame K-fold interpolation is performed in the column direction, that is, the prediction block is searched in the reference frame after interpolation.
  • the pixels in the current block and the pixels in the adjacent area need to be used.
  • affine motion compensation prediction Affine motion compensation prediction
  • the affine motion field of the image block can be derived from the motion vector of two control points (four parameters) or three control points (six parameters).
  • the image processing unit of the Affine technology is a sub-CU (which can be referred to as a sub-block), and the size of the sub-CU is 4 ⁇ 4 (unit: pixel), which will cause the Affine technology to generate greater bandwidth pressure.
  • the present application provides an image processing method and device, which can reduce the bandwidth pressure caused by the Affine prediction technology to a certain extent.
  • an image processing method includes: obtaining a motion vector CPMV of a control point of an image block; obtaining a motion vector of a sub-image block in the image block according to the CPMV of the image block, so The motion vector mentioned is in integer pixel accuracy.
  • an image processing device comprising: a first acquisition unit, configured to acquire a motion vector CPMV of a control point of an image block; and a second acquisition unit, configured to acquire according to the first acquisition unit
  • the CPMV of the image block is obtained, and the motion vector of the sub-image block in the image block is obtained, and the motion vector has an integer pixel accuracy.
  • an image processing device in a third aspect, includes a memory and a processor, the memory is used to store instructions, and the processor is used to execute instructions stored in the memory and store Execution of the instructions of causes the processor to execute the method provided in the first aspect.
  • a chip in a fourth aspect, includes a processing module and a communication interface, the processing module is configured to control the communication interface to communicate with the outside, and the processing module is also configured to implement the method provided in the first aspect.
  • a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer realizes the method in the first aspect or any possible implementation manner of the first aspect .
  • a computer program product containing instructions is provided, which when executed by a computer causes the computer to implement the method provided in the first aspect.
  • the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
  • Figure 1 is a schematic diagram of a video coding architecture.
  • Figure 2 is a schematic diagram of 1/4 pixel interpolation.
  • Figures 3(a) and 3(b) are schematic diagrams of the four-parameter Affine model and the six-parameter Affine model, respectively.
  • Figure 4 is a schematic diagram of the Affine motion vector field.
  • Fig. 5 is a comparison diagram of reference pixels required by the Affine mode and the HEVC mode in the prior art.
  • Fig. 6 is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • Fig. 7 is another schematic flowchart of an image processing method according to an embodiment of the present application.
  • Fig. 8 is another schematic flowchart of the image processing method according to an embodiment of the present application.
  • Fig. 9 is a schematic flowchart of an image processing apparatus according to an embodiment of the present application.
  • Fig. 10 is another schematic flowchart of an image processing apparatus according to an embodiment of the present application.
  • the video coding framework mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • This application is mainly aimed at improving the inter prediction (inter prediction) part.
  • inter-frame prediction uses the temporal correlation between adjacent frames of the video, use the reconstructed frame as the reference frame, and use Motion Estimation (ME) and Motion Compensation (MC) to compare the current frame Make predictions to remove the temporal redundant information of the video.
  • ME Motion Estimation
  • MC Motion Compensation
  • the current frame mentioned in this article means the frame currently being encoded, and in the decoding scene, means the frame currently being decoded.
  • the reconstructed frame mentioned in this article, in the encoding scene, means the previously encoded frame, in the decoding scene, means the previously decoded frame.
  • the entire frame of image is not directly processed in the encoding process, and the entire frame of image is usually divided into image blocks for processing.
  • CTU Coding Tree Unit
  • the size of the CTU is 64 ⁇ 64 or 128 ⁇ 128 (unit: pixels)
  • the CTU can be further divided into square or rectangular Coding Unit (CU).
  • CU Coding Unit
  • the unit of the size of the image block mentioned in this article is all pixels.
  • Motion estimation refers to the process of obtaining a motion vector after searching and comparing the current block of the current frame in the reference frame.
  • Motion compensation refers to the process of obtaining a prediction block using a reference block and a motion vector obtained by motion estimation.
  • the prediction block obtained by the inter-frame prediction process may be different from the original current block. Therefore, it is necessary to calculate the difference between the prediction block and the current block, and the difference may be called the residual. After performing transformation, quantization, entropy coding and other processing on the residual, the coded bit stream is obtained.
  • the bit stream and encoding mode information are stored or sent to the decoding end.
  • the decoding end after obtaining the entropy coded bitstream, first perform entropy decoding on the bitstream to obtain the corresponding residual; then, obtain the prediction block according to the coding mode information such as the decoded motion vector; finally, according to the residual and prediction Block, get the value of each pixel in the current block, that is, reconstruct the current block, and so on, reconstruct the current frame.
  • steps such as inverse quantization and inverse transformation may also be included.
  • Dequantization refers to the process opposite to the quantification process.
  • Inverse transformation refers to the process opposite to the transformation process.
  • Inter-frame prediction mainly includes forward prediction, backward prediction, bi-prediction and so on.
  • forward prediction is to use the previous reconstructed frame of the current frame (may be called the historical frame) to predict the current frame.
  • Backward prediction is to use frames after the current frame (may be called a future frame) to predict the current frame.
  • Bi-prediction may be bi-directional prediction, that is, both "historical frames” and "future frames” are used to predict the current frame.
  • Bi-prediction can also be prediction in two directions, for example, using two "historical frames” to predict the current frame, or using two "future frames” to predict the current frame.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the accuracy of motion estimation needs to be improved to the sub-pixel level (also called 1/K pixel accuracy). For example, in the HEVC standard, motion vectors with 1/4 pixel accuracy are used for motion estimation of the luminance component.
  • the process of 1/4 pixel interpolation is shown in Figure 2.
  • the 3 pixels on the left and 4 pixels on the right outside the image block will be used to generate interpolation points The pixel value.
  • a 0,0 and d 0,0 are 1/4 pixels
  • b 0,0 and h 0,0 are half pixels
  • c 0, 0 and n 0,0 are 3/4 pixels.
  • the current block is a 2 ⁇ 2 block
  • a 0,0 ⁇ A 1,0 , A 0,0 ⁇ A 0,1 are enclosed by 2 ⁇ 2 blocks.
  • some points outside the 2 ⁇ 2 need to be used, including 3 on the left, 4 on the right, 3 on the top, and 4 on the bottom.
  • Affine motion compensated prediction technology (Affine motion compensated prediction, hereinafter referred to as Affine).
  • Affine is an inter-frame prediction technology.
  • an Affine mode sports field can pass two control points (four parameters) (as shown in Figure 3(a)) or three control points (six parameters) (as shown in Figure 3(b))
  • the motion vector is exported.
  • MV controlpointmotionvector
  • CPMV controlpointmotionvector
  • the processing unit of Affine is not a CU, but a sub-block (sub-CU) obtained after dividing the CU, and the size of each sub-CU is 4 ⁇ 4.
  • each sub-CU has one MV. It can be understood that, unlike ordinary CUs, Affine mode CUs do not only have one MV. There are as many sub-CUs as there are in a CU.
  • the MV of the sub-CU in one CU is derived through the CPMV calculation of two control points or three control points as shown in FIG. 3.
  • the MV of the sub-CU at the (x, y) position is calculated by the following formula:
  • the MV of the sub-CU at the (x, y) position is calculated by the following formula:
  • (mv 0x , mv 0y ) is the MV of the upper left control point
  • (mv 1x , mv 1y ) is the MV of the upper right control point
  • (mv 2x , mv 2y ) is the MV of the lower left control point.
  • W in the above formula represents the width of the CU where the sub-CU is located
  • H represents the height of the CU where the sub-CU is located.
  • each square represents a sub-CU with a size of 4 ⁇ 4.
  • the MVs of all sub-CUs will be converted into 1/16 pixel precision representation, that is to say, the highest precision of the sub-CU MV is 1/16 pixel.
  • the prediction block of each sub-CU is obtained through the process of motion compensation.
  • the size of the sub-CU of the chrominance component and the luminance component is 4 ⁇ 4, and the motion vector of the chrominance component 4 ⁇ 4 block is obtained by averaging the corresponding four 4 ⁇ 4 luminance component motion vectors.
  • CPMV information is written in the code stream, and there is no need to write the MV information of each sub-CU.
  • AMVR Adaptive Motion Vector Resolution
  • AMVR technology can make the CU have motion vectors with full pixel precision and sub-pixel precision.
  • the integer pixel accuracy can be, for example, 1-pixel accuracy, 2-pixel accuracy, or the like.
  • the sub-pixel accuracy can be, for example, 1/2 pixel accuracy, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1/16 pixel accuracy.
  • the corresponding MV accuracy is adaptively decided at the encoding end, and the result of the decision is written into the code stream and passed to the decoding end.
  • the whole pixel accuracy or sub-pixel accuracy mentioned in Affine AMVR technology refers to the pixel accuracy of CPMV, not the pixel accuracy of sub-CU.
  • the motion estimation process of the CU is the whole pixel process, but the MV of the sub-CU obtained after the above formula (1) or formula (2) may be 1/4 pixel accuracy or other sub-pixel accuracy. Pixel accuracy.
  • the motion compensation process of the sub-CU will involve sub-pixels, and since the size of the sub-CU is 4 ⁇ 4, this will cause the Affine prediction process to generate greater bandwidth pressure.
  • the simulation result is shown in Figure 5.
  • the box on the right represents the 4 ⁇ 4 bidirectional inter-frame prediction CU under the worst case (1/16 and 1/4 pixel precision MV) in the Affine mode of VVC.
  • this application proposes an image processing method and device, which can reduce the bandwidth pressure generated by the Affine technology to a certain extent.
  • This application is suitable for the field of digital video coding technology, and is specifically used for the inter-frame prediction part of a video codec.
  • This application can be applied to codecs that comply with the international video coding standard H.264/HEVC and the Chinese AVS2 standard, as well as codecs that comply with the next-generation video coding standard VVC or AVS3.
  • This application can be applied to the inter-frame prediction part of a video codec, that is to say, the image processing method according to the embodiment of this application can be executed by an encoding device or a decoding device.
  • FIG. 6 is a schematic flowchart of an image processing method 600 provided by this application.
  • the method 600 includes the following steps.
  • CPMV motion vector
  • the motion vector of the sub-image block in the image block is obtained, and the pixel accuracy of the motion vector of the sub-image block is made to be an integer pixel accuracy.
  • the sub-image block mentioned in this application represents a processing unit of image processing or video processing.
  • the width and/or height of the sub-image block may be less than 8 pixels.
  • the size of the sub-image block is 4 ⁇ 4 (pixels).
  • the sub image block may be a block obtained by dividing the image block. It can be understood that if the size of the image block and the sub-image block are the same, the sub-image block can be regarded as the image block itself.
  • the sub-image block may be a square block, for example, a block with a size of 4 ⁇ 4 or 8 ⁇ 8, or a rectangular block, for example, a block with a size of 2 ⁇ 4 or 4 ⁇ 8.
  • the size of the image block mentioned in this application can be 16 ⁇ 16, 16 ⁇ 8, 16 ⁇ 4, 8 ⁇ 16, 4 ⁇ 8, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8 and other sizes.
  • the motion vector of the sub-image block as the processing unit has an integer pixel accuracy. Therefore, the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the bandwidth pressure generated by the video inter-frame prediction process.
  • the process of obtaining the motion vector of the sub-image block in the image block may include: calculating the motion vector of the sub-image block according to the motion vector of the two or three control points of the image block, and making the obtained motion vector
  • the pixel accuracy of the motion vector of the sub-image block is the integer pixel accuracy.
  • the motion vector of the sub-image block can be calculated according to formula (1) or formula (2) described above.
  • this motion vector is the motion vector of the sub-image block to be obtained in this application. .
  • an algorithm is used to calculate the motion vector of the sub-image block according to the CPMV of the image block.
  • the algorithm can ensure that the calculated pixel accuracy of the motion vector of the sub-image block is an integer pixel.
  • the calculated pixel accuracy of the motion vector of the sub-image block is sub-pixel accuracy, for example, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1 /16 pixel accuracy, you also need to process the currently calculated motion vector to change it from sub-pixel accuracy to full-pixel accuracy.
  • step 620 includes the following steps 1) and 2).
  • the first motion vector of the sub-image block is calculated based on CPMV, and the pixel accuracy of the calculated first motion vector is sub-pixel.
  • step 2) the second motion vector is obtained according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the whole pixel point closest to the end point of the first motion vector.
  • the closest whole pixel point may be the whole pixel point above, below, left or right of the end point of the first motion vector.
  • the following formula is used to calculate the second motion vector (MV2x, MV2y) of the sub-image block according to the first motion vector (MV1x, MV1y) of the sub-image block.
  • MV2x ((MV1x+(1 ⁇ (shift-1)))>>shift) ⁇ shift;
  • MV2y ((MV1y+(1 ⁇ (shift-1)))>>shift) ⁇ shift;
  • the value of shift is related to the storage accuracy of the motion vector in the coding software platform.
  • the storage accuracy of the motion vector is 1/16 accuracy, and the value of shift can be set to 4.
  • the following formula is used to obtain the second motion vector (MV2x, MV2y) of the sub-image block according to the first motion vector (MV1x, MV1y) of the sub-image block.
  • this application does not limit the manner in which the pixel accuracy of the motion vector is converted from the sub-pixel level to the entire pixel level.
  • the pixel accuracy of the CPMV of the image block may be a whole pixel or a sub-pixel.
  • the pixel accuracy of the motion vector of the sub-image block calculated according to the CPMV of the image block is also sub-pixel; if the pixel accuracy of the CPMV of the image block is full pixels, according to the image block
  • the pixel accuracy of the motion vector of the sub-image block calculated by CPMV may also be sub-pixel.
  • the pixel accuracy of the motion vector of the sub-image block calculated according to formula (1) or formula (2) may be sub-pixel.
  • the pixel accuracy of the sub-image block that is, the motion vector of the processing unit may be sub-pixel, which will cause the motion compensation process to involve sub-pixels, which will increase the bandwidth pressure of the Affine technology.
  • the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
  • the bandwidth pressure problem can also be relieved to a certain extent, but this will reduce the image compression performance.
  • the motion vector of the sub-image block as the processing unit into integer pixel accuracy, it can ensure the motion compensation of the integer pixel accuracy, so that on the one hand, the problem of bandwidth pressure can be solved, and on the other hand, better image compression can be ensured. performance.
  • the existing Affine technology can be improved according to the solution provided by the present application, that is, the motion vector of the Sub-CU in the Affine mode is processed to an integer pixel accuracy, so that the bandwidth pressure generated by the Affine technology can be reduced.
  • the solution provided in this application can also be applied to other similar technologies that may appear in the future.
  • the pixel accuracy of the motion vector includes integer pixel accuracy and sub-pixel accuracy, and the size of the image processing unit Smaller, for example, 4 ⁇ 4.
  • the method provided in the embodiment of the present application further includes: processing the CPMV of the image block to integer pixel accuracy.
  • This embodiment can ensure that the CPMV of the image block has an integer pixel accuracy.
  • step 610 includes the following step 611, step 612, and step 613.
  • the motion vectors of the spatial and/or temporal neighboring blocks of the image block are acquired, and based on the motion vectors of these neighboring blocks, a motion information candidate list of the image block is constructed.
  • the aforementioned formula (3) or formula (4) can be used to process the motion vector in the motion information candidate list into integer pixel accuracy.
  • the neighboring block refers to the neighboring block used to construct the motion information candidate list of the image block, for example, the neighboring block in the temporal and/or spatial domain. This application does not limit the manner of determining neighboring blocks.
  • Affine inter prediction modes can be divided into Affine merge mode and Affine inter mode.
  • the embodiment shown in FIG. 7 can be applied to the Affine inter mode and can also be applied to the Affine merge mode.
  • the inter-frame prediction mode of the image block is the Affine merge mode.
  • a CPMV can be selected from the motion information candidate list directly as the CPMV of the image block. That is, step 613 includes: selecting a CPMV from the motion information candidate list of the image block as the CPMV of the image block.
  • selecting CPMV from the motion information candidate list directly as the CPMV of the image block can ensure that the CPMV of the image block is an integer pixel.
  • the general process of inter prediction in Affine merge mode includes the following steps.
  • the image block is a CU as an example.
  • Step 1-1 Obtain the motion vector (MV) of the neighboring block from the spatial neighboring block and/or the temporal neighboring block.
  • the MV of the neighboring block in the Affine mode and the MV of the neighboring block in the traditional mode are obtained, and CPMVs are obtained according to the MV combination of these neighboring blocks, and the motion information candidate list of the CU is constructed from these CPMVs.
  • Step 1-2 processing the motion vector in the motion information candidate list of the CU into integer pixel accuracy.
  • Steps 1-3 select a combination from the motion information candidate list (the combination may contain two or three CPMV, representing two control points and three control points CPMV), as the CPMVs of the CU.
  • the CPMVs selected in the motion information candidate list are used as the CPMVs of the current CU, no motion estimation is required, and there is no concept of MVD in the Affine inter mode (described below). That is, in the Affine merge mode, only the index of CPMVs selected from the motion information candidate list (one CU only needs to write one index) is written into the code stream, and there is no need to transmit MVD.
  • the inter prediction mode of the neighboring block can be the traditional inter prediction mode or the affine mode. Therefore, the MV obtained from the neighboring block may be of integer pixel accuracy or Sub-pixel accuracy.
  • the embodiment shown in FIG. 7 can also be applied to the Affine inter mode.
  • the general flow of the Affine Inter mode will be described first.
  • the general process of Affine Inter mode includes the following steps.
  • the image block is a CU as an example.
  • Step 2-1 Obtain motion vectors of neighboring blocks from spatial neighboring blocks and/or temporal neighboring blocks.
  • the motion vector of the neighboring block in the Affine mode and the motion vector of the neighboring block in the traditional mode are obtained;
  • CPMVs are obtained by combining the obtained motion vectors, and the motion information candidate list of the CU is constructed from these CPMVs.
  • Step 2-2 select a combination from the motion information candidate list constructed in step 2-1 (the combination may contain two or three CPMV, representing two control points and three control points CPMV), as the current CU MV (Motion vector prediction, MVP) (that is, the predicted CPMVs of the current CU).
  • the combination may contain two or three CPMV, representing two control points and three control points CPMV
  • the current CU MV Motion vector prediction, MVP
  • Step 2-3 Perform motion estimation with the current entire CU as a unit, and obtain CPMVs of the current CU.
  • Step 2-4 Calculate the difference between the CPMVs selected in step 2-2 and the CPMVs of step 2-3 motion estimation to obtain a motion vector difference (MVD).
  • MVD motion vector difference
  • the index of the selected CPMVs and MVD need to be written into the code stream.
  • the motion estimation process is performed in units of CU (corresponding to the image block in the embodiment of this application), and the motion compensation process is performed in a 4 ⁇ 4 sub-CU (corresponding to the sub-image in the embodiment of this application). Block) as a unit.
  • the inter prediction mode of the neighboring block can be the traditional inter prediction mode or the affine mode. Therefore, the MV obtained from the neighboring block may be of integer pixel accuracy or Sub-pixel accuracy.
  • the encoder will select different pixel precisions of the motion vector of the CU. This process can be called adaptive motion vector resolution (AMVR) decision-making.
  • AMVR adaptive motion vector resolution
  • the pixel accuracy of AMVR decision is essentially the pixel accuracy of MVD, that is, the pixel accuracy of CPMVs of CU, not the pixel accuracy of MV of sub-CU.
  • the range of pixel accuracy for AMVR decisions includes but is not limited to: 1/16 pixel accuracy, 1/8 pixel accuracy, 1/4 pixel accuracy, 1/2 pixel accuracy, 1 pixel accuracy, 2 Pixel accuracy, 4-pixel accuracy, etc.
  • the CU can have multiple CPMVs with different pixel accuracy.
  • the CU can have three different CPMVs of integer pixels, 1/4 pixel accuracy, and 1/16 pixel accuracy.
  • the inter prediction mode of the image block is Affine Inter mode
  • step 611 includes obtaining the motion information candidate list of the image block
  • step 612 includes The motion vector is processed to integer pixel accuracy
  • step 613 includes: selecting the predicted CPMV of the image block from the motion information candidate list of the image block to obtain the MVD of the image block, the predicted CPMV of the image block and the MVD of the image block, and obtaining the The CPMV of the image block.
  • step 610 may further include step 614 of performing a motion vector accuracy decision of N pixels for the image block, where N is a positive integer.
  • AMVR decision whole pixel precision motion vector precision decision
  • the pixel accuracy of the MVD of the image block can be guaranteed to be integer pixels, and the pixel accuracy of the CPMV of the image block can also be guaranteed to be integer pixels. In this way, it can be ensured that no sub-pixels are involved in the motion estimation process of the image block, thereby reducing the bandwidth pressure to a certain extent.
  • Affine AMVR when used to make motion vector accuracy decisions, it does not make decisions on all pixel accuracy, but skips the decision of 1/M (M>1) pixel accuracy, that is, only N pixel accuracy is made Decision-making.
  • the number of bits (bit number) written into the code stream is reduced correspondingly because the pixel accuracy options are reduced, and there is even no need to write to indicate motion.
  • the number of bits for the vector precision index include three types: integer pixels, 1/4 pixels, and 1/16 pixels. At least 2 bits of information are required to indicate these three pixel accuracy. For example, "0" is used to indicate 1/4 pixel. "10” means 1/16 pixel, and "11” means whole pixel.
  • "0" can be used to represent the whole pixel, so only 1 bit of data needs to be written in the code stream, or the whole pixel precision can be agreed through the agreement, so there is no need to write the motion vector precision index. Into the code stream, this saves signaling overhead, while also reducing bandwidth pressure.
  • N N is a positive integer
  • pixel motion vector accuracy decision on the image block is the same as the embodiment shown in FIG. 8 It can be implemented in combination, or it can be implemented independently from the embodiment shown in FIG. 8.
  • the inter prediction mode of the image block is Affine inter mode
  • step 610 includes: obtaining the CPMV of the image block, and performing the motion vector precision of N pixels on the image block Decision, N is a positive integer.
  • the CPMV of the image block can be guaranteed to have the integer pixel accuracy.
  • the implementation manner of processing the CPMV of the image block's pixel accuracy to integer pixel accuracy is: processing the motion vectors in the motion information candidate list to integer pixel accuracy.
  • the implementation of processing the pixel accuracy of the CPMV of the image block to integer pixel accuracy is: processing the motion vector in the motion information candidate list to integer pixel accuracy, and performing integer pixel accuracy on the image block AMVR decision.
  • the implementation of processing the CPMV pixel accuracy of the image block to the integer pixel accuracy is to implement an AMVR decision with the integer pixel accuracy for the image block.
  • the method shown in the above formula (3) or formula (4) can be used to process the motion vector of the neighboring block to integer pixel accuracy. It is also possible to use other feasible algorithms or methods that convert from sub-pixels to pixels to process the motion vectors of neighboring blocks into integer-pixel accuracy. This application does not limit this.
  • the CPMV of the image block is processed into integer pixel accuracy.
  • the threshold can be determined according to actual needs.
  • the threshold is 16 pixels.
  • the CPMV of the image block is processed to the integer pixel accuracy.
  • the Affine Inter mode motion estimation in units of image blocks will be performed. For example, when the height and width of the image block are equal to or greater than 16 pixels, even the sub-pixel precision motion estimation process will not cause a large bandwidth pressure. In this case, the CPMV of the image block may not be processed to make it Become an integer pixel accuracy.
  • the motion estimation process with sub-pixel accuracy may cause large Bandwidth pressure.
  • the CPMV of the image block can be processed to full pixel accuracy.
  • the prediction mode of the image block is Affine Inter mode, and the height and/or width of the image block are less than 16 pixels.
  • the method according to the embodiment of the present application further includes: performing integer pixel accuracy on the image block AMVR decision.
  • This embodiment can ensure the motion estimation process with the accuracy of the whole pixel, so as to avoid causing a large bandwidth pressure.
  • the motion vector accuracy index of the image block that meets the condition of height and/or width less than 16 pixels is written into the code stream, the number of bits written into the code stream can be reduced because the pixel accuracy options are reduced.
  • AMVR pixel accuracy can be selected from three methods: integer, 1/4, and 1/16 pixels. For example, “0” represents 1/4 pixel, and “10 "Represents 1/16 pixel, and "11" represents an entire pixel. For CUs with a height and/or width less than 16 pixels, because there is only one AMVR pixel accuracy option, there is no need to write the AMVR pixel accuracy index into the code stream. For example, the whole pixel accuracy can be adopted by agreement.
  • the embodiments of the present application can be applied to different kinds of inter-frame prediction methods, for example, forward prediction, backward prediction, or bi-prediction.
  • the inter-frame prediction mode of the sub-image block mentioned in the embodiment of the present application may be any of the following: forward prediction, backward prediction, and bi-prediction.
  • the motion vector of the sub-image block obtained in the forward prediction process is processed as an integer pixel.
  • the motion vector of the sub-image block obtained in the backward prediction process is processed as an integer pixel.
  • the motion vector of the sub-image block obtained by the bi-prediction process is processed as integer pixels.
  • the inter-frame prediction mode of the sub-image block is bi-prediction, but for only one prediction process in the bi-prediction, the method provided in the embodiment of the present application is used to process the motion vector of the sub-image block to integer pixel accuracy.
  • the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or the CPMV of the image block obtained by backward prediction in the bi-prediction process.
  • the motion vector of the sub-image block obtained in one prediction process of the bi-prediction process is processed as an integer pixel.
  • This prediction process may be the forward prediction process in the bi-prediction or the backward prediction process in the bi-prediction.
  • the motion estimation with integer pixel accuracy can be guaranteed, which helps reduce bandwidth pressure.
  • the solution provided by the present application can reduce the bandwidth pressure caused by the inter-frame prediction process, and at the same time can ensure a certain compression performance.
  • an embodiment of the present application provides an image processing apparatus 900, which includes the following units.
  • the first acquiring unit 910 is configured to acquire the motion vector CPMV of the control point of the image block.
  • the second acquiring unit 920 is configured to acquire a motion vector of a sub-image block in the image block according to the CPMV of the image block acquired by the first acquiring unit 910, and the motion vector has an integer pixel accuracy.
  • the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
  • the second acquiring unit 920 is configured to: calculate the first motion vector of the sub-image block according to the CPMV of the image block, the first motion vector is of sub-pixel accuracy; A motion vector is processed as a second motion vector with integer pixel precision.
  • the second obtaining unit 920 is configured to obtain a second motion vector according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the same as that of the first motion vector. The whole pixel closest to the end point.
  • the second acquisition unit 920 is configured to process the first motion vector into a second motion vector with a pixel accuracy of an entire pixel through formula (3) or formula (4).
  • the height and/or width of the sub-image block is 4 pixels.
  • the first obtaining unit 910 is configured to: obtain a motion information candidate list of the image block, and process the motion vector in the motion information candidate list to integer pixel accuracy; and according to the motion information candidate The list is processed as a motion vector with integer pixel precision, and the CPMV of the image block is obtained.
  • the device 900 further includes: a processing unit 930, configured to make a motion vector accuracy decision of N pixels for the image block, where N is a positive integer.
  • the height and/or width of the image block is less than 16 pixels.
  • the inter-frame prediction mode of the sub-image block is any one of the following: forward prediction, backward prediction, and bi-prediction.
  • the inter-frame prediction mode of the sub-image block is bi-prediction, wherein the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or bi-prediction The CPMV of the image block obtained by backward prediction in the process.
  • the image processing apparatus 900 of this embodiment may be an encoder, and the apparatus 900 may also include functional modules for implementing video encoding related processes.
  • the image processing apparatus 900 of this embodiment may be a decoder, and the apparatus 900 may further include functional modules for implementing video decoding related processes.
  • an embodiment of the present invention also provides an image processing apparatus 1000.
  • the device 1000 includes a processor 1010 and a memory 1020.
  • the memory 1020 is used to store instructions.
  • the processor 1010 is used to execute instructions stored in the memory 1020. The execution of the instructions stored in the memory 1020 makes the processor 1010 The method used to perform the above method embodiment.
  • the encoding device 1000 further includes a communication interface 1030 for transmitting signals with external devices.
  • the image processing apparatus 1000 in this embodiment is an encoder, and the communication interface 1030 is used to receive image or video data to be processed from an external device.
  • the communication interface 1030 is also used to send a coded stream to the decoding end.
  • the image processing apparatus 1000 in this embodiment is a decoder, and the communication interface 1030 is used to receive an encoded bitstream from an encoding end.
  • the embodiment of the present invention also provides a computer storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer executes the method in the above method embodiment.
  • An embodiment of the present invention also provides a computer program product containing instructions, which is characterized in that, when the instructions are executed by a computer, the computer executes the method of the above method embodiment.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part.
  • the computer can be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices.
  • Computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • computer instructions can be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server, or data center.
  • a computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. Available media may be magnetic media (for example, floppy disk, hard disk, tape), optical media (for example, digital video disc (DVD)), or semiconductor media (for example, solid state disk (SSD)), etc.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Provided are an image processing method and apparatus. The method comprises: obtaining a control point motion vector (CPMV) of an image block; and obtaining MVs of sub-image blocks in the image block according to the CPMV thereof, the MV being the whole pixel accuracy. By using the MVs of the sub-image blocks used as image processing units as the whole pixel accuracy, a motion compensation process of the sub-image blocks is allowed not to involve sub-pixels, thereby reducing the bandwidth pressure generated by an Affine prediction technology to a certain extent.

Description

图像处理的方法与装置Image processing method and device
版权申明Copyright statement
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。The content disclosed in this patent document contains copyrighted material. The copyright belongs to the copyright owner. The copyright owner does not object to anyone copying the patent document or the patent disclosure in the official records and archives of the Patent and Trademark Office.
技术领域Technical field
本申请涉及图像处理领域,并且更为具体地,涉及一种图像处理的方法与装置。This application relates to the field of image processing, and more specifically, to an image processing method and device.
背景技术Background technique
视频编码帧间预测的大致思想为:利用视频相邻帧之间的时域相关性,使用先前已经编码的重构帧作为参考帧,通过运动估计和运动补偿的方法对当前帧进行预测,从而去除视频的时间冗余信息。帧间预测的大致流程包括运动估计(Motion Estimation,ME)与运动补偿(Motion Compensation,MC)。当前帧的当前编码块在参考帧中寻找最相似块作为当前块的预测块,当前块与其相似块之间的相对位移为运动矢量(Motion Vector,MV)。运动估计的过程就是将当前帧的当前编码块在参考帧中经过搜索、比较后得到运动矢量的过程。运动补偿就是利用MV和参考帧得到预测帧的过程。运动补偿得到的预测帧可能和原始的当前帧有一定的差别,因此需要将预测帧和当前帧的差值(残差)经过变换、量化等过程之后传递到解码端,除此之外还需要将MV和参考帧的信息传递到解码端。解码端通过MV、参考帧、以及预测帧和当前帧的差值,可以重构出当前帧。The general idea of inter-frame prediction in video coding is: use the time-domain correlation between adjacent frames of the video, use the previously coded reconstructed frame as a reference frame, and predict the current frame through motion estimation and motion compensation. Remove the time redundant information of the video. The general process of inter-frame prediction includes Motion Estimation (ME) and Motion Compensation (MC). The current coding block of the current frame searches for the most similar block in the reference frame as the prediction block of the current block, and the relative displacement between the current block and its similar block is a motion vector (MV). The process of motion estimation is the process of obtaining a motion vector after searching and comparing the current coding block of the current frame in the reference frame. Motion compensation is the process of obtaining prediction frames using MV and reference frames. The predicted frame obtained by motion compensation may be different from the original current frame. Therefore, the difference (residual) between the predicted frame and the current frame needs to be transmitted to the decoder after transformation, quantization, etc., in addition to Pass the MV and reference frame information to the decoder. The decoder can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位。为了提高运动矢量的精度,亚像素精度被提出来。例如,在高性能视频编码(high efficiency video coding,HEVC)标准中,对亮度分量的运动估计采用1/4像素精度的运动矢量。但是在数字视频中并不存在分数像素处的样值,一般来说,为了实现1/K像素精度估计,必须将这些分像素点的值近似内插出来,也就是对参考帧的行方向和列方向进行K倍内插,即在插值之后的参考帧中进行搜索预测块。在对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。Due to the continuity of natural object motion, the motion vector of the object between two adjacent frames may not be exactly an integer number of pixel units. In order to improve the accuracy of the motion vector, sub-pixel accuracy is proposed. For example, in the high-efficiency video coding (HEVC) standard, motion vectors with 1/4 pixel precision are used for motion estimation of luminance components. However, there is no sample value at the fractional pixel in digital video. Generally speaking, in order to achieve 1/K pixel accuracy estimation, the value of these fractional pixels must be approximately interpolated, that is, the line direction and the reference frame K-fold interpolation is performed in the column direction, that is, the prediction block is searched in the reference frame after interpolation. In the process of interpolating the current block, the pixels in the current block and the pixels in the adjacent area need to be used.
通常,在帧间预测过程中只考虑传统的运动模型(例如,平移运动)。然而在现实世界中,还有很多种运动形式,比如缩放、旋转、透视运动等无规则的运动。为了考虑上述多运动形式,在VTM-3.0中,引入了仿射运动补偿预测(Affine motion compensation prediction,可简称为Affine)技术。在Affine模式中,图像块的仿射运动场可以通过两个控制点(四参数)或三个控制点(六参数)的运动矢量导出。Generally, only traditional motion models (for example, translational motion) are considered in the inter prediction process. However, in the real world, there are still many forms of motion, such as zoom, rotation, perspective motion and other irregular motions. In order to consider the aforementioned multi-motion form, in VTM-3.0, an affine motion compensation prediction (Affine motion compensation prediction, which can be referred to as Affine) technology is introduced. In the Affine mode, the affine motion field of the image block can be derived from the motion vector of two control points (four parameters) or three control points (six parameters).
在Affine模式中,可以对图像处理单元的运动估计采用1/4像素精度、1/16像素精度或其它亚像素精度的运动矢量。Affine技术的图像处理单元是sub-CU(可称为子块),sub-CU的大小为4×4(单位:像素),这会使Affine技术产生较大的带宽压力。In the Affine mode, motion vectors with 1/4 pixel accuracy, 1/16 pixel accuracy or other sub-pixel accuracy can be used for the motion estimation of the image processing unit. The image processing unit of the Affine technology is a sub-CU (which can be referred to as a sub-block), and the size of the sub-CU is 4×4 (unit: pixel), which will cause the Affine technology to generate greater bandwidth pressure.
发明内容Summary of the invention
本申请提供一种图像处理的方法与装置,可以在一定程度上降低Affine预测技术造成的带宽压力。The present application provides an image processing method and device, which can reduce the bandwidth pressure caused by the Affine prediction technology to a certain extent.
第一方面,提供一种图像处理的方法,所述方法包括:获取图像块的控制点的运动矢量CPMV;根据所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,所述运动矢量为整像素精度。In a first aspect, an image processing method is provided, the method includes: obtaining a motion vector CPMV of a control point of an image block; obtaining a motion vector of a sub-image block in the image block according to the CPMV of the image block, so The motion vector mentioned is in integer pixel accuracy.
第二方面,提供一种图像处理的装置,所述装置包括:第一获取单元,用于获取图像块的控制点的运动矢量CPMV;第二获取单元,用于根据所述第一获取单元获取的所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,所述运动矢量为整像素精度。In a second aspect, there is provided an image processing device, the device comprising: a first acquisition unit, configured to acquire a motion vector CPMV of a control point of an image block; and a second acquisition unit, configured to acquire according to the first acquisition unit The CPMV of the image block is obtained, and the motion vector of the sub-image block in the image block is obtained, and the motion vector has an integer pixel accuracy.
第三方面,提供一种图像处理的装置,所述编码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第一方面提供的方法。In a third aspect, an image processing device is provided, the encoding device includes a memory and a processor, the memory is used to store instructions, and the processor is used to execute instructions stored in the memory and store Execution of the instructions of causes the processor to execute the method provided in the first aspect.
第四方面,提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第一方面提供的方法。In a fourth aspect, a chip is provided. The chip includes a processing module and a communication interface, the processing module is configured to control the communication interface to communicate with the outside, and the processing module is also configured to implement the method provided in the first aspect.
第五方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第一方面或第一方面的任一可能的实现方式中的方法。In a fifth aspect, a computer-readable storage medium is provided, on which a computer program is stored. When the computer program is executed by a computer, the computer realizes the method in the first aspect or any possible implementation manner of the first aspect .
第六方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第一方面提供的方法。In a sixth aspect, a computer program product containing instructions is provided, which when executed by a computer causes the computer to implement the method provided in the first aspect.
本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。In the solution provided by this application, by making the motion vector of the sub-image block as the image processing unit have integer pixel accuracy, the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
附图说明Description of the drawings
图1是视频编码架构的示意图。Figure 1 is a schematic diagram of a video coding architecture.
图2是1/4像素插值的示意图。Figure 2 is a schematic diagram of 1/4 pixel interpolation.
图3(a)和图3(b)分别是四参数Affine模型和六参数Affine模型的示意图。Figures 3(a) and 3(b) are schematic diagrams of the four-parameter Affine model and the six-parameter Affine model, respectively.
图4是Affine运动矢量场的示意图。Figure 4 is a schematic diagram of the Affine motion vector field.
图5是现有技术的Affine模式与HEVC模式所需参考像素点的对比图。Fig. 5 is a comparison diagram of reference pixels required by the Affine mode and the HEVC mode in the prior art.
图6是根据本申请实施例的图像处理的方法的示意性流程图。Fig. 6 is a schematic flowchart of an image processing method according to an embodiment of the present application.
图7是根据本申请实施例的图像处理的方法的另一示意性流程图。Fig. 7 is another schematic flowchart of an image processing method according to an embodiment of the present application.
图8是根据本申请实施例的图像处理的方法的再一示意性流程图。Fig. 8 is another schematic flowchart of the image processing method according to an embodiment of the present application.
图9是根据本申请实施例的图像处理的装置的示意性流程图。Fig. 9 is a schematic flowchart of an image processing apparatus according to an embodiment of the present application.
图10是根据本申请实施例的图像处理的装置的另一示意性流程图。Fig. 10 is another schematic flowchart of an image processing apparatus according to an embodiment of the present application.
具体实施方式detailed description
下面将结合附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below in conjunction with the drawings.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terms used in the description of the application herein are only for the purpose of describing specific embodiments, and are not intended to limit the application.
为了便于理解根据本申请实施例的方案,下面首先描述几个相关的概念。In order to facilitate the understanding of the solutions according to the embodiments of the present application, a few related concepts are first described below.
1、帧间预测1. Inter prediction
如图1所示,视频编码框架主要包括帧内预测、帧间预测、变换、量化、熵编码、环路滤波几个部分。As shown in Figure 1, the video coding framework mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
本申请主要针对帧间预测(inter prediction)部分进行改进。This application is mainly aimed at improving the inter prediction (inter prediction) part.
帧间预测的大致思想是:利用视频相邻帧之间的时域相关性,使用重构帧作为参考帧,通过运动估计(Motion Estimation,ME)和运动补偿(Motion  Compensation,MC)对当前帧进行预测,从而去除视频的时间冗余信息。The general idea of inter-frame prediction is: use the temporal correlation between adjacent frames of the video, use the reconstructed frame as the reference frame, and use Motion Estimation (ME) and Motion Compensation (MC) to compare the current frame Make predictions to remove the temporal redundant information of the video.
本文中提及的当前帧,在编码场景下,表示当前正在编码的帧,在解码场景下,表示当前正在解码的帧。The current frame mentioned in this article, in the encoding scene, means the frame currently being encoded, and in the decoding scene, means the frame currently being decoded.
本文中提及的重构帧,在编码场景下,表示先前已经编码的帧,在解码场景下,表示先前已经解码的帧。The reconstructed frame mentioned in this article, in the encoding scene, means the previously encoded frame, in the decoding scene, means the previously decoded frame.
对于一帧图像,在编码过程中不会直接对整帧图像进行处理,通常将整帧图像划分为图像块进行处理。For a frame of image, the entire frame of image is not directly processed in the encoding process, and the entire frame of image is usually divided into image blocks for processing.
作为示例,先将整帧图像划分成编码区域(Coding Tree Unit,CTU),例如CTU的大小为64×64或128×128(单位:像素),然后可以进一步地将CTU划分成方形或矩形的编码单元(Coding Unit,CU)。在编码过程中,对CU进行处理。As an example, first divide the entire frame of image into coding areas (Coding Tree Unit, CTU), for example, the size of the CTU is 64×64 or 128×128 (unit: pixels), and then the CTU can be further divided into square or rectangular Coding Unit (CU). During the encoding process, the CU is processed.
本文中提及的图像块的大小的单位均为像素。The unit of the size of the image block mentioned in this article is all pixels.
帧间预测的大致流程如下。The general flow of inter prediction is as follows.
针对当前帧中的当前图像块(下文简称为当前块),在参考帧中寻找最相似块作为当前块的预测块。当前块与相似块之间的相对位移称为运动矢量(Motion Vector,MV)。运动估计指的是,将当前帧的当前块在参考帧中经过搜索、比较后得到运动矢量的过程。运动补偿指的是,利用参考块与运动估计得到的运动矢量得到预测块的过程。For the current image block in the current frame (hereinafter referred to as the current block for short), the most similar block is found in the reference frame as the prediction block of the current block. The relative displacement between the current block and similar blocks is called a Motion Vector (MV). Motion estimation refers to the process of obtaining a motion vector after searching and comparing the current block of the current frame in the reference frame. Motion compensation refers to the process of obtaining a prediction block using a reference block and a motion vector obtained by motion estimation.
帧间预测的过程获得的预测块可能和原始的当前块有一定的差别,因此,需要计算预测块与当前块之间的差值,该差值可称为残差。对残差进行变换、量化、熵编码等处理之后,得到编码比特流。The prediction block obtained by the inter-frame prediction process may be different from the original current block. Therefore, it is necessary to calculate the difference between the prediction block and the current block, and the difference may be called the residual. After performing transformation, quantization, entropy coding and other processing on the residual, the coded bit stream is obtained.
在编码端,完成图像编码后,即熵编码得到的比特流之后,会将比特流以及编码模式信息,例如帧间预测模式、运动矢量信息等信息,进行存储或发送到解码端。At the encoding end, after the image encoding is completed, that is, after the bit stream obtained by entropy encoding, the bit stream and encoding mode information, such as inter-frame prediction mode, motion vector information, etc., are stored or sent to the decoding end.
在解码端,获得熵编码比特流之后,先对该比特流进行熵解码,得到相应的残差;然后,根据解码得到的运动矢量等编码模式信息,获得预测块;最后,根据残差和预测块,得到当前块中各像素点的值,即重构出当前块,以此类推,重构出当前帧。At the decoding end, after obtaining the entropy coded bitstream, first perform entropy decoding on the bitstream to obtain the corresponding residual; then, obtain the prediction block according to the coding mode information such as the decoded motion vector; finally, according to the residual and prediction Block, get the value of each pixel in the current block, that is, reconstruct the current block, and so on, reconstruct the current frame.
如图1所示,在编码过程中,还可以包括反量化和反变换等步骤。反量化指的就是与量化过程相反的过程。反变换指的就是与变换过程相反的过程。As shown in Figure 1, in the encoding process, steps such as inverse quantization and inverse transformation may also be included. Dequantization refers to the process opposite to the quantification process. Inverse transformation refers to the process opposite to the transformation process.
帧间预测主要包括前向预测、后向预测、双预测等。其中,前向预测是 利用当前帧的前一重构帧(可以称为历史帧)对当前帧进行预测。后向预测是利用当前帧之后的帧(可以称为将来帧)对当前帧进行预测。双预测可以是双向预测,即既利用“历史帧”也利用“将来帧”来对当前帧进行预测。双预测还可以是两个方向的预测,例如,利用两个“历史帧”来对当前帧进行预测,或者,利用两个“将来帧”来对当前帧进行预测。Inter-frame prediction mainly includes forward prediction, backward prediction, bi-prediction and so on. Among them, forward prediction is to use the previous reconstructed frame of the current frame (may be called the historical frame) to predict the current frame. Backward prediction is to use frames after the current frame (may be called a future frame) to predict the current frame. Bi-prediction may be bi-directional prediction, that is, both "historical frames" and "future frames" are used to predict the current frame. Bi-prediction can also be prediction in two directions, for example, using two "historical frames" to predict the current frame, or using two "future frames" to predict the current frame.
2、亚像素精度运动估计2. Sub-pixel precision motion estimation
在实际场景中,由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位,因此,需要将运动估计的精度提升到亚像素级别(也称为1/K像素精度)。例如,在HEVC标准中,对亮度分量的运动估计采用1/4像素精度的运动矢量。In the actual scene, due to the continuity of natural object motion, the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the accuracy of motion estimation needs to be improved to the sub-pixel level (also called 1/K pixel accuracy). For example, in the HEVC standard, motion vectors with 1/4 pixel accuracy are used for motion estimation of the luminance component.
但在数字视频中并不存在1/K像素处的样值,通常,为了实现1/K像素精度的运动估计,将1/K像素点的值近似内插出来,换言之,对参考帧的行方向和列方向进行K倍内插,在插值之后的图像中进行搜索。对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。However, there is no sample value at 1/K pixel in digital video. Generally, in order to achieve motion estimation with 1/K pixel accuracy, the value of 1/K pixel is approximately interpolated. In other words, the line of reference frame K-fold interpolation is performed in the direction and column direction, and search is performed in the image after interpolation. In the process of interpolating the current block, the pixels in the current block and the pixels in the adjacent area need to be used.
作为示例,1/4像素插值的过程如图2所示。对于一个大小为8×8、4×8、4×4或8×4的图像块,会用到该图像块外部左侧的3个像素点和右侧的4个像素点来产生内插点的像素值。如图2所示,对于一个大小为4×4的图像块,a 0,0和d 0,0为1/4像素点,b 0,0和h 0,0为半像素点,c 0,0和n 0,0为3/4像素点。假如说当前块为2×2的块,A 0,0~A 1,0,A 0,0~A 0,1围成的2×2块。为了计算这个2×2的块中所有的内插点,需要用到2×2外部的一些点,包括左边3个,右边4个,上边3个,下边4个。 As an example, the process of 1/4 pixel interpolation is shown in Figure 2. For an image block with a size of 8×8, 4×8, 4×4, or 8×4, the 3 pixels on the left and 4 pixels on the right outside the image block will be used to generate interpolation points The pixel value. As shown in Figure 2, for an image block with a size of 4×4, a 0,0 and d 0,0 are 1/4 pixels, b 0,0 and h 0,0 are half pixels, and c 0, 0 and n 0,0 are 3/4 pixels. If the current block is a 2×2 block, A 0,0 ~A 1,0 , A 0,0 ~A 0,1 are enclosed by 2×2 blocks. In order to calculate all the interpolation points in this 2×2 block, some points outside the 2×2 need to be used, including 3 on the left, 4 on the right, 3 on the top, and 4 on the bottom.
3、仿射运动补偿预测技术(Affine motion compensated prediction,下文简称为Affine)。3. Affine motion compensated prediction technology (Affine motion compensated prediction, hereinafter referred to as Affine).
Affine为一种帧间预测技术。Affine is an inter-frame prediction technology.
在HEVC标准中,帧间预测过程只考虑了传统的运动模型(例如,平移运动)。然而在现实世界中,还有很多种运动形式,比如缩放、旋转、透视运动等无规则的运动。为了考虑到上述运动形式,在VTM-3.0中,引入了Affine技术。In the HEVC standard, only the traditional motion model (for example, translational motion) is considered in the inter prediction process. However, in the real world, there are still many forms of motion, such as zoom, rotation, perspective motion and other irregular motions. In order to take into account the above-mentioned movement form, in VTM-3.0, Affine technology was introduced.
如图3所示,一个Affine模式的运动场可以通过两个控制点(四参数)(如图3(a)所示)或三个控制点(六参数)(如图3(b)所示)的运动矢量导出。As shown in Figure 3, an Affine mode sports field can pass two control points (four parameters) (as shown in Figure 3(a)) or three control points (six parameters) (as shown in Figure 3(b)) The motion vector is exported.
下文中,将控制点的MV(controlpointmotionvector)简称为CPMV。Hereinafter, the MV (controlpointmotionvector) of the control point is referred to as CPMV for short.
Affine的处理单元不是CU,而是将CU划分之后得到的子块(sub-CU),每个sub-CU的大小为4×4。在Affine模式,每个sub-CU具有一个MV。可以理解到,不同于普通CU,Affine模式的CU不只有一个MV,一个CU中具有多少个sub-CU,这个CU就具有多少个MV。The processing unit of Affine is not a CU, but a sub-block (sub-CU) obtained after dividing the CU, and the size of each sub-CU is 4×4. In Affine mode, each sub-CU has one MV. It can be understood that, unlike ordinary CUs, Affine mode CUs do not only have one MV. There are as many sub-CUs as there are in a CU.
作为示例,一个CU中的sub-CU的MV通过如图3中所示的两个控制点或三个控制点的CPMV计算导出。例如,对于四参数的Affine运动模型,位于(x,y)位置的sub-CU的MV通过以下公式计算得到:As an example, the MV of the sub-CU in one CU is derived through the CPMV calculation of two control points or three control points as shown in FIG. 3. For example, for the four-parameter Affine motion model, the MV of the sub-CU at the (x, y) position is calculated by the following formula:
Figure PCTCN2019077894-appb-000001
Figure PCTCN2019077894-appb-000001
再例如,对于六参数的Affine运动模型,位于(x,y)位置的sub-CU的MV通过以下公式计算得到:For another example, for the six-parameter Affine motion model, the MV of the sub-CU at the (x, y) position is calculated by the following formula:
Figure PCTCN2019077894-appb-000002
Figure PCTCN2019077894-appb-000002
其中(mv 0x,mv 0y)为左上角控制点的MV,(mv 1x,mv 1y)为右上角控制点的MV,(mv 2x,mv 2y)为左下角控制点的MV。上述公式中的W表示sub-CU的所在CU的宽,H表示sub-CU的所在CU的高。 Among them, (mv 0x , mv 0y ) is the MV of the upper left control point, (mv 1x , mv 1y ) is the MV of the upper right control point, and (mv 2x , mv 2y ) is the MV of the lower left control point. W in the above formula represents the width of the CU where the sub-CU is located, and H represents the height of the CU where the sub-CU is located.
经过上述公式(1)的计算,一个CU中运动矢量的示意图如图4所示,每个方格代表4×4大小的sub-CU。在上述公式计算之后的所有sub-CU的MV都会转换成1/16像素精度的表示,也就是说sub-CU的MV最高精度是1/16像素。After calculation of the above formula (1), a schematic diagram of a motion vector in a CU is shown in FIG. 4, and each square represents a sub-CU with a size of 4×4. After the above formula is calculated, the MVs of all sub-CUs will be converted into 1/16 pixel precision representation, that is to say, the highest precision of the sub-CU MV is 1/16 pixel.
在计算得到每一个sub-CU的MV之后,经过运动补偿的过程得到每一个sub-CU的预测块。色度分量和亮度分量的sub-CU的大小都是4×4,色度分量4×4块的运动矢量由其对应的四个4×4的亮度分量运动矢量平均得到。After the MV of each sub-CU is calculated, the prediction block of each sub-CU is obtained through the process of motion compensation. The size of the sub-CU of the chrominance component and the luminance component is 4×4, and the motion vector of the chrominance component 4×4 block is obtained by averaging the corresponding four 4×4 luminance component motion vectors.
在Affine模式的编码过程中,在码流中写入CPMV信息,不需要写入每个sub-CU的MV信息。In the encoding process of the Affine mode, CPMV information is written in the code stream, and there is no need to write the MV information of each sub-CU.
4、自适应运动矢量精度(Adaptive Motion Vector Resolution,AMVR)4. Adaptive Motion Vector Resolution (AMVR)
AMVR技术可以使得CU具有整像素精度和亚像素精度的运动矢量。整像素精度例如可以为1像素精度、2像素精度等。亚像素精度例如可以为1/2像素精度、1/4像素精度、1/8像素精度或1/16像素精度等。AMVR technology can make the CU have motion vectors with full pixel precision and sub-pixel precision. The integer pixel accuracy can be, for example, 1-pixel accuracy, 2-pixel accuracy, or the like. The sub-pixel accuracy can be, for example, 1/2 pixel accuracy, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1/16 pixel accuracy.
例如,对于每一个采用Affine AMVR技术的CU(有些情况下可能CU不采用AffineAMVR),在编码端自适应地决策其对应的MV精度,并将决策的结果写进码流传递到解码端。For example, for each CU that adopts Affine AMVR technology (in some cases, CU may not adopt AffineAMVR), the corresponding MV accuracy is adaptively decided at the encoding end, and the result of the decision is written into the code stream and passed to the decoding end.
Affine AMVR技术中提及的整像素精度或亚像素精度指的是CPMV的像素精度,而不是sub-CU的像素精度。The whole pixel accuracy or sub-pixel accuracy mentioned in Affine AMVR technology refers to the pixel accuracy of CPMV, not the pixel accuracy of sub-CU.
对于整像素的CPMV,CU的运动估计的过程都是整像素的过程,但是经过上述公式(1)或公式(2)计算之后得到的sub-CU的MV可能是1/4像素精度或其它亚像素精度。For the CPMV of the whole pixel, the motion estimation process of the CU is the whole pixel process, but the MV of the sub-CU obtained after the above formula (1) or formula (2) may be 1/4 pixel accuracy or other sub-pixel accuracy. Pixel accuracy.
如果sub-CU的MV是亚像素精度,则sub-CU的运动补偿过程会涉及到亚像素,且由于sub-CU的大小为4×4,这会使得Affine预测过程产生较大的带宽压力。If the MV of the sub-CU is of sub-pixel accuracy, the motion compensation process of the sub-CU will involve sub-pixels, and since the size of the sub-CU is 4×4, this will cause the Affine prediction process to generate greater bandwidth pressure.
申请人在VVC最新的参考软件VTM-4.0上,选取官方通测数据作为测试序列,进行了仿真,仿真结果如图5所示。The applicant selected the official general test data as the test sequence on the latest reference software VTM-4.0 of VVC, and performed a simulation. The simulation result is shown in Figure 5.
如图5所示,左侧方框表示HEVC最坏情况(1/4像素精度的MV)为8×8的双向帧间预测CU,所需的参考像素点的个数为(8+7)×(8+7)×2=450。右侧方框表示VVC的Affine模式下最坏情况(1/16和1/4像素精度MV)下的4×4双向帧间预测的CU,所需参考像素点的个数为(4+7)×(4+7)×2×4=968。As shown in Figure 5, the box on the left represents the worst case of HEVC (MV with 1/4 pixel accuracy) is a bidirectional inter prediction CU with 8×8, and the number of reference pixels required is (8+7) ×(8+7)×2=450. The box on the right represents the 4×4 bidirectional inter-frame prediction CU under the worst case (1/16 and 1/4 pixel precision MV) in the Affine mode of VVC. The number of reference pixels required is (4+7 )×(4+7)×2×4=968.
从图5可知,现有的Affine模式,相比于HEVC,增加了115%的参考像素点,造成了较大的带宽压力。It can be seen from Fig. 5 that compared with HEVC, the existing Affine mode increases the reference pixels by 115%, which causes a greater bandwidth pressure.
针对上述问题,本申请提出一种图像处理的方法与装置,可以在一定程度上减小Affine技术产生的带宽压力。In response to the above problems, this application proposes an image processing method and device, which can reduce the bandwidth pressure generated by the Affine technology to a certain extent.
本申请适用于数字视频编码技术领域,具体用于视频编解码器的帧间预测部分。本申请可以应用于符合国际视频编码标准H.264/HEVC和中国AVS2标准等的编解码器,以及符合下一代视频编码标准VVC或AVS3等的编解码器。This application is suitable for the field of digital video coding technology, and is specifically used for the inter-frame prediction part of a video codec. This application can be applied to codecs that comply with the international video coding standard H.264/HEVC and the Chinese AVS2 standard, as well as codecs that comply with the next-generation video coding standard VVC or AVS3.
本申请可以应用于视频编解码器的帧间预测部分,也就是说,根据本申请实施例的图像处理的方法可以由编码装置执行,也可以由解码装置执行。This application can be applied to the inter-frame prediction part of a video codec, that is to say, the image processing method according to the embodiment of this application can be executed by an encoding device or a decoding device.
图6为本申请提供的图像处理的方法600的示意性流程图,该方法600包括如下步骤。FIG. 6 is a schematic flowchart of an image processing method 600 provided by this application. The method 600 includes the following steps.
610,获取图像块的控制点的运动矢量(CPMV)。610. Acquire a motion vector (CPMV) of a control point of the image block.
下文将描述获取图像块的CPMV的方式,这里暂不描述。The method of obtaining the CPMV of the image block will be described below, which will not be described here.
620,根据该图像块的CPMV,获取图像块中子图像块的运动矢量,该运动矢量为整像素精度。620. Obtain a motion vector of a sub-image block in the image block according to the CPMV of the image block, where the motion vector has an integer pixel accuracy.
换句话说,基于该图像块的CPMV,获取该图像块中的子图像块的运动矢量,并使得该子图像块的运动矢量的像素精度为整像素精度。In other words, based on the CPMV of the image block, the motion vector of the sub-image block in the image block is obtained, and the pixel accuracy of the motion vector of the sub-image block is made to be an integer pixel accuracy.
本申请中提及的子图像块表示图像处理或视频处理的处理单元。该子图像块的宽和/或高可以小于8像素。例如,子图像块的大小为4×4(像素)。The sub-image block mentioned in this application represents a processing unit of image processing or video processing. The width and/or height of the sub-image block may be less than 8 pixels. For example, the size of the sub-image block is 4×4 (pixels).
子图像块可以是通过划分图像块得到的块。可以理解到,若图像块与子图像块的大小相同,则子图像块可以认为就是图像块本身。The sub image block may be a block obtained by dividing the image block. It can be understood that if the size of the image block and the sub-image block are the same, the sub-image block can be regarded as the image block itself.
子图像块可以是方形的块、例如大小为4×4或8×8的块,也可以是矩形的块,例如大小为2×4或4×8的块。The sub-image block may be a square block, for example, a block with a size of 4×4 or 8×8, or a rectangular block, for example, a block with a size of 2×4 or 4×8.
本申请中提及的图像块的大小可以为16×16、16×8、16×4、8×16、4×8、8×8、8×4、4×8等其它尺寸。The size of the image block mentioned in this application can be 16×16, 16×8, 16×4, 8×16, 4×8, 8×8, 8×4, 4×8 and other sizes.
应理解,作为处理单元的子图像块的运动矢量为整像素精度,因此,子图像块的运动补偿过程不会涉及到亚像素,从而可以降低视频帧间预测过程产生的带宽压力。It should be understood that the motion vector of the sub-image block as the processing unit has an integer pixel accuracy. Therefore, the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the bandwidth pressure generated by the video inter-frame prediction process.
根据图像块的CPMV,获取图像块中子图像块的运动矢量的过程可以包括:根据该图像块的两个或三个控制点的运动矢量,计算获得子图像块的运动矢量,并使得所获得的子图像块的运动矢量的像素精度为整像素精度。According to the CPMV of the image block, the process of obtaining the motion vector of the sub-image block in the image block may include: calculating the motion vector of the sub-image block according to the motion vector of the two or three control points of the image block, and making the obtained motion vector The pixel accuracy of the motion vector of the sub-image block is the integer pixel accuracy.
作为示例,可以根据前文描述的公式(1)或公式(2),计算得到子图像块的运动矢量。As an example, the motion vector of the sub-image block can be calculated according to formula (1) or formula (2) described above.
可选地,在一些实施例中,如果直接基于图像块的CPMV计算得到的子图像块的运动矢量的像素精度为整像素精度,则这个运动矢量就是本申请要获取的子图像块的运动矢量。Optionally, in some embodiments, if the pixel accuracy of the motion vector of the sub-image block calculated directly based on the CPMV of the image block is integer pixel accuracy, then this motion vector is the motion vector of the sub-image block to be obtained in this application. .
例如,作为一种可能的实现方式,采用一种算法,根据图像块的CPMV计算子图像块的运动矢量,该算法可以保证计算出的子图像块的运动矢量的像素精度为整像素。For example, as a possible implementation manner, an algorithm is used to calculate the motion vector of the sub-image block according to the CPMV of the image block. The algorithm can ensure that the calculated pixel accuracy of the motion vector of the sub-image block is an integer pixel.
可选地,在一些实施例中,如果直接基于图像块的CPMV,计算得到的 子图像块的运动矢量的像素精度为亚像素精度,例如,1/4像素精度、1/8像素精度或1/16像素精度,则还需要对当前计算得到的运动矢量进行处理,使其由亚像素精度变为整像素精度。Optionally, in some embodiments, if directly based on the CPMV of the image block, the calculated pixel accuracy of the motion vector of the sub-image block is sub-pixel accuracy, for example, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1 /16 pixel accuracy, you also need to process the currently calculated motion vector to change it from sub-pixel accuracy to full-pixel accuracy.
可选地,步骤620包括如下步骤1)和步骤2)。Optionally, step 620 includes the following steps 1) and 2).
1)根据图像块的CPMV,计算子图像块的第一运动矢量,第一运动矢量为亚像素精度。1) Calculate the first motion vector of the sub-image block according to the CPMV of the image block, and the first motion vector has sub-pixel accuracy.
例如,根据前文描述的公式(1)或公式(2),基于CPMV计算子图像块的第一运动矢量,计算得到的第一运动矢量的像素精度为亚像素。For example, according to formula (1) or formula (2) described above, the first motion vector of the sub-image block is calculated based on CPMV, and the pixel accuracy of the calculated first motion vector is sub-pixel.
2)将第一运动矢量处理为整像素精度的第二运动矢量。2) Process the first motion vector into a second motion vector with integer pixel accuracy.
作为步骤2)的一种可能的实现方式:根据子图像块的第一运动矢量,获取第二运动矢量,使得第二运动矢量的终点为与第一运动矢量的终点最接近的整像素点。As a possible implementation of step 2): the second motion vector is obtained according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the whole pixel point closest to the end point of the first motion vector.
例如,最接近的整像素点可以是第一运动矢量的终点的上方、下方、左方或右方的整像素点。For example, the closest whole pixel point may be the whole pixel point above, below, left or right of the end point of the first motion vector.
作为一个示例,通过如下公式,根据子图像块的第一运动矢量(MV1x,MV1y),计算得到该子图像块的第二运动矢量(MV2x,MV2y)。As an example, the following formula is used to calculate the second motion vector (MV2x, MV2y) of the sub-image block according to the first motion vector (MV1x, MV1y) of the sub-image block.
若MV1x>=0,MV2x=((MV1x+(1<<(shift-1)))>>shift)<<shift;If MV1x>=0, MV2x=((MV1x+(1<<(shift-1)))>>shift)<<shift;
若MV1x<0,MV2x=-((-MV1x+(1<<(shift-1)))>>shift)<<shift;If MV1x<0, MV2x=-((-MV1x+(1<<(shift-1)))>>shift)<<shift;
若MV1y>=0,MV2y=((MV1y+(1<<(shift-1)))>>shift)<<shift;If MV1y>=0, MV2y=((MV1y+(1<<(shift-1)))>>shift)<<shift;
若MV1y<0,MV2y=-((-MV1y+(1<<(shift-1)))>>shift)<<shift,If MV1y<0, MV2y=-((-MV1y+(1<<(shift-1)))>>shift)<<shift,
                                                         式(3)。Formula (3).
其中,shift的取值与编码软件平台中运动矢量的存储精度有关。例如,在当前的VTM-4.0参考软件中,运动矢量的存储精度为1/16精度,则可以将shift的取值设置为4。Among them, the value of shift is related to the storage accuracy of the motion vector in the coding software platform. For example, in the current VTM-4.0 reference software, the storage accuracy of the motion vector is 1/16 accuracy, and the value of shift can be set to 4.
作为另一个示例,通过如下公式,根据子图像块的第一运动矢量(MV1x,MV1y),获得该子图像块的第二运动矢量(MV2x,MV2y)。As another example, the following formula is used to obtain the second motion vector (MV2x, MV2y) of the sub-image block according to the first motion vector (MV1x, MV1y) of the sub-image block.
若MV1x>=0,MV2x=(MV1x>>shift)<<shift;If MV1x>=0, MV2x=(MV1x>>shift)<<shift;
若MV1x<0,MV2x=-(((-MV1x)>>shift)<<shift);If MV1x<0, MV2x=-(((-MV1x)>>shift)<<shift);
若MV1y>=0,MV2y=(MV1y>>shift)<<shift;If MV1y>=0, MV2y=(MV1y>>shift)<<shift;
若MV1y<0,MV2y=-(((-MV1y)>>shift)<<shift),式(4)。If MV1y<0, MV2y=-(((-MV1y)>>shift)<<shift), formula (4).
其中,shift的含义与前文描述的shift的含义一致。Among them, the meaning of shift is consistent with the meaning of shift described above.
公式(3)和公式(4)中的“<<”表示左移,“>>”表示右移。"<<" in formula (3) and formula (4) means left shift, and ">>" means right shift.
需要说明的是,本申请对运动矢量的像素精度由亚像素级别转换为整像素级别的方式不作限定。例如,还可以根据其它可行的从亚像素到整像素的变换算法,根据第一运动矢量获得整像素精度的第二运动矢量。It should be noted that this application does not limit the manner in which the pixel accuracy of the motion vector is converted from the sub-pixel level to the entire pixel level. For example, it is also possible to obtain the second motion vector with integer pixel accuracy according to the first motion vector according to other feasible conversion algorithms from sub-pixel to integer pixel.
当前Affine技术中处理的最小CU(对应本申请实施例中的图像块)的大小为16×16时,在运动估计的过程中不会带来带宽的压力,因此,对运动估计过程不需要进行修改。这种情形下,图像块的CPMV的像素精度可能为整像素,也可能为亚像素。若图像块的CPMV的像素精度为亚像素,则根据图像块的CPMV计算得到的子图像块的运动矢量的像素精度也为亚像素;若图像块的CPMV的像素精度为整像素,根据图像块的CPMV计算得到的子图像块的运动矢量的像素精度也有可能为亚像素,例如,根据公式(1)或公式(2)计算得到的子图像块的运动矢量的像素精度可能是亚像素。When the size of the smallest CU processed in the current Affine technology (corresponding to the image block in the embodiment of this application) is 16×16, there will be no bandwidth pressure during the motion estimation process. Therefore, there is no need to perform the motion estimation process. modify. In this case, the pixel accuracy of the CPMV of the image block may be a whole pixel or a sub-pixel. If the pixel accuracy of the CPMV of the image block is sub-pixel, the pixel accuracy of the motion vector of the sub-image block calculated according to the CPMV of the image block is also sub-pixel; if the pixel accuracy of the CPMV of the image block is full pixels, according to the image block The pixel accuracy of the motion vector of the sub-image block calculated by CPMV may also be sub-pixel. For example, the pixel accuracy of the motion vector of the sub-image block calculated according to formula (1) or formula (2) may be sub-pixel.
上述可知,现有的Affine技术中,子图像块,即处理单元的运动矢量的像素精度可能为亚像素,这会导致运动补偿过程涉及亚像素,会增加Affine技术的带宽压力。It can be seen from the above that in the existing Affine technology, the pixel accuracy of the sub-image block, that is, the motion vector of the processing unit may be sub-pixel, which will cause the motion compensation process to involve sub-pixels, which will increase the bandwidth pressure of the Affine technology.
本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。In the solution provided by this application, by making the motion vector of the sub-image block as the image processing unit have integer pixel accuracy, the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
应理解,通过扩大作为处理单元的子图像块的大小,在一定程度上也可以缓解带宽压力的问题,但是,这样会降低图像压缩性能。本申请通过将作为处理单元的子图像块的运动矢量处理为整像素精度,可以保证整像素精度的运动补偿,从而一方面可以解决带宽压力的问题,另一方面也可以保证较好的图像压缩性能。It should be understood that by enlarging the size of the sub-image block as the processing unit, the bandwidth pressure problem can also be relieved to a certain extent, but this will reduce the image compression performance. In this application, by processing the motion vector of the sub-image block as the processing unit into integer pixel accuracy, it can ensure the motion compensation of the integer pixel accuracy, so that on the one hand, the problem of bandwidth pressure can be solved, and on the other hand, better image compression can be ensured. performance.
可以根据本申请提供的方案,对现有的Affine技术进行改进,即将Affine模式下的Sub-CU的运动矢量处理为整像素精度,从而可以降低Affine技术产生的带宽压力。The existing Affine technology can be improved according to the solution provided by the present application, that is, the motion vector of the Sub-CU in the Affine mode is processed to an integer pixel accuracy, so that the bandwidth pressure generated by the Affine technology can be reduced.
除了可以应用于Affine技术之外,本申请提供的方案也可以应用于将来可能出现的其它类似的技术中,例如,运动矢量的像素精度包括整像素精度与亚像素精度,且图像处理单元的尺寸较小,例如,4×4。In addition to being applied to the Affine technology, the solution provided in this application can also be applied to other similar technologies that may appear in the future. For example, the pixel accuracy of the motion vector includes integer pixel accuracy and sub-pixel accuracy, and the size of the image processing unit Smaller, for example, 4×4.
应理解,本申请提供的方案,可用于提升压缩视频质量,提升编解码器的硬件友好性,对广播电视、电视会议、网络视频等视频的压缩处理具有重 要意义。It should be understood that the solution provided in this application can be used to improve the quality of compressed video and improve the hardware friendliness of the codec, which is of great significance to the compression processing of videos such as broadcast television, video conference, and network video.
可选地,在一些实施例中,本申请实施例提供的方法还包括:将该图像块的CPMV处理为整像素精度。Optionally, in some embodiments, the method provided in the embodiment of the present application further includes: processing the CPMV of the image block to integer pixel accuracy.
本实施例可以保证图像块的CPMV为整像素精度。This embodiment can ensure that the CPMV of the image block has an integer pixel accuracy.
下文将描述将该图像块的CPMV处理为整像素精度的实施方式。Hereinafter, an embodiment of processing the CPMV of the image block into integer pixel accuracy will be described.
可选地,如图7所示,在一些实施例中,步骤610包括如下步骤611、步骤612和步骤613。Optionally, as shown in FIG. 7, in some embodiments, step 610 includes the following step 611, step 612, and step 613.
611,获取该图像块的运动信息候选列表。611: Acquire a motion information candidate list of the image block.
例如,获取该图像块的空域和/或时域邻近块的运动矢量,基于这些邻近块的运动矢量,构建该图像块的运动信息候选列表。For example, the motion vectors of the spatial and/or temporal neighboring blocks of the image block are acquired, and based on the motion vectors of these neighboring blocks, a motion information candidate list of the image block is constructed.
612,将该运动信息候选列表中的运动矢量处理为整像素精度。612: Process the motion vector in the motion information candidate list into integer pixel accuracy.
例如,可以采用前文描述的公式(3)或公式(4),将该运动信息候选列表中的运动矢量处理为整像素精度。For example, the aforementioned formula (3) or formula (4) can be used to process the motion vector in the motion information candidate list into integer pixel accuracy.
邻近块指的是用于构建该图像块的运动信息候选列表的邻近块,例如,时域和/或空域上的邻近块。本申请对于确定邻近块的方式不作限定。The neighboring block refers to the neighboring block used to construct the motion information candidate list of the image block, for example, the neighboring block in the temporal and/or spatial domain. This application does not limit the manner of determining neighboring blocks.
613,根据所述运动信息候选列表中处理为整像素精度的运动矢量,获取所述图像块的CPMV。613. Obtain the CPMV of the image block according to the motion vector processed as an integer pixel precision in the motion information candidate list.
Affine帧间预测模式可以分为Affine merge模式和Affine inter模式。Affine inter prediction modes can be divided into Affine merge mode and Affine inter mode.
图7所示实施例可以应用于Affine inter模式,也可以应用于Affine merge模式。The embodiment shown in FIG. 7 can be applied to the Affine inter mode and can also be applied to the Affine merge mode.
可选地,在如图7所示的实施例中,该图像块的帧间预测方式为Affine merge模式。Optionally, in the embodiment shown in FIG. 7, the inter-frame prediction mode of the image block is the Affine merge mode.
在Affine merge模式下,可以从运动信息候选列表选择一个CPMV直接作为该图像块的CPMV。即步骤613包括:从该图像块的运动信息候选列表中选择一个CPMV作为该图像块的CPMV。In the Affine merge mode, a CPMV can be selected from the motion information candidate list directly as the CPMV of the image block. That is, step 613 includes: selecting a CPMV from the motion information candidate list of the image block as the CPMV of the image block.
因为用于构建运动信息候选列表的邻近块的运动矢量被处理为整像素精度,因此,从运动信息候选列表选择CPMV直接作为该图像块的CPMV,可以保证该图像块的CPMV为整像素。Because the motion vectors of the neighboring blocks used to construct the motion information candidate list are processed with integer pixel accuracy, selecting CPMV from the motion information candidate list directly as the CPMV of the image block can ensure that the CPMV of the image block is an integer pixel.
作为示例,Affine merge模式的帧间预测的大致流程包括如下步骤。在本示例中,以图像块为CU为例。As an example, the general process of inter prediction in Affine merge mode includes the following steps. In this example, the image block is a CU as an example.
步骤1-1,从空域临近块和/或时域临近块获取邻近块的运动矢量(MV)。 此过程会获取到Affine模式的邻近块的MV以及传统模式的邻近块的MV,根据这些邻近块的MV组合得到CPMVs,并由这些CPMVs构建该CU的运动信息候选列表。Step 1-1: Obtain the motion vector (MV) of the neighboring block from the spatial neighboring block and/or the temporal neighboring block. In this process, the MV of the neighboring block in the Affine mode and the MV of the neighboring block in the traditional mode are obtained, and CPMVs are obtained according to the MV combination of these neighboring blocks, and the motion information candidate list of the CU is constructed from these CPMVs.
步骤1-2,将该CU的运动信息候选列表中的运动矢量,处理为整像素精度。Step 1-2, processing the motion vector in the motion information candidate list of the CU into integer pixel accuracy.
步骤1-3,从运动信息候选列表中选择一个组合(该组合中可能包含两个或者三个CPMV,代表两个控制点和三个控制点的CPMV),作为CU的CPMVs。Steps 1-3, select a combination from the motion information candidate list (the combination may contain two or three CPMV, representing two control points and three control points CPMV), as the CPMVs of the CU.
在Affine merge模式中,将运动信息候选列表中选出的CPMVs作为当前CU的CPMVs,不需要进行运动估计,也不存在Affine inter模式中的MVD的概念(下文将描述)。也就是说,在Affine merge模式中,只需要将从运动信息候选列表中选出的CPMVs的索引(一个CU只需要写一个索引)写入码流,不需要传输MVD。In the Affine merge mode, the CPMVs selected in the motion information candidate list are used as the CPMVs of the current CU, no motion estimation is required, and there is no concept of MVD in the Affine inter mode (described below). That is, in the Affine merge mode, only the index of CPMVs selected from the motion information candidate list (one CU only needs to write one index) is written into the code stream, and there is no need to transmit MVD.
关于步骤1-1中提及的邻近块,该临近块的帧间预测模式可以是传统的帧间预测模式也可能是affine模式,因此从临近块获取到的MV可能是整像素精度也可能是亚像素精度。Regarding the neighboring block mentioned in step 1-1, the inter prediction mode of the neighboring block can be the traditional inter prediction mode or the affine mode. Therefore, the MV obtained from the neighboring block may be of integer pixel accuracy or Sub-pixel accuracy.
本实施例通过将当前图像块的邻近块的运动矢量处理为整像素精度,从而可以保证该图像块的CPMV为整像素精度。In this embodiment, by processing the motion vector of the neighboring block of the current image block into integer pixel accuracy, it can ensure that the CPMV of the image block has integer pixel accuracy.
前文已述,图7所示的实施例也可以应用于Affine inter模式。为了更好地理解本申请实施例,在描述将图7所示的实施例应用于Affine inter模式的实施例之前,先描述一下Affine Inter模式的大致流程。As mentioned above, the embodiment shown in FIG. 7 can also be applied to the Affine inter mode. In order to better understand the embodiments of the present application, before describing the embodiment in which the embodiment shown in FIG. 7 is applied to the Affine Inter mode, the general flow of the Affine Inter mode will be described first.
作为示例,Affine Inter模式的大致流程包括如下步骤。在本示例中,以图像块为CU为例。As an example, the general process of Affine Inter mode includes the following steps. In this example, the image block is a CU as an example.
步骤2-1,从空域临近块和/或时域临近块获取邻近块的运动矢量。此过程会获取到Affine模式的邻近块的运动矢量以及传统模式的邻近块的运动矢量;根据所获取的运动矢量组合得到CPMVs,并由这些CPMVs构建该CU的运动信息候选列表。Step 2-1: Obtain motion vectors of neighboring blocks from spatial neighboring blocks and/or temporal neighboring blocks. In this process, the motion vector of the neighboring block in the Affine mode and the motion vector of the neighboring block in the traditional mode are obtained; CPMVs are obtained by combining the obtained motion vectors, and the motion information candidate list of the CU is constructed from these CPMVs.
步骤2-2,从步骤2-1构建的运动信息候选列表中选择一个组合(该组合中可能包含两个或者三个CPMV,代表两个控制点和三个控制点的CPMV),作为当前CU的预测MV(Motion vector prediction,MVP)(即当前CU的预测CPMVs)。Step 2-2, select a combination from the motion information candidate list constructed in step 2-1 (the combination may contain two or three CPMV, representing two control points and three control points CPMV), as the current CU MV (Motion vector prediction, MVP) (that is, the predicted CPMVs of the current CU).
步骤2-3,以当前整个CU为单位进行运动估计,获取当前CU的CPMVs。Step 2-3: Perform motion estimation with the current entire CU as a unit, and obtain CPMVs of the current CU.
步骤2-4,计算步骤2-2选择的CPMVs与步骤2-3运动估计的CPMVs之间的差值,获得运动矢量差值(Motion Vector Difference,MVD)。Step 2-4: Calculate the difference between the CPMVs selected in step 2-2 and the CPMVs of step 2-3 motion estimation to obtain a motion vector difference (MVD).
在Affine Inter模式中,需要将选择的CPMVs的索引,以及MVD写入码流。In Affine Inter mode, the index of the selected CPMVs and MVD need to be written into the code stream.
在Affine Inter模式中,运动估计过程以CU(对应于本申请实施例中的图像块)为单位进行,运动补偿过程则以4×4的sub-CU(对应于本申请实施例中的子图像块)为单位进行。In the Affine Inter mode, the motion estimation process is performed in units of CU (corresponding to the image block in the embodiment of this application), and the motion compensation process is performed in a 4×4 sub-CU (corresponding to the sub-image in the embodiment of this application). Block) as a unit.
关于步骤2-1中提及的邻近块,该临近块的帧间预测模式可以是传统的帧间预测模式也可能是affine模式,因此从临近块获取到的MV可能是整像素精度也可能是亚像素精度。Regarding the neighboring block mentioned in step 2-1, the inter prediction mode of the neighboring block can be the traditional inter prediction mode or the affine mode. Therefore, the MV obtained from the neighboring block may be of integer pixel accuracy or Sub-pixel accuracy.
在Affine Inter模式中,编码端会进行CU的运动矢量的不同像素精度的选择,这个过程可以称为自适应运动矢量精度(Adaptive Motion Vector Resolution,AMVR)决策。In the Affine Inter mode, the encoder will select different pixel precisions of the motion vector of the CU. This process can be called adaptive motion vector resolution (AMVR) decision-making.
AMVR决策的像素精度本质上是MVD的像素精度,也就是CU的CPMVs的像素精度,而不是sub-CU的MV的像素精度。The pixel accuracy of AMVR decision is essentially the pixel accuracy of MVD, that is, the pixel accuracy of CPMVs of CU, not the pixel accuracy of MV of sub-CU.
在现有的Affine Inter模式中,AMVR决策的像素精度的范围包括但不限于:1/16像素精度、1/8像素精度、1/4像素精度、1/2像素精度、1像素精度、2像素精度、4像素精度等。换句话说,CU可以有多种不同像素精度的CPMVs。例如,CU可以有整像素、1/4像素精度和1/16像素精度三种不同的CPMVs。In the existing Affine Inter mode, the range of pixel accuracy for AMVR decisions includes but is not limited to: 1/16 pixel accuracy, 1/8 pixel accuracy, 1/4 pixel accuracy, 1/2 pixel accuracy, 1 pixel accuracy, 2 Pixel accuracy, 4-pixel accuracy, etc. In other words, the CU can have multiple CPMVs with different pixel accuracy. For example, the CU can have three different CPMVs of integer pixels, 1/4 pixel accuracy, and 1/16 pixel accuracy.
可选地,在图7所示的实施例中,图像块的帧间预测模式为Affine Inter模式,步骤611包括获取该图像块的运动信息候选列表;步骤612包括将该运动信息候选列表中的运动矢量处理为整像素精度;步骤613包括:从图像块的运动信息候选列表选择该图像块的预测CPMV,获得该图像块的MVD,该图像块的预测CPMV与该图像块的MVD,获得该图像块的CPMV。Optionally, in the embodiment shown in FIG. 7, the inter prediction mode of the image block is Affine Inter mode, and step 611 includes obtaining the motion information candidate list of the image block; step 612 includes The motion vector is processed to integer pixel accuracy; step 613 includes: selecting the predicted CPMV of the image block from the motion information candidate list of the image block to obtain the MVD of the image block, the predicted CPMV of the image block and the MVD of the image block, and obtaining the The CPMV of the image block.
如图8所示,在本实施例中,步骤610还可以包括步骤614,对该图像块进行N像素的运动矢量精度决策,N为正整数。As shown in FIG. 8, in this embodiment, step 610 may further include step 614 of performing a motion vector accuracy decision of N pixels for the image block, where N is a positive integer.
即对该图像块进行整像素精度的运动矢量精度决策(AMVR决策)。That is, the whole pixel precision motion vector precision decision (AMVR decision) is made for the image block.
可以理解到,通过对图像块进行整像素精度的AMVR决策,可以保证图像块的MVD的像素精度为整像素,也可以保证图像块的CPMV的像素精 度为整像素。这样,可以保证图像块的运动估计过程中不涉及亚像素,从而可以在一定程度上降低带宽压力。It can be understood that by making AMVR decisions with integer pixel accuracy for the image block, the pixel accuracy of the MVD of the image block can be guaranteed to be integer pixels, and the pixel accuracy of the CPMV of the image block can also be guaranteed to be integer pixels. In this way, it can be ensured that no sub-pixels are involved in the motion estimation process of the image block, thereby reducing the bandwidth pressure to a certain extent.
在本实施例中,使用Affine AMVR进行运动矢量精度决策时,不对所有像素精度进行决策,而是跳过其中1/M(M>1)像素精度的决策,也就是说,只进行N像素精度的决策。In this embodiment, when Affine AMVR is used to make motion vector accuracy decisions, it does not make decisions on all pixel accuracy, but skips the decision of 1/M (M>1) pixel accuracy, that is, only N pixel accuracy is made Decision-making.
应理解,在本实施例中,在将运动矢量精度索引写入码流时,由于像素精度可选项减少,因此写入码流的比特数(bit数)相应减少,甚至可以无需写入表示运动矢量精度索引的比特数。例如,原本像素精度可选项包括三种:整像素、1/4像素和1/16像素,则至少需要2比特的信息表示这三种像素精度,例如,采用“0”表示1/4像素,“10”表示1/16像素,“11”表示整像素。而在本实施例中,可以采用“0”表示整像素,因而只需在码流中写入1比特的数据,或者,可以通过协议约定好采用整像素精度,因而无需将运动矢量精度索引写入码流,这样节省信令开销,同时也可以减小带宽压力。It should be understood that, in this embodiment, when the motion vector accuracy index is written into the code stream, the number of bits (bit number) written into the code stream is reduced correspondingly because the pixel accuracy options are reduced, and there is even no need to write to indicate motion. The number of bits for the vector precision index. For example, the original pixel accuracy options include three types: integer pixels, 1/4 pixels, and 1/16 pixels. At least 2 bits of information are required to indicate these three pixel accuracy. For example, "0" is used to indicate 1/4 pixel. "10" means 1/16 pixel, and "11" means whole pixel. In this embodiment, "0" can be used to represent the whole pixel, so only 1 bit of data needs to be written in the code stream, or the whole pixel precision can be agreed through the agreement, so there is no need to write the motion vector precision index. Into the code stream, this saves signaling overhead, while also reducing bandwidth pressure.
需要说明的是,在该图像块的帧间预测模式为Affine inter模式的情况下,对该图像块进行N(N为正整数)像素的运动矢量精度决策的实施例与图8所示实施例可以组合实施,也可以解耦于图8所示实施例而独立实施。It should be noted that when the inter prediction mode of the image block is the Affine inter mode, the implementation of N (N is a positive integer) pixel motion vector accuracy decision on the image block is the same as the embodiment shown in FIG. 8 It can be implemented in combination, or it can be implemented independently from the embodiment shown in FIG. 8.
可选地,如图8所示,在一些实施例中,该图像块的帧间预测模式为Affine inter模式,步骤610包括:获取图像块的CPMV,对该图像块进行N像素的运动矢量精度决策,N为正整数。Optionally, as shown in FIG. 8, in some embodiments, the inter prediction mode of the image block is Affine inter mode, and step 610 includes: obtaining the CPMV of the image block, and performing the motion vector precision of N pixels on the image block Decision, N is a positive integer.
应理解,通过对该图像块进行N像素的运动矢量精度决策,无论是否将该图像块的邻近块的运动矢量处理为整像素精度,都可以保证该图像块的CPMV为整像素精度。It should be understood that by making an N-pixel motion vector accuracy decision on the image block, whether or not the motion vectors of the neighboring blocks of the image block are processed to the integer pixel accuracy, the CPMV of the image block can be guaranteed to have the integer pixel accuracy.
还应理解,在Affine Inter模式中,通过将图像块的CPMV的像素精度处理为整像素精度,可以保证整像素精度的运动估计,有助于减少带宽压力。It should also be understood that in the Affine Inter mode, by processing the pixel accuracy of the CPMV of the image block as an integer pixel accuracy, it is possible to ensure the motion estimation of the integer pixel accuracy, which helps reduce bandwidth pressure.
上述可知,在Affine merge模式中,将图像块的CPMV的像素精度处理为整像素精度的实现方式为:将所述运动信息候选列表中的运动矢量处理为整像素精度。It can be seen from the above that, in the Affine merge mode, the implementation manner of processing the CPMV of the image block's pixel accuracy to integer pixel accuracy is: processing the motion vectors in the motion information candidate list to integer pixel accuracy.
在Affine inter模式中,将图像块的CPMV的像素精度处理为整像素精度的实现方式为:将所述运动信息候选列表中的运动矢量处理为整像素精度,且对该图像块进行整像素精度的AMVR决策。In the Affine inter mode, the implementation of processing the pixel accuracy of the CPMV of the image block to integer pixel accuracy is: processing the motion vector in the motion information candidate list to integer pixel accuracy, and performing integer pixel accuracy on the image block AMVR decision.
或者,在Affine inter模式中,将图像块的CPMV的像素精度处理为整 像素精度的实现方式为:对该图像块进行整像素精度的AMVR决策。Or, in the Affine inter mode, the implementation of processing the CPMV pixel accuracy of the image block to the integer pixel accuracy is to implement an AMVR decision with the integer pixel accuracy for the image block.
在上述涉及将邻近块的运动矢量处理为整像素精度的实施例中,可以采用上述公式(3)或公式(4)所示的方式,将邻近块的运动矢量处理为整像素精度。也可以采用其它可行的由亚像素转整像素的算法或方法,将邻近块的运动矢量处理为整像素精度。本申请对此不作限定。In the foregoing embodiment involving processing the motion vector of the neighboring block to integer pixel accuracy, the method shown in the above formula (3) or formula (4) can be used to process the motion vector of the neighboring block to integer pixel accuracy. It is also possible to use other feasible algorithms or methods that convert from sub-pixels to pixels to process the motion vectors of neighboring blocks into integer-pixel accuracy. This application does not limit this.
可选地,在一些实施例中,当图像块的大小小于阈值时,将该图像块的CPMV处理为整像素精度。Optionally, in some embodiments, when the size of the image block is smaller than the threshold value, the CPMV of the image block is processed into integer pixel accuracy.
该阈值可以根据实际需求确定。例如,该阈值为16像素。The threshold can be determined according to actual needs. For example, the threshold is 16 pixels.
例如,当图像块的高和/或宽小于16像素时,将该图像块的CPMV处理为整像素精度。For example, when the height and/or width of the image block is less than 16 pixels, the CPMV of the image block is processed to the integer pixel accuracy.
从前文描述的Affine Inter模式可知,在Affine Inter模式下,会进行以图像块为单位的运动估计。例如,当图像块的高和宽等于或大于16像素时,即使是亚像素精度的运动估计过程也不会造成较大的带宽压力,这种情形下,可以不对图像块的CPMV进行处理使之成为整像素精度。From the Affine Inter mode described above, in the Affine Inter mode, motion estimation in units of image blocks will be performed. For example, when the height and width of the image block are equal to or greater than 16 pixels, even the sub-pixel precision motion estimation process will not cause a large bandwidth pressure. In this case, the CPMV of the image block may not be processed to make it Become an integer pixel accuracy.
但是,如果图像块的高和/或宽小于16像素,例如,图像块的大小为4×8、8×4、4×16或16×4,亚像素精度的运动估计过程可能会造成较大的带宽压力。这种情况下,可以将该图像块的CPMV处理为整像素精度。However, if the height and/or width of the image block is less than 16 pixels, for example, the size of the image block is 4×8, 8×4, 4×16, or 16×4, the motion estimation process with sub-pixel accuracy may cause large Bandwidth pressure. In this case, the CPMV of the image block can be processed to full pixel accuracy.
可选地,在一些实施例中,图像块的预测模式为Affine Inter模式,且图像块的高和/或宽小于16像素,根据本申请实施例的方法还包括:对图像块进行整像素精度的AMVR决策。Optionally, in some embodiments, the prediction mode of the image block is Affine Inter mode, and the height and/or width of the image block are less than 16 pixels. The method according to the embodiment of the present application further includes: performing integer pixel accuracy on the image block AMVR decision.
本实施例可以保证整像素精度的运动估计过程,从而可以避免造成较大的带宽压力。This embodiment can ensure the motion estimation process with the accuracy of the whole pixel, so as to avoid causing a large bandwidth pressure.
此外,将满足高和/或宽小于16像素的条件的图像块的运动矢量精度索引写入码流时,由于像素精度可选项减少,可以减小写入码流的bit数。In addition, when the motion vector accuracy index of the image block that meets the condition of height and/or width less than 16 pixels is written into the code stream, the number of bits written into the code stream can be reduced because the pixel accuracy options are reduced.
例如,针对高和宽大于或等于16像素的CU,在整像素、1/4像素和1/16像素三种方式中选择AMVR像素精度,例如,采用“0”代表1/4像素,“10”代表1/16像素,“11”代表整像素。针对高和/或宽小于16像素的CU,因为只有一种AMVR像素精度可选项,因此不需要将AMVR像素精度索引写入码流,例如可以通过协议约定采用整像素精度。For example, for a CU with a height and width greater than or equal to 16 pixels, AMVR pixel accuracy can be selected from three methods: integer, 1/4, and 1/16 pixels. For example, “0” represents 1/4 pixel, and “10 "Represents 1/16 pixel, and "11" represents an entire pixel. For CUs with a height and/or width less than 16 pixels, because there is only one AMVR pixel accuracy option, there is no need to write the AMVR pixel accuracy index into the code stream. For example, the whole pixel accuracy can be adopted by agreement.
本申请实施例可以应用于不同种的帧间预测方式,例如,前向预测、后向预测或双预测。换言之,本申请实施例中提及的子图像块的帧间预测方式 可以为如下任一种:前向预测、后向预测、双预测。The embodiments of the present application can be applied to different kinds of inter-frame prediction methods, for example, forward prediction, backward prediction, or bi-prediction. In other words, the inter-frame prediction mode of the sub-image block mentioned in the embodiment of the present application may be any of the following: forward prediction, backward prediction, and bi-prediction.
例如,子图像块的帧间预测方式为前向预测,则将前向预测过程所得的子图像块的运动矢量处理为整像素。For example, if the inter-frame prediction mode of the sub-image block is forward prediction, the motion vector of the sub-image block obtained in the forward prediction process is processed as an integer pixel.
再例如,子图像块的帧间预测方式为后向预测,则将后向预测过程所得的子图像块的运动矢量处理为整像素。For another example, if the inter-frame prediction mode of the sub-image block is backward prediction, the motion vector of the sub-image block obtained in the backward prediction process is processed as an integer pixel.
再例如,子图像块的帧间预测方式为双预测,则将双预测过程所得的子图像块的运动矢量处理为整像素。For another example, if the inter-frame prediction mode of the sub-image block is bi-prediction, the motion vector of the sub-image block obtained by the bi-prediction process is processed as integer pixels.
可选地,子图像块的帧间预测方式为双预测,但只针对双预测中的一个预测过程,采用本申请实施例提供的方法,将子图像块的运动矢量处理为整像素精度。Optionally, the inter-frame prediction mode of the sub-image block is bi-prediction, but for only one prediction process in the bi-prediction, the method provided in the embodiment of the present application is used to process the motion vector of the sub-image block to integer pixel accuracy.
例如,该图像块的CPMV为双预测过程中前向预测所得的图像块的CPMV,或者,双预测过程中后向预测所得的该图像块的CPMV。For example, the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or the CPMV of the image block obtained by backward prediction in the bi-prediction process.
换句话说,例如,子图像块的帧间预测方式为双预测,则将双预测过程的一个预测过程所得的子图像块的运动矢量处理为整像素。这一个预测过程可以是双预测中的前向预测过程,或者是双预测中的后向预测过程。In other words, for example, if the inter-frame prediction mode of the sub-image block is bi-prediction, the motion vector of the sub-image block obtained in one prediction process of the bi-prediction process is processed as an integer pixel. This prediction process may be the forward prediction process in the bi-prediction or the backward prediction process in the bi-prediction.
上述可知,本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。It can be seen from the above that the solution provided by this application, by making the motion vector of the sub-image block as the image processing unit have integer pixel accuracy, can make the motion compensation process of the sub-image block not involve sub-pixels, thereby reducing Affine prediction to a certain extent. Bandwidth pressure created by technology.
进一步地,通过将图像块的CPMV的像素精度处理为整像素精度,在Affine Inter模式中,可以保证整像素精度的运动估计,有助于减少带宽压力。Further, by processing the pixel accuracy of the CPMV of the image block as integer pixel accuracy, in the Affine Inter mode, the motion estimation with integer pixel accuracy can be guaranteed, which helps reduce bandwidth pressure.
因此,本申请提供的方案,即可以降低帧间预测过程造成的带宽压力,同时也可以保证一定的压缩性能。Therefore, the solution provided by the present application can reduce the bandwidth pressure caused by the inter-frame prediction process, and at the same time can ensure a certain compression performance.
上文描述了本申请的方法实施例,下文将描述本申请的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见前面方法实施例,为了简洁,这里不再赘述。The method embodiments of the present application are described above, and the device embodiments of the present application will be described below. It should be understood that the description of the device embodiment and the description of the method embodiment correspond to each other. Therefore, for the content that is not described in detail, please refer to the previous method embodiment. For brevity, details are not repeated here.
如图9所示,本申请实施例提供一种图像处理的装置900,该装置900包括如下单元。As shown in FIG. 9, an embodiment of the present application provides an image processing apparatus 900, which includes the following units.
第一获取单元910,用于获取图像块的控制点的运动矢量CPMV。The first acquiring unit 910 is configured to acquire the motion vector CPMV of the control point of the image block.
第二获取单元920,用于根据该第一获取单元910获取的该图像块的CPMV,获取该图像块中子图像块的运动矢量,该运动矢量为整像素精度。The second acquiring unit 920 is configured to acquire a motion vector of a sub-image block in the image block according to the CPMV of the image block acquired by the first acquiring unit 910, and the motion vector has an integer pixel accuracy.
本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为 整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。In the solution provided by this application, by making the motion vector of the sub-image block as the image processing unit have integer pixel accuracy, the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
可选地,在一些实施例中,该第二获取单元920用于:根据该图像块的CPMV,计算该子图像块的第一运动矢量,该第一运动矢量为亚像素精度;将该第一运动矢量处理为整像素精度的第二运动矢量。Optionally, in some embodiments, the second acquiring unit 920 is configured to: calculate the first motion vector of the sub-image block according to the CPMV of the image block, the first motion vector is of sub-pixel accuracy; A motion vector is processed as a second motion vector with integer pixel precision.
可选地,在一些实施例中,该第二获取单元920用于,根据该子图像块的第一运动矢量,获取第二运动矢量,使得第二运动矢量的终点为与第一运动矢量的终点最接近的整像素点。Optionally, in some embodiments, the second obtaining unit 920 is configured to obtain a second motion vector according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the same as that of the first motion vector. The whole pixel closest to the end point.
例如,第二获取单元920用于,通过公式(3)或公式(4),将第一运动矢量处理为像素精度为整像素的第二运动矢量。For example, the second acquisition unit 920 is configured to process the first motion vector into a second motion vector with a pixel accuracy of an entire pixel through formula (3) or formula (4).
可选地,在一些实施例中,该子图像块的高和/或宽为4像素。Optionally, in some embodiments, the height and/or width of the sub-image block is 4 pixels.
可选地,在一些实施例中,该第一获取单元910用于:获取该图像块的运动信息候选列表,将该运动信息候选列表中的运动矢量处理为整像素精度;根据该运动信息候选列表中处理为整像素精度的运动矢量,获取该图像块的CPMV。Optionally, in some embodiments, the first obtaining unit 910 is configured to: obtain a motion information candidate list of the image block, and process the motion vector in the motion information candidate list to integer pixel accuracy; and according to the motion information candidate The list is processed as a motion vector with integer pixel precision, and the CPMV of the image block is obtained.
可选地,在一些实施例中,该装置900还包括:处理单元930,用于对该图像块进行N像素的运动矢量精度决策,N为正整数。Optionally, in some embodiments, the device 900 further includes: a processing unit 930, configured to make a motion vector accuracy decision of N pixels for the image block, where N is a positive integer.
可选地,在一些实施例中,该图像块的高和/或宽小于16像素。Optionally, in some embodiments, the height and/or width of the image block is less than 16 pixels.
可选地,在一些实施例中,该子图像块的帧间预测方式为如下任一种:前向预测、后向预测、双预测。Optionally, in some embodiments, the inter-frame prediction mode of the sub-image block is any one of the following: forward prediction, backward prediction, and bi-prediction.
可选地,在一些实施例中,该子图像块的帧间预测方式为双预测,其中,该图像块的CPMV为双预测过程中前向预测所得的该图像块的CPMV,或者,双预测过程中后向预测所得的该图像块的CPMV。Optionally, in some embodiments, the inter-frame prediction mode of the sub-image block is bi-prediction, wherein the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or bi-prediction The CPMV of the image block obtained by backward prediction in the process.
可选地,本实施例的图像处理的装置900可以为编码器,该装置900中还可以包括用于实现视频编码相关流程的功能模块。Optionally, the image processing apparatus 900 of this embodiment may be an encoder, and the apparatus 900 may also include functional modules for implementing video encoding related processes.
可选地,本实施例的图像处理的装置900可以为解码器,该装置900中还可以包括用于实现视频解码相关流程的功能模块。Optionally, the image processing apparatus 900 of this embodiment may be a decoder, and the apparatus 900 may further include functional modules for implementing video decoding related processes.
如图10所示,本发明实施例还提供一种图像处理的装置1000。该装置1000包括处理器1010与存储器1020,该存储器1020用于存储指令,该处理器1010用于执行该存储器1020存储的指令,并且对该存储器1020中存储的指令的执行使得,该处理器1010用于执行上文方法实施例的方法。As shown in FIG. 10, an embodiment of the present invention also provides an image processing apparatus 1000. The device 1000 includes a processor 1010 and a memory 1020. The memory 1020 is used to store instructions. The processor 1010 is used to execute instructions stored in the memory 1020. The execution of the instructions stored in the memory 1020 makes the processor 1010 The method used to perform the above method embodiment.
具体地,该编码装置1000还包括通信接口1030,用于与外部器件传输信号。Specifically, the encoding device 1000 further includes a communication interface 1030 for transmitting signals with external devices.
可选地,本实施例的图像处理的装置1000为编码器,通信接口1030用于从外部器件接收待处理的图像或视频数据。或者,通信接口1030还用于向解码端发送编码码流。Optionally, the image processing apparatus 1000 in this embodiment is an encoder, and the communication interface 1030 is used to receive image or video data to be processed from an external device. Alternatively, the communication interface 1030 is also used to send a coded stream to the decoding end.
可选地,本实施例的图像处理的装置1000为解码器,通信接口1030用于从编码端接收编码码流。Optionally, the image processing apparatus 1000 in this embodiment is a decoder, and the communication interface 1030 is used to receive an encoded bitstream from an encoding end.
本发明实施例还提供一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得,该计算机执行上文方法实施例的方法。The embodiment of the present invention also provides a computer storage medium on which a computer program is stored. When the computer program is executed by a computer, the computer executes the method in the above method embodiment.
本发明实施例还提供一种包含指令的计算机程序产品,其特征在于,该指令被计算机执行时使得计算机执行上文方法实施例的方法。An embodiment of the present invention also provides a computer program product containing instructions, which is characterized in that, when the instructions are executed by a computer, the computer executes the method of the above method embodiment.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part. The computer can be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices. Computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, computer instructions can be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server, or data center. A computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. Available media may be magnetic media (for example, floppy disk, hard disk, tape), optical media (for example, digital video disc (DVD)), or semiconductor media (for example, solid state disk (SSD)), etc.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may be aware that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和 方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (21)

  1. 一种图像处理的方法,其特征在于,包括:An image processing method, characterized in that it comprises:
    获取图像块的控制点的运动矢量CPMV;Obtain the motion vector CPMV of the control point of the image block;
    根据所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,所述运动矢量为整像素精度。According to the CPMV of the image block, the motion vector of the sub-image block in the image block is obtained, and the motion vector has an integer pixel accuracy.
  2. 根据权利要求1所述的方法,其特征在于,基于所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,包括:The method according to claim 1, wherein, based on the CPMV of the image block, obtaining the motion vector of the sub-image block in the image block comprises:
    根据所述图像块的CPMV,计算所述子图像块的第一运动矢量,所述第一运动矢量为亚像素精度;Calculating a first motion vector of the sub-image block according to the CPMV of the image block, where the first motion vector has sub-pixel accuracy;
    将所述第一运动矢量处理为整像素精度的第二运动矢量。The first motion vector is processed into a second motion vector with integer pixel accuracy.
  3. 根据权利要求1或2所述的方法,其特征在于,所述子图像块的高和/或宽为4像素。The method according to claim 1 or 2, wherein the height and/or width of the sub-image block is 4 pixels.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,获取图像块的控制点的运动矢量CPMV,包括:The method according to any one of claims 1 to 3, wherein acquiring the motion vector CPMV of the control point of the image block comprises:
    获取所述图像块的运动信息候选列表;Acquiring a motion information candidate list of the image block;
    将所述运动信息候选列表中的运动矢量处理为整像素精度;Processing the motion vectors in the motion information candidate list into integer pixel accuracy;
    根据所述运动信息候选列表中处理为整像素精度的运动矢量,获取所述图像块的CPMV。Acquire the CPMV of the image block according to the motion vector processed to the integer pixel precision in the motion information candidate list.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 4, wherein the method further comprises:
    对所述图像块进行N像素的运动矢量精度决策,N为正整数。A motion vector precision decision of N pixels is performed on the image block, where N is a positive integer.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述图像块的高和/或宽小于16像素。The method according to any one of claims 1 to 5, wherein the height and/or width of the image block is less than 16 pixels.
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述子图像块的帧间预测方式为如下任一种:前向预测、后向预测、双预测。The method according to any one of claims 1 to 6, wherein the inter-frame prediction mode of the sub-image block is any one of the following: forward prediction, backward prediction, and bi-prediction.
  8. 根据权利要求1至6中任一项所述的方法,其特征在于,所述子图像块的帧间预测方式为双预测,其中,所述图像块的CPMV为双预测过程中前向预测所得的所述图像块的CPMV,或者,双预测过程中后向预测所得的所述图像块的CPMV。The method according to any one of claims 1 to 6, wherein the inter-frame prediction mode of the sub-image block is bi-prediction, wherein the CPMV of the image block is obtained from forward prediction in the bi-prediction process The CPMV of the image block or the CPMV of the image block obtained by backward prediction in the bi-prediction process.
  9. 根据权利要求2所述的方法,其特征在于,将所述第一运动矢量处理为整像素精度的第二运动矢量,包括:The method according to claim 2, wherein processing the first motion vector into a second motion vector with integer pixel precision comprises:
    根据所述第一运动矢量,获取所述第二运动矢量,使得所述第二运动矢量的终点为与所述第一运动矢量的终点最接近的整像素点。According to the first motion vector, the second motion vector is acquired, so that the end point of the second motion vector is an integral pixel point closest to the end point of the first motion vector.
  10. 一种图像处理的装置,其特征在于,包括:An image processing device, characterized in that it comprises:
    第一获取单元,用于获取图像块的控制点的运动矢量CPMV;The first acquiring unit is used to acquire the motion vector CPMV of the control point of the image block;
    第二获取单元,用于根据所述第一获取单元获取的所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,所述运动矢量为整像素精度。The second acquisition unit is configured to acquire a motion vector of a sub-image block in the image block according to the CPMV of the image block acquired by the first acquisition unit, and the motion vector has an integer pixel accuracy.
  11. 根据权利要求10所述的装置,其特征在于,所述第二获取单元用于:The device according to claim 10, wherein the second acquiring unit is configured to:
    根据所述图像块的CPMV,计算所述子图像块的第一运动矢量,所述第一运动矢量为亚像素精度;Calculating a first motion vector of the sub-image block according to the CPMV of the image block, where the first motion vector has sub-pixel accuracy;
    将所述第一运动矢量处理为整像素精度的第二运动矢量。The first motion vector is processed into a second motion vector with integer pixel accuracy.
  12. 根据权利要求10或11所述的装置,其特征在于,所述子图像块的高和/或宽为4像素。The device according to claim 10 or 11, wherein the height and/or width of the sub-image block is 4 pixels.
  13. 根据权利要求10至12中任一项所述的装置,其特征在于,所述第一获取单元用于:The device according to any one of claims 10 to 12, wherein the first obtaining unit is configured to:
    获取所述图像块的运动信息候选列表;Acquiring a motion information candidate list of the image block;
    将所述运动信息候选列表中的运动矢量处理为整像素精度;Processing the motion vectors in the motion information candidate list into integer pixel accuracy;
    根据所述运动信息候选列表中处理为整像素精度的运动矢量,获取所述图像块的CPMV。Acquire the CPMV of the image block according to the motion vector processed to the integer pixel precision in the motion information candidate list.
  14. 根据权利要求10至13中任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 10 to 13, wherein the device further comprises:
    处理单元,用于对所述图像块进行N像素的运动矢量精度决策,N为正整数。The processing unit is used to make N-pixel motion vector accuracy decision on the image block, where N is a positive integer.
  15. 根据权利要求10至14中任一项所述的装置,其特征在于,所述图像块的高和/或宽小于16像素。The device according to any one of claims 10 to 14, wherein the height and/or width of the image block is less than 16 pixels.
  16. 根据权利要求10至15中任一项所述的装置,其特征在于,所述子图像块的帧间预测方式为如下任一种:前向预测、后向预测、双预测。The apparatus according to any one of claims 10 to 15, wherein the inter-frame prediction mode of the sub-image block is any one of the following: forward prediction, backward prediction, and bi-prediction.
  17. 根据权利要求10至16中任一项所述的装置,其特征在于,所述子图像块的帧间预测方式为双预测,其中,所述图像块的CPMV为双预测过程中前向预测所得的所述图像块的CPMV,或者,双预测过程中后向预测所得的所述图像块的CPMV。The apparatus according to any one of claims 10 to 16, wherein the inter-frame prediction mode of the sub-image block is bi-prediction, wherein the CPMV of the image block is obtained from forward prediction in the bi-prediction process The CPMV of the image block or the CPMV of the image block obtained by backward prediction in the bi-prediction process.
  18. 根据权利要求11所述的装置,其特征在于,所述第二获取单元用于,根据所述第一运动矢量,获取所述第二运动矢量,使得所述第二运动矢量的终点为与所述第一运动矢量的终点最接近的整像素点。The apparatus according to claim 11, wherein the second obtaining unit is configured to obtain the second motion vector according to the first motion vector, so that the end point of the second motion vector is the same The whole pixel point closest to the end point of the first motion vector.
  19. 一种图像处理装置,其特征在于,包括:存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行如权利要求1至9中任一项所述的方法。An image processing device, characterized by comprising: a memory and a processor, the memory is used to store instructions, the processor is used to execute the instructions stored in the memory, and the execution of the instructions stored in the memory causes The processor is configured to execute the method according to any one of claims 1 to 9.
  20. 一种计算机存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被计算机执行时使得,所述计算机执行如权利要求1至9中任一项所述的方法。A computer storage medium, characterized in that a computer program is stored thereon, and when the computer program is executed by a computer, the computer executes the method according to any one of claims 1 to 9.
  21. 一种包含指令的计算机程序产品,其特征在于,所述指令被计算机执行时使得计算机执行如权利要求1至9中任一项所述的方法。A computer program product containing instructions, characterized in that, when the instructions are executed by a computer, the computer executes the method according to any one of claims 1 to 9.
PCT/CN2019/077894 2019-03-12 2019-03-12 Image processing method and apparatus WO2020181507A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/077894 WO2020181507A1 (en) 2019-03-12 2019-03-12 Image processing method and apparatus
CN201980005232.7A CN111247804B (en) 2019-03-12 2019-03-12 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/077894 WO2020181507A1 (en) 2019-03-12 2019-03-12 Image processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2020181507A1 true WO2020181507A1 (en) 2020-09-17

Family

ID=70865988

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/077894 WO2020181507A1 (en) 2019-03-12 2019-03-12 Image processing method and apparatus

Country Status (2)

Country Link
CN (1) CN111247804B (en)
WO (1) WO2020181507A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106303544A (en) * 2015-05-26 2017-01-04 华为技术有限公司 A kind of video coding-decoding method, encoder
CN106534858A (en) * 2015-09-10 2017-03-22 展讯通信(上海)有限公司 Real motion estimation method and device
CN109005407A (en) * 2015-05-15 2018-12-14 华为技术有限公司 Encoding video pictures and decoded method, encoding device and decoding device
CN109218733A (en) * 2017-06-30 2019-01-15 华为技术有限公司 A kind of method and relevant device of determining motion vector predictor
WO2019032765A1 (en) * 2017-08-09 2019-02-14 Vid Scale, Inc. Frame-rate up conversion with reduced complexity

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3414900A4 (en) * 2016-03-15 2019-12-11 Mediatek Inc. Method and apparatus of video coding with affine motion compensation
CN109391814B (en) * 2017-08-11 2023-06-06 华为技术有限公司 Method, device and equipment for encoding and decoding video image
CN107277506B (en) * 2017-08-15 2019-12-03 中南大学 Motion vector accuracy selection method and device based on adaptive motion vector precision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109005407A (en) * 2015-05-15 2018-12-14 华为技术有限公司 Encoding video pictures and decoded method, encoding device and decoding device
CN106303544A (en) * 2015-05-26 2017-01-04 华为技术有限公司 A kind of video coding-decoding method, encoder
CN106534858A (en) * 2015-09-10 2017-03-22 展讯通信(上海)有限公司 Real motion estimation method and device
CN109218733A (en) * 2017-06-30 2019-01-15 华为技术有限公司 A kind of method and relevant device of determining motion vector predictor
WO2019032765A1 (en) * 2017-08-09 2019-02-14 Vid Scale, Inc. Frame-rate up conversion with reduced complexity

Also Published As

Publication number Publication date
CN111247804B (en) 2023-10-13
CN111247804A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
JP7541155B2 (en) Limited memory access window for motion vector refinement
TW202005383A (en) Partial cost calculation
TW201931854A (en) Unified merge candidate list usage
TW201933874A (en) Video coding using local illumination compensation
JP2022523350A (en) Methods, devices and systems for determining predictive weighting for merge modes
JP2023014095A (en) Memory access windows and padding for motion vector refinement and motion compensation
TW202041010A (en) Inter prediction methods for coding video data
CA2808160C (en) Optimized deblocking filters
TWI841033B (en) Method and apparatus of frame inter prediction of video data
JP2022537064A (en) Encoders, decoders and corresponding methods
WO2020006969A1 (en) Motion vector prediction method and related device
CN113383550A (en) Early termination of optical flow modification
TW202031048A (en) Simplified spatial-temporal motion vector prediction
CN114827623A (en) Boundary extension for video coding and decoding
KR100926752B1 (en) Fine Motion Estimation Method and Apparatus for Video Coding
WO2020181507A1 (en) Image processing method and apparatus
WO2020252707A1 (en) Video processing method and device
WO2021134666A1 (en) Video processing method and apparatus
TW202029747A (en) Restrictions for the worst-case bandwidth reduction in video coding
US20130170565A1 (en) Motion Estimation Complexity Reduction
TWI846835B (en) Method, device, and system for determining prediction weight for merge mode
WO2021134631A1 (en) Video processing method and apparatus
WO2020140329A1 (en) Video processing method and apparatus
KR20230081711A (en) Motion Coding Using Geometric Models for Video Compression
TW202041002A (en) Constraints on decoder-side motion vector refinement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19919041

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19919041

Country of ref document: EP

Kind code of ref document: A1