WO2020252707A1 - Video processing method and device - Google Patents

Video processing method and device Download PDF

Info

Publication number
WO2020252707A1
WO2020252707A1 PCT/CN2019/091955 CN2019091955W WO2020252707A1 WO 2020252707 A1 WO2020252707 A1 WO 2020252707A1 CN 2019091955 W CN2019091955 W CN 2019091955W WO 2020252707 A1 WO2020252707 A1 WO 2020252707A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
component
mode
interpolation
filter
Prior art date
Application number
PCT/CN2019/091955
Other languages
French (fr)
Chinese (zh)
Inventor
孟学苇
郑萧桢
王苫社
马思伟
Original Assignee
北京大学
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学, 深圳市大疆创新科技有限公司 filed Critical 北京大学
Priority to CN201980009161.8A priority Critical patent/CN111656782A/en
Priority to PCT/CN2019/091955 priority patent/WO2020252707A1/en
Publication of WO2020252707A1 publication Critical patent/WO2020252707A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Definitions

  • This application relates to the field of image processing, and more specifically, to a video processing method and device.
  • Prediction is an important module of the mainstream video coding framework. Prediction can include intra-frame prediction and inter-frame prediction.
  • the general process of inter prediction may include motion estimation (ME) and motion compensation (MC).
  • the process of motion estimation is the process of obtaining a motion vector (MV) after searching and comparing the current coding block of the current frame in the reference frame.
  • Motion compensation is the process of obtaining the prediction block of the current block by using the MV and the reference block.
  • the predicted block obtained by motion compensation may be different from the original current block. Therefore, the difference (residual) between the predicted block and the current block needs to be transmitted to the decoding end after transformation, quantization, etc., in addition to
  • the information of the MV and the reference frame is passed to the decoding end for the decoding end to reconstruct the current frame.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixel units.
  • sub-pixel accuracy is proposed.
  • HEVC High Efficiency Video Coding
  • a motion vector with 1/4 pixel accuracy is used for motion estimation of the luminance component.
  • there are no samples at sub-pixels in digital video there are no samples at sub-pixels in digital video.
  • the values of these sub-pixels must be approximately interpolated, that is, the line direction and the reference frame K-fold interpolation is performed in the column direction, and the prediction block is searched in the reference frame after the interpolation.
  • the pixels in the current block and the pixels in the adjacent area need to be used.
  • the embodiments of the present application provide a video processing method and device, which can effectively implement the interpolation process in the motion estimation and/or motion compensation process.
  • a video processing method which includes: using an interpolation filter among a variety of interpolation filters to perform motion estimation and/or motion compensation on an image block with multiple motion vectors MV of a target frame.
  • a video processing device including a processor, and the processor is configured to call codes stored in a memory to perform the following operations:
  • motion estimation and/or motion compensation are performed on the image block with multiple MVs of the target frame.
  • a computer system including: a memory, configured to store computer-executable instructions; a processor, configured to access the memory and execute the computer-executable instructions to perform the above-mentioned method in the first aspect operating.
  • a computer storage medium stores program code, and the program code can be used to instruct the execution of the method of the first aspect.
  • a computer program product includes program code, and the program code can be used to instruct to execute the method of the first aspect.
  • interpolation filters for image blocks with multiple MVs, there may be multiple interpolation filters to choose from, and the interpolation filters can be flexibly selected, so that the storage bandwidth pressure can be reduced while ensuring the encoding performance.
  • Fig. 1 is a frame diagram of video coding according to an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a prediction method according to an embodiment of the present application.
  • Fig. 3 is a schematic diagram of an image block interpolation process according to an embodiment of the present application.
  • Fig. 4 is a schematic diagram of the control points of the Affine mode according to an embodiment of the present application.
  • Fig. 5 is a schematic diagram of a motion vector of a CU according to an embodiment of the present application.
  • Fig. 6 is a schematic flowchart of a video processing method according to an embodiment of the present application.
  • Fig. 7 is a schematic block diagram of a video processing device according to an embodiment of the present application.
  • the video coding framework mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • This application is mainly aimed at improving the inter prediction (inter prediction) part.
  • inter-frame prediction uses the time-domain correlation between adjacent frames of the video, use the reconstructed frame as a reference frame, and use Motion Estimation (ME) and Motion Compensation (MC) to compare the current frame Make predictions to remove the temporal redundant information of the video.
  • ME Motion Estimation
  • MC Motion Compensation
  • the current frame (or target frame) mentioned in this article refers to the frame currently being encoded in the encoding scene, and refers to the frame currently being decoded in the decoding scene.
  • the reconstructed frame mentioned in this article, in the encoding scene, means the previously encoded frame, in the decoding scene, means the previously decoded frame.
  • the entire frame of image is not directly processed in the encoding process, and the entire frame of image is usually divided into image blocks for processing.
  • CTU Coding Tree Unit
  • the size of the CTU is 64 ⁇ 64 or 128 ⁇ 128 (unit: pixels)
  • the CTU can be further divided into square or rectangular Coding Unit (CU).
  • CU Coding Unit
  • the unit of the size of the image block mentioned in this article may all be pixels.
  • Motion estimation refers to the process of obtaining a motion vector after searching and comparing the current block of the current frame in the reference frame.
  • Motion compensation refers to the process of obtaining a prediction block using a reference block and a motion vector obtained by motion estimation.
  • the prediction block obtained in the process of inter prediction may be different from the original current block. Therefore, the difference between the prediction block and the current block can be calculated, and the difference may be called the residual. After performing transformation, quantization, entropy coding and other processing on the residual, the coded bit stream is obtained.
  • the bitstream and encoding mode information such as inter-frame prediction mode, motion vector information, and other information, can be stored or sent to the decoding end.
  • the decoding end after obtaining the entropy coded bitstream, first perform entropy decoding on the bitstream to obtain the corresponding residual; then, obtain the prediction block according to the coding mode information such as the decoded motion vector; finally, according to the residual and prediction Block, get the value of each pixel in the current block, that is, reconstruct the current block, and so on, reconstruct the current frame.
  • steps such as inverse quantization and inverse transformation may also be included.
  • Dequantization refers to the process opposite to the quantification process.
  • Inverse transformation refers to the process opposite to the transformation process.
  • Inter prediction may include forward prediction, backward prediction, bi-prediction, and so on.
  • the forward prediction is to use the previous reconstructed frame (may be referred to as the historical frame) of the current frame (for example, the frame labeled t as shown in FIG. 2) to predict the current frame.
  • Backward prediction is to use frames after the current frame (may be called a future frame) to predict the current frame.
  • Bi-prediction can be bi-prediction, that is, using both "historical frames” (for example, frames labeled t-2 and t-1 as shown in Figure 2) and “future frames” (for example, as shown in Figure 2, Frames labeled t+2 and t+1) to predict the current frame.
  • Bi-prediction can also be prediction in the same direction, for example, using two "historical frames” to predict the current frame, or using two "future frames” to predict the current frame.
  • the frame types can include three types: "I frame”, "B frame” and "P frame".
  • B frame is a bidirectional predictive frame.
  • the image blocks in the B frame may use intra-frame coding or inter-frame coding mode.
  • the inter prediction mode of its image block can be forward prediction, backward prediction, or bidirectional prediction. Therefore, the inter prediction block MV of the B frame can be a single MV or a dual MV.
  • Generalized B frame (Generalized P and B picture, referred to as GPB) is a structure in HEVC that combines the characteristics of traditional B and P frames.
  • Generalized B-frames can adopt a dual forward prediction method, that is, there are two reference frames and all of them are "historical frames".
  • For the coding block of a generalized B frame it may also be an intra mode, a forward prediction mode, and a dual forward prediction mode.
  • the inter prediction block MV of a generalized B frame can be a single MV or a dual MV.
  • P frame is a forward prediction frame, and it is a unidirectional prediction.
  • the coded block in the P frame may use the intra prediction mode, and may use the forward prediction mode. Since the P frame is a unidirectional prediction frame, the inter prediction block MV of the P frame is all a single MV.
  • HEVC High Efficiency Video Coding
  • RA Random Access
  • LDB Low Delay B Frame Coding
  • LDP Low Delay P frame coding
  • all frames are I frames (I I I I I I I I I).
  • the RA encoding mode it is mainly B frames, and I frames are inserted periodically (approximately every second), which means that in this encoding mode, it is I B B B B B B B B B B B...B B B I B B B B B B B B B...B B B I....
  • the frame structure is I B B B B B....
  • LDP LDP
  • only the first frame is an I frame, and the rest are coded in the manner of P frames.
  • the frame structure is I P P P P...
  • the inter-frame prediction technology in HEVC can include three modes, namely inter mode (also called AMVP mode), merge mode and skip mode.
  • inter mode also called AMVP mode
  • merge mode merge mode
  • skip mode skip mode
  • motion vector prediction motion vector prediction
  • MVP motion vector prediction
  • the starting point of motion estimation can be determined according to the MVP, and the motion search is performed near the starting point.
  • the optimal MV the position of the reference block in the reference image is determined by the MV, the reference block is subtracted from the current block to obtain the residual block, and the MVP is subtracted from the MV to obtain the Motion Vector Difference (MVD), and the MVD is passed through the code stream Transmitted to the decoding end.
  • MVP Motion Vector Difference
  • the MVP can be determined first, and the MVP can be directly determined as the MV.
  • an MVP candidate list (merge candidate list) can be constructed first.
  • the MVP candidate list at least one candidate MVP can be included.
  • Each candidate MVP can correspond to an index.
  • the encoder can write the MVP index into the code stream, and the decoder can find the index from the MVP candidate list according to the index Corresponding MVP to achieve the decoding of image blocks.
  • Step 1 Obtain the MVP candidate list
  • Step 2 Select an optimal MVP from the MVP candidate list, and at the same time obtain the index of the MVP in the MVP candidate list;
  • Step 3 Use the MVP as the MV of the current block
  • Step 4 Determine the position of the reference block (also called the prediction block) in the reference frame image according to the MV;
  • Step 5 Subtract the current block from the reference block to obtain residual data
  • Step 6 Pass the residual data and the index of the MVP to the decoder.
  • Merge mode can also have other implementations.
  • Skip mode is a special case of Merge mode. After obtaining the MV according to the Merge mode, if the encoding end determines that the current block is basically the same as the reference block, then there is no need to transmit residual data, only the index of the MV, and further a flag can be passed, which can indicate that the current block can be directly Obtained from the reference block.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the accuracy of motion estimation can be improved to the sub-pixel level (also called 1/K pixel accuracy). For example, in the HEVC standard, motion vectors with 1/4 pixel accuracy are used for motion estimation of the luminance component.
  • the 1/4 pixel interpolation process is shown in FIG. 3, and the 3 pixels on the left side and the 4 pixels on the right side of the image block to be encoded can be used to generate the pixel value of the interpolation point.
  • a 0 , 0 and d 0, 0 are 1/4 pixels
  • b 0 , 0 and h 0, 0 are half pixels
  • c 0, 0 and n 0, 0 is 3/4 pixel. If the current block is a 2 ⁇ 2 block, A 0,0 to A 1,0 , A 0,0 to A 0,1 are surrounded by 2 ⁇ 2 blocks.
  • image blocks mentioned here may be 8 ⁇ 8, 4 ⁇ 8, 4 ⁇ 4, or 8 ⁇ 4 image blocks, and may also be image blocks of other sizes, which are not specifically limited in the embodiment of the present application.
  • the interpolation process in the embodiment of the present application can be implemented by an interpolation filter.
  • the number of taps of the interpolation filter may refer to the pixel values of the number of points that may be used at most to calculate the interpolated samples.
  • the coefficients of the 8-tap interpolation filter corresponding to the luminance component and the coefficients of the 4-tap interpolation filter of the chrominance component can be as shown in Tables 1 and 2.
  • filter1 is the interpolation filter coefficient used at 1/8 pixel position
  • filter2 is the interpolation filter coefficient used at 2/8 pixel position
  • filter3 is 3/8 pixel position
  • the Adaptive Motion Vector Resolution (AMVR) technology can enable the CU to have a motion vector with full pixel precision or sub-pixel precision.
  • the integer pixel accuracy can be, for example, 1-pixel accuracy, 2-pixel accuracy, or the like.
  • the sub-pixel accuracy can be, for example, 1/2 pixel accuracy, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1/16 pixel accuracy.
  • AMVR can include AMVR in inter mode and AMVR in Affine mode.
  • an Affine mode sports field can pass two control points (four parameters) (as shown in Figure 4(a)) or three control points (six parameters) (as shown in Figure 4(b))
  • the motion vector is exported.
  • MV Control Point Motion Vector
  • the processing unit of Affine is not a CU, but a sub-block (sub-CU) obtained after dividing the CU, and the size of each sub-CU may be 4 ⁇ 4.
  • each sub-CU has one MV. It can be understood that, unlike ordinary CUs, Affine mode CUs do not only have one MV. There are as many sub-CUs as there are in a CU.
  • the MV of the sub-CU in one CU is derived through the CPMV calculation of two control points or three control points as shown in FIG. 4.
  • the MV of the sub-CU at the (x, y) position is calculated by the following formula:
  • the MV of the sub-CU at the (x, y) position is calculated by the following formula:
  • (mv 0x , mv 0y ) is the MV of the upper left control point
  • (mv 1x , mv 1y ) is the MV of the upper right control point
  • (mv 2x , mv 2y ) is the MV of the lower left control point.
  • the motion vector in a CU can be as shown in Fig. 5, and each square represents a sub-CU with a size of 4x4. All MVs after the above formula calculation will be converted into 1/16 precision representation, which means that the highest precision of sub-CU MV is 1/16.
  • the prediction block of each sub-CU is obtained through the process of motion compensation.
  • the size of the sub-CU of the chrominance component and the luminance component is 4x4, and the motion of the chrominance component 4x4 block is obtained by averaging its corresponding four 4x4 luminance component motion vectors.
  • the Affine merge mode can only process CUs whose width and height are not less than 8, similar to the normal merge mode mentioned above. In this mode, you can first obtain MVs from spatial neighboring blocks and temporal neighboring blocks. In this process, CPMVs of Affine mode CUs and traditional mode MVs are obtained, and CPMVs are obtained from these MV combinations to construct a candidate list, and then from candidates Select a combination from the list (this combination may contain two or three CPMV, representing two control points and three control points CPMV) as the CPMVs of the current block, no motion estimation is required, and only the final selection The index of CPMVs (a CU only needs to write one index) is written into the code stream.
  • the inter prediction mode of adjacent blocks can be the traditional inter prediction mode or the Affine mode. Therefore, the MV obtained from the adjacent blocks may be whole pixels or sub-pixels.
  • the Affine merge mode does not perform AMVR, that is, it does not The process of adaptive motion vector accuracy decision-making will be carried out, and the accuracy of the MV selected from the neighboring blocks is as much as possible.
  • the Affine Inter mode can only process CUs whose width and height are not less than 16, which is similar to the AMVP mode mentioned above.
  • the candidate list can be constructed by first obtaining MVs from adjacent blocks in the spatial or temporal domain, and then performing the motion estimation process.
  • the motion estimation process is performed in units of the entire CU to obtain CPMVs.
  • the motion compensation process is performed in a unit of 4x4 sub-CU, and finally the index of the selected CPMVs and the difference (MVD, motion vector difference) between the actual CPMVs of the current block CU can be written into the code stream.
  • the accuracy of AMVR is essentially the accuracy of MVD, that is, the accuracy of CPMVs, not the MV accuracy of sub-CU.
  • the encoder can adaptively decide its corresponding MV accuracy, and write the result of the decision into the code stream Pass it to the decoder.
  • the whole pixel accuracy or sub-pixel accuracy mentioned in Affine AMVR technology refers to the pixel accuracy of CPMV, not the pixel accuracy of sub-CU.
  • the 1/16 accuracy, 1/4 accuracy, and integer pixel accuracy mentioned in Affine AMVR can refer to the accuracy of the CPMV in Figure 4, not the accuracy of the MV actually used in the sub-CU motion compensation process.
  • the process of motion estimation is the whole pixel process, and the MV of the sub-CU obtained after the above two formulas 1) and 2) may be 1/4 accuracy, so the process of motion compensation Sub-pixels will be involved.
  • the sub-pixel precision interpolation process mentioned above may bring pressure on the memory data reading.
  • the interpolation process also needs to read the data of its neighboring points to obtain the pixel value of the sub-pixel.
  • the interpolation process needs to use 7 pixels in the horizontal direction and 7 pixels in the vertical direction in addition to the current block. If the current block is a block of width and height W and H, the interpolation process needs Read the area of (W+7)x(H+7).
  • the area of (W+3)x(H+3) needs to be read.
  • the area of (W+5)x(H+5) needs to be read.
  • the storage bandwidth consumption is larger. If LDP is used and only single MV is allowed, it will lead to a larger gap between the coding performance and LDB.
  • this application proposes an image processing method and device, which can reduce bandwidth pressure to a certain extent while ensuring compression performance.
  • This application is suitable for the field of digital video coding technology, and is specifically used for the inter-frame prediction part of a video codec.
  • This application can be applied to codecs that comply with the international video coding standard H.264/HEVC and the Chinese AVS2 standard, as well as codecs that comply with the next-generation video coding standard VVC or AVS3.
  • This application can be applied to the inter-frame prediction part of a video codec, that is to say, the image processing method according to the embodiment of this application can be executed by an encoding device or a decoding device.
  • Fig. 6 is a schematic flowchart of a video processing method according to an embodiment of the present application. The method includes at least part of the following content.
  • an interpolation filter among a variety of interpolation filters is used to perform motion estimation and/or motion compensation on an image block having at least one MV (specifically, multiple MVs) of the target frame.
  • interpolation filters for use by the video processing device.
  • the video processing device is performing motion estimation and/or motion compensation on the current image block, and the interpolation filter can be selected from the multiple interpolation filters for motion. Estimation and/or motion compensation.
  • the interpolation filter used for motion estimation and the interpolation filter used for motion compensation may be the same or different.
  • the interpolation filter can be selected once for the current image block, which is used for motion estimation and motion compensation. Or, for the current image block, an interpolation filter may be selected for motion estimation and an interpolation filter may be selected for motion compensation.
  • the multiple interpolation filters used for motion estimation may be the same, partially the same, or completely different from the multiple interpolation filters used for motion compensation.
  • different interpolation filters may refer to differences in at least one of the following aspects: the number of taps of the interpolation filter, the coefficients of the interpolation filter, the shape of the interpolation filter (or referred to as the reference of the interpolation filter) Pixel position).
  • the number of taps of the interpolation filter may be 2, 4, 6, or 8, etc.
  • different interpolation filters of the multiple interpolation filters correspond to different preset conditions.
  • each of the multiple interpolation filters corresponds to a preset condition, and when the preset condition of a certain interpolation filter is satisfied, the interpolation filter can be used to perform motion estimation and/or motion compensation.
  • the first interpolation filter (which may be any interpolation filter among a plurality of interpolation filters) is satisfied, the first interpolation filter is used for the image block, Perform motion estimation and/or motion compensation.
  • the preset conditions corresponding to different interpolation filters may be different in at least one of the following aspects:
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • the coding mode of the image block mentioned in the embodiment of this application may include: Inter mode, Affine mode, Merge mode, and so on.
  • the interval where the size of the image block mentioned in the embodiment of the present application is located can be divided into two or more than two types of intervals.
  • the interval where the size of the image block is located can be divided into two intervals greater than the preset value and less than or equal to the preset value; for example, the difference between the size of the image block may be divided into intervals greater than the first preset value , An interval less than or equal to the first preset value and greater than the second preset value, and an interval less than or equal to the second preset value.
  • the components to be coded of the image block mentioned in the embodiments of the present application may include: luminance components and chrominance components.
  • the number of MVs of image blocks mentioned in the embodiment of the present application may be one, two, three or more.
  • one interpolation filter may correspond to one or more preset conditions, and when any one of the one or more preset conditions is satisfied, the interpolation filter may Used for motion estimation and/or motion compensation.
  • different preset conditions among the multiple preset conditions corresponding to one interpolation filter are different in at least one of the following aspects:
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • the preset conditions corresponding to the first interpolation filter in the plurality of interpolation filters include at least two of the following:
  • the coding mode of the image block is the inter mode, and the component to be coded is the luminance component;
  • the coding mode of the image block is inter mode, and the component to be coded is a chrominance component;
  • the coding mode of the image block is the affine motion compensation prediction Affine mode, and the component to be coded is the chrominance component;
  • the coding mode of the image block is Affine mode, and the component to be coded is a luminance component.
  • the preset conditions corresponding to the first interpolation filter include the above at least two types, which may be different encoding modes, or different components to be encoded.
  • the factors included in a single preset condition mentioned in the embodiments of the present application may be open-ended, that is, in addition to the factors mentioned in (the factors included in the preset conditions, that is, the factors limited by the preset conditions), Other factors can also be included or defined.
  • the preset condition includes multiple factors, it may mean that the multiple factors are all limited, and when other factors are not included, it may mean that other factors are not limited.
  • the same as the preset condition 1) may also include that the size of the preset condition is less than or equal to the preset value.
  • the preset conditions corresponding to different interpolation filters are different in at least one of the following aspects:
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • one interpolation filter corresponds to a plurality of preset conditions, and the factors for each of the plurality of preset conditions and the preset conditions of other interpolation filters may be different.
  • the interpolation filter 1 corresponds to the preset condition A and the preset condition B
  • the interpolation filter 2 corresponds to the preset condition C
  • the preset condition A and the preset condition C may be different in the encoding mode of the image block.
  • the condition B and the preset condition C may be different in the component to be encoded.
  • different preset conditions correspond to different interpolation filters of the multiple kinds of interpolation filters.
  • the interpolation filter corresponding to the preset condition can be used to perform motion estimation and/or motion compensation on the image block.
  • the interpolation filters corresponding to each preset condition may be different.
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • the preset condition is different in a certain aspect, which may refer to different restrictions on this aspect.
  • the preset condition A defines the encoding mode inter mode of the image block
  • the preset condition B defines the image
  • the coding mode of the block is Affine mode
  • the two preset conditions are different in the coding mode of the image block.
  • the following preset conditions respectively correspond to different interpolation filters:
  • the encoding mode of the image block is the inter mode, the component to be encoded is the luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
  • the encoding mode of the image block is the inter mode
  • the component to be encoded is the luminance component
  • the size of the image block is less than or equal to the second preset value.
  • the above preset condition 1) and preset condition 2) have the same limitation on the encoding mode of the image block and the component to be encoded, but the limitation on the size of the image block may be different.
  • different preset conditions may have different limitations on the size of the image block, and the limited encoding mode or the component to be encoded may also be the Affine mode or the chrominance component; or, different preset conditions may limit the size of the image.
  • the size of the block has the same limitation, but the coding mode or the component to be coded can be differently defined; or, different preset conditions can have different limitations on the size of the image block, and there are also different limitations on the coding mode or the component to be coded. Different restrictions.
  • the coding mode of the image block is Affine mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
  • the coding mode of the image block is Affine mode
  • the component to be coded is a chrominance component
  • the size of the image block is less than or equal to the second preset value.
  • the following preset conditions correspond to different interpolation filters:
  • the encoding mode of the image block is the inter mode, the component to be encoded is the luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
  • the coding mode of the image block is Affine mode
  • the component to be coded is a luminance component
  • the size of the image block is less than or equal to the second preset value.
  • the following preset conditions correspond to different interpolation filters:
  • the coding mode of the image block is Affine mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
  • the encoding mode of the image block is the inter mode
  • the component to be encoded is the luminance component
  • the size of the image block is less than or equal to the second preset value.
  • the following preset conditions correspond to different interpolation filters:
  • the coding mode of the image block is Affine mode, the component to be coded is a chrominance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
  • the coding mode of the image block is inter mode
  • the component to be coded is a chrominance component
  • the size of the image block is less than or equal to the second preset value.
  • the following preset conditions respectively correspond to different interpolation filters:
  • the coding mode of the image block is inter mode, and the component to be coded is a luminance component;
  • the coding mode of the image block is inter mode, and the component to be coded is a chrominance component.
  • the above preset condition 1) and preset condition 2) have the same limitation on the encoding mode of the image block, but have different limitations on the component to be encoded. Among them, the above preset conditions 1) and 2) can also be defined in Other aspects have the same or different limitations.
  • the following preset conditions respectively correspond to different interpolation filters:
  • the coding mode of the image block is inter mode, and the component to be coded is a luminance component;
  • the coding mode of the image block is Affine mode, and the component to be coded is a luminance component or a chrominance component.
  • the above preset condition 1) and preset condition 2) have different restrictions on the encoding mode of the image block and the component to be encoded, and may have the same or different restrictions in other aspects.
  • the preset condition 1) may further include: the size of the image block is greater than a preset value.
  • the preset condition 2) may not limit the size of the image block (that is, any size is fine), or the size may be limited.
  • the preset condition a) includes: the encoding mode of the image block is the Affine mode, the component to be encoded is the luminance component; and the number of taps of the corresponding filter is 4. Under the preset condition a), the number of taps of the filter is 4 instead of 6 or 8, which is greater than 4, which can reduce bandwidth pressure.
  • the preset condition b) includes: the encoding mode of the image block is the inter mode, the component to be encoded is the luminance component; and the number of taps of the corresponding filter is 4 or 6.
  • the first preset condition may further include: the size of the image block is less than or equal to a preset value. Under the preset condition b), the number of taps of the filter is 4 or 6 instead of 8, etc., which can reduce the bandwidth pressure.
  • the number of interpolation filters used may be 6, and when the size is less than or equal to the second preset value, the number of interpolation filters used The number of taps is 4. When the size is greater than the first preset value, the number of interpolation filters used is 8.
  • the number of interpolation filters used is 4, and when the size is greater than the first preset value, the number of interpolation filters used is 6 or 8.
  • the preset condition c) includes: the encoding mode of the image block is the inter mode, the component to be encoded is the luminance component; and the number of taps of the corresponding filter is 8.
  • the first preset condition may further include: the size of the image block is greater than a preset value.
  • the preset condition d) includes: the encoding mode of the image block is the inter mode, the component to be encoded is the chrominance component; and the number of taps of the corresponding filter is 4.
  • the preset condition e) includes: the encoding mode of the image block is the Affine mode, the component to be encoded is the chrominance component; and the number of taps of the corresponding filter is 4.
  • the number of MVs is limited to two, or the image frame is limited to double forward B-frames.
  • the size of the image block may be negatively correlated with the number of taps of the interpolation filter used. This is because the smaller the size of the image block, the greater the number of image blocks obtained by dividing the image frame. For the entire image frame, the larger the number of pixels required for interpolation processing, the greater the pressure on the bandwidth. Therefore, when the size of the image block is small, a smaller number of taps can be used. Interpolation filter, which can reduce bandwidth pressure.
  • the interpolation filter is taken as an example above, and it is mentioned that different interpolation filters may correspond to different preset conditions or different preset conditions may correspond to different interpolation filters.
  • different interpolation methods may also correspond to different preset conditions, or different preset conditions may correspond to different interpolation filters.
  • the specific implementation can refer to the above description of the interpolation filter, and specifically, the above interpolation filter can be replaced with an interpolation method.
  • different interpolation methods may include different interpolation filters.
  • the interpolation methods used for motion estimation and/or motion compensation may also be the same.
  • these preset conditions may be different in at least one of the following aspects:
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • the component to be coded is a luminance component (referred to as the preset condition A)
  • the preset condition B chrominance component
  • the preset condition A) and the preset condition B) define different components to be encoded, and the interpolation mode corresponding to the preset condition may be the same.
  • the same interpolation method can be the same interpolation filter.
  • the number of taps of the interpolation filter can be 4, which is used to interpolate 1/16 of the pixels.
  • the preset condition A) and the preset condition B) respectively define the components to be encoded.
  • the same limitation may also have different limitations.
  • the preset condition A) and the preset condition B) respectively define the inter mode of the encoding mode of the image block.
  • the preset condition A) and the preset condition B) respectively define the coding mode Affine mode of the image block.
  • the preset condition A) defines the encoding mode of the image block as the inter mode
  • the preset condition B) separately defines the encoding mode Affine mode of the image block.
  • the preset condition A) defines the encoding mode of the image block as the Affine mode
  • the preset condition B) respectively defines the encoding mode inter mode of the image block.
  • the image block includes a luminance component and a chrominance component; the luminance component and chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
  • the same interpolation method may be used for motion estimation and/or motion compensation.
  • the use of the same interpolation method mentioned here may mean that the number of taps and/or interpolation coefficients of the interpolation filter used are the same.
  • the number of taps of the interpolation filter used for motion estimation and/or motion compensation of the luminance component and chrominance component of the image block is 4, which is used to interpolate 1/16 pixels .
  • the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation:
  • the specific identification bit mentioned here may be the first identification bit mentioned below, or the first identification bit with a specific value.
  • the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
  • the second identification bit can be used to indicate whether the first identification bit exists in the code stream.
  • the second identification bit indicates that when the current frame is a B frame, the first identification bit exists in the code stream.
  • the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
  • the first identification bit may have two values: longtype and shorttype.
  • the longtype indicates that the number of taps of the interpolation filter used may be more than the number of taps of the interpolation filter used for shorttype indicates.
  • the number of taps of the interpolation filter is 8
  • the value of the first identification bit is short type
  • the number of taps of the interpolation filter is 4 or 6.
  • the first flag indicates the long type or the short type
  • the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
  • the coding mode is inter mode or Affine mode.
  • the coding mode here can be that the coding mode of the luminance component is inter mode or Affine mode, or the coding mode of the chrominance component is inter mode or Affine mode; or, the coding mode of the luminance component and the coding of the chrominance component
  • the modes are both inter mode or Affine mode, where the coding mode of the luminance component and the coding mode of the chrominance component may be the same or different.
  • the size of the image block is greater than a preset value.
  • the coding modes of the luminance component and the chrominance component of the image block are both inter mode.
  • the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or Motion compensation.
  • the same interpolation method for the luminance component and the chrominance component of the image block for motion estimation and/or motion compensation may also have other restrictions, which are not specifically limited in the embodiment of the present application.
  • the coding modes of the luminance component and the chrominance component of the image block are both Affine mode.
  • the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or Motion compensation.
  • the same interpolation method for the luminance component and the chrominance component of the image block for motion estimation and/or motion compensation may also have other restrictions, which are not specifically limited in the embodiment of the present application.
  • the encoding end may write identification bits in the bitstream, and the identification bits are used to indicate that the luminance component and chrominance component of the image block use the same motion estimation and/or motion estimation.
  • the interpolation method of compensation may be used to indicate that the luminance component and chrominance component of the image block use the same motion estimation and/or motion estimation.
  • an identification bit is obtained in the code stream, and the identification bit is used to indicate that the luminance component and the chrominance component of the image block adopt the same motion estimation and/or motion compensation interpolation method.
  • the code stream may not have an identification bit, but the encoding end and the decoding end adopt the same method to select the interpolation filter.
  • the encoding end may write an identification bit in the code stream, and the identification bit may indicate whether the interpolation mode (or interpolation filter) corresponding to each preset condition is the same.
  • the decoding end can obtain the identification bit from the code stream to determine whether the interpolation filters (or interpolation filters) corresponding to each preset condition are the same.
  • the identification bit is used to indicate whether the preset condition including the component to be coded as a luminance component and the interpolation mode corresponding to the preset condition including the component to be coded as the luminance component are the same.
  • the identification bit can be carried in the sequence header, frame header, and slice header.
  • the encoding end may add a first identification bit to the code stream, and the first identification bit is used to indicate that one of the interpolation filters is selected from among the various interpolation filters. Used for motion estimation and/or motion compensation (that is, whether the solution of this application is applicable).
  • the decoding end can obtain the first identification bit from the code stream to determine that an interpolation filter needs to be selected from the various interpolation filters for motion estimation and/or motion compensation.
  • the first identification bit is used to indicate that a filter with one tap quantity is selected from filters with multiple tap quantities for use in motion estimation and/or motion compensation.
  • the multiple-tap filters include a first filter and a second filter, the first filter has 8 taps, and the second filter has 6 or 4 taps.
  • the number of taps of the second filter is the same as the number of taps of the chrominance component; the first flag is used to indicate the selection of the first filter or the second filter.
  • the first interpolation filter and the second interpolation filter may be candidate interpolation filters for encoding the luminance component, but the embodiment of the present application is not limited thereto.
  • the first identification bit may have two values: longtype and shorttype.
  • the longtype indicates that the number of taps of the interpolation filter used may be more than that of the shorttype indicates that the interpolation filter used Number of taps.
  • the selected interpolation filter can be the first interpolation filter, the number of taps of the interpolation filter is 8, and when the value of the first identification bit is short type, the selected interpolation filter
  • the filter may be a second interpolation filter, and the number of taps of the second interpolation filter is 4 or 6.
  • the first identification bit can be carried in the sequence header, the frame header, and the Slice header, and specifically can be carried in the Slice_type in the Slice header.
  • the second identification bit can be carried in the sequence header, the frame header, and the slice header.
  • a frame of image can have one or more slices, and each slice has its own slice header.
  • the slice header can use "slice type" to identify whether the current slice is I_SLICE, B_SLICE or P_SLICE.
  • I_SLICE only intra prediction can be used;
  • P_SLICE can use intra prediction or forward prediction;
  • B_SLICE can use intra, forward prediction, bidirectional prediction, backward prediction or dual forward prediction.
  • the first identification bit may be an identification bit independent of the slice type.
  • the Slice header indicates that the slice is B_SLICE
  • the first flag indicates to select an interpolation filter from a variety of interpolation filters
  • the interpolation filter may not be selected from a variety of interpolation filters according to the solution of the application (that is, the application is not applicable Solution), for example, a preset interpolation filter can be used.
  • the Slice header indicates that the slice is P_SLICE
  • the first flag indicates to select an interpolation filter from a variety of interpolation filters
  • the SLICE is processed in a manner (for example, bidirectional prediction, dual forward prediction, or dual backward prediction is used, that is, the current slice is processed as B_SLICE)
  • the first flag indicates that no interpolation filter is selected from a variety of interpolation filters
  • a second identification bit may also exist in the code stream.
  • the code stream has the first An identification bit.
  • the second identification bit here may be slice type.
  • the slice type indicates that the current frame is a B frame
  • the first identification bit still exists in the code stream, otherwise the first identification bit does not exist.
  • the first identification bit can also be multiplexed with slice type, especially the slice type indicating P_SLICE.
  • slice type especially the slice type indicating P_SLICE.
  • the slice type is P_SLICE
  • it can be selected from multiple interpolation filters according to the scheme of this application. Select the interpolation filter, and process the SLICE in the manner of B_SLICE (for example, using bidirectional prediction, dual forward prediction, or dual backward prediction, that is, processing the current slice as B_SLICE).
  • B_SLICE in addition to the slice_type, there may also be a first identification bit in the slice header.
  • the interpolation filter may not be selected from the multiple interpolation filters according to the solution of the application For example, a preset interpolation filter can be used.
  • the first identification bit exists in at least two of the sequence header, the frame header, and the slice header.
  • the first identification bit in the sequence header indicates whether it is required (representative must) to be applicable or not required (representative must not, not possible). When the solution of this application is applicable, it can be in the frame header or slice header. There is no first identification bit, and the solution of this application is applicable or not applicable to all frames or slices of the sequence.
  • the first identification bit in the sequence header indicates that it can (representing selectivity, each frame or slice can be applicable or not) applicable or not applicable to the scheme of this application, it can exist in the frame header or slice header
  • the first flag indicates whether the current frame or slice is applicable to the solution of this application.
  • the first identification bit in the frame header or slice header indicates whether the current frame or slice applies the solution of this application.
  • the first identification bit in the sequence header indicates that each frame does not need to apply the solution of this application, the first identification bit no longer exists in the frame header and slice header.
  • the solution of this application does not apply .
  • the first identification bit in the sequence header indicates that each frame needs to apply the solution of this application
  • the first identification bit no longer exists in the frame header and the sequence header.
  • the solution of this application applies .
  • the identification bit in the sequence header indicates that each frame may not be applicable (or applicable) to the solution of the application
  • the first identification bit in the frame header or slice header indicates whether the current frame or slice is applicable to the solution of the application.
  • the first identification bit may exist in both the frame header and the slice header, or only the first identification bit may exist in the frame header.
  • the first identification bit in the frame header indicates that the solution of this application needs to be applied or not required
  • the first identification bit may not be present in the slice header, and all slices of the frame are applicable or not applicable to the solution of this application .
  • there may be a first identification bit in the slice header indicating whether the slice is applicable to the solution of this application.
  • the embodiments of this application can be used in the LDB mode.
  • the memory bandwidth consumption of the LDB can be reduced, and the LDB mode can bring no extra cost compared to LDP.
  • Bandwidth pressure while having better compression performance than LDP.
  • the above describes how to select an interpolation filter or an interpolation method.
  • the embodiments of the present application can also be used for how to select one-way prediction or two-way prediction, and motion estimation and/or motion compensation use integer pixel precision (without interpolation filter ) Or sub-pixel accuracy (interpolation filter is required).
  • the two preset conditions may also be different in at least one of the following aspects:
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • the preset conditions corresponding to the motion estimation and/or motion compensation mode with integer pixel accuracy may include: the encoding mode is inter mode, the size of the image block is less than or equal to the preset value, and the number of MVs of the image block is greater than or equal to two.
  • the preset condition may also limit other factors, which are not limited in the embodiment of the present application.
  • the preset conditions corresponding to the sub-pixel precision motion estimation and/or motion compensation method may further include multiple types, corresponding to multiple interpolation filters, or sub-pixel precision motion estimation and/or motion.
  • the compensation method may have multiple interpolation filters, and the preset conditions corresponding to each interpolation filter may be different.
  • the prediction mode of unidirectional prediction and a prediction mode of bidirectional prediction may be selected according to a preset condition.
  • the two preset conditions may also be different in at least one of the following aspects:
  • the encoding mode of the image block the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  • the preset conditions corresponding to the prediction mode of unidirectional prediction may include: the coding mode is inter mode, the size of the image block is less than or equal to the preset value, and the number of MVs of the image block is greater than or equal to 2.
  • the preset condition may also limit other factors, which are not limited in the embodiment of the present application.
  • the preset conditions corresponding to the prediction mode of the bidirectional prediction may include multiple types, corresponding to multiple interpolation filters, or the prediction mode of the bidirectional prediction may have multiple interpolation filters, each of which The corresponding preset conditions may be different.
  • FIG. 7 shows a schematic block diagram of a video processing device 200 according to an embodiment of the present application.
  • the device 200 may include a processor 210, and may further include a memory 220.
  • the computer system 200 may also include components commonly included in other computer systems, such as input and output devices, communication interfaces, etc., which are not limited in the embodiment of the present application.
  • the memory 220 is used to store computer executable instructions.
  • the memory 220 may be various types of memory, for example, it may include a high-speed random access memory (Random Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The example does not limit this.
  • RAM Random Access Memory
  • non-volatile memory such as at least one disk memory. The example does not limit this.
  • the processor 210 is configured to access the memory 220 and execute the computer-executable instructions to perform operations in the method for video processing in the foregoing embodiment of the present application.
  • the processor 210 may include a microprocessor, a field-programmable gate array (Field-Programmable Gate Array, FPGA), a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), etc.
  • a microprocessor a field-programmable gate array
  • FPGA Field-Programmable Gate Array
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • the implementation of this application The example does not limit this.
  • the video processing device 200 of the embodiment of the present application may correspond to the execution subject of the video processing method of the embodiment of the present application, and the foregoing and other operations and/or functions of the various modules of the video processing device 200 are used to implement the corresponding procedures of the foregoing methods. , For the sake of brevity, I will not repeat it here.
  • An embodiment of the present application also provides an electronic device, which may include the video processing device of the foregoing various embodiments of the present application.
  • the embodiment of the present application also provides a computer storage medium, and the computer storage medium stores program code, and the program code may be used to instruct the execution of the video processing method in the foregoing embodiment of the present application.
  • the term "and/or” is merely an association relationship describing an associated object, indicating that there may be three relationships.
  • a and/or B can mean: A alone exists, A and B exist at the same time, and B exists alone.
  • the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Embodiments of the present application provide a video processing method and device, capable of effectively implementing an interpolation process in a motion estimation and/or motion compensation process. The method comprises: performing motion estimation and/or motion compensation on an image block of a target frame having multiple motion vectors (MVs) using an interpolation filter in a plurality of interpolation filters.

Description

视频处理方法和设备Video processing method and equipment
版权申明Copyright statement
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。The content disclosed in this patent document contains copyrighted material. The copyright belongs to the copyright owner. The copyright owner does not object to anyone copying the patent document or the patent disclosure in the official records and archives of the Patent and Trademark Office.
技术领域Technical field
本申请涉及图像处理领域,并且更具体地,涉及一种视频处理方法和设备。This application relates to the field of image processing, and more specifically, to a video processing method and device.
背景技术Background technique
预测是主流视频编码框架的重要模块,预测可以包括帧内预测和帧间预测。Prediction is an important module of the mainstream video coding framework. Prediction can include intra-frame prediction and inter-frame prediction.
帧间预测的大致流程可以包括运动估计(Motion Estimation,ME)与运动补偿(Motion Compensation,MC)。运动估计的过程就是将当前帧的当前编码块在参考帧中经过搜索、比较后得到运动矢量(Motion Vector,MV)的过程。运动补偿就是利用MV和参考块得到当前块的预测块的过程。运动补偿得到的预测块可能和原始的当前块有一定的差别,因此需要将预测块和当前块的差值(残差)经过变换、量化等过程之后传递到解码端,除此之外还需要将MV和参考帧的信息传递到解码端,以用于解码端重构出当前帧。The general process of inter prediction may include motion estimation (ME) and motion compensation (MC). The process of motion estimation is the process of obtaining a motion vector (MV) after searching and comparing the current coding block of the current frame in the reference frame. Motion compensation is the process of obtaining the prediction block of the current block by using the MV and the reference block. The predicted block obtained by motion compensation may be different from the original current block. Therefore, the difference (residual) between the predicted block and the current block needs to be transmitted to the decoding end after transformation, quantization, etc., in addition to The information of the MV and the reference frame is passed to the decoding end for the decoding end to reconstruct the current frame.
由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位。为了提高运动矢量的精度,亚像素精度被提出来。例如,在高性能视频编码(High Efficiency Video Coding,HEVC)标准中,对亮度分量的运动估计采用1/4像素精度的运动矢量。但是在数字视频中并不存在亚像素处的样值,一般来说,为了实现1/K像素精度估计,必须将这些亚像素点的值近似内插出来,也就是对参考帧的行方向和列方向进行K倍内插,并在插值之后的参考帧中搜索预测块。在对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。Due to the continuity of natural object motion, the motion vector of the object between two adjacent frames may not be exactly an integer number of pixel units. In order to improve the accuracy of the motion vector, sub-pixel accuracy is proposed. For example, in the High Efficiency Video Coding (HEVC) standard, a motion vector with 1/4 pixel accuracy is used for motion estimation of the luminance component. However, there are no samples at sub-pixels in digital video. Generally speaking, in order to achieve 1/K pixel accuracy estimation, the values of these sub-pixels must be approximately interpolated, that is, the line direction and the reference frame K-fold interpolation is performed in the column direction, and the prediction block is searched in the reference frame after the interpolation. In the process of interpolating the current block, the pixels in the current block and the pixels in the adjacent area need to be used.
如何有效实现上述插值过程是一项亟待解决的问题。How to effectively implement the above interpolation process is an urgent problem to be solved.
发明内容Summary of the invention
本申请实施例提供一种视频处理方法和设备,可以有效实现运动估计和/或运动补偿过程中的插值过程。The embodiments of the present application provide a video processing method and device, which can effectively implement the interpolation process in the motion estimation and/or motion compensation process.
第一方面,提供了一种视频处理方法,包括:利用多种插值滤波器中的插值滤波器,对目标帧的具有多运动矢量MV的图像块,进行运动估计和/或运动补偿。In a first aspect, a video processing method is provided, which includes: using an interpolation filter among a variety of interpolation filters to perform motion estimation and/or motion compensation on an image block with multiple motion vectors MV of a target frame.
第二方面,提供了一种视频处理设备,包括处理器,所述处理器用于调用存储器中存储的代码,执行以下操作:In a second aspect, a video processing device is provided, including a processor, and the processor is configured to call codes stored in a memory to perform the following operations:
利用多种插值滤波器中的插值滤波器,对目标帧的具有多MV的图像块,进行运动估计和/或运动补偿。Using the interpolation filter among the multiple interpolation filters, motion estimation and/or motion compensation are performed on the image block with multiple MVs of the target frame.
第三方面,提供了一种计算机系统,包括:存储器,用于存储计算机可执行指令;处理器,用于访问该存储器,并执行该计算机可执行指令,以进行上述第一方面的方法中的操作。In a third aspect, a computer system is provided, including: a memory, configured to store computer-executable instructions; a processor, configured to access the memory and execute the computer-executable instructions to perform the above-mentioned method in the first aspect operating.
第四方面,提供了一种计算机存储介质,该计算机存储介质中存储有程序代码,该程序代码可以用于指示执行上述第一方面的方法。In a fourth aspect, a computer storage medium is provided, the computer storage medium stores program code, and the program code can be used to instruct the execution of the method of the first aspect.
第五方面,提供了一种计算机程序产品,该程序产品包括程序代码,该程序代码可以用于指示执行上述第一方面的方法。In a fifth aspect, a computer program product is provided. The program product includes program code, and the program code can be used to instruct to execute the method of the first aspect.
因此,在本申请实施例,对于具有多MV的图像块,可以具有多种插值滤波器可供选择,可以灵活选择插值滤波器,从而可以在保证编码性能的同时,降低存储带宽压力。Therefore, in the embodiment of the present application, for image blocks with multiple MVs, there may be multiple interpolation filters to choose from, and the interpolation filters can be flexibly selected, so that the storage bandwidth pressure can be reduced while ensuring the encoding performance.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only some of the present application. Embodiments, for those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1是根据本申请实施例的视频编码的框架图。Fig. 1 is a frame diagram of video coding according to an embodiment of the present application.
图2是根据本申请实施例的预测方式的示意性图。Fig. 2 is a schematic diagram of a prediction method according to an embodiment of the present application.
图3是根据本申请实施例的图像块的插值过程的示意性图。Fig. 3 is a schematic diagram of an image block interpolation process according to an embodiment of the present application.
图4是根据本申请实施例的Affine模式的控制点的示意性图。Fig. 4 is a schematic diagram of the control points of the Affine mode according to an embodiment of the present application.
图5是根据本申请实施例的CU的运动矢量的示意性图。Fig. 5 is a schematic diagram of a motion vector of a CU according to an embodiment of the present application.
图6是根据本申请实施例的视频处理方法的示意性流程图。Fig. 6 is a schematic flowchart of a video processing method according to an embodiment of the present application.
图7是根据本申请实施例的视频处理设备的示意性框图。Fig. 7 is a schematic block diagram of a video processing device according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are a part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
除非另有说明,本申请实施例所使用的所有技术和科学术语与本申请的技术领域的技术人员通常理解的含义相同。本申请中所使用的术语只是为了描述具体的实施例的目的,不是旨在限制本申请的范围。Unless otherwise specified, all technical and scientific terms used in the embodiments of the present application have the same meaning as commonly understood by those skilled in the technical field of the present application. The terminology used in this application is only for the purpose of describing specific embodiments, and is not intended to limit the scope of this application.
如图1所示,视频编码框架主要包括帧内预测、帧间预测、变换、量化、熵编码、环路滤波几个部分。As shown in Figure 1, the video coding framework mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
本申请主要针对帧间预测(inter prediction)部分进行改进。This application is mainly aimed at improving the inter prediction (inter prediction) part.
帧间预测的大致思想是:利用视频相邻帧之间的时域相关性,使用重构帧作为参考帧,通过运动估计(Motion Estimation,ME)和运动补偿(Motion Compensation,MC)对当前帧进行预测,从而去除视频的时间冗余信息。The general idea of inter-frame prediction is: use the time-domain correlation between adjacent frames of the video, use the reconstructed frame as a reference frame, and use Motion Estimation (ME) and Motion Compensation (MC) to compare the current frame Make predictions to remove the temporal redundant information of the video.
本文中提及的当前帧(或目标帧),在编码场景下,表示当前正在编码的帧,在解码场景下,表示当前正在解码的帧。The current frame (or target frame) mentioned in this article refers to the frame currently being encoded in the encoding scene, and refers to the frame currently being decoded in the decoding scene.
本文中提及的重构帧,在编码场景下,表示先前已经编码的帧,在解码场景下,表示先前已经解码的帧。The reconstructed frame mentioned in this article, in the encoding scene, means the previously encoded frame, in the decoding scene, means the previously decoded frame.
对于一帧图像,在编码过程中不会直接对整帧图像进行处理,通常将整帧图像划分为图像块进行处理。For a frame of image, the entire frame of image is not directly processed in the encoding process, and the entire frame of image is usually divided into image blocks for processing.
作为示例,先将整帧图像划分成编码区域(Coding Tree Unit,CTU),例如CTU的大小为64×64或128×128(单位:像素),然后可以进一步地将CTU划分成方形或矩形的编码单元(Coding Unit,CU)。在编码过程中,可以对CU进行处理。As an example, first divide the entire frame of image into coding areas (Coding Tree Unit, CTU), for example, the size of the CTU is 64×64 or 128×128 (unit: pixels), and then the CTU can be further divided into square or rectangular Coding Unit (CU). During the encoding process, the CU can be processed.
本文中提及的图像块的大小的单位可以均为像素。The unit of the size of the image block mentioned in this article may all be pixels.
帧间预测的大致流程如下。The general flow of inter prediction is as follows.
针对当前帧中的当前图像块(下文简称为当前块),在参考帧中寻找最相似块作为当前块的预测块。当前块与相似块之间的相对位移称为运动矢量(Motion Vector,MV)。运动估计指的是,将当前帧的当前块在参考帧中经 过搜索、比较后得到运动矢量的过程。运动补偿指的是,利用参考块与运动估计得到的运动矢量得到预测块的过程。For the current image block in the current frame (hereinafter referred to as the current block for short), the most similar block is found in the reference frame as the prediction block of the current block. The relative displacement between the current block and similar blocks is called a Motion Vector (MV). Motion estimation refers to the process of obtaining a motion vector after searching and comparing the current block of the current frame in the reference frame. Motion compensation refers to the process of obtaining a prediction block using a reference block and a motion vector obtained by motion estimation.
帧间预测的过程获得的预测块可能和原始的当前块有一定的差别,因此,可以计算预测块与当前块之间的差值,该差值可称为残差。对残差进行变换、量化、熵编码等处理之后,得到编码比特流。The prediction block obtained in the process of inter prediction may be different from the original current block. Therefore, the difference between the prediction block and the current block can be calculated, and the difference may be called the residual. After performing transformation, quantization, entropy coding and other processing on the residual, the coded bit stream is obtained.
在编码端,完成图像编码后,即熵编码得到的比特流之后,可以将比特流以及编码模式信息,例如帧间预测模式、运动矢量信息等信息,进行存储或发送到解码端。At the encoding end, after the image encoding is completed, that is, after the bitstream obtained by entropy encoding, the bitstream and encoding mode information, such as inter-frame prediction mode, motion vector information, and other information, can be stored or sent to the decoding end.
在解码端,获得熵编码比特流之后,先对该比特流进行熵解码,得到相应的残差;然后,根据解码得到的运动矢量等编码模式信息,获得预测块;最后,根据残差和预测块,得到当前块中各像素点的值,即重构出当前块,以此类推,重构出当前帧。At the decoding end, after obtaining the entropy coded bitstream, first perform entropy decoding on the bitstream to obtain the corresponding residual; then, obtain the prediction block according to the coding mode information such as the decoded motion vector; finally, according to the residual and prediction Block, get the value of each pixel in the current block, that is, reconstruct the current block, and so on, reconstruct the current frame.
如图1所示,在编码过程中,还可以包括反量化和反变换等步骤。反量化指的就是与量化过程相反的过程。反变换指的就是与变换过程相反的过程。As shown in Figure 1, in the encoding process, steps such as inverse quantization and inverse transformation may also be included. Dequantization refers to the process opposite to the quantification process. Inverse transformation refers to the process opposite to the transformation process.
帧间预测可以包括前向预测、后向预测、双预测等。Inter prediction may include forward prediction, backward prediction, bi-prediction, and so on.
其中,前向预测是利用当前帧(例如,如图2所示的,标号为t的帧)的前一重构帧(可以称为历史帧)对当前帧进行预测。后向预测是利用当前帧之后的帧(可以称为将来帧)对当前帧进行预测。双预测可以是双向预测,即既利用“历史帧”(例如,如图2所示,标号为t-2和t-1的帧)也利用“将来帧”(例如,如图2所示,标号为t+2和t+1的帧)来对当前帧进行预测。双预测还可以是同一方向的预测,例如,利用两个“历史帧”来对当前帧进行预测,或者,利用两个“将来帧”来对当前帧进行预测。Wherein, the forward prediction is to use the previous reconstructed frame (may be referred to as the historical frame) of the current frame (for example, the frame labeled t as shown in FIG. 2) to predict the current frame. Backward prediction is to use frames after the current frame (may be called a future frame) to predict the current frame. Bi-prediction can be bi-prediction, that is, using both "historical frames" (for example, frames labeled t-2 and t-1 as shown in Figure 2) and "future frames" (for example, as shown in Figure 2, Frames labeled t+2 and t+1) to predict the current frame. Bi-prediction can also be prediction in the same direction, for example, using two "historical frames" to predict the current frame, or using two "future frames" to predict the current frame.
在视频编解码中,可以设置不同的帧类型,不同的帧类型可以支持不同种类的帧间预测模式。其中,帧类型可以包括三种:“I帧”、“B帧”和“P帧”。In video coding and decoding, different frame types can be set, and different frame types can support different types of inter-frame prediction modes. Among them, the frame types can include three types: "I frame", "B frame" and "P frame".
“I帧”中的所有图像块全部使用帧内编码,不会参考其他帧的信息。All image blocks in the "I frame" use intra-frame coding and do not refer to the information of other frames.
“B帧”为双向预测帧,B帧中的图像块可能使用帧内编码,也有可能使用帧间编码模式。对于双向预测的B帧,其图像块的帧间预测模式可以为前向预测、后向预测或双向预测,因此B帧的帧间预测块MV可以是单MV或双MV。"B frame" is a bidirectional predictive frame. The image blocks in the B frame may use intra-frame coding or inter-frame coding mode. For a bi-predicted B frame, the inter prediction mode of its image block can be forward prediction, backward prediction, or bidirectional prediction. Therefore, the inter prediction block MV of the B frame can be a single MV or a dual MV.
“广义B帧”(Generalized P and B picture,简称GPB),是HEVC中的一种结构,综合了传统B帧和P帧的特点。广义B帧可以采用双前向预测的方式,也就是说,有两个参考帧并且全都为“历史帧”。对于广义B帧的编码块,也有可能是帧内模式,前向预测模式,双前向预测模式。广义B帧的帧间预测块MV可以是单MV或双MV。"Generalized B frame" (Generalized P and B picture, referred to as GPB) is a structure in HEVC that combines the characteristics of traditional B and P frames. Generalized B-frames can adopt a dual forward prediction method, that is, there are two reference frames and all of them are "historical frames". For the coding block of a generalized B frame, it may also be an intra mode, a forward prediction mode, and a dual forward prediction mode. The inter prediction block MV of a generalized B frame can be a single MV or a dual MV.
“P帧”为前向预测帧,并且为单向预测,P帧中的编码块有可能采用帧内预测模式,有可能采用前向预测的模式。由于P帧是单向预测帧,因此P帧的帧间预测块MV都是单MV。"P frame" is a forward prediction frame, and it is a unidirectional prediction. The coded block in the P frame may use the intra prediction mode, and may use the forward prediction mode. Since the P frame is a unidirectional prediction frame, the inter prediction block MV of the P frame is all a single MV.
以上几种帧类型可以通过特定的方式进行组合,得到几种不同的编码方式。The above several frame types can be combined in a specific way to obtain several different encoding methods.
例如,HEVC中可以有四种编码方式,全帧内编码(All Intra,AI),随机接入编码(Random Access,RA),低延时B帧编码(Low Delay B,LDB),低延时P帧编码(Low Delay P,LDP)。For example, there can be four coding methods in HEVC: All Intra (AI), Random Access (RA), Low Delay B Frame Coding (Low Delay B, LDB), and Low Delay P frame coding (Low Delay P, LDP).
在AI编码方式下,所有帧都是I帧(I I I I I I I I……)。In the AI coding mode, all frames are I frames (I I I I I I I I...).
在RA编码方式下,主要为B帧,并且会周期性插入(大约可以每隔一秒)I帧,也就是说此编码模式下为I B B B B B B B B B…B B B I B B B B B B B B B…B B B I……。In the RA encoding mode, it is mainly B frames, and I frames are inserted periodically (approximately every second), which means that in this encoding mode, it is I B B B B B B B B B B...B B B I B B B B B B B B B...B B B I....
在LDB编码模式下,只有第一帧为I帧,其余都按照广义B帧的方式进行编码,帧结构为I B B B B B……。LDP只有第一帧为I帧,其余都按照P帧的方式进行编码,帧结构为I P P P P P……。In the LDB encoding mode, only the first frame is an I frame, and the rest are coded in a generalized B frame manner. The frame structure is I B B B B B.... In LDP, only the first frame is an I frame, and the rest are coded in the manner of P frames. The frame structure is I P P P P...
HEVC中的帧间预测技术可以包含三种模式,即inter模式(也叫AMVP模式)、merge模式和skip模式。The inter-frame prediction technology in HEVC can include three modes, namely inter mode (also called AMVP mode), merge mode and skip mode.
对于inter模式而言,可以先确定运动矢量预测(motion vector prediction,MVP),在得到MVP之后,可以根据MVP确定运动估计的起始点,在起始点附近,进行运动搜索,搜索完毕之后得到最优的MV,由MV确定参考块在参考图像中的位置,参考块减去当前块得到残差块,MV减去MVP得到运动矢量差值(Motion Vector Difference,MVD),并将该MVD通过码流传输给解码端。For inter mode, motion vector prediction (motion vector prediction, MVP) can be determined first. After the MVP is obtained, the starting point of motion estimation can be determined according to the MVP, and the motion search is performed near the starting point. After the search is completed, the optimal MV, the position of the reference block in the reference image is determined by the MV, the reference block is subtracted from the current block to obtain the residual block, and the MVP is subtracted from the MV to obtain the Motion Vector Difference (MVD), and the MVD is passed through the code stream Transmitted to the decoding end.
对于Merge模式而言,可以先确定MVP,并直接将MVP确定为MV,其中,为了得到MVP,可以先构建一个MVP候选列表(merge candidate list),在MVP候选列表中,可以包括至少一个候选MVP,每个候选MVP可以对 应有一个索引,编码端在从MVP候选列表中选择MVP之后,可以将该MVP索引写入到码流中,则解码端可以按照该索引从MVP候选列表中找到该索引对应的MVP,以实现对图像块的解码。For the Merge mode, the MVP can be determined first, and the MVP can be directly determined as the MV. In order to obtain the MVP, an MVP candidate list (merge candidate list) can be constructed first. In the MVP candidate list, at least one candidate MVP can be included. , Each candidate MVP can correspond to an index. After selecting the MVP from the MVP candidate list, the encoder can write the MVP index into the code stream, and the decoder can find the index from the MVP candidate list according to the index Corresponding MVP to achieve the decoding of image blocks.
为了更加清楚地理解Merge模式,以下将介绍采用Merge模式进行编码的操作流程。In order to understand the Merge mode more clearly, the following will introduce the operation process of using the Merge mode to encode.
步骤一、获取MVP候选列表;Step 1: Obtain the MVP candidate list;
步骤二、从MVP候选列表中选出最优的一个MVP,同时得到该MVP在MVP候选列表中的索引;Step 2: Select an optimal MVP from the MVP candidate list, and at the same time obtain the index of the MVP in the MVP candidate list;
步骤三、把该MVP作为当前块的MV;Step 3: Use the MVP as the MV of the current block;
步骤四、根据MV确定参考块(也可以称为预测块)在参考帧图像中的位置;Step 4: Determine the position of the reference block (also called the prediction block) in the reference frame image according to the MV;
步骤五、参考块减去当前块得到残差数据;Step 5. Subtract the current block from the reference block to obtain residual data;
步骤六、把残差数据和MVP的索引传给解码端。 Step 6. Pass the residual data and the index of the MVP to the decoder.
应理解,以上流程只是Merge模式的一种具体实现方式。Merge模式还可以具有其他的实现方式。It should be understood that the above process is only a specific implementation of the Merge mode. Merge mode can also have other implementations.
例如,Skip模式是Merge模式的一种特例。按照Merge模式得到MV之后,如果编码端确定当前块和参考块基本一样,那么不需要传输残差数据,只需要传递MV的索引,以及进一步地可以传递一个标志,该标志可以表明当前块可以直接从参考块得到。For example, Skip mode is a special case of Merge mode. After obtaining the MV according to the Merge mode, if the encoding end determines that the current block is basically the same as the reference block, then there is no need to transmit residual data, only the index of the MV, and further a flag can be passed, which can indicate that the current block can be directly Obtained from the reference block.
也就是说,Merge模式特点为:MV=MVP(MVD=0);而Skip模式还多一个特点,即:重构值rec=预测值pred(残差值resi=0)。In other words, the feature of the Merge mode is: MV=MVP (MVD=0); and the Skip mode has one more feature, namely: reconstruction value rec=predicted value pred (residual value resi=0).
在实际场景中,由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位,因此,可以将运动估计的精度提升到亚像素级别(也称为1/K像素精度)。例如,在HEVC标准中,对亮度分量的运动估计采用1/4像素精度的运动矢量。In the actual scene, due to the continuity of natural object motion, the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the accuracy of motion estimation can be improved to the sub-pixel level (also called 1/K pixel accuracy). For example, in the HEVC standard, motion vectors with 1/4 pixel accuracy are used for motion estimation of the luminance component.
但在数字视频中并不存在1/K像素处的样值,通常,为了实现1/K像素精度的运动估计,将1/K像素点的值近似内插出来,换言之,对参考帧的行方向和列方向进行K倍内插,在插值之后的图像中进行搜索。对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。However, there is no sample value at 1/K pixel in digital video. Generally, in order to achieve motion estimation with 1/K pixel accuracy, the value of 1/K pixel is approximately interpolated. In other words, the line of reference frame K-fold interpolation is performed in the direction and column direction, and search is performed in the image after interpolation. In the process of interpolating the current block, the pixels in the current block and the pixels in the adjacent area need to be used.
作为示例,1/4像素插值的过程如图3所示,可以采用待编码的图像块外部左侧的3个像素点和右侧的4个像素点来产生内插点的像素值。如图3 所示,对于一个大小为4×4的图像块,a 0,0和d 0,0为1/4像素点,b 0,0和h 0,0为半像素点,c 0,0和n 0,0为3/4像素点。假如说当前块为2×2的块,A 0,0~A 1,0,A 0,0~A 0,1围成的2×2块。为了计算这个2×2的块中所有的内插点,需要用到2×2外部的一些点,包括左边3个,右边4个,上边3个,下边4个。此处提到的图像块可以为8×8、4×8、4×4或8×4的图像块,也可以是其他大小的图像块,本申请实施例对此不做具体限定。 As an example, the 1/4 pixel interpolation process is shown in FIG. 3, and the 3 pixels on the left side and the 4 pixels on the right side of the image block to be encoded can be used to generate the pixel value of the interpolation point. As shown in Figure 3, for an image block with a size of 4×4, a 0 , 0 and d 0, 0 are 1/4 pixels, b 0 , 0 and h 0, 0 are half pixels, and c 0, 0 and n 0, 0 is 3/4 pixel. If the current block is a 2×2 block, A 0,0 to A 1,0 , A 0,0 to A 0,1 are surrounded by 2×2 blocks. In order to calculate all the interpolation points in this 2×2 block, some points outside the 2×2 need to be used, including 3 on the left, 4 on the right, 3 on the top, and 4 on the bottom. The image blocks mentioned here may be 8×8, 4×8, 4×4, or 8×4 image blocks, and may also be image blocks of other sizes, which are not specifically limited in the embodiment of the present application.
各个点的像素插值可以通过以下方式得到式1-21:The pixel interpolation of each point can be obtained by formula 1-21 in the following way:
a 0,j=(∑ i=-3..3A i,jqfilter[i])>>(B-8)  式1 a 0,j =(∑ i=-3..3 A i,j qfilter[i])>>(B-8) Formula 1
b 0,j=(∑ i=-3..4A i,jhfilter[i])>>(B-8)  式2 b 0,j =(∑ i=-3..4 A i,j hfilter[i])>>(B-8) Equation 2
c 0,j=(∑ i=-2..4A i,jqfilter[1-i])>>(B-8)  式3 c 0,j =(∑ i=-2..4 A i,j qfilter[1-i])>>(B-8) Equation 3
d 0,0=(∑ i=-3..3A 0,jqfilter[j])>>(B-8)  式4 d 0,0 =(∑ i=-3..3 A 0,j qfilter[j])>>(B-8) Equation 4
h 0,0=(∑ i=-3..4A 0,jhfilter[j])>>(B-8)  式5 h 0,0 =(∑ i=-3..4 A 0,j hfilter[j])>>(B-8) Equation 5
n 0,0=(∑ i=-2..4A 0,jqfilter[1-j])>>(B-8)  式6 n 0,0 =(∑ i=-2..4 A 0,j qfilter[1-j])>>(B-8) Equation 6
Figure PCTCN2019091955-appb-000001
Figure PCTCN2019091955-appb-000001
Figure PCTCN2019091955-appb-000002
Figure PCTCN2019091955-appb-000002
Figure PCTCN2019091955-appb-000003
Figure PCTCN2019091955-appb-000003
Figure PCTCN2019091955-appb-000004
Figure PCTCN2019091955-appb-000004
Figure PCTCN2019091955-appb-000005
Figure PCTCN2019091955-appb-000005
Figure PCTCN2019091955-appb-000006
Figure PCTCN2019091955-appb-000006
Figure PCTCN2019091955-appb-000007
Figure PCTCN2019091955-appb-000007
Figure PCTCN2019091955-appb-000008
Figure PCTCN2019091955-appb-000008
Figure PCTCN2019091955-appb-000009
Figure PCTCN2019091955-appb-000009
Figure PCTCN2019091955-appb-000010
Figure PCTCN2019091955-appb-000010
Figure PCTCN2019091955-appb-000011
Figure PCTCN2019091955-appb-000011
Figure PCTCN2019091955-appb-000012
Figure PCTCN2019091955-appb-000012
Figure PCTCN2019091955-appb-000013
Figure PCTCN2019091955-appb-000013
Figure PCTCN2019091955-appb-000014
Figure PCTCN2019091955-appb-000014
Figure PCTCN2019091955-appb-000015
Figure PCTCN2019091955-appb-000015
本申请实施例的插值过程可以通过插值滤波器来实现。插值滤波器的抽头数量可以是指最多可能用到该数量个点的像素值来计算插值后的样点。其中,亮度分量对应的8抽头的插值滤波器的系数和色度分量的4抽头的插值滤波器的系数可以如下表1和2所示,The interpolation process in the embodiment of the present application can be implemented by an interpolation filter. The number of taps of the interpolation filter may refer to the pixel values of the number of points that may be used at most to calculate the interpolated samples. Among them, the coefficients of the 8-tap interpolation filter corresponding to the luminance component and the coefficients of the 4-tap interpolation filter of the chrominance component can be as shown in Tables 1 and 2.
表1,8抽头的插值滤波器的系数Table 1. Coefficients of 8-tap interpolation filter
位置索引值iPosition index i -3-3 -2-2 -1-1 00 11 22 33 44
hfilter[i]hfilter[i] -1-1 44 -11-11 4040 4040 -11-11 44 -1-1
qfilter[i]qfilter[i] -1-1 44 -10-10 5858 1717 -5-5 11  To
表2,色度分量的4抽头的插值滤波器的系数Table 2. Coefficients of 4-tap interpolation filter for chrominance components
位置索引值iPosition index i -1-1 00 11 22
filter1[i]filter1[i] -2-2 5858 1010 -2-2
filter2[i]filter2[i] -4-4 5454 1616 -2-2
filter2[i]filter2[i] -6-6 4646 2828 -4-4
filter4[i]filter4[i] -4-4 3636 3636 -4-4
filter5[i]filter5[i] -4-4 2828 4646 -6-6
filter6[i]filter6[i] -2-2 1616 5454 -4-4
filter7[i]filter7[i] -2-2 1010 5858 -2-2
其中,表2中色度分量的抽头系数,filter1为1/8像素点位置所用的插值滤波器系数,filter2为2/8像素点位置所用的插值滤波器系数,filter3为3/8像素点位置所用的插值滤波器系数,依次类推。Among them, the tap coefficients of the chrominance components in Table 2, filter1 is the interpolation filter coefficient used at 1/8 pixel position, filter2 is the interpolation filter coefficient used at 2/8 pixel position, and filter3 is 3/8 pixel position The interpolation filter coefficients used, and so on.
自适应运动矢量精度(Adaptive Motion Vector Resolution,AMVR)技术可以使得CU具有整像素精度或亚像素精度的运动矢量。整像素精度例如可以为1像素精度、2像素精度等。亚像素精度例如可以为1/2像素精度、1/4像 素精度、1/8像素精度或1/16像素精度等。The Adaptive Motion Vector Resolution (AMVR) technology can enable the CU to have a motion vector with full pixel precision or sub-pixel precision. The integer pixel accuracy can be, for example, 1-pixel accuracy, 2-pixel accuracy, or the like. The sub-pixel accuracy can be, for example, 1/2 pixel accuracy, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1/16 pixel accuracy.
AMVR可以包括inter模式下的AMVR及Affine模式下的AMVR。AMVR can include AMVR in inter mode and AMVR in Affine mode.
在HEVC标准中,帧间预测过程只考虑了传统的运动模型(例如,平移运动)。然而在现实世界中,还有很多种运动形式,比如缩放、旋转、透视运动等无规则的运动。为了考虑到上述运动形式,在VTM-3.0中,引入了Affine技术。In the HEVC standard, only the traditional motion model (for example, translational motion) is considered in the inter prediction process. However, in the real world, there are still many forms of motion, such as zoom, rotation, perspective motion and other irregular motions. In order to take into account the above-mentioned movement form, in VTM-3.0, Affine technology was introduced.
如图4所示,一个Affine模式的运动场可以通过两个控制点(四参数)(如图4(a)所示)或三个控制点(六参数)(如图4(b)所示)的运动矢量导出。As shown in Figure 4, an Affine mode sports field can pass two control points (four parameters) (as shown in Figure 4(a)) or three control points (six parameters) (as shown in Figure 4(b)) The motion vector is exported.
下文中,将控制点的MV(Control Point Motion Vector)简称为CPMV。Hereinafter, the MV (Control Point Motion Vector) of the control point is referred to as CPMV for short.
Affine的处理单元不是CU,而是将CU划分之后得到的子块(sub-CU),每个sub-CU的大小可以为4×4。在Affine模式,每个sub-CU具有一个MV。可以理解到,不同于普通CU,Affine模式的CU不只有一个MV,一个CU中具有多少个sub-CU,这个CU就具有多少个MV。The processing unit of Affine is not a CU, but a sub-block (sub-CU) obtained after dividing the CU, and the size of each sub-CU may be 4×4. In Affine mode, each sub-CU has one MV. It can be understood that, unlike ordinary CUs, Affine mode CUs do not only have one MV. There are as many sub-CUs as there are in a CU.
作为示例,一个CU中的sub-CU的MV通过如图4中所示的两个控制点或三个控制点的CPMV计算导出。例如,对于四参数的Affine运动模型,位于(x,y)位置的sub-CU的MV通过以下公式计算得到:As an example, the MV of the sub-CU in one CU is derived through the CPMV calculation of two control points or three control points as shown in FIG. 4. For example, for the four-parameter Affine motion model, the MV of the sub-CU at the (x, y) position is calculated by the following formula:
Figure PCTCN2019091955-appb-000016
Figure PCTCN2019091955-appb-000016
再例如,对于六参数的Affine运动模型,位于(x,y)位置的sub-CU的MV通过以下公式计算得到:For another example, for the six-parameter Affine motion model, the MV of the sub-CU at the (x, y) position is calculated by the following formula:
Figure PCTCN2019091955-appb-000017
Figure PCTCN2019091955-appb-000017
其中(mv 0x,mv 0y)为左上角控制点的MV,(mv 1x,mv 1y)为右上角控制点的MV,(mv 2x,mv 2y)为左下角控制点的MV。 Among them, (mv 0x , mv 0y ) is the MV of the upper left control point, (mv 1x , mv 1y ) is the MV of the upper right control point, and (mv 2x , mv 2y ) is the MV of the lower left control point.
经过上述公式的计算,一个CU中运动矢量的可以如图5所示,每个方格代表4x4大小的sub-CU。在上述公式计算之后的所有MV都会转换成1/16 精度的表示,也就是说sub-CU的MV最高精度是1/16。在计算得到每一个sub-CU的MV之后,经过运动补偿的过程得到每一个sub-CU的预测块。色度分量和亮度分量的sub-CU的大小都是4x4,色度分量4x4块的运动由其对应的四个4x4的亮度分量运动矢量平均得到。After calculating the above formula, the motion vector in a CU can be as shown in Fig. 5, and each square represents a sub-CU with a size of 4x4. All MVs after the above formula calculation will be converted into 1/16 precision representation, which means that the highest precision of sub-CU MV is 1/16. After the MV of each sub-CU is calculated, the prediction block of each sub-CU is obtained through the process of motion compensation. The size of the sub-CU of the chrominance component and the luminance component is 4x4, and the motion of the chrominance component 4x4 block is obtained by averaging its corresponding four 4x4 luminance component motion vectors.
Affine merge模式可以只对宽高都不小于8的CU进行处理,与上文中提到的普通merge模式类似。在该模式,可以首先从空域临近块和时域临近块获取MV,此过程会获取到Affine模式的CU的CPMVs以及传统模式的MV,再根据这些MV组合得到CPMVs来构建候选列表,再从候选列表中选择一个组合(此组合中可能包含两个或者三个CPMV,代表两个控制点和三个控制点的CPMV)来作为当前块的CPMVs,不需要进行运动估计,也只需要将最终选择的CPMVs的索引(一个CU只需要写一个索引)写入码流。临近块的帧间预测模式可以是传统的帧间预测模式也可能是Affine模式,因此从临近块获取到的MV可能是整像素也可能是亚像素,Affine merge模式不会进行AMVR,也就是不会进行自适应运动矢量精度决策的过程的,从临近块选择的MV精度是多少就是多少。The Affine merge mode can only process CUs whose width and height are not less than 8, similar to the normal merge mode mentioned above. In this mode, you can first obtain MVs from spatial neighboring blocks and temporal neighboring blocks. In this process, CPMVs of Affine mode CUs and traditional mode MVs are obtained, and CPMVs are obtained from these MV combinations to construct a candidate list, and then from candidates Select a combination from the list (this combination may contain two or three CPMV, representing two control points and three control points CPMV) as the CPMVs of the current block, no motion estimation is required, and only the final selection The index of CPMVs (a CU only needs to write one index) is written into the code stream. The inter prediction mode of adjacent blocks can be the traditional inter prediction mode or the Affine mode. Therefore, the MV obtained from the adjacent blocks may be whole pixels or sub-pixels. The Affine merge mode does not perform AMVR, that is, it does not The process of adaptive motion vector accuracy decision-making will be carried out, and the accuracy of the MV selected from the neighboring blocks is as much as possible.
Affine Inter模式可以只对宽高都不小于16的CU进行处理,与上文中提到的AMVP模式类似。可以首先从空域或时域临近块获取MV构建候选列表,再进行运动估计的过程,运动估计过程以整个CU为单位进行,来获取到CPMVs。运动补偿过程则是以4x4的sub-CU为单位进行,最终可以将选择的CPMVs的索引以及其与当前块CU实际CPMVs的差值(MVD,motion vector difference)写入码流。AMVR的精度本质上就是MVD的精度,也就是CPMVs的精度,而不是sub-CU的MV精度。The Affine Inter mode can only process CUs whose width and height are not less than 16, which is similar to the AMVP mode mentioned above. The candidate list can be constructed by first obtaining MVs from adjacent blocks in the spatial or temporal domain, and then performing the motion estimation process. The motion estimation process is performed in units of the entire CU to obtain CPMVs. The motion compensation process is performed in a unit of 4x4 sub-CU, and finally the index of the selected CPMVs and the difference (MVD, motion vector difference) between the actual CPMVs of the current block CU can be written into the code stream. The accuracy of AMVR is essentially the accuracy of MVD, that is, the accuracy of CPMVs, not the MV accuracy of sub-CU.
对于每一个采用Affine AMVR(也可以是AMVR技术的CU,有些情况下可能CU不采用Affine AMVR),在编码端可任意自适应地决策其对应的MV精度,并将决策的结果写进码流传递到解码端。For each CU that uses Affine AMVR (or AMVR technology, in some cases, CU may not use Affine AMVR), the encoder can adaptively decide its corresponding MV accuracy, and write the result of the decision into the code stream Pass it to the decoder.
Affine AMVR技术中提及的整像素精度或亚像素精度指的是CPMV的像素精度,而不是sub-CU的像素精度。例如,对于Affine AMVR中提到的1/16精度,1/4精度,整像素精度等可以是指的图4中CPMV的精度,不是sub-CU做运动补偿的过程中实际使用的MV的精度。对于整像素的CPMV,运动估计的过程都是整像素的过程,而经过上述两个公式1)和2)计算之后得到的sub-CU的MV可能是1/4精度,因此其运动补偿的过程会涉及到 亚像素。The whole pixel accuracy or sub-pixel accuracy mentioned in Affine AMVR technology refers to the pixel accuracy of CPMV, not the pixel accuracy of sub-CU. For example, the 1/16 accuracy, 1/4 accuracy, and integer pixel accuracy mentioned in Affine AMVR can refer to the accuracy of the CPMV in Figure 4, not the accuracy of the MV actually used in the sub-CU motion compensation process. . For the CPMV of the whole pixel, the process of motion estimation is the whole pixel process, and the MV of the sub-CU obtained after the above two formulas 1) and 2) may be 1/4 accuracy, so the process of motion compensation Sub-pixels will be involved.
以上提到的亚像素精度的插值过程,可能会带来内存数据读取的压力。主要原因在于,插值过程除了需要读取当前编码块的数值,还需要读取其临近点的数据以得到亚像素点的像素值。以8抽头的插值滤波器为例,需要额外使用当前块以外的水平方向上7个像素点,竖直方向7个像素点,如果当前块为宽高W、H的块,则其插值过程需要读取(W+7)x(H+7)的区域。对于4抽头滤波器,需要读取(W+3)x(H+3)的区域。对于6抽头插值滤波器,需要读取(W+5)x(H+5)的区域。The sub-pixel precision interpolation process mentioned above may bring pressure on the memory data reading. The main reason is that, in addition to reading the value of the current encoding block, the interpolation process also needs to read the data of its neighboring points to obtain the pixel value of the sub-pixel. Taking an 8-tap interpolation filter as an example, it needs to use 7 pixels in the horizontal direction and 7 pixels in the vertical direction in addition to the current block. If the current block is a block of width and height W and H, the interpolation process needs Read the area of (W+7)x(H+7). For a 4-tap filter, the area of (W+3)x(H+3) needs to be read. For a 6-tap interpolation filter, the area of (W+5)x(H+5) needs to be read.
特别是在LDB模式下,相比于LDP模式,由于双MV的存在,导致存储带宽消耗较大,如果采用LDP,仅允许单MV,则会导致编码性能与LDB差距较大。Especially in the LDB mode, compared to the LDP mode, due to the existence of dual MVs, the storage bandwidth consumption is larger. If LDP is used and only single MV is allowed, it will lead to a larger gap between the coding performance and LDB.
针对上述问题,本申请提出一种图像处理的方法与装置,可以在一定程度上减小带宽压力同时保证压缩性能。In response to the above problems, this application proposes an image processing method and device, which can reduce bandwidth pressure to a certain extent while ensuring compression performance.
本申请适用于数字视频编码技术领域,具体用于视频编解码器的帧间预测部分。本申请可以应用于符合国际视频编码标准H.264/HEVC和中国AVS2标准等的编解码器,以及符合下一代视频编码标准VVC或AVS3等的编解码器。This application is suitable for the field of digital video coding technology, and is specifically used for the inter-frame prediction part of a video codec. This application can be applied to codecs that comply with the international video coding standard H.264/HEVC and the Chinese AVS2 standard, as well as codecs that comply with the next-generation video coding standard VVC or AVS3.
本申请可以应用于视频编解码器的帧间预测部分,也就是说,根据本申请实施例的图像处理的方法可以由编码装置执行,也可以由解码装置执行。This application can be applied to the inter-frame prediction part of a video codec, that is to say, the image processing method according to the embodiment of this application can be executed by an encoding device or a decoding device.
图6是根据本申请实施例的视频处理方法的示意性流程图。该方法包括以下内容中的至少部分内容。Fig. 6 is a schematic flowchart of a video processing method according to an embodiment of the present application. The method includes at least part of the following content.
在110中,利用多种插值滤波器中的插值滤波器,对目标帧的具有至少一个MV(具体可以为多个MV)的图像块,进行运动估计和/或运动补偿。In 110, an interpolation filter among a variety of interpolation filters is used to perform motion estimation and/or motion compensation on an image block having at least one MV (specifically, multiple MVs) of the target frame.
具体地,可以存在多种插值滤波器供视频处理设备使用,视频处理设备在对当前图像块进行运动估计和/或运动补偿,可以从该多种插值滤波器中选择插值滤波器,用于运动估计和/或运动补偿。Specifically, there may be a variety of interpolation filters for use by the video processing device. The video processing device is performing motion estimation and/or motion compensation on the current image block, and the interpolation filter can be selected from the multiple interpolation filters for motion. Estimation and/or motion compensation.
在本申请实施例中,对于同一图像块,用于运动估计的插值滤波器和用于运动补偿的插值滤波器可以相同,也可以不相同。In the embodiment of the present application, for the same image block, the interpolation filter used for motion estimation and the interpolation filter used for motion compensation may be the same or different.
具体地,可以对当前图像块选择一次插值滤波器,即用于运动估计又用于运动补偿。或者,针对当前图像块,可以为运动估计选择一次插值滤波器,以及为运动补偿选择一次插值滤波器。Specifically, the interpolation filter can be selected once for the current image block, which is used for motion estimation and motion compensation. Or, for the current image block, an interpolation filter may be selected for motion estimation and an interpolation filter may be selected for motion compensation.
在本申请实施例中,用于运动估计时所采用的多种插值滤波器可以与用于运动补偿时所采用的多种插值滤波器相同,部分相同,或者完全不相同。In the embodiments of the present application, the multiple interpolation filters used for motion estimation may be the same, partially the same, or completely different from the multiple interpolation filters used for motion compensation.
在本申请实施例中,插值滤波器不同可以是指以下方面中的至少一种不同:插值滤波器的抽头数量、插值滤波器的系数、插值滤波器的形状(或者称为插值滤波器所参考的像素位置)。In the embodiments of the present application, different interpolation filters may refer to differences in at least one of the following aspects: the number of taps of the interpolation filter, the coefficients of the interpolation filter, the shape of the interpolation filter (or referred to as the reference of the interpolation filter) Pixel position).
在本申请实施例中,插值滤波器的抽头数量可以是2,4,6或8等。In the embodiment of the present application, the number of taps of the interpolation filter may be 2, 4, 6, or 8, etc.
在本申请实施例中,所述多种插值滤波器中不同的插值滤波器对应不同的预设条件。In the embodiment of the present application, different interpolation filters of the multiple interpolation filters correspond to different preset conditions.
具体而言,多种插值滤波器中每个插值滤波器对应有预设条件,在某一插值滤波器的预设条件得到满足时,可以采用该插值滤波器进行运动估计和/或运动补偿。Specifically, each of the multiple interpolation filters corresponds to a preset condition, and when the preset condition of a certain interpolation filter is satisfied, the interpolation filter can be used to perform motion estimation and/or motion compensation.
例如,在第一插值滤波器(可以是多个插值滤波器中的任一插值滤波器)对应的第一预设条件得到满足时,利用所述第一插值滤波器,对所述图像块,进行运动估计和/或运动补偿。For example, when the first preset condition corresponding to the first interpolation filter (which may be any interpolation filter among a plurality of interpolation filters) is satisfied, the first interpolation filter is used for the image block, Perform motion estimation and/or motion compensation.
在本申请实施例中,不同的插值滤波器对应的预设条件可以在以下方面中的至少一种不同:In the embodiment of the present application, the preset conditions corresponding to different interpolation filters may be different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
本申请实施例提到的图像块的编码模式可以包括:Inter模式,Affine模式,Merge模式等。The coding mode of the image block mentioned in the embodiment of this application may include: Inter mode, Affine mode, Merge mode, and so on.
本申请实施例提到的图像块的尺寸所处的区间可以划分为两种或多于两种的区间。例如,图像块的尺寸所处的区间可以分为大于预设值和小于或等于预设值共两个区间;例如,图像块的尺寸所处的区分可以分为大于第一预设值的区间、小于或等于第一预设值以及大于第二预设值的区间,以及小于或等于第二预设值的区间。The interval where the size of the image block mentioned in the embodiment of the present application is located can be divided into two or more than two types of intervals. For example, the interval where the size of the image block is located can be divided into two intervals greater than the preset value and less than or equal to the preset value; for example, the difference between the size of the image block may be divided into intervals greater than the first preset value , An interval less than or equal to the first preset value and greater than the second preset value, and an interval less than or equal to the second preset value.
本申请实施例提到的图像块的待编码的分量可以包括:亮度分量和色度分量等。The components to be coded of the image block mentioned in the embodiments of the present application may include: luminance components and chrominance components.
本申请实施例提到的图像块的MV的数量可以为1个、2个、3个或多个等。The number of MVs of image blocks mentioned in the embodiment of the present application may be one, two, three or more.
可选地,在本申请实施例中,一个插值滤波器可以对应于一个或多个预设条件,在一个或多个预设条件中的任一条件得到满足的情况下,该插值滤 波器可以用于进行运动估计和/或运动补偿。其中,一个插值滤波器对应的多个预设条件中不同的预设条件在以下方面中的至少一种不同:Optionally, in the embodiment of the present application, one interpolation filter may correspond to one or more preset conditions, and when any one of the one or more preset conditions is satisfied, the interpolation filter may Used for motion estimation and/or motion compensation. Wherein, different preset conditions among the multiple preset conditions corresponding to one interpolation filter are different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
例如,多个插值滤波器中的第一插值滤波器对应的预设条件包括以下中的至少两种:For example, the preset conditions corresponding to the first interpolation filter in the plurality of interpolation filters include at least two of the following:
1)所述图像块的编码模式为帧间inter模式,所述待编码的分量为亮度分量;1) The coding mode of the image block is the inter mode, and the component to be coded is the luminance component;
2)所述图像块的编码模式为inter模式,所述待编码的分量为色度分量;2) The coding mode of the image block is inter mode, and the component to be coded is a chrominance component;
3)所述图像块的编码模式为仿射运动补偿预测Affine模式,所述待编码的分量为色度分量;3) The coding mode of the image block is the affine motion compensation prediction Affine mode, and the component to be coded is the chrominance component;
4)所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量。4) The coding mode of the image block is Affine mode, and the component to be coded is a luminance component.
第一插值滤波器对应的预设条件包括的以上至少两种可以是编码模式不同,也可以是待编码的分量不同。The preset conditions corresponding to the first interpolation filter include the above at least two types, which may be different encoding modes, or different components to be encoded.
应理解,本申请实施例提到的单个预设条件包括的因素可以是开放式的,也即除了包括(预设条件包括的因素也即预设条件限定的因素)提到的因素之外,还可以包括或限定其他的因素。其中,预设条件包括多个因素时,可以是指对该多个因素均进行了限定,在未包括其他因素时,可以是指未对其他因素进行限定。It should be understood that the factors included in a single preset condition mentioned in the embodiments of the present application may be open-ended, that is, in addition to the factors mentioned in (the factors included in the preset conditions, that is, the factors limited by the preset conditions), Other factors can also be included or defined. Wherein, when the preset condition includes multiple factors, it may mean that the multiple factors are all limited, and when other factors are not included, it may mean that other factors are not limited.
例如,对于预设条件1)而言,该与预设条件1)还可以包括所述预设条件的尺寸小于或等于预设值。For example, for the preset condition 1), the same as the preset condition 1) may also include that the size of the preset condition is less than or equal to the preset value.
可选地,在本申请实施例中,不同的插值滤波器对应的预设条件在以下方面中的至少一种不同:Optionally, in the embodiment of the present application, the preset conditions corresponding to different interpolation filters are different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
其中,假设一个插值滤波器对应于多个预设条件,该多个预设条件中各个预设条件与其他插值滤波器的预设条件不同的因素可以是不一样的。Wherein, it is assumed that one interpolation filter corresponds to a plurality of preset conditions, and the factors for each of the plurality of preset conditions and the preset conditions of other interpolation filters may be different.
例如,插值滤波器1对应于预设条件A和预设条件B,插值滤波器2对应于预设条件C,预设条件A与预设条件C可以在图像块的编码模式方面不同,预设条件B与预设条件C可以在待编码的分量方面不同。For example, the interpolation filter 1 corresponds to the preset condition A and the preset condition B, and the interpolation filter 2 corresponds to the preset condition C. The preset condition A and the preset condition C may be different in the encoding mode of the image block. The condition B and the preset condition C may be different in the component to be encoded.
可选地,在本申请实施例中,不同的预设条件对应所述多种插值滤波器中的不同插值滤波器。在其中一个预设条件得到满足时,可以利用该预设条件对应的插值滤波器,对图像块进行运动估计和/或运动补偿。Optionally, in the embodiment of the present application, different preset conditions correspond to different interpolation filters of the multiple kinds of interpolation filters. When one of the preset conditions is satisfied, the interpolation filter corresponding to the preset condition can be used to perform motion estimation and/or motion compensation on the image block.
具体地,可以存在多个预设条件,各个预设条件对应的插值滤波器可以不同。Specifically, there may be multiple preset conditions, and the interpolation filters corresponding to each preset condition may be different.
其中,不同的预设条件在以下方面中的至少一种不同:Among them, different preset conditions are different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
应理解,在本申请实施例中,预设条件在某个方面不同,可以是指对该方面的限制不同,例如,预设条件A限定图像块的编码模式inter模式,预设条件B限定图像块的编码模式Affine模式,则该两个预设条件在图像块的编码模式方面不同。It should be understood that, in the embodiments of the present application, the preset condition is different in a certain aspect, which may refer to different restrictions on this aspect. For example, the preset condition A defines the encoding mode inter mode of the image block, and the preset condition B defines the image The coding mode of the block is Affine mode, the two preset conditions are different in the coding mode of the image block.
可选地,在本申请实施例中,以下预设条件分别对应于不同的插值滤波器:Optionally, in the embodiment of the present application, the following preset conditions respectively correspond to different interpolation filters:
1)所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;1) The encoding mode of the image block is the inter mode, the component to be encoded is the luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
2)所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于所述第二预设值。2) The encoding mode of the image block is the inter mode, the component to be encoded is the luminance component, and the size of the image block is less than or equal to the second preset value.
以上预设条件1)和预设条件2)对图像块的编码模式和待编码的分量的限定是一样的,而对图像块的尺寸的限定可以是不同的。The above preset condition 1) and preset condition 2) have the same limitation on the encoding mode of the image block and the component to be encoded, but the limitation on the size of the image block may be different.
应理解,不同的预设条件可以对图像块的尺寸具有不同的限定,而限定的编码模式或待编码的分量也可以均为Affine模式或色度分量;或者,不同的预设条件可以对图像块的尺寸具有相同的限定,而对编码模式或待编码的分量可以不同的限定;或者,不同的预设条件可以对图像块的尺寸具有不同的限定,同时对编码模式或待编码的分量也有不同的限定。It should be understood that different preset conditions may have different limitations on the size of the image block, and the limited encoding mode or the component to be encoded may also be the Affine mode or the chrominance component; or, different preset conditions may limit the size of the image. The size of the block has the same limitation, but the coding mode or the component to be coded can be differently defined; or, different preset conditions can have different limitations on the size of the image block, and there are also different limitations on the coding mode or the component to be coded. Different restrictions.
例如,以下预设条件分别对应于不同的插值滤波器:For example, the following preset conditions correspond to different interpolation filters:
1)所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;1) The coding mode of the image block is Affine mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
2)所述图像块的编码模式为Affine模式,所述待编码的分量为色度分量,所述图像块的尺寸小于或等于所述第二预设值。2) The coding mode of the image block is Affine mode, the component to be coded is a chrominance component, and the size of the image block is less than or equal to the second preset value.
再例如,以下预设条件分别对应于不同的插值滤波器:For another example, the following preset conditions correspond to different interpolation filters:
1)所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;1) The encoding mode of the image block is the inter mode, the component to be encoded is the luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
2)所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于所述第二预设值。2) The coding mode of the image block is Affine mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to the second preset value.
再例如,以下预设条件分别对应于不同的插值滤波器:For another example, the following preset conditions correspond to different interpolation filters:
1)所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;1) The coding mode of the image block is Affine mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
2)所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于所述第二预设值。2) The encoding mode of the image block is the inter mode, the component to be encoded is the luminance component, and the size of the image block is less than or equal to the second preset value.
再例如,以下预设条件分别对应于不同的插值滤波器:For another example, the following preset conditions correspond to different interpolation filters:
1)所述图像块的编码模式为Affine模式,所述待编码的分量为色度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;1) The coding mode of the image block is Affine mode, the component to be coded is a chrominance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
2)所述图像块的编码模式为inter模式,所述待编码的分量为色度分量,所述图像块的尺寸小于或等于所述第二预设值。2) The coding mode of the image block is inter mode, the component to be coded is a chrominance component, and the size of the image block is less than or equal to the second preset value.
可选地,在本申请实施例中,以下预设条件分别对应于不同的插值滤波器:Optionally, in the embodiment of the present application, the following preset conditions respectively correspond to different interpolation filters:
1)所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;1) The coding mode of the image block is inter mode, and the component to be coded is a luminance component;
2)所述图像块的编码模式为inter模式,所述待编码的分量为色度分量。2) The coding mode of the image block is inter mode, and the component to be coded is a chrominance component.
以上预设条件1)和预设条件2)在图像块的编码模式方面具有相同的限定,而在待编码的分量方面具有不同的限定,其中,以上预设条件1)和2)也可以在其他方面具有相同或不同的限定。The above preset condition 1) and preset condition 2) have the same limitation on the encoding mode of the image block, but have different limitations on the component to be encoded. Among them, the above preset conditions 1) and 2) can also be defined in Other aspects have the same or different limitations.
可选地,在本申请实施例中,以下预设条件分别对应于不同的插值滤波器:Optionally, in the embodiment of the present application, the following preset conditions respectively correspond to different interpolation filters:
所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is inter mode, and the component to be coded is a luminance component;
所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量或色度分量。The coding mode of the image block is Affine mode, and the component to be coded is a luminance component or a chrominance component.
以上预设条件1)和预设条件2)在图像块的编码模式方面和待编码的分量方面具有不同的限定,在其他方面可以具有相同的限定,也可以具有不相同的限定。The above preset condition 1) and preset condition 2) have different restrictions on the encoding mode of the image block and the component to be encoded, and may have the same or different restrictions in other aspects.
例如,预设条件1)可以进一步包括:所述图像块的尺寸大于预设值。此时,预设条件2)可以对图像块的尺寸不限定(也即任何尺寸均可以),也 可以对尺寸进行限定。For example, the preset condition 1) may further include: the size of the image block is greater than a preset value. At this time, the preset condition 2) may not limit the size of the image block (that is, any size is fine), or the size may be limited.
以上仅仅示例性地比对了一些不同的预设条件,但是本申请实施例并不限于此,本申请实施例的预设条件还可以是其他。The foregoing only exemplarily compares some different preset conditions, but the embodiment of the present application is not limited to this, and the preset conditions of the embodiment of the present application may also be other.
为了更加清楚地理解本申请,以下将说明在各种预设条件下,插值滤波器抽头数量是多少。In order to understand this application more clearly, the following will explain the number of interpolation filter taps under various preset conditions.
在一种实现方式中,预设条件a)包括:所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量;对应的滤波器的抽头数量为4。在预设条件a)下,滤波器的抽头数量采用4而不采用大于4的6或8等,可以减轻带宽压力。In an implementation manner, the preset condition a) includes: the encoding mode of the image block is the Affine mode, the component to be encoded is the luminance component; and the number of taps of the corresponding filter is 4. Under the preset condition a), the number of taps of the filter is 4 instead of 6 or 8, which is greater than 4, which can reduce bandwidth pressure.
在一种实现方式中,预设条件b)包括:所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;对应的滤波器的抽头数量为4或6。其中,第一预设条件可以进一步包括:所述图像块的尺寸小于或等于预设值。在预设条件b)下,滤波器的抽头数量采用4或6而不采用8等,可以减轻带宽压力。In an implementation manner, the preset condition b) includes: the encoding mode of the image block is the inter mode, the component to be encoded is the luminance component; and the number of taps of the corresponding filter is 4 or 6. The first preset condition may further include: the size of the image block is less than or equal to a preset value. Under the preset condition b), the number of taps of the filter is 4 or 6 instead of 8, etc., which can reduce the bandwidth pressure.
例如,在尺寸小于或等于第一预设值而大于第二预设值时,采用的插值滤波器的数量可以是6,而在小于或等于第二预设值时,采用的插值滤波器的抽头数量为4。在尺寸大于第一预设值时,采用的插值滤波器的数量为8。For example, when the size is less than or equal to the first preset value and greater than the second preset value, the number of interpolation filters used may be 6, and when the size is less than or equal to the second preset value, the number of interpolation filters used The number of taps is 4. When the size is greater than the first preset value, the number of interpolation filters used is 8.
例如,在尺寸小于或等于第一预设值时,采用的插值滤波器的数量为4,而大于第一预设值,采用的插值滤波器的数量为6或8。For example, when the size is less than or equal to the first preset value, the number of interpolation filters used is 4, and when the size is greater than the first preset value, the number of interpolation filters used is 6 or 8.
在一种实现方式中,预设条件c)包括:所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;对应的滤波器的抽头数量为8。所述第一预设条件进一步可以包括:所述图像块的尺寸大于预设值。In an implementation manner, the preset condition c) includes: the encoding mode of the image block is the inter mode, the component to be encoded is the luminance component; and the number of taps of the corresponding filter is 8. The first preset condition may further include: the size of the image block is greater than a preset value.
在一种实现方式中,预设条件d)包括:所述图像块的编码模式为inter模式,所述待编码的分量为色度分量;对应的滤波器的抽头数量为4。In an implementation manner, the preset condition d) includes: the encoding mode of the image block is the inter mode, the component to be encoded is the chrominance component; and the number of taps of the corresponding filter is 4.
在一种实现方式中,预设条件e)包括:所述图像块的编码模式为Affine模式,所述待编码的分量为色度分量;对应的滤波器的抽头数量为4。In an implementation manner, the preset condition e) includes: the encoding mode of the image block is the Affine mode, the component to be encoded is the chrominance component; and the number of taps of the corresponding filter is 4.
对于以上的预设条件a)-e),还可以具有其他的限定因素,例如,均限定MV的数量为2个,或者限定图像帧为双前向B帧。For the above preset conditions a)-e), there may be other limiting factors. For example, the number of MVs is limited to two, or the image frame is limited to double forward B-frames.
可选地,在本申请实施例中,图像块的尺寸可以与采用的插值滤波器的抽头数量负相关,这是由于图像块的尺寸越小,则图像帧被划分得到的图像块的数量越多,则对于整个图像帧而言,进行插值处理所需用到的像素的数 量越多,则造成带宽较大的压力,因此,在图像块的尺寸较小时,可以采用具有较少数量抽头的插值滤波器,从而可以减轻带宽压力。Optionally, in the embodiment of the present application, the size of the image block may be negatively correlated with the number of taps of the interpolation filter used. This is because the smaller the size of the image block, the greater the number of image blocks obtained by dividing the image frame. For the entire image frame, the larger the number of pixels required for interpolation processing, the greater the pressure on the bandwidth. Therefore, when the size of the image block is small, a smaller number of taps can be used. Interpolation filter, which can reduce bandwidth pressure.
可选地,在本申请实施例中,以上以插值滤波器为例,提到了不同的插值滤波器可以对应于不同的预设条件或不同的预设条件可以对应于不同的插值滤波器。但在本申请实施例中,也可以是不同的插值方式对应于不同的预设条件,或不同的预设条件可以对应于不同的插值滤波器。此时,具体的实现可以参考上文关于插值滤波器的描述,具体可以将上文的插值滤波器替换为插值方式。Optionally, in the embodiment of the present application, the interpolation filter is taken as an example above, and it is mentioned that different interpolation filters may correspond to different preset conditions or different preset conditions may correspond to different interpolation filters. However, in the embodiments of the present application, different interpolation methods may also correspond to different preset conditions, or different preset conditions may correspond to different interpolation filters. At this time, the specific implementation can refer to the above description of the interpolation filter, and specifically, the above interpolation filter can be replaced with an interpolation method.
本申请实施例中,不同的插值方式可以包括插值滤波器的不同。In the embodiment of the present application, different interpolation methods may include different interpolation filters.
在本申请实施例中,在一些预设条件下,用于运动估计和/或运动补偿的插值方式也可以相同。其中,这些预设条件可以在下方面中的至少一种不同:In the embodiments of the present application, under some preset conditions, the interpolation methods used for motion estimation and/or motion compensation may also be the same. Among them, these preset conditions may be different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
例如,在包括待编码的分量为亮度分量的预设条件(称为预设条件A))下和在包括待编码的分量为色度分量的预设条件(称为预设条件B))下,用于运动估计和/或运动补偿的插值方式相同。For example, under the preset condition that the component to be coded is a luminance component (referred to as the preset condition A)) and under the preset condition that the component to be coded is a chrominance component (referred to as the preset condition B)) , The interpolation method used for motion estimation and/or motion compensation is the same.
此处,预设条件A)和预设条件B)限定了不同的待编码的分量,该预设条件对应的插值方式可以是相同的。插值方式相同可以是插值滤波器相同。该插值滤波器的抽头数量可以为4,用于插值出1/16的像素。Here, the preset condition A) and the preset condition B) define different components to be encoded, and the interpolation mode corresponding to the preset condition may be the same. The same interpolation method can be the same interpolation filter. The number of taps of the interpolation filter can be 4, which is used to interpolate 1/16 of the pixels.
可选地,在本申请实施例中,预设条件A)和预设条件B)分别限定待编码的分量,在其他方面(例如,MV的数量,图像块的尺寸,编码模式等)可以具有相同的限定,也可以具有不同的限定。Optionally, in the embodiment of the present application, the preset condition A) and the preset condition B) respectively define the components to be encoded. In other aspects (for example, the number of MVs, the size of the image block, the encoding mode, etc.) may have The same limitation may also have different limitations.
例如,预设条件A)和预设条件B)分别限定图像块的编码模式inter模式。For example, the preset condition A) and the preset condition B) respectively define the inter mode of the encoding mode of the image block.
例如,预设条件A)和预设条件B)分别限定图像块的编码模式Affine模式。For example, the preset condition A) and the preset condition B) respectively define the coding mode Affine mode of the image block.
例如,预设条件A)限定图像块的编码模式为inter模式,预设条件B)分别限定图像块的编码模式Affine模式。For example, the preset condition A) defines the encoding mode of the image block as the inter mode, and the preset condition B) separately defines the encoding mode Affine mode of the image block.
例如,预设条件A)限定图像块的编码模式为Affine模式,预设条件B)分别限定图像块的编码模式inter模式。For example, the preset condition A) defines the encoding mode of the image block as the Affine mode, and the preset condition B) respectively defines the encoding mode inter mode of the image block.
可选地,在本申请实施例中,所述图像块包括亮度分量和色度分量;所 述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿。Optionally, in this embodiment of the application, the image block includes a luminance component and a chrominance component; the luminance component and chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
具体地,在本申请实施例中,针对同一图像块的亮度分量和色度分量,可以采用相同的插值方式进行运动估计和/或运动补偿。Specifically, in the embodiment of the present application, for the luminance component and the chrominance component of the same image block, the same interpolation method may be used for motion estimation and/or motion compensation.
此处提到的采用相同的插值方式可以指采用的插值滤波器的抽头数量和/或插值系数相同。可选地,在本申请实施例中,所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的抽头数量为4,用于插值出1/16的像素。The use of the same interpolation method mentioned here may mean that the number of taps and/or interpolation coefficients of the interpolation filter used are the same. Optionally, in this embodiment of the application, the number of taps of the interpolation filter used for motion estimation and/or motion compensation of the luminance component and chrominance component of the image block is 4, which is used to interpolate 1/16 pixels .
在本申请实施例中,可以是在在满足以下至少一个条件时,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿:In the embodiment of the present application, when at least one of the following conditions is met, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation:
1)码流中具有特定标识位,此时,可以适用于解码端。其中,此处提到的特定标识位可以是下文提到的第一标识位,也可以是具有特定取值的第一标识位。1) There is a specific identification bit in the code stream, at this time, it can be applied to the decoding end. Among them, the specific identification bit mentioned here may be the first identification bit mentioned below, or the first identification bit with a specific value.
例如,在码流中具有第一标识位时,则所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿。其中,可以通过第二标识位指示码流中是否存在第一标识位。第二标识位指示当前帧是B帧时,码流中存在第一标识位。For example, when there is a first identification bit in the code stream, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation. Wherein, the second identification bit can be used to indicate whether the first identification bit exists in the code stream. The second identification bit indicates that when the current frame is a B frame, the first identification bit exists in the code stream.
例如,在第一标识位取值是特定值时,则所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿。第一标识位可以具有long type和short type两种取值,long type指示采用的插值滤波器的抽头数量可以多于short type指示采用的插值滤波器的抽头数量。例如,第一标识位取值为long type时,插值滤波器的抽头数量为8,第一标识位取值为short type时,插值滤波器的抽头数量为4或6。在第一标识位指示long type或short type时,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿。For example, when the value of the first identification bit is a specific value, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation. The first identification bit may have two values: longtype and shorttype. The longtype indicates that the number of taps of the interpolation filter used may be more than the number of taps of the interpolation filter used for shorttype indicates. For example, when the value of the first identification bit is long type, the number of taps of the interpolation filter is 8, and when the value of the first identification bit is short type, the number of taps of the interpolation filter is 4 or 6. When the first flag indicates the long type or the short type, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
2)编码模式为inter模式或Affine模式。其中,此处的编码模式可以是亮度分量的编码模式为inter模式或Affine模式,也可以是色度分量的编码模式为inter模式或Affine模式;或者,亮度分量的编码模式和色度分量的编码模式均为inter模式或Affine模式,其中,亮度分量的编码模式和色度分量的编码模式可以相同,也可以不相同。2) The coding mode is inter mode or Affine mode. Among them, the coding mode here can be that the coding mode of the luminance component is inter mode or Affine mode, or the coding mode of the chrominance component is inter mode or Affine mode; or, the coding mode of the luminance component and the coding of the chrominance component The modes are both inter mode or Affine mode, where the coding mode of the luminance component and the coding mode of the chrominance component may be the same or different.
3)所述图像块的尺寸大于预设值。3) The size of the image block is greater than a preset value.
可选地,在本申请实施例中,在所述图像块的亮度分量和色度分量的编码模式均为inter模式。Optionally, in this embodiment of the present application, the coding modes of the luminance component and the chrominance component of the image block are both inter mode.
具体地,可以是在所述图像块的亮度分量和色度分量的编码模式均为inter模式的情况下,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿,此时,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿还可以具有其他的限制条件,本申请实施例对此不做具体限定。Specifically, it may be that when the coding modes of the luminance component and the chrominance component of the image block are both inter mode, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or Motion compensation. At this time, the same interpolation method for the luminance component and the chrominance component of the image block for motion estimation and/or motion compensation may also have other restrictions, which are not specifically limited in the embodiment of the present application.
可选地,在本申请实施例中,在所述图像块的亮度分量和色度分量的编码模式均为Affine模式。Optionally, in this embodiment of the present application, the coding modes of the luminance component and the chrominance component of the image block are both Affine mode.
具体地,可以是在所述图像块的亮度分量和色度分量的编码模式均为Affine模式的情况下,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿,此时,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿还可以具有其他的限制条件,本申请实施例对此不做具体限定。Specifically, it may be that when the coding modes of the luminance component and the chrominance component of the image block are both Affine mode, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or Motion compensation. At this time, the same interpolation method for the luminance component and the chrominance component of the image block for motion estimation and/or motion compensation may also have other restrictions, which are not specifically limited in the embodiment of the present application.
可选地,在本申请实施例中,编码端可以在码流中写入标识位,所述标识位用于指示所述图像块的亮度分量和色度分量采用相同的运动估计和/或运动补偿的插值方式。Optionally, in this embodiment of the application, the encoding end may write identification bits in the bitstream, and the identification bits are used to indicate that the luminance component and chrominance component of the image block use the same motion estimation and/or motion estimation. The interpolation method of compensation.
对于解码端而言,在码流中获取标识位,所述标识位用于指示所述图像块的亮度分量和色度分量采用相同的运动估计和/或运动补偿的插值方式。For the decoding end, an identification bit is obtained in the code stream, and the identification bit is used to indicate that the luminance component and the chrominance component of the image block adopt the same motion estimation and/or motion compensation interpolation method.
或者,在本申请实施例中,在码流中也可以不具有标识位,而是编码端和解码端采用相同的方式进行插值滤波器的选择。Alternatively, in the embodiment of the present application, the code stream may not have an identification bit, but the encoding end and the decoding end adopt the same method to select the interpolation filter.
可选地,在本申请实施例中,编码端可以在码流中写入一种标识位,该标识位可以指示各个预设条件对应的插值方式(或插值滤波器)是否相同。解码端可以从码流中获取该标识位,以确定各个预设条件对应的插值滤波器(或插值滤波器)是否相同。Optionally, in the embodiment of the present application, the encoding end may write an identification bit in the code stream, and the identification bit may indicate whether the interpolation mode (or interpolation filter) corresponding to each preset condition is the same. The decoding end can obtain the identification bit from the code stream to determine whether the interpolation filters (or interpolation filters) corresponding to each preset condition are the same.
示例性地,该标识位用于指示包括所述待编码的分量为亮度分量的预设条件和包括所述待编码的分量为亮度分量的预设条件对应的插值方式是否相同。Exemplarily, the identification bit is used to indicate whether the preset condition including the component to be coded as a luminance component and the interpolation mode corresponding to the preset condition including the component to be coded as the luminance component are the same.
其中,该标识位可以承载于序列头、帧头、Slice头中。Among them, the identification bit can be carried in the sequence header, frame header, and slice header.
可选地,在本申请实施例中,编码端可以在码流中加入第一标识位,所述第一标识位用于指示从所述多种插值滤波器中选择其中一种插值滤波器 以用于运动估计和/或运动补偿(也即是否适用本申请的方案)。相应地,解码端可以从该码流中获取第一标识位,以确定需要从所述多种插值滤波器中选择插值滤波器以用于运动估计和/或运动补偿。Optionally, in this embodiment of the present application, the encoding end may add a first identification bit to the code stream, and the first identification bit is used to indicate that one of the interpolation filters is selected from among the various interpolation filters. Used for motion estimation and/or motion compensation (that is, whether the solution of this application is applicable). Correspondingly, the decoding end can obtain the first identification bit from the code stream to determine that an interpolation filter needs to be selected from the various interpolation filters for motion estimation and/or motion compensation.
可选地,在本申请实施例中,所述第一标识位用于指示从多种抽头数量的滤波器中选择其中一种抽头数量的滤波器以用于运动估计和/或运动补偿。Optionally, in the embodiment of the present application, the first identification bit is used to indicate that a filter with one tap quantity is selected from filters with multiple tap quantities for use in motion estimation and/or motion compensation.
作为示例性地,所述多种抽头数量的滤波器包括第一滤波器和第二滤波器,所述第一滤波器的抽头数量为8,所述第二滤波器的抽头数量为6或者4,或者所述第二滤波器的抽头数量指示与色度分量的滤波器的抽头数量相同;所述第一标识位用于指示选择所述第一滤波器或者所述第二滤波器。As an example, the multiple-tap filters include a first filter and a second filter, the first filter has 8 taps, and the second filter has 6 or 4 taps. , Or the number of taps of the second filter is the same as the number of taps of the chrominance component; the first flag is used to indicate the selection of the first filter or the second filter.
此时,第一插值滤波器和第二插值滤波器可以为用于编码亮度分量的候选插值滤波器,但是本申请实施例并不限于此。At this time, the first interpolation filter and the second interpolation filter may be candidate interpolation filters for encoding the luminance component, but the embodiment of the present application is not limited thereto.
可选地,在本申请实施例中,第一标识位可以具有long type和short type两种取值,long type指示采用的插值滤波器的抽头数量可以多于short type指示采用的插值滤波器的抽头数量。例如,第一标识位取值为long type时,选择的插值滤波器可以为第一插值滤波器,该插值滤波器的抽头数量为8,第一标识位取值为short type时,选择的插值滤波器可以为第二插值滤波器,该第二插值滤波器的抽头数量为4或6。Optionally, in this embodiment of the present application, the first identification bit may have two values: longtype and shorttype. The longtype indicates that the number of taps of the interpolation filter used may be more than that of the shorttype indicates that the interpolation filter used Number of taps. For example, when the value of the first identification bit is long type, the selected interpolation filter can be the first interpolation filter, the number of taps of the interpolation filter is 8, and when the value of the first identification bit is short type, the selected interpolation filter The filter may be a second interpolation filter, and the number of taps of the second interpolation filter is 4 or 6.
其中,该第一标识位可以承载于序列头、帧头、Slice头中,具体可以承载于Slice头中的Slice_type中。该第二标识位可以承载于序列头、帧头、Slice头中。Wherein, the first identification bit can be carried in the sequence header, the frame header, and the Slice header, and specifically can be carried in the Slice_type in the Slice header. The second identification bit can be carried in the sequence header, the frame header, and the slice header.
其中,一帧图像可以有一个或多个slice,每个slice都有各自的slice头,slice头可以通过“slice type”来标识当前slice是I_SLICE,B_SLICE还是P_SLICE。对于I_SLICE,只能使用帧内预测;P_SLICE可使用帧内预测或前向预测;B_SLICE可使用帧内,前向预测,双向预测,后向预测或双前向预测。Among them, a frame of image can have one or more slices, and each slice has its own slice header. The slice header can use "slice type" to identify whether the current slice is I_SLICE, B_SLICE or P_SLICE. For I_SLICE, only intra prediction can be used; P_SLICE can use intra prediction or forward prediction; B_SLICE can use intra, forward prediction, bidirectional prediction, backward prediction or dual forward prediction.
将第一标识位承载于Slice头中意味着:第一标识位可以是独立于slice type的标识位。在Slice头指示slice是B_SLICE时,如果第一标识位指示从多种插值滤波器中选择插值滤波器,则可以按照本申请的方案从多种插值滤波器中选择插值滤波器(也即适用本申请的方案),如果第一标识位指示不从多种插值滤波器中选择插值滤波器,则可以不按照本申请的方案从多种插值滤波器中选择插值滤波器(也即不适用本申请的方案),例如可以采用预 设的插值滤波器。在Slice头指示slice是P_SLICE时,如果第一标识位指示从多种插值滤波器中选择插值滤波器,则可以按照本申请的方案从多种插值滤波器中选择插值滤波器,并且按照B_SLICE的方式处理该SLICE(例如,采用双向预测、双前向预测或双后向预测,也即将当前slice作为B_SLICE进行处理),如果第一标识位指示不从多种插值滤波器中选择插值滤波器,则可以不按照本申请的方案从多种插值滤波器中选择插值滤波器,以及按照P_SLICE的方式处理该SLICE(例如,采用帧内预测或前向预测)。Carrying the first identification bit in the slice header means: the first identification bit may be an identification bit independent of the slice type. When the Slice header indicates that the slice is B_SLICE, if the first flag indicates to select an interpolation filter from a variety of interpolation filters, you can select an interpolation filter from a variety of interpolation filters according to the scheme of this application (that is, apply this If the first flag indicates that the interpolation filter is not to be selected from a variety of interpolation filters, then the interpolation filter may not be selected from a variety of interpolation filters according to the solution of the application (that is, the application is not applicable Solution), for example, a preset interpolation filter can be used. When the Slice header indicates that the slice is P_SLICE, if the first flag indicates to select an interpolation filter from a variety of interpolation filters, you can select an interpolation filter from a variety of interpolation filters according to the solution of this application, and follow the B_SLICE The SLICE is processed in a manner (for example, bidirectional prediction, dual forward prediction, or dual backward prediction is used, that is, the current slice is processed as B_SLICE), if the first flag indicates that no interpolation filter is selected from a variety of interpolation filters, It is not necessary to select an interpolation filter from a variety of interpolation filters according to the solution of the present application, and process the SLICE in the manner of P_SLICE (for example, using intra prediction or forward prediction).
可选地,在本申请实施例中,码流中还可以存在第二标识位,当所述第二标识位用于指示所述目标帧是B帧时,所述码流中具有所述第一标识位。Optionally, in this embodiment of the present application, a second identification bit may also exist in the code stream. When the second identification bit is used to indicate that the target frame is a B frame, the code stream has the first An identification bit.
其中,此处的第二标识位可以是slice type。也就是说,slice type指示当前帧是B帧时,码流中还存在第一标识位,否则不存在第一标识位。Wherein, the second identification bit here may be slice type. In other words, when the slice type indicates that the current frame is a B frame, the first identification bit still exists in the code stream, otherwise the first identification bit does not exist.
在本申请实施例中,第一标识位也可以复用slice type,尤其复用指示P_SLICE的slice type,例如,在slice type是P_SLICE时,则可以按照本申请的方案从多种插值滤波器中选择插值滤波器,并且按照B_SLICE的方式处理该SLICE(例如,采用双向预测、双前向预测或双后向预测,也即将当前slice作为B_SLICE进行处理)。在slice type是B_SLICE,则在该slice_type之外,在slice头中,还可以存在第一标识位,如果第一标识位指示从多种插值滤波器中选择插值滤波器,则可以按照本申请的方案从多种插值滤波器中选择插值滤波器,如果第一标识位指示不从多种插值滤波器中选择插值滤波器,则可以不按照本申请的方案从多种插值滤波器中选择插值滤波器,例如可以采用预设的插值滤波器。In the embodiment of this application, the first identification bit can also be multiplexed with slice type, especially the slice type indicating P_SLICE. For example, when the slice type is P_SLICE, it can be selected from multiple interpolation filters according to the scheme of this application. Select the interpolation filter, and process the SLICE in the manner of B_SLICE (for example, using bidirectional prediction, dual forward prediction, or dual backward prediction, that is, processing the current slice as B_SLICE). If the slice type is B_SLICE, in addition to the slice_type, there may also be a first identification bit in the slice header. If the first identification bit indicates to select an interpolation filter from a variety of interpolation filters, you can follow this application The solution selects the interpolation filter from a variety of interpolation filters. If the first flag indicates that the interpolation filter is not selected from the multiple interpolation filters, then the interpolation filter may not be selected from the multiple interpolation filters according to the solution of the application For example, a preset interpolation filter can be used.
在本申请实施例中,可以同时存在两种以上的第一标识位。例如,在序列头、帧头和Slice头中的至少两种均存在第一标识位。In the embodiment of the present application, there may be more than two types of first identification bits at the same time. For example, the first identification bit exists in at least two of the sequence header, the frame header, and the slice header.
在一种实现方式中,在序列头中的第一标识位指示需要(代表必须)适用或不需要(代表必须不,不可以)适用本申请的方案时,则在帧头或slice头中可以不存在第一标识位,针对序列的所有帧或slice均适用或均不适用本申请的方案。在序列头中的第一标识位指示可以(代表着选择性,各个帧或slice可以适用,也可以不适用)适用或可以不适用本申请的方案时,则在帧头或slice头中可以存在第一标识位,指示当前帧或slice是否适用本申请的方案。In one implementation, the first identification bit in the sequence header indicates whether it is required (representative must) to be applicable or not required (representative must not, not possible). When the solution of this application is applicable, it can be in the frame header or slice header. There is no first identification bit, and the solution of this application is applicable or not applicable to all frames or slices of the sequence. The first identification bit in the sequence header indicates that it can (representing selectivity, each frame or slice can be applicable or not) applicable or not applicable to the scheme of this application, it can exist in the frame header or slice header The first flag indicates whether the current frame or slice is applicable to the solution of this application.
例如,在序列头中的第一标识位指示各个帧可以适用本申请的方案时, 则帧头或slice头中存在第一标识位,指示当前帧或slice是否适用本申请的方案。在序列头中的第一标识位指示各个帧不需要适用本申请的方案时,则帧头和slice头中不再存在第一标识位,在处理各个帧或slice时,不适用本申请的方案。For example, when the first identification bit in the sequence header indicates that each frame is applicable to the solution of this application, the first identification bit in the frame header or slice header indicates whether the current frame or slice applies the solution of this application. When the first identification bit in the sequence header indicates that each frame does not need to apply the solution of this application, the first identification bit no longer exists in the frame header and slice header. When processing each frame or slice, the solution of this application does not apply .
例如,在序列头中的第一标识位指示各个帧需要适用本申请的方案时,则帧头和序列头中不再存在第一标识位,在处理各个帧或slice时,适用本申请的方案。在序列头中的标识位指示各个帧可以不适用(或可以适用)本申请的方案时,则帧头或slice头中存在第一标识位,指示当前帧或slice是否适用本申请的方案。For example, when the first identification bit in the sequence header indicates that each frame needs to apply the solution of this application, the first identification bit no longer exists in the frame header and the sequence header. When processing each frame or slice, the solution of this application applies . When the identification bit in the sequence header indicates that each frame may not be applicable (or applicable) to the solution of the application, the first identification bit in the frame header or slice header indicates whether the current frame or slice is applicable to the solution of the application.
同样,在本申请实施例中,可以在帧头和slice头中均存在第一标识位,或仅在帧头中存在第一标识位。在帧头中的第一标识位指示需要适用或不需要适用本申请的方案时,则在slice头中可以不存在第一标识位,该帧的所有slice均适用或均不适用本申请的方案。在帧头中的第一标识位指示可以适用或可以不适用本申请的方案时,则在slice头中可以存在第一标识位,指示当slice是否适用本申请的方案。Similarly, in the embodiment of the present application, the first identification bit may exist in both the frame header and the slice header, or only the first identification bit may exist in the frame header. When the first identification bit in the frame header indicates that the solution of this application needs to be applied or not required, the first identification bit may not be present in the slice header, and all slices of the frame are applicable or not applicable to the solution of this application . When the first identification bit in the frame header indicates that the solution of this application may or may not be applicable, there may be a first identification bit in the slice header, indicating whether the slice is applicable to the solution of this application.
本申请实施例可以用于LDB模式下,通过修改LDB中双MV的插值函数(选择插值滤波器或插值方式),可以降低LDB的存储带宽消耗,可以使得LDB模式相比LDP不带来额外的带宽压力,同时比LDP有更好的压缩性能。The embodiments of this application can be used in the LDB mode. By modifying the interpolation function of the double MV in the LDB (selecting the interpolation filter or the interpolation method), the memory bandwidth consumption of the LDB can be reduced, and the LDB mode can bring no extra cost compared to LDP. Bandwidth pressure, while having better compression performance than LDP.
以上介绍了如何进行插值滤波器或插值方式的选择,本申请实施例还可以用于单向预测或双向预测中如何进行选择,以及运动估计和/或运动补偿采用整像素精度(无需插值滤波器)还是亚像素精度(需要插值滤波器)中进行选择。The above describes how to select an interpolation filter or an interpolation method. The embodiments of the present application can also be used for how to select one-way prediction or two-way prediction, and motion estimation and/or motion compensation use integer pixel precision (without interpolation filter ) Or sub-pixel accuracy (interpolation filter is required).
在一种实现方式中,可以存在整像素精度的运动估计和/或运动补偿方式,以及亚像素精度的运动估计和/或运动补偿方式,则可以根据预设条件选择是采用整像素精度的运动估计和/或运动补偿方式还是亚像素精度的运动估计和/或运动补偿方式。其中,该两种预设条件也可以在以下方面中的至少一种不同:In one implementation manner, there may be motion estimation and/or motion compensation methods with integer pixel accuracy, and motion estimation and/or motion compensation methods with sub-pixel accuracy. Then, the selection of the motion estimation and/or motion compensation method with integer pixel accuracy can be selected according to preset conditions. The estimation and/or motion compensation method is also a sub-pixel precision motion estimation and/or motion compensation method. Wherein, the two preset conditions may also be different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
其中,整像素精度的运动估计和/或运动补偿方式对应的预设条件可以包 括:编码模式为inter模式,图像块的尺寸小于或等于预设值,图像块的MV的数量大于或等于2。该预设条件还可以限定其他的因素,本申请实施例对此不进行限定。Among them, the preset conditions corresponding to the motion estimation and/or motion compensation mode with integer pixel accuracy may include: the encoding mode is inter mode, the size of the image block is less than or equal to the preset value, and the number of MVs of the image block is greater than or equal to two. The preset condition may also limit other factors, which are not limited in the embodiment of the present application.
在该种情况下,亚像素精度的运动估计和/或运动补偿方式对应的预设条件可以进一步包括多种,分别对应于多个插值滤波器,或者,亚像素精度的运动估计和/或运动补偿方式可以具有多种插值滤波器,每个插值滤波器对应的预设条件可以不相同。In this case, the preset conditions corresponding to the sub-pixel precision motion estimation and/or motion compensation method may further include multiple types, corresponding to multiple interpolation filters, or sub-pixel precision motion estimation and/or motion. The compensation method may have multiple interpolation filters, and the preset conditions corresponding to each interpolation filter may be different.
在另一种实现方式,可以存在单向预测的预测模式和双向预测的预测模式,则可以根据预设条件选择单向预测的预测模式还是双向预测的预测模式。其中,该两种预设条件也可以在以下方面中的至少一种不同:In another implementation manner, there may be a prediction mode of unidirectional prediction and a prediction mode of bidirectional prediction, and the prediction mode of unidirectional prediction or bidirectional prediction may be selected according to a preset condition. Wherein, the two preset conditions may also be different in at least one of the following aspects:
所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
其中,单向预测的预测模式对应的预设条件可以包括:编码模式为inter模式,图像块的尺寸小于或等于预设值,图像块的MV的数量大于或等于2。该预设条件还可以限定其他的因素,本申请实施例对此不进行限定。The preset conditions corresponding to the prediction mode of unidirectional prediction may include: the coding mode is inter mode, the size of the image block is less than or equal to the preset value, and the number of MVs of the image block is greater than or equal to 2. The preset condition may also limit other factors, which are not limited in the embodiment of the present application.
在该种情况下,双向预测的预测模式对应的预设条件可以包括多种,分别对应于多个插值滤波器,或者,双向预测的预测模式可以具有多种插值滤波器,每个插值滤波器对应的预设条件可以不相同。In this case, the preset conditions corresponding to the prediction mode of the bidirectional prediction may include multiple types, corresponding to multiple interpolation filters, or the prediction mode of the bidirectional prediction may have multiple interpolation filters, each of which The corresponding preset conditions may be different.
以上介绍了根据本申请实施例的视频处理方法,以下将介绍用于实现根据本申请实施例的视频处理设备。The video processing method according to the embodiment of the present application is described above, and the video processing device for implementing the embodiment of the present application will be introduced below.
图7示出了本申请实施例的视频处理设备200的示意性框图。FIG. 7 shows a schematic block diagram of a video processing device 200 according to an embodiment of the present application.
如图7所示,该设备200可以包括处理器210,进一步地可以包括存储器220。As shown in FIG. 7, the device 200 may include a processor 210, and may further include a memory 220.
应理解,该计算机系统200还可以包括其他计算机系统中通常所包括的部件,例如,输入输出设备、通信接口等,本申请实施例对此并不限定。It should be understood that the computer system 200 may also include components commonly included in other computer systems, such as input and output devices, communication interfaces, etc., which are not limited in the embodiment of the present application.
存储器220用于存储计算机可执行指令。The memory 220 is used to store computer executable instructions.
存储器220可以是各种种类的存储器,例如可以包括高速随机存取存储器(Random Access Memory,RAM),还可以包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器,本申请实施例对此并不限定。The memory 220 may be various types of memory, for example, it may include a high-speed random access memory (Random Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The example does not limit this.
处理器210用于访问该存储器220,并执行该计算机可执行指令,以进 行上述本申请实施例的用于视频处理的方法中的操作。The processor 210 is configured to access the memory 220 and execute the computer-executable instructions to perform operations in the method for video processing in the foregoing embodiment of the present application.
处理器210可以包括微处理器,现场可编程门阵列(Field-Programmable Gate Array,FPGA),中央处理器(Central Processing unit,CPU),图形处理器(Graphics Processing Unit,GPU)等,本申请实施例对此并不限定。The processor 210 may include a microprocessor, a field-programmable gate array (Field-Programmable Gate Array, FPGA), a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), etc. The implementation of this application The example does not limit this.
本申请实施例的视频处理设备200可对应于本申请实施例的视频处理方法的执行主体,并且视频处理设备200的各个模块的上述和其它操作和/或功能分别为了实现前述各个方法的相应流程,为了简洁,在此不再赘述。The video processing device 200 of the embodiment of the present application may correspond to the execution subject of the video processing method of the embodiment of the present application, and the foregoing and other operations and/or functions of the various modules of the video processing device 200 are used to implement the corresponding procedures of the foregoing methods. , For the sake of brevity, I will not repeat it here.
本申请实施例还提供了一种电子设备,该电子设备可以包括上述本申请各种实施例的视频处理设备。An embodiment of the present application also provides an electronic device, which may include the video processing device of the foregoing various embodiments of the present application.
本申请实施例还提供了一种计算机存储介质,该计算机存储介质中存储有程序代码,该程序代码可以用于指示执行上述本申请实施例的视频处理方法。The embodiment of the present application also provides a computer storage medium, and the computer storage medium stores program code, and the program code may be used to instruct the execution of the video processing method in the foregoing embodiment of the present application.
应理解,在本申请实施例中,术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。It should be understood that, in the embodiments of the present application, the term "and/or" is merely an association relationship describing an associated object, indicating that there may be three relationships. For example, A and/or B can mean: A alone exists, A and B exist at the same time, and B exists alone. In addition, the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the hardware and software Interchangeability. In the above description, the composition and steps of each example have been generally described in terms of function. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的 耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (70)

  1. 一种视频处理方法,其特征在于,包括:A video processing method, characterized by comprising:
    利用多种插值滤波器中的插值滤波器,对目标帧的具有多MV的图像块,进行运动估计和/或运动补偿。Using the interpolation filter among the multiple interpolation filters, motion estimation and/or motion compensation are performed on the image block with multiple MVs of the target frame.
  2. 根据权利要求1所述的方法,其特征在于,所述多种插值滤波器中不同的插值滤波器对应不同的预设条件;The method according to claim 1, wherein different interpolation filters in the multiple interpolation filters correspond to different preset conditions;
    所述利用多种插值滤波器中的插值滤波器,为目标帧的具有多MV的图像块,进行运动估计和/或运动补偿,包括:The use of the interpolation filter among the multiple interpolation filters to perform motion estimation and/or motion compensation for the image block with multiple MVs of the target frame includes:
    在第一插值滤波器对应的第一预设条件得到满足时,利用所述第一插值滤波器,对所述图像块,进行运动估计和/或运动补偿。When the first preset condition corresponding to the first interpolation filter is satisfied, the first interpolation filter is used to perform motion estimation and/or motion compensation on the image block.
  3. 根据权利要求2所述的方法,其特征在于,不同的插值滤波器对应的预设条件在以下方面中的至少一种不同:The method according to claim 2, wherein the preset conditions corresponding to different interpolation filters are different in at least one of the following aspects:
    所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  4. 根据要求要求1所述的方法,其特征在于,不同的预设条件对应所述多种插值滤波器中的不同插值滤波器;The method according to claim 1, wherein different preset conditions correspond to different interpolation filters among the multiple interpolation filters;
    利用多种插值滤波器中的插值滤波器,对目标帧的具有多MV的图像块,进行运动估计和/或运动补偿,包括:Use the interpolation filter among multiple interpolation filters to perform motion estimation and/or motion compensation on the image block with multiple MVs of the target frame, including:
    在所述多种预设条件中的第一预设条件得到满足时,利用所述第一预设条件对应的所述第一插值滤波器,对所述图像块,进行运动估计和/或运动补偿。When the first preset condition among the multiple preset conditions is satisfied, the first interpolation filter corresponding to the first preset condition is used to perform motion estimation and/or motion on the image block make up.
  5. 根据权利要求4所述的方法,其特征在于,不同的预设条件在以下方面中的至少一种不同:The method according to claim 4, wherein the different preset conditions are different in at least one of the following aspects:
    所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  6. 根据权利要求2或3所述的方法,其特征在于,所述第一插值滤波器对应的预设条件包括以下中的至少两种:The method according to claim 2 or 3, wherein the preset condition corresponding to the first interpolation filter includes at least two of the following:
    所述图像块的编码模式为帧间inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is an inter mode, and the component to be coded is a luminance component;
    所述图像块的编码模式为inter模式,所述待编码的分量为色度分量;The coding mode of the image block is inter mode, and the component to be coded is a chrominance component;
    所述图像块的编码模式为仿射运动补偿预测Affine模式,所述待编码的 分量为色度分量;The coding mode of the image block is an affine motion compensation prediction Affine mode, and the component to be coded is a chrominance component;
    所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量。The coding mode of the image block is Affine mode, and the component to be coded is a luminance component.
  7. 根据权利要求6所述的方法,其特征在于,包括所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量的预设条件进一步包括:所述图像块的尺寸小于或等于预设值。The method according to claim 6, characterized in that the coding mode including the image block is inter mode, and the preset condition that the component to be coded is a luminance component further comprises: the size of the image block is less than or equal to default value.
  8. 根据权利要求2至5中任一项所述的方法,其特征在于,以下预设条件分别对应于不同的插值滤波器:The method according to any one of claims 2 to 5, wherein the following preset conditions respectively correspond to different interpolation filters:
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;The coding mode of the image block is an inter mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于所述第二预设值。The coding mode of the image block is an inter mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to the second preset value.
  9. 根据权利要求2至5中任一项所述的方法,其特征在于,以下预设条件分别对应于不同的插值滤波器:The method according to any one of claims 2 to 5, wherein the following preset conditions respectively correspond to different interpolation filters:
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述图像块的编码模式为inter模式,所述待编码的分量为色度分量。The coding mode of the image block is inter mode, and the component to be coded is a chrominance component.
  10. 根据权利要求2至5中任一项所述的方法,其特征在于,以下预设条件分别对应于不同的插值滤波器:The method according to any one of claims 2 to 5, wherein the following preset conditions respectively correspond to different interpolation filters:
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量或色度分量。The coding mode of the image block is Affine mode, and the component to be coded is a luminance component or a chrominance component.
  11. 根据权利要求9或10所述的方法,其特征在于,包括所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量的预设条件进一步包括:所述图像块的尺寸大于预设值。The method according to claim 9 or 10, wherein the coding mode including the image block is inter mode, and the preset condition that the component to be coded is a luminance component further comprises: the size of the image block is greater than default value.
  12. 根据权利要求2至5中任一项所述的方法,其特征在于,The method according to any one of claims 2 to 5, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量;The first preset condition includes: the coding mode of the image block is Affine mode, and the component to be coded is a luminance component;
    所述第一滤波器的抽头数量为4。The number of taps of the first filter is 4.
  13. 根据权利要求2至5中任一项所述的方法,其特征在于,The method according to any one of claims 2 to 5, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The first preset condition includes: the coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述第一滤波器的抽头数量为4或6。The number of taps of the first filter is 4 or 6.
  14. 根据权利要求13所述的方法,其特征在于,所述第一预设条件进一步包括:所述图像块的尺寸小于或等于预设值。The method according to claim 13, wherein the first preset condition further comprises: the size of the image block is less than or equal to a preset value.
  15. 根据权利要求2至5中任一项所述的方法,其特征在于,The method according to any one of claims 2 to 5, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The first preset condition includes: the coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述第一滤波器的抽头数量为8。The number of taps of the first filter is 8.
  16. 根据权利要求15所述的方法,其特征在于,所述第一预设条件进一步包括:所述图像块的尺寸大于预设值。The method according to claim 15, wherein the first preset condition further comprises: the size of the image block is greater than a preset value.
  17. 根据权利要求2至5中任一项所述的方法,其特征在于,The method according to any one of claims 2 to 5, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为inter模式,所述待编码的分量为色度分量;The first preset condition includes: the coding mode of the image block is inter mode, and the component to be coded is a chrominance component;
    所述第一滤波器的抽头数量为4。The number of taps of the first filter is 4.
  18. 根据权利要求2至5中任一项所述的方法,其特征在于,The method according to any one of claims 2 to 5, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为Affine模式,所述待编码的分量为色度分量;The first preset condition includes: the coding mode of the image block is Affine mode, and the component to be coded is a chrominance component;
    所述第一滤波器的抽头数量为4。The number of taps of the first filter is 4.
  19. 根据权利要求1至3中任一项所述的方法,其特征在于,所述图像块包括亮度分量和色度分量;The method according to any one of claims 1 to 3, wherein the image block includes a luminance component and a chrominance component;
    所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿。The luminance component and chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
  20. 根据权利要求19所述的方法,其特征在于,The method of claim 19, wherein:
    在满足以下至少一个条件时,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿:When at least one of the following conditions is met, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation:
    码流中具有特定标识位;编码模式为inter模式或Affine模式;所述图像块的尺寸大于预设值。The code stream has a specific identification bit; the coding mode is inter mode or Affine mode; the size of the image block is greater than a preset value.
  21. 根据权利要求19或20所述的方法,其特征在于,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿,包括:The method according to claim 19 or 20, wherein the luminance component and the chrominance component of the image block are subjected to motion estimation and/or motion compensation using the same interpolation method, comprising:
    所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的抽头数量相同;和/或,The number of taps of the interpolation filter used for motion estimation and/or motion compensation of the luminance component and chrominance component of the image block is the same; and/or,
    所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的插值系数相同。The interpolation coefficients of the interpolation filters used for motion estimation and/or motion compensation of the luminance component and the chrominance component of the image block are the same.
  22. 根据权利要求19至21中任一项所述的方法,其特征在于,所述图像块的亮度分量和色度分量的编码模式均为inter模式。The method according to any one of claims 19 to 21, wherein the coding modes of the luminance component and the chrominance component of the image block are both inter mode.
  23. 根据权利要求19至21中任一项所述的方法,其特征在于,所述图像块的亮度分量和色度分量的编码模式均为Affine模式。The method according to any one of claims 19 to 21, wherein the coding mode of the luminance component and the chrominance component of the image block are both Affine mode.
  24. 根据权利要求19至23中任一项所述的方法,其特征在于,所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的抽头数量为4,用于插值出1/16的像素。The method according to any one of claims 19 to 23, characterized in that the number of taps of the interpolation filter used for motion estimation and/or motion compensation of the luminance component and chrominance component of the image block is 4. Interpolate 1/16 of the pixels.
  25. 根据权利要求19至24中任一项所述的方法,其特征在于,所述方法用于编码端,所述方法还包括:The method according to any one of claims 19 to 24, wherein the method is used at the encoding end, and the method further comprises:
    在码流中写入标识位,所述标识位用于指示所述图像块的亮度分量和色度分量采用相同的运动估计和/或运动补偿的插值方式。An identification bit is written in the code stream, and the identification bit is used to indicate that the luminance component and the chrominance component of the image block adopt the same motion estimation and/or motion compensation interpolation method.
  26. 根据权利要求19至24中任一项所述的方法,其特征在于,所述方法用于解码端,所述方法还包括:The method according to any one of claims 19 to 24, wherein the method is used at the decoding end, and the method further comprises:
    在码流中获取标识位,所述标识位用于指示所述图像块的亮度分量和色度分量采用相同的运动估计和/或运动补偿的插值方式。An identification bit is obtained in the code stream, and the identification bit is used to indicate that the luminance component and the chrominance component of the image block adopt the same motion estimation and/or motion compensation interpolation method.
  27. 根据权利要求1至26中任一项所述的方法,其特征在于,所述方法用于编码端;所述方法还包括:The method according to any one of claims 1 to 26, wherein the method is used at the encoding end; the method further comprises:
    在码流中加入第一标识位,所述第一标识位用于指示从多种插值滤波器中选择其中一种插值滤波器以用于运动估计和/或运动补偿。A first identification bit is added to the code stream, and the first identification bit is used to indicate that one of the interpolation filters is selected from a variety of interpolation filters for use in motion estimation and/or motion compensation.
  28. 根据权利要求1至26中任一项所述的方法,其特征在于,所述方法用于解码端;所述方法还包括:The method according to any one of claims 1 to 26, wherein the method is used at the decoding end; the method further comprises:
    在码流中获取第一标识位,所述第一标识位用于指示从多种插值滤波器中选择其中一种插值滤波器以用于运动估计和/或运动补偿。The first identification bit is acquired in the code stream, and the first identification bit is used to indicate that one of the interpolation filters is selected from a variety of interpolation filters for use in motion estimation and/or motion compensation.
  29. 根据权利要求27或28所述的方法,其特征在于,所述第一标识位用于指示从多种抽头数量的滤波器中选择其中一种抽头数量的滤波器以用于运动估计和/或运动补偿。The method according to claim 27 or 28, wherein the first identification bit is used to indicate that a filter with one tap number is selected from filters with multiple tap numbers for use in motion estimation and/or Motion compensation.
  30. 根据权利要求29所述的方法,其特征在于,所述多种抽头数量的滤波器包括第一滤波器和第二滤波器,所述第一滤波器的抽头数量为8,所述第二滤波器的抽头数量为6或者4,或者所述第二滤波器的抽头数量指示与色度分量的滤波器的抽头数量相同;The method according to claim 29, wherein the filters with multiple tap numbers include a first filter and a second filter, the first filter has 8 taps, and the second filter The number of taps of the filter is 6 or 4, or the number of taps of the second filter indicates the same number of taps of the filter of the chrominance component;
    所述第一标识位用于指示选择所述第一滤波器或者所述第二滤波器。The first identification bit is used to indicate the selection of the first filter or the second filter.
  31. 根据权利要求27至30任一项所述的方法,其特征在于,所述方法用于编码端,所述方法还包括:The method according to any one of claims 27 to 30, wherein the method is used at the encoding end, and the method further comprises:
    在码流中加入第二标识位,当所述第二标识位用于指示所述目标帧是B帧时,在所述码流具有所述第一标识位。A second identification bit is added to the code stream, and when the second identification bit is used to indicate that the target frame is a B frame, the code stream has the first identification bit.
  32. 根据权利要求27至30任一项所述的方法,其特征在于,所述方法用于解码端,所述方法还包括:The method according to any one of claims 27 to 30, wherein the method is used at the decoding end, and the method further comprises:
    在码流中获取第二标识位,当所述第二标识位用于指示所述目标帧是B帧时,所述码流中具有所述第一标识位。A second identification bit is acquired in a code stream. When the second identification bit is used to indicate that the target frame is a B frame, the code stream has the first identification bit.
  33. 根据权利要求27至32中任一项所述的方法,其特征在于,所述第一标识位承载于序列头、帧头、Slice头。The method according to any one of claims 27 to 32, wherein the first identification bit is carried in a sequence header, a frame header, and a slice header.
  34. 根据权利要求31或32所述的方法,其特征在于,所述第二标识位承载于序列头、帧头、Slice头。The method according to claim 31 or 32, wherein the second identification bit is carried in a sequence header, a frame header, and a slice header.
  35. 根据权利要求31至34中任一项所述的方法,其特征在于,所述第二标识位为所述Slice头中的Slice_type。The method according to any one of claims 31 to 34, wherein the second identification bit is Slice_type in the Slice header.
  36. 一种视频处理设备,其特征在于,包括处理器,所述处理器用于调用存储器中存储的代码,执行以下操作:A video processing device, characterized by comprising a processor, which is used to call codes stored in a memory to perform the following operations:
    利用多种插值滤波器中的插值滤波器,对目标帧的具有多MV的图像块,进行运动估计和/或运动补偿。Using the interpolation filter among the multiple interpolation filters, motion estimation and/or motion compensation are performed on the image block with multiple MVs of the target frame.
  37. 根据权利要求36所述的设备,其特征在于,所述多种插值滤波器中不同的插值滤波器对应不同的预设条件;The device according to claim 36, wherein different interpolation filters of the multiple kinds of interpolation filters correspond to different preset conditions;
    所述利用多种插值滤波器中的插值滤波器,为目标帧的具有多MV的图像块,进行运动估计和/或运动补偿,包括:The use of the interpolation filter among the multiple interpolation filters to perform motion estimation and/or motion compensation for the image block with multiple MVs of the target frame includes:
    在第一插值滤波器对应的第一预设条件得到满足时,利用所述第一插值滤波器,对所述图像块,进行运动估计和/或运动补偿。When the first preset condition corresponding to the first interpolation filter is satisfied, the first interpolation filter is used to perform motion estimation and/or motion compensation on the image block.
  38. 根据权利要求37所述的设备,其特征在于,不同的插值滤波器对应的预设条件在以下方面中的至少一种不同:The device according to claim 37, wherein the preset conditions corresponding to different interpolation filters are different in at least one of the following aspects:
    所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  39. 根据要求要求36所述的设备,其特征在于,不同的预设条件对应所述多种插值滤波器中的不同插值滤波器;The device according to claim 36, wherein different preset conditions correspond to different interpolation filters of the multiple interpolation filters;
    利用多种插值滤波器中的插值滤波器,对目标帧的具有多MV的图像 块,进行运动估计和/或运动补偿,包括:Use the interpolation filter among multiple interpolation filters to perform motion estimation and/or motion compensation on the image block with multiple MVs of the target frame, including:
    在所述多种预设条件中的第一预设条件得到满足时,利用所述第一预设条件对应的所述第一插值滤波器,对所述图像块,进行运动估计和/或运动补偿。When the first preset condition among the multiple preset conditions is satisfied, the first interpolation filter corresponding to the first preset condition is used to perform motion estimation and/or motion on the image block make up.
  40. 根据权利要求39所述的设备,其特征在于,不同的预设条件在以下方面中的至少一种不同:The device according to claim 39, wherein the different preset conditions are different in at least one of the following aspects:
    所述图像块的编码模式、所述图像块的尺寸所处的区间、所述图像块的待编码的分量、所述图像块的MV的数量。The encoding mode of the image block, the interval in which the size of the image block is located, the components to be encoded of the image block, and the number of MVs of the image block.
  41. 根据权利要求37或38所述的设备,其特征在于,所述第一插值滤波器对应的预设条件包括以下中的至少两种:The device according to claim 37 or 38, wherein the preset condition corresponding to the first interpolation filter includes at least two of the following:
    所述图像块的编码模式为帧间inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is an inter mode, and the component to be coded is a luminance component;
    所述图像块的编码模式为inter模式,所述待编码的分量为色度分量;The coding mode of the image block is inter mode, and the component to be coded is a chrominance component;
    所述图像块的编码模式为仿射运动补偿预测Affine模式,所述待编码的分量为色度分量;The coding mode of the image block is an affine motion compensation prediction Affine mode, and the component to be coded is a chrominance component;
    所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量。The coding mode of the image block is Affine mode, and the component to be coded is a luminance component.
  42. 根据权利要求41所述的设备,其特征在于,包括所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量的预设条件进一步包括:所述图像块的尺寸小于或等于预设值。The device according to claim 41, wherein the encoding mode including the image block is inter mode, and the preset condition that the component to be encoded is a luminance component further comprises: the size of the image block is less than or equal to default value.
  43. 根据权利要求37至40中任一项所述的设备,其特征在于,以下预设条件分别对应于不同的插值滤波器:The device according to any one of claims 37 to 40, wherein the following preset conditions respectively correspond to different interpolation filters:
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于第一预设值且大于第二预设值;The coding mode of the image block is an inter mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to a first preset value and greater than a second preset value;
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量,所述图像块的尺寸小于或等于所述第二预设值。The coding mode of the image block is an inter mode, the component to be coded is a luminance component, and the size of the image block is less than or equal to the second preset value.
  44. 根据权利要求37至40中任一项所述的设备,其特征在于,以下预设条件分别对应于不同的插值滤波器:The device according to any one of claims 37 to 40, wherein the following preset conditions respectively correspond to different interpolation filters:
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述图像块的编码模式为inter模式,所述待编码的分量为色度分量。The coding mode of the image block is inter mode, and the component to be coded is a chrominance component.
  45. 根据权利要求37至40中任一项所述的设备,其特征在于,以下预设条件分别对应于不同的插值滤波器:The device according to any one of claims 37 to 40, wherein the following preset conditions respectively correspond to different interpolation filters:
    所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量或色度分量。The coding mode of the image block is Affine mode, and the component to be coded is a luminance component or a chrominance component.
  46. 根据权利要求44或45所述的设备,其特征在于,包括所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量的预设条件进一步包括:所述图像块的尺寸大于预设值。The device according to claim 44 or 45, wherein the encoding mode including the image block is inter mode, and the preset condition that the component to be encoded is a luminance component further comprises: the size of the image block is greater than default value.
  47. 根据权利要求37至40中任一项所述的设备,其特征在于,The device according to any one of claims 37 to 40, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为Affine模式,所述待编码的分量为亮度分量;The first preset condition includes: the coding mode of the image block is Affine mode, and the component to be coded is a luminance component;
    所述第一滤波器的抽头数量为4。The number of taps of the first filter is 4.
  48. 根据权利要求37至40中任一项所述的设备,其特征在于,The device according to any one of claims 37 to 40, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The first preset condition includes: the coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述第一滤波器的抽头数量为4或6。The number of taps of the first filter is 4 or 6.
  49. 根据权利要求48所述的设备,其特征在于,所述第一预设条件进一步包括:所述图像块的尺寸小于或等于预设值。The device according to claim 48, wherein the first preset condition further comprises: the size of the image block is less than or equal to a preset value.
  50. 根据权利要求37至40中任一项所述的设备,其特征在于,The device according to any one of claims 37 to 40, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为inter模式,所述待编码的分量为亮度分量;The first preset condition includes: the coding mode of the image block is inter mode, and the component to be coded is a luminance component;
    所述第一滤波器的抽头数量为8。The number of taps of the first filter is 8.
  51. 根据权利要求50所述的设备,其特征在于,所述第一预设条件进一步包括:所述图像块的尺寸大于预设值。The device according to claim 50, wherein the first preset condition further comprises: the size of the image block is greater than a preset value.
  52. 根据权利要求37至40中任一项所述的设备,其特征在于,The device according to any one of claims 37 to 40, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为inter模式,所述待编码的分量为色度分量;The first preset condition includes: the coding mode of the image block is inter mode, and the component to be coded is a chrominance component;
    所述第一滤波器的抽头数量为4。The number of taps of the first filter is 4.
  53. 根据权利要求37至40中任一项所述的设备,其特征在于,The device according to any one of claims 37 to 40, characterized in that:
    所述第一预设条件包括:所述图像块的编码模式为Affine模式,所述待编码的分量为色度分量;The first preset condition includes: the coding mode of the image block is Affine mode, and the component to be coded is a chrominance component;
    所述第一滤波器的抽头数量为4。The number of taps of the first filter is 4.
  54. 根据权利要求36至38中任一项所述的设备,其特征在于,所述图 像块包括亮度分量和色度分量;The device according to any one of claims 36 to 38, wherein the image block includes a luminance component and a chrominance component;
    所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿。The luminance component and chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation.
  55. 根据权利要求54所述的设备,其特征在于,The device of claim 54, wherein:
    在满足以下至少一个条件时,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿:When at least one of the following conditions is met, the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation:
    码流中具有特定标识位;编码模式为inter模式或Affine模式;所述图像块的尺寸大于预设值。The code stream has a specific identification bit; the coding mode is inter mode or Affine mode; the size of the image block is greater than a preset value.
  56. 根据权利要求54或55所述的设备,其特征在于,所述图像块的亮度分量和色度分量采用相同的插值方式进行运动估计和/或运动补偿,包括:The device according to claim 54 or 55, wherein the luminance component and the chrominance component of the image block adopt the same interpolation method for motion estimation and/or motion compensation, comprising:
    所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的抽头数量相同;和/或,The number of taps of the interpolation filter used for motion estimation and/or motion compensation of the luminance component and chrominance component of the image block is the same; and/or,
    所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的插值系数相同。The interpolation coefficients of the interpolation filters used for motion estimation and/or motion compensation of the luminance component and the chrominance component of the image block are the same.
  57. 根据权利要求54至56中任一项所述的设备,其特征在于,所述图像块的亮度分量和色度分量的编码模式均为inter模式。The device according to any one of claims 54 to 56, wherein the coding modes of the luminance component and the chrominance component of the image block are both inter mode.
  58. 根据权利要求54至56中任一项所述的设备,其特征在于,所述图像块的亮度分量和色度分量的编码模式均为Affine模式。The device according to any one of claims 54 to 56, wherein the coding mode of the luminance component and the chrominance component of the image block are both Affine mode.
  59. 根据权利要求54至58中任一项所述的设备,其特征在于,所述图像块的亮度分量和色度分量用于运动估计和/或运动补偿的插值滤波器的抽头数量为4,用于插值出1/16的像素。The device according to any one of claims 54 to 58, wherein the number of taps of the interpolation filter used for motion estimation and/or motion compensation of the luminance component and chrominance component of the image block is 4, Interpolate 1/16 of the pixels.
  60. 根据权利要求54至59中任一项所述的设备,其特征在于,所述设备用于编码端,所述处理器进一步用于:The device according to any one of claims 54 to 59, wherein the device is used for an encoding end, and the processor is further used for:
    在码流中写入标识位,所述标识位用于指示所述图像块的亮度分量和色度分量采用相同的运动估计和/或运动补偿的插值方式。An identification bit is written in the code stream, and the identification bit is used to indicate that the luminance component and the chrominance component of the image block adopt the same motion estimation and/or motion compensation interpolation method.
  61. 根据权利要求54至59中任一项所述的设备,其特征在于,所述设备用于解码端,所述处理器进一步用于:The device according to any one of claims 54 to 59, wherein the device is used for a decoding end, and the processor is further used for:
    在码流中获取标识位,所述标识位用于指示所述图像块的亮度分量和色度分量采用相同的运动估计和/或运动补偿的插值方式。An identification bit is obtained in the code stream, and the identification bit is used to indicate that the luminance component and the chrominance component of the image block adopt the same motion estimation and/or motion compensation interpolation method.
  62. 根据权利要求36至61中任一项所述的设备,其特征在于,所述设备用于编码端;所述处理器进一步用于:The device according to any one of claims 36 to 61, wherein the device is used for an encoding end; the processor is further used for:
    在码流中加入第一标识位,所述第一标识位用于指示从多种插值滤波器中选择其中一种插值滤波器以用于运动估计和/或运动补偿。A first identification bit is added to the code stream, and the first identification bit is used to indicate that one of the interpolation filters is selected from a variety of interpolation filters for use in motion estimation and/or motion compensation.
  63. 根据权利要求36至61中任一项所述的设备,其特征在于,所述设备用于解码端;所述处理器进一步用于:The device according to any one of claims 36 to 61, wherein the device is used for a decoding end; the processor is further used for:
    在码流中获取第一标识位,所述第一标识位用于指示从多种插值滤波器中选择其中一种插值滤波器以用于运动估计和/或运动补偿。The first identification bit is acquired in the code stream, and the first identification bit is used to indicate that one of the interpolation filters is selected from a variety of interpolation filters for use in motion estimation and/or motion compensation.
  64. 根据权利要求62或63所述的设备,其特征在于,所述第一标识位用于指示从多种抽头数量的滤波器中选择其中一种抽头数量的滤波器以用于运动估计和/或运动补偿。The device according to claim 62 or 63, wherein the first identification bit is used to indicate that a filter with one tap number is selected from filters with multiple tap numbers for use in motion estimation and/or Motion compensation.
  65. 根据权利要求64所述的设备,其特征在于,所述多种抽头数量的滤波器包括第一滤波器和第二滤波器,所述第一滤波器的抽头数量为8,所述第二滤波器的抽头数量为6或者4,或者所述第二滤波器的抽头数量指示与色度分量的滤波器的抽头数量相同;The device according to claim 64, wherein the filters with multiple numbers of taps comprise a first filter and a second filter, the number of taps of the first filter is 8, and the second filter The number of taps of the filter is 6 or 4, or the number of taps of the second filter indicates the same number of taps of the filter of the chrominance component;
    所述第一标识位用于指示选择所述第一滤波器或者所述第二滤波器。The first identification bit is used to indicate the selection of the first filter or the second filter.
  66. 根据权利要求62至65任一项所述的设备,其特征在于,所述设备用于编码端,所述处理器进一步用于:The device according to any one of claims 62 to 65, wherein the device is used for an encoding end, and the processor is further used for:
    在码流中加入第二标识位,当所述第二标识位用于指示所述目标帧是B帧时,在所述码流具有所述第一标识位。A second identification bit is added to the code stream, and when the second identification bit is used to indicate that the target frame is a B frame, the code stream has the first identification bit.
  67. 根据权利要求62至65任一项所述的设备,其特征在于,所述设备用于解码端,所述处理器进一步用于:The device according to any one of claims 62 to 65, wherein the device is used for a decoding end, and the processor is further used for:
    在码流中获取第二标识位,当所述第二标识位用于指示所述目标帧是B帧时,所述码流中具有所述第一标识位。A second identification bit is acquired in a code stream. When the second identification bit is used to indicate that the target frame is a B frame, the code stream has the first identification bit.
  68. 根据权利要求62至67中任一项所述的设备,其特征在于,所述第一标识位承载于序列头、帧头、Slice头。The device according to any one of claims 62 to 67, wherein the first identification bit is carried in a sequence header, a frame header, and a slice header.
  69. 根据权利要求67或68所述的设备,其特征在于,所述第二标识位承载于序列头、帧头、Slice头。The device according to claim 67 or 68, wherein the second identification bit is carried in a sequence header, a frame header, and a slice header.
  70. 根据权利要求69所述的设备,其特征在于,所述第二标识位为所述Slice头中的Slice_type。The device according to claim 69, wherein the second identification bit is Slice_type in the Slice header.
PCT/CN2019/091955 2019-06-19 2019-06-19 Video processing method and device WO2020252707A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980009161.8A CN111656782A (en) 2019-06-19 2019-06-19 Video processing method and device
PCT/CN2019/091955 WO2020252707A1 (en) 2019-06-19 2019-06-19 Video processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/091955 WO2020252707A1 (en) 2019-06-19 2019-06-19 Video processing method and device

Publications (1)

Publication Number Publication Date
WO2020252707A1 true WO2020252707A1 (en) 2020-12-24

Family

ID=72345887

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/091955 WO2020252707A1 (en) 2019-06-19 2019-06-19 Video processing method and device

Country Status (2)

Country Link
CN (1) CN111656782A (en)
WO (1) WO2020252707A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113242427B (en) * 2021-04-14 2024-03-12 中南大学 Rapid method and device based on adaptive motion vector precision in VVC

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1666429A (en) * 2002-07-09 2005-09-07 诺基亚有限公司 Method and system for selecting interpolation filter type in video coding
WO2010087620A2 (en) * 2009-01-28 2010-08-05 삼성전자 주식회사 Method and apparatus for encoding and decoding images by adaptively using an interpolation filter
CN101841701A (en) * 2009-03-20 2010-09-22 华为技术有限公司 Encoding and decoding method and device based on macroblock pair
WO2012125452A1 (en) * 2011-03-16 2012-09-20 General Instrument Corporation Interpolation filter selection using prediction unit (pu) size
CN108702509A (en) * 2016-02-25 2018-10-23 株式会社Kt Method and apparatus for handling vision signal
CN109845265A (en) * 2016-10-19 2019-06-04 数字洞察力有限公司 Use the method for video coding and device of adaptive interpolation filters

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1666429A (en) * 2002-07-09 2005-09-07 诺基亚有限公司 Method and system for selecting interpolation filter type in video coding
WO2010087620A2 (en) * 2009-01-28 2010-08-05 삼성전자 주식회사 Method and apparatus for encoding and decoding images by adaptively using an interpolation filter
CN101841701A (en) * 2009-03-20 2010-09-22 华为技术有限公司 Encoding and decoding method and device based on macroblock pair
WO2012125452A1 (en) * 2011-03-16 2012-09-20 General Instrument Corporation Interpolation filter selection using prediction unit (pu) size
CN108702509A (en) * 2016-02-25 2018-10-23 株式会社Kt Method and apparatus for handling vision signal
CN109845265A (en) * 2016-10-19 2019-06-04 数字洞察力有限公司 Use the method for video coding and device of adaptive interpolation filters

Also Published As

Publication number Publication date
CN111656782A (en) 2020-09-11

Similar Documents

Publication Publication Date Title
TWI711299B (en) Decoding method and apparatus utilizing partial cost calculation
US20230140112A1 (en) Method and apparatus for video signal processing using sub-block based motion compensation
US8503532B2 (en) Method and apparatus for inter prediction encoding/decoding an image using sub-pixel motion estimation
KR101991074B1 (en) Video encoding and decoding
US20060222074A1 (en) Method and system for motion estimation in a video encoder
JP2022123067A (en) Video decoding method, program and decoder readable storage medium
JP7313533B2 (en) Method and Apparatus in Predictive Refinement by Optical Flow
WO2021055643A1 (en) Methods and apparatus for prediction refinement with optical flow
JP7269371B2 (en) Method and Apparatus for Prediction Improvement Using Optical Flow
WO2020257629A1 (en) Methods and apparatus for prediction refinement with optical flow
CN114827623A (en) Boundary extension for video coding and decoding
WO2020223552A1 (en) Methods and apparatus of prediction refinement with optical flow
JP7281602B2 (en) Method and Apparatus for Predictive Refinement with Optical Flow, Bidirectional Optical Flow and Decoder Side Motion Vector Refinement
US9420308B2 (en) Scaled motion search section with parallel processing and method for use therewith
WO2020252707A1 (en) Video processing method and device
CN111247804B (en) Image processing method and device
TW202041002A (en) Constraints on decoder-side motion vector refinement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19933639

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19933639

Country of ref document: EP

Kind code of ref document: A1