WO2021056220A1 - 视频编解码的方法与装置 - Google Patents

视频编解码的方法与装置 Download PDF

Info

Publication number
WO2021056220A1
WO2021056220A1 PCT/CN2019/107607 CN2019107607W WO2021056220A1 WO 2021056220 A1 WO2021056220 A1 WO 2021056220A1 CN 2019107607 W CN2019107607 W CN 2019107607W WO 2021056220 A1 WO2021056220 A1 WO 2021056220A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
coding block
coding
block
motion vector
Prior art date
Application number
PCT/CN2019/107607
Other languages
English (en)
French (fr)
Inventor
马思伟
孟学苇
郑萧桢
王苫社
Original Assignee
北京大学
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学, 深圳市大疆创新科技有限公司 filed Critical 北京大学
Priority to CN201980032177.0A priority Critical patent/CN112204973A/zh
Priority to PCT/CN2019/107607 priority patent/WO2021056220A1/zh
Publication of WO2021056220A1 publication Critical patent/WO2021056220A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures

Definitions

  • This application relates to the technical field of digital video coding and decoding, and more specifically, to a method and device for video coding and decoding.
  • the video coding compression process includes: block division, prediction, transformation, quantization, and entropy coding processes to form a hybrid video coding framework.
  • the video coding and decoding technology standards include: international video coding standards H.264/MPEG-AVC, H. 265/MEPG-HEVC, the domestic audio and video coding standard AVS2, and the H.266/VVC international standard and AVS3 domestic standard that are being developed.
  • the inter-frame prediction mode introduces an affine motion compensated prediction mode (Affine motion compensated prediction), referred to as the Affine mode for short, which has a good prediction effect for scenes such as rotation and zooming.
  • a coding unit Coding Unit, CU
  • MV Motion Vetor
  • the present application provides a method and device for video coding and decoding, which can reduce the bandwidth pressure of the Affine mode while reducing the complexity of the codec.
  • a video encoding and decoding method including: obtaining a control point motion vector of an encoding block in the Affine mode of affine motion compensation prediction, and the control point motion vector is used to calculate the results of multiple sub-encoding blocks in the encoding block.
  • Motion vector when the coding block is unidirectionally predicted, motion compensation is performed on the first sub-coding block based on the motion vector of the first sub-coding block in the plurality of sub-coding blocks.
  • the motion compensation of the first sub-coded block is directly based on the motion vector of the first sub-coded block in the multiple sub-coded blocks, which reduces the codec System complexity, while improving coding efficiency, will not bring greater bandwidth pressure and improve the performance of the codec system.
  • a video encoding and decoding apparatus including: a processor configured to obtain a control point motion vector of an encoding block in an affine motion compensation prediction Affine mode, and the control point motion vector is used to calculate the Motion vectors of multiple sub-coding blocks in the coding block; when the coding block is unidirectionally predicted, motion compensation is performed on the first sub-coding block based on the motion vector of the first sub-coding block in the multiple sub-coding blocks.
  • an electronic device including the video encoding and decoding apparatus provided in the second aspect.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, the computer executes the method provided in the first aspect.
  • a computer program product containing instructions is provided, which when executed by a computer causes the computer to execute the method provided in the first aspect.
  • Fig. 1 is a structural diagram of a technical solution applying an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a video coding framework according to an embodiment of the present application.
  • Fig. 3 is a schematic diagram of a video decoding framework according to an embodiment of the present application.
  • Fig. 4 is a schematic diagram of a sub-pixel interpolation according to an embodiment of the present application.
  • Fig. 5 is a schematic flowchart of a video encoding and decoding method according to an embodiment of the present application.
  • 6a to 6c are schematic diagrams of control point motion vectors of coding blocks and motion vectors of sub coding blocks in Affine mode according to an embodiment of the present application.
  • Fig. 7 is a schematic flowchart of a specific video encoding method according to an embodiment of the present application.
  • Fig. 8 is a schematic flowchart of a specific video decoding method according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of another video encoding and decoding method according to an embodiment of the present application.
  • Fig. 10 is a schematic flowchart of another specific video encoding method according to an embodiment of the present application.
  • Fig. 11 is a schematic flowchart of another specific video decoding method according to an embodiment of the present application.
  • Fig. 12 is a schematic block diagram of a video encoding and decoding device according to an embodiment of the present application.
  • the embodiments of this application can be applied to standard or non-standard image or video codecs.
  • the codec of the VVC standard For example, the codec of the VVC standard.
  • the size of the sequence number of each process does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • Fig. 1 is a structural diagram of a technical solution applying an embodiment of the present application.
  • the system 100 can receive the data 102 to be processed, process the data 102 to be processed, and generate processed data 108.
  • the system 100 may receive the data to be encoded and encode the data to be encoded to generate encoded data, or the system 100 may receive the data to be decoded and decode the data to be decoded to generate decoded data.
  • the components in the system 100 may be implemented by one or more processors.
  • the processor may be a processor in a computing device or a processor in a mobile device (such as a drone).
  • the processor may be any type of processor, which is not limited in the embodiment of the present application.
  • the processor may include an encoder, decoder, or codec.
  • the system 100 may also include one or more memories.
  • the memory can be used to store instructions and data, for example, computer-executable instructions that implement the technical solutions of the embodiments of the present application, to-be-processed data 102, processed data 108, and so on.
  • the memory can be any type of memory, which is not limited in the embodiment of the present application.
  • the data to be encoded may include text, images, graphic objects, animation sequences, audio, video, or any other data that needs to be encoded.
  • the data to be encoded may include sensor data from sensors, which may be vision sensors (for example, cameras, infrared sensors), microphones, near-field sensors (for example, ultrasonic sensors, radars), position sensors, and temperature sensors. Sensors, touch sensors, etc.
  • the data to be encoded may include information from the user, for example, biological information, which may include facial features, fingerprint scans, retinal scans, voice recordings, DNA sampling, and the like.
  • Fig. 2 is a schematic diagram of a video coding framework 2 according to an embodiment of the present application.
  • FIG. 2 after receiving the video to be encoded, starting from the first frame of the video to be encoded, each frame in the video to be encoded is encoded in turn.
  • the current coded frame mainly undergoes processing such as prediction (Prediction), transformation (Transform), quantization (Quantization), and entropy coding (Entropy Coding), and finally the bit stream of the current coded frame is output.
  • the decoding process usually decodes the received bitstream according to the inverse process of the above process to recover the video frame information before decoding.
  • the video encoding framework 2 includes an encoding control module 201 for performing decision-making control actions in the encoding process and selection of parameters.
  • the encoding control module 202 controls the parameters used in transformation, quantization, inverse quantization, and inverse transformation, controls the selection of intra or inter mode, and parameter control of motion estimation and filtering, and
  • the control parameters of the encoding control module 202 will also be input to the entropy encoding module, and the encoding will be performed to form a part of the encoded bitstream.
  • the encoded frame is partitioned 202, specifically, it is firstly divided into slices, and then divided into blocks.
  • the coded frame is divided into a plurality of non-overlapping largest coding tree units (Coding Tree Units, CTUs), and each CTU can also be in a quadtree, or binary tree, or triple tree manner. Iteratively divides into a series of smaller coding units (Coding Unit, CU).
  • the CU may also include a prediction unit (Prediction Unit, PU) and a transformation unit (Transform Unit, TU) associated with it.
  • the PU It is the basic unit of prediction
  • TU is the basic unit of transformation and quantization.
  • the PU and the TU are respectively obtained by dividing into one or more blocks on the basis of the CU, where one PU includes multiple prediction blocks (Prediction Block, PB) and related syntax elements.
  • the PU and TU may be the same, or they may be obtained by the CU through different division methods.
  • at least two of the CU, PU, and TU are the same.
  • CU, PU, and TU are not distinguished, and prediction, quantization, and transformation are all performed in units of CU.
  • the CTU, CU, or other formed data units are all referred to as coding blocks in the following.
  • the data unit for video encoding may be a frame, a slice, a coding tree unit, a coding unit, a coding block, or any group of the above.
  • the size of the data unit can vary.
  • a prediction process is performed to remove the spatial and temporal redundant information of the current coded frame.
  • predictive coding methods include intra-frame prediction and inter-frame prediction.
  • Intra-frame prediction uses only the reconstructed information in the current frame to predict the current coding block
  • inter-frame prediction uses the information in other previously reconstructed frames (also called reference frames) to predict the current coding block.
  • Make predictions Specifically, in this embodiment of the present application, the encoding control module 202 is used to make a decision to select intra-frame prediction or inter-frame prediction.
  • the process of intra-frame prediction 203 includes obtaining the reconstructed block of the coded neighboring block around the current coding block as a reference block, and based on the pixel value of the reference block, the prediction mode method is used to calculate the predicted value to generate the predicted block , Subtracting the corresponding pixel values of the current coding block and the prediction block to obtain the residual of the current coding block, the residual of the current coding block is transformed 204, quantized 205, and entropy coding 210 to form the code stream of the current coding block. Further, after all the coded blocks of the current coded frame undergo the above-mentioned coding process, they form a part of the coded stream of the coded frame. In addition, the control and reference data generated in the intra-frame prediction 203 are also encoded by the entropy encoding 210 to form a part of the encoded bitstream.
  • the transform 204 is used to remove the correlation of the residual of the image block, so as to improve the coding efficiency.
  • the transformation of the residual data of the current coding block usually adopts two-dimensional discrete cosine transform (DCT) transformation and two-dimensional discrete sine transform (DST) transformation, for example, the residual information of the coded block Respectively multiply an N ⁇ M transformation matrix and its transposed matrix, and obtain the transformation coefficient of the current coding block after the multiplication.
  • DCT discrete cosine transform
  • DST two-dimensional discrete sine transform
  • quantization 205 is used to further improve the compression efficiency.
  • the transform coefficients can be quantized to obtain the quantized coefficients, and then the quantized coefficients are entropy-encoded 210 to obtain the residual code stream of the current encoding block. But it is not limited to content adaptive binary arithmetic coding (Context Adaptive Binary Arithmetic Coding, CABAC) entropy coding.
  • CABAC Context Adaptive Binary Arithmetic Coding
  • the coded neighboring block in the intra prediction 203 process is: the neighboring block that has been coded before the current coding block is coded, and the residual generated in the coding process of the neighboring block is transformed 204, quantized 205, After inverse quantization 206 and inverse transform 207, the reconstructed block is obtained by adding the prediction block of the neighboring block.
  • the inverse quantization 206 and the inverse transformation 207 are the inverse processes of the quantization 206 and the transformation 204, which are used to restore the residual data before the quantization and transformation.
  • the inter-frame prediction process includes motion estimation (ME) 208 and motion compensation (MC) 209.
  • the motion estimation is performed 208 according to the reference frame image in the reconstructed video frame, and the image block most similar to the current encoding block is searched for in one or more reference frame images according to a certain matching criterion as a matching block.
  • the relative displacement with the current coding block is the motion vector (Motion Vector, MV) of the current coding block.
  • MV Motion Vector
  • the original value of the pixel of the coding block is subtracted from the pixel value of the corresponding prediction block to obtain the residual of the coding block.
  • the residual of the current coding block is transformed 204, quantized 205, and entropy coding 210 to form a part of the code stream of the coded frame.
  • the control and reference data generated in the motion compensation 209 are also encoded by the entropy encoding 210 to form a part of the encoded bitstream.
  • the reconstructed video frame is a video frame obtained after filtering 211.
  • Filtering 211 is used to reduce compression distortions such as blocking effects and ringing effects generated in the encoding process.
  • the reconstructed video frame is used to provide reference frames for inter-frame prediction during the encoding process.
  • the reconstructed video frame is output after post-processing For the final decoded video.
  • Fig. 3 is a schematic diagram of a video decoding framework 3 according to an embodiment of the present application.
  • video decoding executes operation steps corresponding to video encoding.
  • the residual data undergoes inverse quantization 302 and inverse transformation 303 to obtain original residual data information.
  • the reconstructed image block in the current frame is used to construct prediction information according to the intra-frame prediction method; if it is inter-frame prediction, according to the decoded motion compensation syntax, Determine the reference block in the reconstructed image to obtain the prediction information; then, superimpose the prediction information and the residual information, and filter 311 to obtain the reconstructed video frame. After the reconstructed video frame undergoes post-processing 306, the decoded video is obtained .
  • an affine motion compensation prediction Affine mode is added to realize the prediction of irregular motion such as zooming, rotation, and perspective motion. Improve the performance of video codec.
  • the coding block is divided into multiple sub-coding blocks, for example, the coding unit CU is divided into multiple sub-coding units (sub-CU), where each sub-CU corresponds to its own motion vector MV, And predict based on the MV to obtain multiple prediction blocks. Therefore, although the Affine mode can realize the prediction of a variety of irregular motions, it also brings a large bandwidth.
  • the sub-pixel interpolation filter in the Affine mode is adjusted, and the MV of the coding block is limited to reduce the bandwidth pressure caused by the Affine mode.
  • the aforementioned non-Affine mode may include three inter-frame prediction modes in the current HEVC or other video coding and decoding standards: inter mode, merge mode, and skip mode.
  • the non-Affine mode includes but is not limited to the above inter mode, merge mode, and skip mode, and other inter prediction modes except the Affine mode are called non-Affine modes.
  • Affine mode coding blocks of the same size are divided into multiple sub-coding blocks for prediction, and in non-Affine mode, they are directly predicted.
  • Sub-pixel is the virtual pixel point obtained by interpolation calculation between the whole pixel points in the image frame. For example, if one sub-pixel is inserted between two whole pixels, the pixel accuracy at this time is 1/2, and the sub-pixel is 1/2 pixel. If three sub-pixels are inserted between two whole pixels, the pixel accuracy at this time is 1/4, and the three sub-pixels can be called 1/4 pixel, 1/2 pixel and 3 /4 pixels.
  • a i, j are the whole pixels in the video frame, i and j are both positive integers, except for A i, j , the pixels between the whole pixels, such as a i, j , b i, j , c i, j , di , j and so on are sub-pixels.
  • 3 sub-pixels are interpolated between two whole pixels. For example, a 0,0 , b 0,0 , c 0,0 are interpolated between A 0,0 and A 1,0.
  • Three sub-pixels there are three sub-pixels d 0,0 , h 0,0 and n 0,0 interpolated between A 0,0 and A 0,1 , among which, a 0,0 and d 0, 0 is 1/4 pixel, b 0,0 and h 0,0 are half pixel (or 1/2 pixel), c 0,0 and n 0,0 are 3/4 pixel.
  • the size of the coding block is 2 ⁇ 2, as shown by the black box in Figure 4, except for the 4 whole pixels A 0,1 , A 1,0 , A 0,1 and A 1,1 in the coding block, It is also necessary to use some whole pixels outside the coding block for sub-pixel interpolation.
  • a 0,0 , b 0,0 , c 0,0 can be calculated by using whole pixels in the horizontal direction.
  • 8 whole pixels from A -3,0 to A 4,0 are used for calculation get.
  • the d 0,0 , h 0,0 , n 0,0 can be calculated by using the whole pixel in the vertical direction. Optionally, it can be calculated by using 8 whole pixels from A 0,-3 to A 0,4 .
  • qfilter is the filter coefficient of the 7-tap interpolation filter that calculates 1/4 pixel and 3/4 pixel
  • hfilter is the filter coefficient of the 8-tap interpolation filter that calculates 1/2 pixel
  • a 0,0 , b 0,0 , c 0,0 , d 0,0 , h 0,0 and n 0,0 are obtained .
  • the calculation method is the same as the above a 0,0 , b 0,0 , c 0,0 , d 0,0 , h 0,0 and The calculation method of n 0,0 is similar, and those skilled in the art can refer to the above calculation formula and the prior art to calculate the pixel values of other sub-pixel points, which will not be repeated here.
  • filter coefficients of the 7-tap interpolation filter and the 8-tap interpolation filter described above may also refer to the filter coefficients in HEVC or other related technologies, or may be other arbitrary filter coefficients, which is not done in this embodiment of the application. limited.
  • the 8-tap interpolation filter and the 7-tap interpolation filter are used to continue the interpolation calculation on the basis of 1/4 pixel, and higher pixel accuracy can also be obtained, such as sub-pixel points with 1/16 pixel accuracy. .
  • different interpolation filters have different filter coefficients, and the smaller the number of taps, the smaller the number of pixels required for interpolation calculation of the interpolation filter.
  • the encoding block includes a luma encoding block and a chroma encoding block, or in other words, the above encoding block includes a luma component (Luma) and a chroma component (Chroma).
  • one coding block includes one luminance coding block and two chrominance coding blocks, where the two chrominance coding blocks are respectively a red chrominance coding block (Cr) and a blue chrominance coding block (Cb).
  • the size of one chroma coding block is the same as the size of 4 luma coding blocks, and one chroma coding block corresponds to 4 luma coding blocks.
  • the current coding block only constructs a motion vector candidate list, selects a motion vector from it, and finds a reference block in the reference frame according to the motion vector. If the vector is not an integer pixel accuracy, then sub-pixel interpolation is performed on the reference block to obtain the prediction block of the current coding block.
  • the luminance coding block For a luminance coding block of N ⁇ M size, only one luminance reference block is needed. If an 8-tap interpolation filter is used to perform interpolation calculation on the luminance reference block, the luminance coding block averages the reference required for each whole pixel. The number of whole pixels of a block, that is, the calculation formula of the pixel bandwidth is:
  • the corresponding chrominance coding block size is N/2 ⁇ M/2. If a 4-tap interpolation filter is used to interpolate the chrominance reference block, the chrominance coding block Average the number of whole pixel points of the reference block needed for each whole pixel point, that is, the calculation formula of the pixel bandwidth is:
  • the current coding block constructs two motion vector candidate lists, selects two motion vectors from them, and finds two references in the two reference frames according to the two motion vectors. For block, if the motion vector is not in integral pixel accuracy, sub-pixel interpolation is performed on the two reference blocks, and the two sub-pixel interpolated reference blocks are weighted to obtain the prediction block of the current coding block.
  • the two reference frames are the video frame before the currently encoded frame (historical frame) and the video frame after the currently encoded frame (future frame), respectively.
  • the bidirectional prediction mode is one of the dual motion vector prediction modes.
  • the dual motion vector prediction mode includes a dual forward prediction mode, a dual backward prediction mode, and the foregoing bidirectional prediction mode.
  • the dual forward prediction mode includes two forward motion vectors
  • the dual backward prediction mode includes two backward motion vectors.
  • the two reference frames are the video frames before the current coded frame (historical frames).
  • the dual backward prediction mode the two reference frames are both video frames after the current coded frame (future frames). ).
  • the following uses the bidirectional prediction mode as an example to illustrate the calculation of the bandwidth of the coding block in the bidirectional prediction mode. It should be understood that in this application, the bandwidth calculation of the coding block in the dual forward prediction mode and the dual backward prediction mode can be referred to in the bidirectional prediction mode. Relevant instructions will not be repeated here.
  • the pixel bandwidth of the chroma coding block is also twice that of the unidirectional prediction mode.
  • the calculation formula is:
  • the total bandwidth is 11.34, and when the coding block is of other sizes, the total bandwidth is less than 11.34. It should be understood that, for coding blocks of the same size, if the coding block adopts the unidirectional prediction mode (Uni), the pixel bandwidth is smaller than the pixel bandwidth in the bidirectional prediction mode (Bi).
  • the required bandwidth is 11.34 at most.
  • FIG. 5 shows a schematic flowchart of a video encoding and decoding method 200.
  • the video encoding and decoding method can be applied to the encoding end, in which case the method is specifically referred to as a video encoding method, or may be applied to the decoding end, in which case the method is specifically referred to as a video decoding method.
  • the video encoding method 200 includes:
  • a coding block is used as a whole to perform motion estimation and motion compensation to obtain the corresponding predicted value and coding information.
  • a coding block can be divided into multiple sub coding blocks, and motion compensation is performed according to the motion vector of each sub coding block in the multiple sub coding blocks to obtain multiple prediction values and multiple coding information.
  • the coding block may be a coding unit CU, or may be another type of image block, which is not specifically limited in the embodiment of the present application.
  • the aforementioned coding block may be 8 ⁇ 8 pixels or more and smaller than 128 ⁇ 128 pixels, for example, the size of the coding block is 8 ⁇ 8, 8 ⁇ 16, 16 ⁇ 16, 8 ⁇ 128 And so on, the specific coding block size is not specifically limited in the embodiment of the present application.
  • the aforementioned sub-coding block may be referred to as a sub-coding unit (sub-CU), and the sub-coding block may have a size of 4 ⁇ 4 pixels, or other pixel sizes, and the size of the specific sub-coding block
  • sub-CU sub-coding unit
  • the embodiments of this application also do not make specific limitations.
  • S220 Calculate the motion vectors of multiple sub-coding blocks in the coding block according to the motion vector of the control point.
  • the motion vector of each sub-coding block in the coding block can be calculated by obtaining the control point motion vector (CPMV) of the current coding block in the Affine mode.
  • CPMV control point motion vector
  • control point motion vector CPMV may be a motion vector of two control points, and the Affine mode in this case is also called the four-parameter Affine mode.
  • control point motion vector CPMV may be a motion vector of three control points, and the Affine mode in this case is also called the six-parameter Affine mode.
  • the motion vector MV of the sub-coding block can be calculated through the CPMV of two control points, where the MV of the sub-coding block at position (x, y) is
  • the calculation formula (1) is as follows:
  • W is the pixel width of the coding block
  • x, y are the relative position coordinates of the sub-coding block in the coding block
  • mv 0x , mv 0y are the motion vector of the zeroth control point
  • the zeroth control point is the control point in the upper left corner of Figure 5a
  • mv 1x and mv 1y are the motion vectors of the first control point
  • the first control point is the control point in the upper right corner of FIG. 6a.
  • the motion vector of the sub-coding block can also be calculated by CPMV of three control points, where the MV of the sub-coding block at the (x, y) position
  • the calculation formula (2) is as follows:
  • W and H are the pixel width and pixel height of the coding block
  • x, y are the relative position coordinates of the sub-coding block in the CU
  • mv 0x , mv 0y are the motion vector of the zeroth control point mv 1x
  • mv 1y is the motion vector of the first control point mv 2x
  • mv 2y is the motion vector of the second control point
  • the 20 control point is the control point in the lower left corner of Fig. 6b.
  • the motion vector MV of each sub-coding block in the coding block can be calculated by the above formula.
  • the coding block can be divided into 16 sub-coding blocks, and the MVs of the 16 sub-coding blocks can be As shown in FIG. 6c, therefore, one coding block corresponds to multiple motion vectors, and image prediction can be performed on the coding block more accurately.
  • the MV accuracy of each sub-coding block may be the same or different.
  • the embodiment of the present application does not specifically limit the MV accuracy of the sub-coding block. For example, it can be 4, 2, 1, 1/2, 1/4, 1/8, or 1/16 pixel accuracy.
  • the motion vector of each sub-encoding block calculated by the above formula can reach a maximum of 1/16 pixel accuracy after being rounded.
  • the corresponding reference block in the reference frame is determined according to the MV of the sub-coding block, and sub-pixel interpolation is performed on the reference block to obtain the sub-prediction of the sub-coding block Block, the pixel precision of sub-pixel interpolation is 1/16 pixel precision.
  • the MVs calculated according to the CPMV and the aforementioned formula (1) or formula (2) are different for different sub-coding blocks.
  • the foregoing coding block may be a luminance coding block, which calculates the MVs of multiple sub-luminance coding blocks in the luminance coding block through a four-parameter Affine mode or a six-parameter Affine mode.
  • the size of a chroma coding block is the same as the size of four luma coding blocks, and a chroma coding block corresponds to 4 luma coding blocks. Therefore, after the MVs of all sub-luminance coding blocks in the luminance coding block are obtained by the above calculation formula, the four first sub-luminance coding blocks in the plurality of sub-luminance coding blocks correspond to one first sub-color in the chrominance coding block.
  • the MV of the first sub-chrominance coding block is the average value of the MVs of the first sub-luminance coding degree.
  • the MVs of all sub-chrominance coding blocks in the chrominance coding block can be calculated.
  • the sub-luminance coding block and the sub-chrominance coding block both have a size of 4 ⁇ 4 pixels.
  • the average value of the motion vectors of the four first sub-luminance coding blocks is calculated, and in the 422 format, the average value of the motion vectors of the two first sub-luminance coding blocks is calculated.
  • rate-distortion optimization (Rate Distortion Optimization, RDO) technology or other technologies are used to make mode decisions. If the decision is made to use Affine mode for predictive coding of coding blocks, only the MV of the control point, namely CPMV, is written into the code stream. It is necessary to write the MV of each sub-coding block in the CU into the code stream, and the decoder can obtain the CPMV according to the code stream, so as to calculate the MV of each sub-coding block in the CU according to the CPMV.
  • RDO Rate Distortion Optimization
  • the Affine mode includes the Affine_AMVP mode and the Affine_merge mode.
  • the Affine_AMVP mode is used to perform predictive coding on the coding block, the prediction value of the coding block CPMV, namely CPMVP, is obtained, and the residual CPMVD of CPMV and CPMVP is calculated, and the CPMVD and CPMVP related information is written into the code stream.
  • the size of the coding block is greater than or equal to 16 ⁇ 16 pixels.
  • the size of the coding block is greater than or equal to 8 ⁇ 8 pixels.
  • motion compensation is performed on the multiple sub-coding blocks in the coding block.
  • a 6-tap interpolation filter may be used to perform sub-pixel interpolation processing to perform motion compensation on the sub-coding block, and an 8-tap or other number of taps interpolation filters may also be used to perform interpolation processing.
  • the number of taps is also the number of filter coefficients of the interpolation filter. The larger the number of taps, the more pixels the interpolation filter needs and the larger the transmission bandwidth required.
  • a 6-tap interpolation filter is used to perform sub-pixel interpolation to perform motion compensation on the sub-luminance coding block
  • a 4-tap interpolation filter is used to perform sub-pixel interpolation to perform sub-chroma coding. Block motion compensation.
  • the interpolation filter performs sub-pixel interpolation processing on the reference block of the first sub-luminance coding block to obtain the first sub-luminance prediction block corresponding to the first sub-luminance coding block; and/or,
  • a 4-tap interpolation filter is used to perform sub-pixel interpolation processing on the reference block of the first sub-chroma coding block to obtain the first sub-color The first sub-chrominance prediction block corresponding to the degree coding block.
  • filter coefficients of the 6-tap interpolation filter and the 4-tap interpolation filter in the embodiment of the present application are filter coefficients in the prior art, and may also be other arbitrary filter coefficients. This is not the case in the embodiment of the present application. Make a limit.
  • a unidirectional prediction (Uni) mode if a unidirectional prediction (Uni) mode is used for prediction, a reference frame list is constructed, and a frame of image is selected from the reference frame list for image prediction.
  • the process when the previous reconstructed frame (historical frame) of the current frame is selected to predict the coded block in the current frame, the process is called “forward prediction”.
  • the frame after the current frame (future frame) is selected
  • the process is called “backward prediction”
  • both forward prediction and backward prediction are unidirectional prediction.
  • the 4 ⁇ 4 sub-coding block in the coding block is used as a unit to perform intra-frame prediction.
  • a 6-tap interpolation filter is used to compare the reference frame
  • the luminance reference block in the sub-luminance coding block performs interpolation calculation, and the average number of whole pixel points of the reference block required for each whole pixel point of the sub-luminance coding block, that is, the calculation formula of the pixel bandwidth is:
  • the 4 ⁇ 4 sub-chroma coding block in the coding is still used as the unit for intra-frame prediction.
  • a 4 ⁇ 4 sub-chroma coding block corresponds to an 8 ⁇ 8 size. If a 4-tap interpolation filter is used to interpolate the chrominance reference block in the reference frame, the sub-chrominance coding block averages the number of integer pixel points of the reference block needed for each integer pixel point, that is The calculation formula of the pixel bandwidth is:
  • N ⁇ M coding blocks including one N ⁇ M luminance coding block, 2 N /2 ⁇ M/2 chroma coding block
  • N and M are positive integers
  • the calculation formula for the total pixel bandwidth is:
  • the bidirectional prediction mode (Bi) or the dual forward prediction mode or the dual backward prediction mode is adopted.
  • the coding block including an N ⁇ M luminance coding block, 2 N/2 ⁇ M/2 chrominance coding blocks
  • the calculation formula of the total pixel bandwidth is:
  • the total pixel bandwidth in Affine prediction mode is 6.59, which is less than the maximum total pixel bandwidth of 11.34 in non-Affine prediction mode, and when the prediction mode is bidirectional prediction In mode, the total pixel bandwidth in the Affine prediction mode is 13.18, which is greater than the maximum total pixel bandwidth of 11.34 in the non-Affine prediction mode.
  • the total pixel bandwidth in the Affine prediction mode is 13.18. Greater than the maximum total pixel bandwidth 11.34 in non-Affine prediction mode.
  • an encoding block includes multiple sub-encoding blocks, and the highest pixel accuracy of the motion vector of each sub-encoding block reaches 1/16 pixel accuracy. Multiple motion vectors and high-precision pixels will also cause greater bandwidth pressure Therefore, in order to reduce the bandwidth pressure in the Affine mode, the motion vector in the Affine mode needs to be restricted. In addition to the above-mentioned 6-tap interpolation filter in the Affine mode, the number of pixels in the motion compensation process is reduced. Limit the motion vector of the coding block in Affine mode.
  • FIG. 7 shows a schematic flowchart of a specific video encoding method 300, which is suitable for the video encoding end, and the method includes the motion vector analysis of the encoding block in the Affine mode. Limit the process to reduce the bandwidth in Affine mode.
  • the video encoding method 300 includes:
  • S311 Construct a candidate list of the control point motion vector of the coding block in the Affine mode, and obtain the control point motion vector of the coding block from the candidate list through rate-distortion cost (Rate Distortion Cost, RD Cost) calculation;
  • step S311 in the embodiment of the present application may be a specific implementation of the above step S210.
  • control point motion vector CPMV of the current coding block may be a motion vector of three control points, or a motion vector of two control points.
  • the two control points are respectively located at the upper left corner and the upper right corner of the current coding block.
  • the three control points are respectively located at the upper left corner, the upper right corner and the lower left corner of the current coding block.
  • the candidate list includes CPMV candidate motion vectors for two control points, or CPMV candidate motion vectors for three control points.
  • the candidate motion vector in the candidate list may be obtained based on the motion vector of the adjacent coding block.
  • the motion vectors of adjacent coded blocks may include multiple types, which may be the CPMV obtained by inference and calculation of the CPMV of the adjacent coded block, or the CPMV obtained by constructing the translational motion vector of the adjacent coded block, or It is the CPMV calculated through other types of motion vectors of adjacent coded blocks, which is not limited in the embodiment of the present application.
  • the CPMV candidate list in the Affine_merge mode and the Affine_AMVP mode are different.
  • the coding block is greater than or equal to 8 ⁇ 8 pixels.
  • the candidate list in the Affine_merge mode is constructed, wherein the candidate CPMV is calculated by the CPMV of the adjacent coding block, which is adjacent to the current coding block, and the Affine mode is also used for coding.
  • the optimal CPMV in the candidate list is obtained, which is used as the predicted value of the CPMV of the current coding block, that is, CPMVP, and the index of the CPMVP in the candidate list is written into the code stream.
  • the coding block is greater than or equal to 16 ⁇ 16 pixels.
  • the CPMV in the candidate list in the Affine_AMVP mode can be inferred from the CPMV of the neighboring block, can also be constructed by using the translation MV of the neighboring block, or can be the converted MV of the neighboring block, and so on.
  • the optimal CPMV in the candidate list is obtained, and it is used as the CPMVP of the current coding block.
  • motion estimation is performed in the reference frame to obtain the CPMV of the current coding block.
  • the residual of the CPMV and CPMVP of the coding block, also called CPMVD, and the index of the CPMVP in the reference list are written into the code stream.
  • the CPMV of the current coding block is obtained.
  • the CPMV may include MVs with two control points, or may include MVs with three control points.
  • the foregoing process of constructing the candidate list of coding blocks and obtaining the CPMV of the coding block from the candidate list may be a process of obtaining the CPMV of the luminance coding block, where the CPMV is the CPMV of the luminance coding block.
  • S320 Calculate the motion vectors of multiple sub-coding blocks in the coding block according to the motion vector of the control point.
  • this step S320 may be the same as step S220 in FIG. 5, and will not be repeated here.
  • control point motion vector CPMV of the current coding block is used to calculate the MVs of multiple sub-coding blocks in the coding block using the above calculation formula (1) or calculation formula (2)
  • the current MV is calculated based on the MVs of the multiple sub-coding blocks.
  • the motion vector of the coding block undergoes a restriction process.
  • the specific restriction process can include:
  • S341 When performing unidirectional prediction on the coded block, perform restriction calculation on multiple first restricted blocks in the coded block according to the motion vectors of the multiple sub-coded blocks.
  • a, b, e, b, d and f are constants, and x and y are the relative position coordinates of the sub-coding block in the coding block.
  • the brightness component of the coding block that is, the multiple first restriction blocks in the brightness coding block, is subjected to restriction calculation.
  • the first restriction block has a size of 4 ⁇ 8 or 8 ⁇ 4.
  • a 4 ⁇ 8 first restricted block includes two 4 ⁇ 4 size sub-coding blocks, and the position coordinates of the two sub-coding blocks are (0,0) and (0,4) respectively.
  • the motion vector mv x of the two sub-coding blocks can be expressed as (0, 4c)
  • the motion vector mv y can be expressed as (0, 4d+4). Therefore, the horizontal width bxW1 and the vertical height bxH1 of the area pointed by all mvs inside the 4 ⁇ 8 block, that is, the calculation formula for the horizontal width and vertical height of the 4 ⁇ 8 first restriction block (3 )for:
  • An 8 ⁇ 4 first restricted block also includes two 4 ⁇ 4 size sub-coding blocks, and the position coordinates of the two sub-coding blocks are (0,0) and (4,0) respectively.
  • the motion vector mv x of the two sub-coding blocks can be expressed as (0, 4a+4)
  • the motion vector mv y can be expressed as (0, 4b) respectively. Therefore, the horizontal width bxW2 and the vertical height bxH2 of the area pointed to by all mv inside the 8 ⁇ 4 block, that is, the horizontal width and vertical height calculation formula of the 8 ⁇ 4 first restriction block (4 )for:
  • the motion vectors of all sub-coding blocks in the coding block are not modified, that is, they are maintained through the calculation formula (formula (1)) in the four-parameter Affine mode or the calculation in the six-parameter Affine mode The motion vector calculated by the formula (Equation (2)).
  • the motion vectors of all sub-coding blocks in the current coding block are set to The same motion vector.
  • the same motion vector may be the mean value of multiple motion vectors calculated by the calculation formula (formula (1)) in the four-parameter Affine mode for multiple sub-coding blocks, or it may also be the calculation in the six-parameter Affine mode The average value of multiple motion vectors calculated by the formula (formula (2)).
  • the same motion vector may also be the value of any other motion vector, which is not limited in the embodiment of the present application.
  • S342 When performing bidirectional prediction on the coding block, perform restriction calculation on multiple second restriction blocks in the coding block according to the motion vectors of the multiple sub-coding blocks.
  • two reference frame lists are constructed, and two frames of images are selected from the reference frame lists for image prediction.
  • the two frames of images may be historical frames and future frames respectively.
  • the motion vectors of the multiple sub-coding blocks are also used to perform predictive coding on multiple sub-coding blocks.
  • the second restriction block performs restriction calculations.
  • the following uses the bidirectional prediction mode as an example to illustrate the restriction calculation process of multiple second restriction blocks in the coding block, the restriction calculation process of multiple second restriction blocks in the dual forward prediction mode or the dual backward prediction mode, and the bidirectional prediction mode
  • the calculation process of the limit below is the same, so I won't repeat it here.
  • the second restriction block has a size of 8 ⁇ 8.
  • an 8 ⁇ 8 second restricted block there are four 4 ⁇ 4 size sub-coding blocks, and the position coordinates of the four sub-coding blocks are (0,0), (0,4), (4,0). ) And (4,4).
  • the motion vectors mv x of the 4 sub-coding blocks can be expressed as (0,4c,4a+4,4a+4c+4)
  • the motion vectors mv y can be respectively It is expressed as (0,4d+4,4b,4b+4d+4). Therefore, the horizontal width bxW and the vertical height bxH of the area pointed to by all mvs inside the 8 ⁇ 8 second restriction block, that is, the horizontal width and vertical height of the 8 ⁇ 8 second restriction block
  • the calculation formula (5) is:
  • the motion vectors of all sub-coding blocks in the coding block are not modified, that is, they are maintained.
  • W an example of a preset threshold
  • the motion vectors of all sub-coding blocks in the current coding block are set to the same motion vector.
  • the same motion vector may be the mean value of multiple motion vectors calculated by the calculation formula (formula (1)) in the four-parameter Affine mode for multiple sub-coding blocks, or it may also be the calculation in the six-parameter Affine mode The average value of multiple motion vectors calculated by the formula (formula (2)).
  • the same motion vector may also be the value of any other motion vector, which is not limited in the embodiment of the present application.
  • the coding block is subjected to bidirectional prediction
  • the coding block is restricted in calculation in the bidirectional prediction mode
  • the coding block is also subjected to unidirectional prediction.
  • the prediction mode the coding block is restricted and calculated, and on this basis, the RD Cost calculation is performed to select the optimal prediction mode.
  • step S341 or only the above step S342 may also be performed.
  • the current coding block is located in a unidirectional prediction slice
  • only unidirectional prediction is performed on the coding block
  • the coding block is restricted in calculation in the unidirectional prediction mode, and RDCost calculation is performed on this basis to select Optimal prediction mode.
  • the motion vector of the chrominance coding block in the coding block may be calculated through the motion vector of the luma coding block. Therefore, after the luminance coding block undergoes the above restriction process, if the MVs of all sub-luminance coding blocks in the luminance coding block are the same MV, correspondingly, the MVs of all sub-chrominance coding blocks in the chroma coding block are also the same. MV. If the MVs of all the sub-luminance coding blocks in the luma coding block are different MVs, the MVs of all the sub-chroma codings in the chroma coding block are also different.
  • the bandwidth pressure of the current encoding block in the Affine mode is basically the same as the bandwidth pressure of the current encoding block in the non-Affine mode.
  • the MV of the sub-coding blocks in the coding block is not restricted, and each sub-coding block is used as a unit, and each sub-coding block in the current coding block is sequentially determined. Perform motion compensation.
  • the coding block when the coding block is unidirectionally predicted, the coding block corresponds to a reference frame list, and the MV of each sub-coding block is a single MV, and then according to the MV, the coding block or the sub-coding block corresponding to the coding block or sub-coding block can be directly determined Prediction block or sub-prediction block.
  • the coded block corresponds to two reference frame lists, and the MV of each sub-coded block is a dual MV, where the dual MVs are two obtained through motion estimation MV, the two MVs may be the same or different.
  • the dual MV two initial prediction blocks or two initial sub-prediction blocks corresponding to the coded block or sub-coded block are determined in two reference frames respectively, and then the pair The two initial prediction blocks or two initial sub-prediction blocks are weighted and calculated to obtain the final prediction block or sub-prediction block.
  • the prediction block or sub-prediction block in the reference frame according to the MV it is necessary to perform sub-pixel interpolation on the reference block corresponding to the MV in the reference frame to obtain the prediction block or sub-prediction block, where the prediction block Or the pixel accuracy of the sub-prediction block is the same as the pixel accuracy of the MV.
  • an 8-tap interpolation filter or a 6-tap interpolation filter or other arbitrary-tap interpolation filter can be used to perform sub-pixel interpolation processing, which is not limited in this embodiment of the application. .
  • a 6-tap interpolation filter is used for sub-pixel interpolation processing
  • a 4-tap interpolation filter is used for sub-pixel interpolation processing
  • the residual value of the coding block or sub-coding block is calculated, and the CPMV and the residual value in the Affine mode are calculated through the above-mentioned video coding method 300, according to The residual value is calculated with CPMV to calculate the RD cost of the current coding block, and compared with the RD Cost of the current coding block in other modes to confirm whether the Affine mode is used to predict and encode the current coding block.
  • the index of CPMVP in the candidate list is written into the code stream. If the Affine_AMVP mode is determined to be the prediction mode of the current coding block, the index of CPMVP and CPMVD is written into the code stream together.
  • step S341 and step S342 when the prediction mode is unidirectional prediction, if the number of pixels of all the 8 ⁇ 4 first restricted blocks in the luminance coding block is less than or equal to 165, and all the pixels in the luminance coding block are The number of pixel points of the 4 ⁇ 8 first restricted block is also less than or equal to 165, then the luminance coding block averages the number of whole pixel points of the reference block needed for each whole pixel point, that is, the calculation formula of the pixel bandwidth is:
  • the chroma coding block averages the number of whole pixel points of the reference block used for each whole pixel point, that is, the calculation formula of the pixel bandwidth is:
  • the pixel bandwidth of the luminance coding block is less than 6.69, that is, less than the maximum pixel bandwidth of 11.34 in the non-Affine mode. At this time, the bandwidth of the Affine mode is smaller than the bandwidth of the non-Affine mode.
  • the bandwidth of the Affine mode is basically equivalent to the bandwidth of the non-Affine mode at this time.
  • the prediction mode is bidirectional prediction
  • the luminance coding block averages the reference blocks needed for each whole pixel.
  • the number of whole pixels, that is, the calculation formula of pixel bandwidth is:
  • the chroma coding block averages the number of whole pixel points of the reference block used for each whole pixel point, that is, the calculation formula of the pixel bandwidth is:
  • the bandwidth of the Affine mode is smaller than the bandwidth of the non-Affine mode.
  • the bandwidth of the Affine mode is basically equivalent to the bandwidth of the non-Affine mode at this time.
  • the bandwidth situation is the same as the bandwidth situation in the above bidirectional prediction mode , I won’t repeat it here.
  • the encoding block needs to average each integer pixel point
  • the number of pixels used, in other words, the total bandwidth required is shown in Table 3.
  • step S341 and step S342 it can be seen from Table 3 and Table 1, whether it is in unidirectional prediction or bidirectional prediction mode, or in dual forward prediction mode, or in In the dual backward prediction mode, the bandwidth in the Affine mode is less than or equal to the bandwidth in the non-Affine mode, which will not bring additional bandwidth pressure.
  • FIG. 8 shows a schematic flowchart of a specific video decoding method 400, which can be applied to the video decoding end.
  • the method also includes the motion of the encoding block in the Affine mode.
  • the vector restriction process to reduce the bandwidth in Affine mode.
  • the decoding process in the embodiment of the present application corresponds to the encoding process in FIG. 7, that is, the video encoding method 300 is used to encode the coded block in the frame to be coded to form the code stream of the coded block, and the video decoding method 400 is used to
  • the relevant description in the above-mentioned video encoding method 300 which is not described in detail in this embodiment of the application.
  • the video decoding method 400 includes:
  • S412 Determine the coding mode of the coding block to be the Affine mode according to the code stream of the coding block, and obtain index information of the motion vector of the control point of the coding block;
  • S413 Construct a candidate list of the control point motion vector of the coding block, and obtain the control point motion vector according to the candidate list and index information of the control point motion vector;
  • the code stream of the coded block received by the video decoding end includes an identification bit that identifies whether the coded block is in the Affine mode. More specifically, the code stream of the code block includes a flag bit that identifies whether the code block is in the Affine_merge mode or the Affine_AMVP mode. Through the flag bit, it can be judged whether the coding mode of the coding block is the Affine mode, or more specifically, it can be judged whether the coding mode of the coding block is the Affine_merge mode or the Affine_AMVP mode.
  • the code stream of the coding block also includes index information of the CPMV of the current coding block in the candidate list.
  • a candidate list of CPMV in the Affine mode is constructed.
  • a CPMV candidate list in the Affine_merge mode is constructed.
  • the index value of the CPMV of the current coding block in the candidate list is obtained through the code stream, and the CPMV of the current coding block is directly determined by the index value.
  • the candidate list of CPMV in the Affine_AMVP mode is constructed. Obtain the CPMVP index value of the current coding block in the candidate list and the CPMVD of the current coding block through the code stream, determine the CPMVP of the current coding block in the candidate list through the index value, and add the CPMVP and the corresponding CPMVD to get the current The CPMV of the coding block.
  • the obtained CPMV of the current coding block may be the motion vector of two control points, or may be the motion vector of three control points.
  • control point motion vector is the motion vector of two control points
  • the four-parameter Affine mode calculation formula (formula (1)) is used to calculate the MVs of multiple sub-coding blocks in the current coding block.
  • control point motion vector is the motion vector of the three control points
  • the calculation formula (formula (2)) of the six-parameter Affine mode is used to calculate the MVs of multiple sub-coding blocks in the current coding block.
  • step S420 may be the same as step S220 in FIG. 5 or step S320 in FIG. 7, and will not be repeated here.
  • the current coding block is in the bidirectional prediction (Bi) mode or the unidirectional prediction (Uni) mode through the flag bit in the bitstream, and use different restriction methods to encode according to the prediction mode.
  • the MV of the block is restricted.
  • the current coding block adopts the unidirectional prediction mode.
  • the current coding block adopts the forward prediction mode.
  • the current coding block adopts the backward prediction mode.
  • the current coding block adopts the bidirectional prediction mode.
  • the process of restricting the MV of the coded block in the decoding process is similar to the process of restricting the MV of the coded block in the encoding process. Specifically, it may include:
  • this step S441 may be the same as the step S341 in FIG. 7, and the step S442 may be the same as the step S342 in FIG. 7, which will not be repeated here.
  • S450 Perform motion compensation on the coding block according to the result of the restriction.
  • the MVs of multiple sub-coding blocks in the coding block can be stored in a buffer, and then motion compensation is performed based on the MVs of the multiple sub-coding blocks.
  • this step S450 may also be the same as the step S350 in FIG. 7, and the specific implementation manner can be referred to its related description, which will not be repeated here.
  • the bandwidth in the Affine mode is also less than or equal to the bandwidth in the non-Affine mode after the restriction process and the processing of the 6-tap interpolation filter during the decoding process.
  • this application proposes another more optimized encoding and decoding method. While reducing the bandwidth of the Affine mode, it can also optimize the restriction process in the Affine mode, thereby reducing the complexity of the encoding and decoding system and improving the video encoding and decoding. Performance.
  • Fig. 9 shows a flowchart of another video encoding and decoding method according to an embodiment of the present application.
  • the video encoding and decoding method 500 includes:
  • a coding block in the Affine mode, can be divided into multiple sub-coding blocks, and motion compensation is performed according to the motion vector of each sub-coding block in the multiple sub-coding blocks to obtain multiple predicted values and multiple coding information.
  • the coding block may be a coding unit CU, or may be another type of image block, which is not specifically limited in the embodiment of the present application.
  • the aforementioned coding block may be a coding block larger than or equal to 8 ⁇ 8 pixels and smaller than 128 ⁇ 128 pixels.
  • the size of the coding block is 8 ⁇ 8, 8 ⁇ 16, 16 ⁇ 16, 8 ⁇ 128, etc., the specific coding block size is not specifically limited in the embodiment of the present application.
  • the aforementioned sub-coding block may be referred to as a sub-coding unit (sub-CU), and the sub-coding block may have a size of 4 ⁇ 4 pixels, or other pixel sizes, and the size of the specific sub-coding block
  • sub-CU sub-coding unit
  • the embodiments of this application also do not make specific limitations.
  • the motion vector of each sub-coding block in the coding block can be calculated by obtaining the control point motion vector (CPMV) of the current coding block in the Affine mode.
  • CPMV control point motion vector
  • control point motion vector CPMV may be a motion vector of two control points, and the Affine mode in this case is also called the four-parameter Affine mode.
  • control point motion vector CPMV may be a motion vector of three control points, and the Affine mode in this case is also called the six-parameter Affine mode.
  • the process of calculating the motion vector MV of the sub-coding block through CPMV may be the same as step S320 in the video encoding method 300, the calculation formula of the sub-coding block MV in the four-parameter Affine mode and the sub-coding in the six-parameter Affine mode
  • the calculation formula of the block MV can be referred to formula (1) and formula (2), which will not be repeated here.
  • the pixel accuracy of the CPMV of the coding block may be 4, 2, 1, 1/2, 1/4, 1/8, or 1/16. It should be understood that the pixel accuracy of the CPMV may also be any other pixel accuracy, which is not specifically limited in the embodiment of the present application.
  • the MV accuracy of each sub-coding block in the coding block may be the same or different.
  • the embodiment of the present application does not specifically limit the MV accuracy of the sub-coding block. For example, it may be 4, 2, 1, 1/2, 1/4, 1/8, 1/16 pixel accuracy or other pixel accuracy.
  • the motion vector of each sub-encoding block calculated by the above formula can reach a maximum of 1/16 pixel accuracy after being rounded.
  • the corresponding reference block in the reference frame is determined according to the MV of the sub-coding block, and sub-pixel interpolation is performed on the reference block to obtain the sub-prediction of the sub-coding block Block, the pixel precision of sub-pixel interpolation is 1/16 pixel precision.
  • the MVs calculated according to the CPMV and the aforementioned formula (1) or formula (2) are different for different sub-coding blocks.
  • the foregoing coding block may be a luminance coding block, which calculates the MVs of multiple sub-luminance coding blocks in the luminance coding block through a four-parameter Affine mode or a six-parameter Affine mode.
  • the size of one chroma coding block is the same as the size of 4 luma coding blocks, and one chroma coding block corresponds to 4 luma coding blocks. Therefore, after the MVs of all sub-luminance coding blocks in the luminance coding block are obtained by the above calculation formula, the four first sub-luminance coding blocks in the plurality of sub-luminance coding blocks correspond to one first sub-color in the chrominance coding block.
  • the MV of the first sub-chrominance coding block is the average value of the MVs of the first sub-luminance coding degree.
  • the MVs of all sub-chrominance coding blocks in the chrominance coding block can be calculated.
  • the sub-luminance coding block and the sub-chrominance coding block both have a size of 4 ⁇ 4 pixels.
  • rate-distortion optimization In addition, rate-distortion optimization, RDO technology or other technologies are used for mode decision-making. If the decision is to use Affine mode to predict and encode the coded block, only the MV of the control point, namely CPMV, is written into the bitstream, instead of every CU in the CU. The MV of each sub-coding block is written into the code stream, and the decoding end can obtain the CPMV according to the code stream, so as to calculate the MV of each sub-coding block in the CU according to the CPMV.
  • the Affine mode includes the Affine_AMVP mode and the Affine_merge mode.
  • the Affine_AMVP mode is used to perform predictive coding on the coding block, the prediction value of the coding block CPMV, namely CPMVP, is obtained, and the residual CPMVD of CPMV and CPMVP is calculated, and the CPMVD and CPMVP related information is written into the code stream.
  • the size of the coding block is greater than or equal to 16 ⁇ 16 pixels.
  • the size of the coding block is greater than or equal to 8 ⁇ 8 pixels.
  • the MVs of multiple sub-coding blocks in the coding block can be calculated, and motion compensation is performed on the multiple sub-coding blocks respectively according to the MVs of the multiple sub-coding blocks.
  • the multiple sub-coding blocks include the first sub-coding block, and the MV of the first sub-coding block can be obtained through the above calculation.
  • motion compensation is performed on the multiple sub-coding blocks in the coding block.
  • a 6-tap interpolation filter can be used to perform sub-pixel interpolation processing to perform motion compensation on the sub-coding block, and a 4-tap or other number of taps interpolation filters can also be used to perform interpolation processing.
  • the embodiment of the present application does not specifically limit the number of taps of the interpolation filter.
  • a 6-tap interpolation filter is used to perform sub-pixel interpolation to perform motion compensation on the sub-luminance coding block
  • a 4-tap interpolation filter is used to perform sub-pixel interpolation to perform sub-chroma coding. Block motion compensation.
  • the interpolation filter performs sub-pixel interpolation processing on the reference block of the first sub-luminance coding block to obtain the first sub-luminance prediction block corresponding to the first sub-luminance coding block; and/or,
  • a 4-tap interpolation filter is used to perform sub-pixel interpolation processing on the reference block of the first sub-chroma coding block to obtain the first sub-color The first sub-chrominance prediction block corresponding to the degree coding block.
  • filter coefficients of the 6-tap interpolation filter and the 4-tap interpolation filter in the embodiment of the present application.
  • the filter coefficients in the prior art may also be other arbitrary filter coefficients, which are not done in the embodiment of the present application. limited.
  • a reference frame list is constructed, and when a frame of image is selected from the reference frame list for image prediction, it is directly based on the first sub-frame
  • the MV of the coding block performs motion compensation on the first sub-coding block.
  • the first sub-coding block has a size of 4 ⁇ 4 pixels. It should be understood that the first sub-coding block may also be of any other size, which is not limited in the embodiment of the present application.
  • an interpolation filter is used to perform sub-pixel interpolation processing on the reference block of the first sub-coding block to obtain the first sub-prediction corresponding to the first sub-coding block Piece.
  • the interpolation filter is used to perform sub-pixel interpolation processing on their respective reference blocks to obtain multiple sub-coded blocks. Sub-prediction blocks.
  • the motion vector of the sub-coding block calculated in step S510 is not limited, that is, the motion vector limitation process of the current coding block is not performed (that is, the motion vector limitation process of the current coding block is not performed).
  • steps S341 and S441) directly use the motion vector of the sub-coding block calculated in step S510 for motion compensation.
  • the sub-coding block When performing motion compensation, the sub-coding block is used as the unit of motion compensation, and multiple sub-coding blocks in the coding block are respectively subjected to motion compensation, and the entire coding block is not subjected to motion compensation.
  • the restriction process of the motion vector may refer to step S341 in FIG. 7.
  • this embodiment specifically does not perform step S341, and in step S350, during unidirectional prediction, motion compensation is not performed on the coded block according to the result of the restriction.
  • the bandwidth in Affine mode is smaller than that in non-Affine mode. Therefore, in unidirectional prediction mode, the current coding block is not executed.
  • the restriction process of the motion vector can also ensure that the Affine mode will not bring greater bandwidth pressure. Therefore, through the solution of the embodiment of this application, the complexity of the coding and decoding system is reduced, and the coding efficiency is improved. With greater bandwidth pressure, the performance of the codec system is better.
  • FIG. 10 shows a schematic flowchart of a specific video encoding method 600, which is suitable for the video encoding end.
  • the video encoding method 600 includes:
  • S611 Construct a candidate list of the control point motion vector of the coding block in the Affine mode, and obtain the control point motion vector of the coding block from the candidate list through RD Cost calculation;
  • step S611 in the embodiment of the present application may be a specific implementation of the foregoing step S510.
  • control point motion vector CPMV of the current coding block may be a motion vector of three control points, or a motion vector of two control points.
  • the two control points are respectively located at the upper left corner and the upper right corner of the current coding block.
  • the three control points are respectively located at the upper left corner, the upper right corner and the lower left corner of the current coding block.
  • the candidate list includes CPMV candidate motion vectors for two control points, or CPMV candidate motion vectors for three control points.
  • the candidate motion vector in the candidate list may be obtained based on the motion vector of the adjacent coding block.
  • the motion vectors of adjacent coded blocks may include multiple types, which may be the CPMV obtained by inference and calculation of the CPMV of the adjacent coded block, or the CPMV obtained by constructing the translational motion vector of the adjacent coded block, or It is the CPMV calculated through other types of motion vectors of adjacent coded blocks, which is not limited in the embodiment of the present application.
  • the CPMV candidate list in the Affine_merge mode and the Affine_AMVP mode are different.
  • the coding block is greater than or equal to 8 ⁇ 8 pixels.
  • the candidate list in the Affine_merge mode is constructed, wherein the candidate CPMV is calculated by the CPMV of the adjacent coding block, which is adjacent to the current coding block, and the Affine mode is also used for coding.
  • the optimal CPMV in the candidate list is obtained, which is used as the predicted value of the CPMV of the current coding block, that is, CPMVP, and the index of the CPMVP in the candidate list is written into the code stream.
  • the coding block is greater than or equal to 16 ⁇ 16 pixels.
  • the CPMV in the candidate list in the Affine_AMVP mode can be inferred from the CPMV of the neighboring block, can also be constructed by using the translation MV of the neighboring block, or can be the converted MV of the neighboring block, and so on.
  • the optimal CPMV in the candidate list is obtained, and it is used as the CPMVP of the current coding block.
  • motion estimation is performed in the reference frame to obtain the CPMV of the current coding block.
  • the residual of the CPMV and CPMVP of the coding block, also called CPMVD, and the index of the CPMVP in the reference list are written into the code stream.
  • the CPMV of the current coding block is obtained.
  • the CPMV may include MVs with two control points, or may include MVs with three control points.
  • the foregoing process of constructing the candidate list of coding blocks and obtaining the CPMV of the coding block from the candidate list may be a process of obtaining the CPMV of the luminance coding block, where the CPMV is the CPMV of the luminance coding block.
  • S612 Calculate the motion vectors of multiple sub-coding blocks in the coding block according to the motion vector of the control point.
  • the MVs of multiple sub-coding blocks in the coding block are calculated using the foregoing calculation formula (1) or calculation formula (2) by using the control point motion vector CPMV of the current coding block.
  • S620 When performing unidirectional prediction on the coding block, perform motion compensation on the first sub coding block based on the motion vector of the first sub coding block in the multiple sub coding blocks.
  • S631 When performing bidirectional prediction on the coded block, perform restriction calculation on multiple second restricted blocks in the coded block according to the motion vectors of the multiple sub-coded blocks.
  • two reference frame lists are constructed, and two frames of images are selected from the reference frame lists for image prediction.
  • the two frames of images may be historical frames and future frames respectively.
  • the dual forward prediction mode or the dual backward prediction mode when adopted, in other words, when the dual motion vector prediction mode is adopted for predictive coding of the current coding block, the motion of the multiple sub-coding blocks is also used for predictive coding.
  • Vector performing restriction calculation on multiple second restriction blocks in the coding block.
  • this step S431 may be the same as the step S342 in FIG. 7, and the related technical features and technical solutions can be seen with reference to the description in the previous step S342, which will not be repeated here.
  • S632 Perform motion compensation on the coding block according to the result of the restriction.
  • step S431 if the horizontal width and vertical height of the restricted block are not within a certain threshold range, the data in the coding block If the MVs of the sub coding blocks are set to the same MV, the current coding block is regarded as a whole, and motion compensation is performed on the entire coding block.
  • the MV of the sub-coding blocks in the coding block is not restricted, and each sub-coding block is used as a unit, and each sub-coding block in the current coding block is sequentially determined. Perform motion compensation.
  • the coded block when a coded block is unidirectionally predicted, the coded block corresponds to a reference frame list, and each sub-coded block has only one MV, and the sub-prediction blocks corresponding to multiple sub-coded blocks can be directly determined in the reference frame according to the MV. .
  • the coded block corresponds to two reference frame lists, and each sub-coded block is determined in two reference frames according to its MV to obtain the coded block or sub-coded block.
  • the prediction block or sub-prediction block in the reference frame according to the MV it is necessary to perform sub-pixel interpolation on the reference block corresponding to the MV in the reference frame to obtain the prediction block or sub-prediction block, where the prediction block Or the pixel accuracy of the sub-prediction block is the same as the pixel accuracy of the MV.
  • an 8-tap interpolation filter or a 6-tap interpolation filter or other arbitrary-tap interpolation filter can be used to perform sub-pixel interpolation processing, which is not limited in this embodiment of the application. .
  • a 6-tap interpolation filter is used for sub-pixel interpolation processing
  • a 4-tap interpolation filter is used for sub-pixel interpolation processing
  • the residual value of the coding block or sub-coding block is calculated, and the CPMV and the residual value in the Affine mode are calculated through the above-mentioned video coding method 300, according to The residual value is calculated with CPMV to calculate the RD cost of the current coding block, and compared with the RD Cost of the current coding block in other modes to confirm whether the Affine mode is used to predict and encode the current coding block.
  • the CPMVP index in the candidate list is written into the code stream, and when the Affine_AMVP mode is determined to be the prediction mode of the current coding block, the CPMVP index and CPMVD are combined together Write the code stream.
  • the bandwidth in Affine mode is less than that in non-Affine mode. Therefore, in unidirectional prediction mode, the restriction process of the motion vector of the current coding block is not performed, and it can also ensure that the Affine mode will not bring more changes. Large bandwidth pressure.
  • a 6-tap interpolation filter is used to perform pixel interpolation on the reference block, and the motion vector restriction process of the current coding block is executed, which can also alleviate the bandwidth pressure in the Affine mode. Therefore, through the solutions of the embodiments of the present application, different processing methods are adopted for one-way or two-way prediction modes, which can reduce the complexity of the coding and decoding system and improve the coding efficiency while not bringing greater bandwidth. Pressure makes the performance of the codec system better.
  • FIG. 11 shows a schematic flowchart of a specific video decoding method 700, which is suitable for the video encoding end.
  • the decoding process in the embodiment of the present application corresponds to the encoding process in FIG. 10, that is, the video encoding method 600 is used to encode the coded block in the frame to be coded to form the code stream of the coded block, and the video decoding method 700 is used to encode the coded block.
  • the video encoding method 600 is used to encode the coded block in the frame to be coded to form the code stream of the coded block
  • the video decoding method 700 is used to encode the coded block.
  • the related description in the above-mentioned video encoding method 600 which is not described in detail in this embodiment of the application.
  • the video decoding method 700 includes:
  • S712 Determine the coding mode of the coding block to be the Affine mode according to the code stream of the coding block, and obtain index information of the motion vector of the control point of the coding block;
  • S713 Construct a candidate list of the control point motion vector of the coding block, and obtain the control point motion vector according to the candidate list and the index information of the control point motion vector;
  • the code stream of the coded block received by the video decoding end includes an identification bit that identifies whether the coded block is in the Affine mode. More specifically, the code stream of the code block includes a flag bit that identifies whether the code block is in the Affine_merge mode or the Affine_AMVP mode. Through the flag bit, it can be judged whether the coding mode of the coding block is the Affine mode, or more specifically, it can be judged whether the coding mode of the coding block is the Affine_merge mode or the Affine_AMVP mode.
  • the code stream of the coding block also includes index information of the CPMV of the current coding block in the candidate list.
  • a candidate list of CPMV in the Affine mode is constructed.
  • a CPMV candidate list in the Affine_merge mode is constructed.
  • the index value of the CPMV of the current coding block in the candidate list is obtained through the code stream, and the CPMV of the current coding block is directly determined by the index value.
  • the candidate list of CPMV in the Affine_AMVP mode is constructed. Obtain the CPMVP index value of the current coding block in the candidate list and the CPMVD of the current coding block through the code stream, determine the CPMVP of the current coding block in the candidate list through the index value, and add the CPMVP and the corresponding CPMVD to get the current The CPMV of the coding block.
  • the obtained CPMV of the current coding block may be the motion vector of two control points, or may be the motion vector of three control points.
  • control point motion vector is the motion vector of two control points
  • the four-parameter Affine mode calculation formula (formula (1)) is used to calculate the MVs of multiple sub-coding blocks in the current coding block.
  • control point motion vector is the motion vector of the three control points
  • the calculation formula (formula (2)) of the six-parameter Affine mode is used to calculate the MVs of multiple sub-coding blocks in the current coding block.
  • step S720 may be the same as step S220 in FIG. 5 or step S320 in FIG. 7, and details are not described herein again.
  • the current coding block is in the bidirectional prediction (Bi) mode or the unidirectional prediction (Uni) mode through the flag bit in the bitstream, and use different restriction methods to encode according to the prediction mode.
  • the MV of the block is restricted.
  • the current coding block adopts the unidirectional prediction mode.
  • the current coding block adopts the forward prediction mode.
  • the current coding block adopts the backward prediction mode.
  • the current coding block adopts the bidirectional prediction mode.
  • step S731 If the current coding block adopts the unidirectional prediction mode, the following step S731 is executed, and if the current coding block adopts the bidirectional prediction mode, the following steps S732 and S733 are executed.
  • this step S731 may be the same as step S620 in FIG. 10, and will not be repeated here.
  • the multiple second restriction blocks in the coding block are also restricted and calculated according to the motion vectors of the multiple sub-coding blocks.
  • the MVs of multiple sub-coding blocks in the coding block can be stored in a buffer, and then motion compensation is performed based on the MVs of the multiple sub-coding blocks.
  • this step S732 may be the same as step S631 in FIG. 10, and this step S733 may also be the same as step S632 in FIG.
  • FIG. 12 is a schematic block diagram of a video encoding and decoding device 10 according to an embodiment of the present application. It should be understood that when the video encoding and decoding device is used for video encoding, it may specifically be a video encoding device. When performing video decoding, it may specifically be a video decoding device.
  • the video encoding and decoding device 10 includes: a processor 11 and a memory 12;
  • the memory 12 may be used to store programs, and the processor 11 may be used to execute the programs stored in the memory to perform the following operations:
  • motion compensation is performed on the first sub coding block based on the motion vector of the first sub coding block in the plurality of sub coding blocks.
  • the processor 11 is further configured to: when performing unidirectional prediction on the coding block, not performing the restriction process of the motion vector of the coding block.
  • the processor 11 is further configured to: when performing dual motion vector prediction on the coding block, perform a motion vector restriction process of the coding block.
  • the processor 11 is specifically configured to: based on the control point motion vector, calculate and obtain the number of pixel points required for sub-pixel interpolation of the multiple restricted blocks in the coding block;
  • the motion vectors of the multiple sub-encoded blocks in the encoding block are all set to the same motion vector
  • the motion vectors of the multiple sub-encoding blocks in the encoding block are not modified.
  • the motion vectors of multiple sub-encoding blocks in the encoding block are different.
  • the processor 11 when performing unidirectional prediction on the coding block, is specifically configured to: not perform motion compensation on the entire coding block.
  • the processor 11 is specifically configured to: when performing unidirectional prediction on the coding block, use an interpolation filter to refer to the reference block of the first sub-coding block based on the pixel accuracy of the motion vector of the first sub-coding block. Performing sub-pixel interpolation processing to obtain the first sub-prediction block corresponding to the first sub-coding block.
  • the pixel accuracy of the motion vector of the first sub-coding block is less than or equal to 1/16, and the number of taps of the interpolation filter is less than or equal to 6.
  • the pixel bandwidth of the coding block motion compensation in the Affine mode is smaller than the pixel bandwidth of the coding block motion compensation in the non-Affine mode.
  • the processor 11 is further configured to calculate the motion vectors of multiple sub-coding blocks in the coding block according to the control point motion vector.
  • the coding block includes a luminance coding block and a chrominance coding block
  • the luminance coding block includes a plurality of sub-luminance coding blocks
  • the chrominance coding block includes a plurality of sub-chrominance coding blocks
  • the processor 11 is specifically configured to: calculate the motion vectors of the multiple sub-luminance coding blocks according to the control point motion vector and the position coordinates of the multiple sub-luminance coding blocks in the brightness coding block;
  • the coding block corresponds to 4 sub-luminance coding blocks in the plurality of sub-luminance coding blocks.
  • the first sub-coding block includes the first sub-luminance coding block and the first sub-chrominance coding block
  • the processor 11 is specifically configured to:
  • a 6-tap interpolation filter is used to perform sub-pixel interpolation processing on the reference block of the first sub-luminance coding block to obtain the first sub-sub-luminance coding block corresponding to the first sub-luminance coding block.
  • a 4-tap interpolation filter is used to perform sub-pixel interpolation processing on the reference block of the first sub-chrominance coding block to obtain the corresponding sub-chrominance coding block.
  • the first sub-chroma prediction block is used to perform sub-pixel interpolation processing on the reference block of the first sub-chrominance coding block to obtain the corresponding sub-chrominance coding block.
  • the processor 11 is specifically configured to: during video encoding, construct a candidate list of control point motion vectors of the encoding block;
  • the rate-distortion cost is calculated for the multiple candidate control point motion vectors in the candidate list, and the candidate control point motion vector with the smallest rate-distortion cost is set as the control point motion vector.
  • the processor 11 is specifically configured to: obtain the code stream of the coding block during video decoding;
  • the processor 11 is specifically configured to: during video decoding, determine, according to the code stream of the coding block, that the prediction mode of the coding block is unidirectional prediction;
  • motion compensation is performed on the first sub-coding block based on the motion vector of the first sub-coding block.
  • control point motion vector is a motion vector of three control points, or a motion vector of two control points.
  • the pixel accuracy of the control point motion vector is 4, 2, 1, 1/2, 1/4, 1/8, or 1/16.
  • the sub-coding block is 4 ⁇ 4 pixels.
  • the coding block is greater than or equal to 16 ⁇ 16 pixels
  • the coding block is greater than or equal to 8 ⁇ 8 pixels.
  • the embodiments of the present application also provide an electronic device, which may include the video coding and decoding apparatuses of the various embodiments of the present application described above.
  • the electronic device may include, but is not limited to, mobile phones, drones, cameras, etc.
  • An embodiment of the present application also provides a video encoding and decoding device, including a processor and a memory, where the memory is used to store program instructions, and the processor is used to call the program instructions to execute the video encoding and decoding in the various embodiments of the application described above. method.
  • the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer executes the method of the foregoing method embodiment.
  • the embodiment of the present application also provides a computer program product containing instructions, which when executed by a computer causes the computer to execute the method of the foregoing method embodiment.
  • the computer may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD)), etc.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

提供一种视频编解码的方法与装置,可以在减小Affine模式的带宽压力的同时,降低编解码器的复杂度。该视频编解码的方法,包括:获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,该控制点运动矢量用于计算得到该编码块中多个子编码块的运动矢量;当对该编码块进行单向预测时,基于该多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿。

Description

视频编解码的方法与装置
版权申明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本申请涉及数字视频编解码技术领域,并且更为具体地,涉及一种视频编解码的方法与装置。
背景技术
目前,为了减少视频存储和传输所占用的带宽,需要对视频数据进行编码压缩处理。目前常用的编码技术中,视频的编码压缩处理过程包括:块划分、预测、变换、量化和熵编码过程,形成一个混合视频编码框架。在该混合视频编码框架的基础上,经过几十年的发展,逐渐形成了视频编解码技术标准,目前主流的一些视频编解码标准包括:国际视频编码标准H.264/MPEG-AVC、H.265/MEPG-HEVC、国内音视频编码标准AVS2,以及正在制定的H.266/VVC国际标准和AVS3国内标准。
在块划分、预测、变换、量化和熵编码的编码过程中,预测分为帧内预测(Intra Prediction)和帧间预测(Inter Prediction)两种预测模式,在最新的编码标准,例如H266标准中,帧间预测模式引入了仿射运动补偿预测模式(Affine motion compensated prediction),简称为Affine模式,该Affine模式对于旋转、缩放等场景具有良好的预测效果。具体地,在采用Affine模式进行预测编码时,一个编码单元(Coding Unit,CU)的具有多个运动矢量(Motion Vetor,MV),因此,会给视频编解码带来较大的带宽压力。
因此,如何在减小Affine模式的带宽压力的同时,减少其运算时间,降低编解码器的复杂度,是一项亟待解决的问题。
发明内容
本申请提供一种视频编解码的方法与装置,可以在减小Affine模式的带宽压力的同时,降低编解码器的复杂度。
第一方面,提供一种视频编解码的方法,包括:获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,该控制点运动矢量用于计算得到该编码块中多个子编码块的运动矢量;当对该编码块进行单向预测时,基于该多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿。
通过本申请的技术方案,对该编码块进行单向预测时,直接基于该多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿,在减小编解码系统复杂度,提高编码效率的同时,还不会带来更大的带宽压力,提升编解码系统的性能。
第二方面,提供一种视频编解码的装置,包括:处理器,该处理器用于:获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,该控制点运动矢量用于计算得到该编码块中多个子编码块的运动矢量;当对该编码块进行单向预测时,基于该多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿。
第三方面,提供了一种电子设备,包括第二方面提供的视频编解码的装置。
第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得,所述计算机执行第一方面提供的方法。
第五方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得计算机执行第一方面提供的方法。
附图说明
图1是应用本申请实施例的技术方案的架构图。
图2是根据本申请实施例的视频编码框架示意图。
图3是根据本申请实施例的视频解码框架示意图。
图4是根据本申请实施例的一种亚像素插值的示意图。
图5是根据本申请实施例的一种视频编解码方法的示意性流程图。
图6a至图6c是根据本申请实施例的Affine模式下编码块的控制点运动矢量及子编码块的运动矢量的示意图。
图7是根据本申请实施例的一种具体的视频编码方法的示意性流程图。
图8是根据本申请实施例的一种具体的视频解码方法的示意性流程图。
图9是根据本申请实施例的另一视频编解码方法的示意性流程图。
图10是根据本申请实施例的另一具体的视频编码方法的示意性流程图。
图11是根据本申请实施例的另一种具体的视频解码方法的示意性流程图。
图12是根据本申请实施例的一种视频编解码装置的示意性框图。
具体实施方式
下面将结合附图,对本申请实施例中的技术方案进行描述。
本申请实施例可适用于标准或非标准的图像或视频编解码器。例如,VVC标准的编解码器。
应理解,本文中的具体的例子只是为了帮助本领域技术人员更好地理解本申请实施例,而非限制本申请实施例的范围。
还应理解,本申请实施例中的公式只是一种示例,而非限制本申请实施例的范围,各公式可以进行变形,这些变形也应属于本申请保护的范围。
还应理解,在本申请的各种实施例中,各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
还应理解,本说明书中描述的各种实施方式,既可以单独实施,也可以组合实施,本申请实施例对此并不限定。
除非另有说明,本申请实施例所使用的所有技术和科学术语与本申请的技术领域的技术人员通常理解的含义相同。本申请中所使用的术语只是为了描述具体的实施例的目的,不是旨在限制本申请的范围。本申请所使用的术语“和/或”包括一个或多个相关的所列项的任意的和所有的组合。
图1是应用本申请实施例的技术方案的架构图。
如图1所示,系统100可以接收待处理数据102,对待处理数据102进行处理,产生处理后的数据108。例如,系统100可以接收待编码数据,对待编码数据进行编码以产生编码后的数据,或者,系统100可以接收待解码数据,对待解码数据进行解码以产生解码后的数据。在一些实施例中,系统100中的部件可以由一个或多个处理器实现,该处理器可以是计算设备中的处理器,也可以是移动设备(例如无人机)中的处理器。该处理器可以为任意种类的处理器,本申请实施例对此不做限定。在一些可能的设计中,该处 理器可以包括编码器、解码器或编解码器等。系统100中还可以包括一个或多个存储器。该存储器可用于存储指令和数据,例如,实现本申请实施例的技术方案的计算机可执行指令、待处理数据102、处理后的数据108等。该存储器可以为任意种类的存储器,本申请实施例对此也不做限定。
待编码数据可以包括文本、图像、图形对象、动画序列、音频、视频、或者任何需要编码的其他数据。在一些情况下,待编码数据可以包括来自传感器的传感数据,该传感器可以为视觉传感器(例如,相机、红外传感器),麦克风、近场传感器(例如,超声波传感器、雷达)、位置传感器、温度传感器、触摸传感器等。在一些情况下,待编码数据可以包括来自用户的信息,例如,生物信息,该生物信息可以包括面部特征、指纹扫描、视网膜扫描、嗓音记录、DNA采样等。
图2是根据本申请实施例的视频编码框架2示意图。如图2所示,在接收待编码视频后,从待编码视频的第一帧开始,依次对待编码视频中的每一帧进行编码。其中,当前编码帧主要经过:预测(Prediction)、变换(Transform)、量化(Quantization)和熵编码(Entropy Coding)等处理,最终输出当前编码帧的码流。对应的,解码过程通常是按照上述过程的逆过程对接收到的码流进行解码,以恢复出解码前的视频帧信息。
具体地,如图2所示,所述视频编码框架2中包括一个编码控制模块201,用于进行编码过程中的决策控制动作,以及参数的选择。例如,如图2所示,编码控制模块202控制变换、量化、反量化、反变换的中用到的参数,控制进行帧内或者帧间模式的选择,以及运动估计和滤波的参数控制,且编码控制模块202的控制参数也将输入至熵编码模块中,进行编码形成编码码流中的一部分。
对当前编码帧开始编码时,对编码帧进行划分202处理,具体地,首先对其进行条带(slice)划分,再进行块划分。可选地,在一个示例中,编码帧划分为多个互不重叠的最大的编码树单元(Coding Tree Unit,CTU),各CTU还可以分别按四叉树、或二叉树、或三叉树的方式迭代划分为一系列更小的编码单元(Coding Unit,CU),一些示例中,CU还可以包含与之相关联的预测单元(Prediction Unit,PU)和变换单元(Transform Unit,TU),其中PU为预测的基本单元,TU为变换和量化的基本单元。一些示例中,PU和TU分别是在CU的基础上划分成一个或多个块得到的,其中一个PU 包含多个预测块(Prediction Block,PB)以及相关语法元素。一些示例中,PU和TU可以是相同的,或者,是由CU通过不同的划分方法得到的。一些示例中,CU、PU和TU中的至少两种是相同的,例如,不区分CU、PU和TU,全部是以CU为单位进行预测、量化和变换。为方便描述,下文中将CTU、CU或者其它形成的数据单元均称为编码块。
应理解,在本申请实施例中,视频编码针对的数据单元可以为帧,条带,编码树单元,编码单元,编码块或以上任一种的组。在不同的实施例中,数据单元的大小可以变化。
具体地,如图2所示,编码帧划分为多个编码块后,进行预测过程,用于去除当前编码帧的空域和时域冗余信息。当前比较常用的预测编码方法包括帧内预测和帧间预测两种方法。帧内预测仅利用本帧图像中己重建的信息对当前编码块进行预测,而帧间预测会利用到之前已经重建过的其它帧图像(也被称作参考帧)中的信息对当前编码块进行预测。具体地,在本申请实施例中,编码控制模块202用于决策选择帧内预测或者帧间预测。
当选择帧内预测模式时,帧内预测203的过程包括获取当前编码块周围已编码相邻块的重建块作为参考块,基于该参考块的像素值,采用预测模式方法计算预测值生成预测块,将当前编码块与预测块的相应像素值相减得到当前编码块的残差,当前编码块的残差经过变换204、量化205以及熵编码210后形成当前编码块的码流。进一步的,当前编码帧的全部编码块经过上述编码过程后,形成编码帧的编码码流中的一部分。此外,帧内预测203中产生的控制和参考数据也经过熵编码210编码,形成编码码流中的一部分。
具体地,变换204用于去除图像块的残差的相关性,以便提高编码效率。对于当前编码块残差数据的变换通常采用二维离散余弦变换(Discrete Cosine Transform,DCT)变换和二维离散正弦变换(Discrete Sine Transform,DST)变换,例如在编码端将编码块的残差信息分别与一个N×M的变换矩阵及其转置矩阵相乘,相乘之后得到当前编码块的变换系数。
在产生变换系数之后用量化205进一步提高压缩效率,变换系数经量化可以得到量化后的系数,然后将量化后的系数进行熵编码210得到当前编码块的残差码流,其中,熵编码方法包括但不限于内容自适应二进制算术编码(Context Adaptive Binary Arithmetic Coding,CABAC)熵编码。
具体地,帧内预测203过程中的已编码相邻块为:当前编码块编码之前, 已进行编码的相邻块,该相邻块的编码过程中产生的残差经过变换204、量化205、反量化206、和反变换207后,与该相邻块的预测块相加得到的重建块。对应的,反量化206和反变换207为量化206和变换204的逆过程,用于恢复量化和变换前的残差数据。
如图2所示,当选择帧间预测模式时,帧间预测过程包括运动估计(Motion Estimation,ME)208和运动补偿(Motion Compensation,MC)209。具体地,根据重建视频帧中的参考帧图像进行运动估计208,在一张或多张参考帧图像中根据一定的匹配准则搜索到与当前编码块最相似的图像块为匹配块,该匹配块与当前编码块的相对位移即为当前编码块的运动矢量(Motion Vector,MV)。然后基于该运动矢量和参考帧对当前编码块进行运动补偿209,获得当前编码块的预测块。并将该编码块像素的原始值与对应的预测块像素值相减得到编码块的残差。当前编码块的残差经过变换204、量化205以及熵编码210后形成编码帧的编码码流中的一部分。此外,运动补偿209中产生的控制和参考数据也经过熵编码210编码,形成编码码流中的一部分。
其中,如图2所示,重建视频帧为经过滤波211之后得到视频帧。滤波211用于减少编码过程中产生的块效应和振铃效应等压缩失真,重建视频帧在编码过程中用于为帧间预测提供参考帧,在解码过程中,重建视频帧经过后处理后输出为最终的解码视频。
图3是根据本申请实施例的视频解码框架3示意图。如图3所示,视频解码执行与视频编码相对应的操作步骤。首先利用熵解码301得到编码码流中的残差数据、预测语法、帧内预测语法、运动补偿语法以及滤波语法中的一种或多种数据信息。其中,残差数据经过反量化302和反变换303得到原始残差数据信息。此外,根据预测语法确定当前解码块使用帧内预测还是帧间预测。如果是帧内预测304,则根据解码得到的帧内预测语法,利用当前帧中已重建图像块按照帧内预测方法构建预测信息;如果是帧间预测,则根据解码得到的运动补偿语法,在已重建的图像中确定参考块,得到预测信息;接下来,再将预测信息与残差信息进行叠加,并经过滤波311操作便可以得到重建视频帧,重建视频帧经过后处理306后得到解码视频。
作为编码解过程中的重要环节,不同的帧间预测模式会造成不同的带宽压力。在当前视频编码标准VVC的参考软件VTM-6.0中,在HEVC的帧间 预测模式的基础上,增加了仿射运动补偿预测Affine模式,以实现缩放、旋转、透视运动等无规则运动的预测,提高视频编解码的性能。
其中,Affine模式下,会将编码块划分为多个子编码块,例如将编码单元CU划分为多个子编码单元(sub-CU),其中,每个sub-CU均对应其各自的运动矢量MV,并基于该MV进行预测得到多个预测块。因此,Affine模式虽然能够实现多种无规则运动的预测,但同时也带来了大的带宽。
为了能减小Affine模式占用的带宽,在本申请一种实施方式中,对Affine模式下的亚像素插值滤波器进行调整,并且对编码块的MV进行限制,以减小Affine模式造成的带宽压力。
此处需要的说明的是,上述非Affine模式可以包括目前HEVC或者其它视频编解码标准中的三种帧间预测模式:inter模式、merge模式和skip模式。可选地,在本申请中,非Affine模式包括但不限于上述inter模式、merge模式和skip模式,除Affine模式外的其它帧间预测均称之为非Affine模式。相同大小的编码块,在Affine模式下,将其划分为多个子编码块进行预测,而在非Affine模式下,直接对其进行预测。
下面,首先说明非Affine模式下,编码块进行运动补偿时的亚像素插值过程以及带宽大小。
具体地,在运动估计和运动补偿过程中,为了更精确的标识物体的运动,会采用亚像素插值的方法对编码块以及参考帧进行处理,以得到更为准确的预测块。亚像素即图像帧中整像素点之间经过插值计算得到的虚拟像素点。例如,若两个整像素点之间插入1个亚像素点,则此时的像素精度为1/2,该亚像素点为1/2像素点。若两个整像素点之间插入3个亚像素点,则此时的像素精度为1/4,该3个亚像素点可以分别称之为1/4像素点,1/2像素点和3/4像素点。
下面,以1/4像素精度插值为例,描述亚像素插值过程。
如图4所示,A i,j为视频帧中的整像素点,i和j均为正整数,除A i,j外,整像素点之间的像素点,例如a i,j,b i,j,c i,j,d i,j等为亚像素点。在图4中,两个整像素点之间内插3个亚像素点,例如,A 0,0与A 1,0之间内插有a 0,0,b 0,0,c 0,0三个亚像素点,A 0,0与A 0,1之间内插有d 0,0,h 0,0,n 0,0三个亚像素点,其中,a 0,0和d 0,0为1/4像素点,b 0,0和h 0,0为半像素点(或者1/2像素点),c 0,0和n 0,0为3/4像素点。
当编码块为2×2大小时,即图4中黑框所示,除编码块中的4个整像素点A 0,1,A 1,0,A 0,1以及A 1,1外,还需要用到该编码块外的一些整像素点进行亚像素插值。
具体地,a 0,0,b 0,0,c 0,0可用水平方向的整像素点计算得到,可选地,用到A -3,0至A 4,0这8个整像素点计算得到。
该d 0,0,h 0,0,n 0,0可用竖直方向的整像素点计算得到,可选地,用到A 0,-3至A 0,4这8个整像素点计算得到。
具体地,该a 0,0,b 0,0,c 0,0,d 0,0,h 0,0以及n 0,0的计算公式(3)如下:
a 0,j=(∑ i=-3...3A i,jqfilter[i])>>(B-8);
b 0,j=(∑ i=-3...4A i,jhfilter[i])>>(B-8);
c 0,j=(∑ i=-2...4A i,jqfilter[1-i])>>(B-8);
d 0,0=(∑ j=-3...3A 0,jqfilter[j])>>(B-8);
h 0,0=(∑ j=-3...4A 0,jhfilter[j])>>(B-8);
n 0,0=(∑ j=-2...4A 0,jqfilter[1-j])>>(B-8).
其中,qfilter为计算1/4像素和3/4像素的7抽头插值滤波器的滤波器系数,hfilter为计算1/2像素的8抽头插值滤波器的滤波器系数,B为参考帧的像素比特深度,可选地,B=8。
然后,根据上述计算得到a 0,0,b 0,0,c 0,0,d 0,0,h 0,0以及n 0,0,对剩余的亚像素点,例如e 0,0,f 0,0,g 0,0,i 0,0等亚像素点进行计算,其计算方法和上述a 0,0,b 0,0,c 0,0,d 0,0,h 0,0以及n 0,0的计算方法近似,本领域技术人员可以参考上述计算公式以及现有技术计算得到其他亚像素点的像素值,此处不再赘述。
应理解,上述7抽头插值滤波器和8抽头插值滤波器的滤波器系数也可以参见HEVC或其它相关技术中的滤波器系数,也可以为其它任意滤波器系数,本申请实施例对此不做限定。
可选地,通过上述方法,采用8抽头插值滤波器和7抽头插值滤波器在1/4像素的基础上继续插值计算,还可以得到更高像素精度,例如1/16像素精度的亚像素点。
具体地,当采用X抽头插值滤波器对图像帧或者编码块进行亚像素插值时,对于N×M的编码块,需要用到该N×M的编码块外X-1行X-1列的整 像素点,即会用到(N+X-1)×(M+X-1)个整像素点,其中,X、N、M均为正整数。为方便描述,在下文中,将亚像素插值过程中,编码块中平均每个整像素点用到的整像素点数称为“像素带宽”。
例如,当采用8抽头插值滤波器对图像帧或者编码块进行亚像素插值时,X=8,此时,对于N×M的编码块,平均每个整像素点用到((N+7)×(M+7))/(N×M)个整像素点。同样的,当采用4抽头插值滤波器对图像帧或者编码块进行亚像素插值时,X=4,此时,对于N×M的编码块,平均每个整像素点用到((N+3)×(M+3))/(N×M)个整像素点。
在本申请实施例中,不同的插值滤波器的滤波器系数不同,抽头数量越少,插值滤波器插值计算需要用到的像素数量也越少。
更具体地,编码块包括亮度编码块和色度编码块,或者换言之,上述编码块包括亮度分量(Luma)和色度分量(Chroma)。具体地,一个编码块包括1个亮度编码块和2个色度编码块,其中,2个色度编码块分别为红色色度编码块(Cr)以及蓝色色度编码块(Cb)。当视频的采样格式为YCbCr 4:2:0时,一个色度编码块的大小与4个亮度编码块的大小相同,一个色度编码块对应于4个亮度编码块。
若采用单向预测(Unidirectional Prediction,Uni)模式进行预测,当前编码块只构建一个运动矢量候选列表,从中选择得到一个运动矢量,根据该一个运动矢量在参考帧中对应找到一个参考块,若运动矢量非整像素精度,则对该参考块进行亚像素插值,得到当前编码块的预测块。
对于N×M大小的亮度编码块,只需用到一个亮度参考块,若采用8抽头插值滤波器对该亮度参考块进行插值计算,亮度编码块平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为:
Figure PCTCN2019107607-appb-000001
对应于该N×M大小的亮度编码块,其对应的色度编码块大小为N/2×M/2,若采用4抽头的插值滤波器对色度参考块进行插值处理,色度编码块平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为:
Figure PCTCN2019107607-appb-000002
因此,对于N×M大小的编码块(包括一个N×M的亮度编码块,2个 N/2×M/2的色度编码块),其总像素带宽的计算公式为:
S1=L1+2×C1。
若采用双向预测(Bidirectional Prediction,Bi)模式进行预测,当前编码块构建两个运动矢量候选列表,从中选择得到两个运动矢量,根据该两个运动矢量在两个参考帧中对应找到两个参考块,若运动矢量非整像素精度,则对该两个参考块进行亚像素插值,并对两个亚像素插值后的参考块进行加权计算后得到当前编码块的预测块。具体地,该双向预测模式中,两个参考帧分别为当前编码帧之前的视频帧(历史帧)以及当前编码帧之后的视频帧(未来帧)。该双向预测模式为双运动矢量预测模式中的一种,具体地,双运动矢量预测模式包括双前向预测模式、双后向预测模式以及上述双向预测模式。双前向预测模式包括两个前向运动矢量,双后向预测模式包括两个后向运动矢量。在双前向预测模式中,两个参考帧均为当前编码帧之前的视频帧(历史帧),在双后向预测模式中,两个参考帧均为当前编码帧之后的视频帧(未来帧)。
下文以双向预测模式为例,说明双向预测模式下编码块的带宽计算,应理解,在本申请中,双前向预测模式以及双后向预测模式下编码块的带宽计算可以参见双向预测模式的相关说明,此处不再赘述。
类似地,对于N×M大小的亮度编码块,则需用到两个亮度参考块,若采用8抽头插值滤波器对该两个参考块的亮度分量进行插值计算,亮度编码块的像素带宽为单向预测模式下的两倍,其计算公式为:
Figure PCTCN2019107607-appb-000003
同样的,双向预测模式下,色度编码块的像素带宽也为单向预测模式下的两倍,其计算公式为:
Figure PCTCN2019107607-appb-000004
因此,双向预测模式下,对于N×M大小的编码块(包括一个N×M的亮度编码块,2个N/2×M/2的色度编码块),其总像素带宽的计算公式为:
S2=L2+2×C2。
在非Affine模式下,按照上述计算公式,对于不同大小的编码块,若亮度分量采用8抽头的插值滤波器,色度分量采用4抽头的插值滤波器,平均每个整像素点所需要用到的整像素点数,换言之,所需要的总像素带宽如表 1所示。
表1
Figure PCTCN2019107607-appb-000005
如表1所示,当编码块为4×16或者16×4时,在双向预测模式(Bi)下,总带宽为11.34,当编码块为其它大小时,总带宽小于11.34。应当理解的是,相同大小的编码块,若该编码块采用单向预测模式(Uni),其像素带宽小于双向预测模式(Bi)下的像素带宽。
因此,在非Affine模式下,对于不同大小的编码块,若亮度编码块采用8抽头插值滤波器,色度编码块采用4抽头插值滤波器,其所需要的带宽最大为11.34。
下面,结合图5至图11,详细介绍Affine模式下的编解码过程。
图5示出了一种视频编解码方法200的示意性流程图。应理解,该视频编解码方法既可以应用在编码端,此时该方法具体称之为视频编码方法,也可以应用在解码端,此时该方法具体称之为视频解码方法。
如图5所示,该视频编码方法200包括:
S210:获取Affine模式下编码块的控制点运动矢量。
由上述介绍可知,在非Affine模式下,将一个编码块作为一个整体进行运动估计和运动补偿,得到对应的预测值与编码信息。
而在Affine模式下,一个编码块可以划分为多个子编码块,根据多个子编码块中每个子编码块的运动矢量进行运动补偿,得到多个预测值以及多个编码信息。
可选地,编码块可以为编码单元CU,也可以为其它类型的图像块,本申请实施例不做具体限定。
可选地,在Affine模式下,上述编码块可以为大于等于8×8像素,且小于128×128像素,例如,编码块的大小为8×8,8×16,16×16,8×128等等,具体编码块的大小,本申请实施例也不做具体限定。
可选地,在Affine模式下,上述子编码块可以称为子编码单元(sub-CU),该子编码块可以为4×4像素大小,也可以为其它像素大小,具体子编码块的大小,本申请实施例同样不做具体限定。
S220:根据该控制点运动矢量计算得到该编码块中多个子编码块的运动矢量。
具体地,可以通过获取Affine模式下,当前编码块的控制点运动矢量(Control Point Motion Vector,CPMV)计算得到编码块中每个子编码块的运动矢量。
可选地,该控制点运动矢量CPMV可以为两个控制点的运动矢量,该情况下的Affine模式也称为四参数Affine模式。此外,该控制点运动矢量CPMV可以为三个控制点的运动矢量,该情况下的Affine模式也称为六参数Affine模式。
具体地,如图6a所示,在四参数Affine模式下,子编码块的运动矢量MV可以通过两个控制点的CPMV计算得到,其中,位于(x,y)位置的子编码块的MV的计算公式(1)如下:
Figure PCTCN2019107607-appb-000006
其中,W为编码块的像素宽度,x,y为子编码块在编码块中的相对位置坐标,mv 0x,mv 0y为第零控制点的运动矢量
Figure PCTCN2019107607-appb-000007
例如,该第零控制点为图5a中左上角的控制点,mv 1x,mv 1y为第一控制点的运动矢量
Figure PCTCN2019107607-appb-000008
例如,该第一控制点为图6a中右上角的控制点。
可选地,如图6b所示,在六参数Affine模式下,子编码块的运动矢量还可以通过三个控制点的CPMV计算得到,其中,位于(x,y)位置的子编码块的MV的计算公式(2)如下:
Figure PCTCN2019107607-appb-000009
其中,W和H为编码块的像素宽度和像素高度,x,y为子编码块在CU中的相对位置坐标,mv 0x,mv 0y为第零控制点的运动矢量
Figure PCTCN2019107607-appb-000010
mv 1x,mv 1y为第一控制点的运动矢量
Figure PCTCN2019107607-appb-000011
mv 2x,mv 2y为第二控制点的运动矢量
Figure PCTCN2019107607-appb-000012
例如,该二零控制点为图6b中左下角的控制点。
通过上述公式可以计算得到编码块中每个子编码块的运动矢量MV,例如,当编码块为16×16像素大小时,该编码块可以划分为16个子编码块,该16个子编码块的MV可以如图6c所示,因此,一个编码块对应于多个运动矢量,能够更为准确的对该编码块进行图像预测。
可选地,该每个子编码块的MV精度可以相同,也可以不同。本申请实施例对子编码块的MV精度不做具体限制。例如,可以为4、2、1、1/2、1/4、1/8或者1/16像素精度。
可选地,在一种可能的实施方式中,通过上述公式计算得到的每个子编码块的运动矢量经过四舍五入后最高可以达到1/16像素精度。
换言之,当子编码块的MV精度为1/16像素精度时,根据该子编码块的MV确定参考帧中对应的参考块,并对该参考块进行亚像素插值得到该子编码块的子预测块,亚像素插值的像素精度为1/16像素精度。
可选地,由于子编码块在编码块中的坐标位置不同,因此,不同的子编码块根据CPMV以及上述公式(1)或者公式(2)计算得到的MV不同。
可选地,上述编码块可以为亮度编码块,其通过四参数Affine模式或者六参数Affine模式,计算得到亮度编码块中多个子亮度编码块的MV。
对于色度编码块,由前文的说明可知,当视频的采样格式为YCbCr 4:2:0时,一个色度编码块的大小与4个亮度编码块的大小相同,一个色度编码块对应于4个亮度编码块。因此,通过上述计算公式得到亮度编码块中所有子亮度编码块的MV后,其中,多个子亮度编码块中的4个第一子亮度编码块对应于色度编码块中的一个第一子色度编码块,该第一子色度编码块的MV为4个第一子亮度编码度的MV的均值。通过该计算方式,可以计算得到色度编码块中所有子色度编码块的MV。其中,子亮度编码块和子色度编码块均为4×4像素大小。示例性的,在420格式中计算4个第一子亮度编码块的运动矢量的均值,在422格式中2个第一子亮度编码块的运动矢量的均值。
此外,采用率失真优化(Rate Distortion Optimization,RDO)技术或其他技术进行模式决策,若决策采用Affine模式对编码块进行预测编码,只需 将控制点的MV,即CPMV写入码流,而不需要将CU中每个子编码块的MV写入码流,解码端可以根据码流获取CPMV,从而根据CPMV计算CU中每个子编码块的MV。
具体地,Affine模式中包括Affine_AMVP模式以及Affine_merge模式,当采用Affine_AMVP模式对编码块进行预测编码时,获取编码块CPMV的预测值,即CPMVP,并计算CPMV与CPMVP的残差CPMVD,将该CPMVD以及CPMVP的相关信息写入码流。
可选地,在Affine_AMVP模式下,编码块的大小大于等于16×16像素。
当采用Affine_merge模式对编码块进行预测编码时,则获取编码块的CPMV的预测值CPMVP后,直接将CPMVP的相关信息写入码流。
可选地,在Affine_merge模式下,编码块的大小大于等于8×8像素。
S230:对该编码块中的多个子编码块进行运动补偿。
具体地,基于编码块中多个子编码块的MV,对编码块中多个子编码块进行运动补偿。
可选地,可以采用6抽头的插值滤波器进行亚像素插值处理以对子编码块进行运动补偿,还可以采用8抽头或者其它抽头数量的插值滤波器进行插值处理。该抽头数量也为该插值滤波器的滤波系数数量,抽头数量越多,则插值滤波器所需用到的像素数量越多,需要的传输带宽越大。
在一种可能的实施方式中,在Affine模式下,采用6抽头插值滤波器进行亚像素插值以对子亮度编码块进行运动补偿,采用4抽头插值滤波器进行亚像素插值以对子色度编码块进行运动补偿。
具体地,若此时当前编码块中多个子亮度编码块以及多个子色度编码块的MV不同,基于多个子亮度编码块中的第一子亮度编码块的运动矢量的像素精度,采用6抽头插值滤波器对第一子亮度编码块的参考块进行亚像素插值处理得到第一子亮度编码块对应的第一子亮度预测块;和/或,
基于多个子色度编码块中的第一子色度编码块的运动矢量的像素精度,采用4抽头插值滤波器对第一子色度编码块的参考块进行亚像素插值处理得到第一子色度编码块对应的第一子色度预测块。
应理解,本申请实施例中的6抽头插值滤波器和4抽头插值滤波器的滤波器系数为现有技术中的滤波器系数,也可以为其它任意滤波器系数,本申请实施例对此不做限定。
具体地,在Affine模式下,若采用单向预测(Uni)模式进行预测,构建一个参考帧列表,从该参考帧列表中选择一帧图像进行图像预测。其中,当选择当前帧的前一重构帧(历史帧)对当前帧中的编码块进行预测时,该过程称之为“前向预测”,当选择当前帧之后的帧(未来帧)对当前帧中的编码块进行预测时,该过程称之为“后向预测”,前向预测与后向预测均为单向预测。
对于N×M大小的亮度编码块,以编码块中4×4的子编码块作为单位,进行帧内预测,对于一个4×4的子亮度编码块,采用6抽头的插值滤波器对参考帧中的亮度参考块进行插值计算,子亮度编码块的平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为:
Figure PCTCN2019107607-appb-000013
对于N×M大小的色度编码块,仍以编码中4×4的子色度编码块作为单位,进行帧内预测,一个4×4大小的子色度编码块对应于一个8×8大小的亮度分量,若采用4抽头的插值滤波器对参考帧中的色度参考块进行插值处理,子色度编码块的平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为:
Figure PCTCN2019107607-appb-000014
由于一个子亮度编码块对应2个子色度编码块,因此,在Affine模式下,采用单向预测模式时,对于N×M大小的编码块(包括一个N×M的亮度编码块,2个N/2×M/2的色度编码块),N和M为正整数,其总像素带宽的计算公式为:
S A=L A+2×C A≈6.59。
类似地,在Affine模式下,采用双向预测模式(Bi)或者双前向预测模式或者双后向预测模式,换言之,采用双运动矢量预测模式时,对于N×M大小的编码块(包括一个N×M的亮度编码块,2个N/2×M/2的色度编码块),其总像素带宽的计算公式为:
S A=2×L A+2×2×C A≈13.18。
按照上述计算公式,对于不同大小的编码块,若亮度编码块采用6抽头的插值滤波器,色度编码块采用4抽头的插值滤波器,在Affine模式下,平均每个整像素点所需要用到的整像素点数,换言之,所需要的总带宽如表2 所示。
表2
Figure PCTCN2019107607-appb-000015
结合表1与表2所示,当预测模式为单向预测模式时,Affine预测模式下的总像素带宽为6.59,小于非Affine预测模式下的最大总像素带宽11.34,而当预测模式为双向预测模式时,Affine预测模式下的总像素带宽为13.18,大于非Affine预测模式下的最大总像素带宽11.34。
应理解,当预测模式为双前向预测模式或者为双后向预测模式时,即预测模式为双运动矢量预测模式中任意一种预测模式时,Affine预测模式下的总像素带宽均为13.18,大于非Affine预测模式下的最大总像素带宽11.34。
由于在运动补偿的过程中,需要对参考块进行像素插值,以得到当前编码块的预测块,因此需要一定的带宽传输运动补偿过程中所需要的像素点数量。且在Affine模式下,一个编码块包括多个子编码块,每个子编码块的运动矢量的最高像素精度达到1/16像素精度,多个运动矢量以及高精度的像素也会造成较大的带宽压力,因此,为了减小Affine模式下的带宽压力,对Affine模式下的运动矢量需要进行限制,除了上述在Affine模式下采用6抽头的插值滤波器,减小运动补偿过程中的像素点数量,还对Affine模式下的编码块的运动矢量进行限制。
在图5中的视频编解码方法基础上,图7示出了一种具体的视频编码方法300的示意性流程图,适用于视频编码端,该方法包括Affine模式下的编码块的运动矢量的限制过程,以减小Affine模式下的带宽。
具体地,如图7所示,该视频编码方法300包括:
S311:构建Affine模式下,编码块的控制点运动矢量的候选列表,通过率失真代价(Rate Distortion Cost,RD Cost)计算从候选列表中获取该编码块的控制点运动矢量;
可选地,本申请实施例中的步骤S311可以为上述步骤S210的一种具体 实施方式。
在本申请实施例中,当前编码块的控制点运动矢量CPMV可以为三个控制点的运动矢量,也可以为两个控制点的运动矢量。
在一种可能的实施方式中,当前编码块的CPMV包括两个控制点的运动矢量时,两个控制点分别位于当前编码块的左上角和右上角。
在另一种可能的实施方式中,当前编码块的CPMV包括三个控制点的运动矢量时,三个控制点分别位于当前编码块的左上角、右上角和左下角。
可选地,候选列表中包括两个控制点的CPMV的候选运动矢量,也可以包括三个控制点的CPMV的候选运动矢量。
可选地,该候选列表中的该候选运动矢量可以基于相邻编码块的运动矢量得到。
可选地,相邻编码块的运动矢量可以包括多种类型,可以为相邻编码块的CPMV经过推断计算得到的CPMV,也可以为相邻编码块的平移运动矢量构建得到的CPMV,还可以为通过相邻编码块的其它类型的运动矢量计算得到的CPMV,本申请实施例对此不做限定。
可选地,Affine_merge模式与Affine_AMVP两种模式下的CPMV的候选列表不同。
可选地,当采用Affine_merge模式时,编码块大于等于8×8像素。构建Affine_merge模式下的候选列表,其中的候选CPMV通过相邻编码块的CPMV计算得到,该相邻编码块与当前编码块相邻,且同样采用Affine模式进行编码。
可选地,通过RD Cost计算后,得到候选列表中最优的CPMV,将其作为当前编码块的CPMV的预测值,即CPMVP,且将该CPMVP在候选列表中的索引写入码流。
可选地,当采用Affine_AMVP模式时,编码块大于等于16×16像素。Affine_AMVP模式下的候选列表中的CPMV可以是从相邻块的CPMV中推断得到,也是可以是利用相邻块的平移MV构建得到,还可以是相邻块的转换MV等等。
可选地,通过RD Cost计算后,得到候选列表中最优的CPMV,将其作为当前编码块的CPMVP,根据该CPMVP,在参考帧中进行运动估计,得到当前编码块的CPMV,将第一编码块的CPMV与CPMVP的残差,也称 CPMVD,以及CPMVP在参考列表中的索引写入码流。
通过上述构建列表以及RD Cost的计算过程,得到当前编码块的CPMV。可选地,该CPMV可能包括两个控制点的MV,也可能包括三个控制点的MV。
可选地,上述构建编码块的候选列表,以及从候选列表中获取编码块的CPMV的过程,可以为亮度编码块获取CPMV的过程,该CPMV为亮度编码块的CPMV。
S320:根据该控制点运动矢量计算得到该编码块中多个子编码块的运动矢量。
可选地,该步骤S320可以与图5中的步骤S220相同,此处不再赘述。
具体地,通过当前编码块的控制点运动矢量CPMV,采用上述计算公式(1)或者计算公式(2)计算得到编码块中多个子编码块的MV后,根据多个子编码块的MV,对当前编码块的运动矢量进行限制过程。
具体的限制过程可以包括:
S341:当对该编码块进行单向预测时,根据该多个子编码块的运动矢量,对该编码块中的多个第一限制块进行限制计算。
具体地,Affine模式下,假设一个子编码块的运动矢量(mv x,mv y)可以写为下述形式:
Figure PCTCN2019107607-appb-000016
其中,a,b,e,b,d和f为常数,x,y为子编码块在编码块中的相对位置坐标。
可选地,当对编码块进行单向预测时,对编码块的亮度分量,即亮度编码块中的多个第一限制块进行限制计算。
可选地,在本申请实施例中,第一限制块为4×8或者8×4大小。
在一个4×8的第一限制块中,包括2个4×4大小的子编码块,该2个子编码块的位置坐标分别为(0,0)和(0,4)。相对于(0,0)位置上的子编码块,该2个子编码块的运动矢量mv x可以分别表示为(0,4c),运动矢量mv y可以分别表示为(0,4d+4)。因此,该4×8块内部所有的mv指向的区域的水平方向宽度bxW1和竖直方向高度bxH1,也即该4×8的第一限制块的水平方向宽度和垂直方向高度的计算公式(3)为:
Figure PCTCN2019107607-appb-000017
该4×8的第一限制块内部所有子编码块进行亚像素插值需要用到的像素点,或者该4×8的第一限制块进行亚像素插值需要用到的像素点为Pix_num1=(bxW1+5)×(bxH1+5)。
在一个8×4的第一限制块中,同样包括2个4×4大小的子编码块,该2个子编码块的位置坐标分别为(0,0)和(4,0)。相对于(0,0)位置上的子编码块,该2个子编码块的运动矢量mv x可以分别表示为(0,4a+4),运动矢量mv y可以分别表示为(0,4b)。因此,该8×4块内部所有的mv指向的区域的水平方向宽度bxW2和竖直方向高度bxH2,也即该8×4的第一限制块的水平方向宽度和垂直方向高度的计算公式(4)为:
Figure PCTCN2019107607-appb-000018
该8×4的第一限制块内部所有子编码块进行亚像素插值需要用到的像素点,或者该8×4的第一限制块进行亚像素插值需要用到的像素点为Pix_num2=(bxW2+5)×(bxH2+5)。
对当前编码块中所有的8×4或者4×8大小的第一限制块按照上述公式(3)和公式(4)计算得到其水平方向宽度和垂直方向高度,从而计算得到第一限制块进行亚像素插值所需的像素点数Pix_num。下文为了方便描述,将限制块进行亚像素插值所需的像素点数也写为限制块的像素点数。
若当前编码块中所有的4×8的第一限制块的像素点数小于等于M(预设阈值的一例),并且,所有的8×4的第一限制块的像素点数小于等于N(预设阈值的另一例)时,则不对该编码块中所有的子编码块的运动矢量进行修改,即维持其通过四参数Affine模式下的计算公式(公式(1))或者六参数Affine模式下的计算公式(公式(2))计算得到的运动矢量。
若当前编码块中至少一个4×8的第一限制块的像素点数大于M,或者,至少一个8×4的像素点数大于N时,将当前编码块中所有的子编码块的运动矢量设置为同一运动矢量。
可选地,该同一运动矢量可以为多个子编码块通过四参数Affine模式下的计算公式(公式(1))计算得到的多个运动矢量的均值,或者也可以为六参数Affine模式下的计算公式(公式(2))计算得到的多个运动矢量的均值。
可选地,该同一运动矢量还可以为其它任意的运动矢量的值,本申请实施例对此不做限定。
可选地,在一种可能的实施方式中,上述M=N=15×11=165。
S342:当对该编码块进行双向预测时,根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算。
具体地,当采用双向预测模式对当前编码块进行预测编码时,则会构建两个参考帧列表,从该参考帧列表中选择两帧图像进行图像预测。该两帧图像可以分别为历史帧和未来帧。
此外,此处需要说明的是,当采用双前向预测模式或者双后向预测模式对当前编码块进行预测编码时,同样根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算。下文以双向预测模式为例,说明编码块中的多个第二限制块的限制计算过程,双前向预测模式或者双后向预测模式下多个第二限制块的限制计算过程与双向预测模式下的限制计算过程相同,此处不再赘述。
可选地,在本申请实施例中,第二限制块为8×8大小。
在一个8×8的第二限制块中,包括4个4×4大小的子编码块,该4个子编码块的位置坐标分别为(0,0),(0,4),(4,0)以及(4,4)。相对于(0,0)位置上的子编码块,该4个子编码块的运动矢量mv x可以分别表示为(0,4c,4a+4,4a+4c+4),运动矢量mv y可以分别表示为(0,4d+4,4b,4b+4d+4)。因此,该8×8的第二限制块内部所有的mv指向的区域的水平方向宽度bxW和竖直方向高度bxH,也即该8×8的第二限制块的水平方向宽度和垂直方向高度的计算公式(5)为:
Figure PCTCN2019107607-appb-000019
该8×8的第二限制块内部所有子编码块进行亚像素插值需要用到的像素点为Pix_num=(bxW+5)×(bxH+5)。
若当前编码块中所有的8×8的第二限制块的像素点数小于等于W(预设阈值的一例)时,则不对该编码块中所有的子编码块的运动矢量进行修改,即维持其通过四参数Affine模式下的计算公式(公式(1))或者六参数Affine模式下的计算公式(公式(2))计算得到的运动矢量。
若当前编码块中至少一个8×8的像素点数大于W时,将当前编码块中 所有的子编码块的运动矢量设置为同一运动矢量。
可选地,该同一运动矢量可以为多个子编码块通过四参数Affine模式下的计算公式(公式(1))计算得到的多个运动矢量的均值,或者也可以为六参数Affine模式下的计算公式(公式(2))计算得到的多个运动矢量的均值。
可选地,该同一运动矢量还可以为其它任意的运动矢量的值,本申请实施例对此不做限定。
可选地,在一种可能的实施方式中,上述W=15×15=225。
可选地,在编码过程中,可能即执行上述步骤S341,也执行上述步骤S342。例如,当前编码块位于双向预测条带(slice)中时,既对该编码块进行双向预测,在双向预测模式下对编码块进行限制计算,也对该编码块进行单向预测,在单向预测模式下对编码块进行限制计算,在此基础上进行RD Cost计算以选择最优的预测模式。
可选地,在编码过程中,也可能只执行上述步骤S341或者只执行上述步骤S342。例如,当前编码块位于单向预测条带(slice)中时,只对该编码块进行单向预测,在单向预测模式下对编码块进行限制计算,在此基础上进行RD Cost计算以选择最优的预测模式。
上面具体描述了编码块中亮度编码块的运动矢量在单向预测模式以及双向预测模式下的限制过程。可选地,在本申请实施例中,编码块中的色度编码块的运动矢量可以通过亮度编码块的运动矢量计算得到。因此,亮度编码块经过上述限制过程后,若亮度编码块中所有子亮度编码块的MV为相同的MV,则对应的,色度编码块中所有的子色度编码块的MV也为相同的MV。若亮度编码块中所有子亮度编码块的MV为不同的MV,则色度编码块中所有的子色度编码的MV也不相同。
S350:根据限制的结果对该编码块进行运动补偿。
具体地,通过上述限制过程,若限制块的水平方向宽度和垂直方向高度不在一定的阈值范围内,则将编码块中的子编码块的MV设置为相同的MV,则将当前编码块看为一个整体,对整个编码块进行运动补偿。此时,Affine模式下的当前编码块的带宽压力与非Affine模式下的当前编码块的带宽压力基本相同。
而若限制块的水平方向宽度和垂直方向高度在一定阈值范围内,则不对编码块中的子编码块的MV进行限制,将每个子编码块作为单位,对当前编 码块中每个子编码块依次进行运动补偿。
具体地,对编码块进行单向预测时,编码块对应一个参考帧列表,每个子编码块的MV为单MV,则根据该MV可以直接在参考帧中确定得到编码块或者子编码块对应的预测块或者子预测块。
对编码块进行双向预测或者双前向预测或者双后向预测时,编码块对应两个参考帧列表,每个子编码块的MV为双MV,其中,该双MV为经过运动估计得到的两个MV,该两个MV可能相同,也可能不同,根据该双MV分别在两个参考帧中确定得到编码块或者子编码块对应的两个初始预测块或者两个初始子预测块,然后,对该两个初始预测块或者两个初始子预测块进行加权计算得到最终的预测块或者子预测块。
具体地,根据MV在参考帧中确定预测块或者子预测块的过程中,需要对参考帧中对应于该MV处的参考块进行亚像素插值,得到预测块或者子预测块,其中,预测块或者子预测块的像素精度与MV的像素精度相同。
可选地,在亚像素插值过程中,可以采用8抽头的插值滤波器,或者6抽头的插值滤波器或者其它任意抽头的插值滤波器进行亚像素插值处理,本申请实施例对此不做限定。
在一种可能的实施方式中,对于亮度编码块,采用6抽头的插值滤波器进行亚像素插值处理,对于色度编码块,采用4抽头的插值滤波器进行亚像素插值处理。
得到编码块或者子编码块的预测块或者子预测块后,计算得到编码块或者子编码块的残差值,通过上述视频的编码方法300,计算得到Affine模式下的CPMV以及残差值,根据该残差值与CPMV计算当前编码块的RD cost,与其它模式下当前编码块的RD Cost比较,确认是否采用Affine模式对当前编码块进行预测编码。
当通过RD Cost计算,如果确定Affine_merge模式为当前编码块的预测模式,则将候选列表中的CPMVP的索引写入码流,如果确定Affine_AMVP模式为当前编码块的预测模式,则将CPMVP的索引以及CPMVD一起写入码流。
根据上述步骤S341以及步骤S342中的限制过程,当预测模式为单向预测时,若亮度编码块中所有的8×4的第一限制块的像素点数小于等于165,且亮度编码块中所有的4×8的第一限制块的像素点数也小于等于165,则亮 度编码块平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为:
Figure PCTCN2019107607-appb-000020
对应的,色度编码块平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为;
Figure PCTCN2019107607-appb-000021
因此,在单向预测模式下,若不对亮度编码块中的子亮度编码块的MV进行修改限制,则亮度编码块的像素带宽小于6.69,即小于非Affine模式下最大像素带宽11.34。此时,Affine模式的带宽小于非Affine模式的带宽。
若对亮度编码块中的子亮度编码块的MV进行修改限制,则此时Affine模式的带宽与非Affine模式的带宽基本相当。
类似地,当预测模式为双向预测时,若亮度编码块中所有的8×8的第二限制块的像素点数小于等于225,则亮度编码块平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为:
Figure PCTCN2019107607-appb-000022
对应的,色度编码块平均每个整像素点所需要用到的参考块的整像素点数,即像素带宽的计算公式为;
Figure PCTCN2019107607-appb-000023
因此,在双向预测模式下,经过限制计算后,若不对亮度编码块中的子亮度编码块的MV进行修改限制,则其带宽小于10.09,即小于非Affine模式下最大带宽11.34。此时,Affine模式的带宽小于非Affine模式的带宽。
同样,若对亮度编码块中的子亮度编码块的MV进行修改限制,则此时Affine模式的带宽与非Affine模式的带宽基本相当。
应理解,当预测模式为双前向预测模式或者双后向预测模式时,即预测模式为双运动矢量预测模式中任意一种预测模式时,其带宽情况与上述双向预测模式下的带宽情况相同,此处不再赘述。
具体地,若亮度分量采用6抽头的插值滤波器,色度分量采用4抽头的插值滤波器,且进行编码块的MV的限制过程,在Affine模式下,编码块平均每个整像素点所需要用到的整像素点数,换言之,所需要的总带宽如表3 所示。
表3
Figure PCTCN2019107607-appb-000024
因此,经过上述步骤S341以及步骤S342中的限制过程之后,结合表3和表1可以看出,不论是在单向预测还是在双向预测模式,或者是在双前向预测模式,还或者是在双后向预测模式下,Affine模式下的带宽均小于等于非Affine模式下的带宽,不会带来额外的带宽压力。
在图5中的视频编解码方法基础上,图8示出了一种具体的视频解码方法400的示意性流程图,可以应用于视频解码端,该方法同样包括Affine模式下的编码块的运动矢量的限制过程,以减小Affine模式下的带宽。
应理解,本申请实施例中的解码过程对应于图7中的编码过程,即采用视频编码方法300对待编码帧中的编码块进行编码形成编码块的码流,采用视频解码方法400对编码块的码流进行解码,相同或者相近的技术特征可以参考上述视频编码方法300中的相关说明,本申请实施例不做详细描述。
如图8所示,该视频解码方法400包括:
S411:获取编码块的码流;
S412:根据编码块的码流确定编码块的编码模式为Affine模式,并获取编码块的控制点运动矢量的索引信息;
S413:构建编码块的控制点运动矢量的候选列表,根据候选列表以及控制点运动矢量的索引信息,获取控制点运动矢量;
具体地,视频解码端接收的编码块的码流中包括标识该编码块是否为Affine模式的标识位。更为具体地,该编码块的码流中包括标识该编码块是Affine_merge模式还是Affine_AMVP模式的标志位。通过该标志位,可以判断得到该编码块的编码模式是否为Affine模式,或者更为具体地,可以判断得到该编码块的编码模式是Affine_merge模式还是Affine_AMVP模式。
此外,若当前编码块采用Affine模式进行预测编码,则在编码块的码流中,还包括当前编码块的CPMV在候选列表中的索引信息。
可选地,在确定当前编码块的编码模式为Affine模式时,构建Affine模式下CPMV的候选列表。
具体地,当确定当前编码块的编码模式为Affine_merge模式时,构建Affine_merge模式下CPMV的候选列表。通过码流获取当前编码块的CPMV在候选列表中索引值,通过该索引值直接确定当前编码块的CPMV。
当前编码块的编码模式为Affine_AMVP时,构建Affine_AMVP模式下CPMV的候选列表。通过码流获取当前编码块的CPMVP在候选列表中索引值以及当前编码块的CPMVD,通过该索引值在候选列表中确定当前编码块的CPMVP,将CPMVP与对应的CPMVD相加求和,得到当前编码块的CPMV。
可选地,在解码过程中,得到的当前编码块的CPMV可能为两个控制点的运动矢量,也可能为三个控制点的运动矢量。
S420:根据该控制点运动矢量计算得到该编码块中多个子编码块的运动矢量;
具体地,当控制点运动矢量为两个控制点的运动矢量时,采用四参数Affine模式的计算公式(公式(1))计算得到当前编码块中多个子编码块的MV。
当控制点运动矢量为三个控制点的运动矢量时,采用六参数的Affine模式的计算公式(公式(2))计算得到当前编码块中多个子编码块的MV。
可选地,在本申请实施例中,步骤S420的具体实施方式可以与图5的步骤S220或图7中的步骤S320相同,此处不再赘述。
可选地,在本申请实施例中,还可以通过码流中的标志位判断当前编码块为双向预测(Bi)模式还是单向预测(Uni)模式,根据预测模式采用不同的限制方式对编码块的MV进行限制。
可选地,在一种可能的实施方式中,当标志位为1或2时,当前编码块采用单向预测模式,特别的,当标志位为1时,当前编码块采用前向预测模式,当标志位为2时,当前编码块采用后向预测模式。当标志位为3时,当前编码块采用双向预测模式。
具体地,解码过程中对编码块的MV进行限制的过程与编码过程中对编 码块的MV进行限制的过程近似,具体地,可以包括:
S441:如果该编码块采用单向预测模式,根据该多个子编码块的运动矢量,对该编码块中的多个第一限制块进行限制计算;
S442:如果该编码块采用双向预测模式,根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算;
可选地,该步骤S441可以与图7中的步骤S341相同,步骤S442可以与图7中的步骤S342相同,此处不再赘述。
S450:根据限制的结果对该编码块进行运动补偿。
可选地,限制计算后,可以将编码块中多个子编码块的MV存储至缓存中(buffer)中,然后基于多个子编码块的MV进行运动补偿。
可选地,该步骤S450也可以与图7中的步骤S350相同,具体实施方式可以参见其相关描述,此处不再赘述。
由于解码过程与编码过程中Affine模式下的运动补偿过程近似,因此,解码过程中,经过限制过程以及6抽头插值滤波器的处理,Affine模式下的带宽同样小于等于非Affine模式下的带宽。
经过上述Affine模式编码过程和解码过程的描述和分析,可以知道,经过限制过程对编码块的MV的限制,可以减小Affine模式下的带宽,但同时不论单向预测模式,双向预测模式,或者双前向预测模式,或者双后向预测模式均需要经过复杂的限制计算,即上述编码过程中的S341与S342步骤,以及解码过程中的S441以及S442步骤,并基于限制计算的结果对编码块进行运动补偿,在减小Affine模式带宽的同时,也增加了额外的处理过程,增大了编解码系统的复杂度,影响了编解码的性能。
因此,本申请提出了另一种更为优化的编解码方法,在减小Affine模式带宽的同时,还能优化Affine模式下的限制过程,从而减小编解码系统的复杂度,提高视频编解码的性能。
图9示出了本申请实施例的另一种视频编解码方法的流程框图。
如图9所示,该视频编解码方法500包括:
S510:获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,该控制点运动矢量用于计算得到所述编码块中多个子编码块的运动矢量;
具体地,在Affine模式下,一个编码块可以划分为多个子编码块,根据多个子编码块中每个子编码块的运动矢量进行运动补偿,得到多个预测值以 及多个编码信息。
可选地,编码块可以为编码单元CU,也可以为其它类型的图像块,本申请实施例不做具体限定。
可选地,在Affine模式下,上述编码块可以为大于等于8×8像素,且小于128×128像素的编码块,例如,编码块的大小为8×8,8×16,16×16,8×128等等,具体编码块的大小,本申请实施例也不做具体限定。
可选地,在Affine模式下,上述子编码块可以称为子编码单元(sub-CU),该子编码块可以为4×4像素大小,也可以为其它像素大小,具体子编码块的大小,本申请实施例同样不做具体限定。
可选地,可以通过获取Affine模式下,当前编码块的控制点运动矢量(Control Point Motion Vector,CPMV)计算得到编码块中每个子编码块的运动矢量。
可选地,该控制点运动矢量CPMV可以为两个控制点的运动矢量,该情况下的Affine模式也称为四参数Affine模式。此外,该控制点运动矢量CPMV可以为三个控制点的运动矢量,该情况下的Affine模式也称为六参数Affine模式。
具体地,通过CPMV计算得到子编码块的运动矢量MV的过程可以与视频编码方法300中的步骤S320相同,四参数Affine模式下的子编码块MV的计算公式以及六参数Affine模式下的子编码块MV的计算公式可以参见公式(1)和公式(2),此处不再赘述。
可选地,在本申请实施例中,编码块的CPMV的像素精度可以为4、2、1、1/2、1/4、1/8或者1/16。应理解,该CPMV的像素精度还可以为其它任意像素精度,本申请实施例对此不作具体限定。
可选地,该编码块中每个子编码块的MV精度可以相同,也可以不同。本申请实施例对子编码块的MV精度不做具体限制。例如,也可以为4、2、1、1/2、1/4、1/8、1/16像素精度或者其它像素精度。
可选地,在一种可能的实施方式中,通过上述公式计算得到的每个子编码块的运动矢量经过四舍五入后最高可以达到1/16像素精度。
换言之,当子编码块的MV精度为1/16像素精度时,根据该子编码块的MV确定参考帧中对应的参考块,并对该参考块进行亚像素插值得到该子编码块的子预测块,亚像素插值的像素精度为1/16像素精度。
可选地,由于子编码块在编码块中的坐标位置不同,因此,不同的子编码块根据CPMV以及上述公式(1)或者公式(2)计算得到的MV不同。
可选地,上述编码块可以为亮度编码块,其通过四参数Affine模式或者六参数Affine模式,计算得到亮度编码块中多个子亮度编码块的MV。
对于色度编码块,当视频的采样格式为YCbCr 4:2:0时,一个色度编码块的大小与4个亮度编码块的大小相同,一个色度编码块对应于4个亮度编码块。因此,通过上述计算公式得到亮度编码块中所有子亮度编码块的MV后,其中,多个子亮度编码块中的4个第一子亮度编码块对应于色度编码块中的一个第一子色度编码块,该第一子色度编码块的MV为4个第一子亮度编码度的MV的均值。通过该计算方式,可以计算得到色度编码块中所有子色度编码块的MV。其中,子亮度编码块和子色度编码块均为4×4像素大小。
此外,采用率失真优化,RDO技术或其他技术进行模式决策,若决策采用Affine模式对编码块进行预测编码,只需将控制点的MV,即CPMV写入码流,而不需要将CU中每个子编码块的MV写入码流,解码端可以根据码流获取CPMV,从而根据CPMV计算CU中每个子编码块的MV。
具体地,Affine模式中包括Affine_AMVP模式以及Affine_merge模式,当采用Affine_AMVP模式对编码块进行预测编码时,获取编码块CPMV的预测值,即CPMVP,并计算CPMV与CPMVP的残差CPMVD,将该CPMVD以及CPMVP的相关信息写入码流。
可选地,在Affine_AMVP模式下,编码块的大小大于等于16×16像素。
当采用Affine_merge模式对编码块进行预测编码时,则获取编码块的CPMV的预测值CPMVP后,直接将CPMVP的相关信息写入码流。
可选地,在Affine_merge模式下,编码块的大小大于等于8×8像素。
S520:当对该编码块进行单向预测时,基于多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿。
具体地,通过上述获取当前编码块的CPMV,可以计算得到编码块中多个子编码块的MV,根据多个子编码块的MV,分别对多个子编码块进行运动补偿。其中,多个子编码块中包括第一子编码块,通过上述计算,可以得到第一子编码块的MV。
具体地,基于编码块中多个子编码块的MV,对编码块中多个子编码块进行运动补偿。
可选地,可以采用6抽头插值滤波器进行亚像素插值处理以对子编码块进行运动补偿,还可以采用4抽头或者其它抽头数量的插值滤波器进行插值处理。本申请实施例对插值滤波器的抽头数量具体不做限定。
在一种可能的实施方式中,在Affine模式下,采用6抽头插值滤波器进行亚像素插值以对子亮度编码块进行运动补偿,采用4抽头插值滤波器进行亚像素插值以对子色度编码块进行运动补偿。
具体地,若此时当前编码块中多个子亮度编码块以及多个子色度编码块的MV不同,基于多个子亮度编码块中的第一子亮度编码块的运动矢量的像素精度,采用6抽头插值滤波器对第一子亮度编码块的参考块进行亚像素插值处理得到第一子亮度编码块对应的第一子亮度预测块;和/或,
基于多个子色度编码块中的第一子色度编码块的运动矢量的像素精度,采用4抽头插值滤波器对第一子色度编码块的参考块进行亚像素插值处理得到第一子色度编码块对应的第一子色度预测块。
应理解,本申请实施例中的6抽头插值滤波器和4抽头插值滤波器的滤波器系数现有技术中的滤波器系数,也可以为其它任意滤波器系数,本申请实施例对此不做限定。
在本申请实施例中,Affine模式下,当对编码块进行单向预测(Uni)时,即构建一个参考帧列表,从参考帧列表中选择一帧图像进行图像预测时,直接基于第一子编码块的MV对该第一子编码块进行运动补偿。
可选地,第一子编码块为4×4像素大小。应理解,第一子编码块还可以为其它任意大小,本申请实施例对此不做限定。
具体地,基于第一子编码块的运动矢量的像素精度,采用插值滤波器对所述第一子编码块的参考块进行亚像素插值处理得到所述第一子编码块对应的第一子预测块。
同样地,对于当前编码块中除第一子编码块外的其它多个子编码块,均基于其各自的运动矢量的像素精度,采用插值滤波器对其各自的参考块进行亚像素插值处理得到多个子预测块。
在本申请实施例中,Affine模式下,单向预测时,不对步骤S510计算得到的子编码块的运动矢量进行限制,即不执行当前编码块的运动矢量的限制过程(即前述实施例中提及的步骤S341和S441),直接采用步骤S510计算得到的子编码块的运动矢量进行运动补偿。
在做运动补偿时,以子编码块作为运动补偿的单位,对编码块中多个子编码块分别进行运动补偿,而不会对编码块整体进行运动补偿。
可选地,运动矢量的限制过程可以参见图7中的步骤S341。与图7中的视频编码方法300相比,本实施例具体不执行步骤S341,且在步骤S350中,单向预测时,不根据限制的结果对编码块进行运动补偿。
因此,通过本申请实施例的方案,Affine模式下,进行单向预测时,直接省略了当前编码块的运动矢量的限制过程,从而减少了单向预测时当前块的运动矢量的计算过程,减小了编解码系统的复杂度,节省编码时间并提高了编码效率。
此外,结合前文中的表1和表2以及相关描述可知,当采用单向预测时,Affine模式下的带宽小于非Affine模式下的带宽,因此,单向预测模式下,不执行当前编码块的运动矢量的限制过程,也可以保证Affine模式不会带来更大的带宽压力,因此,通过本申请实施例的方案,在减小编解码系统复杂度,提高编码效率的同时,还不会带来更大的带宽压力,使得编解码系统的性能更优。
在图9中的视频编解码方法基础上,图10示出了一种具体的视频编码方法600的示意性流程图,适用于视频编码端。
如图10所示,该视频编码方法600包括:
S611:构建Affine模式下,编码块的控制点运动矢量的候选列表,通过RD Cost计算从候选列表中获取该编码块的控制点运动矢量;
可选地,本申请实施例中的步骤S611可以为上述步骤S510的一种具体实施方式。
在本申请实施例中,当前编码块的控制点运动矢量CPMV可以为三个控制点的运动矢量,也可以为两个控制点的运动矢量。
在一种可能的实施方式中,当前编码块的CPMV包括两个控制点的运动矢量时,两个控制点分别位于当前编码块的左上角和右上角。
在另一种可能的实施方式中,当前编码块的CPMV包括三个控制点的运动矢量时,三个控制点分别位于当前编码块的左上角、右上角和左下角。
可选地,候选列表中包括两个控制点的CPMV的候选运动矢量,也可以包括三个控制点的CPMV的候选运动矢量。
可选地,该候选列表中的该候选运动矢量可以基于相邻编码块的运动矢 量得到。
可选地,相邻编码块的运动矢量可以包括多种类型,可以为相邻编码块的CPMV经过推断计算得到的CPMV,也可以为相邻编码块的平移运动矢量构建得到的CPMV,还可以为通过相邻编码块的其它类型的运动矢量计算得到的CPMV,本申请实施例对此不做限定。
可选地,Affine_merge模式与Affine_AMVP两种模式下的CPMV的候选列表不同。
可选地,当采用Affine_merge模式时,编码块大于等于8×8像素。构建Affine_merge模式下的候选列表,其中的候选CPMV通过相邻编码块的CPMV计算得到,该相邻编码块与当前编码块相邻,且同样采用Affine模式进行编码。
可选地,通过RD Cost计算后,得到候选列表中最优的CPMV,将其作为当前编码块的CPMV的预测值,即CPMVP,且将该CPMVP在候选列表中的索引写入码流。
可选地,当采用Affine_AMVP模式时,编码块大于等于16×16像素。Affine_AMVP模式下的候选列表中的CPMV可以是从相邻块的CPMV中推断得到,也是可以是利用相邻块的平移MV构建得到,还可以是相邻块的转换MV等等。
可选地,通过RD Cost计算后,得到候选列表中最优的CPMV,将其作为当前编码块的CPMVP,根据该CPMVP,在参考帧中进行运动估计,得到当前编码块的CPMV,将第一编码块的CPMV与CPMVP的残差,也称CPMVD,以及CPMVP在参考列表中的索引写入码流。
通过上述构建列表以及RD Cost的计算过程,得到当前编码块的CPMV。可选地,该CPMV可能包括两个控制点的MV,也可能包括三个控制点的MV。
可选地,上述构建编码块的候选列表,以及从候选列表中获取编码块的CPMV的过程,可以为亮度编码块获取CPMV的过程,该CPMV为亮度编码块的CPMV。
S612:根据该控制点运动矢量计算得到该编码块中多个子编码块的运动矢量。
具体地,通过当前编码块的控制点运动矢量CPMV,采用前文计算公式 (1)或者计算公式(2)计算得到编码块中多个子编码块的MV。
S620:当对该编码块进行单向预测时,基于多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿。
S631:对该编码块进行双向预测时,根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算。
具体地,当采用双向预测模式对当前编码块进行预测编码时,则会构建两个参考帧列表,从该参考帧列表中选择两帧图像进行图像预测。该两帧图像可以分别为历史帧和未来帧。
此外,此处需要说明的是,当采用双前向预测模式或者双后向预测模式,换言之,当采用双运动矢量预测模式对当前编码块进行预测编码时,同样根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算。
具体地,该步骤S431可以与图7中的步骤S342相同,相关技术特征和技术方案可见参照前文步骤S342中的描述,此处不再赘述。
S632:根据限制的结果对该编码块进行运动补偿。
具体地,双向预测或者双前向预测或者双后向预测时,通过上述步骤S431中的限制过程,若限制块的水平方向宽度和垂直方向高度不在一定的阈值范围内,则将编码块中的子编码块的MV设置为相同的MV,则将当前编码块看为一个整体,对整个编码块进行运动补偿。
而若限制块的水平方向宽度和垂直方向高度在一定阈值范围内,则不对编码块中的子编码块的MV进行限制,将每个子编码块作为单位,对当前编码块中每个子编码块依次进行运动补偿。
具体地,对编码块进行单向预测时,编码块对应一个参考帧列表,每个子编码块只具有一个MV,则根据该MV可以直接在参考帧中确定得到多个子编码块对应的子预测块。
对编码块进行双向预测或者双前向预测或者双后向预测预测时,编码块对应两个参考帧列表,每个子编码块根据其MV分别在两个参考帧中确定得到编码块或者子编码块对应的两个初始预测块或者两个初始子预测块,然后,对该两个初始预测块或者两个初始子预测块进行加权计算得到最终的预测块或者子预测块。
具体地,根据MV在参考帧中确定预测块或者子预测块的过程中,需要 对参考帧中对应于该MV处的参考块进行亚像素插值,得到预测块或者子预测块,其中,预测块或者子预测块的像素精度与MV的像素精度相同。
可选地,在亚像素插值过程中,可以采用8抽头的插值滤波器,或者6抽头的插值滤波器或者其它任意抽头的插值滤波器进行亚像素插值处理,本申请实施例对此不做限定。
在一种可能的实施方式中,对于亮度编码块,采用6抽头的插值滤波器进行亚像素插值处理,对于色度编码块,采用4抽头的插值滤波器进行亚像素插值处理。
得到编码块或者子编码块的预测块或者子预测块后,计算得到编码块或者子编码块的残差值,通过上述视频的编码方法300,计算得到Affine模式下的CPMV以及残差值,根据该残差值与CPMV计算当前编码块的RD cost,与其它模式下当前编码块的RD Cost比较,确认是否采用Affine模式对当前编码块进行预测编码。
当通过RD Cost计算,确定Affine_merge模式为当前编码块的预测模式时,将候选列表中的CPMVP的索引写入码流,确定Affine_AMVP模式为当前编码块的预测模式时,将CPMVP的索引以及CPMVD一起写入码流。
当采用单向预测时,Affine模式下的带宽小于非Affine模式下的带宽,因此,单向预测模式下,不执行当前编码块的运动矢量的限制过程,也可以保证Affine模式不会带来更大的带宽压力。并且,双向预测模式下,采用6抽头的插值滤波器对参考块进行像素插值,并且执行当前编码块的运动矢量的限制过程,也可以缓解Affine模式下的带宽压力。因此,通过本申请实施例的方案,针对单向或双向预测模式,采用不同的处理方式,均可以在减小编解码系统复杂度,提高编码效率的同时,还不会带来更大的带宽压力,使得编解码系统的性能更优。
在图9中的视频编解码方法基础上,图11示出了一种具体的视频解码方法700的示意性流程图,适用于视频编码端。
应理解,本申请实施例中的解码过程对应于图10中的编码过程,即采用视频编码方法600对待编码帧中的编码块进行编码形成编码块的码流,采用视频解码方法700对编码块的码流进行解码,相同或者相近的技术特征可以参考上述视频编码方法600中的相关说明,本申请实施例不做详细描述。
如图11所示,该视频的解码方法700包括:
S711:获取编码块的码流;
S712:根据编码块的码流确定编码块的编码模式为Affine模式,并获取编码块的控制点运动矢量的索引信息;
S713:构建编码块的控制点运动矢量的候选列表,根据候选列表以及控制点运动矢量的索引信息,获取控制点运动矢量;
具体地,视频解码端接收的编码块的码流中包括标识该编码块是否为Affine模式的标识位。更为具体地,该编码块的码流中包括标识该编码块是Affine_merge模式还是Affine_AMVP模式的标志位。通过该标志位,可以判断得到该编码块的编码模式是否为Affine模式,或者更为具体地,可以判断得到该编码块的编码模式是Affine_merge模式还是Affine_AMVP模式。
此外,若当前编码块采用Affine模式进行预测编码,则在编码块的码流中,还包括当前编码块的CPMV在候选列表中的索引信息。
可选地,在确定当前编码块的编码模式为Affine模式时,构建Affine模式下CPMV的候选列表。
具体地,当确定当前编码块的编码模式为Affine_merge模式时,构建Affine_merge模式下CPMV的候选列表。通过码流获取当前编码块的CPMV在候选列表中索引值,通过该索引值直接确定当前编码块的CPMV。
当前编码块的编码模式为Affine_AMVP时,构建Affine_AMVP模式下CPMV的候选列表。通过码流获取当前编码块的CPMVP在候选列表中索引值以及当前编码块的CPMVD,通过该索引值在候选列表中确定当前编码块的CPMVP,将CPMVP与对应的CPMVD相加求和,得到当前编码块的CPMV。
可选地,在解码过程中,得到的当前编码块的CPMV可能为两个控制点的运动矢量,也可能为三个控制点的运动矢量。
S720:根据该控制点运动矢量计算得到该编码块中多个子编码块的运动矢量;
具体地,当控制点运动矢量为两个控制点的运动矢量时,采用四参数Affine模式的计算公式(公式(1))计算得到当前编码块中多个子编码块的MV。
当控制点运动矢量为三个控制点的运动矢量时,采用六参数的Affine模式的计算公式(公式(2))计算得到当前编码块中多个子编码块的MV。
可选地,在本申请实施例中,步骤S720的具体实施方式可以与图5的步骤S220或图7中的步骤S320相同,此处不再赘述。
可选地,在本申请实施例中,还可以通过码流中的标志位判断当前编码块为双向预测(Bi)模式还是单向预测(Uni)模式,根据预测模式采用不同的限制方式对编码块的MV进行限制。
可选地,在一种可能的实施方式中,当标志位为1或2时,当前编码块采用单向预测模式,特别的,当标志位为1时,当前编码块采用前向预测模式,当标志位为2时,当前编码块采用后向预测模式。当标志位为3时,当前编码块采用双向预测模式。
如果当前编码块采用单向预测模式,执行下述步骤S731,如果当前编码块采用双向预测模式,则执行下述步骤S732和S733。
S731:如果该编码块采用单向预测模式,基于该多个子编码块中第一子编码块的运动矢量,对第一子编码块进行运动补偿;
可选地,该步骤S731可以与图10中的步骤S620相同,此处不再赘述。
S732:如果该编码块采用双向预测模式,根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算;
同样的,如果该编码块采用双前向预测模式或者双后向预测模式,同样根据该多个子编码块的运动矢量,对该编码块中的多个第二限制块进行限制计算。
S733:根据限制的结果对该编码块进行运动补偿。
可选地,限制计算后,可以将编码块中多个子编码块的MV存储至缓存中(buffer)中,然后基于多个子编码块的MV进行运动补偿。
可选地,该步骤S732可以与图10中的步骤S631相同,该步骤S733也可以与图10中的步骤S632相同,具体实施方式可以参见其相关描述,此处不再赘述。
通过前文的表1、表2、表3以及相关描述可知,双向预测模式或者双前向预测模式或者双后向预测模式,即双运动矢量预测模式下,若不执行编码块的运动矢量的限制过程,则Affine模式下的带宽可能大于非Affine模式下的带宽,造成较大的带宽压力,因此,本申请实施例的技术方案中,对单向预测下的Affine模式不执行编码块的运动矢量的限制过程,而只对双运动矢量预测下的Affine模式执行编码块的运动矢量的限制过程,全面考虑编解 码系统的带宽压力以及编码复杂度,提升编解码系统的性能。
上文结合图5至图11,详细描述了本申请的视频编解码方法实施例,下文结合图12,详细描述本申请的视频编解码装置实施例,应理解,装置实施例与方法实施例相互对应,类似的描述可以参照方法实施例。
图12是根据本申请实施例的视频编解码装置10的示意性框图,应理解,当该视频编解码装置用于进行视频编码时,其具体可以为视频编码装置,当该视频编解码装置用于进行视频解码时,其具体可以为视频解码装置。
如图12所述,所述视频编解码装置10包括:处理器11和存储器12;
存储器12可用于存储程序,处理器11可用于执行该存储器中存储的程序,以执行如下操作:
获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,该控制点运动矢量用于计算得到该编码块中多个子编码块的运动矢量;
当对该编码块进行单向预测时,基于该多个子编码块中第一子编码块的运动矢量,对该第一子编码块进行运动补偿。
可选地,处理器11还用于:当对该编码块进行单向预测时,不执行该编码块的运动矢量的限制过程。
可选地,处理器11还用于:当对该编码块进行双运动矢量预测时,执行该编码块的运动矢量的限制过程。
可选地,处理器11具体用于:基于该控制点运动矢量,计算得到该编码块中的多个限制块的进行亚像素插值所需的像素点数;
若该多个限制块中至少一个限制块的像素点数大于预设阈值,将该编码块中多个子编码块的运动矢量均设置为相同的运动矢量;
若该多个限制块的像素点数均小于或等于该预设阈值,不对该编码块中多个子编码块的运动矢量进行修改。
可选地,当对该编码块进行单向预测时,该编码块中多个子编码块的运动矢量不同。
可选地,当对该编码块进行单向预测时,处理器11具体用于:不对该编码块整体进行运动补偿。
可选地,处理器11具体用于:当对该编码块进行单向预测时,基于该第一子编码块的运动矢量的像素精度,采用插值滤波器对该第一子编码块的参考块进行亚像素插值处理得到该第一子编码块对应的第一子预测块。
可选地,上述第一子编码块的运动矢量的像素精度低于等于1/16,该插值滤波器的抽头数量小于等于6。
可选地,当对该编码块进行单向预测时,Affine模式下该编码块运动补偿的像素带宽小于非Affine模式下该编码块运动补偿的像素带宽。
可选地,处理器11还用于:根据该控制点运动矢量计算得到该编码块中多个子编码块的运动矢量。
可选地,该编码块包括亮度编码块和色度编码块,该亮度编码块包括多个子亮度编码块,该色度编码块包括多个子色度编码块;
处理器11具体用于:根据该控制点运动矢量以及该多个子亮度编码块在该亮度编码块中的位置坐标,计算得到该多个子亮度编码块的运动矢量;
计算4个第一子亮度编码块的运动矢量的均值得到该多个子色度编码块中的第一子色度编码块的运动矢量,该4个第一子亮度编码块为该第一色度编码块对应于该多个子亮度编码块中的4个子亮度编码块。
可选地,该第一子编码块包括该第一子亮度编码块和该第一子色度编码块,处理器11具体用于:
基于该第一子亮度编码块的运动矢量的像素精度,采用6抽头插值滤波器对该第一子亮度编码块的参考块进行亚像素插值处理得到该第一子亮度编码块对应的第一子亮度预测块;和/或,
基于该第一子色度编码块的运动矢量的像素精度,采用4抽头插值滤波器对该第一子色度编码块的参考块进行亚像素插值处理得到该第一子色度编码块对应的第一子色度预测块。
可选地,处理器11具体用于:视频编码时,构建该编码块的控制点运动矢量的候选列表;
对该候选列表中多个候选控制点运动矢量计算率失真代价,将率失真代价最小的候选控制点运动矢量设置为该控制点运动矢量。
可选地,处理器11具体用于:视频解码时,获取该编码块的码流;
根据该编码块的码流确定该编码块的编码模式为Affine模式,并获取该控制点运动矢量的索引信息;
构建该编码块的控制点运动矢量的候选列表;
根据该候选列表以及该控制点运动矢量的索引信息,获取该控制点运动矢量。
可选地,处理器11具体用于:视频解码时,根据该编码块的码流确定该编码块的预测模式为单向预测;
当对该第一子编码块进行单向预测时,基于该第一子编码块的运动矢量,对该第一子编码块进行运动补偿。
可选地,该控制点运动矢量为三个控制点的运动矢量,或者两个控制点的运动矢量。
可选地,该控制点运动矢量的像素精度为4、2、1、1/2、1/4、1/8或者1/16。
可选地,该子编码块为4×4像素。
可选地,当Affine模式为Affine_AMVP模式时,该编码块大于等于16×16像素;
当Affine模式为Affine_merge模式时,该编码块大于等于8×8像素。
本申请实施例还提供了一种电子设备,该电子设备可以包括上述本申请各种实施例的视频编解码装置。
可选的,该电子设备可以包括但不限于手机、无人机、相机等。
本申请实施例还提供一种视频编解码的装置,包括处理器和存储器,该存储器用于存储程序指令,该处理器用于调用所述程序指令,执行上述本申请各种实施例的视频编解码方法。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得,该计算机执行上述方法实施例的方法。
本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线 (例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (40)

  1. 一种视频编解码的方法,其特征在于,包括:
    获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,所述控制点运动矢量用于计算得到所述编码块中多个子编码块的运动矢量;
    当对所述编码块进行单向预测时,基于所述多个子编码块中的第一子编码块的运动矢量,对所述第一子编码块进行运动补偿。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    当对所述编码块进行单向预测时,不执行所述编码块的运动矢量的限制过程。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    当对所述编码块进行双运动矢量预测时,执行所述编码块的运动矢量的限制过程。
  4. 根据权利要求2或3所述的方法,其特征在于,所述编码块的运动矢量的限制过程,包括:
    基于所述控制点运动矢量,计算得到所述编码块中的多个限制块进行亚像素插值所需的像素点数;
    若所述多个限制块中至少一个限制块的像素点数大于预设阈值,将所述编码块中多个子编码块的运动矢量均设置为相同的运动矢量;
    若所述多个限制块的像素点数均小于或等于所述预设阈值,不对所述编码块中多个子编码块的运动矢量进行修改。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,当对所述编码块进行单向预测时,所述编码块中多个子编码块的运动矢量不同。
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,当对所述编码块进行单向预测时,不对所述编码块整体进行运动补偿。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,所述当对所述编码块进行单向预测时,基于所述多个子编码块中第一子编码块的运动矢量,对所述第一子编码块进行运动补偿,包括:
    当对所述编码块进行单向预测时,基于所述第一子编码块的运动矢量的像素精度,采用插值滤波器对所述第一子编码块的参考块进行亚像素插值处理得到所述第一子编码块对应的第一子预测块。
  8. 根据权利要求7所述的方法,其特征在于,所述第一子编码块的运 动矢量的像素精度低于等于1/16,所述插值滤波器的抽头数量小于等于6。
  9. 根据权利要求1-8中任一项所述的方法,其特征在于,当对所述编码块进行单向预测时,Affine模式下所述编码块运动补偿的像素带宽小于非Affine模式下所述编码块运动补偿的像素带宽。
  10. 根据权利要求1-9中任一项所述的方法,其特征在于,所述基于所述多个子编码块中第一子编码块的运动矢量,对所述第一子编码块进行运动补偿之前,所述方法还包括:
    根据所述控制点运动矢量计算得到所述编码块中多个子编码块的运动矢量。
  11. 根据权利要求10所述的方法,其特征在于,所述编码块包括亮度编码块和色度编码块,所述亮度编码块包括多个子亮度编码块,所述色度编码块包括多个子色度编码块;
    所述根据所述控制点运动矢量计算得到所述编码块中多个子编码块的运动矢量,包括:
    根据所述控制点运动矢量以及所述多个子亮度编码块在所述亮度编码块中的位置坐标,计算得到所述多个子亮度编码块的运动矢量;
    计算4个第一子亮度编码块的运动矢量的均值得到所述多个子色度编码块中的第一子色度编码块的运动矢量,所述4个第一子亮度编码块为所述第一色度编码块对应于所述多个子亮度编码块中的4个子亮度编码块。
  12. 根据权利要求11所述的方法,其特征在于,所述第一子编码块包括所述第一子亮度编码块和所述第一子色度编码块,所述当对所述编码块进行单向预测时,基于所述多个子编码块中第一子编码块的运动矢量,对所述第一子编码块进行运动补偿,包括:
    基于所述第一子亮度编码块的运动矢量的像素精度,采用6抽头插值滤波器对所述第一子亮度编码块的参考块进行亚像素插值处理得到所述第一子亮度编码块对应的第一子亮度预测块;和/或,
    基于所述第一子色度编码块的运动矢量的像素精度,采用4抽头插值滤波器对所述第一子色度编码块的参考块进行亚像素插值处理得到所述第一子色度编码块对应的第一子色度预测块。
  13. 根据权利要求1-12中任一项所述的方法,其特征在于,所述获取Affine模式下编码块的控制点运动矢量包括:
    视频编码时,构建所述编码块的控制点运动矢量的候选列表;
    对所述候选列表中多个候选控制点运动矢量计算率失真代价,将率失真代价最小的候选控制点运动矢量设置为所述控制点运动矢量。
  14. 根据权利要求1-12中任一项所述的方法,其特征在于,所述获取Affine模式下编码块的控制点运动矢量包括:
    视频解码时,获取所述编码块的码流;
    根据所述编码块的码流确定所述编码块的编码模式为Affine模式,并获取所述控制点运动矢量的索引信息;
    构建所述编码块的控制点运动矢量的候选列表;
    根据所述候选列表以及所述控制点运动矢量的索引信息,获取所述控制点运动矢量。
  15. 根据权利要求1-12、14中任一项所述的方法,其特征在于,所述当对所述编码块进行单向预测时,基于所述多个子编码块中第一子编码块的运动矢量,对所述第一子编码块进行运动补偿,包括:
    视频解码时,根据所述编码块的码流确定所述编码块的预测模式为单向预测;
    当对所述第一子编码块进行单向预测时,基于所述第一子编码块的运动矢量,对所述第一子编码块进行运动补偿。
  16. 根据权利要求1-15中任一项所述的方法,其特征在于,所述控制点运动矢量为三个控制点的运动矢量,或者两个控制点的运动矢量。
  17. 根据权利要求1-16中任一项所述的方法,其特征在于,所述控制点运动矢量的像素精度为4、2、1、1/2、1/4、1/8或者1/16。
  18. 根据权利要求1-17中任一项所述的方法,其特征在于,所述子编码块为4×4像素。
  19. 根据权利要求1-18中任一项所述的方法,其特征在于,当Affine模式为Affine_AMVP模式时,所述编码块大于等于16×16像素;
    当Affine模式为Affine_merge模式时,所述编码块大于等于8×8像素。
  20. 一种视频编解码的装置,其特征在于,包括:处理器,
    所述处理器用于执行:获取仿射运动补偿预测Affine模式下编码块的控制点运动矢量,所述控制点运动矢量用于计算得到所述编码块中多个子编码块的运动矢量;
    当对所述编码块进行单向预测时,基于所述多个子编码块中第一子编码块的运动矢量,对所述第一子编码块进行运动补偿。
  21. 根据权利要求20所述的装置,其特征在于,所述处理器还用于:
    当对所述编码块进行单向预测时,不执行所述编码块的运动矢量的限制过程。
  22. 根据权利要求20或21所述的装置,其特征在于,所述处理器还用于:
    当对所述编码块进行双运动矢量预测时,执行所述编码块的运动矢量的限制过程。
  23. 根据权利要求21或22所述的装置,其特征在于,所述处理器用于:
    基于所述控制点运动矢量,计算得到所述编码块中的多个限制块的进行亚像素插值所需的像素点数;
    若所述多个限制块中至少一个限制块的像素点数大于预设阈值,将所述编码块中多个子编码块的运动矢量均设置为相同的运动矢量;
    若所述多个限制块的像素点数均小于或等于所述预设阈值,不对所述编码块中多个子编码块的运动矢量进行修改。
  24. 根据权利要求20-23中任一项所述的装置,其特征在于,当对所述编码块进行单向预测时,所述编码块中多个子编码块的运动矢量不同。
  25. 根据权利要求20-24中任一项所述的装置,其特征在于,当对所述编码块进行单向预测时,所述处理器用于不对所述编码块整体进行运动补偿。
  26. 根据权利要求20-25中任一项所述的装置,其特征在于,所述处理器用于:
    当对所述编码块进行单向预测时,基于所述第一子编码块的运动矢量的像素精度,采用插值滤波器对所述第一子编码块的参考块进行亚像素插值处理得到所述第一子编码块对应的第一子预测块。
  27. 根据权利要求26所述的装置,其特征在于,所述第一子编码块的运动矢量的像素精度低于等于1/16,所述插值滤波器的抽头数量小于等于6。
  28. 根据权利要求20-27中任一项所述的装置,其特征在于,当对所述编码块进行单向预测时,Affine模式下所述编码块运动补偿的像素带宽小于非Affine模式下所述编码块运动补偿的像素带宽。
  29. 根据权利要求20-28中任一项所述的装置,其特征在于,所述处理器还用于:根据所述控制点运动矢量计算得到所述编码块中多个子编码块的运动矢量。
  30. 根据权利要求29所述的装置,其特征在于,所述编码块包括亮度编码块和色度编码块,所述亮度编码块包括多个子亮度编码块,所述色度编码块包括多个子色度编码块;
    所述处理器用于:根据所述控制点运动矢量以及所述多个子亮度编码块在所述亮度编码块中的位置坐标,计算得到所述多个子亮度编码块的运动矢量;
    计算4个第一子亮度编码块的运动矢量的均值得到所述多个子色度编码块中的第一子色度编码块的运动矢量,所述4个第一子亮度编码块为所述第一色度编码块对应于所述多个子亮度编码块中的4个子亮度编码块。
  31. 根据权利要求30所述的装置,其特征在于,所述第一子编码块包括所述第一子亮度编码块和所述第一子色度编码块,所述处理器用于:
    基于所述第一子亮度编码块的运动矢量的像素精度,采用6抽头插值滤波器对所述第一子亮度编码块的参考块进行亚像素插值处理得到所述第一子亮度编码块对应的第一子亮度预测块;和/或,
    基于所述第一子色度编码块的运动矢量的像素精度,采用4抽头插值滤波器对所述第一子色度编码块的参考块进行亚像素插值处理得到所述第一子色度编码块对应的第一子色度预测块。
  32. 根据权利要求20-31中任一项所述的装置,其特征在于,所述处理器用于:视频编码时,构建所述编码块的控制点运动矢量的候选列表;
    对所述候选列表中多个候选控制点运动矢量计算率失真代价,将率失真代价最小的候选控制点运动矢量设置为所述控制点运动矢量。
  33. 根据权利要求20-31中任一项所述的装置,其特征在于,所述处理器用于:视频解码时,获取所述编码块的码流;
    根据所述编码块的码流确定所述编码块的编码模式为Affine模式,并获取所述控制点运动矢量的索引信息;
    构建所述编码块的控制点运动矢量的候选列表;
    根据所述候选列表以及所述控制点运动矢量的索引信息,获取所述控制点运动矢量。
  34. 根据权利要求20-31、33中任一项所述的装置,其特征在于,所述处理器用于:
    视频解码时,根据所述编码块的码流确定所述编码块的预测模式为单向预测;
    当对所述第一子编码块进行单向预测时,基于所述第一子编码块的运动矢量,对所述第一子编码块进行运动补偿。
  35. 根据权利要求20-34中任一项所述的装置,其特征在于,所述控制点运动矢量为三个控制点的运动矢量,或者两个控制点的运动矢量。
  36. 根据权利要求20-35中任一项所述的装置,其特征在于,所述控制点运动矢量的像素精度为4、2、1、1/2、1/4、1/8或者1/16。
  37. 根据权利要求20-36中任一项所述的装置,其特征在于,所述子编码块为4×4像素。
  38. 根据权利要求20-37中任一项所述的装置,其特征在于,当Affine模式为Affine_AMVP模式时,所述编码块大于等于16×16像素;
    当Affine模式为Affine_merge模式时,所述编码块大于等于8×8像素。
  39. 一种计算机可读存储介质,其特征在于,用于存储程序指令,所述程序指令被计算机运行时,所述计算机执行如权利要求1至19中任一项所述的方法。
  40. 一种电子设备,其特征在于,包括:
    如权利要求20至38中任一项所述的视频编解码的装置。
PCT/CN2019/107607 2019-09-24 2019-09-24 视频编解码的方法与装置 WO2021056220A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980032177.0A CN112204973A (zh) 2019-09-24 2019-09-24 视频编解码的方法与装置
PCT/CN2019/107607 WO2021056220A1 (zh) 2019-09-24 2019-09-24 视频编解码的方法与装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/107607 WO2021056220A1 (zh) 2019-09-24 2019-09-24 视频编解码的方法与装置

Publications (1)

Publication Number Publication Date
WO2021056220A1 true WO2021056220A1 (zh) 2021-04-01

Family

ID=74004245

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107607 WO2021056220A1 (zh) 2019-09-24 2019-09-24 视频编解码的方法与装置

Country Status (2)

Country Link
CN (1) CN112204973A (zh)
WO (1) WO2021056220A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115190299B (zh) * 2022-07-11 2023-02-28 杭州电子科技大学 Vvc仿射运动估计快速计算方法
CN117939147B (zh) * 2024-03-25 2024-05-28 北京中星微人工智能芯片技术有限公司 视频编解码装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1765123A (zh) * 2004-02-13 2006-04-26 索尼株式会社 图像处理装置、图像处理方法及程序
US20180098063A1 (en) * 2016-10-05 2018-04-05 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
US20180316929A1 (en) * 2017-04-28 2018-11-01 Qualcomm Incorporated Gradient based matching for motion search and derivation
CN109155855A (zh) * 2016-05-16 2019-01-04 高通股份有限公司 用于视频译码的仿射运动预测
CN109391814A (zh) * 2017-08-11 2019-02-26 华为技术有限公司 视频图像编码和解码的方法、装置及设备
CN109429064A (zh) * 2017-08-22 2019-03-05 华为技术有限公司 一种处理视频数据的方法和装置
US20190104319A1 (en) * 2017-10-03 2019-04-04 Qualcomm Incorporated Coding affine prediction motion information for video coding

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190028731A1 (en) * 2016-01-07 2019-01-24 Mediatek Inc. Method and apparatus for affine inter prediction for video coding system
US10701390B2 (en) * 2017-03-14 2020-06-30 Qualcomm Incorporated Affine motion information derivation
CN109729352B (zh) * 2017-10-27 2020-07-21 华为技术有限公司 确定仿射编码块的运动矢量的方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1765123A (zh) * 2004-02-13 2006-04-26 索尼株式会社 图像处理装置、图像处理方法及程序
CN109155855A (zh) * 2016-05-16 2019-01-04 高通股份有限公司 用于视频译码的仿射运动预测
US20180098063A1 (en) * 2016-10-05 2018-04-05 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
US20180316929A1 (en) * 2017-04-28 2018-11-01 Qualcomm Incorporated Gradient based matching for motion search and derivation
CN109391814A (zh) * 2017-08-11 2019-02-26 华为技术有限公司 视频图像编码和解码的方法、装置及设备
CN109429064A (zh) * 2017-08-22 2019-03-05 华为技术有限公司 一种处理视频数据的方法和装置
US20190104319A1 (en) * 2017-10-03 2019-04-04 Qualcomm Incorporated Coding affine prediction motion information for video coding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
H. HUANG (QUALCOMM), W.-J. CHIEN (QUALCOMM), M. KARCZEWICZ (QUALCOMM): "Non-CE4: Size constrain for inherited affine motion prediction", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 24 June 2019 (2019-06-24), XP030218980 *

Also Published As

Publication number Publication date
CN112204973A (zh) 2021-01-08

Similar Documents

Publication Publication Date Title
WO2016050051A1 (zh) 图像预测方法及相关装置
WO2021163862A1 (zh) 视频编码的方法与装置
TW202145792A (zh) 使用深度學習的並行化的速率失真最佳化量化
WO2021203394A1 (zh) 环路滤波的方法与装置
US20210360246A1 (en) Shape adaptive discrete cosine transform for geometric partitioning with an adaptive number of regions
US20220345699A1 (en) In-loop filtering method and device
TWI468018B (zh) 使用向量量化解區塊過濾器之視訊編碼
WO2020219940A1 (en) Global motion for merge mode candidates in inter prediction
US11558617B2 (en) End-to-end dependent quantization with deep reinforcement learning
WO2020258055A1 (zh) 环路滤波的方法与装置
US9438925B2 (en) Video encoder with block merging and methods for use therewith
US11979565B2 (en) Content-adaptive online training method and apparatus for post-filtering
US20220405979A1 (en) Content-adaptive online training method and apparatus for deblocking in block-wise image compression
WO2021056220A1 (zh) 视频编解码的方法与装置
WO2020219952A1 (en) Candidates in frames with global motion
WO2021056212A1 (zh) 视频编解码方法和装置
EP3959889A1 (en) Adaptive motion vector prediction candidates in frames with global motion
EP4107952A1 (en) Block-wise content-adaptive online training in neural image compression with post filtering
JP2022130610A (ja) ビデオ符号化におけるデコーダ側動きベクトル補正のための方法および装置
WO2021056210A1 (zh) 视频编解码方法、装置和计算机可读存储介质
CN113826403A (zh) 信息处理方法及装置、设备、存储介质
US11611770B2 (en) Method and apparatus for video coding
US20240080443A1 (en) Selecting downsampling filters for chroma from luma prediction
US20220360770A1 (en) Block-wise content-adaptive online training in neural image compression with post filtering
US20230171405A1 (en) Scene transition detection based encoding methods for bcw

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19947333

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19947333

Country of ref document: EP

Kind code of ref document: A1