WO2021056212A1 - Procédé et appareil de codage et de décodage vidéo - Google Patents

Procédé et appareil de codage et de décodage vidéo Download PDF

Info

Publication number
WO2021056212A1
WO2021056212A1 PCT/CN2019/107598 CN2019107598W WO2021056212A1 WO 2021056212 A1 WO2021056212 A1 WO 2021056212A1 CN 2019107598 W CN2019107598 W CN 2019107598W WO 2021056212 A1 WO2021056212 A1 WO 2021056212A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
image block
interpolation filter
interpolation
image
Prior art date
Application number
PCT/CN2019/107598
Other languages
English (en)
Chinese (zh)
Inventor
郑萧桢
孟学苇
马思伟
王苫社
Original Assignee
深圳市大疆创新科技有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司, 北京大学 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/107598 priority Critical patent/WO2021056212A1/fr
Priority to CN201980033882.2A priority patent/CN112154666A/zh
Publication of WO2021056212A1 publication Critical patent/WO2021056212A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors

Definitions

  • This application relates to the field of image processing, and more specifically, to a video coding and decoding method and device.
  • Prediction is an important module of the mainstream video coding framework.
  • Prediction can include intra-frame prediction and inter-frame prediction.
  • the inter prediction modes may include Advanced Motion Vector Prediction (AMVP) mode, Merge mode, and Skip mode.
  • AMVP Advanced Motion Vector Prediction
  • Merge mode the MVP can be determined in the motion vector prediction (MVP) candidate list, and the MVP can be directly determined as the MV, and the MVP and reference frame index can be transmitted to the decoder in the code stream. Used for decoding on the decoder side.
  • Skip mode only the index of the MVP needs to be passed, and there is no need to pass the MVD, and there is no need to pass the residual.
  • the motion information of the encoded or decoded block is used to update the motion vector of the next to be encoded or decoded block List of predicted values.
  • pixel interpolation is performed on the current image block or the reference block of the coding block, it is often dependent on reading the MVP in the candidate list of the motion vector predictor and its corresponding interpolation filter, which will make the motion used by the adjacent image block
  • the dependence of the vector and the corresponding interpolation filter is strong. Therefore, the encoding and decoding methods in the prior art will result in a decrease in the encoding and decoding efficiency and the performance loss of the encoding and decoding device.
  • the embodiments of the present application provide a video coding and decoding method and device, which can avoid the current image block from relying too much on interpolation filters used by neighboring blocks, and improve coding and decoding efficiency.
  • a video encoding and decoding method which includes: when pixel interpolation is performed on an image block of a current image, one of at least two interpolation filters can be used for pixel interpolation, and the current image includes a first image block. And a second image block; using a first interpolation filter to perform pixel interpolation on the reference block of the first image block; using a default interpolation filter to perform pixel interpolation on the reference block of the second image block, the first image The block is an adjacent block of the second image block.
  • a video encoding and decoding device in a second aspect, includes: a memory, configured to store executable instructions; a processor, configured to execute the instructions stored in the memory, so that the video A coding and decoding method, which includes the operations in the method of the first aspect described above.
  • a video codec which includes the video codec device of the second aspect described above, and a body, and the codec device is installed on the body.
  • a computer-readable storage medium stores program instructions, and the program instructions can be used to instruct to perform the method of the first aspect.
  • the first interpolation filter is used to perform pixel interpolation on the reference block of the first image block; the default interpolation filter is used to perform pixel interpolation on the reference block of the second image block, and the first interpolation filter is used to perform pixel interpolation on the reference block of the second image block.
  • the image block is the adjacent block of the second image block, and the reference block of the second image block uses the default interpolation filter during the interpolation process, which does not inherit or read the interpolation filter used by the first image block , Reducing its dependence on the coding and decoding of the first image block, therefore, in the coding and decoding process, the coding and decoding efficiency is improved, and the performance of the coding and decoding device is improved.
  • Fig. 1 is a structural diagram of a technical solution according to an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a video coding framework 2 according to an embodiment of the present application.
  • Fig. 3 is a schematic diagram of adjacent blocks of an image block according to an embodiment of the present application.
  • Fig. 4 is a schematic flowchart of a video encoding and decoding method according to an embodiment of the present application.
  • Fig. 5 is a schematic block diagram of a video encoding and decoding device according to an embodiment of the present application.
  • Fig. 1 is a structural diagram of a technical solution applying an embodiment of the present application.
  • the system 100 can receive the data 102 to be processed, process the data 102 to be processed, and generate processed data 108.
  • the system 100 may receive the data to be encoded and encode the data to be encoded to generate encoded data, or the system 100 may receive the data to be decoded and decode the data to be decoded to generate decoded data.
  • the components in the system 100 may be implemented by one or more processors.
  • the processor may be a processor in a computing device or a processor in a mobile device (such as a drone).
  • the processor may be any type of processor, which is not limited in the embodiment of the present application.
  • the processor may include an encoder, a decoder, or a codec, etc.
  • the system 100 may also include one or more memories.
  • the memory can be used to store instructions and data, for example, computer-executable instructions that implement the technical solutions of the embodiments of the present application, to-be-processed data 102, processed data 108, and so on.
  • the memory can be any type of memory, which is not limited in the embodiment of the present application.
  • the data to be encoded may include text, images, graphic objects, animation sequences, audio, video, or any other data that needs to be encoded.
  • the data to be encoded may include sensor data from sensors, which may be vision sensors (for example, cameras, infrared sensors), microphones, near-field sensors (for example, ultrasonic sensors, radars), position sensors, and temperature sensors. Sensors, touch sensors, etc.
  • the data to be encoded may include information from the user, for example, biological information, which may include facial features, fingerprint scans, retinal scans, voice recordings, DNA sampling, and the like.
  • Fig. 2 is a schematic diagram of a video coding framework 2 according to an embodiment of the present application.
  • FIG. 2 after receiving the video to be encoded, starting from the first frame of the video to be encoded, each frame in the video to be encoded is encoded in turn.
  • the current coded frame mainly undergoes processing such as prediction (Prediction), transformation (Transform), quantization (Quantization), and entropy coding (Entropy Coding), and finally the bit stream of the current coded frame is output.
  • the decoding process usually decodes the received bitstream according to the inverse process of the above process to recover the video frame information before decoding.
  • the video encoding framework 2 includes an encoding control module 201, which is used to perform decision-making control actions and parameter selection in the encoding process.
  • the encoding control module 202 controls the parameters used in transformation, quantization, inverse quantization, and inverse transformation, controls the selection of intra or inter mode, and parameter control of motion estimation and filtering, and
  • the control parameters of the encoding control module 202 will also be input to the entropy encoding module, and the encoding will be performed to form a part of the encoded bitstream.
  • the encoded frame is partitioned 202, specifically, it is firstly divided into slices, and then divided into blocks.
  • the coded frame is divided into a plurality of non-overlapping largest coding tree units (Coding Tree Units, CTUs), and each CTU can also be in a quadtree, or binary tree, or triple tree manner. Iteratively divides into a series of smaller coding units (Coding Unit, CU).
  • the CU may also include a prediction unit (Prediction Unit, PU) and a transformation unit (Transform Unit, TU) associated with it.
  • the PU It is the basic unit of prediction
  • TU is the basic unit of transformation and quantization.
  • the PU and TU are respectively obtained by dividing into one or more blocks on the basis of the CU, where one PU includes multiple prediction blocks (PB) and related syntax elements.
  • the PU and TU may be the same, or they may be obtained by the CU through different division methods.
  • at least two of the CU, PU, and TU are the same.
  • CU, PU, and TU are not distinguished, and prediction, quantization, and transformation are all performed in units of CU.
  • the CTU, CU, or other formed data units are all referred to as coding blocks in the following.
  • the data unit for video encoding may be a frame, a slice, a coding tree unit, a coding unit, a coding block, or any group of the above.
  • the size of the data unit can vary.
  • a prediction process is performed to remove the spatial and temporal redundant information of the current coded frame.
  • predictive coding methods include intra-frame prediction and inter-frame prediction.
  • Intra-frame prediction uses only the reconstructed information in the current frame to predict the current coding block
  • inter-frame prediction uses the information in other previously reconstructed frames (also called reference frames) to predict the current coding block.
  • Make predictions Specifically, in this embodiment of the present application, the encoding control module 202 is used to make a decision to select intra-frame prediction or inter-frame prediction.
  • the process of intra-frame prediction 203 includes obtaining the reconstructed block of the coded neighboring block around the current coding block as a reference block, and based on the pixel value of the reference block, the prediction mode method is used to calculate the predicted value to generate the predicted block , Subtracting the corresponding pixel values of the current coding block and the prediction block to obtain the residual of the current coding block, the residual of the current coding block is transformed 204, quantized 205, and entropy coding 210 to form the code stream of the current coding block. Further, after all the coded blocks of the current coded frame undergo the above-mentioned coding process, they form a part of the coded stream of the coded frame. In addition, the control and reference data generated in the intra-frame prediction 203 are also encoded by the entropy encoding 210 to form a part of the encoded bitstream.
  • the transform 204 is used to remove the correlation of the residual of the image block, so as to improve the coding efficiency.
  • the transformation of the residual data of the current coding block usually adopts two-dimensional discrete cosine transform (DCT) transformation and two-dimensional discrete sine transform (DST) transformation, for example, the residual information of the coded block Respectively multiply an N ⁇ M transformation matrix and its transposed matrix, and obtain the transformation coefficient of the current coding block after the multiplication.
  • DCT discrete cosine transform
  • DST two-dimensional discrete sine transform
  • quantization 205 is used to further improve the compression efficiency.
  • the transform coefficients can be quantized to obtain the quantized coefficients, and then the quantized coefficients are entropy-encoded 210 to obtain the residual code stream of the current encoding block. But it is not limited to content adaptive binary arithmetic coding (Context Adaptive Binary Arithmetic Coding, CABAC) entropy coding.
  • CABAC Context Adaptive Binary Arithmetic Coding
  • the coded neighboring block in the intra prediction 203 process is: the neighboring block that has been coded before the current coding block is coded, and the residual generated in the coding process of the neighboring block is transformed 204, quantized 205, After inverse quantization 206 and inverse transform 207, the reconstructed block is obtained by adding the prediction block of the neighboring block.
  • the inverse quantization 206 and the inverse transformation 207 are the inverse processes of the quantization 206 and the transformation 204, which are used to restore the residual data before the quantization and transformation.
  • the inter-frame prediction process includes motion estimation (ME) 208 and motion compensation (MC) 209.
  • the motion estimation is performed 208 according to the reference frame image in the reconstructed video frame, and the image block most similar to the current encoding block is searched for in one or more reference frame images according to a certain matching criterion as a matching block.
  • the relative displacement with the current coding block is the motion vector (Motion Vector, MV) of the current coding block.
  • MV Motion Vector
  • the original value of the pixel of the coding block is subtracted from the pixel value of the corresponding prediction block to obtain the residual of the coding block.
  • the residual of the current coding block is transformed 204, quantized 205, and entropy coding 210 to form a part of the code stream of the coded frame.
  • the control and reference data generated in the motion compensation 209 are also encoded by the entropy encoding 210 to form a part of the encoded bitstream.
  • the reconstructed video frame is a video frame obtained after filtering 211.
  • Filtering 211 is used to reduce compression distortions such as blocking effects and ringing effects generated in the encoding process.
  • the reconstructed video frame is used to provide reference frames for inter-frame prediction during the encoding process.
  • the reconstructed video frame is output after post-processing For the final decoded video.
  • the inter prediction modes in the video coding standard may include AMVP mode, Merge mode and Skip mode.
  • the MVP can be determined first. After the MVP is obtained, the starting point of the motion estimation can be determined according to the MVP, and the motion search is performed near the starting point. After the search is completed, the optimal MV is obtained.
  • the MV determines the reference block in With reference to the position in the image, the reference block is subtracted from the current block to obtain the residual block, and the MV is subtracted from the MVP to obtain the Motion Vector Difference (MVD), and the MVD is transmitted to the decoder through the code stream.
  • MVD Motion Vector Difference
  • the MVP For the Merge mode, you can determine the MVP first, and directly determine the MVP as the MV. Among them, in order to obtain the MVP, you can first build a MVP candidate list (merge candidate list), in the MVP candidate list, you can include at least one candidate MVP, each candidate MVP can correspond to an index, the encoding end is from the MVP candidate list After selecting the MVP, the MVP index can be written into the code stream, and the decoder can find the MVP corresponding to the index from the MVP candidate list according to the index, so as to realize the decoding of the image block.
  • a MVP candidate list Merge candidate list
  • the MVP candidate list you can include at least one candidate MVP, each candidate MVP can correspond to an index
  • the encoding end is from the MVP candidate list
  • the MVP index can be written into the code stream
  • the decoder can find the MVP corresponding to the index from the MVP candidate list according to the index, so as to realize the decoding of the image block.
  • the MVP candidate list may include temporal candidate motion vectors, spatial candidate motion vectors, pairwise motion vectors, or zero motion vectors.
  • the pairwise motion vector can be obtained by averaging or weighted averaging based on the existing motion vectors in the candidate list.
  • the spatial candidate motion vector is obtained from the position of the gray box from 1 to 5 in Figure 3, and the temporal candidate motion vector is obtained from the co-located CU in the coded image adjacent to the current CU.
  • the temporal candidate motion vector cannot be directly used as a candidate.
  • the motion information of the block needs to be adjusted according to the position relationship of the reference image. The specific scaling method will not be repeated here.
  • Skip mode it is a special merge mode that only needs to pass the index of the MVP. In addition to the need to transmit MVD information, there is also no need to transmit residuals.
  • Skip mode is a special case of Merge mode. After obtaining the MV according to the Merge mode, if the encoder determines that the current block is basically the same as the reference block, then there is no need to transmit residual data, only the index of the MV needs to be transmitted, and further a flag can be passed, which can indicate that the current block can be directly Obtained from the reference block.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixel units.
  • a motion vector with 1/4 pixel accuracy is used for the motion estimation of the luminance component in HEVC.
  • the values of these fractional pixels must be approximately interpolated, that is, the line direction and the reference frame K-fold interpolation is performed in the column direction, and a search is performed in the image after interpolation.
  • the common AMVP mode sets four types of adaptive motion vector resolution (Advanced Motion Vector Resolution, AMVR) accuracy (integer pixels, 4 pixels, 1/4 pixels and 1/2 accuracy), which need to be explained Yes, the accuracy of these AMVRs is only an example, and the embodiments of this application do not specifically limit the value of pixel accuracy. For example, there may also be 1/8 pixels, 1/16 pixels, etc., here are integer pixels, 4 pixels, 1/ Take 4 pixels and 1/2 pixels as an example.
  • AMVR Advanced Motion Vector Resolution
  • the corresponding MV accuracy (integer pixel, 4 pixel or 1/4 pixel or 1/2 pixel) is adaptively decided at the encoding end, and Write the result of the decision into the code stream and pass it to the decoding end.
  • the number of taps corresponding to interpolation filters of different AMVR accuracy may be different. For example, for 1/4 pixel accuracy, an eight-tap interpolation filter is used. For 1/2 pixel, Gaussian interpolation filter (six-tap interpolation filter) is used. Due to the different interpolation filters used, when storing the motion vector, the interpolation filter currently used by the CU needs to be stored. As an example, the interpolation filter can be represented by 1 bit. When a Gaussian filter (six-tap interpolation filter) is used, it is stored as 1, and when a Gaussian filter is not used, it is stored as 0.
  • the interpolation filter used by the current CU needs to be determined according to the identification bit of the interpolation filter.
  • the interpolation filter used by the current CU can be understood as an interpolation filter used when performing pixel interpolation on the reference block of the current CU.
  • the interpolation filter used by the image block mentioned below can be understood as the interpolation filter used when performing pixel interpolation on the reference block of the image block.
  • the MV information of the spatial neighboring block and the temporal neighboring block need to be used in the process of constructing the MVP candidate list, after each CU is encoded, the MV finally used needs to be stored for subsequent follow-up MV reference, the MV will store information such as the value of the MV vector, the index of the reference frame, the prediction mode of the current CU, and so on.
  • the current block when the current block is performing motion compensation, it completely inherits the MV of the neighboring block and its corresponding interpolation filter. Therefore, after each CU completes encoding, it is also necessary to store the interpolation filter used by the current CU.
  • the specific storage method may also be a 1-bit identification bit for storage, and the storage method may refer to the foregoing description, which will not be repeated here.
  • the optimal MV selected by the current block from the motion vector candidate list is the time domain MV. If it is MV0, and the identification bit of the interpolation filter corresponding to the MV0 is 0, it means that the time domain neighboring blocks of the current block are doing
  • Gaussian filter is used to perform pixel interpolation on its reference block. Then, after the current block uses MV0 to find the reference block of the current block from the reference frame, it will also use a Gaussian filter, that is, a 6-tap interpolation filter, to perform pixel interpolation on its reference block. After the pixels are interpolated, the prediction of the current block is obtained. Piece.
  • the residual block can be obtained by subtracting the current block from the reference block.
  • the following solutions provided by the embodiments of the present application can reduce its dependence on the encoding and decoding process of adjacent blocks, thereby improving the encoding and decoding efficiency during the encoding and decoding process, and improving the performance of the encoding and decoding device.
  • storage costs can be saved.
  • the pixel interpolation of the image block referred to in this application refers to the pixel interpolation of the reference block of the image block, and the interpolation filter used in the process is to use the interpolation filter to perform pixel interpolation on the reference block of the image block .
  • FIG. 4 is a schematic flowchart of a video encoding and decoding method 400 according to an embodiment of the present application.
  • the method 400 includes at least part of the following content. Among them, the following method 400 can be used on the encoding side and can also be used on the decoding side.
  • one of at least two interpolation filters may be used for pixel interpolation, and the current image includes a first image block and a second image block;
  • the first interpolation filter is used to perform pixel interpolation on the reference block of the first image block; the default interpolation filter is used to perform pixel interpolation on the reference block of the second image block, and the first image block is the phase of the second image block. Adjacent block.
  • interpolation filter used by the image block mentioned in this application can be understood as the interpolation filter adopted/used when performing pixel interpolation on the reference block of the image block.
  • the interpolation filter of the neighboring block is not inherited, but the default interpolation filter is used to perform pixel interpolation on the reference block of the second image block .
  • the image block in the current image has at least two kinds of interpolation filters to choose from when performing pixel interpolation, for example, including at least a 6-tap interpolation filter and an 8-tap interpolation filter.
  • different pixel precisions may also correspond to interpolation filters with different taps.
  • AMVR precision interpolation filter as an example, for 1/4 pixel precision, an eight-tap interpolation filter is used.
  • Gaussian interpolation filter for 1/2 pixel, Gaussian interpolation filter (six-tap interpolation filter) is used.
  • the foregoing corresponding relationship is only an exemplary representation, and does not constitute a limitation to the present application.
  • the interpolation filter used by the current CU is not stored, and the identification bit is not set.
  • the subsequent CU refers to the previously coded CU, it is not necessary
  • the interpolation filter used by the current CU is determined according to the identification bit of the interpolation filter, but the default interpolation filter is directly used.
  • the pixel accuracy used for pixel interpolation of the reference block of the second image block is at the sub-pixel level.
  • the pixel precision used for pixel interpolation on the reference block of the second image block is 1/2 pixel precision.
  • the pixel precision used for pixel interpolation on the reference block of the second image block is 1/2 pixel precision for illustrative purposes only, and other pixel precisions may be used in other embodiments of this application, such as 1/4 pixel precision, 1/ 8 pixel accuracy and so on.
  • the embodiments shown in this application can also be applied to pixel interpolation with integer pixel accuracy.
  • the first image block is a temporal neighboring block of the second image block.
  • the first image block is located on the reference frame
  • the second image block is located on the current frame
  • the correlation prediction between the first image block and the second image block is inter prediction
  • the direction of inter prediction can be forward prediction, Backward prediction, two-way prediction, etc.
  • Forward prediction uses the previous reconstructed frame ("historical frame") to predict the current frame.
  • Backward prediction is to use frames after the current frame (“future frame”) to predict the current frame.
  • Bidirectional prediction uses not only "historical frames” but also "future frames” to predict the current frame. This application is not limited to any one of the above three prediction methods.
  • the temporal candidate list is obtained from the co-located CU in the adjacent coded image of the current CU.
  • the temporal candidate list cannot directly use the motion information of the candidate block, but needs to be based on The positional relationship of the reference image is adjusted accordingly. The specific scaling method will not be repeated here.
  • the bidirectional prediction mode is one of the dual motion vector modes.
  • the dual motion vector mode includes dual forward prediction mode, dual backward prediction mode and bidirectional prediction mode.
  • the dual forward prediction mode includes two forward motion vectors
  • the dual backward prediction mode includes two backward motion vectors.
  • the bidirectional prediction mode includes a forward prediction mode and a backward prediction mode.
  • the first image block is a spatial neighboring block of the second image block.
  • the first image block and the second image block are both located on the current frame.
  • the spatial candidate list in the merge mode Obtain from the position of the boxes 1 to 5 in the figure.
  • the first interpolation filter used by the first image block is a default interpolation filter or a non-default interpolation filter.
  • the first image block may be merge mode, AMVP mode or Skip mode.
  • the first interpolation filter is the default interpolation filter.
  • the first interpolation filter is the actually selected and determined interpolation filter. Exemplarily, if the pixel accuracy actually selected in AMVR is 1/2 pixel accuracy, a 6-tap interpolation filter is used, and if the actual selected pixel accuracy is 1/4 pixel accuracy, an 8-tap interpolation filter is used. When other pixel accuracy is determined, other interpolation filters can be selected accordingly.
  • the default interpolation filter is an interpolation filter with a default number of taps.
  • the default interpolation filter refers to an interpolation filter with a default number of taps.
  • the interpolation filter with a default number of taps includes a 6-tap interpolation filter or an 8-tap interpolation filter.
  • the 6-tap and 8-tap in the embodiment of the present application are only used as an example, and do not constitute a limitation on the default interpolation filter.
  • the default interpolation filter is an interpolation filter with a default weight value.
  • the weight value For the understanding of the weight value, an explanation is given below. Taking a 6-tap interpolation filter with 1/2 pixel accuracy as an example, one sub-pixel needs to be interpolated between every two whole pixels in the reference block, and the sub-pixel is 1/2 pixel. Since there is no pixel value on the 1/2 pixel, the pixel values of the 6 integral pixels on the left and the 6 integral pixels on the right of the 1/2 pixel need to be used to calculate the pixel of the 1/2 pixel. value.
  • the weight value refers to the value given in front of A0 ⁇ A6, which represents the weight value given by the pixel to the final calculation result. It can be seen that, For different pixels, the weight value has been determined as a specific value, and the specific value can be determined by setting or by a default value.
  • the first image block may adopt Merge prediction mode or Skip mode.
  • the method before using the default interpolation filter to perform pixel interpolation on the reference block of the second image block, the method further includes: obtaining a motion vector candidate list; selecting a motion vector from the motion vector candidate list; determining the first motion vector from the reference frame according to the motion vector The reference block of the second image block; after using the default interpolation filter to perform pixel interpolation on the reference block of the second image block, the method further includes: determining a residual error according to the reference block after the pixel interpolation of the reference block and the second image block.
  • the identifier of the first interpolation filter used by the reference block of the first image block is not stored.
  • the identifier of the default interpolation filter used by the reference block of the second image block is not stored.
  • the foregoing encoding and decoding method can reduce its dependence on the encoding and decoding process of adjacent blocks, thereby improving the encoding and decoding efficiency during the encoding and decoding process, and improving the performance of the encoding and decoding device.
  • the type of interpolation filter is not stored, hardware storage resources can be saved.
  • the above embodiment is described in the type that all blocks do not store the interpolation filter.
  • the corresponding interpolation filter may be stored only for the spatial MV, but not for the time domain MV. Corresponding interpolation filter.
  • the identifier of the first interpolation filter used by the reference block of the first image block is stored in the space domain.
  • the spatial storage shown in this application refers to storing the identification bit of the interpolation filter used by the reference block of the current block in the buffer of the spatial MV information.
  • the type of interpolation filter corresponding to the spatial MV can be directly read from the buffer.
  • the identification of the first interpolation filter used by the reference block of the first image block is not stored in the time domain.
  • the time-domain storage shown in this application means that the identification bit of the interpolation filter used by the reference block of the current block is not stored in the buffer of the time-domain MV information.
  • the corresponding interpolation filter can be stored only for the spatial MV, while the corresponding interpolation filter is not stored for the time domain MV, which can also relieve part of the storage pressure.
  • the subsequent CU performs a motion search, if the optimal MV selected from the motion vector candidate list is the time domain MV, the default interpolation filter is directly used to perform pixel interpolation on its reference block. If the optimal MV selected from the motion vector candidate list is a spatial MV, the interpolation filter corresponding to the spatial MV is still used to perform pixel interpolation on the reference block.
  • the corresponding interpolation filter is not stored.
  • the subsequent CU performs a motion search, it directly uses the default interpolation filter to perform pixel interpolation on its reference block. . This can reduce the storage overhead of the hardware. The process is saved, the coding and decoding efficiency is improved, and the performance of the coding and decoding device is improved.
  • only the corresponding interpolation filter is stored for the time domain MV, and the corresponding interpolation filter is not stored for the spatial MV. Since the storage of the time domain MV will also occupy storage space, This results in an increase in storage pressure. Therefore, only the corresponding interpolation filter can be stored for the time domain MV, and the corresponding interpolation filter can not be stored for the spatial MV, so that part of the storage pressure can also be relieved.
  • the subsequent CU performs a motion search, if the optimal MV selected from the motion vector candidate list is a spatial MV, the default interpolation filter is directly used to perform pixel interpolation on its reference block. If the optimal MV selected from the motion vector candidate list is the time domain MV, the interpolation filter corresponding to the time domain MV is still used to perform pixel interpolation on the reference block.
  • the motion vector candidate list does not include the identifier of the first interpolation filter used by the reference block of the first image block.
  • the default interpolation filter is directly used, which can save storage space and ensure coding performance.
  • the motion vector candidate list includes one or more of spatial candidate motion vectors, temporal candidate motion vectors, candidate motion vectors based on historical information, and paired candidate motion vectors, where the paired candidate motion vectors are spatial domain candidates.
  • One or more of candidate motion vectors, temporal candidate motion vectors, or candidate motion vectors based on historical information are determined.
  • the paired candidate motion vector is determined based on the mean/weighted mean of the spatial candidate motion vector and/or the temporal candidate motion vector.
  • the default interpolation filter is used to perform pixel interpolation on the reference block of the second image block.
  • the embodiment shown in this application can work in the brightness mode.
  • the default interpolation filter is used to perform pixel interpolation on the reference block of the second image block.
  • the embodiments shown in this application can work in chroma mode.
  • the solution of the embodiment of the present application can reduce its dependence on the encoding and decoding process of adjacent blocks, thereby improving the encoding and decoding efficiency during the encoding and decoding process, and improving the performance of the encoding and decoding device, so that it can be used for image blocks. It should be understood that the embodiment of the present application can also be used in a coding and decoding scenario, that is, reducing the storage pressure of the system can be used for other purposes.
  • FIG. 5 shows a schematic block diagram of a video encoding and decoding apparatus 500 according to an embodiment of the present application.
  • the video encoding and decoding apparatus 500 may include a processor 510, and may further include a memory 520.
  • video encoding and decoding apparatus 500 may also include components commonly included in other video encoding and decoding apparatuses, such as input and output devices, communication interfaces, etc., which are not limited in the embodiment of the present application.
  • the memory 520 is used to store computer-executable instructions.
  • the memory 520 may be various types of memory, for example, it may include a high-speed random access memory (Random Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The example does not limit this.
  • RAM Random Access Memory
  • non-volatile memory such as at least one magnetic disk memory. The example does not limit this.
  • the processor 510 is configured to access the memory 520 and execute the computer-executable instructions to perform operations in the method for video processing in the foregoing embodiment of the present application.
  • the processor 510 may include a microprocessor, a field-programmable gate array (Field-Programmable Gate Array, FPGA), a central processing unit (CPU), a graphics processor (Graphics Processing Unit, GPU), etc.
  • a microprocessor a field-programmable gate array (Field-Programmable Gate Array, FPGA), a central processing unit (CPU), a graphics processor (Graphics Processing Unit, GPU), etc.
  • FPGA Field-Programmable Gate Array
  • CPU central processing unit
  • GPU Graphics Processing Unit
  • the device for video processing and the computer system in the embodiment of the application may correspond to the execution body of the method for video processing in the embodiment of the application, and the above and other aspects of the device and the computer system for the video processing are described above.
  • the operations and/or functions are used to implement the corresponding procedures of the foregoing methods, and are not repeated here for brevity.
  • the video processor can implement the corresponding operations implemented by the codec device in the above method embodiments.
  • the video encoder may further include a body on which the encoder device is installed.
  • the body includes at least one of a mobile phone, a camera, or a drone.
  • the embodiment of the present application also provides a computer-readable storage medium, and the computer-readable storage medium stores program instructions, and the program instructions may be used to instruct to perform the above-mentioned loop filtering method of the embodiment of the present application.
  • the term "and/or” is merely an association relationship describing an associated object, indicating that there may be three relationships.
  • a and/or B can mean that: A alone exists, A and B exist at the same time, and B exists alone.
  • the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program instructions .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Procédé et appareil de codage et de décodage vidéo comprenant les étapes suivantes : lors de la réalisation d'une interpolation de pixels sur un bloc d'image d'une image courante, l'interpolation de pixels peut être effectuée en utilisant un type parmi au moins deux types de filtres d'interpolation, l'image courante comprenant un premier bloc d'image et un second bloc d'image ; une interpolation de pixels est effectuée sur un bloc de référence du premier bloc d'image à l'aide d'un premier filtre d'interpolation ; et une interpolation de pixels est effectuée sur un bloc de référence du second bloc d'image à l'aide d'un filtre d'interpolation par défaut, le premier bloc d'image étant un bloc voisin du second bloc d'image. Le procédé et l'appareil décrits pour le codage et le décodage empêchent un bloc d'image courant de trop dépendre d'un filtre d'interpolation utilisé par un bloc voisin, ce qui permet d'améliorer l'efficacité de codage et de décodage dans un processus de codage et de décodage, et d'améliorer les performances d'un appareil de codage et de décodage.
PCT/CN2019/107598 2019-09-24 2019-09-24 Procédé et appareil de codage et de décodage vidéo WO2021056212A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/107598 WO2021056212A1 (fr) 2019-09-24 2019-09-24 Procédé et appareil de codage et de décodage vidéo
CN201980033882.2A CN112154666A (zh) 2019-09-24 2019-09-24 视频编解码方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/107598 WO2021056212A1 (fr) 2019-09-24 2019-09-24 Procédé et appareil de codage et de décodage vidéo

Publications (1)

Publication Number Publication Date
WO2021056212A1 true WO2021056212A1 (fr) 2021-04-01

Family

ID=73891983

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107598 WO2021056212A1 (fr) 2019-09-24 2019-09-24 Procédé et appareil de codage et de décodage vidéo

Country Status (2)

Country Link
CN (1) CN112154666A (fr)
WO (1) WO2021056212A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114640845B (zh) * 2021-02-23 2023-02-28 杭州海康威视数字技术股份有限公司 编解码方法、装置及其设备
CN113259669B (zh) * 2021-03-25 2023-07-07 浙江大华技术股份有限公司 编码方法、装置、电子设备及计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101043621A (zh) * 2006-06-05 2007-09-26 华为技术有限公司 一种自适应插值处理方法及编解码模块
CN101365137A (zh) * 2008-09-12 2009-02-11 华为技术有限公司 运动补偿参考数据载入方法和装置及解码器和编解码系统
CN103747269A (zh) * 2013-09-30 2014-04-23 北京大学深圳研究生院 一种滤波器插值方法及滤波器
US20180077423A1 (en) * 2016-09-15 2018-03-15 Google Inc. Dual filter type for motion compensated prediction in video coding
CN107925772A (zh) * 2015-09-25 2018-04-17 华为技术有限公司 利用可选插值滤波器进行视频运动补偿的装置和方法
CN109756737A (zh) * 2017-11-07 2019-05-14 华为技术有限公司 图像预测方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2636622B2 (ja) * 1992-03-13 1997-07-30 松下電器産業株式会社 ビデオ信号の符号化方法及び復号化方法ならびにビデオ信号の符号化装置及び復号化装置
CN102387360B (zh) * 2010-09-02 2016-05-11 乐金电子(中国)研究开发中心有限公司 视频编解码帧间图像预测方法及视频编解码器
CN104702962B (zh) * 2015-03-03 2019-04-16 华为技术有限公司 帧内编解码方法、编码器和解码器

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101043621A (zh) * 2006-06-05 2007-09-26 华为技术有限公司 一种自适应插值处理方法及编解码模块
CN101365137A (zh) * 2008-09-12 2009-02-11 华为技术有限公司 运动补偿参考数据载入方法和装置及解码器和编解码系统
CN103747269A (zh) * 2013-09-30 2014-04-23 北京大学深圳研究生院 一种滤波器插值方法及滤波器
CN107925772A (zh) * 2015-09-25 2018-04-17 华为技术有限公司 利用可选插值滤波器进行视频运动补偿的装置和方法
US20180077423A1 (en) * 2016-09-15 2018-03-15 Google Inc. Dual filter type for motion compensated prediction in video coding
CN109756737A (zh) * 2017-11-07 2019-05-14 华为技术有限公司 图像预测方法和装置

Also Published As

Publication number Publication date
CN112154666A (zh) 2020-12-29

Similar Documents

Publication Publication Date Title
KR102130821B1 (ko) 영상 복호화 방법 및 컴퓨터로 읽을 수 있는 기록 매체
US10666938B2 (en) Deriving reference mode values and encoding and decoding information representing prediction modes
JP7015255B2 (ja) 適応型動き情報改良を用いるビデオ符号化
TWI711300B (zh) 照度補償方法及相應之視訊處理裝置
TWI705703B (zh) 於視訊寫碼中針對子區塊推導運動資訊
WO2020015699A1 (fr) Candidats à la fusion avec de multiples hypothèses
JP5875989B2 (ja) ビデオ・エンコードおよびデコードのための低複雑性テンプレート照合予測のための方法および装置
KR20220000917A (ko) 다중 참조 예측을 위한 움직임 벡터 개선
CN112534807A (zh) 用于多假设模式参考和约束的方法和设备
GB2519514A (en) Method and apparatus for displacement vector component prediction in video coding and decoding
CN111279701B (zh) 视频处理方法和设备
WO2021163862A1 (fr) Procédé et dispositif de codage vidéo
EP4037320A1 (fr) Extension de limite pour codage vidéo
JP2022535859A (ja) Mpmリストを構成する方法、クロマブロックのイントラ予測モードを取得する方法、および装置
WO2021056212A1 (fr) Procédé et appareil de codage et de décodage vidéo
WO2021056220A1 (fr) Procédé et appareil de codage et de décodage vidéo
CN114128263A (zh) 用于视频编解码中的自适应运动矢量分辨率的方法和设备
WO2021056210A1 (fr) Procédé et appareil de codage et de décodage vidéo, et support de stockage lisible par ordinateur
JP7198949B2 (ja) ビデオ符号化のための動きベクトル予測
JP7247345B2 (ja) ビデオ復号化方法、ビデオ復号化装置、及びプログラム
WO2021081905A1 (fr) Procédés de prédiction d'image et de codage vidéo, appareil, plate-forme mobile et support de stockage
WO2020063598A1 (fr) Codeur vidéo, décodeur vidéo, et procédés correspondants
JP7210707B2 (ja) ビデオシーケンスの画像をコーディングするための方法および装置、ならびに端末デバイス
WO2020042990A1 (fr) Procédé et dispositif de prédiction inter-trame, et procédé et dispositif de codage/décodage pour leur application
WO2024086568A1 (fr) Procédé, appareil et support de traitement vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19947331

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19947331

Country of ref document: EP

Kind code of ref document: A1