WO2021134666A1 - 视频处理的方法与装置 - Google Patents

视频处理的方法与装置 Download PDF

Info

Publication number
WO2021134666A1
WO2021134666A1 PCT/CN2019/130901 CN2019130901W WO2021134666A1 WO 2021134666 A1 WO2021134666 A1 WO 2021134666A1 CN 2019130901 W CN2019130901 W CN 2019130901W WO 2021134666 A1 WO2021134666 A1 WO 2021134666A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
luminance
motion vector
chrominance
chrominance block
Prior art date
Application number
PCT/CN2019/130901
Other languages
English (en)
French (fr)
Inventor
马思伟
王苏红
郑萧桢
王苫社
Original Assignee
北京大学
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学, 深圳市大疆创新科技有限公司 filed Critical 北京大学
Priority to CN201980066718.1A priority Critical patent/CN112823520A/zh
Priority to PCT/CN2019/130901 priority patent/WO2021134666A1/zh
Publication of WO2021134666A1 publication Critical patent/WO2021134666A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors

Definitions

  • This application relates to the field of video processing, and more specifically, to a method and device for video processing.
  • the video signal is decomposed into luminance and chrominance components.
  • the resolution of the chrominance component is reduced by half or more through the "chrominance sampling" step.
  • the video encoding process includes an inter-frame prediction process.
  • the inter-frame prediction process includes obtaining a motion vector (MV) of the current block, and then, according to the motion vector of the current block, searching for similar blocks in the reference frame as the prediction block of the current block.
  • MV motion vector
  • This application proposes a method and device for video processing.
  • the process of obtaining the motion vector of the chrominance block can be effectively simplified. Therefore, the coding and decoding process of the chrominance block can be simplified, the coding and decoding complexity can be reduced, and the coding and decoding efficiency can be improved.
  • a video processing method includes: determining a chrominance block to be encoded or decoded; and using a motion vector of one of the luminance blocks corresponding to the chrominance block as the motion vector of the chrominance block.
  • a video processing device in a second aspect, includes a memory and a processor, the memory is used to store instructions, the processor is used to execute the instructions stored in the memory, and the execution of the instructions stored in the memory causes all The processor performs the following operations: determining the chrominance block to be encoded or decoded; and using the motion vector of one of the luminance blocks corresponding to the chrominance block as the motion vector of the chrominance block.
  • a chip including a processing module and a communication interface, the processing module is used to implement the method of the first aspect, and the processing module is also used to control the communication interface to communicate with the outside.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, the computer realizes the method of the first aspect.
  • the computer may be the video processing device provided by the second aspect.
  • a computer program product containing instructions which when executed by a computer causes the computer to implement the method of the first aspect.
  • the computer may be the video processing device provided by the second aspect.
  • a video processing system in a sixth aspect, includes an encoder and a decoder, and both the encoder and the decoder are used to execute the method of the first aspect.
  • the averaging operation can be eliminated, which effectively simplifies the acquisition of chrominance.
  • the flow of the motion vector of the block can simplify the coding and decoding flow of the chrominance block, reduce the coding and decoding complexity, and improve the coding and decoding efficiency.
  • Figure 1 is a schematic diagram of the video encoding architecture.
  • Figure 2 is a schematic diagram of a four-parameter affine model (Affine model).
  • Figure 3 is a schematic diagram of the six-parameter Affine model.
  • Figure 4 is a schematic diagram of the Affine motion vector field.
  • Figure 5 is a schematic diagram of the correspondence between chrominance blocks and luminance blocks in a 4:2:0 video compression format.
  • Fig. 6 is a schematic diagram of a method for obtaining a motion vector of a chrominance block in a 4:2:0 video compression format.
  • Fig. 7 is a schematic diagram of a method for obtaining a motion vector of a chrominance block in a 4:2:2 video compression format.
  • Fig. 8 is a schematic diagram of another method for obtaining a motion vector of a chrominance block in a 4:2:2 video compression format.
  • FIG. 9 is a schematic flowchart of a video processing method provided by an embodiment of the application.
  • FIG. 10 is a schematic diagram of a method for obtaining a motion vector of a chrominance block in a 4:2:0 video compression format according to an embodiment of the application.
  • FIG. 11 is a schematic diagram of a method for obtaining a motion vector of a chrominance block in a 4:2:2 video compression format according to an embodiment of the application.
  • FIG. 12 is a schematic diagram of a method for obtaining a motion vector of a chrominance block in a video compression format of 4:4:4 according to an embodiment of the application.
  • FIG. 13 is a schematic block diagram of a video processing apparatus provided by an embodiment of the application.
  • This application can be applied to a variety of video coding standards, such as H.266, high efficiency video coding (HEVC) (or H265), versatile video coding (VVC), audio and video coding standards ( audio video coding standard, AVS), AVS+, AVS2 and AVS3, as well as various audio and video coding and decoding standards that will evolve in the future.
  • video coding standards such as H.266, high efficiency video coding (HEVC) (or H265), versatile video coding (VVC), audio and video coding standards ( audio video coding standard, AVS), AVS+, AVS2 and AVS3, as well as various audio and video coding and decoding standards that will evolve in the future.
  • the mainstream video coding framework includes prediction, transformation, quantization, entropy coding, loop filtering, etc., as shown in Figure 1.
  • Prediction is an important module of the mainstream video coding framework. Prediction is divided into intra prediction and inter prediction. Prediction is divided into intra prediction and inter prediction. Intra-frame prediction uses the encoded block on the current image frame to generate the reference block (or called the prediction block) of the current image block (hereinafter referred to as the current block), while the inter-frame prediction uses the reference frame (or called the reference image). Get the reference block of the current block. Then the current block is subtracted from the reference block to obtain residual data. Through the residual data and the transformation matrix, the time domain signal is transformed to the frequency domain to obtain the transformation coefficient. The transform coefficients are quantized to reduce the dynamic range of the transform coefficients to further compress the information.
  • the quantized transform coefficients one is to obtain an entropy-coded bitstream through entropy coding; the other is to add the reference block after inverse quantization and inverse transformation, and then perform in-loop filtering to obtain a reconstructed frame image, which can be determined based on the reconstructed frame image Better forecasting method.
  • the bitstream and encoding mode information are sent to the decoding end.
  • the general decoding process includes: entropy decoding the received bitstream to obtain the corresponding residual; obtaining the prediction block according to the coding mode information such as the motion vector obtained by decoding; according to the residual and the prediction block , Reconstruct the current block.
  • Inter-frame prediction can be realized by means of motion compensation.
  • An example of the motion compensation process will be described below.
  • the coding area may also be referred to as a coding tree unit (CTU).
  • the size of the CTU may be, for example, 64 ⁇ 64 or 128 ⁇ 128 (the unit is a pixel, and similar descriptions below will omit the unit).
  • Each CTU can be divided into square or rectangular image blocks.
  • the image block may also be referred to as a coding unit (CU), and the current CU to be encoded will be referred to as the current block in the following.
  • CU coding unit
  • a reference frame (which may be a reconstructed frame near the time domain) can be searched for similar blocks of the current block as the prediction block of the current block.
  • the relative displacement between the current block and similar blocks is called a motion vector (MV).
  • MV motion vector
  • inter prediction modes can be divided into the following two types: merge mode (Merge mode) and non-merger mode (for example, advanced motion vector prediction mode (AMVP mode)).
  • merge mode Merge mode
  • AMVP mode advanced motion vector prediction mode
  • the feature of the Merge mode is that the motion vector (MV) of the image block is the motion vector prediction (MVP) of the image block. Therefore, for the Merge mode, the MVP index and reference frame are transmitted in the code stream. The index is sufficient, and there is no need to transmit a motion vector difference (MVD) in the code stream. In contrast, the non-Merge mode not only needs to transmit the MVP and reference frame index in the code stream, but also needs to transmit MVD in the code stream.
  • the previous mainstream video coding standards only applied the translational motion model, that is, the motion vector of the current block represents the relative displacement between the current block and the reference block.
  • the motion vector of the current block represents the relative displacement between the current block and the reference block.
  • there are a variety of more complex forms of motion such as zoom in/out, rotation, distant motion and other irregular motions.
  • an affine transform motion compensation model (Affine model) is introduced in the new generation of coding standard VCC.
  • the Affine model uses the motion vectors of two control points (CP) of the current block (four-parameter model) or the motion vectors of three control points (six-parameter model) to describe the affine motion field of the current block.
  • the motion vectors of the two control points of the current block are used to describe the affine motion field of the current block.
  • the two control points may be the upper left corner point and the upper right corner point of the current block, for example, as shown in Fig. 2 Shown.
  • the current block is a 16 ⁇ 16 CU, it can be divided into 4 4 ⁇ 4 sub-blocks (Sub-CU), then the sub-CU of each of these 4 4 ⁇ 4 sub-CUs Both MVs can be determined based on the MVs of the two control points of the 16 ⁇ 16 CU.
  • the motion vectors of the three control points of the current block are used to describe the affine motion field of the current block.
  • the three control points can be, for example, the upper left corner point, the upper right corner point and the lower left corner point of the current block, such as Shown in Figure 3.
  • the current block is a 16 ⁇ 16 CU, it can be divided into 4 4 ⁇ 4 sub-blocks (Sub-CU), then each Sub-CU of the 4 4 ⁇ 4 Sub-CUs
  • the MVs can all be determined based on the MVs of the three control points of the 16 ⁇ 16 CU.
  • the size of the sub-block is 4 ⁇ 4, which is only used as an example in this application.
  • the size of the sub-block can be other values, such as 8 ⁇ 8. There is no restriction on this.
  • the motion vector of each sub-block in the current block can be calculated as follows:
  • the motion vector of each sub-block in the current block can be calculated as follows:
  • (x, y) represents the coordinates of each sub-block in the current block.
  • mv 0 , mv 1 and mv 2 represent the control point motion vector prediction (CPMV) of the current block, where (mv 0x , mv 0y ) represents the motion vector of the control point in the upper left corner, as shown in Figures 2 and 3
  • CPMV control point motion vector prediction
  • (mv 0x , mv 0y ) represents the motion vector of the control point in the upper left corner, as shown in Figures 2 and 3
  • mv 1 , (mv 2x , mv 2y ) represents the motion of the lower left control point Vector, mv 2 as shown in Figure 3.
  • W represents the pixel width of the current block (may be simply referred to as the width of the current block)
  • H represents the pixel height of the current block (may be simply referred to as the height of the current block).
  • the Affine model can also be divided into a Merge mode and a non-Merge mode (for example, AMVP) mode.
  • the combination of the Affine model and the Merge mode can be called the Affine Merge mode.
  • the combination of the Affine model and the AMVP mode can be called the Affine AMVP mode.
  • the normal Merge mode motion vector candidate list (merge candidate list) records the MVP of the image block
  • the Affine Merge mode motion vector candidate list (affine merge candidate list) records the control point motion vector prediction (control point motion). vector prediction, CPMVP).
  • CPMVP control point motion vector prediction
  • the Affine Merge mode does not need to add MVD to the code stream, but directly uses the CPMVP as the CPMV of the current block, and encodes the index of the CPMVP into the code stream.
  • the Affine AMVP mode not only needs to transmit the CPMVP index and the index of the reference frame in the code stream, but also needs to transmit MVD in the code stream.
  • VVC applies block-based prediction, divides the current block into sub-blocks, and obtains the motion vector of each sub-block separately. That is, the motion vector of each sub-block in the current block is derived from the motion vectors of two or three control points of the current block.
  • each square represents a sub-block
  • the arrow in each square represents the motion vector of the sub-block.
  • the Affine model is a four-parameter model.
  • the current block is represented as the current CU, and the sub-blocks in the current block may be referred to as sub-CUs.
  • the sub-block size is 4 ⁇ 4.
  • the video signal is decomposed into luminance and chrominance components.
  • the resolution of the chrominance component is reduced by half or more through the "chrominance sampling" step (because the human eye is sensitive to the luminance resolution Higher than the sensitivity to color resolution).
  • the ratio of the resolution of the luminance component and the chrominance component is often used to describe various chrominance sampling methods. This ratio is usually based on the resolution of the luminance component and is described in the form of 4:X:Y. X and Y represent every two chrominance.
  • the relative number of values in the channel. 4:X:Y is also called video compression format. The following is an example of the video compression format 4:X:Y.
  • video compression formats include: 4:4:4, 4:2:0, 4:2:2, 4:1:1, etc.
  • 4:4:4 means that 1 chrominance block corresponds to 1 luma block.
  • 4:4:4 means that 1 chrominance pixel corresponds to 1 luminance pixel.
  • 4:2:0 means that 1 chrominance block corresponds to 4 luma blocks.
  • each chroma pixel corresponds to four Luma Pixels (Luma Pixel).
  • 4:2:0 means 2:1 horizontal sampling and vertical 2:1 sampling.
  • 4:2:2 means that 1 chrominance block corresponds to 2 luma blocks.
  • 4:2:2 means that 1 chrominance pixel corresponds to 2 luminance pixels.
  • 4:2:2 means 2:1 horizontal sampling and vertical full sampling, as shown in Figure 7.
  • 4:2:2 means 2:1 vertical sampling and horizontal full sampling, as shown in Figure 8.
  • 4:1:1 means that 1 chrominance block corresponds to 4 luma blocks.
  • 4:1:1 means that 1 chrominance pixel corresponds to 4 luminance pixels.
  • 4:1:1 means 4:1 horizontal sampling and vertical full sampling.
  • a brightness block can be understood as a pixel point or a collection that carries a brightness value.
  • a chroma block can be understood as a pixel point or a collection that carries a chroma value.
  • a block is a coding unit (CU).
  • the size of a block is 4 ⁇ 4, which can be called a sub-block, that is, the above-mentioned luminance block may be a luminance sub-block with a size of 4 ⁇ 4, and a chrominance block may be a 4 ⁇ 4 in size. Chroma sub-block.
  • the size of the luminance block and the size of the chrominance block may also be other values, such as 8 ⁇ 8, which is not limited in this application. .
  • the motion vector is obtained according to the two or three control point motion vectors of the current block, see above in conjunction with Figures 2 to 4 description of.
  • For each 4 ⁇ 4 chrominance sub-block obtain its motion vector according to the motion vector of the corresponding luminance sub-block.
  • each 4 ⁇ 4 chrominance sub-block corresponds to an 8 ⁇ 8 luminance block.
  • the size of the luminance sub-block is 4 ⁇ 4
  • each 4 ⁇ 4 chrominance sub-block corresponds to four 4 ⁇ 4 luminance sub-blocks.
  • the motion vector of a 4 ⁇ 4 chrominance sub-block is determined by the luminance sub-block in the upper left corner of the corresponding 4 4 ⁇ 4 luminance sub-blocks.
  • the motion vector mv1 and the motion vector mv2 of the brightness sub-block in the lower right corner are averaged, as shown in Figure 6.
  • each 4 ⁇ 4 chrominance sub-block corresponds to a 4 ⁇ 8 luminance sub-block.
  • the size of the luminance sub-block is 4 ⁇ 4
  • each 4 ⁇ 4 chrominance sub-block corresponds to two 4 ⁇ 4 luminance sub-blocks.
  • each 4 ⁇ 4 chrominance sub-block may correspond to two 4 ⁇ 4 luminance sub-blocks in the horizontal direction, or two 4 ⁇ 4 luminance sub-blocks in the vertical direction.
  • the motion vector of each 4 ⁇ 4 chrominance sub-block is calculated from the motion vector of the corresponding two 4 ⁇ 4 luminance sub-blocks in the horizontal direction. The average is obtained, as shown in Figure 7, or the motion vector of each 4 ⁇ 4 chrominance sub-block is obtained by averaging the motion vectors of two 4 ⁇ 4 luminance sub-blocks in the vertical direction, as shown in Figure 8. Shown.
  • FIG. 9 is a schematic flowchart of a video processing method provided by an embodiment of the application. The method in FIG. 9 can be applied to the encoding end as well as to the decoding end.
  • step S910 the chrominance block to be encoded or decoded is determined.
  • step S920 the motion vector of one of the luminance blocks corresponding to the chrominance block is used as the motion vector of the chrominance block.
  • Each chrominance block has the same size as each luminance block.
  • the size of the block is related to the inter prediction mode. For example, when the inter prediction mode is the Affine mode, the size of the block may be 4 ⁇ 4, that is, the size of the chroma block and the size of the luma block are 4 ⁇ 4. It should be understood that under different inter prediction modes or video coding standards, the size of a block may have different definitions.
  • each chroma block corresponds to 4 luma blocks
  • each chroma block corresponds to 2 brightness Block
  • each chroma block corresponds to 1 luma block
  • each chroma block corresponds to 4 brightness Piece.
  • step S920 the motion vector of one of the luminance blocks corresponding to the chrominance block is used as the motion vector of the chrominance block.
  • this luminance block is denoted as the target luminance block in the following.
  • the position of the target luminance block in the luminance block corresponding to the chrominance block can be specified or pre-configured by the protocol, or the position of the target luminance block is negotiated in advance by the encoding end and the decoding end, that is, encoding
  • the terminal and the decoding terminal can obtain the motion vector of the chrominance block based on the same rule.
  • step S920 the motion vector of the target luminance block at the preset position in the luminance block corresponding to the chrominance block is used as the motion vector of the chrominance block.
  • the preset position here means that the position of the target brightness block is stipulated by the agreement or pre-configured.
  • the chrominance block corresponds to multiple luminance blocks (for example, the encoding and compression format is 4:2:0, 4:2:2, 4:1:1)
  • the chrominance block corresponds to The motion vector of the target luminance block in the multiple luminance blocks is used as the motion vector of the chrominance block, where the target luminance block is a luminance block located in any of the following positions among the multiple luminance blocks corresponding to the chrominance block:
  • the aforementioned preset position may be any position of the upper left corner, the left side, the upper side, the lower left corner, the upper right corner, the lower right corner, the right side, or the lower side.
  • the motion vector of a luminance block located in the upper left corner, upper right corner, lower left corner, or lower right corner of the four luminance blocks corresponding to the chrominance block can be used as the chrominance block's motion vector.
  • Motion vector the motion vector of a luminance block located in the upper left corner, upper right corner, lower left corner, or lower right corner of the four luminance blocks corresponding to the chrominance block.
  • the video compression format is 4:2:2.
  • the motion vector of one luminance block located on the left or right of the two luminance blocks can be As the motion vector of the chrominance block; in the case that the chrominance block corresponds to two luminance blocks in the vertical direction, the motion vector of the upper or lower luminance block of the two luminance blocks can be used as the chrominance block.
  • the motion vector of the degree block In the case that the chrominance block corresponds to two luminance blocks in the horizontal direction, the motion vector of one luminance block located on the left or right of the two luminance blocks can be As the motion vector of the chrominance block; in the case that the chrominance block corresponds to two luminance blocks in the vertical direction, the motion vector of the upper or lower luminance block of the two luminance blocks can be used as the chrominance block.
  • the motion vector of the degree block In the case that the chrominance block corresponds to two luminance blocks in the horizontal direction, the motion vector of one luminance block located on the left
  • the video compression format is 4:1:1, and the four luminance blocks corresponding to the chrominance block can be the first on the left, the second on the left, the third on the left, or the fourth on the left.
  • the motion vector of the luminance block is used as the motion vector of the chrominance block.
  • step S920 the motion vector of a luminance block corresponding to the chrominance block is used as the motion vector of the chrominance block.
  • step S920 the motion vector of a luminance block (denoted as the target luminance block) at a fixed position in the luminance block corresponding to the chrominance block is used as the motion vector of the chrominance block.
  • a luminance block at the fixed position may be a luminance block at any position in the luminance block corresponding to the chrominance block.
  • the target luminance block at the fixed position is the luminance block in the upper left corner of the four luminance blocks corresponding to the chrominance block.
  • the motion vector of the 4 ⁇ 4 chrominance sub-block is directly derived from the motion vector of the luminance sub-block in the upper left corner of the four 4 ⁇ 4 luminance sub-blocks corresponding to the chrominance sub-block. (MV) export.
  • the motion vector of one luminance block in the upper left corner of the four luminance blocks corresponding to the chrominance block is used as the motion vector of the chrominance block. Relatively speaking, the color can be improved. The accuracy of the motion vector of the degree block.
  • the target luminance block at the fixed position can also be the luminance block located in the upper right corner, the lower left corner or the lower right corner of the four luminance blocks corresponding to the chrominance block.
  • the target luminance block at the fixed position is the luminance block on the left of the two luminance blocks.
  • the target luminance block at the fixed position is the upper luminance block of the two luminance blocks.
  • chrominance block 1 corresponds to two luminance blocks 1-1 and 1-2 in the horizontal direction.
  • the motion vector of luminance block 1-1 is mv1
  • the motion vector of luminance block 1-2 is mv2.
  • the degree block 2 corresponds to two luminance blocks 2-1 and 2-2 in the horizontal direction.
  • the motion vector of the luminance block 2-1 is mv3
  • the motion vector of the luminance block 2-2 is mv4.
  • the motion vector of the chroma block 1 is directly derived from the motion vector mv1 of the luma block 1-1, that is, the motion vector mv1 of the luma block 1-1 is directly used as the motion vector of the chroma block 1.
  • the motion vector of the chroma block 2 is directly derived from the motion vector mv3 of the luma block 2-1, that is, the motion vector mv3 of the luma block 2-1 is directly used as the motion vector of the chroma block 1.
  • the motion vector of the left or upper one of the two luminance blocks corresponding to the chrominance block is used as the motion vector of the chrominance block. Relatively speaking, Improve the accuracy of the motion vector of the chrominance block.
  • the target luminance block at the fixed position may also be the luminance block located on the right of the two luminance blocks; In the case where the chrominance block corresponds to two luminance blocks in the vertical direction, the target luminance block at the fixed position is the lower luminance block of the two luminance blocks.
  • the target luminance block at the fixed position is the leftmost luminance block among the four luminance blocks corresponding to the chrominance block, that is, the first luminance block on the farthest side.
  • the target brightness block at the fixed position can also be the second brightness block from the left, the third from the left, or the fourth brightness block from the left among the four brightness blocks corresponding to the chroma block. .
  • the target luminance block at the fixed position is a luminance block corresponding to the chrominance block. That is, the motion vector of the chrominance block is directly derived from the motion vector of a corresponding luminance block.
  • chroma block 1 corresponds to a luma block 1-1
  • the motion vector of luma block 1-1 is mv1
  • chroma block 2 corresponds to a luma block 2- 1.
  • the motion vector of the luma block 2-1 is mv2
  • the chroma block 3 corresponds to a luma block 3-1
  • the motion vector of the luma block 3-1 is mv3
  • the chroma block 4 corresponds to a luma block 4-1
  • the motion vector of 4-1 is mv4.
  • the motion vector of the chroma block 1 is directly derived from the motion vector mv1 of the luma block 1-1, that is, the motion vector mv1 of the luma block 1-1 is directly used as the motion vector of the chroma block 1.
  • the motion vector of the chroma block 2 is directly derived from the motion vector mv2 of the luma block 2-1, that is, the motion vector mv2 of the luma block 2-1 is directly used as the motion vector of the chroma block 2.
  • the description of the chrominance block 3 and the chrominance block 4 can be deduced by analogy, and will not be repeated.
  • the fixed position can be specified or pre-configured by the protocol, so that the encoding end does not need to indicate to the decoding end This fixed position can further simplify the process of obtaining the motion vector of the chrominance block.
  • this application directly uses the motion vector of a luminance block in the luminance block corresponding to the chrominance block as the motion vector of the chrominance block.
  • the motion vector of the chrominance block For any video compression format, when obtaining the motion vector of the chrominance block There is no need to perform the averaging operation in the process.
  • the process of obtaining the motion vector of the chrominance block can be effectively simplified, thereby simplifying the coding and decoding process of the chrominance block, reducing the coding and decoding complexity, and improving the coding and decoding efficiency. .
  • the inter-frame prediction mode is the Affine model, that is, the motion vector of the luminance block is obtained according to two or three control point motion vectors.
  • the motion vector of each luminance block is obtained from the upper left corner control point motion vector and the upper right corner control point motion vector of the current block where the luminance block is located.
  • the motion vector of each luminance block is obtained from the upper left corner control point motion vector, the upper right corner control point motion vector, and the lower left corner control point motion vector of the current block where the luminance block is located.
  • the luma block may be called a luma sub-block
  • the chroma block may be called a chroma sub-block
  • This application does not limit the inter-frame prediction mode. For example, it may be other prediction modes except the Affine mode. It should be understood that the way of obtaining the motion vector of the luminance block depends on the inter prediction mode. In addition to being applicable to the Affine mode, this application is applicable to any scene where the motion vector of the chrominance block is derived based on the motion vector of the luma block.
  • step S910 and step S920 are executed.
  • the method in FIG. 9 further includes: encoding the chrominance block by using the motion vector of the chrominance block.
  • the method in FIG. 9 includes the processes of transform, quantization, and entropy encoding shown in FIG. 1, and the process of inverse quantization and inverse transform to obtain a reconstructed frame.
  • the method in FIG. 9 may further include the following operations before step S910:
  • CPMV control point motion vector
  • CPMV control point motion vector
  • the sensitivity of the human eye to the brightness resolution is higher than the sensitivity to the color resolution.
  • the brightness element contributes more to the video signal.
  • the embodiments of the present application can be applied to scenes with low requirements for chroma coding.
  • the process of obtaining the motion vector of the chroma block can be effectively simplified without affecting the coding performance.
  • an embodiment of the present application also provides a video processing device.
  • the device can be an encoder or a decoder.
  • the video processing device includes a processor 1310 and a memory 1320, the memory 1320 is used to store instructions, the processor 1310 is used to execute the instructions stored in the memory 1320, and the execution of the instructions stored in the memory 1320 makes the processing
  • the device 1310 is used to execute the method in the above method embodiment.
  • the processor 1310 executes the instructions stored in the memory 1320 to perform the following operations: determine the chrominance block to be encoded or decoded; use the motion vector of one of the luminance blocks corresponding to the chrominance block as the motion vector of the chrominance block.
  • using the motion vector of a luminance block in the luminance block corresponding to the chrominance block as the motion vector of the chrominance block includes: using the motion vector of a luminance block at a fixed position in the luminance block corresponding to the chrominance block as the color The motion vector of the degree block.
  • the chrominance block corresponds to multiple luminance blocks; wherein, using the motion vector of one of the luminance blocks corresponding to the chrominance block as the motion vector of the chrominance block includes: assigning the multiple luminance blocks corresponding to the chrominance block as the motion vector of the chrominance block.
  • the motion vector of the target luminance block in the block is used as the motion vector of the chrominance block, where the target luminance block is a luminance block located at any of the following positions among the multiple luminance blocks corresponding to the chrominance block: upper left corner, left corner, Top, bottom left, top right, bottom right, right, or bottom.
  • the video compression format is 4:2:2, and the target luminance block is the luminance block located in the upper left corner of the multiple luminance blocks corresponding to the chrominance block.
  • the video compression format is 4:2:0; in the case where the chrominance block corresponds to two horizontal luminance blocks, the target luminance block is the luminance block located on the left of the two luminance blocks corresponding to the chrominance block; or In the case where the chrominance block corresponds to two vertical luminance blocks, the target luminance block is the upper luminance block of the two luminance blocks corresponding to the chrominance block.
  • the video compression format is 4:1:1, and the target luminance block is the leftmost luminance block among the multiple luminance blocks corresponding to the chrominance block.
  • the motion vector of the luminance block is obtained based on two or three control point motion vectors.
  • the processor 1310 is further configured to perform the following operations: use the motion vector of the chroma block to encode or decode the chroma block.
  • the video processing apparatus further includes a communication interface 1330 for transmitting signals with external devices.
  • the communication interface 1330 is used to receive image or video data to be processed from an external device, and is also used to send an encoded bit stream to the decoding end.
  • the communication interface 1330 is used to send the decoded data to an external device.
  • An embodiment of the present application also provides a video processing system.
  • the system includes an encoder and a decoder, and both the encoder and the decoder are used to execute the method of the above embodiment.
  • the embodiment of the present invention also provides a computer storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer executes the method of the above method embodiment.
  • An embodiment of the present invention also provides a computer program product containing instructions, which is characterized in that, when the instructions are executed by a computer, the computer executes the method of the above method embodiment.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part.
  • the computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • Computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • computer instructions can be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

Abstract

提供一种视频处理的方法与装置,该方法包括:确定待编码或解码的色度块;将色度块对应的亮度块中的一个亮度块的运动矢量作为色度块的运动矢量。通过直接将色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量,可以有效简化获取色度块的运动矢量的流程,从而可以简化色度块的编解码流程,降低编解码复杂度,提高编解码效率。

Description

视频处理的方法与装置
版权申明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本申请涉及视频处理领域,并且更为具体地,涉及一种视频处理的方法与装置。
背景技术
在视频压缩中,视频信号会分解为亮度分量与色度分量。其中,色度分量的分辨率通过“色度取样”步骤进行减半或者减更多。
视频编码过程包括帧间预测过程。帧间预测过程包括,获取当前块的运动矢量(motion vector,MV),然后根据当前块的运动矢量,在参考帧中寻找相似块作为当前块的预测块。
当前视频压缩技术中,获取色度块的运动矢量的流程较为繁琐。如何简化获取色度块的运动矢量的流程是目前亟待解决的问题。
发明内容
本申请提出一种视频处理的方法与装置,通过直接将色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量,可以有效简化获取色度块的运动矢量的流程,从而可以简化色度块的编解码流程,降低编解码复杂度,提高编解码效率。
第一方面,提供一种视频处理的方法,该方法包括:确定待编码或解码的色度块;将色度块对应的亮度块中的一个亮度块的运动矢量作为色度块的运动矢量。
第二方面,提供一种视频处理的装置,该装置包括存储器和处理器,该存储器用于存储指令,该处理器用于执行该存储器存储的指令,并且对该存储器中存储的指令的执行使得所述处理器执行如下操作:确定待编码或解码的色度块;将色度块对应的亮度块中的一个亮度块的运动矢量作为色度块的 运动矢量。
第三方面,提供一种芯片,包括处理模块与通信接口,处理模块用于实现第一方面的方法,处理模块还用于控制通信接口与外部进行通信。
第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机实现第一方面的方法。该计算机可以为第二方面提供的视频处理的装置。
第五方面,提供一种包含指令的计算机程序产品,该指令被计算机执行时使得该计算机实现第一方面的方法。该计算机可以为第二方面提供的视频处理的装置。
第六方面,提供一种视频处理的系统,该系统包括编码器与解码器,编码器与解码器均用于执行第一方面的方法。
基于上述描述,通过直接将色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量,相对于现有技术,可以去除求平均的操作,有效简化了获取色度块的运动矢量的流程,从而可以简化色度块的编解码流程,降低编解码复杂度,提高编解码效率。
附图说明
图1为视频编码的架构示意图。
图2为四参数仿射模型(Affine模型)的示意图。
图3为六参数Affine模型的示意图。
图4为Affine运动矢量场的示意图。
图5为视频压缩格式4:2:0下色度块与亮度块的对应示意图。
图6为视频压缩格式4:2:0下获取色度块的运动矢量的一种方法的示意图。
图7为视频压缩格式4:2:2下获取色度块的运动矢量的一种方法的示意图。
图8为视频压缩格式4:2:2下获取色度块的运动矢量的另一种方法的示意图。
图9为本申请实施例提供的视频处理的方法的示意性流程图。
图10为本申请实施例在视频压缩格式4:2:0下获取色度块的运动矢量的方法的示意图。
图11为本申请实施例在视频压缩格式4:2:2下获取色度块的运动矢量的方法的示意图。
图12为本申请实施例在视频压缩格式4:4:4下获取色度块的运动矢量的方法的示意图。
图13为本申请实施例提供的视频处理的装置的示意性框图。
具体实施方式
本申请可应用于多种视频编码标准,如H.266,高效率视频编码(high efficiency video coding,HEVC)(或称H265),通用视频编码(versatile video coding,VVC),音视频编码标准(audio video coding standard,AVS),AVS+,AVS2以及AVS3以及未来演进的各种音视频编解码标准等。
主流视频编码框架包括预测、变换、量化、熵编码、环路滤波等部分,如图1所示。
预测是主流视频编码框架的重要模块。预测分为帧内预测和帧间预测。预测分为帧内预测和帧间预测。帧内预测使用当前图像帧上已编码的块来生成当前图像块(下文简称为当前块)的参考块(或称为预测块),而帧间预测使用参考帧(或称为参考图像)来获取当前块的参考块。然后将当前块与参考块相减得到残差数据。通过残差数据与变换矩阵,将时域信号变换到频域上,得到变换系数。对变换系数进行量化,来缩小变换系数的动态范围,以进一步压缩信息。对于量化后的变换系数,一是通过熵编码得到熵编码的比特流;二是经过反量化和反变换后与参考块相加,再进行环内滤波得到重建帧图像,可以基于重建帧图像确定较优的预测方式。
作为示例,在编码端,通过熵编码得到比特流之后,将比特流以及编码模式信息,例如帧间预测模式、运动矢量信息等信息,发送到解码端。
作为示例,在解码端,解码的大致流程包括:对接收到的比特流进行熵解码,得到相应的残差;根据解码得到的运动矢量等编码模式信息,获得预测块;根据残差和预测块,重构出当前块。
帧间预测可以通过运动补偿的方式来实现。下面对运动补偿过程进行举例说明。
例如,对于一帧图像,可以先将其划分成一个或多个编码区域。该编码区域也可称为编码树单元(coding tree unit,CTU)。CTU的尺寸例如可以是 64×64,也可以是128×128(单位为像素,后文的类似描述均省略单位)。每个CTU可以划分成方形或矩形的图像块。该图像块也可称为编码单元(coding unit,CU),后文会将待编码的当前CU称为当前块。
在对当前块进行帧间预测时,可以从参考帧(可以是时域附近的已重构帧)中寻找当前块的相似块,作为当前块的预测块。当前块与相似块之间的相对位移称为运动矢量(motion vector,MV)。在参考帧中寻找相似块作为当前块的预测块的过程即为运动补偿。
在目前的H.266国际视频编码标准中,帧间预测模式可以分为以下两种类型:合并模式(Merge mode)与非合并模式(例如高级运动矢量预测模式(AMVP mode))。
Merge模式的特点在于,图像块的运动矢量(motion vector,MV)即为图像块的运动矢量预测(motion vector prediction,MVP),因此,对于Merge模式,在码流中传输MVP的索引及参考帧的索引即可,不需要在码流中传输运动矢量差值(Motion vector difference,MVD)。相比而言,非Merge模式不但需要在码流中传输MVP和参考帧的索引,还需要在码流中传输MVD。
在运动补偿预测阶段,以往主流的视频编码标准只应用了平移运动模型,即当前块的运动矢量代表的是当前块与参考块之间的相对位移。而在现实世界中,有多种更为复杂的运动形式,如放大/缩小,旋转,远景运动和其他不规则运动。为了能够描述更为复杂的运动情况,新一代编码标准VCC中引入了仿射变换运动补偿模型(Affine模型)。Affine模型利用当前块的两个控制点(control point,CP)的运动矢量(四参数模型)或三个控制点的运动矢量(六参数模型)描述当前块的仿射运动场。
其中,在四参数Affine模型中,利用当前块的两个控制点的运动矢量描述当前块的仿射运动场,该两个控制点例如可以是当前块的左上角点和右上角点,如图2所示。例如,当前块为16×16大小的CU,可被划分为4个4×4大小的子块(Sub-CU),则这4个4×4大小的Sub-CU中每个Sub-CU的MV均可以根据该16×16大小的CU的两个控制点的MV来确定。
在六参数Affine模型中,利用当前块的三个控制点的运动矢量描述当前块的仿射运动场,该三个控制点例如可以是当前块的左上角点、右上角点和左下角点,如图3所示。例如,当前块为16×16大小的CU,可被划分为4个4×4大小的子块(Sub-CU),则这4个4×4大小的Sub-CU中每个Sub-CU 的MV均可以根据该16×16大小的CU的三个控制点的MV来确定。
需要说明的是,子块(Sub-CU)的大小为4×4,在本申请中仅作为示例,在其他实现的方式中,子块的大小可以是其他数值,例如8×8,本申请对此不作限定。
例如,在四参数模型下,当前块内每个子块的运动矢量可以通过如下公知计算:
Figure PCTCN2019130901-appb-000001
在六参数模型下,当前块内每个子块的运动矢量可以通过如下公知计算:
Figure PCTCN2019130901-appb-000002
其中,(x,y)表示当前块中每个子块的坐标。mv 0、mv 1与mv 2表示当前块的控制点运动矢量(control point motion vector prediction,CPMV),其中,(mv 0x,mv 0y)表示左上角控制点的运动矢量,如图2与图3中所示的mv 0,(mv 1x,mv 1y)表示右上角控制点的运动矢量,如图2与图3中所示的mv 1,(mv 2x,mv 2y)表示左下角控制点的运动矢量,如图3中所示的mv 2。W表示当前块的像素宽度(可以简称为当前块的宽度),H表示当前块的像素高度(可以简称为当前块的高度)。
Affine模型也可以分为Merge模式与非Merge模式(例如,AMVP)模式。Affine模型与Merge模式的结合,可以称为Affine Merge模式。Affine模型与AMVP模式的结合,可以称为Affine AMVP模式。
普通Merge模式的运动矢量候选列表(merge candidate list)中记录的是图像块的MVP,而Affine Merge模式的运动矢量候选列表(affine merge candidate list)中记录的是控制点运动矢量预测(control point motion vector prediction,CPMVP)。与普通Merge模式类似,Affine Merge模式无需在码流中添加MVD,而是直接将CPMVP作为当前块的CPMV,将CPMVP的索引编码到码流中。
与普通AMVP模式类似,Affine AMVP模式不但需要在码流中传输CPMVP的索引和参考帧的索引,还需要在码流中传输MVD。
为了简化运动补偿预测的计算,VVC应用了基于块的预测,将当前块分为子块,分别获取每个子块的运动矢量。即当前块中每个子块的运动矢量通过当前块的两个或三个控制点的运动矢量导出。
作为示例,一个图像块中各个子块的运动矢量的示意图如图4所示,每个方格代表一个子块,每个方格中的箭头表示子块的运动矢量。在图4中,Affine模型为四参数模型。
例如,当前块表示为当前CU,当前块中的子块可以称为sub-CU。
例如,在当前标准中,子块大小为4×4。
在视频压缩中,视频信号会分解为亮度分量与色度分量,其中,色度分量的分辨率通过“色度取样”步骤进行减半或者减更多(因为人眼对亮度分辨率的敏感度高于对色彩分辨率的敏感度)。亮度分量和色度分量的分辨率的比率常用来描述各种色度取样方式,这个比率通常基于亮度分量的分辨率,以4:X:Y的形式描述,X和Y表示每两个色度通道中的数值的相对数量。4:X:Y也称为视频压缩格式。下面对视频压缩格式4:X:Y进行举例说明。
使用标准的视频压缩格式命名规则,视频压缩格式包括:4:4:4,4:2:0,4:2:2,4:1:1等。
4:4:4意味着1个色度块对应1个亮度块。
换句话说,4:4:4意味着1个色度像素对应1个亮度像素。
4:4:4表示完全取样。
4:2:0意味着1个色度块对应4个亮度块。
换句话说,4:2:0意味着1个色度像素对应4个亮度像素。如图5所示,每个色度像素(Chroma Pixel)对应四个亮度像素(Luma Pixel)。
4:2:0表示2:1的水平取样,垂直2:1采样。
4:2:2意味着1个色度块对应2个亮度块。
换句话说,4:2:2意味着1个色度像素对应2个亮度像素。
4:2:2表示2:1的水平取样,垂直完全采样,如图7所示。或者,4:2:2表示2:1的垂直取样,水平完全采样,如图8所示。
4:1:1意味着1个色度块对应4个亮度块。
换句话说,4:1:1意味着1个色度像素对应4个亮度像素。
4:1:1表示4:1的水平取样,垂直完全采样。
例如,亮度块可以理解为,携带亮度值的像素点或集合。色度块可以理解为,携带色度值的像素点或集合。
在不同的视频编码标准中,块的定义不同。例如,在HEVC标准中,块为编码单元(CU)。又例如,在VCC标准中,块的大小为4×4,可以称为子块,即上述的亮度块可以为大小为4×4的亮度子块,色度块可以为大小为4×4的色度子块。
需要说明的是,4×4在本申请中仅作为示例,在其他实现的方式中,亮度块的大小和色度块的大小还可以是其他数值,例如8×8,本申请对此不作限定。
在运动补偿预测阶段,若采用Affine模型,对每个4×4的亮度子块,根据当前块的两个或三个控制点运动矢量,获取其运动矢量,参见上文结合图2至图4的描述。对于每个4×4的色度子块,根据其对应的亮度子块的运动矢量,获取其运动矢量。
例如,对于编码压缩格式为4:2:0的视频,每个4×4的色度子块对应着一个8×8的亮度块。假设在Affine模式中,亮度子块大小为4×4,则每个4×4的色度子块对应着4个4×4的亮度子块。当前技术中,对于编码压缩格式为4:2:0的视频,1个4×4的色度子块的运动矢量由其对应的4个4×4的亮度子块中左上角的亮度子块的运动矢量mv1与右下角的亮度子块的运动矢量mv2求平均获得,如图6所示。
又例如,对于编码压缩格式为4:2:2的视频,每个4×4的色度子块对应着1个4×8的亮度子块。假设在Affine模式中,亮度子块大小为4×4,则每个4×4的色度子块对应着2个4×4的亮度子块。按照取样方向的不同,每个4×4的色度子块可以在水平方向上对应2个4×4的亮度子块,或者,在竖直方向上对应2个4×4的亮度子块。当前技术中,对于编码压缩格式为4:2:2的视频,每个4×4的色度子块的运动矢量由其水平方向上对应的2个4×4的亮度子块的运动矢量求平均获得,如图7所示,或者,每个4×4的色度子块的运动矢量由其垂直方向上对应的2个4×4的亮度子块的运动矢量求平均获得,如图8所示。
当前技术中,针对色度块的运动矢量的导出流程较为繁琐,存在改善空间。
下面结合图9,对本申请实施例进行描述。
图9为本申请实施例提供的视频处理的方法的示意性流程图。图9的方法可应用于编码端,也可应用于解码端。
在步骤S910,确定待编码或解码的色度块。
在步骤S920,将色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量。
每个色度块与每个亮度块的大小相同。块的大小与帧间预测模式有关,例如,在帧间预测模式为Affine模式的情况下,块的大小可以为4×4,即色度块的大小与亮度块的大小为4×4。应理解,在不同帧间预测模式或视频编码标准下,块的大小可以具有不同的定义。
色度块与亮度块的对应规则根据视频压缩格式确定。例如,对于编码压缩格式为4:2:0的视频,每个色度块对应着4个亮度块,对于编码压缩格式为4:2:2的视频,每个色度块对应着2个亮度块,对于编码压缩格式为4:4:4的视频,每个色度块对应着1个亮度块,对于编码压缩格式为4:1:1的视频,每个色度块对应着4个亮度块。
在步骤S920中,将该色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量,为了便于描述与理解,下文中将这一个亮度块记为目标亮度块。
对于一个色度块,目标亮度块在该色度块对应的亮度块中的位置可以由协议规定或预配置,或者,该目标亮度块的位置是编码端与解码端预先协商好的,即编码端与解码端可以基于相同的规则获取色度块的运动矢量。
例如,在步骤S920中,将该色度块对应的亮度块中预设位置的目标亮度块的运动矢量作为该色度块的运动矢量。这里的预设位置表示,目标亮度块的位置是协议规定或预配置的。
在色度块对应多个亮度块的情况下(例如,编码压缩格式为4:2:0,4:2:2,4:1:1),在步骤S920中,将该色度块对应的多个亮度块中的目标亮度块的运动矢量作为该色度块的运动矢量,其中,目标亮度块是位于该色度块对应的多个亮度块中的下列位置中任一位置的亮度块:
左上角、左边、上边、左下角、右上角、右下角、右边或者下边。
也即是说,上述的预设位置可以是左上角、左边、上边、左下角、右上角、右下角、右边或者下边中的任一位置。
例如,视频压缩格式为4:2:0,可以将该色度块对应的四个亮度块中位于左上角、右上角、左下角或右下角的一个亮度块的运动矢量作为该色度块的运动矢量。
又例如,视频压缩格式为4:2:2,在该色度块对应两个水平方向上的亮度块的情况下,可以将这两个亮度块中位于左边或右边的一个亮度块的运动矢量作为该色度块的运动矢量;在该色度块对应两个竖直方向上的亮度块的情况下,可以将这两个亮度块中位于上边或下边的一个亮度块的运动矢量作为该色度块的运动矢量。
再例如,视频压缩格式为4:1:1,可以将该色度块对应的四个亮度块中位于左侧第一个、左侧第二个、左侧第三个或左侧第四个的亮度块的运动矢量作为该色度块的运动矢量。
在色度块对应一个亮度块的情况下,即编码压缩格式为4:4:4,在步骤S920中,将该色度块对应的一个亮度块的运动矢量作为该色度块的运动矢量。
应理解,通过直接将色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量,相对于现有技术,可以避免基于多个亮度块的运动矢量进行进一步计算,获得色度块的运动矢量的过程,有效简化了获取色度块的运动矢量的流程,从而可以简化色度块的编解码流程,降低编解码复杂度,提高编解码效率。
可选地,在步骤S920中,将该色度块对应的亮度块中固定位置的一个亮度块(记为目标亮度块)的运动矢量作为该色度块的运动矢量。
该固定位置的一个亮度块可以是色度块对应的亮度块中任一位置的亮度块。
例如,对于视频压缩格式4:2:0,该固定位置的目标亮度块是位于色度块对应的四个亮度块中的左上角的亮度块。如图10所示,在Affine模式中,4×4的色度子块的运动矢量直接由该色度子块对应的四个4×4的亮度子块中左上角的亮度子块的运动矢量(MV)导出。
应理解,在视频压缩格式4:2:0下,将色度块对应的四个亮度块中左上角的一个亮度块的运动矢量作为该色度块的运动矢量,相对来说,可以提高色度块的运动矢量的准确性。
对于视频压缩格式4:2:0,该固定位置的目标亮度块也可以是位于色度 块对应的四个亮度块中右上角、左下角或右下角的亮度块。
又例如,对于视频压缩格式4:2:2,在该色度块对应两个水平方向上的亮度块的情况下,该固定位置的目标亮度块是位于这两个亮度块中左边的亮度块;在该色度块对应两个竖直方向上的亮度块的情况下,该固定位置的目标亮度块是位于这两个亮度块中的上边的亮度块。
如图11所示,色度块1对应两个水平方向上的亮度块1-1与1-2,亮度块1-1的运动矢量为mv1,亮度块1-2的运动矢量为mv2,色度块2对应两个水平方向上的亮度块2-1与2-2,亮度块2-1的运动矢量为mv3,亮度块2-2的运动矢量为mv4。色度块1的运动矢量直接由亮度块1-1的运动矢量mv1导出,即直接将亮度块1-1的运动矢量mv1作为色度块1的运动矢量。色度块2的运动矢量直接由亮度块2-1的运动矢量mv3导出,即直接将亮度块2-1的运动矢量mv3作为色度块1的运动矢量。
应理解,在视频压缩格式4:2:2下,将色度块对应的两个亮度块中左边的或上边的一个亮度块的运动矢量作为该色度块的运动矢量,相对来说,可以提高色度块的运动矢量的准确性。
对于视频压缩格式4:2:2,在该色度块对应两个水平方向上的亮度块的情况下,该固定位置的目标亮度块也可以是位于这两个亮度块中右边的亮度块;在该色度块对应两个竖直方向上的亮度块的情况下,该固定位置的目标亮度块是位于这两个亮度块中下边的亮度块。
再例如,对于视频压缩格式4:1:1,该固定位置的目标亮度块是位于色度块对应的四个亮度块中最左侧的亮度块,即最侧第一个亮度块。
对于视频压缩格式4:1:1,该固定位置的目标亮度块也可以是位于色度块对应的四个亮度块中左侧第二个、左侧第三个或左侧第四个亮度块。
再例如,对于视频压缩格式4:4:4,该固定位置的目标亮度块就是该色度块对应的一个亮度块。即色度块的运动矢量由其对应的一个亮度块的运动矢量直接导出。
如图12所示,在视频压缩格式4:4:4下,色度块1对应一个亮度块1-1,亮度块1-1的运动矢量为mv1,色度块2对应一个亮度块2-1,亮度块2-1的运动矢量为mv2,色度块3对应一个亮度块3-1,亮度块3-1的运动矢量为mv3,色度块4对应一个亮度块4-1,亮度块4-1的运动矢量为mv4。色度块1的运动矢量直接由亮度块1-1的运动矢量mv1导出,即直接将亮度块 1-1的运动矢量mv1作为色度块1的运动矢量。色度块2的运动矢量直接由亮度块2-1的运动矢量mv2导出,即直接将亮度块2-1的运动矢量mv2作为色度块2的运动矢量。对于色度块3与色度块4的描述以此类推,不再赘述。
应理解,通过将色度块对应的亮度块中固定位置的一个亮度块的运动矢量作为色度块的运动矢量,则该固定位置可以由协议规定或预配置,从而编码端无需向解码端指示该固定位置,因此可以进一步简化获取色度块的运动矢量的流程。
基于上述描述,本申请通过直接将色度块对应的亮度块中的一个亮度块的运动矢量作为该色度块的运动矢量,对于任何一种视频压缩格式,在获取色度块的运动矢量的过程中均无需执行求平均的操作,相对于现有技术,可以有效简化获取色度块的运动矢量的流程,从而可以简化色度块的编解码流程,降低编解码复杂度,提高编解码效率。
可选地,在图9的方法中,帧间预测模式为Affine模型,即亮度块的运动矢量根据两个或三个控制点运动矢量获得。
例如,在四参数Affine模型下,每个亮度块的运动矢量由该亮度块所在当前块的左上角控制点运动矢量与右上角控制点运动矢量获得。例如,在六参数Affine模型下,每个亮度块的运动矢量由该亮度块所在当前块的左上角控制点运动矢量、右上角控制点运动矢量与左下角控制点运动矢量获得。
例如,当帧间预测模式为Affine模型,亮度块可以称为亮度子块,色度块可以称为色度子块。
本申请对帧间预测模式不作限定,例如,可以为除Affine模式之外的其他预测模式。应理解,亮度块的运动矢量的获取方式随帧间预测模式而定。除了可以应用于Affine模式以外,本申请适用于任何基于亮度块的运动矢量来导出色度块的运动矢量的场景。
当图9的方法应用于编码端,在视频编码的预测环节,在获取亮度块的运动矢量之后,执行步骤S910与步骤S920。执行完步骤S920后,图9的方法还包括:利用色度块的运动矢量,对色度块进行编码。
例如,执行完步骤S920后,图9的方法包括图1中所示的变换、量化、熵编码的过程,以及反量化、反变换以获得重建帧的过程。
当图9的方法应用于解码端,图9的方法在步骤S910之前还可以包括如下操作:
对码流进行解码,获得控制点运动矢量(CPMV)的索引,在非合并模式情况下,还需解码获得控制点运动矢量的运动矢量残差值(MVD),进而获得各个图像块的CPMV;按照上文描述的四参数模型或六参数模型,根据CPMV计算得到亮度块的运动矢量。
应理解,人眼对亮度分辨率的敏感度高于对色彩分辨率的敏感度,相对于色度元素,亮度元素对视频信号的贡献更大。采用本申请实施例提供的方法获取色度块的运动矢量,不会明显降低编码性能。
还应理解,本申请实施例可以应用于对色度编码要求较低的场景,这时,在不影响编码性能的前提下,可以有效简化获取色度块的运动矢量的流程。
还应理解,本申请实施例可以适用于所有涉及根据亮度块的运动矢量获取色度块的运动矢量的场景。
上文描述了本申请的方法实施例,下文将描述本申请的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见前面方法实施例,为了简洁,这里不再赘述。
如图13所示,本申请实施例还提供一种视频处理的装置。该装置可以为编码器,与可以为解码器。该视频处理的装置包括处理器1310与存储器1320,该存储器1320用于存储指令,该处理器1310用于执行该存储器1320存储的指令,并且对该存储器1320中存储的指令的执行使得,该处理器1310用于执行上文方法实施例的方法。
该处理器1310通过执行存储器1320存储的指令,执行如下操作:确定待编码或解码的色度块;将色度块对应的亮度块中的一个亮度块的运动矢量作为色度块的运动矢量。
可选地,将色度块对应的亮度块中的一个亮度块的运动矢量作为色度块的运动矢量,包括:将色度块对应的亮度块中固定位置的一个亮度块的运动矢量作为色度块的运动矢量。
可选地,色度块对应多个亮度块;其中,将色度块对应的亮度块中的一个亮度块的运动矢量作为色度块的运动矢量,包括:将色度块对应的多个亮度块中的目标亮度块的运动矢量作为色度块的运动矢量,其中,目标亮度块是位于色度块对应的多个亮度块中的下列位置中任一位置的亮度块:左上角、左边、上边、左下角、右上角、右下角、右边或者下边。
可选地,视频压缩格式为4:2:2,目标亮度块是位于色度块对应的多个 亮度块中左上角的亮度块。
可选地,视频压缩格式为4:2:0;在色度块对应水平的两个亮度块的情况下,目标亮度块是位于色度块对应的两个亮度块中左边的亮度块;或在色度块对应竖直的两个亮度块的情况下,目标亮度块是位于色度块对应的两个亮度块中上边的亮度块。
可选地,视频压缩格式为4:1:1,目标亮度块是位于色度块对应的多个亮度块中最左侧的亮度块。
可选地,亮度块的运动矢量根据两个或三个控制点运动矢量获得。
可选地,处理器1310还用于执行如下操作:利用色度块的运动矢量,对色度块进行编码或解码。
可选地,如图13所示,该视频处理的装置还包括通信接口1330,用于与外部器件传输信号。
当本申请实施例提供的视频处理的装置为编码器,通信接口1330用于从外部器件接收待处理的图像或视频数据,还用于向解码端发送编码的码流。
当本申请实施例提供的视频处理的装置为解码器,通信接口1330用于向外部器件发送解码得到的数据。
本申请实施例还提供一种视频处理的系统,该系统包括编码器与解码器,编码器与解码器均用于执行上文实施例的方法。
本发明实施例还提供一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得,该计算机执行上文方法实施例的方法。
本发明实施例还提供一种包含指令的计算机程序产品,其特征在于,该指令被计算机执行时使得计算机执行上文方法实施例的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方 式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (22)

  1. 一种视频处理的方法,其特征在于,包括:
    确定待编码或解码的色度块;
    将所述色度块对应的亮度块中的一个亮度块的运动矢量作为所述色度块的运动矢量。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述色度块对应的亮度块中的一个亮度块的运动矢量作为所述色度块的运动矢量,包括:
    将所述色度块对应的亮度块中固定位置的一个亮度块的运动矢量作为所述色度块的运动矢量。
  3. 根据权利要求1或2所述的方法,其特征在于,所述色度块对应多个亮度块;
    其中,所述将所述色度块对应的亮度块中的一个亮度块的运动矢量作为所述色度块的运动矢量,包括:
    将所述色度块对应的多个亮度块中的目标亮度块的运动矢量作为所述色度块的运动矢量,其中,所述目标亮度块是位于所述色度块对应的多个亮度块中的下列位置中任一位置的亮度块:
    左上角、左边、上边、左下角、右上角、右下角、右边或者下边。
  4. 根据权利要求3所述的方法,其特征在于,视频压缩格式为4:2:0,所述目标亮度块是位于所述色度块对应的多个亮度块中左上角的亮度块。
  5. 根据权利要求3所述的方法,其特征在于,视频压缩格式为4:2:2;
    在所述色度块对应水平的两个亮度块的情况下,所述目标亮度块是位于所述色度块对应的两个亮度块中左边的亮度块;或
    在所述色度块对应竖直的两个亮度块的情况下,所述目标亮度块是位于所述色度块对应的两个亮度块中上边的亮度块。
  6. 根据权利要求3所述的方法,其特征在于,视频压缩格式为4:1:1,所述目标亮度块是位于所述色度块对应的多个亮度块中最左侧的亮度块。
  7. 根据权利要求3所述的方法,其特征在于,视频压缩格式为4:4:4,所述目标亮度块是所述色度块对应的一个亮度块。
  8. 根据权利要求4或6所述的方法,其特征在于,所述色度块对应四个亮度块。
  9. 根据权利要求1至8中任一项所述的方法,其特征在于,所述亮度 块的运动矢量根据两个或三个控制点运动矢量获得。
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述方法还包括:
    利用所述色度块的运动矢量,对所述色度块进行编码或解码。
  11. 一种视频处理的装置,其特征在于,包括:
    存储器,用于存储指令;
    处理器,用于执行所述存储器存储的指令,执行如下操作:
    确定待编码或解码的色度块;
    将所述色度块对应的亮度块中的一个亮度块的运动矢量作为所述色度块的运动矢量。
  12. 根据权利要求11所述的装置,其特征在于,所述将所述色度块对应的亮度块中的一个亮度块的运动矢量作为所述色度块的运动矢量,包括:
    将所述色度块对应的亮度块中固定位置的一个亮度块的运动矢量作为所述色度块的运动矢量。
  13. 根据权利要求11或12所述的装置,其特征在于,所述色度块对应多个亮度块;
    其中,所述将所述色度块对应的亮度块中的一个亮度块的运动矢量作为所述色度块的运动矢量,包括:
    将所述色度块对应的多个亮度块中的目标亮度块的运动矢量作为所述色度块的运动矢量,其中,所述目标亮度块是位于所述色度块对应的多个亮度块中的下列位置中任一位置的亮度块:
    左上角、左边、上边、左下角、右上角、右下角、右边或者下边。
  14. 根据权利要求13所述的装置,其特征在于,视频压缩格式为4:2:0,所述目标亮度块是位于所述色度块对应的多个亮度块中左上角的亮度块。
  15. 根据权利要求13所述的装置,其特征在于,视频压缩格式为4:2:2;
    在所述色度块对应水平的两个亮度块的情况下,所述目标亮度块是位于所述色度块对应的两个亮度块中左边的亮度块;或
    在所述色度块对应竖直的两个亮度块的情况下,所述目标亮度块是位于所述色度块对应的两个亮度块中上边的亮度块。
  16. 根据权利要求13所述的装置,其特征在于,视频压缩格式为4:1:1,所述目标亮度块是位于所述色度块对应的多个亮度块中最左侧的亮度块。
  17. 根据权利要求13所述的方法,其特征在于,视频压缩格式为4:4:4,所述目标亮度块是所述色度块对应的一个亮度块。
  18. 根据权利要求14或16所述的方法,其特征在于,所述色度块对应四个亮度块。
  19. 根据权利要求11至18中任一项所述的装置,其特征在于,所述亮度块的运动矢量根据两个或三个控制点运动矢量获得。
  20. 根据权利要求11至19中任一项所述的装置,其特征在于,所述处理器还用于执行如下操作:
    利用所述色度块的运动矢量,对所述色度块进行编码或解码。
  21. 一种计算机存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被计算机执行时使得,所述计算机执行如权利要求1至10任一项所述的方法。
  22. 一种包含指令的计算机程序产品,其特征在于,所述指令被计算机执行时使得所述计算机执行如权利要求1至10任一项所述的方法。
PCT/CN2019/130901 2019-12-31 2019-12-31 视频处理的方法与装置 WO2021134666A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980066718.1A CN112823520A (zh) 2019-12-31 2019-12-31 视频处理的方法与装置
PCT/CN2019/130901 WO2021134666A1 (zh) 2019-12-31 2019-12-31 视频处理的方法与装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/130901 WO2021134666A1 (zh) 2019-12-31 2019-12-31 视频处理的方法与装置

Publications (1)

Publication Number Publication Date
WO2021134666A1 true WO2021134666A1 (zh) 2021-07-08

Family

ID=75854409

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130901 WO2021134666A1 (zh) 2019-12-31 2019-12-31 视频处理的方法与装置

Country Status (2)

Country Link
CN (1) CN112823520A (zh)
WO (1) WO2021134666A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070014956A (ko) * 2005-07-29 2007-02-01 엘지전자 주식회사 영상 신호의 인코딩 및 디코딩 방법
CN101883286A (zh) * 2010-06-25 2010-11-10 北京中星微电子有限公司 运动估计中的校准方法及装置、运动估计方法及装置
CN102724511A (zh) * 2012-06-28 2012-10-10 北京华控软件技术有限公司 云转码压缩系统和方法
CN109076210A (zh) * 2016-05-28 2018-12-21 联发科技股份有限公司 视频编解码的当前图像参考的方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070014956A (ko) * 2005-07-29 2007-02-01 엘지전자 주식회사 영상 신호의 인코딩 및 디코딩 방법
CN101883286A (zh) * 2010-06-25 2010-11-10 北京中星微电子有限公司 运动估计中的校准方法及装置、运动估计方法及装置
CN102724511A (zh) * 2012-06-28 2012-10-10 北京华控软件技术有限公司 云转码压缩系统和方法
CN109076210A (zh) * 2016-05-28 2018-12-21 联发科技股份有限公司 视频编解码的当前图像参考的方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NIU LIJUN: "Scalable AVS Video Codec System Staying in Motion", CHINESE MASTER'S THESES FULL-TEXT DATABASE, TIANJIN POLYTECHNIC UNIVERSITY, CN, 15 February 2015 (2015-02-15), CN, XP055827978, ISSN: 1674-0246 *

Also Published As

Publication number Publication date
CN112823520A (zh) 2021-05-18

Similar Documents

Publication Publication Date Title
KR102431537B1 (ko) 루마 및 크로마 성분에 대한 ibc 전용 버퍼 및 디폴트 값 리프레싱을 사용하는 인코더, 디코더 및 대응하는 방법들
CN111819852B (zh) 用于变换域中残差符号预测的方法及装置
JP7436543B2 (ja) 画像符号化システムにおいてアフィンmvp候補リストを使用するアフィン動き予測に基づいた画像デコード方法及び装置
TW202044830A (zh) 用於針對合併模式以確定預測權重的方法、裝置和系統
CN113498605A (zh) 编码器、解码器及使用自适应环路滤波器的相应方法
WO2017129023A1 (zh) 解码方法、编码方法、解码设备和编码设备
JP7209819B2 (ja) ビデオ符号化及び復号のための方法および装置
WO2020140243A1 (zh) 视频图像处理方法与装置
Abou-Elailah et al. Fusion of global and local motion estimation for distributed video coding
CN111837389A (zh) 适用于多符号位隐藏的块检测方法及装置
CN112970256A (zh) 基于经全局运动补偿的运动矢量的视频编码
WO2019128716A1 (zh) 图像的预测方法、装置及编解码器
CN113785573A (zh) 编码器、解码器和使用自适应环路滤波器的对应方法
TW201907715A (zh) 無分割雙向濾波器
JP2019534631A (ja) ピークサンプル適応オフセット
CN113383550A (zh) 光流修正的提前终止
CN114466192A (zh) 图像/视频超分辨率
CN112913236A (zh) 编码器,解码器和使用压缩mv存储的对应方法
TW202114417A (zh) 動態圖像解碼裝置及動態圖像編碼裝置
US20220224912A1 (en) Image encoding/decoding method and device using affine tmvp, and method for transmitting bit stream
KR102407912B1 (ko) 양방향 인트라 예측 시그널링
WO2019233423A1 (zh) 获取运动矢量的方法和装置
CN112997499A (zh) 基于经全局运动补偿的运动矢量预测值的视频编码
WO2021134666A1 (zh) 视频处理的方法与装置
CN116250240A (zh) 图像编码方法、图像解码方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19958249

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19958249

Country of ref document: EP

Kind code of ref document: A1