WO2020042604A1 - Video encoder, video decoder and corresponding method - Google Patents

Video encoder, video decoder and corresponding method Download PDF

Info

Publication number
WO2020042604A1
WO2020042604A1 PCT/CN2019/079955 CN2019079955W WO2020042604A1 WO 2020042604 A1 WO2020042604 A1 WO 2020042604A1 CN 2019079955 W CN2019079955 W CN 2019079955W WO 2020042604 A1 WO2020042604 A1 WO 2020042604A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
block
control point
current
candidate motion
Prior art date
Application number
PCT/CN2019/079955
Other languages
French (fr)
Chinese (zh)
Inventor
陈焕浜
杨海涛
陈建乐
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020042604A1 publication Critical patent/WO2020042604A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria

Definitions

  • the present application relates to the technical field of video encoding and decoding, and in particular, to an inter prediction method and device for a video image, and a corresponding encoder and decoder.
  • Digital video capabilities can be incorporated into a wide variety of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, Digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones (so-called "smart phones"), video teleconferencing devices, video streaming devices, and the like .
  • Digital video devices implement video compression technology, for example, in standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264 / MPEG-4 Part 10 Advanced Video Coding (AVC), Video coding standards described in the H.265 / High Efficiency Video Coding (HEVC) standard and extensions to such standards.
  • Video devices can implement such video compression techniques to more efficiently transmit, receive, encode, decode, and / or store digital video information.
  • Video compression techniques perform spatial (intra-image) prediction and / or temporal (inter-image) prediction to reduce or remove redundancy inherent in video sequences.
  • a video slice ie, a video frame or a portion of a video frame
  • image blocks which may also be referred to as tree blocks, coding units (CU), and / or coding nodes .
  • the spatial prediction of reference samples in neighboring blocks in the same image is used to encode the image blocks in the to-be-encoded (I) slice of the image.
  • the image blocks in an inter-coded (P or B) slice of an image may use spatial prediction relative to reference samples in neighboring blocks in the same image or temporal prediction relative to reference samples in other reference images.
  • An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
  • various video coding standards including the High-Efficiency Video Coding (HEVC) standard have proposed predictive coding modes for image blocks, that is, predicting a current block to be coded based on a video data block that has been coded.
  • HEVC High-Efficiency Video Coding
  • intra prediction mode the current block is predicted based on one or more previously decoded neighboring blocks in the same image as the current block; in inter prediction mode, the current block is predicted based on already decoded blocks in different images.
  • Motion vector prediction is a key technique that affects encoding / decoding performance.
  • motion vector prediction methods based on translational motion models for flat animals in the picture; motion vector prediction methods based on motion models and motion vectors based on control point combinations for non-translational objects. method of prediction.
  • the motion vector prediction method based on the motion model reads more memory, which results in a slower encoding / decoding speed. How to reduce the amount of memory read in the motion vector prediction process is a technical issue being studied by those skilled in the art.
  • the embodiments of the present application provide an inter-frame prediction method and device for a video image, and a corresponding encoder and encoder, which can reduce the read amount of the memory to a certain extent, thereby improving encoding and encoding performance.
  • an embodiment of the present application discloses an encoding method including: determining a target candidate motion vector group from a candidate motion vector list (for example, an affine transformation candidate motion vector list) according to a rate distortion cost criterion;
  • the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current coding block (which may be specifically a current affine coding block), wherein if the first neighboring affine coding block is a four-parameter affine coding block, And the first adjacent affine coding block is located at the coding tree unit CTU above the current coding block, the candidate motion vector list includes a first set of candidate motion vector prediction values, and the first set of candidate motion vector predictions The value is obtained based on the lower left control point and the lower right control point of the first neighboring affine coding block; optionally, the candidate motion vector list may be constructed in the following manner: according to the neighboring block A and the neighboring block B , The neighboring block C, the neighbor
  • the first affine model obtains a motion vector prediction value of a first set of control points of the current encoding block, where the motion vector prediction value of the first set of control points of the current encoding block is used as a first of the candidate motion vector list.
  • an index corresponding to the target candidate motion vector is coded into a code stream to be transmitted (optionally, when the length of the candidate motion vector list is 1, (No index is needed to indicate the target motion vector group).
  • the target candidate motion vector group is an optimal candidate motion vector group selected from a candidate motion vector list according to a rate distortion cost criterion.
  • the selected target candidate motion vector group is the first group of candidate motion vector prediction values; if the first group of candidate motion vector prediction values is not optimal, then the selected target candidate motion vector group is not the first group of candidate motions.
  • the first adjacent affine coding block is a four-parameter affine coding block among the neighboring blocks of the current coding block. The specific one is not limited here. Taking FIG. 7A as an example, it may be a neighboring block therein. A, or neighboring block B, or other neighboring blocks.
  • first”, “second”, “third”, etc. appearing elsewhere in the embodiments of the present application all mean a certain meaning.
  • first represents a certain one
  • second The one indicated and the one indicated by “third” refer to different objects.
  • first group of control points and the second group of control points each have It refers to different control points; in addition, the “first”, “second”, and the like in the embodiments of the present application have no sequential meaning.
  • the first set of control points includes the first neighboring affine coding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent coding block as the first group of control points as in the prior art (or fixed the first phase The upper left control point and the upper right control point of the neighboring coding block are used as the first group of control points).
  • the information (for example, position coordinates, motion vectors, etc.) of the first set of control points can be directly reused from the memory, thereby reducing the probability.
  • the first neighboring affine coding block is specifically specified as a four-parameter affine coding block, when the candidate motion vector is constructed based on the group control points of the first neighboring affine coding block, only the first neighboring The lower left control point and lower right control point of the affine coding block are sufficient, and no additional control points are needed, so it is further ensured that the memory read is not too high.
  • the first set of candidate motion vector prediction values is used to represent an upper left control point and an upper right control point of the current coding block.
  • the first affine model is determined based on a motion vector and a position coordinate of a lower left control point and a lower right control point of the first adjacent affine coding block.
  • the first set of candidate motion vector prediction values is used to represent a motion vector prediction of an upper left control point, an upper right control point, and a lower left fixed point control point of the current coding block. value.
  • the position coordinates of the upper left control point, the upper right control point, and the lower left fixed point control point of the current encoding block are substituted into the first affine model, so as to obtain the upper left control point, upper right control point, and lower left fixed point control point of the current encoding block.
  • Motion vector prediction is used to represent a motion vector prediction of an upper left control point, an upper right control point, and a lower left fixed point control point of the current coding block.
  • the method further includes: using the target candidate motion vector group as a search starting point to search for a lowest cost one within a preset search range according to a rate distortion cost criterion.
  • Motion vectors of a group of control points then, determine a motion vector difference MVD between the motion vectors of the group of control points and the target candidate motion vector group, for example, if the first group of control points includes a first control point and The second control point, then it is necessary to determine the motion vector difference MVD between the motion vector of the first control point and the motion vector prediction value of the first control point in a set of control points represented by the target candidate motion vector group, and determine the first A motion vector difference MVD between a motion vector of two control points and a motion vector prediction value of a second control point in a set of control points represented by the target candidate motion vector group.
  • the encoding the index corresponding to the target candidate motion vector group into a code stream to be transmitted may specifically include:
  • the indexing the index corresponding to the target candidate motion vector group into a code stream to be transmitted may specifically include: combining the target candidate motion vector group with the target candidate motion vector group.
  • the index corresponding to the group, reference frame index, and prediction direction is coded into the code stream to be transmitted.
  • the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine coding block and the position coordinates (x 7 , y 7 ) of the lower right control point Both are calculated according to the position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine coding block, where the position coordinates of the lower left control point of the first adjacent affine coding block (x 6 , y 6 ) is (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine coding block are (x 4 + cuW, y 4 + cuH), cuW is the width of the first neighboring affine coding block, and cuH is the height of the first neighboring affine coding block; in addition, the lower left of the first neighboring affine coding block is controlled
  • the motion vector of the point is the motion vector of the lower left sub-block
  • Motion vector of the lower right child block It can be seen that the position coordinates of the lower left control point and the position coordinates of the lower right control point of the first adjacent affine coding block are both derived and not read from memory, so using this method can Further reducing memory reads and improving encoding performance. As another alternative, the position coordinates of the lower left control point and the lower right control point may also be pre-selected and stored in the memory, and read from the memory when needed later.
  • the method further includes: obtaining the current encoding based on the target candidate motion vector group. A motion vector of one or more sub-blocks of the block; and based on the motion vectors of the one or more sub-blocks of the current coding block, predicting a pixel prediction value of the current coding block.
  • a motion vector of one or more sub-blocks of the current coding block is obtained based on the target candidate motion vector group, if a lower boundary of the current coding block is lower than a CTU of the current coding block, When the boundaries are coincident, the motion vector of the lower left sub-block of the current coding block is calculated according to the target candidate motion vector group and the position coordinate (0, H) of the lower left corner of the current coding block.
  • the motion vectors of the sub-blocks in the lower right corner of the coding block are calculated according to the target candidate motion vector group and the position coordinates (W, H) of the lower right corner of the current coding block.
  • an affine model is constructed based on the target candidate motion vector, and then the position coordinates (0, H) of the lower left corner of the current coding block are substituted into the affine model to obtain the motion vector of the subblock in the lower left corner of the current coding block ( Instead of substituting the coordinates of the center point of the sub-block in the lower left corner into the affine model for calculation), substituting the position coordinates (W, H) of the lower right corner of the current coding block into the affine model to obtain the right of the current coding block The motion vector of the subblock in the lower corner (instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion vector list of the other blocks) uses accurate values instead of estimated values; where W is the width of the current coding block and H is the height of the current coding block.
  • an embodiment of the present application provides a video encoder, including several functional units for implementing any one of the methods in the first aspect.
  • a video encoder can include:
  • An inter prediction unit configured to determine a target candidate motion vector group from a candidate motion vector list according to a rate distortion cost criterion; the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current coding block;
  • An entropy coding unit configured to encode an index corresponding to the target candidate motion vector into a code stream and transmit the code stream
  • the candidate motion vector list A first group of candidate motion vector prediction values is included, and the first group of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine coding block.
  • an embodiment of the present application provides a device for encoding video data, where the device includes:
  • Memory for storing video data in the form of a stream
  • a video encoder for determining a target candidate motion vector group from a candidate motion vector list according to a rate-distortion cost criterion; the target candidate motion vector group represents a motion vector prediction value for a set of control points of a current coding block; and Coding an index corresponding to the target candidate motion vector into a code stream, and transmitting the code stream; wherein if the first neighboring affine coding block is a four-parameter affine coding block, and the first neighboring affine coding block Is located in the coding tree unit CTU above the current coding block, the candidate motion vector list includes a first set of candidate motion vector prediction values, and the first set of candidate motion vector prediction values is based on the first phase The lower left control point and the lower right control point of the neighboring affine coding block are obtained.
  • an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled to each other, the processor calling program code stored in the memory to execute any one of the first aspect Some or all steps of this method.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores program code, where the program code includes a part for performing any one of the methods of the first aspect or Instructions for all steps.
  • an embodiment of the present application provides a computer program product, and when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of any one of the methods of the first aspect.
  • an embodiment of the present application discloses a decoding method.
  • the method includes parsing a bitstream to obtain an index, where the index is used to indicate a target candidate motion of a current decoding block (which may be specifically a current affine decoding block).
  • a vector group ; and then determining the target candidate motion vector group from a candidate motion vector list (for example, an affine transformation candidate motion vector list) according to the index (optionally, when the length of the candidate motion vector list is 1, It is not necessary to parse the bitstream to obtain the index, and the target motion vector group can be directly determined.
  • the target candidate motion vector group represents the motion vector prediction value of a set of control points of the current decoding block; wherein, if the first adjacent affine decoding block is Is a four-parameter affine decoding block, and the first adjacent affine decoding block is located above the current decoding block decoding tree unit CTU, then the candidate motion vector list includes a first set of candidate motion vector prediction values, and The first group of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine decoding block; optionally, where The construction method of the selected motion vector list may be: determine one or more of the current decoding block in the order of the neighboring block A, neighboring block B, neighboring block C, neighboring block D, and neighboring block E (as shown in FIG.
  • Adjacent affine decoded blocks include a first adjacent affine decoded block; and if the first adjacent affine decoded block is a four-parameter affine decoded block, Then using the first affine model to obtain the motion vector prediction value of the first set of control points of the current decoding block based on the lower left control point and the lower right control point of the first adjacent affine decoding block, where the The motion vector prediction value of the first set of control points is used as the first set of candidate motion vectors of the candidate motion vector list.
  • an index corresponding to the target candidate motion vector is compiled into The transmitted code stream (optionally, when the length of the candidate motion vector list is 1, no index is needed to indicate the target motion vector group).
  • the target candidate motion vector group is an optimal candidate motion vector group selected from a candidate motion vector list according to a rate distortion cost criterion.
  • the selected target candidate motion vector group is the first group of candidate motion vector prediction values; if the first group of candidate motion vector prediction values is not optimal, then the selected target candidate motion vector group is not the first group of candidate motions.
  • the first adjacent affine decoding block is a four-parameter affine decoding block among the neighboring blocks of the current decoding block. The specific one is not limited here. Taking FIG. 7A as an example, it may be the neighboring block among them. A, or neighboring block B, or other neighboring blocks.
  • “first”, “second”, “third”, etc. appearing elsewhere in the embodiments of the present application all mean a certain meaning.
  • first represents a certain one
  • second The one indicated and the one indicated by “third” refer to different objects.
  • first group of control points and the second group of control points each have It refers to different control points; in addition, the “first”, “second”, and the like in the embodiments of the present application have no sequential meaning.
  • the first set of control points includes the first neighboring affine decoding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent decoding block as the first group of control points as in the prior art (or fixing the first phase
  • the upper left control point and the upper right control point of the neighboring decoding blocks are used as the first group of control points).
  • the information (for example, position coordinates, motion vectors, etc.) of the first set of control points can be directly reused from the memory, thereby reducing the probability.
  • the first neighboring affine decoding block is specifically defined as a four-parameter affine decoding block, when the candidate motion vector is constructed based on the group control points of the first neighboring affine decoding block, only the first neighboring The lower left control point and lower right control point of the affine decoding block are sufficient, and no additional control point is needed, so it is further ensured that the memory read is not too high.
  • the first set of candidate motion vector prediction values is used to represent an upper left control point and an upper right control point of the current decoding block.
  • the first affine model is determined based on a motion vector and a position coordinate of a lower left control point and a lower right control point of the first adjacent affine decoding block.
  • the first set of candidate motion vector prediction values is used to represent motion vector prediction of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block. value.
  • the position coordinates of the upper left control point, the upper right control point, and the lower left fixed point control point of the current decoding block are substituted into the first affine model, so as to obtain the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block.
  • Motion vector prediction is used to represent motion vector prediction of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block.
  • the obtaining the motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group is specifically: obtaining the current decoding based on a second affine model.
  • a motion vector of one or more sub-blocks of the block (for example, substituting coordinates of a center point of the one or more sub-blocks into the second affine model to obtain motion vectors of one or more sub-blocks),
  • the two affine models are determined based on the position coordinates of the target candidate motion vector group and a set of control points of the current decoding block.
  • the obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group may specifically include: The motion vector difference MVD obtained from the code stream and the target candidate motion vector group indicated by the index are used to obtain a new candidate motion vector group; and then based on the new candidate motion vector group, the current decoded block is obtained.
  • a second affine model is determined based on the position coordinates of the new candidate motion vector group and a set of control points of the current decoded block, and then based on the second affine model.
  • a motion vector of one or more sub-blocks of the current decoding block is obtained.
  • the obtaining a pixel prediction value of the current decoded block based on the motion vector of one or more sub-blocks of the current decoded block may specifically include: According to the motion vector of one or more sub-blocks of the current decoding block, and the reference frame index and prediction direction indicated by the index, a pixel prediction value of the current decoding block is obtained by prediction.
  • the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine decoding block and the position coordinates (x 7 , y 7 ) of the lower right control point ) are calculated according to the position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine decoding block, where the position of the lower left control point of the first adjacent affine decoding block is The coordinates (x 6 , y 6 ) are (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine decoding block are (x 4 + cuW) , Y 4 + cuH), cuW is the width of the first neighboring affine decoding block, and cuH is the height of the first neighboring affine decoding block; in addition, the bottom left of the first neighboring affine decoding block
  • the motion vector of the control point is the motion vector of the lower left
  • the position coordinates of the lower left control point and the position coordinates of the lower right control point of the first adjacent affine decoding block are both derived and not read from the memory, so using this method can Further reducing memory reads and improving decoding performance.
  • the position coordinates of the lower left control point and the lower right control point may also be pre-selected and stored in the memory, and read from the memory when needed later.
  • the motion vector of the sub-left corner of the current decoding block is the position coordinate (0, H) according to the target candidate motion vector group and the bottom-left corner of the current decoding block. It is calculated that the motion vector of the sub-block in the lower right corner of the current decoding block is calculated according to the target candidate motion vector group and the position coordinates (W, H) of the lower right corner of the current decoding block.
  • an affine model is constructed based on the target candidate motion vector, and then the position coordinates (0, H) of the lower left corner of the current decoded block are substituted into the affine model to obtain the motion vector of the subblock in the lower left corner of the current decoded block ( Instead of substituting the coordinates of the center point of the subblock in the lower left corner into the affine model for calculation), substituting the position coordinates (W, H) of the lower right corner of the current decoded block into the affine model to obtain the right of the current decoded block The motion vector of the subblock in the lower corner (instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion vector lists of the other blocks) use accurate values instead of estimated values.
  • W is the width of the current decoded block
  • H is the height of the current decoded block.
  • an embodiment of the present application provides a video decoder.
  • the video decoder includes:
  • An entropy decoding unit configured to parse a code stream to obtain an index, where the index is used to indicate a target candidate motion vector group of a current decoding block;
  • An inter prediction unit configured to determine the target candidate motion vector group from a candidate motion vector list according to the index, where the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current decoding block, where If the first neighboring affine decoding block is a four-parameter affine decoding block and the first neighboring affine decoding block is located above the current decoding block decoding tree unit CTU, the list of candidate motion vectors includes A first set of candidate motion vector prediction values, the first set of candidate motion vector prediction values being obtained based on a lower left control point and a lower right control point of the first adjacent affine decoding block; and based on the target candidate motion
  • the vector group obtains a motion vector of one or more sub-blocks of the current decoding block; based on the motion vectors of the one or more sub-blocks of the current decoding block, predicts and obtains a pixel prediction value of the current decoding block.
  • an embodiment of the present application provides a device for decoding video data, where the device includes:
  • Memory for storing video data in the form of a stream
  • a video decoder for parsing a bitstream to obtain an index, the index indicating a target candidate motion vector group of a current decoding block; and for determining the target candidate motion from a candidate motion vector list according to the index Vector group, the target candidate motion vector group represents a motion vector prediction value of a set of control points of the current decoding block, wherein if the first adjacent affine decoding block is a four-parameter affine decoding block, and the first phase The neighboring affine decoding block is located above the current decoding block CTU, then the candidate motion vector list includes a first set of candidate motion vector prediction values, and the first set of candidate motion vector prediction values is based on the first Obtained from the lower left control point and the lower right control point of an adjacent affine decoding block; and obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group; based on the current decoding block And predicting a motion vector of one or more sub-blocks to obtain a
  • an embodiment of the present application provides a decoding device, including: a non-volatile memory and a processor coupled to each other, the processor invoking program code stored in the memory to execute any one of the seventh aspects Some or all steps of this method.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores program code, where the program code includes a part for performing any one of the methods of the seventh aspect Or all step instructions.
  • an embodiment of the present application provides a computer program product, and when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of any one of the methods in the seventh aspect.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system according to an embodiment of the present application
  • FIG. 2A is a schematic block diagram of a video encoder according to an embodiment of the present application.
  • 2B is a schematic block diagram of a video decoder according to an embodiment of the present application.
  • FIG. 3 is a flowchart of a method for inter prediction of an encoded video image according to an embodiment of the present application
  • FIG. 4 is a flowchart of a method for decoding an inter prediction of a video image according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of motion information of a current image block and a reference block in an embodiment of the present application
  • FIG. 6 is a schematic flowchart of an encoding method according to an embodiment of the present application.
  • FIG. 7A is a schematic diagram of a scenario of an adjacent block provided by an embodiment of the present application.
  • FIG. 7B is a schematic diagram of a scenario of an adjacent block provided by an embodiment of the present application.
  • FIG. 8A is a schematic structural diagram of a motion compensation unit according to an embodiment of the present application.
  • FIG. 8B is a schematic structural diagram of still another motion compensation unit according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a decoding method according to an embodiment of the present application.
  • 9A is a schematic flowchart of constructing a candidate motion vector list according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an encoding device or a decoding device according to an embodiment of the present invention.
  • FIG. 11 is a video encoding system 1100 including the encoder 20 of FIG. 2A and / or the decoder 200 of FIG. 2B according to an exemplary embodiment.
  • Non-translational motion model prediction refers to the use of the same motion model on the codec side (e.g., video encoder and video decoder ends) to derive the value of each sub-motion compensation unit (also known as a sub-block) in the current encoding / decoding block.
  • the motion information (such as a motion vector) is subjected to motion compensation according to the motion information of the sub motion compensation unit to obtain a prediction block, thereby improving prediction efficiency.
  • the process of deriving motion information of a motion compensation unit (also called a sub-block) in the current encoding / decoding block involves motion vector prediction based on the motion model.
  • the upper left of the adjacent affine decoding block of the current encoding / decoding block is usually used
  • the affine model is derived from the position coordinates and motion vectors of the control point, the upper right control point, the lower left control point, and the motion vector; and then the motion vector prediction value of a set of control points of the current encoding / decoding block is derived according to the affine model as the candidate motion vector A set of candidate motion vector predictions in a list.
  • the position coordinates and motion vectors of the upper-left control point, upper-right control point, lower-left control point of adjacent affine decoding blocks used in the motion vector prediction process need to be read from memory in real time, which will increase the memory read pressure.
  • the embodiment of the present application focuses on how to reduce the memory read pressure, which involves the optimization of the encoding and decoding ends.
  • the application scenario of the embodiment of the present application is first introduced below.
  • Encoding a video stream, or a portion thereof, such as a video frame or an image block can use temporal and spatial similarities in the video stream to improve encoding performance.
  • the current image block of a video stream can predict motion information for the current image block based on previously encoded blocks in the video stream, and identify the difference between the predicted block and the current image block (that is, the original block) (also known as the original block) Is the residual), thereby encoding the current image block based on the previously encoded block.
  • the original block also known as the original block
  • Is the residual the difference between the predicted block and the current image block (that is, the original block) (also known as the original block) Is the residual)
  • the motion vector is an important parameter in the inter prediction process, which represents the spatial displacement of a previously coded block relative to the current coded block.
  • Motion vectors can be obtained using motion estimation methods, such as motion search.
  • the bits representing the motion vector were included in the encoded bit stream to allow the decoder to reproduce the predicted block and then obtain the reconstructed block.
  • it was later proposed to use the reference motion vector to differentially encode the motion vector that is, instead of encoding the entire motion vector, only the difference between the motion vector and the reference motion vector was encoded.
  • the reference motion vector may be selected from previously used motion vectors in the video stream. Selecting a previously used motion vector to encode the current motion vector can further reduce the number of bits included in the encoded video bitstream .
  • FIG. 1 is a block diagram of a video decoding system 1 according to an example described in the embodiment of the present application.
  • video coder generally refers to both video encoders and video decoders.
  • video coding or “coding” may generally refer to video encoding or video decoding.
  • the video encoder 100 and the video decoder 200 of the video decoding system 1 are configured to predict a current coded image block according to various method examples described in any of a variety of new inter prediction modes proposed in the present application.
  • the motion information of the sub-block or its sub-blocks makes the predicted motion vector close to the motion vector obtained using the motion estimation method to the greatest extent, so that the motion vector difference is not transmitted during encoding, thereby further improving the encoding and decoding performance.
  • the video decoding system 1 includes a source device 10 and a destination device 20.
  • the source device 10 generates encoded video data. Therefore, the source device 10 may be referred to as a video encoding device.
  • the destination device 20 may decode the encoded video data generated by the source device 10. Therefore, the destination device 20 may be referred to as a video decoding device.
  • Various implementations of the source device 10, the destination device 20, or both may include one or more processors and a memory coupled to the one or more processors.
  • the memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other media that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
  • the source device 10 and the destination device 20 may include various devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets, such as so-called “smart” phones, etc. Cameras, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, or the like.
  • the destination device 20 may receive the encoded video data from the source device 10 via the link 30.
  • the link 30 may include one or more media or devices capable of moving the encoded video data from the source device 10 to the destination device 20.
  • the link 30 may include one or more communication media enabling the source device 10 to directly transmit the encoded video data to the destination device 20 in real time.
  • the source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 20.
  • the one or more communication media may include wireless and / or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
  • the one or more communication media may include a router, a switch, a base station, or other devices that facilitate communication from the source device 10 to the destination device 20.
  • the encoded data may be output from the output interface 140 to the storage device 40.
  • the encoded data can be accessed from the storage device 40 through the input interface 240.
  • the storage device 40 may include any of a variety of distributed or locally-accessed data storage media, such as a hard drive, Blu-ray disc, DVD, CD-ROM, flash memory, volatile or non-volatile memory, Or any other suitable digital storage medium for storing encoded video data.
  • the storage device 40 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 10.
  • the destination device 20 may access the stored video data from the storage device 40 via streaming or download.
  • the file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to the destination device 20.
  • Example file servers include a web server (eg, for a website), an FTP server, a network attached storage (NAS) device, or a local disk drive.
  • the destination device 20 can access the encoded video data through any standard data connection, including an Internet connection.
  • This may include a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, DSL, cable modem, etc.), or a combination of both suitable for accessing encoded video data stored on a file server.
  • the transmission of the encoded video data from the storage device 40 may be a streaming transmission, a download transmission, or a combination of the two.
  • the motion vector prediction technology of the present application can be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the Internet), for storage in data storage Encoding of video data on media, decoding of video data stored on data storage media, or other applications.
  • the video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
  • the video decoding system 1 illustrated in FIG. 1 is merely an example, and the techniques of the present application can be applied to a video decoding setting (for example, video encoding or video decoding) that does not necessarily include any data communication between the encoding device and the decoding device. .
  • data is retrieved from local storage, streamed over a network, and so on.
  • the video encoding device may encode the data and store the data to a memory, and / or the video decoding device may retrieve the data from the memory and decode the data.
  • encoding and decoding are performed by devices that do not communicate with each other, but only encode data to and / or retrieve data from memory and decode data.
  • the source device 10 includes a video source 120, a video encoder 100, and an output interface 140.
  • the output interface 140 may include a regulator / demodulator (modem) and / or a transmitter.
  • Video source 120 may include a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and / or a computer for generating video data Graphics systems, or a combination of these sources of video data.
  • the video encoder 100 may encode video data from the video source 120.
  • the source device 10 transmits the encoded video data directly to the destination device 20 via the output interface 140.
  • the encoded video data may also be stored on the storage device 40 for later access by the destination device 20 for decoding and / or playback.
  • the destination device 20 includes an input interface 240, a video decoder 200, and a display device 220.
  • the input interface 240 includes a receiver and / or a modem.
  • the input interface 240 may receive the encoded video data via the link 30 and / or from the storage device 40.
  • the display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays decoded video data.
  • the display device 220 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • OLED organic light emitting diode
  • video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer unit Or other hardware and software to handle encoding of both audio and video in a common or separate data stream.
  • the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol, or other protocols such as the User Datagram Protocol (UDP), if applicable.
  • UDP User Datagram Protocol
  • Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field Programmable Gate Array (FPGA), discrete logic, hardware, or any combination thereof. If the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may use one or more processors to execute the instructions in hardware Thus implementing the technology of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as a combined encoder in a corresponding device / Decoder (codec).
  • codec device / Decoder
  • This application may generally refer to video encoder 100 as “signaling” or “transmitting” certain information to another device, such as video decoder 200.
  • the terms “signaling” or “transmitting” may generally refer to the transmission of syntax elements and / or other data to decode the compressed video data. This transfer can occur in real time or almost real time. Alternatively, this communication may occur after a period of time, such as may occur when a syntax element is stored in a coded bit stream to a computer-readable storage medium at the time of encoding, and the decoding device may then after the syntax element is stored to this medium Retrieve the syntax element at any time.
  • the video encoder 100 and the video decoder 200 may operate according to a video compression standard such as High Efficiency Video Coding (HEVC) or an extension thereof, and may conform to the HEVC test model (HM).
  • HEVC High Efficiency Video Coding
  • HM HEVC test model
  • video encoder 100 and video decoder 200 may also operate according to other industry standards, such as the ITU-T H.264, H.265 standards, or extensions of such standards.
  • the techniques of this application are not limited to any particular codec standard.
  • the video encoder 100 is configured to encode a syntax element related to an image block to be currently encoded into a digital video output bit stream (referred to as a bit stream or a code stream), which will be used here.
  • the syntax element for inter prediction on the current image block is referred to as inter prediction data for short; in order to determine the inter prediction mode used to encode the current image block, the video encoder 100 is further configured to determine (S301) the candidate inter frame
  • the inter prediction mode in the prediction mode set used to perform inter prediction on the current image block for example, selecting a variety of new inter prediction modes to compromise the code rate of the current image block or the minimum inter prediction mode
  • the encoding process herein may include predicting motion information of one or more sub-blocks in the current image block based on the determined inter prediction mode ( Specifically, it may be motion information of each sub-block or all sub-blocks), and use the motion information of one or more sub-block
  • the difference that is, the residual
  • the difference that is, the residual
  • the current image block that is, the original block
  • the video encoder 100 only needs to program the syntax elements related to the image block to be encoded into a bit stream (also known as a code stream); otherwise, in addition to the syntax elements, the corresponding residuals need to be coded into bits flow.
  • the video decoder 200 is configured to decode a syntax element related to the image block to be decoded from the bit stream (S401), and when the inter prediction data indicates that a candidate is adopted
  • a set of inter prediction modes that is, a new inter prediction mode
  • an inter prediction mode in the candidate inter prediction mode set for performing inter prediction on the current image block is determined (S403 )
  • the decoding process herein may include predicting motion information of one or more sub-blocks in the current image block based on the determined inter prediction mode, Inter motion prediction is performed on the current image block by using motion information of one or more sub-blocks in the current image block.
  • the video decoder 200 is configured to determine the inter prediction indicated by the second identifier.
  • the mode is an inter prediction mode for performing inter prediction on the current image block; or, if the inter prediction data does not include a second identifier used to indicate which inter prediction mode is used by the current image block
  • the video decoder 200 is configured to determine that a first inter prediction mode used for a non-directional motion field is an inter prediction mode used for inter prediction of the current image block.
  • FIG. 2A is a block diagram of a video encoder 100 according to an example described in the embodiment of the present application.
  • the video encoder 100 is configured to output a video to the post-processing entity 41.
  • the post-processing entity 41 represents an example of a video entity that can process the encoded video data from the video encoder 100, such as a media-aware network element (MANE) or a stitching / editing device.
  • the post-processing entity 41 may be an instance of a network entity.
  • the post-processing entity 41 and the video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to the post-processing entity 41 may be performed by the same device including the video encoder 100 carried out.
  • the post-processing entity 41 is an example of the storage device 40 of FIG. 1.
  • the video encoder 100 may perform encoding of a video image block according to any new inter prediction mode set of candidate inter prediction mode sets including modes 0, 1, 2,... Or 10 proposed in the present application, for example, perform a video image block. Inter prediction.
  • the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a decoded image buffer unit (DPB) 107, a summing unit 112, a transform unit 101, a quantization unit 102, and an entropy encoding unit 103.
  • the prediction processing unit 108 includes an inter prediction unit 110 and an intra prediction unit 109.
  • the video encoder 100 further includes an inverse quantization unit 104, an inverse transform unit 105, and a summing unit 111.
  • the filter unit 106 is intended to represent one or more loop filtering units, such as a deblocking filtering unit, an adaptive loop filtering unit (ALF), and a sample adaptive offset (SAO) filtering unit.
  • the filtering unit 106 is shown as an in-loop filter in FIG. 2A, in other implementations, the filtering unit 106 may be implemented as a post-loop filter.
  • the video encoder 100 may further include a video data storage unit and a segmentation unit (not shown in the figure).
  • the video data storage unit may store video data to be encoded by the components of the video encoder 100.
  • the video data stored in the video data storage unit may be obtained from the video source 120.
  • the DPB 107 may be a reference image storage unit that stores reference video data used by the video encoder 100 to encode video data in an intra-frame or inter-frame decoding mode.
  • the video data storage unit and the DPB 107 may be formed by any of a variety of storage unit devices, such as a dynamic random access memory unit (DRAM), a synchronous resistive RAM (MRAM), a resistive RAM ( RRAM), or other types of memory cell devices.
  • the video data storage unit and the DPB 107 may be provided by the same storage unit device or a separate storage unit device.
  • the video data storage unit may be on-chip with other components of video encoder 100, or off-chip with respect to those components.
  • the video encoder 100 receives video data and stores the video data in a video data storage unit.
  • the segmentation unit divides the video data into several image blocks, and these image blocks can be further divided into smaller blocks, such as image block segmentation based on a quad tree structure or a binary tree structure. This segmentation may also include segmentation into slices, tiles, or other larger units.
  • Video encoder 100 typically illustrates components that encode image blocks within a video slice to be encoded. The slice can be divided into multiple image patches (and possibly into a collection of image patches called slices).
  • the prediction processing unit 108 may select one of a plurality of possible decoding modes for the current image block, such as one of a plurality of intra-coding modes or one of a plurality of inter-coding modes, wherein The multiple inter-frame decoding modes include, but are not limited to, one or more of the modes 0, 1, 2, 3 ... 10 proposed in the present application.
  • the prediction processing unit 108 may provide the obtained intra, inter-coded block to the summing unit 112 to generate a residual block, and to the summing unit 111 to reconstruct an encoded block used as a reference image.
  • the intra prediction unit 109 within the prediction processing unit 108 may perform intra predictive encoding of the current image block with respect to one or more neighboring blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy.
  • the inter-prediction unit 110 within the prediction processing unit 108 may perform inter-predictive encoding of the current image block with respect to one or more prediction blocks in the one or more reference images to remove temporal redundancy.
  • the inter prediction unit 110 may be configured to determine an inter prediction mode for encoding a current image block.
  • the inter-prediction unit 110 may use rate-distortion analysis to calculate rate-distortion values of various inter-prediction modes in the candidate inter-prediction mode set, and select an inter-frame having the best rate-distortion characteristics from among Forecasting mode.
  • Rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate used to generate the coded block (that is, , Number of bits).
  • the inter-prediction unit 110 may determine that the inter-prediction mode with the lowest code rate distortion cost of encoding the current image block in the set of candidate inter-prediction modes is the inter-prediction mode for inter-prediction of the current image block.
  • the following describes the inter-predictive coding process in detail, especially in the various inter-prediction modes for non-directional or directional sports fields in this application, predicting one or more sub-blocks (specifically, each Sub-block or all sub-blocks).
  • the inter prediction unit 110 is configured to predict motion information (such as a motion vector) of one or more subblocks in the current image block based on the determined inter prediction mode, and use the motion information (such as the motion vector) of one or more subblocks in the current image block. Motion vector) to obtain or generate a prediction block of the current image block.
  • the inter prediction unit 110 may locate a prediction block pointed to by the motion vector in one of the reference image lists.
  • the inter prediction unit 110 may also generate syntax elements associated with image blocks and video slices for use by the video decoder 200 when decoding image blocks of the video slices.
  • the inter prediction unit 110 uses the motion information of each sub-block to perform a motion compensation process to generate a prediction block of each sub-block, thereby obtaining a prediction block of the current image block. It should be understood that the The inter prediction unit 110 performs motion estimation and motion compensation processes.
  • the inter prediction unit 110 may provide information indicating the selected inter prediction mode of the current image block to the entropy encoding unit 103, so that the entropy encoding unit 103 encodes the instruction.
  • Information on the selected inter prediction mode may include inter prediction data related to the current image block in the transmitted bit stream, which may include a first identifier block_based_enable_flag to indicate whether the new image proposed by the present application is adopted for the current image block.
  • the inter prediction mode is used for inter prediction; optionally, a second identifier block_based_index may also be included to indicate which new inter prediction mode is used by the current image block.
  • a process of predicting a motion vector of a current image block or a sub-block thereof using motion vectors of multiple reference blocks will be described in detail below.
  • the intra prediction unit 109 may perform intra prediction on the current image block. Specifically, the intra prediction unit 109 may determine an intra prediction mode used to encode the current block. For example, the intra-prediction unit 109 may use rate-distortion analysis to calculate rate-distortion values for various intra-prediction modes to be tested, and select an intra-prediction with the best rate-distortion characteristics from the test modes. mode. In any case, after the intra prediction mode is selected for the image block, the intra prediction unit 109 may provide information indicating the selected intra prediction mode of the current image block to the entropy encoding unit 103 so that the entropy encoding unit 103 encodes the instruction Information on the selected intra prediction mode.
  • the video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded.
  • the summing unit 112 represents one or more components that perform this subtraction operation.
  • the residual video data in the residual block may be included in one or more TUs and applied to the transform unit 101.
  • the transform unit 101 transforms the residual video data into a residual transform coefficient using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform.
  • the transform unit 101 may transform the residual video data from a pixel value domain to a transform domain, such as a frequency domain.
  • DCT discrete cosine transform
  • the transformation unit 101 may send the obtained transformation coefficient to the quantization unit 102.
  • the quantization unit 102 quantizes the transform coefficients to further reduce the bit rate.
  • the quantization unit 102 may then perform a scan of a matrix containing the quantized transform coefficients.
  • the entropy encoding unit 103 may perform scanning.
  • the entropy encoding unit 103 After quantization, the entropy encoding unit 103 performs entropy encoding on the quantized transform coefficients. For example, the entropy encoding unit 103 may perform context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), and probability interval segmentation entropy (PIPE ) Coding or another entropy coding method or technique. After entropy encoding by the entropy encoding unit 103, the encoded bitstream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoding unit 103 may also perform entropy encoding on the syntax elements of the current image block to be encoded.
  • CAVLC context adaptive variable length coding
  • CABAC context adaptive binary arithmetic coding
  • SBAC syntax-based context adaptive binary
  • the inverse quantization unit 104 and the inverse transform unit 105 respectively apply inverse quantization and inverse transform to reconstruct the residual block in the pixel domain, for example, as a reference block for later use as a reference image.
  • the summing unit 111 adds the reconstructed residual block to a prediction block generated by the inter prediction unit 110 or the intra prediction unit 109 to generate a reconstructed image block.
  • the filter unit 106 may be adapted to reconstructed image blocks to reduce distortion, such as block artifacts. This reconstructed image block is then stored as a reference block in the decoded image buffer unit 107 and can be used as a reference block by the inter prediction unit 110 to perform inter prediction on subsequent video frames or blocks in the image.
  • the video encoder 100 may directly quantize the residual signal without processing by the transform unit 101 and correspondingly does not need to be processed by the inverse transform unit 105; or, for some image blocks, Or image frames, the video encoder 100 does not generate residual data, and accordingly does not need to be processed by the transform unit 101, the quantization unit 102, the inverse quantization unit 104, and the inverse transform unit 105; or, the video encoder 100 may convert the reconstructed image
  • the blocks are directly stored as reference blocks without being processed by the filter unit 106; alternatively, the quantization unit 102 and the inverse quantization unit 104 in the video encoder 100 may be merged together.
  • the loop filtering unit is optional, and in the case of lossless compression coding, the transform unit 101, the quantization unit 102, the inverse quantization unit 104, and the inverse transform unit 105 are optional. It should be understood that, according to different application scenarios, the inter prediction unit and the intra prediction unit may be selectively enabled, and in this case, the inter prediction unit is enabled.
  • FIG. 2B is a block diagram of a video decoder 200 according to an example described in the embodiment of the present application.
  • the video decoder 200 includes an entropy decoding unit 203, a prediction processing unit 208, an inverse quantization unit 204, an inverse transform unit 205, a summing unit 211, a filter unit 206, and a decoded image buffer unit 207.
  • the prediction processing unit 208 may include an inter prediction unit 210 and an intra prediction unit 209.
  • video decoder 200 may perform a decoding process that is substantially inverse to the encoding process described with respect to video encoder 100 from FIG. 2A.
  • the video decoder 200 receives from the video encoder 100 an encoded video bitstream representing image blocks of the encoded video slice and associated syntax elements.
  • the video decoder 200 may receive video data from the network entity 42, optionally, the video data may also be stored in a video data storage unit (not shown in the figure).
  • the video data storage unit may store video data, such as an encoded video bit stream, to be decoded by the components of the video decoder 200.
  • the video data stored in the video data storage unit can be obtained, for example, from the storage device 40, from a local video source such as a camera, via a wired or wireless network of video data, or by accessing a physical data storage medium.
  • the video data storage unit may function as a decoded image buffer unit (CPB) for storing encoded video data from the encoded video bitstream. Therefore, although the video data storage unit is not illustrated in FIG. 2B, the video data storage unit and the DPB 207 may be the same storage unit, or may be separate storage units.
  • the video data storage unit and DPB 207 can be formed by any of a variety of storage unit devices, such as: dynamic random access memory unit (DRAM) including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), and resistive RAM (RRAM), or other types of memory cell devices.
  • DRAM dynamic random access memory unit
  • SDRAM synchronous DRAM
  • MRAM magnetoresistive RAM
  • RRAM resistive RAM
  • the video data storage unit may be integrated on a chip with other components of the video decoder 200, or disposed off-chip relative to those components.
  • the network entity 42 may be, for example, a server, a MANE, a video editor / splicer, or other such device for implementing one or more of the techniques described above.
  • the network entity 42 may or may not include a video encoder, such as video encoder 100.
  • the network entity 42 may implement some of the techniques described in this application.
  • the network entity 42 and the video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to the network entity 42 may be performed by the same device including the video decoder 200.
  • the network entity 42 may be an example of the storage device 40 of FIG. 1.
  • the entropy decoding unit 203 of the video decoder 200 performs entropy decoding on the bit stream to generate quantized coefficients and some syntax elements.
  • the entropy decoding unit 203 forwards the syntax element to the prediction processing unit 208.
  • Video decoder 200 may receive syntax elements at a video slice level and / or an image block level.
  • the intra-prediction unit 209 of the prediction processing unit 208 may be based on the signaled intra-prediction mode and the previous decoded block from the current frame or image. Data to generate prediction blocks for image blocks of the current video slice.
  • the inter-prediction unit 210 of the prediction processing unit 208 may determine, based on the syntax element received from the entropy decoding unit 203, the An inter prediction mode in which a current image block of a video slice is decoded, and based on the determined inter prediction mode, the current image block is decoded (for example, inter prediction is performed).
  • the inter prediction unit 210 may determine whether to use the new inter prediction mode for prediction of the current image block of the current video slice. If the syntax element indicates that the new inter prediction mode is used to predict the current image block, based on A new inter prediction mode (for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode) predicts the current image block of the current video slice or a sub-block of the current image block. Motion information, thereby obtaining or generating a predicted block of the current image block or a sub-block of the current image block by using the motion information of the predicted current image block or a sub-block of the current image block through a motion compensation process.
  • a new inter prediction mode for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode
  • the motion information here may include reference image information and motion vectors, where the reference image information may include but is not limited to unidirectional / bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list.
  • a prediction block may be generated from one of reference pictures within one of the reference picture lists.
  • the video decoder 200 may construct a reference image list, that is, a list 0 and a list 1, based on the reference images stored in the DPB 207.
  • the reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1.
  • the video encoder 100 may signal whether to use a new inter prediction mode to decode a specific syntax element of a specific block, or may be a signal to indicate whether to use a new inter prediction mode. And indicating which new inter prediction mode is used to decode a specific syntax element of a specific block.
  • the inter prediction unit 210 here performs a motion compensation process. In the following, the inter-prediction process of using the motion information of the reference block to predict the motion information of the current image block or a sub-block of the current image block under various new inter-prediction modes will be explained in detail.
  • the inverse quantization unit 204 inverse quantizes the quantized transform coefficients provided in the bitstream and decoded by the entropy decoding unit 203, that is, dequantization.
  • the inverse quantization process may include using a quantization parameter calculated by the video encoder 100 for each image block in the video slice to determine the degree of quantization that should be applied and similarly to determine the degree of inverse quantization that should be applied.
  • the inverse transform unit 205 applies an inverse transform to transform coefficients, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process in order to generate a residual block in the pixel domain.
  • the video decoder 200 After the inter-prediction unit 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 passes the residual block from the inverse transform unit 205 and the corresponding prediction generated by the inter-prediction unit 210. The blocks are summed to get the reconstructed block, that is, the decoded image block.
  • the summing unit 211 represents a component that performs this summing operation.
  • a loop filtering unit in the decoding loop or after the decoding loop
  • the filter unit 206 may represent one or more loop filtering units, such as a deblocking filtering unit, an adaptive loop filtering unit (ALF), and a sample adaptive offset (SAO) filtering unit.
  • the filtering unit 206 is shown as an in-loop filtering unit in FIG. 2B, in other implementations, the filtering unit 206 may be implemented as a post-loop filtering unit.
  • the filter unit 206 is adapted to reconstruct a block to reduce block distortion, and the result is output as a decoded video stream.
  • the decoded image block in a given frame or image may also be stored in the decoded image buffer unit 207, and the decoded image buffer unit 207 stores a reference image for subsequent motion compensation.
  • the decoded image buffer unit 207 may be part of a storage unit, which may also store the decoded video for later presentation on a display device, such as the display device 220 of FIG. 1, or may be separate from such a storage unit.
  • the video decoder 200 may generate an output video stream without being processed by the filtering unit 206; or, for certain image blocks or image frames, the entropy decoding unit 203 of the video decoder 200 does not decode the quantized coefficients, and accordingly does not It needs to be processed by the inverse quantization unit 204 and the inverse transform unit 205.
  • the loop filtering unit is optional; and in the case of lossless compression, the inverse quantization unit 204 and the inverse transform unit 205 are optional.
  • the inter prediction unit and the intra prediction unit may be selectively enabled, and in this case, the inter prediction unit is enabled.
  • FIG. 5 is a schematic diagram illustrating motion information of an exemplary current image block 600 and a reference block in an embodiment of the present application.
  • W and H are the width and height of the current image block 600 and a co-located block (referred to as a co-located block) 600 'of the same position of the current image block 600.
  • the reference blocks of the current image block 600 include: the upper airspace neighboring block and the left airspace neighboring block of the current image block 600, and the lower airspace neighboring block and the right airspace neighboring block of the collocated block 600 ', where the collocated block 600 'Is an image block in the reference image having the same size, shape, and coordinates as the current image block 600.
  • the motion information of the lower spatial domain neighboring block and the right spatial domain neighboring block of the current image block does not exist and has not been encoded yet.
  • the current image block 600 and the collocated block 600 ' may be any block size.
  • the current image block 600 and the collocated block 600 ' may include, but are not limited to, 16x16 pixels, 32x32 pixels, 32x16 pixels, 16x32 pixels, and the like.
  • each image frame can be divided into image blocks for encoding.
  • image blocks can be further divided into smaller blocks, for example, the current image block 600 and the collocated block 600 'can be divided into multiple MxN sub-blocks, that is, each sub-block is MxN pixels in size, and each reference The size of the block is also MxN pixels, that is, the same size as the sub-block of the current image block.
  • the coordinates in FIG. 5 are measured in MxN blocks.
  • M ⁇ N and M times N are used interchangeably to refer to the pixel size of an image block according to the horizontal and vertical dimensions, that is, there are M pixels in the horizontal direction and N pixels in the vertical direction. Where M and N represent non-negative integer values.
  • the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction.
  • the image block described in this application can be understood as, but not limited to, a prediction unit (PU), a coding unit (CU), or a transformation unit (TU).
  • a CU may include one or more prediction units PU, or the PU and the CU have the same size.
  • Image blocks can have fixed or variable sizes and differ in size according to different video compression codec standards.
  • the current image block refers to an image block to be encoded or decoded currently, such as a prediction unit to be encoded or decoded.
  • each of the left spatial-domain neighboring blocks of the current image block 600 is available along the direction 1 and to sequentially determine each upper-space neighboring block of the current image block 600 along the direction 2. Whether it is available, for example, judging whether neighboring blocks (also referred to as reference blocks, which are used interchangeably) are inter-coded. If the neighboring block exists and is inter-coded, the neighboring block is available; if the neighboring block does not exist or is intra Encoding, then the neighboring blocks are unavailable. If one neighboring block is intra-coded, the motion information of other neighboring reference blocks is copied as the motion information of the neighboring block. Whether the lower airspace neighboring block and the right airspace neighboring block of the juxtaposed block 600 'are available in a similar manner is not described here.
  • the motion information of the fetch available reference block can be directly obtained; if the size of the available reference block is, for example, 8x4, 8x8, its center 4x4 block can be obtained
  • the motion information of the available reference block is used as the motion information of the available reference block.
  • the coordinates of the top left vertex of the center 4x4 block relative to the top left vertex of the reference block are ((W / 4) / 2 * 4, (H / 4) / 2 * 4).
  • the division operation is an integer division operation.
  • the coordinates of the upper left corner vertex of the center 4x4 block relative to the upper left corner vertex of the reference block are (4,0).
  • the motion information of the 4x4 block in the upper left corner of the reference block may also be acquired as the motion information of the available reference block, but the application is not limited thereto.
  • the following describes the MxN sub-blocks as sub-blocks and the neighboring MxN-blocks as neighboring blocks.
  • FIG. 6 is a flowchart illustrating a process 700 of an encoding method according to an embodiment of the present application.
  • the process 700 may be performed by the video encoder 100.
  • the process 700 may be performed by the inter prediction unit 110 and the entropy coding unit (also referred to as an entropy encoder) 103 of the video encoder 100.
  • the process 700 is described as a series of steps or operations. It should be understood that the process 700 may be performed in various orders and / or concurrently, and is not limited to the execution order shown in FIG. 6. Assume that a video data stream with multiple video frames is using a video encoder.
  • the first neighboring affine coding block is located above the current coding block (Coding Tree Unit, CTU), then the first neighboring affine coding block is used.
  • CTU Current coding block
  • the lower left control point and lower right control point of the radio coding block determine a set of candidate motion vector prediction values, corresponding to the process shown in FIG. 6, and the related description is as follows:
  • Step S700 The video encoder determines an inter prediction mode of a current coding block.
  • the inter prediction mode may be an advanced motion vector prediction (Advanced Vector Prediction (AMVP) mode) or may be a merge mode.
  • AMVP Advanced Vector Prediction
  • steps S711-S713 are performed.
  • steps S721-S723 are performed.
  • Step S711 The video encoder constructs a candidate motion vector prediction value MVP list.
  • the video encoder uses an inter prediction unit (also referred to as an inter prediction module) to construct a candidate motion vector prediction value MVP list (also referred to as a candidate motion vector list).
  • an inter prediction unit also referred to as an inter prediction module
  • a candidate motion vector prediction value MVP list also referred to as a candidate motion vector list.
  • the candidate motion vector prediction value MVP list can be a triplet candidate motion vector prediction value MVP list or a binary tuple candidate motion vector prediction value MVP List; the above two methods are as follows:
  • Method 1 A motion vector prediction method based on a motion model is used to construct a candidate motion vector prediction value MVP list.
  • all or part of neighboring blocks of the current coding block are traversed in a predetermined order to determine the neighboring affine coding blocks, and the number of the determined neighboring affine coding blocks may be one or more.
  • the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine coding blocks in the neighboring blocks A, B, C, D, and E.
  • the inter prediction unit determines at least one set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to at least one adjacent affine coding block, and an adjacent affine is used below
  • the coding block is introduced as an example.
  • the neighboring affine coding block is called the first neighboring affine coding block, as follows:
  • a first affine model is determined according to a motion vector of a control point of a first neighboring affine coding block, and then a motion vector of a control point of the current coding block is predicted according to the first affine model.
  • the method of predicting the motion vector of the control point of the current coding block based on the motion vector of the control point of the first neighboring affine coding block is also different, so the following description will be made on a case-by-case basis.
  • the parameter model of the current coding block is a 4-parameter affine transformation model.
  • the derivation method can be:
  • the first phase is obtained
  • the motion vectors of the bottom two control points of the adjacent affine coding block For example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy) of the lower left control point of the first adjacent affine coding block can be obtained. 6 ), and the position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the lower right control point.
  • CTU Coding Tree Unit
  • a first affine model is formed according to the motion vectors and coordinate positions of the bottom two control points of the first adjacent affine coding block (the first affine model obtained at this time is a 4-parameter affine model).
  • the motion vector of the control point of the current coding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point and the position of the upper right control point of the current coding block may be brought into the first affine model, respectively. , Thereby predicting the motion vector of the upper left control point and the motion vector of the upper right control point of the current coding block, as shown in formulas (1) and (2).
  • (x 0 , y 0 ) are the coordinates of the upper-left control point of the current coding block, and (x 1 , y 1 ) are the coordinates of the upper-right control point of the current coding block; in addition, ( vx 0 , vy 0 ) is a motion vector of the upper left control point of the predicted current coding block, and (vx 1 , vy 1 ) is a motion vector of the upper right control point of the predicted current coding block.
  • the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine coding block and the position coordinates (x 7 , y 7 ) of the lower right control point are both based on the The position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine coding block are calculated, where the position coordinates (x 6 , y) of the lower left control point of the first adjacent affine coding block 6 ) is (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine coding block is (x 4 + cuW, y 4 + cuH) , CuW is the width of the first neighboring affine coding block, and cuH is the height of the first neighboring affine coding block.
  • the motion vector of the lower left control point of the first neighboring affine coding block is A motion vector of a lower left sub-block of the first adjacent affine coding block
  • a motion vector of a lower right control point of the first adjacent affine coding block is a Motion vector.
  • the first neighboring affine coding block is located in a coding tree unit (CTU) above the current coding block and the first neighboring affine coding block is a six-parameter affine coding block, it is not based on the first phase
  • the neighboring affine coding block generates a candidate motion vector prediction value of a control point of the current block.
  • the manner of predicting the motion vector of the control point of the current coding block is not limited here.
  • an optional determination method is also exemplified below:
  • the position coordinates and motion vectors of the three control points of the first adjacent affine coding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
  • a 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine coding block.
  • the position coordinates (x 0 , y 0 ) of the upper left control point and the position coordinates (x 1 , y 1 ) of the upper right control point of the current coding block are substituted into a 6-parameter affine model to predict the motion vector of the upper left control point of the current coding block. And the motion vector of the upper right control point, as shown in formulas (4) and (5).
  • (vx 0 , vy 0 ) is the motion vector of the upper left control point of the predicted current coding block
  • (vx 1 , vy 1 ) is the predicted upper right control point of the current coding block. Motion vector.
  • the parameter model of the current coding block is a 6-parameter affine transformation model.
  • the derivation method can be:
  • the first neighboring affine coding block is located above the CTU of the current coding block and the first neighboring affine coding block is a four-parameter affine coding block, two lowermost two sides of the first neighboring affine coding block are obtained Position coordinates and motion vectors of control points. For example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine coding block can be obtained, and the lower right control The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the point.
  • a first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine coding block (the first affine model obtained at this time is a 4-parameter affine model).
  • the motion vector of the control point of the current coding block is predicted according to the first affine model.
  • the position coordinates of the upper left control point, the position of the upper right control point, and the position of the lower left control point of the current coding block may be brought in respectively.
  • To the first affine model to predict the motion vector of the upper left control point, the upper right control point, and the lower left control point of the current coding block as shown in formulas (1), (2), and (3). Show.
  • Formulas (1) and (2) have been described above.
  • (x 0 , y 0 ) are the coordinates of the upper-left control point of the current coding block
  • (x 1 , y 1 ) is the coordinate of the upper right control point of the current coding block
  • (x 2 , y 2 ) is the coordinate of the lower left control point of the current coding block
  • (vx 0 , vy 0 ) is the predicted upper left control of the current coding block
  • the motion vector of the point, (vx 1 , vy 1 ) is the motion vector of the predicted upper right control point of the current coding block
  • (vx 2 , vy 2 ) is the motion vector of the predicted right and left lower control point of the current coding block.
  • the first neighboring affine coding block is located in a coding tree unit (CTU) above the current coding block and the first neighboring affine coding block is a six-parameter affine coding block, it is not based on the first phase
  • the neighboring affine coding block generates a candidate motion vector prediction value of a control point of the current block.
  • the manner of predicting the motion vector of the control point of the current coding block is not limited here.
  • an optional determination method is also exemplified below:
  • the position coordinates and motion vectors of the three control points of the first adjacent affine coding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
  • a 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine coding block.
  • the position coordinates (x 0 , y 0 ) of the upper left control point, the position coordinates of the upper right control point (x 1 , y 1 ), and the position coordinates of the lower left control point (x 2 , y 2 ) of the current coding block are substituted into 6 parameters.
  • the affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current coding block, as shown in formulas (4), (5), and (6).
  • Formulas (4) and (5) have been described previously.
  • (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current coding block
  • (vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current coding block
  • (vx 2 , vy 2 ) are the motion vectors of the predicted lower left control point of the current coding block.
  • Method 2 A motion vector prediction method based on the combination of control points is used to construct a candidate motion vector prediction value MVP list.
  • the parameter model of the current coding block does not have the same way of constructing the candidate motion vector prediction value MVP list at the same time, which will be described below.
  • the parameter model of the current coding block is a 4-parameter affine transformation model.
  • the derivation method can be:
  • the motion information of the upper left vertex and the upper right vertex of the current coding block is estimated by using the motion information of the coded blocks adjacent to the current coding block.
  • the motion vector of the upper left vertex adjacent to the coded block A and / or B and / or C block is used as a candidate motion vector of the motion vector of the upper left vertex of the current coding block;
  • the motion vector of the coding block D and / or E block is used as a candidate motion vector of the motion vector of the top right vertex of the current coding block.
  • a candidate motion vector of the upper left vertex and a candidate motion vector of the upper right vertex are combined to obtain a set of candidate motion vector prediction values. Multiple records obtained by combining in this combination manner can form a candidate motion vector prediction value MVP list.
  • the current parameter block parameter model is a 6-parameter affine transformation model.
  • the derivation method can be:
  • the motion information of the upper left vertex and the upper right vertex of the current coding block is estimated by using the motion information of the coded blocks adjacent to the current coding block.
  • FIG. 7B First, the motion vector of the upper left vertex adjacent to the coded block A and / or B and / or C block is used as a candidate motion vector of the motion vector of the upper left vertex of the current coding block;
  • the motion vector of the coding block D and / or E block is used as the candidate motion vector of the motion vector of the top right vertex of the current coding block;
  • the motion vector of the coded block F and / or G block adjacent to the top right vertex is used as the top right vertex of the current coding block Candidate motion vector.
  • a combination of the candidate motion vector of the upper left vertex, the candidate motion vector of the upper right vertex, and the candidate motion vector of the lower left vertex can be used to obtain a set of candidate motion vector prediction values.
  • a plurality of sets of candidate motion vectors obtained by combining in this way The prediction value may constitute a candidate motion vector prediction value MVP list.
  • the candidate motion vector prediction value MVP list can only be constructed by using the candidate motion vector prediction value predicted by the first method, or the candidate motion vector prediction value MVP can be constructed by only using the candidate motion vector prediction value obtained by the second method prediction.
  • the candidate motion vector prediction value obtained by the prediction method 1 and the candidate motion vector prediction value obtained by the method 2 prediction can be used to jointly construct a candidate motion vector prediction value MVP list.
  • the candidate motion vector prediction value MVP list can be pruned and sorted according to a pre-configured rule, and then truncated or filled to a specific number.
  • the candidate motion vector prediction value MVP list When each group of candidate motion vector prediction values in the candidate motion vector prediction value MVP list includes motion vector prediction values of three control points, the candidate motion vector prediction value MVP list may be called a triple list; when the candidate motion vector When each group of candidate motion vector prediction values in the prediction value MVP list includes motion vector prediction values of two control points, the candidate motion vector prediction value MVP list may be referred to as a two-tuple list.
  • Step S712 The video encoder determines the target candidate motion vector group from the candidate motion vector prediction value MVP list according to the rate-distortion cost criterion. Specifically, for each candidate motion vector group in the candidate motion vector prediction value MVP list, the motion vector of each sub-block of the current block is calculated, and motion compensation is performed to obtain the prediction value of each sub-block, thereby obtaining the prediction value of the current block.
  • the candidate motion vector group with the smallest error between the predicted value and the original value is selected as a set of the best motion vector prediction values, that is, the target candidate motion vector group.
  • the determined target candidate motion vector group is used as an optimal candidate motion vector prediction value for a set of control points, and the target candidate motion vector group corresponds to a unique index number in the candidate motion vector prediction value MVP list.
  • Step S713 The video encoder encodes an index corresponding to the target candidate motion vector and a motion vector difference MVD into a code stream to be transmitted.
  • the video encoder may also use the target candidate motion vector group as a search starting point to search for a set of motion vectors with the lowest cost control point within a preset search range according to a rate distortion cost criterion; and then determine the group
  • the motion vector difference MVD between the motion vector of the control point and the target candidate motion vector group for example, if the first group of control points includes the first control point and the second control point, then the motion of the first control point needs to be determined
  • a motion vector difference MVD of a motion vector prediction value of a first control point in a set of control points represented by the vector and the target candidate motion vector group, and a motion vector determining a second control point and the target candidate motion vector group A motion vector difference MVD of a motion vector prediction value of a second control point in a set of control points represented.
  • steps S714-S715 may also be performed.
  • Step S714 The video encoder uses the affine transformation model to obtain the motion vector value of each sub-block in the current coding block according to the motion vector value of the control point of the current coding block determined above.
  • the new candidate motion vector group obtained based on the target candidate motion vector group and the MVD includes two (upper left control points and upper right control points) or three control points (for example, upper left control point, upper right control point, and lower left control). Dot) motion vector.
  • the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information.
  • the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations.
  • FIG. 8A illustrates a 4 ⁇ 4 motion compensation unit
  • FIG. 8B illustrates an 8 ⁇ 8 motion compensation unit.
  • the coordinates of the center point of the motion compensation unit relative to the top left pixel of the current coding block are calculated using formula (5), where i is the ith motion compensation unit in the horizontal direction (from left to right), and j is the jth motion in the vertical direction
  • the compensation unit (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the center point of the (i, j) th motion compensation unit relative to the upper left control point pixel of the current coding block.
  • the current coding block is a 6-parameter coding block
  • the motion vectors of one or more sub-blocks of the current coding block are obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the sub-block in the lower left corner of the current coding block is a 6-parameter affine model constructed according to the three control points and the current coding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the lower right corner sub-block of the current coding block is a 6-parameter affine model constructed according to the three control points and the right of the current coding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 6-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation.) The position coordinates (W, H) of the lower right corner of the current coding block are substituted into the 6-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion vector prediction value MVP list of the other blocks uses an accurate value instead of an estimated value.
  • W is the width of the current coding block
  • H is the height of the current coding block.
  • the current coding block is a 4-parameter coding block
  • a motion vector of one or more sub-blocks of the current coding block is obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the lower left sub-block of the current coding block is a 4-parameter affine model constructed according to the two control points and the The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the subblock in the lower right corner of the current coding block is a 4-parameter affine model constructed according to the two control points and the right of the current coding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 4-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current coding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion vector prediction value MVP list of the other blocks uses an accurate value instead of an estimated value.
  • W is the width of the current coding block
  • H is the height of the current coding block.
  • Step S715 The video encoder performs motion compensation according to the motion vector value of each sub-block in the current coding block to obtain the pixel prediction value of each sub-block. For example, the motion vector and reference frame index value of each sub-block are used in The corresponding sub-block is found in the reference frame, and interpolation filtering is performed to obtain the pixel prediction value of each sub-block.
  • Step S721 The video encoder constructs a candidate motion information list.
  • the video encoder constructs a candidate motion information list (also referred to as a candidate motion vector list) through an inter prediction unit (also referred to as an inter prediction module), which may be constructed in one of two ways provided below. Or it can be constructed by a combination of two methods, and the candidate motion information list constructed is a triplet candidate motion information list; the above two methods are specifically as follows:
  • Method 1 A motion vector prediction method based on a motion model is used to construct a candidate motion information list.
  • all or part of neighboring blocks of the current coding block are traversed in a predetermined order to determine the neighboring affine coding blocks, and the number of the determined neighboring affine coding blocks may be one or more.
  • the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine coding blocks in the neighboring blocks A, B, C, D, and E.
  • the inter prediction unit determines a set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to each adjacent affine coding block, and an adjacent affine is used below
  • the coding block is introduced as an example.
  • the neighboring affine coding block is called the first neighboring affine coding block, as follows:
  • the first affine model is determined according to the motion vector of the control points of the first neighboring affine coding block, and then the motion vector of the control point of the current coding block is predicted according to the first affine model, which is specifically described as follows:
  • first adjacent affine coding block is located above the CTU of the current coding block and the first adjacent affine coding block is a four-parameter affine coding block, two lower-most two sides of the first adjacent affine coding block are obtained.
  • Position coordinates and motion vectors of the control points for example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine coding block can be obtained, and the lower right The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the control point.
  • a first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine coding block (the first affine model obtained at this time is a 4-parameter affine model).
  • the motion vector of the control point of the current coding block is predicted according to the first affine model.
  • the position coordinates of the upper left control point, the position coordinates of the upper right control point, and the position of the lower left control point of the current coding block may be predicted.
  • the coordinates are brought into the first affine model, so as to predict the motion vector of the upper left control point, the upper right control point, and the lower left control point of the current coding block to form a candidate motion vector triplet and add candidate motions.
  • the information list is shown in formulas (1), (2), and (3).
  • the motion vector of the control point of the current coding block is predicted according to the first affine model.
  • the position coordinates of the upper left control point and the position coordinates of the upper right control point of the current coding block may be brought into the first An affine model to predict the motion vector of the upper left control point and the motion vector of the upper right control point of the current coding block to form a candidate motion vector tuple, and add the candidate motion information list, as shown in formulas (1) and (2). Show.
  • (X 2 , y 2 ) are the coordinates of the lower left control point of the current coding block; in addition, (vx 0 , vy 0 ) is the predicted motion vector of the upper left control point of the current coding block, and (vx 1 , vy 1 ) is The predicted motion vector of the upper right control point of the current coding block, (vx 2 , vy 2 ) is the predicted motion vector of the lower left control point of the current coding block.
  • the first neighboring affine coding block is located in a coding tree unit (CTU) above the current coding block and the first neighboring affine coding block is a six-parameter affine coding block, it is not based on the first
  • the neighboring affine coding block generates a candidate motion vector prediction value of a control point of the current block.
  • the manner of predicting the motion vector of the control point of the current coding block is not limited here.
  • an optional determination method is also exemplified below:
  • the position coordinates and motion vectors of the three control points of the first adjacent affine coding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
  • a 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine coding block.
  • the position coordinates (x 0 , y 0 ) of the upper left control point, the position coordinates of the upper right control point (x 1 , y 1 ), and the position coordinates of the lower left control point (x 2 , y 2 ) of the current coding block are substituted into 6 parameters.
  • the affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current coding block, as shown in formulas (4), (5), and (6).
  • Formulas (4) and (5) have been described previously.
  • (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current coding block
  • (vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current coding block
  • (vx 2 , vy 2 ) are the motion vectors of the predicted lower left control point of the current coding block.
  • Manner 2 A motion vector prediction method based on a combination of control points is used to construct a candidate motion information list.
  • Option A The following two examples are exemplified as Option A and Option B:
  • Solution A Combine the motion information of the control points of the two current coding blocks to build a 4-parameter affine transformation model.
  • the combination of the two control points is ⁇ CP1, CP4 ⁇ , ⁇ CP2, CP3 ⁇ , ⁇ CP1, CP2 ⁇ , ⁇ CP2, CP4 ⁇ , ⁇ CP1, CP3 ⁇ , ⁇ CP3, CP4 ⁇ .
  • Affine CP1, CP2
  • control points can also be converted into a control point at the same position.
  • the four-parameter affine transformation model obtained by combining ⁇ CP1, CP4 ⁇ , ⁇ CP2, CP3 ⁇ , ⁇ CP2, CP4 ⁇ , ⁇ CP1, CP3 ⁇ , ⁇ CP3, CP4 ⁇ into a control point ⁇ CP1, CP2 ⁇ Or ⁇ CP1, CP2, CP3 ⁇ .
  • the conversion method is to substitute the motion vector of the control point and its coordinate information into formula (9-1) to obtain the model parameters, and then substitute the coordinate information of ⁇ CP1, CP2 ⁇ to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
  • a 0 , a 1 , a 2 , and a 3 are parameters in the parameter model, and (x, y) represents position coordinates.
  • a set of motion vector prediction values represented by the upper left control point and the upper right control point can also be converted according to the following formula, and added to the candidate motion information list:
  • ⁇ CP1, CP3 ⁇ is converted to ⁇ CP1, CP2, CP3 ⁇ 's formula (9-3):
  • ⁇ CP1, CP4 ⁇ is converted to ⁇ CP1, CP2, CP3 ⁇ 's formula (11):
  • ⁇ CP3, CP4 ⁇ is converted to ⁇ CP1, CP2, CP3 ⁇ 's formula (13):
  • Solution B Combine the motion information of the 3 control points of the current coding block to build a 6-parameter affine transformation model.
  • the combination of the three control points is ⁇ CP1, CP2, CP4 ⁇ , ⁇ CP1, CP2, CP3 ⁇ , ⁇ CP2, CP3, CP4 ⁇ , ⁇ CP1, CP3, CP4 ⁇ .
  • Affine CP1, CP2, CP3
  • control points can also be converted into a control point at the same position.
  • the 6-parameter affine transformation model of ⁇ CP1, CP2, CP4 ⁇ , ⁇ CP2, CP3, CP4 ⁇ , ⁇ CP1, CP3, CP4 ⁇ is converted into a control point ⁇ CP1, CP2, CP3 ⁇ to represent it.
  • the transformation method is to substitute the motion vector of the control point and its coordinate information into formula (14) to obtain the model parameters, and then substitute the coordinate information of ⁇ CP1, CP2, CP3 ⁇ to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
  • a 1 , a 2 , a 3 , a 4 , a 5 , and a 6 are parameters in the parameter model, and (x, y) represents position coordinates.
  • a set of motion vector prediction values represented by the upper left control point, upper right control point, and lower left control point can also be converted according to the following formula, and added to the candidate motion information list:
  • the candidate motion information list may be constructed only by using the candidate motion vector prediction values predicted by the first method, or only the candidate motion vector prediction values obtained by the second prediction method may be used to construct the candidate motion information list.
  • a candidate motion vector prediction value obtained by the first prediction and a candidate motion vector prediction value obtained by the second prediction are used to jointly construct a candidate motion information list.
  • the candidate motion information list can be pruned and sorted according to pre-configured rules, and then truncated or filled to a specific number.
  • the candidate motion information list When each group of candidate motion vector prediction values in the candidate motion information list includes motion vector prediction values of three control points, the candidate motion information list may be referred to as a triple list; when each group in the candidate motion information list When the candidate motion vector prediction value includes motion vector prediction values of two control points, the candidate motion information list may be referred to as a two-tuple list.
  • Step S722 The video encoder determines a target candidate motion vector group from the candidate motion information list according to the rate-distortion cost criterion. Specifically, for each candidate motion vector group in the candidate motion information list, the motion vector of each sub-block of the current block is calculated, and motion compensation is performed to obtain the prediction value of each sub-block, thereby obtaining the prediction value of the current block.
  • the candidate motion vector group with the smallest error between the predicted value and the original value is selected as a set of the best motion vector prediction values, that is, the target candidate motion vector group.
  • the determined target candidate motion vector group is used as an optimal candidate motion vector prediction value for a group of control points, and the target candidate motion vector group corresponds to a unique index number in the candidate motion information list.
  • Step S723 The video encoder encodes an index corresponding to the target candidate motion vector group, a reference frame index, and a prediction direction into a code stream to be transmitted.
  • steps S721-S723 may also be performed.
  • Step S724 The video encoder uses the parameter affine transformation model to obtain the motion vector value of each sub-block in the current coding block according to the motion vector value of the control point of the current coding block determined above.
  • the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information.
  • the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations.
  • FIG. 8A illustrates a 4 ⁇ 4 motion compensation unit
  • FIG. 8B illustrates an 8 ⁇ 8 motion compensation unit.
  • the coordinates of the center point of the motion compensation unit relative to the top left pixel of the current coding block are calculated using formula (5), where i is the i-th motion compensation unit in the horizontal direction (from left to right), and j is the j-th motion in the vertical direction.
  • the compensation unit (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the center point of the (i, j) th motion compensation unit relative to the upper left control point pixel of the current coding block.
  • the current coding block is a 6-parameter coding block
  • the motion vectors of one or more sub-blocks of the current coding block are obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the sub-block in the lower left corner of the current coding block is a 6-parameter affine model constructed according to the three control points and the current coding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the lower right corner sub-block of the current coding block is a 6-parameter affine model constructed according to the three control points and the right of the current coding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 6-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation.) The position coordinates (W, H) of the lower right corner of the current coding block are substituted into the 6-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion information list of the other blocks uses accurate values instead of estimated values.
  • W is the width of the current coding block
  • H is the height of the current coding block.
  • the current coding block is a 4-parameter coding block
  • a motion vector of one or more sub-blocks of the current coding block is obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the lower left sub-block of the current coding block is a 4-parameter affine model constructed according to the two control points and the The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the subblock in the lower right corner of the current coding block is a 4-parameter affine model constructed according to the two control points and the right of the current coding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 4-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current coding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion information list of the other blocks uses accurate values instead of estimated values.
  • W is the width of the current coding block
  • H is the height of the current coding block.
  • Step S725 The video encoder performs motion compensation according to the motion vector value of each sub-block in the current coding block to obtain the pixel prediction value of each sub-block. Specifically, according to one or more sub-blocks of the current coding block The motion vector, and the reference frame index and prediction direction indicated by the index, to obtain a pixel prediction value of the current coding block.
  • the first set of control points includes the first neighboring affine coding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent coding block as the first group of control points as in the prior art.
  • the information of the first set of control points (for example, position coordinates, motion vectors, etc.) can directly reuse the information read from the memory, thereby reducing the memory read. This improves encoding performance.
  • FIG. 9 is a flowchart illustrating a process 900 of a decoding method according to an embodiment of the present application.
  • the process 900 may be performed by the video decoder 100, and specifically, may be performed by the inter prediction unit 210 of the video decoder 200 and the entropy decoding unit (also referred to as an entropy decoder) 203.
  • the process 900 is described as a series of steps or operations. It should be understood that the process 900 may be performed in various orders and / or concurrently, and is not limited to the execution order shown in FIG. 9. Assume that a video data stream with multiple video frames is using a video decoder.
  • the first neighboring affine decoding block is located above the current decoding block (Coding, Tree Unit, CTU), then it is based on the first neighboring affine decoding unit.
  • the lower left control point and lower right control point of the radio decoding block determine a set of candidate motion vector prediction values, corresponding to the process shown in FIG. 9 and the related description is as follows:
  • a set of candidate motions is determined based on the lower left control point and the lower right control point of the first neighboring affine decoding block.
  • the vector prediction value is described in detail as follows:
  • Step S1200 The video decoder determines an inter prediction mode of the currently decoded block.
  • the inter prediction mode may be an advanced motion vector prediction (Advanced Vector Prediction (AMVP) mode) or may be a merge mode.
  • AMVP Advanced Vector Prediction
  • steps S1211-S1216 are performed.
  • steps S1221-S1225 are performed.
  • Step S1211 The video decoder constructs a candidate motion vector prediction value MVP list.
  • the video decoder uses an inter prediction unit (also referred to as an inter prediction module) to construct a candidate motion vector prediction value MVP list (also referred to as a candidate motion vector list).
  • an inter prediction unit also referred to as an inter prediction module
  • a candidate motion vector prediction value MVP list also referred to as a candidate motion vector list.
  • the candidate motion vector prediction value MVP list can be a triplet candidate motion vector prediction value MVP list or a binary tuple candidate motion vector prediction value MVP List; the above two methods are as follows:
  • Method 1 A motion vector prediction method based on a motion model is used to construct a candidate motion vector prediction value MVP list.
  • all or part of neighboring blocks of the current decoding block are traversed in a predetermined order to determine the neighboring affine decoding blocks, and the number of the determined neighboring affine decoding blocks may be one or more.
  • the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine decoding blocks among the neighboring blocks A, B, C, D, and E.
  • the inter prediction unit determines at least one set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to at least one neighboring affine decoding block, and an adjacent affine is used below
  • the decoding block is introduced as an example.
  • the adjacent affine decoding block is called the first adjacent affine decoding block, as follows:
  • a first affine model is determined according to a motion vector of a control point of a first adjacent affine decoding block, and a motion vector of a control point of the current decoding block is predicted according to the first affine model.
  • the method of predicting the motion vector of the control point of the current decoding block based on the motion vector of the control point of the first adjacent affine decoding block is also different, so the following description will be made on a case-by-case basis.
  • the parameter model of the current decoding block is a 4-parameter affine transformation model, and the derivation method can be (see Figure 9A):
  • the first phase is obtained
  • the motion vectors of the bottom two control points of the adjacent affine decoding block For example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy) of the lower left control point of the first adjacent affine decoding block can be obtained. 6), and a lower right position of the control point coordinates (x 7, y 7) and motion vector values (vx 7, vy 7) (step S1201).
  • a first affine model is formed according to the motion vectors and coordinate positions of the bottom two control points of the first adjacent affine decoding block (the first affine model obtained at this time is a 4-parameter affine model) (step S1202) .
  • the motion vector of the control point of the current decoding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point and the position of the upper right control point of the current decoding block may be brought into the first affine model, respectively. , Thereby predicting the motion vector of the upper left control point and the motion vector of the upper right control point of the current decoding block, as shown in formulas (1) and (2) (step S1203).
  • (x 0 , y 0 ) are the coordinates of the upper left control point of the current decoded block, and (x 1 , y 1 ) are the coordinates of the upper right control point of the current decoded block; in addition, ( vx 0 , vy 0 ) is a motion vector of the upper left control point of the predicted current decoded block, and (vx 1 , vy 1 ) is a motion vector of the upper right control point of the predicted current decoded block.
  • the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine decoding block and the position coordinates (x 7 , y 7 ) of the lower right control point are both based on the The position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine decoding block are calculated, where the position coordinates (x 6 , y) of the lower left control point of the first adjacent affine decoding block 6 ) is (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine decoding block is (x 4 + cuW, y 4 + cuH) , CuW is the width of the first neighboring affine decoding block, and cuH is the height of the first neighboring affine decoding block.
  • the motion vector of the lower left control point of the first neighboring affine decoding block is A motion vector of a lower left sub-block of the first adjacent affine decoding block
  • a motion vector of a lower right control point of the first adjacent affine decoding block is a Motion vector.
  • the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a six-parameter affine decoding block, it is not based on the first phase
  • the neighboring affine decoding block generates a candidate motion vector prediction value of a control point of the current block.
  • CTU Coding Tree Unit
  • the manner of predicting the motion vector of the control point of the current decoding block is not limited here.
  • an optional determination method is also exemplified below:
  • the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
  • a 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block.
  • the position coordinates (x 0 , y 0 ) of the upper left control point and the position coordinates (x 1 , y 1 ) of the upper right control point of the current decoding block are substituted into a 6-parameter affine model to predict the motion vector of the upper left control point of the current decoding block. And the motion vector of the upper right control point, as shown in formulas (4) and (5).
  • (vx 0 , vy 0 ) is the motion vector of the upper left control point of the predicted current decoded block
  • (vx 1 , vy 1 ) is the predicted upper right control point of the current decoded block. Motion vector.
  • the parameter model of the current decoding block is a 6-parameter affine transformation model.
  • the derivation method can be:
  • the first adjacent affine decoding block is located above the CTU of the current decoding block and the first adjacent affine decoding block is a four-parameter affine decoding block, two lower-most two sides of the first adjacent affine decoding block are obtained.
  • the position coordinates and motion vectors of the control points for example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine decoding block can be obtained, and the lower right control
  • a first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine decoding block (the first affine model obtained at this time is a 4-parameter affine model).
  • the motion vector of the control point of the current decoded block is predicted according to the first affine model. For example, the position coordinates of the upper left control point, the position of the upper right control point, and the position of the lower left control point of the current decoded block may be brought in respectively.
  • To the first affine model thereby predicting the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion of the lower left control point of the current decoding block, as shown in formulas (1), (2), (3) Show.
  • Formulas (1) and (2) have been described above.
  • (x 0 , y 0 ) are the coordinates of the upper-left control point of the current decoding block
  • (x 1 , y 1 ) is the coordinate of the upper right control point of the current decoded block
  • (x 2 , y 2 ) is the coordinate of the lower left control point of the current decoded block
  • (vx 0 , vy 0 ) is the predicted upper left control of the current decoded block
  • the motion vector of the point (vx 1 , vy 1 ) is the motion vector of the predicted upper right control point of the current decoded block
  • (vx 2 , vy 2 ) is the predicted motion vector of the right and left lower control point of the current decoded block.
  • the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a six-parameter affine decoding block, it is not based on the first phase
  • the neighboring affine decoding block generates a candidate motion vector prediction value of a control point of the current block.
  • CTU Coding Tree Unit
  • the manner of predicting the motion vector of the control point of the current decoding block is not limited here.
  • an optional determination method is also exemplified below:
  • the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
  • a 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block.
  • the affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current decoding block, as shown in formulas (4), (5), and (6).
  • Formulas (4) and (5) have been described previously.
  • (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current decoding block
  • (vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current decoded block
  • (vx 2 , vy 2 ) are the predicted motion vectors of the lower left control point of the current decoded block.
  • Method 2 A motion vector prediction method based on the combination of control points is used to construct a candidate motion vector prediction value MVP list.
  • the parameter model of the current decoding block does not have the same way of constructing the candidate motion vector prediction value MVP list at the same time, which will be described below.
  • the parameter model of the current decoding block is a 4-parameter affine transformation model.
  • the derivation method can be:
  • the motion information of the upper left vertex and the upper right vertex of the current decoded block is estimated using the motion information of the decoded blocks adjacent to the current decoded block.
  • the motion vector of the upper left vertex adjacent to the decoded block A and / or B and / or C block is used as the candidate motion vector of the motion vector of the upper left vertex of the current decoded block;
  • the motion vector of the decoding block D and / or E block is used as a candidate motion vector of the motion vector of the upper right vertex of the current decoding block.
  • a candidate motion vector of the upper left vertex and a candidate motion vector of the upper right vertex are combined to obtain a set of candidate motion vector prediction values. Multiple records obtained by combining in this combination manner can form a candidate motion vector prediction value MVP list.
  • the current decoding block parameter model is a 6-parameter affine transformation model.
  • the derivation method can be:
  • the motion information of the upper left vertex and the upper right vertex of the current decoded block is estimated using the motion information of the decoded blocks adjacent to the current decoded block.
  • the motion vector of the upper left vertex adjacent to the decoded block A and / or B and / or C block is used as the candidate motion vector of the motion vector of the upper left vertex of the current decoded block;
  • the motion vector of the decoded block D and / or E block is used as the candidate motion vector of the motion vector of the top right vertex of the current decoded block;
  • the motion vector of the decoded block F and / or G block adjacent to the top right vertex is used as the top right vertex of the current decoded block Candidate motion vector.
  • a set of candidate motion vector prediction values can be obtained by combining the above candidate motion vector of the upper left vertex, the candidate motion vector of the upper right vertex, and the candidate motion vector of the lower left vertex. Multiple sets of candidate motion vectors obtained by combining in this combination manner
  • the prediction value may constitute a candidate motion vector prediction value MVP list.
  • the candidate motion vector prediction value MVP list can only be constructed by using the candidate motion vector prediction value predicted by the first method, or the candidate motion vector prediction value MVP can be constructed by only using the candidate motion vector prediction value obtained by the second method prediction.
  • the candidate motion vector prediction value obtained by the prediction method 1 and the candidate motion vector prediction value obtained by the method 2 prediction can be used to jointly construct a candidate motion vector prediction value MVP list.
  • the candidate motion vector prediction value MVP list can be pruned and sorted according to a pre-configured rule, and then truncated or filled to a specific number.
  • the candidate motion vector prediction value MVP list When each group of candidate motion vector prediction values in the candidate motion vector prediction value MVP list includes motion vector prediction values of three control points, the candidate motion vector prediction value MVP list may be called a triple list; when the candidate motion vector When each group of candidate motion vector prediction values in the prediction value MVP list includes motion vector prediction values of two control points, the candidate motion vector prediction value MVP list may be referred to as a two-tuple list.
  • Step S1212 The video decoder parses the bitstream to obtain an index and a motion vector difference MVD.
  • the video decoder may parse the bitstream through an entropy decoding unit, and the index is used to indicate a target candidate motion vector group of the current decoding block, where the target candidate motion vector represents a motion vector prediction value of a set of control points of the current decoding block.
  • Step S1213 The video decoder determines a target motion vector group from the candidate motion vector prediction value MVP list according to the index.
  • the video decoder determines the target candidate motion vector group determined from the candidate motion vectors according to the index to be used as the optimal candidate motion vector prediction value (optionally, when the length of the candidate motion vector prediction value MVP list is 1) (You don't need to parse the bitstream to get the index, you can directly determine the target motion vector group.)
  • the optimal candidate motion vector prediction value when the length of the candidate motion vector prediction value MVP list is 1) (You don't need to parse the bitstream to get the index, you can directly determine the target motion vector group.)
  • the optimal motion vector prediction value of 2 control points is selected from the candidate motion vector prediction value MVP list established above; for example, the video decoder from The index number is parsed in the bitstream, and the optimal motion vector prediction value of 2 control points is determined from the candidate motion vector prediction value MVP list of the binary group according to the index number.
  • Each candidate motion vector prediction value in the MVP list of the candidate motion vector The vector prediction values correspond to their respective index numbers.
  • the optimal motion vector prediction value of 3 control points is selected from the candidate motion vector prediction value MVP list established above; for example, the video decoder from The index number is parsed in the code stream, and the optimal motion vector prediction value of the three control points is determined from the triplet candidate motion vector prediction value MVP list according to the index number.
  • the candidate motion vector prediction value MVP list includes each candidate motion The vector prediction values correspond to their respective index numbers.
  • Step S1214 The video decoder determines the motion vector of the control point of the current decoding block according to the target candidate motion vector group and the motion vector difference MVD parsed from the code stream.
  • the motion vector difference values of the two control points of the current decoding block are obtained by decoding from the code stream, respectively according to the motion vector difference values of the control points and the index indication.
  • a new candidate motion vector group For example, the motion vector difference MVD of the upper left control point and the motion vector difference MVD of the upper right control point are decoded from the code stream, and are respectively added to the motion vectors of the upper left control point and the upper right control point in the target candidate motion vector group to thereby A new candidate motion vector group is obtained. Therefore, the new candidate motion vector group includes new motion vector values of the upper left control point and the upper right control point of the current decoding block.
  • the motion vector value of the third control point may also be obtained by using a 4-parameter affine transformation model based on the motion vector values of the two control points of the current decoding block in the new candidate motion vector group. For example, the motion vector (vx 0 , vy 0 ) of the upper left control point and the motion vector (vx 1 , vy 1 ) of the upper right control point of the current decoding block are obtained, and then the lower left control point (x of the current decoding block) (x 2 , y 2 ) 's motion vector (vx 2 , vy 2 ).
  • (x 0 , y 0 ) is the position coordinates of the upper left control point
  • (x 1 , y 1 ) is the position coordinates of the upper right control point
  • W is the width of the current decoded block
  • H is the height of the current decoded block.
  • the motion vector difference values of the three control points of the current decoding block are obtained by decoding from the code stream, respectively according to the motion vector difference values of each control point and the MVD and the index indication.
  • a new candidate motion vector group For example, the motion vector difference MVD of the upper left control point, the motion vector difference MVD of the upper right control point, and the motion vector difference of the lower left control point are obtained from the code stream, and are respectively compared with the upper left control point in the target candidate motion vector group.
  • the motion vectors of the upper right control point and the lower left control point are added to obtain a new candidate motion vector group. Therefore, the new candidate motion vector group includes the motion vector values of the upper left control point, the upper right control point, and the lower left control point of the current decoding block.
  • Step S1215 The video decoder uses the affine transformation model to obtain the motion vector value of each sub-block in the current decoding block according to the motion vector value of the control point of the current decoding block determined above.
  • the new candidate motion vector group obtained based on the target candidate motion vector group and the MVD includes two (upper left control points and upper right control points) or three control points (for example, upper left control point, upper right control point, and lower left control). Dot) motion vector.
  • the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information.
  • the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations.
  • FIG. 8A illustrates a 4 ⁇ 4 motion compensation unit
  • FIG. 8B illustrates an 8 ⁇ 8 motion compensation unit.
  • the coordinates of the center point of the motion compensation unit relative to the top left pixel of the current decoding block are calculated using formula (8-1), where i is the i-th motion compensation unit in the horizontal direction (from left to right), and j is the j-th vertical direction.
  • Motion compensation units (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the (i, j) th center of the motion compensation unit relative to the pixel at the upper left control point of the current decoding block .
  • the current decoding block is a 6-parameter decoding block
  • a motion vector of one or more sub-blocks of the current decoding block is obtained based on the target candidate motion vector group
  • the lower boundary of the current decoding block is The lower boundary of the CTU where the current decoding block is located coincides
  • the motion vector of the sub-block in the lower left corner of the current decoding block is a 6-parameter affine model constructed from the three control points and the current decoding block.
  • the position coordinates (0, H) of the lower left corner are calculated
  • the motion vector of the subblock in the lower right corner of the current decoding block is a 6-parameter affine model constructed according to the three control points and the right of the current decoding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 6-parameter affine model, the motion vector of the lower left corner sub-block of the current decoded block (instead of the The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoded block are substituted into the 6-parameter affine model to obtain the motion vector of the lower right corner of the current decoded block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion vector prediction value MVP list of the other blocks uses an accurate value instead of an estimated value.
  • W is the width of the current decoded block
  • H is the height of the current decoded block.
  • the current decoding block is a 4-parameter decoding block
  • the motion vectors of one or more sub-blocks of the current decoding block are obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block and The lower boundary of the CTU where the current decoding block is located coincides, and the motion vector of the lower left sub-block of the current decoding block is a 4-parameter affine model constructed according to the two control points and the current decoding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the sub-block in the lower right corner of the current decoding block is a 4-parameter affine model constructed according to the two control points and the right of the current decoding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 4-parameter affine model, the motion vector of the lower left sub-block of the current decoded block (instead of the The center point coordinates are substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current decoding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion vector prediction value MVP list of the other blocks uses an accurate value instead of an estimated value.
  • W is the width of the current decoded block
  • H is the height of the current decoded block.
  • Step S1216 The video decoder performs motion compensation according to the motion vector value of each sub-block in the current decoding block to obtain the pixel prediction value of each sub-block. For example, the motion vector and reference frame index value of each sub-block are used in The corresponding sub-block is found in the reference frame, and interpolation filtering is performed to obtain the pixel prediction value of each sub-block.
  • Step S1221 The video decoder constructs a candidate motion information list.
  • the video decoder constructs a candidate motion information list (also referred to as a candidate motion vector list) through an inter prediction unit (also referred to as an inter prediction module), which may be constructed in one of two ways provided below. Or it can be constructed by a combination of two methods, and the candidate motion information list constructed is a triplet candidate motion information list; the above two methods are specifically as follows:
  • Method 1 A motion vector prediction method based on a motion model is used to construct a candidate motion information list.
  • all or part of neighboring blocks of the current decoding block are traversed in a predetermined order to determine the neighboring affine decoding blocks, and the number of the determined neighboring affine decoding blocks may be one or more.
  • the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine decoding blocks among the neighboring blocks A, B, C, D, and E.
  • the inter prediction unit determines a set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to each adjacent affine decoding block, and an adjacent affine is used below.
  • the decoding block is introduced as an example.
  • the adjacent affine decoding block is called the first adjacent affine decoding block, as follows:
  • a first affine model is determined according to a motion vector of a control point of a first adjacent affine decoding block, and then a motion vector of a control point of the current decoding block is predicted according to the first affine model, which is specifically described as follows:
  • first adjacent affine decoding block is located above the CTU of the current decoding block and the first adjacent affine decoding block is a four-parameter affine decoding block, two lower-most two sides of the first adjacent affine decoding block are obtained.
  • Position coordinates and motion vectors of the control points for example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine decoding block can be obtained, and the lower right The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the control point.
  • a first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine decoding block (the first affine model obtained at this time is a 4-parameter affine model).
  • the motion vector of the control point of the current decoding block is predicted according to the first affine model.
  • the position coordinates of the upper left control point, the position coordinates of the upper right control point, and the position of the lower left control point of the current decoding block can be predicted.
  • the coordinates are brought into the first affine model respectively, thereby predicting the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion of the lower left control point of the current decoded block to form a candidate motion vector triplet and adding candidate motions.
  • the information list is shown in formulas (1), (2), and (3).
  • the motion vector of the control point of the current decoding block is predicted according to the first affine model.
  • the position coordinates of the upper left control point and the position coordinates of the upper right control point of the current decoding block may be brought into the first An affine model, so as to predict the motion vector of the upper left control point and the motion vector of the upper right control point of the current decoding block, form a candidate motion vector two-tuple, and add the candidate motion information list, as shown in formulas (1) and (2). Show.
  • (X 2 , y 2 ) are the coordinates of the lower left control point of the current decoding block; in addition, (vx 0 , vy 0 ) is the predicted motion vector of the upper left control point of the current decoding block, and (vx 1 , vy 1 ) is The predicted motion vector of the upper right control point of the current decoded block, (vx 2 , vy 2 ) is the predicted motion vector of the lower left control point of the current decoded block.
  • the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a six-parameter affine decoding block, it is not based on the first phase
  • the neighboring affine decoding block generates a candidate motion vector prediction value of a control point of the current block.
  • CTU Coding Tree Unit
  • the manner of predicting the motion vector of the control point of the current decoding block is not limited here.
  • an optional determination method is also exemplified below:
  • the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
  • a 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block.
  • the affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current decoding block, as shown in formulas (4), (5), and (6).
  • Formulas (4) and (5) have been described previously.
  • (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current decoding block
  • (vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current decoded block
  • (vx 2 , vy 2 ) are the predicted motion vectors of the lower left control point of the current decoded block.
  • Manner 2 A motion vector prediction method based on a combination of control points is used to construct a candidate motion information list.
  • Option A The following two examples are exemplified as Option A and Option B:
  • Solution A Combine the motion information of the two control points of the current decoding block to construct a 4-parameter affine transformation model.
  • the combination of the two control points is ⁇ CP1, CP4 ⁇ , ⁇ CP2, CP3 ⁇ , ⁇ CP1, CP2 ⁇ , ⁇ CP2, CP4 ⁇ , ⁇ CP1, CP3 ⁇ , ⁇ CP3, CP4 ⁇ .
  • Affine CP1, CP2
  • control points can also be converted into a control point at the same position.
  • the four-parameter affine transformation model obtained by combining ⁇ CP1, CP4 ⁇ , ⁇ CP2, CP3 ⁇ , ⁇ CP2, CP4 ⁇ , ⁇ CP1, CP3 ⁇ , ⁇ CP3, CP4 ⁇ into a control point ⁇ CP1, CP2 ⁇ Or ⁇ CP1, CP2, CP3 ⁇ .
  • the conversion method is to substitute the motion vector of the control point and its coordinate information into formula (9-1) to obtain the model parameters, and then substitute the coordinate information of ⁇ CP1, CP2 ⁇ to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
  • a 0 , a 1 , a 2 , and a 3 are parameters in the parameter model, and (x, y) represents position coordinates.
  • a set of motion vector prediction values represented by the upper left control point and the upper right control point can also be converted according to the following formula, and added to the candidate motion information list:
  • ⁇ CP1, CP3 ⁇ is converted to ⁇ CP1, CP2, CP3 ⁇ 's formula (9-3):
  • ⁇ CP1, CP4 ⁇ is converted to ⁇ CP1, CP2, CP3 ⁇ 's formula (11):
  • ⁇ CP3, CP4 ⁇ is converted to ⁇ CP1, CP2, CP3 ⁇ 's formula (13):
  • Solution B Combine the motion information of the three control points of the current decoding block to build a 6-parameter affine transformation model.
  • the combination of the three control points is ⁇ CP1, CP2, CP4 ⁇ , ⁇ CP1, CP2, CP3 ⁇ , ⁇ CP2, CP3, CP4 ⁇ , ⁇ CP1, CP3, CP4 ⁇ .
  • Affine CP1, CP2, CP3
  • control points can also be converted into a control point at the same position.
  • the 6-parameter affine transformation model of ⁇ CP1, CP2, CP4 ⁇ , ⁇ CP2, CP3, CP4 ⁇ , ⁇ CP1, CP3, CP4 ⁇ is converted into a control point ⁇ CP1, CP2, CP3 ⁇ to represent it.
  • the transformation method is to substitute the motion vector of the control point and its coordinate information into formula (14) to obtain the model parameters, and then substitute the coordinate information of ⁇ CP1, CP2, CP3 ⁇ to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
  • a 1 , a 2 , a 3 , a 4 , a 5 , and a 6 are parameters in the parameter model, and (x, y) represents position coordinates.
  • a set of motion vector prediction values represented by the upper left control point, the upper right control point, and the lower left control point can also be converted according to the following formula, and added to the candidate motion information list:
  • the candidate motion information list may be constructed only by using the candidate motion vector prediction values predicted by the first method, or only the candidate motion vector prediction values obtained by the second prediction method may be used to construct the candidate motion information list.
  • a candidate motion vector prediction value obtained by the first prediction and a candidate motion vector prediction value obtained by the second prediction are used to jointly construct a candidate motion information list.
  • the candidate motion information list can be pruned and sorted according to pre-configured rules, and then truncated or filled to a specific number.
  • the candidate motion information list When each group of candidate motion vector prediction values in the candidate motion information list includes motion vector prediction values of three control points, the candidate motion information list may be referred to as a triple list; when each group in the candidate motion information list When the candidate motion vector prediction value includes motion vector prediction values of two control points, the candidate motion information list may be referred to as a two-tuple list.
  • Step S1222 The video decoder parses the bitstream to obtain an index.
  • the video decoder may parse the bitstream through an entropy decoding unit, and the index is used to indicate a target candidate motion vector group of the current decoding block, where the target candidate motion vector represents a motion vector prediction value of a set of control points of the current decoding block.
  • Step S1223 The video decoder determines a target motion vector group from the candidate motion information list according to the index. Specifically, the target decoder motion vector group determined from the candidate motion vectors according to the index is used as the optimal candidate motion vector prediction value (optionally, when the length of the candidate motion information list is 1, it is not required Analyze the bitstream to get the index and directly determine the target motion vector group), specifically the optimal motion vector prediction value of 2 or 3 control points; for example, the video decoder parses the index number from the bitstream, and then The optimal motion vector prediction value of 2 or 3 control points is determined from the candidate motion information list, and each group of candidate motion vector prediction values in the candidate motion information list corresponds to its own index number.
  • Step S1224 The video decoder uses the parameter affine transformation model to obtain the motion vector value of each sub-block in the current decoding block according to the motion vector value of the control point of the current decoding block determined above.
  • the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information.
  • the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations.
  • FIG. 8A illustrates a 4 ⁇ 4 motion compensation unit
  • FIG. 8B illustrates an 8 ⁇ 8 motion compensation unit.
  • the coordinates of the center point of the motion compensation unit relative to the top left pixel of the current decoding block are calculated using formula (5), where i is the i-th motion compensation unit in the horizontal direction (from left to right), and j is the j-th motion in the vertical direction.
  • the compensation unit (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the center point of the (i, j) th motion compensation unit relative to the upper left control point pixel of the current decoding block.
  • the current decoding block is a 6-parameter decoding block
  • a motion vector of one or more sub-blocks of the current decoding block is obtained based on the target candidate motion vector group
  • the lower boundary of the current decoding block is The lower boundary of the CTU where the current decoding block is located coincides
  • the motion vector of the sub-block in the lower left corner of the current decoding block is a 6-parameter affine model constructed from the three control points and the current decoding block.
  • the position coordinates (0, H) of the lower left corner are calculated
  • the motion vector of the subblock in the lower right corner of the current decoding block is a 6-parameter affine model constructed according to the three control points and the right of the current decoding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 6-parameter affine model, the motion vector of the lower left corner of the current decoded block (instead of The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoded block are substituted into the 6-parameter affine model to obtain the motion vector of the lower right corner of the current decoded block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion information list of the other blocks uses accurate values instead of estimated values.
  • W is the width of the current decoded block
  • H is the height of the current decoded block.
  • the current decoding block is a 4-parameter decoding block
  • the motion vectors of one or more sub-blocks of the current decoding block are obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block and The lower boundary of the CTU where the current decoding block is located coincides, and the motion vector of the lower left sub-block of the current decoding block is a 4-parameter affine model constructed according to the two control points and the current decoding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the sub-block in the lower right corner of the current decoding block is a 4-parameter affine model constructed according to the two control points and the right of the current decoding block.
  • the position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 4-parameter affine model, the motion vector of the lower left sub-block of the current decoded block (instead of the The center point coordinates are substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current decoding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation).
  • the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block).
  • the candidate motion information list of the other blocks uses accurate values instead of estimated values.
  • W is the width of the current decoded block
  • H is the height of the current decoded block.
  • Step S1225 The video decoder performs motion compensation according to the motion vector value of each sub-block in the current decoding block to obtain the pixel prediction value of each sub-block. Specifically, according to one or more sub-blocks of the current decoding block The motion vector, the reference frame index and the prediction direction indicated by the index, to obtain a pixel prediction value of the current decoded block.
  • the first set of control points includes the first neighboring affine decoding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent decoding block as the first group of control points as in the prior art (or fixing the first phase
  • the upper left control point and the upper right control point of the neighboring decoding blocks are used as the first group of control points).
  • the information of the first set of control points (for example, position coordinates, motion vectors, etc.) can directly reuse the information read from the memory, thereby reducing the memory read. This improves the decoding performance.
  • FIG. 10 is a schematic block diagram of an implementation manner of an encoding device or a decoding device (hereinafter referred to as a decoding device 1000) according to an embodiment of the present application.
  • the decoding device 1000 may include a processor 1010, a memory 1030, and a bus system 1050.
  • the processor and the memory are connected through a bus system, the memory is used to store instructions, and the processor is used to execute the instructions stored in the memory.
  • the memory of the encoding device stores program code, and the processor can call the program code stored in the memory to perform various video encoding or decoding methods described in this application, especially the video encoding or decoding methods in various new inter prediction modes. , And methods for predicting motion information in various new inter prediction modes. To avoid repetition, it will not be described in detail here.
  • the processor 1010 may be a Central Processing Unit (“CPU”), and the processor 1010 may also be another general-purpose processor, a digital signal processor (DSP), or a dedicated integration. Circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 1030 may include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device may also be used as the memory 1030.
  • the memory 1030 may include code and data 1031 accessed by the processor 1010 using the bus 1050.
  • the memory 1030 may further include an operating system 1033 and an application program 1035.
  • the application program 1035 includes at least one program that allows the processor 1010 to perform the video encoding or decoding method (especially the encoding method or the decoding method described in this application).
  • the application program 1035 may include applications 1 to N, which further includes a video encoding or decoding application (referred to as a video decoding application) that executes the video encoding or decoding method described in this application.
  • the bus system 1050 may include a power bus, a control bus, and a status signal bus in addition to a data bus. However, for the sake of clarity, various buses are marked as the bus system 1050 in the figure.
  • the decoding device 1000 may further include one or more output devices, such as a display 1070.
  • the display 1070 may be a tactile display that incorporates the display with a tactile unit operatively sensing a touch input.
  • the display 1070 may be connected to the processor 1010 via a bus 1050.
  • FIG. 11 is an explanatory diagram of an example of a video encoding system 1100 including the encoder 20 of FIG. 2A and / or the decoder 200 of FIG. 2B according to an exemplary embodiment.
  • the system 1100 may implement a combination of various techniques of the present application.
  • the video encoding system 1100 may include an imaging device 1101, a video encoder 100, a video decoder 200 (and / or a video encoder implemented by the logic circuit 1107 of the processing unit 1106), an antenna 1102, One or more processors 1103, one or more memories 1104, and / or a display device 1105.
  • the imaging device 1101, the antenna 1102, the processing unit 1106, the logic circuit 1107, the video encoder 100, the video decoder 200, the processor 1103, the memory 1104, and / or the display device 1105 can communicate with each other.
  • the video encoding system 1100 is shown with the video encoder 100 and the video decoder 200, in different examples, the video encoding system 1100 may include only the video encoder 100 or only the video decoder 200.
  • the video encoding system 1100 may include an antenna 1102.
  • the antenna 1102 may be used to transmit or receive an encoded bit stream of video data.
  • the video encoding system 1100 may include a display device 1105.
  • the display device 1105 may be used to present video data.
  • the logic circuit 1107 may be implemented by the processing unit 1106.
  • the processing unit 1106 may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
  • the video encoding system 1100 may also include an optional processor 1103, which may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
  • ASIC application-specific integrated circuit
  • the logic circuit 1107 may be implemented by hardware, such as dedicated hardware for video encoding, and the processor 1103 may be implemented by general software, operating system, and the like.
  • the memory 1104 may be any type of memory, such as volatile memory (for example, Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), etc.) or non-volatile memory Memory (for example, flash memory, etc.).
  • the memory 1104 may be implemented by a cache memory.
  • the logic circuit 1107 may access the memory 1104 (eg, for implementing an image buffer).
  • the logic circuit 1107 and / or the processing unit 1106 may include a memory (eg, a cache, etc.) for implementing an image buffer or the like.
  • video encoder 100 implemented by logic circuits may include an image buffer (eg, implemented by processing unit 1106 or memory 1104) and a graphics processing unit (eg, implemented by processing unit 1106).
  • the graphics processing unit may be communicatively coupled to the image buffer.
  • the graphics processing unit may include a video encoder 100 implemented by a logic circuit 1107 to implement the various modules discussed with reference to FIG. 2A and / or any other encoder system or subsystem described herein.
  • Logic circuits can be used to perform various operations discussed herein.
  • Video decoder 200 may be implemented in a similar manner through logic circuit 1107 to implement the various modules discussed with reference to decoder 200 of FIG. 2B and / or any other decoder system or subsystem described herein.
  • video decoder 200 implemented by a logic circuit may include an image buffer (implemented by processing unit 2820 or memory 1104) and a graphics processing unit (eg, implemented by processing unit 1106).
  • the graphics processing unit may be communicatively coupled to the image buffer.
  • the graphics processing unit may include a video decoder 200 implemented by a logic circuit 1107 to implement various modules discussed with reference to FIG. 2B and / or any other decoder system or subsystem described herein.
  • the antenna 1102 of the video encoding system 1100 may be used to receive an encoded bit stream of video data.
  • the encoded bitstream may contain data, indicators, index values, mode selection data, etc. related to encoded video frames discussed herein, such as data related to coded segmentation (e.g., transform coefficients or quantized transform coefficients , (As discussed) optional indicators, and / or data defining code partitions).
  • the video encoding system 1100 may also include a video decoder 200 coupled to the antenna 1102 and used to decode the encoded bitstream.
  • the display device 1105 is used to present a video frame.
  • the description order of the steps does not represent the execution order of the steps, and it is feasible to perform according to the above description order, and it is also possible to perform without the above description order.
  • the above step S1211 may be performed after step S1212, or may be performed before step S1212; the above step S1221 may be performed after step S1222, or may be performed before step S1222; the remaining steps are not illustrated here one by one.
  • Computer-readable media may include computer-readable storage media, which corresponds to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (e.g., according to a communication protocol) .
  • computer-readable media may generally correspond to (1) tangible computer-readable storage media that is non-transitory, or (2) a communication medium such as a signal or carrier wave.
  • a data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application.
  • the computer program product may include a computer-readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store instructions or data structures Any form of desired program code and any other medium accessible by a computer.
  • any connection is properly termed a computer-readable medium.
  • a coaxial cable is used to transmit instructions from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave. Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media.
  • the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are instead directed to non-transitory tangible storage media.
  • magnetic disks and compact discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), and Blu-ray discs, where disks typically reproduce data magnetically, and optical discs use lasers to reproduce optically data. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • processor may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein.
  • functions described by the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and / or software modules configured for encoding and decoding, or Into the combined codec.
  • the techniques can be fully implemented in one or more circuits or logic elements.
  • the techniques of this application may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset).
  • IC integrated circuit
  • Various components, modules, or units are described in this application to emphasize functional aspects of the apparatus for performing the disclosed techniques, but do not necessarily need to be implemented by different hardware units.
  • the various units may be combined in a codec hardware unit in combination with suitable software and / or firmware, or through interoperable hardware units (including one or more processors as described above) provide.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the present invention provides a video encoder, a video decoder and a corresponding method. The method comprises: parsing code streams to obtain an index, the index is used for indicating a target candidate motion vector set of a current decoding block; determining, according to the index, the target candidate motion vector set from a candidate motion vector list; if the first neighboring affine decoding block is a four-parameter affine decoding block and is located in the CTU above the current decoding block, the candidate motion vector list includes a first group of candidate motion vector prediction values obtained on the basis of a bottom left control point and a bottom right control point of the first neighboring affine decoding block; and obtaining a pixel prediction value of the current decoding block on the basis of the target candidate motion vector set. According to the embodiment of the present invention, memory reading can be reduced so as to promote the encoding and decoding performance.

Description

视频编码器、视频解码器及相应方法Video encoder, video decoder and corresponding method 技术领域Technical field
本申请涉及视频编解码技术领域,尤其涉及一种视频图像的帧间预测方法、装置以及相应的编码器和解码器。The present application relates to the technical field of video encoding and decoding, and in particular, to an inter prediction method and device for a video image, and a corresponding encoder and decoder.
背景技术Background technique
数字视频能力可并入到多种多样的装置中,包含数字电视、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、平板计算机、电子图书阅读器、数码相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝式或卫星无线电电话(所谓的“智能电话”)、视频电话会议装置、视频流式传输装置及其类似者。数字视频装置实施视频压缩技术,例如,在由MPEG-2、MPEG-4、ITU-T H.263、ITU-T H.264/MPEG-4第10部分高级视频编码(AVC)定义的标准、视频编码标准H.265/高效视频编码(HEVC)标准以及此类标准的扩展中所描述的视频压缩技术。视频装置可通过实施此类视频压缩技术来更有效率地发射、接收、编码、解码和/或存储数字视频信息。Digital video capabilities can be incorporated into a wide variety of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, Digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones (so-called "smart phones"), video teleconferencing devices, video streaming devices, and the like . Digital video devices implement video compression technology, for example, in standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264 / MPEG-4 Part 10 Advanced Video Coding (AVC), Video coding standards described in the H.265 / High Efficiency Video Coding (HEVC) standard and extensions to such standards. Video devices can implement such video compression techniques to more efficiently transmit, receive, encode, decode, and / or store digital video information.
视频压缩技术执行空间(图像内)预测和/或时间(图像间)预测以减少或去除视频序列中固有的冗余。对于基于块的视频编码,视频条带(即,视频帧或视频帧的一部分)可分割成若干图像块,所述图像块也可被称作树块、编码单元(CU)和/或编码节点。使用关于同一图像中的相邻块中的参考样本的空间预测来编码图像的待帧内编码(I)条带中的图像块。图像的待帧间编码(P或B)条带中的图像块可使用相对于同一图像中的相邻块中的参考样本的空间预测或相对于其它参考图像中的参考样本的时间预测。图像可被称作帧,且参考图像可被称作参考帧。Video compression techniques perform spatial (intra-image) prediction and / or temporal (inter-image) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (ie, a video frame or a portion of a video frame) can be partitioned into image blocks, which may also be referred to as tree blocks, coding units (CU), and / or coding nodes . The spatial prediction of reference samples in neighboring blocks in the same image is used to encode the image blocks in the to-be-encoded (I) slice of the image. The image blocks in an inter-coded (P or B) slice of an image may use spatial prediction relative to reference samples in neighboring blocks in the same image or temporal prediction relative to reference samples in other reference images. An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
其中,包含高效视频编码(HEVC)标准在内的各种视频编码标准提出了用于图像块的预测性编码模式,即基于已经编码的视频数据块来预测当前待编码的块。在帧内预测模式中,基于与当前块在相同的图像中的一或多个先前经解码相邻块来预测当前块;在帧间预测模式中,基于不同图像中的已经解码块来预测当前块。Among them, various video coding standards including the High-Efficiency Video Coding (HEVC) standard have proposed predictive coding modes for image blocks, that is, predicting a current block to be coded based on a video data block that has been coded. In intra prediction mode, the current block is predicted based on one or more previously decoded neighboring blocks in the same image as the current block; in inter prediction mode, the current block is predicted based on already decoded blocks in different images. Piece.
运动矢量预测是一种影响编/解码性能的关键技术。现有运动矢量预测过程中,针对画面中的平动物体有基于平动运动模型的运动矢量预测方法;针对非平动的物体有基于运动模型的运动矢量预测方法和基于控制点组合的运动矢量预测方法。其中,基于运动模型的运动矢量预测方法读取的内存较多,从而导致编/解码速度较慢。如何降低运动矢量预测过程中的内存读取量是本领域技术人员正在研究的技术问题。Motion vector prediction is a key technique that affects encoding / decoding performance. In the current motion vector prediction process, there are motion vector prediction methods based on translational motion models for flat animals in the picture; motion vector prediction methods based on motion models and motion vectors based on control point combinations for non-translational objects. method of prediction. Among them, the motion vector prediction method based on the motion model reads more memory, which results in a slower encoding / decoding speed. How to reduce the amount of memory read in the motion vector prediction process is a technical issue being studied by those skilled in the art.
发明内容Summary of the Invention
本申请实施例提供一种视频图像的帧间预测方法、装置及相应的编码器和编码器,一定程度上降低内存的读取量,从而提高编编码性能。The embodiments of the present application provide an inter-frame prediction method and device for a video image, and a corresponding encoder and encoder, which can reduce the read amount of the memory to a certain extent, thereby improving encoding and encoding performance.
第一方面,本申请实施例公开了一种编码方法,该方法包括:根据率失真代价准则,从候选运动矢量列表(例如,仿射变换候选运动矢量列表)中确定目标候选运动矢量组;所述目标候选运动矢量组表示当前编码块(可以具体为当前仿射编码块)的一组控制点的 运动矢量预测值,其中,如果第一相邻仿射编码块为四参数仿射编码块,且所述第一相邻仿射编码块位于所述当前编码块的上方编码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的;可选的,其中候选运动矢量列表的构建方式可以为:按照相邻块A、相邻块B、相邻块C、相邻块D、相邻块E(如图7A)的顺序确定当前编码块的一个或多个相邻仿射编码块,所述一个或多个相邻仿射编码块包括第一相邻仿射编码块;然后若所述第一相邻仿射编码块为四参数仿射编码块,则基于第一相邻仿射编码块的左下控制点和右下控制点采用第一仿射模型得到所述当前编码块的第一组控制点的运动矢量预测值,其中所述当前编码块的第一组控制点的运动矢量预测值作为所述候选运动矢量列表的第一组候选运动矢量;通过以上方式确定目标候选运动矢量组之后,将与所述目标候选运动矢量对应的索引编入待传输的码流(可选的,当候选运动矢量列表的长度为1时,不需索引来指示该目标运动矢量组)。In a first aspect, an embodiment of the present application discloses an encoding method including: determining a target candidate motion vector group from a candidate motion vector list (for example, an affine transformation candidate motion vector list) according to a rate distortion cost criterion; The target candidate motion vector group represents a motion vector prediction value of a set of control points of a current coding block (which may be specifically a current affine coding block), wherein if the first neighboring affine coding block is a four-parameter affine coding block, And the first adjacent affine coding block is located at the coding tree unit CTU above the current coding block, the candidate motion vector list includes a first set of candidate motion vector prediction values, and the first set of candidate motion vector predictions The value is obtained based on the lower left control point and the lower right control point of the first neighboring affine coding block; optionally, the candidate motion vector list may be constructed in the following manner: according to the neighboring block A and the neighboring block B , The neighboring block C, the neighboring block D, and the neighboring block E (as shown in FIG. 7A) determine one or more neighboring affine coding blocks of the current coding block, the one or more neighboring affine coding blocks Including the first adjacent affine coding block; then, if the first adjacent affine coding block is a four-parameter affine coding block, the lower left control point and the lower right control point of the first adjacent affine coding block are used. The first affine model obtains a motion vector prediction value of a first set of control points of the current encoding block, where the motion vector prediction value of the first set of control points of the current encoding block is used as a first of the candidate motion vector list. Group candidate motion vectors; after the target candidate motion vector group is determined in the above manner, an index corresponding to the target candidate motion vector is coded into a code stream to be transmitted (optionally, when the length of the candidate motion vector list is 1, (No index is needed to indicate the target motion vector group).
在上述方法中,候选运动矢量列表中可能只有一个候选运动矢量组,也可能有多个候选运动矢量组,其中,每个候选运动矢量组可以为一个运动矢量二元组或者运动矢量三元组。当存在多个候选运动矢量组时,该第一组候选运动矢量预测值为该多个候选运动矢量组中的一个候选运动矢量组,该多个候选运动矢量组中的其他候选运动矢量组的生成原理可以与第一组候选运动矢量预测值的生成原理相同,也可以与第一组候选运动矢量预测值的生成原理不同。进一步地,上述目标候选运动矢量组为根据率失真代价准则,从候选运动矢量列表中选择出的最优的候选运动矢量组,如果该第一组候选运动矢量预测值是最优的,那么,选择出的目标候选运动矢量组即为第一组候选运动矢量预测值;如果该第一组候选运动矢量预测值不是最优的,那么,选择出的目标候选运动矢量组不是第一组候选运动矢量预测值。上述第一相邻仿射编码块为该当前编码块的相邻块中的某一个四参数仿射编码块,具体为哪一个此处不作限定,以图7A为,可能是其中的相邻块A,或者相邻块B,或者其他相邻块。除此之外,本申请实施例其他地方出现的“第一”、“第二”、“第三”等都表示某一个的意思,其中,“第一”表示的某一个、“第二”表示的某一个、“第三”表示的某一个各指代不同的对象,例如,假若出现了第一组控制点和第二组控制点,那么第一组控制点和第二组控制点各指代不同的控制点;另外,本申请实施例中的“第一”、“第二”等也没有先后次序的含义。In the above method, there may be only one candidate motion vector group in the candidate motion vector list, or there may be multiple candidate motion vector groups, where each candidate motion vector group may be a motion vector binary group or a motion vector triplet . When there are multiple candidate motion vector groups, the predicted value of the first candidate motion vector group is one candidate motion vector group of the multiple candidate motion vector groups, and the other candidate motion vector groups of the multiple candidate motion vector groups are The generation principle may be the same as the generation principle of the first group of candidate motion vector prediction values, or may be different from the generation principle of the first group of candidate motion vector prediction values. Further, the target candidate motion vector group is an optimal candidate motion vector group selected from a candidate motion vector list according to a rate distortion cost criterion. If the first group of candidate motion vector prediction values is optimal, then, The selected target candidate motion vector group is the first group of candidate motion vector prediction values; if the first group of candidate motion vector prediction values is not optimal, then the selected target candidate motion vector group is not the first group of candidate motions. Vector prediction. The first adjacent affine coding block is a four-parameter affine coding block among the neighboring blocks of the current coding block. The specific one is not limited here. Taking FIG. 7A as an example, it may be a neighboring block therein. A, or neighboring block B, or other neighboring blocks. In addition, "first", "second", "third", etc. appearing elsewhere in the embodiments of the present application all mean a certain meaning. Among them, "first" represents a certain one, and "second" The one indicated and the one indicated by "third" refer to different objects. For example, if the first group of control points and the second group of control points appear, the first group of control points and the second group of control points each have It refers to different control points; in addition, the “first”, “second”, and the like in the embodiments of the present application have no sequential meaning.
可以理解的是,当所述第一相邻仿射编码块所在的编码树单元CTU在所述当前编码块位置的上方时,该第一相邻仿射编码块最下方控制点的信息已经从内存中读取过;因此上述方案在根据第一相邻仿射编码块的第一组控制点构建候选运动矢量的过程中,该第一组控制点包括所述第一相邻仿射编码块的左下控制点和右下控制点;而不是像现有技术那样固定将第一相邻编码块的左上控制点、右上控制点和左下控制点作为第一组控制点(或者固定将第一相邻编码块的左上控制点和右上控制点作为第一组控制点)。因此采用本申请中确定第一组控制点的方法,第一组控制点的信息(例如,位置坐标、运动矢量等)很大概率上可以直接复用从内存中读取过的信息,从而减少了内存的读取,提高了编码性能。另外,由于特地规定第一相邻仿射编码块为四参数仿射编码块,因此在根据第一相邻仿射编码块的组控制点构建候选运动矢量时,只需用到第一相邻仿射编码块的左下控制点和右下 控制点即可,不需要再用到额外的控制点,因此进一步保证了内存读取不会太高。It can be understood that when the CTU of the coding tree unit where the first neighboring affine coding block is located is above the current coding block position, the information of the lowest control point of the first neighboring affine coding block has been changed from Read in memory; therefore, in the above solution, in the process of constructing a candidate motion vector according to the first set of control points of the first neighboring affine coding block, the first set of control points includes the first neighboring affine coding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent coding block as the first group of control points as in the prior art (or fixed the first phase The upper left control point and the upper right control point of the neighboring coding block are used as the first group of control points). Therefore, by adopting the method for determining the first set of control points in this application, the information (for example, position coordinates, motion vectors, etc.) of the first set of control points can be directly reused from the memory, thereby reducing the probability. Read memory and improve encoding performance. In addition, since the first neighboring affine coding block is specifically specified as a four-parameter affine coding block, when the candidate motion vector is constructed based on the group control points of the first neighboring affine coding block, only the first neighboring The lower left control point and lower right control point of the affine coding block are sufficient, and no additional control points are needed, so it is further ensured that the memory read is not too high.
在一种可能的实现方式中,如果所述当前编码块为四参数仿射编码块,则所述第一组候选运动矢量预测值用于表示所述当前编码块的左上控制点和右上控制点的运动矢量预测值,例如,将所述当前编码块的左上控制点和右上控制点的位置坐标代入第一仿射模型,从而得到当前编码块的左上控制点和右上控制点的运动矢量预测值,其中,所述第一仿射模型是基于所述第一相邻仿射编码块的左下控制点和右下控制点的运动矢量及位置坐标确定的。In a possible implementation manner, if the current coding block is a four-parameter affine coding block, the first set of candidate motion vector prediction values is used to represent an upper left control point and an upper right control point of the current coding block. , For example, substituting position coordinates of the upper left control point and the upper right control point of the current coding block into the first affine model, thereby obtaining the motion vector prediction values of the upper left control point and the upper right control point of the current coding block. , Wherein the first affine model is determined based on a motion vector and a position coordinate of a lower left control point and a lower right control point of the first adjacent affine coding block.
如果所述当前编码块为六参数仿射编码块,则所述第一组候选运动矢量预测值用于表示所述当前编码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。例如,将所述当前编码块的左上控制点、右上控制点和左下定点控制点的位置坐标代入第一仿射模型,从而得到当前编码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。If the current coding block is a six-parameter affine coding block, the first set of candidate motion vector prediction values is used to represent a motion vector prediction of an upper left control point, an upper right control point, and a lower left fixed point control point of the current coding block. value. For example, the position coordinates of the upper left control point, the upper right control point, and the lower left fixed point control point of the current encoding block are substituted into the first affine model, so as to obtain the upper left control point, upper right control point, and lower left fixed point control point of the current encoding block. Motion vector prediction.
在又一种可能的实现方式中,在先进运动矢量预测AMVP模式下,还包括:以所述目标候选运动矢量组为搜索起始点在预设搜索范围内按照率失真代价准则搜索代价最低的一组控制点的运动矢量;然后,确定所述一组控制点的运动矢量与所述目标候选运动矢量组之间的运动矢量差值MVD,例如,假若第一组控制点包括第一控制点和第二控制点,那么需要确定第一控制点的运动矢量与所述目标候选运动矢量组表示的一组控制点中的第一控制点的运动矢量预测值的运动矢量差值MVD,以及确定第二控制点的运动矢量与所述目标候选运动矢量组表示的一组控制点中的第二控制点的运动矢量预测值的运动矢量差值MVD。在这种情况下,所述将与所述目标候选运动矢量组对应的索引编入待传输的码流,可以具体包括:将所述MVD和与所述目标候选运动矢量组对应的索引编入待传输的码流。In another possible implementation manner, in the advanced motion vector prediction AMVP mode, the method further includes: using the target candidate motion vector group as a search starting point to search for a lowest cost one within a preset search range according to a rate distortion cost criterion. Motion vectors of a group of control points; then, determine a motion vector difference MVD between the motion vectors of the group of control points and the target candidate motion vector group, for example, if the first group of control points includes a first control point and The second control point, then it is necessary to determine the motion vector difference MVD between the motion vector of the first control point and the motion vector prediction value of the first control point in a set of control points represented by the target candidate motion vector group, and determine the first A motion vector difference MVD between a motion vector of two control points and a motion vector prediction value of a second control point in a set of control points represented by the target candidate motion vector group. In this case, the encoding the index corresponding to the target candidate motion vector group into a code stream to be transmitted may specifically include: encoding the MVD and an index corresponding to the target candidate motion vector group into The code stream to be transmitted.
在又一种可选的方案中,在融合merge模式下,所述将与所述目标候选运动矢量组对应的索引编入待传输的码流,可以具体包括:将与所述目标候选运动矢量组、参考帧索引和预测方向对应的索引编入待传输的码流。In another optional solution, in the merge merge mode, the indexing the index corresponding to the target candidate motion vector group into a code stream to be transmitted may specifically include: combining the target candidate motion vector group with the target candidate motion vector group. The index corresponding to the group, reference frame index, and prediction direction is coded into the code stream to be transmitted.
在一种可能的实现方式中,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为根据所述第一相邻仿射编码块的左上控制点的位置坐标(x 4,y 4)计算得到的,其中,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射编码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿编码块的宽度,cuH为所述第一相邻仿射编码块的高度;另外,所述第一相邻仿射编码块的左下控制点的运动矢量为所述第一相邻仿射编码块的左下子块的运动矢量,所述第一相邻仿射编码块的右下控制点的运动矢量为第一相邻仿射编码块的右下子块的运动矢量。可以看出,第一相邻仿射编码块的左下控制点的位置坐标和所述右下控制点的位置坐标均是推导得到的,而不是从内存中读取得到的,因此采用该方法能够进一步减少内存的读取,提高了编码性能。作为另外一种可选方案,也可以在内存中预选存储左下控制点和右下控制点的位置坐标,后续要用的时候从内存中读取。 In a possible implementation manner, the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine coding block and the position coordinates (x 7 , y 7 ) of the lower right control point Both are calculated according to the position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine coding block, where the position coordinates of the lower left control point of the first adjacent affine coding block (x 6 , y 6 ) is (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine coding block are (x 4 + cuW, y 4 + cuH), cuW is the width of the first neighboring affine coding block, and cuH is the height of the first neighboring affine coding block; in addition, the lower left of the first neighboring affine coding block is controlled The motion vector of the point is the motion vector of the lower left sub-block of the first adjacent affine coding block, and the motion vector of the lower right control point of the first adjacent affine coding block is the first adjacent affine coding block. Motion vector of the lower right child block. It can be seen that the position coordinates of the lower left control point and the position coordinates of the lower right control point of the first adjacent affine coding block are both derived and not read from memory, so using this method can Further reducing memory reads and improving encoding performance. As another alternative, the position coordinates of the lower left control point and the lower right control point may also be pre-selected and stored in the memory, and read from the memory when needed later.
在又一种可选的方案中,所述根据率失真代价准则,从候选运动矢量列表中确定目标候选运动矢量组之后,该方法还包括:基于所述目标候选运动矢量组得到所述当前编码块 的一个或多个子块的运动矢量;基于所述当前编码块的一个或多个子块的运动矢量,预测得到所述当前编码块的像素预测值。可选的,在基于所述目标候选运动矢量组得到所述当前编码块的一个或多个子块的运动矢量时,若所述当前编码块的下边界与所述当前编码块所在的CTU的下边界重合,则所述当前编码块的左下角的子块的运动矢量为根据所述目标候选运动矢量组和所述当前编码块的左下角的位置坐标(0,H)计算得到,所述当前编码块的右下角的子块的运动矢量为根据所述目标候选运动矢量组和所述当前编码块的右下角的位置坐标(W,H)计算得到。例如,根据目标候选运动矢量构建仿射模型,然后将当前编码块的左下角的位置坐标(0,H)代入到该仿射模型即可得到当前编码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前编码块的右下角的位置坐标(W,H)代入到该仿射模型即可得到当前编码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前编码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动矢量列表),用到的是准确的值而不是估算值;其中,W为当前编码块的宽,H为当前编码块的高。In another optional solution, after determining the target candidate motion vector group from the candidate motion vector list according to the rate-distortion cost criterion, the method further includes: obtaining the current encoding based on the target candidate motion vector group. A motion vector of one or more sub-blocks of the block; and based on the motion vectors of the one or more sub-blocks of the current coding block, predicting a pixel prediction value of the current coding block. Optionally, when a motion vector of one or more sub-blocks of the current coding block is obtained based on the target candidate motion vector group, if a lower boundary of the current coding block is lower than a CTU of the current coding block, When the boundaries are coincident, the motion vector of the lower left sub-block of the current coding block is calculated according to the target candidate motion vector group and the position coordinate (0, H) of the lower left corner of the current coding block. The motion vectors of the sub-blocks in the lower right corner of the coding block are calculated according to the target candidate motion vector group and the position coordinates (W, H) of the lower right corner of the current coding block. For example, an affine model is constructed based on the target candidate motion vector, and then the position coordinates (0, H) of the lower left corner of the current coding block are substituted into the affine model to obtain the motion vector of the subblock in the lower left corner of the current coding block ( Instead of substituting the coordinates of the center point of the sub-block in the lower left corner into the affine model for calculation), substituting the position coordinates (W, H) of the lower right corner of the current coding block into the affine model to obtain the right of the current coding block The motion vector of the subblock in the lower corner (instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion vector list of the other blocks) uses accurate values instead of estimated values; where W is the width of the current coding block and H is the height of the current coding block.
第二方面,本申请实施例提供一种视频编码器,包括用于实施第一方面的任意一种方法的若干个功能单元。举例来说,视频编码器可以包括:In a second aspect, an embodiment of the present application provides a video encoder, including several functional units for implementing any one of the methods in the first aspect. For example, a video encoder can include:
帧间预测单元,用于根据率失真代价准则,从候选运动矢量列表中确定目标候选运动矢量组;所述目标候选运动矢量组表示当前编码块的一组控制点的运动矢量预测值;An inter prediction unit, configured to determine a target candidate motion vector group from a candidate motion vector list according to a rate distortion cost criterion; the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current coding block;
熵编码单元,用于将与所述目标候选运动矢量对应的索引编入码流,并传输所述码流;An entropy coding unit, configured to encode an index corresponding to the target candidate motion vector into a code stream and transmit the code stream;
其中,如果第一相邻仿射编码块为四参数仿射编码块,且所述第一相邻仿射编码块位于所述当前编码块的上方编码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的。Wherein, if the first neighboring affine coding block is a four-parameter affine coding block, and the first neighboring affine coding block is located in the coding tree unit CTU above the current coding block, the candidate motion vector list A first group of candidate motion vector prediction values is included, and the first group of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine coding block.
第三方面,本申请实施例提供一种用于编码视频数据的设备,所述设备包括:In a third aspect, an embodiment of the present application provides a device for encoding video data, where the device includes:
存储器,用于存储码流形式的视频数据;Memory for storing video data in the form of a stream;
视频编码器,用于根据率失真代价准则,从候选运动矢量列表中确定目标候选运动矢量组;所述目标候选运动矢量组表示当前编码块的一组控制点的运动矢量预测值;以及用于将与所述目标候选运动矢量对应的索引编入码流,并传输所述码流;其中,如果第一相邻仿射编码块为四参数仿射编码块,且所述第一相邻仿射编码块位于所述当前编码块的上方编码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的。A video encoder for determining a target candidate motion vector group from a candidate motion vector list according to a rate-distortion cost criterion; the target candidate motion vector group represents a motion vector prediction value for a set of control points of a current coding block; and Coding an index corresponding to the target candidate motion vector into a code stream, and transmitting the code stream; wherein if the first neighboring affine coding block is a four-parameter affine coding block, and the first neighboring affine coding block Is located in the coding tree unit CTU above the current coding block, the candidate motion vector list includes a first set of candidate motion vector prediction values, and the first set of candidate motion vector prediction values is based on the first phase The lower left control point and the lower right control point of the neighboring affine coding block are obtained.
第四方面,本申请实施例提供一种编码设备,包括:相互耦合的非易失性存储器和处理器,所述处理器调用存储在所述存储器中的程序代码以执行第一方面的任意一种方法的部分或全部步骤。In a fourth aspect, an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled to each other, the processor calling program code stored in the memory to execute any one of the first aspect Some or all steps of this method.
第五方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储了程序代码,其中,所述程序代码包括用于执行第一方面的任意一种方法的部分或全部步骤的指令。In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores program code, where the program code includes a part for performing any one of the methods of the first aspect or Instructions for all steps.
第六方面,本申请实施例提供一种计算机程序产品,当所述计算机程序产品在计算机 上运行时,使得所述计算机执行第一方面的任意一种方法的部分或全部步骤。According to a sixth aspect, an embodiment of the present application provides a computer program product, and when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of any one of the methods of the first aspect.
应当理解的是,本申请的第二至第六方面与本申请的第一方面的技术方案一致,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。It should be understood that the second to sixth aspects of the present application are consistent with the technical solutions of the first aspect of the present application, and the beneficial effects obtained in each aspect and corresponding feasible implementation manners are similar, and will not be described again.
第七方面,本申请实施例公开了一种解码方法,该方法包括:解析码流,以得到索引,所述索引用于指示当前解码块(可以具体为当前仿射解码块)的目标候选运动矢量组;接着根据所述索引,从候选运动矢量列表(例如,仿射变换候选运动矢量列表)中确定所述目标候选运动矢量组(可选的,当候选运动矢量列表的长度为1时,不需要解析码流得到索引,直接可以确定目标运动矢量组),所述目标候选运动矢量组表示当前解码块的一组控制点的运动矢量预测值;其中,如果第一相邻仿射解码块为四参数仿射解码块,且所述第一相邻仿射解码块位于所述当前解码块的上方解码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的;可选的,其中候选运动矢量列表的构建方式可以为:按照相邻块A、相邻块B、相邻块C、相邻块D、相邻块E(如图7A)的顺序确定当前解码块的一个或多个相邻仿射解码块,所述一个或多个相邻仿射解码块包括第一相邻仿射解码块;然后若所述第一相邻仿射解码块为四参数仿射解码块,则基于第一相邻仿射解码块的左下控制点和右下控制点采用第一仿射模型得到所述当前解码块的第一组控制点的运动矢量预测值,其中所述当前编码块的第一组控制点的运动矢量预测值作为所述候选运动矢量列表的第一组候选运动矢量;通过以上方式确定目标候选运动矢量组之后,将与所述目标候选运动矢量对应的索引编入待传输的码流(可选的,当候选运动矢量列表的长度为1时,不需索引来指示该目标运动矢量组)。In a seventh aspect, an embodiment of the present application discloses a decoding method. The method includes parsing a bitstream to obtain an index, where the index is used to indicate a target candidate motion of a current decoding block (which may be specifically a current affine decoding block). A vector group; and then determining the target candidate motion vector group from a candidate motion vector list (for example, an affine transformation candidate motion vector list) according to the index (optionally, when the length of the candidate motion vector list is 1, It is not necessary to parse the bitstream to obtain the index, and the target motion vector group can be directly determined. The target candidate motion vector group represents the motion vector prediction value of a set of control points of the current decoding block; wherein, if the first adjacent affine decoding block is Is a four-parameter affine decoding block, and the first adjacent affine decoding block is located above the current decoding block decoding tree unit CTU, then the candidate motion vector list includes a first set of candidate motion vector prediction values, and The first group of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine decoding block; optionally, where The construction method of the selected motion vector list may be: determine one or more of the current decoding block in the order of the neighboring block A, neighboring block B, neighboring block C, neighboring block D, and neighboring block E (as shown in FIG. 7A). Adjacent affine decoded blocks, the one or more adjacent affine decoded blocks include a first adjacent affine decoded block; and if the first adjacent affine decoded block is a four-parameter affine decoded block, Then using the first affine model to obtain the motion vector prediction value of the first set of control points of the current decoding block based on the lower left control point and the lower right control point of the first adjacent affine decoding block, where the The motion vector prediction value of the first set of control points is used as the first set of candidate motion vectors of the candidate motion vector list. After the target candidate motion vector group is determined in the above manner, an index corresponding to the target candidate motion vector is compiled into The transmitted code stream (optionally, when the length of the candidate motion vector list is 1, no index is needed to indicate the target motion vector group).
在上述方法中,候选运动矢量列表中可能只有一个候选运动矢量组,也可能有多个候选运动矢量组,其中,每个候选运动矢量组可以为一个运动矢量二元组或者运动矢量三元组。当存在多个候选运动矢量组时,该第一组候选运动矢量预测值为该多个候选运动矢量组中的一个候选运动矢量组,该多个候选运动矢量组中的其他候选运动矢量组的生成原理可以与第一组候选运动矢量预测值的生成原理相同,也可以与第一组候选运动矢量预测值的生成原理不同。进一步地,上述目标候选运动矢量组为根据率失真代价准则,从候选运动矢量列表中选择出的最优的候选运动矢量组,如果该第一组候选运动矢量预测值是最优的,那么,选择出的目标候选运动矢量组即为第一组候选运动矢量预测值;如果该第一组候选运动矢量预测值不是最优的,那么,选择出的目标候选运动矢量组不是第一组候选运动矢量预测值。上述第一相邻仿射解码块为该当前解码块的相邻块中的某一个四参数仿射解码块,具体为哪一个此处不作限定,以图7A为,可能是其中的相邻块A,或者相邻块B,或者其他相邻块。除此之外,本申请实施例其他地方出现的“第一”、“第二”、“第三”等都表示某一个的意思,其中,“第一”表示的某一个、“第二”表示的某一个、“第三”表示的某一个各指代不同的对象,例如,假若出现了第一组控制点和第二组控制点,那么第一组控制点和第二组控制点各指代不同的控制点;另外,本申请实施例中的“第一”、“第二”等也没有先后次序的含义。In the above method, there may be only one candidate motion vector group in the candidate motion vector list, or there may be multiple candidate motion vector groups, where each candidate motion vector group may be a motion vector binary group or a motion vector triplet . When there are multiple candidate motion vector groups, the predicted value of the first candidate motion vector group is one candidate motion vector group of the multiple candidate motion vector groups, and the other candidate motion vector groups of the multiple candidate motion vector groups are The generation principle may be the same as the generation principle of the first group of candidate motion vector prediction values, or may be different from the generation principle of the first group of candidate motion vector prediction values. Further, the target candidate motion vector group is an optimal candidate motion vector group selected from a candidate motion vector list according to a rate distortion cost criterion. If the first group of candidate motion vector prediction values is optimal, then, The selected target candidate motion vector group is the first group of candidate motion vector prediction values; if the first group of candidate motion vector prediction values is not optimal, then the selected target candidate motion vector group is not the first group of candidate motions. Vector prediction. The first adjacent affine decoding block is a four-parameter affine decoding block among the neighboring blocks of the current decoding block. The specific one is not limited here. Taking FIG. 7A as an example, it may be the neighboring block among them. A, or neighboring block B, or other neighboring blocks. In addition, "first", "second", "third", etc. appearing elsewhere in the embodiments of the present application all mean a certain meaning. Among them, "first" represents a certain one, and "second" The one indicated and the one indicated by "third" refer to different objects. For example, if the first group of control points and the second group of control points appear, the first group of control points and the second group of control points each have It refers to different control points; in addition, the “first”, “second”, and the like in the embodiments of the present application have no sequential meaning.
可以理解的是,当所述第一相邻仿射解码块所在的解码树单元CTU在所述当前解码块 位置的上方时,该第一相邻仿射解码块最下方控制点的信息已经从内存中读取过;因此上述方案在根据第一相邻仿射解码块的第一组控制点构建候选运动矢量的过程中,该第一组控制点包括所述第一相邻仿射解码块的左下控制点和右下控制点;而不是像现有技术那样固定将第一相邻解码块的左上控制点、右上控制点和左下控制点作为第一组控制点(或者固定将第一相邻解码块的左上控制点和右上控制点作为第一组控制点)。因此采用本申请中确定第一组控制点的方法,第一组控制点的信息(例如,位置坐标、运动矢量等)很大概率上可以直接复用从内存中读取过的信息,从而减少了内存的读取,提高了解码性能。另外,由于特地规定第一相邻仿射解码块为四参数仿射解码块,因此在根据第一相邻仿射解码块的组控制点构建候选运动矢量时,只需用到第一相邻仿射解码块的左下控制点和右下控制点即可,不需要再用到额外的控制点,因此进一步保证了内存读取不会太高。It can be understood that when the CTU of the decoding tree unit where the first neighboring affine decoding block is located is above the current decoding block position, the information of the lowest control point of the first neighboring affine decoding block has been changed from Read in memory; therefore, in the above solution, in the process of constructing candidate motion vectors according to the first set of control points of the first neighboring affine decoding block, the first set of control points includes the first neighboring affine decoding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent decoding block as the first group of control points as in the prior art (or fixing the first phase The upper left control point and the upper right control point of the neighboring decoding blocks are used as the first group of control points). Therefore, by adopting the method for determining the first set of control points in this application, the information (for example, position coordinates, motion vectors, etc.) of the first set of control points can be directly reused from the memory, thereby reducing the probability. Read the memory and improve decoding performance. In addition, because the first neighboring affine decoding block is specifically defined as a four-parameter affine decoding block, when the candidate motion vector is constructed based on the group control points of the first neighboring affine decoding block, only the first neighboring The lower left control point and lower right control point of the affine decoding block are sufficient, and no additional control point is needed, so it is further ensured that the memory read is not too high.
在一种可能的实现方式中,如果所述当前解码块为四参数仿射解码块,则所述第一组候选运动矢量预测值用于表示所述当前解码块的左上控制点和右上控制点的运动矢量预测值,例如,将所述当前解码块的左上控制点和右上控制点的位置坐标代入第一仿射模型,从而得到当前解码块的左上控制点和右上控制点的运动矢量预测值,其中,所述第一仿射模型是基于所述第一相邻仿射解码块的左下控制点和右下控制点的运动矢量及位置坐标确定的。In a possible implementation manner, if the current decoding block is a four-parameter affine decoding block, the first set of candidate motion vector prediction values is used to represent an upper left control point and an upper right control point of the current decoding block. , For example, substituting position coordinates of the upper left control point and the upper right control point of the current decoding block into the first affine model, thereby obtaining the motion vector prediction values of the upper left control point and the upper right control point of the current decoding block. , Wherein the first affine model is determined based on a motion vector and a position coordinate of a lower left control point and a lower right control point of the first adjacent affine decoding block.
如果所述当前解码块为六参数仿射解码块,则所述第一组候选运动矢量预测值用于表示所述当前解码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。例如,将所述当前解码块的左上控制点、右上控制点和左下定点控制点的位置坐标代入第一仿射模型,从而得到当前解码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。If the current decoding block is a six-parameter affine decoding block, the first set of candidate motion vector prediction values is used to represent motion vector prediction of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block. value. For example, the position coordinates of the upper left control point, the upper right control point, and the lower left fixed point control point of the current decoding block are substituted into the first affine model, so as to obtain the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block. Motion vector prediction.
在又一种可选的方案中,所述基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量,具体为:基于第二仿射模型得到所述当前解码块的一个或多个子块的运动矢量(例如,将该一个或者多个子块的中心点的坐标代入该第二仿射模型,从而得到一个或多个子块的运动矢量),其中,所述第二仿射模型是基于所述目标候选运动矢量组和所述当前解码块的一组控制点的位置坐标确定的。In another optional solution, the obtaining the motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group is specifically: obtaining the current decoding based on a second affine model. A motion vector of one or more sub-blocks of the block (for example, substituting coordinates of a center point of the one or more sub-blocks into the second affine model to obtain motion vectors of one or more sub-blocks), The two affine models are determined based on the position coordinates of the target candidate motion vector group and a set of control points of the current decoding block.
在一种可选的方案中,在先进运动矢量预测AMVP模式下,所述基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量,可以具体包括:基于从所述码流中解析得到的运动矢量差值MVD与所述索引指示的目标候选运动矢量组,得到新的候选运动矢量组;然后基于所述新的候选运动矢量组,得到所述当前解码块的一个或多个子块的运动矢量,例如,先基于所述新的候选运动矢量组和所述当前解码块的一组控制点的位置坐标确定第二仿射模型,再基于第二仿射模型得到所述当前解码块的一个或多个子块的运动矢量。In an optional solution, in the advanced motion vector prediction AMVP mode, the obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group may specifically include: The motion vector difference MVD obtained from the code stream and the target candidate motion vector group indicated by the index are used to obtain a new candidate motion vector group; and then based on the new candidate motion vector group, the current decoded block is obtained. For example, a second affine model is determined based on the position coordinates of the new candidate motion vector group and a set of control points of the current decoded block, and then based on the second affine model. A motion vector of one or more sub-blocks of the current decoding block is obtained.
在又一种可能的实现方式中,在融合merge模式下,所述基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值,可以具体包括:根据所述当前解码块的一个或多个子块的运动矢量,以及所述索引指示的参考帧索引和预测方向,预测得到所述当前解码块的像素预测值。In yet another possible implementation manner, in a merge merge mode, the obtaining a pixel prediction value of the current decoded block based on the motion vector of one or more sub-blocks of the current decoded block may specifically include: According to the motion vector of one or more sub-blocks of the current decoding block, and the reference frame index and prediction direction indicated by the index, a pixel prediction value of the current decoding block is obtained by prediction.
在又一种可选的方案中,所述第一相邻仿射解码块的左下控制点的位置坐标 (x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为根据所述第一相邻仿射解码块的左上控制点的位置坐标(x 4,y 4)计算得到的,其中,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射解码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿解码块的宽度,cuH为所述第一相邻仿射解码块的高度;另外,所述第一相邻仿射解码块的左下控制点的运动矢量为所述第一相邻仿射解码块的左下子块的运动矢量,所述第一相邻仿射解码块的右下控制点的运动矢量为第一相邻仿射解码块的右下子块的运动矢量。可以看出,第一相邻仿射解码块的左下控制点的位置坐标和所述右下控制点的位置坐标均是推导得到的,而不是从内存中读取得到的,因此采用该方法能够进一步减少内存的读取,提高了解码性能。作为另外一种可选方案,也可以在内存中预选存储左下控制点和右下控制点的位置坐标,后续要用的时候从内存中读取。 In yet another optional solution, the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine decoding block and the position coordinates (x 7 , y 7 ) of the lower right control point ) Are calculated according to the position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine decoding block, where the position of the lower left control point of the first adjacent affine decoding block is The coordinates (x 6 , y 6 ) are (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine decoding block are (x 4 + cuW) , Y 4 + cuH), cuW is the width of the first neighboring affine decoding block, and cuH is the height of the first neighboring affine decoding block; in addition, the bottom left of the first neighboring affine decoding block The motion vector of the control point is the motion vector of the lower left sub-block of the first adjacent affine decoding block, and the motion vector of the lower right control point of the first adjacent affine decoding block is the first adjacent affine decoding The motion vector of the block's bottom-right child block. It can be seen that the position coordinates of the lower left control point and the position coordinates of the lower right control point of the first adjacent affine decoding block are both derived and not read from the memory, so using this method can Further reducing memory reads and improving decoding performance. As another alternative, the position coordinates of the lower left control point and the lower right control point may also be pre-selected and stored in the memory, and read from the memory when needed later.
在又一种可选的方案中,在基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量时,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下角的子块的运动矢量为根据所述目标候选运动矢量组和所述当前解码块的左下角的位置坐标(0,H)计算得到,所述当前解码块的右下角的子块的运动矢量为根据所述目标候选运动矢量组和所述当前解码块的右下角的位置坐标(W,H)计算得到。例如,根据目标候选运动矢量构建仿射模型,然后将当前解码块的左下角的位置坐标(0,H)代入到该仿射模型即可得到当前解码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前解码块的右下角的位置坐标(W,H)代入到该仿射模型即可得到当前解码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前解码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动矢量列表),用到的是准确的值而不是估算值。其中,W为该当前解码块的宽,H为该当前解码块的高。In another optional solution, when the motion vectors of one or more sub-blocks of the current decoding block are obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block and the current decoding The lower bounds of the CTUs where the blocks are located coincide, the motion vector of the sub-left corner of the current decoding block is the position coordinate (0, H) according to the target candidate motion vector group and the bottom-left corner of the current decoding block. It is calculated that the motion vector of the sub-block in the lower right corner of the current decoding block is calculated according to the target candidate motion vector group and the position coordinates (W, H) of the lower right corner of the current decoding block. For example, an affine model is constructed based on the target candidate motion vector, and then the position coordinates (0, H) of the lower left corner of the current decoded block are substituted into the affine model to obtain the motion vector of the subblock in the lower left corner of the current decoded block ( Instead of substituting the coordinates of the center point of the subblock in the lower left corner into the affine model for calculation), substituting the position coordinates (W, H) of the lower right corner of the current decoded block into the affine model to obtain the right of the current decoded block The motion vector of the subblock in the lower corner (instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion vector lists of the other blocks) use accurate values instead of estimated values. Wherein, W is the width of the current decoded block, and H is the height of the current decoded block.
第八方面,本申请实施例提供一种视频解码器,该视频解码器包括:In an eighth aspect, an embodiment of the present application provides a video decoder. The video decoder includes:
熵解码单元,用于解析码流,以得到索引,所述索引用于指示当前解码块的目标候选运动矢量组;An entropy decoding unit, configured to parse a code stream to obtain an index, where the index is used to indicate a target candidate motion vector group of a current decoding block;
帧间预测单元,用于根据所述索引,从候选运动矢量列表中确定所述目标候选运动矢量组,所述目标候选运动矢量组表示当前解码块的一组控制点的运动矢量预测值,其中,如果第一相邻仿射解码块为四参数仿射解码块,且所述第一相邻仿射解码块位于所述当前解码块的上方解码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的;以及基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量;基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值。An inter prediction unit, configured to determine the target candidate motion vector group from a candidate motion vector list according to the index, where the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current decoding block, where If the first neighboring affine decoding block is a four-parameter affine decoding block and the first neighboring affine decoding block is located above the current decoding block decoding tree unit CTU, the list of candidate motion vectors includes A first set of candidate motion vector prediction values, the first set of candidate motion vector prediction values being obtained based on a lower left control point and a lower right control point of the first adjacent affine decoding block; and based on the target candidate motion The vector group obtains a motion vector of one or more sub-blocks of the current decoding block; based on the motion vectors of the one or more sub-blocks of the current decoding block, predicts and obtains a pixel prediction value of the current decoding block.
第九方面,本申请实施例提供一种用于解码视频数据的设备,所述设备包括:In a ninth aspect, an embodiment of the present application provides a device for decoding video data, where the device includes:
存储器,用于存储码流形式的视频数据;Memory for storing video data in the form of a stream;
视频解码器,用于解析码流,以得到索引,所述索引用于指示当前解码块的目标候选 运动矢量组;以及用于根据所述索引,从候选运动矢量列表中确定所述目标候选运动矢量组,所述目标候选运动矢量组表示当前解码块的一组控制点的运动矢量预测值,其中,如果第一相邻仿射解码块为四参数仿射解码块,且所述第一相邻仿射解码块位于所述当前解码块的上方解码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的;以及基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量;基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值。A video decoder for parsing a bitstream to obtain an index, the index indicating a target candidate motion vector group of a current decoding block; and for determining the target candidate motion from a candidate motion vector list according to the index Vector group, the target candidate motion vector group represents a motion vector prediction value of a set of control points of the current decoding block, wherein if the first adjacent affine decoding block is a four-parameter affine decoding block, and the first phase The neighboring affine decoding block is located above the current decoding block CTU, then the candidate motion vector list includes a first set of candidate motion vector prediction values, and the first set of candidate motion vector prediction values is based on the first Obtained from the lower left control point and the lower right control point of an adjacent affine decoding block; and obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group; based on the current decoding block And predicting a motion vector of one or more sub-blocks to obtain a pixel prediction value of the current decoded block.
第十方面,本申请实施例提供一种解码设备,包括:相互耦合的非易失性存储器和处理器,所述处理器调用存储在所述存储器中的程序代码以执行第七方面的任意一种方法的部分或全部步骤。According to a tenth aspect, an embodiment of the present application provides a decoding device, including: a non-volatile memory and a processor coupled to each other, the processor invoking program code stored in the memory to execute any one of the seventh aspects Some or all steps of this method.
第十一方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储了程序代码,其中,所述程序代码包括用于执行第七方面的任意一种方法的部分或全部步骤的指令。According to an eleventh aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores program code, where the program code includes a part for performing any one of the methods of the seventh aspect Or all step instructions.
第十二方面,本申请实施例提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行第七方面的任意一种方法的部分或全部步骤。In a twelfth aspect, an embodiment of the present application provides a computer program product, and when the computer program product runs on a computer, the computer is caused to execute part or all of the steps of any one of the methods in the seventh aspect.
应当理解的是,本申请的第八至第十二方面与本申请的第七方面的技术方案一致,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。It should be understood that the eighth to twelfth aspects of the present application are consistent with the technical solution of the seventh aspect of the present application, and the beneficial effects obtained in each aspect and corresponding feasible implementation manners are similar, and will not be described again.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。In order to more clearly explain the technical solutions in the embodiments of the present application or the background art, the drawings that are needed in the embodiments of the present application or the background art will be described below.
图1为本申请实施例中一种视频编码及解码系统的示意性框图;FIG. 1 is a schematic block diagram of a video encoding and decoding system according to an embodiment of the present application; FIG.
图2A为本申请实施例中一种视频编码器的示意性框图;2A is a schematic block diagram of a video encoder according to an embodiment of the present application;
图2B为本申请实施例中一种视频解码器的示意性框图;2B is a schematic block diagram of a video decoder according to an embodiment of the present application;
图3为本申请实施例中一种用于编码视频图像的帧间预测的方法的流程图;3 is a flowchart of a method for inter prediction of an encoded video image according to an embodiment of the present application;
图4为本申请实施例中一种用于解码视频图像的帧间预测的方法的流程图;4 is a flowchart of a method for decoding an inter prediction of a video image according to an embodiment of the present application;
图5为本申请实施例中的一种当前图像块和参考块的运动信息示意图;5 is a schematic diagram of motion information of a current image block and a reference block in an embodiment of the present application;
图6是本申请实施例提供的一种编码方法的流程示意图;6 is a schematic flowchart of an encoding method according to an embodiment of the present application;
图7A是本申请实施例提供的一种相邻块的场景示意图;FIG. 7A is a schematic diagram of a scenario of an adjacent block provided by an embodiment of the present application; FIG.
图7B是本申请实施例提供的一种相邻块的场景示意图;7B is a schematic diagram of a scenario of an adjacent block provided by an embodiment of the present application;
图8A是本申请实施例提供的一种运动补偿单元的结构示意图;8A is a schematic structural diagram of a motion compensation unit according to an embodiment of the present application;
图8B是本申请实施例提供的又一种运动补偿单元的结构示意图;8B is a schematic structural diagram of still another motion compensation unit according to an embodiment of the present application;
图9是本申请实施例提供的一种解码方法的流程示意图;9 is a schematic flowchart of a decoding method according to an embodiment of the present application;
图9A是本申请实施例提供的一种构建候选运动矢量列表的流程示意图;9A is a schematic flowchart of constructing a candidate motion vector list according to an embodiment of the present application;
图10是本身实施例提供的一种编码设备或解码设备的结构示意图;10 is a schematic structural diagram of an encoding device or a decoding device according to an embodiment of the present invention;
图11是根据一示例性实施例的包含图2A的编码器20和/或图2B的解码器200的视频编码系统1100。FIG. 11 is a video encoding system 1100 including the encoder 20 of FIG. 2A and / or the decoder 200 of FIG. 2B according to an exemplary embodiment.
具体实施方式detailed description
下面结合本申请实施例中的附图对本申请实施例进行描述。The following describes the embodiments of the present application with reference to the drawings in the embodiments of the present application.
非平动运动模型预测指在编解码端(例如,视频编码器和视频解码器两端)使用相同的运动模型推导出当前编/解码块内每一个子运动补偿单元(也称子块)的运动信息(如运动矢量),根据子运动补偿单元的运动信息进行运动补偿,得到预测块,从而提高预测效率。在推导当前编/解码块内的运动补偿单元(也称子块)的运动信息的过程中涉及基于运动模型的运动矢量预测,目前通常采用当前编/解码块的相邻仿射解码块的左上控制点、右上控制点、左下控制点的位置坐标和运动矢量推导仿射模型;然后根据该仿射模型推导该当前编/解码块的一组控制点的运动矢量预测值,以作为候选运动矢量列表中的一组候选运动矢量预测值。然而,运动矢量预测过程中用到的相邻仿射解码块的左上控制点、右上控制点、左下控制点的位置坐标和运动矢量,均需要实时从内存中读取,这会增加内存读取的压力。本申请实施例重点讲述如何减少内存读取压力,涉及对编解码端的优化,为了更好的理解本申请实施例的思想,下面首先对本申请实施例的应用场景进行介绍。Non-translational motion model prediction refers to the use of the same motion model on the codec side (e.g., video encoder and video decoder ends) to derive the value of each sub-motion compensation unit (also known as a sub-block) in the current encoding / decoding block. The motion information (such as a motion vector) is subjected to motion compensation according to the motion information of the sub motion compensation unit to obtain a prediction block, thereby improving prediction efficiency. The process of deriving motion information of a motion compensation unit (also called a sub-block) in the current encoding / decoding block involves motion vector prediction based on the motion model. Currently, the upper left of the adjacent affine decoding block of the current encoding / decoding block is usually used The affine model is derived from the position coordinates and motion vectors of the control point, the upper right control point, the lower left control point, and the motion vector; and then the motion vector prediction value of a set of control points of the current encoding / decoding block is derived according to the affine model as the candidate motion vector A set of candidate motion vector predictions in a list. However, the position coordinates and motion vectors of the upper-left control point, upper-right control point, lower-left control point of adjacent affine decoding blocks used in the motion vector prediction process need to be read from memory in real time, which will increase the memory read pressure. The embodiment of the present application focuses on how to reduce the memory read pressure, which involves the optimization of the encoding and decoding ends. In order to better understand the idea of the embodiment of the present application, the application scenario of the embodiment of the present application is first introduced below.
编码视频流,或者其一部分,诸如视频帧或者图像块可以使用视频流中的时间和空间相似性以改善编码性能。例如,视频流的当前图像块可以通过基于视频流中的先前已编码块预测用于当前图像块的运动信息,并识别预测块和当前图像块(即原始块)之间的差值(亦称为残差),从而基于先前已编码块对当前图像块进行编码。以这种方法,仅仅将用于产生当前图像块的残差和一些参数包括于数字视频输出位流中,而不是将当前图像块的整体包括于数字视频输出位流。这种技术可以称为帧间预测。Encoding a video stream, or a portion thereof, such as a video frame or an image block, can use temporal and spatial similarities in the video stream to improve encoding performance. For example, the current image block of a video stream can predict motion information for the current image block based on previously encoded blocks in the video stream, and identify the difference between the predicted block and the current image block (that is, the original block) (also known as the original block) Is the residual), thereby encoding the current image block based on the previously encoded block. In this way, only the residuals and some parameters used to generate the current image block are included in the digital video output bitstream, rather than the entirety of the current image block is included in the digital video output bitstream. This technique can be called inter prediction.
运动矢量是帧间预测过程中的一个重要参数,其表示先前已编码块相对于该当前编码块的空间位移。可以使用运动估算的方法,诸如运动搜索来获取运动矢量。初期的帧间预测技术,将表示运动矢量的位包括在编码的位流中,以允许解码器再现预测块,进而得到重建块。为了进一步的改善编码效率,后来又提出使用参考运动矢量差分地编码运动矢量,即取代编码运动矢量整体,而仅仅编码运动矢量和参考运动矢量之间的差值。在有些情况下,参考运动矢量可以是从在视频流中先前使用的运动矢量中选择出来的,选择先前使用的运动矢量编码当前的运动矢量可以进一步减少包括在编码的视频位流中的位数。The motion vector is an important parameter in the inter prediction process, which represents the spatial displacement of a previously coded block relative to the current coded block. Motion vectors can be obtained using motion estimation methods, such as motion search. In the early inter-prediction technology, the bits representing the motion vector were included in the encoded bit stream to allow the decoder to reproduce the predicted block and then obtain the reconstructed block. In order to further improve the coding efficiency, it was later proposed to use the reference motion vector to differentially encode the motion vector, that is, instead of encoding the entire motion vector, only the difference between the motion vector and the reference motion vector was encoded. In some cases, the reference motion vector may be selected from previously used motion vectors in the video stream. Selecting a previously used motion vector to encode the current motion vector can further reduce the number of bits included in the encoded video bitstream .
图1为本申请实施例中所描述的一种实例的视频译码系统1的框图。如本文所使用,术语“视频译码器”一般是指视频编码器和视频解码器两者。在本申请中,术语“视频译码”或“译码”可一般地指代视频编码或视频解码。视频译码系统1的视频编码器100和视频解码器200用于根据本申请提出的多种新的帧间预测模式中的任一种所描述的各种方法实例来预测当前经译码图像块或其子块的运动信息,例如运动矢量,使得预测出的运动矢量最大程度上接近使用运动估算方法得到的运动矢量,从而编码时无需传送运动矢量差值,从而进一步的改善编解码性能。FIG. 1 is a block diagram of a video decoding system 1 according to an example described in the embodiment of the present application. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this application, the terms "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video decoding system 1 are configured to predict a current coded image block according to various method examples described in any of a variety of new inter prediction modes proposed in the present application. The motion information of the sub-block or its sub-blocks, such as the motion vector, makes the predicted motion vector close to the motion vector obtained using the motion estimation method to the greatest extent, so that the motion vector difference is not transmitted during encoding, thereby further improving the encoding and decoding performance.
如图1中所示,视频译码系统1包含源装置10和目的地装置20。源装置10产生经编码视频数据。因此,源装置10可被称为视频编码装置。目的地装置20可对由源装置10所产生的经编码的视频数据进行解码。因此,目的地装置20可被称为视频解码装置。源装置 10、目的地装置20或两个的各种实施方案可包含一或多个处理器以及耦合到所述一或多个处理器的存储器。所述存储器可包含但不限于RAM、ROM、EEPROM、快闪存储器或可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体,如本文所描述。As shown in FIG. 1, the video decoding system 1 includes a source device 10 and a destination device 20. The source device 10 generates encoded video data. Therefore, the source device 10 may be referred to as a video encoding device. The destination device 20 may decode the encoded video data generated by the source device 10. Therefore, the destination device 20 may be referred to as a video decoding device. Various implementations of the source device 10, the destination device 20, or both may include one or more processors and a memory coupled to the one or more processors. The memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other media that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
源装置10和目的地装置20可以包括各种装置,包含桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者。The source device 10 and the destination device 20 may include various devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets, such as so-called "smart" phones, etc. Cameras, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, or the like.
目的地装置20可经由链路30从源装置10接收经编码视频数据。链路30可包括能够将经编码视频数据从源装置10移动到目的地装置20的一或多个媒体或装置。在一个实例中,链路30可包括使得源装置10能够实时将经编码视频数据直接发射到目的地装置20的一或多个通信媒体。在此实例中,源装置10可根据通信标准(例如无线通信协议)来调制经编码视频数据,且可将经调制的视频数据发射到目的地装置20。所述一或多个通信媒体可包含无线和/或有线通信媒体,例如射频(RF)频谱或一或多个物理传输线。所述一或多个通信媒体可形成基于分组的网络的一部分,基于分组的网络例如为局域网、广域网或全球网络(例如,因特网)。所述一或多个通信媒体可包含路由器、交换器、基站或促进从源装置10到目的地装置20的通信的其它设备。The destination device 20 may receive the encoded video data from the source device 10 via the link 30. The link 30 may include one or more media or devices capable of moving the encoded video data from the source device 10 to the destination device 20. In one example, the link 30 may include one or more communication media enabling the source device 10 to directly transmit the encoded video data to the destination device 20 in real time. In this example, the source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 20. The one or more communication media may include wireless and / or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet). The one or more communication media may include a router, a switch, a base station, or other devices that facilitate communication from the source device 10 to the destination device 20.
在另一实例中,可将经编码数据从输出接口140输出到存储装置40。类似地,可通过输入接口240从存储装置40存取经编码数据。存储装置40可包含多种分布式或本地存取的数据存储媒体中的任一者,例如硬盘驱动器、蓝光光盘、DVD、CD-ROM、快闪存储器、易失性或非易失性存储器,或用于存储经编码视频数据的任何其它合适的数字存储媒体。In another example, the encoded data may be output from the output interface 140 to the storage device 40. Similarly, the encoded data can be accessed from the storage device 40 through the input interface 240. The storage device 40 may include any of a variety of distributed or locally-accessed data storage media, such as a hard drive, Blu-ray disc, DVD, CD-ROM, flash memory, volatile or non-volatile memory, Or any other suitable digital storage medium for storing encoded video data.
在另一实例中,存储装置40可对应于文件服务器或可保持由源装置10产生的经编码视频的另一中间存储装置。目的地装置20可经由流式传输或下载从存储装置40存取所存储的视频数据。文件服务器可为任何类型的能够存储经编码的视频数据并且将经编码的视频数据发射到目的地装置20的服务器。实例文件服务器包含网络服务器(例如,用于网站)、FTP服务器、网络附接式存储(NAS)装置或本地磁盘驱动器。目的地装置20可通过任何标准数据连接(包含因特网连接)来存取经编码视频数据。这可包含无线信道(例如,Wi-Fi连接)、有线连接(例如,DSL、电缆调制解调器等),或适合于存取存储在文件服务器上的经编码视频数据的两者的组合。经编码视频数据从存储装置40的传输可为流式传输、下载传输或两者的组合。In another example, the storage device 40 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 10. The destination device 20 may access the stored video data from the storage device 40 via streaming or download. The file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to the destination device 20. Example file servers include a web server (eg, for a website), an FTP server, a network attached storage (NAS) device, or a local disk drive. The destination device 20 can access the encoded video data through any standard data connection, including an Internet connection. This may include a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, DSL, cable modem, etc.), or a combination of both suitable for accessing encoded video data stored on a file server. The transmission of the encoded video data from the storage device 40 may be a streaming transmission, a download transmission, or a combination of the two.
本申请的运动矢量预测技术可应用于视频编解码以支持多种多媒体应用,例如空中电视广播、有线电视发射、卫星电视发射、串流视频发射(例如,经由因特网)、用于存储于数据存储媒体上的视频数据的编码、存储在数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频译码系统1可用于支持单向或双向视频传输以支持例如视频流式传输、视频回放、视频广播和/或视频电话等应用。The motion vector prediction technology of the present application can be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the Internet), for storage in data storage Encoding of video data on media, decoding of video data stored on data storage media, or other applications. In some examples, the video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
图1中所说明的视频译码系统1仅为实例,并且本申请的技术可适用于未必包含编码装置与解码装置之间的任何数据通信的视频译码设置(例如,视频编码或视频解码)。在其它实例中,数据从本地存储器检索、在网络上流式传输等等。视频编码装置可对数据进行 编码并且将数据存储到存储器,和/或视频解码装置可从存储器检索数据并且对数据进行解码。在许多实例中,由并不彼此通信而是仅编码数据到存储器和/或从存储器检索数据且解码数据的装置执行编码和解码。The video decoding system 1 illustrated in FIG. 1 is merely an example, and the techniques of the present application can be applied to a video decoding setting (for example, video encoding or video decoding) that does not necessarily include any data communication between the encoding device and the decoding device. . In other examples, data is retrieved from local storage, streamed over a network, and so on. The video encoding device may encode the data and store the data to a memory, and / or the video decoding device may retrieve the data from the memory and decode the data. In many instances, encoding and decoding are performed by devices that do not communicate with each other, but only encode data to and / or retrieve data from memory and decode data.
在图1的实例中,源装置10包含视频源120、视频编码器100和输出接口140。在一些实例中,输出接口140可包含调节器/解调器(调制解调器)和/或发射器。视频源120可包括视频捕获装置(例如,摄像机)、含有先前捕获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频馈入接口,和/或用于产生视频数据的计算机图形系统,或视频数据的此些来源的组合。In the example of FIG. 1, the source device 10 includes a video source 120, a video encoder 100, and an output interface 140. In some examples, the output interface 140 may include a regulator / demodulator (modem) and / or a transmitter. Video source 120 may include a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and / or a computer for generating video data Graphics systems, or a combination of these sources of video data.
视频编码器100可对来自视频源120的视频数据进行编码。在一些实例中,源装置10经由输出接口140将经编码视频数据直接发射到目的地装置20。在其它实例中,经编码视频数据还可存储到存储装置40上,供目的地装置20以后存取来用于解码和/或播放。The video encoder 100 may encode video data from the video source 120. In some examples, the source device 10 transmits the encoded video data directly to the destination device 20 via the output interface 140. In other examples, the encoded video data may also be stored on the storage device 40 for later access by the destination device 20 for decoding and / or playback.
在图1的实例中,目的地装置20包含输入接口240、视频解码器200和显示装置220。在一些实例中,输入接口240包含接收器和/或调制解调器。输入接口240可经由链路30和/或从存储装置40接收经编码视频数据。显示装置220可与目的地装置20集成或可在目的地装置20外部。一般来说,显示装置220显示经解码视频数据。显示装置220可包括多种显示装置,例如,液晶显示器(LCD)、等离子显示器、有机发光二极管(OLED)显示器或其它类型的显示装置。In the example of FIG. 1, the destination device 20 includes an input interface 240, a video decoder 200, and a display device 220. In some examples, the input interface 240 includes a receiver and / or a modem. The input interface 240 may receive the encoded video data via the link 30 and / or from the storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays decoded video data. The display device 220 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
尽管图1中未图示,但在一些方面,视频编码器100和视频解码器200可各自与音频编码器和解码器集成,且可包含适当的多路复用器-多路分用器单元或其它硬件和软件,以处置共同数据流或单独数据流中的音频和视频两者的编码。在一些实例中,如果适用的话,那么MUX-DEMUX单元可符合ITU H.223多路复用器协议,或例如用户数据报协议(UDP)等其它协议。Although not illustrated in FIG. 1, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer unit Or other hardware and software to handle encoding of both audio and video in a common or separate data stream. In some examples, the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol, or other protocols such as the User Datagram Protocol (UDP), if applicable.
视频编码器100和视频解码器200各自可实施为例如以下各项的多种电路中的任一者:一或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件来实施本申请,那么装置可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用一或多个处理器在硬件中执行所述指令从而实施本申请技术。前述内容(包含硬件、软件、硬件与软件的组合等)中的任一者可被视为一或多个处理器。视频编码器100和视频解码器200中的每一者可包含在一或多个编码器或解码器中,所述编码器或解码器中的任一者可集成为相应装置中的组合编码器/解码器(编码解码器)的一部分。 Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field Programmable Gate Array (FPGA), discrete logic, hardware, or any combination thereof. If the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may use one or more processors to execute the instructions in hardware Thus implementing the technology of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as a combined encoder in a corresponding device / Decoder (codec).
本申请可大体上将视频编码器100称为将某些信息“发信号通知”或“发射”到例如视频解码器200的另一装置。术语“发信号通知”或“发射”可大体上指代用以对经压缩视频数据进行解码的语法元素和/或其它数据的传送。此传送可实时或几乎实时地发生。替代地,此通信可经过一段时间后发生,例如可在编码时在经编码位流中将语法元素存储到计算机可读存储媒体时发生,解码装置接着可在所述语法元素存储到此媒体之后的任何时间检索所述语法元素。This application may generally refer to video encoder 100 as "signaling" or "transmitting" certain information to another device, such as video decoder 200. The terms "signaling" or "transmitting" may generally refer to the transmission of syntax elements and / or other data to decode the compressed video data. This transfer can occur in real time or almost real time. Alternatively, this communication may occur after a period of time, such as may occur when a syntax element is stored in a coded bit stream to a computer-readable storage medium at the time of encoding, and the decoding device may then after the syntax element is stored to this medium Retrieve the syntax element at any time.
视频编码器100和视频解码器200可根据例如高效视频编码(HEVC)等视频压缩标准或其扩展来操作,并且可符合HEVC测试模型(HM)。或者,视频编码器100和视频解码器 200也可根据其它业界标准来操作,所述标准例如是ITU-T H.264、H.265标准,或此类标准的扩展。然而,本申请的技术不限于任何特定编解码标准。The video encoder 100 and the video decoder 200 may operate according to a video compression standard such as High Efficiency Video Coding (HEVC) or an extension thereof, and may conform to the HEVC test model (HM). Alternatively, video encoder 100 and video decoder 200 may also operate according to other industry standards, such as the ITU-T H.264, H.265 standards, or extensions of such standards. However, the techniques of this application are not limited to any particular codec standard.
在一个实例中,一并参阅图3,视频编码器100用于:将与当前待编码的图像块相关的语法元素编码入数字视频输出位流(简称为位流或码流),这里将用于当前图像块帧间预测的语法元素简称为帧间预测数据;为了确定用于对当前图像块进行编码的帧间预测模式,视频编码器100还用于确定或选择(S301)上述候选帧间预测模式集合中用于对当前图像块进行帧间预测的帧间预测模式(例如选择多种新的帧间预测模式中编码当前图像块的码率失真代价折中或最小的帧间预测模式);以及基于确定的帧间预测模式,编码所述当前图像块(S303),这里的编码过程可以包括基于确定的帧间预测模式,预测所述当前图像块中一个或多个子块的运动信息(具体可以是每个子块或者所有子块的运动信息),并利用所述当前图像块中一个或多个子块的运动信息对所述当前图像块执行帧间预测;In an example, referring to FIG. 3 together, the video encoder 100 is configured to encode a syntax element related to an image block to be currently encoded into a digital video output bit stream (referred to as a bit stream or a code stream), which will be used here. The syntax element for inter prediction on the current image block is referred to as inter prediction data for short; in order to determine the inter prediction mode used to encode the current image block, the video encoder 100 is further configured to determine (S301) the candidate inter frame The inter prediction mode in the prediction mode set used to perform inter prediction on the current image block (for example, selecting a variety of new inter prediction modes to compromise the code rate of the current image block or the minimum inter prediction mode) And encoding the current image block based on the determined inter prediction mode (S303), the encoding process herein may include predicting motion information of one or more sub-blocks in the current image block based on the determined inter prediction mode ( Specifically, it may be motion information of each sub-block or all sub-blocks), and use the motion information of one or more sub-blocks in the current image block to perform frame processing on the current image block. Inter prediction
应当理解的是,如果由基于本申请提出的新的帧间预测模式预测出的运动信息产生的预测块与当前待编码图像块(即原始块)之间的差值(即残差)为0,则视频编码器100中只需要将与当前待编码的图像块相关的语法元素编入位流(亦称为码流);反之,除了语法元素外,还需要将相应的残差编入位流。It should be understood that if the difference (that is, the residual) between the prediction block generated by the motion information predicted based on the new inter prediction mode proposed in the present application and the current image block (that is, the original block) to be encoded is 0 , The video encoder 100 only needs to program the syntax elements related to the image block to be encoded into a bit stream (also known as a code stream); otherwise, in addition to the syntax elements, the corresponding residuals need to be coded into bits flow.
在另一实例中,一并参阅图4,视频解码器200用于:从位流中解码出与当前待解码的图像块相关的语法元素(S401),当所述帧间预测数据指示采用候选帧间预测模式集合(即新的帧间预测模式)来对当前图像块进行预测时,确定所述候选帧间预测模式集合中用于对当前图像块进行帧间预测的帧间预测模式(S403),并基于确定的帧间预测模式解码所述当前图像块(S405),这里的解码过程可以包括基于确定的帧间预测模式,预测所述当前图像块中一个或多个子块的运动信息,并利用所述当前图像块中一个或多个子块的运动信息对所述当前图像块执行帧间预测。In another example, referring to FIG. 4 together, the video decoder 200 is configured to decode a syntax element related to the image block to be decoded from the bit stream (S401), and when the inter prediction data indicates that a candidate is adopted When a set of inter prediction modes (that is, a new inter prediction mode) is used to predict the current image block, an inter prediction mode in the candidate inter prediction mode set for performing inter prediction on the current image block is determined (S403 ), And decoding the current image block based on the determined inter prediction mode (S405), the decoding process herein may include predicting motion information of one or more sub-blocks in the current image block based on the determined inter prediction mode, Inter motion prediction is performed on the current image block by using motion information of one or more sub-blocks in the current image block.
可选的,如果所述帧间预测数据还包括用于指示所述当前图像块采用何种帧间预测模式的第二标识,视频解码器200用于确定所述第二标识指示的帧间预测模式为用于对所述当前图像块进行帧间预测的帧间预测模式;或者,如果所述帧间预测数据未包括用于指示所述当前图像块采用何种帧间预测模式的第二标识,视频解码器200用于确定用于非方向性的运动场的第一帧间预测模式为用于对所述当前图像块进行帧间预测的帧间预测模式。Optionally, if the inter prediction data further includes a second identifier for indicating which inter prediction mode the current image block uses, the video decoder 200 is configured to determine the inter prediction indicated by the second identifier. The mode is an inter prediction mode for performing inter prediction on the current image block; or, if the inter prediction data does not include a second identifier used to indicate which inter prediction mode is used by the current image block The video decoder 200 is configured to determine that a first inter prediction mode used for a non-directional motion field is an inter prediction mode used for inter prediction of the current image block.
图2A为本申请实施例中所描述的一种实例的视频编码器100的框图。视频编码器100用于将视频输出到后处理实体41。后处理实体41表示可处理来自视频编码器100的经编码视频数据的视频实体的实例,例如媒体感知网络元件(MANE)或拼接/编辑装置。在一些情况下,后处理实体41可为网络实体的实例。在一些视频编码系统中,后处理实体41和视频编码器100可为单独装置的若干部分,而在其它情况下,相对于后处理实体41所描述的功能性可由包括视频编码器100的相同装置执行。在某一实例中,后处理实体41是图1的存储装置40的实例。FIG. 2A is a block diagram of a video encoder 100 according to an example described in the embodiment of the present application. The video encoder 100 is configured to output a video to the post-processing entity 41. The post-processing entity 41 represents an example of a video entity that can process the encoded video data from the video encoder 100, such as a media-aware network element (MANE) or a stitching / editing device. In some cases, the post-processing entity 41 may be an instance of a network entity. In some video encoding systems, the post-processing entity 41 and the video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to the post-processing entity 41 may be performed by the same device including the video encoder 100 carried out. In a certain example, the post-processing entity 41 is an example of the storage device 40 of FIG. 1.
视频编码器100可根据本申请提出的包括模式0,1,2…或10的候选帧间预测模式集合中的任一种新的帧间预测模式执行视频图像块的编码,例如执行视频图像块的帧间预测。The video encoder 100 may perform encoding of a video image block according to any new inter prediction mode set of candidate inter prediction mode sets including modes 0, 1, 2,... Or 10 proposed in the present application, for example, perform a video image block. Inter prediction.
在图2A的实例中,视频编码器100包括预测处理单元108、滤波器单元106、经解码 图像缓冲单元(DPB)107、求和单元112、变换单元101、量化单元102和熵编码单元103。预测处理单元108包括帧间预测单元110和帧内预测单元109。为了图像块重构,视频编码器100还包含反量化单元104、反变换单元105和求和单元111。滤波器单元106既定表示一或多个环路滤波单元,例如去块滤波单元、自适应环路滤波单元(ALF)和样本自适应偏移(SAO)滤波单元。尽管在图2A中将滤波单元106示出为环路内滤波器,但在其它实现方式下,可将滤波单元106实施为环路后滤波器。在一种示例下,视频编码器100还可以包括视频数据存储单元、分割单元(图中未示意)。In the example of FIG. 2A, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a decoded image buffer unit (DPB) 107, a summing unit 112, a transform unit 101, a quantization unit 102, and an entropy encoding unit 103. The prediction processing unit 108 includes an inter prediction unit 110 and an intra prediction unit 109. For image block reconstruction, the video encoder 100 further includes an inverse quantization unit 104, an inverse transform unit 105, and a summing unit 111. The filter unit 106 is intended to represent one or more loop filtering units, such as a deblocking filtering unit, an adaptive loop filtering unit (ALF), and a sample adaptive offset (SAO) filtering unit. Although the filtering unit 106 is shown as an in-loop filter in FIG. 2A, in other implementations, the filtering unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data storage unit and a segmentation unit (not shown in the figure).
视频数据存储单元可存储待由视频编码器100的组件编码的视频数据。可从视频源120获得存储在视频数据存储单元中的视频数据。DPB 107可为参考图像存储单元,其存储用于由视频编码器100在帧内、帧间译码模式中对视频数据进行编码的参考视频数据。视频数据存储单元和DPB 107可由多种存储单元装置中的任一者形成,例如包含同步DRAM(SDRAM)的动态随机存取存储单元(DRAM)、磁阻式RAM(MRAM)、电阻式RAM(RRAM),或其它类型的存储单元装置。视频数据存储单元和DPB 107可由同一存储单元装置或单独存储单元装置提供。在各种实例中,视频数据存储单元可与视频编码器100的其它组件一起在芯片上,或相对于那些组件在芯片外。The video data storage unit may store video data to be encoded by the components of the video encoder 100. The video data stored in the video data storage unit may be obtained from the video source 120. The DPB 107 may be a reference image storage unit that stores reference video data used by the video encoder 100 to encode video data in an intra-frame or inter-frame decoding mode. The video data storage unit and the DPB 107 may be formed by any of a variety of storage unit devices, such as a dynamic random access memory unit (DRAM), a synchronous resistive RAM (MRAM), a resistive RAM ( RRAM), or other types of memory cell devices. The video data storage unit and the DPB 107 may be provided by the same storage unit device or a separate storage unit device. In various examples, the video data storage unit may be on-chip with other components of video encoder 100, or off-chip with respect to those components.
如图2A中所示,视频编码器100接收视频数据,并将所述视频数据存储在视频数据存储单元中。分割单元将所述视频数据分割成若干图像块,而且这些图像块可以被进一步分割为更小的块,例如基于四叉树结构或者二叉树结构的图像块分割。此分割还可包含分割成条带(slice)、片(tile)或其它较大单元。视频编码器100通常说明编码待编码的视频条带内的图像块的组件。所述条带可分成多个图像块(并且可能分成被称作片的图像块集合)。预测处理单元108可选择用于当前图像块的多个可能的译码模式中的一者,例如多个帧内译码模式中的一者或多个帧间译码模式中的一者,其中所述多个帧间译码模式包括但不限于本申请提出的模式0,1,2,3…10中的一个或多个。预测处理单元108可将所得经帧内、帧间译码的块提供给求和单元112以产生残差块,且提供给求和单元111以重构用作参考图像的经编码块。As shown in FIG. 2A, the video encoder 100 receives video data and stores the video data in a video data storage unit. The segmentation unit divides the video data into several image blocks, and these image blocks can be further divided into smaller blocks, such as image block segmentation based on a quad tree structure or a binary tree structure. This segmentation may also include segmentation into slices, tiles, or other larger units. Video encoder 100 typically illustrates components that encode image blocks within a video slice to be encoded. The slice can be divided into multiple image patches (and possibly into a collection of image patches called slices). The prediction processing unit 108 may select one of a plurality of possible decoding modes for the current image block, such as one of a plurality of intra-coding modes or one of a plurality of inter-coding modes, wherein The multiple inter-frame decoding modes include, but are not limited to, one or more of the modes 0, 1, 2, 3 ... 10 proposed in the present application. The prediction processing unit 108 may provide the obtained intra, inter-coded block to the summing unit 112 to generate a residual block, and to the summing unit 111 to reconstruct an encoded block used as a reference image.
预测处理单元108内的帧内预测单元109可相对于与待编码当前块在相同帧或条带中的一或多个相邻块执行当前图像块的帧内预测性编码,以去除空间冗余。预测处理单元108内的帧间预测单元110可相对于一或多个参考图像中的一或多个预测块执行当前图像块的帧间预测性编码以去除时间冗余。The intra prediction unit 109 within the prediction processing unit 108 may perform intra predictive encoding of the current image block with respect to one or more neighboring blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy. . The inter-prediction unit 110 within the prediction processing unit 108 may perform inter-predictive encoding of the current image block with respect to one or more prediction blocks in the one or more reference images to remove temporal redundancy.
具体的,帧间预测单元110可用于确定用于编码当前图像块的帧间预测模式。举例来说,帧间预测单元110可使用速率-失真分析来计算候选帧间预测模式集合中的各种帧间预测模式的速率-失真值,并从中选择具有最佳速率-失真特性的帧间预测模式。速率失真分析通常确定经编码块与经编码以产生所述经编码块的原始的未经编码块之间的失真(或误差)的量,以及用于产生经编码块的位速率(也就是说,位数目)。例如,帧间预测单元110可确定候选帧间预测模式集合中编码所述当前图像块的码率失真代价最小的帧间预测模式为用于对当前图像块进行帧间预测的帧间预测模式。下文将详细介绍帧间预测性编码过程,尤其是在本申请各种用于非方向性或方向性的运动场的帧间预测模式下,预测当前图像块中一个或多个子块(具体可以是每个子块或所有子块)的运动信息的过程。Specifically, the inter prediction unit 110 may be configured to determine an inter prediction mode for encoding a current image block. For example, the inter-prediction unit 110 may use rate-distortion analysis to calculate rate-distortion values of various inter-prediction modes in the candidate inter-prediction mode set, and select an inter-frame having the best rate-distortion characteristics from among Forecasting mode. Rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate used to generate the coded block (that is, , Number of bits). For example, the inter-prediction unit 110 may determine that the inter-prediction mode with the lowest code rate distortion cost of encoding the current image block in the set of candidate inter-prediction modes is the inter-prediction mode for inter-prediction of the current image block. The following describes the inter-predictive coding process in detail, especially in the various inter-prediction modes for non-directional or directional sports fields in this application, predicting one or more sub-blocks (specifically, each Sub-block or all sub-blocks).
帧间预测单元110用于基于确定的帧间预测模式,预测当前图像块中一个或多个子块的运动信息(例如运动矢量),并利用当前图像块中一个或多个子块的运动信息(例如运动矢量)获取或产生当前图像块的预测块。帧间预测单元110可在参考图像列表中的一者中定位所述运动向量指向的预测块。帧间预测单元110还可产生与图像块和视频条带相关联的语法元素以供视频解码器200在对视频条带的图像块解码时使用。又或者,一种示例下,帧间预测单元110利用每个子块的运动信息执行运动补偿过程,以生成每个子块的预测块,从而得到当前图像块的预测块;应当理解的是,这里的帧间预测单元110执行运动估计和运动补偿过程。The inter prediction unit 110 is configured to predict motion information (such as a motion vector) of one or more subblocks in the current image block based on the determined inter prediction mode, and use the motion information (such as the motion vector) of one or more subblocks in the current image block. Motion vector) to obtain or generate a prediction block of the current image block. The inter prediction unit 110 may locate a prediction block pointed to by the motion vector in one of the reference image lists. The inter prediction unit 110 may also generate syntax elements associated with image blocks and video slices for use by the video decoder 200 when decoding image blocks of the video slices. In another example, the inter prediction unit 110 uses the motion information of each sub-block to perform a motion compensation process to generate a prediction block of each sub-block, thereby obtaining a prediction block of the current image block. It should be understood that the The inter prediction unit 110 performs motion estimation and motion compensation processes.
具体的,在为当前图像块选择帧间预测模式之后,帧间预测单元110可将指示当前图像块的所选帧间预测模式的信息提供到熵编码单元103,以便于熵编码单元103编码指示所选帧间预测模式的信息。在本申请中,视频编码器100可在所发射的位流中包含与当前图像块相关的帧间预测数据,其可包括第一标识block_based_enable_flag,以表示是否对当前图像块采用本申请提出的新的帧间预测模式进行帧间预测;可选的,还可以包括第二标识block_based_index,以指示当前图像块使用的是哪一种新的帧间预测模式。本申请中,在不同的模式0,1,2…10下,利用多个参考块的运动矢量来预测当前图像块或其子块的运动矢量的过程,将在下文详细描述。Specifically, after the inter prediction mode is selected for the current image block, the inter prediction unit 110 may provide information indicating the selected inter prediction mode of the current image block to the entropy encoding unit 103, so that the entropy encoding unit 103 encodes the instruction. Information on the selected inter prediction mode. In this application, the video encoder 100 may include inter prediction data related to the current image block in the transmitted bit stream, which may include a first identifier block_based_enable_flag to indicate whether the new image proposed by the present application is adopted for the current image block. The inter prediction mode is used for inter prediction; optionally, a second identifier block_based_index may also be included to indicate which new inter prediction mode is used by the current image block. In the present application, in different modes 0, 1, 2, ..., 10, a process of predicting a motion vector of a current image block or a sub-block thereof using motion vectors of multiple reference blocks will be described in detail below.
帧内预测单元109可对当前图像块执行帧内预测。明确地说,帧内预测单元109可确定用来编码当前块的帧内预测模式。举例来说,帧内预测单元109可使用速率-失真分析来计算各种待测试的帧内预测模式的速率-失真值,并从待测试模式当中选择具有最佳速率-失真特性的帧内预测模式。在任何情况下,在为图像块选择帧内预测模式之后,帧内预测单元109可将指示当前图像块的所选帧内预测模式的信息提供到熵编码单元103,以便熵编码单元103编码指示所选帧内预测模式的信息。The intra prediction unit 109 may perform intra prediction on the current image block. Specifically, the intra prediction unit 109 may determine an intra prediction mode used to encode the current block. For example, the intra-prediction unit 109 may use rate-distortion analysis to calculate rate-distortion values for various intra-prediction modes to be tested, and select an intra-prediction with the best rate-distortion characteristics from the test modes. mode. In any case, after the intra prediction mode is selected for the image block, the intra prediction unit 109 may provide information indicating the selected intra prediction mode of the current image block to the entropy encoding unit 103 so that the entropy encoding unit 103 encodes the instruction Information on the selected intra prediction mode.
在预测处理单元108经由帧间预测、帧内预测产生当前图像块的预测块之后,视频编码器100通过从待编码的当前图像块减去所述预测块来形成残差图像块。求和单元112表示执行此减法运算的一或多个组件。所述残差块中的残差视频数据可包含在一或多个TU中,并应用于变换单元101。变换单元101使用例如离散余弦变换(DCT)或概念上类似的变换等变换将残差视频数据变换成残差变换系数。变换单元101可将残差视频数据从像素值域转换到变换域,例如频域。After the prediction processing unit 108 generates a prediction block of the current image block via inter prediction and intra prediction, the video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. The summing unit 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more TUs and applied to the transform unit 101. The transform unit 101 transforms the residual video data into a residual transform coefficient using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform. The transform unit 101 may transform the residual video data from a pixel value domain to a transform domain, such as a frequency domain.
变换单元101可将所得变换系数发送到量化单元102。量化单元102量化所述变换系数以进一步减小位速率。在一些实例中,量化单元102可接着执行对包含经量化的变换系数的矩阵的扫描。或者,熵编码单元103可执行扫描。The transformation unit 101 may send the obtained transformation coefficient to the quantization unit 102. The quantization unit 102 quantizes the transform coefficients to further reduce the bit rate. In some examples, the quantization unit 102 may then perform a scan of a matrix containing the quantized transform coefficients. Alternatively, the entropy encoding unit 103 may perform scanning.
在量化之后,熵编码单元103对经量化变换系数进行熵编码。举例来说,熵编码单元103可执行上下文自适应可变长度编码(CAVLC)、上下文自适应二进制算术编码(CABAC)、基于语法的上下文自适应二进制算术编码(SBAC)、概率区间分割熵(PIPE)编码或另一熵编码方法或技术。在由熵编码单元103熵编码之后,可将经编码位流发射到视频解码器200,或经存档以供稍后发射或由视频解码器200检索。熵编码单元103还可对待编码的当前图像块的语法元素进行熵编码。After quantization, the entropy encoding unit 103 performs entropy encoding on the quantized transform coefficients. For example, the entropy encoding unit 103 may perform context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), and probability interval segmentation entropy (PIPE ) Coding or another entropy coding method or technique. After entropy encoding by the entropy encoding unit 103, the encoded bitstream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoding unit 103 may also perform entropy encoding on the syntax elements of the current image block to be encoded.
反量化单元104和反变换单元105分别应用逆量化和逆变换以在像素域中重构所述残 差块,例如以供稍后用作参考图像的参考块。求和单元111将经重构的残差块添加到由帧间预测单元110或帧内预测单元109产生的预测块,以产生经重构图像块。滤波器单元106可以适用于经重构图像块以减小失真,诸如方块效应(block artifacts)。然后,该经重构图像块作为参考块存储在经解码图像缓冲单元107中,可由帧间预测单元110用作参考块以对后续视频帧或图像中的块进行帧间预测。The inverse quantization unit 104 and the inverse transform unit 105 respectively apply inverse quantization and inverse transform to reconstruct the residual block in the pixel domain, for example, as a reference block for later use as a reference image. The summing unit 111 adds the reconstructed residual block to a prediction block generated by the inter prediction unit 110 or the intra prediction unit 109 to generate a reconstructed image block. The filter unit 106 may be adapted to reconstructed image blocks to reduce distortion, such as block artifacts. This reconstructed image block is then stored as a reference block in the decoded image buffer unit 107 and can be used as a reference block by the inter prediction unit 110 to perform inter prediction on subsequent video frames or blocks in the image.
应当理解的是,视频编码器100的其它的结构变化可用于编码视频流。例如,对于某些图像块或者图像帧,视频编码器100可以直接地量化残差信号而不需要经变换单元101处理,相应地也不需要经反变换单元105处理;或者,对于某些图像块或者图像帧,视频编码器100没有产生残差数据,相应地不需要经变换单元101、量化单元102、反量化单元104和反变换单元105处理;或者,视频编码器100可以将经重构图像块作为参考块直接地进行存储而不需要经滤波器单元106处理;或者,视频编码器100中量化单元102和反量化单元104可以合并在一起。环路滤波单元是可选的,以及针对无损压缩编码的情况下,变换单元101、量化单元102、反量化单元104和反变换单元105是可选的。应当理解的是,根据不同的应用场景,帧间预测单元和帧内预测单元可以是被选择性的启用,而在本案中,帧间预测单元被启用。It should be understood that other structural changes of the video encoder 100 may be used to encode a video stream. For example, for some image blocks or image frames, the video encoder 100 may directly quantize the residual signal without processing by the transform unit 101 and correspondingly does not need to be processed by the inverse transform unit 105; or, for some image blocks, Or image frames, the video encoder 100 does not generate residual data, and accordingly does not need to be processed by the transform unit 101, the quantization unit 102, the inverse quantization unit 104, and the inverse transform unit 105; or, the video encoder 100 may convert the reconstructed image The blocks are directly stored as reference blocks without being processed by the filter unit 106; alternatively, the quantization unit 102 and the inverse quantization unit 104 in the video encoder 100 may be merged together. The loop filtering unit is optional, and in the case of lossless compression coding, the transform unit 101, the quantization unit 102, the inverse quantization unit 104, and the inverse transform unit 105 are optional. It should be understood that, according to different application scenarios, the inter prediction unit and the intra prediction unit may be selectively enabled, and in this case, the inter prediction unit is enabled.
图2B为本申请实施例中所描述的一种实例的视频解码器200的框图。在图2B的实例中,视频解码器200包括熵解码单元203、预测处理单元208、反量化单元204、反变换单元205、求和单元211、滤波器单元206以及经解码图像缓冲单元207。预测处理单元208可以包括帧间预测单元210和帧内预测单元209。在一些实例中,视频解码器200可执行大体上与相对于来自图2A的视频编码器100描述的编码过程互逆的解码过程。FIG. 2B is a block diagram of a video decoder 200 according to an example described in the embodiment of the present application. In the example of FIG. 2B, the video decoder 200 includes an entropy decoding unit 203, a prediction processing unit 208, an inverse quantization unit 204, an inverse transform unit 205, a summing unit 211, a filter unit 206, and a decoded image buffer unit 207. The prediction processing unit 208 may include an inter prediction unit 210 and an intra prediction unit 209. In some examples, video decoder 200 may perform a decoding process that is substantially inverse to the encoding process described with respect to video encoder 100 from FIG. 2A.
在解码过程中,视频解码器200从视频编码器100接收表示经编码视频条带的图像块和相关联的语法元素的经编码视频位流。视频解码器200可从网络实体42接收视频数据,可选的,还可以将所述视频数据存储在视频数据存储单元(图中未示意)中。视频数据存储单元可存储待由视频解码器200的组件解码的视频数据,例如经编码视频位流。存储在视频数据存储单元中的视频数据,例如可从存储装置40、从相机等本地视频源、经由视频数据的有线或无线网络通信或者通过存取物理数据存储媒体而获得。视频数据存储单元可作为用于存储来自经编码视频位流的经编码视频数据的经解码图像缓冲单元(CPB)。因此,尽管在图2B中没有示意出视频数据存储单元,但视频数据存储单元和DPB 207可以是同一个的存储单元,也可以是单独设置的存储单元。视频数据存储单元和DPB 207可由多种存储单元装置中的任一者形成,例如:包含同步DRAM(SDRAM)的动态随机存取存储单元(DRAM)、磁阻式RAM(MRAM)、电阻式RAM(RRAM),或其它类型的存储单元装置。在各种实例中,视频数据存储单元可与视频解码器200的其它组件一起集成在芯片上,或相对于那些组件设置在芯片外。During the decoding process, the video decoder 200 receives from the video encoder 100 an encoded video bitstream representing image blocks of the encoded video slice and associated syntax elements. The video decoder 200 may receive video data from the network entity 42, optionally, the video data may also be stored in a video data storage unit (not shown in the figure). The video data storage unit may store video data, such as an encoded video bit stream, to be decoded by the components of the video decoder 200. The video data stored in the video data storage unit can be obtained, for example, from the storage device 40, from a local video source such as a camera, via a wired or wireless network of video data, or by accessing a physical data storage medium. The video data storage unit may function as a decoded image buffer unit (CPB) for storing encoded video data from the encoded video bitstream. Therefore, although the video data storage unit is not illustrated in FIG. 2B, the video data storage unit and the DPB 207 may be the same storage unit, or may be separate storage units. The video data storage unit and DPB 207 can be formed by any of a variety of storage unit devices, such as: dynamic random access memory unit (DRAM) including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), and resistive RAM (RRAM), or other types of memory cell devices. In various examples, the video data storage unit may be integrated on a chip with other components of the video decoder 200, or disposed off-chip relative to those components.
网络实体42可例如为服务器、MANE、视频编辑器/剪接器,或用于实施上文所描述的技术中的一或多者的其它此装置。网络实体42可包括或可不包括视频编码器,例如视频编码器100。在网络实体42将经编码视频位流发送到视频解码器200之前,网络实体42可实施本申请中描述的技术中的部分。在一些视频解码系统中,网络实体42和视频解码器200可为单独装置的部分,而在其它情况下,相对于网络实体42描述的功能性可由包括视 频解码器200的相同装置执行。在一些情况下,网络实体42可为图1的存储装置40的实例。The network entity 42 may be, for example, a server, a MANE, a video editor / splicer, or other such device for implementing one or more of the techniques described above. The network entity 42 may or may not include a video encoder, such as video encoder 100. Before the network entity 42 sends the encoded video bitstream to the video decoder 200, the network entity 42 may implement some of the techniques described in this application. In some video decoding systems, the network entity 42 and the video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to the network entity 42 may be performed by the same device including the video decoder 200. In some cases, the network entity 42 may be an example of the storage device 40 of FIG. 1.
视频解码器200的熵解码单元203对位流进行熵解码以产生经量化的系数和一些语法元素。熵解码单元203将语法元素转发到预测处理单元208。视频解码器200可接收在视频条带层级和/或图像块层级处的语法元素。The entropy decoding unit 203 of the video decoder 200 performs entropy decoding on the bit stream to generate quantized coefficients and some syntax elements. The entropy decoding unit 203 forwards the syntax element to the prediction processing unit 208. Video decoder 200 may receive syntax elements at a video slice level and / or an image block level.
当视频条带被解码为经帧内解码(I)条带时,预测处理单元208的帧内预测单元209可基于发信号通知的帧内预测模式和来自当前帧或图像的先前经解码块的数据而产生当前视频条带的图像块的预测块。当视频条带被解码为经帧间解码(即,B或P)条带时,预测处理单元208的帧间预测单元210可基于从熵解码单元203接收到的语法元素,确定用于对当前视频条带的当前图像块进行解码的帧间预测模式,基于确定的帧间预测模式,对所述当前图像块进行解码(例如执行帧间预测)。具体的,帧间预测单元210可确定是否对当前视频条带的当前图像块采用新的帧间预测模式进行预测,如果语法元素指示采用新的帧间预测模式来对当前图像块进行预测,基于新的帧间预测模式(例如通过语法元素指定的一种新的帧间预测模式或默认的一种新的帧间预测模式)预测当前视频条带的当前图像块或当前图像块的子块的运动信息,从而通过运动补偿过程使用预测出的当前图像块或当前图像块的子块的运动信息来获取或生成当前图像块或当前图像块的子块的预测块。这里的运动信息可以包括参考图像信息和运动矢量,其中参考图像信息可以包括但不限于单向/双向预测信息,参考图像列表号和参考图像列表对应的参考图像索引。对于帧间预测,可从参考图像列表中的一者内的参考图像中的一者产生预测块。视频解码器200可基于存储在DPB 207中的参考图像来建构参考图像列表,即列表0和列表1。当前图像的参考帧索引可包含于参考帧列表0和列表1中的一或多者中。在一些实例中,可以是视频编码器100发信号通知指示是否采用新的帧间预测模式来解码特定块的特定语法元素,或者,也可以是发信号通知指示是否采用新的帧间预测模式,以及指示具体采用哪一种新的帧间预测模式来解码特定块的特定语法元素。应当理解的是,这里的帧间预测单元210执行运动补偿过程。下文将详细的阐述在各种新的帧间预测模式下,利用参考块的运动信息来预测当前图像块或当前图像块的子块的运动信息的帧间预测过程。When a video slice is decoded into an intra-decoded (I) slice, the intra-prediction unit 209 of the prediction processing unit 208 may be based on the signaled intra-prediction mode and the previous decoded block from the current frame or image. Data to generate prediction blocks for image blocks of the current video slice. When a video slice is decoded into an inter-decoded (ie, B or P) slice, the inter-prediction unit 210 of the prediction processing unit 208 may determine, based on the syntax element received from the entropy decoding unit 203, the An inter prediction mode in which a current image block of a video slice is decoded, and based on the determined inter prediction mode, the current image block is decoded (for example, inter prediction is performed). Specifically, the inter prediction unit 210 may determine whether to use the new inter prediction mode for prediction of the current image block of the current video slice. If the syntax element indicates that the new inter prediction mode is used to predict the current image block, based on A new inter prediction mode (for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode) predicts the current image block of the current video slice or a sub-block of the current image block. Motion information, thereby obtaining or generating a predicted block of the current image block or a sub-block of the current image block by using the motion information of the predicted current image block or a sub-block of the current image block through a motion compensation process. The motion information here may include reference image information and motion vectors, where the reference image information may include but is not limited to unidirectional / bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list. For inter prediction, a prediction block may be generated from one of reference pictures within one of the reference picture lists. The video decoder 200 may construct a reference image list, that is, a list 0 and a list 1, based on the reference images stored in the DPB 207. The reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1. In some examples, the video encoder 100 may signal whether to use a new inter prediction mode to decode a specific syntax element of a specific block, or may be a signal to indicate whether to use a new inter prediction mode. And indicating which new inter prediction mode is used to decode a specific syntax element of a specific block. It should be understood that the inter prediction unit 210 here performs a motion compensation process. In the following, the inter-prediction process of using the motion information of the reference block to predict the motion information of the current image block or a sub-block of the current image block under various new inter-prediction modes will be explained in detail.
反量化单元204将在位流中提供且由熵解码单元203解码的经量化变换系数逆量化,即去量化。逆量化过程可包括:使用由视频编码器100针对视频条带中的每个图像块计算的量化参数来确定应施加的量化程度以及同样地确定应施加的逆量化程度。反变换单元205将逆变换应用于变换系数,例如逆DCT、逆整数变换或概念上类似的逆变换过程,以便产生像素域中的残差块。The inverse quantization unit 204 inverse quantizes the quantized transform coefficients provided in the bitstream and decoded by the entropy decoding unit 203, that is, dequantization. The inverse quantization process may include using a quantization parameter calculated by the video encoder 100 for each image block in the video slice to determine the degree of quantization that should be applied and similarly to determine the degree of inverse quantization that should be applied. The inverse transform unit 205 applies an inverse transform to transform coefficients, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process in order to generate a residual block in the pixel domain.
在帧间预测单元210产生用于当前图像块或当前图像块的子块的预测块之后,视频解码器200通过将来自反变换单元205的残差块与由帧间预测单元210产生的对应预测块求和以得到重建的块,即经解码图像块。求和单元211表示执行此求和操作的组件。在需要时,还可使用环路滤波单元(在解码环路中或在解码环路之后)来使像素转变平滑或者以其它方式改进视频质量。滤波器单元206可以表示一或多个环路滤波单元,例如去块滤波单元、自适应环路滤波单元(ALF)以及样本自适应偏移(SAO)滤波单元。尽管在图2B中将滤波单元206示出为环路内滤波单元,但在其它实现方式中,可将滤波器单元206实施为环 路后滤波单元。在一种示例下,滤波器单元206适用于重建块以减小块失真,并且该结果作为经解码视频流输出。并且,还可以将给定帧或图像中的经解码图像块存储在经解码图像缓冲单元207中,经解码图像缓冲单元207存储用于后续运动补偿的参考图像。经解码图像缓冲单元207可为存储单元的一部分,其还可以存储经解码视频,以供稍后在显示装置(例如图1的显示装置220)上呈现,或可与此类存储单元分开。After the inter-prediction unit 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 passes the residual block from the inverse transform unit 205 and the corresponding prediction generated by the inter-prediction unit 210. The blocks are summed to get the reconstructed block, that is, the decoded image block. The summing unit 211 represents a component that performs this summing operation. When needed, a loop filtering unit (in the decoding loop or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality. The filter unit 206 may represent one or more loop filtering units, such as a deblocking filtering unit, an adaptive loop filtering unit (ALF), and a sample adaptive offset (SAO) filtering unit. Although the filtering unit 206 is shown as an in-loop filtering unit in FIG. 2B, in other implementations, the filtering unit 206 may be implemented as a post-loop filtering unit. In one example, the filter unit 206 is adapted to reconstruct a block to reduce block distortion, and the result is output as a decoded video stream. Moreover, the decoded image block in a given frame or image may also be stored in the decoded image buffer unit 207, and the decoded image buffer unit 207 stores a reference image for subsequent motion compensation. The decoded image buffer unit 207 may be part of a storage unit, which may also store the decoded video for later presentation on a display device, such as the display device 220 of FIG. 1, or may be separate from such a storage unit.
应当理解的是,视频解码器200的其它结构变化可用于解码经编码视频位流。例如,视频解码器200可以不经滤波单元206处理而生成输出视频流;或者,对于某些图像块或者图像帧,视频解码器200的熵解码单元203没有解码出经量化的系数,相应地不需要经反量化单元204和反变换单元205处理。环路滤波单元是可选的;以及针对无损压缩的情况下,反量化单元204和反变换单元205是可选的。应当理解的是,根据不同的应用场景,帧间预测单元和帧内预测单元可以是被选择性的启用,而在本案中,帧间预测单元被启用。It should be understood that other structural variations of the video decoder 200 may be used to decode the encoded video bitstream. For example, the video decoder 200 may generate an output video stream without being processed by the filtering unit 206; or, for certain image blocks or image frames, the entropy decoding unit 203 of the video decoder 200 does not decode the quantized coefficients, and accordingly does not It needs to be processed by the inverse quantization unit 204 and the inverse transform unit 205. The loop filtering unit is optional; and in the case of lossless compression, the inverse quantization unit 204 and the inverse transform unit 205 are optional. It should be understood that, according to different application scenarios, the inter prediction unit and the intra prediction unit may be selectively enabled, and in this case, the inter prediction unit is enabled.
图5是示出本申请实施例中一种示例性的当前图像块600和参考块的运动信息示意图。如图5所示,W和H是当前图像块600以及当前图像块600的同位置co-located块(简称为并置块)600’的宽度和高度。当前图像块600的参考块包括:当前图像块600的上侧空域邻近块和左侧空域邻近块,以及并置块600’的下侧空域邻近块和右侧空域邻近块,其中并置块600’为参考图像中与当前图像块600具有相同的大小、形状和坐标的图像块。应当注意的是,当前图像块的下侧空域邻近块和右侧空域邻近块的运动信息不存在,还没编码。应当理解的是,当前图像块600和并置块600’可以是任意块大小。例如,当前图像块600和并置块600’可以包括但不限于16x16像素,32x32像素,32x16像素和16x32像素等。如上所述,每个图像帧可以被分割为用于编码的图像块。这些图像块可以被进一步分割为更小的块,例如当前图像块600和并置块600’可以被分割成多个MxN子块,即每个子块的大小均为MxN像素,而且,每个参考块的大小也为MxN像素,即与当前图像块的子块的大小相同。图5中的坐标以MxN块为衡量单位。“M×N”与“M乘N”可互换使用以指依照水平维度及垂直维度的图像块的像素尺寸,即在水平方向上具有M个像素,且在垂直方向上具有N个像素,其中M、N表示非负整数值。此外,块未必需要在水平方向上与在垂直方向上具有相同数目个像素。举例说明,这里的M=N=4,当然当前图像块的子块大小和参考块的大小也可以是8x8像素,8x4像素,或4x8像素,或者最小的预测块大小。此外,本申请描述的图像块可以理解为但不限于:预测单元(prediction unit,PU)或者编码单元(coding unit,CU)或者变换单元(transform unit,TU)等。根据不同视频压缩编解码标准的规定,CU可包含一个或多个预测单元PU,或者PU和CU的尺寸相同。图像块可具有固定或可变的大小,且根据不同视频压缩编解码标准而在大小上不同。此外,当前图像块是指当前待编码或解码的图像块,例如待编码或解码的预测单元。FIG. 5 is a schematic diagram illustrating motion information of an exemplary current image block 600 and a reference block in an embodiment of the present application. As shown in FIG. 5, W and H are the width and height of the current image block 600 and a co-located block (referred to as a co-located block) 600 'of the same position of the current image block 600. The reference blocks of the current image block 600 include: the upper airspace neighboring block and the left airspace neighboring block of the current image block 600, and the lower airspace neighboring block and the right airspace neighboring block of the collocated block 600 ', where the collocated block 600 'Is an image block in the reference image having the same size, shape, and coordinates as the current image block 600. It should be noted that the motion information of the lower spatial domain neighboring block and the right spatial domain neighboring block of the current image block does not exist and has not been encoded yet. It should be understood that the current image block 600 and the collocated block 600 'may be any block size. For example, the current image block 600 and the collocated block 600 'may include, but are not limited to, 16x16 pixels, 32x32 pixels, 32x16 pixels, 16x32 pixels, and the like. As described above, each image frame can be divided into image blocks for encoding. These image blocks can be further divided into smaller blocks, for example, the current image block 600 and the collocated block 600 'can be divided into multiple MxN sub-blocks, that is, each sub-block is MxN pixels in size, and each reference The size of the block is also MxN pixels, that is, the same size as the sub-block of the current image block. The coordinates in FIG. 5 are measured in MxN blocks. "M × N" and "M times N" are used interchangeably to refer to the pixel size of an image block according to the horizontal and vertical dimensions, that is, there are M pixels in the horizontal direction and N pixels in the vertical direction. Where M and N represent non-negative integer values. In addition, the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction. For example, M = N = 4 here, of course, the subblock size and reference block size of the current image block can also be 8x8 pixels, 8x4 pixels, or 4x8 pixels, or the smallest predicted block size. In addition, the image block described in this application can be understood as, but not limited to, a prediction unit (PU), a coding unit (CU), or a transformation unit (TU). According to the provisions of different video compression codec standards, a CU may include one or more prediction units PU, or the PU and the CU have the same size. Image blocks can have fixed or variable sizes and differ in size according to different video compression codec standards. In addition, the current image block refers to an image block to be encoded or decoded currently, such as a prediction unit to be encoded or decoded.
在一种示例下,可以沿着方向1依序判断当前图像块600的每个左侧空域邻近块是否可用,以及可以沿着方向2依序判断当前图像块600的每个上侧空域邻近块是否可用,例如判断邻近块(亦称为参考块,可互换使用)是否帧间编码,如果邻近块存在且是帧间编码,则所述邻近块可用;如果邻近块不存在或者是帧内编码,则所述邻近块不可用。如果一个邻近块是帧内编码,则复制邻近的其它参考块的运动信息作为该邻近块的运动信息。按照类似方法检测并置块600’的下侧空域邻近块和右侧空域邻近块是否可用,在此不再赘 述。In one example, it is possible to sequentially determine whether each of the left spatial-domain neighboring blocks of the current image block 600 is available along the direction 1 and to sequentially determine each upper-space neighboring block of the current image block 600 along the direction 2. Whether it is available, for example, judging whether neighboring blocks (also referred to as reference blocks, which are used interchangeably) are inter-coded. If the neighboring block exists and is inter-coded, the neighboring block is available; if the neighboring block does not exist or is intra Encoding, then the neighboring blocks are unavailable. If one neighboring block is intra-coded, the motion information of other neighboring reference blocks is copied as the motion information of the neighboring block. Whether the lower airspace neighboring block and the right airspace neighboring block of the juxtaposed block 600 'are available in a similar manner is not described here.
进一步的,如果可用参考块的大小与当前图像块的子块的大小是4x4,可以直接获取fetch可用参考块的运动信息;如果可用参考块的大小例如是8x4,8x8,可以获取其中心4x4块的运动信息作为该可用参考块的运动信息,该中心4x4块的左上角顶点相对于该参考块的左上角顶点的坐标为((W/4)/2*4,(H/4)/2*4),这里除运算为整除运算,若M=8,N=4,则中心4x4块的左上角顶点相对于该参考块的左上角顶点的坐标为(4,0)。可选地,也可以获取该参考块的左上角4x4块的运动信息作为该可用参考块的运动信息,但本申请并不限于此。Further, if the size of the available reference block and the size of the subblock of the current image block are 4x4, the motion information of the fetch available reference block can be directly obtained; if the size of the available reference block is, for example, 8x4, 8x8, its center 4x4 block can be obtained The motion information of the available reference block is used as the motion information of the available reference block. The coordinates of the top left vertex of the center 4x4 block relative to the top left vertex of the reference block are ((W / 4) / 2 * 4, (H / 4) / 2 * 4). Here, the division operation is an integer division operation. If M = 8 and N = 4, the coordinates of the upper left corner vertex of the center 4x4 block relative to the upper left corner vertex of the reference block are (4,0). Optionally, the motion information of the 4x4 block in the upper left corner of the reference block may also be acquired as the motion information of the available reference block, but the application is not limited thereto.
为了简化描述,下文以子块表示MxN子块,以邻近块表示邻近MxN块来进行说明。In order to simplify the description, the following describes the MxN sub-blocks as sub-blocks and the neighboring MxN-blocks as neighboring blocks.
图6是示出根据本申请一种实施例的编码方法的过程700的流程图。过程700可由视频编码器100执行,具体的,可以由视频编码器100的帧间预测单元110,以及熵编码单元(也称熵编码器)103来执行。过程700描述为一系列的步骤或操作,应当理解的是,过程700可以以各种顺序执行和/或同时发生,不限于图6所示的执行顺序。假设具有多个视频帧的视频数据流正在使用视频编码器,若其中第一相邻仿射编码块位于当前编码块上方的编码树单元(Coding Tree Unit,CTU),则基于第一相邻仿射编码块的左下控制点和右下控制点确定一组候选运动矢量预测值,对应图6所示流程,相关描述如下:FIG. 6 is a flowchart illustrating a process 700 of an encoding method according to an embodiment of the present application. The process 700 may be performed by the video encoder 100. Specifically, the process 700 may be performed by the inter prediction unit 110 and the entropy coding unit (also referred to as an entropy encoder) 103 of the video encoder 100. The process 700 is described as a series of steps or operations. It should be understood that the process 700 may be performed in various orders and / or concurrently, and is not limited to the execution order shown in FIG. 6. Assume that a video data stream with multiple video frames is using a video encoder. If the first neighboring affine coding block is located above the current coding block (Coding Tree Unit, CTU), then the first neighboring affine coding block is used. The lower left control point and lower right control point of the radio coding block determine a set of candidate motion vector prediction values, corresponding to the process shown in FIG. 6, and the related description is as follows:
步骤S700:视频编码器确定当前编码块的帧间预测模式。Step S700: The video encoder determines an inter prediction mode of a current coding block.
具体的,帧间预测模式可能为先进的运动矢量预测(Advanced Motion Vector Prediction,AMVP)模式,也可能为融合(merge)模式。Specifically, the inter prediction mode may be an advanced motion vector prediction (Advanced Vector Prediction (AMVP) mode) or may be a merge mode.
若确定出当前编码块的帧间预测模式为AMVP模式,则执行步骤S711-S713。If it is determined that the inter prediction mode of the current coding block is the AMVP mode, steps S711-S713 are performed.
若确定出当前编码块的帧间预测模式为merge模式,则执行步骤S721-S723。If it is determined that the inter prediction mode of the current coding block is a merge mode, steps S721-S723 are performed.
AMVP模式:AMVP mode:
步骤S711:视频编码器构建候选运动矢量预测值MVP列表。Step S711: The video encoder constructs a candidate motion vector prediction value MVP list.
具体地,视频编码器通过帧间预测单元(也称帧间预测模块)来构建候选运动矢量预测值MVP列表(也称候选运动矢量列表),可以采用如下提供的两种方式中的一种方式来构建,或者采用两种方式结合的形式来构建,构建的候选运动矢量预测值MVP列表可以为三元组的候选运动矢量预测值MVP列表,也可以为二元组的候选运动矢量预测值MVP列表;以上两种方式具体如下:Specifically, the video encoder uses an inter prediction unit (also referred to as an inter prediction module) to construct a candidate motion vector prediction value MVP list (also referred to as a candidate motion vector list). One of the two methods provided below may be adopted. To build, or use a combination of the two methods, the candidate motion vector prediction value MVP list can be a triplet candidate motion vector prediction value MVP list or a binary tuple candidate motion vector prediction value MVP List; the above two methods are as follows:
方式一,采用基于运动模型的运动矢量预测方法构建候选运动矢量预测值MVP列表。Method 1: A motion vector prediction method based on a motion model is used to construct a candidate motion vector prediction value MVP list.
首先,按照预先规定的顺序遍历当前编码块的全部或部分相邻块,从而确定其中的相邻仿射编码块,确定出的相邻仿射编码块的数量可能为一个也可能为多个。例如,可以依次遍历图7A所示的相邻块A、B、C、D、E,以确定出相邻块A、B、C、D、E中的相邻仿射编码块。该帧间预测单元至少会根据一个相邻仿射编码块确定一组候选运动矢量预测值(每一组候选运动矢量预测值为一个二元组或者三元组),下面以一个相邻仿射编码块为例进行介绍,为了便于描述称该一个相邻仿射编码块为第一相邻仿射编码块,具体如下:First, all or part of neighboring blocks of the current coding block are traversed in a predetermined order to determine the neighboring affine coding blocks, and the number of the determined neighboring affine coding blocks may be one or more. For example, the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine coding blocks in the neighboring blocks A, B, C, D, and E. The inter prediction unit determines at least one set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to at least one adjacent affine coding block, and an adjacent affine is used below The coding block is introduced as an example. For convenience of description, the neighboring affine coding block is called the first neighboring affine coding block, as follows:
根据第一相邻仿射编码块的控制点的运动矢量确定第一仿射模型,进而根据第一仿射模型预测该当前编码块的控制点的运动矢量。当前编码块的参数模型不同时,基于第一相 邻仿射编码块的控制点的运动矢量预测当前编码块的控制点的运动矢量的方式也不同,因此下面分情况进行描述。A first affine model is determined according to a motion vector of a control point of a first neighboring affine coding block, and then a motion vector of a control point of the current coding block is predicted according to the first affine model. When the parameter model of the current coding block is different, the method of predicting the motion vector of the control point of the current coding block based on the motion vector of the control point of the first neighboring affine coding block is also different, so the following description will be made on a case-by-case basis.
A、当前编码块的参数模型为4参数仿射变换模型,推导的方式可以为:A. The parameter model of the current coding block is a 4-parameter affine transformation model. The derivation method can be:
若第一相邻仿射编码块位于当前编码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射编码块为四参数仿射编码块,则获取该第一相邻仿射编码块最下侧两个控制点的运动矢量,例如,可以获取该第一相邻仿射编码块左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6),以及右下控制点的位置坐标(x 7,y 7)和运动矢量值(vx 7,vy 7)。 If the first neighboring affine coding block is located in a Coding Tree Unit (CTU) above the current coding block and the first neighboring affine coding block is a four-parameter affine coding block, the first phase is obtained The motion vectors of the bottom two control points of the adjacent affine coding block. For example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy) of the lower left control point of the first adjacent affine coding block can be obtained. 6 ), and the position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the lower right control point.
根据该第一相邻仿射编码块最下侧两个控制点的运动矢量和坐标位置组成第一仿射模型(这时得到的第一仿射模型为4参数仿射模型)。A first affine model is formed according to the motion vectors and coordinate positions of the bottom two control points of the first adjacent affine coding block (the first affine model obtained at this time is a 4-parameter affine model).
根据该第一仿射模型预测当前编码块的控制点的运动矢量,例如,可以将该当前编码块的左上控制点的位置坐标和右上控制点的位置坐标分别带入到该第一仿射模型,从而预测出当前编码块的左上控制点的运动矢量、右上控制点的运动矢量,具体如公式(1)、(2)所示。The motion vector of the control point of the current coding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point and the position of the upper right control point of the current coding block may be brought into the first affine model, respectively. , Thereby predicting the motion vector of the upper left control point and the motion vector of the upper right control point of the current coding block, as shown in formulas (1) and (2).
Figure PCTCN2019079955-appb-000001
Figure PCTCN2019079955-appb-000001
Figure PCTCN2019079955-appb-000002
Figure PCTCN2019079955-appb-000002
在公式(1)、(2)中,(x 0,y 0)为当前编码块的左上控制点的坐标,(x 1,y 1)为当前编码块的右上控制点的坐标;另外,(vx 0,vy 0)为预测的当前编码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前编码块的右上控制点的运动矢量。 In formulas (1) and (2), (x 0 , y 0 ) are the coordinates of the upper-left control point of the current coding block, and (x 1 , y 1 ) are the coordinates of the upper-right control point of the current coding block; in addition, ( vx 0 , vy 0 ) is a motion vector of the upper left control point of the predicted current coding block, and (vx 1 , vy 1 ) is a motion vector of the upper right control point of the predicted current coding block.
可选的,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为根据所述第一相邻仿射编码块的左上控制点的位置坐标(x 4,y 4)计算得到的,其中,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射编码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿编码块的宽度,cuH为所述第一相邻仿射编码块的高度;另外,所述第一相邻仿射编码块的左下控制点的运动矢量为所述第一相邻仿射编码块的左下子块的运动矢量,所述第一相邻仿射编码块的右下控制点的运动矢量为第一相邻仿射编码块的右下子块的运动矢量。可以看出,第一相邻仿射编码块的左下控制点的位置坐标和所述右下控制点的位置坐标均是推导得到的,而不是从内存中读取得到的,因此采用该方法能够进一步减少内存的读取,提高了编码性能。作为另外一种可选方案,也可以在内存中预选存储左下控制点和右下控制点的位置坐标,后续要用的时候从内存中读取。 Optionally, the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine coding block and the position coordinates (x 7 , y 7 ) of the lower right control point are both based on the The position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine coding block are calculated, where the position coordinates (x 6 , y) of the lower left control point of the first adjacent affine coding block 6 ) is (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine coding block is (x 4 + cuW, y 4 + cuH) , CuW is the width of the first neighboring affine coding block, and cuH is the height of the first neighboring affine coding block. In addition, the motion vector of the lower left control point of the first neighboring affine coding block is A motion vector of a lower left sub-block of the first adjacent affine coding block, and a motion vector of a lower right control point of the first adjacent affine coding block is a Motion vector. It can be seen that the position coordinates of the lower left control point and the position coordinates of the lower right control point of the first adjacent affine coding block are both derived and not read from memory, so using this method can Further reducing memory reads and improving encoding performance. As another alternative, the position coordinates of the lower left control point and the lower right control point may also be pre-selected and stored in the memory, and read from the memory when needed later.
若第一相邻仿射编码块位于当前编码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射编码块为六参数仿射编码块,则不基于第一相邻仿射编码块生成当前块的控制点的候选运动矢量预测值。If the first neighboring affine coding block is located in a coding tree unit (CTU) above the current coding block and the first neighboring affine coding block is a six-parameter affine coding block, it is not based on the first phase The neighboring affine coding block generates a candidate motion vector prediction value of a control point of the current block.
若该第一相邻仿射编码块不位于当前编码块的上方CTU,则预测当前编码块的控制点的运动矢量的方式此处不作限定。但是为了便于理解,下面也例举一种可选的确定方式:If the first neighboring affine coding block is not located above the CTU of the current coding block, the manner of predicting the motion vector of the control point of the current coding block is not limited here. However, for ease of understanding, an optional determination method is also exemplified below:
可以获取该第一相邻仿射编码块的三个控制点的位置坐标和运动矢量,例如,左上控制点的位置坐标(x 4,y 4)和运动矢量值(vx 4,vy 4)、右上控制点的位置坐标(x 5,y 5)和运动矢量值(vx 5,vy 5)、左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6)。 The position coordinates and motion vectors of the three control points of the first adjacent affine coding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
根据该第一相邻仿射编码块的三个控制点的位置坐标和运动矢量组成6参数仿射模型。A 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine coding block.
将该当前编码块的左上控制点的位置坐标(x 0,y 0)和右上控制点的位置坐标(x 1,y 1)代入6参数仿射模型预测当前编码块的左上控制点的运动矢量,以及右上控制点的运动矢量,具体如公式(4)、(5)所示。 The position coordinates (x 0 , y 0 ) of the upper left control point and the position coordinates (x 1 , y 1 ) of the upper right control point of the current coding block are substituted into a 6-parameter affine model to predict the motion vector of the upper left control point of the current coding block. And the motion vector of the upper right control point, as shown in formulas (4) and (5).
Figure PCTCN2019079955-appb-000003
Figure PCTCN2019079955-appb-000003
Figure PCTCN2019079955-appb-000004
Figure PCTCN2019079955-appb-000004
在公式(4)、(5)中,(vx 0,vy 0)为预测的当前编码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前编码块的右上控制点的运动矢量。 In formulas (4) and (5), (vx 0 , vy 0 ) is the motion vector of the upper left control point of the predicted current coding block, and (vx 1 , vy 1 ) is the predicted upper right control point of the current coding block. Motion vector.
B、当前编码块的参数模型为6参数仿射变换模型,推导的方式可以为:B. The parameter model of the current coding block is a 6-parameter affine transformation model. The derivation method can be:
若该第一相邻仿射编码块位于当前编码块的上方CTU且第一相邻仿射编码块为四参数仿射编码块,则获取该第一相邻仿射编码块最下侧两个控制点的位置坐标和运动矢量,例如,可以获取该第一相邻仿射编码块左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6),以及右下控制点的位置坐标(x 7,y 7)和运动矢量值(vx 7,vy 7)。 If the first neighboring affine coding block is located above the CTU of the current coding block and the first neighboring affine coding block is a four-parameter affine coding block, two lowermost two sides of the first neighboring affine coding block are obtained Position coordinates and motion vectors of control points. For example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine coding block can be obtained, and the lower right control The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the point.
根据该第一相邻仿射编码块最下侧两个控制点的运动矢量组成第一仿射模型(此时得到的第一仿射模型为一个4参数仿射模型)。A first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine coding block (the first affine model obtained at this time is a 4-parameter affine model).
根据该第一仿射模型预测当前编码块的控制点的运动矢量,例如,可以将该当前编码块的左上控制点的位置坐标、右上控制点的位置坐标、左下控制点的位置坐标分别带入到该第一仿射模型,从而预测当前编码块的左上控制点的运动矢量、右上控制点的运动矢量和左下控制点的运动矢量,具体如公式(1)、(2)、(3)所示。The motion vector of the control point of the current coding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point, the position of the upper right control point, and the position of the lower left control point of the current coding block may be brought in respectively. To the first affine model to predict the motion vector of the upper left control point, the upper right control point, and the lower left control point of the current coding block, as shown in formulas (1), (2), and (3). Show.
Figure PCTCN2019079955-appb-000005
Figure PCTCN2019079955-appb-000005
公式(1)、(2)以上已有描述,在公式(1)、(2)、(3)中,(x 0,y 0)为当前编码块的左上控制点的坐标,(x 1,y 1)为当前编码块的右上控制点的坐标,(x 2,y 2)为当前编码块的左下控制点的坐标;另外,(vx 0,vy 0)为预测的当前编码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前编码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前编码块的右左下控制点的运动矢量。 Formulas (1) and (2) have been described above. In formulas (1), (2), and (3), (x 0 , y 0 ) are the coordinates of the upper-left control point of the current coding block, (x 1 , y 1 ) is the coordinate of the upper right control point of the current coding block, (x 2 , y 2 ) is the coordinate of the lower left control point of the current coding block; in addition, (vx 0 , vy 0 ) is the predicted upper left control of the current coding block The motion vector of the point, (vx 1 , vy 1 ) is the motion vector of the predicted upper right control point of the current coding block, and (vx 2 , vy 2 ) is the motion vector of the predicted right and left lower control point of the current coding block.
若第一相邻仿射编码块位于当前编码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射编码块为六参数仿射编码块,则不基于第一相邻仿射编码块生成当前 块的控制点的候选运动矢量预测值。If the first neighboring affine coding block is located in a coding tree unit (CTU) above the current coding block and the first neighboring affine coding block is a six-parameter affine coding block, it is not based on the first phase The neighboring affine coding block generates a candidate motion vector prediction value of a control point of the current block.
若该第一相邻仿射编码块不位于当前编码块的上方CTU,则预测当前编码块的控制点的运动矢量的方式此处不作限定。但是为了便于理解,下面也例举一种可选的确定方式:If the first neighboring affine coding block is not located above the CTU of the current coding block, the manner of predicting the motion vector of the control point of the current coding block is not limited here. However, for ease of understanding, an optional determination method is also exemplified below:
可以获取该第一相邻仿射编码块的三个控制点的位置坐标和运动矢量,例如,左上控制点的位置坐标(x 4,y 4)和运动矢量值(vx 4,vy 4)、右上控制点的位置坐标(x 5,y 5)和运动矢量值(vx 5,vy 5)、左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6)。 The position coordinates and motion vectors of the three control points of the first adjacent affine coding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
根据该第一相邻仿射编码块的三个控制点的位置坐标和运动矢量组成6参数仿射模型。A 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine coding block.
将该当前编码块的左上控制点的位置坐标(x 0,y 0)、右上控制点的位置坐标(x 1,y 1)和左下控制点的位置坐标(x 2,y 2)代入6参数仿射模型预测当前编码块的左上控制点的运动矢量,右上控制点的运动矢量,及左下控制点的运动矢量,如公式(4)、(5)、(6)所示。 The position coordinates (x 0 , y 0 ) of the upper left control point, the position coordinates of the upper right control point (x 1 , y 1 ), and the position coordinates of the lower left control point (x 2 , y 2 ) of the current coding block are substituted into 6 parameters. The affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current coding block, as shown in formulas (4), (5), and (6).
Figure PCTCN2019079955-appb-000006
Figure PCTCN2019079955-appb-000006
公式(4)、(5)前面已有描述,在公式(4)、(5)、(6)中,(vx 0,vy 0)为预测的当前编码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前编码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前编码块的左下控制点的运动矢量。 Formulas (4) and (5) have been described previously. In formulas (4), (5), and (6), (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current coding block, ( vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current coding block, and (vx 2 , vy 2 ) are the motion vectors of the predicted lower left control point of the current coding block.
方式二,采用基于控制点组合的运动矢量预测方法构建候选运动矢量预测值MVP列表。Method 2: A motion vector prediction method based on the combination of control points is used to construct a candidate motion vector prediction value MVP list.
当前编码块的参数模型不同时构建候选运动矢量预测值MVP列表的方式也不同,下面展开描述。The parameter model of the current coding block does not have the same way of constructing the candidate motion vector prediction value MVP list at the same time, which will be described below.
A、当前编码块的参数模型为4参数仿射变换模型,推导的方式可以为:A. The parameter model of the current coding block is a 4-parameter affine transformation model. The derivation method can be:
利用当前编码块周边邻近的已编码块的运动信息预估当前编码块左上顶点和右上顶点的运动矢量。如图7B所示:首先,利用左上顶点相邻已编码块A和/或B和/或C块的运动矢量,作为当前编码块左上顶点的运动矢量的候选运动矢量;利用右上顶点相邻已编码块D和/或E块的运动矢量,作为当前编码块右上顶点的运动矢量的候选运动矢量。将上述左上顶点的一个候选运动矢量和右上顶点的一个候选运动矢量进行组合可得到一组候选运动矢量预测值,按照这种组合方式组合得到的多条记录可以构成候选运动矢量预测值MVP列表。The motion information of the upper left vertex and the upper right vertex of the current coding block is estimated by using the motion information of the coded blocks adjacent to the current coding block. As shown in FIG. 7B: First, the motion vector of the upper left vertex adjacent to the coded block A and / or B and / or C block is used as a candidate motion vector of the motion vector of the upper left vertex of the current coding block; The motion vector of the coding block D and / or E block is used as a candidate motion vector of the motion vector of the top right vertex of the current coding block. A candidate motion vector of the upper left vertex and a candidate motion vector of the upper right vertex are combined to obtain a set of candidate motion vector prediction values. Multiple records obtained by combining in this combination manner can form a candidate motion vector prediction value MVP list.
B、当前编码块参数模型是6参数仿射变换模型,推导的方式可以为:B. The current parameter block parameter model is a 6-parameter affine transformation model. The derivation method can be:
利用当前编码块周边邻近的已编码块的运动信息预估当前编码块左上顶点和右上顶点的运动矢量。如图7B所示:首先,利用左上顶点相邻已编码块A和/或B和/或C块的运动矢量,作为当前编码块左上顶点的运动矢量的候选运动矢量;利用右上顶点相邻已编码块D和/或E块的运动矢量,作为当前编码块右上顶点的运动矢量的候选运动矢量;利用右上顶点相邻已编码块F和/或G块的运动矢量,作为当前编码块右上顶点的运动矢量的候选运动矢量。将上述左上顶点的一个候选运动矢量、右上顶点的一个候选运动矢量和左下顶点的一个候选运动矢量进行组合可得到一组候选运动矢量预测值,按照这种组合方式组合得到的多多组候选运动矢量预测值可以构成候选运动矢量预测值MVP列表。The motion information of the upper left vertex and the upper right vertex of the current coding block is estimated by using the motion information of the coded blocks adjacent to the current coding block. As shown in FIG. 7B: First, the motion vector of the upper left vertex adjacent to the coded block A and / or B and / or C block is used as a candidate motion vector of the motion vector of the upper left vertex of the current coding block; The motion vector of the coding block D and / or E block is used as the candidate motion vector of the motion vector of the top right vertex of the current coding block; the motion vector of the coded block F and / or G block adjacent to the top right vertex is used as the top right vertex of the current coding block Candidate motion vector. A combination of the candidate motion vector of the upper left vertex, the candidate motion vector of the upper right vertex, and the candidate motion vector of the lower left vertex can be used to obtain a set of candidate motion vector prediction values. A plurality of sets of candidate motion vectors obtained by combining in this way The prediction value may constitute a candidate motion vector prediction value MVP list.
需要说明的是,可仅采用方式一预测得到的候选运动矢量预测值来构建候选运动矢量预测值MVP列表,也可仅采用方式二预测得到的候选运动矢量预测值来构建候选运动矢量 预测值MVP列表,还可采用方式一预测得到的候选运动矢量预测值和方式二预测得到的候选运动矢量预测值来共同构建候选运动矢量预测值MVP列表。另外,还可将候选运动矢量预测值MVP列表按照预先配置的规则进行剪枝和排序,然后将其截断或填充至特定个数。当候选运动矢量预测值MVP列表中的每一组候选运动矢量预测值包括三个控制点的运动矢量预测值时,可称该候选运动矢量预测值MVP列表为三元组列表;当候选运动矢量预测值MVP列表中的每一组候选运动矢量预测值包括两个控制点的运动矢量预测值时,可称该候选运动矢量预测值MVP列表为二元组列表。It should be noted that the candidate motion vector prediction value MVP list can only be constructed by using the candidate motion vector prediction value predicted by the first method, or the candidate motion vector prediction value MVP can be constructed by only using the candidate motion vector prediction value obtained by the second method prediction. For the list, the candidate motion vector prediction value obtained by the prediction method 1 and the candidate motion vector prediction value obtained by the method 2 prediction can be used to jointly construct a candidate motion vector prediction value MVP list. In addition, the candidate motion vector prediction value MVP list can be pruned and sorted according to a pre-configured rule, and then truncated or filled to a specific number. When each group of candidate motion vector prediction values in the candidate motion vector prediction value MVP list includes motion vector prediction values of three control points, the candidate motion vector prediction value MVP list may be called a triple list; when the candidate motion vector When each group of candidate motion vector prediction values in the prediction value MVP list includes motion vector prediction values of two control points, the candidate motion vector prediction value MVP list may be referred to as a two-tuple list.
步骤S712:视频编码器根据率失真代价准则,从候选运动矢量预测值MVP列表中确定目标候选运动矢量组。具体地,针对候选运动矢量预测值MVP列表中的每一个候选运动矢量组,计算得到当前块每个子块的运动矢量,进行运动补偿得到每个子块的预测值,从而得到当前块的预测值。选择出预测值与原始值误差最小的候选运动矢量组作为一组最佳的运动矢量预测值,即目标候选运动矢量组。另外,确定出的目标候选运动矢量组用于作为一组控制点的最优候选运动矢量预测值,该目标候选运动矢量组在该候选运动矢量预测值MVP列表中对应有一个唯一的索引号。Step S712: The video encoder determines the target candidate motion vector group from the candidate motion vector prediction value MVP list according to the rate-distortion cost criterion. Specifically, for each candidate motion vector group in the candidate motion vector prediction value MVP list, the motion vector of each sub-block of the current block is calculated, and motion compensation is performed to obtain the prediction value of each sub-block, thereby obtaining the prediction value of the current block. The candidate motion vector group with the smallest error between the predicted value and the original value is selected as a set of the best motion vector prediction values, that is, the target candidate motion vector group. In addition, the determined target candidate motion vector group is used as an optimal candidate motion vector prediction value for a set of control points, and the target candidate motion vector group corresponds to a unique index number in the candidate motion vector prediction value MVP list.
步骤S713:视频编码器将与所述目标候选运动矢量对应的索引和运动矢量差值MVD编入待传输的码流。Step S713: The video encoder encodes an index corresponding to the target candidate motion vector and a motion vector difference MVD into a code stream to be transmitted.
具体地,该视频编码器还可以以所述目标候选运动矢量组为搜索起始点在预设搜索范围内按照率失真代价准则搜索代价最低的一组控制点的运动矢量;然后确定所述一组控制点的运动矢量与所述目标候选运动矢量组之间的运动矢量差值MVD,例如,假若第一组控制点包括第一控制点和第二控制点,那么需要确定第一控制点的运动矢量与所述目标候选运动矢量组表示的一组控制点中的第一控制点的运动矢量预测值的运动矢量差值MVD,以及确定第二控制点的运动矢量与所述目标候选运动矢量组表示的一组控制点中的第二控制点的运动矢量预测值的运动矢量差值MVD。Specifically, the video encoder may also use the target candidate motion vector group as a search starting point to search for a set of motion vectors with the lowest cost control point within a preset search range according to a rate distortion cost criterion; and then determine the group The motion vector difference MVD between the motion vector of the control point and the target candidate motion vector group, for example, if the first group of control points includes the first control point and the second control point, then the motion of the first control point needs to be determined A motion vector difference MVD of a motion vector prediction value of a first control point in a set of control points represented by the vector and the target candidate motion vector group, and a motion vector determining a second control point and the target candidate motion vector group A motion vector difference MVD of a motion vector prediction value of a second control point in a set of control points represented.
可选的,AMVP模式下除了执行上述步骤S711-S713之外,还可以执行步骤S714-S715。Optionally, in the AMVP mode, in addition to steps S711-S713 described above, steps S714-S715 may also be performed.
步骤S714:视频编码器根据以上确定出的当前编码块的控制点的运动矢量值采用仿射变换模型获得当前编码块中每个子块的运动矢量值。Step S714: The video encoder uses the affine transformation model to obtain the motion vector value of each sub-block in the current coding block according to the motion vector value of the control point of the current coding block determined above.
具体地,基于目标候选运动矢量组和MVD得到的新的候选运动矢量组中包括两个(左上控制点和右上控制点)或者三个控制点(例如,左上控制点、右上控制点和左下控制点)的运动矢量。对于当前编码块的每一个子块(一个子块也可以等效为一个运动补偿单元),可采用运动补偿单元中预设位置像素点的运动信息来表示该运动补偿单元内所有像素点的运动信息。假设运动补偿单元的尺寸为MxN(M小于等于当前编码块的宽度W,N小于等于当前编码块的高度H,其中M、N、W、H为正整数,通常为2的幂次方,如4、8、16、32、64、128等),则预设位置像素点可以为运动补偿单元中心点(M/2,N/2)、左上像素点(0,0),右上像素点(M-1,0),或其他位置的像素点。图8A示意了4x4的运动补偿单元,图8B示意了8x8的运动补偿单元。Specifically, the new candidate motion vector group obtained based on the target candidate motion vector group and the MVD includes two (upper left control points and upper right control points) or three control points (for example, upper left control point, upper right control point, and lower left control). Dot) motion vector. For each sub-block of the current coding block (a sub-block can also be equivalent to a motion compensation unit), the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information. Assume that the size of the motion compensation unit is MxN (M is less than or equal to the width W of the current coding block, and N is less than or equal to the height H of the current coding block, where M, N, W, H are positive integers, usually powers of two, such as 4, 8, 16, 32, 64, 128, etc.), the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations. FIG. 8A illustrates a 4 × 4 motion compensation unit, and FIG. 8B illustrates an 8 × 8 motion compensation unit.
运动补偿单元中心点相对于当前编码块左上顶点像素的坐标使用公式(5)计算得到,其中i为水平方向第i个运动补偿单元(从左到右),j为竖直方向第j个运动补偿单元(从上到下),(x (i,j),y (i,j))表示第(i,j)个运动补偿单元中心点相对于当前编码块左上控制点像素的 坐标。再根据当前编码块的仿射模型类型(6参数或4参数),将(x (i,j),y (i,j))代入6参数仿射模型公式(6-1)或者将(x (i,j),y (i,j))代入4参数仿射模型公式(6-2),获得每个运动补偿单元中心点的运动信息,作为该运动补偿单元内所有像素点的运动矢量(vx (i,j),vy (i,j))。 The coordinates of the center point of the motion compensation unit relative to the top left pixel of the current coding block are calculated using formula (5), where i is the ith motion compensation unit in the horizontal direction (from left to right), and j is the jth motion in the vertical direction The compensation unit (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the center point of the (i, j) th motion compensation unit relative to the upper left control point pixel of the current coding block. Then according to the affine model type (6 or 4 parameters) of the current coding block, substitute (x (i, j) , y (i, j) ) into the 6-parameter affine model formula (6-1) or (x (i, j) , y (i, j) ) is substituted into the 4-parameter affine model formula (6-2) to obtain the motion information of the center point of each motion compensation unit as the motion vector of all pixels in the motion compensation unit (vx (i, j) , vy (i, j) ).
Figure PCTCN2019079955-appb-000007
Figure PCTCN2019079955-appb-000007
Figure PCTCN2019079955-appb-000008
Figure PCTCN2019079955-appb-000008
Figure PCTCN2019079955-appb-000009
Figure PCTCN2019079955-appb-000009
可选的,当前编码块为6参数编码块时,在基于所述目标候选运动矢量组得到所述当前编码块的一个或多个子块的运动矢量时,若所述当前编码块的下边界与所述当前编码块所在的CTU的下边界重合,则所述当前编码块的左下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前编码块的左下角的位置坐标(0,H)计算得到,所述当前编码块的右下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前编码块的右下角的位置坐标(W,H)计算得到。例如,将当前编码块的左下角的位置坐标(0,H)代入到该6参数仿射模型即可得到当前编码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前编码块的右下角的位置坐标(W,H)代入到该6参数仿射模型即可得到当前编码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前编码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动矢量预测值MVP列表),用到的是准确的值而不是估算值。其中,W为该当前编码块的宽,H为该当前编码块的高。Optionally, when the current coding block is a 6-parameter coding block, when the motion vectors of one or more sub-blocks of the current coding block are obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the sub-block in the lower left corner of the current coding block is a 6-parameter affine model constructed according to the three control points and the current coding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the lower right corner sub-block of the current coding block is a 6-parameter affine model constructed according to the three control points and the right of the current coding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 6-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation.) The position coordinates (W, H) of the lower right corner of the current coding block are substituted into the 6-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion vector prediction value MVP list of the other blocks) uses an accurate value instead of an estimated value. Wherein, W is the width of the current coding block, and H is the height of the current coding block.
可选的,当前编码块为4参数编码块时,在基于所述目标候选运动矢量组得到所述当前编码块的一个或多个子块的运动矢量时,若所述当前编码块的下边界与所述当前编码块所在的CTU的下边界重合,则所述当前编码块的左下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前编码块的左下角的位置坐标(0,H)计算得到,所述当前编码块的右下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前编码块的右下角的位置坐标(W,H)计算得到。例如,将当前编码块的左下角的位置坐标(0,H)代入到该4参数仿射模型即可得到当前编码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前编码块的右下角的位置坐标(W,H)代入到该四参数仿射模型即可得到当前编码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前编码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动矢量预测 值MVP列表),用到的是准确的值而不是估算值。其中,W为该当前编码块的宽,H为该当前编码块的高。Optionally, when the current coding block is a 4-parameter coding block, when a motion vector of one or more sub-blocks of the current coding block is obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the lower left sub-block of the current coding block is a 4-parameter affine model constructed according to the two control points and the The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the subblock in the lower right corner of the current coding block is a 4-parameter affine model constructed according to the two control points and the right of the current coding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 4-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current coding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion vector prediction value MVP list of the other blocks) uses an accurate value instead of an estimated value. Wherein, W is the width of the current coding block, and H is the height of the current coding block.
步骤S715:该视频编码器根据该当前编码块中每个子块的运动矢量值进行运动补偿,以得到每个子块的像素预测值,例如,通过每个子块的运动矢量和参考帧索引值,在参考帧中找到对应的子块,进行插值滤波,得到每个子块的像素预测值。Step S715: The video encoder performs motion compensation according to the motion vector value of each sub-block in the current coding block to obtain the pixel prediction value of each sub-block. For example, the motion vector and reference frame index value of each sub-block are used in The corresponding sub-block is found in the reference frame, and interpolation filtering is performed to obtain the pixel prediction value of each sub-block.
Merge模式:Merge mode:
步骤S721:视频编码器构建候选运动信息列表。Step S721: The video encoder constructs a candidate motion information list.
具体地,视频编码器通过帧间预测单元(也称帧间预测模块)来构建候选运动信息列表(也称候选运动矢量列表),可以采用如下提供的两种方式中的一种方式来构建,或者采用两种方式结合的形式来构建,构建的候选运动信息列表为三元组的候选运动信息列表;以上两种方式具体如下:Specifically, the video encoder constructs a candidate motion information list (also referred to as a candidate motion vector list) through an inter prediction unit (also referred to as an inter prediction module), which may be constructed in one of two ways provided below. Or it can be constructed by a combination of two methods, and the candidate motion information list constructed is a triplet candidate motion information list; the above two methods are specifically as follows:
方式一,采用基于运动模型的运动矢量预测方法构建候选运动信息列表。Method 1: A motion vector prediction method based on a motion model is used to construct a candidate motion information list.
首先,按照预先规定的顺序遍历当前编码块的全部或部分相邻块,从而确定其中的相邻仿射编码块,确定出的相邻仿射编码块的数量可能为一个也可能为多个。例如,可以依次遍历图7A所示的相邻块A、B、C、D、E,以确定出相邻块A、B、C、D、E中的相邻仿射编码块。该帧间预测单元会根据每个相邻仿射编码块确定一组候选运动矢量预测值(每一组候选运动矢量预测值为一个二元组或者三元组),下面以一个相邻仿射编码块为例进行介绍,为了便于描述称该一个相邻仿射编码块为第一相邻仿射编码块,具体如下:First, all or part of neighboring blocks of the current coding block are traversed in a predetermined order to determine the neighboring affine coding blocks, and the number of the determined neighboring affine coding blocks may be one or more. For example, the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine coding blocks in the neighboring blocks A, B, C, D, and E. The inter prediction unit determines a set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to each adjacent affine coding block, and an adjacent affine is used below The coding block is introduced as an example. For convenience of description, the neighboring affine coding block is called the first neighboring affine coding block, as follows:
根据第一相邻仿射编码块的控制点的运动矢量确定第一仿射模型,进而根据第一仿射模型预测该当前编码块的控制点的运动矢量,具体描述如下:The first affine model is determined according to the motion vector of the control points of the first neighboring affine coding block, and then the motion vector of the control point of the current coding block is predicted according to the first affine model, which is specifically described as follows:
若该第一相邻仿射编码块位于当前编码块的上方CTU且该第一相邻仿射编码块为四参数仿射编码块,则获取该第一相邻仿射编码块最下侧两个控制点的位置坐标和运动矢量,例如,可以获取该第一相邻仿射编码块左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6),以及右下控制点的位置坐标(x 7,y 7)和运动矢量值(vx 7,vy 7)。 If the first adjacent affine coding block is located above the CTU of the current coding block and the first adjacent affine coding block is a four-parameter affine coding block, two lower-most two sides of the first adjacent affine coding block are obtained. Position coordinates and motion vectors of the control points, for example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine coding block can be obtained, and the lower right The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the control point.
根据该第一相邻仿射编码块最下侧两个控制点的运动矢量组成第一仿射模型(此时得到的第一仿射模型为一个4参数仿射模型)。A first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine coding block (the first affine model obtained at this time is a 4-parameter affine model).
可选的,根据该第一仿射模型预测当前编码块的控制点的运动矢量,例如,可以将该当前编码块的左上控制点的位置坐标、右上控制点的位置坐标、左下控制点的位置坐标分别带入到该第一仿射模型,从而预测当前编码块的左上控制点的运动矢量、右上控制点的运动矢量和左下控制点的运动矢量,组成候选运动矢量三元组,加入候选运动信息列表,具体如公式(1)、(2)、(3)所示。Optionally, the motion vector of the control point of the current coding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point, the position coordinates of the upper right control point, and the position of the lower left control point of the current coding block may be predicted. The coordinates are brought into the first affine model, so as to predict the motion vector of the upper left control point, the upper right control point, and the lower left control point of the current coding block to form a candidate motion vector triplet and add candidate motions. The information list is shown in formulas (1), (2), and (3).
可选的,根据该第一仿射模型预测当前编码块的控制点的运动矢量,例如,可以将该当前编码块的左上控制点的位置坐标、右上控制点的位置坐标分别带入到该第一仿射模型,从而预测当前编码块的左上控制点的运动矢量和右上控制点的运动矢量,组成候选运动矢量二元组,加入候选运动信息列表,具体如公式(1)、(2)所示。Optionally, the motion vector of the control point of the current coding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point and the position coordinates of the upper right control point of the current coding block may be brought into the first An affine model to predict the motion vector of the upper left control point and the motion vector of the upper right control point of the current coding block to form a candidate motion vector tuple, and add the candidate motion information list, as shown in formulas (1) and (2). Show.
在公式(1)、(2)、(3)中,(x 0,y 0)为当前编码块的左上控制点的坐标,(x 1,y 1)为当前编码块的右上控制点的坐标,(x 2,y 2)为当前编码块的左下控制点的坐标;另外, (vx 0,vy 0)为预测的当前编码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前编码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前编码块的左下控制点的运动矢量。 In formulas (1), (2), and (3), (x 0 , y 0 ) are the coordinates of the upper-left control point of the current coding block, and (x 1 , y 1 ) are the coordinates of the upper-right control point of the current coding block. , (X 2 , y 2 ) are the coordinates of the lower left control point of the current coding block; in addition, (vx 0 , vy 0 ) is the predicted motion vector of the upper left control point of the current coding block, and (vx 1 , vy 1 ) is The predicted motion vector of the upper right control point of the current coding block, (vx 2 , vy 2 ) is the predicted motion vector of the lower left control point of the current coding block.
若第一相邻仿射编码块位于当前编码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射编码块为六参数仿射编码块,则不基于第一相邻仿射编码块生成当前块的控制点的候选运动矢量预测值。If the first neighboring affine coding block is located in a coding tree unit (CTU) above the current coding block and the first neighboring affine coding block is a six-parameter affine coding block, it is not based on the first The neighboring affine coding block generates a candidate motion vector prediction value of a control point of the current block.
若该第一相邻仿射编码块不位于当前编码块的上方CTU,则预测当前编码块的控制点的运动矢量的方式此处不作限定。但是为了便于理解,下面也例举一种可选的确定方式:If the first neighboring affine coding block is not located above the CTU of the current coding block, the manner of predicting the motion vector of the control point of the current coding block is not limited here. However, for ease of understanding, an optional determination method is also exemplified below:
可以获取该第一相邻仿射编码块的三个控制点的位置坐标和运动矢量,例如,左上控制点的位置坐标(x 4,y 4)和运动矢量值(vx 4,vy 4)、右上控制点的位置坐标(x 5,y 5)和运动矢量值(vx 5,vy 5)、左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6)。 The position coordinates and motion vectors of the three control points of the first adjacent affine coding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
根据该第一相邻仿射编码块的三个控制点的位置坐标和运动矢量组成6参数仿射模型。A 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine coding block.
将该当前编码块的左上控制点的位置坐标(x 0,y 0)、右上控制点的位置坐标(x 1,y 1)和左下控制点的位置坐标(x 2,y 2)代入6参数仿射模型预测当前编码块的左上控制点的运动矢量,右上控制点的运动矢量,及左下控制点的运动矢量,如公式(4)、(5)、(6)所示。 The position coordinates (x 0 , y 0 ) of the upper left control point, the position coordinates of the upper right control point (x 1 , y 1 ), and the position coordinates of the lower left control point (x 2 , y 2 ) of the current coding block are substituted into 6 parameters. The affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current coding block, as shown in formulas (4), (5), and (6).
Figure PCTCN2019079955-appb-000010
Figure PCTCN2019079955-appb-000010
公式(4)、(5)前面已有描述,在公式(4)、(5)、(6)中,(vx 0,vy 0)为预测的当前编码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前编码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前编码块的左下控制点的运动矢量。 Formulas (4) and (5) have been described previously. In formulas (4), (5), and (6), (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current coding block, ( vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current coding block, and (vx 2 , vy 2 ) are the motion vectors of the predicted lower left control point of the current coding block.
方式二,采用基于控制点组合的运动矢量预测方法构建候选运动信息列表。Manner 2: A motion vector prediction method based on a combination of control points is used to construct a candidate motion information list.
下面例举两种方案,分别表示为方案A和方案B:The following two examples are exemplified as Option A and Option B:
方案A:将2个当前编码块的控制点的运动信息进行组合,用来构建4参数仿射变换模型。2个控制点的组合方式为{CP1,CP4},{CP2,CP3},{CP1,CP2},{CP2,CP4},{CP1,CP3},{CP3,CP4}。例如,采用CP1和CP2控制点构建的4参数仿射变换模型,记做Affine(CP1,CP2)。Solution A: Combine the motion information of the control points of the two current coding blocks to build a 4-parameter affine transformation model. The combination of the two control points is {CP1, CP4}, {CP2, CP3}, {CP1, CP2}, {CP2, CP4}, {CP1, CP3}, {CP3, CP4}. For example, a 4-parameter affine transformation model constructed using CP1 and CP2 control points is denoted as Affine (CP1, CP2).
需要说明的是,亦可将不同控制点的组合转换为同一位置的控制点。例如:将{CP1,CP4},{CP2,CP3},{CP2,CP4},{CP1,CP3},{CP3,CP4}组合得到的4参数仿射变换模型转换为控制点{CP1,CP2}或{CP1,CP2,CP3}来表示。转换方法为将控制点的运动矢量及其坐标信息,代入公式(9-1),得到模型参数,再将{CP1,CP2}的坐标信息代入,得到其运动矢量,作为一组候选运动矢量预测值。It should be noted that a combination of different control points can also be converted into a control point at the same position. For example: the four-parameter affine transformation model obtained by combining {CP1, CP4}, {CP2, CP3}, {CP2, CP4}, {CP1, CP3}, {CP3, CP4} into a control point {CP1, CP2} Or {CP1, CP2, CP3}. The conversion method is to substitute the motion vector of the control point and its coordinate information into formula (9-1) to obtain the model parameters, and then substitute the coordinate information of {CP1, CP2} to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
Figure PCTCN2019079955-appb-000011
Figure PCTCN2019079955-appb-000011
在公式(9-1)中,a 0,a 1,a 2,a 3均为参数模型中的参数,(x,y)表示位置坐标。 In formula (9-1), a 0 , a 1 , a 2 , and a 3 are parameters in the parameter model, and (x, y) represents position coordinates.
更直接地,也可以按照以下公式进行转换得到以左上控制点、右上控制点表示的一组运动矢量预测值,并加入候选运动信息列表:More directly, a set of motion vector prediction values represented by the upper left control point and the upper right control point can also be converted according to the following formula, and added to the candidate motion information list:
{CP1,CP2}转换得到{CP1,CP2,CP3}的公式(9-2):{CP1, CP2} is transformed to {CP1, CP2, CP3} 's formula (9-2):
Figure PCTCN2019079955-appb-000012
Figure PCTCN2019079955-appb-000012
{CP1,CP3}转换得到{CP1,CP2,CP3}的公式(9-3):{CP1, CP3} is converted to {CP1, CP2, CP3} 's formula (9-3):
Figure PCTCN2019079955-appb-000013
Figure PCTCN2019079955-appb-000013
{CP2,CP3}转换得到{CP1,CP2,CP3}的公式(10):{CP2, CP3} is converted to {CP1, CP2, CP3} 's formula (10):
Figure PCTCN2019079955-appb-000014
Figure PCTCN2019079955-appb-000014
{CP1,CP4}转换得到{CP1,CP2,CP3}的公式(11):{CP1, CP4} is converted to {CP1, CP2, CP3} 's formula (11):
Figure PCTCN2019079955-appb-000015
Figure PCTCN2019079955-appb-000015
{CP2,CP4}转换得到{CP1,CP2,CP3}的公式(12):{CP2, CP4} is converted to {CP1, CP2, CP3} 's formula (12):
Figure PCTCN2019079955-appb-000016
Figure PCTCN2019079955-appb-000016
{CP3,CP4}转换得到{CP1,CP2,CP3}的公式(13):{CP3, CP4} is converted to {CP1, CP2, CP3} 's formula (13):
Figure PCTCN2019079955-appb-000017
Figure PCTCN2019079955-appb-000017
方案B:将当前编码块的3个控制点的运动信息进行组合,用来构建6参数仿射变换模型。3个控制点的组合方式为{CP1,CP2,CP4},{CP1,CP2,CP3},{CP2,CP3,CP4},{CP1,CP3,CP4}。例如,采用CP1、CP2和CP3控制点构建的6参数仿射变换模型,记做Affine(CP1,CP2,CP3)。Solution B: Combine the motion information of the 3 control points of the current coding block to build a 6-parameter affine transformation model. The combination of the three control points is {CP1, CP2, CP4}, {CP1, CP2, CP3}, {CP2, CP3, CP4}, {CP1, CP3, CP4}. For example, a 6-parameter affine transformation model constructed using CP1, CP2, and CP3 control points is denoted as Affine (CP1, CP2, CP3).
要说明的是,亦可将不同控制点的组合转换为同一位置的控制点。例如:将{CP1,CP2,CP4},{CP2,CP3,CP4},{CP1,CP3,CP4}组合的6参数仿射变换模型转换为控制点{CP1,CP2,CP3}来表示。转换方法为将控制点的运动矢量及其坐标信息,代入公式(14),得到模型参数,再将{CP1,CP2,CP3}的坐标信息代入,得到其运动矢量,作为一组候选运动矢量预测值。It should be noted that a combination of different control points can also be converted into a control point at the same position. For example: The 6-parameter affine transformation model of {CP1, CP2, CP4}, {CP2, CP3, CP4}, {CP1, CP3, CP4} is converted into a control point {CP1, CP2, CP3} to represent it. The transformation method is to substitute the motion vector of the control point and its coordinate information into formula (14) to obtain the model parameters, and then substitute the coordinate information of {CP1, CP2, CP3} to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
Figure PCTCN2019079955-appb-000018
Figure PCTCN2019079955-appb-000018
在公式(14)中,a 1,a 2,a 3,a 4,a 5,a 6为参数模型中的参数,(x,y)表示位置坐标。 In formula (14), a 1 , a 2 , a 3 , a 4 , a 5 , and a 6 are parameters in the parameter model, and (x, y) represents position coordinates.
更直接地,也可以按照以下公式进行转换得到以左上控制点、右上控制点、左下控制 点表示的一组运动矢量预测值,并加入候选运动信息列表:More directly, a set of motion vector prediction values represented by the upper left control point, upper right control point, and lower left control point can also be converted according to the following formula, and added to the candidate motion information list:
{CP1,CP2,CP4}转换得到{CP1,CP2,CP3}的公式(15):{CP1, CP2, CP4} is transformed into {CP1, CP2, CP3} 's formula (15):
Figure PCTCN2019079955-appb-000019
Figure PCTCN2019079955-appb-000019
{CP2,CP3,CP4}转换得到{CP1,CP2,CP3}的公式(16):{CP2, CP3, CP4} is converted to {CP1, CP2, CP3} 's formula (16):
Figure PCTCN2019079955-appb-000020
Figure PCTCN2019079955-appb-000020
{CP1,CP3,CP4}转换得到{CP1,CP2,CP3}的公式(17):{CP1, CP3, CP4} is converted to {CP1, CP2, CP3} 's formula (17):
Figure PCTCN2019079955-appb-000021
Figure PCTCN2019079955-appb-000021
需要说明的是,可仅采用方式一预测得到的候选运动矢量预测值来构建候选运动信息列表,也可仅采用方式二预测得到的候选运动矢量预测值来构建候选运动信息列表,还可采用方式一预测得到的候选运动矢量预测值和方式二预测得到的候选运动矢量预测值来共同构建候选运动信息列表。另外,还可将候选运动信息列表按照预先配置的规则进行剪枝和排序,然后将其截断或填充至特定个数。当候选运动信息列表中的每一组候选运动矢量预测值包括三个控制点的运动矢量预测值时,可称该候选运动信息列表为三元组列表;当候选运动信息列表中的每一组候选运动矢量预测值包括两个控制点的运动矢量预测值时,可称该候选运动信息列表为二元组列表。It should be noted that the candidate motion information list may be constructed only by using the candidate motion vector prediction values predicted by the first method, or only the candidate motion vector prediction values obtained by the second prediction method may be used to construct the candidate motion information list. A candidate motion vector prediction value obtained by the first prediction and a candidate motion vector prediction value obtained by the second prediction are used to jointly construct a candidate motion information list. In addition, the candidate motion information list can be pruned and sorted according to pre-configured rules, and then truncated or filled to a specific number. When each group of candidate motion vector prediction values in the candidate motion information list includes motion vector prediction values of three control points, the candidate motion information list may be referred to as a triple list; when each group in the candidate motion information list When the candidate motion vector prediction value includes motion vector prediction values of two control points, the candidate motion information list may be referred to as a two-tuple list.
步骤S722:视频编码器根据率失真代价准则,从候选运动信息列表中确定目标候选运动矢量组。具体地,针对候选运动信息列表中的每一个候选运动矢量组,计算得到当前块每个子块的运动矢量,进行运动补偿得到每个子块的预测值,从而得到当前块的预测值。选择出预测值与原始值误差最小的候选运动矢量组作为一组最佳的运动矢量预测值,即目标候选运动矢量组。另外,确定出的目标候选运动矢量组用于作为一组控制点的最优候选运动矢量预测值,该目标候选运动矢量组在该候选运动信息列表中对应有一个唯一的索引号。Step S722: The video encoder determines a target candidate motion vector group from the candidate motion information list according to the rate-distortion cost criterion. Specifically, for each candidate motion vector group in the candidate motion information list, the motion vector of each sub-block of the current block is calculated, and motion compensation is performed to obtain the prediction value of each sub-block, thereby obtaining the prediction value of the current block. The candidate motion vector group with the smallest error between the predicted value and the original value is selected as a set of the best motion vector prediction values, that is, the target candidate motion vector group. In addition, the determined target candidate motion vector group is used as an optimal candidate motion vector prediction value for a group of control points, and the target candidate motion vector group corresponds to a unique index number in the candidate motion information list.
步骤S723:视频编码器将与所述目标候选运动矢量组、参考帧索引和预测方向对应的索引编入待传输的码流。Step S723: The video encoder encodes an index corresponding to the target candidate motion vector group, a reference frame index, and a prediction direction into a code stream to be transmitted.
可选的,merge模式下除了执行上述步骤S721-S723之外,还可以执行步骤S724-S725。Optionally, in the merge mode, in addition to steps S721-S723 described above, steps S724-S725 may also be performed.
步骤S724:视频编码器根据以上确定出的当前编码块的控制点的运动矢量值采用参数仿射变换模型获得当前编码块中每个子块的运动矢量值。Step S724: The video encoder uses the parameter affine transformation model to obtain the motion vector value of each sub-block in the current coding block according to the motion vector value of the control point of the current coding block determined above.
具体地,即目标候选运动矢量组中包括的两个(左上控制点和右上控制点)或者三个控制点(例如,左上控制点、右上控制点和左下控制点)的运动矢量。对于当前编码块的每一个子块(一个子块也可以等效为一个运动补偿单元),可采用运动补偿单元中预设位置像素点的运动信息来表示该运动补偿单元内所有像素点的运动信息。假设运动补偿单元的尺寸为MxN(M小于等于当前编码块的宽度W,N小于等于当前编码块的高度H,其中M、N、W、H为正整数,通常为2的幂次方,如4、8、16、32、64、128等),则预设位置像素点可以为运动补偿单元中心点(M/2,N/2)、左上像素点(0,0),右上像素点(M-1,0),或其他位置的像素点。图8A示意了4x4的运动补偿单元,图8B示意了8x8的运动补偿单元。Specifically, it is a motion vector of two (upper left control points and upper right control points) or three control points (for example, upper left control points, upper right control points, and lower left control points) included in the target candidate motion vector group. For each sub-block of the current coding block (a sub-block can also be equivalent to a motion compensation unit), the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information. Assume that the size of the motion compensation unit is MxN (M is less than or equal to the width W of the current coding block, and N is less than or equal to the height H of the current coding block, where M, N, W, H are positive integers, usually powers of two, such as 4, 8, 16, 32, 64, 128, etc.), the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations. FIG. 8A illustrates a 4 × 4 motion compensation unit, and FIG. 8B illustrates an 8 × 8 motion compensation unit.
运动补偿单元中心点相对于当前编码块左上顶点像素的坐标使用公式(5)计算得到,其中i为水平方向第i个运动补偿单元(从左到右),j为竖直方向第j个运动补偿单元(从上到下),(x (i,j),y (i,j))表示第(i,j)个运动补偿单元中心点相对于当前编码块左上控制点像素的坐标。再根据当前编码块的仿射模型类型(6参数或4参数),将(x (i,j),y (i,j))代入6参数仿射模型公式(6-1)或者将(x (i,j),y (i,j))代入4参数仿射模型公式(6-2),获得每个运动补偿单元中心点的运动信息,作为该运动补偿单元内所有像素点的运动矢量(vx (i,j),vy (i,j))。 The coordinates of the center point of the motion compensation unit relative to the top left pixel of the current coding block are calculated using formula (5), where i is the i-th motion compensation unit in the horizontal direction (from left to right), and j is the j-th motion in the vertical direction. The compensation unit (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the center point of the (i, j) th motion compensation unit relative to the upper left control point pixel of the current coding block. Then according to the affine model type (6 or 4 parameters) of the current coding block, substitute (x (i, j) , y (i, j) ) into the 6-parameter affine model formula (6-1) or (x (i, j) , y (i, j) ) is substituted into the 4-parameter affine model formula (6-2) to obtain the motion information of the center point of each motion compensation unit as the motion vector of all pixels in the motion compensation unit (vx (i, j) , vy (i, j) ).
可选的,当前编码块为6参数编码块时,在基于所述目标候选运动矢量组得到所述当前编码块的一个或多个子块的运动矢量时,若所述当前编码块的下边界与所述当前编码块所在的CTU的下边界重合,则所述当前编码块的左下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前编码块的左下角的位置坐标(0,H)计算得到,所述当前编码块的右下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前编码块的右下角的位置坐标(W,H)计算得到。例如,将当前编码块的左下角的位置坐标(0,H)代入到该6参数仿射模型即可得到当前编码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前编码块的右下角的位置坐标(W,H)代入到该6参数仿射模型即可得到当前编码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前编码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动信息列表),用到的是准确的值而不是估算值。其中,W为该当前编码块的宽,H为该当前编码块的高。Optionally, when the current coding block is a 6-parameter coding block, when the motion vectors of one or more sub-blocks of the current coding block are obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the sub-block in the lower left corner of the current coding block is a 6-parameter affine model constructed according to the three control points and the current coding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the lower right corner sub-block of the current coding block is a 6-parameter affine model constructed according to the three control points and the right of the current coding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 6-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation.) The position coordinates (W, H) of the lower right corner of the current coding block are substituted into the 6-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion information list of the other blocks) uses accurate values instead of estimated values. Wherein, W is the width of the current coding block, and H is the height of the current coding block.
可选的,当前编码块为4参数编码块时,在基于所述目标候选运动矢量组得到所述当前编码块的一个或多个子块的运动矢量时,若所述当前编码块的下边界与所述当前编码块所在的CTU的下边界重合,则所述当前编码块的左下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前编码块的左下角的位置坐标(0,H)计算得到,所述当前编码块的右下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前编码块的右下角的位置坐标(W,H)计算得到。例如,将当前编码块的左下角的位置坐标(0,H)代入到该4参数仿射模型即可得到当前编码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前编码块的右下角的位置坐标(W,H)代入到该四参数仿射模型即可得到当前编码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前编码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动信息列表),用到的是准确的值而不是估算值。其中,W为该当前编码块的宽,H为该当前编码块的高。Optionally, when the current coding block is a 4-parameter coding block, when a motion vector of one or more sub-blocks of the current coding block is obtained based on the target candidate motion vector group, if the lower boundary of the current coding block and The lower boundary of the CTU where the current coding block is located coincides, and the motion vector of the lower left sub-block of the current coding block is a 4-parameter affine model constructed according to the two control points and the The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the subblock in the lower right corner of the current coding block is a 4-parameter affine model constructed according to the two control points and the right of the current coding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current coding block into the 4-parameter affine model, the motion vector of the sub block at the lower left corner of the current coding block (instead of the The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current coding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current coding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current coding block are used (for example, subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion information list of the other blocks) uses accurate values instead of estimated values. Wherein, W is the width of the current coding block, and H is the height of the current coding block.
步骤S725:该视频编码器根据该当前编码块中每个子块的运动矢量值进行运动补偿,以得到每个子块的像素预测值,具体来说,根据所述当前编码块的一个或多个子块的运动矢量,以及所述索引指示的参考帧索引和预测方向,预测得到所述当前编码块的像素预测值。Step S725: The video encoder performs motion compensation according to the motion vector value of each sub-block in the current coding block to obtain the pixel prediction value of each sub-block. Specifically, according to one or more sub-blocks of the current coding block The motion vector, and the reference frame index and prediction direction indicated by the index, to obtain a pixel prediction value of the current coding block.
可以理解的是,当所述第一相邻仿射编码块所在的编码树单元CTU在所述当前编码块位置的上方时,该第一相邻仿射编码块最下方控制点的信息已经从内存中读取过;因此上述方案在根据第一相邻仿射编码块的第一组控制点构建候选运动矢量的过程中,该第一组控制点包括所述第一相邻仿射编码块的左下控制点和右下控制点;而不是像现有技术那样固定将第一相邻编码块的左上控制点、右上控制点和左下控制点作为第一组控制点。因此采用本申请中确定第一组控制点的方法,第一组控制点的信息(例如,位置坐标、运动矢量等)可以直接复用从内存中读取过的信息,从而减少了内存的读取,提高了编码性能。It can be understood that when the CTU of the coding tree unit where the first neighboring affine coding block is located is above the current coding block position, the information of the lowest control point of the first neighboring affine coding block has been changed from Read in memory; therefore, in the above solution, in the process of constructing a candidate motion vector according to the first set of control points of the first neighboring affine coding block, the first set of control points includes the first neighboring affine coding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent coding block as the first group of control points as in the prior art. Therefore, by adopting the method for determining the first set of control points in this application, the information of the first set of control points (for example, position coordinates, motion vectors, etc.) can directly reuse the information read from the memory, thereby reducing the memory read. This improves encoding performance.
图9是示出根据本申请一种实施例的解码方法的过程900的流程图。过程900可由视频解码器100执行,具体的,可以由视频解码器200的帧间预测单元210,以及熵解码单元(也称熵解码器)203来执行。过程900描述为一系列的步骤或操作,应当理解的是,过程900可以以各种顺序执行和/或同时发生,不限于图9所示的执行顺序。假设具有多个视频帧的视频数据流正在使用视频解码器,若其中第一相邻仿射解码块位于当前解码块上方的解码树单元(Coding Tree Unit,CTU),则基于第一相邻仿射解码块的左下控制点和右下控制点确定一组候选运动矢量预测值,对应图9所示流程,相关描述如下:FIG. 9 is a flowchart illustrating a process 900 of a decoding method according to an embodiment of the present application. The process 900 may be performed by the video decoder 100, and specifically, may be performed by the inter prediction unit 210 of the video decoder 200 and the entropy decoding unit (also referred to as an entropy decoder) 203. The process 900 is described as a series of steps or operations. It should be understood that the process 900 may be performed in various orders and / or concurrently, and is not limited to the execution order shown in FIG. 9. Assume that a video data stream with multiple video frames is using a video decoder. If the first neighboring affine decoding block is located above the current decoding block (Coding, Tree Unit, CTU), then it is based on the first neighboring affine decoding unit. The lower left control point and lower right control point of the radio decoding block determine a set of candidate motion vector prediction values, corresponding to the process shown in FIG. 9 and the related description is as follows:
若第一相邻仿射解码块位于当前解码块上方的编码树单元(Coding Tree Unit,CTU),则基于第一相邻仿射解码块的左下控制点和右下控制点确定一组候选运动矢量预测值,详细描述如下:If the first neighboring affine decoding block is located above the coding tree unit (CTU) of the current decoding block, a set of candidate motions is determined based on the lower left control point and the lower right control point of the first neighboring affine decoding block. The vector prediction value is described in detail as follows:
步骤S1200:视频解码器确定当前解码块的帧间预测模式。Step S1200: The video decoder determines an inter prediction mode of the currently decoded block.
具体的,帧间预测模式可能为先进的运动矢量预测(Advanced Motion Vector Prediction,AMVP)模式,也可能为融合(merge)模式。Specifically, the inter prediction mode may be an advanced motion vector prediction (Advanced Vector Prediction (AMVP) mode) or may be a merge mode.
若确定出当前解码块的帧间预测模式为AMVP模式,则执行步骤S1211-S1216。If it is determined that the inter prediction mode of the current decoding block is the AMVP mode, steps S1211-S1216 are performed.
若确定出当前解码块的帧间预测模式为merge模式,则执行步骤S1221-S1225。If it is determined that the inter prediction mode of the current decoding block is the merge mode, steps S1221-S1225 are performed.
AMVP模式:AMVP mode:
步骤S1211:视频解码器构建候选运动矢量预测值MVP列表。Step S1211: The video decoder constructs a candidate motion vector prediction value MVP list.
具体地,视频解码器通过帧间预测单元(也称帧间预测模块)来构建候选运动矢量预测值MVP列表(也称候选运动矢量列表),可以采用如下提供的两种方式中的一种方式来构建,或者采用两种方式结合的形式来构建,构建的候选运动矢量预测值MVP列表可以为三元组的候选运动矢量预测值MVP列表,也可以为二元组的候选运动矢量预测值MVP列表;以上两种方式具体如下:Specifically, the video decoder uses an inter prediction unit (also referred to as an inter prediction module) to construct a candidate motion vector prediction value MVP list (also referred to as a candidate motion vector list). One of the two methods provided below may be adopted. To build, or use a combination of the two methods, the candidate motion vector prediction value MVP list can be a triplet candidate motion vector prediction value MVP list or a binary tuple candidate motion vector prediction value MVP List; the above two methods are as follows:
方式一,采用基于运动模型的运动矢量预测方法构建候选运动矢量预测值MVP列表。Method 1: A motion vector prediction method based on a motion model is used to construct a candidate motion vector prediction value MVP list.
首先,按照预先规定的顺序遍历当前解码块的全部或部分相邻块,从而确定其中的相邻仿射解码块,确定出的相邻仿射解码块的数量可能为一个也可能为多个。例如,可以依次遍历图7A所示的相邻块A、B、C、D、E,以确定出相邻块A、B、C、D、E中的相邻仿射解码块。该帧间预测单元至少会根据一个相邻仿射解码块确定一组候选运动矢量预测值(每一组候选运动矢量预测值为一个二元组或者三元组),下面以一个相邻仿射解码块为例进行介绍,为了便于描述称该一个相邻仿射解码块为第一相邻仿射解码块,具体如下:First, all or part of neighboring blocks of the current decoding block are traversed in a predetermined order to determine the neighboring affine decoding blocks, and the number of the determined neighboring affine decoding blocks may be one or more. For example, the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine decoding blocks among the neighboring blocks A, B, C, D, and E. The inter prediction unit determines at least one set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to at least one neighboring affine decoding block, and an adjacent affine is used below The decoding block is introduced as an example. For convenience of description, the adjacent affine decoding block is called the first adjacent affine decoding block, as follows:
根据第一相邻仿射解码块的控制点的运动矢量确定第一仿射模型,进而根据第一仿射 模型预测该当前解码块的控制点的运动矢量。当前解码块的参数模型不同时,基于第一相邻仿射解码块的控制点的运动矢量预测当前解码块的控制点的运动矢量的方式也不同,因此下面分情况进行描述。A first affine model is determined according to a motion vector of a control point of a first adjacent affine decoding block, and a motion vector of a control point of the current decoding block is predicted according to the first affine model. When the parameter model of the current decoding block is different, the method of predicting the motion vector of the control point of the current decoding block based on the motion vector of the control point of the first adjacent affine decoding block is also different, so the following description will be made on a case-by-case basis.
A、当前解码块的参数模型为4参数仿射变换模型,推导的方式可以为(如图9A):A. The parameter model of the current decoding block is a 4-parameter affine transformation model, and the derivation method can be (see Figure 9A):
若第一相邻仿射解码块位于当前解码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射解码块为四参数仿射解码块,则获取该第一相邻仿射解码块最下侧两个控制点的运动矢量,例如,可以获取该第一相邻仿射解码块左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6),以及右下控制点的位置坐标(x 7,y 7)和运动矢量值(vx 7,vy 7)(步骤S1201)。 If the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a four-parameter affine decoding block, the first phase is obtained The motion vectors of the bottom two control points of the adjacent affine decoding block. For example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy) of the lower left control point of the first adjacent affine decoding block can be obtained. 6), and a lower right position of the control point coordinates (x 7, y 7) and motion vector values (vx 7, vy 7) (step S1201).
根据该第一相邻仿射解码块最下侧两个控制点的运动矢量和坐标位置组成第一仿射模型(这时得到的第一仿射模型为4参数仿射模型)(步骤S1202)。A first affine model is formed according to the motion vectors and coordinate positions of the bottom two control points of the first adjacent affine decoding block (the first affine model obtained at this time is a 4-parameter affine model) (step S1202) .
根据该第一仿射模型预测当前解码块的控制点的运动矢量,例如,可以将该当前解码块的左上控制点的位置坐标和右上控制点的位置坐标分别带入到该第一仿射模型,从而预测出当前解码块的左上控制点的运动矢量、右上控制点的运动矢量,具体如公式(1)、(2)所示(步骤S1203)。The motion vector of the control point of the current decoding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point and the position of the upper right control point of the current decoding block may be brought into the first affine model, respectively. , Thereby predicting the motion vector of the upper left control point and the motion vector of the upper right control point of the current decoding block, as shown in formulas (1) and (2) (step S1203).
Figure PCTCN2019079955-appb-000022
Figure PCTCN2019079955-appb-000022
Figure PCTCN2019079955-appb-000023
Figure PCTCN2019079955-appb-000023
在公式(1)、(2)中,(x 0,y 0)为当前解码块的左上控制点的坐标,(x 1,y 1)为当前解码块的右上控制点的坐标;另外,(vx 0,vy 0)为预测的当前解码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前解码块的右上控制点的运动矢量。 In formulas (1) and (2), (x 0 , y 0 ) are the coordinates of the upper left control point of the current decoded block, and (x 1 , y 1 ) are the coordinates of the upper right control point of the current decoded block; in addition, ( vx 0 , vy 0 ) is a motion vector of the upper left control point of the predicted current decoded block, and (vx 1 , vy 1 ) is a motion vector of the upper right control point of the predicted current decoded block.
可选的,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为根据所述第一相邻仿射解码块的左上控制点的位置坐标(x 4,y 4)计算得到的,其中,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射解码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿解码块的宽度,cuH为所述第一相邻仿射解码块的高度;另外,所述第一相邻仿射解码块的左下控制点的运动矢量为所述第一相邻仿射解码块的左下子块的运动矢量,所述第一相邻仿射解码块的右下控制点的运动矢量为第一相邻仿射解码块的右下子块的运动矢量。可以看出,第一相邻仿射解码块的左下控制点的位置坐标和所述右下控制点的位置坐标均是推导得到的,而不是从内存中读取得到的,因此采用该方法能够进一步减少内存的读取,提高了解码性能。作为另外一种可选方案,也可以在内存中预选存储左下控制点和右下控制点的位置坐标,后续要用的时候从内存中读取。 Optionally, the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine decoding block and the position coordinates (x 7 , y 7 ) of the lower right control point are both based on the The position coordinates (x 4 , y 4 ) of the upper left control point of the first adjacent affine decoding block are calculated, where the position coordinates (x 6 , y) of the lower left control point of the first adjacent affine decoding block 6 ) is (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) of the lower right control point of the first adjacent affine decoding block is (x 4 + cuW, y 4 + cuH) , CuW is the width of the first neighboring affine decoding block, and cuH is the height of the first neighboring affine decoding block. In addition, the motion vector of the lower left control point of the first neighboring affine decoding block is A motion vector of a lower left sub-block of the first adjacent affine decoding block, and a motion vector of a lower right control point of the first adjacent affine decoding block is a Motion vector. It can be seen that the position coordinates of the lower left control point and the position coordinates of the lower right control point of the first adjacent affine decoding block are both derived and not read from the memory, so using this method can Further reducing memory reads and improving decoding performance. As another alternative, the position coordinates of the lower left control point and the lower right control point may also be pre-selected and stored in the memory, and read from the memory when needed later.
若第一相邻仿射解码块位于当前解码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射解码块为六参数仿射解码块,则不基于第一相邻仿射解码块生成当前 块的控制点的候选运动矢量预测值。If the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a six-parameter affine decoding block, it is not based on the first phase The neighboring affine decoding block generates a candidate motion vector prediction value of a control point of the current block.
若该第一相邻仿射解码块不位于当前解码块的上方CTU,则预测当前解码块的控制点的运动矢量的方式此处不作限定。但是为了便于理解,下面也例举一种可选的确定方式:If the first adjacent affine decoding block is not located above the CTU of the current decoding block, the manner of predicting the motion vector of the control point of the current decoding block is not limited here. However, for ease of understanding, an optional determination method is also exemplified below:
可以获取该第一相邻仿射解码块的三个控制点的位置坐标和运动矢量,例如,左上控制点的位置坐标(x 4,y 4)和运动矢量值(vx 4,vy 4)、右上控制点的位置坐标(x 5,y 5)和运动矢量值(vx 5,vy 5)、左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6)。 The position coordinates and motion vectors of the three control points of the first adjacent affine decoding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
根据该第一相邻仿射解码块的三个控制点的位置坐标和运动矢量组成6参数仿射模型。A 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block.
将该当前解码块的左上控制点的位置坐标(x 0,y 0)和右上控制点的位置坐标(x 1,y 1)代入6参数仿射模型预测当前解码块的左上控制点的运动矢量,以及右上控制点的运动矢量,具体如公式(4)、(5)所示。 The position coordinates (x 0 , y 0 ) of the upper left control point and the position coordinates (x 1 , y 1 ) of the upper right control point of the current decoding block are substituted into a 6-parameter affine model to predict the motion vector of the upper left control point of the current decoding block. And the motion vector of the upper right control point, as shown in formulas (4) and (5).
Figure PCTCN2019079955-appb-000024
Figure PCTCN2019079955-appb-000024
Figure PCTCN2019079955-appb-000025
Figure PCTCN2019079955-appb-000025
在公式(4)、(5)中,(vx 0,vy 0)为预测的当前解码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前解码块的右上控制点的运动矢量。 In formulas (4) and (5), (vx 0 , vy 0 ) is the motion vector of the upper left control point of the predicted current decoded block, and (vx 1 , vy 1 ) is the predicted upper right control point of the current decoded block. Motion vector.
B、当前解码块的参数模型为6参数仿射变换模型,推导的方式可以为:B. The parameter model of the current decoding block is a 6-parameter affine transformation model. The derivation method can be:
若该第一相邻仿射解码块位于当前解码块的上方CTU且第一相邻仿射解码块为四参数仿射解码块,则获取该第一相邻仿射解码块最下侧两个控制点的位置坐标和运动矢量,例如,可以获取该第一相邻仿射解码块左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6),以及右下控制点的位置坐标(x 7,y 7)和运动矢量值(vx 7,vy 7)。 If the first adjacent affine decoding block is located above the CTU of the current decoding block and the first adjacent affine decoding block is a four-parameter affine decoding block, two lower-most two sides of the first adjacent affine decoding block are obtained. The position coordinates and motion vectors of the control points, for example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine decoding block can be obtained, and the lower right control The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the point.
根据该第一相邻仿射解码块最下侧两个控制点的运动矢量组成第一仿射模型(此时得到的第一仿射模型为一个4参数仿射模型)。A first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine decoding block (the first affine model obtained at this time is a 4-parameter affine model).
根据该第一仿射模型预测当前解码块的控制点的运动矢量,例如,可以将该当前解码块的左上控制点的位置坐标、右上控制点的位置坐标、左下控制点的位置坐标分别带入到该第一仿射模型,从而预测当前解码块的左上控制点的运动矢量、右上控制点的运动矢量和左下控制点的运动矢量,具体如公式(1)、(2)、(3)所示。The motion vector of the control point of the current decoded block is predicted according to the first affine model. For example, the position coordinates of the upper left control point, the position of the upper right control point, and the position of the lower left control point of the current decoded block may be brought in respectively. To the first affine model, thereby predicting the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion of the lower left control point of the current decoding block, as shown in formulas (1), (2), (3) Show.
Figure PCTCN2019079955-appb-000026
Figure PCTCN2019079955-appb-000026
公式(1)、(2)以上已有描述,在公式(1)、(2)、(3)中,(x 0,y 0)为当前解码块的左上控制点的坐标,(x 1,y 1)为当前解码块的右上控制点的坐标,(x 2,y 2)为当前解码块的左下控制点的坐标;另外,(vx 0,vy 0)为预测的当前解码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前解码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前解码块的右左下控制点的运动矢量。 Formulas (1) and (2) have been described above. In formulas (1), (2), and (3), (x 0 , y 0 ) are the coordinates of the upper-left control point of the current decoding block, (x 1 , y 1 ) is the coordinate of the upper right control point of the current decoded block, (x 2 , y 2 ) is the coordinate of the lower left control point of the current decoded block; in addition, (vx 0 , vy 0 ) is the predicted upper left control of the current decoded block The motion vector of the point, (vx 1 , vy 1 ) is the motion vector of the predicted upper right control point of the current decoded block, and (vx 2 , vy 2 ) is the predicted motion vector of the right and left lower control point of the current decoded block.
若第一相邻仿射解码块位于当前解码块上方的编码树单元(Coding Tree Unit,CTU) 且所述第一相邻仿射解码块为六参数仿射解码块,则不基于第一相邻仿射解码块生成当前块的控制点的候选运动矢量预测值。If the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a six-parameter affine decoding block, it is not based on the first phase The neighboring affine decoding block generates a candidate motion vector prediction value of a control point of the current block.
若该第一相邻仿射解码块不位于当前解码块的上方CTU,则预测当前解码块的控制点的运动矢量的方式此处不作限定。但是为了便于理解,下面也例举一种可选的确定方式:If the first adjacent affine decoding block is not located above the CTU of the current decoding block, the manner of predicting the motion vector of the control point of the current decoding block is not limited here. However, for ease of understanding, an optional determination method is also exemplified below:
可以获取该第一相邻仿射解码块的三个控制点的位置坐标和运动矢量,例如,左上控制点的位置坐标(x 4,y 4)和运动矢量值(vx 4,vy 4)、右上控制点的位置坐标(x 5,y 5)和运动矢量值(vx 5,vy 5)、左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6)。 The position coordinates and motion vectors of the three control points of the first adjacent affine decoding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
根据该第一相邻仿射解码块的三个控制点的位置坐标和运动矢量组成6参数仿射模型。A 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block.
将该当前解码块的左上控制点的位置坐标(x 0,y 0)、右上控制点的位置坐标(x 1,y 1)和左下控制点的位置坐标(x 2,y 2)代入6参数仿射模型预测当前解码块的左上控制点的运动矢量,右上控制点的运动矢量,及左下控制点的运动矢量,如公式(4)、(5)、(6)所示。 Substituting the position coordinates (x 0 , y 0 ) of the upper left control point, the position coordinates of the upper right control point (x 1 , y 1 ), and the position coordinates of the lower left control point (x 2 , y 2 ) into the 6 parameters of the current decoding block. The affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current decoding block, as shown in formulas (4), (5), and (6).
Figure PCTCN2019079955-appb-000027
Figure PCTCN2019079955-appb-000027
公式(4)、(5)前面已有描述,在公式(4)、(5)、(6)中,(vx 0,vy 0)为预测的当前解码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前解码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前解码块的左下控制点的运动矢量。 Formulas (4) and (5) have been described previously. In formulas (4), (5), and (6), (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current decoding block, ( vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current decoded block, and (vx 2 , vy 2 ) are the predicted motion vectors of the lower left control point of the current decoded block.
方式二,采用基于控制点组合的运动矢量预测方法构建候选运动矢量预测值MVP列表。Method 2: A motion vector prediction method based on the combination of control points is used to construct a candidate motion vector prediction value MVP list.
当前解码块的参数模型不同时构建候选运动矢量预测值MVP列表的方式也不同,下面展开描述。The parameter model of the current decoding block does not have the same way of constructing the candidate motion vector prediction value MVP list at the same time, which will be described below.
A、当前解码块的参数模型为4参数仿射变换模型,推导的方式可以为:A. The parameter model of the current decoding block is a 4-parameter affine transformation model. The derivation method can be:
利用当前解码块周边邻近的已解码块的运动信息预估当前解码块左上顶点和右上顶点的运动矢量。如图7B所示:首先,利用左上顶点相邻已解码块A和/或B和/或C块的运动矢量,作为当前解码块左上顶点的运动矢量的候选运动矢量;利用右上顶点相邻已解码块D和/或E块的运动矢量,作为当前解码块右上顶点的运动矢量的候选运动矢量。将上述左上顶点的一个候选运动矢量和右上顶点的一个候选运动矢量进行组合可得到一组候选运动矢量预测值,按照这种组合方式组合得到的多条记录可以构成候选运动矢量预测值MVP列表。The motion information of the upper left vertex and the upper right vertex of the current decoded block is estimated using the motion information of the decoded blocks adjacent to the current decoded block. As shown in FIG. 7B: First, the motion vector of the upper left vertex adjacent to the decoded block A and / or B and / or C block is used as the candidate motion vector of the motion vector of the upper left vertex of the current decoded block; The motion vector of the decoding block D and / or E block is used as a candidate motion vector of the motion vector of the upper right vertex of the current decoding block. A candidate motion vector of the upper left vertex and a candidate motion vector of the upper right vertex are combined to obtain a set of candidate motion vector prediction values. Multiple records obtained by combining in this combination manner can form a candidate motion vector prediction value MVP list.
B、当前解码块参数模型是6参数仿射变换模型,推导的方式可以为:B. The current decoding block parameter model is a 6-parameter affine transformation model. The derivation method can be:
利用当前解码块周边邻近的已解码块的运动信息预估当前解码块左上顶点和右上顶点的运动矢量。如图7B所示:首先,利用左上顶点相邻已解码块A和/或B和/或C块的运动矢量,作为当前解码块左上顶点的运动矢量的候选运动矢量;利用右上顶点相邻已解码块D和/或E块的运动矢量,作为当前解码块右上顶点的运动矢量的候选运动矢量;利用右上顶点相邻已解码块F和/或G块的运动矢量,作为当前解码块右上顶点的运动矢量的候选运动矢量。将上述左上顶点的一个候选运动矢量、右上顶点的一个候选运动矢量和左下顶点的一个候选运动矢量进行组合可得到一组候选运动矢量预测值,按照这种组合方式组合得到的多组候选运动矢量预测值可以构成候选运动矢量预测值MVP列表。The motion information of the upper left vertex and the upper right vertex of the current decoded block is estimated using the motion information of the decoded blocks adjacent to the current decoded block. As shown in FIG. 7B: First, the motion vector of the upper left vertex adjacent to the decoded block A and / or B and / or C block is used as the candidate motion vector of the motion vector of the upper left vertex of the current decoded block; The motion vector of the decoded block D and / or E block is used as the candidate motion vector of the motion vector of the top right vertex of the current decoded block; the motion vector of the decoded block F and / or G block adjacent to the top right vertex is used as the top right vertex of the current decoded block Candidate motion vector. A set of candidate motion vector prediction values can be obtained by combining the above candidate motion vector of the upper left vertex, the candidate motion vector of the upper right vertex, and the candidate motion vector of the lower left vertex. Multiple sets of candidate motion vectors obtained by combining in this combination manner The prediction value may constitute a candidate motion vector prediction value MVP list.
需要说明的是,可仅采用方式一预测得到的候选运动矢量预测值来构建候选运动矢量 预测值MVP列表,也可仅采用方式二预测得到的候选运动矢量预测值来构建候选运动矢量预测值MVP列表,还可采用方式一预测得到的候选运动矢量预测值和方式二预测得到的候选运动矢量预测值来共同构建候选运动矢量预测值MVP列表。另外,还可将候选运动矢量预测值MVP列表按照预先配置的规则进行剪枝和排序,然后将其截断或填充至特定个数。当候选运动矢量预测值MVP列表中的每一组候选运动矢量预测值包括三个控制点的运动矢量预测值时,可称该候选运动矢量预测值MVP列表为三元组列表;当候选运动矢量预测值MVP列表中的每一组候选运动矢量预测值包括两个控制点的运动矢量预测值时,可称该候选运动矢量预测值MVP列表为二元组列表。It should be noted that the candidate motion vector prediction value MVP list can only be constructed by using the candidate motion vector prediction value predicted by the first method, or the candidate motion vector prediction value MVP can be constructed by only using the candidate motion vector prediction value obtained by the second method prediction. For the list, the candidate motion vector prediction value obtained by the prediction method 1 and the candidate motion vector prediction value obtained by the method 2 prediction can be used to jointly construct a candidate motion vector prediction value MVP list. In addition, the candidate motion vector prediction value MVP list can be pruned and sorted according to a pre-configured rule, and then truncated or filled to a specific number. When each group of candidate motion vector prediction values in the candidate motion vector prediction value MVP list includes motion vector prediction values of three control points, the candidate motion vector prediction value MVP list may be called a triple list; when the candidate motion vector When each group of candidate motion vector prediction values in the prediction value MVP list includes motion vector prediction values of two control points, the candidate motion vector prediction value MVP list may be referred to as a two-tuple list.
步骤S1212:视频解码器解析码流,以得到索引和运动矢量差值MVD。Step S1212: The video decoder parses the bitstream to obtain an index and a motion vector difference MVD.
具体地,视频解码器可以通过熵解码单元解析码流,该索引用于指示当前解码块的目标候选运动矢量组,该目标候选运动矢量表示当前解码块的一组控制点的运动矢量预测值。Specifically, the video decoder may parse the bitstream through an entropy decoding unit, and the index is used to indicate a target candidate motion vector group of the current decoding block, where the target candidate motion vector represents a motion vector prediction value of a set of control points of the current decoding block.
步骤S1213:视频解码器根据所述索引,从候选运动矢量预测值MVP列表中确定目标运动矢量组。Step S1213: The video decoder determines a target motion vector group from the candidate motion vector prediction value MVP list according to the index.
具体地,视频解码器根据该索引从候选运动矢量中确定出的目标候选运动矢量组用于作为最优候选运动矢量预测值(可选的,当候选运动矢量预测值MVP列表的长度为1时,不需要解析码流得到索引,直接可以确定目标运动矢量组),下面对该最优优选运动矢量预测值进行简单介绍。Specifically, the video decoder determines the target candidate motion vector group determined from the candidate motion vectors according to the index to be used as the optimal candidate motion vector prediction value (optionally, when the length of the candidate motion vector prediction value MVP list is 1) (You don't need to parse the bitstream to get the index, you can directly determine the target motion vector group.) The following briefly introduces the optimal and preferred motion vector prediction value.
若当前解码块的参数模型是4参数仿射变换模型,那么从以上建立的候选运动矢量预测值MVP列表中选择的是2个控制点的最优运动矢量预测值;例如,该视频解码器从码流中解析索引号,再根据索引号从二元组的候选运动矢量预测值MVP列表中确定2个控制点的最优运动矢量预测值,该候选运动矢量预测值MVP列表中每组候选运动矢量预测值各自对应有各自的索引号。If the parameter model of the current decoding block is a 4-parameter affine transformation model, then the optimal motion vector prediction value of 2 control points is selected from the candidate motion vector prediction value MVP list established above; for example, the video decoder from The index number is parsed in the bitstream, and the optimal motion vector prediction value of 2 control points is determined from the candidate motion vector prediction value MVP list of the binary group according to the index number. Each candidate motion vector prediction value in the MVP list of the candidate motion vector The vector prediction values correspond to their respective index numbers.
若当前解码块的参数模型是6参数仿射变换模型,那么从以上建立的候选运动矢量预测值MVP列表中选择的是3个控制点的最优运动矢量预测值;例如,该视频解码器从码流中解析索引号,再根据索引号从三元组的候选运动矢量预测值MVP列表中确定3个控制点的最优运动矢量预测值,该候选运动矢量预测值MVP列表中每组候选运动矢量预测值各自对应有各自的索引号。If the parameter model of the current decoding block is a 6-parameter affine transformation model, then the optimal motion vector prediction value of 3 control points is selected from the candidate motion vector prediction value MVP list established above; for example, the video decoder from The index number is parsed in the code stream, and the optimal motion vector prediction value of the three control points is determined from the triplet candidate motion vector prediction value MVP list according to the index number. The candidate motion vector prediction value MVP list includes each candidate motion The vector prediction values correspond to their respective index numbers.
步骤S1214:视频解码器根据目标候选运动矢量组和从码流中解析出的运动矢量差值MVD确定当前解码块的控制点的运动矢量。Step S1214: The video decoder determines the motion vector of the control point of the current decoding block according to the target candidate motion vector group and the motion vector difference MVD parsed from the code stream.
若当前解码块的参数模型是4参数仿射变换模型,那么从码流中解码得到当前解码块2个控制点的运动矢量差值,分别根据各控制点的运动矢量差值和所述索引指示的目标候选运动矢量组获得新的候选运动矢量组。例如,从码流中解码得到左上控制点的运动矢量差值MVD和右上控制点的运动矢量差值MVD,并分别与目标候选运动矢量组中左上控制点和右上控制点的运动矢量相加从而得到新的候选运动矢量组,因此,该新的候选运动矢量组包括当前解码块左上控制点和右上控制点的新的运动矢量值。If the parameter model of the current decoding block is a 4-parameter affine transformation model, then the motion vector difference values of the two control points of the current decoding block are obtained by decoding from the code stream, respectively according to the motion vector difference values of the control points and the index indication. To obtain a new candidate motion vector group. For example, the motion vector difference MVD of the upper left control point and the motion vector difference MVD of the upper right control point are decoded from the code stream, and are respectively added to the motion vectors of the upper left control point and the upper right control point in the target candidate motion vector group to thereby A new candidate motion vector group is obtained. Therefore, the new candidate motion vector group includes new motion vector values of the upper left control point and the upper right control point of the current decoding block.
可选的,还可以根据新的候选运动矢量组中当前解码块2个控制点的运动矢量值,采用4参数仿射变换模型获得第3个控制点的运动矢量值。例如,获得当前解码块左上控制点的运动矢量(vx 0,vy 0)和右上控制点的运动矢量(vx 1,vy 1),然后利用公式(7)计算 获得当前解码块左下控制点(x 2,y 2)的运动矢量(vx 2,vy 2)。 Optionally, the motion vector value of the third control point may also be obtained by using a 4-parameter affine transformation model based on the motion vector values of the two control points of the current decoding block in the new candidate motion vector group. For example, the motion vector (vx 0 , vy 0 ) of the upper left control point and the motion vector (vx 1 , vy 1 ) of the upper right control point of the current decoding block are obtained, and then the lower left control point (x of the current decoding block) (x 2 , y 2 ) 's motion vector (vx 2 , vy 2 ).
Figure PCTCN2019079955-appb-000028
Figure PCTCN2019079955-appb-000028
其中,(x 0,y 0)为左上控制点的位置坐标,(x 1,y 1)为右上控制点的位置坐标,W为当前解码块的宽,H为当前解码块的高。 Among them, (x 0 , y 0 ) is the position coordinates of the upper left control point, (x 1 , y 1 ) is the position coordinates of the upper right control point, W is the width of the current decoded block, and H is the height of the current decoded block.
若当前解码块参数模型是6参数仿射变换模型,那么从码流中解码得到当前解码块3个控制点的运动矢量差值,分别根据各控制点的运动矢量差值MVD和所述索引指示的目标候选运动矢量组获得新的候选运动矢量组。例如,从码流中解码得到左上控制点的运动矢量差值MVD、右上控制点的运动矢量差值MVD和左下控制点的运动矢量差值,并分别与目标候选运动矢量组中左上控制点、右上控制点、左下控制点的运动矢量相加从而得到新的候选运动矢量组,因此该新的候选运动矢量组包括当前解码块左上控制点、右上控制点和左下控制点的运动矢量值。If the current decoding block parameter model is a 6-parameter affine transformation model, then the motion vector difference values of the three control points of the current decoding block are obtained by decoding from the code stream, respectively according to the motion vector difference values of each control point and the MVD and the index indication. To obtain a new candidate motion vector group. For example, the motion vector difference MVD of the upper left control point, the motion vector difference MVD of the upper right control point, and the motion vector difference of the lower left control point are obtained from the code stream, and are respectively compared with the upper left control point in the target candidate motion vector group, The motion vectors of the upper right control point and the lower left control point are added to obtain a new candidate motion vector group. Therefore, the new candidate motion vector group includes the motion vector values of the upper left control point, the upper right control point, and the lower left control point of the current decoding block.
步骤S1215:视频解码器根据以上确定出的当前解码块的控制点的运动矢量值采用仿射变换模型获得当前解码块中每个子块的运动矢量值。Step S1215: The video decoder uses the affine transformation model to obtain the motion vector value of each sub-block in the current decoding block according to the motion vector value of the control point of the current decoding block determined above.
具体地,基于目标候选运动矢量组和MVD得到的新的候选运动矢量组中包括两个(左上控制点和右上控制点)或者三个控制点(例如,左上控制点、右上控制点和左下控制点)的运动矢量。对于当前解码块的每一个子块(一个子块也可以等效为一个运动补偿单元),可采用运动补偿单元中预设位置像素点的运动信息来表示该运动补偿单元内所有像素点的运动信息。假设运动补偿单元的尺寸为MxN(M小于等于当前解码块的宽度W,N小于等于当前解码块的高度H,其中M、N、W、H为正整数,通常为2的幂次方,如4、8、16、32、64、128等),则预设位置像素点可以为运动补偿单元中心点(M/2,N/2)、左上像素点(0,0),右上像素点(M-1,0),或其他位置的像素点。图8A示意了4x4的运动补偿单元,图8B示意了8x8的运动补偿单元。Specifically, the new candidate motion vector group obtained based on the target candidate motion vector group and the MVD includes two (upper left control points and upper right control points) or three control points (for example, upper left control point, upper right control point, and lower left control). Dot) motion vector. For each sub-block of the current decoding block (a sub-block can also be equivalent to a motion compensation unit), the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information. Assume that the size of the motion compensation unit is MxN (M is less than or equal to the width W of the current decoding block, and N is less than or equal to the height H of the current decoding block, where M, N, W, H are positive integers, usually powers of two, such as 4, 8, 16, 32, 64, 128, etc.), the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations. FIG. 8A illustrates a 4 × 4 motion compensation unit, and FIG. 8B illustrates an 8 × 8 motion compensation unit.
运动补偿单元中心点相对于当前解码块左上顶点像素的坐标使用公式(8-1)计算得到,其中i为水平方向第i个运动补偿单元(从左到右),j为竖直方向第j个运动补偿单元(从上到下),(x (i,j),y (i,j))表示第(i,j)个运动补偿单元中心点相对于当前解码块左上控制点像素的坐标。再根据当前解码块的仿射模型类型(6参数或4参数),将(x (i,j),y (i,j))代入6参数仿射模型公式(8-2)或者将(x (i,j),y (i,j))代入4参数仿射模型公式(8-3),获得每个运动补偿单元中心点的运动信息,作为该运动补偿单元内所有像素点的运动矢量(vx (i,j),vy (i,j))。 The coordinates of the center point of the motion compensation unit relative to the top left pixel of the current decoding block are calculated using formula (8-1), where i is the i-th motion compensation unit in the horizontal direction (from left to right), and j is the j-th vertical direction. Motion compensation units (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the (i, j) th center of the motion compensation unit relative to the pixel at the upper left control point of the current decoding block . Then according to the affine model type (6 or 4 parameters) of the current decoding block, substitute (x (i, j) , y (i, j) ) into the 6-parameter affine model formula (8-2) or (x (i, j) , y (i, j) ) is substituted into the 4-parameter affine model formula (8-3) to obtain the motion information of the center point of each motion compensation unit as the motion vector of all pixels in the motion compensation unit (vx (i, j) , vy (i, j) ).
Figure PCTCN2019079955-appb-000029
Figure PCTCN2019079955-appb-000029
Figure PCTCN2019079955-appb-000030
Figure PCTCN2019079955-appb-000030
Figure PCTCN2019079955-appb-000031
Figure PCTCN2019079955-appb-000031
可选的,当前解码块为6参数解码块时,在基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量时,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前解码块的左下角的位置坐标(0,H)计算得到,所述当前解码块的右下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前解码块的右下角的位置坐标(W,H)计算得到。例如,将当前解码块的左下角的位置坐标(0,H)代入到该6参数仿射模型即可得到当前解码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前解码块的右下角的位置坐标(W,H)代入到该6参数仿射模型即可得到当前解码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前解码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动矢量预测值MVP列表),用到的是准确的值而不是估算值。其中,W为该当前解码块的宽,H为该当前解码块的高。Optionally, when the current decoding block is a 6-parameter decoding block, when a motion vector of one or more sub-blocks of the current decoding block is obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block is The lower boundary of the CTU where the current decoding block is located coincides, and the motion vector of the sub-block in the lower left corner of the current decoding block is a 6-parameter affine model constructed from the three control points and the current decoding block. The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the subblock in the lower right corner of the current decoding block is a 6-parameter affine model constructed according to the three control points and the right of the current decoding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 6-parameter affine model, the motion vector of the lower left corner sub-block of the current decoded block (instead of the The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoded block are substituted into the 6-parameter affine model to obtain the motion vector of the lower right corner of the current decoded block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion vector prediction value MVP list of the other blocks) uses an accurate value instead of an estimated value. Wherein, W is the width of the current decoded block, and H is the height of the current decoded block.
可选的,当前解码块为4参数解码块时,在基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量时,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前解码块的左下角的位置坐标(0,H)计算得到,所述当前解码块的右下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前解码块的右下角的位置坐标(W,H)计算得到。例如,将当前解码块的左下角的位置坐标(0,H)代入到该4参数仿射模型即可得到当前解码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前解码块的右下角的位置坐标(W,H)代入到该四参数仿射模型即可得到当前解码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前解码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动矢量预测值MVP列表),用到的是准确的值而不是估算值。其中,W为该当前解码块的宽,H为该当前解码块的高。Optionally, when the current decoding block is a 4-parameter decoding block, when the motion vectors of one or more sub-blocks of the current decoding block are obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block and The lower boundary of the CTU where the current decoding block is located coincides, and the motion vector of the lower left sub-block of the current decoding block is a 4-parameter affine model constructed according to the two control points and the current decoding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the sub-block in the lower right corner of the current decoding block is a 4-parameter affine model constructed according to the two control points and the right of the current decoding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 4-parameter affine model, the motion vector of the lower left sub-block of the current decoded block (instead of the The center point coordinates are substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current decoding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion vector prediction value MVP list of the other blocks) uses an accurate value instead of an estimated value. Wherein, W is the width of the current decoded block, and H is the height of the current decoded block.
步骤S1216:该视频解码器根据该当前解码块中每个子块的运动矢量值进行运动补偿,以得到每个子块的像素预测值,例如,通过每个子块的运动矢量和参考帧索引值,在参考帧中找到对应的子块,进行插值滤波,得到每个子块的像素预测值。Step S1216: The video decoder performs motion compensation according to the motion vector value of each sub-block in the current decoding block to obtain the pixel prediction value of each sub-block. For example, the motion vector and reference frame index value of each sub-block are used in The corresponding sub-block is found in the reference frame, and interpolation filtering is performed to obtain the pixel prediction value of each sub-block.
Merge模式:Merge mode:
步骤S1221:视频解码器构建候选运动信息列表。Step S1221: The video decoder constructs a candidate motion information list.
具体地,视频解码器通过帧间预测单元(也称帧间预测模块)来构建候选运动信息列表(也称候选运动矢量列表),可以采用如下提供的两种方式中的一种方式来构建,或者采用两种方式结合的形式来构建,构建的候选运动信息列表为三元组的候选运动信息列表;以上两种方式具体如下:Specifically, the video decoder constructs a candidate motion information list (also referred to as a candidate motion vector list) through an inter prediction unit (also referred to as an inter prediction module), which may be constructed in one of two ways provided below. Or it can be constructed by a combination of two methods, and the candidate motion information list constructed is a triplet candidate motion information list; the above two methods are specifically as follows:
方式一,采用基于运动模型的运动矢量预测方法构建候选运动信息列表。Method 1: A motion vector prediction method based on a motion model is used to construct a candidate motion information list.
首先,按照预先规定的顺序遍历当前解码块的全部或部分相邻块,从而确定其中的相邻仿射解码块,确定出的相邻仿射解码块的数量可能为一个也可能为多个。例如,可以依次遍历图7A所示的相邻块A、B、C、D、E,以确定出相邻块A、B、C、D、E中的相邻仿射解码块。该帧间预测单元会根据每个相邻仿射解码块确定一组候选运动矢量预测值(每一组候选运动矢量预测值为一个二元组或者三元组),下面以一个相邻仿射解码块为例进行介绍,为了便于描述称该一个相邻仿射解码块为第一相邻仿射解码块,具体如下:First, all or part of neighboring blocks of the current decoding block are traversed in a predetermined order to determine the neighboring affine decoding blocks, and the number of the determined neighboring affine decoding blocks may be one or more. For example, the neighboring blocks A, B, C, D, and E shown in FIG. 7A may be traversed in order to determine neighboring affine decoding blocks among the neighboring blocks A, B, C, D, and E. The inter prediction unit determines a set of candidate motion vector prediction values (each set of candidate motion vector prediction values is a two-tuple or three-tuple) according to each adjacent affine decoding block, and an adjacent affine is used below. The decoding block is introduced as an example. For convenience of description, the adjacent affine decoding block is called the first adjacent affine decoding block, as follows:
根据第一相邻仿射解码块的控制点的运动矢量确定第一仿射模型,进而根据第一仿射模型预测该当前解码块的控制点的运动矢量,具体描述如下:A first affine model is determined according to a motion vector of a control point of a first adjacent affine decoding block, and then a motion vector of a control point of the current decoding block is predicted according to the first affine model, which is specifically described as follows:
若该第一相邻仿射解码块位于当前解码块的上方CTU且该第一相邻仿射解码块为四参数仿射解码块,则获取该第一相邻仿射解码块最下侧两个控制点的位置坐标和运动矢量,例如,可以获取该第一相邻仿射解码块左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6),以及右下控制点的位置坐标(x 7,y 7)和运动矢量值(vx 7,vy 7)。 If the first adjacent affine decoding block is located above the CTU of the current decoding block and the first adjacent affine decoding block is a four-parameter affine decoding block, two lower-most two sides of the first adjacent affine decoding block are obtained. Position coordinates and motion vectors of the control points, for example, the position coordinates (x 6 , y 6 ) and motion vectors (vx 6 , vy 6 ) of the lower left control point of the first adjacent affine decoding block can be obtained, and the lower right The position coordinates (x 7 , y 7 ) and motion vector values (vx 7 , vy 7 ) of the control point.
根据该第一相邻仿射解码块最下侧两个控制点的运动矢量组成第一仿射模型(此时得到的第一仿射模型为一个4参数仿射模型)。A first affine model is formed according to the motion vectors of the two control points at the bottom of the first adjacent affine decoding block (the first affine model obtained at this time is a 4-parameter affine model).
可选的,根据该第一仿射模型预测当前解码块的控制点的运动矢量,例如,可以将该当前解码块的左上控制点的位置坐标、右上控制点的位置坐标、左下控制点的位置坐标分别带入到该第一仿射模型,从而预测当前解码块的左上控制点的运动矢量、右上控制点的运动矢量和左下控制点的运动矢量,组成候选运动矢量三元组,加入候选运动信息列表,具体如公式(1)、(2)、(3)所示。Optionally, the motion vector of the control point of the current decoding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point, the position coordinates of the upper right control point, and the position of the lower left control point of the current decoding block can be predicted. The coordinates are brought into the first affine model respectively, thereby predicting the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion of the lower left control point of the current decoded block to form a candidate motion vector triplet and adding candidate motions. The information list is shown in formulas (1), (2), and (3).
可选的,根据该第一仿射模型预测当前解码块的控制点的运动矢量,例如,可以将该当前解码块的左上控制点的位置坐标、右上控制点的位置坐标分别带入到该第一仿射模型,从而预测当前解码块的左上控制点的运动矢量和右上控制点的运动矢量,组成候选运动矢量二元组,加入候选运动信息列表,具体如公式(1)、(2)所示。Optionally, the motion vector of the control point of the current decoding block is predicted according to the first affine model. For example, the position coordinates of the upper left control point and the position coordinates of the upper right control point of the current decoding block may be brought into the first An affine model, so as to predict the motion vector of the upper left control point and the motion vector of the upper right control point of the current decoding block, form a candidate motion vector two-tuple, and add the candidate motion information list, as shown in formulas (1) and (2). Show.
在公式(1)、(2)、(3)中,(x 0,y 0)为当前解码块的左上控制点的坐标,(x 1,y 1)为当前解码块的右上控制点的坐标,(x 2,y 2)为当前解码块的左下控制点的坐标;另外,(vx 0,vy 0)为预测的当前解码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前解码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前解码块的左下控制点的运动矢量。 In formulas (1), (2), and (3), (x 0 , y 0 ) are the coordinates of the upper-left control point of the current decoded block, and (x 1 , y 1 ) are the coordinates of the upper-right control point of the current decoded block. , (X 2 , y 2 ) are the coordinates of the lower left control point of the current decoding block; in addition, (vx 0 , vy 0 ) is the predicted motion vector of the upper left control point of the current decoding block, and (vx 1 , vy 1 ) is The predicted motion vector of the upper right control point of the current decoded block, (vx 2 , vy 2 ) is the predicted motion vector of the lower left control point of the current decoded block.
若第一相邻仿射解码块位于当前解码块上方的编码树单元(Coding Tree Unit,CTU)且所述第一相邻仿射解码块为六参数仿射解码块,则不基于第一相邻仿射解码块生成当前块的控制点的候选运动矢量预测值。If the first neighboring affine decoding block is located in a Coding Tree Unit (CTU) above the current decoding block and the first neighboring affine decoding block is a six-parameter affine decoding block, it is not based on the first phase The neighboring affine decoding block generates a candidate motion vector prediction value of a control point of the current block.
若该第一相邻仿射解码块不位于当前解码块的上方CTU,则预测当前解码块的控制点的运动矢量的方式此处不作限定。但是为了便于理解,下面也例举一种可选的确定方式:If the first adjacent affine decoding block is not located above the CTU of the current decoding block, the manner of predicting the motion vector of the control point of the current decoding block is not limited here. However, for ease of understanding, an optional determination method is also exemplified below:
可以获取该第一相邻仿射解码块的三个控制点的位置坐标和运动矢量,例如,左上控制点的位置坐标(x 4,y 4)和运动矢量值(vx 4,vy 4)、右上控制点的位置坐标(x 5,y 5)和运动矢量值(vx 5,vy 5)、左下控制点的位置坐标(x 6,y 6)和运动矢量(vx 6,vy 6)。 The position coordinates and motion vectors of the three control points of the first adjacent affine decoding block may be obtained, for example, the position coordinates (x 4 , y 4 ) and motion vector values (vx 4 , vy 4 ) of the upper left control point, an upper right position coordinates of the control points (x 5, y 5) and a motion vector value (vx 5, vy 5), the position coordinates of the lower left control point (x 6, y 6) and the motion vector (vx 6, vy 6).
根据该第一相邻仿射解码块的三个控制点的位置坐标和运动矢量组成6参数仿射模型。A 6-parameter affine model is formed according to the position coordinates and motion vectors of the three control points of the first adjacent affine decoding block.
将该当前解码块的左上控制点的位置坐标(x 0,y 0)、右上控制点的位置坐标(x 1,y 1)和 左下控制点的位置坐标(x 2,y 2)代入6参数仿射模型预测当前解码块的左上控制点的运动矢量,右上控制点的运动矢量,及左下控制点的运动矢量,如公式(4)、(5)、(6)所示。 Substituting the position coordinates (x 0 , y 0 ) of the upper left control point, the position coordinates of the upper right control point (x 1 , y 1 ), and the position coordinates of the lower left control point (x 2 , y 2 ) into the 6 parameters of the current decoding block. The affine model predicts the motion vector of the upper left control point, the motion vector of the upper right control point, and the motion vector of the lower left control point of the current decoding block, as shown in formulas (4), (5), and (6).
Figure PCTCN2019079955-appb-000032
Figure PCTCN2019079955-appb-000032
公式(4)、(5)前面已有描述,在公式(4)、(5)、(6)中,(vx 0,vy 0)为预测的当前解码块的左上控制点的运动矢量,(vx 1,vy 1)为预测的当前解码块的右上控制点的运动矢量,(vx 2,vy 2)为预测的当前解码块的左下控制点的运动矢量。 Formulas (4) and (5) have been described previously. In formulas (4), (5), and (6), (vx 0 , vy 0 ) is the predicted motion vector of the upper-left control point of the current decoding block, ( vx 1 , vy 1 ) are the motion vectors of the predicted upper right control point of the current decoded block, and (vx 2 , vy 2 ) are the predicted motion vectors of the lower left control point of the current decoded block.
方式二,采用基于控制点组合的运动矢量预测方法构建候选运动信息列表。Manner 2: A motion vector prediction method based on a combination of control points is used to construct a candidate motion information list.
下面例举两种方案,分别表示为方案A和方案B:The following two examples are exemplified as Option A and Option B:
方案A:将当前解码块的2个控制点的运动信息进行组合,用来构建4参数仿射变换模型。2个控制点的组合方式为{CP1,CP4},{CP2,CP3},{CP1,CP2},{CP2,CP4},{CP1,CP3},{CP3,CP4}。例如,采用CP1和CP2控制点构建的4参数仿射变换模型,记做Affine(CP1,CP2)。Solution A: Combine the motion information of the two control points of the current decoding block to construct a 4-parameter affine transformation model. The combination of the two control points is {CP1, CP4}, {CP2, CP3}, {CP1, CP2}, {CP2, CP4}, {CP1, CP3}, {CP3, CP4}. For example, a 4-parameter affine transformation model constructed using CP1 and CP2 control points is denoted as Affine (CP1, CP2).
需要说明的是,亦可将不同控制点的组合转换为同一位置的控制点。例如:将{CP1,CP4},{CP2,CP3},{CP2,CP4},{CP1,CP3},{CP3,CP4}组合得到的4参数仿射变换模型转换为控制点{CP1,CP2}或{CP1,CP2,CP3}来表示。转换方法为将控制点的运动矢量及其坐标信息,代入公式(9-1),得到模型参数,再将{CP1,CP2}的坐标信息代入,得到其运动矢量,作为一组候选运动矢量预测值。It should be noted that a combination of different control points can also be converted into a control point at the same position. For example: the four-parameter affine transformation model obtained by combining {CP1, CP4}, {CP2, CP3}, {CP2, CP4}, {CP1, CP3}, {CP3, CP4} into a control point {CP1, CP2} Or {CP1, CP2, CP3}. The conversion method is to substitute the motion vector of the control point and its coordinate information into formula (9-1) to obtain the model parameters, and then substitute the coordinate information of {CP1, CP2} to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
Figure PCTCN2019079955-appb-000033
Figure PCTCN2019079955-appb-000033
在公式(9-1)中,a 0,a 1,a 2,a 3均为参数模型中的参数,(x,y)表示位置坐标。 In formula (9-1), a 0 , a 1 , a 2 , and a 3 are parameters in the parameter model, and (x, y) represents position coordinates.
更直接地,也可以按照以下公式进行转换得到以左上控制点、右上控制点表示的一组运动矢量预测值,并加入候选运动信息列表:More directly, a set of motion vector prediction values represented by the upper left control point and the upper right control point can also be converted according to the following formula, and added to the candidate motion information list:
{CP1,CP2}转换得到{CP1,CP2,CP3}的公式(9-2):{CP1, CP2} is transformed to {CP1, CP2, CP3} 's formula (9-2):
Figure PCTCN2019079955-appb-000034
Figure PCTCN2019079955-appb-000034
{CP1,CP3}转换得到{CP1,CP2,CP3}的公式(9-3):{CP1, CP3} is converted to {CP1, CP2, CP3} 's formula (9-3):
Figure PCTCN2019079955-appb-000035
Figure PCTCN2019079955-appb-000035
{CP2,CP3}转换得到{CP1,CP2,CP3}的公式(10):{CP2, CP3} is converted to {CP1, CP2, CP3} 's formula (10):
Figure PCTCN2019079955-appb-000036
Figure PCTCN2019079955-appb-000036
{CP1,CP4}转换得到{CP1,CP2,CP3}的公式(11):{CP1, CP4} is converted to {CP1, CP2, CP3} 's formula (11):
Figure PCTCN2019079955-appb-000037
Figure PCTCN2019079955-appb-000037
{CP2,CP4}转换得到{CP1,CP2,CP3}的公式(12):{CP2, CP4} is converted to {CP1, CP2, CP3} 's formula (12):
Figure PCTCN2019079955-appb-000038
Figure PCTCN2019079955-appb-000038
{CP3,CP4}转换得到{CP1,CP2,CP3}的公式(13):{CP3, CP4} is converted to {CP1, CP2, CP3} 's formula (13):
Figure PCTCN2019079955-appb-000039
Figure PCTCN2019079955-appb-000039
方案B:将当前解码块的3个控制点的运动信息进行组合,用来构建6参数仿射变换模型。3个控制点的组合方式为{CP1,CP2,CP4},{CP1,CP2,CP3},{CP2,CP3,CP4},{CP1,CP3,CP4}。例如,采用CP1、CP2和CP3控制点构建的6参数仿射变换模型,记做Affine(CP1,CP2,CP3)。Solution B: Combine the motion information of the three control points of the current decoding block to build a 6-parameter affine transformation model. The combination of the three control points is {CP1, CP2, CP4}, {CP1, CP2, CP3}, {CP2, CP3, CP4}, {CP1, CP3, CP4}. For example, a 6-parameter affine transformation model constructed using CP1, CP2, and CP3 control points is denoted as Affine (CP1, CP2, CP3).
要说明的是,亦可将不同控制点的组合转换为同一位置的控制点。例如:将{CP1,CP2,CP4},{CP2,CP3,CP4},{CP1,CP3,CP4}组合的6参数仿射变换模型转换为控制点{CP1,CP2,CP3}来表示。转换方法为将控制点的运动矢量及其坐标信息,代入公式(14),得到模型参数,再将{CP1,CP2,CP3}的坐标信息代入,得到其运动矢量,作为一组候选运动矢量预测值。It should be noted that a combination of different control points can also be converted into a control point at the same position. For example: The 6-parameter affine transformation model of {CP1, CP2, CP4}, {CP2, CP3, CP4}, {CP1, CP3, CP4} is converted into a control point {CP1, CP2, CP3} to represent it. The transformation method is to substitute the motion vector of the control point and its coordinate information into formula (14) to obtain the model parameters, and then substitute the coordinate information of {CP1, CP2, CP3} to obtain its motion vector, which is used as a set of candidate motion vector predictions. value.
Figure PCTCN2019079955-appb-000040
Figure PCTCN2019079955-appb-000040
在公式(14)中,a 1,a 2,a 3,a 4,a 5,a 6为参数模型中的参数,(x,y)表示位置坐标。 In formula (14), a 1 , a 2 , a 3 , a 4 , a 5 , and a 6 are parameters in the parameter model, and (x, y) represents position coordinates.
更直接地,也可以按照以下公式进行转换得到以左上控制点、右上控制点、左下控制点表示的一组运动矢量预测值,并加入候选运动信息列表:More directly, a set of motion vector prediction values represented by the upper left control point, the upper right control point, and the lower left control point can also be converted according to the following formula, and added to the candidate motion information list:
{CP1,CP2,CP4}转换得到{CP1,CP2,CP3}的公式(15):{CP1, CP2, CP4} is transformed into {CP1, CP2, CP3} 's formula (15):
Figure PCTCN2019079955-appb-000041
Figure PCTCN2019079955-appb-000041
{CP2,CP3,CP4}转换得到{CP1,CP2,CP3}的公式(16):{CP2, CP3, CP4} is converted to {CP1, CP2, CP3} 's formula (16):
Figure PCTCN2019079955-appb-000042
Figure PCTCN2019079955-appb-000042
{CP1,CP3,CP4}转换得到{CP1,CP2,CP3}的公式(17):{CP1, CP3, CP4} is converted to {CP1, CP2, CP3} 's formula (17):
Figure PCTCN2019079955-appb-000043
Figure PCTCN2019079955-appb-000043
需要说明的是,可仅采用方式一预测得到的候选运动矢量预测值来构建候选运动信息列表,也可仅采用方式二预测得到的候选运动矢量预测值来构建候选运动信息列表,还可 采用方式一预测得到的候选运动矢量预测值和方式二预测得到的候选运动矢量预测值来共同构建候选运动信息列表。另外,还可将候选运动信息列表按照预先配置的规则进行剪枝和排序,然后将其截断或填充至特定个数。当候选运动信息列表中的每一组候选运动矢量预测值包括三个控制点的运动矢量预测值时,可称该候选运动信息列表为三元组列表;当候选运动信息列表中的每一组候选运动矢量预测值包括两个控制点的运动矢量预测值时,可称该候选运动信息列表为二元组列表。It should be noted that the candidate motion information list may be constructed only by using the candidate motion vector prediction values predicted by the first method, or only the candidate motion vector prediction values obtained by the second prediction method may be used to construct the candidate motion information list. A candidate motion vector prediction value obtained by the first prediction and a candidate motion vector prediction value obtained by the second prediction are used to jointly construct a candidate motion information list. In addition, the candidate motion information list can be pruned and sorted according to pre-configured rules, and then truncated or filled to a specific number. When each group of candidate motion vector prediction values in the candidate motion information list includes motion vector prediction values of three control points, the candidate motion information list may be referred to as a triple list; when each group in the candidate motion information list When the candidate motion vector prediction value includes motion vector prediction values of two control points, the candidate motion information list may be referred to as a two-tuple list.
步骤S1222:视频解码器解析码流,以得到索引。Step S1222: The video decoder parses the bitstream to obtain an index.
具体地,视频解码器可以通过熵解码单元解析码流,该索引用于指示当前解码块的目标候选运动矢量组,该目标候选运动矢量表示当前解码块的一组控制点的运动矢量预测值。Specifically, the video decoder may parse the bitstream through an entropy decoding unit, and the index is used to indicate a target candidate motion vector group of the current decoding block, where the target candidate motion vector represents a motion vector prediction value of a set of control points of the current decoding block.
步骤S1223:视频解码器根据所述索引,从候选运动信息列表中确定目标运动矢量组。具体地,频解码器根据该索引从候选运动矢量中确定出的目标候选运动矢量组用于作为最优候选运动矢量预测值(可选的,当候选运动信息列表的长度为1时,不需要解析码流得到索引,直接可以确定目标运动矢量组),具体来说是2个或3个控制点的最优运动矢量预测值;例如,视频解码器从码流中解析索引号,再根据索引号从候选运动信息列表中确定2个或3个控制点的最优运动矢量预测值,候选运动信息列表中每组候选运动矢量预测值各自对应有各自的索引号。Step S1223: The video decoder determines a target motion vector group from the candidate motion information list according to the index. Specifically, the target decoder motion vector group determined from the candidate motion vectors according to the index is used as the optimal candidate motion vector prediction value (optionally, when the length of the candidate motion information list is 1, it is not required Analyze the bitstream to get the index and directly determine the target motion vector group), specifically the optimal motion vector prediction value of 2 or 3 control points; for example, the video decoder parses the index number from the bitstream, and then The optimal motion vector prediction value of 2 or 3 control points is determined from the candidate motion information list, and each group of candidate motion vector prediction values in the candidate motion information list corresponds to its own index number.
步骤S1224:视频解码器根据以上确定出的当前解码块的控制点的运动矢量值采用参数仿射变换模型获得当前解码块中每个子块的运动矢量值。Step S1224: The video decoder uses the parameter affine transformation model to obtain the motion vector value of each sub-block in the current decoding block according to the motion vector value of the control point of the current decoding block determined above.
具体地,即目标候选运动矢量组中包括的两个(左上控制点和右上控制点)或者三个控制点(例如,左上控制点、右上控制点和左下控制点)的运动矢量。对于当前解码块的每一个子块(一个子块也可以等效为一个运动补偿单元),可采用运动补偿单元中预设位置像素点的运动信息来表示该运动补偿单元内所有像素点的运动信息。假设运动补偿单元的尺寸为MxN(M小于等于当前解码块的宽度W,N小于等于当前解码块的高度H,其中M、N、W、H为正整数,通常为2的幂次方,如4、8、16、32、64、128等),则预设位置像素点可以为运动补偿单元中心点(M/2,N/2)、左上像素点(0,0),右上像素点(M-1,0),或其他位置的像素点。图8A示意了4x4的运动补偿单元,图8B示意了8x8的运动补偿单元。Specifically, it is a motion vector of two (upper left control points and upper right control points) or three control points (for example, upper left control points, upper right control points, and lower left control points) included in the target candidate motion vector group. For each sub-block of the current decoding block (a sub-block can also be equivalent to a motion compensation unit), the motion information of pixels at preset positions in the motion compensation unit can be used to represent the motion of all pixels in the motion compensation unit information. Assume that the size of the motion compensation unit is MxN (M is less than or equal to the width W of the current decoding block, and N is less than or equal to the height H of the current decoding block, where M, N, W, H are positive integers, usually powers of two, such as 4, 8, 16, 32, 64, 128, etc.), the preset position pixels can be the center point of the motion compensation unit (M / 2, N / 2), the upper left pixel (0, 0), and the upper right pixel ( M-1,0), or other locations. FIG. 8A illustrates a 4 × 4 motion compensation unit, and FIG. 8B illustrates an 8 × 8 motion compensation unit.
运动补偿单元中心点相对于当前解码块左上顶点像素的坐标使用公式(5)计算得到,其中i为水平方向第i个运动补偿单元(从左到右),j为竖直方向第j个运动补偿单元(从上到下),(x (i,j),y (i,j))表示第(i,j)个运动补偿单元中心点相对于当前解码块左上控制点像素的坐标。再根据当前解码块的仿射模型类型(6参数或4参数),将(x (i,j),y (i,j))代入6参数仿射模型公式(6-1)或者将(x (i,j),y (i,j))代入4参数仿射模型公式(6-2),获得每个运动补偿单元中心点的运动信息,作为该运动补偿单元内所有像素点的运动矢量(vx (i,j),vy (i,j))。 The coordinates of the center point of the motion compensation unit relative to the top left pixel of the current decoding block are calculated using formula (5), where i is the i-th motion compensation unit in the horizontal direction (from left to right), and j is the j-th motion in the vertical direction. The compensation unit (from top to bottom), (x (i, j) , y (i, j) ) represents the coordinates of the center point of the (i, j) th motion compensation unit relative to the upper left control point pixel of the current decoding block. Then according to the type of the affine model (6 or 4 parameters) of the current decoding block, substitute (x (i, j) , y (i, j) ) into the 6 parameter affine model formula (6-1) or (x (i, j) , y (i, j) ) is substituted into the 4-parameter affine model formula (6-2) to obtain the motion information of the center point of each motion compensation unit as the motion vector of all pixels in the motion compensation unit (vx (i, j) , vy (i, j) ).
可选的,当前解码块为6参数解码块时,在基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量时,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前解码块的左下角的位置坐标(0,H)计算得到,所述当前解码块的右下角的子块的运动矢量为根据所述三个控制点构造的6参数仿射模型和所述当前解码块的右下角的位置坐标(W,H)计算得到。例如,将当前解码块的左下角 的位置坐标(0,H)代入到该6参数仿射模型即可得到当前解码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前解码块的右下角的位置坐标(W,H)代入到该6参数仿射模型即可得到当前解码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前解码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动信息列表),用到的是准确的值而不是估算值。其中,W为该当前解码块的宽,H为该当前解码块的高。Optionally, when the current decoding block is a 6-parameter decoding block, when a motion vector of one or more sub-blocks of the current decoding block is obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block is The lower boundary of the CTU where the current decoding block is located coincides, and the motion vector of the sub-block in the lower left corner of the current decoding block is a 6-parameter affine model constructed from the three control points and the current decoding block. The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the subblock in the lower right corner of the current decoding block is a 6-parameter affine model constructed according to the three control points and the right of the current decoding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 6-parameter affine model, the motion vector of the lower left corner of the current decoded block (instead of The center point coordinate is substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoded block are substituted into the 6-parameter affine model to obtain the motion vector of the lower right corner of the current decoded block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion information list of the other blocks) uses accurate values instead of estimated values. Wherein, W is the width of the current decoded block, and H is the height of the current decoded block.
可选的,当前解码块为4参数解码块时,在基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量时,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前解码块的左下角的位置坐标(0,H)计算得到,所述当前解码块的右下角的子块的运动矢量为根据所述两个控制点构造的4参数仿射模型和所述当前解码块的右下角的位置坐标(W,H)计算得到。例如,将当前解码块的左下角的位置坐标(0,H)代入到该4参数仿射模型即可得到当前解码块的左下角的子块的运动矢量(而不是将左下角的子块的中心点坐标代入该仿射模型进行计算),将当前解码块的右下角的位置坐标(W,H)代入到该四参数仿射模型即可得到当前解码块的右下角的子块的运动矢量(而不是将右下角的子块的中心点坐标代入该仿射模型进行计算)。这样一来,该当前解码块的左下控制点的运动矢量和右下控制点的运动矢量被用到时(例如,后续其他块基于该当前块的左下控制点和右下控制点的运动矢量构建该其他块的候选运动信息列表),用到的是准确的值而不是估算值。其中,W为该当前解码块的宽,H为该当前解码块的高。Optionally, when the current decoding block is a 4-parameter decoding block, when the motion vectors of one or more sub-blocks of the current decoding block are obtained based on the target candidate motion vector group, if the lower boundary of the current decoding block and The lower boundary of the CTU where the current decoding block is located coincides, and the motion vector of the lower left sub-block of the current decoding block is a 4-parameter affine model constructed according to the two control points and the current decoding block's The position coordinates (0, H) of the lower left corner are calculated, and the motion vector of the sub-block in the lower right corner of the current decoding block is a 4-parameter affine model constructed according to the two control points and the right of the current decoding block. The position coordinates (W, H) of the lower corner are calculated. For example, by substituting the position coordinates (0, H) of the lower left corner of the current decoded block into the 4-parameter affine model, the motion vector of the lower left sub-block of the current decoded block (instead of the The center point coordinates are substituted into the affine model for calculation), and the position coordinates (W, H) of the lower right corner of the current decoding block are substituted into the four-parameter affine model to obtain the motion vector of the subblock in the lower right corner of the current decoding block. (Instead of substituting the coordinates of the center point of the subblock in the lower right corner into the affine model for calculation). In this way, the motion vector of the lower left control point and the lower right control point of the current decoding block are used (for example, the subsequent other blocks are constructed based on the motion vector of the lower left control point and the lower right control point of the current block). The candidate motion information list of the other blocks) uses accurate values instead of estimated values. Wherein, W is the width of the current decoded block, and H is the height of the current decoded block.
步骤S1225:该视频解码器根据该当前解码块中每个子块的运动矢量值进行运动补偿,以得到每个子块的像素预测值,具体来说,根据所述当前解码块的一个或多个子块的运动矢量,以及所述索引指示的参考帧索引和预测方向,预测得到所述当前解码块的像素预测值。Step S1225: The video decoder performs motion compensation according to the motion vector value of each sub-block in the current decoding block to obtain the pixel prediction value of each sub-block. Specifically, according to one or more sub-blocks of the current decoding block The motion vector, the reference frame index and the prediction direction indicated by the index, to obtain a pixel prediction value of the current decoded block.
可以理解的是,当所述第一相邻仿射解码块所在的解码树单元CTU在所述当前解码块位置的上方时,该第一相邻仿射解码块最下方控制点的信息已经从内存中读取过;因此上述方案在根据第一相邻仿射解码块的第一组控制点构建候选运动矢量的过程中,该第一组控制点包括所述第一相邻仿射解码块的左下控制点和右下控制点;而不是像现有技术那样固定将第一相邻解码块的左上控制点、右上控制点和左下控制点作为第一组控制点(或者固定将第一相邻解码块的左上控制点和右上控制点作为第一组控制点)。因此采用本申请中确定第一组控制点的方法,第一组控制点的信息(例如,位置坐标、运动矢量等)可以直接复用从内存中读取过的信息,从而减少了内存的读取,提高了解码性能。It can be understood that when the CTU of the decoding tree unit where the first neighboring affine decoding block is located is above the current decoding block position, the information of the lowest control point of the first neighboring affine decoding block has been changed from Read in memory; therefore, in the above solution, in the process of constructing candidate motion vectors according to the first set of control points of the first neighboring affine decoding block, the first set of control points includes the first neighboring affine decoding block Lower left control point and lower right control point; instead of fixing the upper left control point, upper right control point, and lower left control point of the first adjacent decoding block as the first group of control points as in the prior art (or fixing the first phase The upper left control point and the upper right control point of the neighboring decoding blocks are used as the first group of control points). Therefore, by adopting the method for determining the first set of control points in this application, the information of the first set of control points (for example, position coordinates, motion vectors, etc.) can directly reuse the information read from the memory, thereby reducing the memory read. This improves the decoding performance.
图10为本申请实施例的编码设备或解码设备(简称为译码设备1000)的一种实现方式的示意性框图。其中,译码设备1000可以包括处理器1010、存储器1030和总线系统1050。其中,处理器和存储器通过总线系统相连,该存储器用于存储指令,该处理器用于执行该 存储器存储的指令。编码设备的存储器存储程序代码,且处理器可以调用存储器中存储的程序代码执行本申请描述的各种视频编码或解码方法,尤其是在各种新的帧间预测模式下的视频编码或解码方法,以及在各种新的帧间预测模式下预测运动信息的方法。为避免重复,这里不再详细描述。FIG. 10 is a schematic block diagram of an implementation manner of an encoding device or a decoding device (hereinafter referred to as a decoding device 1000) according to an embodiment of the present application. The decoding device 1000 may include a processor 1010, a memory 1030, and a bus system 1050. The processor and the memory are connected through a bus system, the memory is used to store instructions, and the processor is used to execute the instructions stored in the memory. The memory of the encoding device stores program code, and the processor can call the program code stored in the memory to perform various video encoding or decoding methods described in this application, especially the video encoding or decoding methods in various new inter prediction modes. , And methods for predicting motion information in various new inter prediction modes. To avoid repetition, it will not be described in detail here.
在本申请实施例中,该处理器1010可以是中央处理单元(Central Processing Unit,简称为“CPU”),该处理器1010还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。In the embodiment of the present application, the processor 1010 may be a Central Processing Unit (“CPU”), and the processor 1010 may also be another general-purpose processor, a digital signal processor (DSP), or a dedicated integration. Circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
该存储器1030可以包括只读存储器(ROM)设备或者随机存取存储器(RAM)设备。任何其他适宜类型的存储设备也可以用作存储器1030。存储器1030可以包括由处理器1010使用总线1050访问的代码和数据1031。存储器1030可以进一步包括操作系统1033和应用程序1035,该应用程序1035包括允许处理器1010执行本申请描述的视频编码或解码方法(尤其是本申请描述的编码方法或解码方法)的至少一个程序。例如,应用程序1035可以包括应用1至N,其进一步包括执行在本申请描述的视频编码或解码方法的视频编码或解码应用(简称视频译码应用)。The memory 1030 may include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device may also be used as the memory 1030. The memory 1030 may include code and data 1031 accessed by the processor 1010 using the bus 1050. The memory 1030 may further include an operating system 1033 and an application program 1035. The application program 1035 includes at least one program that allows the processor 1010 to perform the video encoding or decoding method (especially the encoding method or the decoding method described in this application). For example, the application program 1035 may include applications 1 to N, which further includes a video encoding or decoding application (referred to as a video decoding application) that executes the video encoding or decoding method described in this application.
该总线系统1050除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统1050。The bus system 1050 may include a power bus, a control bus, and a status signal bus in addition to a data bus. However, for the sake of clarity, various buses are marked as the bus system 1050 in the figure.
可选的,译码设备1000还可以包括一个或多个输出设备,诸如显示器1070。在一个示例中,显示器1070可以是触感显示器,其将显示器与可操作地感测触摸输入的触感单元合并。显示器1070可以经由总线1050连接到处理器1010。Optionally, the decoding device 1000 may further include one or more output devices, such as a display 1070. In one example, the display 1070 may be a tactile display that incorporates the display with a tactile unit operatively sensing a touch input. The display 1070 may be connected to the processor 1010 via a bus 1050.
图11是根据一示例性实施例的包含图2A的编码器20和/或图2B的解码器200的视频编码系统1100的实例的说明图。系统1100可以实现本申请的各种技术的组合。在所说明的实施方式中,视频编码系统1100可以包含成像设备1101、视频编码器100、视频解码器200(和/或藉由处理单元1106的逻辑电路1107实施的视频编码器)、天线1102、一个或多个处理器1103、一个或多个存储器1104和/或显示设备1105。FIG. 11 is an explanatory diagram of an example of a video encoding system 1100 including the encoder 20 of FIG. 2A and / or the decoder 200 of FIG. 2B according to an exemplary embodiment. The system 1100 may implement a combination of various techniques of the present application. In the illustrated embodiment, the video encoding system 1100 may include an imaging device 1101, a video encoder 100, a video decoder 200 (and / or a video encoder implemented by the logic circuit 1107 of the processing unit 1106), an antenna 1102, One or more processors 1103, one or more memories 1104, and / or a display device 1105.
如图所示,成像设备1101、天线1102、处理单元1106、逻辑电路1107、视频编码器100、视频解码器200、处理器1103、存储器1104和/或显示设备1105能够互相通信。如所论述,虽然用视频编码器100和视频解码器200绘示视频编码系统1100,但在不同实例中,视频编码系统1100可以只包含视频编码器100或只包含视频解码器200。As shown, the imaging device 1101, the antenna 1102, the processing unit 1106, the logic circuit 1107, the video encoder 100, the video decoder 200, the processor 1103, the memory 1104, and / or the display device 1105 can communicate with each other. As discussed, although the video encoding system 1100 is shown with the video encoder 100 and the video decoder 200, in different examples, the video encoding system 1100 may include only the video encoder 100 or only the video decoder 200.
在一些实例中,如图所示,视频编码系统1100可以包含天线1102。例如,天线1102可以用于传输或接收视频数据的经编码比特流。另外,在一些实例中,视频编码系统1100可以包含显示设备1105。显示设备1105可以用于呈现视频数据。在一些实例中,如图所示,逻辑电路1107可以通过处理单元1106实施。处理单元1106可以包含专用集成电路(application-specific integrated circuit,ASIC)逻辑、图形处理器、通用处理器等。视频编码系统1100也可以包含可选处理器1103,该可选处理器1103类似地可以包含专用集成电路(application-specific integrated circuit,ASIC)逻辑、图形处理器、通用处理器等。在一 些实例中,逻辑电路1107可以通过硬件实施,如视频编码专用硬件等,处理器1103可以通过通用软件、操作系统等实施。另外,存储器1104可以是任何类型的存储器,例如易失性存储器(例如,静态随机存取存储器(Static Random Access Memory,SRAM)、动态随机存储器(Dynamic Random Access Memory,DRAM)等)或非易失性存储器(例如,闪存等)等。在非限制性实例中,存储器1104可以由超速缓存内存实施。在一些实例中,逻辑电路1107可以访问存储器1104(例如用于实施图像缓冲器)。在其它实例中,逻辑电路1107和/或处理单元1106可以包含存储器(例如,缓存等)用于实施图像缓冲器等。In some examples, as shown, the video encoding system 1100 may include an antenna 1102. For example, the antenna 1102 may be used to transmit or receive an encoded bit stream of video data. In addition, in some examples, the video encoding system 1100 may include a display device 1105. The display device 1105 may be used to present video data. In some examples, as shown, the logic circuit 1107 may be implemented by the processing unit 1106. The processing unit 1106 may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like. The video encoding system 1100 may also include an optional processor 1103, which may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like. In some examples, the logic circuit 1107 may be implemented by hardware, such as dedicated hardware for video encoding, and the processor 1103 may be implemented by general software, operating system, and the like. In addition, the memory 1104 may be any type of memory, such as volatile memory (for example, Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), etc.) or non-volatile memory Memory (for example, flash memory, etc.). In a non-limiting example, the memory 1104 may be implemented by a cache memory. In some examples, the logic circuit 1107 may access the memory 1104 (eg, for implementing an image buffer). In other examples, the logic circuit 1107 and / or the processing unit 1106 may include a memory (eg, a cache, etc.) for implementing an image buffer or the like.
在一些实例中,通过逻辑电路实施的视频编码器100可以包含(例如,通过处理单元1106或存储器1104实施的)图像缓冲器和(例如,通过处理单元1106实施的)图形处理单元。图形处理单元可以通信耦合至图像缓冲器。图形处理单元可以包含通过逻辑电路1107实施的视频编码器100,以实施参照图2A和/或本文中所描述的任何其它编码器系统或子系统所论述的各种模块。逻辑电路可以用于执行本文所论述的各种操作。In some examples, video encoder 100 implemented by logic circuits may include an image buffer (eg, implemented by processing unit 1106 or memory 1104) and a graphics processing unit (eg, implemented by processing unit 1106). The graphics processing unit may be communicatively coupled to the image buffer. The graphics processing unit may include a video encoder 100 implemented by a logic circuit 1107 to implement the various modules discussed with reference to FIG. 2A and / or any other encoder system or subsystem described herein. Logic circuits can be used to perform various operations discussed herein.
视频解码器200可以以类似方式通过逻辑电路1107实施,以实施参照图2B的解码器200和/或本文中所描述的任何其它解码器系统或子系统所论述的各种模块。在一些实例中,逻辑电路实施的视频解码器200可以包含(通过处理单元2820或存储器1104实施的)图像缓冲器和(例如,通过处理单元1106实施的)图形处理单元。图形处理单元可以通信耦合至图像缓冲器。图形处理单元可以包含通过逻辑电路1107实施的视频解码器200,以实施参照图2B和/或本文中所描述的任何其它解码器系统或子系统所论述的各种模块。 Video decoder 200 may be implemented in a similar manner through logic circuit 1107 to implement the various modules discussed with reference to decoder 200 of FIG. 2B and / or any other decoder system or subsystem described herein. In some examples, video decoder 200 implemented by a logic circuit may include an image buffer (implemented by processing unit 2820 or memory 1104) and a graphics processing unit (eg, implemented by processing unit 1106). The graphics processing unit may be communicatively coupled to the image buffer. The graphics processing unit may include a video decoder 200 implemented by a logic circuit 1107 to implement various modules discussed with reference to FIG. 2B and / or any other decoder system or subsystem described herein.
在一些实例中,视频编码系统1100的天线1102可以用于接收视频数据的经编码比特流。如所论述,经编码比特流可以包含本文所论述的与编码视频帧相关的数据、指示符、索引值、模式选择数据等,例如与编码分割相关的数据(例如,变换系数或经量化变换系数,(如所论述的)可选指示符,和/或定义编码分割的数据)。视频编码系统1100还可包含耦合至天线1102并用于解码经编码比特流的视频解码器200。显示设备1105用于呈现视频帧。In some examples, the antenna 1102 of the video encoding system 1100 may be used to receive an encoded bit stream of video data. As discussed, the encoded bitstream may contain data, indicators, index values, mode selection data, etc. related to encoded video frames discussed herein, such as data related to coded segmentation (e.g., transform coefficients or quantized transform coefficients , (As discussed) optional indicators, and / or data defining code partitions). The video encoding system 1100 may also include a video decoder 200 coupled to the antenna 1102 and used to decode the encoded bitstream. The display device 1105 is used to present a video frame.
以上方法流程的步骤中,步骤的描述顺序并不代表步骤的执行顺序,按照以上的描述顺序来执行是可行的,不按照以上的描述顺序来执行也是可行的。例如上述步骤S1211可以在步骤S1212之后执行,也可以在步骤S1212之前执行;上述步骤S1221可以在步骤S1222之后执行,也可以在步骤S1222之前执行;其余步骤此处不再一一举例。In the steps of the above method flow, the description order of the steps does not represent the execution order of the steps, and it is feasible to perform according to the above description order, and it is also possible to perform without the above description order. For example, the above step S1211 may be performed after step S1212, or may be performed before step S1212; the above step S1221 may be performed after step S1222, or may be performed before step S1222; the remaining steps are not illustrated here one by one.
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,根据通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。Those skilled in the art can appreciate that the functions described in connection with the various illustrative logical blocks, modules, and algorithm steps disclosed in this disclosure may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions described by the various illustrative logical blocks, modules, and steps may be stored or transmitted as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (e.g., according to a communication protocol) . In this manner, computer-readable media may generally correspond to (1) tangible computer-readable storage media that is non-transitory, or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application. The computer program product may include a computer-readable medium.
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM 或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。By way of example, and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store instructions or data structures Any form of desired program code and any other medium accessible by a computer. Also, any connection is properly termed a computer-readable medium. For example, a coaxial cable is used to transmit instructions from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave. Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media. It should be understood, however, that the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are instead directed to non-transitory tangible storage media. As used herein, magnetic disks and compact discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), and Blu-ray discs, where disks typically reproduce data magnetically, and optical discs use lasers to reproduce optically data. Combinations of the above should also be included within the scope of computer-readable media.
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。Can be processed by one or more, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits To execute instructions. Accordingly, the term "processor" as used herein may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein. Additionally, in some aspects, the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and / or software modules configured for encoding and decoding, or Into the combined codec. Moreover, the techniques can be fully implemented in one or more circuits or logic elements.
本申请的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。The techniques of this application may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset). Various components, modules, or units are described in this application to emphasize functional aspects of the apparatus for performing the disclosed techniques, but do not necessarily need to be implemented by different hardware units. In fact, as described above, the various units may be combined in a codec hardware unit in combination with suitable software and / or firmware, or through interoperable hardware units (including one or more processors as described above) provide.
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。The above description is only an exemplary specific implementation of the present application, but the scope of protection of the present application is not limited to this. Any person skilled in the art can easily think of changes or changes within the technical scope disclosed in this application. Replacement shall be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (32)

  1. 一种编码方法,其特征在于,包括:An encoding method, comprising:
    根据率失真代价准则,从候选运动矢量列表中确定目标候选运动矢量组;所述目标候选运动矢量组表示当前编码块的一组控制点的运动矢量预测值;Determining a target candidate motion vector group from the candidate motion vector list according to the rate-distortion cost criterion; the target candidate motion vector group represents a motion vector prediction value of a set of control points of the current coding block;
    将与所述目标候选运动矢量对应的索引编入码流,并传输所述码流;Coding an index corresponding to the target candidate motion vector into a code stream, and transmitting the code stream;
    其中,如果第一相邻仿射编码块为四参数仿射编码块,且所述第一相邻仿射编码块位于所述当前编码块的上方编码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的。Wherein, if the first neighboring affine coding block is a four-parameter affine coding block, and the first neighboring affine coding block is located in the coding tree unit CTU above the current coding block, the candidate motion vector list A first group of candidate motion vector prediction values is included, and the first group of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine coding block.
  2. 根据权利要求1所述的方法,其特征在于:The method according to claim 1, characterized in that:
    如果所述当前编码块为四参数仿射编码块,则所述第一组候选运动矢量预测值用于表示所述当前编码块的左上控制点和右上控制点的运动矢量预测值;If the current coding block is a four-parameter affine coding block, the first set of candidate motion vector prediction values is used to represent motion vector prediction values of an upper left control point and an upper right control point of the current coding block;
    如果所述当前编码块为六参数仿射编码块,则所述第一组候选运动矢量预测值用于表示所述当前编码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。If the current coding block is a six-parameter affine coding block, the first set of candidate motion vector prediction values is used to represent a motion vector prediction of an upper left control point, an upper right control point, and a lower left fixed point control point of the current coding block. value.
  3. 根据权利要求1或2所述的方法,其特征在于:所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的,具体为:The method according to claim 1 or 2, wherein the first set of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine coding block, and specifically for:
    如果所述当前编码块为四参数仿射编码块,所述第一组候选运动矢量预测值是将所述当前编码块的左上控制点和右上控制点的位置坐标代入第一仿射模型得到的;或者,If the current coding block is a four-parameter affine coding block, the first set of candidate motion vector prediction values is obtained by substituting the position coordinates of the upper left control point and the upper right control point of the current coding block into a first affine model. ;or,
    如果所述当前编码块为六参数仿射编码块,所述第一组候选运动矢量预测值是将所述当前编码块的左上控制点、右上控制点和左下定点控制点的位置坐标代入第一仿射模型得到的;If the current coding block is a six-parameter affine coding block, the first set of candidate motion vector prediction values is to substitute the position coordinates of the upper left control point, upper right control point, and lower left fixed point control point of the current coding block into the first Obtained from the affine model;
    其中,所述第一仿射模型是基于所述第一相邻仿射编码块的左下控制点和右下控制点的运动矢量及位置坐标确定的。The first affine model is determined based on a motion vector and a position coordinate of a lower left control point and a lower right control point of the first adjacent affine coding block.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,还包括:The method according to any one of claims 1-3, further comprising:
    以所述目标候选运动矢量组为搜索起始点在预设搜索范围内按照率失真代价准则搜索代价最低的一组控制点的运动矢量;Using the target candidate motion vector group as a search starting point to search for a motion vector of a set of control points with the lowest cost within a preset search range according to a rate distortion cost criterion;
    确定所述一组控制点的运动矢量与所述目标候选运动矢量组之间的运动矢量差值MVD;Determining a motion vector difference MVD between a motion vector of the set of control points and the target candidate motion vector group;
    所述将与所述目标候选运动矢量对应的索引编入码流,并传输所述码流,包括:The encoding the index corresponding to the target candidate motion vector into a code stream and transmitting the code stream includes:
    将所述MVD和与所述目标候选运动矢量组对应的索引编入待传输的码流,并传输所述码流。Coding the MVD and an index corresponding to the target candidate motion vector group into a code stream to be transmitted, and transmitting the code stream.
  5. 根据权利要求1-3任一项所述的方法,其特征在于,所述将与所述目标候选运动矢量对应的索引编入码流,并传输所述码流,包括:The method according to any one of claims 1 to 3, wherein the coding an index corresponding to the target candidate motion vector into a code stream and transmitting the code stream comprises:
    将与所述目标候选运动矢量组、参考帧索引和预测方向对应的索引编入码流,并传输所述码流。Indexes corresponding to the target candidate motion vector group, reference frame index, and prediction direction are coded into a code stream, and the code stream is transmitted.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为基于所述第一相邻仿射编码块的左上控制点的位置坐标(x 4,y 4)计算推导得到的,其中,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射编码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿编码块的宽度,cuH为所述第一相邻仿射编码块的高度。 The method according to any one of claims 1-5, wherein a position coordinate (x 6 , y 6 ) of a lower left control point of the first adjacent affine coding block and a position of the lower right control point The position coordinates (x 7 , y 7 ) are all calculated and derived based on the position coordinates (x 4 , y 4 ) of the upper-left control point of the first neighboring affine coding block, where the first neighboring affine coding block The position coordinates (x 6 , y 6 ) of the lower left control point of the affine coding block are (x 4 , y 4 + cuH), and the position coordinates (x 7 , y 7 ) is (x 4 + cuW, y 4 + cuH), cuW is the width of the first neighboring affine coding block, and cuH is the height of the first neighboring affine coding block.
  7. 根据权利要求6所述的方法,其特征在于,所述第一相邻仿射编码块的左下控制点的运动矢量为所述第一相邻仿射编码块的左下子块的运动矢量,所述第一相邻仿射编码块的右下控制点的运动矢量为所述第一相邻仿射编码块的右下子块的运动矢量。The method according to claim 6, wherein the motion vector of the lower left control point of the first adjacent affine coding block is the motion vector of the lower left sub-block of the first adjacent affine coding block, and The motion vector of the lower right control point of the first neighboring affine coding block is a motion vector of the lower right sub-block of the first neighboring affine coding block.
  8. 一种解码方法,其特征在于,包括:A decoding method, comprising:
    解析码流,以得到索引,所述索引用于指示当前解码块的目标候选运动矢量组;Parse the code stream to obtain an index, where the index is used to indicate the target candidate motion vector group of the current decoded block;
    根据所述索引,从候选运动矢量列表中确定所述目标候选运动矢量组,所述目标候选运动矢量组表示当前解码块的一组控制点的运动矢量预测值,其中,如果第一相邻仿射解码块为四参数仿射解码块,且所述第一相邻仿射解码块位于所述当前解码块的上方解码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的;According to the index, the target candidate motion vector group is determined from a candidate motion vector list, and the target candidate motion vector group represents a motion vector prediction value of a set of control points of the current decoding block, where if the first neighboring simulation The affine decoding block is a four-parameter affine decoding block, and the first neighboring affine decoding block is located above the current decoding block in a decoding tree unit CTU, then the candidate motion vector list includes a first group of candidate motion vector predictions. Value, the first set of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine decoding block;
    基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量;Obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group;
    基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值。Based on the motion vectors of one or more sub-blocks of the current decoding block, a pixel prediction value of the current decoding block is obtained by prediction.
  9. 根据权利要求8所述的方法,其特征在于:The method according to claim 8, characterized in that:
    如果所述当前解码块为四参数仿射解码块,则所述第一组候选运动矢量预测值用于表示所述当前解码块的左上控制点和右上控制点的运动矢量预测值;If the current decoding block is a four-parameter affine decoding block, the first set of candidate motion vector prediction values is used to represent the motion vector prediction values of the upper left control point and the upper right control point of the current decoding block;
    如果所述当前解码块为六参数仿射解码块,则所述第一组候选运动矢量预测值用于表示所述当前解码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。If the current decoding block is a six-parameter affine decoding block, the first set of candidate motion vector prediction values is used to represent motion vector prediction of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block. value.
  10. 根据权利要求8或9所述的方法,其特征在于:所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的,具体为:The method according to claim 8 or 9, wherein the first set of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first adjacent affine decoding block, and specifically for:
    如果所述当前解码块为四参数仿射解码块,所述第一组候选运动矢量预测值是将所述当前解码块的左上控制点和右上控制点的位置坐标代入第一仿射模型得到的;或者,If the current decoding block is a four-parameter affine decoding block, the first set of candidate motion vector prediction values is obtained by substituting the position coordinates of the upper left control point and the upper right control point of the current decoding block into the first affine model. ;or,
    如果所述当前解码块为六参数仿射解码块,所述第一组候选运动矢量预测值是将所述当前解码块的左上控制点、右上控制点和左下定点控制点的位置坐标代入第一仿射模型得到的;If the current decoding block is a six-parameter affine decoding block, the first set of candidate motion vector prediction values is to substitute the position coordinates of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block into the first Obtained from the affine model;
    其中,所述第一仿射模型是基于所述第一相邻仿射解码块的左下控制点和右下控制点的运动矢量及位置坐标确定的。The first affine model is determined based on motion vectors and position coordinates of a lower left control point and a lower right control point of the first adjacent affine decoding block.
  11. 根据权利要求8-10任一项所述的方法,其特征在于,所述基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量,具体为:The method according to any one of claims 8 to 10, wherein the obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group is specifically:
    基于第二仿射模型得到所述当前解码块的一个或多个子块的运动矢量,所述第二仿射模型是基于所述目标候选运动矢量组和所述当前解码块的一组控制点的位置坐标确定的。Obtaining a motion vector of one or more sub-blocks of the current decoding block based on a second affine model, the second affine model is based on the target candidate motion vector group and a set of control points of the current decoding block The position coordinates are determined.
  12. 根据权利要求8-11任一项所述的方法,其特征在于,所述基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量,包括:The method according to any one of claims 8-11, wherein the obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group comprises:
    基于从所述码流中解析得到的运动矢量差值MVD与所述索引指示的目标候选运动矢量组,得到新的候选运动矢量组;Obtaining a new candidate motion vector group based on the motion vector difference MVD parsed from the code stream and the target candidate motion vector group indicated by the index;
    基于所述新的候选运动矢量组,得到所述当前解码块的一个或多个子块的运动矢量。Based on the new candidate motion vector group, a motion vector of one or more sub-blocks of the current decoding block is obtained.
  13. 根据权利要求8-11任一项所述的方法,其特征在于,所述基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值,包括:The method according to any one of claims 8 to 11, wherein the predicting and obtaining a pixel prediction value of the current decoded block based on a motion vector of one or more sub-blocks of the current decoded block comprises:
    根据所述当前解码块的一个或多个子块的运动矢量,以及所述索引指示的参考帧索引和预测方向,预测得到所述当前解码块的像素预测值。According to the motion vector of one or more sub-blocks of the current decoding block, and the reference frame index and prediction direction indicated by the index, a pixel prediction value of the current decoding block is obtained by prediction.
  14. 根据权利要求8-13任一项所述的方法,其特征在于,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为根据所述第一相邻仿射解码块的左上控制点的位置坐标(x 4,y 4)计算推导得到的,其中,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射解码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿射解码块的宽度,cuH为所述第一相邻仿射解码块的高度。 The method according to any one of claims 8-13, wherein a position coordinate (x 6 , y 6 ) of a lower left control point of the first adjacent affine decoding block and a position of the lower right control point The position coordinates (x 7 , y 7 ) are all obtained by calculating and deriving from the position coordinates (x 4 , y 4 ) of the upper-left control point of the first adjacent affine decoding block, where the first adjacent affine decoding block The position coordinates (x 6 , y 6 ) of the lower left control point of the affine decoding block are (x 4 , y 4 + cuH), and the position coordinates of the lower right control point of the first adjacent affine decoding block (x 7 , y 7 ) is (x 4 + cuW, y 4 + cuH), cuW is the width of the first neighboring affine decoding block, and cuH is the height of the first neighboring affine decoding block.
  15. 根据权利要求14所述的方法,其特征在于,所述第一相邻仿射解码块的左下控制点的运动矢量为所述第一相邻仿射解码块的左下子块的运动矢量,所述第一相邻仿射解码块的右下控制点的运动矢量为第一相邻仿射解码块的右下子块的运动矢量。The method according to claim 14, wherein a motion vector of a lower left control point of the first adjacent affine decoding block is a motion vector of a lower left sub-block of the first adjacent affine decoding block, and The motion vector of the lower right control point of the first adjacent affine decoding block is the motion vector of the lower right sub-block of the first adjacent affine decoding block.
  16. 根据权利要求8-15任一项所述的方法,其特征在于,在所述基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量的过程中,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下顶点的子块的运动矢量为根据所述目标候选运动矢量组和所述当前解码块的左下顶点的位置坐标(0,H)计算得到,所述当前解码块的右下顶点的子块的运动矢量为根据所述目标候选运动矢量组和所述当前解码块的右下顶点的位置坐标(W,H)计算得到,其中,W等于所述当前解码块的宽,H等于所述当前解码块的高,当前解码块的左上顶点的坐标是(0,0)。The method according to any one of claims 8-15, wherein, in the process of obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group, if If the lower boundary of the current decoded block coincides with the lower boundary of the CTU where the current decoded block is located, then the motion vector of the sub-block at the lower left vertex of the current decoded block is based on the target candidate motion vector group and the current decode The position coordinates (0, H) of the lower left vertex of the block are calculated, and the motion vector of the sub-block of the lower right vertex of the current decoded block is based on the target candidate motion vector group and the lower right vertex of the current decoded block. The position coordinates (W, H) are calculated, where W is equal to the width of the current decoded block, H is equal to the height of the current decoded block, and the coordinates of the upper left vertex of the current decoded block are (0, 0).
  17. 一种视频编码器,其特征在于,包括:A video encoder includes:
    帧间预测单元,用于根据率失真代价准则,从候选运动矢量列表中确定目标候选运动矢量组;所述目标候选运动矢量组表示当前编码块的一组控制点的运动矢量预测值;An inter prediction unit, configured to determine a target candidate motion vector group from a candidate motion vector list according to a rate distortion cost criterion; the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current coding block;
    熵编码单元,用于将与所述目标候选运动矢量对应的索引编入码流,并传输所述码流;An entropy coding unit, configured to encode an index corresponding to the target candidate motion vector into a code stream and transmit the code stream;
    其中,如果第一相邻仿射编码块为四参数仿射编码块,且所述第一相邻仿射编码块位于所述当前编码块的上方编码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的。Wherein, if the first neighboring affine coding block is a four-parameter affine coding block, and the first neighboring affine coding block is located in the coding tree unit CTU above the current coding block, the candidate motion vector list A first group of candidate motion vector prediction values is included, and the first group of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine coding block.
  18. 根据权利要求17所述的视频编码器,其特征在于:The video encoder according to claim 17, wherein:
    如果所述当前编码块为四参数仿射编码块,则所述第一组候选运动矢量预测值用于表示所述当前编码块的左上控制点和右上控制点的运动矢量预测值;If the current coding block is a four-parameter affine coding block, the first set of candidate motion vector prediction values is used to represent motion vector prediction values of an upper left control point and an upper right control point of the current coding block;
    如果所述当前编码块为六参数仿射编码块,则所述第一组候选运动矢量预测值用于表示所述当前编码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。If the current coding block is a six-parameter affine coding block, the first set of candidate motion vector prediction values is used to represent a motion vector prediction of an upper left control point, an upper right control point, and a lower left fixed point control point of the current coding block. value.
  19. 根据权利要求17或18所述的视频编码器,其特征在于:所述第一组候选运动矢量预测值是基于所述第一相邻仿射编码块的左下控制点和右下控制点得到的,具体为:The video encoder according to claim 17 or 18, wherein the first set of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first neighboring affine coding block ,Specifically:
    如果所述当前编码块为四参数仿射编码块,所述第一组候选运动矢量预测值是将所述当前编码块的左上控制点和右上控制点的位置坐标代入第一仿射模型得到的;或者,If the current coding block is a four-parameter affine coding block, the first set of candidate motion vector prediction values is obtained by substituting the position coordinates of the upper left control point and the upper right control point of the current coding block into a first affine model. ;or,
    如果所述当前编码块为六参数仿射编码块,所述第一组候选运动矢量预测值是将所述当前编码块的左上控制点、右上控制点和左下定点控制点的位置坐标代入第一仿射模型得到的;If the current coding block is a six-parameter affine coding block, the first set of candidate motion vector prediction values is to substitute the position coordinates of the upper left control point, upper right control point, and lower left fixed point control point of the current coding block into the first Obtained from the affine model;
    其中,所述第一仿射模型是基于所述第一相邻仿射编码块的左下控制点和右下控制点的运动矢量及位置坐标确定的。The first affine model is determined based on a motion vector and a position coordinate of a lower left control point and a lower right control point of the first adjacent affine coding block.
  20. 根据权利要求17-19任一项所述的视频编码器,其特征在于:The video encoder according to any one of claims 17 to 19, wherein:
    所述帧间预测单元,还用于以所述目标候选运动矢量组为搜索起始点在预设搜索范围内按照率失真代价准则搜索代价最低的一组控制点的运动矢量;以及,确定所述一组控制点的运动矢量与所述目标候选运动矢量组之间的运动矢量差值MVD;The inter prediction unit is further configured to use the target candidate motion vector group as a search starting point to search for a motion vector of a group of control points with the lowest cost within a preset search range according to a rate distortion cost criterion; and determine the A motion vector difference MVD between a motion vector of a set of control points and the target candidate motion vector group;
    所述熵编码单元具体用于,将所述MVD和与所述目标候选运动矢量组对应的索引编入待传输的码流,并传输所述码流。The entropy encoding unit is specifically configured to encode the MVD and an index corresponding to the target candidate motion vector group into a code stream to be transmitted, and transmit the code stream.
  21. 根据权利要求17-19任一项所述的视频编码器,其特征在于,所述熵编码单元具体用于,将与所述目标候选运动矢量组、参考帧索引和预测方向对应的索引编入码流,并传输所述码流。The video encoder according to any one of claims 17 to 19, wherein the entropy coding unit is specifically configured to code an index corresponding to the target candidate motion vector group, a reference frame index, and a prediction direction into Code stream, and transmitting the code stream.
  22. 根据权利要求17-21任一项所述的视频编码器,其特征在于,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为 基于所述第一相邻仿射编码块的左上控制点的位置坐标(x 4,y 4)计算推导得到的,其中,所述第一相邻仿射编码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射编码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿编码块的宽度,cuH为所述第一相邻仿射编码块的高度。 The video encoder as claimed in claim any one of claims 17-21, wherein the position coordinates of the lower left first adjacent control point affine encoded block (x 6, y 6), and said lower right control The position coordinates (x 7 , y 7 ) of the points are all calculated and derived based on the position coordinates (x 4 , y 4 ) of the upper-left control point of the first adjacent affine coding block, where the first phase The position coordinates (x 6 , y 6 ) of the lower left control point of the adjacent affine coding block is (x 4 , y 4 + cuH), and the position coordinates (x of the lower right control point of the first adjacent affine coding block) 7 , y 7 ) is (x 4 + cuW, y 4 + cuH), cuW is the width of the first neighboring affine coding block, and cuH is the height of the first neighboring affine coding block.
  23. 根据权利要求22所述的视频编码器,其特征在于,所述第一相邻仿射编码块的左下控制点的运动矢量为所述第一相邻仿射编码块的左下子块的运动矢量,所述第一相邻仿射编码块的右下控制点的运动矢量为所述第一相邻仿射编码块的右下子块的运动矢量。The video encoder according to claim 22, wherein a motion vector of a lower left control point of the first adjacent affine coding block is a motion vector of a lower left sub-block of the first adjacent affine coding block , A motion vector of a lower right control point of the first adjacent affine coding block is a motion vector of a lower right sub-block of the first adjacent affine coding block.
  24. 一种视频解码器,其特征在于,包括:A video decoder, comprising:
    熵解码单元,用于解析码流,以得到索引,所述索引用于指示当前解码块的目标候选运动矢量组;An entropy decoding unit, configured to parse a code stream to obtain an index, where the index is used to indicate a target candidate motion vector group of a current decoding block;
    帧间预测单元,用于根据所述索引,从候选运动矢量列表中确定所述目标候选运动矢量组,所述目标候选运动矢量组表示当前解码块的一组控制点的运动矢量预测值,其中,如果第一相邻仿射解码块为四参数仿射解码块,且所述第一相邻仿射解码块位于所述当前解码块的上方解码树单元CTU,则所述候选运动矢量列表包括第一组候选运动矢量预测值,所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的;以及基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量;基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值。An inter prediction unit, configured to determine the target candidate motion vector group from a candidate motion vector list according to the index, where the target candidate motion vector group represents a motion vector prediction value of a set of control points of a current decoding block, where If the first neighboring affine decoding block is a four-parameter affine decoding block and the first neighboring affine decoding block is located above the current decoding block decoding tree unit CTU, the list of candidate motion vectors includes A first set of candidate motion vector prediction values, the first set of candidate motion vector prediction values being obtained based on a lower left control point and a lower right control point of the first adjacent affine decoding block; and based on the target candidate motion The vector group obtains a motion vector of one or more sub-blocks of the current decoding block; based on the motion vectors of the one or more sub-blocks of the current decoding block, predicts and obtains a pixel prediction value of the current decoding block.
  25. 根据权利要求24所述的视频解码器,其特征在于:The video decoder according to claim 24, wherein:
    如果所述当前解码块为四参数仿射解码块,则所述第一组候选运动矢量预测值用于表示所述当前解码块的左上控制点和右上控制点的运动矢量预测值;If the current decoding block is a four-parameter affine decoding block, the first set of candidate motion vector prediction values is used to represent the motion vector prediction values of the upper left control point and the upper right control point of the current decoding block;
    如果所述当前解码块为六参数仿射解码块,则所述第一组候选运动矢量预测值用于表示所述当前解码块的左上控制点、右上控制点和左下定点控制点的运动矢量预测值。If the current decoding block is a six-parameter affine decoding block, the first set of candidate motion vector prediction values is used to represent motion vector prediction of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block. value.
  26. 根据权利要求24或25所述的视频解码器,其特征在于:所述第一组候选运动矢量预测值是基于所述第一相邻仿射解码块的左下控制点和右下控制点得到的,具体为:The video decoder according to claim 24 or 25, wherein the first set of candidate motion vector prediction values is obtained based on a lower left control point and a lower right control point of the first adjacent affine decoding block ,Specifically:
    如果所述当前解码块为四参数仿射解码块,所述第一组候选运动矢量预测值是将所述当前解码块的左上控制点和右上控制点的位置坐标代入第一仿射模型得到的;或者,If the current decoding block is a four-parameter affine decoding block, the first set of candidate motion vector prediction values is obtained by substituting the position coordinates of the upper left control point and the upper right control point of the current decoding block into the first affine model. ;or,
    如果所述当前解码块为六参数仿射解码块,所述第一组候选运动矢量预测值是将所述当前解码块的左上控制点、右上控制点和左下定点控制点的位置坐标代入第一仿射模型得到的;If the current decoding block is a six-parameter affine decoding block, the first set of candidate motion vector prediction values is to substitute the position coordinates of the upper left control point, upper right control point, and lower left fixed point control point of the current decoding block into the first Obtained from the affine model;
    其中,所述第一仿射模型是基于所述第一相邻仿射解码块的左下控制点和右下控制点的运动矢量及位置坐标确定的。The first affine model is determined based on motion vectors and position coordinates of a lower left control point and a lower right control point of the first adjacent affine decoding block.
  27. 根据权利要求24-26任一项所述的视频解码器,其特征在于,所述帧间预测单元, 用于基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量,具体为:基于第二仿射模型得到所述当前解码块的一个或多个子块的运动矢量,所述第二仿射模型是基于所述目标候选运动矢量组和所述当前解码块的一组控制点的位置坐标确定的。The video decoder according to any one of claims 24-26, wherein the inter-prediction unit is configured to obtain the one The motion vector is specifically: obtaining a motion vector of one or more sub-blocks of the current decoding block based on a second affine model, and the second affine model is based on the target candidate motion vector group and the current decoding block The position coordinates of a set of control points are determined.
  28. 根据权利要求24-27任一项所述的视频解码器,其特征在于,所述帧间预测单元,用于基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量,具体为:基于从所述码流中解析得到的运动矢量差值MVD与所述索引指示的目标候选运动矢量组,得到新的候选运动矢量组;以及基于所述新的候选运动矢量组,得到所述当前解码块的一个或多个子块的运动矢量。The video decoder according to any one of claims 24-27, characterized in that the inter prediction unit is configured to obtain one or more sub-blocks of the current decoded block based on the target candidate motion vector group. The motion vector is specifically: obtaining a new candidate motion vector group based on the motion vector difference MVD parsed from the code stream and the target candidate motion vector group indicated by the index; and based on the new candidate motion vector Group to obtain a motion vector of one or more sub-blocks of the current decoding block.
  29. 根据权利要求24-27任一项所述的视频解码器,其特征在于,所述帧间预测单元,用于基于所述当前解码块的一个或多个子块的运动矢量,预测得到所述当前解码块的像素预测值,具体为:根据所述当前解码块的一个或多个子块的运动矢量,以及所述索引指示的参考帧索引和预测方向,预测得到所述当前解码块的像素预测值。The video decoder according to any one of claims 24-27, wherein the inter prediction unit is configured to predict and obtain the current based on a motion vector of one or more sub-blocks of the current decoding block. The pixel predicted value of the decoded block is specifically: according to the motion vector of one or more sub-blocks of the current decoded block, and the reference frame index and prediction direction indicated by the index, the pixel predicted value of the current decoded block is obtained by prediction. .
  30. 根据权利要求24-29任一项所述的视频解码器,其特征在于,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)和所述右下控制点的位置坐标(x 7,y 7)均为根据所述第一相邻仿射解码块的左上控制点的位置坐标(x 4,y 4)计算推导得到的,其中,所述第一相邻仿射解码块的左下控制点的位置坐标(x 6,y 6)为(x 4,y 4+cuH),所述第一相邻仿射解码块的右下控制点的位置坐标(x 7,y 7)为(x 4+cuW,y 4+cuH),cuW为所述第一相邻仿射解码块的宽度,cuH为所述第一相邻仿射解码块的高度。 The video decoder according to any one of claims 24-29, wherein the position coordinates (x 6 , y 6 ) of the lower left control point of the first adjacent affine decoding block and the lower right control The position coordinates (x 7 , y 7 ) of the points are all obtained by calculating and deriving according to the position coordinates (x 4 , y 4 ) of the upper-left control point of the first adjacent affine decoding block, where the first phase The position coordinate (x 6 , y 6 ) of the lower left control point of the adjacent affine decoding block is (x 4 , y 4 + cuH), and the position coordinate (x of the lower right control point of the first adjacent affine decoding block) 7 , y 7 ) is (x 4 + cuW, y 4 + cuH), cuW is the width of the first neighboring affine decoding block, and cuH is the height of the first neighboring affine decoding block.
  31. 根据权利要求30所述的视频解码器,其特征在于,所述第一相邻仿射解码块的左下控制点的运动矢量为所述第一相邻仿射解码块的左下子块的运动矢量,所述第一相邻仿射解码块的右下控制点的运动矢量为第一相邻仿射解码块的右下子块的运动矢量。The video decoder according to claim 30, wherein a motion vector of a lower left control point of the first adjacent affine decoding block is a motion vector of a lower left sub-block of the first adjacent affine decoding block , A motion vector of a lower right control point of the first adjacent affine decoding block is a motion vector of a lower right sub-block of the first adjacent affine decoding block.
  32. 根据权利要求24-31任一项所述的视频解码器,其特征在于,在所述基于所述目标候选运动矢量组得到所述当前解码块的一个或多个子块的运动矢量的过程中,若所述当前解码块的下边界与所述当前解码块所在的CTU的下边界重合,则所述当前解码块的左下顶点的子块的运动矢量为根据所述目标候选运动矢量组和所述当前解码块的左下顶点的位置坐标(0,H)计算得到,所述当前解码块的右下顶点的子块的运动矢量为根据所述目标候选运动矢量组和所述当前解码块的右下顶点的位置坐标(W,H)计算得到,其中,W等于所述当前解码块的宽,H等于所述当前解码块的高,当前解码块的左上顶点的坐标是(0,0)。The video decoder according to any one of claims 24-31, wherein in the process of obtaining a motion vector of one or more sub-blocks of the current decoding block based on the target candidate motion vector group, If the lower boundary of the current decoding block coincides with the lower boundary of the CTU where the current decoding block is located, the motion vector of the sub-block at the lower left vertex of the current decoding block is based on the target candidate motion vector group and the The position coordinates (0, H) of the lower left vertex of the current decoded block are calculated, and the motion vector of the subblock of the lower right vertex of the current decoded block is according to the target candidate motion vector group and the lower right of the current decoded block. The position coordinates (W, H) of the vertices are calculated, where W is equal to the width of the current decoded block, H is equal to the height of the current decoded block, and the coordinates of the upper left vertex of the current decoded block are (0, 0).
PCT/CN2019/079955 2018-08-27 2019-03-27 Video encoder, video decoder and corresponding method WO2020042604A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810992362.1A CN110868602B (en) 2018-08-27 2018-08-27 Video encoder, video decoder and corresponding methods
CN201810992362.1 2018-08-27

Publications (1)

Publication Number Publication Date
WO2020042604A1 true WO2020042604A1 (en) 2020-03-05

Family

ID=69643826

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/079955 WO2020042604A1 (en) 2018-08-27 2019-03-27 Video encoder, video decoder and corresponding method

Country Status (2)

Country Link
CN (1) CN110868602B (en)
WO (1) WO2020042604A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111327901B (en) * 2020-03-10 2023-05-30 北京达佳互联信息技术有限公司 Video encoding method, device, storage medium and encoding equipment
CN113709484B (en) * 2020-03-26 2022-12-23 杭州海康威视数字技术股份有限公司 Decoding method, encoding method, device, equipment and machine readable storage medium
CN113747172A (en) * 2020-05-29 2021-12-03 Oppo广东移动通信有限公司 Inter-frame prediction method, encoder, decoder, and computer storage medium
CN113630602A (en) * 2021-06-29 2021-11-09 杭州未名信科科技有限公司 Affine motion estimation method and device for coding unit, storage medium and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080279478A1 (en) * 2007-05-09 2008-11-13 Mikhail Tsoupko-Sitnikov Image processing method and image processing apparatus
CN104935938A (en) * 2015-07-15 2015-09-23 哈尔滨工业大学 Inter-frame prediction method in hybrid video coding standard
US9438910B1 (en) * 2014-03-11 2016-09-06 Google Inc. Affine motion prediction in video coding
CN108271023A (en) * 2017-01-04 2018-07-10 华为技术有限公司 Image prediction method and relevant device
CN108432250A (en) * 2016-01-07 2018-08-21 联发科技股份有限公司 The method and device of affine inter-prediction for coding and decoding video

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595110B (en) * 2011-01-10 2015-04-29 华为技术有限公司 Video coding method, decoding method and terminal
KR101484171B1 (en) * 2011-01-21 2015-01-23 에스케이 텔레콤주식회사 Motion Information Generating Apparatus and Method using Motion Vector Predictor Index Coding, and Image Encoding/Decoding Apparatus and Method using the Same
US9083983B2 (en) * 2011-10-04 2015-07-14 Qualcomm Incorporated Motion vector predictor candidate clipping removal for video coding
CN106331722B (en) * 2015-07-03 2019-04-26 华为技术有限公司 Image prediction method and relevant device
WO2017147765A1 (en) * 2016-03-01 2017-09-08 Mediatek Inc. Methods for affine motion compensation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080279478A1 (en) * 2007-05-09 2008-11-13 Mikhail Tsoupko-Sitnikov Image processing method and image processing apparatus
US9438910B1 (en) * 2014-03-11 2016-09-06 Google Inc. Affine motion prediction in video coding
CN104935938A (en) * 2015-07-15 2015-09-23 哈尔滨工业大学 Inter-frame prediction method in hybrid video coding standard
CN108432250A (en) * 2016-01-07 2018-08-21 联发科技股份有限公司 The method and device of affine inter-prediction for coding and decoding video
CN108271023A (en) * 2017-01-04 2018-07-10 华为技术有限公司 Image prediction method and relevant device

Also Published As

Publication number Publication date
CN110868602A (en) 2020-03-06
CN110868602B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
US11252436B2 (en) Video picture inter prediction method and apparatus, and codec
JP7148612B2 (en) Video data inter prediction method, apparatus, video encoder, video decoder and program
WO2020042604A1 (en) Video encoder, video decoder and corresponding method
KR102606146B1 (en) Motion vector prediction method and related devices
WO2019154424A1 (en) Video decoding method, video decoder, and electronic device
US20230239494A1 (en) Video encoder, video decoder, and corresponding method
WO2020007093A1 (en) Image prediction method and apparatus
US20210185323A1 (en) Inter prediction method and apparatus, video encoder, and video decoder
CN110677645B (en) Image prediction method and device
TWI841033B (en) Method and apparatus of frame inter prediction of video data
WO2019237287A1 (en) Inter-frame prediction method for video image, device, and codec
WO2020007187A1 (en) Image block decoding method and device
WO2019227297A1 (en) Interframe prediction method, device, and codec for video image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19855595

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19855595

Country of ref document: EP

Kind code of ref document: A1