WO2019129130A1 - 图像预测方法、装置以及编解码器 - Google Patents

图像预测方法、装置以及编解码器 Download PDF

Info

Publication number
WO2019129130A1
WO2019129130A1 PCT/CN2018/124275 CN2018124275W WO2019129130A1 WO 2019129130 A1 WO2019129130 A1 WO 2019129130A1 CN 2018124275 W CN2018124275 W CN 2018124275W WO 2019129130 A1 WO2019129130 A1 WO 2019129130A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
backward
reference block
current image
target
Prior art date
Application number
PCT/CN2018/124275
Other languages
English (en)
French (fr)
Inventor
马祥
杨海涛
陈焕浜
高山
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to SG11202006258VA priority Critical patent/SG11202006258VA/en
Priority to EP18895955.5A priority patent/EP3734976A4/en
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to JP2020536667A priority patent/JP2021508213A/ja
Priority to KR1020207022351A priority patent/KR102503943B1/ko
Priority to BR112020012914-3A priority patent/BR112020012914A2/pt
Priority to CN201880084937.8A priority patent/CN111543059A/zh
Priority to KR1020237006148A priority patent/KR102627496B1/ko
Priority to KR1020247001807A priority patent/KR20240011263A/ko
Priority to RU2020125254A priority patent/RU2772639C2/ru
Priority to AU2018395081A priority patent/AU2018395081B2/en
Priority to CA3087405A priority patent/CA3087405A1/en
Priority to EP23219999.2A priority patent/EP4362464A3/en
Publication of WO2019129130A1 publication Critical patent/WO2019129130A1/zh
Priority to US16/915,678 priority patent/US11528503B2/en
Priority to US17/994,556 priority patent/US20230232036A1/en
Priority to AU2023204122A priority patent/AU2023204122A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present application relates to the field of video codec technology, and in particular, to an image prediction method, apparatus, and codec.
  • ITU-TH.265 high Efficiently transmitting and receiving digital video information between devices can be achieved between the high efficiency video coding (HEVC) standard and the video compression techniques described in the extended section of the standard.
  • HEVC high efficiency video coding
  • an image of a video sequence is divided into image blocks for encoding or decoding.
  • inter prediction mode may include, but is not limited to, a merge mode (Merge Mode) and a non-merge mode (for example, an advanced motion vector prediction mode (AMVP mode), etc., and both are inter-predictions by using a method of multi-motion information competition. of.
  • merge Mode merge mode
  • AMVP mode advanced motion vector prediction mode
  • a candidate motion information list (referred to as a candidate list) including a plurality of sets of motion information (also referred to as a plurality of candidate motion information) is introduced, for example, the encoder may utilize a group selected from the candidate list.
  • the motion information acts as or predicts motion information (eg, motion vector) of the current image block to be encoded, thereby obtaining a reference image block (ie, a reference sample) of the current image block to be encoded.
  • the decoder can decode the indication information from the code stream to obtain a set of motion information. Because the encoding overhead of the motion information is limited in the inter prediction process (that is, the bit overhead of the code stream is occupied), the accuracy of the motion information is affected to some extent, which affects the accuracy of the image prediction.
  • the existing decoder-side motion vector refinement (DMVR) technique can be used to correct the motion information.
  • DMVR scheme for image prediction, not only the template matching block is calculated. And the template matching block is used to perform the search matching process in the forward reference image and the backward reference image respectively, resulting in high search complexity. Therefore, how to reduce the complexity of image prediction while improving the image prediction accuracy It is a problem that needs to be solved.
  • the embodiments of the present application provide an image prediction method and apparatus, and a corresponding encoder and decoder, which can reduce the complexity of image prediction to a certain extent while improving image prediction accuracy, thereby improving codec performance.
  • an embodiment of the present application provides an image prediction method, including: acquiring initial motion information of a current image block; and determining a position of the N forward reference blocks based on the initial motion information and a position of a current image block. And N backward reference blocks located in the forward reference picture, the N backward reference blocks being located in the backward reference picture, N being an integer greater than 1; based on the matching cost a criterion, determining, from the position of the M reference block, a position of the pair of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, wherein the position of each pair of reference blocks includes a forward reference a position of the block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset representing the forward reference block a positional offset relative to a position of an initial forward reference block, the second positional offset representing a positional offset of
  • the locations of the N forward reference blocks include a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks, where the N
  • the position of the backward reference block includes the position of one initial backward reference block and the position of (N-1) candidate backward reference blocks, so that the position of the position of the initial forward reference block relative to the position of the initial forward reference block
  • the 0 offset and the 0 offset also satisfy the mirror relationship.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
  • the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward direction
  • the second position of the position of the reference block is offset into a mirror relationship, on the basis of which the position of the pair of reference blocks (for example, the least matching cost) is determined from the position of the N pair of reference blocks as the target forward reference of the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • the current image block (abbreviated as the current block) herein can be understood as the image block currently being processed.
  • the current image block (abbreviated as the current block) herein can be understood as the image block currently being processed.
  • the encoding process it refers to an encoding block currently being encoded; in the decoding process, it refers to a decoding block currently being decoded.
  • reference blocks herein refer to blocks that provide reference signals for the current block. During the search process, it is necessary to traverse multiple reference blocks to find the best reference block.
  • a reference block located in the forward reference picture is referred to as a forward reference block; a reference block located in the backward reference picture is referred to as a backward reference block.
  • the block that provides the prediction for the current block is referred to as a prediction block.
  • the best reference block is found, which will provide predictions for the current block, which may be referred to as a prediction block.
  • the pixel value or sampled value or sampled signal within the prediction block is called a prediction signal.
  • the matching cost criterion herein can be understood as a criterion considering the matching cost between the paired forward reference block and the backward reference block, wherein the matching cost can be understood as the difference between the two blocks.
  • the value can be regarded as the accumulation of the difference values of the pixel points of the corresponding positions in the two blocks.
  • the calculation method of difference is generally based on the SAD (sum of absolute difference) criterion, or other criteria such as SAD (Sum of Absolute Transform Difference), MR-SAD (mean-removed sum of absolute difference, The absolute difference between the mean removal and the SSD (sum of squared differences) is calculated.
  • the initial motion information of the current image block of the embodiment of the present application may include a motion vector MV and reference image indication information.
  • the initial motion information may also include one or both of them.
  • the reference image indication information is used to indicate which one or which reconstructed images are used as the reference image
  • the motion vector represents the positional offset of the reference block position relative to the current block position in the used reference image, generally including the horizontal component offset and Vertical component offset.
  • (x, y) is used to represent the MV
  • x is the positional shift in the horizontal direction
  • y is the positional shift in the vertical direction.
  • the reference image indication information may include a reference image list and/or a reference image index corresponding to the reference image list.
  • the reference image index is used to identify a reference image corresponding to the used motion vector in the specified reference image list (RefPicList0 or RefPicList1).
  • An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
  • the initial motion information of the current image block of the embodiment of the present application is initial bidirectional prediction motion information, that is, motion information for forward and backward prediction directions.
  • the forward and backward prediction directions are two prediction directions of the bidirectional prediction mode, and it can be understood that "forward" and “backward” correspond to the reference image list 0 (RefPicList0) and the reference image of the current image, respectively.
  • the position of the initial forward reference block in the embodiment of the present application refers to the position of the reference block obtained in the forward reference image by using the position of the current block plus the initial motion MV offset;
  • the position of the initial backward reference block of an embodiment refers to the position of the reference block obtained in the backward reference image using the position of the current block plus the initial motion MV offset.
  • the execution subject of the method of the embodiment of the present application may be an image prediction device, such as a video encoder or a video decoder or an electronic device having a video codec function, and may be, for example, a frame in a video encoder.
  • An inter prediction unit, or a motion compensation unit in a video decoder may be an image prediction device, such as a video encoder or a video decoder or an electronic device having a video codec function, and may be, for example, a frame in a video encoder.
  • An inter prediction unit, or a motion compensation unit in a video decoder may be, for example, a frame in a video encoder.
  • the first position offset and the second position offset are mirrored, and the first position offset is the same as the second position offset.
  • the direction of the first positional offset also referred to as the vector direction
  • the magnitude of the first positional offset is the same as the magnitude of the second positional offset.
  • the first position offset includes a first horizontal component offset and a first vertical component offset
  • the second position offset includes a second horizontal component offset and a second vertical component offset Shifting, wherein the direction of the first horizontal component offset is opposite to the direction of the second horizontal component offset, and the magnitude of the first horizontal component offset is the same as the magnitude of the second horizontal component offset;
  • the direction of a vertical component offset is opposite to the direction of the second vertical component offset, and the magnitude of the first vertical component offset is the same as the magnitude of the second vertical component offset.
  • the first position offset and the second position offset are both zero.
  • the method further comprises: obtaining updated motion information of the current image block, the updated motion information including the updated forward motion vector and the updated backward direction a motion vector, wherein the updated forward motion vector points to a location of the target forward reference block, and the updated backward motion vector points to a location of the target backward reference block.
  • the updated motion information of the current image block is obtained based on the position of the target forward reference block, the position of the target backward reference block, and the position of the current image block, or is based on the determined one. Obtained from the first position offset and the second position offset corresponding to the reference block position.
  • the embodiment of the present application can obtain the motion information of the corrected current image block, and improve the accuracy of the current image block motion information, which is also beneficial to the prediction of other image blocks, for example, the prediction of motion information of other image blocks. Accuracy and so on.
  • the locations of the N forward reference blocks include a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks,
  • the positional offset of the position of each candidate forward reference block relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance;
  • the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
  • the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
  • the locations of the N pairs of reference blocks include: locations of the paired initial forward reference blocks and initial backward reference blocks, and locations of the paired candidate forward reference blocks and candidate backward reference blocks, Wherein the position of the candidate forward reference block in the forward reference image is offset from the position of the initial forward reference block, and the position of the candidate backward reference block in the backward reference image is relative to The position of the position of the reference block is shifted to a mirror relationship after the initial.
  • the initial motion information includes forward predicted motion information and backward predicted motion information
  • Determining a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block including:
  • a position of the N backward reference blocks in the backward reference image according to the backward predicted motion information and a position of the current image block, where the positions of the N backward reference blocks include an initial backward reference block position and (N-1) Positions of candidate backward reference blocks, the positional offset of the position of each candidate backward reference block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
  • the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a Two reference image index;
  • Determining a location of the N forward reference blocks and a location of the N backward reference blocks according to the initial motion information and a location of the current image block including:
  • the determining, by the M-pair reference block, the position of the pair of reference blocks from the position of the M-reference block is the location of the target forward reference block of the current image block according to the matching cost criterion.
  • the location of the target backward reference block including:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block, Wherein M is less than or equal to N.
  • the matching cost criterion is a criterion that minimizes the matching cost. For example, for the position of the M pair reference block, the difference between the pixel value of the forward reference block and the pixel value of the backward reference block in each pair of reference blocks is calculated; from the positions of the M pairs of reference blocks, the pixel value difference is determined to be the smallest The position of the pair of reference blocks is the position of the forward target reference block of the current image block and the position of the backward target reference block.
  • the matching cost criterion is a matching cost and an early termination criterion. For example, for the positions of the nth pair of reference blocks (one forward reference block and one backward reference block), the difference between the pixel value of the forward reference block and the pixel value of the backward reference block is calculated, where n is greater than or equal to An integer less than or equal to N; determining a position of the nth pair of reference blocks (a forward reference block and a backward reference block) as the current image block when the pixel value difference is less than or equal to the matching error threshold The position of the forward target reference block and the position of the backward target reference block.
  • the method is for encoding the current image block, and the acquiring initial motion information of the current image block includes: from a candidate motion information list of a current image block Obtaining the initial motion information;
  • the method is used to decode the current image block, and before the acquiring initial motion information of the current image block, the method further includes: acquiring indication information from a code stream of the current image block, where the indication information is used for Indicates the initial motion information of the current image block.
  • the image prediction method in the embodiment of the present application is applicable not only to the merge prediction mode (Merge) and/or the advanced motion vector prediction mode (AMVP), but also to the use of the spatial reference block and the time domain reference.
  • the motion information of the block and/or the inter-view reference block predicts other modes of motion information of the current image block, thereby improving codec performance.
  • a second aspect of the present application provides an image prediction method, including: acquiring initial motion information of a current image block;
  • N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the position of the M pair reference block is the position of the target forward reference block of the current image block And a position of the target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset and the second
  • the positional offset has a proportional relationship based on a time domain distance, the first positional offset representing a positional offset of a position of the forward referenced block relative to a position of an initial forward referenced block; the second positional offset representing Position of the backward reference block relative to a position of an
  • the position of the position of the initial forward reference block relative to the position of the initial forward reference block is 0, and the position of the initial backward reference block is relative to the initial backward reference block.
  • the positional offset of the position is 0, the 0 offset and the 0 offset are also satisfying the mirror relationship or satisfying the proportional relationship based on the time domain distance.
  • the position of the (N-1) pair reference block does not include the position of the initial forward reference block and the position of the initial backward reference block.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
  • the position of the forward reference block is offset from the first position of the position of the initial forward reference block
  • the position of the backward reference block is relative to the initial backward direction
  • the second position offset of the position of the reference block has a proportional relationship based on the time domain distance (also referred to as a time domain distance based mirror relationship), on the basis of which the position of the reference block is determined from N (eg, matching cost)
  • the position of the smallest pair of reference blocks is the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the current image block and the target backward reference block (ie, the best backward reference)
  • the position of the block/backward prediction block thereby obtaining a predicted value of the pixel
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • the first position offset and the second position offset have a proportional relationship based on a time domain distance, including:
  • the proportional relationship between the first position offset and the second position offset is determined based on a proportional relationship between the first time domain distance and the second time domain distance, wherein the first time domain distance indicates that the current image block belongs to The time domain distance between the current image and the forward reference image; the second time domain distance represents the time domain distance between the current image and the backward reference image.
  • the first location offset and the second location offset have a proportional relationship based on a time domain distance, including:
  • the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
  • the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
  • the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
  • the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
  • the method further comprises: obtaining updated motion information of the current image block, the updated motion information including the updated forward motion vector and the updated backward direction a motion vector, wherein the updated forward motion vector points to a location of the target forward reference block, and the updated backward motion vector points to a location of the target backward reference block.
  • the embodiment of the present application can obtain the motion information of the corrected current image block, and improve the accuracy of the current image block motion information, which will also be beneficial for prediction of other image blocks, such as lifting motion information of other image blocks. Forecast accuracy, etc.
  • the locations of the N forward reference blocks include a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks,
  • the positional offset of the position of each candidate forward reference block relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance;
  • the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
  • the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
  • the location of the N pairs of reference blocks includes: a location of the paired initial forward reference block and the initial backward reference block, and a pair of candidate forward directions a position of a reference block and a candidate backward reference block, wherein a position of the candidate forward reference block in the forward reference image is offset from a position of an initial forward reference block, and the backward reference image
  • the positional offset of the position of the candidate backward reference block relative to the position of the initial backward reference block has a proportional relationship based on the time domain distance.
  • the initial motion information includes forward predicted motion information and backward predicted motion information
  • Determining a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block including:
  • the position of the candidate backward reference block, the positional offset of the position of each candidate backward reference block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
  • the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a Two reference image index;
  • Determining a location of the N forward reference blocks and a location of the N backward reference blocks according to the initial motion information and a location of the current image block including:
  • the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the location of the target forward reference block of the current image block.
  • the location of the target backward reference block including:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block,
  • the M is less than or equal to N.
  • the matching cost criterion is a criterion that minimizes the matching cost. For example, for the position of the M pair reference block, the difference between the pixel value of the forward reference block and the pixel value of the backward reference block in each pair of reference blocks is calculated; from the positions of the M pairs of reference blocks, the pixel value difference is determined to be the smallest The position of the pair of reference blocks is the position of the forward target reference block of the current image block and the position of the backward target reference block.
  • the matching cost criterion is a matching cost and an early termination criterion. For example, for the positions of the nth pair of reference blocks (one forward reference block and one backward reference block), the difference between the pixel value of the forward reference block and the pixel value of the backward reference block is calculated, where n is greater than or equal to An integer less than or equal to N; determining a position of the nth pair of reference blocks (a forward reference block and a backward reference block) as the current image block when the pixel value difference is less than or equal to the matching error threshold The position of the forward target reference block and the position of the backward target reference block.
  • the method is used to encode the current image block, and the acquiring initial motion information of the current image block includes: from a candidate motion information list of a current image block Obtaining the initial motion information;
  • the method is used to decode the current image block, and before the acquiring initial motion information of the current image block, the method further includes: acquiring indication information from a code stream of the current image block, where the indication information is used for Indicates the initial motion information of the current image block.
  • a third aspect of the present application provides an image prediction method, including: acquiring an i-th wheel motion information of a current image block;
  • N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th target before the current image block.
  • the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a position offset of a position of the forward reference block relative to a position of the i-1th wheel target forward reference block,
  • the second position offset represents a positional offset of a position of the backward reference block
  • the position of the position of the initial forward reference block relative to the position of the initial forward reference block is 0, and the position of the initial backward reference block is relative to the initial backward reference block.
  • the 0 offset and the 0 offset also satisfy the mirror relationship.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
  • the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward direction
  • the second position of the position of the reference block is offset into a mirror relationship, on the basis of which the position of the pair of reference blocks (for example, the least matching cost) is determined from the position of the N pair of reference blocks as the target forward reference of the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • the method of the present application can further improve the accuracy of the modified motion vector MV by using an iterative method, thereby further improving the codec performance.
  • the i-th wheel motion information is initial motion information of the current image block; correspondingly, the N forward reference blocks
  • the position includes a position of an initial forward reference block and a position of (N-1) candidate forward reference blocks, and a positional offset of a position of each candidate forward reference block relative to a position of the initial forward reference block is An integer pixel distance or a fractional pixel distance; or, the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, each candidate backward reference
  • the positional offset of the position of the block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
  • the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. a motion vector; accordingly, the positions of the N forward reference blocks include a position of an i-th round target forward reference block and a position of (N-1) candidate forward reference blocks, each candidate forward The position of the reference block relative to the position of the i-1th round of the target forward reference block is offset by an integer pixel distance or a fractional pixel distance; or the position of the N backward reference blocks includes an i-1th round The position of the target backward reference block and the position of the (N-1) candidate backward reference blocks, and the positional offset of the position of each candidate backward reference block relative to the position of the i-1th round target backward reference block is Integer pixel distance or fractional pixel distance.
  • initial motion information of the current image block is obtained by: determining the initial motion information from a candidate motion information list of a current image block. Or, if the method is used to decode the current image block, the initial motion information of the current image block is obtained by acquiring instruction information from a code stream of a current image block, wherein the indication information Used to indicate initial motion information of the current image block.
  • the first position offset is in a mirror image relationship with the second position offset, including: the direction of the first position offset and the second position offset The direction is reversed and the magnitude of the first position offset is the same as the magnitude of the second position offset.
  • the i-th wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
  • Determining a position of the N forward reference blocks and a position of the N backward reference blocks according to the i-th wheel motion information and a position of the current image block including:
  • the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
  • the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th wheel target forward reference of the current image block.
  • the position of the block and the position of the reference frame after the i-th wheel target include:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
  • a fourth aspect of the present application provides an image prediction method, including: acquiring an i-th wheel motion information of a current image block;
  • N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th target before the current image block.
  • the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position offset indicating a position of the forward reference block relative to the i-1th in the forward reference image a position offset of a position of the wheel target forward reference block; the second position offset indicating
  • the position of the position of the initial forward reference block relative to the position of the initial forward reference block is 0, and the position of the initial backward reference block is relative to the initial backward reference block.
  • the positional offset of the position is 0, the 0 offset and the 0 offset are also satisfying the mirror relationship or satisfying the proportional relationship based on the time domain distance.
  • the position of the (N-1) pair reference block does not include the position of the initial forward reference block and the position of the initial backward reference block.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
  • the position of the forward reference block is offset from the first position of the position of the initial forward reference block
  • the position of the backward reference block is relative to the initial backward direction
  • the second position offset of the position of the reference block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks determined from the positions of the N pairs of reference blocks (for example, the matching cost is the smallest) is the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • the method of the present application can further improve the accuracy of the modified motion vector MV by using an iterative method, thereby further improving the codec performance.
  • the i-th wheel motion information is initial motion information of the current image block; if i>1, the i-th wheel motion The information includes a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward motion vector pointing to the position of the i-1th round target backward reference block.
  • the first position offset and the second position offset have a proportional relationship based on a time domain distance, including:
  • the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
  • the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
  • the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
  • the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
  • the i-th wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
  • Determining a position of the N forward reference blocks and a position of the N backward reference blocks according to the i-th wheel motion information and a position of the current image block including:
  • the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
  • the location includes the location of an i-th round of the target backward reference block and the location of the (N-1) candidate backward reference blocks.
  • the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th target of the current image block
  • the position of the block and the position of the reference frame after the i-th wheel target include:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
  • a fifth aspect of the present application provides an image prediction apparatus comprising a plurality of functional units for implementing any of the methods of the first aspect.
  • the image prediction apparatus may include: a first acquiring unit, configured to acquire initial motion information of the current image block; and a first searching unit, configured to determine N forward directions based on the initial motion information and a position of the current image block. Referring to the position of the block and the position of the N backward reference blocks, the N forward reference blocks are located in the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1.
  • a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating the a position offset of a position of the forward reference block relative to a position of the initial forward reference block, the second position offset indicating a positional deviation of a position of the backward reference block relative to a position of the initial backward reference block Shifting, the M is an integer greater than or equal to 1, and the M is less than or equal to N; the first prediction unit is configured to calculate a pixel value of the target forward reference block and a pixel of the target backward reference block a value that yields a predicted value of the pixel value of
  • the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
  • a sixth aspect of the present application provides an image predicting apparatus comprising a plurality of functional units for implementing any one of the methods of the second aspect.
  • the image prediction apparatus may include: a second acquiring unit, configured to acquire initial motion information of the current image block; and a second searching unit, configured to determine N forward directions based on the initial motion information and a position of the current image block Referring to the position of the block and the position of the N backward reference blocks, the N forward reference blocks are located in the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1.
  • a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and for a position of each pair of reference blocks, the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position An offset represents a positional offset of a position of the forward reference block relative to a position of an initial forward reference block; the second positional offset represents a position of the backward reference block relative to an initial backward reference a positional offset of a position of the block, the M being an integer greater than or equal to 1, and the M being less than or equal to N; a second prediction unit for determining a pixel value based on the target forward reference block and the target The pixel value of the reference block is backward referenced to
  • the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
  • a seventh aspect of the present application provides an image predicting apparatus comprising a plurality of functional units for implementing any of the methods of the third aspect.
  • the image prediction apparatus may include: a third acquiring unit, configured to acquire the i-th wheel motion information of the current image block; and a third searching unit, configured to determine, according to the information of the i-th wheel motion and the current image block The positions of the N forward reference blocks and the positions of the N backward reference blocks, the N forward reference blocks being located in the forward reference picture, the N backward reference blocks being located in the backward reference picture, N being An integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks as the position of the i-th target forward reference block of the current image block and the position of the i-th target backward reference block of the current image block a location, wherein a location of each pair of reference blocks includes a location of a forward reference block and a location of a backward reference block
  • the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
  • the image prediction apparatus may include: a fourth acquiring unit, configured to acquire an i-th wheel motion information of the current image block; and a fourth searching unit, configured to determine, according to the i-th wheel motion information and a current image block position
  • the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
  • a ninth aspect of the present application provides an image prediction apparatus, the apparatus comprising: a processor and a memory coupled to the processor; the processor for performing the first aspect or the second aspect or the third aspect Or the method of the fourth aspect or various implementations of the foregoing aspects.
  • a tenth aspect of the present application provides a video encoder, the video encoder for encoding an image block, comprising: an inter prediction module, wherein the inter prediction module includes the fifth aspect or the sixth aspect or the seventh aspect Or the image prediction apparatus of the eighth aspect, wherein the inter prediction module is configured to predict a predicted value of a pixel value of the image block; and an entropy encoding module is configured to encode the indication information into the code stream, the indication The information is used to indicate initial motion information of the image block; and a reconstruction module is configured to reconstruct the image block based on a predicted value of a pixel value of the image block.
  • An eleventh aspect of the present application provides a video decoder, where the video decoder is configured to decode an image block from a code stream, and includes: an entropy decoding module, configured to decode indication information from a code stream, where the indication information is An initial prediction information for indicating a currently decoded image block; the inter prediction module, comprising the image prediction apparatus according to the fifth aspect or the sixth aspect, or the seventh aspect or the eighth aspect, wherein the inter prediction module is used for prediction Obtaining a predicted value of a pixel value of the image block; and a reconstruction module, configured to reconstruct the image block based on a predicted value of a pixel value of the image block.
  • a twelfth aspect of the present application provides a video encoding apparatus including a nonvolatile storage medium, and a processor, the nonvolatile storage medium storing an executable program, the processor and the nonvolatile The storage mediums are coupled to one another and execute the executable program to implement the methods of the first, second, third or fourth aspects or various implementations thereof.
  • a thirteenth aspect of the present application provides a video decoding apparatus, including a nonvolatile storage medium, and a processor, the nonvolatile storage medium storing an executable program, the processor and the nonvolatile The storage mediums are coupled to one another and execute the executable program to implement the methods of the first, second, third or fourth aspects or various implementations thereof.
  • a fourteenth aspect of the present application provides a computer readable storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the first, second, third or fourth aspects described above Or methods in various implementations thereof.
  • a fifteenth aspect of the present application provides a computer program product comprising instructions which, when executed on a computer, cause the computer to perform the methods of the first, second, third or fourth aspects or various implementations thereof.
  • a sixteenth aspect of the present application provides an electronic device comprising the video encoder according to the above tenth aspect, or the video decoder according to the eleventh aspect, or the fifth, sixth, seventh or eighth aspect The image prediction device.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system in an embodiment of the present application
  • FIG. 2A is a schematic block diagram of a video encoder in an embodiment of the present application.
  • 2B is a schematic block diagram of a video decoder in an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an encoder acquiring initial motion information in a merge mode of inter prediction
  • FIG. 5 is a schematic diagram of a decoding end acquiring initial motion information in a merge mode of inter prediction
  • FIG. 6 is a schematic diagram of an initial reference block of a current image block
  • FIG. 7 is a schematic diagram of an integer pixel position pixel and a sub-pixel position pixel
  • Figure 8 is a schematic diagram of a search starting point
  • FIG. 9 is a schematic block diagram showing a mirror relationship between a first position offset and a second position offset in the embodiment of the present application.
  • FIG. 10 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
  • FIG. 12 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
  • FIG. 13 is a schematic block diagram showing a proportional relationship between a first position offset and a second position offset according to a time domain distance in the embodiment of the present application;
  • FIG. 14 is a schematic flowchart of another image prediction method 1400 according to an embodiment of the present application.
  • FIG. 15 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
  • FIG. 16 is a schematic flowchart of another image prediction method 1600 according to an embodiment of the present application.
  • 17 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
  • FIG. 18 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application.
  • FIG. 19 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application.
  • FIG. 20 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application.
  • 21 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application.
  • FIG. 22 is a schematic block diagram of an encoding device or a decoding device according to an embodiment of the present application.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system in an embodiment of the present application.
  • the video encoder 20 and video decoder 30 in the system are used to predict predicted values of pixel values of image blocks according to various image prediction method examples proposed herein, and to correct motion information of currently encoded or decoded image blocks, for example Motion vectors to further improve codec performance.
  • the system includes a source device 12 and a destination device 14, which generates encoded video data that will be decoded by the destination device 14 at a later time.
  • Source device 12 and destination device 14 may comprise any of a wide range of devices, including desktop computers, notebook computers, tablet computers, set top boxes, telephone handsets such as so-called “smart” phones, so-called “smart” "Touchpads, televisions, cameras, display devices, digital media players, video game consoles, video streaming devices or the like.
  • Link 16 may include any type of media or device capable of moving encoded video data from source device 12 to destination device 14.
  • link 16 may include communication media that enables source device 12 to transmit encoded video data directly to destination device 14 in real time.
  • the encoded video data can be modulated and transmitted to destination device 14 in accordance with a communication standard (e.g., a wireless communication protocol).
  • Communication media can include any wireless or wired communication medium, such as a radio frequency spectrum or one or more physical transmission lines.
  • the communication medium can form part of a packet-based network (eg, a global network of local area networks, wide area networks, or the Internet).
  • Communication media can include routers, switches, base stations, or any other equipment that can be used to facilitate communication from source device 12 to destination device 14.
  • the encoded data may be output from output interface 22 to storage device 24.
  • encoded data can be accessed from storage device 24 by an input interface.
  • Storage device 24 may comprise any of a variety of distributed or locally accessed data storage media, such as a hard drive, Blu-ray Disc, DVD, CD-ROM, flash memory, volatile or non-volatile memory Or any other suitable digital storage medium for storing encoded video data.
  • storage device 24 may correspond to a file server or another intermediate storage device that may maintain encoded video produced by source device 12. Destination device 14 may access the stored video data from storage device 24 via streaming or download.
  • the file server can be any type of server capable of storing encoded video data and transmitting this encoded video data to destination device 14.
  • a file server includes a web server, a file transfer protocol server, a network attached storage device, or a local disk unit.
  • Destination device 14 can access the encoded video data via any standard data connection that includes an Internet connection.
  • This data connection may include a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, a cable modem, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server.
  • the transmission of encoded video data from storage device 24 may be streaming, downloading, or a combination of both.
  • the techniques of this application are not necessarily limited to wireless applications or settings. Techniques may be applied to video decoding to support any of a variety of multimedia applications, such as over-the-air broadcast, cable television transmission, satellite television transmission, streaming video transmission (eg, via the Internet), encoding digital video for use in It is stored on a data storage medium and decodes digital video or other applications stored on the data storage medium.
  • the system can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • source device 12 includes video source 18, video encoder 20, and output interface 22.
  • output interface 22 may include a modulator/demodulator (modem) and/or a transmitter.
  • video source 18 may include sources such as video capture devices (eg, cameras), video archives containing previously captured video, video feed interfaces to receive video from video content providers. And/or a computer graphics system for generating computer graphics data as source video, or a combination of these sources.
  • the video source 18 is a video camera
  • the source device 12 and the destination device 14 may form a so-called camera phone or video phone.
  • the techniques described in this application are illustratively applicable to video decoding and are applicable to wireless and/or wired applications.
  • Captured, pre-captured, or computer generated video may be encoded by video encoder 20.
  • the encoded video data can be transmitted directly to the destination device 14 via the output interface 22 of the source device 12.
  • the encoded video data may also (or alternatively) be stored on storage device 24 for later access by destination device 14 or other device for decoding and/or playback.
  • the destination device 14 includes an input interface 28, a video decoder 30, and a display device 32.
  • input interface 28 can include a receiver and/or a modem.
  • Input interface 28 of destination device 14 receives encoded video data via link 16.
  • the encoded video data communicated or provided on storage device 24 via link 16 may include various syntax elements generated by video encoder 20 for use by video decoders of video decoder 30 to decode the video data. These syntax elements can be included with encoded video data that is transmitted over a communication medium, stored on a storage medium, or stored on a file server.
  • Display device 32 may be integrated with destination device 14 or external to destination device 14.
  • destination device 14 can include an integrated display device and is also configured to interface with an external display device.
  • the destination device 14 can be a display device.
  • display device 32 displays decoded video data to a user and may include any of a variety of display devices, such as a liquid crystal display, a plasma display, an organic light emitting diode display, or another type of display device.
  • Video encoder 20 and video decoder 30 may operate in accordance with, for example, the next generation video codec compression standard (H.266) currently under development and may conform to the H.266 Test Model (JEM).
  • video encoder 20 and video decoder 30 may be according to, for example, the ITU-TH.265 standard, also referred to as a high efficiency video decoding standard, or other proprietary or industry standard of the ITU-TH.264 standard or an extension of these standards.
  • the ITU-TH.264 standard is alternatively referred to as MPEG-4 Part 10, also known as advanced video coding (AVC).
  • AVC advanced video coding
  • the techniques of this application are not limited to any particular decoding standard.
  • Other possible implementations of the video compression standard include MPEG-2 and ITU-TH.263.
  • video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder and may include a suitable multiplexer-demultiplexer (MUX-DEMUX) unit or other hardware and software to handle the encoding of both audio and video in a common data stream or in a separate data stream.
  • MUX-DEMUX multiplexer-demultiplexer
  • the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol or other protocols such as the User Datagram Protocol (UDP).
  • UDP User Datagram Protocol
  • Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field Programmable Gate Array (FPGA), discrete logic, software, hardware, firmware, or any combination thereof.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGA Field Programmable Gate Array
  • the apparatus may store the instructions of the software in a suitable non-transitory computer readable medium and execute the instructions in hardware using one or more processors to perform the techniques of the present application.
  • Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated into a combined encoder/decoder (CODEC) in a respective device. part.
  • CDEC combined encoder/decoder
  • the present application may illustratively involve video encoder 20 "signaling" particular information to another device, such as video decoder 30.
  • video encoder 20 may signal information by associating particular syntax elements with various encoded portions of the video data. That is, video encoder 20 may "signal" the data by storing the particular syntax elements to the header information of the various encoded portions of the video data.
  • these syntax elements may be encoded and stored (eg, stored to storage system 34 or file server 36) prior to being received and decoded by video decoder 30.
  • the term “signaling” may illustratively refer to the communication of grammar or other data used to decode compressed video data, whether this communication occurs in real time or near real time or occurs over a time span, such as may be encoded Occurs when a syntax element is stored to the media, and the syntax element can then be retrieved by the decoding device at any time after storage to the media.
  • H.265 JCT-VC developed the H.265 (HEVC) standard.
  • HEVC standardization is based on an evolution model of a video decoding device called the HEVC Test Model (HM).
  • HM HEVC Test Model
  • the latest standard documentation for H.265 is available at http://www.itu.int/rec/T-REC-H.265.
  • the latest version of the standard document is H.265 (12/16), which is the full text of the standard document.
  • the manner of reference is incorporated herein.
  • the HM assumes that the video decoding device has several additional capabilities with respect to existing algorithms of ITU-TH.264/AVC. For example, H.264 provides nine intra-prediction coding modes, while HM provides up to 35 intra-prediction coding modes.
  • JVET is committed to the development of the H.266 standard.
  • the H.266 standardization process is based on an evolution model of a video decoding device called the H.266 test model.
  • the algorithm description of H.266 is available from http://phenix.int-evry.fr/jvet, and the latest algorithm description is included in JVET-F1001-v2, which is incorporated herein by reference in its entirety.
  • the reference software for the JEM test model is available from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/, which is also incorporated herein by reference in its entirety.
  • HM can divide a video frame or image into a sequence of treeblocks or largest coding units (LCUs) containing both luminance and chrominance samples, also referred to as CTUs.
  • Treeblocks have similar purposes to macroblocks of the H.264 standard.
  • a stripe contains several consecutive treeblocks in decoding order.
  • a video frame or image can be segmented into one or more stripes.
  • Each tree block can be split into coding units according to a quadtree. For example, a tree block that is the root node of a quadtree can be split into four child nodes, and each child node can be a parent node again and split into four other child nodes.
  • the final non-splitable child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded image blocks.
  • the syntax data associated with the decoded code stream may define the maximum number of times the tree block can be split, and may also define the minimum size of the decoded node.
  • the coding unit includes a decoding node and a prediction unit (PU) and a transform unit (TU) associated with the decoding node.
  • the size of the CU corresponds to the size of the decoding node and the shape must be square.
  • the size of the CU may range from 8 x 8 pixels up to a maximum of 64 x 64 pixels or larger.
  • Each CU may contain one or more PUs and one or more TUs.
  • syntax data associated with a CU may describe a situation in which a CU is partitioned into one or more PUs.
  • the split mode may be different between situations where the CU is skipped or encoded by direct mode coding, intra prediction mode coding, or inter prediction mode.
  • the PU can be divided into a shape that is non-square.
  • syntax data associated with a CU may also describe a situation in which a CU is partitioned into one or more TUs according to a quadtree.
  • the shape of the TU can be square or non
  • the HEVC standard allows for transforms based on TUs, which can be different for different CUs.
  • the TU is typically sized based on the size of the PU within a given CU defined for the partitioned LCU, although this may not always be the case.
  • the size of the TU is usually the same as or smaller than the PU.
  • the residual samples corresponding to the CU may be subdivided into smaller units using a quadtree structure called a "residual qualtree" (RQT).
  • RQT residual qualtree
  • the leaf node of the RQT can be referred to as a TU.
  • the pixel difference values associated with the TU may be transformed to produce transform coefficients, which may be quantized.
  • TUs use transform and quantization processes.
  • a given CU with one or more PUs may also contain one or more TUs.
  • video encoder 20 may calculate a residual value corresponding to the PU.
  • the residual value includes pixel difference values, which can be transformed into transform coefficients, quantized, and scanned using TU to produce serialized transform coefficients for entropy decoding.
  • the present application generally refers to the decoding node of a CU using the term "image block.”
  • image block may also be used herein to refer to a tree block containing a decoding node as well as a PU and a TU, eg, an LCU or CU.
  • a video sequence usually contains a series of video frames or images.
  • a group of picture illustratively includes a series of one or more video images.
  • the GOP may include syntax data in the header information of the GOP, in the header information of one or more of the images, or elsewhere, the syntax data describing the number of images included in the GOP.
  • Each strip of the image may contain stripe syntax data describing the encoding mode of the corresponding image.
  • Video encoder 20 typically operates on image blocks within individual video stripes to encode video data.
  • An image block may correspond to a decoding node within a CU.
  • Image blocks may have fixed or varying sizes and may vary in size depending on the specified decoding criteria.
  • HM supports prediction of various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, HM supports intra prediction of PU size of 2N ⁇ 2N or N ⁇ N, and inter-frame prediction of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N or N ⁇ N symmetric PU size prediction. The HM also supports asymmetric partitioning of inter-prediction of PU sizes of 2N x nU, 2N x nD, nL x 2N, and nR x 2N. In the asymmetric segmentation, one direction of the CU is not divided, and the other direction is divided into 25% and 75%.
  • 2N x nU refers to a horizontally partitioned 2N x 2 NCU, where 2N x 0.5 NPU is at the top and 2N x 1.5 NPU is at the bottom.
  • N x M and N by M are used interchangeably to refer to the pixel size of an image block according to a horizontal dimension and a vertical dimension, for example, 16 x 8 pixels or 16 by 8 pixels.
  • a 16x8 block will have 16 pixels in the horizontal direction, that is, the image block has a width of 16 pixels and has 8 pixels in the vertical direction, that is, the image block has a height of 8 pixels.
  • video encoder 20 may calculate residual data for the TU of the CU.
  • a PU may include pixel data in a spatial domain (also referred to as a pixel domain), and a TU may be included in transforming (eg, discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after application to the residual video data.
  • the residual data may correspond to a pixel difference between a pixel of the uncoded image and a predicted value corresponding to the PU.
  • Video encoder 20 may form a TU that includes residual data for the CU, and then transform the TU to generate transform coefficients for the CU.
  • An image block refers to a two-dimensional array of sample points, which can be a square array or a rectangular array.
  • a 4 ⁇ 4 size image block can be regarded as a square sample point array composed of 4 ⁇ 4 total 16 sample points.
  • the signal within the image block refers to the sampled value of the sample point within the image block.
  • sampling points may also be referred to as pixels or pixels, and will be used indiscriminately in the present document.
  • the value of the sample point can also be referred to as a pixel value, which will be used without distinction in this application.
  • the image can also be represented as a two-dimensional array of sample points, labeled in a similar way to the image block.
  • video encoder 20 may perform quantization of the transform coefficients.
  • Quantization illustratively refers to the process of quantizing the coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression.
  • the quantization process can reduce the bit depth associated with some or all of the coefficients. For example, the n-bit value can be rounded down to an m-bit value during quantization, where n is greater than m.
  • the JEM model further improves the coding structure of video images.
  • a block coding structure called "Quad Tree Combined Binary Tree" (QTBT) is introduced.
  • QTBT Quality Tree Combined Binary Tree
  • the QTBT structure rejects the concepts of CU, PU, TU, etc. in HEVC, and supports more flexible CU partitioning shapes.
  • One CU can be square or rectangular.
  • a CTU first performs quadtree partitioning, and the leaf nodes of the quadtree further perform binary tree partitioning.
  • there are two division modes in the binary tree division symmetric horizontal division and symmetric vertical division.
  • the leaf nodes of the binary tree are called CUs, and the CUs of the JEM cannot be further divided during the prediction and transformation process, that is, the CUs, PUs, and TUs of the JEM have the same block size.
  • the maximum size of the CTU is 256 ⁇ 256 luma pixels.
  • video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to produce an entropy encoded serialized vector.
  • video encoder 20 may perform an adaptive scan. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may be based on context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), grammar based context adaptive binary. Arithmetic decoding (SBAC), probability interval partitioning entropy (PIPE) decoding, or other entropy decoding methods to entropy decode a one-dimensional vector.
  • Video encoder 20 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 30 to decode the video data.
  • FIG. 2A is a schematic block diagram of a video encoder 20 in the embodiment of the present application.
  • video encoder 20 may perform an image prediction process, and in particular, motion compensation unit 44 in video encoder 20 may perform an image prediction process.
  • video encoder 20 may include a prediction module 41, a summer 50, a transform module 52, a quantization module 54, and an entropy encoding module 56.
  • the prediction module 41 may include a motion estimation unit 42, a motion compensation unit 44, and an intra prediction unit 46.
  • the internal structure of the prediction module 41 is not limited in this embodiment of the present application.
  • video encoder 20 may also include inverse quantization module 58, inverse transform module 60, and summer 62.
  • the video encoder 20 may further include a splitting unit (not shown) and a reference image memory 64, it being understood that the splitting unit and the reference image memory 64 may also be disposed in the video encoder. Beyond 20;
  • video encoder 20 may also include a filter (not shown) to filter block boundaries to remove blockiness artifacts from the reconstructed video.
  • the filter will typically filter the output of summer 62 as needed.
  • the video encoder 20 receives video data, and the dividing unit divides the data into image blocks.
  • This segmentation may also include segmentation into strips, image blocks, or other larger units, such as image block segmentation based on the quadtree structure of the LCU and CU.
  • a strip can be divided into multiple image blocks.
  • the prediction module 41 is configured to generate a prediction block of the current coded image block. Prediction module 41 may select one of a plurality of possible decoding modes of the current image block based on the encoding quality and the cost calculation result (eg, rate-distortion cost, RDcost), such as one or more of a plurality of intra-coding modes One of the inter-frame decoding modes. Prediction module 41 may provide the resulting intra-coded or inter-coded block to summer 50 to generate residual block data and provide the resulting intra-coded or inter-coded block to summer 62. The coded block is reconstructed for use as a reference image.
  • rate-distortion cost e.g., rate-distortion cost, RDcost
  • Motion estimation unit 42 and motion compensation unit 44 within prediction module 41 perform inter-predictive decoding of the current image block relative to one or more of the one or more reference pictures to provide temporal compression.
  • Motion estimation unit 42 is operative to determine an inter prediction mode for the video stripe based on a predetermined pattern of the video sequence.
  • the predetermined mode specifies the video strips in the sequence as P strips, B strips, or GPB strips.
  • Motion estimation unit 42 and motion compensation unit 44 may be highly integrated, but are separately illustrated for conceptual purposes.
  • the motion performed by the motion estimation unit 42 is estimated as a process of generating a motion vector of the estimated image block.
  • the motion vector may indicate the displacement of the PU of the current video frame or image block within the image relative to the predicted block within the reference image.
  • the prediction block is a block of PUs that are found to closely match the image block to be decoded according to the pixel difference, and the pixel difference may be determined by absolute difference sum (SAD), square difference sum (SSD) or other difference metric.
  • video encoder 20 may calculate a value of a sub-integer pixel location of a reference image stored in reference image memory 64.
  • the motion estimation unit 42 calculates a motion vector of the PU of the image block in the inter-decoded slice by comparing the position of the PU with the position of the prediction block of the reference image.
  • the reference images may be selected from a first reference image list (List 0) or a second reference image list (List 1), each of the lists identifying one or more reference images stored in the reference image memory 64.
  • Motion estimation unit 42 transmits the computed motion vector to entropy encoding module 56 and motion compensation unit 44.
  • Motion compensation performed by motion compensation unit 44 may involve extracting or generating a prediction block based on motion vectors determined by motion estimation, possibly performing interpolation to sub-pixel precision. After receiving the motion vector of the PU of the current image block, motion compensation unit 44 may locate the prediction block to which the motion vector is directed in one of the reference image lists.
  • the video encoder 20 forms a residual image block by subtracting the pixel value of the prediction block from the pixel value of the current image block being decoded, thereby forming a pixel difference value.
  • the pixel difference values form residual data for the block and may include both luminance and chrominance difference components.
  • Summer 50 represents one or more components that perform this subtraction.
  • Motion compensation unit 44 may also generate syntax elements associated with image blocks and video slices for use by video decoder 30 to decode image blocks of the video strip.
  • the image prediction process of the embodiment of the present application will be described in detail below with reference to FIG. 3, FIG. 10-12, and FIG. 14-17, and details are not described herein again.
  • Intra prediction unit 46 within prediction module 41 may perform intra-predictive decoding of the current image block relative to one or more neighboring blocks in the same image or slice as the current block to be decoded to provide spatial compression .
  • intra-prediction unit 46 may intra-predict the current block.
  • intra prediction unit 46 may determine an intra prediction mode to encode the current block.
  • intra-prediction unit 46 may encode the current block using various intra-prediction modes, for example, during separate encoding traversal, and intra-prediction unit 46 (or in some possible implementations, The mode selection unit 40) may select the appropriate intra prediction mode to use from the tested mode.
  • the video encoder 20 forms a residual image block by subtracting the prediction block from the current image block.
  • the residual video data in the residual block may be included in one or more TUs and applied to transform module 52.
  • the transform module 52 is configured to transform the residual between the original block of the current coded image block and the predicted block of the current image block.
  • Transform module 52 transforms the residual data into residual transform coefficients using, for example, a discrete cosine transform (DCT) or a conceptually similar transform (eg, a discrete sinusoidal transform DST).
  • Transform module 52 may convert the residual video data from the pixel domain to a transform domain (eg, a frequency domain).
  • Transform module 52 may send the resulting transform coefficients to quantization module 54.
  • Quantization module 54 quantizes the transform coefficients to further reduce the code rate.
  • quantization module 54 may then perform a scan of the matrix containing the quantized transform coefficients.
  • entropy encoding module 56 may perform a scan.
  • entropy encoding module 56 may entropy encode the quantized transform coefficients. For example, entropy encoding module 56 may perform context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), syntax based context adaptive binary arithmetic decoding (SBAC), probability interval partitioning entropy (PIPE) decoding or another entropy coding method or technique. Entropy encoding module 56 may also entropy encode the motion vectors and other syntax elements of the current video strip being encoded. After entropy encoding by entropy encoding module 56, the encoded code stream may be transmitted to video decoder 30 or archive for later transmission or retrieved by video decoder 30.
  • CAVLC context adaptive variable length decoding
  • CABAC context adaptive binary arithmetic decoding
  • SBAC syntax based context adaptive binary arithmetic decoding
  • PIPE probability interval partitioning entropy
  • Entropy encoding module 56 may also entrop
  • Inverse quantization module 58 and inverse transform module 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain for later use as a reference block for the reference image.
  • the summer 62 adds the reconstructed residual block to the prediction block generated by the prediction module 41 to produce a reconstructed block and serves as a reference block for storage in the reference image memory 64.
  • These reference blocks may be used by motion estimation unit 42 and motion compensation unit 44 as reference blocks to inter-predict blocks in subsequent video frames or images.
  • video encoder 20 may directly quantize the residual signal without the need for processing by transform module 52, and accordingly need not be processed by inverse transform module 58; or, for some image blocks Or the image frame, the video encoder 20 does not generate residual data, and accordingly does not need to be processed by the transform module 52, the quantization module 54, the inverse quantization module 58 and the inverse transform module 60; alternatively, the video encoder 20 can reconstruct the reconstructed image
  • the block is stored directly as a reference block without being processed by the filter unit; alternatively, the quantization module 54 and the inverse quantization module 58 in the video encoder 20 may be merged together; or, the transform module 52 and the inverse transform in the video encoder 20 Modules 60 can be merged together; alternatively, summer 50 and summer 62 can be combined.
  • FIG. 2B is a schematic block diagram of a video decoder 30 in the embodiment of the present application.
  • video decoder 30 may perform an image prediction process, and in particular, motion compensation unit 82 in video decoder 30 may perform an image prediction process.
  • video decoder 30 may include an entropy decoding module 80, a prediction processing module 81, an inverse quantization module 86, an inverse transform module 88, and a reconstruction module 90.
  • the prediction module 81 may include a motion compensation unit 82 and an intra prediction unit 84, which are not limited in this embodiment of the present application.
  • video decoder 30 may also include reference image memory 92. It should be understood that the reference image memory 92 can also be disposed outside of the video decoder 30. In some possible implementations, video decoder 30 may perform an exemplary reciprocal decoding process with respect to the encoding flow described by video encoder 20 from FIG. 2A.
  • video decoder 30 receives from video encoder 20 an encoded video code stream representing the image blocks of the encoded video stripe and associated syntax elements.
  • Video decoder 30 may receive syntax elements at the video stripe level and/or image block level.
  • Entropy decoding module 80 of video decoder 30 entropy decodes the bitstream/codestream to produce quantized coefficients and some syntax elements.
  • Entropy decoding module 80 forwards the syntax elements to prediction processing module 81.
  • the syntax element herein may include inter prediction data related to a current image block, and the inter prediction data may include an index identifier block_based_index to indicate which motion information is used by the current image block.
  • a switch flag block_based_enable_flag may also be included to indicate whether image prediction is performed on the current image block using FIG. 3 or 14 (in other words, to indicate whether the current image block is present)
  • MVD image constraint proposed by the present application to perform inter prediction, or whether to perform image prediction on the current image block using FIG. 12 or 16 (in other words, to indicate whether or not the time domain distance proposed by the present application is applied to the current image block)
  • Inter-prediction is performed under the proportional relationship).
  • the intra-prediction unit 84 of the prediction processing module 81 may be based on the signaled intra prediction mode and the previously decoded block from the current frame or image. The data produces a prediction block of the image block of the current video strip.
  • motion compensation unit 82 of prediction processing module 81 may determine to use for the current video based on the syntax elements received from entropy decoding module 82.
  • the inter-prediction mode in which the current image block of the stripe is decoded, the current image block is decoded (eg, inter-prediction is performed) based on the determined inter-prediction mode.
  • the motion compensation unit 82 may determine which image prediction method is used for prediction of the current image block of the current video slice, for example, the syntax element indicates that the current image block is predicted by using an image prediction method based on the MVD image constraint.
  • the motion information of the current image block of the current video stripe is predicted or corrected so that the predicted block of the current image block is acquired or generated by the motion compensation process using the predicted motion information of the current image block.
  • the motion information herein may include reference image information and motion vectors, wherein the reference image information may include, but is not limited to, unidirectional/bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list.
  • a prediction block may be generated from one of the reference pictures within one of the reference picture lists.
  • the video decoder 30 may construct a reference image list, that is, list 0 and list 1, based on the reference image stored in the reference image memory 92.
  • the reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1.
  • video encoder 20 may be signaled indicating which new image prediction method to employ.
  • the prediction processing module 81 is configured to generate a prediction block of the currently decoded image block; specifically, when the video slice is decoded into an intra-frame decoded (I) slice, the intra prediction unit 84 of the prediction module 81 A prediction block of an image block of the current video stripe may be generated based on the signaled intra prediction mode and data from a previously decoded image block of the current frame or image.
  • motion compensation unit 82 of prediction module 81 generates the current video based on the motion vectors and other syntax elements received from entropy encoding unit 80. The predicted block of the image block of the image.
  • Inverse quantization module 86 inverse quantizes, ie, dequantizes, the quantized transform coefficients provided in the bitstream and decoded by entropy decoding module 80.
  • the inverse quantization process may include using the quantization parameters calculated by video encoder 20 for each of the video slices to determine the degree of quantization that should be applied and likewise determine the degree of inverse quantization that should be applied.
  • Inverse transform module 88 applies the inverse transform to transform coefficients, such as inverse DCT, inverse integer transform, or a conceptually similar inverse transform process, to generate residual blocks in the pixel domain.
  • the video decoder 30 obtains the reconstructed block by summing the residual block from the inverse transform module 88 with the corresponding prediction block generated by the motion compensation unit 82. That is, the decoded image block.
  • Summer 90 represents the component that performs this summation operation.
  • a loop filter (either in the decoding loop or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality, if desired.
  • a filter unit may represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
  • decoded image blocks in a given frame or image may be stored in decoded image buffer 92, which stores reference images for subsequent motion compensation.
  • the decoded image buffer 92 can be part of a memory that can also store decoded video for later presentation on a display device (eg, display device 32 of FIG. 1), or can be separate from such memory.
  • video decoder 30 may be used to decode the encoded video bitstream.
  • video decoder 30 may generate an output video stream without processing by a filter unit; or, for some image blocks or image frames, entropy decoding module 80 of video decoder 30 does not decode the quantized coefficients, correspondingly Processing by inverse quantization module 86 and inverse transform module 88 is required.
  • inverse quantization module 86 and inverse transform module 88 in video decoder 30 may be combined.
  • FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
  • the method shown in FIG. 3 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 3 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
  • Process 300 may be performed by video encoder 20 or video decoder 30, and in particular, by video encoder 20 or a motion compensation unit of video decoder 30. Assuming that a video data stream having multiple video frames is using a video encoder or video decoder, a process 300 comprising the steps of predicting a predicted value of a pixel value of a current image block of a current video frame is performed;
  • the method shown in FIG. 3 includes steps 301 to 304, and steps 301 to 304 are described in detail below.
  • the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
  • the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
  • the initial motion information may include indication information of a prediction direction (usually bidirectional prediction), a motion vector pointing to a reference image block (usually a motion vector of a neighboring block), and image information of a reference image block (generally understood as a reference image).
  • Information wherein the motion vector includes a forward motion vector and a backward motion vector, and the reference image information includes reference frame index information of the forward prediction reference image block and the backward prediction reference image block.
  • the method may be performed in various manners. For example, the following manners 1 and 2 may be used to obtain initial motion information of the image block.
  • a candidate motion information list is constructed according to motion information of neighboring blocks of the current image block, and a candidate motion information is selected from the candidate motion information list as The initial motion information of the current image block.
  • the candidate motion information list includes a motion vector, reference frame index information, and the like.
  • the motion information of the neighboring block A0 (see the candidate motion information with index 0 in FIG. 5) is selected as the initial motion information of the current image block, specifically, the forward motion vector of A0 is used as the forward motion of the current block.
  • Vector using the backward motion vector of A0 as the backward motion vector of the current block.
  • a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
  • the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
  • the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
  • N is an integer greater than one.
  • the current image to which the current image block belongs has two reference images in tandem, that is, a forward reference image and a backward reference image.
  • the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a second reference image index of a backward prediction direction;
  • step 302 can include:
  • the position of the block is used as a first search starting point (indicated by (0, 0) in FIG. 8), and positions of (N-1) candidate forward reference blocks are determined in the forward reference image;
  • the position of the block is used as a second search starting point, and the positions of (N-1) candidate backward reference blocks are determined in the backward reference picture.
  • the positions of the N forward reference blocks include an initial forward reference block position (indicated by (0, 0)) and (N-1) candidate forward directions.
  • the position of the reference block (indicated by (0, -1) (-1, -1), (-1, 1), (1, -1), (1, 1), etc.), each candidate forward reference block
  • the positional offset of the position relative to the position of the initial forward reference block is an integer pixel distance (as shown in FIG.
  • the accuracy of the MV may be fractional pixel precision (e.g., 1/2 pixel accuracy, or 1/4 pixel precision). If there are only pixel values of integer pixels in the image, and the accuracy of the current MV is fractional pixel precision, then the pixel value of the entire pixel position of the reference image needs to be interpolated by using an interpolation filter to obtain the pixel value of the sub-pixel position as the current The value of the block's prediction block.
  • the specific interpolation operation process is related to the interpolation filter used. Generally, the pixel value of the integer pixel point around the reference pixel point can be linearly weighted to obtain the value of the reference pixel point. Commonly used interpolation filters are 4 taps, 6 taps, 8 taps, and so on.
  • Ai, j is a pixel point at an entire pixel position, and its bit width is bitDepth.
  • A0,0,b0,0,c0,0,d0,0,h0,0,n0,0e0,0,i0,0,p0,0,f0,0,j0,0,q0,0,g0,0, K0, 0, and r0, 0 are pixel points at the sub-pixel position. If an 8-tap interpolation filter is used, a0,0 can be calculated by the following formula:
  • a0,0 (C 0 * A -3,0 + C 1 * A -2,0 + C 2 * A -1,0 + C 3 * A 0,0 + C 4 * A 1,0 + C 5 *A 2,0 +C 6 *A 3,0 +C 7 *A 4,0 )>>shift1
  • a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, where the position of each pair of reference blocks include a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a positional offset of a position of the forward reference block relative to a position of the initial forward reference block, the second positional offset indicating a positional offset of a position of the backward reference block relative to a position of the initial backward reference block , M is an integer greater than or equal to 1, and the M is less than or equal to N.
  • the position of the candidate forward reference block 904 in the forward reference image Ref0 is shifted by MVD0 (delta0x, delta0y) with respect to the position of the position of the initial forward reference block 902 (ie, the forward search base point).
  • the position of the candidate backward reference block 905 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the initial backward reference block 903 (ie, the backward search base point).
  • MVD0 -MVD1; ie:
  • step 303 can include:
  • the positions of the pair of reference blocks with the smallest matching error are determined as the position of the target forward reference block of the current image block and the current image block.
  • Position of the target backward reference block; or from the position of the M pair reference block, determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the current image block The target is backwards to the position of the reference block, where the M is less than or equal to N.
  • step 304 the pixel value of the target forward reference block and the pixel value of the target backward reference block are weighted to obtain a predicted value of the pixel value of the current image block.
  • the method shown in FIG. 3 further includes: obtaining updated motion information of the current image block, where the updated motion information includes an updated forward motion vector and an updated backward motion vector, where The updated forward motion vector points to the location of the target forward reference block, and the updated backward motion vector points to the location of the target backward reference block.
  • the updated motion information of the current image block may be based on the location of the target forward reference block, the location of the target backward reference block, and the location of the current image block, or may be a pair of references based on the determination.
  • the block position corresponds to the first position offset and the second position offset.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs reference block
  • the forward reference block Based on the first positional offset of the position of the initial forward reference block, and the second position of the position of the backward reference block relative to the position of the initial backward reference block, in a mirror relationship, on the basis of Determining (eg, matching the least cost) a position of a pair of reference blocks from the position of the N-reference block to the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the current image block and Position of the target backward reference block (ie, the best backward reference block/backward prediction block), such that the pixel value based on the target forward reference block and the pixel value of the target backward reference block are obtained The predicted value of the pixel value of the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • FIG. 10 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
  • the method shown in FIG. 10 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 10 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 10 can occur in the interframe prediction process at the time of encoding and decoding.
  • the method shown in FIG. 10 includes steps 1001 to 1007, and the steps 1001 to 1007 are described in detail below.
  • a set of motion information is obtained from the merge candidate list according to the index of the merge, and the motion information is the initial motion information of the current block.
  • the MVP is obtained from the MVP candidate list according to the index of the AMVP
  • the MV of the current block is obtained by summing the MVP and the MVD included in the code stream.
  • the initial motion information includes reference image indication information and a motion vector, and the forward reference image and the backward reference image are determined by referring to the image indication information. The position of the forward reference block and the position of the backward reference block are determined by the motion vector.
  • 1002 Determine, in a forward reference image, a location of a starting forward reference block of a current image block, where the location of the starting forward reference block is a search starting point (also referred to as a search base point) in the forward reference image;
  • a search base point (hereinafter referred to as a first search base point) in the forward reference image is obtained.
  • the forward MV information is (MV0x, MV0y).
  • the position information of the current block is (B0x, B0y).
  • the first search base point of the forward reference image is (MV0x+B0x, MV0y+B0y).
  • a search base point in the backward reference image (hereinafter referred to as a second search base point) is obtained.
  • the backward MV is (MV1x, MV1y).
  • the position information of the current block is (B0x, B0y).
  • the second search base point of the backward reference picture is (MV1x+B0x, MV1y+B0y).
  • the MVD image constraint here can be interpreted as the positional shift MVD0 (delta0x, delta0y) of the block position in the forward reference image with respect to the forward search base point.
  • the position of the block in the backward reference picture is shifted by MVD1 (delta1x, delta1y) with respect to the position of the backward search base point.
  • MVD0 -MVD1; ie:
  • a motion search of an integer pixel step is performed starting from a search base point (indicated by (0, 0)).
  • the integer pixel step size refers to the positional offset of the position of the candidate reference block relative to the search base point as an integer pixel distance. It should be pointed out that regardless of whether the search base point is an integer pixel (the starting point can be an integer pixel, or a sub-pixel, such as 1/2, 1/4, 1/8, 1/16, etc.), the integer can be integer first. Pixel step motion search to get the position of the forward reference block of the current image block. It should be understood that when searching in integer pixel steps, the search starting point can be either full pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels. and many more.
  • the search point of 8 integer pixel steps around the search base point is searched with the (0, 0) point as the search base point, and the position of the corresponding candidate reference block is obtained.
  • Figure 7 illustrates eight candidate reference blocks; if the positional offset of the position of the forward candidate reference block relative to the position of the forward search base point in the forward reference picture is (-1, -1), the backward reference picture The positional offset of the position of the corresponding backward candidate reference block relative to the position of the backward search base point is (1, 1). The positions of the paired forward candidate reference block and the backward candidate reference block are obtained accordingly. A matching cost between the corresponding two candidate reference blocks is calculated for the positions of the obtained pair of reference blocks. The forward reference block and the backward reference block with the smallest matching cost are taken as the optimal forward reference block and the optimal backward reference block, and the optimal forward motion vector and the optimal backward motion vector are obtained.
  • step 1005-1006 performing the motion compensation process using the optimal forward motion vector obtained in step 1004 to obtain the pixel value of the optimal forward reference block; using the optimal backward motion vector obtained in step 1004 to perform the motion compensation process, The pixel value of the optimal backward reference block.
  • a predicted value of a pixel value of a current image block can be obtained according to formula (2).
  • predSamples'[x][y] (predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (2)
  • predSamplesL0' is the optimal forward reference block
  • predSamplesL1' is the optimal backward reference block
  • predSamples' is the prediction block of the current image block
  • predSamplesL0'[x][y] is the optimal forward reference block at the pixel point.
  • predSamplesL1'[x][y] is the pixel value of the optimal backward reference block at the pixel point (x, y)
  • predSamples'[x][y] is the final prediction block in the pixel
  • search method it is not limited to which search method is used, and may be any search method.
  • calculate a difference between the backward candidate block and a corresponding forward candidate block in step 4 calculate a difference between the backward candidate block and a corresponding forward candidate block in step 4, and select a backward candidate block with a minimum SAD and its corresponding backward motion.
  • the vector and the forward candidate block and their corresponding forward motion vectors are taken as the optimal backward reference block and the corresponding optimal backward motion vector and the optimal forward reference block and the corresponding optimal forward motion vector.
  • step 1004 only an example of an integer pixel step search method is given.
  • a fractional pixel step search can also be used. For example, after performing an integer pixel step search in step 1004, a search for the fractional pixel step size is performed. Or, search directly for the fractional pixel step size.
  • the specific search method is not limited here.
  • the method for calculating the matching cost is not limited.
  • the SAD criterion may be used, or the MR-SAD criterion may be used, or other criteria may be used.
  • the traversal operation or the search operation may be terminated earlier.
  • the early termination condition of the search method is not limited.
  • step 1005 and step 1005 is not limited, and may be performed simultaneously or sequentially.
  • the template matching block needs to be calculated first, and the template matching block is used to perform the forward search and the backward search respectively.
  • the candidate in the forward reference image is used in the process of finding the matching block.
  • the block and the candidate blocks in the backward reference image directly calculate the matching cost, determine the two blocks with the smallest matching cost, simplify the image prediction process, and improve the image prediction accuracy while reducing the complexity.
  • FIG. 11 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
  • the method shown in FIG. 11 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 11 includes steps 1101 to 1105, wherein steps 1101 to 1103, 1105 refer to the description of steps 1001 to 1003, 1007 in FIG. 10, and details are not described herein again.
  • the difference between the embodiment of the present application and the embodiment shown in FIG. 10 is that the pixel values of the current optimal forward reference block and the optimal backward reference block are retained and updated during the search process. After the search is completed, the predicted values of the pixel values of the current image block may be calculated using the pixel values of the current optimal forward reference block and the optimal backward reference block.
  • Costi is the matching cost of the ith
  • MinCost represents the current minimum matching value
  • Bfi, Bbi are the pixel values of the forward reference block obtained by the ith and the pixel values of the backward reference block, respectively.
  • BestBf, BestBb are the values of the current optimal forward reference block, respectively.
  • CalCost(M,N) represents the matching cost of block M and block N.
  • BestBf is used, and BestBb obtains the predicted value of the pixel value of the current block.
  • FIG. 12 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
  • the method shown in FIG. 12 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 12 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
  • Process 1200 may be performed by video encoder 20 or video decoder 30, and in particular, by video encoder 20 or a motion compensation unit of video decoder 30. Assuming that a video data stream having multiple video frames is using a video encoder or video decoder, a process 1200 comprising the steps of predicting a predicted value of a pixel value of a current image block of a current video frame is performed;
  • steps 1201 to 1204 The method shown in FIG. 12 includes steps 1201 to 1204, wherein steps 1201, 1202, and 1204 refer to the description of steps 301, 302, and 304 in FIG. 3, and details are not described herein again.
  • step 1203 based on the matching cost criterion, determining the position of a pair of reference blocks from the positions of the M pairs of reference blocks as the target forward reference of the current image block.
  • the position of the block and the position of the target backward reference block wherein the position of each pair of reference blocks includes the position of one forward reference block and the position of one backward reference block, and the position offset for each pair of reference blocks And a second position offset having a proportional relationship based on a time domain distance, the first position offset representing a positional offset of a position of the forward reference block relative to a position of an initial forward reference block; the second position The offset represents a positional offset of a position of the backward reference block relative to a position of an initial backward reference block, the M being an integer greater than or equal to 1, and the M is less than or equal to N;
  • the position of the candidate forward reference block 1304 in the forward reference image Ref0 is shifted by MVD0 (delta0x, delta0y) with respect to the position of the position of the initial forward reference block 1302 (ie, the forward search base point).
  • the position of the candidate backward reference block 1305 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the initial backward reference block 1303 (ie, the backward search base point).
  • TC, T0, and T1 represent the time of the current frame, the time of the forward reference picture, and the time of the backward reference picture, respectively.
  • TD0, TD1 represents the time interval between two moments.
  • TD0 and TD1 can be calculated using picture order count (POC).
  • POC picture order count
  • POCc, POC0, POC1 represent the POC of the current image, the POC of the forward reference image, and the POC of the backward reference image, respectively.
  • TD0 represents a picture order count (POC) distance between the current picture and the forward reference picture;
  • TD1 represents a POC distance between the current picture and the backward reference picture.
  • Delta0x (TD0/TD1)*delta1x
  • Delta0x/delta1x (TD0/TD1)
  • Delta0y/delta1y (TD0/TD1)
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs reference block
  • the forward reference block a position offset relative to a position of the initial forward reference block
  • the position of a pair of reference blocks determined from the position of the N pairs of reference blocks is the target forward reference block of the current image block (ie, the best forward reference block/forward prediction)
  • the position of the block and the position of the target backward reference block ie, the best backward reference block/backward prediction block), thereby based on the pixel value of the target forward reference block and the pixel of the target backward reference block a value that yields a predicted value of the pixel value of the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • the search process was performed once.
  • the method shown in FIG. 14 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 14 may occur in the encoding process, and may also occur in the decoding process. Specifically, the method shown in FIG. 14 may occur in an encoding process or an interframe prediction process at the time of decoding.
  • the method shown in FIG. 14 specifically includes steps 1401 to 1404, as follows:
  • the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
  • the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
  • the i-th wheel motion information is initial motion information of the current image block
  • the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. Motion vector.
  • the initial motion information may include indication information of a prediction direction (usually bidirectional prediction), a motion vector pointing to a reference image block (usually a motion vector of a neighboring block), and image information of a reference image block (generally understood as a reference image).
  • Information wherein the motion vector includes a forward motion vector and a backward motion vector, and the reference image information includes reference frame index information of the forward prediction reference image block and the backward prediction reference image block.
  • the method may be performed in various manners. For example, the following manners 1 and 2 may be used to obtain initial motion information of the image block.
  • a candidate motion information list is constructed according to motion information of neighboring blocks of the current image block, and a candidate motion information is selected from the candidate motion information list as The initial motion information of the current image block.
  • the candidate motion information list includes a motion vector, reference frame index information, and the like.
  • the motion information of the neighboring block A0 (see the candidate motion information with index 0 in FIG. 5) is selected as the initial motion information of the current image block, specifically, the forward motion vector of A0 is used as the forward motion of the current block.
  • Vector using the backward motion vector of A0 as the backward motion vector of the current block.
  • a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
  • the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
  • the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
  • N is an integer greater than one.
  • the ith wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
  • step 1402 can include:
  • the position of the i-1 round target backward reference block is used as the i b search starting point to determine the position of the (N-1) candidate backward reference blocks in the backward reference picture.
  • the positions of the N forward reference blocks include the positions of the i-1th round target forward reference block (indicated by (0, 0)) and (N-1).
  • the position of the candidate forward reference block (indicated by (0,-1)(-1,-1), (-1,1), (1,-1), (1,1), etc.), each candidate
  • the positional offset of the position of the forward reference block relative to the position of the i-th round of the target forward reference block is an integer pixel distance (as shown in FIG.
  • the position of the backward reference block includes the position of the i-1th round target backward reference block and the position of the (N-1) candidate backward reference blocks, and the position of each candidate backward reference block is relative to the
  • a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, where the position of each pair of reference blocks include a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a positional offset of a position of the forward reference block relative to a position of the i-1th round target forward reference block, the second positional offset indicating a position of the backward reference block relative to the i-1th round target
  • the position of the position of the backward reference block is shifted, the M is an integer greater than or equal to 1, and the M is less than or equal to N.
  • the first position offset is in a mirror image relationship with the second position offset. It can be understood that the direction of the first position offset is opposite to the direction of the second position offset, and the first position offset is offset. The value is the same as the amplitude of the second position offset.
  • the position of the candidate backward reference block 905 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the i-1th round target backward reference block 903 (ie, the backward search base point).
  • MVD0 -MVD1; ie:
  • step 1403 can include:
  • the position of the pair of reference blocks with the smallest matching error is determined as the position of the i-th target forward reference block of the current image block and The position of the i-round target backward reference block; or, from the position of the M-reference block, the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is determined as the ith round target forward reference block of the current image block.
  • step 404 the pixel value of the target forward reference block and the pixel value of the target backward reference block are weighted to obtain a predicted value of the pixel value of the current image block.
  • the predicted value of the pixel value of the current image block may be obtained according to other methods, which is not limited in this application.
  • the initial motion information is updated to the second round of motion information, wherein the second round of motion information includes: a forward motion vector pointing to the position of the first round target forward reference block and the pointing The backward motion vector of the position of the target backward reference block 1 round, ..., thus makes it possible to effectively predict other image blocks according to the image block when performing the next image prediction.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs reference block
  • the forward reference block Based on the first positional offset of the position of the initial forward reference block, and the second position of the position of the backward reference block relative to the position of the initial backward reference block, in a mirror relationship, on the basis of Determining (eg, matching the least cost) a position of a pair of reference blocks from the position of the N-reference block to the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the current image block and Position of the target backward reference block (ie, the best backward reference block/backward prediction block), such that the pixel value based on the target forward reference block and the pixel value of the target backward reference block are obtained The predicted value of the pixel value of the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • increasing the number of iterations can further improve the accuracy of the modified MV, thereby further improving the codec performance.
  • the flow of the image prediction method in the embodiment of the present application will be described in detail below with reference to FIG. 15.
  • the method shown in FIG. 15 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 15 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 15 may occur in an encoding process or an interframe prediction process at the time of decoding.
  • the method shown in FIG. 15 specifically includes steps 1501 to 1508, and step 1501 to step 1508 are described in detail below.
  • the initial motion information of the current block is used. For example, for an image block whose encoding mode is merge, motion information is obtained from the merge candidate list according to the index of the merge, and the motion information is the initial motion information of the current block. For example, for an image block whose encoding mode is AMVP, the MVP is obtained from the MVP candidate list according to the index of the AMVP, and the MV of the current block is obtained by summing the MVP and the MVD included in the code stream. If it is not the first round search, the last round of updated MV information is used.
  • the motion information includes reference image indication information and motion vector information.
  • the forward reference image and the backward reference image are determined by referring to the image indication information.
  • the position of the forward reference block and the position of the backward reference block are determined by the motion vector information.
  • the search base point in the forward reference image is determined based on the forward MV information and the position information of the current block. It is specifically similar to the process of the embodiment of FIG. 10 or 11. For example, if the forward MV information is (MV0x, MV0y) and the position information of the current block is (B0x, B0y), the search base point in the forward reference image is (MV0x+B0x, MV0y+B0y).
  • the search base point in the backward reference image is determined based on the backward MV information and the position information of the current block. It is specifically similar to the process of the embodiment of FIG. 10 or 11. For example, if the backward MV information is (MV1x, MV1y) and the position information of the current block is (B0x, B0y), the search base point of the backward reference picture is (MV1x+B0x, MV1y+B0y).
  • the specific search step is similar to the process of the embodiment of FIG. 10 or 11, and will not be described again here.
  • steps 1505. Determine whether an iteration termination condition is reached. If not, perform steps 1502 and 1503. Otherwise, steps 1506 and 1507 are performed.
  • L is a preset value and L is an integer greater than 1.
  • L can be set in advance before the image is predicted.
  • L can also set the value of L according to the accuracy of the image prediction and the complexity of the search prediction block.
  • L can also be set according to the historical experience value, or L can also be Determined based on the verification of the results in the intermediate search process.
  • a total of 2 searches are performed in integer pixel steps, wherein in the first search, the position of the initial forward reference block can be used as the search base point in the forward reference image (also referred to as the front Determining the positions of (N-1) candidate forward reference blocks in the reference region; and determining the position of the initial backward reference block as the search base point in the backward reference image (also referred to as the forward reference region) ( N-1) the positions of the candidate backward reference blocks, and the matching cost of the corresponding two reference blocks is calculated for one or more pairs of reference block positions in the positions of the N pairs of reference blocks, for example, calculating an initial forward reference block The matching cost with the initial backward reference block, and the matching cost of one candidate forward reference block and one candidate backward reference block satisfying the MVD mirror constraint, thereby obtaining the first round target forward reference block of the first search And the position of the first round target backward reference block, thereby obtaining updated motion information, including: a forward motion vector indicating a position of the forward reference image (also referred
  • the updated motion information is the same as the reference frame index in the initial motion information.
  • a second search is performed to determine (N-1) candidate forward references in the forward reference image (also referred to as the forward reference region) with the position of the first round target forward reference block as the search base point.
  • Position of the block and determining the position of the (N-1) candidate backward reference blocks in the backward reference picture (also referred to as the forward reference area) by using the position of the first round target backward reference block as the search base point, Calculating the matching cost of the corresponding two reference blocks for one or more pairs of reference block positions in the positions of the N pairs of reference blocks, for example, calculating the first round target forward reference block and the first round target backward reference block Matching cost, calculating the matching cost of a candidate forward reference block and a candidate backward reference block satisfying the MVD image constraint condition, thereby obtaining the second round target forward reference block and the second round target backward reference of the second search
  • the position of the block which in turn obtains updated motion information, comprising: a forward motion vector indicating a position of the current image block pointing to the position of the second round target forward reference block and a position indicating that the current image block is pointing to the second round target backward direction Reference block bit The backward motion vectors.
  • the updated motion information is the same as other information such as the reference frame index in the initial motion information.
  • the second round target forward reference block and the second round target backward reference block are the final target forward reference blocks and targets.
  • the reference block also known as the optimal forward reference block and the optimal backward reference block.
  • step 1506 to 1507 performing the motion compensation process using the optimal forward motion vector obtained in step 1504 to obtain the pixel value of the optimal forward reference block; and performing the motion compensation process using the optimal backward motion vector obtained in step 1504, To get the pixel value of the optimal backward reference block.
  • a search (or referred to as a motion search) may be performed in a full pixel step size when searching in the forward reference image or the backward reference image to obtain the position of at least one forward reference block and at least one backward direction. Refer to the location of the block.
  • the search starting point can be either full pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels, and the like.
  • the search may be performed directly in the sub-pixel step, or both the full pixel step search and the sub-pixel are performed. Step search. This application does not limit the search method.
  • step 1504 for the address of each pair of reference blocks, when calculating the difference between the pixel value of the forward reference block of the corresponding relationship and the pixel value of the corresponding one of the backward reference blocks, SAD, SATD, or absolute square difference may be used. And etc. to measure the difference between the pixel value of each forward reference block and the pixel value of the corresponding backward reference block.
  • SAD SATD
  • absolute square difference may be used. And etc. to measure the difference between the pixel value of each forward reference block and the pixel value of the corresponding backward reference block.
  • the application is not limited thereto.
  • the pixel values and the optimal backward direction of the optimal forward reference block obtained in step 1506 and step 1507 may be obtained.
  • the pixel value of the reference block is weighted, and the pixel value obtained by the weighting process is used as a predicted value of the pixel value of the current image block.
  • the predicted value of the pixel value of the current image block can be obtained according to formula (8).
  • predSamples'[x][y] (predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (8)
  • predSamplesL0'[x][y] is the pixel value of the optimal forward reference block at the pixel point (x, y)
  • predSamplesL1'[x][y] is the optimal backward reference block at the pixel point (x, y)
  • the pixel value, predSamples'[x][y] is the pixel prediction value of the current image block at the pixel point (x, y).
  • the pixel value of the current optimal forward reference block and the pixel value of the optimal backward reference block may also be retained and updated.
  • the predicted values of the pixel values of the current image block are directly calculated using the pixel values of the current optimal forward reference block and the optimal backward reference block.
  • steps 1506 and 1507 are optional steps.
  • Costi is the matching cost of the ith
  • MinCost represents the current minimum matching value
  • Bfi, Bbi are the pixel values of the forward reference block obtained by the ith and the pixel values of the backward reference block, respectively.
  • BestBf, BestBb are the pixel values of the current optimal forward reference block, respectively.
  • CalCost(M,N) represents the matching cost of block M and block N.
  • BestBf is used, and BestBb obtains the predicted value of the pixel value of the current block.
  • the search process is performed once.
  • the flow of the image prediction method 1600 of the embodiment of the present application will be described in detail below with reference to FIG.
  • the method shown in FIG. 16 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 16 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 16 may occur in an encoding process or an interframe prediction process at the time of decoding.
  • the method 1600 shown in FIG. 16 includes steps 1601 through 1604, wherein steps 1601, 1602, and 1604 refer to the description of steps 1401, 1402, and 1404 in FIG. 14, and are not described herein again.
  • step 1603 determining the position of a pair of reference blocks from the positions of the M pairs of reference blocks as the target forward reference of the current image block based on the matching cost criterion.
  • the position of the block and the position of the target backward reference block wherein the position of each pair of reference blocks includes the position of one forward reference block and the position of one backward reference block, and the position offset for each pair of reference blocks And a second position offset having a proportional relationship based on a time domain distance, the first position offset representing a positional offset of a position of the forward reference block relative to a position of an initial forward reference block; the second position The offset represents a positional offset of a position of the backward reference block relative to a position of an initial backward reference block, the M being an integer greater than or equal to 1, and the M is less than or equal to N;
  • the position of the candidate forward reference block 1304 in the forward reference image Ref0 is shifted by MVD0 (delta0x, delta0y) with respect to the position of the position of the initial forward reference block 1302 (ie, the forward search base point).
  • the position of the candidate backward reference block 1305 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the initial backward reference block 1303 (ie, the backward search base point).
  • TC, T0, and T1 represent the time of the current frame, the time of the forward reference picture, and the time of the backward reference picture, respectively.
  • TD0, TD1 represents the time interval between two moments.
  • TD0 and TD1 can be calculated using picture order count (POC).
  • POC picture order count
  • POCc, POC0, POC1 represent the POC of the current image, the POC of the forward reference image, and the POC of the backward reference image, respectively.
  • TD0 represents a picture order count (POC) distance between the current picture and the forward reference picture;
  • TD1 represents a POC distance between the current picture and the backward reference picture.
  • Delta0x (TD0/TD1)*delta1x
  • Delta0x/delta1x (TD0/TD1)
  • Delta0y/delta1y (TD0/TD1)
  • step 1603 can include:
  • the position of the pair of reference blocks with the smallest matching error is determined as the position of the i-th target forward reference block of the current image block and The position of the i-round target backward reference block; or, from the position of the M-reference block, the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is determined as the ith round target forward reference block of the current image block.
  • the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
  • the position of the forward reference block is offset from the first position of the position of the initial forward reference block
  • the position of the backward reference block is relative to the initial backward direction
  • the second position offset of the position of the reference block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks determined from the positions of the N pairs of reference blocks (for example, the matching cost is the smallest) is the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • increasing the number of iterations can further improve the accuracy of the modified MV, thereby further improving the codec performance.
  • the flow of the image prediction method in the embodiment of the present application will be described in detail below with reference to FIG.
  • the method shown in FIG. 17 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 17 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 17 may occur in an encoding process or an interframe prediction process at the time of decoding.
  • the method shown in FIG. 17 includes steps 1701 to 1708, wherein steps 1701 to 1703, 1705 to 1708 refer to the description of steps 1501 to 1503, 1505 to 1508 in FIG. 15, and details are not described herein again.
  • the image constraint of the MVD based on the time domain distance can be interpreted as the positional offset MVD0 (delta0x, delta0y) of the block position in the forward reference image relative to the forward search base point and the block position in the backward reference image.
  • the positional offset MVD1 (delta1x, delta1y) of the backward search base point satisfies the following relationship:
  • TC, T0, and T1 represent the time of the current image, the time of the forward reference image, and the time of the backward reference image, respectively.
  • TD0, TD1 represents the time interval between two moments.
  • TD0 and TD1 can be calculated using picture order count (POC).
  • POC picture order count
  • POCc, POC0, POC1 represent the POC of the current image, the POC of the forward reference image, and the POC of the backward reference image, respectively.
  • TD0 represents a picture order count (POC) distance between the current picture and the forward reference picture;
  • TD1 represents a POC distance between the current picture and the backward reference picture.
  • Delta0x (TD0/TD1)*delta1x
  • Delta0x/delta1x (TD0/TD1)
  • Delta0y/delta1y (TD0/TD1).
  • the specific search step is similar to the process of the embodiment of FIG. 10 or 11, and will not be described again here.
  • the mirror relationship considers either the time domain interval or the time domain interval.
  • the current frame or the current block is adaptively selected for motion vector correction, whether the mirror relationship considers the time domain interval.
  • indication information may be added in sequence level header information (SPS), picture level header information (PPS), or slice header, or block code stream information to indicate the current sequence, or the current picture, or the current stripe. (Slice), or whether the mirror relationship used by the current block considers the time interval.
  • SPS sequence level header information
  • PPS picture level header information
  • slice header or block code stream information to indicate the current sequence, or the current picture, or the current stripe.
  • block code stream information to indicate the current sequence, or the current picture, or the current stripe. (Slice), or whether the mirror relationship used by the current block considers the time interval.
  • the current block adaptively determines whether the mirror relationship used by the current block takes into account the time interval according to the POC of the forward reference image and the POC of the backward reference image.
  • Max(A, B) represents the larger value in A and B
  • Min(A, B) represents the smaller value in A and B.
  • the mirroring relationship to be used needs to consider the interval, otherwise the time interval is not considered, where R is the preset threshold.
  • R is the preset threshold.
  • the specific R value is not limited herein.
  • the image prediction method of the embodiments of the present application may be specifically performed by an encoder (eg, encoder 20) or a motion compensation module in a decoder (eg, decoder 30). Additionally, the image prediction method of embodiments of the present application can be implemented in any electronic device or device that requires encoding and/or decoding of a video image.
  • FIG. 18 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 1800 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 1800 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction device 1800 may include:
  • the first obtaining unit 1801 is configured to acquire initial motion information of the current image block.
  • a first searching unit 1802 configured to determine a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block, where the N forward reference blocks are located in a forward direction In the reference image, the N backward reference blocks are located in the backward reference picture, N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the position of the M reference block as the current image block
  • the first position offset is in a mirror relationship with the second position offset, the first position offset representing a positional offset of the position of the forward reference block relative to the position of the initial forward reference block, the second position The offset represents a positional offset of a position of the backward reference block relative to a position
  • the first prediction unit 1803 is configured to obtain a predicted value of a pixel value of the current image block based on a pixel value of the target forward reference block and a pixel value of the target backward reference block.
  • the first position offset is in a mirror image relationship with the second position offset. It can be understood that the first position offset is the same as the second position offset, for example, the direction of the first position offset is The direction of the second positional offset is reversed, and the magnitude of the first positional offset is the same as the magnitude of the second positional offset.
  • the first prediction unit 1803 is further configured to obtain updated motion information of a current image block, where the updated motion information includes an updated forward motion vector and an updated To a motion vector, wherein the updated forward motion vector points to a location of the target forward reference block, the updated backward motion vector pointing to a location of the target backward reference block.
  • the motion vector of the image block is updated, so that other image blocks can be effectively predicted according to the image block when the next image prediction is performed.
  • the location of the N forward reference blocks includes a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks, each candidate forward reference block.
  • the positional offset of the position relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance; or
  • the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
  • the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
  • the initial motion information includes a first motion vector and a first reference image index in a forward prediction direction, and a second motion vector and a second reference image index in a backward prediction direction;
  • the first searching unit is specifically configured to:
  • the position of the pair of reference blocks is determined from the position of the M pair reference block as the position of the target forward reference block of the current image block and the target backward reference block based on the matching cost criterion.
  • the first search unit 1802 is specifically configured to:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block, Wherein M is less than or equal to N.
  • the foregoing apparatus 1800 may perform the foregoing methods shown in FIG. 3, FIG. 10, and FIG. 11, and the apparatus 1800 may specifically be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
  • the apparatus 1800 can be used for both image prediction during encoding and image prediction during decoding.
  • the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks
  • Each of the positions of the N pairs of reference blocks, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial position Deviating to a second position of the position of the reference block into a mirror relationship, on the basis of which the position of the pair of reference blocks determined from the position of the N pair of reference blocks (for example, the least matching cost) is the target forward direction of the current image block
  • a predicted value of a pixel value of the current image block is obtained by referring to a pixel value of the
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • FIG. 19 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 1900 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 1900 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction device 1900 may include:
  • a second acquiring unit 1901 configured to acquire initial motion information of a current image block
  • a second searching unit 1902 configured to determine a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block, where the N forward reference blocks are located in a forward direction In the reference image, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks as the current image block
  • the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position offset indicating a positional offset of a position of the forward reference block relative to a position of the initial forward reference block;
  • the second position offset represents a positional offset
  • the second prediction unit 1903 is configured to obtain a predicted value of a pixel value of the current image block based on a pixel value of the target forward reference block and a pixel value of the target backward reference block.
  • the first position offset and the second position offset have a proportional relationship based on the time domain distance, which can be understood as:
  • the proportional relationship between the first position offset and the second position offset is determined based on a proportional relationship between the first time domain distance and the second time domain distance, wherein the first time domain distance indicates that the current image block belongs to The time domain distance between the current image and the forward reference image; the second time domain distance represents the time domain distance between the current image and the backward reference image.
  • the first position offset and the second position offset have a proportional relationship based on a time domain distance, and may include:
  • the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
  • the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
  • the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
  • the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
  • the second prediction unit 1903 is further configured to obtain updated motion information of a current image block, where the updated motion information includes an updated forward motion vector and an updated backward motion vector. And wherein the updated forward motion vector points to a location of the target forward reference block, the updated backward motion vector pointing to a location of the target backward reference block.
  • the embodiment of the present application can obtain the motion information of the corrected current image block, and improve the accuracy of the current image block motion information, which will also be beneficial for prediction of other image blocks, such as lifting motion information of other image blocks. Forecast accuracy, etc.
  • the positions of the N forward reference blocks include a position of an initial forward reference block and a position of (N-1) candidate forward reference blocks, and the positions of each candidate forward reference block are relative.
  • the positional offset of the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance; or
  • the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
  • the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
  • the initial motion information includes forward predicted motion information and backward predicted motion information
  • the second search unit 1902 is specifically configured to:
  • the position of the candidate backward reference block, the positional offset of the position of each candidate backward reference block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
  • the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a second reference image index of a backward prediction direction;
  • the second search unit is specifically configured to:
  • determining, according to the matching cost criterion, a position of the pair of reference blocks from the positions of the M pairs of reference blocks is a position of the target forward reference block of the current image block and a position of the target backward reference block.
  • the second searching unit 1902 is specifically configured to:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block,
  • the M is less than or equal to N.
  • the matching cost criterion is a criterion that minimizes the matching cost. For example, for the position of the M pair reference block, the difference between the pixel value of the forward reference block and the pixel value of the backward reference block in each pair of reference blocks is calculated; from the positions of the M pairs of reference blocks, the pixel value difference is determined to be the smallest The position of the pair of reference blocks is the position of the forward target reference block of the current image block and the position of the backward target reference block.
  • the matching cost criterion is a matching cost and an early termination criterion. For example, for the positions of the nth pair of reference blocks (one forward reference block and one backward reference block), the difference between the pixel value of the forward reference block and the pixel value of the backward reference block is calculated, where n is greater than or equal to An integer less than or equal to N; determining a position of the nth pair of reference blocks (a forward reference block and a backward reference block) as the current image block when the pixel value difference is less than or equal to the matching error threshold The position of the forward target reference block and the position of the backward target reference block.
  • the second acquiring unit 1901 is configured to acquire the initial motion information from a candidate motion information list of a current image block, or acquire the initial motion information according to the indication information, where the indication information is used to indicate a current The initial motion information of the image block. It should be understood that the initial motion information is relative to the corrected motion information.
  • the above apparatus 1900 can perform the above-described method shown in FIG. 12, and the apparatus 1900 can be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
  • the apparatus 1900 can be used for both image prediction during encoding and image prediction during decoding.
  • the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks
  • Each of the positions of the N pairs of reference blocks, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial position
  • the second position offset to the position of the reference block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks (eg, the least matching cost) is determined from the position of the N pair of reference blocks as the current image
  • the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the block and the position of the target backward reference block (ie, the best backward reference block/backward prediction block) thereby based on A pixel value of the target forward reference block and a pixel value of the target backward reference block obtain a
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • FIG. 20 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 2000 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 2000 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction apparatus 2000 may include:
  • the third acquiring unit 2001 is configured to acquire the i-th wheel motion information of the current image block.
  • a third search unit 2002 configured to determine a location of the N forward reference blocks and a position of the N backward reference blocks according to the i-th wheel motion information and a position of the current image block, where the N forward reference blocks are located In the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is current a position of the i-th wheel target forward reference block of the image block and a position of the i-th wheel target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and For a position of each pair of reference blocks, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a position of the forward reference block relative to the i-1th round target forward reference a positional offset of a position of the block, the
  • a third prediction unit 2003 configured to obtain, according to a pixel value of the j-th target forward reference block and a pixel value of the j-th target backward reference block, a predicted value of a pixel value of the current image block, Where j is greater than or equal to i, and i and j are integers greater than or equal to one.
  • the positions of the N forward reference blocks include an initial forward reference block position and (N-1) positions of the candidate forward reference blocks, the positional offset of the position of each candidate forward reference block relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance; or
  • the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward reference
  • the positional offset of the position of the block is an integer pixel distance or a fractional pixel distance.
  • the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. a motion vector; accordingly, the positions of the N forward reference blocks include a position of an i-th round target forward reference block and a position of (N-1) candidate forward reference blocks, each candidate forward The position of the reference block relative to the position of the i-1th round of the target forward reference block is offset by an integer pixel distance or a fractional pixel distance; or the position of the N backward reference blocks includes an i-1th round The position of the target backward reference block and the position of the (N-1) candidate backward reference blocks, and the positional offset of the position of each candidate backward reference block relative to the position of the i-1th round target backward reference block is Integer pixel distance or fractional pixel distance.
  • the third prediction unit 2003 is specifically configured to: according to the pixel value of the j-th target forward reference block and the pixel of the j-th target backward reference block when the iterative termination condition is satisfied a value that yields a predicted value of the pixel value of the image block, where j is greater than or equal to i, and i and j are integers greater than or equal to one.
  • j is greater than or equal to i
  • i and j are integers greater than or equal to one.
  • the first position offset is in a mirror image relationship with the second position offset, and the first position offset amount is the same as the second position offset amount, for example, the The direction of the first positional offset is opposite to the direction of the second positional offset, and the magnitude of the first positional offset is the same as the magnitude of the second positional offset.
  • the i-th wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
  • the third search unit 2002 is specifically configured to determine a position of the N forward reference blocks and a position of the N backward reference blocks according to the position of the i-th wheel motion information and the current image block. :
  • the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
  • the location includes the location of an i-th round of the target backward reference block and the location of the (N-1) candidate backward reference blocks.
  • the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the position of the i-th wheel target forward reference block of the current image block and the i-th wheel target
  • the third search unit 2002 is specifically configured to:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
  • the apparatus 2000 may perform the foregoing methods shown in FIG. 14 and FIG. 15.
  • the apparatus 2000 may specifically be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
  • the device 2000 can be used for both image prediction during encoding and image prediction during decoding.
  • the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, N pairs of reference block positions in the position of the reference block, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward reference
  • the second position of the position of the block is offset into a mirror relationship, on the basis of which the position of the pair of reference blocks determined from the position of the N pair of reference blocks (for example, the least matching cost) is the target forward reference block of the current image block.
  • the pixel value and the pixel value of the target backward reference block obtain a predicted value of the pixel value of the current image block.
  • FIG. 21 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 2100 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 2100 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction device 2100 may include:
  • the fourth obtaining unit 2101 is configured to acquire the i-th wheel motion information of the current image block.
  • a fourth searching unit 2102 configured to determine, according to the i-th wheel motion information and a position of a current image block, a position of the N forward reference blocks and a position of the N backward reference blocks, where the N forward reference blocks are located In the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is current a position of the i-th wheel target forward reference block of the image block and a position of the i-th wheel target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and For a position of each pair of reference blocks, the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position offset representing the forward reference block in the forward reference image a position offset relative to a position of a position
  • a fourth prediction unit 2103 configured to obtain, according to the pixel value of the j-th target forward reference block and the pixel value of the j-th target backward reference block, a predicted value of a pixel value of the current image block, Where j is greater than or equal to i, and i and j are integers greater than or equal to one.
  • the i-th wheel motion information is initial motion information of the current image block
  • the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. Motion vector.
  • the fourth prediction unit 2103 is specifically configured to: when the iterative termination condition is met, according to the pixel value of the j-th target forward reference block and the j-th target backward reference block The pixel value is obtained as a predicted value of the pixel value of the image block, where j is greater than or equal to i, and i and j are integers greater than or equal to 1.
  • the first position offset and the second position offset have a proportional relationship based on the time domain distance, which can be understood as:
  • the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
  • the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
  • the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
  • the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
  • the ith wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index; and correspondingly, according to the ith wheel motion
  • the information and the location of the current image block determine aspects of the location of the N forward reference blocks and the location of the N backward reference blocks.
  • the fourth search unit 2102 is specifically configured to:
  • the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
  • the location includes the location of an i-th round of the target backward reference block and the location of the (N-1) candidate backward reference blocks.
  • the fourth search unit 2102 is specifically configured to:
  • determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
  • the foregoing apparatus 2100 may perform the above-described method shown in FIG. 16 or 17, and the apparatus 2100 may be a video encoding apparatus, a video decoding apparatus, a video codec system, or other devices having a video codec function.
  • the device 2100 can be used for both image prediction during encoding and image prediction during decoding.
  • the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, N pairs of reference block positions in the position of the reference block, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward reference
  • the second positional offset of the position of the block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks determined from the position of the N pair of reference blocks (eg, the least matching cost) is the current image block.
  • the pixel value of the target forward reference block and the pixel value of the target backward reference block obtain a predicted value of the pixel value of the current image block.
  • the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
  • increasing the number of iterations can further improve the accuracy of the modified MV, thereby further improving the codec performance.
  • FIG. 22 is a schematic block diagram of an implementation manner of a video encoding device or a video decoding device (abbreviated as decoding device 2200) according to an embodiment of the present disclosure.
  • the decoding device 2200 can include a processor 2210, a memory 2230, and a bus system 2250.
  • the processor and the memory are connected by a bus system for storing instructions for executing instructions stored in the memory.
  • the memory of the encoding device stores the program code, and the processor can invoke the program code stored in the memory to perform various video encoding or decoding methods described herein, particularly video encoding in various inter prediction modes or intra prediction modes. Or a decoding method, and a method of predicting motion information in various inter or intra prediction modes. To avoid repetition, it will not be described in detail here.
  • the processor 2210 may be a central processing unit (“CPU"), and the processor 2210 may also be other general-purpose processors, digital signal processors (DSPs), and dedicated integration. Circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 2230 can include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device can also be used as the memory 2230.
  • Memory 2230 can include code and data 2231 that is accessed by processor 2210 using bus 2250.
  • the memory 2230 can further include an operating system 2233 and an application 2235 that includes at least one program that allows the processor 2210 to perform the video encoding or decoding methods described herein, particularly the image prediction methods described herein.
  • application 2235 can include applications 1 through N, which further include a video encoding or decoding application (referred to as a video coding application) that performs the video encoding or decoding methods described herein.
  • the bus system 2250 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 2250 in the figure.
  • decoding device 2200 may also include one or more output devices, such as display 2270.
  • display 2270 can be a tactile display or a touch display that combines the display with a tactile unit that operatively senses a touch input.
  • Display 2270 can be coupled to processor 2210 via bus 2250.
  • the computer readable medium can comprise a computer readable storage medium corresponding to a tangible medium, such as a data storage medium, or any communication medium that facilitates transfer of the computer program from one location to another (eg, according to a communication protocol) .
  • a computer readable medium may generally correspond to (1) a non-transitory tangible computer readable storage medium, or (2) a communication medium, such as a signal or carrier.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in this application.
  • the computer program product can comprise a computer readable medium.
  • such computer readable storage medium may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage, flash memory or may be used to store instructions or data structures
  • the desired program code in the form of any other medium that can be accessed by a computer.
  • any connection is properly termed a computer-readable medium.
  • a coaxial cable, fiber optic cable, twisted pair cable, digital subscriber line (DSL), or wireless technology such as infrared, radio, and microwave is used to transmit commands from a website, server, or other remote source
  • the coaxial cable Wire, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of the media.
  • the computer readable storage medium and data storage medium do not include connections, carrier waves, signals, or other temporary media, but rather are directed to non-transitory tangible storage media.
  • magnetic disks and optical disks include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), and blu-ray disc, where the disc typically reproduces data magnetically, while the disc is optically reproduced using a laser data. Combinations of the above should also be included in the scope of computer readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • the term "processor,” as used herein, may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
  • the functions described in the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or Into the combined codec.
  • the techniques may be fully implemented in one or more circuits or logic elements.
  • various illustrative logical blocks, units, modules in video encoder 20 and video decoder 30 may be understood as corresponding circuit devices or logic elements.
  • the techniques of the present application can be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset).
  • IC integrated circuit
  • a group of ICs eg, a chipset
  • Various components, modules or units are described herein to emphasize functional aspects of the apparatus for performing the disclosed techniques, but do not necessarily need to be implemented by different hardware units. Indeed, as described above, various units may be combined in a codec hardware unit in conjunction with suitable software and/or firmware, or by interoperating hardware units (including one or more processors as described above) provide.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Color Television Systems (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种图像预测方法、装置以及编解码器。该方法包括:获取当前图像块的初始运动信息;基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系;基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。本申请在提高图像预测的准确性的同时,能降低图像预测的复杂度。

Description

图像预测方法、装置以及编解码器 技术领域
本申请涉及视频编解码技术领域,尤其涉及图像预测方法、装置以及编解码器。
背景技术
通过视频压缩技术,例如MPEG-2、MPEG-4、ITU-TH.263、ITU-TH.264/MPEG-4第10部分高级视频编解码(advanced video coding,AVC)、ITU-TH.265高效率视频编解码(high efficiency video coding,HEVC)标准和所述标准的扩展部分中所描述的那些视频压缩技术,设备之间可以实现高效地发射及接收数字视频信息。通常情况下,视频序列的图像被划分成图像块进行编码或解码。
视频压缩技术中,为了减少或去除视频序列中的冗余信息,引入了基于图像块的空间预测(帧内预测,intra prediction)和/或时间预测(帧间预测,inter prediction)。其中,帧间预测模式可以包括但不限于:合并模式(Merge Mode)与非合并模式(例如高级运动矢量预测模式(AMVP mode))等,且均是利用多运动信息竞争的方法进行帧间预测的。
帧间预测过程中,引入了包括多组运动信息(亦称为多个候选运动信息)的候选运动信息列表(简称候选列表),例如,编码器可以利用从该候选列表中选出的一组运动信息作为或者预测当前待编码图像块的运动信息(例如运动矢量),进而得到当前待编码图像块的参考图像块(即参考样本)。相应地,解码器可以从码流中解码出指示信息,以得到一组运动信息。由于帧间预测过程中限制了运动信息的编码开销(即占据码流的比特开销),一定程度上影响了运动信息的准确度,进而影响了图像预测的准确性。
为了提高图像预测的准确性,可以采用现有的解码端运动矢量修正(Decoder-side motion vector refinement,DMVR)技术对运动信息进行修正,然而使用DMVR方案进行图像预测时,不仅要计算模板匹配块,而且要使用模板匹配块在前向参考图像和后向参考图像中分别进行搜索匹配过程,导致搜索复杂度较高,因此,在提高图像预测准确性的同时,如何减少图像预测时的复杂度是一个需要解决的问题。
发明内容
本申请实施例提供图像预测方法、装置及相应的编码器和解码器,在提高图像预测准确性的同时,能一定程度上降低图像预测的复杂度,从而提高编解码性能。
第一方面,本申请实施例提供了一种图像预测方法,该方法包括:获取当前图像块的初始运动信息;基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数, 且所述M小于或等于N;基于所述目标前向参考块的像素值(sample)和所述目标后向参考块的像素值(sample),得到所述当前图像块的像素值的预测值。
尤其需要说明的是,在本申请实施例中,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,从而初始前向参考块的位置相对于初始前向参考块的位置的位置偏移为0,初始后向参考块的位置相对于初始后向参考块的位置的位置偏移为0的情况下,0偏移与0偏移也是满足镜像关系。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每一对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移成镜像关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。
此外,应当理解的是,这里的当前图像块(简称为当前块)可以理解为当前正在处理的图像块。例如在编码过程中,指当前正在编码的图像块(encoding block);在解码过程中,指当前正在解码的图像块(decoding block)。
此外,应当理解的是,这里的参考块指为当前块提供参考信号的块。在搜索过程中,需要遍历多个参考块,寻找最佳参考块。位于前向参考图像中的参考块,称为前向参考块;位于后向参考图像中的参考块,称为后向参考块。
此外,应当理解的是,为当前块提供预测的块称为预测块。例如,在遍历多个参考块以后,找到了最佳参考块,此最佳参考块将为当前块提供预测,此块可称为预测块。预测块内的像素值或者采样值或者采样信号,称为预测信号。
此外,应当理解的是,这里的匹配代价准则可以理解为考虑成对的前向参考块与后向参考块之间的匹配代价的准则,其中,匹配代价可以理解为两个块之间的差异值,可以看做是两个块内各个对应位置像素点差异值的累加。差异的计算方法一般基于SAD(sum of absolute difference,绝对差异和)准则,或者其他准则,例如SATD(Sum of Absolute Transform Difference,绝对变换差异和),MR-SAD(mean-removed sum of absolute difference,均值去除的绝对差异和),SSD(sum of squared differences,平方差异和)等进行计算。
此外,需要说明的是,本申请实施例的当前图像块的初始运动信息可包括运动矢量MV和参考图像指示信息。当然,初始运动信息也可以包含两者之一或者全部包含,例如在编解码端共同约定参考图像的情况下,初始运动信息可以仅包含运动矢量MV。其中参考图像指示信息用于指示当前块使用到了哪一个或哪些重建图像作为参考图像,运动矢量表示在所用参考图像中参考块位置相对于当前块位置的位置偏移,一般包含水平分量偏移 和竖直分量偏移。例如使用(x,y)表示MV,x表示水平方向的位置偏移,y表示竖直方向的位置偏移。使用当前块的位置加上MV,便可以得到它的参考块在参考图像中的位置。其中参考图像指示信息可以包括参考图像列表和/或与参考图像列表对应的参考图像索引。参考图像索引用于识别指定参考图像列表(RefPicList0或RefPicList1)中的与所用运动矢量对应的参考图像。图像可被称作帧,且参考图像可被称作参考帧。
本申请实施例的当前图像块的初始运动信息是初始双向预测运动信息,即包括用于前向和后向预测方向的运动信息。此处,前向和后向预测方向是双向预测模式的两个预测方向,可以理解的是,“前向”和“后向”分别对应于当前图像的参考图像列表0(RefPicList0)和参考图像列表1(RefPicList1)。“
此外,需要说明的是,本申请实施例的初始前向参考块的位置指的是使用当前块的位置加上初始运动MV偏移而得到的参考块在前向参考图像中的位置;本申请实施例的初始后向参考块的位置指的是使用当前块的位置加上初始运动MV偏移而得到的参考块在后向参考图像中的位置。
应当理解的是,本申请实施例的方法的执行主体可以是图像预测装置,例如可以是视频编码器或视频解码器或具有视频编解码功能的电子设备,具体例如可以是视频编码器中的帧间预测单元,或者视频解码器中的运动补偿单元。
结合第一方面,在第一方面的某些实现方式中,所述第一位置偏移与第二位置偏移成镜像关系可以理解为第一位置偏移量与第二位置偏移量相同,例如,所述第一位置偏移的方向(亦称为矢量方向)与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
在一种示例下,所述第一位置偏移包括第一水平分量偏移和第一竖直分量偏移,所述第二位置偏移包括第二水平分量偏移和第二竖直分量偏移,其中,所述第一水平分量偏移的方向与第二水平分量偏移的方向相反,且第一水平分量偏移的幅值与第二水平分量偏移的幅值相同;所述第一竖直分量偏移的方向与第二竖直分量偏移的方向相反,且第一竖直分量偏移的幅值与第二竖直分量偏移的幅值相同。
在另一种示例下,第一位置偏移与第二位置偏移均为0。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
在不同示例下,当前图像块的更新的运动信息是基于所述目标前向参考块的位置、目标后向参考块的位置和当前图像块的位置得到的,或者,是基于所述确定的一对参考块位置对应的第一位置偏移和第二位置偏移得到的。
可见,本申请实施例能获得经修正过的当前图像块的运动信息,提高当前图像块运动信息的准确度,这也将有利于其他图像块的预测,例如提升其它图像块的运动信息的预测准确性等。
结合第一方面,在第一方面的某些实现方式中,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
需要说明的是,所述N对参考块的位置包括:成对的初始前向参考块和初始后向参考块的位置,和成对的候选前向参考块和候选后向参考块的位置,其中所述前向参考图像中所述候选前向参考块的位置相对于初始前向参考块的位置的位置偏移,与,所述后向参考图像中所述候选后向参考块的位置相对于初始后向参考块的位置的位置偏移成镜像关系。
结合第一方面,在第一方面的某些实现方式中,所述初始运动信息包括前向预测运动信息和后向预测运动信息;
所述基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
根据所述前向预测运动信息和当前图像块的位置在前向参考图像中确定N个前向参考块的位置,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;
根据所述后向预测运动信息和当前图像块的位置在后向参考图像中确定N个后向参考块的位置,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
结合第一方面,在第一方面的某些实现方式中,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考 块的位置。
结合第一方面,在第一方面的某些实现方式中,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,包括:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。
在一种示例下,所述匹配代价准则为匹配代价最小化的准则。例如,针对M对参考块的位置,计算每对参考块中前向参考块的像素值与后向参考块的像素值的差异;从所述M对参考块的位置中,确定像素值差异最小的一对参考块的位置为所述当前图像块的前向目标参考块的位置以及后向目标参考块的位置。
在另一种示例下,所述匹配代价准则为匹配代价与提前终止准则。例如,针对第n对参考块(一个前向参考块与一个后向参考块)的位置,计算所述前向参考块的像素值与后向参考块的像素值的差异,n为大于或等于1,且小于或等于N的整数;当像素值差异小于或等于匹配误差阈值时,确定第n对参考块(一个前向参考块与一个后向参考块)的位置为所述当前图像块的前向目标参考块的位置以及后向目标参考块的位置。
结合第一方面,在第一方面的某些实现方式中,所述方法用于编码所述当前图像块,所述获取当前图像块的初始运动信息包括:从当前图像块的候选运动信息列表中获取所述初始运动信息;
或者,所述方法用于解码所述当前图像块,所述获取当前图像块的初始运动信息之前,所述方法还包括:从当前图像块的码流中获取指示信息,所述指示信息用于指示当前图像块的初始运动信息。
可见,本申请实施例的图像预测方法,不仅适用于合并预测模式(Merge)和/或高级运动矢量预测模式(advanced motion vector prediction,AMVP),而且也能适用于使用空域参考块,时域参考块和/或视间参考块的运动信息对当前图像块的运动信息进行预测的其它模式,从而提高编解码性能。
本申请的第二方面提供一种图像预测方法,包括:获取当前图像块的初始运动信息;
基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示 所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
尤其需要说明的是,在本申请实施例中,初始前向参考块的位置相对于初始前向参考块的位置的位置偏移为0,初始后向参考块的位置相对于初始后向参考块的位置的位置偏移为0的情况下,0偏移与0偏移也是满足镜像关系或满足基于时域距离的比例关系的。换一种角度来描述,可以是针对(N-1)对参考块位置中的每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系/镜像关系。这里的(N-1)对参考块的位置不包括初始前向参考块的位置和初始后向参考块的位置。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每一对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移具有基于时域距离的比例关系(亦可称为基于时域距离的镜像关系),在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。
结合第二方面,在第二方面的某些实现方式中,针对每对参考块,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:
针对每对参考块,第一位置偏移与第二位置偏移的比例关系是基于第一时域距离与第二时域距离比例关系而确定的,其中第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
结合第二方面,在第二方面的某些实现方式中,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:
如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
结合第二方面,在第二方面的某些实现方式中,所述方法还包括:获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
可见,本申请实施例能获得经修正过的当前图像块的运动信息,提高当前图像块运动信息的准确度,这也将有利于其他图像块的而预测,例如提升其它图像块的运动信息的预测准确性等。
结合第二方面,在第二方面的某些实现方式中,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
结合第二方面,在第二方面的某些实现方式中,所述N对参考块的位置包括:成对的初始前向参考块和初始后向参考块的位置,和成对的候选前向参考块和候选后向参考块的位置,其中所述前向参考图像中所述候选前向参考块的位置相对于初始前向参考块的位置的位置偏移,与,所述后向参考图像中所述候选后向参考块的位置相对于初始后向参考块的位置的位置偏移具有基于时域距离的比例关系。
结合第二方面,在第二方面的某些实现方式中,所述初始运动信息包括前向预测运动信息和后向预测运动信息;
所述基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
根据所述前向预测运动信息和当前图像块的位置在前向参考图像中确定N个前向参考块的位置,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;
根据所述后向预测运动信息和当前图像块的位置在后向参考图像中确定N个后向参考块的位置,所述N个后向参考块的位置包括初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
结合第二方面,在第二方面的某些实现方式中,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N 个后向参考块的位置,包括:
根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
结合第二方面,在第二方面的某些实现方式中,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,包括:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,所述M小于或等于N。
在一种示例下,所述匹配代价准则为匹配代价最小化的准则。例如,针对M对参考块的位置,计算每对参考块中前向参考块的像素值与后向参考块的像素值的差异;从所述M对参考块的位置中,确定像素值差异最小的一对参考块的位置为所述当前图像块的前向目标参考块的位置以及后向目标参考块的位置。
在另一种示例下,所述匹配代价准则为匹配代价与提前终止准则。例如,针对第n对参考块(一个前向参考块与一个后向参考块)的位置,计算所述前向参考块的像素值与后向参考块的像素值的差异,n为大于或等于1,且小于或等于N的整数;当像素值差异小于或等于匹配误差阈值时,确定第n对参考块(一个前向参考块与一个后向参考块)的位置为所述当前图像块的前向目标参考块的位置以及后向目标参考块的位置。
结合第二方面,在第二方面的某些实现方式中,所述方法用于编码所述当前图像块,所述获取当前图像块的初始运动信息包括:从当前图像块的候选运动信息列表中获取所述初始运动信息;
或者,所述方法用于解码所述当前图像块,所述获取当前图像块的初始运动信息之前,所述方法还包括:从当前图像块的码流中获取指示信息,所述指示信息用于指示当前图像块的初始运动信息。
本申请的第三方面提供一种图像预测方法,包括:获取当前图像块的第i轮运动信息;
根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
尤其需要说明的是,在本申请实施例中,初始前向参考块的位置相对于初始前向参考块的位置的位置偏移为0,初始后向参考块的位置相对于初始后向参考块的位置的位置偏移为0的情况下,0偏移与0偏移也是满足镜像关系。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每一对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移成镜像关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。此外,本申请实施例通过迭代的方法,可以进一步提高修正运动矢量MV的准确度,从而进一步提高编解码性能。
结合第三方面,在第三方面的某些实现方式中,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;相应地,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量;相应地,所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于 第i-1轮目标后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
需要说明的是,如果所述方法用于编码所述当前图像块,所述当前图像块的初始运动信息是通过如下方法获取的:从当前图像块的候选运动信息列表中确定所述初始运动信息;或者,如果所述方法用于解码所述当前图像块,所述当前图像块的初始运动信息是通过如下方法获取的:从当前图像块的码流中获取指示信息,其中,所述指示信息用于指示当前图像块的初始运动信息。
结合第三方面,在第三方面的某些实现方式中,所述根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数,包括:
当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
结合第三方面,在第三方面的某些实现方式中,所述第一位置偏移与第二位置偏移成镜像关系,包括:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
结合第三方面,在第三方面的某些实现方式中,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;
所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
结合第三方面,在第三方面的某些实现方式中,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置,包括:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
本申请的第四方面提供一种图像预测方法,包括:获取当前图像块的第i轮运动信息;
根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
尤其需要说明的是,在本申请实施例中,初始前向参考块的位置相对于初始前向参考块的位置的位置偏移为0,初始后向参考块的位置相对于初始后向参考块的位置的位置偏移为0的情况下,0偏移与0偏移也是满足镜像关系或满足基于时域距离的比例关系的。换一种角度来描述,可以是针对(N-1)对参考块位置中的每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系/镜像关系。这里的(N-1)对参考块的位置不包括初始前向参考块的位置和初始后向参考块的位置。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每一对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移具有基于时域距离的比例关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。此外,本申请实施例通过迭代的方法,可以进一步提高修正运动矢量MV的准确度,从而进一步提高编解码性能。
结合第四方面,在第四方面的某些实现方式中,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
结合第四方面,在第四方面的某些实现方式中,所述根据所述第j轮目标前向参考块 的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数,包括:
当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
结合第四方面,在第四方面的某些实现方式中,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:
如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
结合第四方面,在第四方面的某些实现方式中,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;
所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
结合第四方面,在第四方面的某些实现方式中,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置,包括:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
本申请的第五方面提供一种图像预测装置,包括用于实施第一方面的任意一种方法的若干个功能单元。举例来说,图像预测装置可以包括:第一获取单元,用于获取当前图像块的初始运动信息;第一搜索单元,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第一预测单元,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
在不同应用场景下,图像预测装置例如应用于视频编码装置(视频编码器)或视频解码装置(视频解码器)。
本申请的第六方面提供一种图像预测装置,包括用于实施第二方面的任意一种方法的若干个功能单元。举例来说,图像预测装置可以包括:第二获取单元,用于获取当前图像块的初始运动信息;第二搜索单元,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第二预测单元,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
在不同应用场景下,图像预测装置例如应用于视频编码装置(视频编码器)或视频解码装置(视频解码器)。
本申请的第七方面提供一种图像预测装置,包括用于实施第三方面的任意一种方法的若干个功能单元。举例来说,图像预测装置可以包括:第三获取单元,用于获取当前图像块的第i轮运动信息;第三搜索单元,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位 置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第三预测单元,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
在不同应用场景下,图像预测装置例如应用于视频编码装置(视频编码器)或视频解码装置(视频解码器)。
本申请的第八方面提供一种图像预测装置,包括用于实施第四方面的任意一种方法的若干个功能单元。举例来说,图像预测装置可以包括:第四获取单元,用于获取当前图像块的第i轮运动信息;第四搜索单元,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第四预测单元,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数
在不同应用场景下,图像预测装置例如应用于视频编码装置(视频编码器)或视频解码装置(视频解码器)。
本申请的第九方面提供了一种图像预测装置,所述装置包括:处理器和耦合于所述处理器的存储器;所述处理器用于执行所述第一方面或第二方面或第三方面或第四方面或前述各方面的各种实现方式中的方法。
本申请的第十方面提供一种视频编码器,所述视频编码器用于编码图像块,包括:帧间预测模块,其中所述帧间预测模块包括如第五方面或第六方面或第七方面或第八方面所述的图像预测装置,其中所述帧间预测模块用于预测得到所述图像块的像素值的预测值;熵编码模块,用于将指示信息编入码流,所述指示信息用于指示所述图像块的初始运动信息;重建模块,用于基于所述图像块的像素值的预测值重建所述图像块。
本申请的第十一方面提供一种视频解码器,所述视频解码器用于从码流中解码出图像块,包括:熵解码模块,用于从码流中解码出指示信息,所述指示信息用于指示当前解码图像块的初始运动信息;帧间预测模块,包括如第五方面或第六方面或第七方面或第八方面所述的图像预测装置,所述帧间预测模块用于预测得到所述图像块的像素值的预测值;重建模块,用于基于所述图像块的像素值的预测值重建所述图像块。
本申请的第十二方面提供一种视频编码设备,包括非易失性存储介质,以及处理器,所述非易失性存储介质存储有可执行程序,所述处理器与所述非易失性存储介质相互耦合,并执行所述可执行程序以实现所述第一、二、三或四方面或其各种实现方式中的方法。
本申请的第十三方面提供一种视频解码设备,包括非易失性存储介质,以及处理器,所述非易失性存储介质存储有可执行程序,所述处理器与所述非易失性存储介质相互耦合,并执行所述可执行程序以实现所述第一、二、三或四方面或其各种实现方式中的方法。
本申请的第十四方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一、二、三或四方面或其各种实现方式中的方法。
本申请的第十五方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一、二、三或四方面或其各种实现方式中的方法。
本申请的第十六方面提供了一种电子设备,包括上述第十方面所述的视频编码器,或上述第十一方面所述的视频解码器,或上述第五、六、七或八方面所述的图像预测装置。
应理解,各方面及对应的可实施的设计方式所取得的有益效果相似,不再赘述。
附图说明
图1为本申请实施例中一种视频编码及解码系统的示意性框图;
图2A为本申请实施例中一种视频编码器的示意性框图;
图2B为本申请实施例中一种视频解码器的示意性框图;
图3是本申请实施例的一种图像预测方法的示意性流程图;
图4是帧间预测的合并模式下编码端获取初始运动信息的示意图;
图5是帧间预测的合并模式下解码端获取初始运动信息的示意图;
图6是当前图像块的初始参考块的示意图;
图7是整像素位置像素与分像素位置像素的示意图;
图8是搜索起始点的示意图;
图9是本申请实施例中第一位置偏移与第二位置偏移成镜像关系的示意性框图;
图10是本申请实施例的另一种图像预测方法的示意性流程图;
图11是本申请实施例的另一种图像预测方法的示意性流程图;
图12是本申请实施例的另一种图像预测方法的示意性流程图;
图13是本申请实施例中第一位置偏移与第二位置偏移具有基于时域距离的比例关系的示意性框图;
图14是本申请实施例的另一种图像预测方法1400的示意性流程图;
图15是本申请实施例的另一种图像预测方法的示意性流程图;
图16是本申请实施例的另一种图像预测方法1600的示意性流程图;
图17是本申请实施例的另一种图像预测方法的示意性流程图;
图18是本申请实施例的一种图像预测装置的示意性框图;
图19是本申请实施例的另一种图像预测装置的示意性框图;
图20是本申请实施例的另一种图像预测装置的示意性框图;
图21是本申请实施例的另一种图像预测装置的示意性框图;
图22是本申请实施例的一种编码设备或解码设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
图1为本申请实施例中视频编码及解码系统的一种示意性框图。系统中视频编码器20和视频解码器30用于根据本申请提出的各种图像预测方法实例来预测图像块的像素值的预测值,以及修正当前经编码或经解码图像块的运动信息,例如运动矢量,从而进一步的改善编解码性能。如图1所示,系统包含源装置12和目的地装置14,源装置12产生将在稍后时间由目的地装置14解码的经编码视频数据。源装置12及目的地装置14可包括广泛范围的装置中的任一者,包含桌上型计算机、笔记型计算机、平板计算机、机顶盒、例如所谓的“智能”电话的电话手机、所谓的“智能”触控板、电视、摄影机、显示装置、数字媒体播放器、视频游戏控制台、视频流式传输装置或类似者。
目的地装置14可经由链路16接收待解码的经编码视频数据。链路16可包括能够将经编码视频数据从源装置12移动到目的地装置14的任何类型的媒体或装置。在一个可行的实施方式中,链路16可包括使源装置12能够实时将经编码视频数据直接传输到目的地装置14的通信媒体。可根据通信标准(例如,无线通信协议)调制经编码视频数据且将其传输到目的地装置14。通信媒体可包括任何无线或有线通信媒体,例如射频频谱或一个或多个物理传输线。通信媒体可形成基于包的网络(例如,局域网、广域网或因特网的全球网络)的部分。通信媒体可包含路由器、交换器、基站或可有用于促进从源装置12到目的地装置14的通信的任何其它装备。
替代地,可将经编码数据从输出接口22输出到存储装置24。类似地,可由输入接口从存储装置24存取经编码数据。存储装置24可包含多种分散式或本地存取的数据存储媒体中的任一者,例如,硬盘驱动器、蓝光光盘、DVD、CD-ROM、快闪存储器、易失性或非易失性存储器或用于存储经编码视频数据的任何其它合适的数字存储媒体。在另一可行的实施方式中,存储装置24可对应于文件服务器或可保持由源装置12产生的经编码视频的另一中间存储装置。目的地装置14可经由流式传输或下载从存储装置24存取所存储视频数据。文件服务器可为能够存储经编码视频数据且将此经编码视频数据传输到目的地装置14的任何类型的服务器。可行的实施方式文件服务器包含网站服务器、文件传送协议服务器、网络附接存储装置或本地磁盘机。目的地装置14可经由包含因特网连接的任何标准数据连接存取经编码视频数据。此数据连接可包含适合于存取存储于文件服务器上的经编码视频数据的无线信道(例如,Wi-Fi连接)、有线连接(例如,缆线调制解调器等)或两者的组合。经编码视频数据从存储装置24的传输可为流式传输、下载传输或两者的组合。
本申请的技术不必限于无线应用或设定。技术可应用于视频解码以支持多种多媒体应用中的任一者,例如,空中电视广播、有线电视传输、卫星电视传输、流式传输视频传输(例如,经由因特网)、编码数字视频以用于存储于数据存储媒体上、解码存储于数据存储媒体上的数字视频或其它应用。在一些可行的实施方式中,系统可经配置以支持单向或双向视频传输以支持例如视频流式传输、视频播放、视频广播和/或视频电话的应用。
在图1的可行的实施方式中,源装置12包括视频源18、视频编码器20及输出接口22。在一些应用中,输出接口22可包括调制器/解调制器(调制解调器)和/或传输器。在源装置12中,视频源18可包括例如以下各者的源:视频捕获装置(例如,摄像机)、含有先前捕获的视频的视频存档、用以从视频内容提供者接收视频的视频馈入接口,和/或用于产生计算机图形数据作为源视频的计算机图形系统,或这些源的组合。作为一种可行的实施方式,如果视频源18为摄像机,那么源装置12及目的装置14可形成所谓的摄影机电话或视频电话。本申请中所描述的技术可示例性地适用于视频解码,且可适用于无线和/或有线应用。
可由视频编码器20来编码所捕获、预捕获或计算机产生的视频。经编码视频数据可经由源装置12的输出接口22直接传输到目的地装置14。经编码视频数据也可(或替代地)存储到存储装置24上以供稍后由目的地装置14或其它装置存取以用于解码和/或播放。
目的地装置14包含输入接口28、视频解码器30及显示装置32。在一些应用中,输入接口28可包含接收器和/或调制解调器。目的地装置14的输入接口28经由链路16接收经编码视频数据。经由链路16传达或提供于存储装置24上的经编码视频数据可包含由视频编码器20产生以供视频解码器30的视频解码器使用以解码视频数据的多种语法元素。这些语法元素可与在通信媒体上传输、存储于存储媒体上或存储于文件服务器上的经编码视频数据包含在一起。
显示装置32可与目的地装置14集成或在目的地装置14外部。在一些可行的实施方式中,目的地装置14可包含集成显示装置且也经配置以与外部显示装置接口连接。在其它可行的实施方式中,目的地装置14可为显示装置。一般来说,显示装置32向用户显示经解码视频数据,且可包括多种显示装置中的任一者,例如液晶显示器、等离子显示器、有机发光二极管显示器或另一类型的显示装置。
视频编码器20及视频解码器30可根据例如目前在开发中的下一代视频编解码压缩标准(H.266)操作且可遵照H.266测试模型(JEM)。替代地,视频编码器20及视频解码器30可根据例如ITU-TH.265标准,也称为高效率视频解码标准,或者,ITU-TH.264标准的其它专属或工业标准或这些标准的扩展而操作,ITU-TH.264标准替代地被称为MPEG-4第10部分,也称高级视频编码(advanced video coding,AVC)。然而,本申请的技术不限于任何特定解码标准。视频压缩标准的其它可行的实施方式包含MPEG-2和ITU-TH.263。
尽管未在图1中展示,但在一些方面中,视频编码器20及视频解码器30可各自与音频编码器及解码器集成,且可包含适当多路复用器-多路分用器(MUX-DEMUX)单元或其它硬件及软件以处置共同数据流或单独数据流中的音频及视频两者的编码。如果适用,那么在一些可行的实施方式中,MUX-DEMUX单元可遵照ITUH.223多路复用器协议或例如用户数据报协议(UDP)的其它协议。
视频编码器20及视频解码器30各自可实施为多种合适编码器电路中的任一者,例如,一个或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、软件、硬件、固件或其任何组合。在技术部分地以软件实施时,装置可将软件的指令存储于合适的非暂时性计算机可读媒体中且使用一个或多个处理器以硬件执行指令,以执行本申请的技术。视频编码器20及视频解码器30中的每一者可包含于 一个或多个编码器或解码器中,其中的任一者可在相应装置中集成为组合式编码器/解码器(CODEC)的部分。
本申请示例性地可涉及视频编码器20将特定信息“用信号发送”到例如视频解码器30的另一装置。然而,应理解,视频编码器20可通过将特定语法元素与视频数据的各种经编码部分相关联来用信号发送信息。即,视频编码器20可通过将特定语法元素存储到视频数据的各种经编码部分的头信息来“用信号发送”数据。在一些应用中,这些语法元素可在通过视频解码器30接收及解码之前经编码及存储(例如,存储到存储系统34或文件服务器36)。因此,术语“用信号发送”示例性地可指语法或用于解码经压缩视频数据的其它数据的传达,而不管此传达是实时或近实时地发生或在时间跨度内发生,例如可在编码时将语法元素存储到媒体时发生,语法元素接着可在存储到此媒体之后的任何时间通过解码装置检索。
JCT-VC开发了H.265(HEVC)标准。HEVC标准化基于称作HEVC测试模型(HM)的视频解码装置的演进模型。H.265的最新标准文档可从http://www.itu.int/rec/T-REC-H.265获得,最新版本的标准文档为H.265(12/16),该标准文档以全文引用的方式并入本文中。HM假设视频解码装置相对于ITU-TH.264/AVC的现有算法具有若干额外能力。例如,H.264提供9种帧内预测编码模式,而HM可提供多达35种帧内预测编码模式。
JVET致力于开发H.266标准。H.266标准化的过程基于称作H.266测试模型的视频解码装置的演进模型。H.266的算法描述可从http://phenix.int-evry.fr/jvet获得,其中最新的算法描述包含于JVET-F1001-v2中,该算法描述文档以全文引用的方式并入本文中。同时,可从https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/获得JEM测试模型的参考软件,同样以全文引用的方式并入本文中。
一般来说,HM的工作模型描述可将视频帧或图像划分成包含亮度及色度样本两者的树块或最大编码单元(largest coding unit,LCU)的序列,LCU也被称为CTU。树块具有与H.264标准的宏块类似的目的。条带包含按解码次序的数个连续树块。可将视频帧或图像分割成一个或多个条带。可根据四叉树将每一树块分裂成编码单元。例如,可将作为四叉树的根节点的树块分裂成四个子节点,且每一子节点可又为母节点且被分裂成另外四个子节点。作为四叉树的叶节点的最终不可分裂的子节点包括解码节点,例如,经解码图像块。与经解码码流相关联的语法数据可定义树块可分裂的最大次数,且也可定义解码节点的最小大小。
编码单元包含解码节点及预测单元(prediction unit,PU)以及与解码节点相关联的变换单元(transform unit,TU)。CU的大小对应于解码节点的大小且形状必须为正方形。CU的大小的范围可为8×8像素直到最大64×64像素或更大的树块的大小。每一CU可含有一个或多个PU及一个或多个TU。例如,与CU相关联的语法数据可描述将CU分割成一个或多个PU的情形。分割模式在CU是被跳过或经直接模式编码、帧内预测模式编码或帧间预测模式编码的情形之间可为不同的。PU可经分割成形状为非正方形。例如,与CU相关联的语法数据也可描述根据四叉树将CU分割成一个或多个TU的情形。TU的形状可为正方形或非正方形。
HEVC标准允许根据TU进行变换,TU对于不同CU来说可为不同的。TU通常基于 针对经分割LCU定义的给定CU内的PU的大小而设定大小,但情况可能并非总是如此。TU的大小通常与PU相同或小于PU。在一些可行的实施方式中,可使用称作“残差四叉树”(residual qualtree,RQT)的四叉树结构将对应于CU的残差样本再分成较小单元。RQT的叶节点可被称作TU。可变换与TU相关联的像素差值以产生变换系数,变换系数可被量化。
一般来说,TU使用变换及量化过程。具有一个或多个PU的给定CU也可包含一个或多个TU。在预测之后,视频编码器20可计算对应于PU的残差值。残差值包括像素差值,像素差值可变换成变换系数、经量化且使用TU扫描以产生串行化变换系数以用于熵解码。本申请通常使用术语“图像块”来指CU的解码节点。在一些特定应用中,本申请也可使用术语“图像块”来指包含解码节点以及PU及TU的树块,例如,LCU或CU。本申请实施例的下文将详细介绍视频编码或解码中自适应反量化方法所描述的各种方法实例来执行当前图像块(即当前变换块)对应的变换系数的反量化过程,以改善编解码性能。
视频序列通常包含一系列视频帧或图像。图像群组(group of picture,GOP)示例性地包括一系列、一个或多个视频图像。GOP可在GOP的头信息中、图像中的一者或多者的头信息中或在别处包含语法数据,语法数据描述包含于GOP中的图像的数目。图像的每一条带可包含描述相应图像的编码模式的条带语法数据。视频编码器20通常对个别视频条带内的图像块进行操作以便编码视频数据。图像块可对应于CU内的解码节点。图像块可具有固定或变化的大小,且可根据指定解码标准而在大小上不同。
作为一种可行的实施方式,HM支持各种PU大小的预测。假定特定CU的大小为2N×2N,HM支持2N×2N或N×N的PU大小的帧内预测,及2N×2N、2N×N、N×2N或N×N的对称PU大小的帧间预测。HM也支持2N×nU、2N×nD、nL×2N及nR×2N的PU大小的帧间预测的不对称分割。在不对称分割中,CU的一方向未分割,而另一方向分割成25%及75%。对应于25%区段的CU的部分由“n”后跟着“上(Up)”、“下(Down)”、“左(Left)”或“右(Right)”的指示来指示。因此,例如,“2N×nU”指水平分割的2N×2NCU,其中2N×0.5NPU在上部且2N×1.5NPU在底部。
在本申请中,“N×M”与“N乘M”可互换使用以指依照水平维度及竖直维度的图像块的像素尺寸,例如,16×8像素或16乘8像素。一般来说,16×8块将在水平方向上具有16个像素,即图像块的宽为16像素,且在竖直方向上具有8个像素,即图像块的高为8像素。
在使用CU的PU的帧内预测性或帧间预测性解码之后,视频编码器20可计算CU的TU的残差数据。PU可包括空间域(也称作像素域)中的像素数据,且TU可包括在将变换(例如,离散余弦变换(discrete cosine transform,DCT)、整数变换、小波变换或概念上类似的变换)应用于残差视频数据之后变换域中的系数。残差数据可对应于未经编码图像的像素与对应于PU的预测值之间的像素差。视频编码器20可形成包含CU的残差数据的TU,且接着变换TU以产生CU的变换系数。
本申请实施例的下文将详细介绍视频编码或解码中帧间预测过程的各种方法实例来得到当前图像块的最佳前向参考块的采样点的采样值和当前图像块的最佳后向参考块的采样点的采样值,进而预测当前图像块的采样点的采样值。图像块指一个二维采样点阵列, 可以是正方形阵列,也可以是矩形阵列,例如一个4x4大小的图像块可看做4x4共16个采样点构成的方形采样点阵列。图像块内信号指图像块内采样点的采样值。此外,采样点还可以称为像素点或者像素,在本发明文件中将不加区分的使用。相应的,采样点的值也可以称为像素值,在本申请中将不加区分的使用。图像也可以表示为一个二维采样点阵列,采用与图像块类似的方法标记。
在任何变换以产生变换系数之后,视频编码器20可执行变换系数的量化。量化示例性地指对系数进行量化以可能减少用以表示系数的数据的量从而提供进一步压缩的过程。量化过程可减少与系数中的一些或全部相关联的位深度。例如,可在量化期间将n位值降值舍位到m位值,其中n大于m。
JEM模型对视频图像的编码结构进行了进一步的改进,具体的,被称为“四叉树结合二叉树”(QTBT)的块编码结构被引入进来。QTBT结构摒弃了HEVC中的CU,PU,TU等概念,支持更灵活的CU划分形状,一个CU可以正方形,也可以是长方形。一个CTU首先进行四叉树划分,该四叉树的叶节点进一步进行二叉树划分。同时,在二叉树划分中存在两种划分模式,对称水平分割和对称竖直分割。二叉树的叶节点被称为CU,JEM的CU在预测和变换的过程中都不可以被进一步划分,也就是说JEM的CU,PU,TU具有相同的块大小。在现阶段的JEM中,CTU的最大尺寸为256×256亮度像素。
在一些可行的实施方式中,视频编码器20可利用预定义扫描次序来扫描经量化变换系数以产生可经熵编码的串行化向量。在其它可行的实施方式中,视频编码器20可执行自适应性扫描。在扫描经量化变换系数以形成一维向量之后,视频编码器20可根据上下文自适应性可变长度解码(CAVLC)、上下文自适应性二进制算术解码(CABAC)、基于语法的上下文自适应性二进制算术解码(SBAC)、概率区间分割熵(PIPE)解码或其他熵解码方法来熵解码一维向量。视频编码器20也可熵编码与经编码视频数据相关联的语法元素以供视频解码器30用于解码视频数据。
图2A为本申请实施例中视频编码器20的一种示意性框图。一并参阅图3,视频编码器20可执行图像预测过程,尤其是视频编码器20中的运动补偿单元44可执行图像预测过程。
如图2A所示,视频编码器20可以包括:预测模块41、求和器50、变换模块52、量化模块54和熵编码模块56。在一种示例下,预测模块41可以包括运动估计单元42、运动补偿单元44和帧内预测单元46,本申请实施例对预测模块41的内部结构不作限定。可选的,对于混合架构的视频编码器,视频编码器20也可以包括反量化模块58、反变换模块60和求和器62。
在图2A的一种可行的实施方式下,视频编码器20还可以包括分割单元(未示意)和参考图像存储器64,应当理解的是,分割单元和参考图像存储器64也可以设置在视频编码器20之外;
在另一种可行的实施方式下,视频编码器20还可以包括滤波器(未示意)以对块边界进行滤波从而从经重构建视频中去除块效应伪影。在需要时,滤波器将通常对求和器62的输出进行滤波。
如图2A所示,视频编码器20接收视频数据,且分割单元将数据分割成图像块。此分割也可包含分割成条带、图像块或其它较大单元,例如根据LCU及CU的四叉树结构进行图像块分割。一般来说,条带可划分成多个图像块。
预测模块41用于生成当前编码图像块的预测块。预测模块41可基于编码质量与代价计算结果(例如,码率-失真代价,RDcost)选择当前图像块的多个可能解码模式中的一者,例如多个帧内解码模式中的一者或多个帧间解码模式中的一者。预测模块41可将所得经帧内解码或经帧间解码块提供到求和器50以产生残差块数据且将所得经帧内译码或经帧间译码块提供到求和器62以重构建经编码块从而用作参考图像。
预测模块41内的运动估计单元42及运动补偿单元44执行相对于一个或多个参考图像中的一个或多个预测块的当前图像块的帧间预测性解码以提供时间压缩。运动估计单元42用于根据视频序列的预定模式确定视频条带的帧间预测模式。预定模式可将序列中的视频条带指定为P条带、B条带或GPB条带。运动估计单元42及运动补偿单元44可高度集成,但为概念目的而分别说明。通过运动估计单元42所执行的运动估计为产生估计图像块的运动矢量的过程。例如,运动矢量可指示当前视频帧或图像内的图像块的PU相对于参考图像内的预测块的位移。
预测块为依据像素差而被发现为紧密匹配待解码的图像块的PU的块,像素差可通过绝对差和(SAD)、平方差和(SSD)或其它差度量确定。在一些可行的实施方式中,视频编码器20可计算存储于参考图像存储器64中的参考图像的子整数(sub-integer)像素位置的值。
运动估计单元42通过比较PU的位置与参考图像的预测块的位置而计算经帧间解码条带中的图像块的PU的运动矢量。可从第一参考图像列表(列表0)或第二参考图像列表(列表1)选择参考图像,列表中的每一者识别存储于参考图像存储器64中的一个或多个参考图像。运动估计单元42将经计算运动矢量发送到熵编码模块56及运动补偿单元44。
由运动补偿单元44执行的运动补偿可涉及基于由运动估计所确定的运动矢量提取或产生预测块,可能执行到子像素精确度的内插。在接收当前图像块的PU的运动矢量后,运动补偿单元44即可在参考图像列表中的一者中定位运动矢量所指向的预测块。视频编码器20通过从正经解码的当前图像块的像素值减去预测块的像素值来形成残差图像块,从而形成像素差值。像素差值形成块的残差数据,且可包含亮度及色度差分量两者。求和器50表示执行此减法运算的一个或多个组件。运动补偿单元44也可产生与图像块及视频条带相关联的语法元素以供视频解码器30用于解码视频条带的图像块。下文将结合图3、图10-12、图14-17对本申请实施例的图像预测过程进行详细的介绍,这里不再赘述。
预测模块41内的帧内预测单元46可执行相对于在与待解码的当前块相同的图像或条带中的一个或多个相邻块的当前图像块的帧内预测性解码以提供空间压缩。因此,作为通过运动估计单元42及运动补偿单元44执行的帧间预测(如前文所描述)的替代,帧内预测单元46可帧内预测当前块。明确地说,帧内预测单元46可确定用以编码当前块的帧内预测模式。在一些可行的实施方式中,帧内预测单元46可(例如)在单独编码遍历期间使用各种帧内预测模式来编码当前块,且帧内预测单元46(或在一些可行的实施方式中,模式选择单元40)可从经测试模式选择使用的适当帧内预测模式。
在预测模块41经由帧间预测或帧内预测产生当前图像块的预测块之后,视频编码器20通过从当前图像块减去预测块而形成残差图像块。残差块中的残差视频数据可包含于 一个或多个TU中且应用于变换模块52。变换模块52用于对当前编码图像块的原始块和当前图像块的预测块之间的残差进行变换。变换模块52使用例如离散余弦变换(DCT)或概念上类似的变换(例如,离散正弦变换DST)将残差数据变换成残差变换系数。变换模块52可将残差视频数据从像素域转换到变换域(例如,频域)。
变换模块52可将所得变换系数发送到量化模块54。量化模块54对变换系数进行量化以进一步减小码率。在一些可行的实施方式中,量化模块54可接着执行包含经量化变换系数的矩阵的扫描。替代地,熵编码模块56可执行扫描。
在量化之后,熵编码模块56可熵编码经量化变换系数。例如,熵编码模块56可执行上下文自适应性可变长度解码(CAVLC)、上下文自适应性二进制算术解码(CABAC)、基于语法的上下文自适应性二进制算术解码(SBAC)、概率区间分割熵(PIPE)解码或另一熵编码方法或技术。熵编码模块56也可熵编码正经编码的当前视频条带的运动矢量及其它语法元素。在通过熵编码模块56进行熵编码之后,可将经编码码流传输到视频解码器30或存档以供稍后传输或由视频解码器30检索。
反量化模块58及反变换模块60分别应用反量化及反变换,以在像素域中重构建残差块以供稍后用作参考图像的参考块。求和器62将经重构建残差块与通过预测模块41所产生的预测块相加以产生重建块,并作为参考块以供存储于参考图像存储器64中。这些参考块可由运动估计单元42及运动补偿单元44用作参考块以帧间预测后续视频帧或图像中的块。
应当理解的是,视频编码器20的其它的结构变化可用于编码视频流。例如,对于某些图像块或者图像帧,视频编码器20可以直接地量化残差信号而不需要经变换模块52处理,相应地也不需要经反变换模块58处理;或者,对于某些图像块或者图像帧,视频编码器20没有产生残差数据,相应地不需要经变换模块52、量化模块54、反量化模块58和反变换模块60处理;或者,视频编码器20可以将经重构图像块作为参考块直接地进行存储而不需要经滤波器单元处理;或者,视频编码器20中量化模块54和反量化模块58可以合并在一起;或者,视频编码器20中变换模块52和反变换模块60可以合并在一起;或者,求和器50和求和器62可以合并在一起。
图2B为本申请实施例中视频解码器30的一种示意性框图。一并参阅图3、图10-12、图14-17,视频解码器30可执行图像预测过程,尤其是视频解码器30中的运动补偿单元82可执行图像预测过程。
如图2B所示,视频解码器30可以包括熵解码模块80、预测处理模块81、反量化模块86、反变换模块88和重建模块90。在一种示例下,预测模块81可以包括运动补偿单元82和帧内预测单元84,本申请实施例对此不作限定。
在一种可行的实施方式中,视频解码器30还可以包括参考图像存储器92。应当理解的是,参考图像存储器92也可以设置在视频解码器30之外。在一些可行的实施方式中,视频解码器30可执行与关于来自图2A的视频编码器20描述的编码流程的示例性地互逆的解码流程。
在解码过程期间,视频解码器30从视频编码器20接收表示经编码视频条带的图像块及相关联的语法元素的经编码视频码流。视频解码器30可在视频条带层级和/或图像块层 级处接收语法元素。视频解码器30的熵解码模块80对位流/码流进行熵解码以产生经量化的系数和一些语法元素。熵解码模块80将语法元素转发到预测处理模块81。本申请中,在一种示例下,这里的语法元素可以包括与当前图像块相关的帧间预测数据,该帧间预测数据可以包括索引标识block_based_index,以指示当前图像块使用的是哪一个运动信息(亦称为当前图像块的初始运动信息);可选的,还可以包括开关标志block_based_enable_flag,以表示是否对当前图像块采用图3或14进行图像预测(换言之,即以表示是否对当前图像块采用本申请提出的MVD镜像约束条件下进行帧间预测),或是否对当前图像块采用图12或16进行图像预测(换言之,即以表示是否对当前图像块采用本申请提出的基于时域距离的比例关系下进行帧间预测)。
当视频条带被解码为经帧内解码(I)条带时,预测处理模块81的帧内预测单元84可基于发信号通知的帧内预测模式和来自当前帧或图像的先前经解码块的数据而产生当前视频条带的图像块的预测块。当视频条带被解码为经帧间解码(即,B或P)条带时,预测处理模块81的运动补偿单元82可基于从熵解码模块82接收到的语法元素,确定用于对当前视频条带的当前图像块进行解码的帧间预测模式,基于确定的帧间预测模式,对所述当前图像块进行解码(例如执行帧间预测)。具体的,运动补偿单元82可确定对当前视频条带的当前图像块采用哪一种图像预测方法进行预测,例如语法元素指示采用基于MVD镜像约束条件的图像预测方法来对当前图像块进行预测,预测或修正当前视频条带的当前图像块的运动信息,从而通过运动补偿过程使用预测出的当前图像块的运动信息来获取或生成当前图像块的预测块。这里的运动信息可以包括参考图像信息和运动矢量,其中参考图像信息可以包括但不限于单向/双向预测信息,参考图像列表号和参考图像列表对应的参考图像索引。对于帧间预测,可从参考图像列表中的一者内的参考图像中的一者产生预测块。视频解码器30可基于存储在参考图像存储器92中的参考图像来建构参考图像列表,即列表0和列表1。当前图像的参考帧索引可包含于参考帧列表0和列表1中的一或多者中。在一些实例中,可以是视频编码器20发信号通知指示采用哪一种新的图像预测方法。
本实施例中,预测处理模块81用于生成当前解码图像块的预测块;具体的,在视频条带经解码为经帧内解码(I)条带时,预测模块81的帧内预测单元84可基于用信号发送的帧内预测模式及来自当前帧或图像的先前经解码图像块的数据而产生当前视频条带的图像块的预测块。在视频图像经解码为经帧间解码(例如,B、P或GPB)条带时,预测模块81的运动补偿单元82基于从熵编码单元80所接收的运动矢量及其它语法元素而产生当前视频图像的图像块的预测块。
反量化模块86将在位流中提供且由熵解码模块80解码的经量化变换系数逆量化,即去量化。逆量化过程可包括:使用由视频编码器20针对视频条带中的每个图像块计算的量化参数来确定应施加的量化程度以及同样地确定应施加的逆量化程度。反变换模块88将逆变换应用于变换系数,例如逆DCT、逆整数变换或概念上类似的逆变换过程,以便产生像素域中的残差块。
在运动补偿单元82产生用于当前图像块的预测块之后,视频解码器30通过将来自反变换模块88的残差块与由运动补偿单元82产生的对应预测块求和以得到重建的块,即经 解码图像块。求和器90表示执行此求和操作的组件。在需要时,还可使用环路滤波器(在解码环路中或在解码环路之后)来使像素转变平滑或者以其它方式改进视频质量。滤波器单元(未示意)可以表示一或多个环路滤波器,例如去块滤波器、自适应环路滤波器(ALF)以及样本自适应偏移(SAO)滤波器。并且,还可以将给定帧或图像中的经解码图像块存储在经解码图像缓冲器92中,经解码图像缓冲器92存储用于后续运动补偿的参考图像。经解码图像缓冲器92可为存储器的一部分,其还可以存储经解码视频,以供稍后在显示装置(例如图1的显示装置32)上呈现,或可与此类存储器分开。
应当理解的是,视频解码器30的其它结构变化可用于解码经编码视频位流。例如,视频解码器30可以不经滤波器单元处理而生成输出视频流;或者,对于某些图像块或者图像帧,视频解码器30的熵解码模块80没有解码出经量化的系数,相应地不需要经反量化模块86和反变换模块88处理。例如,视频解码器30中反量化模块86和反变换模块88可以合并在一起。
图3是本申请实施例的图像预测方法的示意性流程图。图3所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图3所示的方法既可以发生在编码过程,也可以发生在解码过程,更具体地,图3所示的方法可以发生在编解码时的帧间预测过程。过程300可由视频编码器20或视频解码器30执行,具体的,可以由视频编码器20或视频解码器30的运动补偿单元来执行。假设具有多个视频帧的视频数据流正在使用视频编码器或者视频解码器,执行包括如下步骤的过程300来预测当前视频帧的当前图像块的像素值的预测值;
图3所示的方法包括步骤301至步骤304,下面对步骤301至步骤304进行详细的介绍。
301、获取当前图像块的初始运动信息。
这里的图像块可以是待处理图像中的一个图像块,也可以是待处理图像中的一个子图像。另外,这里的图像块可以是编码过程中待编码的图像块,也可以是解码过程中待解码的图像块。
另外,上述初始运动信息可以包括预测方向的指示信息(通常为双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和后向运动矢量,参考图像信息包括前向预测参考图像块和后向预测参考图像块的参考帧索引信息。
在获取图像块的初始运动信息时,可以采用多种方式进行,例如,可以采用下面的方式一和方式二来获取图像块的初始运动信息。
方式一:
一并参阅图4和图5,在帧间预测的合并模式下,根据当前图像块的相邻块的运动信息构建候选运动信息列表,并从该候选运动信息列表中选择某个候选运动信息作为当前图像块的初始运动信息。其中,候选运动信息列表包含运动矢量、参考帧索引信息等等。例如,选择相邻块A0的运动信息(参见图5中index为0的候选运动信息)作为当前图像块的初始运动信息,具体地,将A0的前向运动矢量作为当前块的前向预测运动矢量,将A0的后向运动矢量作为当前块的后向预测运动矢量。
方式二:
在帧间预测的非合并模式下,根据当前图像块的相邻块的运动信息构建运动矢量预测值列表,并从该运动矢量预测值列表中选择某个运动矢量作为当前图像块的运动矢量预测值。在这种情况下,当前图像块的运动矢量可以为相邻块的运动矢量值,也可以为所选取的相邻块的运动矢量与当前图像块的运动矢量差的和,其中,运动矢量差通过对当前图像块进行运动估计所得到的运动矢量与所选取的相邻块的运动矢量的差。例如,选择运动矢量预测值列表中的索引1和2对应的运动矢量作为当前图像块的前向运动矢量和后向运动矢量。
应理解,上述方式一和方式二只是获取图像块的初始运动信息的具体两种方式,本申请对获取预测块的运动信息的方式不做限定,任何能够获取图像块的初始运动信息的方式都在本申请的保护范围内。
302、基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数。
一并参阅图6,本申请实施例涉及的当前图像块所属的当前图像存在一前一后的两个参考图像,即前向参考图像和后向参考图像。
在一种示例下,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
相应地,步骤302可以包括:
根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点(以图8中的(0,0)示意),在所述前向参考图像中确定(N-1)个候选前向参考块的位置;
根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置。
在一种示例下,一并参阅图7,所述N个前向参考块的位置包括一个初始前向参考块的位置(以(0,0)示意)和(N-1)个候选前向参考块的位置(以(0,-1)(-1,-1)、(-1,1)、(1,-1)和(1,1)等示意),每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离(如图8所示)或者分数像素距离,其中N=9;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离,其中N=9。
一并参阅图8,在运动估计或者运动补偿过程中,MV的精度可以是分数像素精度(例如,1/2像素精度,或者1/4像素精度)。如果图像中只有整数像素的像素值,并且当前MV的精度为分数像素精度,则需要通过参考图像的整像素位置的像素值,采用插值滤波器进行插值,得到分像素位置的像素值,作为当前块的预测块的值。具体插值操作过程与使用的插值滤波器有关,一般来说,可以对参考像素点周围的整数像素点的像素值做线性加权得到参考像素点的值。常用的插值滤波器有4抽头,6抽头,8抽头等。
如图7所示,Ai,j为整像素位置的像素点,其位宽为bitDepth。a0,0,b0,0,c0,0,d0,0,h0,0,n0,0e0,0,i0,0,p0,0,f0,0,j0,0,q0,0,g0,0,k0,0,和r0,0为分像素位置的像素点。若采用8抽头插值滤波器,则a0,0可以通过下面的公式计算得到:
a0,0=(C 0*A -3,0+C 1*A -2,0+C 2*A -1,0+C 3*A 0,0+C 4*A 1,0+C 5*A 2,0+C 6*A 3,0+C 7*A 4,0)>>shift1
在上述公式中,C k,k=0,1,…,7为插值滤波器的系数,如果插值滤波器的系数和为2的N次方,那么,插值滤波器的增益为N,例如,N为6表示插值滤波器增益为6比特。shift1为右移位数,shift1可以设置为bitDepth-8,其中,bitDepth为目标位宽,这样根据上述公式最终得到的预测块的像素值的位宽为bitDepth+6-shift1=14比特。
303、基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N。
一并参阅图9,前向参考图像Ref0中的候选前向参考块904的位置相对于初始前向参考块902(即前向搜索基点)的位置的位置偏移MVD0(delta0x,delta0y)。后向参考图像Ref1中的候选后向参考块905的位置相对于初始后向参考块903(即后向搜索基点)的位置的位置偏移MVD1(delta1x,delta1y)。
MVD0=-MVD1;即:
delta0x=-delta1x;
delta0y=-delta1y;
在不同示例下,步骤303可以包括:
从M对参考块(一个前向参考块和一个后向参考块)的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。另外,在比较前向参考块的像素值与后向参考块的像素值的差异时,可以采用绝对误差和(Sum of absolute differences,SAD)、绝对变换误差和(Sum of absolute transformation differences,SATD)或者绝对平方差和等来衡量。
304、基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
在一种示例下,步骤304中,对所述目标前向参考块的像素值和所述目标后向参考块的像素值进行加权处理,得到所述当前图像块的像素值的预测值。
可选地,作为一个实施例,图3所示的方法还包括:获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新 的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。其中当前图像块的更新的运动信息可以是基于所述目标前向参考块的位置、目标后向参考块的位置和当前图像块的位置得到的,或者,可以是基于所述确定的一对参考块位置对应的第一位置偏移和第二位置偏移得到的。
通过对图像块的运动矢量进行了更新,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移成镜像关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。
下面结合图10对本申请实施例的图像预测方法进行详细的描述。
图10是本申请实施例的图像预测方法的示意性流程图。图10所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图10所示的方法既可以发生在编码过程,也可以发生在解码过程,更具体地,图10所示的方法可以发生在编解码时的帧间预测过程。
图10所示的方法包括步骤1001至步骤1007,下面对步骤1001至步骤1007进行详细的介绍。
1001、获得当前块的初始运动信息。
例如,对于帧间预测/编码模式为merge的图像块,根据merge的index从merge candidate list中获取一组运动信息,此运动信息便为当前块的初始运动信息。例如,对于帧间预测/编码模式为AMVP的图像块,根据AMVP的index从MVP candidate list中获取MVP,对该MVP与码流中包含的MVD求和得到当前块的MV。初始运动信息包括参考图像指示信息以及运动矢量,通过参考图像指示信息确定前向参考图像以及后向参考图像。通过运动矢量确定前向参考块的位置以及后向参考块的位置。
1002、在前向参考图像中确定当前图像块的起始前向参考块的位置,所述起始前向参考块的位置为前向参考图像中的搜索起点(亦称为搜索基点);
具体地,根据前向MV以及当前块的位置信息,获得前向参考图像中的搜索基点(下文称为第一搜索基点)。例如,前向MV信息为(MV0x,MV0y)。当前块的位置信息为(B0x,B0y)。则前向参考图像的第一搜索基点为(MV0x+B0x,MV0y+B0y)。
1003、在后向参考图像中确定当前图像块的起始后向参考块的位置,所述起始后向参考块的位置为后向参考图像中的搜索起点;
具体地,根据后向MV以及当前块的位置信息,获得后向参考图像中的搜索基点(下 文称为第二搜索基点)。例如,后向MV为(MV1x,MV1y)。当前块的位置信息为(B0x,B0y)。则后向参考图像的第二搜索基点为(MV1x+B0x,MV1y+B0y)。
1004、在MVD镜像约束条件下,确定最匹配的一对参考块(即一个前向参考块和一个后向参考块)的位置,并得到最佳前向运动矢量和最佳后向运动矢量;
这里的MVD镜像约束条件可以解释为,前向参考图像中的块位置相对于前向搜索基点的位置偏移MVD0(delta0x,delta0y)。后向参考图像中的块位置相对于后向搜索基点的位置偏移MVD1(delta1x,delta1y)。满足以下关系:
MVD0=-MVD1;即:
delta0x=-delta1x;
delta0y=-delta1y;
请参阅图7,在所述的前向参考图像中,以搜索基点(以(0,0)示意)为起点,进行整数像素步长的运动搜索。整像素步长是指候选参考块的位置相对于搜索基点的位置偏移为整数像素距离。需要指出的是,不管搜索基点是否为整数像素点(起始点可以是整像素,或亚像素,如:1/2,1/4,1/8,1/16等),都可以先进行整数像素步长运动搜索,以得到当前图像块的前向参考块的位置。应理解,在以整像素步长进行搜索时,搜索起始点既可以整像素也可以是分像素,例如,整像素,1/2像素,1/4像素,1/8像素以及1/16像素等等。
如图7所示,以(0,0)点为搜索基点,对搜索基点周围的8个整数像素步长的搜索点进行搜索,得到对应的候选参考块的位置。图7示意出了8个候选参考块;如果前向参考图像中前向候选参考块的位置相对于前向搜索基点的位置的位置偏移为(-1,-1),则后向参考图像中对应后向候选参考块的位置相对于后向搜索基点的位置的位置偏移为(1,1)。据此获得成对的前向候选参考块以及后向候选参考块的位置。针对所得一对参考块的位置,计算对应的两个候选参考块之间的匹配代价。取匹配代价最小的前向参考块和后向参考块作为最优前向参考块以及最优后向参考块,并得到最优前向运动矢量和最优后向运动矢量。
1005-1006、使用步骤1004所得到的最优前向运动矢量进行运动补偿过程,得到最优前向参考块的像素值;使用步骤1004所得到的最优后向运动矢量进行运动补偿过程,得到最优后向参考块的像素值。
1007、对所得到的最优前向参考块的像素值和最优后向参考块的像素值进行加权,得到当前图像块的像素值的预测值。
具体地,例如,可以根据公式(2)得到当前图像块的像素值的预测值。
predSamples’[x][y]=(predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (2)
其中,predSamplesL0’为最优前向参考块,predSamplesL1’为最优后向参考块,predSamples’为当前图像块的预测块,predSamplesL0’[x][y]为最优前向参考块在像素点(x,y)的像素值,predSamplesL1’[x][y]为最优后向参考块在像素点(x,y)的像素值,predSamples’[x][y]为最终预测块在像素点(x,y)的像素值。
需要注意的是,本申请实施例中,不限定使用哪一种搜索方法,可为任意搜索方法。针对搜索得到的每个前向候选块,计算所述前向候选块与步骤4中对应的后向候选块之间 的差异,选择SAD最小的前向候选块和其对应的前向运动矢量以及后向候选块和其对应的后向运动矢量,作为最优的前向预测块和对应的最优前向运动矢量以及最优的后向预测块和对应的最优后向运动矢量。或者,针对搜索得到的每个后向候选块,计算所述后向候选块与步骤4中对应的前向候选块之间的差异,选择SAD最小的后向候选块和其对应的后向运动矢量以及前向候选块和其对应的前向运动矢量,作为最优的后向参考块和对应的最优后向运动矢量以及最优的前向参考块和对应的最优前向运动矢量。
需要注意的是,步骤1004中,仅给出了整数像素步长搜索方法的例子。实际上,除了进行整数像素步长搜索以外,还可以使用分数像素步长搜索。例如,在步骤1004进行整数像素步长搜索以后,再进行分数像素步长的搜索。或者,直接进行分数像素步长的搜索。此处并不对具体的搜索方法进行限定。
需要注意的是,本申请实施例中,不限定使用匹配代价计算的方法,例如可以使用SAD准则,也可以使用MR-SAD准则,也可以使用其他准则。另外,计算匹配代价的时候,可以仅使用亮度分量去计算,也可以同时使用亮度和色度分量计算。
需要注意的是,在搜索过程中,如果出现匹配代价为0或者达到预设的门限值时,则可以提前终止遍历操作或者搜索操作。在此,对搜索方法的提前终止条件不作限定。
应理解,步骤1005与步骤1005的顺序不做限制,可以同时进行,也可以先后进行。
可见,相对于现有方法中需要先计算模板匹配块,使用模板匹配块分别做前向搜索以及后向搜索,本申请实施例,在寻找匹配块的过程中,使用前向参考图像中的候选块以及后向参考图像中的候选块直接计算匹配代价,确定匹配代价最小的两个块,简化了图像预测流程,在提高图像预测准确性的同时降低了复杂度。
图11是本申请实施例的图像预测方法的示意性流程图。图11所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图11所示的方法包括步骤1101至步骤1105,其中步骤1101至1103、1105参见图10中步骤1001至1003、1007的描述,这里不再赘述。
本申请实施例相比于图10所示的实施例的区别为,在搜索过程中,保留并更新当前最优前向参考块和最优后向参考块的像素值。在搜索完成以后,可使用当前的最优前向参考块和最优后向参考块的像素值计算当前图像块的像素值的预测值。
例如,需要遍历N对参考块的位置。Costi为第i次的匹配代价,MinCost表示当前最小的匹配代价值。Bfi,Bbi分别为第i次取得的前向参考块的像素值和后向参考块的像素值。BestBf,BestBb分别为当前最优的前向参考块的值。CalCost(M,N)表示块M和块N的匹配代价。
当开始搜索时(i=1),MinCost=Cost0=CalCost(Bf0,Bb0),BestBf=Bf0,BestBb=Bb0;
在后续遍历其他对参考块时,时刻更新。例如,进行第i(i>1)次搜索时,如果Costi<MinCost,则BestBf=Bfi,BestBb=Bbi;否则,则不更新。
搜索结束时,使用BestBf,BestBb得到当前块的像素值的预测值。
图12是本申请实施例的图像预测方法的示意性流程图。图12所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。 图12所示的方法既可以发生在编码过程,也可以发生在解码过程,更具体地,图3所示的方法可以发生在编解码时的帧间预测过程。过程1200可由视频编码器20或视频解码器30执行,具体的,可以由视频编码器20或视频解码器30的运动补偿单元来执行。假设具有多个视频帧的视频数据流正在使用视频编码器或者视频解码器,执行包括如下步骤的过程1200来预测当前视频帧的当前图像块的像素值的预测值;
图12所示的方法包括步骤1201至步骤1204,其中步骤1201、1202和1204参见图3中步骤301、302和304的描述,这里不再赘述。
本申请实施例相比于图3所示的实施例的区别为,步骤1203、基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
一并参阅图13,前向参考图像Ref0中的候选前向参考块1304的位置相对于初始前向参考块1302(即前向搜索基点)的位置的位置偏移MVD0(delta0x,delta0y)。后向参考图像Ref1中的候选后向参考块1305的位置相对于初始后向参考块1303(即后向搜索基点)的位置的位置偏移MVD1(delta1x,delta1y)。
在搜索过程中,两个匹配块的位置偏移满足镜像关系,镜像关系需要考虑时域间隔。这里TC,T0,T1分别表示当前帧的时刻,前向参考图像的时刻,后向参考图像的时刻。TD0,TD1表示两个时刻之间的时间间隔。
TD0=TC-T0
TD1=TC-T1
具体编解码过程中,TD0,TD1可以使用图像序列计数(picture order count,POC)计算。例如:
TD0=POCc-POC0
TD1=POCc-POC1
这里,POCc,POC0,POC1分别表示当前图像的POC,前向参考图像的POC,以及后向参考图像的POC。TD0表示当前图像与前向参考图像之间的图像序列计数(picture order count,POC)距离;TD1表示当前图像与后向参考图像之间的POC距离。
delta0=(delta0x,delta0y)
delta1=(delta1x,delta1y)
考虑时域间隔的镜像关系描述如下:
delta0x=(TD0/TD1)*delta1x;
delta0y=(TD0/TD1)*delta1y;
或者
delta0x/delta1x=(TD0/TD1);
delta0y/delta1y=(TD0/TD1);
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移具有基于时域距离的比例关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。
上面所述实施例中,搜索过程进行了一次。除此以外,也可以通过迭代的方法,进行多轮搜索。具体的,在每轮搜索获得前向参考块和后向参考块以后,可以根据当前经修正的MV,再进行一轮或者多轮搜索。
下面结合图14对本申请实施例的图像预测方法的流程进行详细的介绍。与图3所示的方法类似,图14所示的方法也可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图14所示的方法可以发生在编码过程,也可以发生在解码过程,具体地,图14所示的方法可以发生在编码过程或者解码时的帧间预测过程。
图14所示的方法具体包括步骤1401至步骤1404,如下:
1401、获取当前图像块的第i轮运动信息;
这里的图像块可以是待处理图像中的一个图像块,也可以是待处理图像中的一个子图像。另外,这里的图像块可以是编码过程中待编码的图像块,也可以是解码过程中待解码的图像块。
如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;
如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
另外,上述初始运动信息可以包括预测方向的指示信息(通常为双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和后向运动矢量,参考图像信息包括前向预测参考图像块和后向预测参考图像块的参考帧索引信息。
在获取图像块的初始运动信息时,可以采用多种方式进行,例如,可以采用下面的方式一和方式二来获取图像块的初始运动信息。
方式一:
一并参阅图4和图5,在帧间预测的合并模式下,根据当前图像块的相邻块的运动信息构建候选运动信息列表,并从该候选运动信息列表中选择某个候选运动信息作为当前图 像块的初始运动信息。其中,候选运动信息列表包含运动矢量、参考帧索引信息等等。例如,选择相邻块A0的运动信息(参见图5中index为0的候选运动信息)作为当前图像块的初始运动信息,具体地,将A0的前向运动矢量作为当前块的前向预测运动矢量,将A0的后向运动矢量作为当前块的后向预测运动矢量。
方式二:
在帧间预测的非合并模式下,根据当前图像块的相邻块的运动信息构建运动矢量预测值列表,并从该运动矢量预测值列表中选择某个运动矢量作为当前图像块的运动矢量预测值。在这种情况下,当前图像块的运动矢量可以为相邻块的运动矢量值,也可以为所选取的相邻块的运动矢量与当前图像块的运动矢量差的和,其中,运动矢量差通过对当前图像块进行运动估计所得到的运动矢量与所选取的相邻块的运动矢量的差。例如,选择运动矢量预测值列表中的索引1和2对应的运动矢量作为当前图像块的前向运动矢量和后向运动矢量。
应理解,上述方式一和方式二只是获取图像块的初始运动信息的具体两种方式,本申请对获取预测块的运动信息的方式不做限定,任何能够获取图像块的初始运动信息的方式都在本申请的保护范围内。
1402、根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数。
在一种示例下,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;
相应地,步骤1402可以包括:
根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置;
根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点在所述后向参考图像中确定(N-1)个候选后向参考块的位置。
在一种示例下,一并参阅图7,所述N个前向参考块的位置包括第i-1轮目标前向参考块的位置(以(0,0)示意)和(N-1)个候选前向参考块的位置(以(0,-1)(-1,-1)、(-1,1)、(1,-1)和(1,1)等示意),每个候选前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移为整数像素距离(如图8所示)或者分数像素距离,其中N=9;或者,所述N个后向参考块的位置包括第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述第i-1轮目标后向参考块的位置的位置偏移为整数像素距离或者分数像素距离,其中N=9。
1403、基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个 前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N。
其中,所述第一位置偏移与第二位置偏移成镜像关系,可以理解为:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
一并参阅图9,前向参考图像Ref0中的候选前向参考块904的位置相对于第i-1轮目标前向参考块902(即前向搜索基点)的位置的位置偏移MVD0(delta0x,delta0y)。后向参考图像Ref1中的候选后向参考块905的位置相对于第i-1轮目标后向参考块903(即后向搜索基点)的位置的位置偏移MVD1(delta1x,delta1y)。
MVD0=-MVD1;即:
delta0x=-delta1x;
delta0y=-delta1y;
在不同示例下,步骤1403可以包括:
从M对参考块(一个前向参考块和一个后向参考块)的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者,从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。另外,在比较前向参考块的像素值与后向参考块的像素值的差异时,可以采用绝对误差和(Sum of absolute differences,SAD)、绝对变换误差和(Sum of absolute transformation differences,SATD)或者绝对平方差和等来衡量。
1404、基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
在一种示例下,步骤404中,对所述目标前向参考块的像素值和所述目标后向参考块的像素值进行加权处理,得到所述当前图像块的像素值的预测值。另外,在本申请中,还可以根据其它方法来得到当前图像块的像素值的预测值,本申请对此不做限定。
通过对图像块的运动矢量进行了更新,比如初始运动信息更新为第2轮运动信息,其中第2轮运动信息包括:指向第1轮目标前向参考块的位置的前向运动矢量和指向第1轮目标后向参考块的位置的后向运动矢量,…,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移成镜像关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预 测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。此外,增加迭代次数,可以进一步提高修正MV的准确度,从而进一步提高编解码性能。
下面结合图15对本申请实施例的图像预测方法的流程进行详细的介绍。图15所示的方法也可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图15所示的方法可以发生在编码过程,也可以发生在解码过程,具体地,图15所示的方法可以发生在编码过程或者解码时的帧间预测过程。
图15所示的方法具体包括步骤1501至步骤1508,下面分别对步骤1501至步骤1508进行详细的描述。
1501、获取当前图像块的初始运动信息;
例如,如果是首轮搜索,则使用当前块的初始运动信息。例如,对于编码模式为merge的图像块,根据merge的index从merge candidate list中获取运动信息,此运动信息便为当前块的初始运动信息。例如,对于编码模式为AMVP的图像块,根据AMVP的index从MVP candidate list中获取MVP,对该MVP与码流中包含的MVD求和得到当前块的MV。如果不是首轮搜索,则使用上一轮更新的MV信息。运动信息包括参考图像指示信息以及运动矢量信息。通过参考图像指示信息确定前向参考图像以及后向参考图像。通过运动矢量信息确定前向参考块的位置以及后向参考块的位置。
1502、确定前向参考图像中的搜索基点;
根据前向MV信息以及当前块的位置信息,确定前向参考图像中的搜索基点。具体与图10或11的实施例的过程类似。例如,前向MV信息为(MV0x,MV0y),当前块的位置信息为(B0x,B0y),则前向参考图像中的搜索基点为(MV0x+B0x,MV0y+B0y)。
1503、确定后向参考图像中的搜索基点
根据后向MV信息以及当前块的位置信息,确定后向参考图像中的搜索基点。具体与图10或11的实施例的过程类似。例如,后向MV信息为(MV1x,MV1y),当前块的位置信息为(B0x,B0y),则后向参考图像的搜索基点为(MV1x+B0x,MV1y+B0y)。
1504、在前向参考图像、后向参考图像中,确定MVD镜像约束条件下最匹配的一对参考块(即一个前向参考块和一个后向参考块)的位置,并得到当前图像块的经修正的前向运动矢量和经修正的后向运动矢量;
具体的搜索步骤与图10或11的实施例的过程类似,这里不再赘述。
1505、判断是否达到迭代终止条件,如果没有达到,则执行步骤1502和1503。否则,执行步骤1506和1507。
这里并不限定迭代搜索的终止条件的设计,例如可以遍历完设定的迭代次数L,或者达到其他的迭代终止条件。例如:如果当前迭代操作结果以后,出现MVD0接近或等于0和MVD1接近或等于0的情况,例如MVD0=(0,0),MVD1=(0,0),则可以终止迭代操作。其中,L为预设值,L为大于1的整数。L可以在对图像进行预测之前预先设置好的数值, L也可以根据图像预测的精度以及搜索预测块的复杂度来设置L的数值,L也可以根据历史经验值来设定,或者L也可以根据中间搜索过程中的结果的验证情况来确定。
例如,在本实施例中一共以整像素步长进行2次搜索,其中,在第一次搜索时,可以以初始前向参考块的位置作为搜索基点,在前向参考图像(亦称为前向参考区域)中确定(N-1)个候选前向参考块的位置;以及以初始后向参考块的位置作为搜索基点,在后向参考图像(亦称为前向参考区域)中确定(N-1)个候选后向参考块的位置,针对N对参考块的位置中的一对或多对参考块位置,计算对应的两个参考块的匹配代价,比如,计算初始前向参考块与初始后向参考块的匹配代价,以及计算满足MVD镜像约束条件的一个候选前向参考块和一个候选后向参考块的匹配代价,从而得到第一次搜索的第1轮目标前向参考块和第1轮目标后向参考块的位置,进而得到更新的运动信息,其包括:表示当前图像块的位置指向第1轮目标前向参考块的位置的前向运动矢量和表示当前图像块的位置指向第1轮目标后向参考块的位置的后向运动矢量。应当理解的是,更新的运动信息与初始运动信息中参考帧索引等信息相同。接下来再进行第二次搜索,以第1轮目标前向参考块的位置作为搜索基点,在前向参考图像(亦称为前向参考区域)中确定(N-1)个候选前向参考块的位置;以及以第1轮目标后向参考块的位置作为搜索基点,在后向参考图像(亦称为前向参考区域)中确定(N-1)个候选后向参考块的位置,针对N对参考块的位置中的一对或多对参考块位置,计算对应的两个参考块的匹配代价,比如,计算第1轮目标前向参考块与第1轮目标后向参考块的匹配代价,计算满足MVD镜像约束条件的一个候选前向参考块和一个候选后向参考块的匹配代价,从而得到第二次搜索的第2轮目标前向参考块和第2轮目标后向参考块的位置,进而得到更新的运动信息,其包括:表示当前图像块的位置指向第2轮目标前向参考块的位置的前向运动矢量和表示当前图像块的位置指向第2轮目标后向参考块的位置的后向运动矢量。应当理解的是,更新的运动信息与初始运动信息中参考帧索引等其它信息相同。当预设的迭代次数L=2时,这里的第二次搜索过程中的,第2轮目标前向参考块和第2轮目标后向参考块就是最终得到的目标前向参考块和目标后向参考块(亦称为最优前向参考块和最优后向参考块)。
1506至1507、使用步骤1504所得到的最优前向运动矢量进行运动补偿过程,以得到最优前向参考块的像素值;使用步骤1504所得到的最优后向运动矢量进行运动补偿过程,以得到最优后向参考块的像素值。
1508、根据步骤1506和1507中所得到的最优前向参考块的像素值和最优后向参考块的像素值,获得当前图像块的像素值的预测值。
在步骤1504中,在前向参考图像或者后向参考图像中进行搜索时可以以整像素步长进行搜索(或者称为运动搜索),以得到至少一个前向参考块的位置以及至少一个后向参考块的位置。在以整像素步长进行搜索时,搜索起始点既可以整像素也可以是分像素,例如,整像素,1/2像素,1/4像素,1/8像素以及1/16像素等等。
另外,在步骤1504中搜索至少一个前向参考块的位置和至少一个后向参考块的位置时,也可以直接以分像素步长进行搜索,或者,既进行整像素步长搜索又进行分像素步长搜索。本申请对搜索方法不做限定。
在步骤1504中,针对每对参考块的地址,在计算成对应关系的前向参考块的像素值与对应的一个后向参考块的像素值的差异时,可以采用SAD、SATD或者绝对平方差和等 来衡量成每个前向参考块的像素值与对应的后向参考块的像素值的差异。但是本申请不限于此。
在根据最优前向参考块和最优后向参考块确定当前图像块的像素值的预测值时,可以对步骤1506和步骤1507得到的最优前向参考块的像素值和最优后向参考块的像素值进行加权处理,并将加权处理后得到的像素值作为当前图像块的像素值的预测值。
具体地,可以根据公式(8)得到当前图像块的像素值的预测值。
predSamples’[x][y]=(predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (8)
其中,predSamplesL0’[x][y]为最优前向参考块在像素点(x,y)的像素值,predSamplesL1’[x][y]为最优后向参考块在像素点(x,y)的像素值,predSamples’[x][y]为当前图像块在像素点(x,y)的像素预测值。
一并参阅图11,本申请实施例的迭代搜索过程中,还可以保留并更新当前最优前向参考块的像素值和最优后向参考块的像素值。在搜索完成以后,直接使用当前的最优前向参考块和最优后向参考块的像素值计算当前图像块的像素值的预测值。在这种实现方式下,步骤1506和1507是可选的步骤。
例如,需要遍历N对参考块的位置。Costi为第i次的匹配代价,MinCost表示当前最小的匹配代价值。Bfi,Bbi分别为第i次取得的前向参考块的像素值和后向参考块的像素值。BestBf,BestBb分别为当前最优的前向参考块的像素值。CalCost(M,N)表示块M和块N的匹配代价。
当开始搜索时(i=1),MinCost=Cost0=CalCost(Bf0,Bb0),BestBf=Bf0,BestBb=Bb0;
在后续遍历其他对参考块时,时刻更新。例如,进行第i(i>1)次搜索时,如果Costi<MinCost,则BestBf=Bfi,BestBb=Bbi;否则,则不更新。
搜索结束时,使用BestBf,BestBb得到当前块的像素值的预测值。
上面图12所示的实施例中,搜索过程进行了一次。除此以外,也可以通过迭代的方法,进行多轮搜索。具体的,在每轮搜索获得前向参考块和后向参考块以后,可以根据当前经修正的MV,再进行一轮或者多轮搜索。
下面结合图16对本申请实施例的图像预测方法1600的流程进行详细的介绍。图16所示的方法也可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图16所示的方法可以发生在编码过程,也可以发生在解码过程,具体地,图16所示的方法可以发生在编码过程或者解码时的帧间预测过程。
图16所示的方法1600包括步骤1601至步骤1604,其中步骤1601、1602和1604参见图14中步骤1401、1402和1404的描述,这里不再赘述。
本申请实施例相比于图14所示的实施例的区别为,步骤1603、基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置 的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
一并参阅图13,前向参考图像Ref0中的候选前向参考块1304的位置相对于初始前向参考块1302(即前向搜索基点)的位置的位置偏移MVD0(delta0x,delta0y)。后向参考图像Ref1中的候选后向参考块1305的位置相对于初始后向参考块1303(即后向搜索基点)的位置的位置偏移MVD1(delta1x,delta1y)。
在搜索过程中,两个匹配块的位置偏移满足镜像关系,镜像关系需要考虑时域间隔。这里TC,T0,T1分别表示当前帧的时刻,前向参考图像的时刻,后向参考图像的时刻。TD0,TD1表示两个时刻之间的时间间隔。
TD0=TC-T0
TD1=TC-T1
具体编解码过程中,TD0,TD1可以使用图像序列计数(picture order count,POC)计算。例如:
TD0=POCc-POC0
TD1=POCc-POC1
这里,POCc,POC0,POC1分别表示当前图像的POC,前向参考图像的POC,以及后向参考图像的POC。TD0表示当前图像与前向参考图像之间的图像序列计数(picture order count,POC)距离;TD1表示当前图像与后向参考图像之间的POC距离。
delta0=(delta0x,delta0y)
delta1=(delta1x,delta1y)
考虑时域间隔的镜像关系描述如下:
delta0x=(TD0/TD1)*delta1x;
delta0y=(TD0/TD1)*delta1y;
或者
delta0x/delta1x=(TD0/TD1);
delta0y/delta1y=(TD0/TD1);
在不同示例下,步骤1603可以包括:
从M对参考块(一个前向参考块和一个后向参考块)的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者,从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。另外,在比较前向参考块的像素值与后向参考块的像素值的差异时,可以采用绝对误差和(Sum of absolute differences,SAD)、绝对变换误差和(Sum of absolute transformation differences,SATD)或者绝对平方差和等来衡量。
可见,本申请实施例中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每一对参考块的位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏 移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移具有基于时域距离的比例关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。此外,增加迭代次数,可以进一步提高修正MV的准确度,从而进一步提高编解码性能。
下面结合图17对本申请实施例的图像预测方法的流程进行详细的介绍。图17所示的方法也可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图17所示的方法可以发生在编码过程,也可以发生在解码过程,具体地,图17所示的方法可以发生在编码过程或者解码时的帧间预测过程。
图17所示的方法包括步骤1701至步骤1708,其中步骤1701至1703、1705至1708参见图15中步骤1501至1503、1505至1508的描述,这里不再赘述。
本申请实施例相比于图15所示的实施例的区别为,
1704、在MVD基于时域距离的镜像约束条件下,确定最匹配的一对参考块(即一个前向参考块和一个后向参考块)的位置,并得到当前图像块的经修正的前向运动矢量和经修正的后向运动矢量;
这里的MVD基于时域距离的镜像约束条件可以解释为,前向参考图像中的块位置相对于前向搜索基点的位置偏移MVD0(delta0x,delta0y)与后向参考图像中的块位置相对于后向搜索基点的位置偏移MVD1(delta1x,delta1y)满足以下关系:
两个匹配块的位置偏移满足基于时域距离的镜像关系。这里TC,T0,T1分别表示当前图像的时刻,前向参考图像的时刻,后向参考图像的时刻。TD0,TD1表示两个时刻之间的时间间隔。
TD0=TC-T0
TD1=TC-T1
具体编解码过程中,TD0,TD1可以使用图像序列计数(picture order count,POC)计算。例如:
TD0=POCc-POC0
TD1=POCc-POC1
这里,POCc,POC0,POC1分别表示当前图像的POC,前向参考图像的POC,以及后向参考图像的POC。TD0表示当前图像与前向参考图像之间的图像序列计数(picture order count,POC)距离;TD1表示当前图像与后向参考图像之间的POC距离。
delta0=(delta0x,delta0y)
delta1=(delta1x,delta1y)
考虑时域距离(亦可称为时域间隔)的镜像关系描述如下:
delta0x=(TD0/TD1)*delta1x;
delta0y=(TD0/TD1)*delta1y;
或者
delta0x/delta1x=(TD0/TD1);
delta0y/delta1y=(TD0/TD1)。
具体的搜索步骤与图10或11的实施例的过程类似,这里不再赘述。
应当理解的是,本申请实施例中,镜像关系要么考虑时域间隔,要么不考虑时域间隔。实际使用时,可以自适应的选择当前帧或者当前块进行运动矢量修正时,镜像关系是否考虑时域间隔。
例如,可以在序列级头信息(SPS),图像级头信息(PPS),或者条带头(slice header),或者块码流信息中添加指示信息以指示当前序列,或者当前图像,或者当前条带(Slice),或者当前块使用的镜像关系是否考虑时间间隔。
或者,当前块自适应的根据前向参考图像的POC以及后向参考图像的POC自适应的判断当前块使用的镜像关系是否考虑时间间隔。
例如:如果|POCc-POC0|-|POCc-POC1|>T,则使用的镜像关系需要考虑间隔,否则不考虑时间间隔,这里T为预设的门限值。例如T=2,或者T=3。.具体的T值,在此不做限定。
又例如:如果|POCc-POC0|和|POCc-POC1|中的较大值和|POCc-POC0|和|POCc-POC1|中的较小值的比值大于门限值R,
即:
(Max(|POCc-POC0|,|POCc-POC1|)/Min(|POCc-POC0|,|POCc-POC1|))>R。
这里Max(A,B)表示A和B中的较大值,Min(A,B)表示A和B中的较小值。
则使用的镜像关系需要考虑间隔,否则不考虑时间间隔,这里R为预设的门限值。例如R=2,或者R=3。具体的R值,在此不做限定。
应理解,本申请实施例的图像预测方法可以具体由编码器(例如编码器20)或者解码器(例如解码器30)中的运动补偿模块来执行。另外,本申请实施例的图像预测方法可以在需要对视频图像进行编码和/或解码的任何电子设备或者装置内实施。
下面结合图18至21对本申请实施例的图像预测装置进行详细的描述。
图18是本申请实施例的一种图像预测装置的示意性框图。需要说明的是,预测装置1800既适用于解码视频图像的帧间预测,也适用于编码视频图像的帧间预测,应当理解的是,这里的预测装置1800可以对应于图2A中的运动补偿单元44,或者可以对应于图2B中的运动补偿单元82,该预测装置1800可以包括:
第一获取单元1801,用于获取当前图像块的初始运动信息;
第一搜索单元1802,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M 对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
第一预测单元1803,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
其中,所述第一位置偏移与第二位置偏移成镜像关系,可以理解为所述第一位置偏移量与第二位置偏移量相同,例如所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
较优地,本申请实施例的装置1800中,所述第一预测单元1803还用于获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
可见,对图像块的运动矢量进行了更新,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。
本申请实施例的装置1800中,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
本申请实施例的装置1800中,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
在所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第一搜索单元具体用于:
根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
本申请实施例的装置1800中,在所述基于匹配代价准则,从M对参考块的位置中确 定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置的方面,所述第一搜索单元1802具体用于:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。
应理解,上述装置1800可执行上述图3、图10和图11所示的方法,装置1800具体可以是视频编码装置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置1800既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测。
详细细节请参见本文中对图像预测方法的介绍,为简洁起见,这里不再赘述。
可见,本申请实施例的预测装置中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每对参考块位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移成镜像关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。
图19是本申请实施例的另一种图像预测装置的示意性框图。需要说明的是,预测装置1900既适用于解码视频图像的帧间预测,也适用于编码视频图像的帧间预测,应当理解的是,这里的预测装置1900可以对应于图2A中的运动补偿单元44,或者可以对应于图2B中的运动补偿单元82,该预测装置1900可以包括:
第二获取单元1901,用于获取当前图像块的初始运动信息;
第二搜索单元1902,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
第二预测单元1903,用于基于所述目标前向参考块的像素值和所述目标后向参考块 的像素值,得到所述当前图像块的像素值的预测值。
针对每对参考块,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,可以理解为:
针对每对参考块,第一位置偏移与第二位置偏移的比例关系是基于第一时域距离与第二时域距离比例关系而确定的,其中第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
作为一种实现方式,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,可以包括:
如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
较佳地,本实施例装置中,所述第二预测单元1903还用于获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
可见,本申请实施例能获得经修正过的当前图像块的运动信息,提高当前图像块运动信息的准确度,这也将有利于其他图像块的而预测,例如提升其它图像块的运动信息的预测准确性等。
作为一种实现方式,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
作为一种实现方式,所述初始运动信息包括前向预测运动信息和后向预测运动信息;
在所述基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第二搜索单元1902具体用于:
根据所述前向预测运动信息和当前图像块的位置在前向参考图像中确定N个前向参考块的位置,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;
根据所述后向预测运动信息和当前图像块的位置在后向参考图像中确定N个后向参 考块的位置,所述N个后向参考块的位置包括初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
作为另一种实现方式,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
在所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第二搜索单元具体用于:
根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
作为一种实现方式,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置的方面,所述第二搜索单元1902具体用于:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,所述M小于或等于N。
在一种示例下,所述匹配代价准则为匹配代价最小化的准则。例如,针对M对参考块的位置,计算每对参考块中前向参考块的像素值与后向参考块的像素值的差异;从所述M对参考块的位置中,确定像素值差异最小的一对参考块的位置为所述当前图像块的前向目标参考块的位置以及后向目标参考块的位置。
在另一种示例下,所述匹配代价准则为匹配代价与提前终止准则。例如,针对第n对参考块(一个前向参考块与一个后向参考块)的位置,计算所述前向参考块的像素值与后向参考块的像素值的差异,n为大于或等于1,且小于或等于N的整数;当像素值差异小于或等于匹配误差阈值时,确定第n对参考块(一个前向参考块与一个后向参考块)的位置为所述当前图像块的前向目标参考块的位置以及后向目标参考块的位置。
作为一种实现方式,第二获取单元1901用于从当前图像块的候选运动信息列表中获取所述初始运动信息,或者,根据指示信息获取所述初始运动信息,所述指示信息用于指示当前图像块的初始运动信息。应当理解的是,初始运动信息是相对于经修正的运动信息而言的。
应理解,上述装置1900可执行上述图12所示的方法,装置1900可以是视频编码装 置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置1900既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测。
详细细节请参见本文中对图像预测方法的介绍,为简洁起见,这里不再赘述。
可见,本申请实施例的预测装置中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对所述N对参考块的位置中的每对参考块位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移具有基于时域距离的比例关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。
图20是本申请实施例的另一种图像预测装置的示意性框图。需要说明的是,预测装置2000既适用于解码视频图像的帧间预测,也适用于编码视频图像的帧间预测,应当理解的是,这里的预测装置2000可以对应于图2A中的运动补偿单元44,或者可以对应于图2B中的运动补偿单元82,该预测装置2000可以包括:
第三获取单元2001,用于获取当前图像块的第i轮运动信息;
第三搜索单元2002,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
第三预测单元2003,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
需要说明的是,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;相应地,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量;相应地,所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
作为本申请实施例,所述第三预测单元2003具体用于当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。其中,迭代终止条件的说明参见其它实施例,这里不再赘述。
本申请实施例的装置中,所述第一位置偏移与第二位置偏移成镜像关系,可以理解为:所述第一位置偏移量与第二位置偏移量相同,例如,所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
在一种实现方式下,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;
在所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第三搜索单元2002具体用于:
根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
在一种实现方式下,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置的方面,所述第三搜索单元2002具体用于:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块 的位置,其中所述M小于或等于N。
应理解,上述装置2000可执行上述图14和图15所示的方法,装置2000具体可以是视频编码装置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置2000既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测。
详细细节请参见本文中对图像预测方法的介绍,为简洁起见,这里不再赘述。
可见,本申请实施例的预测装置中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对N对参考块的位置中的每对参考块位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移成镜像关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。此外,增加迭代次数,可以进一步提高修正MV的准确度,从而进一步提高编解码性能。
图21是本申请实施例的另一种图像预测装置的示意性框图。需要说明的是,预测装置2100既适用于解码视频图像的帧间预测,也适用于编码视频图像的帧间预测,应当理解的是,这里的预测装置2100可以对应于图2A中的运动补偿单元44,或者可以对应于图2B中的运动补偿单元82,该预测装置2100可以包括:
第四获取单元2101,用于获取当前图像块的第i轮运动信息;
第四搜索单元2102,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
第四预测单元2103,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
在迭代搜索过程中,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;
如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运 动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
作为一种实现方式,所述第四预测单元2103具体用于:当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
本实施例的装置中,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,可以理解为:
如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
作为一种实现方式,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;相应地,在所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,第四搜索单元2102具体用于:
根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;
根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
作为一种实现方式,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置的方面,第四搜索单元2102具体用于:
从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者
从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
应理解,上述装置2100可执行上述图16或17所示的方法,装置2100可以是视频编码装置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置2100既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测。
详细细节请参见本文中对图像预测方法的介绍,为简洁起见,这里不再赘述。
可见,本申请实施例的预测装置中,位于前向参考图像中的N个前向参考块的位置和位于后向参考图像中N个后向参考块的位置形成N对参考块的位置,针对N对参考块的位置中的每对参考块位置,前向参考块的位置相对于初始前向参考块的位置的第一位置偏移,与,后向参考块的位置相对于初始后向参考块的位置的第二位置偏移具有基于时域距离的比例关系,在此基础上,从N对参考块的位置中确定(例如匹配代价最小的)一对参考块的位置为当前图像块的目标前向参考块(亦即最佳前向参考块/前向预测块)的位置和目标后向参考块(亦即最佳后向参考块/后向预测块)的位置,从而基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。相对于现有技术,本申请实施例方法避免了预先计算模板匹配块的计算过程以及避免了使用模板匹配块分别进行前向搜索匹配以及后向搜索匹配的过程,简化了图像预测过程,从而在提高图像预测准确性的同时,降低了图像预测的复杂度。此外,增加迭代次数,可以进一步提高修正MV的准确度,从而进一步提高编解码性能。
图22为本申请实施例的视频编码设备或视频解码设备(简称为译码设备2200)的一种实现方式的示意性框图。其中,译码设备2200可以包括处理器2210、存储器2230和总线系统2250。其中,处理器和存储器通过总线系统相连,该存储器用于存储指令,该处理器用于执行该存储器存储的指令。编码设备的存储器存储程序代码,且处理器可以调用存储器中存储的程序代码执行本申请描述的各种视频编码或解码方法,尤其是在各种帧间预测模式或帧内预测模式下的视频编码或解码方法,以及在各种帧间或帧内预测模式下预测运动信息的方法。为避免重复,这里不再详细描述。
在本申请实施例中,该处理器2210可以是中央处理单元(Central Processing Unit,简称为“CPU”),该处理器2210还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器2230可以包括只读存储器(ROM)设备或者随机存取存储器(RAM)设备。任何其他适宜类型的存储设备也可以用作存储器2230。存储器2230可以包括由处理器2210使用总线2250访问的代码和数据2231。存储器2230可以进一步包括操作系统2233和应用程序2235,该应用程序2235包括允许处理器2210执行本申请描述的视频编码或解码方法(尤其是本申请描述的图像预测方法)的至少一个程序。例如,应用程序2235可以包括应用1至N,其进一步包括执行在本申请描述的视频编码或解码方法的视频编码或解码应用(简称视频译码应用)。
该总线系统2250除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统2250。
可选的,译码设备2200还可以包括一个或多个输出设备,诸如显示器2270。在一个示例中,显示器2270可以是触感显示器或触摸显示屏,其将显示器与可操作地感测触摸输入的触感单元合并。显示器2270可以经由总线2250连接到处理器2210。
需要说明的是,对相同步骤或者相同术语的解释和限定同样适用于不同实施例间,为 了简洁,本文适当省略重复的描述。
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,根据通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行相应的功能。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。在一种示例下,视频编码器20及视频解码器30中的各种说明性逻辑框、单元、模块可以理解为对应的电路器件或逻辑元件。
本申请的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (48)

  1. 一种图像预测方法,其特征在于,包括:
    获取当前图像块的初始运动信息;
    基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;
    基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
  2. 如权利要求1所述的方法,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:
    所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
  3. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:
    获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
  4. 如权利要求1至3任一项所述的方法,其特征在于,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
    所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
  5. 如权利要求1至4任一项所述的方法,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
    所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和 N个后向参考块的位置,包括:
    根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
    根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
  6. 如权利要求1至5任一项所述的方法,其特征在于,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,包括:
    从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
    从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。
  7. 如权利要求1至6任一项所述的方法,其特征在于,所述方法用于编码所述当前图像块,所述获取当前图像块的初始运动信息包括:从当前图像块的候选运动信息列表中获取所述初始运动信息;
    或者,所述方法用于解码所述当前图像块,所述获取当前图像块的初始运动信息之前,所述方法还包括:从当前图像块的码流中获取指示信息,所述指示信息用于指示当前图像块的初始运动信息。
  8. 一种图像预测方法,其特征在于,包括:
    获取当前图像块的初始运动信息;
    基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;
    基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或 等于N;
    基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
  9. 如权利要求8所述的方法,其特征在于,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:
    如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
    如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
    其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
  10. 如权利要求8或9所述的方法,其特征在于,所述方法还包括:
    获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
  11. 如权利要求8至10任一项所述的方法,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
    所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
  12. 如权利要求8至11任一项所述的方法,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
    所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
    根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
    根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所 述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
  13. 如权利要求8至12任一项所述的方法,其特征在于,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,包括:
    从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
    从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,所述M小于或等于N。
  14. 如权利要求8至13任一项所述的方法,其特征在于,所述方法用于编码所述当前图像块,所述获取当前图像块的初始运动信息包括:从当前图像块的候选运动信息列表中获取所述初始运动信息;
    或者,所述方法用于解码所述当前图像块,所述获取当前图像块的初始运动信息之前,所述方法还包括:从当前图像块的码流中获取指示信息,所述指示信息用于指示当前图像块的初始运动信息。
  15. 一种图像预测方法,其特征在于,包括:
    获取当前图像块的第i轮运动信息;
    根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;
    基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  16. 如权利要求15所述的方法,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;
    如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
  17. 如权利要求15或16所述的方法,其特征在于,所述根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数,包括:
    当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  18. 如权利要求15至17任一项所述的方法,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:
    所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
  19. 如权利要求15至18任一项所述的方法,其特征在于,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;
    所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:
    根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;
    根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
  20. 如权利要求15至19任一项所述的方法,其特征在于,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置,包括:
    从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者
    从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
  21. 一种图像预测方法,其特征在于,包括:
    获取当前图像块的第i轮运动信息;
    根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;
    基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  22. 如权利要求21所述的方法,其特征在于,
    如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;
    如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
  23. 如权利要求21或22所述的方法,其特征在于,所述根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数,包括:
    当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  24. 如权利要求21至23任一项所述的方法,其特征在于,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:
    如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
    如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
    其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
  25. 一种图像预测装置,其特征在于,包括:
    第一获取单元,用于获取当前图像块的初始运动信息;
    第一搜索单元,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块 的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    第一预测单元,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
  26. 如权利要求25所述的装置,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
  27. 如权利要求25或26所述的装置,其特征在于,所述第一预测单元还用于获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
  28. 如权利要求25至27任一项所述的装置,其特征在于,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,
    所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
  29. 如权利要求25至28任一项所述的装置,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
    在所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第一搜索单元具体用于:
    根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
    根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后 向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
  30. 如权利要求25至29任一项所述的装置,其特征在于,在所述基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置的方面,所述第一搜索单元具体用于:
    从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
    从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。
  31. 一种图像预测装置,其特征在于,包括:
    第二获取单元,用于获取当前图像块的初始运动信息;
    第二搜索单元,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    第二预测单元,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
  32. 如权利要求31所述的装置,其特征在于,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:
    如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
    如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
    其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
  33. 如权利要求31或32所述的装置,其特征在于,所述第二预测单元还用于获得当 前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
  34. 如权利要求31至33任一项所述的装置,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
  35. 如权利要求31至34任一项所述的装置,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;
    在所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第二搜索单元具体用于:
    根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;
    根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
  36. 如权利要求31至35任一项所述的装置,其特征在于,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置的方面,所述第二搜索单元具体用于:
    从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者
    从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,所述M小于或等于N。
  37. 一种图像预测装置,其特征在于,包括:
    第三获取单元,用于获取当前图像块的第i轮运动信息;
    第三搜索单元,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考 块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    第三预测单元,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  38. 如权利要求37所述的装置,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;
    如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
  39. 如权利要求37或38所述的装置,其特征在于,所述第三预测单元具体用于当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  40. 如权利要求37至39任一项所述的装置,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:
    所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
  41. 如权利要求37至40任一项所述的装置,其特征在于,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;
    在所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第三搜索单元具体用于:
    根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;
    根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点在所述后向参考图像中确定(N-1)个候选后向参考块的位 置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
  42. 如权利要求37至41任一项所述的装置,其特征在于,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置的方面,所述第三搜索单元具体用于:
    从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者
    从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
  43. 一种图像预测装置,其特征在于,包括:
    第四获取单元,用于获取当前图像块的第i轮运动信息;
    第四搜索单元,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;
    第四预测单元,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  44. 如权利要求43所述的装置,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;
    如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
  45. 如权利要求43或44所述的装置,其特征在于,所述第四预测单元具体用于:当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
  46. 如权利要求43至45任一项所述的装置,其特征在于,所述第一位置偏移与第二 位置偏移具有基于时域距离的比例关系,包括:
    如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,
    如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;
    其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
  47. 一种视频编码器,其特征在于,所述视频编码器用于编码图像块,包括:
    帧间预测模块,包括如权利要求25至46任一项所述的图像预测装置,其中所述帧间预测模块用于预测得到所述图像块的像素值的预测值;
    熵编码模块,用于将指示信息编入码流,所述指示信息用于指示所述图像块的初始运动信息;
    重建模块,用于基于所述图像块的像素值的预测值重建所述图像块。
  48. 一种视频解码器,其特征在于,所述视频解码器用于从码流中解码出图像块,包括:
    熵解码模块,用于从码流中解码出指示信息,所述指示信息用于指示当前解码图像块的初始运动信息;
    帧间预测模块,包括如权利要求25至46中任一项所述的图像预测装置,所述帧间预测模块用于预测得到所述图像块的像素值的预测值;
    重建模块,用于基于所述图像块的像素值的预测值重建所述图像块。
PCT/CN2018/124275 2017-12-31 2018-12-27 图像预测方法、装置以及编解码器 WO2019129130A1 (zh)

Priority Applications (15)

Application Number Priority Date Filing Date Title
RU2020125254A RU2772639C2 (ru) 2017-12-31 2018-12-27 Кодек, устройство и способ предсказания изображения
KR1020247001807A KR20240011263A (ko) 2017-12-31 2018-12-27 픽처 예측 방법과 장치, 및 코덱
JP2020536667A JP2021508213A (ja) 2017-12-31 2018-12-27 画像予測の方法および装置、ならびにコーデック
EP18895955.5A EP3734976A4 (en) 2017-12-31 2018-12-27 PROCESS AND DEVICE FOR IMAGE PREDICTION AND CODEC
BR112020012914-3A BR112020012914A2 (pt) 2017-12-31 2018-12-27 Método e aparelho de predição de imagem, e codec
CN201880084937.8A CN111543059A (zh) 2017-12-31 2018-12-27 图像预测方法、装置以及编解码器
AU2018395081A AU2018395081B2 (en) 2017-12-31 2018-12-27 Picture prediction method and apparatus, and codec
SG11202006258VA SG11202006258VA (en) 2017-12-31 2018-12-27 Picture prediction method and apparatus, and codec
KR1020207022351A KR102503943B1 (ko) 2017-12-31 2018-12-27 픽처 예측 방법과 장치, 및 코덱
KR1020237006148A KR102627496B1 (ko) 2017-12-31 2018-12-27 픽처 예측 방법과 장치, 및 코덱
CA3087405A CA3087405A1 (en) 2017-12-31 2018-12-27 Picture prediction method and apparatus, and codec
EP23219999.2A EP4362464A3 (en) 2017-12-31 2018-12-27 Picture prediction method and apparatus, and codec
US16/915,678 US11528503B2 (en) 2017-12-31 2020-06-29 Picture prediction method and apparatus, and codec
US17/994,556 US20230232036A1 (en) 2017-12-31 2022-11-28 Picture prediction method and apparatus, and codec
AU2023204122A AU2023204122A1 (en) 2017-12-31 2023-06-28 Picture prediction method and apparatus, and codec

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711494274.0A CN109996081B (zh) 2017-12-31 2017-12-31 图像预测方法、装置以及编解码器
CN201711494274.0 2017-12-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/915,678 Continuation US11528503B2 (en) 2017-12-31 2020-06-29 Picture prediction method and apparatus, and codec

Publications (1)

Publication Number Publication Date
WO2019129130A1 true WO2019129130A1 (zh) 2019-07-04

Family

ID=67066616

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/124275 WO2019129130A1 (zh) 2017-12-31 2018-12-27 图像预测方法、装置以及编解码器

Country Status (11)

Country Link
US (2) US11528503B2 (zh)
EP (2) EP3734976A4 (zh)
JP (2) JP2021508213A (zh)
KR (3) KR20240011263A (zh)
CN (3) CN117336504A (zh)
AU (2) AU2018395081B2 (zh)
BR (1) BR112020012914A2 (zh)
CA (1) CA3087405A1 (zh)
SG (1) SG11202006258VA (zh)
TW (2) TWI828507B (zh)
WO (1) WO2019129130A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021068954A1 (en) * 2019-10-12 2021-04-15 Beijing Bytedance Network Technology Co., Ltd. High level syntax for video coding tools
CN113691810A (zh) * 2021-07-26 2021-11-23 浙江大华技术股份有限公司 帧内帧间联合预测方法、编解码方法及相关设备
US11575887B2 (en) 2019-05-11 2023-02-07 Beijing Bytedance Network Technology Co., Ltd. Selective use of coding tools in video processing

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7282872B2 (ja) * 2018-08-13 2023-05-29 エルジー エレクトロニクス インコーポレイティド ヒストリベースの動きベクトルに基づくインター予測方法及びその装置
CN110545425B (zh) * 2019-08-21 2021-11-16 浙江大华技术股份有限公司 一种帧间预测方法、终端设备以及计算机存储介质
WO2021061023A1 (en) * 2019-09-23 2021-04-01 Huawei Technologies Co., Ltd. Signaling for motion vector refinement
WO2020251418A2 (en) * 2019-10-01 2020-12-17 Huawei Technologies Co., Ltd. Method and apparatus of slice-level signaling for bi-directional optical flow and decoder side motion vector refinement
CN112135127B (zh) * 2019-11-05 2021-09-21 杭州海康威视数字技术股份有限公司 一种编解码方法、装置、设备及机器可读存储介质
CN113452997B (zh) * 2020-03-25 2022-07-29 杭州海康威视数字技术股份有限公司 一种编解码方法、装置及其设备
CN112565753B (zh) * 2020-12-06 2022-08-16 浙江大华技术股份有限公司 运动矢量差的确定方法和装置、存储介质及电子装置
CN114640856B (zh) * 2021-03-19 2022-12-23 杭州海康威视数字技术股份有限公司 解码方法、编码方法、装置及设备
CN113938690B (zh) * 2021-12-03 2023-10-31 北京达佳互联信息技术有限公司 视频编码方法、装置、电子设备及存储介质
US20230199171A1 (en) * 2021-12-21 2023-06-22 Mediatek Inc. Search Memory Management For Video Coding
WO2024010338A1 (ko) * 2022-07-05 2024-01-11 한국전자통신연구원 영상 부호화/복호화를 위한 방법, 장치 및 기록 매체

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1658673A (zh) * 2005-03-23 2005-08-24 南京大学 视频压缩编解码方法
CN101557514A (zh) * 2008-04-11 2009-10-14 华为技术有限公司 一种帧间预测编解码方法、装置及系统
US20120027095A1 (en) * 2010-07-30 2012-02-02 Canon Kabushiki Kaisha Motion vector detection apparatus, motion vector detection method, and computer-readable storage medium
CN104427347A (zh) * 2013-09-02 2015-03-18 苏州威迪斯特光电科技有限公司 网络摄像机视频监控系统图像质量提高方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6195389B1 (en) * 1998-04-16 2001-02-27 Scientific-Atlanta, Inc. Motion estimation system and methods
TWI401972B (zh) * 2009-06-23 2013-07-11 Acer Inc 時間性錯誤隱藏方法
US8917769B2 (en) 2009-07-03 2014-12-23 Intel Corporation Methods and systems to estimate motion based on reconstructed reference frames at a video decoder
GB2493755B (en) 2011-08-17 2016-10-19 Canon Kk Method and device for encoding a sequence of images and method and device for decoding a sequence of images
CN104427345B (zh) * 2013-09-11 2019-01-08 华为技术有限公司 运动矢量的获取方法、获取装置、视频编解码器及其方法
US10958927B2 (en) * 2015-03-27 2021-03-23 Qualcomm Incorporated Motion information derivation mode determination in video coding
WO2017201678A1 (zh) 2016-05-24 2017-11-30 华为技术有限公司 图像预测方法和相关设备
EP3264769A1 (en) * 2016-06-30 2018-01-03 Thomson Licensing Method and apparatus for video coding with automatic motion information refinement
US10631002B2 (en) * 2016-09-30 2020-04-21 Qualcomm Incorporated Frame rate up-conversion coding mode
US10750203B2 (en) * 2016-12-22 2020-08-18 Mediatek Inc. Method and apparatus of adaptive bi-prediction for video coding
US20180192071A1 (en) * 2017-01-05 2018-07-05 Mediatek Inc. Decoder-side motion vector restoration for video coding
WO2019001741A1 (en) * 2017-06-30 2019-01-03 Huawei Technologies Co., Ltd. MOTION VECTOR REFINEMENT FOR MULTI-REFERENCE PREDICTION
CN111201795B (zh) * 2017-10-09 2022-07-26 华为技术有限公司 存储访问窗口和用于运动矢量修正的填充

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1658673A (zh) * 2005-03-23 2005-08-24 南京大学 视频压缩编解码方法
CN101557514A (zh) * 2008-04-11 2009-10-14 华为技术有限公司 一种帧间预测编解码方法、装置及系统
US20120027095A1 (en) * 2010-07-30 2012-02-02 Canon Kabushiki Kaisha Motion vector detection apparatus, motion vector detection method, and computer-readable storage medium
CN104427347A (zh) * 2013-09-02 2015-03-18 苏州威迪斯特光电科技有限公司 网络摄像机视频监控系统图像质量提高方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11575887B2 (en) 2019-05-11 2023-02-07 Beijing Bytedance Network Technology Co., Ltd. Selective use of coding tools in video processing
WO2021068954A1 (en) * 2019-10-12 2021-04-15 Beijing Bytedance Network Technology Co., Ltd. High level syntax for video coding tools
US11689747B2 (en) 2019-10-12 2023-06-27 Beijing Bytedance Network Technology Co., Ltd High level syntax for video coding tools
CN113691810A (zh) * 2021-07-26 2021-11-23 浙江大华技术股份有限公司 帧内帧间联合预测方法、编解码方法及相关设备
CN113691810B (zh) * 2021-07-26 2022-10-04 浙江大华技术股份有限公司 帧内帧间联合预测方法、编解码方法及相关设备、存储介质

Also Published As

Publication number Publication date
JP2021508213A (ja) 2021-02-25
SG11202006258VA (en) 2020-07-29
EP3734976A4 (en) 2021-02-03
TW202318876A (zh) 2023-05-01
RU2020125254A (ru) 2022-01-31
AU2018395081A1 (en) 2020-08-13
KR102503943B1 (ko) 2023-02-24
AU2023204122A1 (en) 2023-07-13
KR20240011263A (ko) 2024-01-25
RU2020125254A3 (zh) 2022-01-31
CA3087405A1 (en) 2019-07-04
TWI791723B (zh) 2023-02-11
TWI828507B (zh) 2024-01-01
CN109996081A (zh) 2019-07-09
JP2023103277A (ja) 2023-07-26
EP3734976A1 (en) 2020-11-04
EP4362464A3 (en) 2024-05-29
CN117336504A (zh) 2024-01-02
KR20200101986A (ko) 2020-08-28
TW201931857A (zh) 2019-08-01
US20230232036A1 (en) 2023-07-20
BR112020012914A2 (pt) 2020-12-08
EP4362464A2 (en) 2024-05-01
KR20230033021A (ko) 2023-03-07
US11528503B2 (en) 2022-12-13
US20200396478A1 (en) 2020-12-17
KR102627496B1 (ko) 2024-01-18
AU2018395081B2 (en) 2023-03-30
CN109996081B (zh) 2023-09-12
CN111543059A (zh) 2020-08-14

Similar Documents

Publication Publication Date Title
WO2019129130A1 (zh) 图像预测方法、装置以及编解码器
US10652571B2 (en) Advanced motion vector prediction speedups for video coding
JP6783788B2 (ja) ビデオコーディングにおけるサブブロックの動き情報の導出
WO2019120305A1 (zh) 图像块的运动信息的预测方法、装置及编解码器
JP2018530246A (ja) ビデオコーディングのために位置依存の予測組合せを使用する改善されたビデオイントラ予測
US11765378B2 (en) Video coding method and apparatus
US11563949B2 (en) Motion vector obtaining method and apparatus, computer device, and storage medium
WO2019154424A1 (zh) 视频解码方法、视频解码器以及电子设备
JP2017513346A (ja) 低複雑度符号化および背景検出のためのシステムおよび方法
WO2020047807A1 (zh) 帧间预测方法、装置以及编解码器
US11394996B2 (en) Video coding method and apparatus
WO2020114356A1 (zh) 帧间预测方法和相关装置
WO2020043111A1 (zh) 基于历史候选列表的图像编码、解码方法以及编解码器
WO2019084776A1 (zh) 图像块的候选运动信息的获取方法、装置及编解码器
RU2772639C2 (ru) Кодек, устройство и способ предсказания изображения

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18895955

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3087405

Country of ref document: CA

Ref document number: 2020536667

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018895955

Country of ref document: EP

Effective date: 20200727

Ref document number: 20207022351

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2018395081

Country of ref document: AU

Date of ref document: 20181227

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020012914

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112020012914

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20200624