WO2019129130A1 - 图像预测方法、装置以及编解码器 - Google Patents
图像预测方法、装置以及编解码器 Download PDFInfo
- Publication number
- WO2019129130A1 WO2019129130A1 PCT/CN2018/124275 CN2018124275W WO2019129130A1 WO 2019129130 A1 WO2019129130 A1 WO 2019129130A1 CN 2018124275 W CN2018124275 W CN 2018124275W WO 2019129130 A1 WO2019129130 A1 WO 2019129130A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- backward
- reference block
- current image
- target
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 308
- 239000013598 vector Substances 0.000 claims description 227
- 238000003860 storage Methods 0.000 description 30
- 238000013139 quantization Methods 0.000 description 27
- 238000010586 diagram Methods 0.000 description 23
- 230000006870 function Effects 0.000 description 17
- 238000004364 calculation method Methods 0.000 description 14
- 238000004891 communication Methods 0.000 description 14
- 239000000523 sample Substances 0.000 description 13
- 230000003044 adaptive effect Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 10
- 101100025317 Candida albicans (strain SC5314 / ATCC MYA-2876) MVD gene Proteins 0.000 description 8
- 101150079299 MVD1 gene Proteins 0.000 description 8
- 230000006835 compression Effects 0.000 description 8
- 238000007906 compression Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 7
- 230000009466 transformation Effects 0.000 description 7
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 6
- 238000013500 data storage Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000000638 solvent extraction Methods 0.000 description 6
- 230000002457 bidirectional effect Effects 0.000 description 5
- 101100537098 Mus musculus Alyref gene Proteins 0.000 description 4
- 101150095908 apex1 gene Proteins 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- FGXWKSZFVQUSTL-UHFFFAOYSA-N domperidone Chemical compound C12=CC=CC=C2NC(=O)N1CCCN(CC1)CCC1N1C2=CC=C(Cl)C=C2NC1=O FGXWKSZFVQUSTL-UHFFFAOYSA-N 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 101100520660 Drosophila melanogaster Poc1 gene Proteins 0.000 description 3
- 101100520662 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) PBA1 gene Proteins 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012432 intermediate storage Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 239000013074 reference sample Substances 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/577—Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/56—Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present application relates to the field of video codec technology, and in particular, to an image prediction method, apparatus, and codec.
- ITU-TH.265 high Efficiently transmitting and receiving digital video information between devices can be achieved between the high efficiency video coding (HEVC) standard and the video compression techniques described in the extended section of the standard.
- HEVC high efficiency video coding
- an image of a video sequence is divided into image blocks for encoding or decoding.
- inter prediction mode may include, but is not limited to, a merge mode (Merge Mode) and a non-merge mode (for example, an advanced motion vector prediction mode (AMVP mode), etc., and both are inter-predictions by using a method of multi-motion information competition. of.
- merge Mode merge mode
- AMVP mode advanced motion vector prediction mode
- a candidate motion information list (referred to as a candidate list) including a plurality of sets of motion information (also referred to as a plurality of candidate motion information) is introduced, for example, the encoder may utilize a group selected from the candidate list.
- the motion information acts as or predicts motion information (eg, motion vector) of the current image block to be encoded, thereby obtaining a reference image block (ie, a reference sample) of the current image block to be encoded.
- the decoder can decode the indication information from the code stream to obtain a set of motion information. Because the encoding overhead of the motion information is limited in the inter prediction process (that is, the bit overhead of the code stream is occupied), the accuracy of the motion information is affected to some extent, which affects the accuracy of the image prediction.
- the existing decoder-side motion vector refinement (DMVR) technique can be used to correct the motion information.
- DMVR scheme for image prediction, not only the template matching block is calculated. And the template matching block is used to perform the search matching process in the forward reference image and the backward reference image respectively, resulting in high search complexity. Therefore, how to reduce the complexity of image prediction while improving the image prediction accuracy It is a problem that needs to be solved.
- the embodiments of the present application provide an image prediction method and apparatus, and a corresponding encoder and decoder, which can reduce the complexity of image prediction to a certain extent while improving image prediction accuracy, thereby improving codec performance.
- an embodiment of the present application provides an image prediction method, including: acquiring initial motion information of a current image block; and determining a position of the N forward reference blocks based on the initial motion information and a position of a current image block. And N backward reference blocks located in the forward reference picture, the N backward reference blocks being located in the backward reference picture, N being an integer greater than 1; based on the matching cost a criterion, determining, from the position of the M reference block, a position of the pair of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, wherein the position of each pair of reference blocks includes a forward reference a position of the block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset representing the forward reference block a positional offset relative to a position of an initial forward reference block, the second positional offset representing a positional offset of
- the locations of the N forward reference blocks include a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks, where the N
- the position of the backward reference block includes the position of one initial backward reference block and the position of (N-1) candidate backward reference blocks, so that the position of the position of the initial forward reference block relative to the position of the initial forward reference block
- the 0 offset and the 0 offset also satisfy the mirror relationship.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
- the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward direction
- the second position of the position of the reference block is offset into a mirror relationship, on the basis of which the position of the pair of reference blocks (for example, the least matching cost) is determined from the position of the N pair of reference blocks as the target forward reference of the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- the current image block (abbreviated as the current block) herein can be understood as the image block currently being processed.
- the current image block (abbreviated as the current block) herein can be understood as the image block currently being processed.
- the encoding process it refers to an encoding block currently being encoded; in the decoding process, it refers to a decoding block currently being decoded.
- reference blocks herein refer to blocks that provide reference signals for the current block. During the search process, it is necessary to traverse multiple reference blocks to find the best reference block.
- a reference block located in the forward reference picture is referred to as a forward reference block; a reference block located in the backward reference picture is referred to as a backward reference block.
- the block that provides the prediction for the current block is referred to as a prediction block.
- the best reference block is found, which will provide predictions for the current block, which may be referred to as a prediction block.
- the pixel value or sampled value or sampled signal within the prediction block is called a prediction signal.
- the matching cost criterion herein can be understood as a criterion considering the matching cost between the paired forward reference block and the backward reference block, wherein the matching cost can be understood as the difference between the two blocks.
- the value can be regarded as the accumulation of the difference values of the pixel points of the corresponding positions in the two blocks.
- the calculation method of difference is generally based on the SAD (sum of absolute difference) criterion, or other criteria such as SAD (Sum of Absolute Transform Difference), MR-SAD (mean-removed sum of absolute difference, The absolute difference between the mean removal and the SSD (sum of squared differences) is calculated.
- the initial motion information of the current image block of the embodiment of the present application may include a motion vector MV and reference image indication information.
- the initial motion information may also include one or both of them.
- the reference image indication information is used to indicate which one or which reconstructed images are used as the reference image
- the motion vector represents the positional offset of the reference block position relative to the current block position in the used reference image, generally including the horizontal component offset and Vertical component offset.
- (x, y) is used to represent the MV
- x is the positional shift in the horizontal direction
- y is the positional shift in the vertical direction.
- the reference image indication information may include a reference image list and/or a reference image index corresponding to the reference image list.
- the reference image index is used to identify a reference image corresponding to the used motion vector in the specified reference image list (RefPicList0 or RefPicList1).
- An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
- the initial motion information of the current image block of the embodiment of the present application is initial bidirectional prediction motion information, that is, motion information for forward and backward prediction directions.
- the forward and backward prediction directions are two prediction directions of the bidirectional prediction mode, and it can be understood that "forward" and “backward” correspond to the reference image list 0 (RefPicList0) and the reference image of the current image, respectively.
- the position of the initial forward reference block in the embodiment of the present application refers to the position of the reference block obtained in the forward reference image by using the position of the current block plus the initial motion MV offset;
- the position of the initial backward reference block of an embodiment refers to the position of the reference block obtained in the backward reference image using the position of the current block plus the initial motion MV offset.
- the execution subject of the method of the embodiment of the present application may be an image prediction device, such as a video encoder or a video decoder or an electronic device having a video codec function, and may be, for example, a frame in a video encoder.
- An inter prediction unit, or a motion compensation unit in a video decoder may be an image prediction device, such as a video encoder or a video decoder or an electronic device having a video codec function, and may be, for example, a frame in a video encoder.
- An inter prediction unit, or a motion compensation unit in a video decoder may be, for example, a frame in a video encoder.
- the first position offset and the second position offset are mirrored, and the first position offset is the same as the second position offset.
- the direction of the first positional offset also referred to as the vector direction
- the magnitude of the first positional offset is the same as the magnitude of the second positional offset.
- the first position offset includes a first horizontal component offset and a first vertical component offset
- the second position offset includes a second horizontal component offset and a second vertical component offset Shifting, wherein the direction of the first horizontal component offset is opposite to the direction of the second horizontal component offset, and the magnitude of the first horizontal component offset is the same as the magnitude of the second horizontal component offset;
- the direction of a vertical component offset is opposite to the direction of the second vertical component offset, and the magnitude of the first vertical component offset is the same as the magnitude of the second vertical component offset.
- the first position offset and the second position offset are both zero.
- the method further comprises: obtaining updated motion information of the current image block, the updated motion information including the updated forward motion vector and the updated backward direction a motion vector, wherein the updated forward motion vector points to a location of the target forward reference block, and the updated backward motion vector points to a location of the target backward reference block.
- the updated motion information of the current image block is obtained based on the position of the target forward reference block, the position of the target backward reference block, and the position of the current image block, or is based on the determined one. Obtained from the first position offset and the second position offset corresponding to the reference block position.
- the embodiment of the present application can obtain the motion information of the corrected current image block, and improve the accuracy of the current image block motion information, which is also beneficial to the prediction of other image blocks, for example, the prediction of motion information of other image blocks. Accuracy and so on.
- the locations of the N forward reference blocks include a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks,
- the positional offset of the position of each candidate forward reference block relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance;
- the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
- the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
- the locations of the N pairs of reference blocks include: locations of the paired initial forward reference blocks and initial backward reference blocks, and locations of the paired candidate forward reference blocks and candidate backward reference blocks, Wherein the position of the candidate forward reference block in the forward reference image is offset from the position of the initial forward reference block, and the position of the candidate backward reference block in the backward reference image is relative to The position of the position of the reference block is shifted to a mirror relationship after the initial.
- the initial motion information includes forward predicted motion information and backward predicted motion information
- Determining a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block including:
- a position of the N backward reference blocks in the backward reference image according to the backward predicted motion information and a position of the current image block, where the positions of the N backward reference blocks include an initial backward reference block position and (N-1) Positions of candidate backward reference blocks, the positional offset of the position of each candidate backward reference block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
- the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a Two reference image index;
- Determining a location of the N forward reference blocks and a location of the N backward reference blocks according to the initial motion information and a location of the current image block including:
- the determining, by the M-pair reference block, the position of the pair of reference blocks from the position of the M-reference block is the location of the target forward reference block of the current image block according to the matching cost criterion.
- the location of the target backward reference block including:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block, Wherein M is less than or equal to N.
- the matching cost criterion is a criterion that minimizes the matching cost. For example, for the position of the M pair reference block, the difference between the pixel value of the forward reference block and the pixel value of the backward reference block in each pair of reference blocks is calculated; from the positions of the M pairs of reference blocks, the pixel value difference is determined to be the smallest The position of the pair of reference blocks is the position of the forward target reference block of the current image block and the position of the backward target reference block.
- the matching cost criterion is a matching cost and an early termination criterion. For example, for the positions of the nth pair of reference blocks (one forward reference block and one backward reference block), the difference between the pixel value of the forward reference block and the pixel value of the backward reference block is calculated, where n is greater than or equal to An integer less than or equal to N; determining a position of the nth pair of reference blocks (a forward reference block and a backward reference block) as the current image block when the pixel value difference is less than or equal to the matching error threshold The position of the forward target reference block and the position of the backward target reference block.
- the method is for encoding the current image block, and the acquiring initial motion information of the current image block includes: from a candidate motion information list of a current image block Obtaining the initial motion information;
- the method is used to decode the current image block, and before the acquiring initial motion information of the current image block, the method further includes: acquiring indication information from a code stream of the current image block, where the indication information is used for Indicates the initial motion information of the current image block.
- the image prediction method in the embodiment of the present application is applicable not only to the merge prediction mode (Merge) and/or the advanced motion vector prediction mode (AMVP), but also to the use of the spatial reference block and the time domain reference.
- the motion information of the block and/or the inter-view reference block predicts other modes of motion information of the current image block, thereby improving codec performance.
- a second aspect of the present application provides an image prediction method, including: acquiring initial motion information of a current image block;
- N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the position of the M pair reference block is the position of the target forward reference block of the current image block And a position of the target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset and the second
- the positional offset has a proportional relationship based on a time domain distance, the first positional offset representing a positional offset of a position of the forward referenced block relative to a position of an initial forward referenced block; the second positional offset representing Position of the backward reference block relative to a position of an
- the position of the position of the initial forward reference block relative to the position of the initial forward reference block is 0, and the position of the initial backward reference block is relative to the initial backward reference block.
- the positional offset of the position is 0, the 0 offset and the 0 offset are also satisfying the mirror relationship or satisfying the proportional relationship based on the time domain distance.
- the position of the (N-1) pair reference block does not include the position of the initial forward reference block and the position of the initial backward reference block.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
- the position of the forward reference block is offset from the first position of the position of the initial forward reference block
- the position of the backward reference block is relative to the initial backward direction
- the second position offset of the position of the reference block has a proportional relationship based on the time domain distance (also referred to as a time domain distance based mirror relationship), on the basis of which the position of the reference block is determined from N (eg, matching cost)
- the position of the smallest pair of reference blocks is the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the current image block and the target backward reference block (ie, the best backward reference)
- the position of the block/backward prediction block thereby obtaining a predicted value of the pixel
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- the first position offset and the second position offset have a proportional relationship based on a time domain distance, including:
- the proportional relationship between the first position offset and the second position offset is determined based on a proportional relationship between the first time domain distance and the second time domain distance, wherein the first time domain distance indicates that the current image block belongs to The time domain distance between the current image and the forward reference image; the second time domain distance represents the time domain distance between the current image and the backward reference image.
- the first location offset and the second location offset have a proportional relationship based on a time domain distance, including:
- the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
- the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
- the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
- the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
- the method further comprises: obtaining updated motion information of the current image block, the updated motion information including the updated forward motion vector and the updated backward direction a motion vector, wherein the updated forward motion vector points to a location of the target forward reference block, and the updated backward motion vector points to a location of the target backward reference block.
- the embodiment of the present application can obtain the motion information of the corrected current image block, and improve the accuracy of the current image block motion information, which will also be beneficial for prediction of other image blocks, such as lifting motion information of other image blocks. Forecast accuracy, etc.
- the locations of the N forward reference blocks include a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks,
- the positional offset of the position of each candidate forward reference block relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance;
- the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
- the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
- the location of the N pairs of reference blocks includes: a location of the paired initial forward reference block and the initial backward reference block, and a pair of candidate forward directions a position of a reference block and a candidate backward reference block, wherein a position of the candidate forward reference block in the forward reference image is offset from a position of an initial forward reference block, and the backward reference image
- the positional offset of the position of the candidate backward reference block relative to the position of the initial backward reference block has a proportional relationship based on the time domain distance.
- the initial motion information includes forward predicted motion information and backward predicted motion information
- Determining a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block including:
- the position of the candidate backward reference block, the positional offset of the position of each candidate backward reference block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
- the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a Two reference image index;
- Determining a location of the N forward reference blocks and a location of the N backward reference blocks according to the initial motion information and a location of the current image block including:
- the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the location of the target forward reference block of the current image block.
- the location of the target backward reference block including:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block,
- the M is less than or equal to N.
- the matching cost criterion is a criterion that minimizes the matching cost. For example, for the position of the M pair reference block, the difference between the pixel value of the forward reference block and the pixel value of the backward reference block in each pair of reference blocks is calculated; from the positions of the M pairs of reference blocks, the pixel value difference is determined to be the smallest The position of the pair of reference blocks is the position of the forward target reference block of the current image block and the position of the backward target reference block.
- the matching cost criterion is a matching cost and an early termination criterion. For example, for the positions of the nth pair of reference blocks (one forward reference block and one backward reference block), the difference between the pixel value of the forward reference block and the pixel value of the backward reference block is calculated, where n is greater than or equal to An integer less than or equal to N; determining a position of the nth pair of reference blocks (a forward reference block and a backward reference block) as the current image block when the pixel value difference is less than or equal to the matching error threshold The position of the forward target reference block and the position of the backward target reference block.
- the method is used to encode the current image block, and the acquiring initial motion information of the current image block includes: from a candidate motion information list of a current image block Obtaining the initial motion information;
- the method is used to decode the current image block, and before the acquiring initial motion information of the current image block, the method further includes: acquiring indication information from a code stream of the current image block, where the indication information is used for Indicates the initial motion information of the current image block.
- a third aspect of the present application provides an image prediction method, including: acquiring an i-th wheel motion information of a current image block;
- N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th target before the current image block.
- the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a position offset of a position of the forward reference block relative to a position of the i-1th wheel target forward reference block,
- the second position offset represents a positional offset of a position of the backward reference block
- the position of the position of the initial forward reference block relative to the position of the initial forward reference block is 0, and the position of the initial backward reference block is relative to the initial backward reference block.
- the 0 offset and the 0 offset also satisfy the mirror relationship.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
- the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward direction
- the second position of the position of the reference block is offset into a mirror relationship, on the basis of which the position of the pair of reference blocks (for example, the least matching cost) is determined from the position of the N pair of reference blocks as the target forward reference of the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- the method of the present application can further improve the accuracy of the modified motion vector MV by using an iterative method, thereby further improving the codec performance.
- the i-th wheel motion information is initial motion information of the current image block; correspondingly, the N forward reference blocks
- the position includes a position of an initial forward reference block and a position of (N-1) candidate forward reference blocks, and a positional offset of a position of each candidate forward reference block relative to a position of the initial forward reference block is An integer pixel distance or a fractional pixel distance; or, the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, each candidate backward reference
- the positional offset of the position of the block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
- the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. a motion vector; accordingly, the positions of the N forward reference blocks include a position of an i-th round target forward reference block and a position of (N-1) candidate forward reference blocks, each candidate forward The position of the reference block relative to the position of the i-1th round of the target forward reference block is offset by an integer pixel distance or a fractional pixel distance; or the position of the N backward reference blocks includes an i-1th round The position of the target backward reference block and the position of the (N-1) candidate backward reference blocks, and the positional offset of the position of each candidate backward reference block relative to the position of the i-1th round target backward reference block is Integer pixel distance or fractional pixel distance.
- initial motion information of the current image block is obtained by: determining the initial motion information from a candidate motion information list of a current image block. Or, if the method is used to decode the current image block, the initial motion information of the current image block is obtained by acquiring instruction information from a code stream of a current image block, wherein the indication information Used to indicate initial motion information of the current image block.
- the first position offset is in a mirror image relationship with the second position offset, including: the direction of the first position offset and the second position offset The direction is reversed and the magnitude of the first position offset is the same as the magnitude of the second position offset.
- the i-th wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
- Determining a position of the N forward reference blocks and a position of the N backward reference blocks according to the i-th wheel motion information and a position of the current image block including:
- the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
- the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th wheel target forward reference of the current image block.
- the position of the block and the position of the reference frame after the i-th wheel target include:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
- a fourth aspect of the present application provides an image prediction method, including: acquiring an i-th wheel motion information of a current image block;
- N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th target before the current image block.
- the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position offset indicating a position of the forward reference block relative to the i-1th in the forward reference image a position offset of a position of the wheel target forward reference block; the second position offset indicating
- the position of the position of the initial forward reference block relative to the position of the initial forward reference block is 0, and the position of the initial backward reference block is relative to the initial backward reference block.
- the positional offset of the position is 0, the 0 offset and the 0 offset are also satisfying the mirror relationship or satisfying the proportional relationship based on the time domain distance.
- the position of the (N-1) pair reference block does not include the position of the initial forward reference block and the position of the initial backward reference block.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
- the position of the forward reference block is offset from the first position of the position of the initial forward reference block
- the position of the backward reference block is relative to the initial backward direction
- the second position offset of the position of the reference block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks determined from the positions of the N pairs of reference blocks (for example, the matching cost is the smallest) is the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- the method of the present application can further improve the accuracy of the modified motion vector MV by using an iterative method, thereby further improving the codec performance.
- the i-th wheel motion information is initial motion information of the current image block; if i>1, the i-th wheel motion The information includes a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward motion vector pointing to the position of the i-1th round target backward reference block.
- the first position offset and the second position offset have a proportional relationship based on a time domain distance, including:
- the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
- the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
- the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
- the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
- the i-th wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
- Determining a position of the N forward reference blocks and a position of the N backward reference blocks according to the i-th wheel motion information and a position of the current image block including:
- the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
- the location includes the location of an i-th round of the target backward reference block and the location of the (N-1) candidate backward reference blocks.
- the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the i-th target of the current image block
- the position of the block and the position of the reference frame after the i-th wheel target include:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
- a fifth aspect of the present application provides an image prediction apparatus comprising a plurality of functional units for implementing any of the methods of the first aspect.
- the image prediction apparatus may include: a first acquiring unit, configured to acquire initial motion information of the current image block; and a first searching unit, configured to determine N forward directions based on the initial motion information and a position of the current image block. Referring to the position of the block and the position of the N backward reference blocks, the N forward reference blocks are located in the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1.
- a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating the a position offset of a position of the forward reference block relative to a position of the initial forward reference block, the second position offset indicating a positional deviation of a position of the backward reference block relative to a position of the initial backward reference block Shifting, the M is an integer greater than or equal to 1, and the M is less than or equal to N; the first prediction unit is configured to calculate a pixel value of the target forward reference block and a pixel of the target backward reference block a value that yields a predicted value of the pixel value of
- the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
- a sixth aspect of the present application provides an image predicting apparatus comprising a plurality of functional units for implementing any one of the methods of the second aspect.
- the image prediction apparatus may include: a second acquiring unit, configured to acquire initial motion information of the current image block; and a second searching unit, configured to determine N forward directions based on the initial motion information and a position of the current image block Referring to the position of the block and the position of the N backward reference blocks, the N forward reference blocks are located in the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1.
- a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and for a position of each pair of reference blocks, the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position An offset represents a positional offset of a position of the forward reference block relative to a position of an initial forward reference block; the second positional offset represents a position of the backward reference block relative to an initial backward reference a positional offset of a position of the block, the M being an integer greater than or equal to 1, and the M being less than or equal to N; a second prediction unit for determining a pixel value based on the target forward reference block and the target The pixel value of the reference block is backward referenced to
- the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
- a seventh aspect of the present application provides an image predicting apparatus comprising a plurality of functional units for implementing any of the methods of the third aspect.
- the image prediction apparatus may include: a third acquiring unit, configured to acquire the i-th wheel motion information of the current image block; and a third searching unit, configured to determine, according to the information of the i-th wheel motion and the current image block The positions of the N forward reference blocks and the positions of the N backward reference blocks, the N forward reference blocks being located in the forward reference picture, the N backward reference blocks being located in the backward reference picture, N being An integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks as the position of the i-th target forward reference block of the current image block and the position of the i-th target backward reference block of the current image block a location, wherein a location of each pair of reference blocks includes a location of a forward reference block and a location of a backward reference block
- the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
- the image prediction apparatus may include: a fourth acquiring unit, configured to acquire an i-th wheel motion information of the current image block; and a fourth searching unit, configured to determine, according to the i-th wheel motion information and a current image block position
- the image prediction device is applied, for example, to a video encoding device (video encoder) or a video decoding device (video decoder).
- a ninth aspect of the present application provides an image prediction apparatus, the apparatus comprising: a processor and a memory coupled to the processor; the processor for performing the first aspect or the second aspect or the third aspect Or the method of the fourth aspect or various implementations of the foregoing aspects.
- a tenth aspect of the present application provides a video encoder, the video encoder for encoding an image block, comprising: an inter prediction module, wherein the inter prediction module includes the fifth aspect or the sixth aspect or the seventh aspect Or the image prediction apparatus of the eighth aspect, wherein the inter prediction module is configured to predict a predicted value of a pixel value of the image block; and an entropy encoding module is configured to encode the indication information into the code stream, the indication The information is used to indicate initial motion information of the image block; and a reconstruction module is configured to reconstruct the image block based on a predicted value of a pixel value of the image block.
- An eleventh aspect of the present application provides a video decoder, where the video decoder is configured to decode an image block from a code stream, and includes: an entropy decoding module, configured to decode indication information from a code stream, where the indication information is An initial prediction information for indicating a currently decoded image block; the inter prediction module, comprising the image prediction apparatus according to the fifth aspect or the sixth aspect, or the seventh aspect or the eighth aspect, wherein the inter prediction module is used for prediction Obtaining a predicted value of a pixel value of the image block; and a reconstruction module, configured to reconstruct the image block based on a predicted value of a pixel value of the image block.
- a twelfth aspect of the present application provides a video encoding apparatus including a nonvolatile storage medium, and a processor, the nonvolatile storage medium storing an executable program, the processor and the nonvolatile The storage mediums are coupled to one another and execute the executable program to implement the methods of the first, second, third or fourth aspects or various implementations thereof.
- a thirteenth aspect of the present application provides a video decoding apparatus, including a nonvolatile storage medium, and a processor, the nonvolatile storage medium storing an executable program, the processor and the nonvolatile The storage mediums are coupled to one another and execute the executable program to implement the methods of the first, second, third or fourth aspects or various implementations thereof.
- a fourteenth aspect of the present application provides a computer readable storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the first, second, third or fourth aspects described above Or methods in various implementations thereof.
- a fifteenth aspect of the present application provides a computer program product comprising instructions which, when executed on a computer, cause the computer to perform the methods of the first, second, third or fourth aspects or various implementations thereof.
- a sixteenth aspect of the present application provides an electronic device comprising the video encoder according to the above tenth aspect, or the video decoder according to the eleventh aspect, or the fifth, sixth, seventh or eighth aspect The image prediction device.
- FIG. 1 is a schematic block diagram of a video encoding and decoding system in an embodiment of the present application
- FIG. 2A is a schematic block diagram of a video encoder in an embodiment of the present application.
- 2B is a schematic block diagram of a video decoder in an embodiment of the present application.
- FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- FIG. 4 is a schematic diagram of an encoder acquiring initial motion information in a merge mode of inter prediction
- FIG. 5 is a schematic diagram of a decoding end acquiring initial motion information in a merge mode of inter prediction
- FIG. 6 is a schematic diagram of an initial reference block of a current image block
- FIG. 7 is a schematic diagram of an integer pixel position pixel and a sub-pixel position pixel
- Figure 8 is a schematic diagram of a search starting point
- FIG. 9 is a schematic block diagram showing a mirror relationship between a first position offset and a second position offset in the embodiment of the present application.
- FIG. 10 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
- FIG. 11 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
- FIG. 12 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
- FIG. 13 is a schematic block diagram showing a proportional relationship between a first position offset and a second position offset according to a time domain distance in the embodiment of the present application;
- FIG. 14 is a schematic flowchart of another image prediction method 1400 according to an embodiment of the present application.
- FIG. 15 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
- FIG. 16 is a schematic flowchart of another image prediction method 1600 according to an embodiment of the present application.
- 17 is a schematic flowchart of another image prediction method according to an embodiment of the present application.
- FIG. 18 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application.
- FIG. 19 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application.
- FIG. 20 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application.
- 21 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application.
- FIG. 22 is a schematic block diagram of an encoding device or a decoding device according to an embodiment of the present application.
- FIG. 1 is a schematic block diagram of a video encoding and decoding system in an embodiment of the present application.
- the video encoder 20 and video decoder 30 in the system are used to predict predicted values of pixel values of image blocks according to various image prediction method examples proposed herein, and to correct motion information of currently encoded or decoded image blocks, for example Motion vectors to further improve codec performance.
- the system includes a source device 12 and a destination device 14, which generates encoded video data that will be decoded by the destination device 14 at a later time.
- Source device 12 and destination device 14 may comprise any of a wide range of devices, including desktop computers, notebook computers, tablet computers, set top boxes, telephone handsets such as so-called “smart” phones, so-called “smart” "Touchpads, televisions, cameras, display devices, digital media players, video game consoles, video streaming devices or the like.
- Link 16 may include any type of media or device capable of moving encoded video data from source device 12 to destination device 14.
- link 16 may include communication media that enables source device 12 to transmit encoded video data directly to destination device 14 in real time.
- the encoded video data can be modulated and transmitted to destination device 14 in accordance with a communication standard (e.g., a wireless communication protocol).
- Communication media can include any wireless or wired communication medium, such as a radio frequency spectrum or one or more physical transmission lines.
- the communication medium can form part of a packet-based network (eg, a global network of local area networks, wide area networks, or the Internet).
- Communication media can include routers, switches, base stations, or any other equipment that can be used to facilitate communication from source device 12 to destination device 14.
- the encoded data may be output from output interface 22 to storage device 24.
- encoded data can be accessed from storage device 24 by an input interface.
- Storage device 24 may comprise any of a variety of distributed or locally accessed data storage media, such as a hard drive, Blu-ray Disc, DVD, CD-ROM, flash memory, volatile or non-volatile memory Or any other suitable digital storage medium for storing encoded video data.
- storage device 24 may correspond to a file server or another intermediate storage device that may maintain encoded video produced by source device 12. Destination device 14 may access the stored video data from storage device 24 via streaming or download.
- the file server can be any type of server capable of storing encoded video data and transmitting this encoded video data to destination device 14.
- a file server includes a web server, a file transfer protocol server, a network attached storage device, or a local disk unit.
- Destination device 14 can access the encoded video data via any standard data connection that includes an Internet connection.
- This data connection may include a wireless channel (eg, a Wi-Fi connection), a wired connection (eg, a cable modem, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server.
- the transmission of encoded video data from storage device 24 may be streaming, downloading, or a combination of both.
- the techniques of this application are not necessarily limited to wireless applications or settings. Techniques may be applied to video decoding to support any of a variety of multimedia applications, such as over-the-air broadcast, cable television transmission, satellite television transmission, streaming video transmission (eg, via the Internet), encoding digital video for use in It is stored on a data storage medium and decodes digital video or other applications stored on the data storage medium.
- the system can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
- source device 12 includes video source 18, video encoder 20, and output interface 22.
- output interface 22 may include a modulator/demodulator (modem) and/or a transmitter.
- video source 18 may include sources such as video capture devices (eg, cameras), video archives containing previously captured video, video feed interfaces to receive video from video content providers. And/or a computer graphics system for generating computer graphics data as source video, or a combination of these sources.
- the video source 18 is a video camera
- the source device 12 and the destination device 14 may form a so-called camera phone or video phone.
- the techniques described in this application are illustratively applicable to video decoding and are applicable to wireless and/or wired applications.
- Captured, pre-captured, or computer generated video may be encoded by video encoder 20.
- the encoded video data can be transmitted directly to the destination device 14 via the output interface 22 of the source device 12.
- the encoded video data may also (or alternatively) be stored on storage device 24 for later access by destination device 14 or other device for decoding and/or playback.
- the destination device 14 includes an input interface 28, a video decoder 30, and a display device 32.
- input interface 28 can include a receiver and/or a modem.
- Input interface 28 of destination device 14 receives encoded video data via link 16.
- the encoded video data communicated or provided on storage device 24 via link 16 may include various syntax elements generated by video encoder 20 for use by video decoders of video decoder 30 to decode the video data. These syntax elements can be included with encoded video data that is transmitted over a communication medium, stored on a storage medium, or stored on a file server.
- Display device 32 may be integrated with destination device 14 or external to destination device 14.
- destination device 14 can include an integrated display device and is also configured to interface with an external display device.
- the destination device 14 can be a display device.
- display device 32 displays decoded video data to a user and may include any of a variety of display devices, such as a liquid crystal display, a plasma display, an organic light emitting diode display, or another type of display device.
- Video encoder 20 and video decoder 30 may operate in accordance with, for example, the next generation video codec compression standard (H.266) currently under development and may conform to the H.266 Test Model (JEM).
- video encoder 20 and video decoder 30 may be according to, for example, the ITU-TH.265 standard, also referred to as a high efficiency video decoding standard, or other proprietary or industry standard of the ITU-TH.264 standard or an extension of these standards.
- the ITU-TH.264 standard is alternatively referred to as MPEG-4 Part 10, also known as advanced video coding (AVC).
- AVC advanced video coding
- the techniques of this application are not limited to any particular decoding standard.
- Other possible implementations of the video compression standard include MPEG-2 and ITU-TH.263.
- video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder and may include a suitable multiplexer-demultiplexer (MUX-DEMUX) unit or other hardware and software to handle the encoding of both audio and video in a common data stream or in a separate data stream.
- MUX-DEMUX multiplexer-demultiplexer
- the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol or other protocols such as the User Datagram Protocol (UDP).
- UDP User Datagram Protocol
- Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field Programmable Gate Array (FPGA), discrete logic, software, hardware, firmware, or any combination thereof.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGA Field Programmable Gate Array
- the apparatus may store the instructions of the software in a suitable non-transitory computer readable medium and execute the instructions in hardware using one or more processors to perform the techniques of the present application.
- Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated into a combined encoder/decoder (CODEC) in a respective device. part.
- CDEC combined encoder/decoder
- the present application may illustratively involve video encoder 20 "signaling" particular information to another device, such as video decoder 30.
- video encoder 20 may signal information by associating particular syntax elements with various encoded portions of the video data. That is, video encoder 20 may "signal" the data by storing the particular syntax elements to the header information of the various encoded portions of the video data.
- these syntax elements may be encoded and stored (eg, stored to storage system 34 or file server 36) prior to being received and decoded by video decoder 30.
- the term “signaling” may illustratively refer to the communication of grammar or other data used to decode compressed video data, whether this communication occurs in real time or near real time or occurs over a time span, such as may be encoded Occurs when a syntax element is stored to the media, and the syntax element can then be retrieved by the decoding device at any time after storage to the media.
- H.265 JCT-VC developed the H.265 (HEVC) standard.
- HEVC standardization is based on an evolution model of a video decoding device called the HEVC Test Model (HM).
- HM HEVC Test Model
- the latest standard documentation for H.265 is available at http://www.itu.int/rec/T-REC-H.265.
- the latest version of the standard document is H.265 (12/16), which is the full text of the standard document.
- the manner of reference is incorporated herein.
- the HM assumes that the video decoding device has several additional capabilities with respect to existing algorithms of ITU-TH.264/AVC. For example, H.264 provides nine intra-prediction coding modes, while HM provides up to 35 intra-prediction coding modes.
- JVET is committed to the development of the H.266 standard.
- the H.266 standardization process is based on an evolution model of a video decoding device called the H.266 test model.
- the algorithm description of H.266 is available from http://phenix.int-evry.fr/jvet, and the latest algorithm description is included in JVET-F1001-v2, which is incorporated herein by reference in its entirety.
- the reference software for the JEM test model is available from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/, which is also incorporated herein by reference in its entirety.
- HM can divide a video frame or image into a sequence of treeblocks or largest coding units (LCUs) containing both luminance and chrominance samples, also referred to as CTUs.
- Treeblocks have similar purposes to macroblocks of the H.264 standard.
- a stripe contains several consecutive treeblocks in decoding order.
- a video frame or image can be segmented into one or more stripes.
- Each tree block can be split into coding units according to a quadtree. For example, a tree block that is the root node of a quadtree can be split into four child nodes, and each child node can be a parent node again and split into four other child nodes.
- the final non-splitable child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded image blocks.
- the syntax data associated with the decoded code stream may define the maximum number of times the tree block can be split, and may also define the minimum size of the decoded node.
- the coding unit includes a decoding node and a prediction unit (PU) and a transform unit (TU) associated with the decoding node.
- the size of the CU corresponds to the size of the decoding node and the shape must be square.
- the size of the CU may range from 8 x 8 pixels up to a maximum of 64 x 64 pixels or larger.
- Each CU may contain one or more PUs and one or more TUs.
- syntax data associated with a CU may describe a situation in which a CU is partitioned into one or more PUs.
- the split mode may be different between situations where the CU is skipped or encoded by direct mode coding, intra prediction mode coding, or inter prediction mode.
- the PU can be divided into a shape that is non-square.
- syntax data associated with a CU may also describe a situation in which a CU is partitioned into one or more TUs according to a quadtree.
- the shape of the TU can be square or non
- the HEVC standard allows for transforms based on TUs, which can be different for different CUs.
- the TU is typically sized based on the size of the PU within a given CU defined for the partitioned LCU, although this may not always be the case.
- the size of the TU is usually the same as or smaller than the PU.
- the residual samples corresponding to the CU may be subdivided into smaller units using a quadtree structure called a "residual qualtree" (RQT).
- RQT residual qualtree
- the leaf node of the RQT can be referred to as a TU.
- the pixel difference values associated with the TU may be transformed to produce transform coefficients, which may be quantized.
- TUs use transform and quantization processes.
- a given CU with one or more PUs may also contain one or more TUs.
- video encoder 20 may calculate a residual value corresponding to the PU.
- the residual value includes pixel difference values, which can be transformed into transform coefficients, quantized, and scanned using TU to produce serialized transform coefficients for entropy decoding.
- the present application generally refers to the decoding node of a CU using the term "image block.”
- image block may also be used herein to refer to a tree block containing a decoding node as well as a PU and a TU, eg, an LCU or CU.
- a video sequence usually contains a series of video frames or images.
- a group of picture illustratively includes a series of one or more video images.
- the GOP may include syntax data in the header information of the GOP, in the header information of one or more of the images, or elsewhere, the syntax data describing the number of images included in the GOP.
- Each strip of the image may contain stripe syntax data describing the encoding mode of the corresponding image.
- Video encoder 20 typically operates on image blocks within individual video stripes to encode video data.
- An image block may correspond to a decoding node within a CU.
- Image blocks may have fixed or varying sizes and may vary in size depending on the specified decoding criteria.
- HM supports prediction of various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, HM supports intra prediction of PU size of 2N ⁇ 2N or N ⁇ N, and inter-frame prediction of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N or N ⁇ N symmetric PU size prediction. The HM also supports asymmetric partitioning of inter-prediction of PU sizes of 2N x nU, 2N x nD, nL x 2N, and nR x 2N. In the asymmetric segmentation, one direction of the CU is not divided, and the other direction is divided into 25% and 75%.
- 2N x nU refers to a horizontally partitioned 2N x 2 NCU, where 2N x 0.5 NPU is at the top and 2N x 1.5 NPU is at the bottom.
- N x M and N by M are used interchangeably to refer to the pixel size of an image block according to a horizontal dimension and a vertical dimension, for example, 16 x 8 pixels or 16 by 8 pixels.
- a 16x8 block will have 16 pixels in the horizontal direction, that is, the image block has a width of 16 pixels and has 8 pixels in the vertical direction, that is, the image block has a height of 8 pixels.
- video encoder 20 may calculate residual data for the TU of the CU.
- a PU may include pixel data in a spatial domain (also referred to as a pixel domain), and a TU may be included in transforming (eg, discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after application to the residual video data.
- the residual data may correspond to a pixel difference between a pixel of the uncoded image and a predicted value corresponding to the PU.
- Video encoder 20 may form a TU that includes residual data for the CU, and then transform the TU to generate transform coefficients for the CU.
- An image block refers to a two-dimensional array of sample points, which can be a square array or a rectangular array.
- a 4 ⁇ 4 size image block can be regarded as a square sample point array composed of 4 ⁇ 4 total 16 sample points.
- the signal within the image block refers to the sampled value of the sample point within the image block.
- sampling points may also be referred to as pixels or pixels, and will be used indiscriminately in the present document.
- the value of the sample point can also be referred to as a pixel value, which will be used without distinction in this application.
- the image can also be represented as a two-dimensional array of sample points, labeled in a similar way to the image block.
- video encoder 20 may perform quantization of the transform coefficients.
- Quantization illustratively refers to the process of quantizing the coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression.
- the quantization process can reduce the bit depth associated with some or all of the coefficients. For example, the n-bit value can be rounded down to an m-bit value during quantization, where n is greater than m.
- the JEM model further improves the coding structure of video images.
- a block coding structure called "Quad Tree Combined Binary Tree" (QTBT) is introduced.
- QTBT Quality Tree Combined Binary Tree
- the QTBT structure rejects the concepts of CU, PU, TU, etc. in HEVC, and supports more flexible CU partitioning shapes.
- One CU can be square or rectangular.
- a CTU first performs quadtree partitioning, and the leaf nodes of the quadtree further perform binary tree partitioning.
- there are two division modes in the binary tree division symmetric horizontal division and symmetric vertical division.
- the leaf nodes of the binary tree are called CUs, and the CUs of the JEM cannot be further divided during the prediction and transformation process, that is, the CUs, PUs, and TUs of the JEM have the same block size.
- the maximum size of the CTU is 256 ⁇ 256 luma pixels.
- video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to produce an entropy encoded serialized vector.
- video encoder 20 may perform an adaptive scan. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may be based on context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), grammar based context adaptive binary. Arithmetic decoding (SBAC), probability interval partitioning entropy (PIPE) decoding, or other entropy decoding methods to entropy decode a one-dimensional vector.
- Video encoder 20 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 30 to decode the video data.
- FIG. 2A is a schematic block diagram of a video encoder 20 in the embodiment of the present application.
- video encoder 20 may perform an image prediction process, and in particular, motion compensation unit 44 in video encoder 20 may perform an image prediction process.
- video encoder 20 may include a prediction module 41, a summer 50, a transform module 52, a quantization module 54, and an entropy encoding module 56.
- the prediction module 41 may include a motion estimation unit 42, a motion compensation unit 44, and an intra prediction unit 46.
- the internal structure of the prediction module 41 is not limited in this embodiment of the present application.
- video encoder 20 may also include inverse quantization module 58, inverse transform module 60, and summer 62.
- the video encoder 20 may further include a splitting unit (not shown) and a reference image memory 64, it being understood that the splitting unit and the reference image memory 64 may also be disposed in the video encoder. Beyond 20;
- video encoder 20 may also include a filter (not shown) to filter block boundaries to remove blockiness artifacts from the reconstructed video.
- the filter will typically filter the output of summer 62 as needed.
- the video encoder 20 receives video data, and the dividing unit divides the data into image blocks.
- This segmentation may also include segmentation into strips, image blocks, or other larger units, such as image block segmentation based on the quadtree structure of the LCU and CU.
- a strip can be divided into multiple image blocks.
- the prediction module 41 is configured to generate a prediction block of the current coded image block. Prediction module 41 may select one of a plurality of possible decoding modes of the current image block based on the encoding quality and the cost calculation result (eg, rate-distortion cost, RDcost), such as one or more of a plurality of intra-coding modes One of the inter-frame decoding modes. Prediction module 41 may provide the resulting intra-coded or inter-coded block to summer 50 to generate residual block data and provide the resulting intra-coded or inter-coded block to summer 62. The coded block is reconstructed for use as a reference image.
- rate-distortion cost e.g., rate-distortion cost, RDcost
- Motion estimation unit 42 and motion compensation unit 44 within prediction module 41 perform inter-predictive decoding of the current image block relative to one or more of the one or more reference pictures to provide temporal compression.
- Motion estimation unit 42 is operative to determine an inter prediction mode for the video stripe based on a predetermined pattern of the video sequence.
- the predetermined mode specifies the video strips in the sequence as P strips, B strips, or GPB strips.
- Motion estimation unit 42 and motion compensation unit 44 may be highly integrated, but are separately illustrated for conceptual purposes.
- the motion performed by the motion estimation unit 42 is estimated as a process of generating a motion vector of the estimated image block.
- the motion vector may indicate the displacement of the PU of the current video frame or image block within the image relative to the predicted block within the reference image.
- the prediction block is a block of PUs that are found to closely match the image block to be decoded according to the pixel difference, and the pixel difference may be determined by absolute difference sum (SAD), square difference sum (SSD) or other difference metric.
- video encoder 20 may calculate a value of a sub-integer pixel location of a reference image stored in reference image memory 64.
- the motion estimation unit 42 calculates a motion vector of the PU of the image block in the inter-decoded slice by comparing the position of the PU with the position of the prediction block of the reference image.
- the reference images may be selected from a first reference image list (List 0) or a second reference image list (List 1), each of the lists identifying one or more reference images stored in the reference image memory 64.
- Motion estimation unit 42 transmits the computed motion vector to entropy encoding module 56 and motion compensation unit 44.
- Motion compensation performed by motion compensation unit 44 may involve extracting or generating a prediction block based on motion vectors determined by motion estimation, possibly performing interpolation to sub-pixel precision. After receiving the motion vector of the PU of the current image block, motion compensation unit 44 may locate the prediction block to which the motion vector is directed in one of the reference image lists.
- the video encoder 20 forms a residual image block by subtracting the pixel value of the prediction block from the pixel value of the current image block being decoded, thereby forming a pixel difference value.
- the pixel difference values form residual data for the block and may include both luminance and chrominance difference components.
- Summer 50 represents one or more components that perform this subtraction.
- Motion compensation unit 44 may also generate syntax elements associated with image blocks and video slices for use by video decoder 30 to decode image blocks of the video strip.
- the image prediction process of the embodiment of the present application will be described in detail below with reference to FIG. 3, FIG. 10-12, and FIG. 14-17, and details are not described herein again.
- Intra prediction unit 46 within prediction module 41 may perform intra-predictive decoding of the current image block relative to one or more neighboring blocks in the same image or slice as the current block to be decoded to provide spatial compression .
- intra-prediction unit 46 may intra-predict the current block.
- intra prediction unit 46 may determine an intra prediction mode to encode the current block.
- intra-prediction unit 46 may encode the current block using various intra-prediction modes, for example, during separate encoding traversal, and intra-prediction unit 46 (or in some possible implementations, The mode selection unit 40) may select the appropriate intra prediction mode to use from the tested mode.
- the video encoder 20 forms a residual image block by subtracting the prediction block from the current image block.
- the residual video data in the residual block may be included in one or more TUs and applied to transform module 52.
- the transform module 52 is configured to transform the residual between the original block of the current coded image block and the predicted block of the current image block.
- Transform module 52 transforms the residual data into residual transform coefficients using, for example, a discrete cosine transform (DCT) or a conceptually similar transform (eg, a discrete sinusoidal transform DST).
- Transform module 52 may convert the residual video data from the pixel domain to a transform domain (eg, a frequency domain).
- Transform module 52 may send the resulting transform coefficients to quantization module 54.
- Quantization module 54 quantizes the transform coefficients to further reduce the code rate.
- quantization module 54 may then perform a scan of the matrix containing the quantized transform coefficients.
- entropy encoding module 56 may perform a scan.
- entropy encoding module 56 may entropy encode the quantized transform coefficients. For example, entropy encoding module 56 may perform context adaptive variable length decoding (CAVLC), context adaptive binary arithmetic decoding (CABAC), syntax based context adaptive binary arithmetic decoding (SBAC), probability interval partitioning entropy (PIPE) decoding or another entropy coding method or technique. Entropy encoding module 56 may also entropy encode the motion vectors and other syntax elements of the current video strip being encoded. After entropy encoding by entropy encoding module 56, the encoded code stream may be transmitted to video decoder 30 or archive for later transmission or retrieved by video decoder 30.
- CAVLC context adaptive variable length decoding
- CABAC context adaptive binary arithmetic decoding
- SBAC syntax based context adaptive binary arithmetic decoding
- PIPE probability interval partitioning entropy
- Entropy encoding module 56 may also entrop
- Inverse quantization module 58 and inverse transform module 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain for later use as a reference block for the reference image.
- the summer 62 adds the reconstructed residual block to the prediction block generated by the prediction module 41 to produce a reconstructed block and serves as a reference block for storage in the reference image memory 64.
- These reference blocks may be used by motion estimation unit 42 and motion compensation unit 44 as reference blocks to inter-predict blocks in subsequent video frames or images.
- video encoder 20 may directly quantize the residual signal without the need for processing by transform module 52, and accordingly need not be processed by inverse transform module 58; or, for some image blocks Or the image frame, the video encoder 20 does not generate residual data, and accordingly does not need to be processed by the transform module 52, the quantization module 54, the inverse quantization module 58 and the inverse transform module 60; alternatively, the video encoder 20 can reconstruct the reconstructed image
- the block is stored directly as a reference block without being processed by the filter unit; alternatively, the quantization module 54 and the inverse quantization module 58 in the video encoder 20 may be merged together; or, the transform module 52 and the inverse transform in the video encoder 20 Modules 60 can be merged together; alternatively, summer 50 and summer 62 can be combined.
- FIG. 2B is a schematic block diagram of a video decoder 30 in the embodiment of the present application.
- video decoder 30 may perform an image prediction process, and in particular, motion compensation unit 82 in video decoder 30 may perform an image prediction process.
- video decoder 30 may include an entropy decoding module 80, a prediction processing module 81, an inverse quantization module 86, an inverse transform module 88, and a reconstruction module 90.
- the prediction module 81 may include a motion compensation unit 82 and an intra prediction unit 84, which are not limited in this embodiment of the present application.
- video decoder 30 may also include reference image memory 92. It should be understood that the reference image memory 92 can also be disposed outside of the video decoder 30. In some possible implementations, video decoder 30 may perform an exemplary reciprocal decoding process with respect to the encoding flow described by video encoder 20 from FIG. 2A.
- video decoder 30 receives from video encoder 20 an encoded video code stream representing the image blocks of the encoded video stripe and associated syntax elements.
- Video decoder 30 may receive syntax elements at the video stripe level and/or image block level.
- Entropy decoding module 80 of video decoder 30 entropy decodes the bitstream/codestream to produce quantized coefficients and some syntax elements.
- Entropy decoding module 80 forwards the syntax elements to prediction processing module 81.
- the syntax element herein may include inter prediction data related to a current image block, and the inter prediction data may include an index identifier block_based_index to indicate which motion information is used by the current image block.
- a switch flag block_based_enable_flag may also be included to indicate whether image prediction is performed on the current image block using FIG. 3 or 14 (in other words, to indicate whether the current image block is present)
- MVD image constraint proposed by the present application to perform inter prediction, or whether to perform image prediction on the current image block using FIG. 12 or 16 (in other words, to indicate whether or not the time domain distance proposed by the present application is applied to the current image block)
- Inter-prediction is performed under the proportional relationship).
- the intra-prediction unit 84 of the prediction processing module 81 may be based on the signaled intra prediction mode and the previously decoded block from the current frame or image. The data produces a prediction block of the image block of the current video strip.
- motion compensation unit 82 of prediction processing module 81 may determine to use for the current video based on the syntax elements received from entropy decoding module 82.
- the inter-prediction mode in which the current image block of the stripe is decoded, the current image block is decoded (eg, inter-prediction is performed) based on the determined inter-prediction mode.
- the motion compensation unit 82 may determine which image prediction method is used for prediction of the current image block of the current video slice, for example, the syntax element indicates that the current image block is predicted by using an image prediction method based on the MVD image constraint.
- the motion information of the current image block of the current video stripe is predicted or corrected so that the predicted block of the current image block is acquired or generated by the motion compensation process using the predicted motion information of the current image block.
- the motion information herein may include reference image information and motion vectors, wherein the reference image information may include, but is not limited to, unidirectional/bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list.
- a prediction block may be generated from one of the reference pictures within one of the reference picture lists.
- the video decoder 30 may construct a reference image list, that is, list 0 and list 1, based on the reference image stored in the reference image memory 92.
- the reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1.
- video encoder 20 may be signaled indicating which new image prediction method to employ.
- the prediction processing module 81 is configured to generate a prediction block of the currently decoded image block; specifically, when the video slice is decoded into an intra-frame decoded (I) slice, the intra prediction unit 84 of the prediction module 81 A prediction block of an image block of the current video stripe may be generated based on the signaled intra prediction mode and data from a previously decoded image block of the current frame or image.
- motion compensation unit 82 of prediction module 81 generates the current video based on the motion vectors and other syntax elements received from entropy encoding unit 80. The predicted block of the image block of the image.
- Inverse quantization module 86 inverse quantizes, ie, dequantizes, the quantized transform coefficients provided in the bitstream and decoded by entropy decoding module 80.
- the inverse quantization process may include using the quantization parameters calculated by video encoder 20 for each of the video slices to determine the degree of quantization that should be applied and likewise determine the degree of inverse quantization that should be applied.
- Inverse transform module 88 applies the inverse transform to transform coefficients, such as inverse DCT, inverse integer transform, or a conceptually similar inverse transform process, to generate residual blocks in the pixel domain.
- the video decoder 30 obtains the reconstructed block by summing the residual block from the inverse transform module 88 with the corresponding prediction block generated by the motion compensation unit 82. That is, the decoded image block.
- Summer 90 represents the component that performs this summation operation.
- a loop filter (either in the decoding loop or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality, if desired.
- a filter unit may represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
- decoded image blocks in a given frame or image may be stored in decoded image buffer 92, which stores reference images for subsequent motion compensation.
- the decoded image buffer 92 can be part of a memory that can also store decoded video for later presentation on a display device (eg, display device 32 of FIG. 1), or can be separate from such memory.
- video decoder 30 may be used to decode the encoded video bitstream.
- video decoder 30 may generate an output video stream without processing by a filter unit; or, for some image blocks or image frames, entropy decoding module 80 of video decoder 30 does not decode the quantized coefficients, correspondingly Processing by inverse quantization module 86 and inverse transform module 88 is required.
- inverse quantization module 86 and inverse transform module 88 in video decoder 30 may be combined.
- FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 3 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 3 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
- Process 300 may be performed by video encoder 20 or video decoder 30, and in particular, by video encoder 20 or a motion compensation unit of video decoder 30. Assuming that a video data stream having multiple video frames is using a video encoder or video decoder, a process 300 comprising the steps of predicting a predicted value of a pixel value of a current image block of a current video frame is performed;
- the method shown in FIG. 3 includes steps 301 to 304, and steps 301 to 304 are described in detail below.
- the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
- the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
- the initial motion information may include indication information of a prediction direction (usually bidirectional prediction), a motion vector pointing to a reference image block (usually a motion vector of a neighboring block), and image information of a reference image block (generally understood as a reference image).
- Information wherein the motion vector includes a forward motion vector and a backward motion vector, and the reference image information includes reference frame index information of the forward prediction reference image block and the backward prediction reference image block.
- the method may be performed in various manners. For example, the following manners 1 and 2 may be used to obtain initial motion information of the image block.
- a candidate motion information list is constructed according to motion information of neighboring blocks of the current image block, and a candidate motion information is selected from the candidate motion information list as The initial motion information of the current image block.
- the candidate motion information list includes a motion vector, reference frame index information, and the like.
- the motion information of the neighboring block A0 (see the candidate motion information with index 0 in FIG. 5) is selected as the initial motion information of the current image block, specifically, the forward motion vector of A0 is used as the forward motion of the current block.
- Vector using the backward motion vector of A0 as the backward motion vector of the current block.
- a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
- the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
- the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
- N is an integer greater than one.
- the current image to which the current image block belongs has two reference images in tandem, that is, a forward reference image and a backward reference image.
- the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a second reference image index of a backward prediction direction;
- step 302 can include:
- the position of the block is used as a first search starting point (indicated by (0, 0) in FIG. 8), and positions of (N-1) candidate forward reference blocks are determined in the forward reference image;
- the position of the block is used as a second search starting point, and the positions of (N-1) candidate backward reference blocks are determined in the backward reference picture.
- the positions of the N forward reference blocks include an initial forward reference block position (indicated by (0, 0)) and (N-1) candidate forward directions.
- the position of the reference block (indicated by (0, -1) (-1, -1), (-1, 1), (1, -1), (1, 1), etc.), each candidate forward reference block
- the positional offset of the position relative to the position of the initial forward reference block is an integer pixel distance (as shown in FIG.
- the accuracy of the MV may be fractional pixel precision (e.g., 1/2 pixel accuracy, or 1/4 pixel precision). If there are only pixel values of integer pixels in the image, and the accuracy of the current MV is fractional pixel precision, then the pixel value of the entire pixel position of the reference image needs to be interpolated by using an interpolation filter to obtain the pixel value of the sub-pixel position as the current The value of the block's prediction block.
- the specific interpolation operation process is related to the interpolation filter used. Generally, the pixel value of the integer pixel point around the reference pixel point can be linearly weighted to obtain the value of the reference pixel point. Commonly used interpolation filters are 4 taps, 6 taps, 8 taps, and so on.
- Ai, j is a pixel point at an entire pixel position, and its bit width is bitDepth.
- A0,0,b0,0,c0,0,d0,0,h0,0,n0,0e0,0,i0,0,p0,0,f0,0,j0,0,q0,0,g0,0, K0, 0, and r0, 0 are pixel points at the sub-pixel position. If an 8-tap interpolation filter is used, a0,0 can be calculated by the following formula:
- a0,0 (C 0 * A -3,0 + C 1 * A -2,0 + C 2 * A -1,0 + C 3 * A 0,0 + C 4 * A 1,0 + C 5 *A 2,0 +C 6 *A 3,0 +C 7 *A 4,0 )>>shift1
- a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, where the position of each pair of reference blocks include a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a positional offset of a position of the forward reference block relative to a position of the initial forward reference block, the second positional offset indicating a positional offset of a position of the backward reference block relative to a position of the initial backward reference block , M is an integer greater than or equal to 1, and the M is less than or equal to N.
- the position of the candidate forward reference block 904 in the forward reference image Ref0 is shifted by MVD0 (delta0x, delta0y) with respect to the position of the position of the initial forward reference block 902 (ie, the forward search base point).
- the position of the candidate backward reference block 905 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the initial backward reference block 903 (ie, the backward search base point).
- MVD0 -MVD1; ie:
- step 303 can include:
- the positions of the pair of reference blocks with the smallest matching error are determined as the position of the target forward reference block of the current image block and the current image block.
- Position of the target backward reference block; or from the position of the M pair reference block, determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the current image block The target is backwards to the position of the reference block, where the M is less than or equal to N.
- step 304 the pixel value of the target forward reference block and the pixel value of the target backward reference block are weighted to obtain a predicted value of the pixel value of the current image block.
- the method shown in FIG. 3 further includes: obtaining updated motion information of the current image block, where the updated motion information includes an updated forward motion vector and an updated backward motion vector, where The updated forward motion vector points to the location of the target forward reference block, and the updated backward motion vector points to the location of the target backward reference block.
- the updated motion information of the current image block may be based on the location of the target forward reference block, the location of the target backward reference block, and the location of the current image block, or may be a pair of references based on the determination.
- the block position corresponds to the first position offset and the second position offset.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs reference block
- the forward reference block Based on the first positional offset of the position of the initial forward reference block, and the second position of the position of the backward reference block relative to the position of the initial backward reference block, in a mirror relationship, on the basis of Determining (eg, matching the least cost) a position of a pair of reference blocks from the position of the N-reference block to the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the current image block and Position of the target backward reference block (ie, the best backward reference block/backward prediction block), such that the pixel value based on the target forward reference block and the pixel value of the target backward reference block are obtained The predicted value of the pixel value of the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- FIG. 10 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 10 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 10 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 10 can occur in the interframe prediction process at the time of encoding and decoding.
- the method shown in FIG. 10 includes steps 1001 to 1007, and the steps 1001 to 1007 are described in detail below.
- a set of motion information is obtained from the merge candidate list according to the index of the merge, and the motion information is the initial motion information of the current block.
- the MVP is obtained from the MVP candidate list according to the index of the AMVP
- the MV of the current block is obtained by summing the MVP and the MVD included in the code stream.
- the initial motion information includes reference image indication information and a motion vector, and the forward reference image and the backward reference image are determined by referring to the image indication information. The position of the forward reference block and the position of the backward reference block are determined by the motion vector.
- 1002 Determine, in a forward reference image, a location of a starting forward reference block of a current image block, where the location of the starting forward reference block is a search starting point (also referred to as a search base point) in the forward reference image;
- a search base point (hereinafter referred to as a first search base point) in the forward reference image is obtained.
- the forward MV information is (MV0x, MV0y).
- the position information of the current block is (B0x, B0y).
- the first search base point of the forward reference image is (MV0x+B0x, MV0y+B0y).
- a search base point in the backward reference image (hereinafter referred to as a second search base point) is obtained.
- the backward MV is (MV1x, MV1y).
- the position information of the current block is (B0x, B0y).
- the second search base point of the backward reference picture is (MV1x+B0x, MV1y+B0y).
- the MVD image constraint here can be interpreted as the positional shift MVD0 (delta0x, delta0y) of the block position in the forward reference image with respect to the forward search base point.
- the position of the block in the backward reference picture is shifted by MVD1 (delta1x, delta1y) with respect to the position of the backward search base point.
- MVD0 -MVD1; ie:
- a motion search of an integer pixel step is performed starting from a search base point (indicated by (0, 0)).
- the integer pixel step size refers to the positional offset of the position of the candidate reference block relative to the search base point as an integer pixel distance. It should be pointed out that regardless of whether the search base point is an integer pixel (the starting point can be an integer pixel, or a sub-pixel, such as 1/2, 1/4, 1/8, 1/16, etc.), the integer can be integer first. Pixel step motion search to get the position of the forward reference block of the current image block. It should be understood that when searching in integer pixel steps, the search starting point can be either full pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels. and many more.
- the search point of 8 integer pixel steps around the search base point is searched with the (0, 0) point as the search base point, and the position of the corresponding candidate reference block is obtained.
- Figure 7 illustrates eight candidate reference blocks; if the positional offset of the position of the forward candidate reference block relative to the position of the forward search base point in the forward reference picture is (-1, -1), the backward reference picture The positional offset of the position of the corresponding backward candidate reference block relative to the position of the backward search base point is (1, 1). The positions of the paired forward candidate reference block and the backward candidate reference block are obtained accordingly. A matching cost between the corresponding two candidate reference blocks is calculated for the positions of the obtained pair of reference blocks. The forward reference block and the backward reference block with the smallest matching cost are taken as the optimal forward reference block and the optimal backward reference block, and the optimal forward motion vector and the optimal backward motion vector are obtained.
- step 1005-1006 performing the motion compensation process using the optimal forward motion vector obtained in step 1004 to obtain the pixel value of the optimal forward reference block; using the optimal backward motion vector obtained in step 1004 to perform the motion compensation process, The pixel value of the optimal backward reference block.
- a predicted value of a pixel value of a current image block can be obtained according to formula (2).
- predSamples'[x][y] (predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (2)
- predSamplesL0' is the optimal forward reference block
- predSamplesL1' is the optimal backward reference block
- predSamples' is the prediction block of the current image block
- predSamplesL0'[x][y] is the optimal forward reference block at the pixel point.
- predSamplesL1'[x][y] is the pixel value of the optimal backward reference block at the pixel point (x, y)
- predSamples'[x][y] is the final prediction block in the pixel
- search method it is not limited to which search method is used, and may be any search method.
- calculate a difference between the backward candidate block and a corresponding forward candidate block in step 4 calculate a difference between the backward candidate block and a corresponding forward candidate block in step 4, and select a backward candidate block with a minimum SAD and its corresponding backward motion.
- the vector and the forward candidate block and their corresponding forward motion vectors are taken as the optimal backward reference block and the corresponding optimal backward motion vector and the optimal forward reference block and the corresponding optimal forward motion vector.
- step 1004 only an example of an integer pixel step search method is given.
- a fractional pixel step search can also be used. For example, after performing an integer pixel step search in step 1004, a search for the fractional pixel step size is performed. Or, search directly for the fractional pixel step size.
- the specific search method is not limited here.
- the method for calculating the matching cost is not limited.
- the SAD criterion may be used, or the MR-SAD criterion may be used, or other criteria may be used.
- the traversal operation or the search operation may be terminated earlier.
- the early termination condition of the search method is not limited.
- step 1005 and step 1005 is not limited, and may be performed simultaneously or sequentially.
- the template matching block needs to be calculated first, and the template matching block is used to perform the forward search and the backward search respectively.
- the candidate in the forward reference image is used in the process of finding the matching block.
- the block and the candidate blocks in the backward reference image directly calculate the matching cost, determine the two blocks with the smallest matching cost, simplify the image prediction process, and improve the image prediction accuracy while reducing the complexity.
- FIG. 11 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 11 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 11 includes steps 1101 to 1105, wherein steps 1101 to 1103, 1105 refer to the description of steps 1001 to 1003, 1007 in FIG. 10, and details are not described herein again.
- the difference between the embodiment of the present application and the embodiment shown in FIG. 10 is that the pixel values of the current optimal forward reference block and the optimal backward reference block are retained and updated during the search process. After the search is completed, the predicted values of the pixel values of the current image block may be calculated using the pixel values of the current optimal forward reference block and the optimal backward reference block.
- Costi is the matching cost of the ith
- MinCost represents the current minimum matching value
- Bfi, Bbi are the pixel values of the forward reference block obtained by the ith and the pixel values of the backward reference block, respectively.
- BestBf, BestBb are the values of the current optimal forward reference block, respectively.
- CalCost(M,N) represents the matching cost of block M and block N.
- BestBf is used, and BestBb obtains the predicted value of the pixel value of the current block.
- FIG. 12 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 12 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 12 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
- Process 1200 may be performed by video encoder 20 or video decoder 30, and in particular, by video encoder 20 or a motion compensation unit of video decoder 30. Assuming that a video data stream having multiple video frames is using a video encoder or video decoder, a process 1200 comprising the steps of predicting a predicted value of a pixel value of a current image block of a current video frame is performed;
- steps 1201 to 1204 The method shown in FIG. 12 includes steps 1201 to 1204, wherein steps 1201, 1202, and 1204 refer to the description of steps 301, 302, and 304 in FIG. 3, and details are not described herein again.
- step 1203 based on the matching cost criterion, determining the position of a pair of reference blocks from the positions of the M pairs of reference blocks as the target forward reference of the current image block.
- the position of the block and the position of the target backward reference block wherein the position of each pair of reference blocks includes the position of one forward reference block and the position of one backward reference block, and the position offset for each pair of reference blocks And a second position offset having a proportional relationship based on a time domain distance, the first position offset representing a positional offset of a position of the forward reference block relative to a position of an initial forward reference block; the second position The offset represents a positional offset of a position of the backward reference block relative to a position of an initial backward reference block, the M being an integer greater than or equal to 1, and the M is less than or equal to N;
- the position of the candidate forward reference block 1304 in the forward reference image Ref0 is shifted by MVD0 (delta0x, delta0y) with respect to the position of the position of the initial forward reference block 1302 (ie, the forward search base point).
- the position of the candidate backward reference block 1305 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the initial backward reference block 1303 (ie, the backward search base point).
- TC, T0, and T1 represent the time of the current frame, the time of the forward reference picture, and the time of the backward reference picture, respectively.
- TD0, TD1 represents the time interval between two moments.
- TD0 and TD1 can be calculated using picture order count (POC).
- POC picture order count
- POCc, POC0, POC1 represent the POC of the current image, the POC of the forward reference image, and the POC of the backward reference image, respectively.
- TD0 represents a picture order count (POC) distance between the current picture and the forward reference picture;
- TD1 represents a POC distance between the current picture and the backward reference picture.
- Delta0x (TD0/TD1)*delta1x
- Delta0x/delta1x (TD0/TD1)
- Delta0y/delta1y (TD0/TD1)
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs reference block
- the forward reference block a position offset relative to a position of the initial forward reference block
- the position of a pair of reference blocks determined from the position of the N pairs of reference blocks is the target forward reference block of the current image block (ie, the best forward reference block/forward prediction)
- the position of the block and the position of the target backward reference block ie, the best backward reference block/backward prediction block), thereby based on the pixel value of the target forward reference block and the pixel of the target backward reference block a value that yields a predicted value of the pixel value of the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- the search process was performed once.
- the method shown in FIG. 14 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 14 may occur in the encoding process, and may also occur in the decoding process. Specifically, the method shown in FIG. 14 may occur in an encoding process or an interframe prediction process at the time of decoding.
- the method shown in FIG. 14 specifically includes steps 1401 to 1404, as follows:
- the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
- the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
- the i-th wheel motion information is initial motion information of the current image block
- the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. Motion vector.
- the initial motion information may include indication information of a prediction direction (usually bidirectional prediction), a motion vector pointing to a reference image block (usually a motion vector of a neighboring block), and image information of a reference image block (generally understood as a reference image).
- Information wherein the motion vector includes a forward motion vector and a backward motion vector, and the reference image information includes reference frame index information of the forward prediction reference image block and the backward prediction reference image block.
- the method may be performed in various manners. For example, the following manners 1 and 2 may be used to obtain initial motion information of the image block.
- a candidate motion information list is constructed according to motion information of neighboring blocks of the current image block, and a candidate motion information is selected from the candidate motion information list as The initial motion information of the current image block.
- the candidate motion information list includes a motion vector, reference frame index information, and the like.
- the motion information of the neighboring block A0 (see the candidate motion information with index 0 in FIG. 5) is selected as the initial motion information of the current image block, specifically, the forward motion vector of A0 is used as the forward motion of the current block.
- Vector using the backward motion vector of A0 as the backward motion vector of the current block.
- a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
- the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
- the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
- N is an integer greater than one.
- the ith wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
- step 1402 can include:
- the position of the i-1 round target backward reference block is used as the i b search starting point to determine the position of the (N-1) candidate backward reference blocks in the backward reference picture.
- the positions of the N forward reference blocks include the positions of the i-1th round target forward reference block (indicated by (0, 0)) and (N-1).
- the position of the candidate forward reference block (indicated by (0,-1)(-1,-1), (-1,1), (1,-1), (1,1), etc.), each candidate
- the positional offset of the position of the forward reference block relative to the position of the i-th round of the target forward reference block is an integer pixel distance (as shown in FIG.
- the position of the backward reference block includes the position of the i-1th round target backward reference block and the position of the (N-1) candidate backward reference blocks, and the position of each candidate backward reference block is relative to the
- a position of the pair of reference blocks from the positions of the M pairs of reference blocks as a position of the target forward reference block of the current image block and a position of the target backward reference block, where the position of each pair of reference blocks include a position of a forward reference block and a position of a backward reference block, and for each position of the reference block, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a positional offset of a position of the forward reference block relative to a position of the i-1th round target forward reference block, the second positional offset indicating a position of the backward reference block relative to the i-1th round target
- the position of the position of the backward reference block is shifted, the M is an integer greater than or equal to 1, and the M is less than or equal to N.
- the first position offset is in a mirror image relationship with the second position offset. It can be understood that the direction of the first position offset is opposite to the direction of the second position offset, and the first position offset is offset. The value is the same as the amplitude of the second position offset.
- the position of the candidate backward reference block 905 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the i-1th round target backward reference block 903 (ie, the backward search base point).
- MVD0 -MVD1; ie:
- step 1403 can include:
- the position of the pair of reference blocks with the smallest matching error is determined as the position of the i-th target forward reference block of the current image block and The position of the i-round target backward reference block; or, from the position of the M-reference block, the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is determined as the ith round target forward reference block of the current image block.
- step 404 the pixel value of the target forward reference block and the pixel value of the target backward reference block are weighted to obtain a predicted value of the pixel value of the current image block.
- the predicted value of the pixel value of the current image block may be obtained according to other methods, which is not limited in this application.
- the initial motion information is updated to the second round of motion information, wherein the second round of motion information includes: a forward motion vector pointing to the position of the first round target forward reference block and the pointing The backward motion vector of the position of the target backward reference block 1 round, ..., thus makes it possible to effectively predict other image blocks according to the image block when performing the next image prediction.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs reference block
- the forward reference block Based on the first positional offset of the position of the initial forward reference block, and the second position of the position of the backward reference block relative to the position of the initial backward reference block, in a mirror relationship, on the basis of Determining (eg, matching the least cost) a position of a pair of reference blocks from the position of the N-reference block to the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the current image block and Position of the target backward reference block (ie, the best backward reference block/backward prediction block), such that the pixel value based on the target forward reference block and the pixel value of the target backward reference block are obtained The predicted value of the pixel value of the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- increasing the number of iterations can further improve the accuracy of the modified MV, thereby further improving the codec performance.
- the flow of the image prediction method in the embodiment of the present application will be described in detail below with reference to FIG. 15.
- the method shown in FIG. 15 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 15 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 15 may occur in an encoding process or an interframe prediction process at the time of decoding.
- the method shown in FIG. 15 specifically includes steps 1501 to 1508, and step 1501 to step 1508 are described in detail below.
- the initial motion information of the current block is used. For example, for an image block whose encoding mode is merge, motion information is obtained from the merge candidate list according to the index of the merge, and the motion information is the initial motion information of the current block. For example, for an image block whose encoding mode is AMVP, the MVP is obtained from the MVP candidate list according to the index of the AMVP, and the MV of the current block is obtained by summing the MVP and the MVD included in the code stream. If it is not the first round search, the last round of updated MV information is used.
- the motion information includes reference image indication information and motion vector information.
- the forward reference image and the backward reference image are determined by referring to the image indication information.
- the position of the forward reference block and the position of the backward reference block are determined by the motion vector information.
- the search base point in the forward reference image is determined based on the forward MV information and the position information of the current block. It is specifically similar to the process of the embodiment of FIG. 10 or 11. For example, if the forward MV information is (MV0x, MV0y) and the position information of the current block is (B0x, B0y), the search base point in the forward reference image is (MV0x+B0x, MV0y+B0y).
- the search base point in the backward reference image is determined based on the backward MV information and the position information of the current block. It is specifically similar to the process of the embodiment of FIG. 10 or 11. For example, if the backward MV information is (MV1x, MV1y) and the position information of the current block is (B0x, B0y), the search base point of the backward reference picture is (MV1x+B0x, MV1y+B0y).
- the specific search step is similar to the process of the embodiment of FIG. 10 or 11, and will not be described again here.
- steps 1505. Determine whether an iteration termination condition is reached. If not, perform steps 1502 and 1503. Otherwise, steps 1506 and 1507 are performed.
- L is a preset value and L is an integer greater than 1.
- L can be set in advance before the image is predicted.
- L can also set the value of L according to the accuracy of the image prediction and the complexity of the search prediction block.
- L can also be set according to the historical experience value, or L can also be Determined based on the verification of the results in the intermediate search process.
- a total of 2 searches are performed in integer pixel steps, wherein in the first search, the position of the initial forward reference block can be used as the search base point in the forward reference image (also referred to as the front Determining the positions of (N-1) candidate forward reference blocks in the reference region; and determining the position of the initial backward reference block as the search base point in the backward reference image (also referred to as the forward reference region) ( N-1) the positions of the candidate backward reference blocks, and the matching cost of the corresponding two reference blocks is calculated for one or more pairs of reference block positions in the positions of the N pairs of reference blocks, for example, calculating an initial forward reference block The matching cost with the initial backward reference block, and the matching cost of one candidate forward reference block and one candidate backward reference block satisfying the MVD mirror constraint, thereby obtaining the first round target forward reference block of the first search And the position of the first round target backward reference block, thereby obtaining updated motion information, including: a forward motion vector indicating a position of the forward reference image (also referred
- the updated motion information is the same as the reference frame index in the initial motion information.
- a second search is performed to determine (N-1) candidate forward references in the forward reference image (also referred to as the forward reference region) with the position of the first round target forward reference block as the search base point.
- Position of the block and determining the position of the (N-1) candidate backward reference blocks in the backward reference picture (also referred to as the forward reference area) by using the position of the first round target backward reference block as the search base point, Calculating the matching cost of the corresponding two reference blocks for one or more pairs of reference block positions in the positions of the N pairs of reference blocks, for example, calculating the first round target forward reference block and the first round target backward reference block Matching cost, calculating the matching cost of a candidate forward reference block and a candidate backward reference block satisfying the MVD image constraint condition, thereby obtaining the second round target forward reference block and the second round target backward reference of the second search
- the position of the block which in turn obtains updated motion information, comprising: a forward motion vector indicating a position of the current image block pointing to the position of the second round target forward reference block and a position indicating that the current image block is pointing to the second round target backward direction Reference block bit The backward motion vectors.
- the updated motion information is the same as other information such as the reference frame index in the initial motion information.
- the second round target forward reference block and the second round target backward reference block are the final target forward reference blocks and targets.
- the reference block also known as the optimal forward reference block and the optimal backward reference block.
- step 1506 to 1507 performing the motion compensation process using the optimal forward motion vector obtained in step 1504 to obtain the pixel value of the optimal forward reference block; and performing the motion compensation process using the optimal backward motion vector obtained in step 1504, To get the pixel value of the optimal backward reference block.
- a search (or referred to as a motion search) may be performed in a full pixel step size when searching in the forward reference image or the backward reference image to obtain the position of at least one forward reference block and at least one backward direction. Refer to the location of the block.
- the search starting point can be either full pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels, and the like.
- the search may be performed directly in the sub-pixel step, or both the full pixel step search and the sub-pixel are performed. Step search. This application does not limit the search method.
- step 1504 for the address of each pair of reference blocks, when calculating the difference between the pixel value of the forward reference block of the corresponding relationship and the pixel value of the corresponding one of the backward reference blocks, SAD, SATD, or absolute square difference may be used. And etc. to measure the difference between the pixel value of each forward reference block and the pixel value of the corresponding backward reference block.
- SAD SATD
- absolute square difference may be used. And etc. to measure the difference between the pixel value of each forward reference block and the pixel value of the corresponding backward reference block.
- the application is not limited thereto.
- the pixel values and the optimal backward direction of the optimal forward reference block obtained in step 1506 and step 1507 may be obtained.
- the pixel value of the reference block is weighted, and the pixel value obtained by the weighting process is used as a predicted value of the pixel value of the current image block.
- the predicted value of the pixel value of the current image block can be obtained according to formula (8).
- predSamples'[x][y] (predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (8)
- predSamplesL0'[x][y] is the pixel value of the optimal forward reference block at the pixel point (x, y)
- predSamplesL1'[x][y] is the optimal backward reference block at the pixel point (x, y)
- the pixel value, predSamples'[x][y] is the pixel prediction value of the current image block at the pixel point (x, y).
- the pixel value of the current optimal forward reference block and the pixel value of the optimal backward reference block may also be retained and updated.
- the predicted values of the pixel values of the current image block are directly calculated using the pixel values of the current optimal forward reference block and the optimal backward reference block.
- steps 1506 and 1507 are optional steps.
- Costi is the matching cost of the ith
- MinCost represents the current minimum matching value
- Bfi, Bbi are the pixel values of the forward reference block obtained by the ith and the pixel values of the backward reference block, respectively.
- BestBf, BestBb are the pixel values of the current optimal forward reference block, respectively.
- CalCost(M,N) represents the matching cost of block M and block N.
- BestBf is used, and BestBb obtains the predicted value of the pixel value of the current block.
- the search process is performed once.
- the flow of the image prediction method 1600 of the embodiment of the present application will be described in detail below with reference to FIG.
- the method shown in FIG. 16 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 16 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 16 may occur in an encoding process or an interframe prediction process at the time of decoding.
- the method 1600 shown in FIG. 16 includes steps 1601 through 1604, wherein steps 1601, 1602, and 1604 refer to the description of steps 1401, 1402, and 1404 in FIG. 14, and are not described herein again.
- step 1603 determining the position of a pair of reference blocks from the positions of the M pairs of reference blocks as the target forward reference of the current image block based on the matching cost criterion.
- the position of the block and the position of the target backward reference block wherein the position of each pair of reference blocks includes the position of one forward reference block and the position of one backward reference block, and the position offset for each pair of reference blocks And a second position offset having a proportional relationship based on a time domain distance, the first position offset representing a positional offset of a position of the forward reference block relative to a position of an initial forward reference block; the second position The offset represents a positional offset of a position of the backward reference block relative to a position of an initial backward reference block, the M being an integer greater than or equal to 1, and the M is less than or equal to N;
- the position of the candidate forward reference block 1304 in the forward reference image Ref0 is shifted by MVD0 (delta0x, delta0y) with respect to the position of the position of the initial forward reference block 1302 (ie, the forward search base point).
- the position of the candidate backward reference block 1305 in the backward reference picture Ref1 is shifted by MVD1 (delta1x, delta1y) with respect to the position of the position of the initial backward reference block 1303 (ie, the backward search base point).
- TC, T0, and T1 represent the time of the current frame, the time of the forward reference picture, and the time of the backward reference picture, respectively.
- TD0, TD1 represents the time interval between two moments.
- TD0 and TD1 can be calculated using picture order count (POC).
- POC picture order count
- POCc, POC0, POC1 represent the POC of the current image, the POC of the forward reference image, and the POC of the backward reference image, respectively.
- TD0 represents a picture order count (POC) distance between the current picture and the forward reference picture;
- TD1 represents a POC distance between the current picture and the backward reference picture.
- Delta0x (TD0/TD1)*delta1x
- Delta0x/delta1x (TD0/TD1)
- Delta0y/delta1y (TD0/TD1)
- step 1603 can include:
- the position of the pair of reference blocks with the smallest matching error is determined as the position of the i-th target forward reference block of the current image block and The position of the i-round target backward reference block; or, from the position of the M-reference block, the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is determined as the ith round target forward reference block of the current image block.
- the positions of the N forward reference blocks in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, for the N
- the position of the forward reference block is offset from the first position of the position of the initial forward reference block
- the position of the backward reference block is relative to the initial backward direction
- the second position offset of the position of the reference block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks determined from the positions of the N pairs of reference blocks (for example, the matching cost is the smallest) is the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- increasing the number of iterations can further improve the accuracy of the modified MV, thereby further improving the codec performance.
- the flow of the image prediction method in the embodiment of the present application will be described in detail below with reference to FIG.
- the method shown in FIG. 17 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 17 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 17 may occur in an encoding process or an interframe prediction process at the time of decoding.
- the method shown in FIG. 17 includes steps 1701 to 1708, wherein steps 1701 to 1703, 1705 to 1708 refer to the description of steps 1501 to 1503, 1505 to 1508 in FIG. 15, and details are not described herein again.
- the image constraint of the MVD based on the time domain distance can be interpreted as the positional offset MVD0 (delta0x, delta0y) of the block position in the forward reference image relative to the forward search base point and the block position in the backward reference image.
- the positional offset MVD1 (delta1x, delta1y) of the backward search base point satisfies the following relationship:
- TC, T0, and T1 represent the time of the current image, the time of the forward reference image, and the time of the backward reference image, respectively.
- TD0, TD1 represents the time interval between two moments.
- TD0 and TD1 can be calculated using picture order count (POC).
- POC picture order count
- POCc, POC0, POC1 represent the POC of the current image, the POC of the forward reference image, and the POC of the backward reference image, respectively.
- TD0 represents a picture order count (POC) distance between the current picture and the forward reference picture;
- TD1 represents a POC distance between the current picture and the backward reference picture.
- Delta0x (TD0/TD1)*delta1x
- Delta0x/delta1x (TD0/TD1)
- Delta0y/delta1y (TD0/TD1).
- the specific search step is similar to the process of the embodiment of FIG. 10 or 11, and will not be described again here.
- the mirror relationship considers either the time domain interval or the time domain interval.
- the current frame or the current block is adaptively selected for motion vector correction, whether the mirror relationship considers the time domain interval.
- indication information may be added in sequence level header information (SPS), picture level header information (PPS), or slice header, or block code stream information to indicate the current sequence, or the current picture, or the current stripe. (Slice), or whether the mirror relationship used by the current block considers the time interval.
- SPS sequence level header information
- PPS picture level header information
- slice header or block code stream information to indicate the current sequence, or the current picture, or the current stripe.
- block code stream information to indicate the current sequence, or the current picture, or the current stripe. (Slice), or whether the mirror relationship used by the current block considers the time interval.
- the current block adaptively determines whether the mirror relationship used by the current block takes into account the time interval according to the POC of the forward reference image and the POC of the backward reference image.
- Max(A, B) represents the larger value in A and B
- Min(A, B) represents the smaller value in A and B.
- the mirroring relationship to be used needs to consider the interval, otherwise the time interval is not considered, where R is the preset threshold.
- R is the preset threshold.
- the specific R value is not limited herein.
- the image prediction method of the embodiments of the present application may be specifically performed by an encoder (eg, encoder 20) or a motion compensation module in a decoder (eg, decoder 30). Additionally, the image prediction method of embodiments of the present application can be implemented in any electronic device or device that requires encoding and/or decoding of a video image.
- FIG. 18 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 1800 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 1800 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction device 1800 may include:
- the first obtaining unit 1801 is configured to acquire initial motion information of the current image block.
- a first searching unit 1802 configured to determine a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block, where the N forward reference blocks are located in a forward direction In the reference image, the N backward reference blocks are located in the backward reference picture, N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the position of the M reference block as the current image block
- the first position offset is in a mirror relationship with the second position offset, the first position offset representing a positional offset of the position of the forward reference block relative to the position of the initial forward reference block, the second position The offset represents a positional offset of a position of the backward reference block relative to a position
- the first prediction unit 1803 is configured to obtain a predicted value of a pixel value of the current image block based on a pixel value of the target forward reference block and a pixel value of the target backward reference block.
- the first position offset is in a mirror image relationship with the second position offset. It can be understood that the first position offset is the same as the second position offset, for example, the direction of the first position offset is The direction of the second positional offset is reversed, and the magnitude of the first positional offset is the same as the magnitude of the second positional offset.
- the first prediction unit 1803 is further configured to obtain updated motion information of a current image block, where the updated motion information includes an updated forward motion vector and an updated To a motion vector, wherein the updated forward motion vector points to a location of the target forward reference block, the updated backward motion vector pointing to a location of the target backward reference block.
- the motion vector of the image block is updated, so that other image blocks can be effectively predicted according to the image block when the next image prediction is performed.
- the location of the N forward reference blocks includes a location of an initial forward reference block and a location of (N-1) candidate forward reference blocks, each candidate forward reference block.
- the positional offset of the position relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance; or
- the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
- the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
- the initial motion information includes a first motion vector and a first reference image index in a forward prediction direction, and a second motion vector and a second reference image index in a backward prediction direction;
- the first searching unit is specifically configured to:
- the position of the pair of reference blocks is determined from the position of the M pair reference block as the position of the target forward reference block of the current image block and the target backward reference block based on the matching cost criterion.
- the first search unit 1802 is specifically configured to:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block, Wherein M is less than or equal to N.
- the foregoing apparatus 1800 may perform the foregoing methods shown in FIG. 3, FIG. 10, and FIG. 11, and the apparatus 1800 may specifically be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
- the apparatus 1800 can be used for both image prediction during encoding and image prediction during decoding.
- the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks
- Each of the positions of the N pairs of reference blocks, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial position Deviating to a second position of the position of the reference block into a mirror relationship, on the basis of which the position of the pair of reference blocks determined from the position of the N pair of reference blocks (for example, the least matching cost) is the target forward direction of the current image block
- a predicted value of a pixel value of the current image block is obtained by referring to a pixel value of the
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- FIG. 19 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 1900 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 1900 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction device 1900 may include:
- a second acquiring unit 1901 configured to acquire initial motion information of a current image block
- a second searching unit 1902 configured to determine a location of the N forward reference blocks and a location of the N backward reference blocks based on the initial motion information and a location of the current image block, where the N forward reference blocks are located in a forward direction In the reference image, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks as the current image block
- the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position offset indicating a positional offset of a position of the forward reference block relative to a position of the initial forward reference block;
- the second position offset represents a positional offset
- the second prediction unit 1903 is configured to obtain a predicted value of a pixel value of the current image block based on a pixel value of the target forward reference block and a pixel value of the target backward reference block.
- the first position offset and the second position offset have a proportional relationship based on the time domain distance, which can be understood as:
- the proportional relationship between the first position offset and the second position offset is determined based on a proportional relationship between the first time domain distance and the second time domain distance, wherein the first time domain distance indicates that the current image block belongs to The time domain distance between the current image and the forward reference image; the second time domain distance represents the time domain distance between the current image and the backward reference image.
- the first position offset and the second position offset have a proportional relationship based on a time domain distance, and may include:
- the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
- the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
- the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
- the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
- the second prediction unit 1903 is further configured to obtain updated motion information of a current image block, where the updated motion information includes an updated forward motion vector and an updated backward motion vector. And wherein the updated forward motion vector points to a location of the target forward reference block, the updated backward motion vector pointing to a location of the target backward reference block.
- the embodiment of the present application can obtain the motion information of the corrected current image block, and improve the accuracy of the current image block motion information, which will also be beneficial for prediction of other image blocks, such as lifting motion information of other image blocks. Forecast accuracy, etc.
- the positions of the N forward reference blocks include a position of an initial forward reference block and a position of (N-1) candidate forward reference blocks, and the positions of each candidate forward reference block are relative.
- the positional offset of the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance; or
- the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward direction
- the positional offset of the position of the reference block is an integer pixel distance or a fractional pixel distance.
- the initial motion information includes forward predicted motion information and backward predicted motion information
- the second search unit 1902 is specifically configured to:
- the position of the candidate backward reference block, the positional offset of the position of each candidate backward reference block relative to the position of the initial backward reference block is an integer pixel distance or a fractional pixel distance.
- the initial motion information includes a first motion vector and a first reference image index of a forward prediction direction, and a second motion vector and a second reference image index of a backward prediction direction;
- the second search unit is specifically configured to:
- determining, according to the matching cost criterion, a position of the pair of reference blocks from the positions of the M pairs of reference blocks is a position of the target forward reference block of the current image block and a position of the target backward reference block.
- the second searching unit 1902 is specifically configured to:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the target forward reference block of the current image block and the position of the target backward reference block of the current image block,
- the M is less than or equal to N.
- the matching cost criterion is a criterion that minimizes the matching cost. For example, for the position of the M pair reference block, the difference between the pixel value of the forward reference block and the pixel value of the backward reference block in each pair of reference blocks is calculated; from the positions of the M pairs of reference blocks, the pixel value difference is determined to be the smallest The position of the pair of reference blocks is the position of the forward target reference block of the current image block and the position of the backward target reference block.
- the matching cost criterion is a matching cost and an early termination criterion. For example, for the positions of the nth pair of reference blocks (one forward reference block and one backward reference block), the difference between the pixel value of the forward reference block and the pixel value of the backward reference block is calculated, where n is greater than or equal to An integer less than or equal to N; determining a position of the nth pair of reference blocks (a forward reference block and a backward reference block) as the current image block when the pixel value difference is less than or equal to the matching error threshold The position of the forward target reference block and the position of the backward target reference block.
- the second acquiring unit 1901 is configured to acquire the initial motion information from a candidate motion information list of a current image block, or acquire the initial motion information according to the indication information, where the indication information is used to indicate a current The initial motion information of the image block. It should be understood that the initial motion information is relative to the corrected motion information.
- the above apparatus 1900 can perform the above-described method shown in FIG. 12, and the apparatus 1900 can be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
- the apparatus 1900 can be used for both image prediction during encoding and image prediction during decoding.
- the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks
- Each of the positions of the N pairs of reference blocks, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial position
- the second position offset to the position of the reference block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks (eg, the least matching cost) is determined from the position of the N pair of reference blocks as the current image
- the position of the target forward reference block (ie, the best forward reference block/forward prediction block) of the block and the position of the target backward reference block (ie, the best backward reference block/backward prediction block) thereby based on A pixel value of the target forward reference block and a pixel value of the target backward reference block obtain a
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- FIG. 20 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 2000 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 2000 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction apparatus 2000 may include:
- the third acquiring unit 2001 is configured to acquire the i-th wheel motion information of the current image block.
- a third search unit 2002 configured to determine a location of the N forward reference blocks and a position of the N backward reference blocks according to the i-th wheel motion information and a position of the current image block, where the N forward reference blocks are located In the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is current a position of the i-th wheel target forward reference block of the image block and a position of the i-th wheel target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and For a position of each pair of reference blocks, the first position offset is in a mirror relationship with the second position offset, the first position offset indicating a position of the forward reference block relative to the i-1th round target forward reference a positional offset of a position of the block, the
- a third prediction unit 2003 configured to obtain, according to a pixel value of the j-th target forward reference block and a pixel value of the j-th target backward reference block, a predicted value of a pixel value of the current image block, Where j is greater than or equal to i, and i and j are integers greater than or equal to one.
- the positions of the N forward reference blocks include an initial forward reference block position and (N-1) positions of the candidate forward reference blocks, the positional offset of the position of each candidate forward reference block relative to the position of the initial forward reference block is an integer pixel distance or a fractional pixel distance; or
- the positions of the N backward reference blocks include a position of an initial backward reference block and a position of (N-1) candidate backward reference blocks, and a position of each candidate backward reference block is relative to the initial backward reference
- the positional offset of the position of the block is an integer pixel distance or a fractional pixel distance.
- the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. a motion vector; accordingly, the positions of the N forward reference blocks include a position of an i-th round target forward reference block and a position of (N-1) candidate forward reference blocks, each candidate forward The position of the reference block relative to the position of the i-1th round of the target forward reference block is offset by an integer pixel distance or a fractional pixel distance; or the position of the N backward reference blocks includes an i-1th round The position of the target backward reference block and the position of the (N-1) candidate backward reference blocks, and the positional offset of the position of each candidate backward reference block relative to the position of the i-1th round target backward reference block is Integer pixel distance or fractional pixel distance.
- the third prediction unit 2003 is specifically configured to: according to the pixel value of the j-th target forward reference block and the pixel of the j-th target backward reference block when the iterative termination condition is satisfied a value that yields a predicted value of the pixel value of the image block, where j is greater than or equal to i, and i and j are integers greater than or equal to one.
- j is greater than or equal to i
- i and j are integers greater than or equal to one.
- the first position offset is in a mirror image relationship with the second position offset, and the first position offset amount is the same as the second position offset amount, for example, the The direction of the first positional offset is opposite to the direction of the second positional offset, and the magnitude of the first positional offset is the same as the magnitude of the second positional offset.
- the i-th wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index;
- the third search unit 2002 is specifically configured to determine a position of the N forward reference blocks and a position of the N backward reference blocks according to the position of the i-th wheel motion information and the current image block. :
- the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
- the location includes the location of an i-th round of the target backward reference block and the location of the (N-1) candidate backward reference blocks.
- the determining, according to the matching cost criterion, the position of the pair of reference blocks from the positions of the M pairs of reference blocks is the position of the i-th wheel target forward reference block of the current image block and the i-th wheel target
- the third search unit 2002 is specifically configured to:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
- the apparatus 2000 may perform the foregoing methods shown in FIG. 14 and FIG. 15.
- the apparatus 2000 may specifically be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
- the device 2000 can be used for both image prediction during encoding and image prediction during decoding.
- the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, N pairs of reference block positions in the position of the reference block, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward reference
- the second position of the position of the block is offset into a mirror relationship, on the basis of which the position of the pair of reference blocks determined from the position of the N pair of reference blocks (for example, the least matching cost) is the target forward reference block of the current image block.
- the pixel value and the pixel value of the target backward reference block obtain a predicted value of the pixel value of the current image block.
- FIG. 21 is a schematic block diagram of another image prediction apparatus according to an embodiment of the present application. It should be noted that the prediction apparatus 2100 is applicable to both inter prediction of a decoded video image and inter prediction of an encoded video image. It should be understood that the prediction apparatus 2100 herein may correspond to the motion compensation unit in FIG. 2A. 44, or may correspond to the motion compensation unit 82 in FIG. 2B, the prediction device 2100 may include:
- the fourth obtaining unit 2101 is configured to acquire the i-th wheel motion information of the current image block.
- a fourth searching unit 2102 configured to determine, according to the i-th wheel motion information and a position of a current image block, a position of the N forward reference blocks and a position of the N backward reference blocks, where the N forward reference blocks are located In the forward reference picture, the N backward reference blocks are located in the backward reference picture, and N is an integer greater than 1; based on the matching cost criterion, determining the position of the pair of reference blocks from the positions of the M pairs of reference blocks is current a position of the i-th wheel target forward reference block of the image block and a position of the i-th wheel target backward reference block, wherein the position of each pair of reference blocks includes a position of a forward reference block and a position of a backward reference block, and For a position of each pair of reference blocks, the first position offset and the second position offset have a proportional relationship based on a time domain distance, the first position offset representing the forward reference block in the forward reference image a position offset relative to a position of a position
- a fourth prediction unit 2103 configured to obtain, according to the pixel value of the j-th target forward reference block and the pixel value of the j-th target backward reference block, a predicted value of a pixel value of the current image block, Where j is greater than or equal to i, and i and j are integers greater than or equal to one.
- the i-th wheel motion information is initial motion information of the current image block
- the i-th wheel motion information includes: a forward motion vector pointing to the position of the i-1th round target forward reference block and a backward direction pointing to the position of the i-1st round target backward reference block. Motion vector.
- the fourth prediction unit 2103 is specifically configured to: when the iterative termination condition is met, according to the pixel value of the j-th target forward reference block and the j-th target backward reference block The pixel value is obtained as a predicted value of the pixel value of the image block, where j is greater than or equal to i, and i and j are integers greater than or equal to 1.
- the first position offset and the second position offset have a proportional relationship based on the time domain distance, which can be understood as:
- the direction of the first position offset is opposite to the direction of the second position offset, and the magnitude of the first position offset is offset from the second position The same magnitude; or,
- the direction of the first position offset is opposite to the direction of the second position offset, and the amplitude of the first position offset is offset from the second position
- the proportional relationship between the values is based on a proportional relationship between the first time domain distance and the second time domain distance;
- the first time domain distance represents a time domain distance between the current image to which the current image block belongs and the forward reference image; and the second time domain distance represents a time between the current image and the backward reference image. Domain distance.
- the ith wheel motion information includes a forward motion vector and a forward reference image index, and a backward motion vector and a backward reference image index; and correspondingly, according to the ith wheel motion
- the information and the location of the current image block determine aspects of the location of the N forward reference blocks and the location of the N backward reference blocks.
- the fourth search unit 2102 is specifically configured to:
- the position of the block includes a position of an i-th round target forward reference block and a position of the (N-1) candidate forward reference block;
- the location includes the location of an i-th round of the target backward reference block and the location of the (N-1) candidate backward reference blocks.
- the fourth search unit 2102 is specifically configured to:
- determining the position of the pair of reference blocks whose matching error is less than or equal to the matching error threshold is the position of the i-th target forward reference block of the current image block and the i-th target of the current image block The position to the reference block, where M is less than or equal to N.
- the foregoing apparatus 2100 may perform the above-described method shown in FIG. 16 or 17, and the apparatus 2100 may be a video encoding apparatus, a video decoding apparatus, a video codec system, or other devices having a video codec function.
- the device 2100 can be used for both image prediction during encoding and image prediction during decoding.
- the positions of the N forward reference blocks located in the forward reference image and the positions of the N backward reference blocks in the backward reference image form a position of the N pairs of reference blocks, N pairs of reference block positions in the position of the reference block, the position of the forward reference block is offset from the first position of the position of the initial forward reference block, and the position of the backward reference block is relative to the initial backward reference
- the second positional offset of the position of the block has a proportional relationship based on the time domain distance, on the basis of which the position of the pair of reference blocks determined from the position of the N pair of reference blocks (eg, the least matching cost) is the current image block.
- the pixel value of the target forward reference block and the pixel value of the target backward reference block obtain a predicted value of the pixel value of the current image block.
- the method in the embodiment of the present application avoids the calculation process of pre-calculating the template matching block and avoids the process of using the template matching block to perform the forward search matching and the backward search matching respectively, thereby simplifying the image prediction process, thereby While improving the accuracy of image prediction, the complexity of image prediction is reduced.
- increasing the number of iterations can further improve the accuracy of the modified MV, thereby further improving the codec performance.
- FIG. 22 is a schematic block diagram of an implementation manner of a video encoding device or a video decoding device (abbreviated as decoding device 2200) according to an embodiment of the present disclosure.
- the decoding device 2200 can include a processor 2210, a memory 2230, and a bus system 2250.
- the processor and the memory are connected by a bus system for storing instructions for executing instructions stored in the memory.
- the memory of the encoding device stores the program code, and the processor can invoke the program code stored in the memory to perform various video encoding or decoding methods described herein, particularly video encoding in various inter prediction modes or intra prediction modes. Or a decoding method, and a method of predicting motion information in various inter or intra prediction modes. To avoid repetition, it will not be described in detail here.
- the processor 2210 may be a central processing unit (“CPU"), and the processor 2210 may also be other general-purpose processors, digital signal processors (DSPs), and dedicated integration. Circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc.
- the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
- the memory 2230 can include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device can also be used as the memory 2230.
- Memory 2230 can include code and data 2231 that is accessed by processor 2210 using bus 2250.
- the memory 2230 can further include an operating system 2233 and an application 2235 that includes at least one program that allows the processor 2210 to perform the video encoding or decoding methods described herein, particularly the image prediction methods described herein.
- application 2235 can include applications 1 through N, which further include a video encoding or decoding application (referred to as a video coding application) that performs the video encoding or decoding methods described herein.
- the bus system 2250 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 2250 in the figure.
- decoding device 2200 may also include one or more output devices, such as display 2270.
- display 2270 can be a tactile display or a touch display that combines the display with a tactile unit that operatively senses a touch input.
- Display 2270 can be coupled to processor 2210 via bus 2250.
- the computer readable medium can comprise a computer readable storage medium corresponding to a tangible medium, such as a data storage medium, or any communication medium that facilitates transfer of the computer program from one location to another (eg, according to a communication protocol) .
- a computer readable medium may generally correspond to (1) a non-transitory tangible computer readable storage medium, or (2) a communication medium, such as a signal or carrier.
- Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in this application.
- the computer program product can comprise a computer readable medium.
- such computer readable storage medium may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage, flash memory or may be used to store instructions or data structures
- the desired program code in the form of any other medium that can be accessed by a computer.
- any connection is properly termed a computer-readable medium.
- a coaxial cable, fiber optic cable, twisted pair cable, digital subscriber line (DSL), or wireless technology such as infrared, radio, and microwave is used to transmit commands from a website, server, or other remote source
- the coaxial cable Wire, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of the media.
- the computer readable storage medium and data storage medium do not include connections, carrier waves, signals, or other temporary media, but rather are directed to non-transitory tangible storage media.
- magnetic disks and optical disks include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), and blu-ray disc, where the disc typically reproduces data magnetically, while the disc is optically reproduced using a laser data. Combinations of the above should also be included in the scope of computer readable media.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable logic arrays
- the term "processor,” as used herein, may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
- the functions described in the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or Into the combined codec.
- the techniques may be fully implemented in one or more circuits or logic elements.
- various illustrative logical blocks, units, modules in video encoder 20 and video decoder 30 may be understood as corresponding circuit devices or logic elements.
- the techniques of the present application can be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset).
- IC integrated circuit
- a group of ICs eg, a chipset
- Various components, modules or units are described herein to emphasize functional aspects of the apparatus for performing the disclosed techniques, but do not necessarily need to be implemented by different hardware units. Indeed, as described above, various units may be combined in a codec hardware unit in conjunction with suitable software and/or firmware, or by interoperating hardware units (including one or more processors as described above) provide.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Color Television Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (48)
- 一种图像预测方法,其特征在于,包括:获取当前图像块的初始运动信息;基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
- 如权利要求1所述的方法,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
- 如权利要求1或2所述的方法,其特征在于,所述方法还包括:获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
- 如权利要求1至3任一项所述的方法,其特征在于,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
- 如权利要求1至4任一项所述的方法,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和 N个后向参考块的位置,包括:根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
- 如权利要求1至5任一项所述的方法,其特征在于,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,包括:从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。
- 如权利要求1至6任一项所述的方法,其特征在于,所述方法用于编码所述当前图像块,所述获取当前图像块的初始运动信息包括:从当前图像块的候选运动信息列表中获取所述初始运动信息;或者,所述方法用于解码所述当前图像块,所述获取当前图像块的初始运动信息之前,所述方法还包括:从当前图像块的码流中获取指示信息,所述指示信息用于指示当前图像块的初始运动信息。
- 一种图像预测方法,其特征在于,包括:获取当前图像块的初始运动信息;基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或 等于N;基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
- 如权利要求8所述的方法,其特征在于,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
- 如权利要求8或9所述的方法,其特征在于,所述方法还包括:获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
- 如权利要求8至10任一项所述的方法,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
- 如权利要求8至11任一项所述的方法,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所 述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
- 如权利要求8至12任一项所述的方法,其特征在于,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,包括:从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,所述M小于或等于N。
- 如权利要求8至13任一项所述的方法,其特征在于,所述方法用于编码所述当前图像块,所述获取当前图像块的初始运动信息包括:从当前图像块的候选运动信息列表中获取所述初始运动信息;或者,所述方法用于解码所述当前图像块,所述获取当前图像块的初始运动信息之前,所述方法还包括:从当前图像块的码流中获取指示信息,所述指示信息用于指示当前图像块的初始运动信息。
- 一种图像预测方法,其特征在于,包括:获取当前图像块的第i轮运动信息;根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求15所述的方法,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
- 如权利要求15或16所述的方法,其特征在于,所述根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数,包括:当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求15至17任一项所述的方法,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
- 如权利要求15至18任一项所述的方法,其特征在于,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,包括:根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
- 如权利要求15至19任一项所述的方法,其特征在于,所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置,包括:从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
- 一种图像预测方法,其特征在于,包括:获取当前图像块的第i轮运动信息;根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置和第i轮目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求21所述的方法,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
- 如权利要求21或22所述的方法,其特征在于,所述根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数,包括:当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求21至23任一项所述的方法,其特征在于,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
- 一种图像预测装置,其特征在于,包括:第一获取单元,用于获取当前图像块的初始运动信息;第一搜索单元,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块 的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第一预测单元,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
- 如权利要求25所述的装置,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
- 如权利要求25或26所述的装置,其特征在于,所述第一预测单元还用于获得当前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
- 如权利要求25至27任一项所述的装置,其特征在于,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
- 如权利要求25至28任一项所述的装置,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;在所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第一搜索单元具体用于:根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后 向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
- 如权利要求25至29任一项所述的装置,其特征在于,在所述基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置的方面,所述第一搜索单元具体用于:从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,其中所述M小于或等于N。
- 一种图像预测装置,其特征在于,包括:第二获取单元,用于获取当前图像块的初始运动信息;第二搜索单元,用于基于所述初始运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考块的位置相对于初始前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考块的位置相对于初始后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第二预测单元,用于基于所述目标前向参考块的像素值和所述目标后向参考块的像素值,得到所述当前图像块的像素值的预测值。
- 如权利要求31所述的装置,其特征在于,所述第一位置偏移与第二位置偏移具有基于时域距离的比例关系,包括:如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
- 如权利要求31或32所述的装置,其特征在于,所述第二预测单元还用于获得当 前图像块的更新的运动信息,所述更新的运动信息包括更新的前向运动矢量和更新的后向运动矢量,其中所述更新的前向运动矢量指向所述目标前向参考块的位置,所述更新的后向运动矢量指向所述目标后向参考块的位置。
- 如权利要求31至33任一项所述的装置,所述N个前向参考块的位置包括一个初始前向参考块的位置和(N-1)个候选前向参考块的位置,每个候选前向参考块的位置相对于所述初始前向参考块的位置的位置偏移为整数像素距离或者分数像素距离;或者,所述N个后向参考块的位置包括一个初始后向参考块的位置和(N-1)个候选后向参考块的位置,每个候选后向参考块的位置相对于所述初始后向参考块的位置的位置偏移为整数像素距离或者分数像素距离。
- 如权利要求31至34任一项所述的装置,其特征在于,所述初始运动信息包括前向预测方向的第一运动矢量和第一参考图像索引,以及后向预测方向的第二运动矢量和第二参考图像索引;在所述根据所述初始运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第二搜索单元具体用于:根据所述第一运动矢量和所述当前图像块的位置在所述第一参考图像索引对应的前向参考图像中确定当前图像块的初始前向参考块的位置;以所述初始前向参考块的位置作为第一搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括所述初始前向参考块的位置和所述(N-1)个候选前向参考块的位置;根据所述第二运动矢量和所述当前图像块的位置在所述第二参考图像索引对应的后向参考图像中确定当前图像块的初始后向参考块的位置;以所述初始后向参考块的位置作为第二搜索起点,在所述后向参考图像中确定(N-1)个候选后向参考块的位置,其中所述N个后向参考块的位置包括所述初始后向参考块的位置和所述(N-1)个候选后向参考块的位置。
- 如权利要求31至35任一项所述的装置,其特征在于,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置的方面,所述第二搜索单元具体用于:从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的目标前向参考块的位置和当前图像块的目标后向参考块的位置,所述M小于或等于N。
- 一种图像预测装置,其特征在于,包括:第三获取单元,用于获取当前图像块的第i轮运动信息;第三搜索单元,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考 块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移成镜像关系,所述第一位置偏移表示所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移,所述第二位置偏移表示所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第三预测单元,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求37所述的装置,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
- 如权利要求37或38所述的装置,其特征在于,所述第三预测单元具体用于当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求37至39任一项所述的装置,其特征在于,所述第一位置偏移与第二位置偏移成镜像关系,包括:所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同。
- 如权利要求37至40任一项所述的装置,其特征在于,所述第i轮运动信息包括前向运动矢量和前向参考图像索引,以及后向运动矢量和后向参考图像索引;在所述根据所述第i轮运动信息和所述当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置的方面,所述第三搜索单元具体用于:根据所述前向运动矢量和所述当前图像块的位置在所述前向参考图像索引对应的前向参考图像中确定当前图像块的第i-1轮目标前向参考块的位置;以所述i-1轮目标前向参考块的位置作为第i f搜索起点,在所述前向参考图像中确定(N-1)个候选前向参考块的位置,其中所述N个前向参考块的位置包括一个第i-1轮目标前向参考块的位置和所述(N-1)个候选前向参考块的位置;根据所述后向运动矢量和所述当前图像块的位置在所述后向参考图像索引对应的后向参考图像中确定当前图像块的第i-1轮目标后向参考块的位置;以所述i-1轮目标后向参考块的位置作为第i b搜索起点在所述后向参考图像中确定(N-1)个候选后向参考块的位 置,其中所述N个后向参考块的位置包括一个第i-1轮目标后向参考块的位置和所述(N-1)个候选后向参考块的位置。
- 如权利要求37至41任一项所述的装置,其特征在于,在所述根据匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置的方面,所述第三搜索单元具体用于:从M对参考块的位置中,确定匹配误差最小的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及第i轮目标后向参考块的位置;或者从M对参考块的位置中,确定匹配误差小于或等于匹配误差阈值的一对参考块的位置为当前图像块的第i轮目标前向参考块的位置以及当前图像块的第i轮目标后向参考块的位置,其中所述M小于或等于N。
- 一种图像预测装置,其特征在于,包括:第四获取单元,用于获取当前图像块的第i轮运动信息;第四搜索单元,用于根据所述第i轮运动信息和当前图像块的位置确定N个前向参考块的位置和N个后向参考块的位置,所述N个前向参考块位于前向参考图像中,所述N个后向参考块位于后向参考图像中,N为大于1的整数;基于匹配代价准则,从M对参考块的位置中确定一对参考块的位置为当前图像块的目标前向参考块的位置和目标后向参考块的位置,其中每对参考块的位置包括一个前向参考块的位置和一个后向参考块的位置,且针对每对参考块的位置,第一位置偏移与第二位置偏移具有基于时域距离的比例关系,所述第一位置偏移表示所述前向参考图像中,所述前向参考块的位置相对于第i-1轮目标前向参考块的位置的位置偏移;所述第二位置偏移表示所述后向参考图像中,所述后向参考块的位置相对于第i-1轮目标后向参考块的位置的位置偏移,所述M为大于或等于1的整数,且所述M小于或等于N;第四预测单元,用于根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述当前图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求43所述的装置,其特征在于,如果i=1,则所述第i轮运动信息为当前图像块的初始运动信息;如果i>1,则所述第i轮运动信息包括:指向第i-1轮目标前向参考块的位置的前向运动矢量和指向第i-1轮目标后向参考块的位置的后向运动矢量。
- 如权利要求43或44所述的装置,其特征在于,所述第四预测单元具体用于:当满足迭代终止条件时,根据所述第j轮目标前向参考块的像素值和所述第j轮目标后向参考块的像素值,得到所述图像块的像素值的预测值,其中j大于或等于i,i和j均为大于或等于1的整数。
- 如权利要求43至45任一项所述的装置,其特征在于,所述第一位置偏移与第二 位置偏移具有基于时域距离的比例关系,包括:如果第一时域距离与第二时域距离相同,则所述第一位置偏移的方向与第二位置偏移的方向相反,且第一位置偏移的幅值与第二位置偏移的幅值相同;或者,如果第一时域距离与第二时域距离不同,则所述第一位置偏移的方向与第二位置偏移的方向相反,第一位置偏移的幅值与第二位置偏移的幅值之间的比例关系是基于第一时域距离与第二时域距离的比例关系;其中,第一时域距离表示当前图像块所属的当前图像与所述前向参考图像之间的时域距离;第二时域距离表示所述当前图像与所述后向参考图像之间的时域距离。
- 一种视频编码器,其特征在于,所述视频编码器用于编码图像块,包括:帧间预测模块,包括如权利要求25至46任一项所述的图像预测装置,其中所述帧间预测模块用于预测得到所述图像块的像素值的预测值;熵编码模块,用于将指示信息编入码流,所述指示信息用于指示所述图像块的初始运动信息;重建模块,用于基于所述图像块的像素值的预测值重建所述图像块。
- 一种视频解码器,其特征在于,所述视频解码器用于从码流中解码出图像块,包括:熵解码模块,用于从码流中解码出指示信息,所述指示信息用于指示当前解码图像块的初始运动信息;帧间预测模块,包括如权利要求25至46中任一项所述的图像预测装置,所述帧间预测模块用于预测得到所述图像块的像素值的预测值;重建模块,用于基于所述图像块的像素值的预测值重建所述图像块。
Priority Applications (15)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2020125254A RU2772639C2 (ru) | 2017-12-31 | 2018-12-27 | Кодек, устройство и способ предсказания изображения |
KR1020247001807A KR20240011263A (ko) | 2017-12-31 | 2018-12-27 | 픽처 예측 방법과 장치, 및 코덱 |
JP2020536667A JP2021508213A (ja) | 2017-12-31 | 2018-12-27 | 画像予測の方法および装置、ならびにコーデック |
EP18895955.5A EP3734976A4 (en) | 2017-12-31 | 2018-12-27 | PROCESS AND DEVICE FOR IMAGE PREDICTION AND CODEC |
BR112020012914-3A BR112020012914A2 (pt) | 2017-12-31 | 2018-12-27 | Método e aparelho de predição de imagem, e codec |
CN201880084937.8A CN111543059A (zh) | 2017-12-31 | 2018-12-27 | 图像预测方法、装置以及编解码器 |
AU2018395081A AU2018395081B2 (en) | 2017-12-31 | 2018-12-27 | Picture prediction method and apparatus, and codec |
SG11202006258VA SG11202006258VA (en) | 2017-12-31 | 2018-12-27 | Picture prediction method and apparatus, and codec |
KR1020207022351A KR102503943B1 (ko) | 2017-12-31 | 2018-12-27 | 픽처 예측 방법과 장치, 및 코덱 |
KR1020237006148A KR102627496B1 (ko) | 2017-12-31 | 2018-12-27 | 픽처 예측 방법과 장치, 및 코덱 |
CA3087405A CA3087405A1 (en) | 2017-12-31 | 2018-12-27 | Picture prediction method and apparatus, and codec |
EP23219999.2A EP4362464A3 (en) | 2017-12-31 | 2018-12-27 | Picture prediction method and apparatus, and codec |
US16/915,678 US11528503B2 (en) | 2017-12-31 | 2020-06-29 | Picture prediction method and apparatus, and codec |
US17/994,556 US20230232036A1 (en) | 2017-12-31 | 2022-11-28 | Picture prediction method and apparatus, and codec |
AU2023204122A AU2023204122A1 (en) | 2017-12-31 | 2023-06-28 | Picture prediction method and apparatus, and codec |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711494274.0A CN109996081B (zh) | 2017-12-31 | 2017-12-31 | 图像预测方法、装置以及编解码器 |
CN201711494274.0 | 2017-12-31 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/915,678 Continuation US11528503B2 (en) | 2017-12-31 | 2020-06-29 | Picture prediction method and apparatus, and codec |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019129130A1 true WO2019129130A1 (zh) | 2019-07-04 |
Family
ID=67066616
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/124275 WO2019129130A1 (zh) | 2017-12-31 | 2018-12-27 | 图像预测方法、装置以及编解码器 |
Country Status (11)
Country | Link |
---|---|
US (2) | US11528503B2 (zh) |
EP (2) | EP3734976A4 (zh) |
JP (2) | JP2021508213A (zh) |
KR (3) | KR20240011263A (zh) |
CN (3) | CN117336504A (zh) |
AU (2) | AU2018395081B2 (zh) |
BR (1) | BR112020012914A2 (zh) |
CA (1) | CA3087405A1 (zh) |
SG (1) | SG11202006258VA (zh) |
TW (2) | TWI828507B (zh) |
WO (1) | WO2019129130A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021068954A1 (en) * | 2019-10-12 | 2021-04-15 | Beijing Bytedance Network Technology Co., Ltd. | High level syntax for video coding tools |
CN113691810A (zh) * | 2021-07-26 | 2021-11-23 | 浙江大华技术股份有限公司 | 帧内帧间联合预测方法、编解码方法及相关设备 |
US11575887B2 (en) | 2019-05-11 | 2023-02-07 | Beijing Bytedance Network Technology Co., Ltd. | Selective use of coding tools in video processing |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7282872B2 (ja) * | 2018-08-13 | 2023-05-29 | エルジー エレクトロニクス インコーポレイティド | ヒストリベースの動きベクトルに基づくインター予測方法及びその装置 |
CN110545425B (zh) * | 2019-08-21 | 2021-11-16 | 浙江大华技术股份有限公司 | 一种帧间预测方法、终端设备以及计算机存储介质 |
WO2021061023A1 (en) * | 2019-09-23 | 2021-04-01 | Huawei Technologies Co., Ltd. | Signaling for motion vector refinement |
WO2020251418A2 (en) * | 2019-10-01 | 2020-12-17 | Huawei Technologies Co., Ltd. | Method and apparatus of slice-level signaling for bi-directional optical flow and decoder side motion vector refinement |
CN112135127B (zh) * | 2019-11-05 | 2021-09-21 | 杭州海康威视数字技术股份有限公司 | 一种编解码方法、装置、设备及机器可读存储介质 |
CN113452997B (zh) * | 2020-03-25 | 2022-07-29 | 杭州海康威视数字技术股份有限公司 | 一种编解码方法、装置及其设备 |
CN112565753B (zh) * | 2020-12-06 | 2022-08-16 | 浙江大华技术股份有限公司 | 运动矢量差的确定方法和装置、存储介质及电子装置 |
CN114640856B (zh) * | 2021-03-19 | 2022-12-23 | 杭州海康威视数字技术股份有限公司 | 解码方法、编码方法、装置及设备 |
CN113938690B (zh) * | 2021-12-03 | 2023-10-31 | 北京达佳互联信息技术有限公司 | 视频编码方法、装置、电子设备及存储介质 |
US20230199171A1 (en) * | 2021-12-21 | 2023-06-22 | Mediatek Inc. | Search Memory Management For Video Coding |
WO2024010338A1 (ko) * | 2022-07-05 | 2024-01-11 | 한국전자통신연구원 | 영상 부호화/복호화를 위한 방법, 장치 및 기록 매체 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1658673A (zh) * | 2005-03-23 | 2005-08-24 | 南京大学 | 视频压缩编解码方法 |
CN101557514A (zh) * | 2008-04-11 | 2009-10-14 | 华为技术有限公司 | 一种帧间预测编解码方法、装置及系统 |
US20120027095A1 (en) * | 2010-07-30 | 2012-02-02 | Canon Kabushiki Kaisha | Motion vector detection apparatus, motion vector detection method, and computer-readable storage medium |
CN104427347A (zh) * | 2013-09-02 | 2015-03-18 | 苏州威迪斯特光电科技有限公司 | 网络摄像机视频监控系统图像质量提高方法 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6195389B1 (en) * | 1998-04-16 | 2001-02-27 | Scientific-Atlanta, Inc. | Motion estimation system and methods |
TWI401972B (zh) * | 2009-06-23 | 2013-07-11 | Acer Inc | 時間性錯誤隱藏方法 |
US8917769B2 (en) | 2009-07-03 | 2014-12-23 | Intel Corporation | Methods and systems to estimate motion based on reconstructed reference frames at a video decoder |
GB2493755B (en) | 2011-08-17 | 2016-10-19 | Canon Kk | Method and device for encoding a sequence of images and method and device for decoding a sequence of images |
CN104427345B (zh) * | 2013-09-11 | 2019-01-08 | 华为技术有限公司 | 运动矢量的获取方法、获取装置、视频编解码器及其方法 |
US10958927B2 (en) * | 2015-03-27 | 2021-03-23 | Qualcomm Incorporated | Motion information derivation mode determination in video coding |
WO2017201678A1 (zh) | 2016-05-24 | 2017-11-30 | 华为技术有限公司 | 图像预测方法和相关设备 |
EP3264769A1 (en) * | 2016-06-30 | 2018-01-03 | Thomson Licensing | Method and apparatus for video coding with automatic motion information refinement |
US10631002B2 (en) * | 2016-09-30 | 2020-04-21 | Qualcomm Incorporated | Frame rate up-conversion coding mode |
US10750203B2 (en) * | 2016-12-22 | 2020-08-18 | Mediatek Inc. | Method and apparatus of adaptive bi-prediction for video coding |
US20180192071A1 (en) * | 2017-01-05 | 2018-07-05 | Mediatek Inc. | Decoder-side motion vector restoration for video coding |
WO2019001741A1 (en) * | 2017-06-30 | 2019-01-03 | Huawei Technologies Co., Ltd. | MOTION VECTOR REFINEMENT FOR MULTI-REFERENCE PREDICTION |
CN111201795B (zh) * | 2017-10-09 | 2022-07-26 | 华为技术有限公司 | 存储访问窗口和用于运动矢量修正的填充 |
-
2017
- 2017-12-31 CN CN202311090778.1A patent/CN117336504A/zh active Pending
- 2017-12-31 CN CN201711494274.0A patent/CN109996081B/zh active Active
-
2018
- 2018-12-26 TW TW112100139A patent/TWI828507B/zh active
- 2018-12-26 TW TW107147080A patent/TWI791723B/zh active
- 2018-12-27 KR KR1020247001807A patent/KR20240011263A/ko active Application Filing
- 2018-12-27 EP EP18895955.5A patent/EP3734976A4/en not_active Ceased
- 2018-12-27 WO PCT/CN2018/124275 patent/WO2019129130A1/zh unknown
- 2018-12-27 KR KR1020207022351A patent/KR102503943B1/ko active IP Right Grant
- 2018-12-27 AU AU2018395081A patent/AU2018395081B2/en active Active
- 2018-12-27 EP EP23219999.2A patent/EP4362464A3/en active Pending
- 2018-12-27 CA CA3087405A patent/CA3087405A1/en active Pending
- 2018-12-27 JP JP2020536667A patent/JP2021508213A/ja active Pending
- 2018-12-27 CN CN201880084937.8A patent/CN111543059A/zh active Pending
- 2018-12-27 KR KR1020237006148A patent/KR102627496B1/ko active IP Right Grant
- 2018-12-27 SG SG11202006258VA patent/SG11202006258VA/en unknown
- 2018-12-27 BR BR112020012914-3A patent/BR112020012914A2/pt unknown
-
2020
- 2020-06-29 US US16/915,678 patent/US11528503B2/en active Active
-
2022
- 2022-11-28 US US17/994,556 patent/US20230232036A1/en active Pending
-
2023
- 2023-04-24 JP JP2023070921A patent/JP2023103277A/ja active Pending
- 2023-06-28 AU AU2023204122A patent/AU2023204122A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1658673A (zh) * | 2005-03-23 | 2005-08-24 | 南京大学 | 视频压缩编解码方法 |
CN101557514A (zh) * | 2008-04-11 | 2009-10-14 | 华为技术有限公司 | 一种帧间预测编解码方法、装置及系统 |
US20120027095A1 (en) * | 2010-07-30 | 2012-02-02 | Canon Kabushiki Kaisha | Motion vector detection apparatus, motion vector detection method, and computer-readable storage medium |
CN104427347A (zh) * | 2013-09-02 | 2015-03-18 | 苏州威迪斯特光电科技有限公司 | 网络摄像机视频监控系统图像质量提高方法 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11575887B2 (en) | 2019-05-11 | 2023-02-07 | Beijing Bytedance Network Technology Co., Ltd. | Selective use of coding tools in video processing |
WO2021068954A1 (en) * | 2019-10-12 | 2021-04-15 | Beijing Bytedance Network Technology Co., Ltd. | High level syntax for video coding tools |
US11689747B2 (en) | 2019-10-12 | 2023-06-27 | Beijing Bytedance Network Technology Co., Ltd | High level syntax for video coding tools |
CN113691810A (zh) * | 2021-07-26 | 2021-11-23 | 浙江大华技术股份有限公司 | 帧内帧间联合预测方法、编解码方法及相关设备 |
CN113691810B (zh) * | 2021-07-26 | 2022-10-04 | 浙江大华技术股份有限公司 | 帧内帧间联合预测方法、编解码方法及相关设备、存储介质 |
Also Published As
Publication number | Publication date |
---|---|
JP2021508213A (ja) | 2021-02-25 |
SG11202006258VA (en) | 2020-07-29 |
EP3734976A4 (en) | 2021-02-03 |
TW202318876A (zh) | 2023-05-01 |
RU2020125254A (ru) | 2022-01-31 |
AU2018395081A1 (en) | 2020-08-13 |
KR102503943B1 (ko) | 2023-02-24 |
AU2023204122A1 (en) | 2023-07-13 |
KR20240011263A (ko) | 2024-01-25 |
RU2020125254A3 (zh) | 2022-01-31 |
CA3087405A1 (en) | 2019-07-04 |
TWI791723B (zh) | 2023-02-11 |
TWI828507B (zh) | 2024-01-01 |
CN109996081A (zh) | 2019-07-09 |
JP2023103277A (ja) | 2023-07-26 |
EP3734976A1 (en) | 2020-11-04 |
EP4362464A3 (en) | 2024-05-29 |
CN117336504A (zh) | 2024-01-02 |
KR20200101986A (ko) | 2020-08-28 |
TW201931857A (zh) | 2019-08-01 |
US20230232036A1 (en) | 2023-07-20 |
BR112020012914A2 (pt) | 2020-12-08 |
EP4362464A2 (en) | 2024-05-01 |
KR20230033021A (ko) | 2023-03-07 |
US11528503B2 (en) | 2022-12-13 |
US20200396478A1 (en) | 2020-12-17 |
KR102627496B1 (ko) | 2024-01-18 |
AU2018395081B2 (en) | 2023-03-30 |
CN109996081B (zh) | 2023-09-12 |
CN111543059A (zh) | 2020-08-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019129130A1 (zh) | 图像预测方法、装置以及编解码器 | |
US10652571B2 (en) | Advanced motion vector prediction speedups for video coding | |
JP6783788B2 (ja) | ビデオコーディングにおけるサブブロックの動き情報の導出 | |
WO2019120305A1 (zh) | 图像块的运动信息的预测方法、装置及编解码器 | |
JP2018530246A (ja) | ビデオコーディングのために位置依存の予測組合せを使用する改善されたビデオイントラ予測 | |
US11765378B2 (en) | Video coding method and apparatus | |
US11563949B2 (en) | Motion vector obtaining method and apparatus, computer device, and storage medium | |
WO2019154424A1 (zh) | 视频解码方法、视频解码器以及电子设备 | |
JP2017513346A (ja) | 低複雑度符号化および背景検出のためのシステムおよび方法 | |
WO2020047807A1 (zh) | 帧间预测方法、装置以及编解码器 | |
US11394996B2 (en) | Video coding method and apparatus | |
WO2020114356A1 (zh) | 帧间预测方法和相关装置 | |
WO2020043111A1 (zh) | 基于历史候选列表的图像编码、解码方法以及编解码器 | |
WO2019084776A1 (zh) | 图像块的候选运动信息的获取方法、装置及编解码器 | |
RU2772639C2 (ru) | Кодек, устройство и способ предсказания изображения |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18895955 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3087405 Country of ref document: CA Ref document number: 2020536667 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018895955 Country of ref document: EP Effective date: 20200727 Ref document number: 20207022351 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2018395081 Country of ref document: AU Date of ref document: 20181227 Kind code of ref document: A |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112020012914 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112020012914 Country of ref document: BR Kind code of ref document: A2 Effective date: 20200624 |