WO2019091372A1 - 图像预测方法和装置 - Google Patents
图像预测方法和装置 Download PDFInfo
- Publication number
- WO2019091372A1 WO2019091372A1 PCT/CN2018/114146 CN2018114146W WO2019091372A1 WO 2019091372 A1 WO2019091372 A1 WO 2019091372A1 CN 2018114146 W CN2018114146 W CN 2018114146W WO 2019091372 A1 WO2019091372 A1 WO 2019091372A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- prediction block
- image
- reference image
- prediction
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
Definitions
- the present application relates to the field of video codec technology, and more particularly, to an image prediction method and apparatus. .
- the present application proposes an image prediction method and apparatus, which unifies the precision of various prediction blocks in the prediction process, and determines the predicted value of the pixel value of the image block according to the finally obtained target prediction block. , thus simplifying the complexity of image prediction.
- a bit width larger than the bit width of the reconstructed pixel value of the image block may be employed in the process of predicting the image to improve the accuracy of the predicted value of the pixel value of the finally obtained image block.
- the same bit width as the bit width of the reconstructed pixel value of the image block can be used in the process of predicting the image to further reduce the complexity in image prediction.
- an image prediction method comprising: acquiring predicted motion information of an image block; obtaining, according to the predicted motion information, a first prediction block corresponding to the image block in a reference image by using an interpolation filter a second prediction block, wherein a gain of the interpolation filter is greater than 1; an initial prediction block is obtained according to the first prediction block and the second prediction block, wherein the initial prediction block, the first prediction The bit widths of the pixel values of the block and the second prediction block are all the same; searching in the reference image according to the predicted motion information to obtain M prediction blocks corresponding to the image block, where M is a preset a value, M is an integer greater than 1; determining a target prediction block of the image block according to the M prediction block and the initial prediction block, wherein the target prediction block and the pixel value of the initial prediction block The bit width is the same; a predicted value of the pixel value of the image block is obtained according to the pixel value of the target prediction block.
- the foregoing predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector directed to the reference image block (eg, a motion vector of a neighboring block), and a reference image block.
- Image information (generally understood as reference image information), wherein the motion vector comprises a forward motion vector and/or a backward motion vector, the reference image information comprising a reference of the forward prediction reference image block and/or the backward prediction reference image block Frame index information.
- the bit width of the pixel values of the first prediction block and the second prediction block is greater than the bit width of the reconstructed pixel value of the finally obtained image block
- the bit widths of the pixel values of a prediction block, the second prediction block, the initial prediction block, and the target prediction block are the same, such that the bit width of the pixel value of the finally obtained target prediction block is also larger than the bit width of the reconstructed pixel value of the image block, thus
- the predicted value of the pixel value of the image block may be directly determined according to the pixel value of the target prediction block having the higher bit width, and the pixel value of the image block may not be determined after the prediction block having the pixel value of the high bit width is obtained by motion compensation.
- the predicted value saves the motion compensation operation and reduces the complexity of image prediction.
- Obtaining an initial prediction block according to the first prediction block and the second prediction block may refer to obtaining a pixel value of the initial prediction block according to the pixel value of the first prediction block and the pixel value of the second prediction block.
- a bit width of a pixel value of each of the M prediction blocks is the same as a bit width of a pixel value of the initial prediction block.
- the initial prediction block is obtained according to the first prediction block and the second prediction block, including: the first prediction block and the The pixel values of the second prediction block are weighted to obtain pixel values of the initial prediction block.
- the pixel value of the initial prediction block is specifically determined after weighting the pixel values of the first prediction block and the second prediction block.
- the result of the weighted processed pixel value is determined as a result of the pixel value of the initial prediction block, and the bit width of the pixel value of the initial prediction block is made to coincide with the bit width of the pixel values of the first prediction block and the second prediction block.
- the obtaining, by the interpolation filter, the first prediction block and the second prediction block corresponding to the image block in the reference image according to the prediction motion information including
- the reference image is a forward reference image, and the first prediction block and the second prediction block are obtained in the forward reference image by an interpolation filter according to the predicted motion information; or the reference image a backward reference image, the first prediction block and the second prediction block being obtained in the backward reference image by an interpolation filter according to the predicted motion information; or the reference image includes a forward reference image And the backward reference image, the first prediction block and the second prediction block are respectively obtained in the forward reference image and the backward reference image by an interpolation filter according to the predicted motion information.
- the initial prediction block By acquiring different prediction blocks in the forward reference image and/or the backward reference image, it is possible to determine the initial prediction block according to different prediction blocks, and the prediction that will be directly searched in the forward reference image or the backward reference image.
- the block can be used to more accurately determine the initial prediction block according to different prediction blocks as compared to the way the block is initially predicted.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block including: the reference The image is a forward reference image, and the search is performed in the forward reference image according to the predicted motion information to obtain M prediction blocks corresponding to the image block; or the reference image is a backward reference image, according to the The predicted motion information is searched in the backward reference image to obtain M prediction blocks corresponding to the image block; and the determining, according to the M prediction blocks and the initial prediction block corresponding to the image block,
- the target prediction block of the image block includes: determining a prediction block that minimizes a difference between a pixel value of the M prediction blocks and a pixel value of the initial prediction block as the target prediction block.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block including: the reference The image includes a forward reference image and a backward reference image, and performs a search in the forward reference image according to the predicted motion information to obtain A prediction blocks corresponding to the image block; and the backward reference image according to the predicted motion information.
- Performing a search to obtain B prediction blocks corresponding to the image block, where A and B are integers greater than 0, A+B M; M prediction blocks corresponding to the image block and the initial prediction Determining, by the block, the target prediction block of the image block, comprising: determining, as the first target prediction block, a prediction block that minimizes a difference between a pixel value of the A prediction block and a pixel value of the initial prediction block; a prediction block having a smallest difference between a pixel value of the B prediction blocks and a pixel value of the initial prediction block is determined as a second target prediction block; determining the mesh according to the first target prediction block and the second target prediction block Prediction block.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block including: the reference The image is a first direction reference image, and is searched in the first direction reference image according to the predicted motion information, to obtain M prediction blocks corresponding to the image block; and according to the M prediction blocks corresponding to the image block.
- Determining, by the initial prediction block, the target prediction block of the image block comprising: determining, as the first prediction block, a difference between a pixel value of the M prediction blocks corresponding to the image block and a pixel value of the initial prediction block a target prediction block; determining a first motion vector of the image block pointing to the first target prediction block; determining a second motion vector according to the first motion vector according to a preset rule; according to the second motion vector Determining, in the two-direction reference image, a second target prediction block corresponding to the image block, wherein the first direction reference image and the second direction reference image are respectively forward reference maps And the backward reference image, or the first direction reference image and the second direction reference image are respectively a backward reference image and a forward reference image; and the first target prediction block and the second target according to the first target A prediction block that determines the target prediction block.
- Deriving a prediction block in the reference image of the image block in the other direction by the prediction block searched in the reference image in one direction can save a lot of search operations, simplify the complexity of image prediction, and at the same time, determine the target prediction
- the block uses both the prediction block corresponding to the forward reference image and the prediction block corresponding to the image reference block in the backward reference image, which can simplify the image prediction complexity and ensure the accuracy of image prediction.
- the obtaining, by the interpolation filter, the first prediction block and the second prediction block corresponding to the image block in the reference image according to the prediction motion information including Obtaining, according to the predicted motion information, a first prediction block corresponding to the image block in an first reference image by using an interpolation filter; acquiring the image block in a second reference image by using an interpolation filter according to the predicted motion information.
- Corresponding second prediction block wherein the first reference image is a reference image in a first reference image list, the second reference image is a reference image in a second reference image list, the first reference image list And the second reference image list is a different reference image list used when predicting the image block.
- the performing the searching in the reference image according to the predicted motion information, obtaining the M prediction blocks corresponding to the image block including: according to the The predicted motion information is searched in the first reference image to obtain A prediction blocks corresponding to the image block; and the second motion reference information is searched according to the predicted motion information to obtain B prediction blocks corresponding to the image block.
- the first reference image is a reference image in a first reference image list
- the second reference image is a reference image in a second reference image list
- Determining, by the block, the target prediction block of the image block comprising: determining, as the first target prediction block, a prediction block that minimizes a difference between a pixel value of the A prediction block and a pixel value of the initial prediction block. Determining a prediction block that minimizes a difference between a pixel value of the B prediction blocks and a pixel value of the initial prediction block as a second target prediction block; according to the first target prediction block and the second target prediction block The target prediction block is determined.
- the first reference image and the second reference image may be either a forward reference image or a backward reference image. Specifically, the following may be included: the first reference image and the second reference image are both forward reference images; the first reference image and the second reference image are both backward reference images; the first reference image is a forward reference The image, the second reference image is a backward reference image.
- first reference image may be a reference image or a plurality of reference images.
- second reference image may also be a reference image or multiple reference images.
- the method before acquiring the predicted motion information of the image block, the method further includes: obtaining indication information from a code stream of the image block, wherein the indication The information is used to indicate predicted motion information of the acquired image block, and the indication information is carried in any one of a sequence parameter set, an image parameter set, or a slice header of the image block.
- the indication information can flexibly indicate whether the predicted motion information of the image block is acquired, and then the image is predicted according to the predicted motion information of the image block or the like. Specifically, the indication information can indicate whether the image prediction is performed by using the method of the embodiment of the present application. After the indication information is obtained from the code stream, the image is predicted according to the method of the embodiment of the present application, if not obtained from the code stream. To the indication information, the image can be predicted according to a conventional method, and the indication information can flexibly indicate which method is used to predict the image.
- the method before the initial prediction block is obtained according to the first prediction block and the second prediction block, the method further includes: from the image block Acquiring information in the code stream, wherein the indication information is carried in any one of a sequence parameter set, an image parameter set or a slice header of the image block; and the first prediction block and the second The prediction block obtains an initial prediction block, and includes: when the value of the identifier bit of the indication information is a first value, obtaining an initial prediction block according to the first prediction block and the second prediction block.
- the method further includes: obtaining indication information from a code stream of the image block, wherein the indication information is carried in any one of a sequence parameter set, an image parameter set, or a slice header of the image block; Performing a weighting process on the pixel values of the prediction block and the second prediction block to obtain a pixel value of the initial prediction block, including: in a case where the value of the identifier bit of the indication information is the first value, The pixel values of a prediction block and the second prediction block are weighted to obtain pixel values of the initial prediction block.
- the value of the identifier of the indication information may include a first value and a second value.
- the image block may be predicted according to the prediction method of the present application.
- the identifier bit of the indication information is the second value, it may indicate that the image block is predicted according to a conventional prediction method.
- the first value and the second value may be 1 and 0, respectively, or the first value and the second value may also be 0 and 1, respectively.
- the method further comprises: obtaining a motion vector of the image block pointing to the target prediction block; and pointing to the motion of the target prediction block according to the image block Vector, a motion vector of the image block is obtained, wherein the motion vector of the image block is used to predict other image blocks.
- Determining the motion vector of the image block according to the motion vector of the target prediction block specifically, determining the motion vector of the target motion block as the motion vector of the image block, that is, updating the motion vector of the image block, so that Other image blocks can be effectively predicted based on the image block when the next image prediction is performed.
- an image prediction method comprising: acquiring predicted motion information of an image block; obtaining, according to the predicted motion information, a first prediction block corresponding to the image block in a reference image by using an interpolation filter a second prediction block, wherein a gain of the interpolation filter is greater than 1; performing a shift operation on pixel values of the first prediction block and the second prediction block, such that the first prediction block and the first a bit width of a pixel value of the second prediction block is reduced to a target bit width, wherein the target bit width is a bit width of a reconstructed pixel value of the image block; according to the first prediction block and the second prediction block Obtaining an initial prediction block, wherein bit lengths of pixel values of the initial prediction block, the first prediction block, and the second prediction block are the same; searching in the reference image according to the predicted motion information Obtaining M prediction blocks corresponding to the image block, where M is a preset value, where M is an integer greater than 1; determining the image
- the foregoing predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector directed to the reference image block (usually a motion vector of a neighboring block), and a reference image.
- the image information of the block (generally understood as reference image information), wherein the motion vector comprises a forward motion vector and/or a backward motion vector, the reference image information comprising a forward predicted reference image block and/or a backward predicted reference image block Reference frame index information.
- the bit widths of the pixel values of the initial prediction block and the target prediction block are both target widths, in the process of predicting the image, the back and forth conversion of the pixel values between different bit widths can be reduced. And determining a predicted value of the pixel value of the image block according to the target prediction block whose pixel value bit width is the target bit width, and no longer performing motion compensation to obtain the prediction block of the pixel value having the high bit width, and then determining the pixel value of the image block.
- the predicted value saves the motion compensation operation, simplifies the process of image prediction, and reduces the complexity of image prediction.
- Obtaining an initial prediction block according to the first prediction block and the second prediction block may refer to obtaining a pixel value of the initial prediction block according to the pixel value of the first prediction block and the pixel value of the second prediction block.
- a bit width of a pixel value of each of the M prediction blocks is the same as a bit width of a pixel value of the initial prediction block.
- the initial prediction block is obtained according to the first prediction block and the second prediction block, including: the first prediction block and the The pixel values of the second prediction block are weighted to obtain pixel values of the initial prediction block.
- the pixel value of the initial prediction block is specifically determined after weighting the pixel values of the first prediction block and the second prediction block.
- the result of the weighted processed pixel value is determined as a result of the pixel value of the initial prediction block, and the bit width of the pixel value of the initial prediction block is made to coincide with the bit width of the pixel values of the first prediction block and the second prediction block.
- the obtaining, by the interpolation filter, the first prediction block and the second prediction block corresponding to the image block by using an interpolation filter according to the prediction motion information including
- the reference image is a forward reference image, and the first prediction block and the second prediction block are obtained in the forward reference image by an interpolation filter according to the predicted motion information; or the reference image a backward reference image, the first prediction block and the second prediction block being obtained in the backward reference image by an interpolation filter according to the predicted motion information; or the reference image includes a forward reference image And the backward reference image, the first prediction block and the second prediction block are respectively obtained in the forward reference image and the backward reference image by an interpolation filter according to the predicted motion information.
- the initial prediction block By acquiring different prediction blocks in the forward reference image and/or the backward reference image, it is possible to determine the initial prediction block according to different prediction blocks, and the prediction that will be directly searched in the forward reference image or the backward reference image.
- the block can be used to more accurately determine the initial prediction block according to different prediction blocks as compared to the way the block is initially predicted.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block including: the reference The image is a forward reference image, and the search is performed in the forward reference image according to the predicted motion information to obtain M prediction blocks corresponding to the image block; or the reference image is a backward reference image, according to the The predicted motion information is searched in the backward reference image to obtain M prediction blocks corresponding to the image block; and the determining, according to the M prediction blocks and the initial prediction block corresponding to the image block,
- the target prediction block of the image block includes: determining a prediction block that minimizes a difference between a pixel value of the M prediction blocks and a pixel value of the initial prediction block as the target prediction block.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block including: the reference The image includes a forward reference image and a backward reference image, and performs a search in the forward reference image according to the predicted motion information to obtain A prediction blocks corresponding to the image block; and the backward reference image according to the predicted motion information.
- Performing a search to obtain B prediction blocks corresponding to the image block, where A and B are integers greater than 0, A+B M; M prediction blocks corresponding to the image block and the initial prediction Determining, by the block, the target prediction block of the image block, comprising: determining, as the first target prediction block, a prediction block that minimizes a difference between a pixel value of the A prediction block and a pixel value of the initial prediction block; a prediction block having a smallest difference between a pixel value of the B prediction blocks and a pixel value of the initial prediction block is determined as a second target prediction block; determining the target according to the first target prediction block and the second target prediction block Forecast block.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block including: the reference The image is a first direction reference image, and is searched in the first direction reference image according to the predicted motion information, to obtain M prediction blocks corresponding to the image block; and according to the M prediction blocks corresponding to the image block.
- Determining, by the initial prediction block, the target prediction block of the image block comprising: determining, as the first prediction block, a difference between a pixel value of the M prediction blocks corresponding to the image block and a pixel value of the initial prediction block a target prediction block; determining a first motion vector of the image block pointing to the first target prediction block; determining a second motion vector according to the first motion vector according to a preset rule; according to the second motion vector Determining, in the two-direction reference image, a second target prediction block corresponding to the image block, wherein the first direction reference image and the second direction reference image are forward reference respectively And the backward reference image, or the first direction reference image and the second direction reference image are respectively a backward reference image and a forward reference image; and the first target prediction block and the second target according to the first target A prediction block that determines the target prediction block.
- Deriving a prediction block in the reference image of the image block in the other direction by the prediction block searched in the reference image in one direction can save a lot of search operations, simplify the complexity of image prediction, and at the same time, determine the target prediction
- the block uses both the prediction block corresponding to the forward reference image and the prediction block corresponding to the image reference block in the backward reference image, which can simplify the image prediction complexity and ensure the accuracy of image prediction.
- the obtaining, by the interpolation filter, the first prediction block and the second prediction block corresponding to the image block by using an interpolation filter according to the prediction motion information including Obtaining, according to the predicted motion information, a first prediction block corresponding to the image block in an first reference image by using an interpolation filter; acquiring the image block in a second reference image by using an interpolation filter according to the predicted motion information.
- Corresponding second prediction block wherein the first reference image is a reference image in a first reference image list, the second reference image is a reference image in a second reference image list, the first reference image list And the second reference image list is a different reference image list used when predicting the image block.
- the performing the searching in the reference image according to the predicted motion information, to obtain the M prediction blocks corresponding to the image block The predicted motion information is searched in the first reference image to obtain A prediction blocks corresponding to the image block; and the second motion reference information is searched according to the predicted motion information to obtain B prediction blocks corresponding to the image block.
- the first reference image is a reference image in a first reference image list
- the second reference image is a reference image in a second reference image list
- the first reference image and the second reference image may be either a forward reference image or a backward reference image. Specifically, the following may be included: the first reference image and the second reference image are both forward reference images; the first reference image and the second reference image are both backward reference images; the first reference image is a forward reference The image, the second reference image is a backward reference image.
- first reference image may be a reference image or a plurality of reference images.
- second reference image may also be a reference image or multiple reference images.
- the method before acquiring the predicted motion information of the image block, the method further includes: obtaining indication information from a code stream of the image block, wherein the indication The information is used to indicate predicted motion information of the acquired image block, and the indication information is carried in any one of a sequence parameter set, an image parameter set, or a slice header of the image block.
- the indication information can flexibly indicate whether the predicted motion information of the image block is acquired, and then the image is predicted according to the predicted motion information of the image block or the like. Specifically, the indication information can indicate whether the image prediction is performed by using the method of the embodiment of the present application. After the indication information is obtained from the code stream, the image is predicted according to the method of the embodiment of the present application, if not obtained from the code stream. To the indication information, the image can be predicted according to a conventional method, and the indication information can flexibly indicate which method is used to predict the image.
- the method before the initial prediction block is obtained according to the first prediction block and the second prediction block, the method further includes: from the image block Acquiring information in the code stream, wherein the indication information is carried in any one of a sequence parameter set, an image parameter set or a slice header of the image block; and the first prediction block and the second The prediction block obtains an initial prediction block, and includes: when the value of the identifier bit of the indication information is a first value, obtaining an initial prediction block according to the first prediction block and the second prediction block.
- the method further includes: obtaining indication information from a code stream of the image block, wherein the indication information is carried in any one of a sequence parameter set, an image parameter set, or a slice header of the image block; Performing a weighting process on the pixel values of the prediction block and the second prediction block to obtain a pixel value of the initial prediction block, including: in a case where the value of the identifier bit of the indication information is the first value, The pixel values of a prediction block and the second prediction block are weighted to obtain pixel values of the initial prediction block.
- the value of the identifier of the indication information may include a first value and a second value.
- the image block may be predicted according to the prediction method of the present application.
- the identifier bit of the indication information is the second value, it may indicate that the image block is predicted according to a conventional prediction method.
- the first value and the second value may be 1 and 0, respectively, or the first value and the second value may also be 0 and 1, respectively.
- the method further includes: obtaining a motion vector of the image block pointing to the target prediction block; and pointing to the motion of the target prediction block according to the image block Vector, a motion vector of the image block is obtained, wherein the motion vector of the image block is used to predict other image blocks.
- Determining the motion vector of the image block according to the motion vector of the target prediction block specifically, determining the motion vector of the target motion block as the motion vector of the image block, that is, updating the motion vector of the image block, so that Other image blocks can be effectively predicted based on the image block when the next image prediction is performed.
- an image prediction apparatus comprising means for performing the method of the first aspect or various implementations thereof.
- an image prediction apparatus comprising means for performing the method of the second aspect or various implementations thereof.
- a terminal device in a fifth aspect, includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute The method of the first aspect or various implementations thereof.
- a terminal device in a sixth aspect, includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute The method of the second aspect or its various implementations.
- a video encoder including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile A storage medium is coupled and the executable program is executed to implement the method of the first aspect or various implementations thereof.
- a video encoder including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile A storage medium is coupled and the executable program is executed to implement the method of the second aspect or various implementations thereof.
- a video decoder including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile A storage medium is coupled and the executable program is executed to implement the method of the first aspect or various implementations thereof.
- a video decoder including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile A storage medium is coupled and the executable program is executed to implement the method of the second aspect or various implementations thereof.
- a video encoding system including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile The storage medium is coupled and the executable program is executed to implement the method of the first aspect or various implementations thereof.
- a video encoding system including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile The storage medium is coupled and the executable program is executed to implement the method of the second aspect or various implementations thereof.
- a thirteenth aspect a computer readable medium storing program code for device execution, the program code comprising instructions for performing the method of the first aspect or various implementations thereof .
- a computer readable medium storing program code for device execution, the program code comprising instructions for performing the method of the second aspect or various implementations thereof .
- a decoder comprising the image prediction apparatus and the reconstruction module of the third aspect or the fourth aspect, wherein the reconstruction module is configured to obtain according to the image prediction apparatus The predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
- an encoder comprising the image prediction apparatus and the reconstruction module of the third aspect or the fourth aspect, wherein the reconstruction module is configured to obtain according to the image prediction apparatus The predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
- Figure 1 is a schematic diagram of a video encoding process
- FIG. 2 is a schematic diagram of a video decoding process
- FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- FIG. 4 is a schematic diagram of selecting a motion vector of a prediction block of a current block in a merge mode of inter prediction
- FIG. 5 is a schematic diagram of selecting a motion vector of a prediction block of a current block in a non-merging mode of inter prediction
- FIG. 6 is a schematic diagram of an integer pixel position pixel and a sub-pixel position pixel
- FIG. 7 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- Figure 8 is a schematic diagram of a search starting point
- FIG. 9 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- FIG. 10 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- FIG. 11 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application.
- FIG. 12 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application.
- FIG. 13 is a schematic block diagram of a video encoder according to an embodiment of the present application.
- FIG. 14 is a schematic block diagram of a video decoder according to an embodiment of the present application.
- FIG. 15 is a schematic block diagram of a video transmission system according to an embodiment of the present application.
- 16 is a schematic block diagram of a video codec apparatus according to an embodiment of the present application.
- FIG. 17 is a schematic block diagram of a video codec system according to an embodiment of the present application.
- the image prediction method in the present application can be applied to the field of video codec technology.
- the video codec is first introduced below.
- a video generally consists of a number of frame images in a certain order.
- redundant information For example, there is often a large amount of space in one frame of image.
- the same or similar structure that is to say, there is a large amount of spatial redundancy information in the video file.
- time redundant information in the video file, which is caused by the composition of the video.
- the frame rate of video sampling is generally 25 frames/second to 60 frames/second, that is, the sampling interval between adjacent frames is 1/60 second to 1/25 second, in such a short period of time, There are basically a lot of similar information in the sampled image, and there is a huge correlation between the images.
- visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance.
- visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance.
- the sensitivity of human vision to brightness changes tends to decrease, and is more sensitive to the edges of objects; in addition, the human eye is relatively insensitive to internal areas and sensitive to the overall structure. Since the final target of the video image is our human population, we can make full use of these characteristics of the human eye to compress the original video image to achieve better compression.
- video image information also has redundancy in information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy, etc. information.
- the purpose of video coding (also referred to as video compression coding) is to use various technical methods to remove redundant information in a video sequence to reduce storage space and save transmission bandwidth.
- Chroma sampling This method makes full use of the visual psychological characteristics of the human eye, and tries to minimize the amount of data described by a single element from the underlying data representation.
- YUV luminance-chrominance-chrominance
- the YUV color space includes a luminance signal Y and two color difference signals U and V, and the three components are independent of each other.
- the YUV color space is more flexible in representation, and the transmission occupies less bandwidth, which is superior to the traditional red, green and blue (RGB) color model.
- the YUV 4:2:0 form indicates that the two chrominance components U and V are only half of the luminance Y component in both the horizontal direction and the vertical direction, that is, there are four luminance components Y among the four sampled pixels, and the chrominance component There is only one U and V.
- the amount of data is further reduced, only about 33% of the original. Therefore, chroma sampling makes full use of the physiological visual characteristics of the human eye, and the purpose of video compression by means of such chroma sampling is one of the widely used video data compression methods.
- Predictive coding uses the data information of the previously encoded frame to predict the frame currently to be encoded.
- a predicted value is obtained by prediction, which is not completely equivalent to the actual value, and there is a certain residual value between the predicted value and the actual value.
- the more accurate the prediction the closer the predicted value is to the actual value, and the smaller the residual value, so that the residual value can be encoded to greatly reduce the amount of data, and the residual value plus the predicted value is used when decoding at the decoding end. It is possible to restore and reconstruct the matching image, which is the basic idea of predictive coding.
- predictive coding is divided into two basic types: intra prediction and inter prediction.
- Intra Prediction refers to predicting the pixel value of the pixel in the current coding unit by using the pixel value of the pixel in the reconstructed region in the current image;
- Inter Prediction is the reconstructed image. Searching for a matching reference block for the current coding unit in the current image, using the pixel value of the pixel in the reference block as the prediction information or the predicted value of the pixel value of the pixel in the current coding unit, and transmitting the motion of the current coding unit. information.
- Transform coding This coding method does not directly encode the original spatial domain information, but converts the information sample values from the current domain to another artificial domain according to some form of transformation function (commonly called transform domain). ), and then compression coding according to the distribution characteristics of the information in the transform domain. Since video image data tends to have very large data correlation in the spatial domain, there is a large amount of redundant information, and if it is directly encoded, a large amount of bits is required. When the information sample value is converted into the transform domain, the correlation of the data is greatly reduced, so that the amount of data required for encoding is greatly reduced due to the reduction of redundant information during encoding, so that high compression can be obtained. Than, and can achieve better compression.
- Typical transform coding methods include Kalo (K-L) transform, Fourier transform, and the like.
- Quantization coding The above-mentioned transform coding does not compress the data itself, and the quantization process can effectively achieve the compression of the data.
- the quantization process is also the main reason for the loss of data in the lossy compression.
- the process of quantification is the process of "force planning" a large dynamic input value into fewer output values. Since the range of quantized input values is large, more bit number representation is needed, and the range of output values after "forced planning" is small, so that only a small number of bits can be expressed.
- the encoder control module selects the coding mode adopted by the image block according to the local characteristics of different image blocks in the video frame.
- the intra-predictive coded block is subjected to frequency domain or spatial domain prediction
- the inter-predictive coded block is subjected to motion compensation prediction
- the predicted residual is further transformed and quantized to form a residual coefficient
- the final code is generated by the entropy encoder. flow.
- the intra or inter prediction reference signals are obtained by the decoding module at the encoding end.
- the transformed and quantized residual coefficients are reconstructed by inverse quantization and inverse transform, and then added to the predicted reference signal to obtain a reconstructed image.
- the loop filtering performs pixel correction on the reconstructed image to improve the encoding quality of the reconstructed image.
- Figure 1 is a schematic diagram of a video encoding process.
- intra prediction or inter prediction when performing prediction on the current image block in the current frame Fn, either intra prediction or inter prediction may be used. Specifically, whether intra coding or intraframe coding can be selected according to the type of the current frame Fn. Inter-frame coding, for example, intra prediction is used when the current frame Fn is an I frame, and inter prediction is used when the current frame Fn is a P frame or a B frame.
- intra prediction is adopted, the pixel value of the pixel of the current image block may be predicted by using the pixel value of the pixel of the reconstructed area in the current frame Fn, and the reference frame F'n -1 may be used when inter prediction is adopted. The pixel value of the pixel of the reference block that matches the current image block predicts the pixel value of the pixel of the current image block.
- the pixel value of the pixel point of the current image block is compared with the pixel value of the pixel point of the prediction block to obtain residual information, and the residual information is obtained.
- the change, quantization, and entropy coding are performed to obtain an encoded code stream.
- the residual information of the current frame Fn and the prediction information of the current frame Fn are superimposed, and a filtering operation is performed to obtain a reconstructed frame F' n of the current frame, and is used as a reference frame for subsequent encoding. .
- FIG. 2 is a schematic diagram of a video decoding process.
- the video decoding process shown in FIG. 2 is equivalent to the inverse process of the video decoding process shown in FIG. 1.
- the residual information is obtained by using entropy decoding and inverse quantization and inverse transform, and the current image block is determined according to the decoded code stream.
- Intra prediction is also inter prediction. If it is intra prediction, the prediction information is constructed according to the intra prediction method using the pixel values of the pixels in the reconstructed region in the current frame; if it is inter prediction, the motion information needs to be parsed, and the parsed motion information is used.
- the reference block is determined in the reconstructed image, and the pixel value of the pixel in the reference block is used as prediction information.
- the prediction information is superimposed with the residual information, and the reconstruction information is obtained through the filtering operation.
- FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 3 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 3 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
- the method shown in FIG. 3 includes steps 101 to 106, and step 101 to step 106 are described in detail below.
- the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
- the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
- the foregoing predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector directed to the reference image block (usually a motion vector of a neighboring block), and a reference image.
- the image information of the block (generally understood as reference image information), wherein the motion vector comprises a forward motion vector and/or a backward motion vector, the reference image information comprising a forward predicted reference image block and/or a backward predicted reference image block Reference frame index information.
- the following method 1 and mode 2 may be used to obtain the predicted motion information of the image block.
- the candidate predicted motion information list is constructed according to the motion information of the neighboring block of the current image block, and a candidate predicted motion information is selected from the candidate predicted motion information list as the predicted motion of the current image block.
- the candidate predicted motion information list includes a motion vector, reference frame index information of a reference image block, and the like.
- the motion information of the neighboring block A0 is selected as the predicted motion information of the current image block.
- the forward motion vector of A0 is used as the forward motion vector of the current block
- the backward motion vector of A0 is used.
- the backward predicted motion vector as the current block.
- a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
- the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
- the motion vectors corresponding to indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
- the gain of the interpolation filter is greater than 1, and since the gain of the difference filter is greater than 1, the bit width of the pixel values of the first prediction block and the second prediction block obtained in the reference image is higher than that obtained finally.
- the bit width of the pixel value prediction value of the image block (in this case, the pixel value with a higher bit width can also be considered to have higher precision), that is, the bit value of the pixel value of the first prediction block and the second prediction block. The width is wider than the predicted value of the pixel value of the finally obtained image block.
- the above reference image is a reference image of an image block, or the reference image is a reference image of an image to be processed in which the image block is located.
- the first prediction block and the second prediction block may be specifically determined according to the first motion vector and the second motion vector included in the prediction motion information.
- the position in the reference image of the image to be processed may be specifically determined according to the first motion vector and the second motion vector included in the prediction motion information.
- the position pointed by the motion vector is a sub-pixel position (for example, a 1/2 pixel position)
- Interpolation obtains the pixel value of the sub-pixel position as the pixel value of the prediction block.
- the bit width of the pixel value of the prediction block is made higher than the bit width of the reconstructed pixel value of the finally obtained image block due to the interpolation filter gain.
- the reconstructed pixel value of the image block herein may be the pixel value of the reconstructed block obtained by reconstructing the image block.
- the bit width of the pixel value of the reference image is 8 bits and the interpolation filter gain is 6 bits, then the pixel values of the first prediction block and the second prediction block can be obtained according to the prediction motion information and by the interpolation filter.
- the bit width is 14 bits.
- the bits of the pixel values of the first prediction block and the second prediction block can be obtained according to the predicted motion information and by the interpolation filter.
- the width is 16 bits.
- the shift operation may be performed after the interpolation operation, for example, the pixel value of the reference image has a bit width of 10 bits, and the interpolation filter gain is 6 bits, in order to keep the bit width of the obtained pixel value of the prediction block 14 bits, the pixel value obtained after the interpolation operation is shifted right by 2 bits, so that the pixel value of the prediction block has a bit width of 14 bits.
- Ai, j is a pixel point at an entire pixel position, and its bit width is bitDepth.
- A0,0,b0,0,c0,0,d0,0,h0,0,n0,0e0,0,i0,0,p0,0,f0,0,j0,0,q0,0,g0,0, K0, 0, and r0, 0 are pixel points at the sub-pixel position. If an 8-tap interpolation filter is used, a0,0 can be calculated by the following formula:
- a0,0 (C 0 * A -3,0 + C 1 * A -2,0 + C 2 * A -1,0 + C 3 * A 0,0 + C 4 * A 1,0 + C 5 *A 2,0 +C 6 *A 3,0 +C 7 *A 4,0 )>>shift1
- the bitDepth is the target bit width, and the target bit width is the bit width of the reconstructed pixel value of the image block.
- the first prediction block and the second prediction block may be acquired in the reference image according to the motion vector included in the predicted motion information. Two prediction blocks.
- Obtaining an initial prediction block according to the first prediction block and the second prediction block may refer to obtaining a pixel value of the initial prediction block according to the pixel value of the first prediction block and the pixel value of the second prediction block.
- the obtaining the initial prediction block according to the first prediction block and the second prediction block specifically includes: obtaining a pixel value of the initial prediction block according to the pixel value of the first prediction block and the pixel value of the second prediction block.
- bit widths of the pixel values of the initial prediction block, the first prediction block, and the second prediction block are all the same, that is, any two prediction blocks in the initial prediction block, the first prediction block, and the second prediction block.
- the pixel values have the same bit width.
- bit width of the pixel values of the first prediction block and the second prediction block is greater than the target bit width, and similarly, the initial prediction The bit width of the pixel value of the block and target prediction block is also greater than the target bit width.
- bit width is adjusted to the target bit width such that the bit width bit target width of the predicted value of the pixel value of the finally obtained image block is wide.
- the target bit width is 10 bits
- the bit width of the pixel values of the first prediction block and the second prediction block is 14 bits
- the bit width of the pixel values of the initial prediction block and the target prediction block is also 14 bits
- the prediction block according to the target The pixel value is used to determine the predicted value of the pixel value of the image block, and then the bit width of the pixel value is reduced from 14 bits to 10 bits.
- a high bit width pixel value can be used in the intermediate process, and finally, a high bit width pixel value is converted into a target bit width when the predicted value of the image block pixel value is obtained, so that Improve the accuracy of image prediction.
- the pixel value of the first prediction block and the pixel value of the second prediction block may be weighted, and the weighted pixel value is processed.
- the shift is performed, and the shifted pixel value is taken as the pixel value of the initial prediction block, and the bit width of the pixel value of the initial prediction block is made the same as the bit width of the first prediction block and the second prediction value.
- the bit width of the pixel values of the first prediction block and the second prediction block is 14 bits, then, after weighting the pixel values of the first prediction block and the second prediction block, the pixel values obtained after the weighting process are
- the bit width is also reserved as 14 bits.
- the initial prediction block when the initial prediction block is obtained according to the first prediction block and the second prediction block, in addition to the weighting process, the initial prediction block may be obtained in other manners, which is not limited in this application.
- the weighting coefficients of the pixel values of different prediction blocks may be different or the same.
- the prediction is equivalent to different predictions.
- the pixel values of the block are averaged.
- one prediction block may also be obtained from the reference image, and the prediction block may be directly determined as an initial prediction block.
- the prediction may be directly obtained from the forward reference image.
- the block is determined as an initial prediction block, and when the reference image is a backward reference picture, the prediction block obtained from the backward reference picture can be directly determined as the initial prediction block.
- a plurality of prediction blocks may also be acquired from the reference image, and then the initial prediction block is determined according to the plurality of prediction blocks, so that the pixel value of the initial prediction block is equal to the pixel after weighting the pixel values of the plurality of prediction blocks. value.
- M is a preset value, and M is an integer greater than 1.
- M may be set in advance before the image is predicted, and the value of M may be set according to the accuracy of the image prediction and the complexity of the search prediction block. It should be understood that each of the above M prediction blocks has the same bit width as the pixel value of the initial prediction block.
- the bit width of the pixel value of the target prediction block is the same as the bit width of the pixel value of the initial prediction block.
- the prediction block among the M prediction blocks that has the smallest difference (or difference value) of the pixel values of the initial prediction block may be determined as the target prediction block.
- the prediction block among the M prediction blocks that has the smallest difference (or difference value) of the pixel values of the initial prediction block may be determined as the target prediction block.
- Sum of absolute differences SAD
- absolute transformation errors Sum of absolute transformation differences
- SATD absolute square difference sum
- the search when searching for a target prediction block of the image block in the reference image, the search may be performed in an integer pixel step size (also referred to as motion search) or in a sub-pixel step (also referred to as sub-pixel step). Long) to search, and when searching in whole pixel step or sub-pixel step, the starting point of the search can be either whole pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels , 1/8 pixel and 1/16 pixel, and so on.
- the integer pixel step refers to the step size of each search or the integer multiple of the entire pixel when searching for a prediction block.
- the sub-pixel step search means that the step size of each search is less than the entire pixel when searching for the prediction block.
- the search step size can be 1/2 pixel, 1/4 pixel, 1/8 pixel. And 1/16 pixels and so on.
- the step size may be determined by using the currently pointed sub-pixel. For example, if the current motion vector points to a 1/2 pixel position, then the step may be performed in steps of 1/2 pixel. Search by pixel step size.
- the search can be performed in a preset sub-pixel step size when performing the pixel-by-pixel step search.
- the pixel value of the target prediction block may be directly limited, so that the bit width of the pixel value after the limitation reaches the target width of the image prediction, and then the image block is obtained according to the pixel value after the limit operation.
- the predicted value of the pixel value or the pixel value after the limit operation is directly determined as the predicted value of the pixel value of the image block.
- the bit width of the predicted value of the pixel value of the obtained image block is the target bit width.
- the pixel value of the obtained target prediction block has a bit width of 14 bits and the target bit width of the image prediction is 10 bits
- the pixel value of the target prediction block is limited (or shifted), so that the pixel value is The bit width is changed from 14 bits to 10 bits, and then the pixel value after the limit is used as the predicted value of the pixel value of the image block.
- the bit width of the predicted value of the pixel value of the image block becomes 10 bits.
- the bit width of the pixel values of the first prediction block and the second prediction block is greater than the bit width of the reconstructed pixel value of the finally obtained image block
- the bit widths of the pixel values of a prediction block, the second prediction block, the initial prediction block, and the target prediction block are the same, such that the bit width of the pixel value of the finally obtained target prediction block is also larger than the bit width of the reconstructed pixel value of the image block, thus
- the predicted value of the pixel value of the image block may be directly determined according to the pixel value of the target prediction block having the higher bit width, and the pixel value of the image block may not be determined after the prediction block having the pixel value of the high bit width is obtained by motion compensation.
- the predicted value saves the motion compensation operation and reduces the complexity of image prediction.
- the reference image may include only the forward reference image, or only the backward reference image, or both the forward reference image and the backward reference image.
- the first prediction block and the second prediction block corresponding to the image block are obtained in the reference image by using the interpolation filter according to the prediction motion information, and specifically include the following three cases.
- the first prediction block and the second prediction block are respectively obtained in the forward reference image and the backward reference image by the interpolation filter according to the predicted motion information.
- the initial prediction block By acquiring different prediction blocks in the forward reference image and/or the backward reference image, it is possible to determine the initial prediction block according to different prediction blocks, and the prediction that will be directly searched in the forward reference image or the backward reference image.
- the block can be used to more accurately determine the initial prediction block according to different prediction blocks as compared to the way the block is initially predicted.
- the search is performed in the reference image according to the predicted motion information, and the M prediction blocks corresponding to the image block are obtained, which specifically includes the following two cases:
- the search is performed in the backward reference image according to the predicted motion information, and M prediction blocks corresponding to the image block are obtained.
- the above M prediction blocks are either searched from the forward reference image or searched from the backward reference image, and then, in the Determining, according to the M prediction blocks corresponding to the image block and the initial prediction block, a prediction block of the image block, the prediction block of which the difference between the pixel value of the M prediction block and the pixel value of the initial prediction block is the smallest Predict blocks for the target.
- the specific process of obtaining the M prediction blocks corresponding to the image block is as follows:
- Determining a target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- the target prediction block is determined according to the first target prediction block and the second target prediction block.
- performing a search in the reference image according to the predicted motion information, and obtaining M prediction blocks corresponding to the image block including: the reference image is a first direction reference image, and searching in the first direction reference image according to the predicted motion information, Obtaining M prediction blocks corresponding to the image block; determining the target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block, including: selecting pixel values of the M prediction blocks corresponding to the image block and the initial prediction block a prediction block having a smallest difference in pixel values is determined as a first target prediction block; determining a first motion vector of the image block pointing to the first target prediction block; determining a second motion vector according to a preset rule according to the first motion vector; The motion vector determines a second target prediction block corresponding to the image block in the second direction reference image; and determines the target prediction block according to the first target prediction block and the second target prediction block.
- the first direction reference image and the second direction reference image are respectively a forward reference image and a backward reference image, or the first direction reference image and the second direction reference image are a backward reference image and a forward reference image, respectively.
- Deriving a prediction block in the reference image of the image block in the other direction by the prediction block searched in the reference image in one direction can save a lot of search operations, simplify the complexity of image prediction, and at the same time, determine the target prediction
- the block uses both the prediction block corresponding to the forward reference image and the prediction block corresponding to the image reference block in the backward reference image, which can simplify the image prediction complexity and ensure the accuracy of image prediction.
- the search may be performed only in the forward reference image to obtain a forward target prediction block, and then according to the front
- the backward motion vector is derived from the forward motion vector of the reference image target prediction block (eg, the motion vector may be derived using a mirror hypothesis method), and then the backward target prediction block is determined based on the derived backward motion vector.
- the target prediction block is then determined based on the forward target prediction block and the backward target prediction block.
- the target prediction block is determined according to the forward target prediction block and the backward target prediction block.
- the method shown in FIG. 3 further includes: acquiring indication information from a code stream of the image block, where the indication information is used to indicate the predicted motion information of the acquired image block.
- the indication information is carried in any one of a sequence parameter set, an image parameter set or a slice header of the image block.
- the indication information can flexibly indicate whether or not the predicted motion information of the image block is acquired, and then the image is predicted based on the predicted motion information of the image block or the like. Specifically, the indication information can indicate whether the image prediction is performed by using the method of the embodiment of the present application. After the indication information is obtained from the code stream, the image is predicted according to the method of the embodiment of the present application, if not obtained from the code stream. To the indication information, the image can be predicted according to a conventional method, and the indication information can flexibly indicate which method is used to predict the image.
- the trigger information When the trigger information is carried in the sequence parameter set of the image to be processed, the trigger information may be specifically represented by the form shown in Table 1.
- Seq_parameter_set_rbsp() Descriptor ... Sps_dmvr_precision_flag u(1) ... ⁇
- seq_parameter_set_rbsp() represents all parameter information of an image sequence
- sps_dmvr_precision_flag is used to indicate trigger information.
- the value of the indicator of sps_dmvr_precision_flag can be obtained by decoding the code stream, and when the indicator of sps_dmvr_precision_flag is 0,
- the image is predicted according to a conventional prediction method, and when the indicator of sps_dmvr_precision_flag is 1, the image can be predicted according to the method of the present application.
- the trigger information When the trigger information is carried in the image parameter set of the image to be processed, the trigger information may be specifically represented by the form shown in Table 2.
- Pic_parameter_set_rbsp() Descriptor ... Pps_dmvr_precision_flag u(1) ... ⁇
- pic_parameter_set_rbsp() represents all parameter information of an image
- pps_dmvr_precision_flag is used to indicate trigger information.
- the value of the indicator of pps_dmvr_precision_flag can be obtained by decoding the code stream. When the indicator of pps_dmvr_precision_flag is 0, it can be followed.
- the conventional prediction method predicts an image, and when the indicator of pps_dmvr_precision_flag is 1, the image can be predicted according to the method of the present application.
- the trigger information When the trigger information is carried in the parameter of the strip header of the image to be processed, the trigger information may be specifically represented by the form shown in Table 3.
- Slice_segment_header() Descriptor ... Slice_dmvr_precision_flag u(1) ... ⁇
- slice_segment_header() represents all parameter information of a certain strip of an image
- slice_dmvr_precision_flag is used to indicate trigger information
- the value of the indicator of slice_dmvr_precision_flag can be obtained by decoding the code stream, when the indicator of slice_dmvr_precision_flag When it is 0, the image can be predicted according to the conventional prediction method.
- the indicator of slice_dmvr_precision_flag is 1, the image can be predicted according to the method of the present application.
- the method shown in FIG. 3 further includes: determining a motion vector of the image block according to the motion vector of the image block pointing to the target prediction block.
- the motion vector of the target prediction block herein is the motion vector of the image block pointing to the target prediction block.
- Determining the motion vector of the image block according to the motion vector of the target prediction block specifically, determining the motion vector of the target motion block as the motion vector of the image block, that is, updating the motion vector of the image block, so that Other image blocks can be effectively predicted based on the image block when the next image prediction is performed.
- the motion vector of the target motion block may also be determined as the predicted value of the motion vector of the image block, and then the motion vector of the image block may be obtained according to the predicted value of the motion vector of the image block.
- the method shown in FIG. 7 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 7 may occur in the encoding process, and may also occur in the decoding process. Specifically, the method shown in FIG. 7 may occur in an encoding process or an interframe prediction process at the time of decoding.
- the method shown in FIG. 7 specifically includes steps 201 to 209, and step 201 to step 209 are described in detail below.
- the predicted motion information of the current image block may be specifically determined according to the motion information of the adjacent image block of the current image block. Further, the method 1 and method 2 in the step 101 may be used to obtain the predicted motion information.
- the predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), and is directed to the reference image block.
- a motion vector usually a motion vector of a neighboring block
- image information of a reference image block generally understood as reference image information
- the motion vector comprises a forward motion vector and/or a backward motion vector
- the reference image information including the front Reference frame index information to a prediction reference picture block and/or a backward prediction reference picture block.
- a high bit width pixel value refers to a pixel value whose image bit width is larger than the final target bit width of the image prediction.
- the pixel values of the forward prediction block and the backward prediction block in step 202 and step 203 have a bit width of 14 bits and a target bit width of 10 bits, since the bit widths of the pixel values of the forward prediction block and the backward prediction block are larger than The target bit width is wide, and therefore, the pixel values of the forward prediction block and the backward prediction block may be referred to as high bit width pixel values.
- the predicted motion information in step 201 may specifically include a forward motion vector and a backward motion vector, so that in step 202, it may be acquired in the forward reference image by the motion compensation method according to the forward motion vector.
- the forward prediction block of the current image block acquires the backward prediction block of the current image block in the backward reference image by the motion compensation method according to the backward motion vector.
- the pixel values of the forward prediction block and the backward prediction block are not subjected to bit width shift and limit operations, so that forward prediction is performed.
- the pixel values of the block and backward prediction blocks remain high bit width.
- forward prediction block and the backward prediction block in step 204 are obtained in step 202 and step 203, respectively, and the pixel value of the forward prediction block and the pixel value of the backward prediction block are both high bit width pixel values. .
- the pixel value of the forward prediction block and the pixel value of the backward prediction block may be specifically weighted, and the obtained pixel value is determined as the initial prediction block.
- the pixel value (which may also be referred to as a matching prediction block). It should be understood that after weighting the pixel values of the forward prediction block and the pixel values of the backward prediction block, the bit width shift and the limit operation are not performed on the pixel values obtained after the weighting process, so that the obtained initial prediction block is The pixel value is also high bit width.
- the pixel value of each pixel of the initial prediction block can be obtained according to formula (2).
- predSamples[x][y] (predSamplesL0[x][y]+predSamplesL1[x][y]+1)>>1 (2)
- predSamplesL0 is the forward prediction block
- predSamplesL1 is the backward prediction block
- predSamples is the initial prediction block
- predSamplesL0[x][y] is the pixel value of the pixel point (x, y) in the forward prediction block
- predSamplesL1[x][y] is the pixel value of the pixel point (x, y) in the backward prediction block
- predSamples[x][y] is the pixel value of the pixel point (x, y) in the initial prediction block.
- the bit width of the pixel of the initial prediction block obtained according to the formula (2) is also 14 bits, that is, according to the formula ( 2) Calculating the pixel value of the pixel of the initial prediction block enables the bit width of the pixel value of the pixel of the initial prediction block to be consistent with the bit width of the forward prediction block and the backward prediction block (both being the high bit width).
- the difference between the pixel value of each forward prediction block in the at least one forward prediction block and the pixel value of the initial prediction block may be determined first. And determining, as the optimal forward prediction block, the forward prediction block in which the difference between the pixel value of the at least one forward prediction block and the pixel value of the initial prediction block is the smallest. 207. Search for at least one backward prediction block in the backward reference image, where pixel values of each backward prediction block in the at least one backward prediction block are high bit width pixel values.
- a search (or referred to as a motion search) may be performed in full pixel steps to perform at least one forward prediction block when searching in the forward reference image.
- a search (or referred to as a motion search) may also be performed in a full pixel step size when searching in the backward reference image to obtain at least one forward prediction block.
- the search starting point can be either full pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels, and the like.
- (0, 0) when searching in an integer pixel step, (0, 0) can be used as a search starting point to obtain a forward prediction block, and then, around 8 (0, 0) The pixel points are search points, and the search is continued, and then 8 forward prediction blocks are obtained.
- the search may be performed directly in the sub-pixel step, or both the full pixel step search and the sub-pixel step. Long search.
- the search starting point can be either a full pixel or a sub-pixel.
- the distribution of the search starting point can also be as shown in FIG.
- the difference between the pixel value of each backward prediction block in the at least one backward prediction block and the initial prediction block may also be determined first, and then The backward prediction block in which the difference between the pixel value of the at least one backward prediction block and the pixel value of the initial prediction block is the smallest is determined as the optimal backward prediction block.
- SAD SATD
- absolute square difference sum may be used. Measures the difference between the pixel values of different prediction blocks.
- the pixel value of the optimal forward prediction block and the pixel value of the optimal backward prediction block may be weighted, due to the pixel value of the optimal forward prediction block and the most
- the pixel value of the preferred backward prediction block is a high bit width pixel value. Therefore, after performing the weighting process, the obtained pixel value is still a high bit width, and then the bit value shift of the pixel value obtained after the weighting process is required. The limit operation is performed, and then the pixel value after the bit width shift and the limit operation processing is determined as the predicted value of the pixel value of the current image block.
- the predicted value of the pixel value of the current image block can be obtained according to formula (3).
- predSamplesL0'[x][y]+predSamplesL1'[x][y] predSamplesL0' is the optimal forward prediction block
- predSamplesL1' is the optimal backward prediction block
- predSamples' is the current image block
- predSamples L0'[x][y] is the pixel value of the pixel point (x, y) in the optimal forward prediction block
- predSamples L1'[x][y] is the pixel point (x, y) in the optimal backward prediction block.
- the pixel value, predSamples[x][y] is the predicted value of the pixel value of the pixel point (x, y) in the current image block, shift2 represents the bit width difference, and offset2 is equal to 1 ⁇ (shift2-1), for Rounding up during the calculation process.
- bit value of the pixel value of the forward optimal prediction block is 14 bits
- bit width of the pixel value of the backward optimal prediction block is also 14 bits
- bitDepth is the target bit width
- shift2 is 15-bitDepth, according to the formula ( 3)
- the predicted value of the pixel value of the current image block may be obtained according to other methods, which is not limited in this application.
- the target bit width can be uniformly used for the initial prediction block and the target prediction block in the process of image prediction.
- FIG. 9 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 9 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in Fig. 9 can occur both in the encoding process and in the decoding process. More specifically, the method shown in Fig. 9 can occur in the interframe prediction process at the time of encoding and decoding.
- the method shown in FIG. 9 includes steps 301 to 306, and steps 301 to 306 are described in detail below.
- the above image block may be one image block in the image to be processed, or may be one sub image in the image to be processed.
- the predicted motion information includes indication information of a prediction direction (typically forward prediction, backward prediction, or bidirectional prediction), and is directed to the reference image block.
- a motion vector usually a motion vector of a neighboring block
- image information of a reference image block generally understood as reference image information
- the motion vector comprises a forward motion vector and/or a backward motion vector
- the reference image information including the front Reference frame index information to a prediction reference picture block and/or a backward prediction reference picture block.
- the method may be adopted by using the first method and the second method in the step 101.
- the gain of the above interpolation filter is greater than 1.
- the above reference image is a reference image of an image block, or the reference image is a reference image of an image to be processed in which the image block is located.
- a target bit width when predicting an image which is the bit width to which the pixel reconstruction value of the image block is to be reached after the image prediction ends.
- the bit width of the pixel values of the first prediction block and the second prediction block obtained in the reference image is higher than the target bit width, by the pixel value of the first prediction block and the second
- the performing a shift operation of the pixel values of the prediction block may reduce the bit width of the pixel values of the first prediction block and the second prediction block after the shift operation to the target bit width.
- the pixel values of the initial prediction block, the first prediction block, and the second prediction block have the same bit width, that is, the initial prediction is fast, and the bit widths of the pixel values of the first prediction block and the second prediction block are all target bits. width.
- the pixel value of the first prediction block and the pixel value of the second prediction block may be weighted, and the weighted pixel value is processed.
- bit values of the pixel values of the first prediction block and the second prediction block are 10 bits, then, after weighting the pixel values of the first prediction block and the second prediction block, the pixel values obtained after the weighting process are
- the bit width is also reserved as 10 bits.
- the weighting process is only one way of obtaining the pixel value of the initial prediction block.
- the pixel value of the initial prediction block may be used in the present application, which is not limited in this application.
- one prediction block may also be obtained from the reference image, and the prediction block may be directly determined as an initial prediction block.
- the prediction may be directly obtained from the forward reference image.
- the block is determined as an initial prediction block, and when the reference image is a backward reference picture, the prediction block obtained from the backward reference picture can be directly determined as the initial prediction block.
- M is a preset value, and M is an integer greater than 1.
- M may be a value set in advance before the image is predicted, and the value of M may be set according to the accuracy of the image prediction and the complexity of the search prediction block.
- each of the above M prediction blocks has the same bit width as the pixel value of the initial prediction block.
- the bit width of the pixel value of the target prediction block is the same as the bit width of the pixel value of the initial prediction block.
- the bit width of the pixel values of the first prediction block and the second prediction block is greater than the target bit width, but is shifted After the operation, the bit widths of the pixel values of the first prediction block and the second prediction block are again changed to the target bit width, and then, the initial prediction block obtained from the first prediction block and the second prediction block, and the finally obtained target
- the bit width of the pixel value of the prediction block is also the target bit width.
- the target bit width is 10 bits
- the bit width of the pixel values of the first prediction block and the second prediction block is 14 bits
- the bit width of the first prediction block and the second prediction block obtained after the shift operation is 10 bits
- the bit widths of the initial prediction block and the target prediction block obtained according to the first prediction block and the second prediction block are also 10 bits, so that the shifted first prediction block, the second prediction block, and the initial prediction block and the target prediction block
- the bit width is the target bit width, which reduces the complexity of image prediction.
- the prediction block among the M prediction blocks that has the smallest difference (or difference value) of the pixel values of the initial prediction block may be determined as the target prediction block.
- the prediction block among the M prediction blocks that has the smallest difference (or difference value) of the pixel values of the initial prediction block may be determined as the target prediction block.
- the pixel value of each prediction block and the initial prediction block may be measured by SAD, SATD, or absolute square difference sum or the like. The difference between the pixel values.
- the search when searching for a target prediction block of the image block in the reference image, the search may be performed in an integer pixel step size (or referred to as motion search), or may be searched in a sub-pixel step size, and the integer pixel is used.
- the starting point of the search can be either whole pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels. and many more.
- the pixel value of the target prediction block can be directly directly Determined as the predicted value of the pixel value of the image block.
- the pixel value of the target prediction block can be directly determined as the predicted value of the pixel value of the image block.
- the bit widths of the pixel values of the initial prediction block and the target prediction block are both target widths, in the process of predicting the image, the back and forth conversion of the pixel values between different bit widths can be reduced. And determining a predicted value of the pixel value of the image block according to the target prediction block whose pixel value bit width is the target bit width, and no longer performing motion compensation to obtain the prediction block of the pixel value having the high bit width, and then determining the pixel value of the image block.
- the predicted value saves the motion compensation operation, simplifies the process of image prediction, and reduces the complexity of image prediction.
- the reference image is a forward reference image or a backward reference image.
- the reference image is only one of the forward reference image and the backward reference image, it is only necessary to search for the prediction block in one type of reference image, which reduces the complexity of the search.
- two forward prediction blocks may be acquired in the forward reference image by the interpolation filter according to the predicted motion information, and the two forward prediction blocks are respectively taken as the first The prediction block and the second prediction block.
- two backward prediction blocks may be acquired in the backward reference image by the interpolation filter according to the predicted motion information, and the two backward prediction blocks are respectively taken as the first The prediction block and the second prediction block.
- a forward prediction block and a backward prediction block may be acquired in the forward reference image and the backward reference image by the interpolation filter according to the predicted motion information in step 302. And forward prediction block and backward prediction block as the first prediction block and the second prediction block, respectively.
- the order of obtaining the forward prediction block and the backward prediction block is not limited, and may be acquired at the same time, or may be obtained by acquiring the forward prediction block and then acquiring the backward prediction block, or acquiring the backward prediction block before acquiring. To the prediction block.
- the initial prediction block By acquiring different prediction blocks in the forward reference image and/or the backward reference image, it is possible to determine the initial prediction block according to different prediction blocks, and the prediction that will be directly searched in the forward reference image or the backward reference image.
- the block can be used to more accurately determine the initial prediction block according to different prediction blocks as compared to the way the block is initially predicted.
- the step 304 and the step 305 specifically include: searching, in the forward reference image, according to the predicted motion information, to obtain M prediction blocks corresponding to the image block; and M prediction blocks.
- the prediction block in which the difference between the pixel value and the pixel value of the initial prediction block is the smallest is determined as the target prediction block.
- the step 304 and the step 305 specifically include: searching, in the backward reference image, according to the predicted motion information, to obtain M prediction blocks corresponding to the image block; and M prediction blocks.
- the prediction block in which the difference between the pixel value and the pixel value of the initial prediction block is the smallest is determined as the target prediction block.
- the reference image includes the forward reference image and the backward reference image
- the accuracy of the image prediction can be improved.
- step 305 and step 306 may specifically include steps 1 to 6, and step 1 to step 6 are described in detail below.
- Step 1 Searching in the forward reference image according to the predicted motion information, and obtaining A prediction blocks corresponding to the image block;
- Step 2 performing a search in the backward reference image according to the predicted motion information, and obtaining B prediction blocks corresponding to the image block;
- Step 3 Determine, according to the M prediction blocks and the initial prediction block corresponding to the image block, a target prediction block of the image block, including:
- Step 4 determining, as the first target prediction block, a prediction block that minimizes a difference between a pixel value of the A prediction blocks and a pixel value of the initial prediction block;
- Step 5 Determine a prediction block that minimizes a difference between a pixel value of the B prediction blocks and a pixel value of the initial prediction block as a second target prediction block;
- Step 6 Determine a target prediction block according to the first target prediction block and the second target prediction block.
- performing a search in the reference image according to the predicted motion information, and obtaining M prediction blocks corresponding to the image block including: the reference image is a first direction reference image, and searching in the first direction reference image according to the predicted motion information, Obtaining M prediction blocks corresponding to the image block; determining the target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block, including: selecting pixel values of the M prediction blocks corresponding to the image block and the initial prediction block a prediction block having a smallest difference in pixel values is determined as a first target prediction block; determining a first motion vector of the image block pointing to the first target prediction block; determining a second motion vector according to a preset rule according to the first motion vector; The motion vector determines a second target prediction block corresponding to the image block in the second direction reference image; and determines the target prediction block according to the first target prediction block and the second target prediction block.
- the first direction reference image and the second direction reference image are respectively a forward reference image and a backward reference image, or the first direction reference image and the second direction reference image are a backward reference image and a forward reference image, respectively.
- Deriving a prediction block in the reference image of the image block in the other direction by the prediction block searched in the reference image in one direction can save a lot of search operations, simplify the complexity of image prediction, and at the same time, determine the target prediction
- the block uses both the prediction block corresponding to the forward reference image and the prediction block corresponding to the image reference block in the backward reference image, which can simplify the image prediction complexity and ensure the accuracy of image prediction.
- the method shown in FIG. 3 further includes: acquiring indication information from a code stream of the image block, where the indication information is used to indicate the predicted motion information of the acquired image block.
- the indication information is carried in any one of a sequence parameter set, an image parameter set or a slice header of the image block.
- the indication information can flexibly indicate whether the predicted motion information of the image block is acquired, and then the image is predicted according to the predicted motion information of the image block or the like. Specifically, the indication information can indicate whether the image prediction is performed by using the method of the embodiment of the present application. After the indication information is obtained from the code stream, the image is predicted according to the method of the embodiment of the present application, if not obtained from the code stream. To the indication information, the image can be predicted according to a conventional method, and the indication information can flexibly indicate which method is used to predict the image.
- the specific expression forms of the trigger information may be respectively shown in Table 1 to Table 3.
- the method shown in FIG. 9 further includes: determining a motion vector of the image block according to the motion vector of the image block pointing to the target prediction block.
- the motion vector of the target prediction block herein is the motion vector of the image block pointing to the target prediction block.
- Determining the motion vector of the image block according to the motion vector of the target prediction block specifically, determining the motion vector of the target motion block as the motion vector of the image block, that is, updating the motion vector of the image block, so that Other image blocks can be effectively predicted based on the image block when the next image prediction is performed.
- the motion vector of the target motion block may also be determined as the predicted value of the motion vector of the image block, and then the motion vector of the image block may be obtained according to the predicted value of the motion vector of the image block.
- the method shown in FIG. 10 can also be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 10 may occur in the encoding process, or may occur in the decoding process. Specifically, the method shown in FIG. 10 may occur in an encoding process or an interframe prediction process at the time of decoding.
- the method shown in FIG. 10 specifically includes steps 401 to 409, and steps 401 to 409 are respectively described in detail below.
- the predicted motion information of the current image block may be determined according to motion information of adjacent image blocks of the current image block. Specifically, the predicted motion information may be acquired by using the first method and the second method in the step 101.
- the above predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector pointing to a reference image block (usually a motion vector of a neighboring block), and image information of a reference image block.
- the motion vector comprises a forward motion vector and/or a backward motion vector
- the reference image information comprising reference frame index information of the forward predicted reference image block and/or the backward predicted reference image block . 402.
- a motion search may be performed in the forward reference image according to the forward motion vector in the predicted motion information to obtain a forward prediction block of the current image block.
- a motion search may be performed in the forward reference image according to the forward motion vector in the predicted motion information to obtain a forward prediction block of the current image block.
- the target bit width in step 402 and step 403 may refer to the bit width of the reconstructed pixel value of the finally obtained image block, that is, the bit width of the pixel value of the forward prediction block and the backward prediction block obtained here and the final obtained The reconstructed pixel value of the image block.
- bit width of the pixel values of the forward prediction block and the backward prediction block directly searched according to the motion vector may be greater than the target bit width, and then, in the forward direction of the search.
- the prediction block and the backward prediction block perform a shift operation such that the bit width of the pixel values of the forward prediction block and the backward prediction block is reduced to the target bit width.
- the pixels of the forward searched forward prediction block and the backward prediction block may be searched.
- the bit width of the value is shifted from 14 bits to 10 bits.
- forward prediction block and the backward prediction block in step 404 are obtained in steps 402 and 403, respectively.
- the pixel value of the forward prediction block and the pixel value of the backward prediction block may be weighted, and then the bit width of the pixel value obtained by the weighting process may be performed.
- the shift and limit operations are such that the bit width of the pixel value obtained after the bit width shift and the limit operation is the target bit width.
- the pixel value of each pixel of the initial predicted block can be obtained according to formula (4) when determining the pixel value of the initial prediction block.
- predSamplesL0[x][y]+predSamplesL1[x][y] predSamplesL0 is the forward prediction block
- predSamplesL1 is the backward prediction block
- predSamples is the initial prediction block
- predSamplesL0[x][y] is the forward direction.
- predSamplesL1[x][y] is the pixel value of the pixel (x, y) in the backward prediction block
- predSamples[x][y] is the initial prediction block.
- the pixel value of the pixel (x, y), shift2 represents the bit width difference
- offset2 is equal to 1 ⁇ (shift2-1), which is used for rounding in the calculation process.
- the Clip3 function is to ensure that the final predicted pixel value is within the bit width of the image prediction, as defined by equation (5):
- shift2 can be set to 15-bitDepth, where the bitDepth target bit width, so that the initial prediction can be finally obtained according to formula (4).
- a difference between a pixel value of each of the at least one forward prediction block and the initial prediction block may be determined, and at least one forward direction is determined.
- the prediction block in which the difference between the pixel value in the prediction block and the pixel value of the initial prediction block is the smallest is determined as the optimal forward prediction block.
- a difference between a pixel value of each backward prediction block in the at least one backward prediction block and the initial prediction block may be determined, and at least one backward direction
- the prediction block in which the difference between the pixel value in the prediction block and the pixel value of the initial prediction block is the smallest is determined as the optimal backward prediction block.
- a search may be performed in a full pixel step size (or referred to as motion search) when searching in a forward reference image or a backward reference image to obtain at least one forward prediction block and at least one post To the prediction block.
- the search starting point can be either full pixels or sub-pixels, for example, integer pixels, 1/2 pixels, 1/4 pixels, 1/8 pixels, and 1/16 pixels, and the like.
- a forward prediction block when searching in an integer pixel step, can be obtained with (0, 0) as a search starting point, and then, (0, 0) can be further used.
- the surrounding 8 pixels are search points, and the search is continued, and then 8 forward prediction blocks are obtained.
- the search may be performed directly in the sub-pixel step, or both the full pixel step search and the sub-pixel step. Long search.
- a high bit width pixel value may be used in the search process, so that the pixel value of the at least one prediction block obtained by the search is a high bit width pixel value, and then, for at least one prediction block.
- the pixel value performs a bit width shift and a limit operation such that the pixel value of the at least one prediction block obtained by the search becomes a pixel value of the target bit width.
- the pixel value of the forward prediction block obtained by the search may be subjected to a bit width shift and a limit operation according to the formula (6).
- predSamplesL0'[x][y] Clip3(0,(1 ⁇ bitDepth)-1,(predSamplesL0[x][y]+offset2)>>shift2) (6)
- predSamplesL0 is the searched forward prediction block
- predSamplesL0' is the forward prediction block after bit width shift and limit operation processing on predSamplesL0
- predSamplesL0[x][y] is the searched forward prediction block.
- the pixel value of the pixel (x, y), predSamplesL0'[x][y] is the pixel value of the pixel point (x, y) in the forward prediction block after the bit width shift and the limit operation processing
- shift2 Indicates the bit width difference
- offset2 is equal to 1 ⁇ (shift2-1), used for rounding in the calculation process.
- the bit width shift and the limit operation of the backward prediction block can also be performed by using formula (6).
- predSamplesL0 represents the searched backward prediction block
- predSamplesL0' is the pair.
- predSamplesL0 performs the bitwise shift and the backward prediction block after the limit operation processing.
- steps 406 and 408 when calculating the difference between the pixel value of each forward prediction block and the pixel value of the matching prediction block, and the difference between the pixel value of each backward prediction block and the pixel value of the matching prediction block,
- the difference between the pixel value of each forward prediction block and the pixel value of the matching prediction block may be measured by SAD, SATD, or absolute square difference sum or the like.
- SATD absolute square difference sum
- the application is not limited thereto, and other parameters that can be used to describe the similarity between two prediction blocks may also be employed.
- the pixel values and the optimal backward direction of the optimal forward prediction block which can be obtained in steps 407 and 408 can be obtained.
- the pixel value of the prediction block is weighted, and the pixel value obtained by the weighting process is used as a predicted value of the pixel value of the current image block.
- the predicted value of the pixel value of the current image block can be obtained according to formula (7).
- predSamples'[x][y] (predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (7)
- predSamplesL0' is the optimal forward prediction block
- predSamplesL1' is the optimal backward prediction block
- predSamples' is the final prediction block of the current image block
- predSamplesL0'[x][y] is the optimal forward prediction block in pixels.
- the pixel value of point (x, y), predSamplesL1'[x][y] is the pixel value of the optimal backward prediction block at the pixel point (x, y), and predSamples'[x][y] is the final prediction block.
- the pixel value of the pixel (x, y), Clip3 () is a limit function.
- the method shown in FIG. 10 is compared with the method shown in FIG. 7 , and the forward prediction block, the backward prediction block, the initial prediction block, and the at least one forward prediction block obtained in steps 402 to 408 are at least one.
- the pixel values of the backward prediction block and the optimal forward prediction block and the optimal backward prediction block are pixel values of the target bit width.
- the bit widths of the pixel values of the corresponding prediction blocks in steps 202 to 208 are all pixel values of high bit width.
- the method shown in Figure 7 guarantees the accuracy of image prediction, while the method shown in Figure 10 reduces the complexity of image prediction.
- the image prediction method of the embodiment of the present application is described in detail with reference to FIG. 3 to FIG. 10 . It should be understood that the image prediction method in the embodiment of the present application may correspond to the inter prediction shown in FIG. 1 and FIG. 2 . The image prediction method of the embodiment may be performed in the inter prediction process shown in FIG. 1 and FIG. 2, and the image prediction method in the embodiment of the present application may be specifically performed by an inter prediction module in an encoder or a decoder. Additionally, the image prediction method of embodiments of the present application can be implemented in any electronic device or device that may require encoding and/or decoding of a video image.
- the image prediction apparatus of the embodiment of the present application will be described in detail below with reference to FIGS. 11 and 12.
- the image predicting apparatus shown in FIG. 11 corresponds to the methods shown in FIGS. 3 and 7, and each of the steps shown in FIGS. 3 and 7 can be executed; the image predicting apparatus shown in FIG. 12 and FIG.
- each of the steps shown in FIGS. 9 and 10 can be performed.
- the repeated description is appropriately omitted below.
- FIG. 11 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application.
- the apparatus 600 shown in Figure 11 includes:
- An obtaining module 601 configured to: acquire prediction motion information of an image block; and obtain, by using an interpolation filter, a first prediction block and a second prediction block corresponding to the image block by using an interpolation filter according to the prediction motion information. Wherein the gain of the interpolation filter is greater than 1;
- a processing module 602 configured to: obtain an initial prediction block according to the first prediction block and the second prediction block, where the initial prediction block, the first prediction block, and the first The bit widths of the pixel values of the two prediction blocks are the same; the M prediction blocks corresponding to the image blocks are obtained by searching in the reference image according to the predicted motion information, where M is a preset value, and M is greater than An integer of 1; determining, according to the M prediction blocks and the initial prediction block, a target prediction block of the image block, wherein the target prediction block has the same bit width as a pixel value of the initial prediction block;
- the prediction module 603 is configured to obtain a predicted value of a pixel value of the image block according to a pixel value of the target prediction block.
- the bit width of the pixel values of the first prediction block and the second prediction block is greater than the bit width of the reconstructed pixel value of the finally obtained image block
- the bit widths of the pixel values of a prediction block, the second prediction block, the initial prediction block, and the target prediction block are the same, such that the bit width of the pixel value of the finally obtained target prediction block is also larger than the bit width of the reconstructed pixel value of the image block, thus
- the predicted value of the pixel value of the image block may be directly determined according to the pixel value of the target prediction block having the higher bit width, and the pixel value of the image block may not be determined after the prediction block having the pixel value of the high bit width is obtained by motion compensation.
- the predicted value saves the motion compensation operation and reduces the complexity of image prediction.
- the acquiring module 601 is specifically configured to:
- the reference image is a forward reference image, and the first prediction block and the second prediction block are obtained in the forward reference image by an interpolation filter according to the predicted motion information; or
- the reference image is a backward reference image, and the first prediction block and the second prediction block are obtained in the backward reference image by an interpolation filter according to the predicted motion information; or
- the reference image includes a forward reference image and a backward reference image, and the first prediction block and the location are obtained in the forward reference image and the backward reference image respectively by an interpolation filter according to the predicted motion information
- the second prediction block is described.
- processing module 602 is specifically configured to:
- the reference image is a forward reference image, and the search is performed in the forward reference image according to the predicted motion information to obtain M prediction blocks corresponding to the image block;
- the reference image is a backward reference image, and the search is performed in the backward reference image according to the predicted motion information, to obtain M prediction blocks corresponding to the image block;
- Determining the target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- a prediction block that minimizes a difference between a pixel value of the M prediction blocks and a pixel value of the initial prediction block is determined as the target prediction block.
- processing module 602 is specifically configured to:
- the reference image includes a forward reference image and a backward reference image, and performs a search in the forward reference image according to the predicted motion information to obtain A prediction blocks corresponding to the image block;
- Determining a target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- the target prediction block is determined according to the first target prediction block and the second target prediction block.
- processing module 602 is specifically configured to:
- the reference image is a first direction reference image, and is searched in the first direction reference image according to the predicted motion information, to obtain M prediction blocks corresponding to the image block;
- Determining a target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- the second direction reference image Determining, in the second direction reference image, a second target prediction block corresponding to the image block according to the second motion vector, wherein the first direction reference image and the second direction reference image are respectively forward reference images And a backward reference image, or the first direction reference image and the second direction reference image are respectively a backward reference image and a forward reference image;
- the acquiring module 601 before acquiring the predicted motion information of the image block, is further configured to obtain the indication information from the code stream of the image block, where the indication information is used to indicate the acquisition.
- the predicted motion information of the image block, the indication information being carried in any one of a sequence parameter set, an image parameter set or a slice header of the image block.
- the obtaining module 601 is further configured to:
- the processing module 602 is configured to obtain a motion vector of the image block according to a motion vector of the image block that is directed to the target prediction block, where a motion vector of the image block is used to predict other image blocks.
- the foregoing apparatus 600 may perform the foregoing method for image prediction shown in FIG. 3 and FIG. 7.
- the apparatus 600 may specifically be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
- the apparatus 600 can be used for both image prediction during encoding and image prediction during decoding.
- FIG. 12 is a schematic block diagram of an image prediction apparatus according to an embodiment of the present application.
- the apparatus 800 shown in Figure 12 includes:
- An obtaining module 801 configured to: acquire prediction motion information of an image block; and obtain, by using an interpolation filter, a first prediction block and a second prediction block corresponding to the image block by using an interpolation filter according to the prediction motion information. Wherein the gain of the interpolation filter is greater than 1;
- a processing module 802 configured to: perform a shift operation on pixel values of the first prediction block and the second prediction block, so that pixel values of the first prediction block and the second prediction block The bit width is reduced to a target bit width, wherein the target bit width is a bit width of a reconstructed pixel value of the image block; according to the first predicted block and the second predicted block, an initial predicted block is obtained, The bit widths of the pixel values of the initial prediction block, the first prediction block, and the second prediction block are all the same; searching in the reference image according to the predicted motion information, to obtain the image block.
- Corresponding M prediction blocks where M is a preset value, where M is an integer greater than 1; determining a target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block, Wherein the target prediction block has the same bit width as the pixel value of the initial prediction block;
- the prediction module 803 is configured to obtain a predicted value of a pixel value of the image block according to a pixel value of the target prediction block.
- the bit widths of the pixel values of the initial prediction block and the target prediction block are both target widths, in the process of predicting the image, the back and forth conversion of the pixel values between different bit widths can be reduced. And determining a predicted value of the pixel value of the image block according to the target prediction block whose pixel value bit width is the target bit width, and no longer performing motion compensation to obtain the prediction block of the pixel value having the high bit width, and then determining the pixel value of the image block.
- the predicted value saves the motion compensation operation, simplifies the process of image prediction, and reduces the complexity of image prediction.
- the obtaining module 801 is specifically configured to:
- the reference image is a forward reference image, and the first prediction block and the second prediction block are obtained in the forward reference image by an interpolation filter according to the predicted motion information; or
- the reference image is a backward reference image, and the first prediction block and the second prediction block are obtained in the backward reference image by an interpolation filter according to the predicted motion information; or
- the reference image includes a forward reference image and a backward reference image, and the first prediction block and the location are obtained in the forward reference image and the backward reference image respectively by an interpolation filter according to the predicted motion information
- the second prediction block is described.
- processing module 802 is specifically configured to:
- the reference image is a forward reference image, and is searched in the forward reference image according to the predicted motion information to obtain M prediction blocks corresponding to the image block; or
- the reference image is a backward reference image, and the search is performed in the backward reference image according to the predicted motion information, to obtain M prediction blocks corresponding to the image block;
- Determining the target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- a prediction block that minimizes a difference between a pixel value of the M prediction blocks and a pixel value of the initial prediction block is determined as the target prediction block.
- processing module 802 is specifically configured to:
- the reference image includes a forward reference image and a backward reference image, and performs a search in the forward reference image according to the predicted motion information to obtain A prediction blocks corresponding to the image block;
- Determining a target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- the target prediction block is determined according to the first target prediction block and the second target prediction block.
- processing module 802 is specifically configured to:
- the reference image is a first direction reference image, and is searched in the first direction reference image according to the predicted motion information, to obtain M prediction blocks corresponding to the image block;
- Determining a target prediction block of the image block according to the M prediction blocks and the initial prediction block corresponding to the image block including:
- the second direction reference image Determining, in the second direction reference image, a second target prediction block corresponding to the image block according to the second motion vector, wherein the first direction reference image and the second direction reference image are respectively forward reference images And a backward reference image, or the first direction reference image and the second direction reference image are respectively a backward reference image and a forward reference image;
- the acquiring module 801 before acquiring the predicted motion information of the image block, is further configured to obtain the indication information from the code stream of the image block, where the indication information is used to indicate the acquisition.
- the predicted motion information of the image block, the indication information being carried in any one of a sequence parameter set, an image parameter set or a slice header of the image block.
- the obtaining module 801 is further configured to:
- the processing module 802 is configured to obtain a motion vector of the image block according to a motion vector of the image block that is directed to the target prediction block, where a motion vector of the image block is used to predict other image blocks.
- the foregoing apparatus 800 may perform the foregoing method for image prediction shown in FIG. 9 and FIG. 10, and the apparatus 800 may specifically be a video encoding apparatus, a video decoding apparatus, a video codec system, or other device having a video codec function.
- the apparatus 800 can be used for both image prediction during encoding and image prediction during decoding.
- the present application further provides a terminal device, the terminal device includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute the program
- a terminal device includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute the program
- the image prediction method of the embodiment is applied.
- the terminal devices here may be video display devices, smart phones, portable computers, and other devices that can process video or play video.
- the present application also provides a video encoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application.
- the present application also provides a video decoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application.
- the present application also provides a video encoding system including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application.
- the application further provides a computer readable medium storing program code for device execution, the program code comprising instructions for performing an image prediction method of an embodiment of the present application.
- the present application further provides a decoder, which includes an image prediction device (for example, device 600, device 800) and a reconstruction module in the embodiment of the present application, wherein the reconstruction module is configured to obtain according to the image prediction device.
- the predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
- the present application further provides an encoder, which includes an image prediction device (for example, device 600, device 800) and a reconstruction module in the embodiment of the present application, wherein the reconstruction module is configured to obtain according to the image prediction device.
- the predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
- FIG. 13 is a schematic block diagram of a video encoder according to an embodiment of the present application.
- the video encoder 1000 shown in FIG. 13 includes an encoding end prediction module 1001, a transform quantization module 1002, an entropy encoding module 1003, an encoding reconstruction module 1004, and an encoding end filtering module.
- the video encoder 1000 shown in FIG. 13 can encode a video. Specifically, the video encoder 1000 can perform the video encoding process shown in FIG. 1 to implement encoding of a video. In addition, the video encoder 1000 may also perform the image prediction method of the embodiment of the present application, and the video encoder 1000 may perform the respective steps of the image prediction methods illustrated in FIGS. 3, 7, 9, and 10.
- the image prediction apparatus in the embodiment of the present application may also be the encoding end prediction module 1001 in the video encoder 1000. Specifically, the apparatus 600 and the apparatus 800 shown in FIG. 11 and FIG. 12 are equivalent to the encoding end in the video encoder 1000. Prediction module 1001.
- the video decoder 2000 shown in FIG. 14 includes an entropy decoding module 2001, an inverse transform inverse quantization module 2002, a decoding end prediction module 2003, a decoding reconstruction module 2004, and a decoding end filtering module 2005.
- the video decoder 2000 shown in FIG. 14 can encode the video. Specifically, the video decoder 2000 can perform the video decoding process shown in FIG. 2 to implement decoding of the video. In addition, the video decoder 2000 may also perform the image prediction method of the embodiment of the present application, and the video decoder 2000 may perform the respective steps of the image prediction method illustrated in FIGS. 3, 7, 9, and 10.
- the image prediction apparatus in the embodiment of the present application may also be the decoding side prediction module 2003 in the video decoder 2000. Specifically, the apparatus 600 and the apparatus 800 shown in FIG. 11 and FIG. 12 are equivalent to the decoding end in the video decoder 2000. Prediction module 2003.
- the application scenario of the image prediction method in the embodiment of the present application is described below with reference to FIG. 15 to FIG. 17.
- the image prediction method in the embodiment of the present application may be implemented by the video transmission system, the codec device, and the editing device shown in FIG. 15 to FIG.
- the decoding system is executed.
- FIG. 15 is a schematic block diagram of a video transmission system according to an embodiment of the present application.
- the video transmission system includes an acquisition module 3001, an encoding module 3002, a transmitting module 3003, a network transmission 3004, a receiving module 3005, a decoding module 3006, a rendering module 3007, and a display module 208.
- each module in the video transmission system is as follows:
- the acquisition module 3001 includes a camera or a camera group for collecting video images, and performing pre-encoding processing on the collected video images to convert the optical signals into digitized video sequences;
- the encoding module 3002 is configured to encode the video sequence to obtain a code stream
- the sending module 3003 is configured to send the coded code stream.
- the receiving module 3005 is configured to receive the code stream sent by the sending module 3003.
- the network 3004 is configured to transmit the code stream sent by the sending module 3003 to the receiving module 3005;
- the decoding module 3006 is configured to decode the code stream received by the receiving module 3005 to reconstruct a video sequence.
- the rendering module 3007 is configured to render the reconstructed video sequence decoded by the decoding module 3006 to improve the display effect of the video.
- the video transmission system shown in FIG. 15 can perform the image prediction method in the embodiment of the present application.
- the encoding module 3001 and the decoding module 3006 in the video transmission system shown in FIG. 15 can perform image prediction in the embodiment of the present application.
- the acquisition module 3001, the encoding module 3002, and the transmission module 3003 in the video transmission system shown in FIG. 12 correspond to the video encoder 1000 shown in FIG.
- the receiving module 3005, the decoding module 3006, and the rendering module 3007 in the video transmission system shown in FIG. 13 correspond to the video decoder 2000 shown in FIG.
- a codec system composed of a codec device and a codec device will be described in detail below with reference to FIGS. 16 and 17. It should be understood that the codec apparatus and codec system shown in FIGS. 16 and 17 are capable of executing the method of image prediction of the embodiment of the present application.
- FIG. 16 is a schematic diagram of a video codec apparatus according to an embodiment of the present application.
- the video codec device 50 may be a device dedicated to encoding and/or decoding a video image, or may be an electronic device having a video codec function. Further, the codec device 50 may be a mobile communication system. Terminal or user equipment.
- Codec device 50 may include the following modules or units: controller 56, codec 54, radio interface 52, antenna 44, smart card 46, card reader 48, keypad 34, memory 58, infrared port 42, display 32.
- the codec device 50 may also include a microphone or any suitable audio input module, which may be a digital or analog signal input, and the codec device 50 may also include an audio output.
- the audio output module can be a headset, a speaker or an analog audio or digital audio output connection.
- the codec device 50 may also include a battery, which may be a solar cell, a fuel cell, or the like.
- the codec device 50 may also include an infrared port for short-range line-of-sight communication with other devices, and the codec device 50 may also communicate with other devices using any suitable short-range communication method, for example, a Bluetooth wireless connection, USB / Firewire wired connection.
- any suitable short-range communication method for example, a Bluetooth wireless connection, USB / Firewire wired connection.
- the memory 58 can store data in the form of data and audio in the form of images, as well as instructions for execution on the controller 56.
- Codec 54 may implement encoding and decoding of audio and/or video data or enable auxiliary and auxiliary decoding of audio and/or video data under the control of controller 56.
- the smart card 46 and the card reader 48 can provide user information as well as network authentication and authentication information for authorized users.
- the specific implementation form of the smart card 46 and the card reader 48 may be a Universal Integrated Circuit Card (UICC) and a UICC reader.
- UICC Universal Integrated Circuit Card
- the radio interface circuit 52 can generate a wireless communication signal, which can be a communication signal generated during a cellular communication network, a wireless communication system, or a wireless local area network communication.
- the antenna 44 is used to transmit radio frequency signals generated by the radio interface circuit 52 to other devices (the number of devices may be one or more), and may also be used for other devices (the number of devices may be one or more Receive RF signals.
- codec device 50 may receive video image data to be processed from another device prior to transmission and/or storage. In still other embodiments of the present application, the codec device 50 may receive images over a wireless or wired connection and encode/decode the received images.
- FIG. 17 is a schematic block diagram of a video codec system 7000 according to an embodiment of the present application.
- the video codec system 7000 includes a source device 4000 and a destination device 5000.
- the source device 4000 generates encoded video data
- the source device 4000 may also be referred to as a video encoding device or a video encoding device
- the destination device 5000 may decode the encoded video data generated by the source device 4000
- the destination device 5000 may also be referred to as a video decoding device or a video decoding device.
- the specific implementation form of the source device 4000 and the destination device 5000 may be any one of the following devices: a desktop computer, a mobile computing device, a notebook (eg, a laptop) computer, a tablet computer, a set top box, a smart phone, a handset, TV, camera, display device, digital media player, video game console, on-board computer, or other similar device.
- Destination device 5000 can receive video data encoded by source device 4000 via channel 6000.
- Channel 6000 can include one or more media and/or devices capable of moving encoded video data from source device 4000 to destination device 5000.
- channel 6000 can include one or more communication media that enable source device 4000 to transmit encoded video data directly to destination device 5000 in real time, in which case source device 4000 can be based on communication standards ( For example, a wireless communication protocol) modulates the encoded video data, and the modulated video data can be transmitted to the destination device 5000.
- the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media described above may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)).
- the one or more communication media described above may include a router, a switch, a base station, or other device that enables communication from the source device 4000 to the destination device 5000.
- channel 6000 can include a storage medium that stores encoded video data generated by source device 4000.
- destination device 5000 can access the storage medium via disk access or card access.
- the storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, High Density Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), flash memory. Or other suitable digital storage medium for storing encoded video data.
- channel 6000 can include a file server or another intermediate storage device that stores encoded video data generated by source device 4000.
- destination device 5000 can access the encoded video data stored at a file server or other intermediate storage device via streaming or download.
- the file server may be a server type capable of storing encoded video data and transmitting the encoded video data to the destination device 5000.
- the file server may include a World Wide Web (Web) server (for example, for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk. driver.
- Web World Wide Web
- FTP File Transfer Protocol
- NAS Network Attached Storage
- Destination device 5000 can access the encoded video data via a standard data connection (e.g., an internet connection).
- the instance type of the data connection includes a wireless channel, a wired connection (e.g., a cable modem, etc.), or a combination of both, suitable for accessing the encoded video data stored on the file server.
- the transmission of the encoded video data from the file server may be streaming, downloading, or a combination of both.
- the image prediction method of the present application is not limited to a wireless application scenario.
- the image prediction method of the present application can be applied to video codec supporting multiple multimedia applications such as the following applications: aerial television broadcasting, cable television transmission, satellite television transmission, Streaming video transmission (e.g., via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application.
- video codec system 7000 can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
- the source device 4000 includes a video source 4001, a video encoder 4002, and an output interface 4003.
- output interface 4003 can include a modulator/demodulator (modem) and/or a transmitter.
- Video source 4001 can include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer for generating video data A graphics system, or a combination of the above video data sources.
- Video encoder 4002 can encode video data from video source 4001.
- source device 4000 transmits the encoded video data directly to destination device 5000 via output interface 4003.
- the encoded video data may also be stored on a storage medium or file server for later access by the destination device 5000 for decoding and/or playback.
- destination device 5000 includes an input interface 5003, a video decoder 5002, and a display device 5001.
- input interface 5003 includes a receiver and/or a modem.
- the input interface 5003 can receive the encoded video data via the channel 6000.
- Display device 5001 may be integrated with destination device 5000 or may be external to destination device 5000. Generally, the display device 5001 displays the decoded video data.
- Display device 5001 can include a variety of display devices, such as liquid crystal displays, plasma displays, organic light emitting diode displays, or other types of display devices.
- the video encoder 4002 and the video decoder 5002 can operate according to a video compression standard (for example, the high efficiency video codec H.265 standard), and can follow a High Efficiency Video Coding (HEVC) test model (HM). ).
- HEVC High Efficiency Video Coding
- HM High Efficiency Video Coding
- a textual description of the H.265 standard is published on April 29, 2015, ITU-T.265(V3) (04/2015), available for download from http://handle.itu.int/11.1002/7000/12455 The entire contents of the document are incorporated herein by reference.
- the disclosed systems, devices, and methods may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
- the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
- the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请提供了一种图像预测的方法和装置,该方法包括:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块;根据所述第一预测块和所述第二预测块,得到初始预测块;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,其中,M为预设值,M为大于1的整数;根据所述M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。本申请能够降低图像预测时的复杂度。
Description
本申请要求于2017年11月07日提交中国专利局、申请号为201711086618.4、申请名称为“图像预测方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及视频编解码技术领域,并且更具体地,涉及一种图像预测方法和装置。。
在对视频图像进行编解码时,为了减少传输数据的冗余,需要对图像的像素值进行预测。但是传统的图像预测方法的流程较多,复杂度较高。因此,如何减少图像预测时的复杂度是一个需要解决的问题。
发明内容
为了减少图像预测的复杂度,本申请提出了一种图像预测方法和装置,在预测过程中统一各种预测块的精度,并根据最终得到的目标预测块来确定图像块的像素值的预测值,从而简化图像预测的复杂度。具体地,可以在对图像进行预测的过程中采用比图像块的重建像素值的位宽更大的位宽,以提高最终得到的图像块的像素值的预测值的准确性。另外,还可以在对图像进行预测的过程中采用与图像块的重建像素值的位宽相同的位宽,以进一步减少图像预测时的复杂度。
第一方面,提供了一种图像预测方法,该方法包括:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,其中,M为预设值,M为大于1的整数;根据所述M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
可选地,上述预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(例如相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向预测参考图像块和/或后向预测参考图像块的参考帧索引信息。
在本申请中,由于差值滤波器的增益大于1,因此,第一预测块和第二预测块的像素值的位宽大于最终得到的图像块的重建像素值的位宽,此外,由于第一预测块、第二预测 块、初始预测块以及目标预测块的像素值的位宽相同,使得最终得到的目标预测块的像素值的位宽也大于图像块的重建像素值的位宽,因此,可以直接根据具有较高位宽的目标预测块的像素值来确定图像块的像素值的预测值,而不必再通过运动补偿获取具有高位宽的像素值的预测块之后再确定图像块的像素值的预测值,节省了运动补偿的操作,降低了图像预测时的复杂度。
上述根据第一预测块和第二预测块,得到初始预测块,可以是指根据第一预测块的像素值和第二预测块的像素值,得到初始预测块的像素值。
结合第一方面,在第一方面的某些实现方式中,所述M个预测块中的每个预测块的像素值的位宽均与初始预测块的像素值的位宽相同。
结合第一方面,在第一方面的某些实现方式中,所述根据所述第一预测块和所述第二预测块,得到初始预测块,包括:对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值。
应理解,上述对第一预测块和第二预测块的像素值进行加权处理,得到初始预测块的像素值具体是指在对第一预测块和第二预测块的像素值进行加权处理之后,将加权处理得到的像素值的结果确定为初始预测块的像素值的结果,并使得初始预测块的像素值的位宽与第一预测块以及第二预测块的像素值的位宽保持一致。
结合第一方面,在第一方面的某些实现方式中,所述根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
通过在前向参考图像和/或后向参考图像获取不同的预测块,进而可以根据不同的预测块来确定初始预测块,与直接将在前向参考图像或者后向参考图像中搜索到的预测块作为初始预测块的方式相比,根据不同的预测块能够来更准确地确定初始预测块。
结合第一方面,在第一方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
通过仅一个参考图像(前向参考图像或者后向参考图像)中进行搜索,从而得到M个预测块,能够减少搜索预测块的复杂度,另外,通过比较M个预测块中的每个预测块与初始预测块的像素值的差异,能够得到与图像块更接近的预测块,从而提高图像预测的效果。
结合第一方面,在第一方面的某些实现方式中,所述根据所述预测运动信息在所述参 考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
通过在前向参考图像和后向参考图像中分别进行搜索,能够根据前向参考图像和后向参考图像中搜索出来的预测块来综合确定最终的目标块,这样在获得预测块的时候既考虑了前向参考图像,又考虑了后向参考图像,能够使得最终得到的目标预测块与图像块更接近,从而提高图像预测的效果。
结合第一方面,在第一方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定所述图像块指向所述第一目标预测块的第一运动矢量;根据所述第一运动矢量按照预设规则确定第二运动矢量;根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
通过在一个方向的参考图像中搜索到的预测块来推导图像块在另一个方向的参考图像中的预测块,能够节省大量搜索操作,简化图像预测时的复杂度,同时,由于在确定目标预测块的同时既采用了图像块对应在前向参考图像的预测块也采用了图像块对应在后向参考图像中的预测块,可以在简化图像预测复杂度的同时,保证图像预测的准确性。
结合第一方面,在第一方面的某些实现方式中,所述根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,包括:根据所述预测运动信息通过插值滤波器在第一参考图像中获取所述图像块对应的第一预测块;根据所述预测运动信息通过插值滤波器在第二参考图像中获取所述图像块对应的第二预测块;其中,所述第一参考图像是第一参考图像列表中的参考图像,所述第二参考图像是第二参考图像列表中的参考图像,所述第一参考图像列表和所述第二参考图像列表为对所述图像块进行预测时采用的不同参考图像列表。
结合第一方面,在第一方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:根据所述预测运动信息在第一参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在第二参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,所述第一参考图像是第一参考图像列表中的参考图像,所述第二参考图像是第二参考图像列表中的参考图 像,所述第一参考图像列表和所述第二参考图像列表为对所述图像块进行预测时采用的不同参考图像列表,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
其中,上述第一参考图像和第二参考图像既可以是前向参考图像,也可以是后向参考图像。具体地,可以包含以下几种情况:第一参考图像和第二参考图像均为前向参考图像;第一参考图像和第二参考图像均为后向参考图像;第一参考图像为前向参考图像,第二参考图像为后向参考图像。
另外,上述第一参考图像可以是一个参考图像或者多个参考图像,同样,第二参考图像也可以是一个参考图像或者多个参考图像。
结合第一方面,在第一方面的某些实现方式中,在获取图像块的预测运动信息之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
通过指示信息能够灵活地指示是否获取图像块的预测运动信息,进而根据图像块的预测运动信息等等对图像进行预测。具体地,通过指示信息能够指示是否采用本申请实施例的方法进行图像预测,当从码流中获取到该指示信息后按照本申请实施例的方法对图像进行预测,如果没有从码流中获取到该指示信息的话,可以按照传统的方法对图像进行预测,通过指示信息能够灵活指示具体采用何种方法对图像进行预测。
结合第一方面,在第一方面的某些实现方式中,在根据所述第一预测块和所述第二预测块,得到初始预测块之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种;所述根据所述第一预测块和所述第二预测块,得到初始预测块,包括:在所述指示信息的标识位的取值为第一取值的情况下,根据所述第一预测块和所述第二预测块,得到初始预测块。
结合第一方面,在第一方面的某些实现方式中,在对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种;对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值,包括:在所述指示信息的标识位的取值为第一取值的情况下,对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值。
上述指示信息的标识位的取值可以包含第一取值和第二取值,当指示信息的标识位为第一取值时,可以表示按照本申请的预测方法对图像块进行预测,而当指示信息的标识位为第二取值时,可以表示按照传统的预测方法对图像块进行预测。另外,上述第一取值和第二取值可以分别是1和0,或者第一取值和第二取值也可以分别是0和1。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:获得所述图像块指 向所述目标预测块的运动矢量;根据所述图像块指向所述目标预测块的运动矢量,得到所述图像块的运动矢量,其中,所述图像块的运动矢量用于对其它图像块进行预测。
根据指向目标预测块的运动矢量来确定图像块的运动矢量,具体可以是将目标运动块的运动矢量直接确定为图像块的运动矢量,也就是对图像块的运动矢量进行了更新,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。
第二方面,提供了一种图像预测方法,该方法包括:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;对所述第一预测块和所述第二预测块的像素值进行移位操作,使得所述第一预测块和所述第二预测块的像素值的位宽减小到目标位宽,其中,所述目标位宽为所述图像块的重建像素值的位宽;根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,M为预设值,其中,M为大于1的整数;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
可选地,上述预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向预测参考图像块和/或后向预测参考图像块的参考帧索引信息。
本申请中,由于初始预测块和目标预测块的像素值的位宽均为目标位宽,因此,在对图像进行预测的过程中,能够减少像素值的在不同位宽之间的来回转换,并且根据像素值位宽为目标位宽的目标预测块来确定图像块的像素值的预测值,而不再进行运动补偿获取具有高位宽的像素值的预测块之后再确定图像块的像素值的预测值,节省了运动补偿的操作,简化了图像预测的流程,降低了图像预测的复杂度。
上述根据第一预测块和第二预测块,得到初始预测块,可以是指根据第一预测块的像素值和第二预测块的像素值,得到初始预测块的像素值。
结合第二方面,在第二方面的某些实现方式中,所述M个预测块中的每个预测块的像素值的位宽均与初始预测块的像素值的位宽相同。
结合第二方面,在第二方面的某些实现方式中,所述根据所述第一预测块和所述第二预测块,得到初始预测块,包括:对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值。
应理解,上述对第一预测块和第二预测块的像素值进行加权处理,得到初始预测块的像素值具体是指在对第一预测块和第二预测块的像素值进行加权处理之后,将加权处理得到的像素值的结果确定为初始预测块的像素值的结果,并使得初始预测块的像素值的位宽与第一预测块以及第二预测块的像素值的位宽保持一致。
结合第二方面,在第二方面的某些实现方式中,所述根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,包括:所述参考图 像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
通过在前向参考图像和/或后向参考图像获取不同的预测块,进而可以根据不同的预测块来确定初始预测块,与直接将在前向参考图像或者后向参考图像中搜索到的预测块作为初始预测块的方式相比,根据不同的预测块能够来更准确地确定初始预测块。
结合第二方面,在第二方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
通过仅一个参考图像(前向参考图像或者后向参考图像)中进行搜索,从而得到M个预测块,能够减少搜索预测块的复杂度,另外,通过比较M个预测块中的每个预测块与初始预测块的像素值的差异,能够得到与图像块更接近的预测块,从而提高图像预测的效果。
结合第二方面,在第二方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
通过在前向参考图像和后向参考图像中分别进行搜索,能够根据前向参考图像和后向参考图像中搜索出来的预测块来综合确定最终的目标块,这样在获得预测块的时候既考虑了前向参考图像,又考虑了后向参考图像,能够使得最终得到的目标预测块与图像块更接近,从而提高图像预测的效果。
结合第二方面,在第二方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定所述图像块指向所述第一目 标预测块的第一运动矢量;根据所述第一运动矢量按照预设规则确定第二运动矢量;根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
通过在一个方向的参考图像中搜索到的预测块来推导图像块在另一个方向的参考图像中的预测块,能够节省大量搜索操作,简化图像预测时的复杂度,同时,由于在确定目标预测块的同时既采用了图像块对应在前向参考图像的预测块也采用了图像块对应在后向参考图像中的预测块,可以在简化图像预测复杂度的同时,保证图像预测的准确性。
结合第二方面,在第二方面的某些实现方式中,所述根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,包括:根据所述预测运动信息通过插值滤波器在第一参考图像中获取所述图像块对应的第一预测块;根据所述预测运动信息通过插值滤波器在第二参考图像中获取所述图像块对应的第二预测块;其中,所述第一参考图像是第一参考图像列表中的参考图像,所述第二参考图像是第二参考图像列表中的参考图像,所述第一参考图像列表和所述第二参考图像列表为对所述图像块进行预测时采用的不同参考图像列表。
结合第二方面,在第二方面的某些实现方式中,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:根据所述预测运动信息在第一参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在第二参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,所述第一参考图像是第一参考图像列表中的参考图像,所述第二参考图像是第二参考图像列表中的参考图像,所述第一参考图像列表和所述第二参考图像列表为对所述图像块进行预测时采用的不同参考图像列表,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
其中,上述第一参考图像和第二参考图像既可以是前向参考图像,也可以是后向参考图像。具体地,可以包含以下几种情况:第一参考图像和第二参考图像均为前向参考图像;第一参考图像和第二参考图像均为后向参考图像;第一参考图像为前向参考图像,第二参考图像为后向参考图像。
另外,上述第一参考图像可以是一个参考图像或者多个参考图像,同样,第二参考图像也可以是一个参考图像或者多个参考图像。
结合第二方面,在第二方面的某些实现方式中,在获取图像块的预测运动信息之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
通过指示信息能够灵活地指示是否获取图像块的预测运动信息,进而根据图像块的预测运动信息等等对图像进行预测。具体地,通过指示信息能够指示是否采用本申请实施例 的方法进行图像预测,当从码流中获取到该指示信息后按照本申请实施例的方法对图像进行预测,如果没有从码流中获取到该指示信息的话,可以按照传统的方法对图像进行预测,通过指示信息能够灵活指示具体采用何种方法对图像进行预测。
结合第二方面,在第二方面的某些实现方式中,在根据所述第一预测块和所述第二预测块,得到初始预测块之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种;所述根据所述第一预测块和所述第二预测块,得到初始预测块,包括:在所述指示信息的标识位的取值为第一取值的情况下,根据所述第一预测块和所述第二预测块,得到初始预测块。
结合第二方面,在第二方面的某些实现方式中,在对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种;对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值,包括:在所述指示信息的标识位的取值为第一取值的情况下,对所述第一预测块和所述第二预测块的像素值进行加权处理,得到初始预测块的像素值。
上述指示信息的标识位的取值可以包含第一取值和第二取值,当指示信息的标识位为第一取值时,可以表示按照本申请的预测方法对图像块进行预测,而当指示信息的标识位为第二取值时,可以表示按照传统的预测方法对图像块进行预测。另外,上述第一取值和第二取值可以分别是1和0,或者第一取值和第二取值也可以分别是0和1。
结合第二方面,在第二方面的某些实现方式中,所述方法还包括:获得所述图像块指向所述目标预测块的运动矢量;根据所述图像块指向所述目标预测块的运动矢量,得到所述图像块的运动矢量,其中,所述图像块的运动矢量用于对其它图像块进行预测。
根据指向目标预测块的运动矢量来确定图像块的运动矢量,具体可以是将目标运动块的运动矢量直接确定为图像块的运动矢量,也就是对图像块的运动矢量进行了更新,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。
第三方面,提供了一种图像预测装置,所述装置包括用于执行所述第一方面或其各种实现方式中的方法的模块。
第四方面,提供了一种图像预测装置,所述装置包括用于执行所述第二方面或其各种实现方式中的方法的模块。
第五方面,提供一种终端设备,所述终端设备包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于执行所述第一方面或其各种实现方式中的方法。
第六方面,提供一种终端设备,所述终端设备包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于执行所述第二方面或其各种实现方式中的方法。
第七方面,提供一种视频编码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现所述第一方面或其各种实现方式中的方法。
第八方面,提供一种视频编码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现所述第二方面或其各种实现方式中的方法。
第九方面,提供一种视频解码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现所述第一方面或其各种实现方式中的方法。
第十方面,提供一种视频解码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现所述第二方面或其各种实现方式中的方法。
第十一方面,提供一种视频编码系统,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现所述第一方面或其各种实现方式中的方法。
第十二方面,提供一种视频编码系统,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现所述第二方面或其各种实现方式中的方法。
第十三方面,提供一种计算机可读介质,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括用于执行第一方面或其各种实现方式中的方法的指令。
第十四方面,提供一种计算机可读介质,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括用于执行第二方面或其各种实现方式中的方法的指令。
第十五方面,提供一种解码器,所述解码器包括上述第三方面或者第四方面中的图像预测装置以及重建模块,其中,所述重建模块用于根据所述图像预测装置得到的所述图像块的像素值的预测值得到所述图像块的重建像素值。
第十六方面,提供一种编码器,所述编码器包括上述第三方面或者第四方面中的图像预测装置以及重建模块,其中,所述重建模块用于根据所述图像预测装置得到的所述图像块的像素值的预测值得到所述图像块的重建像素值。
图1是视频编码过程的示意图;
图2是视频解码过程的示意图;
图3是本申请实施例的图像预测方法的示意性流程图;
图4是帧间预测的合并模式下选择当前块的预测块的运动矢量的示意图;
图5是帧间预测的非合并模式下选择当前块的预测块的运动矢量的示意图;
图6是整像素位置像素与分像素位置像素的示意图;
图7是本申请实施例的图像预测方法的示意性流程图;
图8是搜索起始点的示意图;
图9是本申请实施例的图像预测方法的示意性流程图;
图10是本申请实施例的图像预测方法的示意性流程图;
图11是本申请实施例的图像预测装置的示意性框图;
图12是本申请实施例的图像预测装置的示意性框图;
图13是本申请实施例的视频编码器的示意性框图;
图14是本申请实施例的视频解码器的示意性框图;
图15是本申请实施例的视频传输系统的示意性框图;
图16是本申请实施例的视频编解码装置的示意性框图;
图17是本申请实施例的视频编解码系统的示意性框图。
下面将结合附图,对本申请中的技术方案进行描述。
本申请中的图像预测方法可以应用到视频编解码技术领域中。为了更好地理解本申请的图像预测方法,下面先对视频编解码进行介绍。
一段视频一般由很多帧图像按照一定的次序组成,一般来说,一帧图像中或者不同帧图像之间存在着大量的重复信息(冗余信息),例如,一帧图像内往往存在着大量空间结构相同或者相似的地方,也就是说视频文件中存在大量的空间冗余信息。另外,视频文件中还存在大量的时间冗余信息,这是由视频的组成结构导致的。例如,视频采样的帧速率一般为25帧/秒至60帧/秒,也就是说,相邻两帧间的采样时间间隔为1/60秒到1/25秒,在这么短的时间内,采样得到的图像画面中基本上都存在大量的相似信息,画面之间存在巨大关联性。
此外,相关研究表明,从人眼的视觉敏感度这一心理特性的角度出发,视频信息中也存在可以用来压缩的部分,即视觉冗余。所谓视觉冗余,是指利用人眼对亮度变化比较敏感,而对色度的变化相对不太敏感的特性来适当的压缩视频比特流。例如,在高亮度的区域,人眼视觉对亮度变化的敏感度呈现下降趋势,转而对物体的边缘处较为敏感;另外,人眼对内部区域相对不敏感而对对整体结构较为敏感。由于视频图像的最终服务对象是我们人类群体,所以可以充分利用人眼的这些特性对原有的视频图像进行压缩处理,达到更佳的压缩效果。除了上面提到的空间冗余、时间冗余和视觉冗余外,视频图像信息中还会存在信息熵冗余、结构冗余、知识冗余、重要性冗余等等这一系列的冗余信息。视频编码(也可以称为视频压缩编码)的目的就是使用各种技术方法将视频序列中的冗余信息去除掉,以达到减小存储空间和节省传输带宽的效果。
目前,在国际通用范围内,视频压缩编码标准中主流的压缩编码方式有四种:色度抽样、预测编码、变换编码和量化编码。下面分别对这几种编码方式进行详细介绍。
色度抽样:该方式充分利用了人眼的视觉心理特点,从底层的数据表示就开始设法最大限度的缩减单个元素描述的数据量。例如,在电视系统中多数采用的是亮度-色度-色度(YUV)颜色编码,它是欧洲电视系统广泛采用的标准。YUV颜色空间中包括一个亮度信号Y和两个色差信号U和V,三个分量彼此独立。YUV颜色空间彼此分开的表示方式更加灵活,传输占用带宽少,比传统的红绿蓝(RGB)色彩模型更具优势。例如,YUV 4:2:0形式表示两色度分量U和V在水平方向和垂直方向上都只有亮度Y分量的一半,即4个采样像素点中存在4个亮度分量Y,而色度分量U和V则只有一个。采用这种形式表示时,数据量进一步缩小,仅为原始的33%左右。因此,色度抽样充分利用了人眼的生理视觉特性,通过这种色度抽样的方式实现视频压缩的目的,是目前广泛采用的视频数据压缩方式之一。
预测编码:预测编码时利用之前已编码帧的数据信息来预测当前将要编码的帧。通过预测得到一个预测值,它不完全等同与实际值,预测值和实际值之间存在着一定的残差值。预测的越准确,则预测值就会越接近实际值,残差值就越小,这样对残差值进行编码就能大大减小数据量,在解码端解码时运用残差值加上预测值就能还原重构出匹配图像,这就是预测编码的基本思想方法。在主流编码标准中预测编码分为帧内预测和帧间预测两种基本类型。其中,帧内预测(Intra Prediction)是指利用当前图像内已重建区域内像素点的像素值对当前编码单元内像素点的像素值进行预测;帧间预测(Inter Prediction)是在已重建的图像中,为当前图像中的当前编码单元寻找匹配的参考块,将参考块中的像素点的像素值作为当前编码单元中像素点的像素值的预测信息或者预测值,并传输当前编码单元的运动信息。
变换编码:这种编码方式不直接对原本的空间域信息进行编码,而是按照某种形式的变换函数,将信息采样值从当前域转换到另外一种人为定义域中(通常称为变换域),再根据信息在变换域的分布特性进行压缩编码。由于视频图像数据往往在空间域的数据相关性非常大,存在大量的冗余信息,如果直接进行编码的话需要很大的比特量。而将信息采样值转换到变换域中后,数据的相关性大大减少,这样在编码时由于冗余信息的减少,编码所需的数据量也随之大大减少,这样就能够得到较高的压缩比,而且可以实现较好的压缩效果。典型的变换编码方式有卡洛(K-L)变换、傅立叶变换等。
量化编码:上述提到的变换编码其实本身并不压缩数据,量化过程才能有效地实现对数据的压缩,量化过程也是有损压缩中数据“损失”的主要原因。量化的过程就是将动态范围较大的输入值“强行规划”成较少的输出值的过程。由于量化输入值范围较大,需要较多的比特数表示,而“强行规划”后的输出值范围较小,从而只需要少量的比特数即可表示。
在基于混合编码架构的编码算法中,上述几种压缩编码方式可以混合使用,编码器控制模块根据视频帧中不同图像块的局部特性,选择该图像块所采用的编码模式。对帧内预测编码的块进行频域或空域预测,对帧间预测编码的块进行运动补偿预测,预测的残差再通过变换和量化处理形成残差系数,最后通过熵编码器生成最终的码流。为避免预测误差的累积,帧内或帧间预测的参考信号是通过编码端的解码模块得到。变换和量化后的残差系数经过反量化和反变换重建残差信号,再与预测的参考信号相加得到重建的图像。另外,环路滤波会对重建后的图像进行像素修正,以提高重建图像的编码质量。
下面结合图1和图2对视频编解码的整个过程进行简单的介绍。
图1是视频编码过程的示意图。
如图1所示,在对当前帧Fn中的当前图像块进行预测时,既可以采用帧内预测也可以采用帧间预测,具体地,可以根据当前帧Fn的类型,选择采用帧内编码还是帧间编码,例如,当前帧Fn为I帧时采用帧内预测,当前帧Fn为P帧或者B帧时采用帧间预测。当采用帧内预测时可以采用当前帧Fn中已经重建区域的像素点的像素值对当前图像块的像素点的像素值进行预测,当采用帧间预测时可以采用参考帧F’
n-1中与当前图像块匹配的参考块的像素点的像素值对当前图像块的像素点的像素值进行预测。
在根据帧间预测或者帧内预测得到当前图像块的预测块之后,将当前图像块的像素点的像素值与预测块的像素点的像素值进行做差,得到残差信息,对残差信息进行变化、量化以及熵编码,得到编码码流。另外,在编码过程中还要对当前帧Fn的残差信息与当前 帧Fn的预测信息进行叠加,并进行滤波操作,得到当前帧的重建帧F’
n,并将其作为后续编码的参考帧。
图2是视频解码过程的示意图。
图2所示的视频解码过程相当于图1所示的视频解码过程的逆过程,在解码时,利用熵解码以及反量化和反变换得到残差信息,并根据解码码流确定当前图像块使用帧内预测还是帧间预测。如果是帧内预测,则利用当前帧中已重建区域内像素点的像素值按照帧内预测方法构建预测信息;如果是帧间预测,则需要解析出运动信息,并使用所解析出的运动信息在已重建的图像中确定参考块,并将参考块内像素点的像素值作为预测信息,接下来,再将预测信息与残差信息进行叠加,并经过滤波操作便可以得到重建信息。
图3是本申请实施例的图像预测方法的示意性流程图。图3所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图3所示的方法既可以发生在编码过程,也可以发生在解码过程,更具体地,图3所示的方法可以发生在编解码时的帧间预测过程。
图3所示的方法包括步骤101至步骤106,下面对步骤101至步骤106进行详细的介绍。
101、获取图像块的预测运动信息。
这里的图像块可以是待处理图像中的一个图像块,也可以是待处理图像中的一个子图像。另外,这里的图像块可以是编码过程中待编码的图像块,也可以是解码过程中待解码的图像块。
可选地,上述预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向预测参考图像块和/或后向预测参考图像块的参考帧索引信息。
获取图像块的预测运动信息的方式有多种,例如,可以采用下面的方式一和方式二来获取图像块的预测运动信息。
方式一:
在帧间预测的合并模式下,根据当前图像块的相邻块的运动信息构建候选预测运动信息列表,并从该候选预测运动信息列表中选择某个候选预测运动信息作为当前图像块的预测运动信息。其中,候选预测运动信息列表包含运动矢量、参考图像块的参考帧索引信息等等。如图4所示,选择相邻块A0的运动信息作为当前图像块的预测运动信息,具体地,将A0的前向运动矢量作为当前块的前向预测运动矢量,将A0的后向运动矢量作为当前块的后向预测运动矢量。
方式二:
在帧间预测的非合并模式下,根据当前图像块的相邻块的运动信息构建运动矢量预测值列表,并从该运动矢量预测值列表中选择某个运动矢量作为当前图像块的运动矢量预测值。在这种情况下,当前图像块的运动矢量可以为相邻块的运动矢量值,也可以为所选取的相邻块的运动矢量与当前图像块的运动矢量差的和,其中,运动矢量差通过对当前图像块进行运动估计所得到的运动矢量与所选取的相邻块的运动矢量的差。如图5所示,选择 运动矢量预测值列表中的索引1和2对应的运动矢量作为当前图像块的前向运动矢量和后向运动矢量。
应理解,上述方式一和方式二只是获取图像块的预测运动信息的具体两种方式,本申请对获取预测块的运动信息的方式不做限定,任何可以获取图像块的预测运动信息的方式都在本申请的保护范围内。
102、根据预测运动信息通过插值滤波器在参考图像中获得图像块对应的第一预测块和第二预测块。
其中,上述插值滤波器的增益大于1,由于差值滤波器的增益大于1,这就使得在参考图像中获得的第一预测块和第二预测块的像素值的位宽高于最终得到的图像块的像素值预测值的位宽(在本文中,位宽较高的像素值也可以认为具有较高的精度),也就是说,第一预测块和第二预测块的像素值的位宽高于最终得到的图像块的像素值的预测值的位宽。
应理解,上述参考图像为图像块的参考图像,或者,上述参考图像为图像块所在的待处理图像的参考图像。
在根据预测运动信息在参考图像中获取第一预测块和第二预测块时,具体可以根据预测运动信息中包含的第一运动矢量和第二运动矢量确定第一预测块和第二预测块在待处理图像的参考图像中的位置。
由于参考图像中只有整像素位置的像素值,若运动矢量指向的位置为分像素位置(例如,1/2像素位置),则需要通过参考图像的整像素位置的像素值,采用插值滤波器进行插值,得到分像素位置的像素值,作为预测块的像素值。在进行插值操作中,由于插值滤波增益,使得预测块的像素值的位宽高于最终得到的图像块的重建像素值的位宽。这里的图像块的重建像素值可以是对图像块进行重建后得到的重建块的像素值。
例如,当参考图像的像素值的位宽为8比特,插值滤波器增益为6比特时,那么,根据预测运动信息并通过插值滤波器可以得到第一预测块和第二预测块的像素值的位宽均为14比特。
另外,当参考图像的像素值的位宽为10比特,插值滤波器增益为6比特时,那么根据预测运动信息并通过插值滤波器可以得到第一预测块和第二预测块的像素值的位宽均为16比特。为了使得插值后得到的预测块的像素值的保持一定的位宽,还可以在进行插值操作之后再进行移位操作,例如,参考图像的像素值的位宽为10比特,插值滤波器增益为6比特,为了使得得到的预测块的像素值的位宽保持14比特,将插值操作后得到的像素值再右移2位,这样就使得预测块的像素值的位宽为14比特。
如图6所示,Ai,j为整像素位置的像素点,其位宽为bitDepth。a0,0,b0,0,c0,0,d0,0,h0,0,n0,0e0,0,i0,0,p0,0,f0,0,j0,0,q0,0,g0,0,k0,0,和r0,0为分像素位置的像素点。若采用8抽头插值滤波器,则a0,0可以通过下面的公式计算得到:
a0,0=(C
0*A
-3,0+C
1*A
-2,0+C
2*A
-1,0+C
3*A
0,0+C
4*A
1,0+C
5*A
2,0+C
6*A
3,0+C
7*A
4,0)>>shift1
在上述公式中,C
k,k=0,1,…,7为插值滤波器的系数,如果插值滤波器的系数和为2的N次方,那么,插值滤波器的增益为N,例如,N为6表示插值滤波器增益为6比特。shift1为右移位数,shift1可以设置为bitDepth-8,其中,bitDepth为目标位宽,这样根据上述公式最终得到的预测块的像素值的位宽为bitDepth+6-shift1=14比特。
上述bitDepth为目标位宽,该目标位宽为图像块的重建像素值的位宽。
可选地,步骤102中在参考图像中获得第一预测块和第二预测块时,可以根据预测运动信息中包含的运动矢量以运动补偿的方式,在参考图像中获取第一预测块和第二预测块。
103、根据第一预测块和所述第二预测块,得到初始预测块。
上述根据第一预测块和第二预测块,得到初始预测块,可以是指根据第一预测块的像素值和第二预测块的像素值,得到初始预测块的像素值。
可选地,上述根据第一预测块和所述第二预测块,得到初始预测块具体包括:根据第一预测块的像素值和第二预测块的像素值,得到初始预测块的像素值。
另外,上述初始预测块、第一预测块以及第二预测块的像素值的位宽均相同,也就是说,上述初始预测块、第一预测块和第二预测块中的任意两个预测块的像素值的位宽相同。
应理解,在对图像进行预测时可以存在一个目标位宽,该目标位宽就是在图像预测结束后,图像块的像素重建值要达到的位宽。由于在获取第一预测块和第二预测块时,差值滤波器的增益大于1,该第一预测块和第二预测块的像素值的位宽是大于目标位宽的,同样,初始预测块和目标预测块的像素值的位宽也大于目标位宽。也就是说,在图像预测的中间过程中采用较大的位宽去确定预测块,最终再根据目标预测块的像素值的位宽来确定图像块的像素值的预测值时再将像素值的位宽调整为目标位宽,使得最终得到的图像块的像素值的预测值的位宽位目标位宽。例如,目标位宽为10bit,第一预测块和第二预测块的像素值的位宽为14bit,初始预测块和目标预测块的像素值的位宽也为14bit,最终再根据目标预测块的像素值来确定图像块的像素值的预测值时再将像素值的位宽从14bit降低到10bit。
也就是说,在对图像进行预测时,可以在中间过程中采用高位宽的像素值,最后在得到图像块的像素值的预测值时再将高位宽的像素值转化为目标位宽,这样能够提高图像预测的准确性。
可选地,在根据第一预测块和第二预测块得到初始预测块时,可以对第一预测块的像素值和第二预测块的像素值进行加权处理,并对加权处理后的像素值进行移位,将移位后的像素值作为初始预测块的像素值,并且使得初始预测块的像素值的位宽与第一预测块和第二预测值的位宽相同。
例如,第一预测块和第二预测块的像素值的位宽为14bit,那么,在对第一预测块和第二预测块的像素值进行加权处理后,将加权处理后得到的像素值的位宽也保留为14bit。
应理解,在根据第一预测块和第二预测块,得到初始预测块时,除了采用加权处理之外,还可以采用其他方式得到初始预测块,本申请对此不做限定。
另外,在本申请中,在对不同预测块的像素值进行加权处理时,不同预测块的像素值的加权系数可以不同也可以相同,当不同预测块的加权系数相同时,相当于是对不同预测块的像素值进行了平均处理。
可选地,也可以从参考图像中获取一个预测块,并将该预测块直接确定为初始预测块,例如,当参考图像为前向参考图像时,可以直接将从前向参考图像中获取的预测块确定为初始预测块,当参考图像为后向参考图像时,可以直接将从后向参考图像中获取的预测块确定为初始预测块。
可选地,也可以从参考图像中获取多个预测块,然后根据该多个预测块确定初始预测块,使得初始预测块的像素值等于对多个预测块的像素值进行加权处理后的像素值。
104、根据预测运动信息在参考图像中进行搜索,得到图像块对应的M个预测块。
其中,上述M为预设值,M为大于1的整数。M可以在对图像进行预测之前预先设置好的数值,另外,也可以根据图像预测的精度以及搜索预测块的复杂度来设置M的数值。应理解,上述M个预测块中的每个预测块与初始预测块的像素值的位宽相同。
105、根据M个预测块和初始预测块,确定图像块的目标预测块。
其中,上述目标预测块的像素值的位宽与初始预测块的像素值的位宽相同。
可选地,在确定目标预测块时,可以将M个预测块中与初始预测块的像素值的差异(或者差异值)最小的预测块确定为目标预测块。通过比较每个预测块与初始预测块的差异,能够得到像素值与图像块的像素值更接近的预测块。
在比较多个预测块中的每个预测块的像素值与初始预测块的像素值的差异时,可以采用绝对误差和(Sum of absolute differences,SAD)、绝对变换误差和(Sum of absolute transformation differences,SATD)或者绝对平方差和等来衡量每个预测块的像素值与初始预测块的像素值之间的差异。
可选地,在参考图像中搜索得到图像块的目标预测块时,既可以以整像素步长进行搜索(或者称为运动搜索),也可以以分像素步长(也可以称为亚像素步长)进行搜索,并且在以整像素步长或者分像素步长进行搜索时,搜索的起始点既可以是整像素也可以是分像素,例如,整像素,1/2像素,1/4像素,1/8像素以及1/16像素等等。
在本文中,整像素步长是指是指搜索预测块时每次搜索的步长为整个像素或者整个像素的整数倍。分像素步长搜索是指搜索预测块时每次搜索的步长小于整个像素,例如,进行分像素步长搜索时,搜索步长可以为1/2像素,1/4像素,1/8像素以及1/16像素等等。另外,在进行分像素步长搜索时可以以当前指向的分像素来确定步长进行搜索,例如,当前运动矢量指向的是1/2像素位置,那么,可以以1/2像素为步长进行分像素步长搜索。另外,在进行分像素步长搜索时还可以以预设的分像素步长进行搜索。
106、根据目标预测块的像素值,得到图像块的像素值的预测值。
具体地,可以直接对目标预测块的像素值进行限位操作,使得限位后的像素值的位宽达到图像预测时的目标位宽,然后,根据限位操作后的像素值来得到图像块的像素值的预测值或限位操作后的像素值直接确定为图像块的像素值的预测值,此时,得到的图像块的像素值的预测值的位宽为目标位宽。
例如,得到的目标预测块的像素值的位宽为14bit,而图像预测时的目标位宽为10bit,那么就对目标预测块的像素值进行限位(或者移位)操作,使得像素值的位宽从14bit变成10bit,然后将限位后的像素值作为图像块的像素值的预测值,这时,图像块的像素值的预测值的位宽就变成了10bit。
在本申请中,由于差值滤波器的增益大于1,因此,第一预测块和第二预测块的像素值的位宽大于最终得到的图像块的重建像素值的位宽,此外,由于第一预测块、第二预测块、初始预测块以及目标预测块的像素值的位宽相同,使得最终得到的目标预测块的像素值的位宽也大于图像块的重建像素值的位宽,因此,可以直接根据具有较高位宽的目标预测块的像素值来确定图像块的像素值的预测值,而不必再通过运动补偿获取具有高位宽的 像素值的预测块之后再确定图像块的像素值的预测值,节省了运动补偿的操作,降低了图像预测时的复杂度。
可选地,上述参考图像既可以仅包含前向参考图像,也可以仅包含后向参考图像,也可以是既包含前向参考图像又包含后向参考图像。
可选地,作为一个实施例,根据预测运动信息通过插值滤波器在参考图像中获得图像块对应的第一预测块和第二预测块,具体包含以下三种情况。
(1)、当参考图像为前向参考图像时,根据预测运动信息通过插值滤波器在前向参考图像中获得第一预测块和第二预测块;
(2)、当参考图像为后向参考图像时,根据预测运动信息通过插值滤波器在参考图像中获得第一预测块和第二预测块;
(3)、当参考图像包含前向参考图像和后向参考图像时,根据预测运动信息通过插值滤波器分别在前向参考图像和后向参考图像中获得第一预测块和第二预测块。
通过在前向参考图像和/或后向参考图像获取不同的预测块,进而可以根据不同的预测块来确定初始预测块,与直接将在前向参考图像或者后向参考图像中搜索到的预测块作为初始预测块的方式相比,根据不同的预测块能够来更准确地确定初始预测块。
可选地,当参考图像仅包含前向参考图像或者仅包含后向参考图像时,根据预测运动信息在参考图像中进行搜索,得到图像块对应的M个预测块,具体包含以下两种情况:
(4)、当参考图像为前向参考图像时,根据预测运动信息在前向参考图像中进行搜索,得到图像块对应的M个预测块;
(5)、当参考图像为后向参考图像时,根据预测运动信息在后向参考图像中进行搜索,得到图像块对应的M个预测块。
通过仅一个参考图像(前向参考图像或者后向参考图像)中进行搜索,从而得到M个预测块,能够减少搜索预测块的复杂度,另外,通过比较M个预测块中的每个预测块与初始预测块的像素值的差异,能够得到与图像块更接近的预测块,从而提高图像预测的效果。
当参考图像仅包含前向参考图像或者仅包含后向参考图像时,上述M个预测块或者是从前向参考图像中搜索得到的,或者是从后向参考图像中搜索得到的,接下来,在根据图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块时,可以将M个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为目标预测块。
可选地,当参考图像既包含前向参考图像又包含后向参考图像时,根据预测运动信息在参考图像中进行搜索,得到图像块对应的M个预测块的具体过程如下:
根据预测运动信息在前向参考图像中进行搜索,得到图像块对应的A个预测块;
根据预测运动信息在后向参考图像中进行搜索,得到图像块对应的B个预测块;
根据图像块对应的M个预测块和初始预测块,确定图像块的目标预测块,包括:
将A个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为第一目标预测块;
将B个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为第二目标预测块;
根据第一目标预测块和第二目标预测块确定目标预测块。
其中,上述A和B均为大于0的整数,A+B=M,A和B既可以相同,也可以不同。
通过在前向参考图像和后向参考图像中分别进行搜索,能够根据前向参考图像和后向参考图像中搜索出来的预测块来综合确定最终的目标块,这样在获得预测块的时候既考虑了前向参考图像,又考虑了后向参考图像,能够使得最终得到的目标预测块与图像块更接近,从而提高图像预测的效果。
另外,为了进一步减少搜索的复杂度,可以只在一个方向的参考图像中进行搜索,得到M个预测块,而在另一个方向中参考图像中不进行搜索,而是根据已经搜索得到的预测块在推导图像块在另一个方向中的预测块。
可选地,根据预测运动信息在参考图像中进行搜索,得到图像块对应的M个预测块,包括:参考图像为第一方向参考图像,根据预测运动信息在第一方向参考图像中进行搜索,得到图像块对应的M个预测块;根据图像块对应的M个预测块和初始预测块,确定图像块的目标预测块,包括:将图像块对应的M个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定图像块指向第一目标预测块的第一运动矢量;根据第一运动矢量按照预设规则确定第二运动矢量;根据第二运动矢量在第二方向参考图像中确定图像块对应的第二目标预测块;根据第一目标预测块和第二目标预测块,确定目标预测块。
其中,上述第一方向参考图像和第二方向参考图像分别为前向参考图像和后向参考图像,或者,第一方向参考图像和第二方向参考图像分别为后向参考图像和前向参考图像
通过在一个方向的参考图像中搜索到的预测块来推导图像块在另一个方向的参考图像中的预测块,能够节省大量搜索操作,简化图像预测时的复杂度,同时,由于在确定目标预测块的同时既采用了图像块对应在前向参考图像的预测块也采用了图像块对应在后向参考图像中的预测块,可以在简化图像预测复杂度的同时,保证图像预测的准确性。
可选地,上述根据第一运动矢量按照预设规则确定第二运动矢量,可以是根据公式MV1’=MV1–(MV0’–MV0)来推导第二运动矢量,MV0’为第一运动矢量,MV1’为第二运动矢量,MV0为图像块指向上述第一预测块的初始前向运动矢量,MV1为图像块指向上述第二预测块的初始后向运动矢量。
可选地,当上述参考图像包括前向参考图像和后向参考图像时,在获取目标预测块时,可以只在前向参考图像中进行搜索,得到前向目标预测块,然后根据指向该前向参考图像目标预测块的前向运动矢量推导出后向运动矢量(例如,可以采用镜像假设方法推导运动矢量),然后根据推到出来的后向运动矢量来确定后向目标预测块。然后根据前向目标预测块和后向目标预测块来确定目标预测块。
另外,也可以只在后向参考图像中进行搜索,得到后向目标预测块,然后根据指向该后向参考图像目标预测块的后向运动矢量推导出前向运动矢量(可以采用镜像假设方法),然后根据推到出来的前向运动矢量来确定前向目标预测块。最后再根据前向目标预测块和后向目标预测块来确定目标预测块。
可选地,作为一个实施例,在上述步骤101之前,图3所示的方法还包括:从图像块的码流中获取指示信息,其中,该指示信息用于指示获取图像块的预测运动信息,该指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
通过指示信息能够灵活地指示是否获取图像块的预测运动信息,进而根据图像块的预 测运动信息等等对图像进行预测。具体地,通过指示信息能够指示是否采用本申请实施例的方法进行图像预测,当从码流中获取到该指示信息后按照本申请实施例的方法对图像进行预测,如果没有从码流中获取到该指示信息的话,可以按照传统的方法对图像进行预测,通过指示信息能够灵活指示具体采用何种方法对图像进行预测。
当上述触发信息携带在待处理图像的序列参数集中时,触发信息具体可以采用表1所示的形式来表示。
表1
seq_parameter_set_rbsp(){ | Descriptor(指示符) |
… | |
sps_dmvr_precision_flag | u(1) |
… | |
} |
在表1中,seq_parameter_set_rbsp()表示一个图像序列的所有参数信息,而sps_dmvr_precision_flag用于表示触发信息,可以通过解码码流来获取sps_dmvr_precision_flag的指示符的取值情况,当sps_dmvr_precision_flag的指示符为0时可以按照传统的预测方法对图像进行预测,当sps_dmvr_precision_flag的指示符为1时可以按照本申请的方法对图像进行预测。
当上述触发信息携带在待处理图像的图像参数集中时,触发信息具体可以采用表2所示的形式来表示。
表2
pic_parameter_set_rbsp(){ | Descriptor(指示符) |
… | |
pps_dmvr_precision_flag | u(1) |
… | |
} |
在表2中,pic_parameter_set_rbsp()表示一个图像的所有参数信息,而pps_dmvr_precision_flag用于表示触发信息,可以通过解码码流来获取pps_dmvr_precision_flag的指示符的取值情况,当pps_dmvr_precision_flag的指示符为0时可以按照传统的预测方法对图像进行预测,当pps_dmvr_precision_flag的指示符为1时可以按照本申请的方法对图像进行预测。
当上述触发信息携带在待处理图像的条带头的参数中时,触发信息具体可以采用表3所示的形式来表示。
表3
slice_segment_header(){ | Descriptor(指示符) |
… | |
slice_dmvr_precision_flag | u(1) |
… | |
} |
在表3中,slice_segment_header()表示一个图像的某个条带的所有参数信息,而slice_dmvr_precision_flag用于表示触发信息,可以通过解码码流来获取slice_dmvr_precision_flag的指示符的取值情况,当slice_dmvr_precision_flag的指示符为0时可以按照传统的预测方法对图像进行预测,当slice_dmvr_precision_flag的指示符为1时可以按照本申请的方法对图像进行预测。
可选地,作为一个实施例,图3所示的方法还包括:根据图像块指向目标预测块的运动矢量,确定图像块的运动矢量。
应理解,这里的目标预测块的运动矢量是图像块指向该目标预测块的运动矢量。
根据指向目标预测块的运动矢量来确定图像块的运动矢量,具体可以是将目标运动块的运动矢量直接确定为图像块的运动矢量,也就是对图像块的运动矢量进行了更新,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。
另外,还可以将目标运动块的运动矢量确定为图像块的运动矢量的预测值,接下来再根据图像块的运动矢量的预测值得到图像块的运动矢量。
下面结合图7对本申请实施例的图像预测方法的流程进行详细的介绍。与图3所示的方法类似,图7所示的方法也可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图7所示的方法可以发生在编码过程,也可以发生在解码过程,具体地,图7所示的方法可以发生在编码过程或者解码时的帧间预测过程。
图7所示的方法具体包括步骤201至步骤209,下面分别对步骤201至步骤209进行详细的描述。
201、获得当前图像块的预测运动信息。
在获取预测运动信息时,具体可以根据当前图像块的相邻图像块的运动信息来确定当前图像块的预测运动信息。进一步地,还可以采用步骤101下方的方式一和方式二来获取预测运动信息。
在当前图像块的参考图像包括前向参考图像和后向参考图像时,上述预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向预测参考图像块和/或后向预测参考图像块的参考帧索引信息。
202、在前向参考图像中获取当前图像块的前向预测块,其中,该前向预测块的像素值为高位宽的像素值。
203、在后向参考图像中获取当前图像块的后向预测块,其中,该后向预测块的像素值为高位宽的像素值。
应理解,在本文中,高位宽的像素值是指像位宽大于图像预测的最终目标位宽的像素值。例如,步骤202和步骤203中的前向预测块和后向预测块的像素值的位宽为14bit,目标位宽为10bit,由于前向预测块和后向预测块的像素值的位宽大于目标位宽,因此,前向预测块和后向预测块的像素值可以称为高位宽的像素值。
应理解,步骤201中的预测运动信息可以具体包括前向运动矢量和后向运动矢量,这样在步骤202中,就可以根据前向运动矢量,通过运动补偿的方法,在前向参考图像中获 取当前图像块的前向预测块,根据后向运动矢量,通过运动补偿的方法,在后向参考图像中获取当前图像块的后向预测块。
应理解,在步骤202和步骤203中获取到前向预测块和后向预测块之后,不对前向预测块和后向预测块的像素值进行位宽移位和限位操作,使得前向预测块和后向预测块的像素值保持高位宽。
204、根据前向预测块和后向预测块,获取初始预测块。
应理解,步骤204中的前向预测块和后向预测块是分别在步骤202和步骤203中得到的,前向预测块的像素值和后向预测块的像素值均为高位宽的像素值。
在根据前向预测块和后向预测块的获取初始预测块时,具体可以对前向预测块的像素值和后向预测块的像素值进行加权处理,将得到的像素值确定为初始预测块(也可以称为匹配预测块)的像素值。应理解,在对前向预测块的像素值和后向预测块的像素值进行加权处理后,不对加权处理后得到的像素值进行位宽移位和限位操作,使得得到的初始预测块的像素值也为高位宽。
在对前向预测块的像素值和后向预测块的像素值进行加权处理时,可以根据公式(2)来获得初始预测块的每个像素点的像素值。
predSamples[x][y]=(predSamplesL0[x][y]+predSamplesL1[x][y]+1)>>1 (2)
在公式(2)中,predSamplesL0为前向预测块,predSamplesL1为后向预测块,predSamples为初始预测块,predSamplesL0[x][y]为前向预测块中像素点(x,y)的像素值,predSamplesL1[x][y]为后向预测块中像素点(x,y)的像素值,predSamples[x][y]为初始预测块中像素点(x,y)的像素值。
当前向预测块的像素值和后向预测块的像素值的位宽均为14比特时,根据公式(2)得到的初始预测块的像素的位宽也为14比特,也就是说根据公式(2)计算初始预测块的像素点的像素值能够使得初始预测块的像素点的像素值的位宽与前向预测块和后向预测块的位宽保持一致(均是高位宽)。
205、在前向参考图像中搜索至少一个前向预测块,其中,该至少一个前向预测块中的每个前向预测块的像素值均为高位宽的像素值。
206、从至少一个前向预测块中确定最优前向预测块,其中,该最优前向预测块的向像素值为高位宽的像素值。
具体地,在从至少一个前向预测块中确定最优前向预测块时,可以先确定至少一个前向预测块中的每个前向预测块的像素值与初始预测块的像素值的差异,然后将至少一个前向预测块中像素值与初始预测块的像素值差异最小的前向预测块确定为最优前向预测块。207、在后向参考图像中搜索至少一个后向预测块,其中,至少一个后向预测块中的每个后向预测块的像素值均为高位宽的像素值。
在步骤205中,在前向参考图像中进行搜索时可以以整像素步长进行搜索(或者称为运动搜索),以得到至少一个前向预测块。
在步骤207中,在后向参考图像中进行搜索时也可以以整像素步长进行搜索(或者称为运动搜索),以得到至少一个前向预测块。
在以整像素步长进行搜索时,搜索起始点既可以整像素也可以是分像素,例如,整像素,1/2像素,1/4像素,1/8像素以及1/16像素等等。
例如,如图8所示,在以整像素步长进行搜索时,可以以(0,0)为搜索起始点,得到一个前向预测块,接下来,以(0,0)的周围的8个像素点为搜索点,继续进行搜索,再得到8个前向预测块。
另外,在步骤205和步骤207中搜索至少一个前向预测块和至少一个后向预测块时,也可以直接以分像素步长进行搜索,或者,既进行整像素步长搜索又进行分像素步长搜索。
应理解,以分像素步长进行搜索时,搜索起始点既可以整像素也可以是分像素。搜索起始点的分布也可以如图8所示。
208、从至少一个后向预测块中确定最优后向预测块,其中,该最优后向预测块的像素值为高位宽的像素值。
与确定最优前向预测块类似,在确定最优后向预测块时,也可以先确定至少一个后向预测块中的每个后向预测块的像素值与初始预测块的差异,然后将至少一个后向预测块中像素值与初始预测块的像素值差异最小的后向预测块确定为最优后向预测块。
在比较前向预测块的像素值与初始预测块的像素值的差异,以及后向预测块的像素值与初始预测块的像素值的差异时,可以采用SAD、SATD或者绝对平方差和等来衡量不同预测块的像素值之间的差异。
209、根据最优前向预测块和最优后向预测块,确定当前图像块的像素值的预测值。
在确定当前图像块的像素值的预测值时,可以对最优前向预测块的像素值和最优后向预测块的像素值进行加权处理,由于最优前向预测块的像素值和最优后向预测块的像素值均为高位宽的像素值,因此,在进行加权处理后,得到的像素值仍然是高位宽,这时需要对加权处理后得到的像素值进行位宽移位和限位操作,然后再将位宽移位和限位操作处理后的像素值确定为当前图像块的像素值的预测值。
具体地,可以根据公式(3)得到当前图像块的像素值的预测值。
predSamples’[x][y]=Clip3(0,(1<<bitDepth)-1,(A+offset2)>>shift2) (3)
其中,A=predSamplesL0’[x][y]+predSamplesL1’[x][y],predSamplesL0’为最优前向预测块,predSamplesL1’为最优后向预测块,predSamples’为当前图像块,predSamples L0’[x][y]为最优前向预测块中像素点(x,y)的像素值,predSamples L1’[x][y]为最优后向预测块中像素点(x,y)的像素值,predSamples[x][y]为当前图像块中像素点(x,y)的像素值的预测值,shift2表示位宽差,offset2等于1<<(shift2-1),用于在计算过程中进行四舍五入。
例如,前向最优预测块的像素值的位宽为14bit,后向最优预测块的像素值的位宽也为14bit,bitDepth为目标位宽,那么,shift2为15-bitDepth,根据公式(3)得到的预测块像素值位宽为14+1-shift2=bitDepth。
另外,在本申请中,还可以根据其它方法来得到当前图像块的像素值的预测值,本申请对此不做限定。
为了进一步减少图像预测时的复杂度,可以在图像预测的过程中对初始预测块和目标预测块统一采用目标位宽。下面结合图9对本申请实施例的图像预测方法进行详细的介绍。
图9是本申请实施例的图像预测方法的示意性流程图。图9所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图9所示的方法既可以发生在编码过程,也可以发生在解码过程,更具体地,图9所示的 方法可以发生在编解码时的帧间预测过程。
图9所示的方法包括步骤301至步骤306,下面对步骤301至步骤306进行详细的介绍。
301、获取图像块的预测运动信息。
上述图像块可以是待处理图像中的一个图像块,也可以是待处理图像中的一个子图像。
当该图像块的参考图像包括前向参考图像和后向参考图像时,上述预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向预测参考图像块和/或后向预测参考图像块的参考帧索引信息。
在获取预测运动信息时,具体可以采用步骤101下方的方式一和方式二来获取。
302、根据预测运动信息通过插值滤波器在参考图像中获得图像块对应的第一预测块和第二预测块。
其中,上述插值滤波器的增益大于1。
应理解,上述参考图像为图像块的参考图像,或者,上述参考图像为图像块所在的待处理图像的参考图像。
303、对第一预测块的像素值和第二预测块的像素值进行移位操作,使得第一预测块的像素值和第二预测块的像素值的位宽减小到目标位宽。
应理解,在对图像进行预测时可以存在一个目标位宽,该目标位宽就是在图像预测结束后,图像块的像素重建值要达到的位宽。
由于上述插值滤波器的增益大于1,使得在参考图像中获得的第一预测块和第二预测块的像素值的位宽高于目标位宽,通过对第一预测块的像素值和第二预测块的像素值的进行移位操作,可以使得移位操作后的第一预测块和第二预测块的像素值的位宽减小到目标位宽。
304、根据第一预测块和第二预测块,得到初始预测块。
其中,初始预测块、第一预测块以及第二预测块的像素值的位宽相同,也就是说,初始预测快,第一预测块以及第二预测块的像素值的位宽均为目标位宽。
可选地,在根据第一预测块和第二预测块得到初始预测块时,可以对第一预测块的像素值和第二预测块的像素值进行加权处理,并对加权处理后的像素值进行移位,将移位后的像素值作为初始预测块的像素值,并且使得初始预测块的像素值的位宽与第一预测块和第二预测值的位宽保持一致,也就是使得初始预测块的像素值的位宽达到目标位宽。
例如,第一预测块和第二预测块的像素值的位宽为10bit,那么,在对第一预测块和第二预测块的像素值进行加权处理后,将加权处理后得到的像素值的位宽也保留为10bit。
应理解,加权处理,只是得到初始预测块的像素值的一种方式,在本申请中还可以采用其它方式来得到初始预测块的像素值,本申请对此不做限定。
可选地,也可以从参考图像中获取一个预测块,并将该预测块直接确定为初始预测块,例如,当参考图像为前向参考图像时,可以直接将从前向参考图像中获取的预测块确定为初始预测块,当参考图像为后向参考图像时,可以直接将从后向参考图像中获取的预测块 确定为初始预测块。
305、根据初始预测块在参考图像中进行搜索,得到图像块对应的M个预测块。
上述M为预设值,M为大于1的整数。M可以是在对图像进行预测之前预先设置好的数值,另外,也可以根据图像预测的精度以及搜索预测块的复杂度来设置M的数值。另外,上述M个预测块中的每个预测块与初始预测块的像素值的位宽相同。
306、根据M个预测块和初始预测块,确定图像块的目标预测块。
其中,上述目标预测块的像素值的位宽与初始预测块的像素值的位宽相同。
由于在获取第一预测块和第二预测块时,插值滤波器的增益大于1,该第一预测块和第二预测块的像素值的位宽是大于目标位宽的,但是,经过移位操作之后使得第一预测块和第二预测块的像素值的位宽又变成了目标位宽,接下来,根据第一预测块和第二预测块得到的初始预测块,以及最后得到的目标预测块的像素值的位宽也为目标位宽。例如,目标位宽为10bit,第一预测块和第二预测块的像素值的位宽为14bit,经过移位操作后得到的第一预测块和第二预测块的位宽为10bit,接下来,根据第一预测块和第二预测块得到的初始预测块以及目标预测块的位宽也为10bit,使得移位后的第一预测块,第二预测块,以及初始预测块和目标预测块的位宽均为目标位宽,减少了图像预测的复杂度。
可选地,在确定目标预测块时,可以将M个预测块中与初始预测块的像素值的差异(或者差异值)最小的预测块确定为目标预测块。通过比较每个预测块与初始预测块的差异,能够得到像素值与图像块的像素值更接近的预测块。
在比较多个预测块中的每个预测块的像素值与初始预测块的像素值的差异时,可以采用SAD、SATD或者绝对平方差和等来衡量每个预测块的像素值与初始预测块的像素值之间的差异。
可选地,在参考图像中搜索得到图像块的目标预测块时,既可以以整像素步长进行搜索(或者称为运动搜索),也可以以分像素步长进行搜索,并且在以整像素步长或者分像素步长进行搜索时,搜索的起始点既可以是整像素也可以是分像素,例如,整像素,1/2像素,1/4像素,1/8像素以及1/16像素等等。
307、根据目标预测块的像素值,得到图像块的像素值的预测值。
具体而言,在根据目标预测块的像素值,得到图像块的像素值的预测值时,由于目标预测块的像素值的位宽为目标位宽,因此,可以直接将目标预测块的像素值确定为图像块的像素值的预测值。
例如,目标位宽为10bit,步骤307得到的目标预测块的像素值的位宽也为10bit,那么,可以直接将该目标预测块的像素值确定为图像块的像素值的预测值。
本申请中,由于初始预测块和目标预测块的像素值的位宽均为目标位宽,因此,在对图像进行预测的过程中,能够减少像素值的在不同位宽之间的来回转换,并且根据像素值位宽为目标位宽的目标预测块来确定图像块的像素值的预测值,而不再进行运动补偿获取具有高位宽的像素值的预测块之后再确定图像块的像素值的预测值,节省了运动补偿的操作,简化了图像预测的流程,降低了图像预测的复杂度。
可选地,作为一个实施例,上述参考图像为前向参考图像或者后向参考图像。
在本申请中,当参考图像仅为前向参考图像和后向参考图像的一种时,只需要在一种类型的参考图像进行搜索预测块,降低了搜索的复杂度。
当参考图像为前向参考图像时,在步骤302中可以根据预测运动信息通过插值滤波器在前向参考图像中获取两个前向预测块,并将这两个前向预测块分别作为第一预测块和第二预测块。
当参考图像为后向参考图像时,在步骤302中可以根据预测运动信息通过插值滤波器在后向参考图像中获取两个后向预测块,并将这两个后向预测块分别作为第一预测块和第二预测块。
当参考图像包含前向参考图像和后向参考图像时,在步骤302中可以根据预测运动信息通过插值滤波器在前向参考图像和后参考图像中获取一个前向预测块和一个后向预测块,并将前向预测块和后向预测块分别作为第一预测块和第二预测块。这里对获取前向预测块和后向预测块的顺序不作限定,既可以是同时获取,也可以是先获取前向预测块再获取后向预测块,或者,先获取后向预测块再获取前向预测块。
通过在前向参考图像和/或后向参考图像获取不同的预测块,进而可以根据不同的预测块来确定初始预测块,与直接将在前向参考图像或者后向参考图像中搜索到的预测块作为初始预测块的方式相比,根据不同的预测块能够来更准确地确定初始预测块。
可选地,当参考图像为前向参考图像时,步骤304和步骤305具体包括:根据预测运动信息在前向参考图像中进行搜索,得到图像块对应的M个预测块;将M个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为目标预测块。
可选地,当参考图像为前向参考图像时,步骤304和步骤305具体包括:根据预测运动信息在后向参考图像中进行搜索,得到图像块对应的M个预测块;将M个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为目标预测块。
通过仅一个参考图像(前向参考图像或者后向参考图像)中进行搜索,从而得到M个预测块,能够减少搜索预测块的复杂度,另外,通过比较M个预测块中的每个预测块与初始预测块的像素值的差异,能够得到与图像块更接近的预测块,从而提高图像预测的效果。
在本申请中,当参考图像包括前向参考图像和后向参考图像时,通过在前向参考图像和后向参考图进行搜索来共同确定目标预测块,能够提高图像预测的准确性。
当参考图像包括前向参考图像和后向参考图像时,步骤305和步骤306可以具体包括步骤1至步骤6,下面对步骤1至步骤6进行详细介绍。
步骤1、根据预测运动信息在前向参考图像中进行搜索,得到图像块对应的A个预测块;
步骤2、根据预测运动信息在后向参考图像中进行搜索,得到图像块对应的B个预测块;
步骤3、根据图像块对应的M个预测块和初始预测块,确定图像块的目标预测块,包括:
步骤4、将A个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为第一目标预测块;
步骤5、将B个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为第二目标预测块;
步骤6、根据第一目标预测块和第二目标预测块确定目标预测块。
其中,上述A和B均为大于0的整数,A+B=M,A和B既可以相同,也可以不同。
通过在前向参考图像和后向参考图像中分别进行搜索,能够根据前向参考图像和后向参考图像中搜索出来的预测块来综合确定最终的目标块,这样在获得预测块的时候既考虑了前向参考图像,又考虑了后向参考图像,能够使得最终得到的目标预测块与图像块更接近,从而提高图像预测的效果。
另外,为了进一步减少搜索的复杂度,可以只在一个方向的参考图像中进行搜索,得到M个预测块,而在另一个方向中参考图像中不进行搜索,而是根据已经搜索得到的预测块在推导图像块在另一个方向中的预测块。
可选地,根据预测运动信息在参考图像中进行搜索,得到图像块对应的M个预测块,包括:参考图像为第一方向参考图像,根据预测运动信息在第一方向参考图像中进行搜索,得到图像块对应的M个预测块;根据图像块对应的M个预测块和初始预测块,确定图像块的目标预测块,包括:将图像块对应的M个预测块中像素值与初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定图像块指向第一目标预测块的第一运动矢量;根据第一运动矢量按照预设规则确定第二运动矢量;根据第二运动矢量在第二方向参考图像中确定图像块对应的第二目标预测块;根据第一目标预测块和第二目标预测块,确定目标预测块。
其中,上述第一方向参考图像和第二方向参考图像分别为前向参考图像和后向参考图像,或者,第一方向参考图像和第二方向参考图像分别为后向参考图像和前向参考图像
通过在一个方向的参考图像中搜索到的预测块来推导图像块在另一个方向的参考图像中的预测块,能够节省大量搜索操作,简化图像预测时的复杂度,同时,由于在确定目标预测块的同时既采用了图像块对应在前向参考图像的预测块也采用了图像块对应在后向参考图像中的预测块,可以在简化图像预测复杂度的同时,保证图像预测的准确性。
可选地,上述根据第一运动矢量按照预设规则确定第二运动矢量,可以是根据公式MV1’=MV1–(MV0’–MV0)来推导第二运动矢量,MV0’为第一运动矢量,MV1’为第二运动矢量,MV0为图像块指向上述第一预测块的初始前向运动矢量,MV1为图像块指向上述第二预测块的初始后向运动矢量。
可选地,作为一个实施例,在上述步骤101之前,图3所示的方法还包括:从图像块的码流中获取指示信息,其中,该指示信息用于指示获取图像块的预测运动信息,该指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
通过指示信息能够灵活地指示是否获取图像块的预测运动信息,进而根据图像块的预测运动信息等等对图像进行预测。具体地,通过指示信息能够指示是否采用本申请实施例的方法进行图像预测,当从码流中获取到该指示信息后按照本申请实施例的方法对图像进行预测,如果没有从码流中获取到该指示信息的话,可以按照传统的方法对图像进行预测,通过指示信息能够灵活指示具体采用何种方法对图像进行预测。
当触发信息分别携带在图像块的序列参数集、图像参数集或者条带头中,触发信息的具体表现形式可以分别如表1至表3所示。
可选地,作为一个实施例,图9所示的方法还包括:根据图像块指向目标预测块的运动矢量,确定图像块的运动矢量。
应理解,这里的目标预测块的运动矢量是图像块指向该目标预测块的运动矢量。
根据指向目标预测块的运动矢量来确定图像块的运动矢量,具体可以是将目标运动块的运动矢量直接确定为图像块的运动矢量,也就是对图像块的运动矢量进行了更新,这样就使得在进行下次图像预测时可以根据该图像块对其它图像块进行有效的预测。另外,还可以将目标运动块的运动矢量确定为图像块的运动矢量的预测值,接下来再根据图像块的运动矢量的预测值得到图像块的运动矢量。
下面结合图10对本申请实施例的图像预测方法的流程进行详细的介绍。与图9所示的方法类似,图10所示的方法也可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图10所示的方法可以发生在编码过程,也可以发生在解码过程,具体地,图10所示的方法可以发生在编码过程或者解码时的帧间预测过程。
图10所示的方法具体包括步骤401至步骤409,下面分别对步骤401至步骤409进行详细的描述。
401、获取当前图像块的预测运动信息。
可以根据当前图像块的相邻图像块的运动信息来确定当前图像块的预测运动信息。具体地,可以采用步骤101下方的方式一和方式二来获取预测运动信息。
上述预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像块所在图像信息(通常理解为参考图像信息),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向预测参考图像块和/或后向预测参考图像块的参考帧索引信息。402、在前向参考图像中获取当前图像块的前向预测块,该前向预测块的像素值的位宽为目标位宽。
在步骤402中,可以根据预测运动信息中的前向运动矢量在前向参考图像中进行运动搜索,以得到当前图像块的前向预测块。
403、在后向参考图像中获取当前图像块的后向预测块,该后向预测块的像素值的位宽为目标位宽。
在步骤403中,可以根据预测运动信息中的前向运动矢量在前向参考图像中进行运动搜索,以得到当前图像块的前向预测块。
步骤402和步骤403中的目标位宽可以指最终得到的图像块重建像素值的位宽,也就是说,这里得到的前向预测块和后向预测块的像素值的位宽与最终得到的图像块的重建像素值。
应理解,在步骤402和步骤403中,在根据运动矢量直接搜索到的前向预测块和后向预测块的像素值的位宽可以大于目标位宽,接下来,在对搜索到的前向预测块和后向预测块进行移位操作,使得前向预测块和后向预测块的像素值的位宽减小到目标位宽。
例如,根据运动矢量搜索到的前向预测块和后向预测块的像素值的位宽为14bit,目标位宽为10bit,那么可以对初始搜索到的前向预测块和后向预测块的像素值的位宽从14bit移位到10bit。
404、根据前向预测块和后向预测块,获取初始预测块,该初始预测块的位宽为目标位宽。
应理解,步骤404中的前向预测块和后向预测块分别是在步骤402和步骤403中获取 的。
在根据前向预测块和后向预测块获取初始预测块时,可以对前向预测块的像素值和后向预测块的像素值进行加权处理,然后再对加权处理得到的像素值进行位宽移位和限位操作,使得经过位宽移位和限位操作之后得到的像素值的位宽为目标位宽。
在确定初始预测块的像素值时可以根据公式(4)获得初始预测块块的每个像素点的像素值。
predSamples[x][y]=Clip3(0,(1<<bitDepth)-1,(B+offset2)>>shift2) (4)
其中,B=predSamplesL0[x][y]+predSamplesL1[x][y],predSamplesL0为前向预测块,predSamplesL1为后向预测块,predSamples为初始预测块,predSamplesL0[x][y]为前向预测块中像素点(x,y)的像素值,predSamplesL1[x][y]为后向预测块中像素点(x,y)的像素值,predSamples[x][y]为初始预测块中像素点(x,y)的像素值,shift2表示位宽差,offset2等于1<<(shift2-1),用于在计算过程中进行四舍五入。
Clip3函数为保证最终的预测像素值在图像预测的位宽范围内,其定义如公式(5)所示:
例如,当前向预测块和后向预测块的像素值的位宽均为14bit时,shift2可以设置为15-bitDepth,其中,bitDepth目标位宽,这样,根据公式(4)最终可以得到的初始预测块的像素值的位宽为14+1-shift2=bitDepth,也就是说,最终得到的初始预测块的像素值的位宽与目标位宽相同。
405、在前向参考图像中搜索至少一个前向预测块,该至少一个前向预测块的像素值的位宽为目标位宽。
406、从至少一个前向预测块中确定最优前向预测块,其中,该最优前向预测块的位宽为目标位宽。
在从至少一个前向预测块中确定最优前向预测块时,可以确定至少一个前向预测块中的每个前向预测块的像素值与初始预测块的差异,并将至少一个前向预测块中的像素值与初始预测块的像素值差异最小的预测块确定为最优前向预测块。
407、在后向参考图像中搜索至少一个后向预测块,该至少一个后向预测块的像素值的位宽为目标位宽。
408、从至少一个后向预测块中确定最优后向预测块,其中,该最优后向预测块的位宽为目标位宽。
在从至少一个后向预测块中确定最优前向预测块时,可以确定至少一个后向预测块中的每个后向预测块的像素值与初始预测块的差异,并将至少一个后向预测块中的像素值与初始预测块的像素值差异最小的预测块确定为最优后向预测块。
在步骤405和步骤407中,在前向参考图像或者后向参考图像中进行搜索时可以以整像素步长进行搜索(或者称为运动搜索),以得到至少一个前向预测块以及至少一个后向预测块。在以整像素步长进行搜索时,搜索起始点既可以整像素也可以是分像素,例如,整像素,1/2像素,1/4像素,1/8像素以及1/16像素等等。
例如,如图8所示,在以整像素步长进行搜索时,可以以(0,0)为搜索起始点,得到一个前向预测块,接下来,还可以再以(0,0)的周围的8个像素点为搜索点,继续进行搜索,再得到8个前向预测块。
另外,在步骤405和步骤407中搜索至少一个前向预测块和至少一个后向预测块时,也可以直接以分像素步长进行搜索,或者,既进行整像素步长搜索又进行分像素步长搜索。
在步骤405和步骤407中,可以在搜索的过程中使用高位宽的像素值,从而使得搜索得到的至少一个预测块的像素值为高位宽的像素值,接下来,再对至少一个预测块的像素值进行位宽移位和限位操作,使得搜索得到的至少一个预测块的像素值变为目标位宽的像素值。
具体地,可以根据公式(6)对搜索得到的前向预测块的像素值进行位宽移位和限位操作。
predSamplesL0’[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0[x][y]+offset2)>>shift2) (6)
其中,predSamplesL0为搜索到的前向预测块,predSamplesL0’为对predSamplesL0进行位宽移位和限位操作处理后的前向预测块,predSamplesL0[x][y]为搜索到的前向预测块中像素点(x,y)的像素值,predSamplesL0’[x][y]为进行位宽移位和限位操作处理后的前向预测块中的像素点(x,y)的像素值,shift2表示位宽差,offset2等于1<<(shift2-1),用于在计算过程中进行四舍五入。
对于搜索到的后向预测块,也可以采用公式(6)对搜索得到后向预测块进行位宽移位和限位操作,此时,predSamplesL0表示搜索到的后向预测块,predSamplesL0’为对predSamplesL0进行位宽移位和限位操作处理后的后向预测块。
应理解,在步骤405和步骤407中进行整像素进行搜索时,具体可以采用任何搜索方法。
在步骤406和步骤408中,在计算每个前向预测块的像素值与匹配预测块的像素值的差异,以及每个后向预测块的像素值与匹配预测块的像素值的差异时,可以采用SAD、SATD或者绝对平方差和等来衡量每个前向预测块的像素值与匹配预测块的像素值的差异。但是本申请不限于此,还可以采用其他一些可以用于描述两个预测块之间相似性的参数。
409、根据最优前向预测块和最优后向预测块确定当前图像块的像素值的预测值,其中,当前图像块的像素值的预测值为目标位宽。
在根据最优前向预测块和最优后向预测块确定当前图像块的像素值的预测值时,可以对步骤407和步骤408得到的最优前向预测块的像素值和最优后向预测块的像素值进行加权处理,并将加权处理后得到的像素值作为当前图像块的像素值的预测值。
具体地,可以根据公式(7)得到当前图像块的像素值的预测值。
predSamples’[x][y]=(predSamplesL0’[x][y]+predSamplesL1’[x][y]+1)>>1 (7)
其中,predSamplesL0’为最优前向预测块,predSamplesL1’为最优后向预测块,predSamples’为当前图像块的最终预测块,predSamplesL0’[x][y]为最优前向预测块在像素点(x,y)的像素值,predSamplesL1’[x][y]为最优后向预测块在像素点(x,y)的像素值,predSamples’[x][y]为最终预测块在像素点(x,y)的像素值,Clip3()为限位函数。
应理解,图10所示的方法与图7所示的方法相比,步骤402至步骤408中得到的前向预测块,后向预测块,初始预测块,至少一个前向预测块,至少一个后向预测块以及最优前向预测块以及最优后向预测块的像素值均为目标位宽的像素值。而在图7所示的方法中,步骤202至步骤208中相对应的预测块的像素值的位宽均为高位宽的像素值。图7所示的方法保证了图像预测的准确性,而图10所示的方法减少了图像预测的复杂度。
上文结合图3至图10对本申请实施例的图像预测方法进行了详细的描述,应理解,本申请实施例的图像预测方法可以对应于图1和图2所示的帧间预测,本申请实施例的图像预测方法可以发生在图1和图2所示的帧间预测过程中,本申请实施例的图像预测方法可以具体由编码器或者解码器中的帧间预测模块来执行。另外,本申请实施例的图像预测方法可以在可能需要对视频图像进行编码和/或解码的任何电子设备或者装置内实施。
下面结合图11和图12对本申请实施例的图像预测装置进行详细的描述。其中,图11所示的图像预测装置与图3和图7所示的方法相对应,能够执行图3和图7所示的方法中的各个步骤;图12所示的图像预测装置与图9和图10所示的方法相对应,能够执行图9和图10所示的方法中的各个步骤。为了简洁,下面适当省略重复的描述。
图11是本申请实施例的图像预测装置的示意性框图。图11所示的装置600包括:
获取模块601,所述获取模块601用于:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;
处理模块602,所述处理模块602用于:根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,其中,M为预设值,M为大于1的整数;根据所述M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;
预测模块603,用于根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
在本申请中,由于差值滤波器的增益大于1,因此,第一预测块和第二预测块的像素值的位宽大于最终得到的图像块的重建像素值的位宽,此外,由于第一预测块、第二预测块、初始预测块以及目标预测块的像素值的位宽相同,使得最终得到的目标预测块的像素值的位宽也大于图像块的重建像素值的位宽,因此,可以直接根据具有较高位宽的目标预测块的像素值来确定图像块的像素值的预测值,而不必再通过运动补偿获取具有高位宽的像素值的预测块之后再确定图像块的像素值的预测值,节省了运动补偿的操作,降低了图像预测时的复杂度。
可选地,作为一个实施例,所述获取模块601具体用于:
所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,
所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,
所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤 波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
可选地,作为一个实施例,所述处理模块602具体用于:
所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,
所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;
所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:
将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
可选地,作为一个实施例,所述处理模块602具体用于:
所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;
根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;
根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:
将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;
将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;
根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
可选地,作为一个实施例,所述处理模块602具体用于:
所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;
根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:
将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;
确定所述图像块指向所述第一目标预测块的第一运动矢量;
根据所述第一运动矢量按照预设规则确定第二运动矢量;
根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;
根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
可选地,作为一个实施例,在获取图像块的预测运动信息之前,所述获取模块601还用于从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预 测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
可选地,作为一个实施例,所述获取模块601还用于:
获得所述图像块指向所述目标预测块的运动矢量;
所述处理模块602用于根据所述图像块指向所述目标预测块的运动矢量,得到所述图像块的运动矢量,其中,所述图像块的运动矢量用于对其它图像块进行预测。
应理解,上述装置600可执行上述图3和图7所示的图像预测的方法,装置600具体可以是视频编码装置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置600既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测。
图12是本申请实施例的图像预测装置的示意性框图。图12所示的装置800包括:
获取模块801,所述获取模块801用于:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;
处理模块802,所述处理模块用于:对所述第一预测块和所述第二预测块的像素值进行移位操作,使得所述第一预测块和所述第二预测块的像素值的位宽减小到目标位宽,其中,所述目标位宽为所述图像块的重建像素值的位宽;根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,M为预设值,其中,M为大于1的整数;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;
预测模块803,用于根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
本申请中,由于初始预测块和目标预测块的像素值的位宽均为目标位宽,因此,在对图像进行预测的过程中,能够减少像素值的在不同位宽之间的来回转换,并且根据像素值位宽为目标位宽的目标预测块来确定图像块的像素值的预测值,而不再进行运动补偿获取具有高位宽的像素值的预测块之后再确定图像块的像素值的预测值,节省了运动补偿的操作,简化了图像预测的流程,降低了图像预测的复杂度。
可选地,作为一个实施例,所述获取模块801具体用于:
所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,
所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,
所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
可选地,作为一个实施例,所述处理模块802具体用于:
所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜 索,得到所述图像块对应的M个预测块;或者,
所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;
所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:
将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
可选地,作为一个实施例,所述处理模块802具体用于:
所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;
根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;
根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:
将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;
将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;
根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
可选地,作为一个实施例,所述处理模块802具体用于:
所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;
根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:
将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;
确定所述图像块指向所述第一目标预测块的第一运动矢量;
根据所述第一运动矢量按照预设规则确定第二运动矢量;
根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;
根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
可选地,作为一个实施例,在获取图像块的预测运动信息之前,所述获取模块801还用于从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
可选地,作为一个实施例,所述获取模块801还用于:
获得所述图像块指向所述目标预测块的运动矢量;
所述处理模块802用于根据所述图像块指向所述目标预测块的运动矢量,得到所述图像块的运动矢量,其中,所述图像块的运动矢量用于对其它图像块进行预测。
应理解,上述装置800可执行上述图9和图10所示的图像预测的方法,装置800具体可以是视频编码装置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置800既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测。
本申请还提供一种终端设备,所述终端设备包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于执行本申请实施例的图像预测方法。
这里的终端设备可以是视频显示设备,智能手机,便携式电脑以及其它可以处理视频或者播放视频的设备。
本申请还提供一种视频编码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的图像预测方法。
本申请还提供一种视频解码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的图像预测方法。
本申请还提供一种视频编码系统,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的图像预测方法。
本申请还提供一种计算机可读介质,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括用于执行本申请实施例的图像预测方法的指令。
本申请还提供一种解码器,所述解码器包括本申请实施例中的图像预测装置(例如装置600、装置800)以及重建模块,其中,所述重建模块用于根据所述图像预测装置得到的所述图像块的像素值的预测值得到所述图像块的重建像素值。
本申请还提供一种编码器,所述编码器包括本申请实施例中的图像预测装置(例如装置600、装置800)以及重建模块,其中,所述重建模块用于根据所述图像预测装置得到的所述图像块的像素值的预测值得到所述图像块的重建像素值。
图13是本申请实施例的视频编码器的示意性框图。图13所示的视频编码器1000包括:编码端预测模块1001、变换量化模块1002、熵编码模块1003、编码重建模块1004和编码端滤波模块。
图13所示的视频编码器1000可以对视频进行编码,具体地,视频编码器1000可以执行图1所示的视频编码过程,实现对视频的编码。另外,视频编码器1000还可以执行本申请实施例的图像预测方法,视频编码器1000可以执行图3、图7、图9和图10所示的图像预测方法的各个步骤。本申请实施例中的图像预测装置还可以是视频编码器1000中的编码端预测模块1001,具体地,图11和图12所示的装置600和装置800相当于视频编码器1000中的编码端预测模块1001。
图14是本申请实施例的视频解码器的示意性框图。图14所示的视频解码器2000包括:熵解码模块2001、反变换反量化模块2002、解码端预测模块2003、解码重建模块2004 和解码端滤波模块2005。
图14所示的视频解码器2000可以对视频进行编码,具体地,视频解码器2000可以执行图2所示的视频解码过程,实现对视频的解码。另外,视频解码器2000还可以执行本申请实施例的图像预测方法,视频解码器2000可以执行图3、图7、图9和图10所示的图像预测方法的各个步骤。本申请实施例中的图像预测装置还可以是视频解码器2000中的解码端预测模块2003,具体地,图11和图12所示的装置600和装置800相当于视频解码器2000中的解码端预测模块2003。
下面结合图15至图17对本申请实施例的图像预测的方法的应用场景进行介绍,本申请实施例的图像预测的方法可以由图15至图17所示的视频传输系统、编解码装置以及编解码系统来执行。
图15是本申请实施例的视频传输系统的示意性框图。
如图15所示,视频传输系统包括采集模块3001、编码模块3002、发送模块3003、网络传输3004、接收模块3005、解码模块3006、渲染模块3007和显示模块208组成。
其中,视频传输系统中各个模块的具体作用如下:
采集模块3001包含摄像头或者摄像头组,用于采集视频图像,并对采集到的视频图像进行编码前的处理,将光信号转化为数字化的视频序列;
编码模块3002用于对视频序列进行编码,得到码流;
发送模块3003用于将编码得到的码流发送出去;
接收模块3005用于接收发送模块3003发送的码流;
网络3004用于将发送模块3003发送的码流传输到接收模块3005;
解码模块3006用于对接收模块3005接收的码流进行解码,重建视频序列;
渲染模块3007用于对解码模块3006解码得到的重建视频序列进行渲染,以提高视频的显示效果。
图15所示的视频传输系统可以执行本申请实施例的图像预测的方法,具体地,图15所示的视频传输系统中的编码模块3001和解码模块3006都可以执行本申请实施例的图像预测的方法。另外,图12所示的视频传输系统中的采集模块3001、编码模块3002以及发送模块3003相当于图15所示的视频编码器1000。图13所示的视频传输系统中的接收模块3005、解码模块3006和渲染模块3007相当于图16所示的视频解码器2000。
下面结合图16和图17对编解码装置和编解码装置组成的编解码系统进行详细的介绍。应理解,图16和图17中所示的编解码装置和编解码系统能够执行本申请实施例的图像预测的方法。
图16是本申请实施例的视频编解码装置的示意性图。该视频编解码装置50可以是专门用于对视频图像进行编码和/或解码的装置,也可以是具有视频编解码功能的电子设备,进一步地,该编解码装置50可以是无线通信系统的移动终端或者用户设备。
编解码装置50可以包括下列模块或者单元:控制器56、编解码器54、无线电接口52、天线44、智能卡46、读卡器48、小键盘34、存储器58、红外线端口42、显示器32。除了图16中所示的模块和单元之外,编解码装置50还可以包括麦克风或者任何适当的音频输入模块,该音频输入模块可以是数字或者模拟信号输入,编解码装置50还可以包括音频输出模块,该音频输出模块可以是耳机、扬声器或者模拟音频或者数字音频输出连接。 编解码装置50也可以包括电池,该电池可以是太阳能电池、燃料电池等等。编解码装置50还可以包括用于与其它设备进行近程视线通信的红外线端口,该编解码装置50还可以采用包括任何适当的近程通信方式与其它设备进行通信,例如,蓝牙无线连接、USB/火线有线连接。
存储器58可以存储形式为图像的数据和音频的数据,也可以存储用于在控制器56上执行的指令。
编解码器54可以实现对音频和/或视频数据的编码和解码或者在控制器56的控制下实现对音频和/或视频数据的辅助编码和辅助解码。
智能卡46和读卡器48可以提供用户信息,也可以提供网络认证和授权用户的认证信息。智能卡46和读卡器48的具体实现形式可以是集成电路卡(Universal Integrated Circuit Card,UICC)和UICC读取器。
无线电接口电路52可以生成无线通信信号,该无线通信信号可以是在进行蜂窝通信网络、无线通信系统或者无线局域网通信产生的通信信号。
天线44用于向其它装置(装置的数目可以为一个也可以为多个)发送在无线电接口电路52生成的射频信号,并且还可以用于从其它装置(装置的数目可以为一个也可以为多个)接收射频信号。
在本申请的一些实施例中,编解码装置50可以在传输和/或存储之前从另一设备接收待处理的视频图像数据。在本申请的另一些实施例中,编解码装置50可以通过无线或者有线连接接收图像并对接收到的图像进行编码/解码。
图17是本申请实施例的视频编解码系统7000的示意性框图。
如图17所示,视频编解码系统7000包含源装置4000及目的地装置5000。源装置4000产生经过编码后的视频数据,源装置4000也可以被称为视频编码装置或视频编码设备,目的地装置5000可以对源装置4000产生的经过编码后的视频数据进行解码,目的地装置5000也可以被称为视频解码装置或视频解码设备。
源装置4000和目的地装置5000的具体实现形式可以是如下设备中的任意一种:台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、智能电话、手持机、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机,或者其它类似的设备。
目的地装置5000可以经由信道6000接收来自源装置4000编码后的视频数据。信道6000可包括能够将编码后的视频数据从源装置4000移动到目的地装置5000的一个或多个媒体及/或装置。在一个实例中,信道6000可以包括使源装置4000能够实时地将编码后的视频数据直接发射到目的地装置5000的一个或多个通信媒体,在此实例中,源装置4000可以根据通信标准(例如,无线通信协议)来调制编码后的视频数据,并且可以将调制后的视频数据发射到目的地装置5000。上述一个或多个通信媒体可以包含无线及/或有线通信媒体,例如射频(Radio Frequency,RF)频谱或一根或多根物理传输线。上述一个或多个通信媒体可以形成基于包的网络(例如,局域网、广域网或全球网络(例如,因特网))的部分。上述一个或多个通信媒体可以包含路由器、交换器、基站,或者实现从源装置4000到目的地装置5000的通信的其它设备。
在另一实例中,信道6000可包含存储由源装置4000产生的编码后的视频数据的存 储媒体。在此实例中,目的地装置5000可经由磁盘存取或卡存取来存取存储媒体。存储媒体可包含多种本地存取式数据存储媒体,例如蓝光光盘、高密度数字视频光盘(Digital Video Disc,DVD)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)、快闪存储器,或用于存储经编码视频数据的其它合适数字存储媒体。
在另一实例中,信道6000可包含文件服务器或存储由源装置4000产生的编码后的视频数据的另一中间存储装置。在此实例中,目的地装置5000可经由流式传输或下载来存取存储于文件服务器或其它中间存储装置处的编码后的视频数据。文件服务器可以是能够存储编码后的视频数据且将所述编码后的视频数据发射到目的地装置5000的服务器类型。例如,文件服务器可以包含全球广域网(World Wide Web,Web)服务器(例如,用于网站)、文件传送协议(File Transfer Protocol,FTP)服务器、网络附加存储(Network Attached Storage,NAS)装置以及本地磁盘驱动器。
目的地装置5000可经由标准数据连接(例如,因特网连接)来存取编码后的视频数据。数据连接的实例类型包含适合于存取存储于文件服务器上的编码后的视频数据的无线信道、有线连接(例如,缆线调制解调器等),或两者的组合。编码后的视频数据从文件服务器的发射可为流式传输、下载传输或两者的组合。
本申请的图像预测方法不限于无线应用场景,示例性的,本申请的图像预测方法可以应用于支持以下应用等多种多媒体应用的视频编解码:空中电视广播、有线电视发射、卫星电视发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频编解码系统7000可经配置以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。
在图17中,源装置4000包含视频源4001、视频编码器4002及输出接口4003。在一些实例中,输出接口4003可包含调制器/解调器(调制解调器)及/或发射器。视频源4001可包含视频俘获装置(例如,视频相机)、含有先前俘获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频输入接口,及/或用于产生视频数据的计算机图形系统,或上述视频数据源的组合。
视频编码器4002可编码来自视频源4001的视频数据。在一些实例中,源装置4000经由输出接口4003将编码后的视频数据直接发射到目的地装置5000。编码后的视频数据还可存储于存储媒体或文件服务器上以供目的地装置5000稍后存取以用于解码及/或播放。
在图17的实例中,目的地装置5000包含输入接口5003、视频解码器5002及显示装置5001。在一些实例中,输入接口5003包含接收器及/或调制解调器。输入接口5003可经由信道6000接收编码后的视频数据。显示装置5001可与目的地装置5000整合或可在目的地装置5000外部。一般来说,显示装置5001显示解码后的视频数据。显示装置5001可包括多种显示装置,例如液晶显示器、等离子体显示器、有机发光二极管显示器或其它类型的显示装置。
视频编码器4002及视频解码器5002可根据视频压缩标准(例如,高效率视频编解码H.265标准))而操作,并且可以可遵照高效视频编码(High Efficiency Video Coding,HEVC)测试模型(HM)。H.265标准的文本描述ITU-TH.265(V3)(04/2015)于2015年4 月29号发布,可从http://handle.itu.int/11.1002/7000/12455下载,所述文件的全部内容以引用的方式并入本文中。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (26)
- 一种图像预测方法,其特征在于,包括:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,其中,M为预设值,M为大于1的整数;根据所述M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
- 如权利要求1所述的方法,其特征在于,所述根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像获得所述第一预测块,并在所述后向参考图像中获得所述第二预测块。
- 如权利要求1或2所述的方法,其特征在于,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
- 如权利要求1或2所述的方法,其特征在于,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
- 如权利要求1或2所述的方法,其特征在于,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定所述图像块指向所述第一目标预测块的第一运动矢量;根据所述第一运动矢量按照预设规则确定第二运动矢量;根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
- 如权利要求1-5中任一项所述的方法,其特征在于,在获取图像块的预测运动信息之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
- 一种图像预测方法,其特征在于,包括:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;对所述第一预测块和所述第二预测块的像素值进行移位操作,使得所述第一预测块和所述第二预测块的像素值的位宽减小到目标位宽,其中,所述目标位宽为所述图像块的重建像素值的位宽;根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,M为预设值,其中,M为大于1的整数;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测 块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
- 如权利要求7所述的方法,其特征在于,所述根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
- 如权利要求7或8所述的方法,其特征在于,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
- 如权利要求7或8所述的方法,其特征在于,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
- 如权利要求7或8所述的方法,其特征在于,所述根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,包括:所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测 块,包括:将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定所述图像块指向所述第一目标预测块的第一运动矢量;根据所述第一运动矢量按照预设规则确定第二运动矢量;根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
- 如权利要求7-11中任一项所述的方法,其特征在于,在获取图像块的预测运动信息之前,所述方法还包括:从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
- 一种图像预测装置,其特征在于,包括:获取模块,所述获取模块用于:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;处理模块,所述处理模块用于:根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,其中,M为预设值,M为大于1的整数;根据所述M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;预测模块,用于根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
- 如权利要求13所述的装置,其特征在于,所述获取模块具体用于:所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
- 如权利要求13或14所述的装置,其特征在于,所述处理模块具体用于:所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
- 如权利要求13或14所述的装置,其特征在于,所述处理模块具体用于:所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
- 如权利要求13或14所述的装置,其特征在于,所述处理模块具体用于:所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定所述图像块指向所述第一目标预测块的第一运动矢量;根据所述第一运动矢量按照预设规则确定第二运动矢量;根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
- 如权利要求13-17中任一项所述的装置,其特征在于,在获取图像块的预测运动信息之前,所述获取模块还用于从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
- 一种图像预测装置,其特征在于,包括:获取模块,所述获取模块用于:获取图像块的预测运动信息;根据所述预测运动信息通过插值滤波器在参考图像中获得所述图像块对应的第一预测块和第二预测块,其中,所述插值滤波器的增益大于1;处理模块,所述处理模块用于:对所述第一预测块和所述第二预测块的像素值进行移位操作,使得所述第一预测块和所述第二预测块的像素值的位宽减小到目标位宽,其中,所述目标位宽为所述图像块的重建像素值的位宽;根据所述第一预测块和所述第二预测块,得到初始预测块,其中,所述初始预测块、所述第一预测块以及所述第二预测块的像素值的位宽均相同;根据所述预测运动信息在所述参考图像中进行搜索,得到所述图像块对应的M个预测块,M为预设值,其中,M为大于1的整数;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,其中,所述目标预测块与所述初始预测块的像素值的位宽相同;预测模块,用于根据所述目标预测块的像素值,得到所述图像块的像素值的预测值。
- 如权利要求19所述的装置,其特征在于,所述获取模块具体用于:所述参考图像为前向参考图像,根据所述预测运动信息通过插值滤波器在所述前向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息通过插值滤波器在所述后向参考图像中获得所述第一预测块和所述第二预测块;或者,所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息通过插值滤波器分别在所述前向参考图像和所述后向参考图像中获得所述第一预测块和所述第二预测块。
- 如权利要求19或20所述的装置,其特征在于,所述处理模块具体用于:所述参考图像为前向参考图像,根据所述预测运动信息在所述前向参考图像中进行搜索,得到所述图像块对应的M个预测块;或者,所述参考图像为后向参考图像,根据所述预测运动信息在所述后向参考图像中进行搜索,得到所述图像块对应的M个预测块;所述根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为所述目标预测块。
- 如权利要求19或20所述的装置,其特征在于,所述处理模块具体用于:所述参考图像包含前向参考图像和后向参考图像,根据所述预测运动信息在前向参考图像中进行搜索,得到所述图像块对应的A个预测块;根据所述预测运动信息在后向参考图像中进行搜索,得到所述图像块对应的B个预测块,其中,A和B均为大于0的整数,A+B=M;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述A个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;将所述B个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第二目标预测块;根据所述第一目标预测块和所述第二目标预测块确定所述目标预测块。
- 如权利要求19或20所述的装置,其特征在于,所述处理模块具体用于:所述参考图像为第一方向参考图像,根据所述预测运动信息在所述第一方向参考图像中进行搜索,得到所述图像块对应的M个预测块;根据所述图像块对应的M个预测块和所述初始预测块,确定所述图像块的目标预测块,包括:将所述图像块对应的M个预测块中像素值与所述初始预测块的像素值的差异最小的预测块确定为第一目标预测块;确定所述图像块指向所述第一目标预测块的第一运动矢量;根据所述第一运动矢量按照预设规则确定第二运动矢量;根据所述第二运动矢量在第二方向参考图像中确定所述图像块对应的第二目标预测块,其中,所述第一方向参考图像和所述第二方向参考图像分别为前向参考图像和后向参考图像,或者,所述第一方向参考图像和所述第二方向参考图像分别为后向参考图像和前向参考图像;根据所述第一目标预测块和所述第二目标预测块,确定所述目标预测块。
- 如权利要求19-23中任一项所述的装置,其特征在于,在获取图像块的预测运动信息之前,所述获取模块还用于从所述图像块的码流中获取指示信息,其中,所述指示信息用于指示获取图像块的预测运动信息,所述指示信息携带在所述图像块的序列参数集、图像参数集或者条带头中的任意一种。
- 一种终端设备,其特征在于,包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于执行如权利要求1-6中任一项所述的方法。
- 一种终端设备,其特征在于,包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于执行如权利要求7-12中任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711086618.4A CN109756737B (zh) | 2017-11-07 | 2017-11-07 | 图像预测方法和装置 |
CN201711086618.4 | 2017-11-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019091372A1 true WO2019091372A1 (zh) | 2019-05-16 |
Family
ID=66401269
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/114146 WO2019091372A1 (zh) | 2017-11-07 | 2018-11-06 | 图像预测方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109756737B (zh) |
WO (1) | WO2019091372A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112135129B (zh) * | 2019-06-25 | 2024-06-04 | 华为技术有限公司 | 一种帧间预测方法及装置 |
WO2021056212A1 (zh) * | 2019-09-24 | 2021-04-01 | 深圳市大疆创新科技有限公司 | 视频编解码方法和装置 |
CN113033424B (zh) * | 2021-03-29 | 2021-09-28 | 广东众聚人工智能科技有限公司 | 一种基于多分支视频异常检测方法和系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103650495A (zh) * | 2011-07-01 | 2014-03-19 | 高通股份有限公司 | 分辨率减小的像素内插 |
WO2014166360A1 (en) * | 2013-04-10 | 2014-10-16 | Mediatek Inc. | Method and apparatus for bi-prediction of illumination compensation |
CN105637866A (zh) * | 2013-10-28 | 2016-06-01 | 高通股份有限公司 | 自适应色彩分量间残差预测 |
CN106331722A (zh) * | 2015-07-03 | 2017-01-11 | 华为技术有限公司 | 图像预测方法和相关设备 |
WO2017082698A1 (ko) * | 2015-11-11 | 2017-05-18 | 삼성전자 주식회사 | 비디오 복호화 방법 및 그 장치 및 비디오 부호화 방법 및 그 장치 |
-
2017
- 2017-11-07 CN CN201711086618.4A patent/CN109756737B/zh active Active
-
2018
- 2018-11-06 WO PCT/CN2018/114146 patent/WO2019091372A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103650495A (zh) * | 2011-07-01 | 2014-03-19 | 高通股份有限公司 | 分辨率减小的像素内插 |
WO2014166360A1 (en) * | 2013-04-10 | 2014-10-16 | Mediatek Inc. | Method and apparatus for bi-prediction of illumination compensation |
CN105637866A (zh) * | 2013-10-28 | 2016-06-01 | 高通股份有限公司 | 自适应色彩分量间残差预测 |
CN106331722A (zh) * | 2015-07-03 | 2017-01-11 | 华为技术有限公司 | 图像预测方法和相关设备 |
WO2017082698A1 (ko) * | 2015-11-11 | 2017-05-18 | 삼성전자 주식회사 | 비디오 복호화 방법 및 그 장치 및 비디오 부호화 방법 및 그 장치 |
Also Published As
Publication number | Publication date |
---|---|
CN109756737B (zh) | 2020-11-17 |
CN109756737A (zh) | 2019-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7461974B2 (ja) | クロマ予測方法及びデバイス | |
WO2017129023A1 (zh) | 解码方法、编码方法、解码设备和编码设备 | |
WO2019109955A1 (zh) | 帧间预测方法、装置及终端设备 | |
CN107277533B (zh) | 帧间预测的方法及其装置和运动补偿的方法及其装置 | |
CN108924553B (zh) | 视频编码、解码方法、装置、计算机设备和存储介质 | |
WO2019128716A1 (zh) | 图像的预测方法、装置及编解码器 | |
WO2010095557A1 (ja) | 画像処理装置および方法 | |
US11412210B2 (en) | Inter prediction method and apparatus for video coding | |
US11109060B2 (en) | Image prediction method and apparatus | |
WO2019153861A1 (zh) | 一种双向帧间预测方法及装置 | |
KR102349788B1 (ko) | 영상의 부호화/복호화 방법 및 장치 | |
WO2019091372A1 (zh) | 图像预测方法和装置 | |
WO2019114294A1 (zh) | 图像编解码方法、装置、系统及存储介质 | |
WO2019233423A1 (zh) | 获取运动矢量的方法和装置 | |
WO2018120230A1 (zh) | 图像滤波方法、装置以及设备 | |
US11770522B2 (en) | Picture display order determining method and apparatus, and video coding device | |
JP2013251759A (ja) | 電子機器及び復号方法 | |
KR20240099324A (ko) | 기준 픽처 리샘플링을 사용한 비디오 인코딩 및 디코딩 | |
CN118138770A (zh) | 视频处理方法、装置、电子设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18876236 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18876236 Country of ref document: EP Kind code of ref document: A1 |