WO2019128716A1 - Image prediction method, apparatus, and codec - Google Patents
Image prediction method, apparatus, and codec Download PDFInfo
- Publication number
- WO2019128716A1 WO2019128716A1 PCT/CN2018/120681 CN2018120681W WO2019128716A1 WO 2019128716 A1 WO2019128716 A1 WO 2019128716A1 CN 2018120681 W CN2018120681 W CN 2018120681W WO 2019128716 A1 WO2019128716 A1 WO 2019128716A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reference block
- block
- image
- pixel
- precision
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
Definitions
- the present application relates to the field of video codec technology, and in particular, to an interframe prediction method and apparatus for video images, and a corresponding encoder and decoder.
- Digital video capabilities can be incorporated into a wide variety of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, Digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones (so-called "smart phones"), video teleconferencing devices, video streaming devices and the like .
- Digital video devices implement video compression techniques, for example, standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 Part 10 Advanced Video Coding (AVC), The video coding standard H.265/High Efficiency Video Coding (HEVC) standard and the video compression techniques described in the extension of such standards.
- Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.
- Video compression techniques perform spatial (intra-image) prediction and/or temporal (inter-image) prediction to reduce or remove redundancy inherent in video sequences.
- a video slice ie, a video frame or a portion of a video frame
- the image block in the intra-coded (I) slice of the image is encoded using spatial prediction with respect to reference samples in neighboring blocks in the same image.
- An image block in an inter-coded (P or B) slice of an image may use spatial prediction with respect to reference samples in neighboring blocks in the same image or temporal prediction with respect to reference samples in other reference images.
- An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
- various video coding standards including the High Efficiency Video Coding (HEVC) standard propose a predictive coding mode for an image block, that is, predict a block to be currently coded based on an already encoded video data block.
- HEVC High Efficiency Video Coding
- the current image block is predicted based on one or more previously decoded neighboring blocks in the same image as the current image block; in the inter prediction mode, based on the already decoded blocks in the different images Predict the current image block.
- inter-prediction modes such as Merge mode, Skip mode, and Advanced Motion Vector Prediction mode (AMVP mode)
- AMVP mode Advanced Motion Vector Prediction mode
- An embodiment of the present application provides an image prediction method and apparatus, and a corresponding encoder and decoder, in particular, an inter-frame prediction method for video images, which improves the prediction accuracy of motion information of an image block to a certain extent, thereby improving coding and decoding performance.
- an embodiment of the present application provides an image prediction method, which includes: acquiring initial predicted motion information of a current image block; and determining, according to the initial predicted motion information, the current image in a first reference image. a first reference block corresponding to the block, and determining, in the second reference image, a second reference block corresponding to the current image block; wherein the first reference block includes a first search base point, and the second reference block includes a second search base point; determining N third reference blocks in the first reference image; and for any one of the N third reference blocks, according to the first search base point, Determining, in the second reference image, a fourth reference block corresponding to a position of the any one of the third reference block and the second search base; obtaining N reference block groups, wherein the reference block group includes a first a third reference block and a fourth reference block; N is greater than or equal to 1; increasing pixel values of the obtained third reference block and fourth reference block to a first pixel precision, and calculating at the first pixel precision An image
- an embodiment of the present application provides an image prediction apparatus, including a plurality of functional units for implementing any one of the methods of the first aspect.
- the apparatus may include: an acquiring unit, configured to acquire initial predicted motion information of a current image block; and a determining unit, configured to determine the current image block in the first reference image according to the initial predicted motion information Corresponding first reference block, and determining, in the second reference image, a second reference block corresponding to the current image block; wherein the first reference block includes a first search base point, and the second reference block includes a first reference block a search base unit, configured to determine N third reference blocks in the first reference image, and a mapping unit, configured to use, for any one of the N third reference blocks, a third reference block, according to Determining, in the second reference image, a fourth reference block corresponding to the first search base point, the location of the any one of the third reference blocks, and the second search base point; obtaining N reference block groups, Wherein, one reference block group includes a third
- the initial predicted motion information includes a reference image index for indicating that the two reference images include one forward reference image and one backward reference image.
- the N third reference blocks include the first reference block; and the obtained N fourth reference blocks include the second reference a block; wherein the first reference block and the second reference block belong to one reference block group, that is, there is a corresponding relationship in space. It can also be understood that, according to any one of the N third reference blocks, according to the first search base point, the location of the any one third reference block, and the second search base point Correspondingly determining a fourth reference block in the second reference image includes: if the first reference block is a third reference block; the second reference block is correspondingly a fourth reference block .
- the any one of the third Correspondingly determining a fourth reference block in the second reference image includes: determining, according to the location of the block and the second search base, the first reference block according to the any one of the third reference block and the first search base An i vector; determining, according to a time domain interval t1 of the current image block relative to the first reference image, a time domain interval t2 of the current image block relative to the second reference image, and the ith vector a jth vector, wherein the jth vector is opposite to a direction of the ith vector; i and j are both positive integers not greater than N; and one is determined according to the second search base point and the jth vector Fourth reference block. Accordingly, the method can be performed by a mapping unit.
- the any one of the N third reference blocks, according to the first search base point, the any one Determining, in the second reference image, a fourth reference block corresponding to the location of the third reference block and the second search base comprising: determining, according to the any one of the third reference block and the first search base point An ith vector; determining, according to the ith vector, a jth vector, wherein the jth vector is inversely different from the ith vector; i and j are positive integers not greater than N;
- the second search base point and the jth vector determine a fourth reference block. Accordingly, the method can be performed by a mapping unit.
- the pixel values of the obtained third reference block and the fourth reference block are improved to a first pixel precision
- the first pixel precision is Calculating the image block matching cost of the N reference block groups includes: interpolating or subtracting pixel values of the obtained third reference block and the fourth reference block for at least one of the N reference block groups Up shifting to a first pixel precision; and calculating an image block matching cost at the first pixel precision; determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion includes: The first occurrence of the reference block group in the at least one reference block group that satisfies the image block matching cost less than the preset threshold is determined as the target reference block group.
- the image block matching cost is not less than a preset threshold.
- the image block matching cost is less than a preset threshold, and the third reference block group is used as the third reference block group.
- the target references the block group and no further reference block groups are calculated. Accordingly, the method can be performed jointly by the computing unit and the selection unit.
- the pixel values of the obtained third reference block and the fourth reference block are improved to a first pixel precision
- the first pixel precision is Calculating an image block matching cost of the N reference block groups includes: increasing pixel values of the obtained third reference block and fourth reference block by interpolation or shifting to a first pixel precision; for the N Calculating an image block matching cost for each of the reference block groups in the reference block group; determining, in the N reference block groups, the target reference block group that satisfies the image block matching cost criterion includes: imaging the N reference block groups A reference block group having the smallest block matching cost is determined as the target reference block group. For example, six reference block groups are calculated, wherein the fourth reference block group image block matching cost is the smallest, and the fourth reference block group is used as the target reference block group. Accordingly, the method can be performed jointly by the computing unit and the selection unit.
- the pixel value according to the target third reference block at the first precision and the target fourth reference block are at the first precision Pixel values to obtain pixel prediction values for the current image block include:
- the first motion vector and the second motion vector included in the initial predicted motion information; according to the initial predicted motion information, at the first Determining, in the reference image, the first reference block corresponding to the current image block, and determining, in the second reference image, that the second reference block corresponding to the current image block comprises: according to the location of the current image block and the The first motion vector obtains the first reference block, and obtains the second reference block according to the location of the current image block and the second motion vector. Accordingly, the method can be performed by the determining unit.
- the motion search may be performed in a preset step with reference to the search base point of the first reference block, and the N third reference blocks are searched for.
- the method further includes: determining a motion vector corresponding to the target third reference block and the target fourth reference block as the forward optimal motion vector sum
- the backward optimal motion vector provides a motion vector reference for the prediction of subsequent image blocks.
- the above methods and apparatus can be implemented by a processor calling a program and instructions in a memory.
- an embodiment of the present application provides a video encoder, where the video encoder is used to encode an image block, and includes any possible image prediction apparatus and a code reconstruction module, where the image prediction apparatus is configured to obtain a current image block. a prediction value of the pixel value; the code reconstruction module is configured to obtain the reconstructed pixel value of the current image block according to the predicted value of the pixel value of the current image block. Accordingly, the video encoder can perform any of the possible design methods described above.
- an embodiment of the present application provides a video decoder, where the video decoder is used to decode an image block, and includes any possible image prediction apparatus and a decoding reconstruction module, where the image prediction apparatus is configured to obtain a current image block. a prediction value of the pixel value; a decoding reconstruction module, configured to obtain a reconstructed pixel value of the current image block according to a predicted value of a pixel value of the current image block.
- the video decoder can perform any of the possible design methods described above.
- an embodiment of the present application provides an apparatus for encoding video data, where the apparatus includes:
- a memory for storing video data, the video data comprising one or more image blocks;
- a video encoder for encoding an image, and the inter prediction method in the encoding process may adopt any of the above possible design methods.
- an embodiment of the present application provides an apparatus for decoding video data, where the device includes:
- a memory for storing video data, the video data comprising one or more image blocks;
- a video decoder for decoding an image, and the inter prediction method in the decoding process may adopt any of the above possible design methods.
- an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled to each other, the processor calling a program code stored in the memory to perform any one of the first aspects. Part or all of the steps of the method.
- an embodiment of the present application provides a decoding apparatus, including: a non-volatile memory and a processor coupled to each other, the processor calling a program code stored in the memory to perform any one of the first aspects. Part or all of the steps of the method.
- the embodiment of the present application provides a computer readable storage medium, where the program code stores program code, where the program code includes a part for performing any one of the methods of the first aspect or Instructions for all steps.
- the embodiment of the present application provides a computer program product, when the computer program product is run on a computer, causing the computer to perform some or all of the steps of any one of the first aspects.
- FIG. 1 is a schematic diagram of a video encoding process in an embodiment of the present application.
- FIG. 2 is a schematic diagram of a video decoding process in an embodiment of the present application.
- FIG. 3 is a schematic diagram of an image prediction method according to an embodiment of the present application.
- FIG. 4 is a schematic diagram of an inter prediction mode in an embodiment of the present application.
- FIG. 5 is a schematic diagram of another inter prediction mode in the embodiment of the present application.
- FIG. 6 is a schematic diagram of a search reference block in an embodiment of the present application.
- FIG. 7 is a schematic diagram of an image prediction apparatus according to an embodiment of the present application.
- FIG. 8 is a schematic block diagram of a video encoder in an embodiment of the present application.
- FIG. 9 is a schematic block diagram of a video decoder in an embodiment of the present application.
- FIG. 10 is a schematic block diagram of a video transmission system in an embodiment of the present application.
- FIG. 11 is a schematic diagram of a video codec apparatus in an embodiment of the present application.
- FIG. 12 is a schematic block diagram of a video codec system in an embodiment of the present application.
- the image prediction method in the present application can be applied to the field of video codec technology.
- the video codec is first introduced below.
- a video generally consists of a number of frame images in a certain order.
- redundant information For example, there is often a large amount of space in one frame of image.
- the same or similar structure that is to say, there is a large amount of spatial redundancy information in the video file.
- time redundant information in the video file, which is caused by the composition of the video.
- the frame rate of video sampling is generally 25 frames/second to 60 frames/second, that is, the sampling interval between adjacent frames is 1/60 second to 1/25 second, in such a short period of time, There are basically a lot of similar information in the sampled image, and there is a huge correlation between the images.
- visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance.
- visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance.
- the sensitivity of human vision to brightness changes tends to decrease, and is more sensitive to the edges of objects; in addition, the human eye is relatively insensitive to internal areas and sensitive to the overall structure. Since the final target of the video image is our human population, we can make full use of these characteristics of the human eye to compress the original video image to achieve better compression.
- video image information also has redundancy in information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy, etc. information.
- the purpose of video coding (also referred to as video compression coding) is to use various technical methods to remove redundant information in a video sequence to reduce storage space and save transmission bandwidth.
- Chroma sampling This method makes full use of the visual psychological characteristics of the human eye, and tries to minimize the amount of data described by a single element from the underlying data representation.
- YUV luminance-chrominance-chrominance
- the YUV color space includes a luminance signal Y and two color difference signals U and V, and the three components are independent of each other.
- the YUV color space is more flexible in representation, and the transmission occupies less bandwidth, which is superior to the traditional red, green and blue (RGB) color model.
- the YUV 4:2:0 form indicates that the two chrominance components U and V are only half of the luminance Y component in both the horizontal direction and the vertical direction, that is, there are four luminance components Y among the four sampled pixels, and the chrominance component There is only one U and V.
- the amount of data is further reduced, only about 33% of the original. Therefore, chroma sampling makes full use of the physiological visual characteristics of the human eye, and the purpose of video compression by means of such chroma sampling is one of the widely used video data compression methods.
- Predictive coding uses the data information of the previously encoded frame to predict the frame currently to be encoded.
- a predicted value is obtained by prediction, which is not completely equivalent to the actual value, and there is a certain residual value between the predicted value and the actual value.
- the more accurate the prediction the closer the predicted value is to the actual value, and the smaller the residual value, so that the residual value can be encoded to greatly reduce the amount of data, and the residual value plus the predicted value is used when decoding at the decoding end. It is possible to restore and reconstruct the matching image, which is the basic idea of predictive coding.
- predictive coding is divided into two basic types: intra prediction and inter prediction.
- Intra Prediction refers to predicting the pixel value of the pixel in the current coding unit by using the pixel value of the pixel in the reconstructed region in the current image;
- Inter Prediction is the reconstructed image. Searching for a matching reference block for the current coding unit in the current image, using the pixel value of the pixel in the reference block as the prediction information or the predicted value of the pixel value of the pixel in the current coding unit, and transmitting the motion of the current coding unit. information.
- Transform coding This coding method does not directly encode the original spatial domain information, but converts the information sample values from the current domain to another artificial domain according to some form of transformation function (commonly called transform domain). ), and then compression coding according to the distribution characteristics of the information in the transform domain. Since video image data tends to have very large data correlation in the spatial domain, there is a large amount of redundant information, and if it is directly encoded, a large amount of bits is required. When the information sample value is converted into the transform domain, the correlation of the data is greatly reduced, so that the amount of data required for encoding is greatly reduced due to the reduction of redundant information during encoding, so that high compression can be obtained. Than, and can achieve better compression.
- Typical transform coding methods include Kalo (K-L) transform, Fourier transform, and the like.
- Quantization coding The above-mentioned transform coding does not compress the data itself, and the quantization process can effectively achieve the compression of the data.
- the quantization process is also the main reason for the loss of data in the lossy compression.
- the process of quantification is the process of "force planning" a large dynamic input value into fewer output values. Since the range of quantized input values is large, more bit number representation is needed, and the range of output values after "forced planning" is small, so that only a small number of bits can be expressed.
- the encoder control module selects the coding mode adopted by the image block according to the local characteristics of different image blocks in the video frame.
- the intra-predictive coded block is subjected to frequency domain or spatial domain prediction
- the inter-predictive coded block is subjected to motion compensation prediction
- the predicted residual is further transformed and quantized to form a residual coefficient
- the final code is generated by the entropy encoder. flow.
- the intra or inter prediction reference signals are obtained by the decoding module at the encoding end.
- the transformed and quantized residual coefficients are reconstructed by inverse quantization and inverse transform, and then added to the predicted reference signal to obtain a reconstructed image.
- the loop filtering performs pixel correction on the reconstructed image to improve the encoding quality of the reconstructed image.
- Figure 1 is a schematic diagram of a video encoding process.
- intra prediction or inter prediction when performing prediction on the current image block in the current frame Fn, either intra prediction or inter prediction may be used. Specifically, whether intra coding or intraframe coding can be selected according to the type of the current frame Fn. Inter-frame coding, for example, intra prediction is used when the current frame Fn is an I frame, and inter prediction is used when the current frame Fn is a P frame or a B frame.
- intra prediction is adopted, the pixel value of the pixel of the current image block may be predicted by using the pixel value of the pixel of the reconstructed area in the current frame Fn, and the reference frame F'n -1 may be used when inter prediction is adopted. The pixel value of the pixel of the reference block that matches the current image block predicts the pixel value of the pixel of the current image block.
- the pixel value of the pixel point of the current image block is compared with the pixel value of the pixel point of the prediction block to obtain residual information, and the residual information is obtained.
- the change, quantization, and entropy coding are performed to obtain an encoded code stream.
- the residual information of the current frame Fn and the prediction information of the current frame Fn are superimposed, and a filtering operation is performed to obtain a reconstructed frame F' n of the current frame, and is used as a reference frame for subsequent encoding. .
- FIG. 2 is a schematic diagram of a video decoding process.
- the video decoding process shown in FIG. 2 is equivalent to the inverse process of the video decoding process shown in FIG. 1.
- the residual information is obtained by using entropy decoding and inverse quantization and inverse transform, and the current image block is determined according to the decoded code stream.
- Intra prediction is also inter prediction. If it is intra prediction, the prediction information is constructed according to the intra prediction method using the pixel values of the pixels in the reconstructed region in the current frame; if it is inter prediction, the motion information needs to be parsed, and the parsed motion information is used.
- the reference block is determined in the reconstructed image, and the pixel value of the pixel in the reference block is used as prediction information.
- the prediction information is superimposed with the residual information, and the reconstruction information is obtained through the filtering operation.
- FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
- the method shown in FIG. 3 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
- the method shown in FIG. 3 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
- the method shown in FIG. 3 includes steps 301 to 308, and steps 301 to 308 are described in detail below.
- the 302. Determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine a second reference block corresponding to the current image block in the second reference image; where, the first The reference block includes a first search base point, and the second reference block includes a second search base point; a pixel value of the first reference block and a pixel value of the second reference block have a first pixel precision.
- the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
- the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
- the foregoing initial predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector directed to the reference image block (usually a motion vector of a neighboring block), and a reference.
- the indication information of the image (generally understood as reference image information for determining the reference image), wherein the motion vector comprises a forward motion vector and/or a backward motion vector, the reference image information comprising a forward reference image block and/or a backward direction Reference frame index information of the reference image block.
- the position of the forward reference block and the position of the backward reference block can be determined by the motion vector information.
- the first reference image is a forward reference image
- the second reference image is a backward reference image
- the first motion vector and the second motion vector included in the motion information are initially predicted; the position of the first reference block may be obtained according to the position of the current image block and the first motion vector, that is, the first a reference block; and obtaining a second reference block according to the position of the current image block and the second motion vector, that is, determining the second reference block.
- the location of the first reference block and/or the second reference block may be an equivalent position of the current image block, it may also be obtained according to the equivalent position and the motion vector.
- the following method 1 and mode 2 may be used to obtain the initial predicted motion information of the image block.
- the candidate prediction motion information list is constructed according to the motion information of the neighboring block of the current image block, and a candidate prediction motion information is selected from the candidate prediction motion information list as an initial prediction of the current image block.
- the candidate predicted motion information list includes a motion vector, reference frame index information of a reference image block, and the like.
- the motion information of the neighboring block A0 is selected as the initial predicted motion information of the current image block.
- the forward motion vector of A0 is used as the forward motion vector of the current image block, and the backward direction of A0 is used.
- the motion vector is used as the backward predicted motion vector of the current image block.
- a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
- the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
- the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
- the base point can be represented by a coordinate point, which is a kind of position information, which can be used to indicate the position of the image block, and can also be used as a reference in the subsequent image block search. It may be the top left corner of an image block, or the center point of an image block, or a relative position point specified by other rules, which is not limited in this application.
- the base point of the reference image can be used as a search base point in subsequent search processes. So once the position of a reference block is determined, the search base point is determined.
- the base points contained in both the first reference block and the second reference block may also be referred to as a first search base point and a second search base point respectively because of subsequent search operations related to the base point; they may be predetermined It can also be specified in the process of codec.
- the forward motion vector is (MV0x, MV0y), and the base point of the current image block is (B0x, B0y), and the base point of the forward reference block is (MV0x+B0x, MV0y+B0y).
- the backward reference block which is not described in the present application.
- the first reference image may refer to the forward reference image
- the second reference image may refer to the backward reference image
- the first reference block may refer to a forward reference block and the second reference block may refer to a backward reference block.
- Step 303 includes a search method, and the specific search method can be as follows:
- a motion search of an integer pixel step is performed around the first reference block with reference to the first reference block (or the first search base).
- the integer pixel step may refer to a position offset of the position of the candidate search block relative to the first reference block as an integer pixel distance, wherein the size of the candidate search block may be the same as the first reference block, so that the search process may determine The location of the candidate search block, and then the third reference block is determined according to the search rule. It should be pointed out that no matter whether the search base point is an integer pixel (the starting point can be an integer pixel, or a sub-pixel, such as 1/2, 1/4, 1/8, 1/16, etc.), the whole pixel can be performed.
- the step motion search obtains the position of the forward reference block of the current image block, that is, the third reference block is determined correspondingly. After searching for some third reference blocks in integer pixel steps, optionally, a sub-pixel search can be performed, and then some third reference blocks are obtained, and if there is still a search requirement, the finer sub-pixels can be continued. Search...
- search method see Figure 6, where (0,0) is the search base point, you can use the cross search: search for (0,-1), (0,1), (-1,0) and (1,0); or square search: search for (-1,-1), (1,-1), (-1,1) and (1,1) in sequence; these points are the upper left corner of the candidate search block Vertex, these base points are determined, then the reference block corresponding to the base points, that is, the third reference block is also determined.
- search method is not limited, and any prior art search may be adopted.
- Methods, such as fractional pixel step size search can be used in addition to integer pixel step search. For example, a search for a fractional pixel step size is directly performed, and a specific search method is not limited herein.
- N reference block groups are obtained, wherein one reference block group includes a third reference block and a fourth reference block.
- the reference block group can be found using the motion vector difference MVD (motion vector difference) mirroring constraint.
- MVD motion vector difference
- the MVD image constraint here is that if the positional offset of a third image block (base point) relative to the first reference block (first search base point) is Offset0 (deltaX0, deltaY0), it is found in the backward image.
- the image of the current current image block is different from the time domain interval of the forward reference image and the backward reference image, and the motion vector difference (MVD) image constraint can still be used to find the reference. Block group.
- the ith vector and the jth vector are inversely large.
- the time interval between the forward reference image and the backward reference image respectively from the image of the current image block is For t1 and t2, the following constraint can be adopted: if the positional offset of the block position (base point) of a third image block relative to the first reference block (first search base point) is Offset00 (deltaX00, deltaY00), then in the backward direction An image block in the image in which the positional offset with respect to the second reference block (second search base point) is Offset01 (deltaX01, deltaY01) is determined as a fourth reference block.
- deltaX01 -deltaX00*t2/t1
- deltaY01 -deltaY00*t2/t1.
- Offset0 deltaX0, deltaY0
- Offset1 deltaX1, deltaY1
- the first image block, the first base point and the first reference image have equivalent functions
- the current image block, the base point of the current image block, and the image of the current image block have equivalent functions.
- the essence is to calculate the time domain interval of the image where the first reference image and the current image block are located; the same applies to calculating the time domain interval of the second reference image and the image of the current image block. That is, the time interval between frames.
- N reference block groups can be obtained, wherein one reference block group includes one third reference block and one fourth reference block.
- positional offset may refer to the offset between the base point and the base point, and may also refer to the offset between the image block and the image block, representing a relative position.
- the N reference block groups may include the foregoing first reference block and the foregoing second reference block; that is, the first reference block may be a third reference block, and correspondingly, the second reference block may be a fourth Reference block.
- both the ith vector and the jth vector are 0.
- the first reference block is the third reference block
- the second reference block is its corresponding fourth reference block.
- the third reference block and/or the fourth reference block mentioned in the present application are not limited to image blocks of a specific location, but may represent a type of reference block, which may be a specific image.
- the block may also be a plurality of image blocks; for example, the third reference block may be based on any one of the image blocks searched around the first base point, and the fourth reference block may be an image block corresponding to any one of the image blocks described above, thus The fourth reference block may be a specific image block or a plurality of image blocks.
- Step 305 Increase the pixel values of the obtained third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups under the first pixel precision.
- a reference block group is taken as a specific example to illustrate how to calculate an image block matching cost of a reference block group.
- the reference block group includes a third reference block and a fourth reference block determined corresponding thereto.
- the pixel values of the third reference block and the fourth reference block are referred to as high first pixel precision, wherein the third reference block and the fourth reference block are both image blocks that have been coded, so their pixels have a code stream.
- the accuracy, such as the code stream precision is 8 bits
- the pixel precision of the pixel values of the third reference block and the fourth reference block is 8 bits. In order to find a reference block that is more similar in image, it is necessary to improve the precision of the pixels of the third reference block and the fourth reference block.
- the accuracy of the image block that needs to calculate the image block matching cost needs to be increased to the same precision, for example, 14 bits.
- the 14-bit pixel value of the third reference block is obtained, denoted as pi[x, y]
- the 14-bit pixel value of the fourth reference block is recorded as pj[x, y], where x, y represents the coordinates.
- An image block matching cost eij is calculated from pi[x, y] and pj[x, y], which may also be referred to as an image block matching error eij.
- the image block matching error there are many ways to calculate the image block matching error, such as the SAD criterion, the MR-SAD criterion, and other evaluation criteria in the prior art, and the calculation method of the image block matching error is not limited in the present invention.
- the image block matching cost calculation described above can be performed for a plurality of reference block groups.
- the first reference block is a third reference block
- the second reference block is a fourth reference block
- the pixel values of the first reference block and the second reference block are obtained by a motion compensation method.
- Motion compensation refers to pointing to a reconstructed reference image (with pixel precision of the code stream) according to the motion vector, and obtaining the pixel value (having the first pixel precision) of the reference block of the current image block.
- the position pointed by the motion vector is a sub-pixel position, and the pixel value of the entire pixel position of the reference image needs to be interpolated by using an interpolation filter to obtain a pixel value of the sub-pixel position as the pixel value of the reference block of the current image block;
- the position pointed by the motion vector is a whole pixel position, and a moving operation can be employed.
- the coefficient sum of the interpolation filter that is, the interpolation filter gain, is 2 to the power of N. If N is 6, it means that the interpolation filter gain is 6 bits. In the interpolation operation, since the interpolation filter gain is usually greater than 1, the accuracy of the pixel values of the obtained forward reference block and backward reference block is higher than that of the code stream.
- the pixel value precision bitDepth of the predicted image is 8 bits
- the interpolation filter gain is 6 bits
- the predicted pixel value with a precision of 14 bits is obtained;
- the pixel value precision bitDepth of the predicted image is 10 bits
- the interpolation filter gain is 6 bits.
- a predicted pixel value with a precision of 16 bits is obtained; if the pixel value precision bitDepth of the predicted image is 10 bits, the interpolation filter gain is 6 bits, and then 2 bits are shifted right, and a predicted pixel value with a precision of 14 bits is obtained.
- Commonly used interpolation filters are 4 taps, 6 taps, 8 taps, and so on. There are many motion compensation methods in the prior art, which are not described in this application.
- the pixels of the image block referred to in the present application may include a luminance component sample, or a luma sample; correspondingly, the pixel point is a luminance component sampling point; and the pixel value is a luminance component sampling value.
- the image block matching cost criterion comprises: determining a reference block group with the smallest image block matching cost as the target reference block group.
- the image block matching cost criterion further includes: determining, as the target reference block group, the first occurrence of the reference block group that satisfies the image block matching cost less than the preset threshold.
- step 304, step 305, and step 306 may be performed after step 303, or may be performed in synchronization with step 303.
- the step numbers do not constitute any limitation on the order in which the methods are executed.
- a fourth reference block is correspondingly determined, and an image block matching cost of the third set of reference blocks and the fourth reference block is calculated, if the Nth reference is calculated.
- the Nth reference block group if the image block matching cost result satisfies a preset condition, such as less than a preset threshold, or even 0, the Nth reference block group is used as the target reference block group. It is not necessary to determine and calculate more third reference blocks and fourth reference blocks, which can reduce the computational complexity, where N is greater than or equal to 1.
- N third reference blocks are determined first, and N fourth reference blocks are determined one-to-one, N reference block groups are formed, and then images corresponding to each reference block group are calculated for the N reference block groups.
- the block matching error is compared and compared, and the image block matching cost result satisfies a preset condition. For example, if the image block matching error is the smallest, the reference block group with the smallest image block matching cost is selected (if there are at least a plurality, the random one is optional) ) as the target reference block group.
- the third reference block and the fourth reference block in the target reference block group may also be respectively called the optimal forward reference block of the current image block and the current image block respectively.
- the third reference block is determined based on the first reference block having the first pixel precision; the fourth reference block is determined based on the second reference block having the first pixel precision; thus the third reference block and the fourth reference
- the pixel precision of the block is also the first pixel precision, ie higher than the pixel precision of the code stream.
- the second pixel precision is the same as the pixel precision (bitDepth) of the code stream.
- x and y are the coordinates of the horizontal and vertical directions of each pixel in the image block, and for each pixel in the image block, the operation is as shown in the above formula.
- the precision of the pixel value of the target third reference block is 14 bits
- the precision of the pixel value of the target fourth reference block is 14 bits
- the shift2 is 15-bitDepth
- the precision of the pixel prediction value of the current image block is 14+1.
- -shift2 bitDepth.
- the first reference block and the second reference block obtained from the initial motion information are not necessarily able to accurately predict the current image block, a completely new method will be adopted in the present application to find a more suitable target.
- the third reference block and the target fourth reference block, and the current image block is predicted by the pixel values of the target third reference block and the target fourth reference block.
- the image prediction method in the embodiment of the present application may occur in the inter prediction process shown in FIG. 1 and FIG. 2, and the image prediction method in the embodiment of the present application may be specifically implemented by an inter prediction module in an encoder or a decoder. To execute. Additionally, the image prediction method of embodiments of the present application can be implemented in any electronic device or device that may require encoding and/or decoding of a video image.
- an embodiment of the present invention provides an image prediction apparatus.
- the image prediction apparatus of the embodiment of the present application is described below with reference to FIG.
- the image predicting apparatus shown in FIG. 6 corresponds to the method shown in FIG. 3, and each step in the method shown in FIG. 3 can be executed.
- the repeated description is appropriately omitted below.
- an image prediction apparatus 700 the apparatus 700 includes:
- the obtaining unit 701 is configured to acquire initial predicted motion information of the current image block. This unit can be implemented by the processor invoking code in memory.
- a determining unit 702 configured to determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine, in the second reference image, the current image block a second reference block; wherein the first reference block includes a first search base point and the second reference block includes a second search base point.
- This unit can be implemented by the processor invoking code in memory.
- the searching unit 703 is configured to determine N third reference blocks in the first reference image. This unit can be implemented by the processor invoking code in memory.
- the mapping unit 704 is configured to: according to the first search base point, the location of the any one of the third reference blocks, and the second search base point, for any one of the N third reference blocks, Correspondingly determining a fourth reference block in the second reference image; obtaining N reference block groups, wherein one reference block group includes a third reference block and a fourth reference block; N is greater than or equal to 1.
- This unit can be implemented by the processor invoking code in memory.
- the calculating unit 705 is configured to increase the obtained pixel values of the third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups at the first pixel precision.
- This unit can be implemented by the processor invoking code in memory.
- the selecting unit 706 is configured to determine, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, where the target reference block group includes a target third reference block and a target fourth reference block.
- This unit can be implemented by the processor invoking code in memory.
- a prediction unit 707 configured to obtain a pixel prediction value of the current image block according to a pixel value of the target third reference block at a first precision and a pixel value of the target fourth reference block at a first precision, where The pixel prediction value of the current image block has a second pixel precision; the second pixel precision is less than the first pixel precision.
- This unit can be implemented by the processor invoking code in memory.
- the obtaining unit 701 is specifically configured to perform the method mentioned in the foregoing step 301 and the method that can be replaced by the same; the determining unit 702 is specifically configured to perform the method mentioned in the foregoing step 302 and can be equivalently replaced.
- the search unit 703 is specifically configured to perform the method mentioned in the above step 303 and the method that can be replaced by the same; the mapping unit 704 is specifically configured to perform the method mentioned in the above step 304 and the method that can be replaced equally; 705 is specifically configured to perform the method mentioned in step 305 and the method that can be replaced by the same; the selecting unit 706 is specifically configured to perform the method mentioned in step 306 and the method that can be replaced equally; the prediction unit 707 is specifically configured to perform the step The method mentioned in 307 and the method which can be equivalently replaced.
- the corresponding method embodiments and corresponding explanations, representations, refinements, and alternative alternative embodiments are also applicable to the method in the device.
- the device 700 may specifically be a video encoding device, a video decoding device, a video codec system, or other device having a video codec function.
- the apparatus 700 can be used for both image prediction in the encoding process and image prediction in the decoding process, especially inter-frame prediction in video images.
- Apparatus 700 includes a number of functional units for implementing any of the foregoing methods
- the present application further provides a terminal device, the terminal device includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute the program
- the image prediction method of the application embodiment includes steps 301-307.
- the terminal devices here may be video display devices, smart phones, portable computers, and other devices that can process video or play video.
- the present application also provides a video encoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
- the present application also provides a video decoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
- the present application also provides a video encoding system including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
- the present application also provides a computer readable medium storing program code for device execution, the program code comprising instructions for performing an image prediction method of an embodiment of the present application, including steps for implementing Program code of 301-307.
- the present application also provides a decoder, which includes an image prediction apparatus in the embodiment of the present application, such as 700, and a decoding reconstruction module, wherein the decoding reconstruction module is configured to obtain according to the image prediction apparatus.
- the predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
- the present application also provides an encoder, which includes an image prediction apparatus in the embodiment of the present application, such as 700, and a code reconstruction module, wherein the code reconstruction module is configured to obtain according to the image prediction device.
- the predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
- FIG. 8 is a schematic block diagram of a video encoder according to an embodiment of the present application.
- the video encoder 1000 shown in FIG. 8 includes an encoding end prediction module 1001, a transform quantization module 1002, an entropy encoding module 1003, a code reconstruction module 1004, and an encoding end filtering module.
- the video encoder 1000 shown in FIG. 8 can encode a video. Specifically, the video encoder 1000 can perform the video encoding process shown in FIG. 1 to implement encoding of a video. In addition, the video encoder 1000 can also perform the image prediction method of the embodiment of the present application, and the video encoder 1000 can perform various steps of the image prediction method shown in FIG. 3, including refinement of each step and an alternative implementation.
- the image prediction apparatus in the embodiment of the present application may also be the encoding end prediction module 1001 in the video encoder 1000.
- FIG. 9 is a schematic block diagram of a video decoder of an embodiment of the present application.
- the video decoder 2000 shown in FIG. 9 includes an entropy decoding module 2001, an inverse transform inverse quantization module 2002, a decoding end prediction module 2003, a decoding reconstruction module 2004, and a decoding end filtering module 2005.
- the video decoder 2000 shown in FIG. 9 can encode the video. Specifically, the video decoder 2000 can perform the video decoding process shown in FIG. 2 to implement decoding of the video. In addition, the video decoder 2000 can also perform the image prediction method of the embodiment of the present application, and the video decoder 2000 can perform various steps of the image prediction method shown in FIG. 3, including refinement of each step and an alternative implementation.
- the image prediction apparatus 700 in the embodiment of the present application may also be the decoding side prediction module 2003 in the video decoder 2000.
- the application scenario of the image prediction method in the embodiment of the present application is described below with reference to FIG. 10 to FIG. 12 .
- the image prediction method in the embodiment of the present application may be implemented by the video transmission system, the codec device, and the editing device shown in FIG. 10 to FIG. 12 .
- the decoding system is executed.
- FIG. 10 is a schematic block diagram of a video transmission system according to an embodiment of the present application.
- the video transmission system includes an acquisition module 3001, an encoding module 3002, a transmitting module 3003, a network transmission 3004, a receiving module 3005, a decoding module 3006, and a rendering module 3007.
- each module in the video transmission system is as follows:
- the acquisition module 3001 includes a camera or a camera group for collecting video images, and performing pre-encoding processing on the collected video images to convert the optical signals into digitized video sequences;
- the encoding module 3002 is configured to encode the video sequence to obtain a code stream
- the sending module 3003 is configured to send the coded code stream.
- the receiving module 3005 is configured to receive the code stream sent by the sending module 3003.
- the network 3004 is configured to transmit the code stream sent by the sending module 3003 to the receiving module 3005;
- the decoding module 3006 is configured to decode the code stream received by the receiving module 3005 to reconstruct a video sequence.
- the rendering module 3007 is configured to render the reconstructed video sequence decoded by the decoding module 3006 to improve the display effect of the video.
- the video transmission system shown in FIG. 10 can perform the image prediction method in the embodiment of the present application.
- the encoding module 3002 and the decoding module 3006 in the video transmission system shown in FIG. 10 can perform image prediction in the embodiment of the present application.
- the method, including steps 301-307, also includes refinement and alternative implementations of each step.
- the acquisition module 3001, the encoding module 3002, and the transmission module 3003 in the video transmission system shown in FIG. 10 correspond to the video encoder 1000 shown in FIG.
- the receiving module 3005, the decoding module 3006, and the rendering module 3007 in the video transmission system shown in FIG. 10 correspond to the video decoder 2000 shown in FIG.
- a codec system composed of a codec device and a codec device will be described in detail below with reference to FIGS. 11 and 12. It should be understood that the codec device and the codec system shown in FIGS. 11 and 12 are capable of performing the method of image prediction of the embodiment of the present application.
- FIG. 11 is a schematic diagram of a video codec apparatus according to an embodiment of the present application.
- the video codec device 50 may be a device dedicated to encoding and/or decoding a video image, or may be an electronic device having a video codec function. Further, the codec device 50 may be a mobile communication system. Terminal or user equipment.
- Codec device 50 may include the following modules or units: controller 56, codec 54, radio interface 52, antenna 44, smart card 46, card reader 48, keypad 34, memory 58, infrared port 42, display 32.
- the codec device 50 may also include a microphone or any suitable audio input module, which may be a digital or analog signal input, and the codec device 50 may also include an audio output.
- the audio output module can be a headset, a speaker or an analog audio or digital audio output connection.
- the codec device 50 may also include a battery, which may be a solar cell, a fuel cell, or the like.
- the codec device 50 may also include an infrared port for short-range line-of-sight communication with other devices, and the codec device 50 may also communicate with other devices using any suitable short-range communication method, for example, a Bluetooth wireless connection, USB / Firewire wired connection.
- any suitable short-range communication method for example, a Bluetooth wireless connection, USB / Firewire wired connection.
- the memory 58 can store data in the form of data and audio in the form of images, as well as instructions for execution on the controller 56.
- Codec 54 may implement encoding and decoding of audio and/or video data or enable auxiliary and auxiliary decoding of audio and/or video data under the control of controller 56.
- the smart card 46 and the card reader 48 can provide user information as well as network authentication and authentication information for authorized users.
- the specific implementation form of the smart card 46 and the card reader 48 may be a Universal Integrated Circuit Card (UICC) and a UICC reader.
- UICC Universal Integrated Circuit Card
- the radio interface circuit 52 can generate a wireless communication signal, which can be a communication signal generated during a cellular communication network, a wireless communication system, or a wireless local area network communication.
- the antenna 44 is used to transmit radio frequency signals generated by the radio interface circuit 52 to other devices (the number of devices may be one or more), and may also be used for other devices (the number of devices may be one or more Receive RF signals.
- codec device 50 may receive video image data to be processed from another device prior to transmission and/or storage. In still other embodiments of the present application, the codec device 50 may receive images over a wireless or wired connection and encode/decode the received images.
- FIG. 12 is a schematic block diagram of a video codec system 7000 according to an embodiment of the present application.
- the video codec system 7000 includes a source device 4000 and a destination device 5000.
- the source device 4000 generates encoded video data
- the source device 4000 may also be referred to as a video encoding device or a video encoding device
- the destination device 5000 may decode the encoded video data generated by the source device 4000
- the destination device 5000 may also be referred to as a video decoding device or a video decoding device.
- the specific implementation form of the source device 4000 and the destination device 5000 may be any one of the following devices: a desktop computer, a mobile computing device, a notebook (eg, a laptop) computer, a tablet computer, a set top box, a smart phone, a handset, TV, camera, display device, digital media player, video game console, on-board computer, or other similar device.
- Destination device 5000 can receive video data encoded by source device 4000 via channel 6000.
- Channel 6000 can include one or more media and/or devices capable of moving encoded video data from source device 4000 to destination device 5000.
- channel 6000 can include one or more communication media that enable source device 4000 to transmit encoded video data directly to destination device 5000 in real time, in which case source device 4000 can be based on communication standards ( For example, a wireless communication protocol) modulates the encoded video data, and the modulated video data can be transmitted to the destination device 5000.
- the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media described above may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)).
- the one or more communication media described above may include a router, a switch, a base station, or other device that enables communication from the source device 4000 to the destination device 5000.
- channel 6000 can include a storage medium that stores encoded video data generated by source device 4000.
- destination device 5000 can access the storage medium via disk access or card access.
- the storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, High Density Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), flash memory. Or other suitable digital storage medium for storing encoded video data.
- channel 6000 can include a file server or another intermediate storage device that stores encoded video data generated by source device 4000.
- destination device 5000 can access the encoded video data stored at a file server or other intermediate storage device via streaming or download.
- the file server may be a server type capable of storing encoded video data and transmitting the encoded video data to the destination device 5000.
- the file server may include a World Wide Web (Web) server (for example, for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk. driver.
- Web World Wide Web
- FTP File Transfer Protocol
- NAS Network Attached Storage
- Destination device 5000 can access the encoded video data via a standard data connection (e.g., an internet connection).
- the instance type of the data connection includes a wireless channel, a wired connection (e.g., a cable modem, etc.), or a combination of both, suitable for accessing the encoded video data stored on the file server.
- the transmission of the encoded video data from the file server may be streaming, downloading, or a combination of both.
- the image prediction method of the present application is not limited to a wireless application scenario.
- the image prediction method of the present application can be applied to video codec supporting multiple multimedia applications such as the following applications: aerial television broadcasting, cable television transmission, satellite television transmission, Streaming video transmission (e.g., via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application.
- video codec system 7000 can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
- the source device 4000 includes a video source 4001, a video encoder 4002, and an output interface 4003.
- output interface 4003 can include a modulator/demodulator (modem) and/or a transmitter.
- Video source 4001 can include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer for generating video data A graphics system, or a combination of the above video data sources.
- Video encoder 4002 can encode video data from video source 4001.
- source device 4000 transmits the encoded video data directly to destination device 5000 via output interface 4003.
- the encoded video data may also be stored on a storage medium or file server for later access by the destination device 5000 for decoding and/or playback.
- the destination device 5000 includes an input interface 5003, a video decoder 5002, and a display device 5001.
- input interface 5003 includes a receiver and/or a modem.
- the input interface 5003 can receive the encoded video data via the channel 6000.
- Display device 5001 may be integrated with destination device 5000 or may be external to destination device 5000. Generally, the display device 5001 displays the decoded video data.
- Display device 5001 can include a variety of display devices, such as liquid crystal displays, plasma displays, organic light emitting diode displays, or other types of display devices.
- the disclosed systems, devices, and methods may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
- the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
- the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The present invention provides an image prediction method. Said method comprises: obtaining initial prediction motion information of a current image block, determining a first reference block from a forward reference image, and determining a second reference block from a backward reference image; and performing, by means of completely new class mirroring, search around the first reference block and the second reference block, so as to determine whether there is a pair of target reference blocks having less image block matching cost, said pair of target reference blocks having a correlation in space; and obtaining, according to the pixel value of the target reference blocks at a first precision, a pixel prediction value of the current image block, said pixel prediction value of the current image block having a code stream precision. By means of the present invention, the image block matching cost calculation is performed at a high accuracy, and an optimal reference block pair is found, reducing the complexity of inter-frame prediction of a video image in the prior art, and improving the accuracy.
Description
本申请涉及视频编解码技术领域,尤其涉及一种视频图像的帧间预测方法、装置以及相应的编码器和解码器。The present application relates to the field of video codec technology, and in particular, to an interframe prediction method and apparatus for video images, and a corresponding encoder and decoder.
数字视频能力可并入到多种多样的装置中,包含数字电视、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、平板计算机、电子图书阅读器、数码相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝式或卫星无线电电话(所谓的“智能电话”)、视频电话会议装置、视频流式传输装置及其类似者。数字视频装置实施视频压缩技术,例如,在由MPEG-2、MPEG-4、ITU-T H.263、ITU-T H.264/MPEG-4第10部分高级视频编码(AVC)定义的标准、视频编码标准H.265/高效视频编码(HEVC)标准以及此类标准的扩展中所描述的视频压缩技术。视频装置可通过实施此类视频压缩技术来更有效率地发射、接收、编码、解码和/或存储数字视频信息。Digital video capabilities can be incorporated into a wide variety of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, Digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones (so-called "smart phones"), video teleconferencing devices, video streaming devices and the like . Digital video devices implement video compression techniques, for example, standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 Part 10 Advanced Video Coding (AVC), The video coding standard H.265/High Efficiency Video Coding (HEVC) standard and the video compression techniques described in the extension of such standards. Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.
视频压缩技术执行空间(图像内)预测和/或时间(图像间)预测以减少或去除视频序列中固有的冗余。对于基于块的视频编码,视频条带(即,视频帧或视频帧的一部分)可分割成若干图像块,所述图像块也可被称作树块、编码单元(CU)和/或编码节点。使用关于同一图像中的相邻块中的参考样本的空间预测来编码图像的待帧内编码(I)条带中的图像块。图像的待帧间编码(P或B)条带中的图像块可使用相对于同一图像中的相邻块中的参考样本的空间预测或相对于其它参考图像中的参考样本的时间预测。图像可被称作帧,且参考图像可被称作参考帧。Video compression techniques perform spatial (intra-image) prediction and/or temporal (inter-image) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (ie, a video frame or a portion of a video frame) may be partitioned into several image blocks, which may also be referred to as tree blocks, coding units (CUs), and/or coding nodes. . The image block in the intra-coded (I) slice of the image is encoded using spatial prediction with respect to reference samples in neighboring blocks in the same image. An image block in an inter-coded (P or B) slice of an image may use spatial prediction with respect to reference samples in neighboring blocks in the same image or temporal prediction with respect to reference samples in other reference images. An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
其中,包含高效视频编码(HEVC)标准在内的各种视频编码标准提出了用于图像块的预测性编码模式,即基于已经编码的视频数据块来预测当前待编码的块。在帧内预测模式中,基于与当前图像块在相同的图像中的一或多个先前经解码相邻块来预测当前图像块;在帧间预测模式中,基于不同图像中的已经解码块来预测当前图像块。Among them, various video coding standards including the High Efficiency Video Coding (HEVC) standard propose a predictive coding mode for an image block, that is, predict a block to be currently coded based on an already encoded video data block. In the intra prediction mode, the current image block is predicted based on one or more previously decoded neighboring blocks in the same image as the current image block; in the inter prediction mode, based on the already decoded blocks in the different images Predict the current image block.
现有的几种帧间预测模式,例如合并模式(Merge mode)、跳过模式(Skip mode)和高级运动矢量预测模式(AMVP mode),但是传统的图像预测方法的流程较多,复杂度较高,准确性不高。There are several inter-prediction modes, such as Merge mode, Skip mode, and Advanced Motion Vector Prediction mode (AMVP mode), but the conventional image prediction method has more processes and more complexity. High, accuracy is not high.
发明内容Summary of the invention
本申请实施例提供一种图像预测方法、装置及相应的编码器和解码器,尤其是视频图像的帧间预测方法,一定程度上提高图像块的运动信息的预测准确性,从而提高编解码性能。An embodiment of the present application provides an image prediction method and apparatus, and a corresponding encoder and decoder, in particular, an inter-frame prediction method for video images, which improves the prediction accuracy of motion information of an image block to a certain extent, thereby improving coding and decoding performance. .
第一方面,本申请实施例提供了一种图像预测方法,该方法包括:获取当前图像块的初始预测运动信息;根据所述初始预测运动信息,在第一参考图像中确定出所述当前图像块对应的第一参考块,并在第二参考图像中确定出所述当前图像块对应的第二参考块;其中,所述第一参考块包含第一搜索基点,所述第二参考块包含第二搜索基点;在所述第一参考图像中确定出N个第三参考块;针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块;得到N个参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块;N大于等于1;将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价;在所述N个的参考块组中确定出满足图像块匹配代价准则的目标参考块组,所述目标参考块组包含目标第三参考块和目标第四参考块;根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值,其中,所述当前图像块的像素预测值具有第二像素精度;所述第二像素精度小于所述第一像素精度In a first aspect, an embodiment of the present application provides an image prediction method, which includes: acquiring initial predicted motion information of a current image block; and determining, according to the initial predicted motion information, the current image in a first reference image. a first reference block corresponding to the block, and determining, in the second reference image, a second reference block corresponding to the current image block; wherein the first reference block includes a first search base point, and the second reference block includes a second search base point; determining N third reference blocks in the first reference image; and for any one of the N third reference blocks, according to the first search base point, Determining, in the second reference image, a fourth reference block corresponding to a position of the any one of the third reference block and the second search base; obtaining N reference block groups, wherein the reference block group includes a first a third reference block and a fourth reference block; N is greater than or equal to 1; increasing pixel values of the obtained third reference block and fourth reference block to a first pixel precision, and calculating at the first pixel precision An image block matching cost of the N reference block groups; determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, the target reference block group including a target third reference block and a fourth reference block; a pixel predicted value of the current image block according to a pixel value of the target third reference block at a first precision and a pixel value of the target fourth reference block at a first precision, wherein The pixel prediction value of the current image block has a second pixel precision; the second pixel precision is less than the first pixel precision
第二方面,本申请实施例提供了一种图像预测装置,包括用于实施第一方面的任意一种方法的若干个功能单元。举例来说,该装置可以包括:获取单元,用于获取当前图像块的初始预测运动信息;确定单元,用于根据所述初始预测运动信息,在第一参考图像中确定出所述当前图像块对应的第一参考块,并在第二参考图像中确定出所述当前图像块对应的第二参考块;其中,所述第一参考块包含第一搜索基点,所述第二参考块包含第二搜索基点;搜索单元,用于在所述第一参考图像中确定出N个第三参考块;映射单元,用于针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块;得到N个参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块;N大于等于1;计算单元,用于将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价;选择单元,用于在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组,所述目标参考块组包含目标第三参考块和目标第四参考块;预测单元,用于根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值,其中,所述当前图像块的像素预测值具有第二像素精度;所述第二像素精度小于所述第一像素精度。In a second aspect, an embodiment of the present application provides an image prediction apparatus, including a plurality of functional units for implementing any one of the methods of the first aspect. For example, the apparatus may include: an acquiring unit, configured to acquire initial predicted motion information of a current image block; and a determining unit, configured to determine the current image block in the first reference image according to the initial predicted motion information Corresponding first reference block, and determining, in the second reference image, a second reference block corresponding to the current image block; wherein the first reference block includes a first search base point, and the second reference block includes a first reference block a search base unit, configured to determine N third reference blocks in the first reference image, and a mapping unit, configured to use, for any one of the N third reference blocks, a third reference block, according to Determining, in the second reference image, a fourth reference block corresponding to the first search base point, the location of the any one of the third reference blocks, and the second search base point; obtaining N reference block groups, Wherein, one reference block group includes a third reference block and a fourth reference block; N is greater than or equal to 1; and a calculation unit is configured to increase pixel values of the obtained third reference block and fourth reference block to a pixel precision, and calculating an image block matching cost of the N reference block groups at the first pixel precision; a selecting unit, configured to determine, in the N reference block groups, an image block matching cost criterion a target reference block group, the target reference block group including a target third reference block and a target fourth reference block; a prediction unit, configured to: according to the pixel value of the target third reference block at the first precision, and the target Obtaining a pixel prediction value of the current image block by a pixel value of the fourth reference block at a first precision, wherein the pixel prediction value of the current image block has a second pixel precision; the second pixel precision is less than the first Pixel accuracy.
根据第一方面或第二方面,在一种可能的实现方式中,初始预测运动信息中包含参考图像索引,用于指示所述两个参考图像包括一个前向参考图像和一个后向参考图像。According to the first aspect or the second aspect, in a possible implementation, the initial predicted motion information includes a reference image index for indicating that the two reference images include one forward reference image and one backward reference image.
根据第一方面或第二方面,在一种可能的实现方式中,所述N个第三参考块包含所述第一参考块;所述得到的N个第四参考块包含所述第二参考块;其中,所述第一参考块和所述第二参考块属于一个参考块组,即在空间上存在对应关系。也可以理解为,所述针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地 确定出一个第四参考块包括:若所述第一参考块为一个第三参考块;则将所述第二参考块对应地为一个第四参考块。According to the first aspect or the second aspect, in a possible implementation manner, the N third reference blocks include the first reference block; and the obtained N fourth reference blocks include the second reference a block; wherein the first reference block and the second reference block belong to one reference block group, that is, there is a corresponding relationship in space. It can also be understood that, according to any one of the N third reference blocks, according to the first search base point, the location of the any one third reference block, and the second search base point Correspondingly determining a fourth reference block in the second reference image includes: if the first reference block is a third reference block; the second reference block is correspondingly a fourth reference block .
根据第一方面或第二方面,在一种可能的实现方式中,针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块包括:根据所述任意一个第三参考块和所述第一搜索基点确定出第i矢量;根据所述当前图像块相对于所述第一参考图像的时域间隔t1,所述当前图像块相对于所述第二参考图像的时域间隔t2,以及所述第i矢量,确定出第j矢量,其中,所述第j矢量与所述第i矢量的方向相反;i和j均为不大于N的正整数;根据所述第二搜索基点和所述第j矢量确定出一个第四参考块。相应地,该方法可以由映射单元执行。According to the first aspect or the second aspect, in a possible implementation, for any one of the N third reference blocks, according to the first search base, the any one of the third Correspondingly determining a fourth reference block in the second reference image includes: determining, according to the location of the block and the second search base, the first reference block according to the any one of the third reference block and the first search base An i vector; determining, according to a time domain interval t1 of the current image block relative to the first reference image, a time domain interval t2 of the current image block relative to the second reference image, and the ith vector a jth vector, wherein the jth vector is opposite to a direction of the ith vector; i and j are both positive integers not greater than N; and one is determined according to the second search base point and the jth vector Fourth reference block. Accordingly, the method can be performed by a mapping unit.
根据第一方面或第二方面,在一种可能的实现方式中,所述针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块包括:根据所述任意一个第三参考块和所述第一搜索基点确定出第i矢量;根据所述第i矢量,确定出第j矢量,其中,所述第j矢量与所述第i矢量等大反向;i和j均为不大于N的正整数;根据所述第二搜索基点和所述第j矢量确定出一个第四参考块。相应地,该方法可以由映射单元执行。According to the first aspect or the second aspect, in a possible implementation, the any one of the N third reference blocks, according to the first search base point, the any one Determining, in the second reference image, a fourth reference block corresponding to the location of the third reference block and the second search base, comprising: determining, according to the any one of the third reference block and the first search base point An ith vector; determining, according to the ith vector, a jth vector, wherein the jth vector is inversely different from the ith vector; i and j are positive integers not greater than N; The second search base point and the jth vector determine a fourth reference block. Accordingly, the method can be performed by a mapping unit.
根据第一方面或第二方面,在一种可能的实现方式中,所述将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价包括:针对所述N个参考块组中的至少一个参考块组,将得到的第三参考块和第四参考块的像素值通过插值或移位提高到第一像素精度;并在所述第一像素精度下计算图像块匹配代价;在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组包括:将所述至少一个参考块组中首个出现的满足图像块匹配代价小于预设阈值的参考块组确定为所述目标参考块组。比如计算了2个参考块组,图像块匹配代价均不小于预设阈值,当计算到第3个参考块组的时候,图像块匹配代价小于预设阈值,则将第3个参考块组作为目标参考块组,并不再计算其他的参考块组。相应地,该方法可以由计算单元和选择单元共同执行。According to the first aspect or the second aspect, in a possible implementation, the pixel values of the obtained third reference block and the fourth reference block are improved to a first pixel precision, and the first pixel precision is Calculating the image block matching cost of the N reference block groups includes: interpolating or subtracting pixel values of the obtained third reference block and the fourth reference block for at least one of the N reference block groups Up shifting to a first pixel precision; and calculating an image block matching cost at the first pixel precision; determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion includes: The first occurrence of the reference block group in the at least one reference block group that satisfies the image block matching cost less than the preset threshold is determined as the target reference block group. For example, two reference block groups are calculated, and the image block matching cost is not less than a preset threshold. When the third reference block group is calculated, the image block matching cost is less than a preset threshold, and the third reference block group is used as the third reference block group. The target references the block group and no further reference block groups are calculated. Accordingly, the method can be performed jointly by the computing unit and the selection unit.
根据第一方面或第二方面,在一种可能的实现方式中,所述将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价包括:将所述得到的第三参考块和第四参考块的像素值通过插值或移位提高到第一像素精度;针对所述N个参考块组中的每一个参考块组计算图像块匹配代价;在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组包括:将所述N个参考块组中图像块匹配代价最小的参考块组确定为所述目标参考块组。比如计算了6个参考块组,其中第4个参考块组图像块匹配代价最小,则将第4个参考块组作为目标参考块组。相应地,该方法可以由计算单元和选择单元共同执行。According to the first aspect or the second aspect, in a possible implementation, the pixel values of the obtained third reference block and the fourth reference block are improved to a first pixel precision, and the first pixel precision is Calculating an image block matching cost of the N reference block groups includes: increasing pixel values of the obtained third reference block and fourth reference block by interpolation or shifting to a first pixel precision; for the N Calculating an image block matching cost for each of the reference block groups in the reference block group; determining, in the N reference block groups, the target reference block group that satisfies the image block matching cost criterion includes: imaging the N reference block groups A reference block group having the smallest block matching cost is determined as the target reference block group. For example, six reference block groups are calculated, wherein the fourth reference block group image block matching cost is the smallest, and the fourth reference block group is used as the target reference block group. Accordingly, the method can be performed jointly by the computing unit and the selection unit.
根据第一方面或第二方面,在一种可能的实现方式中,所述根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值包括:According to the first aspect or the second aspect, in a possible implementation, the pixel value according to the target third reference block at the first precision and the target fourth reference block are at the first precision Pixel values to obtain pixel prediction values for the current image block include:
获取所述目标第三参考块在第一精度下的像素值predSamplesL0’[x][y];Obtaining a pixel value predSamplesL0'[x][y] of the target third reference block at the first precision;
获取所述目标第四参考块在第一精度下的像素值predSamplesL1’[x][y];Obtaining a pixel value predSamplesL1'[x][y] of the target fourth reference block at the first precision;
所述当前图像块的像素预测值predSamples’[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0’[x][y]+predSamplesL1’[x][y]+offset2)>>shift2),其中,bitDepth为所述第二像素精度,Shift2为移位参数,offset2等于1<<(shift2-1);其中第二像素精度可以为码流精度。相应地,该方法可以由预测单元执行。The pixel prediction value of the current image block predSamples'[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0'[x][y]+predSamplesL1'[x][y]+ Offset2)>>shift2), where bitDepth is the second pixel precision, Shift2 is the shift parameter, and offset2 is equal to 1<<(shift2-1); wherein the second pixel precision can be the code stream precision. Accordingly, the method can be performed by a prediction unit.
根据第一方面或第二方面,在一种可能的实现方式中,所述初始预测运动信息中包含的第一运动矢量和第二运动矢量;所述根据所述初始预测运动信息,在第一参考图像中确定出所述当前图像块对应的第一参考块,并在第二参考图像中确定出所述当前图像块对应的第二参考块包括:根据所述当前图像块的位置和所述第一运动矢量得到所述第一参考块,并根据所述当前图像块的位置和所述第二运动矢量得到所述第二参考块。相应地,该方法可以由确定单元执行。According to the first aspect or the second aspect, in a possible implementation manner, the first motion vector and the second motion vector included in the initial predicted motion information; according to the initial predicted motion information, at the first Determining, in the reference image, the first reference block corresponding to the current image block, and determining, in the second reference image, that the second reference block corresponding to the current image block comprises: according to the location of the current image block and the The first motion vector obtains the first reference block, and obtains the second reference block according to the location of the current image block and the second motion vector. Accordingly, the method can be performed by the determining unit.
根据第一方面或第二方面,在一种可能的实现方式中,可以以所述第一参考块的搜索基点为基准,以预设步长进行运动搜索,搜索得到N个第三参考块。According to the first aspect or the second aspect, in a possible implementation, the motion search may be performed in a preset step with reference to the search base point of the first reference block, and the N third reference blocks are searched for.
根据第一方面或第二方面,在一种可能的实现方式中:所述方法还包括:还确定出目标第三参考块和目标第四参考块对应的运动矢量作为前向最优运动矢量和后向最优运动矢量,为后续图像块的预测提供运动矢量参考。According to the first aspect or the second aspect, in a possible implementation manner, the method further includes: determining a motion vector corresponding to the target third reference block and the target fourth reference block as the forward optimal motion vector sum The backward optimal motion vector provides a motion vector reference for the prediction of subsequent image blocks.
上述方法和装置,均可以由处理器调用存储器中的程序和指令来实现。The above methods and apparatus can be implemented by a processor calling a program and instructions in a memory.
第三方面,本申请实施例提供一种视频编码器,所述视频编码器用于编码图像块,包含如上述任意一种可能的图像预测装置以及编码重建模块,图像预测装置用于得到当前图像块的像素值的预测值;编码重建模块,用于根据当前图像块的像素值的预测值得到所述当前图像块的重建像素值。相应地,所述视频编码器可以执行上述任一一种可能的设计方法。In a third aspect, an embodiment of the present application provides a video encoder, where the video encoder is used to encode an image block, and includes any possible image prediction apparatus and a code reconstruction module, where the image prediction apparatus is configured to obtain a current image block. a prediction value of the pixel value; the code reconstruction module is configured to obtain the reconstructed pixel value of the current image block according to the predicted value of the pixel value of the current image block. Accordingly, the video encoder can perform any of the possible design methods described above.
第四方面,本申请实施例提供一种视频解码器,所述视频解码器用于解码图像块,包含如上述任意一种可能的图像预测装置以及解码重建模块,图像预测装置用于得到当前图像块的像素值的预测值;解码重建模块,用于根据当前图像块的像素值的预测值得到所述当前图像块的重建像素值。相应地,所述视频解码器可以执行上述任一一种可能的设计方法。In a fourth aspect, an embodiment of the present application provides a video decoder, where the video decoder is used to decode an image block, and includes any possible image prediction apparatus and a decoding reconstruction module, where the image prediction apparatus is configured to obtain a current image block. a prediction value of the pixel value; a decoding reconstruction module, configured to obtain a reconstructed pixel value of the current image block according to a predicted value of a pixel value of the current image block. Accordingly, the video decoder can perform any of the possible design methods described above.
第五方面,本申请实施例提供一种用于编码视频数据的设备,所述设备包括:In a fifth aspect, an embodiment of the present application provides an apparatus for encoding video data, where the apparatus includes:
存储器,用于存储视频数据,所述视频数据包括一个或多个图像块;a memory for storing video data, the video data comprising one or more image blocks;
视频编码器,用于对图像进行编码,编码过程中的帧间预测方法可以采用上述任一一种可能的设计方法。A video encoder for encoding an image, and the inter prediction method in the encoding process may adopt any of the above possible design methods.
第六方面,本申请实施例提供一种用于解码视频数据的设备,所述设备包括:In a sixth aspect, an embodiment of the present application provides an apparatus for decoding video data, where the device includes:
存储器,用于存储视频数据,所述视频数据包括一个或多个图像块;a memory for storing video data, the video data comprising one or more image blocks;
视频解码器,用于对图像进行解码,解码过程中的帧间预测方法可以采用上述任一一种可能的设计方法。A video decoder for decoding an image, and the inter prediction method in the decoding process may adopt any of the above possible design methods.
第七方面,本申请实施例提供一种编码设备,包括:相互耦合的非易失性存储器和处理器,所述处理器调用存储在所述存储器中的程序代码以执行第一方面的任意一种方法的部分或全部步骤。In a seventh aspect, an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled to each other, the processor calling a program code stored in the memory to perform any one of the first aspects. Part or all of the steps of the method.
第八方面,本申请实施例提供一种解码设备,包括:相互耦合的非易失性存储器和处理器,所述处理器调用存储在所述存储器中的程序代码以执行第一方面的任意一种方法的部分或全部步骤。In an eighth aspect, an embodiment of the present application provides a decoding apparatus, including: a non-volatile memory and a processor coupled to each other, the processor calling a program code stored in the memory to perform any one of the first aspects. Part or all of the steps of the method.
第九方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储了程序代码,其中,所述程序代码包括用于执行第一方面的任意一种方法的部分或全部步骤的指令。In a ninth aspect, the embodiment of the present application provides a computer readable storage medium, where the program code stores program code, where the program code includes a part for performing any one of the methods of the first aspect or Instructions for all steps.
第十方面,本申请实施例提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行第一方面的任意一种方法的部分或全部步骤。In a tenth aspect, the embodiment of the present application provides a computer program product, when the computer program product is run on a computer, causing the computer to perform some or all of the steps of any one of the first aspects.
应理解,上述方案仅仅是本申请中可能的实现形式,各实现方式之间在不违背自然规律的情况下可以自由组合。It should be understood that the foregoing solutions are only possible implementation forms in the present application, and the implementation manners can be freely combined without departing from the natural laws.
图1为本申请实施例中一种视频编码过程的示意图;1 is a schematic diagram of a video encoding process in an embodiment of the present application;
图2为本申请实施例中一种视频解码过程的示意图;2 is a schematic diagram of a video decoding process in an embodiment of the present application;
图3为本申请实施例中一种图像预测方法的示意图;FIG. 3 is a schematic diagram of an image prediction method according to an embodiment of the present application;
图4为本申请实施例中一种帧间预测模式的示意图;4 is a schematic diagram of an inter prediction mode in an embodiment of the present application;
图5为本申请实施例中另一种帧间预测模式的示意图;FIG. 5 is a schematic diagram of another inter prediction mode in the embodiment of the present application; FIG.
图6为本申请实施例中一种搜索参考块的示意图;6 is a schematic diagram of a search reference block in an embodiment of the present application;
图7为本申请实施例中一种图像预测装置的示意图;FIG. 7 is a schematic diagram of an image prediction apparatus according to an embodiment of the present application; FIG.
图8是本申请实施例中的视频编码器的示意性框图;8 is a schematic block diagram of a video encoder in an embodiment of the present application;
图9是本申请实施例中的视频解码器的示意性框图;9 is a schematic block diagram of a video decoder in an embodiment of the present application;
图10是本申请实施例中的视频传输系统的示意性框图;FIG. 10 is a schematic block diagram of a video transmission system in an embodiment of the present application; FIG.
图11是本申请实施例中的视频编解码装置的示意性图;11 is a schematic diagram of a video codec apparatus in an embodiment of the present application;
图12是本申请实施例中的视频编解码系统的示意性框图。FIG. 12 is a schematic block diagram of a video codec system in an embodiment of the present application.
下面将结合附图,对本申请中的技术方案进行描述。The technical solutions in the present application will be described below with reference to the accompanying drawings.
本申请中的图像预测方法可以应用到视频编解码技术领域中。为了更好地理解本申请的图像预测方法,下面先对视频编解码进行介绍。The image prediction method in the present application can be applied to the field of video codec technology. In order to better understand the image prediction method of the present application, the video codec is first introduced below.
一段视频一般由很多帧图像按照一定的次序组成,一般来说,一帧图像中或者不同帧图像之间存在着大量的重复信息(冗余信息),例如,一帧图像内往往存在着大量空间结构相同或者相似的地方,也就是说视频文件中存在大量的空间冗余信息。另外,视频文件中还存在大量的时间冗余信息,这是由视频的组成结构导致的。例如,视频采样的帧速率一般为25帧/秒至60帧/秒,也就是说,相邻两帧间的采样时间间隔为1/60秒到1/25秒,在这么短的时间内,采样得到的图像画面中基本上都存在大量的相似信息,画面之间存在巨大关联性。A video generally consists of a number of frame images in a certain order. Generally speaking, there is a large amount of repeated information (redundant information) in one frame of image or between different frame images. For example, there is often a large amount of space in one frame of image. The same or similar structure, that is to say, there is a large amount of spatial redundancy information in the video file. In addition, there is a large amount of time redundant information in the video file, which is caused by the composition of the video. For example, the frame rate of video sampling is generally 25 frames/second to 60 frames/second, that is, the sampling interval between adjacent frames is 1/60 second to 1/25 second, in such a short period of time, There are basically a lot of similar information in the sampled image, and there is a huge correlation between the images.
此外,相关研究表明,从人眼的视觉敏感度这一心理特性的角度出发,视频信息 中也存在可以用来压缩的部分,即视觉冗余。所谓视觉冗余,是指利用人眼对亮度变化比较敏感,而对色度的变化相对不太敏感的特性来适当的压缩视频比特流。例如,在高亮度的区域,人眼视觉对亮度变化的敏感度呈现下降趋势,转而对物体的边缘处较为敏感;另外,人眼对内部区域相对不敏感而对对整体结构较为敏感。由于视频图像的最终服务对象是我们人类群体,所以可以充分利用人眼的这些特性对原有的视频图像进行压缩处理,达到更佳的压缩效果。除了上面提到的空间冗余、时间冗余和视觉冗余外,视频图像信息中还会存在信息熵冗余、结构冗余、知识冗余、重要性冗余等等这一系列的冗余信息。视频编码(也可以称为视频压缩编码)的目的就是使用各种技术方法将视频序列中的冗余信息去除掉,以达到减小存储空间和节省传输带宽的效果。In addition, related research shows that from the perspective of the psychological characteristics of the human eye's visual sensitivity, there is also a part of the video information that can be used for compression, that is, visual redundancy. The so-called visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance. For example, in high-brightness areas, the sensitivity of human vision to brightness changes tends to decrease, and is more sensitive to the edges of objects; in addition, the human eye is relatively insensitive to internal areas and sensitive to the overall structure. Since the final target of the video image is our human population, we can make full use of these characteristics of the human eye to compress the original video image to achieve better compression. In addition to the above mentioned spatial redundancy, temporal redundancy and visual redundancy, video image information also has redundancy in information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy, etc. information. The purpose of video coding (also referred to as video compression coding) is to use various technical methods to remove redundant information in a video sequence to reduce storage space and save transmission bandwidth.
目前,在国际通用范围内,视频压缩编码标准中主流的压缩编码方式有四种:色度抽样、预测编码、变换编码和量化编码。下面分别对这几种编码方式进行详细介绍。At present, in the internationally accepted range, there are four mainstream compression coding methods in video compression coding standards: chroma sampling, predictive coding, transform coding, and quantization coding. The following describes the coding methods in detail.
色度抽样:该方式充分利用了人眼的视觉心理特点,从底层的数据表示就开始设法最大限度的缩减单个元素描述的数据量。例如,在电视系统中多数采用的是亮度-色度-色度(YUV)颜色编码,它是欧洲电视系统广泛采用的标准。YUV颜色空间中包括一个亮度信号Y和两个色差信号U和V,三个分量彼此独立。YUV颜色空间彼此分开的表示方式更加灵活,传输占用带宽少,比传统的红绿蓝(RGB)色彩模型更具优势。例如,YUV 4:2:0形式表示两色度分量U和V在水平方向和垂直方向上都只有亮度Y分量的一半,即4个采样像素点中存在4个亮度分量Y,而色度分量U和V则只有一个。采用这种形式表示时,数据量进一步缩小,仅为原始的33%左右。因此,色度抽样充分利用了人眼的生理视觉特性,通过这种色度抽样的方式实现视频压缩的目的,是目前广泛采用的视频数据压缩方式之一。Chroma sampling: This method makes full use of the visual psychological characteristics of the human eye, and tries to minimize the amount of data described by a single element from the underlying data representation. For example, most of the television systems use luminance-chrominance-chrominance (YUV) color coding, which is a widely adopted standard in European television systems. The YUV color space includes a luminance signal Y and two color difference signals U and V, and the three components are independent of each other. The YUV color space is more flexible in representation, and the transmission occupies less bandwidth, which is superior to the traditional red, green and blue (RGB) color model. For example, the YUV 4:2:0 form indicates that the two chrominance components U and V are only half of the luminance Y component in both the horizontal direction and the vertical direction, that is, there are four luminance components Y among the four sampled pixels, and the chrominance component There is only one U and V. When expressed in this form, the amount of data is further reduced, only about 33% of the original. Therefore, chroma sampling makes full use of the physiological visual characteristics of the human eye, and the purpose of video compression by means of such chroma sampling is one of the widely used video data compression methods.
预测编码:预测编码时利用之前已编码帧的数据信息来预测当前将要编码的帧。通过预测得到一个预测值,它不完全等同与实际值,预测值和实际值之间存在着一定的残差值。预测的越准确,则预测值就会越接近实际值,残差值就越小,这样对残差值进行编码就能大大减小数据量,在解码端解码时运用残差值加上预测值就能还原重构出匹配图像,这就是预测编码的基本思想方法。在主流编码标准中预测编码分为帧内预测和帧间预测两种基本类型。其中,帧内预测(Intra Prediction)是指利用当前图像内已重建区域内像素点的像素值对当前编码单元内像素点的像素值进行预测;帧间预测(Inter Prediction)是在已重建的图像中,为当前图像中的当前编码单元寻找匹配的参考块,将参考块中的像素点的像素值作为当前编码单元中像素点的像素值的预测信息或者预测值,并传输当前编码单元的运动信息。Predictive coding: Predictive coding uses the data information of the previously encoded frame to predict the frame currently to be encoded. A predicted value is obtained by prediction, which is not completely equivalent to the actual value, and there is a certain residual value between the predicted value and the actual value. The more accurate the prediction, the closer the predicted value is to the actual value, and the smaller the residual value, so that the residual value can be encoded to greatly reduce the amount of data, and the residual value plus the predicted value is used when decoding at the decoding end. It is possible to restore and reconstruct the matching image, which is the basic idea of predictive coding. In the mainstream coding standard, predictive coding is divided into two basic types: intra prediction and inter prediction. Intra Prediction refers to predicting the pixel value of the pixel in the current coding unit by using the pixel value of the pixel in the reconstructed region in the current image; Inter Prediction is the reconstructed image. Searching for a matching reference block for the current coding unit in the current image, using the pixel value of the pixel in the reference block as the prediction information or the predicted value of the pixel value of the pixel in the current coding unit, and transmitting the motion of the current coding unit. information.
变换编码:这种编码方式不直接对原本的空间域信息进行编码,而是按照某种形式的变换函数,将信息采样值从当前域转换到另外一种人为定义域中(通常称为变换域),再根据信息在变换域的分布特性进行压缩编码。由于视频图像数据往往在空间域的数据相关性非常大,存在大量的冗余信息,如果直接进行编码的话需要很大的比特量。而将信息采样值转换到变换域中后,数据的相关性大大减少,这样在编码时由于冗余信息的减少,编码所需的数据量也随之大大减少,这样就能够得到较高的压缩比,而且可以实现较好的压缩效果。典型的变换编码方式有卡洛(K-L)变换、傅立叶 变换等。Transform coding: This coding method does not directly encode the original spatial domain information, but converts the information sample values from the current domain to another artificial domain according to some form of transformation function (commonly called transform domain). ), and then compression coding according to the distribution characteristics of the information in the transform domain. Since video image data tends to have very large data correlation in the spatial domain, there is a large amount of redundant information, and if it is directly encoded, a large amount of bits is required. When the information sample value is converted into the transform domain, the correlation of the data is greatly reduced, so that the amount of data required for encoding is greatly reduced due to the reduction of redundant information during encoding, so that high compression can be obtained. Than, and can achieve better compression. Typical transform coding methods include Kalo (K-L) transform, Fourier transform, and the like.
量化编码:上述提到的变换编码其实本身并不压缩数据,量化过程才能有效地实现对数据的压缩,量化过程也是有损压缩中数据“损失”的主要原因。量化的过程就是将动态范围较大的输入值“强行规划”成较少的输出值的过程。由于量化输入值范围较大,需要较多的比特数表示,而“强行规划”后的输出值范围较小,从而只需要少量的比特数即可表示。Quantization coding: The above-mentioned transform coding does not compress the data itself, and the quantization process can effectively achieve the compression of the data. The quantization process is also the main reason for the loss of data in the lossy compression. The process of quantification is the process of "force planning" a large dynamic input value into fewer output values. Since the range of quantized input values is large, more bit number representation is needed, and the range of output values after "forced planning" is small, so that only a small number of bits can be expressed.
在基于混合编码架构的编码算法中,上述几种压缩编码方式可以混合使用,编码器控制模块根据视频帧中不同图像块的局部特性,选择该图像块所采用的编码模式。对帧内预测编码的块进行频域或空域预测,对帧间预测编码的块进行运动补偿预测,预测的残差再通过变换和量化处理形成残差系数,最后通过熵编码器生成最终的码流。为避免预测误差的累积,帧内或帧间预测的参考信号是通过编码端的解码模块得到。变换和量化后的残差系数经过反量化和反变换重建残差信号,再与预测的参考信号相加得到重建的图像。另外,环路滤波会对重建后的图像进行像素修正,以提高重建图像的编码质量。In the coding algorithm based on the hybrid coding architecture, the above several compression coding modes may be used in combination, and the encoder control module selects the coding mode adopted by the image block according to the local characteristics of different image blocks in the video frame. The intra-predictive coded block is subjected to frequency domain or spatial domain prediction, and the inter-predictive coded block is subjected to motion compensation prediction, and the predicted residual is further transformed and quantized to form a residual coefficient, and finally the final code is generated by the entropy encoder. flow. In order to avoid the accumulation of prediction errors, the intra or inter prediction reference signals are obtained by the decoding module at the encoding end. The transformed and quantized residual coefficients are reconstructed by inverse quantization and inverse transform, and then added to the predicted reference signal to obtain a reconstructed image. In addition, the loop filtering performs pixel correction on the reconstructed image to improve the encoding quality of the reconstructed image.
下面结合图1和图2对视频编解码的整个过程进行简单的介绍。The whole process of video codec is briefly introduced in conjunction with FIG. 1 and FIG.
图1是视频编码过程的示意图。Figure 1 is a schematic diagram of a video encoding process.
如图1所示,在对当前帧Fn中的当前图像块进行预测时,既可以采用帧内预测也可以采用帧间预测,具体地,可以根据当前帧Fn的类型,选择采用帧内编码还是帧间编码,例如,当前帧Fn为I帧时采用帧内预测,当前帧Fn为P帧或者B帧时采用帧间预测。当采用帧内预测时可以采用当前帧Fn中已经重建区域的像素点的像素值对当前图像块的像素点的像素值进行预测,当采用帧间预测时可以采用参考帧F’
n-1中与当前图像块匹配的参考块的像素点的像素值对当前图像块的像素点的像素值进行预测。
As shown in FIG. 1 , when performing prediction on the current image block in the current frame Fn, either intra prediction or inter prediction may be used. Specifically, whether intra coding or intraframe coding can be selected according to the type of the current frame Fn. Inter-frame coding, for example, intra prediction is used when the current frame Fn is an I frame, and inter prediction is used when the current frame Fn is a P frame or a B frame. When the intra prediction is adopted, the pixel value of the pixel of the current image block may be predicted by using the pixel value of the pixel of the reconstructed area in the current frame Fn, and the reference frame F'n -1 may be used when inter prediction is adopted. The pixel value of the pixel of the reference block that matches the current image block predicts the pixel value of the pixel of the current image block.
在根据帧间预测或者帧内预测得到当前图像块的预测块之后,将当前图像块的像素点的像素值与预测块的像素点的像素值进行做差,得到残差信息,对残差信息进行变化、量化以及熵编码,得到编码码流。另外,在编码过程中还要对当前帧Fn的残差信息与当前帧Fn的预测信息进行叠加,并进行滤波操作,得到当前帧的重建帧F’
n,并将其作为后续编码的参考帧。
After obtaining the prediction block of the current image block according to inter prediction or intra prediction, the pixel value of the pixel point of the current image block is compared with the pixel value of the pixel point of the prediction block to obtain residual information, and the residual information is obtained. The change, quantization, and entropy coding are performed to obtain an encoded code stream. In addition, in the encoding process, the residual information of the current frame Fn and the prediction information of the current frame Fn are superimposed, and a filtering operation is performed to obtain a reconstructed frame F' n of the current frame, and is used as a reference frame for subsequent encoding. .
图2是视频解码过程的示意图。2 is a schematic diagram of a video decoding process.
图2所示的视频解码过程相当于图1所示的视频解码过程的逆过程,在解码时,利用熵解码以及反量化和反变换得到残差信息,并根据解码码流确定当前图像块使用帧内预测还是帧间预测。如果是帧内预测,则利用当前帧中已重建区域内像素点的像素值按照帧内预测方法构建预测信息;如果是帧间预测,则需要解析出运动信息,并使用所解析出的运动信息在已重建的图像中确定参考块,并将参考块内像素点的像素值作为预测信息,接下来,再将预测信息与残差信息进行叠加,并经过滤波操作便可以得到重建信息。The video decoding process shown in FIG. 2 is equivalent to the inverse process of the video decoding process shown in FIG. 1. When decoding, the residual information is obtained by using entropy decoding and inverse quantization and inverse transform, and the current image block is determined according to the decoded code stream. Intra prediction is also inter prediction. If it is intra prediction, the prediction information is constructed according to the intra prediction method using the pixel values of the pixels in the reconstructed region in the current frame; if it is inter prediction, the motion information needs to be parsed, and the parsed motion information is used. The reference block is determined in the reconstructed image, and the pixel value of the pixel in the reference block is used as prediction information. Next, the prediction information is superimposed with the residual information, and the reconstruction information is obtained through the filtering operation.
请参阅图3,图3为本申请实施例的图像预测方法的示意性流程图。图3所示的方法可以由视频编解码装置、视频编解码器、视频编解码系统以及其它具有视频编解码功能的设备来执行。图3所示的方法既可以发生在编码过程,也可以发生在解码过程,更具体地,图3所示的方法可以发生在编解码时的帧间预测过程。Please refer to FIG. 3. FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application. The method shown in FIG. 3 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions. The method shown in FIG. 3 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
图3所示的方法包括步骤301至步骤308,下面对步骤301至步骤308进行详细的介绍。The method shown in FIG. 3 includes steps 301 to 308, and steps 301 to 308 are described in detail below.
301、获取当前图像块的初始预测运动信息。301. Acquire initial predicted motion information of a current image block.
302、根据初始预测运动信息,在第一参考图像中确定出当前图像块对应的第一参考块,并在第二参考图像中确定出当前图像块对应的第二参考块;其中,所第一参考块包含第一搜索基点,第二参考块包含第二搜索基点;所述第一参考块的像素值和所述第二参考块的像素值具有第一像素精度。302. Determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine a second reference block corresponding to the current image block in the second reference image; where, the first The reference block includes a first search base point, and the second reference block includes a second search base point; a pixel value of the first reference block and a pixel value of the second reference block have a first pixel precision.
这里的图像块可以是待处理图像中的一个图像块,也可以是待处理图像中的一个子图像。另外,这里的图像块可以是编码过程中待编码的图像块,也可以是解码过程中待解码的图像块。The image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed. In addition, the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
可选地,上述初始预测运动信息包括预测方向的指示信息(通常为前向预测、后向预测、或者双向预测),指向参考图像块的运动矢量(通常为相邻块的运动矢量)和参考图像的指示信息(通常理解为参考图像信息,用于确定参考图像),其中,运动矢量包括前向运动矢量和/或后向运动矢量,参考图像信息包括前向参考图像块和/或后向参考图像块的参考帧索引信息。通过运动矢量信息可以确定前向参考块的位置以及后向参考块的位置。Optionally, the foregoing initial predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector directed to the reference image block (usually a motion vector of a neighboring block), and a reference. The indication information of the image (generally understood as reference image information for determining the reference image), wherein the motion vector comprises a forward motion vector and/or a backward motion vector, the reference image information comprising a forward reference image block and/or a backward direction Reference frame index information of the reference image block. The position of the forward reference block and the position of the backward reference block can be determined by the motion vector information.
可选地,第一参考图像为前向参考图像,第二参考图像为后向参考图像;反之亦可。Optionally, the first reference image is a forward reference image, and the second reference image is a backward reference image;
在一种可能的实现方式中,初始预测运动信息中包含的第一运动矢量和第二运动矢量;可以根据当前图像块的位置和第一运动矢量得到第一参考块的位置,即确定了第一参考块;并根据当前图像块的位置和第二运动矢量得到第二参考块,即确定了第二参考块。In a possible implementation, the first motion vector and the second motion vector included in the motion information are initially predicted; the position of the first reference block may be obtained according to the position of the current image block and the first motion vector, that is, the first a reference block; and obtaining a second reference block according to the position of the current image block and the second motion vector, that is, determining the second reference block.
在一种可能的实现方式中,若第一参考块和/或第二参考块的位置可以是当前图像块的等同位置;也可以是根据等同位置和运动矢量共同得到的。In a possible implementation, if the location of the first reference block and/or the second reference block may be an equivalent position of the current image block, it may also be obtained according to the equivalent position and the motion vector.
获取图像块的初始预测运动信息的方式有多种,例如,可以采用下面的方式一和方式二来获取图像块的初始预测运动信息。There are various ways to obtain the initial predicted motion information of the image block. For example, the following method 1 and mode 2 may be used to obtain the initial predicted motion information of the image block.
方式一:method one:
在帧间预测的合并模式下,根据当前图像块的相邻块的运动信息构建候选预测运动信息列表,并从该候选预测运动信息列表中选择某个候选预测运动信息作为当前图像块的初始预测运动信息。其中,候选预测运动信息列表包含运动矢量、参考图像块的参考帧索引信息等等。如图4所示,选择相邻块A0的运动信息作为当前图像块的初始预测运动信息,具体地,将A0的前向运动矢量作为当前图像块的前向预测运动矢量,将A0的后向运动矢量作为当前图像块的后向预测运动矢量。In the merge mode of the inter prediction, the candidate prediction motion information list is constructed according to the motion information of the neighboring block of the current image block, and a candidate prediction motion information is selected from the candidate prediction motion information list as an initial prediction of the current image block. Sports information. The candidate predicted motion information list includes a motion vector, reference frame index information of a reference image block, and the like. As shown in FIG. 4, the motion information of the neighboring block A0 is selected as the initial predicted motion information of the current image block. Specifically, the forward motion vector of A0 is used as the forward motion vector of the current image block, and the backward direction of A0 is used. The motion vector is used as the backward predicted motion vector of the current image block.
方式二:Method 2:
在帧间预测的非合并模式下,根据当前图像块的相邻块的运动信息构建运动矢量预测值列表,并从该运动矢量预测值列表中选择某个运动矢量作为当前图像块的运动矢量预测值。在这种情况下,当前图像块的运动矢量可以为相邻块的运动矢量值,也可以为所选取的相邻块的运动矢量与当前图像块的运动矢量差的和,其中,运动矢量差通过对当前图像块进行运动估计所得到的运动矢量与所选取的相邻块的运动矢量的 差。如图5所示,选择运动矢量预测值列表中的索引1和2对应的运动矢量作为当前图像块的前向运动矢量和后向运动矢量。In the non-merging mode of inter prediction, a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value. In this case, the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block. As shown in FIG. 5, the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
应理解,上述方式一和方式二只是获取当前图像块的初始预测运动信息的具体两种方式,本申请对获取预测块的运动信息的方式不做限定,任何可以获取图像块的初始预测运动信息的方式都在本申请的保护范围内。It should be understood that the foregoing manners 1 and 2 are only two specific manners for obtaining initial prediction motion information of the current image block. The method for acquiring motion information of the prediction block is not limited, and any initial prediction motion information of the image block may be acquired. The manner of this is within the scope of protection of this application.
基点可以用坐标点来表示,是一种位置信息,它可以用来表示图像块的位置,也可以在后续图像块搜索中作为参照基准。它可以是一个图像块的左上角顶点,也可以是一个图像块的中心点,也可以是其他规则规定的一个相对位置点,本申请中不予以限定。参考图像的基点可以在后续的搜索过程中作为搜索基点。因此一旦一个参考块的位置确定了,那么搜索基点也就确定了。第一参考块和第二参考块中都含有的基点,由于后续会有以这个基点相关的搜索操作,因此也可分别称为第一搜索基点和第二搜索基点;它们既可以是预先确定得到的,也可以是在编解码的过程中进行规定得到的。The base point can be represented by a coordinate point, which is a kind of position information, which can be used to indicate the position of the image block, and can also be used as a reference in the subsequent image block search. It may be the top left corner of an image block, or the center point of an image block, or a relative position point specified by other rules, which is not limited in this application. The base point of the reference image can be used as a search base point in subsequent search processes. So once the position of a reference block is determined, the search base point is determined. The base points contained in both the first reference block and the second reference block may also be referred to as a first search base point and a second search base point respectively because of subsequent search operations related to the base point; they may be predetermined It can also be specified in the process of codec.
例如前向运动矢量为(MV0x,MV0y),当前图像块的基点为(B0x,B0y),则前向参考块的基点为(MV0x+B0x,MV0y+B0y)。类似方式可以应用于后向参考块,本申请中不加以赘述。For example, the forward motion vector is (MV0x, MV0y), and the base point of the current image block is (B0x, B0y), and the base point of the forward reference block is (MV0x+B0x, MV0y+B0y). A similar manner can be applied to the backward reference block, which is not described in the present application.
为了方便清楚地说明,在接下来的步骤中以一个图像块的左上角顶点作为基点,第一参考图像可以指代前向参考图像,第二参考图像可以指代后向参考图像,对应的,第一参考块可以指代前向参考块,第二参考块可以指代后向参考块。应理解,这仅是为了方便说明的可选举例方式,不对本发明的实现构成任何限制。For convenience of clarity, in the next step, with the upper left corner vertex of one image block as a base point, the first reference image may refer to the forward reference image, and the second reference image may refer to the backward reference image, correspondingly, The first reference block may refer to a forward reference block and the second reference block may refer to a backward reference block. It should be understood that this is merely an illustrative example for the convenience of the description, and does not impose any limitation on the implementation of the present invention.
303、在第一参考图像中确定出N个第三参考块,N的取值大于等于1。303. Determine N third reference blocks in the first reference image, where the value of N is greater than or equal to 1.
步骤303包含一种搜索方法,具体搜索方式可以如下:Step 303 includes a search method, and the specific search method can be as follows:
在前向参考图像中,以第一参考块(或第一搜索基点)为基准,在第一参考块周围进行整像素步长的运动搜索。整像素步长可以指候选搜索块的位置相对于第一参考块的位置偏移为整像素距离,其中,候选搜索块的大小可以与第一参考块相同,因此搜索的过程中就可以确定出候选搜索块的位置,进而根据搜索规则确定出第三参考块。需要指出的是,不管搜索基点是否为整像素点(起始点可以是整像素,或分像素,如:1/2,1/4,1/8,1/16等),都可以进行整像素步长运动搜索,得到当前图像块的前向参考块的位置,即对应确定出了第三参考块。以整像素步长搜索到一些第三参考块后,可选地,还可以进行分像素搜索,进而再得到一些第三参考块,如果还有搜索需求,还可以继续以更加精细的分像素进行搜索……对于搜索方式,参见图6所示,其中(0,0)为搜索基点,可以使用十字搜索:依次搜索(0,-1),(0,1),(-1,0)和(1,0);或正方形搜索:依次搜索(-1,-1),(1,-1),(-1,1)和(1,1);这些点均为候选搜索块的左上角顶点,这些基点确定了,则这些基点对应的参考块即第三参考块也就确定出来了……需要注意的是,本发明中,不限定使用搜索方法,可以采用任意现有技术中的搜索方法,如除了进行整数像素步长搜索以外,还可以使用分数像素步长搜索。例如,直接进行分数像素步长的搜索等,此处并不对具体的搜索方法进行限定。In the forward reference image, a motion search of an integer pixel step is performed around the first reference block with reference to the first reference block (or the first search base). The integer pixel step may refer to a position offset of the position of the candidate search block relative to the first reference block as an integer pixel distance, wherein the size of the candidate search block may be the same as the first reference block, so that the search process may determine The location of the candidate search block, and then the third reference block is determined according to the search rule. It should be pointed out that no matter whether the search base point is an integer pixel (the starting point can be an integer pixel, or a sub-pixel, such as 1/2, 1/4, 1/8, 1/16, etc.), the whole pixel can be performed. The step motion search obtains the position of the forward reference block of the current image block, that is, the third reference block is determined correspondingly. After searching for some third reference blocks in integer pixel steps, optionally, a sub-pixel search can be performed, and then some third reference blocks are obtained, and if there is still a search requirement, the finer sub-pixels can be continued. Search... For the search method, see Figure 6, where (0,0) is the search base point, you can use the cross search: search for (0,-1), (0,1), (-1,0) and (1,0); or square search: search for (-1,-1), (1,-1), (-1,1) and (1,1) in sequence; these points are the upper left corner of the candidate search block Vertex, these base points are determined, then the reference block corresponding to the base points, that is, the third reference block is also determined. It should be noted that, in the present invention, the search method is not limited, and any prior art search may be adopted. Methods, such as fractional pixel step size search, can be used in addition to integer pixel step search. For example, a search for a fractional pixel step size is directly performed, and a specific search method is not limited herein.
304、针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地 确定出一个第四参考块;得到N个参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块。304. For any one of the N third reference blocks, according to the first search base point, the location of the any one third reference block, and the second search base point, in the A fourth reference block is correspondingly determined in the two reference pictures; N reference block groups are obtained, wherein one reference block group includes a third reference block and a fourth reference block.
可选地,如果当前图像块所在图像距离前向参考图像和后向参考图像的时域间隔相同,如前向参考图像和后向参考图像分别距离当前图像块所在图像的时域间隔为±0.04s,可以用运动矢量差MVD(motion vector difference)镜像限制条件找到参考块组。这里的MVD镜像限制条件为:如果一个第三图像块(基点)的位置相对于第一参考块(第一搜索基点)的位置偏移为Offset0(deltaX0,deltaY0),那么在后向图像中找到相对于第二参考块(第二搜索基点)的位置偏移为Offset1(deltaX1,deltaY1)的候选图像块确定为一个第四参考块,其中,deltaX1=-deltaX0;deltaY1=-deltaY0;对应的,Offset0(deltaX0,deltaY0)可以表示第i矢量,Offset1(deltaX1,deltaY1)可以表示第j矢量。作为另一种可选的实施方式,即时当前图像块所在图像距离前向参考图像和后向参考图像的时域间隔不同,也依旧可以用运动矢量差MVD(motion vector difference)镜像限制条件找到参考块组。该实现方法中,第i矢量和第j矢量等大反向。Optionally, if the image of the current image block is the same as the time domain interval of the forward reference image and the backward reference image, for example, the time interval between the forward reference image and the backward reference image from the image of the current image block is ±0.04 s, the reference block group can be found using the motion vector difference MVD (motion vector difference) mirroring constraint. The MVD image constraint here is that if the positional offset of a third image block (base point) relative to the first reference block (first search base point) is Offset0 (deltaX0, deltaY0), it is found in the backward image. The candidate image block whose position offset is Offset1 (deltaX1, deltaY1) with respect to the second reference block (second search base point) is determined as a fourth reference block, where deltaX1=-deltaX0;deltaY1=-deltaY0; corresponding, Offset0 (deltaX0, deltaY0) can represent the ith vector, and Offset1 (deltaX1, deltaY1) can represent the jth vector. As another optional implementation manner, the image of the current current image block is different from the time domain interval of the forward reference image and the backward reference image, and the motion vector difference (MVD) image constraint can still be used to find the reference. Block group. In this implementation method, the ith vector and the jth vector are inversely large.
应理解,本申请中所提到的“组”实质上是为了表达一种对应关系,不构成任何形式的限定。It should be understood that the "group" referred to in the present application is intended to express a corresponding relationship and does not constitute any form of limitation.
作为扩展地,如果当前图像块所在图像距离前向参考图像和后向参考图像的时域间隔不相同时,如前向参考图像和后向参考图像分别距离当前图像块所在图像的时域间隔为t1和t2,可以采用以下约束条件:如果一个第三图像块的块位置(基点)相对于第一参考块(第一搜索基点)的位置偏移为Offset00(deltaX00,deltaY00),那么在后向图像中找到相对于第二参考块(第二搜索基点)的位置偏移为Offset01(deltaX01,deltaY01)的图像块确定为一个第四参考块。其中,deltaX01=-deltaX00*t2/t1;deltaY01=-deltaY00*t2/t1。对应的,Offset0(deltaX0,deltaY0)可以表示第i矢量,Offset1(deltaX1,deltaY1)可以表示第j矢量。As an extension, if the current image block is located at a time interval different from the forward reference image and the backward reference image, the time interval between the forward reference image and the backward reference image respectively from the image of the current image block is For t1 and t2, the following constraint can be adopted: if the positional offset of the block position (base point) of a third image block relative to the first reference block (first search base point) is Offset00 (deltaX00, deltaY00), then in the backward direction An image block in the image in which the positional offset with respect to the second reference block (second search base point) is Offset01 (deltaX01, deltaY01) is determined as a fourth reference block. Among them, deltaX01=-deltaX00*t2/t1; deltaY01=-deltaY00*t2/t1. Correspondingly, Offset0 (deltaX0, deltaY0) can represent the ith vector, and Offset1 (deltaX1, deltaY1) can represent the jth vector.
应理解,在计算时域间隔的过程中,第一图像块、第一基点和第一参考图像具有等同的功能,当前图像块、当前图像块的基点、当前图像块所在的图像具有等同的功能;其实质都是计算第一参考图像和当前图像块所在图像的时域间隔;同理适用于计算第二参考图像和当前图像块所在图像的时域间隔。即帧之间的时间间隔。It should be understood that in calculating the time domain interval, the first image block, the first base point and the first reference image have equivalent functions, and the current image block, the base point of the current image block, and the image of the current image block have equivalent functions. The essence is to calculate the time domain interval of the image where the first reference image and the current image block are located; the same applies to calculating the time domain interval of the second reference image and the image of the current image block. That is, the time interval between frames.
从上述实现方式中可以得出,只要搜索出一个第三参考块,就对应地确定出一个第四参考块;即确定出的第三参考块和第四参考块数量相等,且在空间位置上是一一对应的。最终可以得到N个的参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块。It can be concluded from the foregoing implementation that, as long as a third reference block is searched, a fourth reference block is correspondingly determined; that is, the determined third reference block and the fourth reference block are equal in number and in spatial position. It is one-to-one correspondence. Finally, N reference block groups can be obtained, wherein one reference block group includes one third reference block and one fourth reference block.
应理解,上述上述所指的位置偏移,既可以指基点与基点之间的偏移,也可以指图像块与图像块之间的偏移,代表的是一个相对位置。It should be understood that the above-mentioned positional offset may refer to the offset between the base point and the base point, and may also refer to the offset between the image block and the image block, representing a relative position.
可选地,N个参考块组可以包含上述第一参考块和上述第二参考块;即上述第一参考块可以是一个第三参考块,相应地,上述第二参考块可以是一个第四参考块。该实现方法中,第i矢量和第j矢量都为0。特别地,当第一参考块为第三参考块时,第二参考块就是其对应的第四参考块。Optionally, the N reference block groups may include the foregoing first reference block and the foregoing second reference block; that is, the first reference block may be a third reference block, and correspondingly, the second reference block may be a fourth Reference block. In the implementation method, both the ith vector and the jth vector are 0. In particular, when the first reference block is the third reference block, the second reference block is its corresponding fourth reference block.
作为补充说明的,本申请中所提到的第三参考块和/或第四参考块并不限定于某一 个具体位置的图像块,而是可以表示一类参考块,可以是一个具体的图像块也可以是多个图像块;如第三参考块可以是根据在第一基点周围搜索出来的任意一个图像块,第四参考块可以是上述任意一个图像块对应确定出来的一个图像块,因此第四参考块可以是一个具体的图像块也可以是多个图像块。As a supplementary description, the third reference block and/or the fourth reference block mentioned in the present application are not limited to image blocks of a specific location, but may represent a type of reference block, which may be a specific image. The block may also be a plurality of image blocks; for example, the third reference block may be based on any one of the image blocks searched around the first base point, and the fourth reference block may be an image block corresponding to any one of the image blocks described above, thus The fourth reference block may be a specific image block or a plurality of image blocks.
步骤305、将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在第一像素精度下计算所述N个参考块组的图像块匹配代价。Step 305: Increase the pixel values of the obtained third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups under the first pixel precision.
首先以一个参考块组为具体实例进行举例说明如何计算参考块组的图像块匹配代价,这个参考块组包含一个第三参考块和与其对应确定出来的第四参考块。首先将第三参考块和第四参考块的像素值提到高第一像素精度,其中,第三参考块和第四参考块都是已经编解码完毕的图像块,因此它们的像素具有码流精度,如码流精度为8比特,则第三参考块和第四参考块的像素值的像素精度就为8比特。为了寻找到图像更相似的参考块,需要将第三参考块和第四参考块的像素的精度进行提升。提高精度可以通过差值或者移位的现有技术方式实现,本申请中不予以赘述。为了更好地方便后续计算图像块匹配代价,因此需要计算图像块匹配代价的图像块的精度需要提高到同一精度,例如14比特。经上述操作后,可以得到第三参考块的14比特的像素值,记作pi[x,y],以及第四参考块的14比特的像素值记作pj[x,y],其中x,y表示坐标。根据pi[x,y]和pj[x,y]计算出一个图像块匹配代价eij,也可以称为图像块匹配误差eij。计算图像块匹配误差的方式有很多,如SAD准则,MR-SAD准则,也可以使用其他现有技术中的评估准则,本发明中不限定图像块匹配误差的计算方法。First, a reference block group is taken as a specific example to illustrate how to calculate an image block matching cost of a reference block group. The reference block group includes a third reference block and a fourth reference block determined corresponding thereto. First, the pixel values of the third reference block and the fourth reference block are referred to as high first pixel precision, wherein the third reference block and the fourth reference block are both image blocks that have been coded, so their pixels have a code stream. The accuracy, such as the code stream precision is 8 bits, the pixel precision of the pixel values of the third reference block and the fourth reference block is 8 bits. In order to find a reference block that is more similar in image, it is necessary to improve the precision of the pixels of the third reference block and the fourth reference block. Increasing the accuracy can be achieved by prior art methods of difference or shift, which are not described in this application. In order to better facilitate the subsequent calculation of the image block matching cost, the accuracy of the image block that needs to calculate the image block matching cost needs to be increased to the same precision, for example, 14 bits. After the above operation, the 14-bit pixel value of the third reference block is obtained, denoted as pi[x, y], and the 14-bit pixel value of the fourth reference block is recorded as pj[x, y], where x, y represents the coordinates. An image block matching cost eij is calculated from pi[x, y] and pj[x, y], which may also be referred to as an image block matching error eij. There are many ways to calculate the image block matching error, such as the SAD criterion, the MR-SAD criterion, and other evaluation criteria in the prior art, and the calculation method of the image block matching error is not limited in the present invention.
若存在两组以上的参考块组,则对多个参考块组都可以执行上述的图像块匹配代价计算。If there are more than two sets of reference block groups, the image block matching cost calculation described above can be performed for a plurality of reference block groups.
在一种实现方式中,上述第一参考块为一个第三参考块,上述第二参考块为一个第四参考块;第一参考块和第二参考块的像素值可以通过运动补偿方法得到。运动补偿是指根据运动矢量,指向已重建的参考图像(具有码流的像素精度),并得到当前图像块的参考块的像素值(具有第一像素精度)。例如,运动矢量指向的位置为分像素位置,需要针对参考图像的整像素位置的像素值,采用插值滤波器进行插值,得到分像素位置的像素值,作为当前图像块的参考块的像素值;运动矢量指向的位置为整像素位置,可以采用移动操作。插值滤波器的系数和,即插值滤波器增益,为2的N次方,如N为6,则表示插值滤波器增益为6比特。在进行插值操作中,由于插值滤波增益通常大于1,因此得到的前向参考块和后向参考块的像素值的精度高于码流的像素精度。为了减少精度损失,此时不进行移位和/或限位操作,以保留高精度的前向参考块和后向参考块的像素值。如预测图像的像素值精度bitDepth为8比特,插值滤波器增益为6比特,得到精度为14比特的预测像素值;如预测图像的像素值精度bitDepth为10比特,插值滤波器增益为6比特,得到精度为16比特的预测像素值;如预测图像的像素值精度bitDepth为10比特,插值滤波器增益为6比特,再右移2位,得到精度为14比特的预测像素值。常用的插值滤波器有4抽头,6抽头,8抽头等。现有技术中有很多的运动补偿方法,本申请不加以赘述。In an implementation manner, the first reference block is a third reference block, and the second reference block is a fourth reference block; and the pixel values of the first reference block and the second reference block are obtained by a motion compensation method. Motion compensation refers to pointing to a reconstructed reference image (with pixel precision of the code stream) according to the motion vector, and obtaining the pixel value (having the first pixel precision) of the reference block of the current image block. For example, the position pointed by the motion vector is a sub-pixel position, and the pixel value of the entire pixel position of the reference image needs to be interpolated by using an interpolation filter to obtain a pixel value of the sub-pixel position as the pixel value of the reference block of the current image block; The position pointed by the motion vector is a whole pixel position, and a moving operation can be employed. The coefficient sum of the interpolation filter, that is, the interpolation filter gain, is 2 to the power of N. If N is 6, it means that the interpolation filter gain is 6 bits. In the interpolation operation, since the interpolation filter gain is usually greater than 1, the accuracy of the pixel values of the obtained forward reference block and backward reference block is higher than that of the code stream. In order to reduce the loss of precision, no shifting and/or limiting operations are performed at this time to preserve the pixel values of the high-precision forward reference block and the backward reference block. For example, the pixel value precision bitDepth of the predicted image is 8 bits, the interpolation filter gain is 6 bits, and the predicted pixel value with a precision of 14 bits is obtained; for example, the pixel value precision bitDepth of the predicted image is 10 bits, and the interpolation filter gain is 6 bits. A predicted pixel value with a precision of 16 bits is obtained; if the pixel value precision bitDepth of the predicted image is 10 bits, the interpolation filter gain is 6 bits, and then 2 bits are shifted right, and a predicted pixel value with a precision of 14 bits is obtained. Commonly used interpolation filters are 4 taps, 6 taps, 8 taps, and so on. There are many motion compensation methods in the prior art, which are not described in this application.
另外,作为补充说明,本申请中所指的图像块的像素可以包括亮度分量采样,或指luma sample;对应的,像素点是亮度分量采样点;像素值是亮度分量采样值。In addition, as a supplementary explanation, the pixels of the image block referred to in the present application may include a luminance component sample, or a luma sample; correspondingly, the pixel point is a luminance component sampling point; and the pixel value is a luminance component sampling value.
306、在N个参考块组中确定出满足图像块匹配代价准则的目标参考块组,所述目标参考块组包含目标第三参考块和目标第四参考块。306. Determine, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, where the target reference block group includes a target third reference block and a target fourth reference block.
可选地,图像块匹配代价准则包括:将图像块匹配代价最小的参考块组确定为目标参考块组。Optionally, the image block matching cost criterion comprises: determining a reference block group with the smallest image block matching cost as the target reference block group.
可选地,图像块匹配代价准则也包括:将首个出现的满足图像块匹配代价小于预设阈值的参考块组确定为所述目标参考块组。Optionally, the image block matching cost criterion further includes: determining, as the target reference block group, the first occurrence of the reference block group that satisfies the image block matching cost less than the preset threshold.
应理解,步骤304、步骤305、步骤306可以在步骤303之后执行,也可以和步骤303同步进行。步骤编号不构成对方法执行顺序的任何限定。It should be understood that step 304, step 305, and step 306 may be performed after step 303, or may be performed in synchronization with step 303. The step numbers do not constitute any limitation on the order in which the methods are executed.
举例说明,每当确定出一个第三参考块,就对应地确定出一个第四参考块,并计算这一组第三参考块和第四参考块的图像块匹配代价,如果计算到第N参考块组时,出现图像块匹配代价结果满足预设条件,如小于一个预设阈值,或甚至为0,则将该第N参考块组作为目标参考块组。没有必要再去确定和计算更多的第三参考块和第四参考块,可以减少计算的复杂度,这里N大于等于1。For example, each time a third reference block is determined, a fourth reference block is correspondingly determined, and an image block matching cost of the third set of reference blocks and the fourth reference block is calculated, if the Nth reference is calculated. In the block group, if the image block matching cost result satisfies a preset condition, such as less than a preset threshold, or even 0, the Nth reference block group is used as the target reference block group. It is not necessary to determine and calculate more third reference blocks and fourth reference blocks, which can reduce the computational complexity, where N is greater than or equal to 1.
如先确定N个第三参考块,并且一对一地确定出N个第四参考块,组成N个参考块组,然后对这N个参考块组,计算出每一个参考块组对应的图像块匹配误差并进行比较,出现图像块匹配代价结果满足预设条件,如图像块匹配误差最小,则选取图像块匹配代价最小的那一个参考块组(若最小存在多个,则随机任选一个),作为目标参考块组。If N third reference blocks are determined first, and N fourth reference blocks are determined one-to-one, N reference block groups are formed, and then images corresponding to each reference block group are calculated for the N reference block groups. The block matching error is compared and compared, and the image block matching cost result satisfies a preset condition. For example, if the image block matching error is the smallest, the reference block group with the smallest image block matching cost is selected (if there are at least a plurality, the random one is optional) ) as the target reference block group.
相应地,目标参考块组中的第三参考块和第四参考块即目标第三参考块和目标第四参考块,也可以分别叫做当前图像块的最优前向参考块和当前图像块的最优后向参考块;供步骤306进行计算。Correspondingly, the third reference block and the fourth reference block in the target reference block group, that is, the target third reference block and the target fourth reference block, may also be respectively called the optimal forward reference block of the current image block and the current image block respectively. The optimal backward reference block; for step 306 to perform the calculation.
作为补充说明,由于基于具备第一像素精度的第一参考块确定出第三参考块;基于具备第一像素精度的第二参考块确定出第四参考块;因此第三参考块和第四参考块的像素精度也都是第一像素精度,即高于码流的像素精度。As a supplementary explanation, the third reference block is determined based on the first reference block having the first pixel precision; the fourth reference block is determined based on the second reference block having the first pixel precision; thus the third reference block and the fourth reference The pixel precision of the block is also the first pixel precision, ie higher than the pixel precision of the code stream.
307、根据目标第三参考块在第一精度下的像素值和目标第四参考块在第一精度下的像素值得到当前图像块的像素预测值,其中,当前图像块的像素预测值具有第二像素精度;第二像素精度小于上述第一像素精度。307. Obtain a pixel prediction value of the current image block according to the pixel value of the target third reference block at the first precision and the pixel value of the target fourth reference block at the first precision, where the pixel prediction value of the current image block has a Two pixel precision; the second pixel precision is less than the first pixel precision described above.
对所得到的目标第三参考块的像素值(第一像素精度)和目标第四参考块的像素值(第一像素精度)进行“平均加权+移位”,得到当前图像块的像素预测值(第二像素精度)。Performing "average weighting + shifting" on the obtained pixel value of the target third reference block (first pixel precision) and the pixel value (first pixel precision) of the target fourth reference block to obtain a pixel prediction value of the current image block. (second pixel accuracy).
作为可选地,第二像素精度与码流的像素精度(bitDepth)相同。Alternatively, the second pixel precision is the same as the pixel precision (bitDepth) of the code stream.
具体实现过程中,获取目标第三参考块在第一精度下的像素值为predSamplesL0’[x][y];获取目标第四参考块在第一精度下的像素值为predSamplesL1’[x][y];当前图像块的像素预测值predSamples’[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0’[x][y]+predSamplesL1’[x][y]+offset2)>>shift2),其中,bitDepth为码流的像素精度,Shift2为移位参数,offset2等于1<<(shift2-1)。x和y为图像块中每个像素点的水平和竖直方向的坐标,则对于图像块中的每个像素点,操作均如上述公式所示。如,目标第三参考块像素值的精度为14比特,目标第四参考块像素值的精度为14比特, shift2为15-bitDepth,则此时当前图像块的像素预测值的精度为14+1-shift2=bitDepth。In a specific implementation process, the pixel value of the target third reference block at the first precision is predSamplesL0'[x][y]; and the pixel value of the target fourth reference block at the first precision is predSamplesL1'[x][ y]; the pixel prediction value of the current image block predSamples'[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0'[x][y]+predSamplesL1'[x][y] +offset2)>>shift2), where bitDepth is the pixel precision of the code stream, Shift2 is the shift parameter, and offset2 is equal to 1<<(shift2-1). x and y are the coordinates of the horizontal and vertical directions of each pixel in the image block, and for each pixel in the image block, the operation is as shown in the above formula. For example, the precision of the pixel value of the target third reference block is 14 bits, the precision of the pixel value of the target fourth reference block is 14 bits, and the shift2 is 15-bitDepth, and the precision of the pixel prediction value of the current image block is 14+1. -shift2=bitDepth.
综上所述,由于根据初始运动信息得到的第一参考块和第二参考块并不一定能够准确预测当前图像块,因此本申请中将会采用一种全新的方法,找到更加合适的目标第三参考块和目标第四参考块,并通过目标第三参考块和目标第四参考块的像素值对当前图像块进行预测。通过本发明提供的预测方法,可以在匹配的过程中,持续保持高精度像素值,不需要反复的限位操作以及运动补偿操作,降低了编解码的复杂度。In summary, since the first reference block and the second reference block obtained from the initial motion information are not necessarily able to accurately predict the current image block, a completely new method will be adopted in the present application to find a more suitable target. The third reference block and the target fourth reference block, and the current image block is predicted by the pixel values of the target third reference block and the target fourth reference block. Through the prediction method provided by the invention, the high-precision pixel value can be continuously maintained in the matching process, and the repeated limit operation and the motion compensation operation are not required, thereby reducing the complexity of the codec.
应理解,本申请实施例的图像预测方法可以发生在图1和图2所示的帧间预测过程中,本申请实施例的图像预测方法可以具体由编码器或者解码器中的帧间预测模块来执行。另外,本申请实施例的图像预测方法可以在可能需要对视频图像进行编码和/或解码的任何电子设备或者装置内实施。It should be understood that the image prediction method in the embodiment of the present application may occur in the inter prediction process shown in FIG. 1 and FIG. 2, and the image prediction method in the embodiment of the present application may be specifically implemented by an inter prediction module in an encoder or a decoder. To execute. Additionally, the image prediction method of embodiments of the present application can be implemented in any electronic device or device that may require encoding and/or decoding of a video image.
基于上述实施例提供的预测方法,本发明实施例提供一种图像预测装置,下面结合图6对本申请实施例的图像预测装置进行描述。其中,图6所示的图像预测装置与图3所示的方法相对应,能够执行图3所示的方法中的各个步骤。为了简洁,下面适当省略重复的描述。Based on the prediction method provided by the foregoing embodiment, an embodiment of the present invention provides an image prediction apparatus. The image prediction apparatus of the embodiment of the present application is described below with reference to FIG. Here, the image predicting apparatus shown in FIG. 6 corresponds to the method shown in FIG. 3, and each step in the method shown in FIG. 3 can be executed. For the sake of brevity, the repeated description is appropriately omitted below.
请参阅图7,一种图像预测装置700,所述装置700包括:Referring to FIG. 7, an image prediction apparatus 700, the apparatus 700 includes:
获取单元701,用于获取当前图像块的初始预测运动信息。该单元可以由处理器调用存储器中的代码实现。The obtaining unit 701 is configured to acquire initial predicted motion information of the current image block. This unit can be implemented by the processor invoking code in memory.
确定单元702,用于根据所述初始预测运动信息,在第一参考图像中确定出所述当前图像块对应的第一参考块,并在第二参考图像中确定出所述当前图像块对应的第二参考块;其中,所述第一参考块包含第一搜索基点,所述第二参考块包含第二搜索基点。该单元可以由处理器调用存储器中的代码实现。a determining unit 702, configured to determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine, in the second reference image, the current image block a second reference block; wherein the first reference block includes a first search base point and the second reference block includes a second search base point. This unit can be implemented by the processor invoking code in memory.
搜索单元703,用于在所述第一参考图像中确定出N个第三参考块。该单元可以由处理器调用存储器中的代码实现。The searching unit 703 is configured to determine N third reference blocks in the first reference image. This unit can be implemented by the processor invoking code in memory.
映射单元704,用于针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块;得到N个参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块;N大于等于1。该单元可以由处理器调用存储器中的代码实现。The mapping unit 704 is configured to: according to the first search base point, the location of the any one of the third reference blocks, and the second search base point, for any one of the N third reference blocks, Correspondingly determining a fourth reference block in the second reference image; obtaining N reference block groups, wherein one reference block group includes a third reference block and a fourth reference block; N is greater than or equal to 1. This unit can be implemented by the processor invoking code in memory.
计算单元705,用于将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在第一像素精度下计算N个参考块组的图像块匹配代价。该单元可以由处理器调用存储器中的代码实现。The calculating unit 705 is configured to increase the obtained pixel values of the third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups at the first pixel precision. This unit can be implemented by the processor invoking code in memory.
选择单元706,用于在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组,所述目标参考块组包含目标第三参考块和目标第四参考块。该单元可以由处理器调用存储器中的代码实现。The selecting unit 706 is configured to determine, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, where the target reference block group includes a target third reference block and a target fourth reference block. This unit can be implemented by the processor invoking code in memory.
预测单元707,用于根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值,其中,所述当前图像块的像素预测值具有第二像素精度;所述第二像素精度小于所述第一像素精度。该单元可以由处理器调用存储器中的代码实现。a prediction unit 707, configured to obtain a pixel prediction value of the current image block according to a pixel value of the target third reference block at a first precision and a pixel value of the target fourth reference block at a first precision, where The pixel prediction value of the current image block has a second pixel precision; the second pixel precision is less than the first pixel precision. This unit can be implemented by the processor invoking code in memory.
在具体实现过程中,获取单元701具体用于执行上述步骤301中所提到的方法以及可以等同替换的方法;确定单元702具体用于执行上述步骤302中所提到的方法以及可以等同替换的方法;搜索单元703具体用于执行上述步骤303中所提到的方法以及可以等同替换的方法;映射单元704具体用于执行上述步骤304中所提到的方法以及可以等同替换的方法;计算单元705具体用于执行步骤305中所提到的方法以及可以等同替换的方法;选择单元706具体用于执行步骤306中所提到的方法以及可以等同替换的方法;预测单元707具体用于执行步骤307中所提到的方法以及可以等同替换的方法。其中,上述具体的方法实施例以及实施例中相应的解释、表述、细化、与可选的可替代实施方式也适用于装置中的方法执行。In a specific implementation process, the obtaining unit 701 is specifically configured to perform the method mentioned in the foregoing step 301 and the method that can be replaced by the same; the determining unit 702 is specifically configured to perform the method mentioned in the foregoing step 302 and can be equivalently replaced. The search unit 703 is specifically configured to perform the method mentioned in the above step 303 and the method that can be replaced by the same; the mapping unit 704 is specifically configured to perform the method mentioned in the above step 304 and the method that can be replaced equally; 705 is specifically configured to perform the method mentioned in step 305 and the method that can be replaced by the same; the selecting unit 706 is specifically configured to perform the method mentioned in step 306 and the method that can be replaced equally; the prediction unit 707 is specifically configured to perform the step The method mentioned in 307 and the method which can be equivalently replaced. The corresponding method embodiments and corresponding explanations, representations, refinements, and alternative alternative embodiments are also applicable to the method in the device.
装置700具体可以是视频编码装置、视频解码装置、视频编解码系统或者其他具有视频编解码功能的设备。装置700既可以用于在编码过程中进行图像预测,也可以用于在解码过程中进行图像预测,尤其是视频图像中的帧间预测。装置700包括用于实施前述方法中任意一种可能方式的若干个功能单元The device 700 may specifically be a video encoding device, a video decoding device, a video codec system, or other device having a video codec function. The apparatus 700 can be used for both image prediction in the encoding process and image prediction in the decoding process, especially inter-frame prediction in video images. Apparatus 700 includes a number of functional units for implementing any of the foregoing methods
本申请还提供一种终端设备,所述终端设备包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于执行本申请实施例的图像预测方法,包括步骤301-307。The present application further provides a terminal device, the terminal device includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute the program The image prediction method of the application embodiment includes steps 301-307.
这里的终端设备可以是视频显示设备,智能手机,便携式电脑以及其它可以处理视频或者播放视频的设备。The terminal devices here may be video display devices, smart phones, portable computers, and other devices that can process video or play video.
本申请还提供一种视频编码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的图像预测方法,包括步骤301-307。The present application also provides a video encoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
本申请还提供一种视频解码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的图像预测方法,包括步骤301-307。The present application also provides a video decoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
本申请还提供一种视频编码系统,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的图像预测方法,包括步骤301-307。The present application also provides a video encoding system including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
本申请还提供一种计算机可读介质,所述计算机可读介质存储用于设备执行的程序代码,所述程序代码包括用于执行本申请实施例的图像预测方法的指令,包括用于实现步骤301-307的程序代码。The present application also provides a computer readable medium storing program code for device execution, the program code comprising instructions for performing an image prediction method of an embodiment of the present application, including steps for implementing Program code of 301-307.
本申请还提供一种解码器,所述解码器包括本申请实施例中的图像预测装置,如700,以及解码重建模块,其中,所述解码重建模块用于根据所述图像预测装置得到的所述图像块的像素值的预测值得到所述图像块的重建像素值。The present application also provides a decoder, which includes an image prediction apparatus in the embodiment of the present application, such as 700, and a decoding reconstruction module, wherein the decoding reconstruction module is configured to obtain according to the image prediction apparatus. The predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
本申请还提供一种编码器,所述编码器包括本申请实施例中的图像预测装置,如700,以及编码重建模块,其中,所述编码重建模块用于根据所述图像预测装置得到的所述图像块的像素值的预测值得到所述图像块的重建像素值。The present application also provides an encoder, which includes an image prediction apparatus in the embodiment of the present application, such as 700, and a code reconstruction module, wherein the code reconstruction module is configured to obtain according to the image prediction device. The predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
图8是本申请实施例的视频编码器的示意性框图。图8所示的视频编码器1000包括:编码端预测模块1001、变换量化模块1002、熵编码模块1003、编码重建模块1004和编码端滤波模块。FIG. 8 is a schematic block diagram of a video encoder according to an embodiment of the present application. The video encoder 1000 shown in FIG. 8 includes an encoding end prediction module 1001, a transform quantization module 1002, an entropy encoding module 1003, a code reconstruction module 1004, and an encoding end filtering module.
图8所示的视频编码器1000可以对视频进行编码,具体地,视频编码器1000可以执行图1所示的视频编码过程,实现对视频的编码。另外,视频编码器1000还可以执行本申请实施例的图像预测方法,视频编码器1000可以执行图3所示的图像预测方法的各个步骤,包括每个步骤的细化和可替代的实现方式。本申请实施例中的图像预测装置还可以是视频编码器1000中的编码端预测模块1001。The video encoder 1000 shown in FIG. 8 can encode a video. Specifically, the video encoder 1000 can perform the video encoding process shown in FIG. 1 to implement encoding of a video. In addition, the video encoder 1000 can also perform the image prediction method of the embodiment of the present application, and the video encoder 1000 can perform various steps of the image prediction method shown in FIG. 3, including refinement of each step and an alternative implementation. The image prediction apparatus in the embodiment of the present application may also be the encoding end prediction module 1001 in the video encoder 1000.
图9是本申请实施例的视频解码器的示意性框图。图9所示的视频解码器2000包括:熵解码模块2001、反变换反量化模块2002、解码端预测模块2003、解码重建模块2004和解码端滤波模块2005。9 is a schematic block diagram of a video decoder of an embodiment of the present application. The video decoder 2000 shown in FIG. 9 includes an entropy decoding module 2001, an inverse transform inverse quantization module 2002, a decoding end prediction module 2003, a decoding reconstruction module 2004, and a decoding end filtering module 2005.
图9所示的视频解码器2000可以对视频进行编码,具体地,视频解码器2000可以执行图2所示的视频解码过程,实现对视频的解码。另外,视频解码器2000还可以执行本申请实施例的图像预测方法,视频解码器2000可以执行图3所示的图像预测方法的各个步骤,包括每个步骤的细化和可替代的实现方式。本申请实施例中的图像预测装置700还可以是视频解码器2000中的解码端预测模块2003。The video decoder 2000 shown in FIG. 9 can encode the video. Specifically, the video decoder 2000 can perform the video decoding process shown in FIG. 2 to implement decoding of the video. In addition, the video decoder 2000 can also perform the image prediction method of the embodiment of the present application, and the video decoder 2000 can perform various steps of the image prediction method shown in FIG. 3, including refinement of each step and an alternative implementation. The image prediction apparatus 700 in the embodiment of the present application may also be the decoding side prediction module 2003 in the video decoder 2000.
下面结合图10至图12对本申请实施例的图像预测的方法的应用场景进行介绍,本申请实施例的图像预测的方法可以由图10至图12所示的视频传输系统、编解码装置以及编解码系统来执行。The application scenario of the image prediction method in the embodiment of the present application is described below with reference to FIG. 10 to FIG. 12 . The image prediction method in the embodiment of the present application may be implemented by the video transmission system, the codec device, and the editing device shown in FIG. 10 to FIG. 12 . The decoding system is executed.
图10是本申请实施例的视频传输系统的示意性框图。FIG. 10 is a schematic block diagram of a video transmission system according to an embodiment of the present application.
如图10所示,视频传输系统包括采集模块3001、编码模块3002、发送模块3003、网络传输3004、接收模块3005、解码模块3006、渲染模块3007。As shown in FIG. 10, the video transmission system includes an acquisition module 3001, an encoding module 3002, a transmitting module 3003, a network transmission 3004, a receiving module 3005, a decoding module 3006, and a rendering module 3007.
其中,视频传输系统中各个模块的具体作用如下:Among them, the specific functions of each module in the video transmission system are as follows:
采集模块3001包含摄像头或者摄像头组,用于采集视频图像,并对采集到的视频图像进行编码前的处理,将光信号转化为数字化的视频序列;The acquisition module 3001 includes a camera or a camera group for collecting video images, and performing pre-encoding processing on the collected video images to convert the optical signals into digitized video sequences;
编码模块3002用于对视频序列进行编码,得到码流;The encoding module 3002 is configured to encode the video sequence to obtain a code stream;
发送模块3003用于将编码得到的码流发送出去;The sending module 3003 is configured to send the coded code stream.
接收模块3005用于接收发送模块3003发送的码流;The receiving module 3005 is configured to receive the code stream sent by the sending module 3003.
网络3004用于将发送模块3003发送的码流传输到接收模块3005;The network 3004 is configured to transmit the code stream sent by the sending module 3003 to the receiving module 3005;
解码模块3006用于对接收模块3005接收的码流进行解码,重建视频序列;The decoding module 3006 is configured to decode the code stream received by the receiving module 3005 to reconstruct a video sequence.
渲染模块3007用于对解码模块3006解码得到的重建视频序列进行渲染,以提高视频的显示效果。The rendering module 3007 is configured to render the reconstructed video sequence decoded by the decoding module 3006 to improve the display effect of the video.
图10所示的视频传输系统可以执行本申请实施例的图像预测的方法,具体地,图10所示的视频传输系统中的编码模块3002和解码模块3006都可以执行本申请实施例的图像预测的方法,包括步骤301-307,也包括每个步骤的细化和可替代的实现方式。另外,图10所示的视频传输系统中的采集模块3001、编码模块3002以及发送模块3003相当于图8所示的视频编码器1000。图10所示的视频传输系统中的接收模块3005、解码模块3006和渲染模块3007相当于图9所示的视频解码器2000。The video transmission system shown in FIG. 10 can perform the image prediction method in the embodiment of the present application. Specifically, the encoding module 3002 and the decoding module 3006 in the video transmission system shown in FIG. 10 can perform image prediction in the embodiment of the present application. The method, including steps 301-307, also includes refinement and alternative implementations of each step. In addition, the acquisition module 3001, the encoding module 3002, and the transmission module 3003 in the video transmission system shown in FIG. 10 correspond to the video encoder 1000 shown in FIG. The receiving module 3005, the decoding module 3006, and the rendering module 3007 in the video transmission system shown in FIG. 10 correspond to the video decoder 2000 shown in FIG.
下面结合图11和图12对编解码装置和编解码装置组成的编解码系统进行详细的介绍。应理解,图11和图12中所示的编解码装置和编解码系统能够执行本申请实施例的图像预测的方法。A codec system composed of a codec device and a codec device will be described in detail below with reference to FIGS. 11 and 12. It should be understood that the codec device and the codec system shown in FIGS. 11 and 12 are capable of performing the method of image prediction of the embodiment of the present application.
图11是本申请实施例的视频编解码装置的示意性图。该视频编解码装置50 可以是专门用于对视频图像进行编码和/或解码的装置,也可以是具有视频编解码功能的电子设备,进一步地,该编解码装置50可以是无线通信系统的移动终端或者用户设备。FIG. 11 is a schematic diagram of a video codec apparatus according to an embodiment of the present application. The video codec device 50 may be a device dedicated to encoding and/or decoding a video image, or may be an electronic device having a video codec function. Further, the codec device 50 may be a mobile communication system. Terminal or user equipment.
编解码装置50可以包括下列模块或者单元:控制器56、编解码器54、无线电接口52、天线44、智能卡46、读卡器48、小键盘34、存储器58、红外线端口42、显示器32。除了图11中所示的模块和单元之外,编解码装置50还可以包括麦克风或者任何适当的音频输入模块,该音频输入模块可以是数字或者模拟信号输入,编解码装置50还可以包括音频输出模块,该音频输出模块可以是耳机、扬声器或者模拟音频或者数字音频输出连接。编解码装置50也可以包括电池,该电池可以是太阳能电池、燃料电池等等。编解码装置50还可以包括用于与其它设备进行近程视线通信的红外线端口,该编解码装置50还可以采用包括任何适当的近程通信方式与其它设备进行通信,例如,蓝牙无线连接、USB/火线有线连接。 Codec device 50 may include the following modules or units: controller 56, codec 54, radio interface 52, antenna 44, smart card 46, card reader 48, keypad 34, memory 58, infrared port 42, display 32. In addition to the modules and units shown in FIG. 11, the codec device 50 may also include a microphone or any suitable audio input module, which may be a digital or analog signal input, and the codec device 50 may also include an audio output. Module, the audio output module can be a headset, a speaker or an analog audio or digital audio output connection. The codec device 50 may also include a battery, which may be a solar cell, a fuel cell, or the like. The codec device 50 may also include an infrared port for short-range line-of-sight communication with other devices, and the codec device 50 may also communicate with other devices using any suitable short-range communication method, for example, a Bluetooth wireless connection, USB / Firewire wired connection.
存储器58可以存储形式为图像的数据和音频的数据,也可以存储用于在控制器56上执行的指令。The memory 58 can store data in the form of data and audio in the form of images, as well as instructions for execution on the controller 56.
编解码器54可以实现对音频和/或视频数据的编码和解码或者在控制器56的控制下实现对音频和/或视频数据的辅助编码和辅助解码。Codec 54 may implement encoding and decoding of audio and/or video data or enable auxiliary and auxiliary decoding of audio and/or video data under the control of controller 56.
智能卡46和读卡器48可以提供用户信息,也可以提供网络认证和授权用户的认证信息。智能卡46和读卡器48的具体实现形式可以是集成电路卡(Universal Integrated Circuit Card,UICC)和UICC读取器。The smart card 46 and the card reader 48 can provide user information as well as network authentication and authentication information for authorized users. The specific implementation form of the smart card 46 and the card reader 48 may be a Universal Integrated Circuit Card (UICC) and a UICC reader.
无线电接口电路52可以生成无线通信信号,该无线通信信号可以是在进行蜂窝通信网络、无线通信系统或者无线局域网通信产生的通信信号。The radio interface circuit 52 can generate a wireless communication signal, which can be a communication signal generated during a cellular communication network, a wireless communication system, or a wireless local area network communication.
天线44用于向其它装置(装置的数目可以为一个也可以为多个)发送在无线电接口电路52生成的射频信号,并且还可以用于从其它装置(装置的数目可以为一个也可以为多个)接收射频信号。The antenna 44 is used to transmit radio frequency signals generated by the radio interface circuit 52 to other devices (the number of devices may be one or more), and may also be used for other devices (the number of devices may be one or more Receive RF signals.
在本申请的一些实施例中,编解码装置50可以在传输和/或存储之前从另一设备接收待处理的视频图像数据。在本申请的另一些实施例中,编解码装置50可以通过无线或者有线连接接收图像并对接收到的图像进行编码/解码。In some embodiments of the present application, codec device 50 may receive video image data to be processed from another device prior to transmission and/or storage. In still other embodiments of the present application, the codec device 50 may receive images over a wireless or wired connection and encode/decode the received images.
图12是本申请实施例的视频编解码系统7000的示意性框图。FIG. 12 is a schematic block diagram of a video codec system 7000 according to an embodiment of the present application.
如图12所示,视频编解码系统7000包含源装置4000及目的地装置5000。源装置4000产生经过编码后的视频数据,源装置4000也可以被称为视频编码装置或视频编码设备,目的地装置5000可以对源装置4000产生的经过编码后的视频数据进行解码,目的地装置5000也可以被称为视频解码装置或视频解码设备。As shown in FIG. 12, the video codec system 7000 includes a source device 4000 and a destination device 5000. The source device 4000 generates encoded video data, the source device 4000 may also be referred to as a video encoding device or a video encoding device, and the destination device 5000 may decode the encoded video data generated by the source device 4000, the destination device 5000 may also be referred to as a video decoding device or a video decoding device.
源装置4000和目的地装置5000的具体实现形式可以是如下设备中的任意一种:台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、智能电话、手持机、电视、相机、显示装置、数字媒体播放器、视频游 戏控制台、车载计算机,或者其它类似的设备。The specific implementation form of the source device 4000 and the destination device 5000 may be any one of the following devices: a desktop computer, a mobile computing device, a notebook (eg, a laptop) computer, a tablet computer, a set top box, a smart phone, a handset, TV, camera, display device, digital media player, video game console, on-board computer, or other similar device.
目的地装置5000可以经由信道6000接收来自源装置4000编码后的视频数据。信道6000可包括能够将编码后的视频数据从源装置4000移动到目的地装置5000的一个或多个媒体及/或装置。在一个实例中,信道6000可以包括使源装置4000能够实时地将编码后的视频数据直接发射到目的地装置5000的一个或多个通信媒体,在此实例中,源装置4000可以根据通信标准(例如,无线通信协议)来调制编码后的视频数据,并且可以将调制后的视频数据发射到目的地装置5000。上述一个或多个通信媒体可以包含无线及/或有线通信媒体,例如射频(Radio Frequency,RF)频谱或一根或多根物理传输线。上述一个或多个通信媒体可以形成基于包的网络(例如,局域网、广域网或全球网络(例如,因特网))的部分。上述一个或多个通信媒体可以包含路由器、交换器、基站,或者实现从源装置4000到目的地装置5000的通信的其它设备。Destination device 5000 can receive video data encoded by source device 4000 via channel 6000. Channel 6000 can include one or more media and/or devices capable of moving encoded video data from source device 4000 to destination device 5000. In one example, channel 6000 can include one or more communication media that enable source device 4000 to transmit encoded video data directly to destination device 5000 in real time, in which case source device 4000 can be based on communication standards ( For example, a wireless communication protocol) modulates the encoded video data, and the modulated video data can be transmitted to the destination device 5000. The one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media described above may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)). The one or more communication media described above may include a router, a switch, a base station, or other device that enables communication from the source device 4000 to the destination device 5000.
在另一实例中,信道6000可包含存储由源装置4000产生的编码后的视频数据的存储媒体。在此实例中,目的地装置5000可经由磁盘存取或卡存取来存取存储媒体。存储媒体可包含多种本地存取式数据存储媒体,例如蓝光光盘、高密度数字视频光盘(Digital Video Disc,DVD)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)、快闪存储器,或用于存储经编码视频数据的其它合适数字存储媒体。In another example, channel 6000 can include a storage medium that stores encoded video data generated by source device 4000. In this example, destination device 5000 can access the storage medium via disk access or card access. The storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, High Density Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), flash memory. Or other suitable digital storage medium for storing encoded video data.
在另一实例中,信道6000可包含文件服务器或存储由源装置4000产生的编码后的视频数据的另一中间存储装置。在此实例中,目的地装置5000可经由流式传输或下载来存取存储于文件服务器或其它中间存储装置处的编码后的视频数据。文件服务器可以是能够存储编码后的视频数据且将所述编码后的视频数据发射到目的地装置5000的服务器类型。例如,文件服务器可以包含全球广域网(World Wide Web,Web)服务器(例如,用于网站)、文件传送协议(File Transfer Protocol,FTP)服务器、网络附加存储(Network Attached Storage,NAS)装置以及本地磁盘驱动器。In another example, channel 6000 can include a file server or another intermediate storage device that stores encoded video data generated by source device 4000. In this example, destination device 5000 can access the encoded video data stored at a file server or other intermediate storage device via streaming or download. The file server may be a server type capable of storing encoded video data and transmitting the encoded video data to the destination device 5000. For example, the file server may include a World Wide Web (Web) server (for example, for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk. driver.
目的地装置5000可经由标准数据连接(例如,因特网连接)来存取编码后的视频数据。数据连接的实例类型包含适合于存取存储于文件服务器上的编码后的视频数据的无线信道、有线连接(例如,缆线调制解调器等),或两者的组合。编码后的视频数据从文件服务器的发射可为流式传输、下载传输或两者的组合。Destination device 5000 can access the encoded video data via a standard data connection (e.g., an internet connection). The instance type of the data connection includes a wireless channel, a wired connection (e.g., a cable modem, etc.), or a combination of both, suitable for accessing the encoded video data stored on the file server. The transmission of the encoded video data from the file server may be streaming, downloading, or a combination of both.
本申请的图像预测方法不限于无线应用场景,示例性的,本申请的图像预测方法可以应用于支持以下应用等多种多媒体应用的视频编解码:空中电视广播、有线电视发射、卫星电视发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频编解码系统7000可经配置以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。The image prediction method of the present application is not limited to a wireless application scenario. Illustratively, the image prediction method of the present application can be applied to video codec supporting multiple multimedia applications such as the following applications: aerial television broadcasting, cable television transmission, satellite television transmission, Streaming video transmission (e.g., via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application. In some examples, video codec system 7000 can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
在图12中,源装置4000包含视频源4001、视频编码器4002及输出接口4003。在一些实例中,输出接口4003可包含调制器/解调器(调制解调器)及/或发射器。视频源4001可包含视频俘获装置(例如,视频相机)、含有先前俘获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频输入接口,及/或用于产生视频数据的计算机图形系统,或上述视频数据源的组合。In FIG. 12, the source device 4000 includes a video source 4001, a video encoder 4002, and an output interface 4003. In some examples, output interface 4003 can include a modulator/demodulator (modem) and/or a transmitter. Video source 4001 can include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer for generating video data A graphics system, or a combination of the above video data sources.
视频编码器4002可编码来自视频源4001的视频数据。在一些实例中,源装置4000经由输出接口4003将编码后的视频数据直接发射到目的地装置5000。编码后的视频数据还可存储于存储媒体或文件服务器上以供目的地装置5000稍后存取以用于解码及/或播放。Video encoder 4002 can encode video data from video source 4001. In some examples, source device 4000 transmits the encoded video data directly to destination device 5000 via output interface 4003. The encoded video data may also be stored on a storage medium or file server for later access by the destination device 5000 for decoding and/or playback.
在图12的实例中,目的地装置5000包含输入接口5003、视频解码器5002及显示装置5001。在一些实例中,输入接口5003包含接收器及/或调制解调器。输入接口5003可经由信道6000接收编码后的视频数据。显示装置5001可与目的地装置5000整合或可在目的地装置5000外部。一般来说,显示装置5001显示解码后的视频数据。显示装置5001可包括多种显示装置,例如液晶显示器、等离子体显示器、有机发光二极管显示器或其它类型的显示装置。In the example of FIG. 12, the destination device 5000 includes an input interface 5003, a video decoder 5002, and a display device 5001. In some examples, input interface 5003 includes a receiver and/or a modem. The input interface 5003 can receive the encoded video data via the channel 6000. Display device 5001 may be integrated with destination device 5000 or may be external to destination device 5000. Generally, the display device 5001 displays the decoded video data. Display device 5001 can include a variety of display devices, such as liquid crystal displays, plasma displays, organic light emitting diode displays, or other types of display devices.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设 备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application. It should be covered by the scope of protection of this application. Therefore, the scope of protection of the present application should be determined by the scope of the claims.
Claims (20)
- 一种图像预测方法,其特征在于,所述方法包括:An image prediction method, the method comprising:获取当前图像块的初始预测运动信息;Obtaining initial predicted motion information of the current image block;根据所述初始预测运动信息,在第一参考图像中确定出所述当前图像块对应的第一参考块,并在第二参考图像中确定出所述当前图像块对应的第二参考块;其中,所述第一参考块包含第一搜索基点,所述第二参考块包含第二搜索基点;Determining, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determining, in the second reference image, a second reference block corresponding to the current image block; The first reference block includes a first search base point, and the second reference block includes a second search base point;在所述第一参考图像中确定出N个第三参考块;Determining N third reference blocks in the first reference image;针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块;得到N个参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块;N大于等于1;For the third reference block of any one of the N third reference blocks, according to the first search base point, the location of the any one of the third reference blocks, and the second search base point, in the second reference Correspondingly determining a fourth reference block in the image; obtaining N reference block groups, wherein one reference block group includes a third reference block and a fourth reference block; N is greater than or equal to 1;将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价;And increasing the pixel values of the obtained third reference block and the fourth reference block to a first pixel precision, and calculating an image block matching cost of the N reference block groups under the first pixel precision;在所述N个的参考块组中确定出满足图像块匹配代价准则的目标参考块组,所述目标参考块组包含目标第三参考块和目标第四参考块;Determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, the target reference block group including a target third reference block and a target fourth reference block;根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值,其中,所述当前图像块的像素预测值具有第二像素精度;所述第二像素精度小于所述第一像素精度。Obtaining a pixel prediction value of the current image block according to a pixel value of the target third reference block at a first precision and a pixel value of the target fourth reference block at a first precision, wherein the current image block The pixel prediction value has a second pixel precision; the second pixel precision is less than the first pixel precision.
- 如权利要求1所述的方法,其特征在于,所述初始预测运动信息中包含参考图像索引,用于指示所述两个参考图像包括一个前向参考图像和一个后向参考图像。The method according to claim 1, wherein said initial predicted motion information includes a reference image index for indicating that said two reference images comprise a forward reference image and a backward reference image.
- 如权利要求1或2所述的方法,其特征在于,所述针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块包括:The method according to claim 1 or 2, wherein said any one of said N third reference blocks is based on said first search base point, said any one of said third reference Positioning the block and the second search base point, correspondingly determining, in the second reference image, a fourth reference block includes:若所述第一参考块为一个第三参考块;If the first reference block is a third reference block;则将所述第二参考块对应地为一个第四参考块;其中,所述第一参考块和所述第二参考块属于一个参考块组。The second reference block is correspondingly a fourth reference block; wherein the first reference block and the second reference block belong to one reference block group.
- 如权利要求1或2所述的方法,其特征在于,所述针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块包括:The method according to claim 1 or 2, wherein said any one of said N third reference blocks is based on said first search base point, said any one of said third reference Positioning the block and the second search base point, correspondingly determining, in the second reference image, a fourth reference block includes:根据所述任意一个第三参考块和所述第一搜索基点确定出第i矢量;Determining an ith vector according to the any one of the third reference block and the first search base;根据所述当前图像块相对于所述第一参考图像的时域间隔t1,所述当前图像块相对于所述第二参考图像的时域间隔t2,以及所述第i矢量,确定出第j矢量,其中,所述第j矢量与所述第i矢量的方向相反;i和j均为不大于N的正整数;Determining the jth according to a time domain interval t1 of the current image block relative to the first reference image, a time domain interval t2 of the current image block relative to the second reference image, and the ith vector a vector, wherein the jth vector is opposite to a direction of the ith vector; i and j are both positive integers not greater than N;根据所述第二搜索基点和所述第j矢量确定出一个第四参考块。Determining a fourth reference block according to the second search base point and the jth vector.
- 如权利要求1或2所述的方法,其特征在于,所述针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块包括:The method according to claim 1 or 2, wherein said any one of said N third reference blocks is based on said first search base point, said any one of said third reference Positioning the block and the second search base point, correspondingly determining, in the second reference image, a fourth reference block includes:根据所述任意一个第三参考块和所述第一搜索基点确定出第i矢量;Determining an ith vector according to the any one of the third reference block and the first search base;根据所述第i矢量,确定出第j矢量,其中,所述第j矢量与所述第i矢量等大 反向;i和j均为不大于N的正整数;Determining, according to the ith vector, a jth vector, wherein the jth vector is inversely different from the ith vector; i and j are positive integers not greater than N;根据所述第二搜索基点和所述第j矢量确定出一个第四参考块。Determining a fourth reference block according to the second search base point and the jth vector.
- 如权利要求1-5任一项所述的方法,其特征在于,所述将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价包括:The method according to any one of claims 1 to 5, wherein the pixel values of the obtained third reference block and fourth reference block are increased to a first pixel precision, and the first pixel precision is Calculating the image block matching cost of the N reference block groups includes:针对所述N个参考块组中的至少一个参考块组,将第三参考块和第四参考块的像素值通过插值或移位提高到第一像素精度,并在所述第一像素精度下计算图像块匹配代价;And increasing pixel values of the third reference block and the fourth reference block to a first pixel precision by interpolation or shifting for at least one of the N reference block groups, and under the first pixel precision Calculate the image block matching cost;在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组包括:Determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion includes:将所述至少一个参考块组中首个出现的满足图像块匹配代价小于预设阈值的参考块组确定为所述目标参考块组。And determining, by the first one of the at least one reference block group, a reference block group that satisfies an image block matching cost less than a preset threshold as the target reference block group.
- 如权利要求1-5任一项所述的方法,其特征在于,所述将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价包括:The method according to any one of claims 1 to 5, wherein the pixel values of the obtained third reference block and fourth reference block are increased to a first pixel precision, and the first pixel precision is Calculating the image block matching cost of the N reference block groups includes:将所述得到的第三参考块和第四参考块的像素值通过插值或移位提高到第一像素精度;And increasing pixel values of the obtained third reference block and fourth reference block to first pixel precision by interpolation or shifting;针对所述N个参考块组中的每一个参考块组计算图像块匹配代价;Calculating an image block matching cost for each of the N reference block groups;在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组包括:Determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion includes:将所述N个参考块组中图像块匹配代价最小的参考块组确定为所述目标参考块组。A reference block group having the smallest image block matching cost among the N reference block groups is determined as the target reference block group.
- 如权利要求1-7任一项所述的方法,其特征在于,所述根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值包括:The method according to any one of claims 1 to 7, wherein said pixel value according to said target third reference block at a first precision and said target fourth reference block are at a first precision Pixel values to obtain pixel prediction values for the current image block include:获取所述目标第三参考块在第一精度下的像素值predSamplesL0’[x][y];Obtaining a pixel value predSamplesL0'[x][y] of the target third reference block at the first precision;获取所述目标第四参考块在第一精度下的像素值predSamplesL1’[x][y];Obtaining a pixel value predSamplesL1'[x][y] of the target fourth reference block at the first precision;所述当前图像块的像素预测值predSamples’[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0’[x][y]+predSamplesL1’[x][y]+offset2)>>shift2),其中,bitDepth为所述第二像素精度,Shift2为移位参数,offset2等于1<<(shift2-1)。The pixel prediction value of the current image block predSamples'[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0'[x][y]+predSamplesL1'[x][y]+ Offset2)>>shift2), where bitDepth is the second pixel precision, Shift2 is the shift parameter, and offset2 is equal to 1<<(shift2-1).
- 一种图像预测装置,其特征在于,所述装置包括:An image prediction apparatus, characterized in that the apparatus comprises:获取单元,用于获取当前图像块的初始预测运动信息;An acquiring unit, configured to acquire initial predicted motion information of a current image block;确定单元,用于根据所述初始预测运动信息,在第一参考图像中确定出所述当前图像块对应的第一参考块,并在第二参考图像中确定出所述当前图像块对应的第二参考块;其中,所述第一参考块包含第一搜索基点,所述第二参考块包含第二搜索基点;a determining unit, configured to determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine, in the second reference image, a corresponding one of the current image block a second reference block, wherein the first reference block includes a first search base point, and the second reference block includes a second search base point;搜索单元,用于在所述第一参考图像中确定出N个第三参考块;a searching unit, configured to determine N third reference blocks in the first reference image;映射单元,用于针对所述N个第三参考块中的任意一个第三参考块,根据所述第一搜索基点、所述任意一个第三参考块的位置和所述第二搜索基点,在所述第二参考图像中对应地确定出一个第四参考块;得到N个参考块组,其中,一个参考块组包含一个第三参考块和一个第四参考块;N大于等于1;a mapping unit, configured to: according to the first search base point, the location of the any one of the third reference blocks, and the second search base point, for any one of the N third reference blocks Determining, in the second reference image, a fourth reference block; obtaining N reference block groups, wherein one reference block group includes a third reference block and a fourth reference block; N is greater than or equal to 1;计算单元,用于将得到的第三参考块和第四参考块的像素值提高到第一像素精度,并在所述第一像素精度下计算所述N个参考块组的图像块匹配代价;a calculating unit, configured to increase a pixel value of the obtained third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups under the first pixel precision;选择单元,用于在所述N个参考块组中确定出满足图像块匹配代价准则的目标参考块组,所述目标参考块组包含目标第三参考块和目标第四参考块;a selecting unit, configured to determine, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, where the target reference block group includes a target third reference block and a target fourth reference block;预测单元,用于根据所述目标第三参考块在第一精度下的像素值和所述目标第四参考块在第一精度下的像素值得到所述当前图像块的像素预测值,其中,所述当前图像块的像素预测值具有第二像素精度;所述第二像素精度小于所述第一像素精度。a prediction unit, configured to obtain a pixel prediction value of the current image block according to a pixel value of the target third reference block at a first precision and a pixel value of the target fourth reference block at a first precision, where The pixel prediction value of the current image block has a second pixel precision; the second pixel precision is less than the first pixel precision.
- 如权利要求9所述的装置,其特征在于,所述初始预测运动信息中包含参考图像索引,用于指示所述两个参考图像包括一个前向参考图像和一个后向参考图像。The apparatus according to claim 9, wherein said initial predicted motion information includes a reference image index for indicating that said two reference images comprise a forward reference image and a backward reference image.
- 如权利要求9或10所述的装置,其特征在于,所述映射单元具体用于:The device according to claim 9 or 10, wherein the mapping unit is specifically configured to:若所述第一参考块为一个第三参考块;If the first reference block is a third reference block;则将所述第二参考块对应地为一个第四参考块;其中,所述第一参考块和所述第二参考块属于一个参考块组。The second reference block is correspondingly a fourth reference block; wherein the first reference block and the second reference block belong to one reference block group.
- 如权利要求9或10所述的装置,其特征在于,所述映射单元具体用于:The device according to claim 9 or 10, wherein the mapping unit is specifically configured to:根据所述任意一个第三参考块和所述第一搜索基点确定出第i矢量;Determining an ith vector according to the any one of the third reference block and the first search base;根据所述当前图像块相对于所述第一参考图像的时域间隔t1,所述当前图像块相对于所述第二参考图像的时域间隔t2,以及所述第i矢量,确定出第j矢量,其中,所述第j矢量与所述第i矢量的方向相反;i和j均为不大于N的正整数;Determining the jth according to a time domain interval t1 of the current image block relative to the first reference image, a time domain interval t2 of the current image block relative to the second reference image, and the ith vector a vector, wherein the jth vector is opposite to a direction of the ith vector; i and j are both positive integers not greater than N;根据所述第二搜索基点和所述第j矢量确定出一个第四参考块。Determining a fourth reference block according to the second search base point and the jth vector.
- 如权利要求9或10所述的装置,其特征在于,所述映射单元具体用于:The device according to claim 9 or 10, wherein the mapping unit is specifically configured to:根据所述任意一个第三参考块和所述第一搜索基点确定出第i矢量;Determining an ith vector according to the any one of the third reference block and the first search base;根据所述第i矢量,确定出第j矢量,其中,所述第j矢量与所述第i矢量等大反向;i和j均为不大于N的正整数;Determining, according to the ith vector, a jth vector, wherein the jth vector is inversely different from the ith vector; i and j are positive integers not greater than N;根据所述第二搜索基点和所述第j矢量确定出一个第四参考块。Determining a fourth reference block according to the second search base point and the jth vector.
- 如权利要求9-13任一项所述的装置,其特征在于,所述计算单元具体用于:针对所述N个参考块组中的至少一个参考块组,将得到的第三参考块和第四参考块的像素值通过插值或移位提高到第一像素精度;并在所述第一像素精度下计算图像块匹配代价;The apparatus according to any one of claims 9 to 13, wherein the calculating unit is specifically configured to: obtain, for the at least one reference block group of the N reference block groups, the obtained third reference block and The pixel value of the fourth reference block is increased to the first pixel precision by interpolation or shifting; and the image block matching cost is calculated under the first pixel precision;所述选择单元具体用于:将所述至少一个参考块组中首个出现的满足图像块匹配误差小于预设阈值的参考块组确定为所述目标参考块组。The selecting unit is specifically configured to: determine, as the target reference block group, a reference block group that meets the first occurrence of the image block matching error that is less than a preset threshold in the at least one reference block group.
- 如权利要求9-13任一项所述的装置,其特征在于,所述计算单元具体用于:将所述得到的第三参考块和第四参考块的像素值通过插值或移位提高到第一像素精度;针对所述N个参考块组中的每一个参考块组计算图像块匹配代价;The apparatus according to any one of claims 9 to 13, wherein the calculating unit is specifically configured to: increase pixel values of the obtained third reference block and fourth reference block by interpolation or shifting to First pixel precision; calculating an image block matching cost for each of the N reference block groups;所述选择单元具体用于:将所述N个参考块组中图像块匹配误差最小的参考块组确定为所述目标参考块组。The selecting unit is specifically configured to: determine, as the target reference block group, a reference block group that minimizes an image block matching error in the N reference block groups.
- 如权利要求9-15任一项所述的装置,其特征在于,所述预测单元具体用于:The apparatus according to any one of claims 9 to 15, wherein the prediction unit is specifically configured to:获取所述目标第三参考块在第一精度下的像素值predSamplesL0’[x][y];Obtaining a pixel value predSamplesL0'[x][y] of the target third reference block at the first precision;获取所述目标第四参考块在第一精度下的像素值predSamplesL1’[x][y];Obtaining a pixel value predSamplesL1'[x][y] of the target fourth reference block at the first precision;所述当前图像块的像素预测值predSamples’[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0’[x][y]+predSamplesL1’[x][y]+offset2)>>shift2),其中,bitDepth为所述第二像素精度,Shift2为移位参数,offset2等于1<<(shift2-1)。The pixel prediction value of the current image block predSamples'[x][y]=Clip3(0,(1<<bitDepth)-1,(predSamplesL0'[x][y]+predSamplesL1'[x][y]+ Offset2)>>shift2), where bitDepth is the second pixel precision, Shift2 is the shift parameter, and offset2 is equal to 1<<(shift2-1).
- 一种编码器,所述视频编码器用于编码图像块,其特征在于,包括:An encoder for encoding an image block, comprising:如权利要求9至16任一项所述的图像预测装置,其中所述图像预测装置用于得到所述当前图像块的像素值的预测值;The image prediction apparatus according to any one of claims 9 to 16, wherein the image prediction means is configured to obtain a predicted value of a pixel value of the current image block;编码重建模块,用于根据所述当前图像块的像素值的预测值得到所述当前图像块的重建像素值。And a code reconstruction module, configured to obtain a reconstructed pixel value of the current image block according to a predicted value of a pixel value of the current image block.
- 一种解码器,所述视频解码器用于解码图像块,其特征在于,包括:A decoder, the video decoder for decoding an image block, comprising:如权利要求9至16任一项所述的图像预测装置,其中所述图像预测装置用于得到所述当前图像块的像素值的预测值;The image prediction apparatus according to any one of claims 9 to 16, wherein the image prediction means is configured to obtain a predicted value of a pixel value of the current image block;解码重建模块,用于根据所述当前图像块的像素值的预测值得到所述当前图像块的重建像素值。And a decoding reconstruction module, configured to obtain a reconstructed pixel value of the current image block according to a predicted value of a pixel value of the current image block.
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储了程序代码,其中,所述程序代码包括用于执行如权利要求1-8任意一种所述方法的指令。A computer readable storage medium, characterized in that the computer readable storage medium stores program code, wherein the program code comprises instructions for performing the method of any of claims 1-8.
- 一种终端,其特征在于,所述终端包括存储器,处理器;A terminal, the terminal comprising a memory, a processor;所述存储器中存储有程序指令,Program instructions are stored in the memory,所述处理器,用于调用所述程序指令,执行如权利要求1-8任意一种所述方法。The processor is configured to invoke the program instruction to perform the method of any one of claims 1-8.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711494258.1 | 2017-12-31 | ||
CN201711494258.1A CN109996080B (en) | 2017-12-31 | 2017-12-31 | Image prediction method and device and coder-decoder |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019128716A1 true WO2019128716A1 (en) | 2019-07-04 |
Family
ID=67066492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/120681 WO2019128716A1 (en) | 2017-12-31 | 2018-12-12 | Image prediction method, apparatus, and codec |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109996080B (en) |
WO (1) | WO2019128716A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110992399A (en) * | 2019-11-11 | 2020-04-10 | 北京空间机电研究所 | High-precision target atmospheric disturbance detection method |
CN114040209A (en) * | 2021-10-21 | 2022-02-11 | 百果园技术(新加坡)有限公司 | Motion estimation method, motion estimation device, electronic equipment and storage medium |
CN116847088A (en) * | 2023-08-24 | 2023-10-03 | 深圳传音控股股份有限公司 | Image processing method, processing apparatus, and storage medium |
US12003764B2 (en) | 2019-09-27 | 2024-06-04 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Prediction method for current block and electronic device |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116916043A (en) | 2019-09-24 | 2023-10-20 | Oppo广东移动通信有限公司 | Prediction value determination method, encoder, decoder, and computer storage medium |
CN113709500B (en) * | 2019-12-23 | 2022-12-23 | 杭州海康威视数字技术股份有限公司 | Encoding and decoding method, device and equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1525762A (en) * | 2003-09-12 | 2004-09-01 | 中国科学院计算技术研究所 | A coding/decoding end bothway prediction method for video coding |
GB2521349A (en) * | 2013-12-05 | 2015-06-24 | Sony Corp | Data encoding and decoding |
WO2017057947A1 (en) * | 2015-10-01 | 2017-04-06 | 엘지전자(주) | Image processing method on basis of inter prediction mode and apparatus therefor |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100551073C (en) * | 2006-12-05 | 2009-10-14 | 华为技术有限公司 | Decoding method and device, image element interpolation processing method and device |
EP4425925A2 (en) * | 2011-01-07 | 2024-09-04 | Nokia Technologies Oy | Motion prediction in video coding |
-
2017
- 2017-12-31 CN CN201711494258.1A patent/CN109996080B/en active Active
-
2018
- 2018-12-12 WO PCT/CN2018/120681 patent/WO2019128716A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1525762A (en) * | 2003-09-12 | 2004-09-01 | 中国科学院计算技术研究所 | A coding/decoding end bothway prediction method for video coding |
GB2521349A (en) * | 2013-12-05 | 2015-06-24 | Sony Corp | Data encoding and decoding |
WO2017057947A1 (en) * | 2015-10-01 | 2017-04-06 | 엘지전자(주) | Image processing method on basis of inter prediction mode and apparatus therefor |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12003764B2 (en) | 2019-09-27 | 2024-06-04 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Prediction method for current block and electronic device |
CN110992399A (en) * | 2019-11-11 | 2020-04-10 | 北京空间机电研究所 | High-precision target atmospheric disturbance detection method |
CN114040209A (en) * | 2021-10-21 | 2022-02-11 | 百果园技术(新加坡)有限公司 | Motion estimation method, motion estimation device, electronic equipment and storage medium |
CN116847088A (en) * | 2023-08-24 | 2023-10-03 | 深圳传音控股股份有限公司 | Image processing method, processing apparatus, and storage medium |
CN116847088B (en) * | 2023-08-24 | 2024-04-05 | 深圳传音控股股份有限公司 | Image processing method, processing apparatus, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109996080A (en) | 2019-07-09 |
CN109996080B (en) | 2023-01-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3672249B1 (en) | Inter frame prediction method and device for video images | |
WO2019128716A1 (en) | Image prediction method, apparatus, and codec | |
AU2023200956B2 (en) | Video data inter prediction method and apparatus | |
WO2017129023A1 (en) | Decoding method, encoding method, decoding apparatus, and encoding apparatus | |
CN115941942A (en) | Video encoder, video decoder and corresponding encoding and decoding methods | |
CN110121073B (en) | Bidirectional interframe prediction method and device | |
WO2019109955A1 (en) | Interframe prediction method and apparatus, and terminal device | |
US20220094947A1 (en) | Method for constructing mpm list, method for obtaining intra prediction mode of chroma block, and apparatus | |
US20240040113A1 (en) | Video picture decoding and encoding method and apparatus | |
US11412210B2 (en) | Inter prediction method and apparatus for video coding | |
US12010293B2 (en) | Picture prediction method and apparatus, and computer-readable storage medium | |
CA3137980A1 (en) | Picture prediction method and apparatus, and computer-readable storage medium | |
US11109060B2 (en) | Image prediction method and apparatus | |
US20220109830A1 (en) | Method for constructing merge candidate motion information list, apparatus, and codec | |
CN111327907B (en) | Method, device and equipment for inter-frame prediction and storage medium | |
WO2019233423A1 (en) | Motion vector acquisition method and device | |
WO2019091372A1 (en) | Image prediction method and device | |
US11902506B2 (en) | Video encoder, video decoder, and corresponding methods | |
WO2023051156A1 (en) | Video image processing method and apparatus | |
WO2020135615A1 (en) | Video image decoding method and apparatus | |
RU2787885C2 (en) | Method and equipment for mutual prediction, bit stream and non-volatile storage carrier | |
RU2822447C2 (en) | Method and equipment for mutual prediction | |
RU2798316C2 (en) | Method and equipment for external prediction | |
CN110677645B (en) | Image prediction method and device | |
CN110971899A (en) | Method for determining motion information, and inter-frame prediction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18894892 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18894892 Country of ref document: EP Kind code of ref document: A1 |