WO2019128716A1 - Procédé, appareil et codec de prédiction d'image - Google Patents

Procédé, appareil et codec de prédiction d'image Download PDF

Info

Publication number
WO2019128716A1
WO2019128716A1 PCT/CN2018/120681 CN2018120681W WO2019128716A1 WO 2019128716 A1 WO2019128716 A1 WO 2019128716A1 CN 2018120681 W CN2018120681 W CN 2018120681W WO 2019128716 A1 WO2019128716 A1 WO 2019128716A1
Authority
WO
WIPO (PCT)
Prior art keywords
reference block
block
image
pixel
precision
Prior art date
Application number
PCT/CN2018/120681
Other languages
English (en)
Chinese (zh)
Inventor
高山
马祥
陈焕浜
杨海涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019128716A1 publication Critical patent/WO2019128716A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy

Definitions

  • the present application relates to the field of video codec technology, and in particular, to an interframe prediction method and apparatus for video images, and a corresponding encoder and decoder.
  • Digital video capabilities can be incorporated into a wide variety of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, Digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones (so-called "smart phones"), video teleconferencing devices, video streaming devices and the like .
  • Digital video devices implement video compression techniques, for example, standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 Part 10 Advanced Video Coding (AVC), The video coding standard H.265/High Efficiency Video Coding (HEVC) standard and the video compression techniques described in the extension of such standards.
  • Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.
  • Video compression techniques perform spatial (intra-image) prediction and/or temporal (inter-image) prediction to reduce or remove redundancy inherent in video sequences.
  • a video slice ie, a video frame or a portion of a video frame
  • the image block in the intra-coded (I) slice of the image is encoded using spatial prediction with respect to reference samples in neighboring blocks in the same image.
  • An image block in an inter-coded (P or B) slice of an image may use spatial prediction with respect to reference samples in neighboring blocks in the same image or temporal prediction with respect to reference samples in other reference images.
  • An image may be referred to as a frame, and a reference image may be referred to as a reference frame.
  • various video coding standards including the High Efficiency Video Coding (HEVC) standard propose a predictive coding mode for an image block, that is, predict a block to be currently coded based on an already encoded video data block.
  • HEVC High Efficiency Video Coding
  • the current image block is predicted based on one or more previously decoded neighboring blocks in the same image as the current image block; in the inter prediction mode, based on the already decoded blocks in the different images Predict the current image block.
  • inter-prediction modes such as Merge mode, Skip mode, and Advanced Motion Vector Prediction mode (AMVP mode)
  • AMVP mode Advanced Motion Vector Prediction mode
  • An embodiment of the present application provides an image prediction method and apparatus, and a corresponding encoder and decoder, in particular, an inter-frame prediction method for video images, which improves the prediction accuracy of motion information of an image block to a certain extent, thereby improving coding and decoding performance.
  • an embodiment of the present application provides an image prediction method, which includes: acquiring initial predicted motion information of a current image block; and determining, according to the initial predicted motion information, the current image in a first reference image. a first reference block corresponding to the block, and determining, in the second reference image, a second reference block corresponding to the current image block; wherein the first reference block includes a first search base point, and the second reference block includes a second search base point; determining N third reference blocks in the first reference image; and for any one of the N third reference blocks, according to the first search base point, Determining, in the second reference image, a fourth reference block corresponding to a position of the any one of the third reference block and the second search base; obtaining N reference block groups, wherein the reference block group includes a first a third reference block and a fourth reference block; N is greater than or equal to 1; increasing pixel values of the obtained third reference block and fourth reference block to a first pixel precision, and calculating at the first pixel precision An image
  • an embodiment of the present application provides an image prediction apparatus, including a plurality of functional units for implementing any one of the methods of the first aspect.
  • the apparatus may include: an acquiring unit, configured to acquire initial predicted motion information of a current image block; and a determining unit, configured to determine the current image block in the first reference image according to the initial predicted motion information Corresponding first reference block, and determining, in the second reference image, a second reference block corresponding to the current image block; wherein the first reference block includes a first search base point, and the second reference block includes a first reference block a search base unit, configured to determine N third reference blocks in the first reference image, and a mapping unit, configured to use, for any one of the N third reference blocks, a third reference block, according to Determining, in the second reference image, a fourth reference block corresponding to the first search base point, the location of the any one of the third reference blocks, and the second search base point; obtaining N reference block groups, Wherein, one reference block group includes a third
  • the initial predicted motion information includes a reference image index for indicating that the two reference images include one forward reference image and one backward reference image.
  • the N third reference blocks include the first reference block; and the obtained N fourth reference blocks include the second reference a block; wherein the first reference block and the second reference block belong to one reference block group, that is, there is a corresponding relationship in space. It can also be understood that, according to any one of the N third reference blocks, according to the first search base point, the location of the any one third reference block, and the second search base point Correspondingly determining a fourth reference block in the second reference image includes: if the first reference block is a third reference block; the second reference block is correspondingly a fourth reference block .
  • the any one of the third Correspondingly determining a fourth reference block in the second reference image includes: determining, according to the location of the block and the second search base, the first reference block according to the any one of the third reference block and the first search base An i vector; determining, according to a time domain interval t1 of the current image block relative to the first reference image, a time domain interval t2 of the current image block relative to the second reference image, and the ith vector a jth vector, wherein the jth vector is opposite to a direction of the ith vector; i and j are both positive integers not greater than N; and one is determined according to the second search base point and the jth vector Fourth reference block. Accordingly, the method can be performed by a mapping unit.
  • the any one of the N third reference blocks, according to the first search base point, the any one Determining, in the second reference image, a fourth reference block corresponding to the location of the third reference block and the second search base comprising: determining, according to the any one of the third reference block and the first search base point An ith vector; determining, according to the ith vector, a jth vector, wherein the jth vector is inversely different from the ith vector; i and j are positive integers not greater than N;
  • the second search base point and the jth vector determine a fourth reference block. Accordingly, the method can be performed by a mapping unit.
  • the pixel values of the obtained third reference block and the fourth reference block are improved to a first pixel precision
  • the first pixel precision is Calculating the image block matching cost of the N reference block groups includes: interpolating or subtracting pixel values of the obtained third reference block and the fourth reference block for at least one of the N reference block groups Up shifting to a first pixel precision; and calculating an image block matching cost at the first pixel precision; determining, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion includes: The first occurrence of the reference block group in the at least one reference block group that satisfies the image block matching cost less than the preset threshold is determined as the target reference block group.
  • the image block matching cost is not less than a preset threshold.
  • the image block matching cost is less than a preset threshold, and the third reference block group is used as the third reference block group.
  • the target references the block group and no further reference block groups are calculated. Accordingly, the method can be performed jointly by the computing unit and the selection unit.
  • the pixel values of the obtained third reference block and the fourth reference block are improved to a first pixel precision
  • the first pixel precision is Calculating an image block matching cost of the N reference block groups includes: increasing pixel values of the obtained third reference block and fourth reference block by interpolation or shifting to a first pixel precision; for the N Calculating an image block matching cost for each of the reference block groups in the reference block group; determining, in the N reference block groups, the target reference block group that satisfies the image block matching cost criterion includes: imaging the N reference block groups A reference block group having the smallest block matching cost is determined as the target reference block group. For example, six reference block groups are calculated, wherein the fourth reference block group image block matching cost is the smallest, and the fourth reference block group is used as the target reference block group. Accordingly, the method can be performed jointly by the computing unit and the selection unit.
  • the pixel value according to the target third reference block at the first precision and the target fourth reference block are at the first precision Pixel values to obtain pixel prediction values for the current image block include:
  • the first motion vector and the second motion vector included in the initial predicted motion information; according to the initial predicted motion information, at the first Determining, in the reference image, the first reference block corresponding to the current image block, and determining, in the second reference image, that the second reference block corresponding to the current image block comprises: according to the location of the current image block and the The first motion vector obtains the first reference block, and obtains the second reference block according to the location of the current image block and the second motion vector. Accordingly, the method can be performed by the determining unit.
  • the motion search may be performed in a preset step with reference to the search base point of the first reference block, and the N third reference blocks are searched for.
  • the method further includes: determining a motion vector corresponding to the target third reference block and the target fourth reference block as the forward optimal motion vector sum
  • the backward optimal motion vector provides a motion vector reference for the prediction of subsequent image blocks.
  • the above methods and apparatus can be implemented by a processor calling a program and instructions in a memory.
  • an embodiment of the present application provides a video encoder, where the video encoder is used to encode an image block, and includes any possible image prediction apparatus and a code reconstruction module, where the image prediction apparatus is configured to obtain a current image block. a prediction value of the pixel value; the code reconstruction module is configured to obtain the reconstructed pixel value of the current image block according to the predicted value of the pixel value of the current image block. Accordingly, the video encoder can perform any of the possible design methods described above.
  • an embodiment of the present application provides a video decoder, where the video decoder is used to decode an image block, and includes any possible image prediction apparatus and a decoding reconstruction module, where the image prediction apparatus is configured to obtain a current image block. a prediction value of the pixel value; a decoding reconstruction module, configured to obtain a reconstructed pixel value of the current image block according to a predicted value of a pixel value of the current image block.
  • the video decoder can perform any of the possible design methods described above.
  • an embodiment of the present application provides an apparatus for encoding video data, where the apparatus includes:
  • a memory for storing video data, the video data comprising one or more image blocks;
  • a video encoder for encoding an image, and the inter prediction method in the encoding process may adopt any of the above possible design methods.
  • an embodiment of the present application provides an apparatus for decoding video data, where the device includes:
  • a memory for storing video data, the video data comprising one or more image blocks;
  • a video decoder for decoding an image, and the inter prediction method in the decoding process may adopt any of the above possible design methods.
  • an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled to each other, the processor calling a program code stored in the memory to perform any one of the first aspects. Part or all of the steps of the method.
  • an embodiment of the present application provides a decoding apparatus, including: a non-volatile memory and a processor coupled to each other, the processor calling a program code stored in the memory to perform any one of the first aspects. Part or all of the steps of the method.
  • the embodiment of the present application provides a computer readable storage medium, where the program code stores program code, where the program code includes a part for performing any one of the methods of the first aspect or Instructions for all steps.
  • the embodiment of the present application provides a computer program product, when the computer program product is run on a computer, causing the computer to perform some or all of the steps of any one of the first aspects.
  • FIG. 1 is a schematic diagram of a video encoding process in an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a video decoding process in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an image prediction method according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an inter prediction mode in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of another inter prediction mode in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of a search reference block in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an image prediction apparatus according to an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of a video encoder in an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of a video decoder in an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a video transmission system in an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a video codec apparatus in an embodiment of the present application.
  • FIG. 12 is a schematic block diagram of a video codec system in an embodiment of the present application.
  • the image prediction method in the present application can be applied to the field of video codec technology.
  • the video codec is first introduced below.
  • a video generally consists of a number of frame images in a certain order.
  • redundant information For example, there is often a large amount of space in one frame of image.
  • the same or similar structure that is to say, there is a large amount of spatial redundancy information in the video file.
  • time redundant information in the video file, which is caused by the composition of the video.
  • the frame rate of video sampling is generally 25 frames/second to 60 frames/second, that is, the sampling interval between adjacent frames is 1/60 second to 1/25 second, in such a short period of time, There are basically a lot of similar information in the sampled image, and there is a huge correlation between the images.
  • visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance.
  • visual redundancy refers to the proper compression of the video bit stream by using the human eye to be sensitive to changes in luminance and relatively less sensitive to changes in chrominance.
  • the sensitivity of human vision to brightness changes tends to decrease, and is more sensitive to the edges of objects; in addition, the human eye is relatively insensitive to internal areas and sensitive to the overall structure. Since the final target of the video image is our human population, we can make full use of these characteristics of the human eye to compress the original video image to achieve better compression.
  • video image information also has redundancy in information entropy redundancy, structural redundancy, knowledge redundancy, importance redundancy, etc. information.
  • the purpose of video coding (also referred to as video compression coding) is to use various technical methods to remove redundant information in a video sequence to reduce storage space and save transmission bandwidth.
  • Chroma sampling This method makes full use of the visual psychological characteristics of the human eye, and tries to minimize the amount of data described by a single element from the underlying data representation.
  • YUV luminance-chrominance-chrominance
  • the YUV color space includes a luminance signal Y and two color difference signals U and V, and the three components are independent of each other.
  • the YUV color space is more flexible in representation, and the transmission occupies less bandwidth, which is superior to the traditional red, green and blue (RGB) color model.
  • the YUV 4:2:0 form indicates that the two chrominance components U and V are only half of the luminance Y component in both the horizontal direction and the vertical direction, that is, there are four luminance components Y among the four sampled pixels, and the chrominance component There is only one U and V.
  • the amount of data is further reduced, only about 33% of the original. Therefore, chroma sampling makes full use of the physiological visual characteristics of the human eye, and the purpose of video compression by means of such chroma sampling is one of the widely used video data compression methods.
  • Predictive coding uses the data information of the previously encoded frame to predict the frame currently to be encoded.
  • a predicted value is obtained by prediction, which is not completely equivalent to the actual value, and there is a certain residual value between the predicted value and the actual value.
  • the more accurate the prediction the closer the predicted value is to the actual value, and the smaller the residual value, so that the residual value can be encoded to greatly reduce the amount of data, and the residual value plus the predicted value is used when decoding at the decoding end. It is possible to restore and reconstruct the matching image, which is the basic idea of predictive coding.
  • predictive coding is divided into two basic types: intra prediction and inter prediction.
  • Intra Prediction refers to predicting the pixel value of the pixel in the current coding unit by using the pixel value of the pixel in the reconstructed region in the current image;
  • Inter Prediction is the reconstructed image. Searching for a matching reference block for the current coding unit in the current image, using the pixel value of the pixel in the reference block as the prediction information or the predicted value of the pixel value of the pixel in the current coding unit, and transmitting the motion of the current coding unit. information.
  • Transform coding This coding method does not directly encode the original spatial domain information, but converts the information sample values from the current domain to another artificial domain according to some form of transformation function (commonly called transform domain). ), and then compression coding according to the distribution characteristics of the information in the transform domain. Since video image data tends to have very large data correlation in the spatial domain, there is a large amount of redundant information, and if it is directly encoded, a large amount of bits is required. When the information sample value is converted into the transform domain, the correlation of the data is greatly reduced, so that the amount of data required for encoding is greatly reduced due to the reduction of redundant information during encoding, so that high compression can be obtained. Than, and can achieve better compression.
  • Typical transform coding methods include Kalo (K-L) transform, Fourier transform, and the like.
  • Quantization coding The above-mentioned transform coding does not compress the data itself, and the quantization process can effectively achieve the compression of the data.
  • the quantization process is also the main reason for the loss of data in the lossy compression.
  • the process of quantification is the process of "force planning" a large dynamic input value into fewer output values. Since the range of quantized input values is large, more bit number representation is needed, and the range of output values after "forced planning" is small, so that only a small number of bits can be expressed.
  • the encoder control module selects the coding mode adopted by the image block according to the local characteristics of different image blocks in the video frame.
  • the intra-predictive coded block is subjected to frequency domain or spatial domain prediction
  • the inter-predictive coded block is subjected to motion compensation prediction
  • the predicted residual is further transformed and quantized to form a residual coefficient
  • the final code is generated by the entropy encoder. flow.
  • the intra or inter prediction reference signals are obtained by the decoding module at the encoding end.
  • the transformed and quantized residual coefficients are reconstructed by inverse quantization and inverse transform, and then added to the predicted reference signal to obtain a reconstructed image.
  • the loop filtering performs pixel correction on the reconstructed image to improve the encoding quality of the reconstructed image.
  • Figure 1 is a schematic diagram of a video encoding process.
  • intra prediction or inter prediction when performing prediction on the current image block in the current frame Fn, either intra prediction or inter prediction may be used. Specifically, whether intra coding or intraframe coding can be selected according to the type of the current frame Fn. Inter-frame coding, for example, intra prediction is used when the current frame Fn is an I frame, and inter prediction is used when the current frame Fn is a P frame or a B frame.
  • intra prediction is adopted, the pixel value of the pixel of the current image block may be predicted by using the pixel value of the pixel of the reconstructed area in the current frame Fn, and the reference frame F'n -1 may be used when inter prediction is adopted. The pixel value of the pixel of the reference block that matches the current image block predicts the pixel value of the pixel of the current image block.
  • the pixel value of the pixel point of the current image block is compared with the pixel value of the pixel point of the prediction block to obtain residual information, and the residual information is obtained.
  • the change, quantization, and entropy coding are performed to obtain an encoded code stream.
  • the residual information of the current frame Fn and the prediction information of the current frame Fn are superimposed, and a filtering operation is performed to obtain a reconstructed frame F' n of the current frame, and is used as a reference frame for subsequent encoding. .
  • FIG. 2 is a schematic diagram of a video decoding process.
  • the video decoding process shown in FIG. 2 is equivalent to the inverse process of the video decoding process shown in FIG. 1.
  • the residual information is obtained by using entropy decoding and inverse quantization and inverse transform, and the current image block is determined according to the decoded code stream.
  • Intra prediction is also inter prediction. If it is intra prediction, the prediction information is constructed according to the intra prediction method using the pixel values of the pixels in the reconstructed region in the current frame; if it is inter prediction, the motion information needs to be parsed, and the parsed motion information is used.
  • the reference block is determined in the reconstructed image, and the pixel value of the pixel in the reference block is used as prediction information.
  • the prediction information is superimposed with the residual information, and the reconstruction information is obtained through the filtering operation.
  • FIG. 3 is a schematic flowchart of an image prediction method according to an embodiment of the present application.
  • the method shown in FIG. 3 can be performed by a video codec device, a video codec, a video codec system, and other devices having video codec functions.
  • the method shown in FIG. 3 can occur both in the encoding process and in the decoding process. More specifically, the method shown in FIG. 3 can occur in the interframe prediction process at the time of encoding and decoding.
  • the method shown in FIG. 3 includes steps 301 to 308, and steps 301 to 308 are described in detail below.
  • the 302. Determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine a second reference block corresponding to the current image block in the second reference image; where, the first The reference block includes a first search base point, and the second reference block includes a second search base point; a pixel value of the first reference block and a pixel value of the second reference block have a first pixel precision.
  • the image block here may be one image block in the image to be processed, or may be one sub-image in the image to be processed.
  • the image block herein may be an image block to be encoded in the encoding process, or may be an image block to be decoded in the decoding process.
  • the foregoing initial predicted motion information includes indication information of a prediction direction (usually forward prediction, backward prediction, or bidirectional prediction), a motion vector directed to the reference image block (usually a motion vector of a neighboring block), and a reference.
  • the indication information of the image (generally understood as reference image information for determining the reference image), wherein the motion vector comprises a forward motion vector and/or a backward motion vector, the reference image information comprising a forward reference image block and/or a backward direction Reference frame index information of the reference image block.
  • the position of the forward reference block and the position of the backward reference block can be determined by the motion vector information.
  • the first reference image is a forward reference image
  • the second reference image is a backward reference image
  • the first motion vector and the second motion vector included in the motion information are initially predicted; the position of the first reference block may be obtained according to the position of the current image block and the first motion vector, that is, the first a reference block; and obtaining a second reference block according to the position of the current image block and the second motion vector, that is, determining the second reference block.
  • the location of the first reference block and/or the second reference block may be an equivalent position of the current image block, it may also be obtained according to the equivalent position and the motion vector.
  • the following method 1 and mode 2 may be used to obtain the initial predicted motion information of the image block.
  • the candidate prediction motion information list is constructed according to the motion information of the neighboring block of the current image block, and a candidate prediction motion information is selected from the candidate prediction motion information list as an initial prediction of the current image block.
  • the candidate predicted motion information list includes a motion vector, reference frame index information of a reference image block, and the like.
  • the motion information of the neighboring block A0 is selected as the initial predicted motion information of the current image block.
  • the forward motion vector of A0 is used as the forward motion vector of the current image block, and the backward direction of A0 is used.
  • the motion vector is used as the backward predicted motion vector of the current image block.
  • a motion vector predictor list is constructed according to motion information of neighboring blocks of the current image block, and a motion vector is selected from the motion vector predictor list as motion vector prediction of the current image block. value.
  • the motion vector of the current image block may be the motion vector value of the adjacent block, or may be the sum of the motion vector of the selected neighboring block and the motion vector difference of the current image block, where the motion vector difference The difference between the motion vector obtained by motion estimation of the current image block and the motion vector of the selected neighboring block.
  • the motion vectors corresponding to the indices 1 and 2 in the motion vector predictor list are selected as the forward motion vector and the backward motion vector of the current image block.
  • the base point can be represented by a coordinate point, which is a kind of position information, which can be used to indicate the position of the image block, and can also be used as a reference in the subsequent image block search. It may be the top left corner of an image block, or the center point of an image block, or a relative position point specified by other rules, which is not limited in this application.
  • the base point of the reference image can be used as a search base point in subsequent search processes. So once the position of a reference block is determined, the search base point is determined.
  • the base points contained in both the first reference block and the second reference block may also be referred to as a first search base point and a second search base point respectively because of subsequent search operations related to the base point; they may be predetermined It can also be specified in the process of codec.
  • the forward motion vector is (MV0x, MV0y), and the base point of the current image block is (B0x, B0y), and the base point of the forward reference block is (MV0x+B0x, MV0y+B0y).
  • the backward reference block which is not described in the present application.
  • the first reference image may refer to the forward reference image
  • the second reference image may refer to the backward reference image
  • the first reference block may refer to a forward reference block and the second reference block may refer to a backward reference block.
  • Step 303 includes a search method, and the specific search method can be as follows:
  • a motion search of an integer pixel step is performed around the first reference block with reference to the first reference block (or the first search base).
  • the integer pixel step may refer to a position offset of the position of the candidate search block relative to the first reference block as an integer pixel distance, wherein the size of the candidate search block may be the same as the first reference block, so that the search process may determine The location of the candidate search block, and then the third reference block is determined according to the search rule. It should be pointed out that no matter whether the search base point is an integer pixel (the starting point can be an integer pixel, or a sub-pixel, such as 1/2, 1/4, 1/8, 1/16, etc.), the whole pixel can be performed.
  • the step motion search obtains the position of the forward reference block of the current image block, that is, the third reference block is determined correspondingly. After searching for some third reference blocks in integer pixel steps, optionally, a sub-pixel search can be performed, and then some third reference blocks are obtained, and if there is still a search requirement, the finer sub-pixels can be continued. Search...
  • search method see Figure 6, where (0,0) is the search base point, you can use the cross search: search for (0,-1), (0,1), (-1,0) and (1,0); or square search: search for (-1,-1), (1,-1), (-1,1) and (1,1) in sequence; these points are the upper left corner of the candidate search block Vertex, these base points are determined, then the reference block corresponding to the base points, that is, the third reference block is also determined.
  • search method is not limited, and any prior art search may be adopted.
  • Methods, such as fractional pixel step size search can be used in addition to integer pixel step search. For example, a search for a fractional pixel step size is directly performed, and a specific search method is not limited herein.
  • N reference block groups are obtained, wherein one reference block group includes a third reference block and a fourth reference block.
  • the reference block group can be found using the motion vector difference MVD (motion vector difference) mirroring constraint.
  • MVD motion vector difference
  • the MVD image constraint here is that if the positional offset of a third image block (base point) relative to the first reference block (first search base point) is Offset0 (deltaX0, deltaY0), it is found in the backward image.
  • the image of the current current image block is different from the time domain interval of the forward reference image and the backward reference image, and the motion vector difference (MVD) image constraint can still be used to find the reference. Block group.
  • the ith vector and the jth vector are inversely large.
  • the time interval between the forward reference image and the backward reference image respectively from the image of the current image block is For t1 and t2, the following constraint can be adopted: if the positional offset of the block position (base point) of a third image block relative to the first reference block (first search base point) is Offset00 (deltaX00, deltaY00), then in the backward direction An image block in the image in which the positional offset with respect to the second reference block (second search base point) is Offset01 (deltaX01, deltaY01) is determined as a fourth reference block.
  • deltaX01 -deltaX00*t2/t1
  • deltaY01 -deltaY00*t2/t1.
  • Offset0 deltaX0, deltaY0
  • Offset1 deltaX1, deltaY1
  • the first image block, the first base point and the first reference image have equivalent functions
  • the current image block, the base point of the current image block, and the image of the current image block have equivalent functions.
  • the essence is to calculate the time domain interval of the image where the first reference image and the current image block are located; the same applies to calculating the time domain interval of the second reference image and the image of the current image block. That is, the time interval between frames.
  • N reference block groups can be obtained, wherein one reference block group includes one third reference block and one fourth reference block.
  • positional offset may refer to the offset between the base point and the base point, and may also refer to the offset between the image block and the image block, representing a relative position.
  • the N reference block groups may include the foregoing first reference block and the foregoing second reference block; that is, the first reference block may be a third reference block, and correspondingly, the second reference block may be a fourth Reference block.
  • both the ith vector and the jth vector are 0.
  • the first reference block is the third reference block
  • the second reference block is its corresponding fourth reference block.
  • the third reference block and/or the fourth reference block mentioned in the present application are not limited to image blocks of a specific location, but may represent a type of reference block, which may be a specific image.
  • the block may also be a plurality of image blocks; for example, the third reference block may be based on any one of the image blocks searched around the first base point, and the fourth reference block may be an image block corresponding to any one of the image blocks described above, thus The fourth reference block may be a specific image block or a plurality of image blocks.
  • Step 305 Increase the pixel values of the obtained third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups under the first pixel precision.
  • a reference block group is taken as a specific example to illustrate how to calculate an image block matching cost of a reference block group.
  • the reference block group includes a third reference block and a fourth reference block determined corresponding thereto.
  • the pixel values of the third reference block and the fourth reference block are referred to as high first pixel precision, wherein the third reference block and the fourth reference block are both image blocks that have been coded, so their pixels have a code stream.
  • the accuracy, such as the code stream precision is 8 bits
  • the pixel precision of the pixel values of the third reference block and the fourth reference block is 8 bits. In order to find a reference block that is more similar in image, it is necessary to improve the precision of the pixels of the third reference block and the fourth reference block.
  • the accuracy of the image block that needs to calculate the image block matching cost needs to be increased to the same precision, for example, 14 bits.
  • the 14-bit pixel value of the third reference block is obtained, denoted as pi[x, y]
  • the 14-bit pixel value of the fourth reference block is recorded as pj[x, y], where x, y represents the coordinates.
  • An image block matching cost eij is calculated from pi[x, y] and pj[x, y], which may also be referred to as an image block matching error eij.
  • the image block matching error there are many ways to calculate the image block matching error, such as the SAD criterion, the MR-SAD criterion, and other evaluation criteria in the prior art, and the calculation method of the image block matching error is not limited in the present invention.
  • the image block matching cost calculation described above can be performed for a plurality of reference block groups.
  • the first reference block is a third reference block
  • the second reference block is a fourth reference block
  • the pixel values of the first reference block and the second reference block are obtained by a motion compensation method.
  • Motion compensation refers to pointing to a reconstructed reference image (with pixel precision of the code stream) according to the motion vector, and obtaining the pixel value (having the first pixel precision) of the reference block of the current image block.
  • the position pointed by the motion vector is a sub-pixel position, and the pixel value of the entire pixel position of the reference image needs to be interpolated by using an interpolation filter to obtain a pixel value of the sub-pixel position as the pixel value of the reference block of the current image block;
  • the position pointed by the motion vector is a whole pixel position, and a moving operation can be employed.
  • the coefficient sum of the interpolation filter that is, the interpolation filter gain, is 2 to the power of N. If N is 6, it means that the interpolation filter gain is 6 bits. In the interpolation operation, since the interpolation filter gain is usually greater than 1, the accuracy of the pixel values of the obtained forward reference block and backward reference block is higher than that of the code stream.
  • the pixel value precision bitDepth of the predicted image is 8 bits
  • the interpolation filter gain is 6 bits
  • the predicted pixel value with a precision of 14 bits is obtained;
  • the pixel value precision bitDepth of the predicted image is 10 bits
  • the interpolation filter gain is 6 bits.
  • a predicted pixel value with a precision of 16 bits is obtained; if the pixel value precision bitDepth of the predicted image is 10 bits, the interpolation filter gain is 6 bits, and then 2 bits are shifted right, and a predicted pixel value with a precision of 14 bits is obtained.
  • Commonly used interpolation filters are 4 taps, 6 taps, 8 taps, and so on. There are many motion compensation methods in the prior art, which are not described in this application.
  • the pixels of the image block referred to in the present application may include a luminance component sample, or a luma sample; correspondingly, the pixel point is a luminance component sampling point; and the pixel value is a luminance component sampling value.
  • the image block matching cost criterion comprises: determining a reference block group with the smallest image block matching cost as the target reference block group.
  • the image block matching cost criterion further includes: determining, as the target reference block group, the first occurrence of the reference block group that satisfies the image block matching cost less than the preset threshold.
  • step 304, step 305, and step 306 may be performed after step 303, or may be performed in synchronization with step 303.
  • the step numbers do not constitute any limitation on the order in which the methods are executed.
  • a fourth reference block is correspondingly determined, and an image block matching cost of the third set of reference blocks and the fourth reference block is calculated, if the Nth reference is calculated.
  • the Nth reference block group if the image block matching cost result satisfies a preset condition, such as less than a preset threshold, or even 0, the Nth reference block group is used as the target reference block group. It is not necessary to determine and calculate more third reference blocks and fourth reference blocks, which can reduce the computational complexity, where N is greater than or equal to 1.
  • N third reference blocks are determined first, and N fourth reference blocks are determined one-to-one, N reference block groups are formed, and then images corresponding to each reference block group are calculated for the N reference block groups.
  • the block matching error is compared and compared, and the image block matching cost result satisfies a preset condition. For example, if the image block matching error is the smallest, the reference block group with the smallest image block matching cost is selected (if there are at least a plurality, the random one is optional) ) as the target reference block group.
  • the third reference block and the fourth reference block in the target reference block group may also be respectively called the optimal forward reference block of the current image block and the current image block respectively.
  • the third reference block is determined based on the first reference block having the first pixel precision; the fourth reference block is determined based on the second reference block having the first pixel precision; thus the third reference block and the fourth reference
  • the pixel precision of the block is also the first pixel precision, ie higher than the pixel precision of the code stream.
  • the second pixel precision is the same as the pixel precision (bitDepth) of the code stream.
  • x and y are the coordinates of the horizontal and vertical directions of each pixel in the image block, and for each pixel in the image block, the operation is as shown in the above formula.
  • the precision of the pixel value of the target third reference block is 14 bits
  • the precision of the pixel value of the target fourth reference block is 14 bits
  • the shift2 is 15-bitDepth
  • the precision of the pixel prediction value of the current image block is 14+1.
  • -shift2 bitDepth.
  • the first reference block and the second reference block obtained from the initial motion information are not necessarily able to accurately predict the current image block, a completely new method will be adopted in the present application to find a more suitable target.
  • the third reference block and the target fourth reference block, and the current image block is predicted by the pixel values of the target third reference block and the target fourth reference block.
  • the image prediction method in the embodiment of the present application may occur in the inter prediction process shown in FIG. 1 and FIG. 2, and the image prediction method in the embodiment of the present application may be specifically implemented by an inter prediction module in an encoder or a decoder. To execute. Additionally, the image prediction method of embodiments of the present application can be implemented in any electronic device or device that may require encoding and/or decoding of a video image.
  • an embodiment of the present invention provides an image prediction apparatus.
  • the image prediction apparatus of the embodiment of the present application is described below with reference to FIG.
  • the image predicting apparatus shown in FIG. 6 corresponds to the method shown in FIG. 3, and each step in the method shown in FIG. 3 can be executed.
  • the repeated description is appropriately omitted below.
  • an image prediction apparatus 700 the apparatus 700 includes:
  • the obtaining unit 701 is configured to acquire initial predicted motion information of the current image block. This unit can be implemented by the processor invoking code in memory.
  • a determining unit 702 configured to determine, according to the initial predicted motion information, a first reference block corresponding to the current image block in the first reference image, and determine, in the second reference image, the current image block a second reference block; wherein the first reference block includes a first search base point and the second reference block includes a second search base point.
  • This unit can be implemented by the processor invoking code in memory.
  • the searching unit 703 is configured to determine N third reference blocks in the first reference image. This unit can be implemented by the processor invoking code in memory.
  • the mapping unit 704 is configured to: according to the first search base point, the location of the any one of the third reference blocks, and the second search base point, for any one of the N third reference blocks, Correspondingly determining a fourth reference block in the second reference image; obtaining N reference block groups, wherein one reference block group includes a third reference block and a fourth reference block; N is greater than or equal to 1.
  • This unit can be implemented by the processor invoking code in memory.
  • the calculating unit 705 is configured to increase the obtained pixel values of the third reference block and the fourth reference block to a first pixel precision, and calculate an image block matching cost of the N reference block groups at the first pixel precision.
  • This unit can be implemented by the processor invoking code in memory.
  • the selecting unit 706 is configured to determine, in the N reference block groups, a target reference block group that satisfies an image block matching cost criterion, where the target reference block group includes a target third reference block and a target fourth reference block.
  • This unit can be implemented by the processor invoking code in memory.
  • a prediction unit 707 configured to obtain a pixel prediction value of the current image block according to a pixel value of the target third reference block at a first precision and a pixel value of the target fourth reference block at a first precision, where The pixel prediction value of the current image block has a second pixel precision; the second pixel precision is less than the first pixel precision.
  • This unit can be implemented by the processor invoking code in memory.
  • the obtaining unit 701 is specifically configured to perform the method mentioned in the foregoing step 301 and the method that can be replaced by the same; the determining unit 702 is specifically configured to perform the method mentioned in the foregoing step 302 and can be equivalently replaced.
  • the search unit 703 is specifically configured to perform the method mentioned in the above step 303 and the method that can be replaced by the same; the mapping unit 704 is specifically configured to perform the method mentioned in the above step 304 and the method that can be replaced equally; 705 is specifically configured to perform the method mentioned in step 305 and the method that can be replaced by the same; the selecting unit 706 is specifically configured to perform the method mentioned in step 306 and the method that can be replaced equally; the prediction unit 707 is specifically configured to perform the step The method mentioned in 307 and the method which can be equivalently replaced.
  • the corresponding method embodiments and corresponding explanations, representations, refinements, and alternative alternative embodiments are also applicable to the method in the device.
  • the device 700 may specifically be a video encoding device, a video decoding device, a video codec system, or other device having a video codec function.
  • the apparatus 700 can be used for both image prediction in the encoding process and image prediction in the decoding process, especially inter-frame prediction in video images.
  • Apparatus 700 includes a number of functional units for implementing any of the foregoing methods
  • the present application further provides a terminal device, the terminal device includes: a memory for storing a program; a processor, configured to execute the program stored by the memory, when the program is executed, the processor is configured to execute the program
  • the image prediction method of the application embodiment includes steps 301-307.
  • the terminal devices here may be video display devices, smart phones, portable computers, and other devices that can process video or play video.
  • the present application also provides a video encoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
  • the present application also provides a video decoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
  • the present application also provides a video encoding system including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the image prediction method of the embodiment of the present application, including steps 301-307.
  • the present application also provides a computer readable medium storing program code for device execution, the program code comprising instructions for performing an image prediction method of an embodiment of the present application, including steps for implementing Program code of 301-307.
  • the present application also provides a decoder, which includes an image prediction apparatus in the embodiment of the present application, such as 700, and a decoding reconstruction module, wherein the decoding reconstruction module is configured to obtain according to the image prediction apparatus.
  • the predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
  • the present application also provides an encoder, which includes an image prediction apparatus in the embodiment of the present application, such as 700, and a code reconstruction module, wherein the code reconstruction module is configured to obtain according to the image prediction device.
  • the predicted value of the pixel value of the image block results in a reconstructed pixel value of the image block.
  • FIG. 8 is a schematic block diagram of a video encoder according to an embodiment of the present application.
  • the video encoder 1000 shown in FIG. 8 includes an encoding end prediction module 1001, a transform quantization module 1002, an entropy encoding module 1003, a code reconstruction module 1004, and an encoding end filtering module.
  • the video encoder 1000 shown in FIG. 8 can encode a video. Specifically, the video encoder 1000 can perform the video encoding process shown in FIG. 1 to implement encoding of a video. In addition, the video encoder 1000 can also perform the image prediction method of the embodiment of the present application, and the video encoder 1000 can perform various steps of the image prediction method shown in FIG. 3, including refinement of each step and an alternative implementation.
  • the image prediction apparatus in the embodiment of the present application may also be the encoding end prediction module 1001 in the video encoder 1000.
  • FIG. 9 is a schematic block diagram of a video decoder of an embodiment of the present application.
  • the video decoder 2000 shown in FIG. 9 includes an entropy decoding module 2001, an inverse transform inverse quantization module 2002, a decoding end prediction module 2003, a decoding reconstruction module 2004, and a decoding end filtering module 2005.
  • the video decoder 2000 shown in FIG. 9 can encode the video. Specifically, the video decoder 2000 can perform the video decoding process shown in FIG. 2 to implement decoding of the video. In addition, the video decoder 2000 can also perform the image prediction method of the embodiment of the present application, and the video decoder 2000 can perform various steps of the image prediction method shown in FIG. 3, including refinement of each step and an alternative implementation.
  • the image prediction apparatus 700 in the embodiment of the present application may also be the decoding side prediction module 2003 in the video decoder 2000.
  • the application scenario of the image prediction method in the embodiment of the present application is described below with reference to FIG. 10 to FIG. 12 .
  • the image prediction method in the embodiment of the present application may be implemented by the video transmission system, the codec device, and the editing device shown in FIG. 10 to FIG. 12 .
  • the decoding system is executed.
  • FIG. 10 is a schematic block diagram of a video transmission system according to an embodiment of the present application.
  • the video transmission system includes an acquisition module 3001, an encoding module 3002, a transmitting module 3003, a network transmission 3004, a receiving module 3005, a decoding module 3006, and a rendering module 3007.
  • each module in the video transmission system is as follows:
  • the acquisition module 3001 includes a camera or a camera group for collecting video images, and performing pre-encoding processing on the collected video images to convert the optical signals into digitized video sequences;
  • the encoding module 3002 is configured to encode the video sequence to obtain a code stream
  • the sending module 3003 is configured to send the coded code stream.
  • the receiving module 3005 is configured to receive the code stream sent by the sending module 3003.
  • the network 3004 is configured to transmit the code stream sent by the sending module 3003 to the receiving module 3005;
  • the decoding module 3006 is configured to decode the code stream received by the receiving module 3005 to reconstruct a video sequence.
  • the rendering module 3007 is configured to render the reconstructed video sequence decoded by the decoding module 3006 to improve the display effect of the video.
  • the video transmission system shown in FIG. 10 can perform the image prediction method in the embodiment of the present application.
  • the encoding module 3002 and the decoding module 3006 in the video transmission system shown in FIG. 10 can perform image prediction in the embodiment of the present application.
  • the method, including steps 301-307, also includes refinement and alternative implementations of each step.
  • the acquisition module 3001, the encoding module 3002, and the transmission module 3003 in the video transmission system shown in FIG. 10 correspond to the video encoder 1000 shown in FIG.
  • the receiving module 3005, the decoding module 3006, and the rendering module 3007 in the video transmission system shown in FIG. 10 correspond to the video decoder 2000 shown in FIG.
  • a codec system composed of a codec device and a codec device will be described in detail below with reference to FIGS. 11 and 12. It should be understood that the codec device and the codec system shown in FIGS. 11 and 12 are capable of performing the method of image prediction of the embodiment of the present application.
  • FIG. 11 is a schematic diagram of a video codec apparatus according to an embodiment of the present application.
  • the video codec device 50 may be a device dedicated to encoding and/or decoding a video image, or may be an electronic device having a video codec function. Further, the codec device 50 may be a mobile communication system. Terminal or user equipment.
  • Codec device 50 may include the following modules or units: controller 56, codec 54, radio interface 52, antenna 44, smart card 46, card reader 48, keypad 34, memory 58, infrared port 42, display 32.
  • the codec device 50 may also include a microphone or any suitable audio input module, which may be a digital or analog signal input, and the codec device 50 may also include an audio output.
  • the audio output module can be a headset, a speaker or an analog audio or digital audio output connection.
  • the codec device 50 may also include a battery, which may be a solar cell, a fuel cell, or the like.
  • the codec device 50 may also include an infrared port for short-range line-of-sight communication with other devices, and the codec device 50 may also communicate with other devices using any suitable short-range communication method, for example, a Bluetooth wireless connection, USB / Firewire wired connection.
  • any suitable short-range communication method for example, a Bluetooth wireless connection, USB / Firewire wired connection.
  • the memory 58 can store data in the form of data and audio in the form of images, as well as instructions for execution on the controller 56.
  • Codec 54 may implement encoding and decoding of audio and/or video data or enable auxiliary and auxiliary decoding of audio and/or video data under the control of controller 56.
  • the smart card 46 and the card reader 48 can provide user information as well as network authentication and authentication information for authorized users.
  • the specific implementation form of the smart card 46 and the card reader 48 may be a Universal Integrated Circuit Card (UICC) and a UICC reader.
  • UICC Universal Integrated Circuit Card
  • the radio interface circuit 52 can generate a wireless communication signal, which can be a communication signal generated during a cellular communication network, a wireless communication system, or a wireless local area network communication.
  • the antenna 44 is used to transmit radio frequency signals generated by the radio interface circuit 52 to other devices (the number of devices may be one or more), and may also be used for other devices (the number of devices may be one or more Receive RF signals.
  • codec device 50 may receive video image data to be processed from another device prior to transmission and/or storage. In still other embodiments of the present application, the codec device 50 may receive images over a wireless or wired connection and encode/decode the received images.
  • FIG. 12 is a schematic block diagram of a video codec system 7000 according to an embodiment of the present application.
  • the video codec system 7000 includes a source device 4000 and a destination device 5000.
  • the source device 4000 generates encoded video data
  • the source device 4000 may also be referred to as a video encoding device or a video encoding device
  • the destination device 5000 may decode the encoded video data generated by the source device 4000
  • the destination device 5000 may also be referred to as a video decoding device or a video decoding device.
  • the specific implementation form of the source device 4000 and the destination device 5000 may be any one of the following devices: a desktop computer, a mobile computing device, a notebook (eg, a laptop) computer, a tablet computer, a set top box, a smart phone, a handset, TV, camera, display device, digital media player, video game console, on-board computer, or other similar device.
  • Destination device 5000 can receive video data encoded by source device 4000 via channel 6000.
  • Channel 6000 can include one or more media and/or devices capable of moving encoded video data from source device 4000 to destination device 5000.
  • channel 6000 can include one or more communication media that enable source device 4000 to transmit encoded video data directly to destination device 5000 in real time, in which case source device 4000 can be based on communication standards ( For example, a wireless communication protocol) modulates the encoded video data, and the modulated video data can be transmitted to the destination device 5000.
  • the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media described above may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)).
  • the one or more communication media described above may include a router, a switch, a base station, or other device that enables communication from the source device 4000 to the destination device 5000.
  • channel 6000 can include a storage medium that stores encoded video data generated by source device 4000.
  • destination device 5000 can access the storage medium via disk access or card access.
  • the storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, High Density Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), flash memory. Or other suitable digital storage medium for storing encoded video data.
  • channel 6000 can include a file server or another intermediate storage device that stores encoded video data generated by source device 4000.
  • destination device 5000 can access the encoded video data stored at a file server or other intermediate storage device via streaming or download.
  • the file server may be a server type capable of storing encoded video data and transmitting the encoded video data to the destination device 5000.
  • the file server may include a World Wide Web (Web) server (for example, for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk. driver.
  • Web World Wide Web
  • FTP File Transfer Protocol
  • NAS Network Attached Storage
  • Destination device 5000 can access the encoded video data via a standard data connection (e.g., an internet connection).
  • the instance type of the data connection includes a wireless channel, a wired connection (e.g., a cable modem, etc.), or a combination of both, suitable for accessing the encoded video data stored on the file server.
  • the transmission of the encoded video data from the file server may be streaming, downloading, or a combination of both.
  • the image prediction method of the present application is not limited to a wireless application scenario.
  • the image prediction method of the present application can be applied to video codec supporting multiple multimedia applications such as the following applications: aerial television broadcasting, cable television transmission, satellite television transmission, Streaming video transmission (e.g., via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application.
  • video codec system 7000 can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • the source device 4000 includes a video source 4001, a video encoder 4002, and an output interface 4003.
  • output interface 4003 can include a modulator/demodulator (modem) and/or a transmitter.
  • Video source 4001 can include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer for generating video data A graphics system, or a combination of the above video data sources.
  • Video encoder 4002 can encode video data from video source 4001.
  • source device 4000 transmits the encoded video data directly to destination device 5000 via output interface 4003.
  • the encoded video data may also be stored on a storage medium or file server for later access by the destination device 5000 for decoding and/or playback.
  • the destination device 5000 includes an input interface 5003, a video decoder 5002, and a display device 5001.
  • input interface 5003 includes a receiver and/or a modem.
  • the input interface 5003 can receive the encoded video data via the channel 6000.
  • Display device 5001 may be integrated with destination device 5000 or may be external to destination device 5000. Generally, the display device 5001 displays the decoded video data.
  • Display device 5001 can include a variety of display devices, such as liquid crystal displays, plasma displays, organic light emitting diode displays, or other types of display devices.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé de prédiction d'image. Ledit procédé comprend : l'obtention d'informations de mouvement de prédiction initiale d'un bloc d'image actuel, la détermination d'un premier bloc de référence à partir d'une image de référence avant, et la détermination d'un second bloc de référence à partir d'une image de référence arrière; et la réalisation, au moyen d'un effet miroir de classe complètement nouvelle, d'une recherche autour du premier bloc de référence et du second bloc de référence, de façon à déterminer s'il existe une paire de blocs de référence cibles qui ont un coût de correspondance de bloc d'image inférieur, ladite paire de blocs de référence cibles ayant une corrélation dans l'espace; et l'obtention, en fonction de la valeur de pixel des blocs de référence cibles à une première précision, d'une valeur de prédiction de pixel du bloc d'image actuel, ladite valeur de prédiction de pixel du bloc d'image actuel ayant une précision de flux de code. Au moyen de la présente invention, le calcul de coût de correspondance de bloc d'image est effectué à une précision élevée, et une paire de blocs de référence optimale est découverte, réduisant la complexité de prédiction inter-trames d'une image vidéo dans l'état de la technique, et améliorant la précision.
PCT/CN2018/120681 2017-12-31 2018-12-12 Procédé, appareil et codec de prédiction d'image WO2019128716A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711494258.1A CN109996080B (zh) 2017-12-31 2017-12-31 图像的预测方法、装置及编解码器
CN201711494258.1 2017-12-31

Publications (1)

Publication Number Publication Date
WO2019128716A1 true WO2019128716A1 (fr) 2019-07-04

Family

ID=67066492

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/120681 WO2019128716A1 (fr) 2017-12-31 2018-12-12 Procédé, appareil et codec de prédiction d'image

Country Status (2)

Country Link
CN (1) CN109996080B (fr)
WO (1) WO2019128716A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992399A (zh) * 2019-11-11 2020-04-10 北京空间机电研究所 一种高精度目标大气扰动检出方法
CN114040209A (zh) * 2021-10-21 2022-02-11 百果园技术(新加坡)有限公司 运动估计方法、装置、电子设备及存储介质
CN116847088A (zh) * 2023-08-24 2023-10-03 深圳传音控股股份有限公司 图像处理方法、处理设备及存储介质
US12003764B2 (en) 2019-09-27 2024-06-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Prediction method for current block and electronic device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220061908A (ko) * 2019-09-24 2022-05-13 광동 오포 모바일 텔레커뮤니케이션즈 코포레이션 리미티드 예측값의 확정 방법, 인코더, 디코더 및 컴퓨터 저장 매체
CN113709501B (zh) * 2019-12-23 2022-12-23 杭州海康威视数字技术股份有限公司 一种编解码方法、装置及其设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1525762A (zh) * 2003-09-12 2004-09-01 中国科学院计算技术研究所 一种用于视频编码的编码端/解码端双向预测方法
GB2521349A (en) * 2013-12-05 2015-06-24 Sony Corp Data encoding and decoding
WO2017057947A1 (fr) * 2015-10-01 2017-04-06 엘지전자(주) Procédé de traitement d'image basé sur un mode interprédiction et appareil associé

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100551073C (zh) * 2006-12-05 2009-10-14 华为技术有限公司 编解码方法及装置、分像素插值处理方法及装置
EP4099700A1 (fr) * 2011-01-07 2022-12-07 Nokia Technologies Oy Prédiction de mouvement en codage vidéo

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1525762A (zh) * 2003-09-12 2004-09-01 中国科学院计算技术研究所 一种用于视频编码的编码端/解码端双向预测方法
GB2521349A (en) * 2013-12-05 2015-06-24 Sony Corp Data encoding and decoding
WO2017057947A1 (fr) * 2015-10-01 2017-04-06 엘지전자(주) Procédé de traitement d'image basé sur un mode interprédiction et appareil associé

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12003764B2 (en) 2019-09-27 2024-06-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Prediction method for current block and electronic device
CN110992399A (zh) * 2019-11-11 2020-04-10 北京空间机电研究所 一种高精度目标大气扰动检出方法
CN114040209A (zh) * 2021-10-21 2022-02-11 百果园技术(新加坡)有限公司 运动估计方法、装置、电子设备及存储介质
CN116847088A (zh) * 2023-08-24 2023-10-03 深圳传音控股股份有限公司 图像处理方法、处理设备及存储介质
CN116847088B (zh) * 2023-08-24 2024-04-05 深圳传音控股股份有限公司 图像处理方法、处理设备及存储介质

Also Published As

Publication number Publication date
CN109996080A (zh) 2019-07-09
CN109996080B (zh) 2023-01-06

Similar Documents

Publication Publication Date Title
EP3672249B1 (fr) Procédé et dispositif de prédiction inter-trame destiné à des images vidéo
WO2019128716A1 (fr) Procédé, appareil et codec de prédiction d'image
AU2023200956B2 (en) Video data inter prediction method and apparatus
WO2017129023A1 (fr) Procédé de décodage, procédé de codage, appareil de décodage et appareil de codage
CN115941942A (zh) 视频编码器、视频解码器及相应的编解码方法
CN110121073B (zh) 一种双向帧间预测方法及装置
WO2019109955A1 (fr) Procédé et appareil de prédiction inter-trames et dispositif terminal
US20220094947A1 (en) Method for constructing mpm list, method for obtaining intra prediction mode of chroma block, and apparatus
US20240040113A1 (en) Video picture decoding and encoding method and apparatus
US11412210B2 (en) Inter prediction method and apparatus for video coding
US12010293B2 (en) Picture prediction method and apparatus, and computer-readable storage medium
CA3137980A1 (fr) Procede et appareil de prediction d'image et support d'informations lisible par ordinateur
US11109060B2 (en) Image prediction method and apparatus
US20220109830A1 (en) Method for constructing merge candidate motion information list, apparatus, and codec
CN111327907B (zh) 一种帧间预测的方法、装置、设备及存储介质
WO2019233423A1 (fr) Procédé et dispositif d'acquisition de vecteur de mouvement
WO2019091372A1 (fr) Procédé et dispositif de prédiction d'image
US11902506B2 (en) Video encoder, video decoder, and corresponding methods
WO2023051156A1 (fr) Procédé et appareil de traitement d'image vidéo
WO2020135615A1 (fr) Procédé et appareil de décodage d'images vidéo
RU2787885C2 (ru) Способ и оборудование взаимного прогнозирования, поток битов и энергонезависимый носитель хранения
RU2822447C2 (ru) Способ и оборудование взаимного прогнозирования
RU2798316C2 (ru) Способ и аппаратура внешнего предсказания
CN110677645B (zh) 一种图像预测方法及装置
CN110971899A (zh) 一种确定运动信息的方法、帧间预测方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18894892

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18894892

Country of ref document: EP

Kind code of ref document: A1