WO2017201678A1 - 图像预测方法和相关设备 - Google Patents

图像预测方法和相关设备 Download PDF

Info

Publication number
WO2017201678A1
WO2017201678A1 PCT/CN2016/083203 CN2016083203W WO2017201678A1 WO 2017201678 A1 WO2017201678 A1 WO 2017201678A1 CN 2016083203 W CN2016083203 W CN 2016083203W WO 2017201678 A1 WO2017201678 A1 WO 2017201678A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion information
pixel
image block
current image
motion
Prior art date
Application number
PCT/CN2016/083203
Other languages
English (en)
French (fr)
Inventor
陈焕浜
杨海涛
李厚强
Original Assignee
华为技术有限公司
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 中国科学技术大学 filed Critical 华为技术有限公司
Priority to CN201680085451.7A priority Critical patent/CN109076234A/zh
Priority to EP16902668.9A priority patent/EP3457694A4/en
Priority to PCT/CN2016/083203 priority patent/WO2017201678A1/zh
Publication of WO2017201678A1 publication Critical patent/WO2017201678A1/zh
Priority to US16/197,585 priority patent/US20190098312A1/en
Priority to HK18114854.6A priority patent/HK1255704A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes

Definitions

  • the present invention relates to the field of video coding and decoding, and in particular to an image prediction method and related equipment.
  • the basic principle of video coding compression is to use the correlation between airspace, time domain and codewords to remove redundancy as much as possible.
  • the current popular practice is to use a block-based hybrid video coding framework to implement video coding compression through prediction (including intra prediction and inter prediction), transform, quantization, and entropy coding.
  • This coding framework shows a strong vitality, and HEVC still uses this block-based hybrid video coding framework.
  • motion estimation/motion compensation is a key technique that affects encoding/decoding performance.
  • the existing motion estimation/motion compensation algorithms are basically block motion compensation algorithms based on the translational motion model.
  • irregular movements such as scaling, rotation, and parabolic motion are ubiquitous.
  • video coding experts realized the universality of irregular motion and hoped to improve video coding efficiency by introducing irregular motion models (such as affine motion models), but the existing affine motion model based on The computational complexity of image prediction is usually very high.
  • Embodiments of the present invention provide an image prediction method and related equipment, so as to reduce the computational complexity of image prediction based on an affine motion model and improve coding efficiency.
  • an image prediction method includes at least one first pixel sample and at least one second pixel sample, and the method includes: parsing the first Code stream information, the first code stream information is used to indicate a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples; and the first code stream information is obtained according to the parsed Motion information of each of the first pixel samples and predicted motion information of each of the second pixel samples, the predicted motion information is prediction information of motion information; parsing second code stream information, the second code stream Information for characterizing difference motion information of each of the second pixel samples, the difference motion information being a difference between motion information and predicted motion information; according to the parsed second code stream information and corresponding each Obtaining motion information of the second pixel sample, acquiring motion information of each of the second pixel samples; performing motion information according to the current image block, motion information of each of the first pixel samples, and each of the second The motion information of the pixel sample obtains
  • the embodiment of the present invention when acquiring the motion information of the first pixel sample, only needs to obtain the corresponding predicted motion information as its motion information, and does not need to further parse the code stream to obtain the residual value of the predicted motion information, thereby saving Predicting the number of bits to be transmitted by the residual value of the information reduces bit consumption and improves efficiency.
  • the first code stream information includes an index, where the index is used to indicate a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples.
  • the second code stream information includes a difference value, where the difference is a motion vector residual between a motion vector of any of the second pixel samples and a predicted motion vector.
  • the obtaining, according to the parsed first code stream information, motion information of each of the first pixel samples and predicted motion of each of the second pixel samples includes: determining a set of candidate motion information units corresponding to each of the first pixel samples and each of the second pixel samples, wherein any one of the candidate motion information unit sets includes at least one motion information unit; a set of merged motion information units of the current block, wherein each of the motion information units in the merged motion information unit set is corresponding candidate motion information in each of the first pixel samples and each of the second pixel samples At least part of the motion information unit in the unit set, wherein the motion information of the motion information unit includes a motion vector in which the prediction direction is a forward direction and/or a motion direction in which the prediction direction is backward; according to the parsed first code stream Information determining, from the combined motion information unit set, a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples; Use The motion information of the motion information unit
  • the determining the merged motion information unit set of the current block includes: determining, from the N candidate merge motion information unit sets, each of the first pixel samples and each a set of merged motion information units of motion information units corresponding to the second pixel samples, wherein each motion information unit included in each candidate merged motion information unit set in the N candidate combined motion information unit sets respectively At least part of the motion information unit that meets the constraint selected from each of the first pixel samples and each of the candidate motion information unit sets corresponding to each of the second pixel samples, wherein the N is a positive integer, the N The sets of candidate merged motion information units are different from each other.
  • the set of N candidate combined motion information units satisfies at least one of a first condition, a second condition, a third condition, a fourth condition, and a fifth condition, where
  • the first condition includes that the motion mode of the current image block indicated by the motion information unit in any one of the N candidate motion information unit sets is a non-translational motion;
  • the second condition includes that the prediction directions corresponding to the two motion information units in the one of the N candidate motion information unit sets are the same;
  • the third condition includes the N candidate motion information units
  • the reference frame indexes corresponding to the two motion information units in the set of any one of the candidate motion information units in the set are the same;
  • the fourth condition includes two of the candidate motion information unit sets in any one of the N candidate motion information unit sets.
  • the absolute value of the difference of the horizontal component of the motion vector of the motion information unit is less than or equal to the horizontal score a threshold value, or an absolute value of a difference between a motion information unit of one of the motion information unit groups of any one of the N candidate motion information unit sets and a motion vector horizontal component of the pixel sample Z is less than or equal to a level a component threshold, the pixel sample Z of the current image block is different from any one of the first pixel sample and the second pixel sample; the fifth condition includes the N candidate combined motion information
  • the absolute value of the difference between the motion vector vertical components of the two motion information units in any one of the candidate merge motion information unit sets in the unit set is less than or equal to the vertical component threshold, or the N candidate combined motion information unit sets Any one of the candidate motion information unit sets and the motion vector of the pixel sample Z
  • the absolute value of the difference between the vertical components of the current image block is different from or equal to the vertical component threshold, and the pixel sample Z of the current image block is different from any one of the first pixel
  • the motion information includes: obtaining, according to the parsed second code stream information, difference motion information of each of the second pixel samples; and comparing the difference motion information of each of the second pixel samples with the corresponding
  • the motion information is predicted to be added, and motion information of each of the second pixel samples is obtained.
  • the motion model is a non-translational motion model, and specifically includes: the non-translational motion model is an affine motion model in the following form:
  • the motion vectors of the first pixel sample and the second pixel sample are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a coordinate in the current image block ( a motion vector horizontal component of a pixel sample of x, y), the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block, the w being the current image block Long or wide; correspondingly, the current image block is obtained according to the motion model of the current image block, the motion information of each of the first pixel samples, and the motion information of each of the second pixel samples
  • the predicted value includes: calculating a motion vector of each pixel in the current image block by using the motion vector of the affine motion model, the first pixel sample, and the second pixel sample, and using the calculated a motion vector of each pixel in the current image block determines a predicted pixel value of each pixel in the current image block; or,
  • the motion model is a non-translational motion model, and specifically includes: the non-translational motion model is an affine motion model in the following form:
  • the vy is a motion vector vertical component of a pixel sample whose coordinates are (x, y) in the current image block, where w is the length or width of the current image block; correspondingly, the basis
  • the at least one first pixel sample and the at least one second pixel sample include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and a central pixel sample a1 of the current image block.
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block that includes an upper left vertex of the current image block
  • the lower left pixel sample of the current image block is the lower left vertex of the current image block or the pixel block of the current image block including the lower left vertex of the current image block
  • the upper right pixel sample of the current image block is the a top right vertex of the current image block or a pixel block in the current image block that includes an upper right vertex of the current image block
  • a central pixel sample a1 of the current image block is a central pixel point of the current image block or the A block of pixels in the current image block that contains the central pixel of the current image block.
  • the upper left pixel sample of the current image block corresponds to
  • the set of candidate motion information units includes motion information units of x1 pixel samples, wherein the x1 pixel samples include at least one pixel sample adjacent to an upper left pixel sample spatial domain of the current image block and/or at least one a pixel sample adjacent to a time domain of an upper left pixel sample of the current image block, wherein the x1 is a positive integer; wherein the x1 pixel samples include a video frame adjacent to a time domain of a video frame to which the current image block belongs a pixel sample having the same position as an upper left pixel sample of the current image block, a spatial adjacent pixel sample of a left side of the current image block, a spatially adjacent pixel sample of an upper left of the current image block, and the current image At least one of the spatially adjacent pixel samples of the upper side of the block.
  • the set of candidate motion information units corresponding to the upper right pixel sample of the current image block includes motion information units of x2 pixel samples, where the x2 pixel samples include at least one a pixel sample adjacent to an upper right pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to a time domain of an upper right pixel sample of the current image block, the x2 being a positive integer; wherein The x2 pixel samples include a pixel sample having the same position as an upper right pixel sample of the current image block and a spatial domain to the right of the current image block, among video frames adjacent to a video frame time domain to which the current image block belongs. At least one of an adjacent pixel sample, a spatially adjacent pixel sample of the upper right of the current image block, and a spatially adjacent pixel sample of the upper side of the current image block.
  • the set of candidate motion information units corresponding to the lower left pixel sample of the current image block includes motion information units of x3 pixel samples, where the x3 pixel samples include at least one a pixel sample adjacent to a left lower pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to a lower left pixel sample time domain of the current image block, the x3 being a positive integer; wherein The x3 pixel samples include a pixel sample having the same position as a lower left pixel sample of the current image block, and a left space of the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of an adjacent pixel sample, a spatially adjacent pixel sample of a lower left of the current image block, and a spatially adjacent pixel sample of a lower side of the current image block.
  • the set of candidate motion information units corresponding to the central pixel sample a1 of the current image block includes motion information units of x5 pixel samples, where the x5 pixel samples are in the One of the pixel samples is a pixel sample a2, wherein a position of the central pixel sample a1 in a video frame to which the current image block belongs, and an adjacent video frame of the pixel sample a2 in a video frame to which the current image block belongs The positions in are the same, and x5 is a positive integer.
  • an image prediction method wherein a current image block includes at least one first pixel sample and at least one second pixel sample, and the method includes: determining each a first pixel sample and a set of candidate motion information units corresponding to each of the second pixel samples, wherein any one of the candidate motion information unit sets includes at least one motion information unit; and the combined motion information unit of the current block is determined a set, wherein each of the motion information units in the merged motion information unit set is at least part of a motion information unit in each of the first pixel samples and each of the second candidate candidate motion information unit sets And wherein the motion information of the motion information unit includes a motion vector in which the prediction direction is a forward motion vector and/or a motion direction in which the prediction direction is backward; determining, from the combined motion information unit set, each of the first pixel samples and a motion information unit corresponding to each of the second pixel samples; encoding first code stream information, the first code stream information And locating motion information
  • the embodiment of the present invention when acquiring the motion information of the first pixel sample, only needs to obtain the corresponding predicted motion information as its motion information, and does not need to further encode the code stream to complete the transmission of the residual value of the predicted motion information.
  • the number of bits to be transmitted by the residual information of the prediction information is saved, the bit consumption is reduced, and the coding efficiency is improved.
  • an image prediction apparatus configured to analyze the first code stream information, the first code stream information is used to indicate a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples; and the first acquiring unit is configured to: Obtaining, according to the parsed first code stream information, motion information of each of the first pixel samples and predicted motion information of each of the second pixel samples, where the predicted motion information is motion information Prediction information; a second parsing unit, configured to parse the second code stream information, the second code stream information is used to represent the difference motion information of each of the second pixel samples, the difference motion information is motion information and And a second acquisition unit, configured to acquire, according to the parsed second code stream information and the predicted motion information of each of the second pixel samples, each of the second pixel samples a third information acquiring unit,
  • the first code stream information includes an index, where the index is used to indicate a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples.
  • the second code stream information includes a difference value, where the difference is a motion vector residual between a motion vector of any of the second pixel samples and a predicted motion vector.
  • the first acquiring unit is specifically configured to: determine a candidate motion information unit set corresponding to each of the first pixel samples and each of the second pixel samples, where Any one of the candidate motion information unit sets includes at least one motion information unit; determining a merge motion information unit set of the current block, wherein each motion information unit in the merged motion information unit set is each of the a pixel sample and at least a portion of the motion information unit of the corresponding candidate motion information unit set in each of the second pixel samples, wherein the motion information of the motion information unit includes a motion vector with a prediction direction being forward and/or prediction a direction-oriented motion vector; determining, according to the parsed first code stream information, a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples from the combined motion information unit set Using the motion information of the motion information unit corresponding to the first pixel sample as the motion information of the first pixel sample; The motion information of the motion information unit corresponding to the second pixel sample is used as
  • the first acquiring unit is specifically configured to: determine, from the N candidate combined motion information unit sets, each of the first pixel samples and each of the second pixels a combined motion information element set of the motion information unit corresponding to the sample, wherein each motion information included in each candidate combined motion information unit set in the N candidate combined motion information unit sets a unit, which is respectively selected from each of the first pixel samples and at least part of the motion information unit of the candidate motion information unit set corresponding to each of the second pixel samples, wherein the N is a positive integer,
  • the sets of N candidate combined motion information units are different from each other.
  • the set of N candidate combined motion information units satisfies at least one of a first condition, a second condition, a third condition, a fourth condition, and a fifth condition, where
  • the first condition includes that the motion mode of the current image block indicated by the motion information unit in any one of the N candidate motion information unit sets is a non-translational motion;
  • the second condition includes that the prediction directions corresponding to the two motion information units in the one of the N candidate motion information unit sets are the same;
  • the third condition includes the N candidate motion information units
  • the reference frame indexes corresponding to the two motion information units in the set of any one of the candidate motion information units in the set are the same;
  • the fourth condition includes two of the candidate motion information unit sets in any one of the N candidate motion information unit sets.
  • the absolute value of the difference of the horizontal component of the motion vector of the motion information unit is less than or equal to the horizontal score a threshold value, or an absolute value of a difference between a motion information unit of one of the motion information unit groups of any one of the N candidate motion information unit sets and a motion vector horizontal component of the pixel sample Z is less than or equal to a level a component threshold, the pixel sample Z of the current image block is different from any one of the first pixel sample and the second pixel sample; the fifth condition includes the N candidate combined motion information
  • the absolute value of the difference between the motion vector vertical components of the two motion information units in any one of the candidate merge motion information unit sets in the unit set is less than or equal to the vertical component threshold, or the N candidate combined motion information unit sets An absolute value of a difference between any one of the motion information units in the candidate merged motion information unit set and the motion vector vertical component of the pixel sample Z is less than or equal to a vertical component threshold, the pixel sample of the current image block Z is different from any one of the first pixel
  • the second acquiring unit is configured to: obtain, according to the parsed second code stream information, difference motion information of each of the second pixel samples; The difference motion information of the second pixel sample and the corresponding predicted motion information are added to obtain motion information of each of the second pixel samples.
  • the motion model is a non-translational motion model, and specifically includes: the non-translational motion model is an affine motion model in the following form:
  • the motion vectors of the first pixel sample and the second pixel sample are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a coordinate in the current image block ( a motion vector horizontal component of a pixel sample of x, y), the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block, the w being the current image block
  • the third acquiring unit is configured to: calculate, by using the motion vectors of the affine motion model, the first pixel sample, and the second pixel sample, the current image block a motion vector of each pixel, using the calculated motion vector of each pixel in the current image block to determine a predicted pixel value of each pixel in the current image block; or, using the affine motion model Calculating a motion vector of each pixel block in the current image block by using motion vectors of the first pixel sample and the second pixel sample, and using the calculated
  • the motion model is a non-translational motion model, and specifically includes: the non-translational motion model is an affine motion model in the following form:
  • the vy is a motion vector vertical component of the pixel sample of the (x, y) coordinate in the current image block, where w is the length or width of the current image block; correspondingly, the third The obtaining unit is configured to: calculate, by using the motion vectors of the affine motion model, the first pixel sample, and the second pixel sample, a motion vector of each pixel in the current image block, and use the calculated a motion vector of each pixel in the current image block determines a predicted pixel value of each pixel in the current image block
  • the at least one first pixel sample and the at least one second pixel sample include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and a central pixel sample a1 of the current image block.
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block that includes an upper left vertex of the current image block
  • the lower left pixel sample of the current image block is the lower left vertex of the current image block or the pixel block of the current image block including the lower left vertex of the current image block
  • the upper right pixel sample of the current image block is the a top right vertex of the current image block or a pixel block in the current image block that includes an upper right vertex of the current image block
  • a central pixel sample a1 of the current image block is a central pixel point of the current image block or the A block of pixels in the current image block that contains the central pixel of the current image block.
  • the candidate motion information unit set corresponding to the upper left pixel sample of the current image block includes motion information units of x1 pixel samples, where the x1 pixel samples include at least one a pixel sample adjacent to an upper left pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to an upper left pixel sample time domain of the current image block, the x1 being a positive integer; wherein The x1 pixel samples include a pixel sample having the same position as an upper left pixel sample of the current image block and a left space of the current image block among video frames adjacent to a video frame time domain to which the current image block belongs. At least one of an adjacent pixel sample, a spatially adjacent pixel sample of the upper left of the current image block, and a spatially adjacent pixel sample of the upper side of the current image block.
  • the set of candidate motion information units corresponding to the upper right pixel sample of the current image block includes motion information units of x2 pixel samples, where the x2 pixel samples include at least one a pixel sample adjacent to an upper right pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to a time domain of an upper right pixel sample of the current image block, the x2 being a positive integer; wherein The x2 pixel samples include a pixel sample of the same position as the upper right pixel sample of the current image block among video frames adjacent to the time domain of the video frame to which the current image block belongs And at least one of a spatial neighboring pixel sample on the right side of the current image block, a spatially adjacent pixel sample on the upper right of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the set of candidate motion information units corresponding to the lower left pixel sample of the current image block includes motion information units of x3 pixel samples, where the x3 pixel samples include at least one a pixel sample adjacent to a left lower pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to a lower left pixel sample time domain of the current image block, the x3 being a positive integer; wherein The x3 pixel samples include a pixel sample having the same position as a lower left pixel sample of the current image block, and a left space of the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of an adjacent pixel sample, a spatially adjacent pixel sample of a lower left of the current image block, and a spatially adjacent pixel sample of a lower side of the current image block.
  • the set of candidate motion information units corresponding to the central pixel sample a1 of the current image block includes motion information units of x5 pixel samples, where the x5 pixel samples are in the One of the pixel samples is a pixel sample a2, wherein a position of the central pixel sample a1 in a video frame to which the current image block belongs, and an adjacent video frame of the pixel sample a2 in a video frame to which the current image block belongs The positions in are the same, and x5 is a positive integer.
  • an image prediction apparatus includes at least one first pixel sample and at least one second pixel sample
  • the apparatus includes: a first determining unit, configured to determine a set of candidate motion information units corresponding to each of the first pixel samples and each of the second pixel samples, wherein any one of the candidate motion information unit sets includes at least one motion information unit; and the second determining unit is configured to: Determining a set of merged motion information units of the current block, wherein each of the motion information units in the merged motion information unit set is a corresponding candidate in each of the first pixel samples and each of the second pixel samples At least part of the motion information unit of the motion information unit, wherein the motion information of the motion information unit includes a motion vector in which the prediction direction is a forward motion vector and/or a motion direction in which the prediction direction is a backward direction; and a third determining unit, configured to The combined motion information unit collectively determines a motion letter corresponding to each of
  • a calculation unit configured to calculate difference motion information of the second pixel sample, the difference motion information is a difference between motion information and predicted motion information
  • a second coding unit configured to encode second code stream information, where The second stream information is used to represent the difference motion information of each of the second pixel samples
  • the obtaining unit is configured to: according to the motion model of the current image block, the motion information of each of the first pixel samples, and each The motion information of the second pixel sample is obtained, and the predicted value of the current image block is obtained.
  • an image prediction apparatus includes at least one first pixel sample and at least one second pixel sample, the apparatus comprising: a processor and a processor coupled to the processor a memory for storing code or instructions; the processor for invoking the code or instructions to perform: parsing first code stream information, the first code stream information being used to indicate each of the first pixels And a motion information unit corresponding to each of the second pixel samples; acquiring motion information of each of the first pixel samples and each of the second pixel samples according to the parsed first code stream information Predicting motion information, the predicted motion information is prediction information of motion information; parsing second code stream information, the second code stream information is used to represent difference motion information of each of the second pixel samples, the difference motion The information is a difference between the motion information and the predicted motion information; and the predicted motion information according to the parsed second code stream information and the corresponding second pixel sample Obtaining motion information of each of the second pixel samples; obtaining the motion information
  • an image prediction apparatus includes at least one first pixel sample and at least one second pixel sample
  • the apparatus includes: a processor and a processor coupled a memory for storing code or instructions; the processor for invoking the code or instructions to perform: determining candidate motion information units corresponding to each of the first pixel samples and each of the second pixel samples a set, wherein any one of the candidate motion information unit sets includes at least one motion information unit; determining a merge motion information unit set of the current block, wherein each motion information unit in the merged motion information unit set is each The first pixel sample and each At least part of the motion information unit of the corresponding candidate motion information unit set in the second pixel sample, wherein the motion information of the motion information unit includes a motion direction in which the prediction direction is forward and/or the prediction direction is backward.
  • a motion vector determining, from the combined motion information unit set, a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples; encoding first code stream information, where the first code stream information is used And symbolizing each of the first pixel samples and the motion information unit corresponding to each of the second pixel samples determined by the combined motion information unit; using motion information of the motion information unit corresponding to the first pixel sample as Motion information of the first pixel sample; using motion information of the motion information unit corresponding to the second pixel sample as prediction motion information of the second pixel sample; calculating difference motion information of the second pixel sample,
  • the difference motion information is a difference between the motion information and the predicted motion information; encoding the second code stream information, the second code stream information Information for characterizing the difference motion information of each of the second pixel samples; motion information according to the current image block, motion information of each of the first pixel samples, and motion information of each of the second pixel samples Obtaining a predicted value of the current image block.
  • a seventh aspect of the present invention provides an image prediction method, which may include:
  • Determining 2 pixel samples in the current image block determining a candidate motion information unit set corresponding to each of the 2 pixel samples; wherein the candidate motion information unit set corresponding to each pixel sample Include at least one motion information unit of the candidate;
  • Each of the motion information units i is selected from at least a part of the motion information unit of the candidate motion information unit group corresponding to each of the two pixel samples, where
  • the motion information unit includes a motion vector in which the prediction direction is forward and/or a motion vector in which the prediction direction is backward;
  • the current image block is subjected to pixel value prediction using an affine motion model and the combined motion information unit set i.
  • the determining the combined motion information unit set i that includes the two motion information units includes:
  • Each of the motion information units included in the motion information unit set is selected from at least a part of the motion information unit that meets the constraint condition in the candidate motion information unit set corresponding to each of the two pixel samples, respectively.
  • the N is a positive integer, the set of N candidate combined motion information units are different from each other, and each set of candidate combined motion information units in the N candidate combined motion information unit sets includes two motion information units.
  • the set of the N candidate combined motion information units meet the first condition, the second condition, the third condition, At least one of the fourth condition and the fifth condition,
  • the first condition includes that the motion mode of the current image block indicated by the motion information unit in any one of the N candidate motion information unit sets is a non-translation motion
  • the second condition includes that the two motion information units in the set of candidate motion information units in the N candidate motion information unit sets have the same prediction direction;
  • the third condition includes that the reference frame indexes corresponding to the two motion information units in any one of the N candidate motion information unit sets are the same;
  • the fourth condition includes that an absolute value of a difference between motion vector horizontal components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to a horizontal component threshold Or, the absolute value of the difference between one of the motion information unit and the motion vector horizontal component of the pixel sample Z in any one of the N candidate motion information unit sets is less than or equal to a level a component threshold, the pixel sample Z of the current image block being different from any one of the 2 pixel samples;
  • the fifth condition includes that an absolute value of a difference between motion vector vertical components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to vertical a component threshold, or an absolute value of a difference between any one of the motion information unit of one of the candidate motion information unit sets and the motion vector vertical component of the pixel sample Z of the N candidate motion information unit sets is smaller than Or equal to the horizontal component threshold, the pixel sample Z of the current image block is different from any one of the 2 pixel samples.
  • the 2 pixel samples include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and a central pixel sample a1 of the current image block. 2 of the pixel samples;
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block including an upper left vertex of the current image block; and a lower left pixel sample of the current image block a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex or a location of the current image block a pixel block in a current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or the current image block includes the A block of pixels at the center pixel of the current image block.
  • the candidate motion information unit set corresponding to the upper left pixel sample of the current image block includes motion information units of x1 pixel samples, wherein the x1 pixel samples include at least one spatial domain of the upper left pixel sample of the current image block. a neighboring pixel sample and/or at least one pixel sample adjacent to a time domain of an upper left pixel sample of the current image block, the x1 being a positive integer;
  • the x1 pixel samples include a pixel sample having the same position as an upper left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatial adjacent pixel sample on the left, a spatially adjacent pixel sample on the upper left of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the upper right pixel sample of the current image block The corresponding set of candidate motion information units includes motion information units of x2 pixel samples, wherein the x2 pixel samples include at least one pixel sample adjacent to an upper right pixel sample spatial domain of the current image block and/or at least one The pixel sample of the upper right pixel sample of the current image block is adjacent to the time domain, and the x2 is a positive integer;
  • the x2 pixel samples include a pixel sample that is the same as the upper right pixel sample position of the current image block, among the video frames adjacent to the video frame time domain to which the current image block belongs. At least one of a spatial adjacent pixel sample on the right side of the front image block, a spatially adjacent pixel sample on the upper right of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the candidate motion information unit set corresponding to the lower left pixel sample of the current image block includes motion information units of x3 pixel samples, wherein the x3 pixel samples include at least one spatial domain of the lower left pixel sample of the current image block. a neighboring pixel sample and/or at least one pixel sample adjacent to a time domain of a lower left pixel sample of the current image block, the x3 being a positive integer;
  • the x3 pixel samples include a pixel sample having the same position as a lower left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatially adjacent pixel sample on the left side, a spatially adjacent pixel sample on the lower left of the current image block, and a spatially adjacent pixel sample on the lower side of the current image block.
  • the candidate motion information unit set corresponding to the central pixel sample a1 of the current image block includes motion information units of x5 pixel samples, wherein one of the x5 pixel samples is a pixel sample a2,
  • the position of the central pixel sample a1 in the video frame to which the current image block belongs is the same as the position of the pixel sample a2 in the adjacent video frame of the video frame to which the current image block belongs, and the x5 is A positive integer.
  • Performing pixel value prediction on the current image block by using the affine motion model and the merged motion information unit set i includes: when the prediction direction in the merged motion information unit set i is a motion vector corresponding to the first prediction direction In the case where the reference frame index is different from the reference frame index of the current image block, the merged motion information unit set i is subjected to scaling processing such that the prediction direction in the combined motion information unit set i is the first prediction The motion vector of the direction is scaled to the reference frame of the current image block, using the affine motion model and the merged motion information unit set i after performing the scaling process The pre-image block performs pixel value prediction, wherein the first prediction direction is forward or backward;
  • Performing pixel value prediction on the current image block by using the affine motion model and the merged motion information unit set i includes: when the prediction direction in the merged motion information unit set i is a reference corresponding to a forward motion vector a frame index is different from a forward reference frame index of the current image block, and a reference frame index corresponding to the backward direction motion vector in the merged motion information unit set i is different from a backward direction of the current image block
  • the merged motion information unit set i is subjected to a scaling process such that the forward motion vector in the merged motion information unit set i is forwarded to the front of the current image block To the reference frame and causing the backward direction motion vector in the merged motion information unit set i to be scaled to the backward reference frame of the current image block, using the affine motion model and the merge motion after the scaling process
  • the information unit set i performs pixel value prediction on the current image block.
  • the performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i includes:
  • Performing pixel value prediction on the current image block by using the affine motion model and the merged motion information unit set i including: using a difference between motion vector horizontal components of the two pixel samples and the current The ratio of the length or width of the image block, and the vertical division of the motion vector of the 2 pixel samples a ratio of a difference between the quantities to a length or a width of the current image block, resulting in a motion vector of an arbitrary pixel sample in the current image block, wherein the motion vector of the 2 pixel samples is based on the combined motion
  • the motion vectors of the two motion information units in the information unit set i are obtained.
  • the horizontal coordinate coefficient of the motion vector horizontal component of the 2 pixel samples and the vertical coordinate coefficient of the motion vector vertical component are equal, and the vertical coordinate coefficient and the motion vector vertical of the motion vector horizontal component of the 2 pixel samples
  • the horizontal coordinate coefficients of the components are opposite.
  • the affine motion model is an affine motion model of the form:
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a pixel sample with coordinates (x, y) in the current image block.
  • a motion vector horizontal component the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block
  • the w being the length or width of the current image block.
  • the image prediction method is applied to a video encoding process or the image prediction method is applied to a video decoding process.
  • the N candidate merges
  • the merged motion information unit set i including the two motion information units is determined among the motion information unit sets, including: based on the identifier of the merged motion information unit set i obtained from the video code stream, from the N candidate combined motion information unit sets
  • the merged motion information unit set i including two motion information units is determined.
  • the method further includes: decoding the 2 from the video code stream a motion vector residual of the pixel samples, using the motion vectors of the spatial neighboring or temporally adjacent pixel samples of the 2 pixel samples to obtain motion vector predictors of the 2 pixel samples, based on the 2 pixels The motion vector predictor of the sample and the motion vector residual of the two pixel samples respectively obtain motion vectors of the two pixel samples.
  • the method further includes Obtaining motion vector predictors of the two pixel samples by using motion vectors of spatially adjacent or temporally adjacent pixel samples of the two pixel samples, and obtaining motion vector predictors according to the motion data of the two pixel samples A motion vector residual of the two pixel samples, and a motion vector residual of the two pixel samples is written into the video code stream.
  • the image prediction method is applied to In the case of a video encoding process, the method further comprises writing the identification of the combined motion information element set i to the video code stream.
  • An eighth aspect of the embodiments of the present invention provides an image prediction apparatus, including:
  • a first determining unit configured to determine 2 pixel samples in the current image block, and determine a candidate motion information unit set corresponding to each of the 2 pixel samples; wherein each of the pixel samples Corresponding candidate motion information unit set includes at least one motion information unit of the candidate;
  • a second determining unit configured to determine a combined motion information unit set i including two motion information units
  • Each of the motion information units i is selected from at least a part of the motion information unit of the candidate motion information unit group corresponding to each of the two pixel samples, where
  • the motion information unit includes a motion vector in which the prediction direction is forward and/or a motion vector in which the prediction direction is backward;
  • a prediction unit configured to perform pixel value prediction on the current image block by using the affine motion model and the merged motion information unit set i.
  • the second determining The element is specifically configured to determine, from among the N candidate merged motion information unit sets, a combined motion information unit set i that includes two motion information units, where each candidate merge motion of the N candidate combined motion information unit sets Each of the motion information units included in the information unit set is selected from at least a part of the motion information unit of the candidate motion information unit set corresponding to each of the two pixel samples, wherein the N is a positive integer, the set of N candidate combined motion information units are different from each other, and each set of candidate combined motion information units in the N candidate combined motion information unit sets includes two motion information units.
  • the set of the N candidate combined motion information units meet the first condition, the second condition, the third condition, At least one of the fourth condition and the fifth condition,
  • the first condition includes that the motion mode of the current image block indicated by the motion information unit in any one of the N candidate motion information unit sets is a non-translation motion
  • the second condition includes that the two motion information units in the set of candidate motion information units in the N candidate motion information unit sets have the same prediction direction;
  • the third condition includes that the reference frame indexes corresponding to the two motion information units in any one of the N candidate motion information unit sets are the same;
  • the fourth condition includes that an absolute value of a difference between motion vector horizontal components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to a horizontal component threshold Or, the absolute value of the difference between one of the motion information unit and the motion vector horizontal component of the pixel sample Z in any one of the N candidate motion information unit sets is less than or equal to a level a component threshold, the pixel sample Z of the current image block being different from any one of the 2 pixel samples;
  • the fifth condition includes that an absolute value of a difference between motion vector vertical components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to vertical a component threshold, or an absolute value of a difference between a motion information unit of one of the motion information unit units and a motion vector component of the pixel sample Z in the set of candidate motion information units in the N candidate motion information unit sets is less than or Equal to the horizontal component threshold, the current The pixel sample Z of the image block is different from any one of the 2 pixel samples.
  • the two pixel samples Included in the upper left pixel sample, the upper right pixel sample, the lower left pixel sample, and two of the central pixel samples a1 of the current image block;
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block including an upper left vertex of the current image block; and a lower left pixel sample of the current image block a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex or a location of the current image block a pixel block in a current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or the current image block includes the A block of pixels at the center pixel of the current image block.
  • the candidate motion information unit set corresponding to the upper left pixel sample of the current image block includes x1 pixel samples a motion information unit, wherein the x1 pixel samples include at least one pixel sample adjacent to an upper left pixel sample spatial domain of the current image block and/or at least one time domain adjacent to an upper left pixel sample of the current image block Pixel samples, the x1 is a positive integer;
  • the x1 pixel samples include a pixel sample having the same position as an upper left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatial adjacent pixel sample on the left, a spatially adjacent pixel sample on the upper left of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the upper right pixel sample of the current image block The corresponding set of candidate motion information units includes motion information units of x2 pixel samples, wherein the x2 pixel samples include at least one pixel sample adjacent to an upper right pixel sample spatial domain of the current image block and/or at least one The pixel sample of the upper right pixel sample of the current image block is adjacent to the time domain, and the x2 is a positive integer;
  • the x2 pixel samples are adjacent to a time domain of a video frame to which the current image block belongs a pixel sample having the same position as the upper right pixel sample of the current image block, a spatial adjacent pixel sample of the right side of the current image block, a spatially adjacent pixel sample of the upper right of the current image block, and At least one of the spatially adjacent pixel samples of the upper side of the current image block.
  • the candidate motion information unit set corresponding to the lower left pixel sample of the current image block includes motion information units of x3 pixel samples, wherein the x3 pixel samples include at least one spatial domain of the lower left pixel sample of the current image block. a neighboring pixel sample and/or at least one pixel sample adjacent to a time domain of a lower left pixel sample of the current image block, the x3 being a positive integer;
  • the x3 pixel samples include a pixel sample having the same position as a lower left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatially adjacent pixel sample on the left side, a spatially adjacent pixel sample on the lower left of the current image block, and a spatially adjacent pixel sample on the lower side of the current image block.
  • the candidate motion information unit set corresponding to the central pixel sample a1 of the current image block includes motion information units of x5 pixel samples, wherein one of the x5 pixel samples is a pixel sample a2,
  • the position of the central pixel sample a1 in the video frame to which the current image block belongs is the same as the position of the pixel sample a2 in the adjacent video frame of the video frame to which the current image block belongs, and the x5 is A positive integer.
  • the prediction unit is specifically configured to: when the reference frame index corresponding to the motion vector in the first prediction direction in the merged motion information unit set i is different from the reference frame index of the current image block,
  • the merged motion information unit set i performs a scaling process such that a motion vector in the merged motion information unit set i that is a prediction direction of the first prediction direction is scaled to a reference frame of the current image block, using an affine motion Model and merged motion information unit set after scaling processing i Performing pixel value prediction on the current image block, where the first prediction direction is forward or backward;
  • the prediction unit is specifically configured to: when the prediction direction in the merged motion information unit set i is a reference frame index corresponding to a forward motion vector, different from a forward reference frame index of the current image block, and In the case where the prediction direction in the merged motion information unit set i is that the reference frame index corresponding to the backward motion vector is different from the backward reference frame index of the current image block, the merged motion information unit set i is scaled Processing such that a forward motion vector in the merged motion information unit set i is forward-oriented to a forward reference frame of the current image block and causes a prediction direction in the merged motion information unit set i to be The backward motion vector is scaled to the backward reference frame of the current image block, and the current image block is subjected to pixel value prediction using the affine motion model and the merged motion information unit set i subjected to the scaling process.
  • the prediction unit is specifically configured to calculate a motion vector of each pixel in the current image block by using an affine motion model and the combined motion information unit set i, and use each of the calculated current image blocks a motion vector of a pixel determines a predicted pixel value of each pixel in the current image block;
  • the prediction unit is specifically configured to calculate a motion vector of each pixel block in the current image block by using an affine motion model and the combined motion information unit set i, and use each of the calculated current image blocks
  • the motion vector of the pixel block determines a predicted pixel value for each pixel of each pixel block in the current image block.
  • the prediction unit is specifically configured to utilize a ratio of a difference between motion vector horizontal components of the 2 pixel samples to a length or a width of the current image block, and a motion vector vertical of the 2 pixel samples a ratio of a difference between the components to a length or a width of the current image block, resulting in a motion vector of an arbitrary pixel sample in the current image block, wherein the motion vector of the 2 pixel samples is based on the combined motion
  • the motion vectors of the two motion information units in the information unit set i are obtained.
  • the horizontal coordinate coefficient and the motion vector vertical component of the motion vector horizontal component of the two pixel samples The vertical coordinate coefficients are equal, and the vertical coordinate coefficients of the motion vector horizontal components of the 2 pixel samples and the horizontal coordinate coefficients of the motion vector vertical components are opposite.
  • the affine motion model is an affine motion model of the form:
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a pixel sample with coordinates (x, y) in the current image block.
  • a motion vector horizontal component the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block
  • the w being the length or width of the current image block.
  • the image prediction device is applied to a video encoding device or the image prediction device is applied to a video decoding device.
  • the second The determining unit is specifically configured to determine, according to the identifier of the merged motion information unit set i obtained from the video code stream, the merged motion information unit set i including the two motion information units from among the N candidate combined motion information unit sets.
  • the apparatus further includes a decoding unit, configured to decode, from the video code stream, a motion vector residual of the two pixel samples, using spatial neighboring or temporally adjacent pixel samples of the two pixel samples.
  • the motion vector obtains a motion vector predictor of the 2 pixel samples, and obtains a motion vector of the 2 pixel samples based on a motion vector predictor of the 2 pixel samples and a motion vector residual of the 2 pixel samples, respectively.
  • the prediction unit The method is further configured to: obtain motion vector predictive values of the two pixel samples by using motion vectors of spatial neighboring or temporally adjacent pixel samples of the two pixel samples, according to motion vectors of the two pixel samples.
  • the predicted value obtains a motion vector residual of the two pixel samples, and the motion vector residual of the two pixel samples is written into the video code stream.
  • the device when the image prediction apparatus is applied In the case of a video encoding device, the device further comprises an encoding unit for writing the identity of the combined motion information unit set i to the video code stream.
  • a ninth aspect of the embodiments of the present invention provides an image prediction apparatus, including:
  • the processor determines, by using a code or an instruction stored in the memory, for determining two pixel samples in a current image block, and determining candidates corresponding to each of the two pixel samples. a set of motion information units; wherein the set of candidate motion information units corresponding to each of the pixel samples includes at least one motion information unit of the candidate; determining a combined motion information unit set i including two motion information units; wherein the combining Each of the motion information units i is selected from at least a portion of the motion information unit of the candidate motion information unit set corresponding to each of the 2 pixel samples, wherein the motion information unit includes The prediction direction is a forward motion vector and/or the prediction direction is a backward motion vector; the current image block is subjected to pixel value prediction using the affine motion model and the combined motion information unit set i.
  • the processor is configured to merge motion information from the N candidates in determining the combined motion information unit set i including the two motion information units Among the unit sets, a combined motion information unit set i including two motion information units is determined; wherein each candidate motion information unit set in the N candidate combined motion information unit sets Each of the motion information units included is selected from at least a portion of the motion information unit of the candidate motion information unit set corresponding to each of the two pixel samples, wherein the N is a positive integer.
  • the set of N candidate combined motion information units are different from each other, and each set of candidate combined motion information units in the set of N candidate combined motion information units includes two motion information units.
  • the set of the N candidate combined motion information units meet the first condition, the second condition, the third condition, At least one of the fourth condition and the fifth condition,
  • the first condition includes that the motion mode of the current image block indicated by the motion information unit in any one of the N candidate motion information unit sets is a non-translation motion
  • the second condition includes that the two motion information units in the set of candidate motion information units in the N candidate motion information unit sets have the same prediction direction;
  • the third condition includes that the reference frame indexes corresponding to the two motion information units in any one of the N candidate motion information unit sets are the same;
  • the fourth condition includes that an absolute value of a difference between motion vector horizontal components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to a horizontal component threshold Or, the absolute value of the difference between one of the motion information unit and the motion vector horizontal component of the pixel sample Z in any one of the N candidate motion information unit sets is less than or equal to a level a component threshold, the pixel sample Z of the current image block being different from any one of the 2 pixel samples;
  • the fifth condition includes that an absolute value of a difference of motion vector vertical components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to a vertical component threshold Or the absolute value of the difference between any one of the motion information unit of one of the candidate motion information unit sets and the motion vector vertical component of the pixel sample Z is less than or equal to A horizontal component threshold, the pixel sample Z of the current image block being different from any one of the 2 pixel samples.
  • the two pixel samples includes an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and two of the central pixel samples a1 of the current image block;
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block including an upper left vertex of the current image block; and a lower left pixel sample of the current image block a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex or a location of the current image block a pixel block in a current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or the current image block includes the A block of pixels at the center pixel of the current image block.
  • the candidate motion information unit set corresponding to the upper left pixel sample of the current image block includes x1 pixel samples a motion information unit, wherein the x1 pixel samples include at least one pixel sample adjacent to an upper left pixel sample spatial domain of the current image block and/or at least one time domain adjacent to an upper left pixel sample of the current image block Pixel samples, the x1 is a positive integer;
  • the x1 pixel samples include a pixel sample having the same position as an upper left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatial adjacent pixel sample on the left, a spatially adjacent pixel sample on the upper left of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the upper right pixel sample of the current image block The corresponding set of candidate motion information units includes motion information units of x2 pixel samples, wherein the x2 pixel samples include at least one pixel sample adjacent to an upper right pixel sample spatial domain of the current image block and/or at least one The pixel sample of the upper right pixel sample of the current image block is adjacent to the time domain, and the x2 is a positive integer;
  • the x2 pixel samples include a pixel sample having the same position as an upper right pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatially adjacent pixel sample on the right side, a spatially adjacent pixel sample on the upper right of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the candidate motion information unit set corresponding to the lower left pixel sample of the current image block includes motion information units of x3 pixel samples, wherein the x3 pixel samples include at least one spatial domain of the lower left pixel sample of the current image block. a neighboring pixel sample and/or at least one pixel sample adjacent to a time domain of a lower left pixel sample of the current image block, the x3 being a positive integer;
  • the x3 pixel samples include a pixel sample having the same position as a lower left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatially adjacent pixel sample on the left side, a spatially adjacent pixel sample on the lower left of the current image block, and a spatially adjacent pixel sample on the lower side of the current image block.
  • the candidate motion information unit set corresponding to the central pixel sample a1 of the current image block includes motion information units of x5 pixel samples, wherein one of the x5 pixel samples is a pixel sample a2,
  • the position of the central pixel sample a1 in the video frame to which the current image block belongs is the same as the position of the pixel sample a2 in the adjacent video frame of the video frame to which the current image block belongs, and the x5 is A positive integer.
  • the processor is configured to: when the prediction direction in the merged motion information unit set i is the first In a case where the reference frame index corresponding to the motion vector of the prediction direction is different from the reference frame index of the current image block, the merged motion information unit set i is subjected to scaling processing such that the merged motion information unit set i The motion vector whose prediction direction is the first prediction direction is scaled to the reference frame of the current image block, and the pixel value prediction is performed on the current image block by using the affine motion model and the merged motion information unit set i after performing the scaling process, Wherein the first prediction direction is forward or backward;
  • the processor is configured to: when the prediction direction in the merged motion information unit set i is The reference frame index corresponding to the forward motion vector is different from the forward reference frame index of the current image block, and the reference frame index in the merged motion information unit set i is different from the reference frame index corresponding to the backward motion vector
  • the merged motion information unit set i is subjected to a scaling process such that the forward motion vector in the merged motion information unit set i is scaled to the forward motion vector Going to the forward reference frame of the current image block and causing the backward direction motion vector in the merged motion information unit set i to be scaled to the backward reference frame of the current image block, using an affine motion model And combining the motion information unit set i after performing the scaling process on the current image block for pixel value prediction.
  • the affine motion model is utilized And the aspect of performing the pixel value prediction on the current image block by the merged motion information unit set i, wherein the processor is configured to calculate the current image block by using an affine motion model and the combined motion information unit set i a motion vector of each pixel, using the calculated motion vector of each pixel in the current image block to determine a predicted pixel value of each pixel in the current image block;
  • the processor is configured to calculate using the affine motion model and the combined motion information unit set i a motion vector of each pixel block in the current image block, using the calculated motion vector of each pixel block in the current image block to determine a predicted pixel value of each pixel point of each pixel block in the current image block.
  • the processor is configured to utilize a difference between horizontal components of motion vectors of the two pixel samples a ratio of a value to a length or a width of the current image block, and a ratio of a difference between a vertical component of a motion vector of the 2 pixel samples and a length or a width of the current image block, A motion vector of an arbitrary pixel sample in the pre-image block, wherein the motion vector of the 2 pixel samples is obtained based on motion vectors of two motion information units in the combined motion information unit set i.
  • the horizontal coordinate coefficient of the motion vector horizontal component of the 2 pixel samples and the vertical coordinate coefficient of the motion vector vertical component are equal, and the vertical coordinate coefficient and the motion vector vertical of the motion vector horizontal component of the 2 pixel samples
  • the horizontal coordinate coefficients of the components are opposite.
  • the affine motion model is an affine motion model of the form:
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a pixel sample with coordinates (x, y) in the current image block.
  • a motion vector horizontal component the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block
  • the w being the length or width of the current image block.
  • the image prediction device is applied to a video encoding device or the image prediction device is applied to a video decoding device.
  • the determining includes An aspect of the merged motion information unit set i of the motion information units, the processor configured to determine, based on the identifier of the merged motion information unit set i obtained from the video code stream, from the N candidate combined motion information unit sets The combined motion information unit set i of the two motion information units.
  • the processor is further configured to: obtain a motion vector residual of the two pixel samples from the video code stream, and use the spatial neighboring or time of the two pixel samples a motion vector of a pixel sample adjacent to the domain obtains a motion vector predictor of the two pixel samples, and the motion vector predictor of the two pixel samples and a motion vector residual of the two pixel samples respectively obtain the The motion vector of 2 pixel samples.
  • the processor In conjunction with the thirteenth possible implementation of the ninth aspect, in a sixteenth possible implementation of the ninth aspect, in the case where the image prediction apparatus is applied to a video encoding apparatus, the processor And using, by using motion vectors of spatially adjacent or time-domain adjacent pixel samples of the two pixel samples, obtaining motion vector predictors of the two pixel samples, according to motion vectors of the two pixel samples The predicted value obtains a motion vector residual of the two pixel samples, and the motion vector residual of the two pixel samples is written into the video code stream.
  • the processor is further configured to write the identifier of the combined motion information unit set i into the video code stream.
  • a tenth aspect of the embodiments of the present invention provides an image processing method, including:
  • the affine motion model is in the following form:
  • (x, y) is a coordinate of the arbitrary pixel sample
  • the vx is a horizontal component of a motion vector of the arbitrary pixel sample
  • the vy is a vertical component of a motion vector of the arbitrary pixel sample
  • a is a horizontal coordinate coefficient of a horizontal component of the affine motion model
  • b is a vertical coordinate coefficient of a horizontal component of the affine motion model
  • vy In -bx+ay, a is the vertical coordinate coefficient of the vertical component of the affine motion model, and -b is the The horizontal coordinate coefficient of the vertical component of the affine motion model.
  • the affine motion model further includes a horizontal displacement coefficient c of a horizontal component of the affine motion model, and a vertical of the affine motion model The vertical component of the direct component is d, so that the affine motion model is of the form:
  • the calculating, by using the affine motion model and the motion vector 2-tuple, include:
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the difference between the horizontal components of the motion vectors of the two pixel samples is utilized a ratio of a value to a distance between the two pixel samples, and a ratio between a difference between a vertical component of a motion vector of each of the two pixel samples and a distance between the two pixel samples, The value of the coefficient of the affine motion model;
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the calculating is obtained by using the affine motion model and the motion vector 2-tuple
  • the motion vector of any pixel sample in the current image block includes:
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 1 , vy 1 ) is a motion vector of the right region pixel sample
  • w is between the two pixel samples distance.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 2 , vy 2 ) is a motion vector of the lower region pixel sample
  • h is between the two pixel samples distance.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 3 , vy 3 ) is a motion vector of the lower right region pixel sample
  • h 1 is the two pixel samples.
  • the distance in the vertical direction w 1 is the horizontal direction distance between the two pixel samples
  • w 1 2 + h 1 2 is the square of the distance between the two pixel samples.
  • the method further includes:
  • An eleventh aspect of the present invention provides an image processing apparatus, including:
  • an obtaining unit configured to obtain a motion vector 2-tuple of the current image block, where the motion vector 2-tuple includes a motion vector of each of the 2 pixel samples in the video frame to which the current image block belongs;
  • a calculating unit configured to calculate, by using an affine motion model and a motion vector 2-tuple obtained by the obtaining unit, a motion vector of an arbitrary pixel sample in the current image block;
  • the affine motion model is in the following form:
  • (x, y) is a coordinate of the arbitrary pixel sample
  • the vx is a horizontal component of a motion vector of the arbitrary pixel sample
  • the vy is a vertical component of a motion vector of the arbitrary pixel sample
  • a is a horizontal coordinate coefficient of a horizontal component of the affine motion model
  • b is a vertical coordinate coefficient of a horizontal component of the affine motion model
  • vy In -bx+ay, a is a vertical coordinate coefficient of a vertical component of the affine motion model, and -b is a horizontal coordinate coefficient of a vertical component of the affine motion model.
  • the affine motion model further includes a horizontal displacement coefficient c of a horizontal component of the affine motion model, and the affine motion model The vertical displacement coefficient d of the vertical component, such that the affine motion model is of the form:
  • the calculating unit is specifically configured to:
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the calculating unit is specifically configured to:
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the calculating unit is specifically configured to:
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 1 , vy 1 ) is a motion vector of the right region pixel sample
  • w is between the two pixel samples distance.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 2 , vy 2 ) is a motion vector of the lower region pixel sample
  • h is between the two pixel samples distance.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 3 , vy 3 ) is a motion vector of the lower right region pixel sample
  • h 1 is the two pixel samples.
  • the distance in the vertical direction w 1 is the horizontal direction distance between the two pixel samples
  • w 1 2 + h 1 2 is the square of the distance between the two pixel samples.
  • the device further includes an encoding unit, configured to calculate, by using the calculating unit, a motion vector of an arbitrary pixel sample in the current image block, for the current image block Any pixel sample is described for motion compensated predictive coding.
  • the device further includes a decoding unit, configured to perform motion compensation on the arbitrary pixel sample by using a motion vector of an arbitrary pixel sample in the current image block calculated by the calculating unit. Decoding to obtain a pixel reconstruction value of the arbitrary pixel sample.
  • a twelfth aspect of the embodiments of the present invention provides an image processing apparatus, including:
  • the processor by calling a code or an instruction stored in the memory, for obtaining a motion vector 2-tuple of a current image block, where the motion vector 2-tuple includes a video frame to which the current image block belongs The respective motion vectors of the 2 pixel samples;
  • the affine motion model is in the following form:
  • (x, y) is a coordinate of the arbitrary pixel sample
  • the vx is a horizontal component of a motion vector of the arbitrary pixel sample
  • the vy is a vertical component of a motion vector of the arbitrary pixel sample
  • a is a horizontal coordinate coefficient of a horizontal component of the affine motion model
  • b is a vertical coordinate coefficient of a horizontal component of the affine motion model
  • vy In -bx+ay, a is a vertical coordinate coefficient of a vertical component of the affine motion model, and -b is a horizontal coordinate coefficient of a vertical component of the affine motion model.
  • the affine motion model further includes a horizontal displacement coefficient c of a horizontal component of the affine motion model, and the affine motion model The vertical displacement coefficient d of the vertical component, such that the affine motion model is of the form:
  • the processor is configured to obtain the affine motion model by using a motion vector of each of the two pixel samples and a position of the two pixel samples The value of the coefficient;
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the affine motion model and the motion vector 2-tuple are utilized, Calculating a motion vector aspect of an arbitrary pixel sample in the current image block, the processor configured to utilize a difference between a horizontal component of a motion vector of each of the two pixel samples and the two pixel samples a ratio of a distance, and a ratio of a difference between a vertical component of a motion vector of each of the 2 pixel samples and a distance between the 2 pixel samples, obtaining a value of a coefficient of the affine motion model;
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the affine motion model and the motion vector 2-tuple are utilized, Calculating a motion vector aspect of an arbitrary pixel sample in the current image block, the processor configured to utilize a weighted sum between components of respective motion vectors of the two pixel samples and a distance between the two pixel samples Or a ratio of squares of distances between the two pixel samples to obtain values of coefficients of the affine motion model;
  • a motion vector of an arbitrary pixel sample in the current image block is obtained using the affine motion model and values of coefficients of the affine motion model.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 1 , vy 1 ) is a motion vector of the right region pixel sample
  • w is between the two pixel samples distance.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 2 , vy 2 ) is a motion vector of the lower region pixel sample
  • h is between the two pixel samples distance.
  • the affine motion model is specifically:
  • (vx 0 , vy 0 ) is a motion vector of the upper left pixel sample
  • (vx 3 , vy 3 ) is a motion vector of the lower right region pixel sample
  • h 1 is the two pixel samples.
  • the distance in the vertical direction w 1 is the horizontal direction distance between the two pixel samples
  • w 1 2 + h 1 2 is the square of the distance between the two pixel samples.
  • the processor is further configured to: after calculating the motion vector of an arbitrary pixel sample in the current image block by using the affine motion model and the motion vector 2-tuple And performing motion compensation predictive coding on the arbitrary pixel samples in the current image block by using the calculated motion vector of any pixel sample in the current image block.
  • the processor is further configured to: in the determining, a predicted pixel value of a pixel of the arbitrary pixel sample in the current image block Then, using the calculated motion vector of any pixel sample in the current image block, performing motion compensation decoding on the arbitrary pixel sample to obtain a pixel reconstruction value of the arbitrary pixel sample.
  • An image processing method comprising:
  • the affine motion model is in the following form:
  • (x, y) is a coordinate of the arbitrary pixel sample
  • the vx is a horizontal component of a motion vector of the arbitrary pixel sample
  • the vy is a vertical component of a motion vector of the arbitrary pixel sample
  • a is a horizontal coordinate coefficient of a horizontal component of the affine motion model
  • b is a vertical coordinate coefficient of a horizontal component of the affine motion model
  • vy In -bx+ay, a is a vertical coordinate coefficient of a vertical component of the affine motion model, -b is a horizontal coordinate coefficient of a vertical component of the affine motion model, and coefficients of the affine motion model Including a and b;
  • the coefficients of the affine motion model further include a horizontal displacement coefficient c of a horizontal component of the affine motion model, and a vertical displacement coefficient d of a vertical component of the affine motion model, such that the affine motion model In the form of:
  • a fourteenth aspect of the embodiments of the present invention provides an image processing apparatus, including:
  • a calculating unit configured to calculate a motion vector of an arbitrary pixel sample in the current image block by using a coefficient of the affine motion model obtained by the obtaining unit and the affine model;
  • a prediction unit configured to calculate, by the calculation unit, a motion vector of the arbitrary pixel sample, and determine a predicted pixel value of a pixel of the arbitrary pixel sample;
  • the affine motion model is in the following form:
  • (x, y) is a coordinate of the arbitrary pixel sample
  • the vx is a horizontal component of a motion vector of the arbitrary pixel sample
  • the vy is a vertical component of a motion vector of the arbitrary pixel sample
  • a is a horizontal coordinate coefficient of a horizontal component of the affine motion model
  • b is a vertical coordinate coefficient of a horizontal component of the affine motion model
  • vy In -bx+ay, a is a vertical coordinate coefficient of a vertical component of the affine motion model, -b is a horizontal coordinate coefficient of a vertical component of the affine motion model, and coefficients of the affine motion model Including a and b;
  • the coefficients of the affine motion model further include a horizontal displacement coefficient c of a horizontal component of the affine motion model, and a vertical displacement coefficient d of a vertical component of the affine motion model, such that the affine motion model In the form of:
  • pixel value prediction is performed on a current image block by using an affine motion model and a combined motion information unit set i, wherein each motion in the motion information unit set i is merged.
  • the information units are respectively selected from at least part of the motion information units of the candidate motion information unit set corresponding to each of the 2 pixel samples, wherein the selection range is reduced because the combined motion information unit set i becomes relatively small, and the tradition is abandoned
  • the mechanism adopted by the technology to filter out a motion information unit of a plurality of pixel samples by a large number of calculations in a plurality of possible candidate motion information element sets of a plurality of pixel samples, is advantageous for improving coding efficiency, and is also advantageous for reducing affine-based affine
  • the computational complexity of image prediction by motion models makes it possible to introduce video coding standards into affine motion models. And because the affine motion model is introduced, it is beneficial to describe the motion of the object more accurately, so it is beneficial to improve the prediction accuracy.
  • FIG. 1 is a schematic diagram of partitioning of several image blocks according to an embodiment of the present invention
  • FIG. 1 is a schematic flowchart of an image prediction method according to an embodiment of the present disclosure
  • FIG. 1 is a schematic diagram of an image block according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart diagram of another image prediction method according to an embodiment of the present disclosure.
  • FIG. 2 is a schematic diagram of a plurality of candidate motion information unit sets for determining pixel samples according to an embodiment of the present disclosure
  • 2 e is a schematic diagram of vertex coordinates of an image block x according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of the affine motion of a pixel according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a pixel point rotation motion according to an embodiment of the present invention
  • FIG. 3 is a schematic flowchart diagram of another image prediction method according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart diagram of another image prediction method according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart diagram of another image prediction method according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of another image processing apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of another image processing apparatus according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of another image processing apparatus according to an embodiment of the present invention.
  • Embodiments of the present invention provide an image prediction method and related equipment, in order to reduce an affine motion model based on Perform computational complexity of image prediction.
  • the video sequence consists of a series of pictures (English: picture), the picture is further divided into slices (English: slice), and the slice is further divided into blocks (English: block).
  • the video coding is performed in units of blocks, and can be encoded from left to right and from top to bottom line from the upper left corner position of the picture.
  • the concept of block is further extended.
  • the MB can be further divided into a plurality of prediction blocks (English: partition) that can be used for predictive coding.
  • coding unit English: coding unit, abbreviation: CU
  • prediction unit English: prediction unit, abbreviation: PU
  • transform unit English: transform unit, abbreviation: TU
  • CU PU or TU
  • the PU can correspond to a prediction block and is the basic unit of predictive coding.
  • the CU is further divided into a plurality of PUs according to a division mode.
  • the TU can correspond to a transform block and is a basic unit for transforming the prediction residual.
  • High performance video coding English: high efficiency video coding, abbreviation: HEVC
  • HEVC high efficiency video coding
  • CTB coding tree block
  • the size of a coding unit may include four levels of 64 ⁇ 64, 32 ⁇ 32, 16 ⁇ 16, and 8 ⁇ 8, and coding units of each level may be divided into different according to intra prediction and inter prediction.
  • the size of the prediction unit. 1 - a and FIG. The corresponding prediction unit division method.
  • the skip mode and the direct mode are effective tools for improving coding efficiency, and are used at low bit rates.
  • the blocks of the coding mode can account for more than half of the entire coding sequence.
  • the skip mode only one skip mode flag needs to be passed in the code stream, and the motion vector of the current image block can be derived by using the peripheral motion vector, and the value of the reference block is directly copied according to the motion vector as the current image block. Reconstruction value.
  • the encoder can derive the motion vector of the current image block by using the peripheral motion vector, and directly copy the value of the reference block as the predicted value of the current image block according to the motion vector, and use the predicted value pair at the encoding end.
  • the current image block is encoded and predicted.
  • high-performance video coding English: high efficiency video coding, abbreviated: HEVC
  • HEVC high efficiency video coding
  • AMVP adaptive motion vector prediction
  • a fusion code constructs a candidate motion information set by using motion information of a coded block around the current coding block (which may include a motion vector (English: motion vector, abbreviation: MV) and a prediction direction and a reference frame index, etc.)
  • the candidate motion information with the highest coding efficiency may be selected as the motion information of the current coding block, the prediction value of the current coding block is found in the reference frame, and the current coding block is predictively coded, and at the same time, the peripheral coded block from which the representation is selected may be selected.
  • the index value of the motion information is written to the code stream.
  • the adaptive motion vector prediction mode when used, by using the motion vector of the peripheral coded block as the predicted value of the current coded block motion vector, a motion vector with the highest coding efficiency may be selected to predict the motion vector of the current coded block, and Express The index value of which peripheral motion vector is selected is written to the video stream.
  • the image prediction method provided by the embodiment of the present invention is described below.
  • the execution body of the image prediction method provided by the embodiment of the present invention is a video coding device or a video decoding device, where the video coding device or the video decoding device may be any output or A device that stores video, such as a laptop, tablet, PC, cell phone, or video server.
  • an image prediction method includes: determining two pixel samples in a current image block, and determining a candidate motion information unit set corresponding to each of the two pixel samples.
  • the candidate motion information unit set corresponding to each pixel sample includes at least one motion information unit of the candidate; determining a combined motion information unit set i including two motion information units; wherein the combined motion information unit set
  • Each of the motion information units in i is selected from at least a portion of the motion information units in the candidate motion information unit set corresponding to each of the 2 pixel samples, wherein the motion information unit includes a prediction direction of the front
  • the moving motion vector and/or the prediction direction is a backward motion vector; the current image block is subjected to pixel value prediction using the affine motion model and the combined motion information unit set i.
  • FIG. 1-c is a schematic flowchart of an image prediction method according to an embodiment of the present invention.
  • an image prediction method provided by the first embodiment of the present invention may include:
  • S101 Determine two pixel samples in the current image block, and determine a candidate motion information unit set corresponding to each of the two pixel samples.
  • the candidate motion information unit set corresponding to each pixel sample includes at least one motion information unit of the candidate.
  • the pixel samples mentioned in the embodiments of the present invention may be pixel points or pixel blocks including at least two pixel points.
  • referring to the motion information unit in each embodiment of the present invention may include a motion vector in which the prediction direction is forward and/or a motion vector in which the prediction direction is backward. That is, one motion information unit may include one motion vector or two motion vectors that may include different prediction directions.
  • the prediction direction corresponding to the motion information unit is forward, it indicates that the motion information unit includes a motion vector whose prediction direction is forward, but does not include a motion vector whose prediction direction is backward. If sports information The prediction direction corresponding to the unit is backward, indicating that the motion information unit includes a motion vector whose prediction direction is backward but does not include a motion vector whose prediction direction is forward. If the prediction direction corresponding to the motion information unit is unidirectional, it indicates that the motion information unit includes a motion vector in which the prediction direction is forward, but does not include a motion vector in which the prediction direction is backward, or indicates that the motion information unit includes a backward direction. The motion vector does not include the forward direction motion vector. Wherein, if the prediction direction corresponding to the motion information unit is bidirectional, the motion information unit includes a motion vector in which the prediction direction is forward and a motion vector in which the prediction direction is backward.
  • the 2 pixel samples include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and a second pixel sample in the central pixel sample a1 of the current image block.
  • the upper left pixel sample of the current image block may be an upper left vertex of the current image block or a pixel block in the current image block that includes an upper left vertex of the current image block; a lower left pixel of the current image block The sample is a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex of the current image block or a pixel block in the current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or an inclusion in the current image block A block of pixels of a central pixel of the current image block.
  • the size of the pixel block is, for example, 2*2, 1*2, 4*2, 4*4, or other size.
  • An image block may include a plurality of pixel blocks.
  • the central pixel of the image block when w is an odd number (for example, w is equal to 3, 5, 7, or 11, etc.), the central pixel of the image block is unique, and when w is an even number
  • the time for example, w is equal to 4, 6, 8, or 16, etc.
  • the central sample of the image block may be any central pixel point or a designated central pixel of the image block.
  • the central sample of the image block may be a pixel block containing any one of the central pixel points in the image block, or the central sample of the image block may be a pixel block in the image block containing the specified central pixel.
  • the image block of size 4*4 shown in the example of FIG. 1-d has four pixel points of A1, A2, A3, and A4, and the designated central pixel point can be pixel point A1 (upper left) Center pixel), pixel A2 (lower left center pixel), pixel A3 (upper right center pixel), or pixel point A4 (right Lower center pixel), and so on.
  • Each of the motion information units i is selected from at least a part of motion information units in the candidate motion information unit set corresponding to each of the two pixel samples.
  • the motion information unit includes a motion vector in which the prediction direction is a forward direction and/or a motion direction in which the prediction direction is a backward direction.
  • the candidate motion information unit set corresponding to the pixel sample 001 is the candidate motion information unit set 011.
  • the candidate motion information unit set corresponding to the pixel sample 002 is the candidate motion information unit set 022.
  • the merged motion information unit set i includes a motion information unit C01 and a motion information unit C02, wherein the motion information unit C01 may be selected from the candidate motion information unit set 011, wherein the motion information unit C02 may be selected from the candidate motion information unit set 022. And so on.
  • the merged motion information unit set i includes the motion information unit C01 and the motion information unit C02, wherein any one of the motion information unit C01 and the motion information unit C02 may include a forward motion vector and a prediction direction.
  • the combined motion information unit set i may include 2 motion vectors (the prediction manners corresponding to the 2 motion vectors may be forward or backward) or the 2 motion vectors may Including one motion vector with a prediction direction of forward and one motion vector with a prediction direction of backward, and may also include four motion vectors (where the four motion vectors may include two motions whose prediction direction is forward)
  • the vector and the prediction direction are two backward motion vectors), and may also include three motion vectors (the three motion vectors may also include one motion vector with a prediction direction of forward and two backward directions with a prediction direction of backward).
  • the motion vector may also include two motion vectors whose prediction direction is forward and one motion vector whose prediction direction is backward.
  • the current image block may be a current coding block or a current decoding block.
  • the current image block is subjected to pixel value prediction by using the affine motion model and the merged motion information unit set i, wherein each motion information unit in the merged motion information unit set i is separately selected.
  • Candidate motion information sheet corresponding to each pixel sample from 2 pixel samples At least part of the motion information unit in the metaset, since the selection range of the merged motion information unit set i becomes relatively small, the majority of the possible candidate motion information element sets of the plurality of pixel samples adopted by the conventional technique are discarded and filtered out by a large number of calculations.
  • the mechanism of a motion information unit of multiple pixel samples is beneficial to improve coding efficiency, and is also beneficial to reduce the computational complexity of image prediction based on affine motion model, thereby making it possible to introduce affine motion model into video coding standard. .
  • the affine motion model is introduced, it is beneficial to describe the motion of the object more accurately, so it is beneficial to improve the prediction accuracy.
  • the number of referenced pixel samples can be two, it is advantageous to further reduce the computational complexity of image prediction based on the affine motion model after introducing the affine motion model, and also to reduce the transmission of the affine at the encoding end. Parameter information or the number of motion vector residuals, etc.
  • the image prediction method provided by this embodiment may be applied to a video encoding process or may be applied to a video decoding process.
  • the manner in which the combined motion information unit set i including the two motion information units is determined may be various.
  • determining a combined motion information unit set i that includes two motion information units includes: determining, from the N candidate motion information unit sets, that the two motion information is included a unit of combined motion information unit i of the unit; wherein each motion information unit included in each set of candidate motion information unit sets in the N candidate motion information unit sets is selected from the two pixel samples respectively At least part of the motion information unit of the candidate motion information unit set corresponding to each pixel sample, wherein the N is a positive integer, the N candidate combined motion information unit sets are different from each other, and the N Each candidate merged motion information unit set in the candidate merged motion information unit set includes two motion information units.
  • the two candidate combined motion information unit sets are different, and may be that the motion information units included in the candidate combined motion information unit set are not completely the same.
  • the two motion information units are different, and may refer to different motion vectors included in the two motion information units, or different motion directions corresponding to the motion vectors included in the two motion information units, or included in the two motion information units.
  • the motion vector corresponds to a different reference frame index.
  • the two motion information units are the same, which may mean that the motion vectors included in the two motion information units are the same, and the motion directions included in the two motion information units correspond to the same prediction direction, and the motion information included in the two motion information units
  • the reference frame index corresponding to the motion vector is the same.
  • determining, by using one of the N candidate motion information unit sets, two motion information units Combining the motion information unit set i may include: determining a combined motion information unit including two motion information units from among the N candidate combined motion information unit sets based on the identifier of the combined motion information unit set i obtained from the video code stream Set i.
  • the method may further include: writing the identifier of the merged motion information unit set i Into the video stream.
  • the identifier of the merged motion information unit set i may be any information that can identify the merged motion information unit set i.
  • the identifier of the merged motion information unit set i may be a merged motion information unit set i in the merged motion information.
  • the method further includes: using a spatial neighboring or time of the two pixel samples a motion vector of a pixel sample adjacent to the domain, obtaining a motion vector predictor of the two pixel samples, and obtaining a motion vector residual of the two pixel samples according to a motion vector predictor of the two pixel samples, The motion vector residuals of the two pixel samples are written to the video code stream.
  • the method further includes: decoding the 2 pixel samples from a video code stream. a motion vector residual, using motion vectors of spatially adjacent or temporally adjacent pixel samples of the 2 pixel samples to obtain motion vector predictors of the 2 pixel samples, based on motion of the 2 pixel samples The vector predictor and the motion vector residual of the two pixel samples respectively obtain motion vectors of the two pixel samples.
  • determining the combined motion information unit set i including the two motion information units from among the N candidate combined motion information unit sets may include: based on distortion or rate distortion cost A combined motion information unit set i including two motion vectors is determined from among the N candidate combined motion information unit sets.
  • the rate distortion cost corresponding to the merged motion information unit set i is less than or equal to the rate distortion cost corresponding to any one of the combined motion information unit sets except the combined motion information unit set i in the N candidate motion information unit sets. .
  • the distortion corresponding to the merged motion information unit set i is less than or equal to the distortion corresponding to any one of the combined motion information unit sets except the combined motion information unit set i in the N candidate motion information unit sets.
  • the rate-distortion cost corresponding to a certain candidate combined motion information unit set may be, for example, utilized.
  • the distortion corresponding to a certain candidate combined motion information unit set (for example, the combined motion information unit set i in the N candidate motion information unit sets) in the foregoing N candidate combined motion information unit sets, for example, may be an image block.
  • Distortion ie, distortion between the original pixel value of the image block and the predicted pixel value.
  • the obtained distortion between the predicted pixel values of the image block may be, for example, the original pixel value of the image block (such as the current image block) and the set of the combined motion information unit using the certain candidate (for example, the combined motion information unit set) i) a sum of square differences or (SAD, sum of absolution differences) or an error and/or between the predicted pixel values of the image block obtained by performing pixel value prediction on the image block
  • Other distortion parameters that measure distortion that measure distortion.
  • N is a positive integer.
  • N described above may be, for example, equal to 1, 2, 3, 4, 5, 6, 8, or other values.
  • each motion information unit in any one of the N candidate motion information unit sets may be different from each other.
  • the N candidate combined motion messages The set of information units satisfies at least one of the first condition, the second condition, the third condition, the fourth condition, and the fifth condition.
  • the first condition includes that the motion mode of the current image block indicated by the motion information unit in any one of the N candidate motion information unit sets is a non-translation motion.
  • the motion mode of the current image block indicated by the motion information unit in the candidate combined motion information unit set may be considered as a translational motion, and vice versa, the motion mode of the current image block indicated by the motion information unit in the candidate combined motion information unit is considered to be a non-translational motion, wherein the first prediction direction is forward or backward .
  • the candidate may be considered as the candidate.
  • the motion mode of the current image block indicated by the motion information unit in the motion information unit is a translational motion, and vice versa, the current image block indicated by the motion information unit in the candidate combined motion information unit may be considered.
  • the mode of exercise is non-translational exercise.
  • the second condition includes that the two motion information units in the one candidate motion information unit set in the N candidate combined motion information unit sets have the same prediction direction.
  • both motion information units include a motion vector in which the prediction direction is forward and a motion vector in which the prediction direction is backward
  • one of the two motion information units includes a motion vector in which the prediction direction is forward and a motion vector in which the prediction direction is backward
  • the other motion information unit includes a motion vector in which the prediction direction is forward.
  • the motion vector with the prediction direction being backward is not included, or the motion information unit includes the motion vector with the prediction direction being the backward direction but not the motion vector with the prediction direction being the forward direction, which may represent the prediction corresponding to the two motion information units.
  • the direction is different.
  • one of the two motion information units includes a motion vector whose prediction direction is a forward motion but does not include a motion vector whose backward direction is a backward direction
  • another motion information unit includes a backward direction of the prediction direction.
  • the motion vector, but not including the forward direction motion vector may indicate that the prediction directions corresponding to the two motion information units are different.
  • both motion information units include a motion vector whose prediction direction is forward, but the two motion information units do not include a motion vector whose prediction direction is backward, indicating the two motions.
  • the information unit corresponds to the same prediction direction.
  • both motion information units include a motion vector whose prediction direction is backward, but neither of the motion information units includes a motion vector whose prediction direction is forward, indicating that the two motion information units have the same prediction direction. .
  • the third condition includes that the reference frame indexes corresponding to the two motion information units in any one of the N candidate motion information unit sets are the same.
  • both motion information units include a motion vector in which the prediction direction is forward and a motion vector in which the prediction direction is backward
  • the prediction direction in the two motion information units is a reference corresponding to the forward motion vector.
  • the frame index is the same
  • the prediction direction in the two motion information units is the same as the reference frame index corresponding to the backward motion vector, which may indicate that the reference frame indexes corresponding to the two motion information units are the same.
  • the other motion information unit includes a motion in which the prediction direction is forward.
  • the other motion information unit includes a motion vector whose prediction direction is backward but does not include a motion vector whose prediction direction is forward, indicating that the two motion information units correspond to The prediction directions are different, and the reference frame indexes corresponding to the two motion information units may be different.
  • the other motion information unit includes a motion in which the prediction direction is backward.
  • the vector, but not including the forward direction motion vector may indicate that the reference frame indices corresponding to the two motion information units are different.
  • the other motion information unit when one of the two motion information units includes a motion vector in which the prediction direction is a forward motion but does not include a motion vector in which the prediction direction is backward, the other motion information unit includes a motion in which the prediction direction is forward.
  • the vector does not include the motion vector whose prediction direction is backward, and the prediction direction in the two motion information units is the same as the reference frame index corresponding to the forward motion vector, and may represent the reference frame corresponding to the two motion information units.
  • the index is different.
  • the other motion information unit when one of the two motion information units includes a motion vector in which the prediction direction is backward but does not include a motion vector in which the prediction direction is forward, the other motion information unit includes a motion in which the prediction direction is backward.
  • the vector does not include the motion vector of the forward direction, and the reference frame in the two motion information units is the same as the reference frame index corresponding to the backward motion vector, and may represent the reference frame corresponding to the two motion information units.
  • the index is different.
  • the fourth condition includes that an absolute value of a difference value of motion vector horizontal components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to a horizontal component threshold, or And an absolute value of a difference between one of the motion information unit and the motion vector horizontal component of the pixel sample Z in any one of the N candidate motion information unit sets is less than or equal to a horizontal component threshold
  • the pixel sample Z of the current image block is different from any one of the 2 pixel samples.
  • the horizontal component threshold may be, for example, equal to 1/3 of the width of the current image block, 1/2 of the width of the current image block, 2/3 of the width of the current image block, or 3/4 of the width of the current image block or the like. size.
  • the fifth condition includes that an absolute value of a difference of motion vector vertical components of two motion information units in any one of the N candidate motion information unit sets is less than or equal to a vertical component threshold
  • the absolute value of the difference between any one of the motion information unit of one of the candidate motion information unit sets and the motion vector vertical component of the pixel sample Z is less than or equal to A vertical component threshold, the pixel sample Z of the current image block being different from any one of the 2 pixel samples.
  • the above vertical component threshold may be, for example, equal to 1/3 of the height of the current image block, 1/2 of the height of the current image block, 2/3 of the height of the current image block, or 3/4 of the height of the current image block or the like. size.
  • the pixel samples Z may be the lower left pixel samples or the central pixel samples or other pixel samples of the current image block. Other situations can be deduced by analogy.
  • the candidate motion information unit set corresponding to the upper left pixel sample of the current image block includes motion information units of x1 pixel samples, where the x1 pixel samples include At least one pixel sample adjacent to an upper left pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to an upper left pixel sample time domain of the current image block, the x1 being a positive integer.
  • the x1 pixel samples include only at least one pixel sample adjacent to the upper left pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to the upper left pixel sample time domain of the current image block.
  • x1 above may be equal to 1, 2, 3, 4, 5, 6, or other values, for example.
  • the x1 pixel samples are adjacent to a time domain of a video frame to which the current image block belongs a pixel sample having the same position as the upper left pixel sample of the current image block, a spatial adjacent pixel sample of the left side of the current image block, and a spatially adjacent pixel sample of the upper left space of the current image block. At least one of the spatially adjacent pixel samples of the upper side of the current image block.
  • the candidate motion information unit set corresponding to the upper right pixel sample of the current image block includes motion information units of x2 pixel samples, where the x2 pixel samples include At least one pixel sample adjacent to an upper right pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to a time domain of an upper right pixel sample of the current image block, the x2 being a positive integer.
  • x2 above may be, for example, equal to 1, 2, 3, 4, 5, 6, or other values.
  • the x2 pixel samples include a pixel sample that is the same as an upper right pixel sample position of the current image block, and the current image block, among video frames adjacent to a video frame time domain to which the current image block belongs. At least one of a spatially adjacent pixel sample on the right side, a spatially adjacent pixel sample on the upper right of the current image block, and a spatially adjacent pixel sample on the upper side of the current image block.
  • the candidate motion information unit set corresponding to the lower left pixel sample of the current image block includes motion information units of x3 pixel samples, where the x3 pixel samples include At least one pixel sample adjacent to a lower left pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to a lower left pixel sample time domain of the current image block, the x3 being a positive integer.
  • the x3 pixel samples include only at least one pixel sample adjacent to the lower left pixel sample spatial domain of the current image block and/or at least one pixel sample adjacent to the lower left pixel sample time domain of the current image block.
  • x3 above may be equal to 1, 2, 3, 4, 5, 6, or other values, for example.
  • the x3 pixel samples include a pixel sample having the same position as a lower left pixel sample of the current image block, and the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatially adjacent pixel sample on the left side, a spatially adjacent pixel sample on the lower left of the current image block, and a spatially adjacent pixel sample on the lower side of the current image block.
  • the set of candidate motion information units corresponding to the central pixel sample a1 of the current image block includes motion information units of x5 pixel samples, where the x5 pixel samples are included.
  • One of the pixel samples is the pixel sample a2.
  • the x5 pixel samples Only the pixel sample a2 is included.
  • the position of the central pixel sample a1 in the video frame to which the current image block belongs is the same as the position of the pixel sample a2 in the adjacent video frame of the video frame to which the current image block belongs, and the x5 is A positive integer.
  • the performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i may include: when the merged motion information In a case where the prediction direction in the unit set i is that the reference frame index corresponding to the motion vector of the first prediction direction is different from the reference frame index of the current image block, the merged motion information unit set i is subjected to scaling processing so that The motion vector in the first motion direction of the merged motion information unit set i is scaled to a reference frame of the current image block, and the affine motion model and the merged motion information unit set i after performing the scaling process are used. Performing pixel value prediction on the current image block, where the first prediction direction is forward or backward;
  • the performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i may include: when the prediction direction in the combined motion information unit set i is a forward motion The reference frame index corresponding to the vector is different from the forward reference frame index of the current image block, and the reference frame index corresponding to the backward direction motion vector in the merged motion information unit set i is different from the current image
  • the merged motion information unit set i is subjected to scaling processing such that the forward motion vector in the merged motion information unit set i is forwarded to the current Forward reference frame of the image block and causing the backward direction motion vector in the merged motion information unit set i to be scaled to the backward reference frame of the current image block, using the affine motion model and performing scaling processing
  • the merged motion information unit set i performs pixel value prediction on the current image block.
  • performing pixel value prediction on the current image block by using a non-translational motion model and a combined motion information element set i after performing scaling processing may include: Performing motion estimation processing on the motion vector in the merged motion information unit set i after the scaling process to obtain the combined motion information unit set i after the motion estimation processing, and the combined motion information unit after the motion estimation processing and the motion estimation processing The set i performs pixel value prediction on the current image block.
  • the utilizing an affine motion model and Performing pixel value prediction on the current image block by the merged motion information unit set i includes: calculating a motion vector of each pixel point in the current image block by using an affine motion model and the combined motion information unit set i Determining, by using the calculated motion vector of each pixel in the current image block, a predicted pixel value of each pixel in the current image block; or calculating using the affine motion model and the combined motion information unit set i Obtaining a motion vector of each pixel block in the current image block, and determining a predicted pixel of each pixel point of each pixel block in the current image block by using the calculated motion vector of each pixel block in the current image block. value.
  • the test finds that if the affine motion model and the combined motion information unit set i are used to calculate the motion vector of each pixel block in the current image block, and then the calculated pixels in the current image block are used.
  • the motion vector of the block determines the predicted pixel value of each pixel of each pixel block in the current image block, and the pixel block in the current image block is used as the granularity when calculating the motion vector, which is advantageous for greatly reducing the computational complexity. .
  • performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i may include: combining the combined motion information unit The motion vector in the set i is subjected to motion estimation processing to obtain a combined motion information unit set i after the motion estimation process, and the current image block is pixel-formed by using the affine motion model and the motion estimation processed combined motion information unit set i Value prediction.
  • the performing, by using the affine motion model and the merged motion information unit set i, performing pixel value prediction on the current image block includes: using the merge motion a ratio of a difference between motion vector horizontal components of two motion information units in the information unit set i to a length or a width of the current image block, and two motions in the combined motion information unit set i A ratio of a difference between a vertical component of a motion vector of the information unit to a length or a width of the current image block results in a motion vector of an arbitrary pixel sample in the current image block.
  • the performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i may include: using a difference between a motion vector horizontal component of the two pixel samples a ratio of a length or a width of the current image block, and a ratio of a difference between a vertical component of the motion vector of the 2 pixel samples and a length or a width of the current image block, obtained in the current image block a motion vector of an arbitrary pixel sample, wherein the motion vector of the 2 pixel samples is based on The motion vector of the two motion information units in the merged motion information unit set i is obtained (for example, the motion vector of the 2 pixel samples is the motion vector of the two motion information units in the combined motion information unit set i, Alternatively, motion vectors of the two pixel samples are obtained based on motion vectors and prediction residuals of two motion information units in the combined motion information unit set i.
  • the horizontal coordinate coefficient of the motion vector horizontal component of the 2 pixel samples and the vertical coordinate coefficient of the motion vector vertical component are equal, and the 2 pixels are The vertical coordinate coefficient of the motion vector horizontal component of the sample is opposite to the horizontal coordinate coefficient of the vertical component of the motion vector.
  • the affine motion model may be, for example, an affine motion model of the form:
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a pixel sample with coordinates (x, y) in the current image block.
  • a motion vector horizontal component the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block
  • the w being the length or width of the current image block.
  • (vx 2 , vy 2 ) is a motion vector of another pixel sample different from the above two pixel samples in the current image block.
  • (vx 2 , vy 2 ) may be the lower left pixel sample or the central pixel sample of the previous image block.
  • (vx 2 , vy 2 ) may be the upper right pixel sample or the central pixel sample of the front image block.
  • the coordinates of the pixel sample may be the action of any one of the pixel samples, or the coordinates of the pixel sample may be the coordinates of the specified pixel point in the pixel sample.
  • the coordinates of the pixel sample may be the upper left pixel point or the lower left upper left pixel point or the upper right pixel point or the coordinates of the center pixel point in the pixel sample, etc.).
  • the pixel value prediction may be performed in a manner similar to the pixel value prediction manner corresponding to the current image block.
  • some image blocks in the current video frame are also Pixel value prediction may be performed in a different manner from the pixel value prediction manner corresponding to the current image block.
  • FIG. 2-a is a schematic flowchart diagram of another image prediction method according to another embodiment of the present invention.
  • an image prediction method implemented in a video encoding apparatus is mainly described as an example.
  • another image prediction method provided by the second embodiment of the present invention may include:
  • the video encoding device determines two pixel samples in the current image block.
  • the two pixel samples include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and two of the central pixel samples a1 of the current image block as an example.
  • the 2 pixel samples include an upper left pixel sample and an upper right pixel sample of the current image block.
  • the scenario in which the two pixel samples are other pixel samples of the current image block may be analogized.
  • the upper left pixel sample of the current image block may be an upper left vertex of the current image block or a pixel block in the current image block that includes an upper left vertex of the current image block; a lower left pixel of the current image block The sample is a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex of the current image block or a pixel block in the current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or an inclusion in the current image block A block of pixels of a central pixel of the current image block.
  • the size of the pixel block is, for example, 2*2, 1*2, 4*2, 4*4, or other sizes.
  • the video encoding apparatus determines a candidate motion information unit set corresponding to each of the two pixel samples.
  • the candidate motion information unit set corresponding to each pixel sample includes at least one candidate Motion information unit.
  • the pixel samples mentioned in the embodiments of the present invention may be pixel points or pixel blocks including at least two pixel points.
  • the candidate motion information unit set S1 corresponding to the upper left pixel sample of the current image block may include motion information units of x1 pixel samples.
  • the x1 pixel samples include: a pixel sample Col-LT having the same position as an upper left pixel sample LT of the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatial adjacent image block C on the left side of the current image block, a spatially adjacent image block A on the upper left of the current image block, and a spatially adjacent image block B on the upper side of the current image block.
  • the motion information unit of the spatial adjacent image block C on the left side of the current image block, the motion information unit of the spatially adjacent image block A on the upper left of the current image block, and the upper side of the current image block may be acquired first.
  • the motion information unit of the spatial adjacent image block B, the motion information unit of the spatial adjacent image block C on the left side of the current image block, and the motion of the adjacent air image block A on the upper left side of the current image block The information unit and the motion information unit of the spatial adjacent image block B of the upper side of the current image block are added to the candidate motion information unit set corresponding to the upper left pixel sample of the current image block, if the air space on the left side of the current image block a motion information unit of the adjacent image block C, a motion information unit of the spatially adjacent image block A of the upper left of the current image block, and a portion of the motion information unit of the spatial adjacent image block B of the upper side of the current image block Or all the motion information units are the same, and further performing deduplication processing on the candidate
  • the motion information unit of the pixel sample Col-LT having the same position as the upper left pixel sample LT of the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs is different from Any one of the motion information unit S1 in the candidate motion information unit set S1, and the video image frame adjacent to the time domain of the video frame to which the current image block belongs, and the upper left of the current image block Pixel
  • the motion information unit of the pixel sample Col-LT having the same sample LT position is added to the candidate motion information unit set S1 after the deduplication processing, if the number of motion information units in the candidate motion information unit set S1 is still small at this time.
  • a zero motion information unit may be added to the candidate motion information unit set S1 until the number of motion information units in the candidate motion information unit set S1 is equal to three.
  • the zero motion information unit added to the candidate motion information unit set S1 includes a zero motion vector whose prediction direction is forward but may not include a backward direction. Motion vector. If the video frame to which the current image block belongs is a backward predicted frame, the zero motion information unit added to the candidate motion information unit set S1 includes a zero motion vector whose prediction direction is backward but may not include a zero motion vector whose prediction direction is forward. . In addition, if the video frame to which the current image block belongs is a bidirectional prediction frame, the zero motion information unit added to the candidate motion information unit set S1 includes a zero motion vector whose prediction direction is forward and a zero motion vector whose prediction direction is backward.
  • the reference frame index corresponding to the motion vector added to the different zero motion information unit in the candidate motion information unit set S1 may be different, and the corresponding reference frame index may be, for example, 0, 1, 2, 3 or other values thereof.
  • the candidate motion information unit set S2 corresponding to the upper right pixel sample of the current image block may include motion information units of x2 image blocks.
  • the x2 image blocks may include: a pixel sample Col-RT that is the same as an upper right pixel sample RT position of the current image block, among video frames adjacent to a video frame time domain to which the current image block belongs. And at least one of a spatially adjacent image block E of the upper right of the current image block and a spatially adjacent image block D of the upper side of the current image block.
  • the motion information unit of the spatially adjacent image block E of the upper right of the current image block and the motion information unit of the spatial adjacent image block D of the upper side of the current image block may be acquired first, and the current image to be acquired may be acquired.
  • the motion information unit of the spatially adjacent image block E of the upper right of the block and the motion information unit of the spatial adjacent image block D of the upper side of the current image block are added to the candidate motion information unit corresponding to the upper right pixel sample of the current image block.
  • the candidate may be The motion information unit set S2 performs deduplication processing (the number of motion information units in the candidate motion information unit set S2 after the de-duplication processing is 1), if the time interval of the video frame to which the current image block belongs a motion information unit of a pixel sample Col-RT that is in the same position as an upper right pixel sample RT of the current image block among adjacent video frames, and de-emphasis
  • the one of the candidate motion information unit sets S2 is the same, and the zero motion information unit may be further added to the candidate motion information unit set S2 until the motion information in the candidate motion information unit set S2.
  • the number of units is equal to 2.
  • the motion information unit of the pixel sample Col-RT that is the same as the upper right pixel sample RT of the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs is different from Any one of the candidate motion information unit sets S2 after the re-processing may be the same as the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs.
  • the motion information unit of the pixel sample Col-RT in which the upper right pixel sample RT position is the same is added to the candidate motion information unit set S2 after the deduplication processing, if the motion information unit in the candidate motion information unit set S2 at this time If the number is still less than two, the zero motion information unit is further added to the candidate motion information unit set S2 until the number of motion information units in the candidate motion information unit set S2 is equal to two.
  • the zero motion information unit added to the candidate motion information unit set S2 includes a zero motion vector whose prediction direction is forward but may not include a backward direction. Motion vector. If the video frame to which the current image block belongs is a backward predicted frame, the zero motion information unit added to the candidate motion information unit set S2 includes a zero motion vector whose prediction direction is backward but may not include a zero motion vector whose prediction direction is forward. . In addition, if the video frame to which the current image block belongs is a bidirectional prediction frame, the zero motion information unit added to the candidate motion information unit set S2 includes a zero motion vector whose prediction direction is forward and a zero motion vector whose prediction direction is backward.
  • the reference frame index corresponding to the motion vector added to the different zero motion information unit in the candidate motion information unit set S2 may be different, and the corresponding reference frame index may be, for example, 0, 1, 2, 3 or other values thereof.
  • the candidate motion information unit set S3 corresponding to the lower left pixel sample of the current image block may include motion information units of x3 image blocks.
  • the x3 image blocks may include: a pixel sample Col-LB having the same position as a lower left pixel sample LB of the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. And at least one of a left-side spatial adjacent image block G of the current image block and a spatially adjacent image block F of a left side of the current image block.
  • the motion information unit of the spatially adjacent image block G of the lower left of the current image block and the motion information unit of the spatial adjacent image block F of the left side of the current image block may be acquired first, and the acquired current image block may be acquired.
  • the motion information unit of the left-side airspace adjacent image block G and the said The motion information unit of the spatial adjacent image block F on the left side of the front image block is added to the candidate motion information unit set S3 corresponding to the lower left pixel sample of the current image block, if the left adjacent airspace adjacent image of the current image block
  • the motion information unit of the block G is the same as the motion information unit of the spatial adjacent image block F on the left side of the current image block, and then the deselective motion information unit set S3 is subjected to deduplication processing.
  • the number of motion information units in the candidate motion information unit set S3 is 1), if the left bottom pixel sample LB of the current image block is among the video frames adjacent to the video frame time domain to which the current image block belongs.
  • the motion information unit of the pixel sample Col-LB having the same position is the same as one of the motion information unit of the candidate motion information unit set S3 after the deduplication processing, and may be further added to the candidate motion information unit set S3. Zero motion information unit until the number of motion information units in the candidate motion information unit set S3 is equal to two.
  • the motion information unit of the pixel sample Col-LB that is the same as the position of the lower left pixel sample LB of the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs, is different from Any one of the candidate motion information unit sets S3 that is reprocessed may be the same as the current image block of the video frame adjacent to the video frame time domain to which the current image block belongs.
  • the motion information unit of the pixel sample Col-LB having the same lower left pixel sample LB position is added to the deselected motion information unit set S3, if the number of motion information units in the candidate motion information unit set S3 is still If there are less than two, the zero motion information unit is further added to the candidate motion information unit set S3 until the number of motion information units in the candidate motion information unit set S3 is equal to two.
  • the zero motion information unit added to the candidate motion information unit set S3 includes a zero motion vector whose prediction direction is forward but may not include a backward direction. Motion vector. If the video frame to which the current image block belongs is a backward predicted frame, the zero motion information unit added to the candidate motion information unit set S3 includes a zero motion vector whose prediction direction is backward but may not include a zero motion vector whose prediction direction is forward. . In addition, if the video frame to which the current image block belongs is a bidirectional prediction frame, the zero motion information unit added to the candidate motion information unit set S3 includes a zero motion vector whose prediction direction is forward and a zero motion vector whose prediction direction is backward.
  • the reference frame index corresponding to the motion vector added to the different zero motion information unit in the candidate motion information unit set S3 may be different, and the corresponding reference frame index may be, for example, 0, 1, 2, 3 or other values thereof.
  • the two motion information units are different, and may refer to motion vectors included in the two motion information units.
  • the prediction directions corresponding to the motion vectors included in the two motion information units are different, or the reference frame indices corresponding to the motion vectors included in the two motion information units are different.
  • the two motion information units are the same, and the motion vectors included in the two motion information units are the same, and the motion directions corresponding to the motion information units of the two motion information units are the same, and the two motion information units are The included motion vectors correspond to the same reference frame index.
  • a candidate motion information unit set of the corresponding pixel sample can be obtained in a similar manner.
  • the 2 pixel samples may include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and a central pixel sample a1 of the current image block. Two of the pixel samples in .
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block including an upper left vertex of the current image block; and a lower left pixel sample of the current image block a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex or a location of the current image block a pixel block in a current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or the current image block includes the A block of pixels at the center pixel of the current image block.
  • the video encoding apparatus determines, according to the candidate motion information element set corresponding to each of the two pixel samples, the N candidate combined motion information unit sets.
  • Each of the motion information units included in each of the N candidate motion information unit sets is selected from candidate motions corresponding to each of the two pixel samples. At least part of the motion information element of the information unit set that meets the constraint.
  • the set of N candidate combined motion information units are different from each other, and each set of candidate combined motion information units in the N candidate combined motion information unit sets includes two motion information units.
  • the information unit collectively filters out N candidate combined motion information unit sets. Wherein, if the number of motion information units included in the candidate motion information unit set S1 and the candidate motion information unit set S2 is not limited to the above example, the number of initial candidate combined motion information unit sets is not necessarily six.
  • the set of N candidate combined motion information units may, for example, also satisfy other unlisted conditions.
  • the initial candidate combined motion information unit set may be filtered by using at least one of the first condition, the second condition, and the third condition, and the N01 is selected from the initial candidate combined motion information unit set.
  • the candidate merged motion information unit sets and then performs scaling processing on the N01 candidate combined motion information unit sets, and then uses at least one of the fourth condition and the fifth condition to extract from the N01 candidate combined motion information unit units subjected to the scaling processing.
  • N sets of candidate combined motion information unit sets are filtered out.
  • the fourth condition and the fifth condition may also not be referenced, but the initial candidate combined motion information element set is directly filtered by using at least one of the first condition, the second condition, and the third condition, from the initial candidate.
  • the combined motion information unit collectively filters out N candidate combined motion information unit sets.
  • the motion vector in the video codec reflects the distance that an object is offset in one direction (predictive direction) with respect to the same time (the same time frame corresponds to the same reference frame). Therefore, in the case that the motion information units of different pixel samples correspond to different prediction directions and/or correspond to different reference frame indexes, motion offset of each pixel/pixel block of the current image block relative to a reference frame may not be directly obtained. And when the pixel samples correspond to the same prediction direction and correspond to the same reference frame index, the combined motion vector combination can be used to obtain the motion vector of each pixel/pixel block in the image block.
  • the candidate combined motion information unit set may be subjected to scaling processing.
  • performing scaling processing on the candidate combined motion information unit set may involve modifying, adding, and/or deleting motion vectors in one or more motion information units in the candidate combined motion information unit set.
  • the performing pixel value prediction on the current image block by using the affine motion model and the merged motion information unit set i may include: when the merging In the case where the prediction direction in the motion information unit set i is that the reference frame index corresponding to the motion vector of the first prediction direction is different from the reference frame index of the current image block, the merged motion information unit set i is scaled, The motion vector in which the prediction direction in the merged motion information unit set i is the first prediction direction is scaled to the reference frame of the current image block, and the affine motion model and the merged motion information unit set after the scaling processing are utilized i performing pixel value prediction on the current image block, where the first prediction direction is forward or backward;
  • the performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i may include: when the prediction direction in the combined motion information unit set i is a forward motion The reference frame index corresponding to the vector is different from the forward reference frame index of the current image block, and the reference frame index corresponding to the backward direction motion vector in the merged motion information unit set i is different from the current image
  • the merged motion information unit set i is subjected to scaling processing such that the forward motion vector in the merged motion information unit set i is forwarded to the current Forward reference frame of the image block and causing the backward direction motion vector in the merged motion information unit set i to be scaled to the backward reference frame of the current image block, using the affine motion model and performing scaling processing
  • the merged motion information unit set i performs pixel value prediction on the current image block.
  • the video encoding device determines, from among the N candidate combined motion information unit sets, a combined motion information unit set i including two motion information units.
  • the video encoding apparatus may further write the identifier of the combined motion information unit set i into the video code stream.
  • the video decoding device determines the combined motion information unit set i including the two motion information units from among the N candidate combined motion information unit sets based on the identification of the combined motion information unit set i obtained from the video code stream.
  • determining, by the video encoding apparatus, the merged motion information unit set i including the two motion information units from the N candidate combined motion information unit sets may include: based on a distortion or a rate The distortion cost determines a combined motion information unit set i including two motion vectors from among the N candidate combined motion information unit sets.
  • the rate distortion cost corresponding to the merged motion information unit set i is less than or equal to the rate distortion cost corresponding to any one of the combined motion information unit sets except the combined motion information unit set i in the N candidate motion information unit sets. .
  • the distortion corresponding to the merged motion information unit set i is less than or equal to the distortion corresponding to any one of the combined motion information unit sets except the combined motion information unit set i in the N candidate motion information unit sets.
  • the rate-distortion cost corresponding to a certain candidate combined motion information unit set may be, for example, utilized.
  • the distortion corresponding to a certain candidate combined motion information unit set (for example, the combined motion information unit set i in the N candidate motion information unit sets) in the foregoing N candidate combined motion information unit sets, for example, may be an image block.
  • Distortion ie, distortion between the original pixel value of the image block and the predicted pixel value.
  • the original pixel value of the image block (such as the current image block) and the pixel value prediction of the image block by using the certain candidate combined motion information unit set (for example, the combined motion information unit set i)
  • the resulting distortion between the predicted pixel values of the image block may be, for example, the original pixel value of the image block (eg, the current image block) and the set of combined motion information units using the certain candidate (eg, the combined motion information unit set) i) a squared error sum (SSD) or absolute error sum (SAD) or error between the predicted pixel values of the image block resulting from pixel value prediction for the image block and or other distortion parameters capable of measuring distortion.
  • SSD squared error sum
  • SAD absolute error sum
  • n1 candidate combined motion information unit sets may be selected from the N candidate combined motion information unit sets, and n1 candidate combined motion information is obtained based on distortion or rate distortion cost.
  • the unit set determines a combined motion information unit set i including two motion information units.
  • the D(V) corresponding to any one of the candidate motion information unit sets in the n1 candidate combined motion information unit sets is less than or equal to any other than the n1 candidate combined motion information unit sets in the N candidate motion information unit sets.
  • a candidate merges the set of motion information units corresponding to D(V), where n1 is, for example, equal to 3, 4, 5, 6, or other values.
  • the identifier of the n1 candidate combined motion information unit set or the n1 candidate combined motion information unit set may be added to the candidate combined motion information unit set queue, wherein if the N is less than or equal to n1, the N pieces may be used.
  • the identifier of the candidate combined motion information unit set or the N candidate combined motion information unit sets is added to the candidate combined motion information unit set queue.
  • the candidate merged motion information unit set in the candidate merged motion information unit set queue may be, for example, sorted in ascending or descending order according to the D(V) size.
  • the Euclidean distance parameter D(V) of any one of the candidate motion information unit sets (eg, the combined motion information unit set i) in the N candidate motion information unit sets may be calculated, for example, as follows:
  • motion vectors of two pixel samples included in a candidate merged motion information unit set in the N candidate combined motion information unit sets A motion vector represented as another pixel sample of the current image block, the other pixel sample being different from the two pixel samples described above.
  • a motion vector representing the upper left pixel sample and the upper right pixel sample of the current image block motion vector a motion vector representing the lower left pixel sample of the current image block, of course, a motion vector It can also represent the motion vector of the central pixel sample or other pixel samples of the current image block.
  • the data is sorted in ascending or descending order, and the candidate combined motion information unit set queue can be obtained.
  • the merge motion information unit set in the candidate merged motion information unit set queue is different from each other, and the available index number indicates a certain combined motion information unit set in the candidate merged motion information unit set queue.
  • the video encoding apparatus uses an affine motion model and the combined motion information unit set i
  • the current image block is subjected to motion vector prediction.
  • the size of the current image block is w ⁇ h, and the w is equal to or not equal to h.
  • Figure 2-e shows the coordinates of the four vertices of the current image block.
  • a schematic diagram of affine motion is shown in Figures 2-f and 2-g.
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the coordinates and motion vectors of the two pixel samples are substituted into the affine motion model as exemplified below, and the calculation can be calculated.
  • the motion vector of any pixel within the current image block x is (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the coordinates and motion vectors of the two pixel samples are substituted into the affine motion model as exemplified below, and the calculation can be calculated.
  • the motion vector of any pixel within the current image block x is (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the coordinates and motion vectors of the two pixel samples are substituted into the affine motion model as exemplified below, and the calculation can be calculated.
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, wherein the vx and vy are coordinates (x, y) in the current image block, respectively.
  • the video encoding apparatus may perform pixel value prediction on the current image block based on the calculated motion vector of each pixel point or each pixel block of the current image block.
  • the video encoding apparatus may obtain the prediction residual of the current image block by using the original pixel value of the current image block and the current image block prediction pixel value obtained by performing pixel value prediction on the current image block.
  • the video encoding device can write the prediction residual of the current image block to the video code stream.
  • the video encoding apparatus performs pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i, and each motion information unit in the combined motion information unit set i is respectively At least part of the motion information unit in the candidate motion information unit set corresponding to each of the 2 pixel samples, since the selected motion information unit set i selection range becomes relatively small, the traditional technology is adopted A mechanism for filtering out a motion information unit of a plurality of pixel samples by a large number of calculations in all possible candidate motion information unit sets of pixel samples, which is advantageous for improving coding efficiency and also for reducing image prediction based on affine motion model The computational complexity, which in turn makes it possible to introduce affine motion models into video coding standards.
  • the affine motion model is introduced, it is beneficial to describe the motion of the object more accurately, so it is beneficial to improve the prediction accuracy. Since the number of pixel samples referenced can be two, this is advantageous for further reduction of introduction.
  • the computational complexity of image prediction is performed based on the affine motion model, and it is also beneficial to reduce the number of affine parameter information or motion vector residuals transmitted by the encoder.
  • a derivation process of the affine motion model shown in Equation 1 is exemplified below.
  • a rotational motion model can be utilized to derive an affine motion model.
  • FIG. 2-h the rotational motion is exemplified by, for example, FIG. 2-h or FIG. 2-i.
  • the rotational motion model is shown in formula (2).
  • (x', y') is the coordinate corresponding to the pixel point of the coordinate (x, y) in the reference frame, where ⁇ is the rotation angle and (a 0 , a 1 ) is the translation component. If the transform coefficient is known, the motion vector (vx, vy) of the pixel point (x, y) can be obtained.
  • the rotation matrix adopted is:
  • the simplified affine motion model description can be as Equation 3.
  • the simplified affine motion model can only represent 4 parameters compared with the general affine motion model.
  • Equation 1 For an image block of size w ⁇ h (such as CUR), extend the right and bottom boundaries of each line and find the motion vector of the vertex of the coordinate point (0,0), (w,0) (vx 0 , vy 0) ), (vx 1 , vy 1 ). Taking these two vertices as pixel samples (of course, pixel samples with other points as references, such as central pixel samples, etc.), and substituting their coordinates and motion vectors into equation (3), Equation 1 can be derived.
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the vx is a pixel sample with coordinates (x, y) in the current image block.
  • a motion vector horizontal component the vy being a motion vector vertical component of a pixel sample of coordinates (x, y) in the current image block
  • the w being the length or width of the current image block.
  • Formula 1 has strong usability, and the practice finds that since the number of referenced pixel samples can be two, it is advantageous to further reduce the introduction of the affine motion model, based on the simulation.
  • the motion model performs computational complexity of image prediction and reduces the number of affine parameter information or motion vector difference values.
  • FIG. 3 is a schematic flowchart diagram of another image prediction method according to another embodiment of the present invention.
  • an image prediction method implemented in a video decoding apparatus is mainly described as an example.
  • another image prediction method provided by the third embodiment of the present invention may include:
  • the video decoding device determines two pixel samples in the current image block.
  • the two pixel samples include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and two of the central pixel samples a1 of the current image block as an example.
  • the 2 pixel samples include an upper left pixel sample and an upper right pixel sample of the current image block.
  • the scenario in which the two pixel samples are other pixel samples of the current image block may be analogized.
  • the upper left pixel sample of the current image block may be an upper left vertex of the current image block or a pixel block in the current image block that includes an upper left vertex of the current image block; a lower left pixel of the current image block The sample is a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex of the current image block or The upper right of the current image block containing the current image block a pixel block of a vertex; a central pixel sample a1 of the current image block is a central pixel point of the current image block or a pixel block of the current image block that includes a central pixel point of the current image block.
  • the size of the pixel block is, for example, 2*2, 1*2, 4*2, 4*4, or other sizes.
  • the video decoding apparatus determines a candidate motion information unit set corresponding to each of the two pixel samples.
  • the candidate motion information unit set corresponding to each pixel sample includes at least one motion information unit of the candidate.
  • the pixel samples mentioned in the embodiments of the present invention may be pixel points or pixel blocks including at least two pixel points.
  • the candidate motion information unit set S1 corresponding to the upper left pixel sample of the current image block may include motion information units of x1 pixel samples.
  • the x1 pixel samples include: a pixel sample Col-LT having the same position as an upper left pixel sample LT of the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. At least one of a spatial adjacent image block C on the left side of the current image block, a spatially adjacent image block A on the upper left of the current image block, and a spatially adjacent image block B on the upper side of the current image block.
  • the motion information unit of the spatial adjacent image block C on the left side of the current image block, the motion information unit of the spatially adjacent image block A on the upper left of the current image block, and the upper side of the current image block may be acquired first.
  • the motion information unit of the spatial adjacent image block B, the motion information unit of the spatial adjacent image block C on the left side of the current image block, and the motion of the adjacent air image block A on the upper left side of the current image block The information unit and the motion information unit of the spatial adjacent image block B of the upper side of the current image block are added to the candidate motion information unit set corresponding to the upper left pixel sample of the current image block, if the air space on the left side of the current image block a motion information unit of the adjacent image block C, a motion information unit of the spatially adjacent image block A of the upper left of the current image block, and a portion of the motion information unit of the spatial adjacent image block B of the upper side of the current image block Or all the motion information units are the same, and further performing deduplication processing on the candidate
  • the motion information unit of the pixel sample Col-LT having the same position as the upper left pixel sample LT of the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs is different from Any one of the motion information unit S1 in the candidate motion information unit set S1, and the video image frame adjacent to the time domain of the video frame to which the current image block belongs, and the upper left of the current image block
  • the motion information unit of the pixel sample Col-LT having the same pixel sample LT position is added to the candidate motion information unit set S1 after the de-duplication processing, if the number of motion information units in the candidate motion information unit set S1 is still If there are less than three, zero motion information units may be added to the candidate motion information unit set S1 until the number of motion information units in the candidate motion information unit set S1 is equal to three.
  • the zero motion information unit added to the candidate motion information unit set S1 includes a zero motion vector whose prediction direction is forward but may not include a backward direction. Motion vector. If the video frame to which the current image block belongs is a backward predicted frame, the zero motion information unit added to the candidate motion information unit set S1 includes a zero motion vector whose prediction direction is backward but may not include a zero motion vector whose prediction direction is forward. . In addition, if the video frame to which the current image block belongs is a bidirectional prediction frame, the zero motion information unit added to the candidate motion information unit set S1 includes a zero motion vector whose prediction direction is forward and a zero motion vector whose prediction direction is backward.
  • the reference frame index corresponding to the motion vector added to the different zero motion information unit in the candidate motion information unit set S1 may be different, and the corresponding reference frame index may be, for example, 0, 1, 2, 3 or other values thereof.
  • the candidate motion information unit set S2 corresponding to the upper right pixel sample of the current image block may include motion information units of x2 image blocks.
  • the x2 image blocks may include: a pixel sample Col-RT that is the same as an upper right pixel sample RT position of the current image block, among video frames adjacent to a video frame time domain to which the current image block belongs. And at least one of a spatially adjacent image block E of the upper right of the current image block and a spatially adjacent image block D of the upper side of the current image block.
  • the motion information unit of the spatially adjacent image block E of the upper right of the current image block and the motion information unit of the spatial adjacent image block D of the upper side of the current image block may be acquired first, and the current image to be acquired may be acquired.
  • the motion information unit of the adjacent image block E of the upper right spatial block of the block and The motion information unit of the spatial adjacent image block D of the upper side of the current image block is added to the candidate motion information unit set S2 corresponding to the upper right pixel sample of the current image block, if the upper right spatial domain of the current image block
  • the candidate motion information unit set S2 may be subjected to deduplication processing (the deduplication processing at this time)
  • the number of motion information units in the candidate motion information unit set S2 is 1), if the video frame adjacent to the video frame time domain to which the current image block belongs is the upper right of the current image block.
  • the motion information unit of the pixel sample Col-RT having the same pixel sample RT position is the same as one of the candidate motion information unit sets S2 after the deduplication processing, and may further provide the candidate motion information unit set S2
  • the zero motion information unit is added until the number of motion information units in the candidate motion information unit set S2 is equal to two.
  • the motion information unit of the pixel sample Col-RT that is the same as the upper right pixel sample RT of the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs, is different from Any one of the candidate motion information unit sets S2 after the re-processing may be the same as the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs
  • the motion information unit of the pixel sample Col-RT in which the upper right pixel sample RT position is the same is added to the candidate motion information unit set S2 after the deduplication processing, if the motion information unit in the candidate motion information unit set S2 at this time If the number is still less than two, the zero motion information unit is further added to the candidate motion information unit set S2 until the number of motion information units in the candidate motion information unit set S2 is equal to two.
  • the zero motion information unit added to the candidate motion information unit set S2 includes a zero motion vector whose prediction direction is forward but may not include a backward direction. Motion vector. If the video frame to which the current image block belongs is a backward predicted frame, the zero motion information unit added to the candidate motion information unit set S2 includes a zero motion vector whose prediction direction is backward but may not include a zero motion vector whose prediction direction is forward. . In addition, if the video frame to which the current image block belongs is a bidirectional prediction frame, the zero motion information unit added to the candidate motion information unit set S2 includes a zero motion vector whose prediction direction is forward and a zero motion vector whose prediction direction is backward.
  • the reference frame index corresponding to the motion vector added to the different zero motion information unit in the candidate motion information unit set S2 may be different, and the corresponding reference frame index may be, for example, 0, 1, 2, 3 or other values thereof.
  • the candidate motion information unit set S3 may include motion information units of x3 image blocks.
  • the x3 image blocks may include: a pixel sample Col-LB having the same position as a lower left pixel sample LB of the current image block, among video frames adjacent to a time domain of a video frame to which the current image block belongs. And at least one of a left-side spatial adjacent image block G of the current image block and a spatially adjacent image block F of a left side of the current image block.
  • the motion information unit of the spatially adjacent image block G of the lower left of the current image block and the motion information unit of the spatial adjacent image block F of the left side of the current image block may be acquired first, and the acquired current image block may be acquired.
  • the motion information unit of the lower left spatial adjacent image block G and the motion information unit of the left adjacent spatial image block F of the current image block are added to the candidate motion information unit set corresponding to the lower left pixel sample of the current image block.
  • the candidate motion information is The unit set S3 performs deduplication processing (the number of motion information units in the candidate motion information unit set S3 after the de-reprocessing is 1), if adjacent to the video frame time domain to which the current image block belongs a motion information unit of a pixel sample Col-LB having the same position as a lower left pixel sample LB of the current image block among the video frames, and the candidate motion information unit after the deduplication processing
  • S3 is the same motion information of a unit can be further added to the zero motion information unit to the set of candidate motion information unit S3 until the candidate motion information unit sets the number of motion information element S3 is equal to 2.
  • the motion information unit of the pixel sample Col-LB that is the same as the position of the lower left pixel sample LB of the current image block among the video frames adjacent to the video frame time domain to which the current image block belongs, is different from Any one of the candidate motion information unit sets S3 that is reprocessed may be the same as the current image block of the video frame adjacent to the video frame time domain to which the current image block belongs.
  • the motion information unit of the pixel sample Col-LB having the same lower left pixel sample LB position is added to the deselected motion information unit set S3, if the number of motion information units in the candidate motion information unit set S3 is still If there are less than two, the zero motion information unit is further added to the candidate motion information unit set S3 until the number of motion information units in the candidate motion information unit set S3 is equal to two.
  • the zero motion information unit added to the candidate motion information unit set S3 includes a zero motion vector whose prediction direction is forward but may not include a backward direction. Motion vector. If the video frame to which the current image block belongs is a backward predicted frame, add The zero motion information unit into the candidate motion information unit set S3 includes a zero motion vector whose prediction direction is backward but may not include a zero motion vector whose prediction direction is forward. In addition, if the video frame to which the current image block belongs is a bidirectional prediction frame, the zero motion information unit added to the candidate motion information unit set S3 includes a zero motion vector whose prediction direction is forward and a zero motion vector whose prediction direction is backward.
  • the reference frame index corresponding to the motion vector added to the different zero motion information unit in the candidate motion information unit set S3 may be different, and the corresponding reference frame index may be, for example, 0, 1, 2, 3 or other values thereof.
  • the two motion information units are different, and the motion information included in the two motion information units is different, or the motion directions corresponding to the motion information units of the two motion information units are different, or the two motion information units are The included motion vector corresponds to a different reference frame index.
  • the two motion information units are the same, and the motion vectors included in the two motion information units are the same, and the motion directions corresponding to the motion information units of the two motion information units are the same, and the two motion information units are The included motion vectors correspond to the same reference frame index.
  • a candidate motion information unit set of the corresponding pixel sample can be obtained in a similar manner.
  • the 2 pixel samples may include an upper left pixel sample, an upper right pixel sample, a lower left pixel sample, and a central pixel sample a1 of the current image block. Two of the pixel samples in .
  • the upper left pixel sample of the current image block is an upper left vertex of the current image block or a pixel block of the current image block including an upper left vertex of the current image block; and a lower left pixel sample of the current image block a lower left vertex of the current image block or a pixel block in the current image block that includes a lower left vertex of the current image block; an upper right pixel sample of the current image block is an upper right vertex or a location of the current image block a pixel block in a current image block that includes an upper right vertex of the current image block; a central pixel sample a1 of the current image block is a central pixel point of the current image block or the current image block includes the A block of pixels at the center pixel of the current image block.
  • the video decoding apparatus determines, according to the candidate motion information unit set corresponding to each of the two pixel samples, the N candidate combined motion information unit sets.
  • Each of the motion information units included in each of the N candidate motion information unit sets is selected from candidate motions corresponding to each of the two pixel samples. At least part of the motion information element of the information unit set that meets the constraint.
  • the N candidates merge The sets of motion information units are different from each other, and each set of candidate motion information units in the N candidate motion information unit sets includes two motion information units.
  • a condition is to filter out N sets of candidate combined motion information units from the six initial candidate combined motion information unit sets.
  • the set of N candidate combined motion information units may, for example, also satisfy other unlisted conditions.
  • the initial candidate combined motion information unit set may be filtered by using at least one of the first condition, the second condition, and the third condition, and the N01 is selected from the initial candidate combined motion information unit set.
  • the candidate merged motion information unit sets and then performs scaling processing on the N01 candidate combined motion information unit sets, and then uses at least one of the fourth condition and the fifth condition to extract from the N01 candidate combined motion information unit units subjected to the scaling processing.
  • N sets of candidate combined motion information unit sets are filtered out.
  • the fourth condition and the fifth condition may also not be referenced, but the initial candidate combined motion information element set is directly filtered by using at least one of the first condition, the second condition, and the third condition, from the initial candidate.
  • the combined motion information unit collectively filters out N candidate combined motion information unit sets.
  • the motion vector in the video codec reflects the distance that an object is offset in one direction (predictive direction) with respect to the same time (the same time frame corresponds to the same reference frame). Therefore, in the case that the motion information units of different pixel samples correspond to different prediction directions and/or correspond to different reference frame indexes, motion offset of each pixel/pixel block of the current image block relative to a reference frame may not be directly obtained. And when the pixel samples correspond to the same prediction direction and correspond to the same reference frame index, the combined motion vector combination can be used to obtain the motion vector of each pixel/pixel block in the image block.
  • the candidate combined motion information unit set may be subjected to scaling processing.
  • performing scaling processing on the candidate combined motion information unit set may involve modifying, adding, and/or deleting motion vectors in one or more motion information units in the candidate combined motion information unit set.
  • the performing pixel value prediction on the current image block by using the affine motion model and the merged motion information unit set i may include: when the merged motion information unit set In the case where the prediction direction in i is that the reference frame index corresponding to the motion vector of the first prediction direction is different from the reference frame index of the current image block, the merged motion information unit set i is subjected to scaling processing so that the a motion vector in which the prediction direction in the merged motion information unit set i is the first prediction direction is scaled to a reference frame of the current image block, using the affine motion model and the merged motion information unit set i subjected to the scaling process
  • the current image block performs pixel value prediction, and the first prediction direction is forward or backward;
  • the performing pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i may include: when the prediction direction in the combined motion information unit set i is a forward motion The reference frame index corresponding to the vector is different from the forward reference frame index of the current image block, and the reference frame index corresponding to the backward direction motion vector in the merged motion information unit set i is different from the current image
  • the merged motion information unit set i is subjected to scaling processing such that the forward motion vector in the merged motion information unit set i is forwarded to the current Forward reference frame of the image block and causing the backward direction motion vector in the merged motion information unit set i to be scaled to the backward reference frame of the current image block, using the affine motion model and performing scaling processing
  • the merged motion information unit set i performs pixel value prediction on the current image block.
  • the video decoding apparatus performs a decoding process on the video code stream to obtain an identifier of the combined motion information unit set i and a prediction residual of the current image block, and based on the identifier of the combined motion information unit set i, from the N candidate combined motion information unit sets. Among them, a combined motion information unit set i including two motion information units is determined.
  • the video encoding device can write the identifier of the combined motion information unit set i to the video code stream.
  • the video decoding apparatus performs motion vector prediction on the current image block by using an affine motion model and the combined motion information unit set i.
  • the video decoding device may perform motion estimation processing on the motion vector in the combined motion information unit set i to obtain a combined motion information unit set i after motion estimation processing, and the video decoding apparatus uses an affine motion model and motion estimation processing.
  • the merged motion information unit set i performs motion vector prediction on the current image block.
  • the size of the current image block is w ⁇ h, and the w is equal to or not equal to h.
  • Figure 2-e shows the coordinates of the four vertices of the current image block.
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the coordinates and motion vectors of the two pixel samples are substituted into the affine motion model as exemplified below, and the calculation can be calculated.
  • the motion vector of any pixel within the current image block x is (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the coordinates and motion vectors of the two pixel samples are substituted into the affine motion model as exemplified below, and the calculation can be calculated.
  • the motion vector of any pixel within the current image block x is (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and the coordinates and motion vectors of the two pixel samples are substituted into the affine motion model as exemplified below, and the calculation can be calculated.
  • the motion vectors of the two pixel samples are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, wherein the vx and vy are coordinates (x, y) in the current image block, respectively.
  • the motion vector horizontal component (vx) and the motion vector vertical component (vy) of the pixel samples, wherein the w in Equation 1 may be the length or width of the current image block.
  • the video decoding device calculates a predicted pixel value of the current image block obtained by performing pixel value prediction on the current image block according to the calculated motion vector of each pixel point or each pixel block of the current image block.
  • the video decoding apparatus reconstructs the current image block by using the predicted pixel value of the current image block and the prediction residual of the current image block.
  • the video decoding device performs pixel value prediction on the current image block by using the affine motion model and the combined motion information unit set i, and each motion information unit in the combined motion information unit set i is respectively Selecting at least part of the motion information unit of the candidate motion information unit set corresponding to each pixel sample of the 2 pixel samples, since the merged motion information unit set i is selected.
  • the range becomes relatively small, and the mechanism of filtering out a motion information unit of a plurality of pixel samples by a large number of calculations in all possible candidate motion information element sets of a plurality of pixel samples adopted by the conventional technology is discarded, which is advantageous for improving coding.
  • Efficiency helps to reduce the computational complexity of image prediction based on affine motion models, which makes it possible to introduce affine motion models into video coding standards. And because the affine motion model is introduced, it is beneficial to describe the motion of the object more accurately, so it is beneficial to improve the prediction accuracy. Since the number of referenced pixel samples can be two, it is advantageous to further reduce the computational complexity of image prediction based on the affine motion model after introducing the affine motion model, and also to reduce the transmission of affine parameter information at the encoding end. Or the number of motion vector residuals, etc.
  • FIG. 4 is a schematic flowchart diagram of still another image prediction method according to an embodiment of the present invention.
  • an image prediction method provided by the fourth embodiment of the present invention is used for a decoding device, where any image block includes at least one first type of pixel sample, and at least one second type of sample. It may be desirable to set the first type of pixel sample to include the first pixel sample, and the second type of pixel sample to include the second pixel sample.
  • the difference between the first type of sample and the second type of sample is that the motion information of the first type of sample only comes from For the corresponding motion information unit, and the motion information portion of the second type of sample is from its corresponding motion information unit, the method may include:
  • Each image block to be predicted corresponds to a part of the code stream in the code stream.
  • the decoding device can obtain the auxiliary information (side information) for constructing the predicted image and the residual value of the predicted image and the image to be decoded by parsing the code stream, and the image to be decoded can be reconstructed by predicting the image and the residual value.
  • the first code stream information is used to represent the motion information unit corresponding to the first pixel sample and the second pixel sample respectively.
  • the first code stream information that is parsed is an index value
  • the first code stream information may respectively indicate motion information units corresponding to the first pixel sample and the second pixel sample respectively, and may also uniformly indicate the first pixel sample and
  • the combination of the motion information units corresponding to the second pixel samples is not limited.
  • the predicted motion information refers to a predicted value of the motion information.
  • the predicted motion information refers to a predicted value of the motion vector.
  • the predicted value of the motion vector in the video codec field is generally derived from the motion information unit corresponding to the current image block, that is, the motion vector of the predicted image block.
  • the step specifically includes:
  • S4021 Determine a candidate motion information unit set corresponding to the first pixel sample and the second pixel sample, where any candidate motion information unit set includes at least one motion information unit.
  • S101 is a general method summary, and the method is summarized in S301-S302 in combination with the decoding device.
  • the description and exemplary embodiments of S4021 can be referred to S101 and S301-S302, and are not described again.
  • S4022 Determine a combined motion information unit set of the current block, where each motion information unit in the merged motion information unit set is at least a candidate motion information unit set corresponding to each pixel sample in the first pixel sample and the second pixel sample, respectively. And a partial motion information unit, wherein the motion information of the motion information unit includes a motion vector in which the prediction direction is forward and/or a motion vector in which the prediction direction is backward.
  • S102 is a general method summary, S303 is combined with a decoding device, and the method and exemplary embodiment of S4022 can be referred to S102 and S303, and details are not described herein.
  • S304 is combined with the decoding device, and the method and the exemplary embodiment of S4023 can be referred to S304, and details are not described herein.
  • S4024 Use motion information of the motion information unit corresponding to the first pixel sample as motion information of the first pixel sample.
  • the motion information unit indicated by the first code stream information that is, the motion vector of the predicted image block corresponding to the first pixel sample, is used as the motion vector of the first pixel sample.
  • S4025 Use motion information of the motion information unit corresponding to the second pixel sample as the predicted motion information of the second pixel sample.
  • the motion information unit indicated by the first code stream information that is, the motion vector of the predicted image block corresponding to the second pixel sample, is used as the predicted motion vector of the second pixel sample.
  • the second code stream information is used to characterize difference motion information of the second pixel sample, the difference motion information being a difference between the motion information and the predicted motion information.
  • the second code stream information is used to represent a residual value between a motion vector of the second pixel sample and the predicted motion vector. It should be understood that the motion vector of each second pixel sample corresponds to a residual value, and the residual value may be zero.
  • the parsed second code stream information may include a residual value of a motion vector of each second pixel sample, and may also include a set of residual values of motion vectors of all second pixel samples, which are not limited.
  • This step specifically includes:
  • the residual vector of the motion vector of the second pixel sample obtained by parsing the second code stream information and the corresponding predicted motion vector are added to obtain a motion vector of the second pixel sample.
  • the motion model of the current image block may be an affine motion model, or other translational and non-translational motion models, which may be a four-parameter affine motion model or other affine motion models such as six parameters. limited.
  • the motion model includes:
  • the motion vectors of the first pixel sample and the second pixel sample are (vx 0 , vy 0 ) and (vx 1 , vy 1 ), respectively, and vx is a pixel sample of coordinates (x, y) in the current image block.
  • the motion vector horizontal component, vy is the motion vector vertical component of the pixel sample of coordinates (x, y) in the current image block, and w is the length or width of the current image block.
  • the motion model also includes:
  • the motion vector of the first pixel sample and any two second pixel samples, or the motion vectors of the second pixel sample and any two first pixel samples are (vx 0 , vy 0 ), (vx 1 , respectively).
  • vx is the motion vector horizontal component of the pixel sample with coordinates (x, y) in the current image block
  • vy is the coordinate (x, y) in the current image block.
  • the motion vector vertical component of the pixel sample, w is the length or width of the current image block.
  • This step specifically includes:
  • the motion vector of each pixel in the current image block is calculated by using the motion vectors of the affine motion model, the first pixel sample, and the second pixel sample, and the calculated current image block is used.
  • the motion vector of each pixel determines the predicted pixel value of each pixel in the current image block; in another feasible implementation, the motion vector of the affine motion model, the first pixel sample, and the second pixel sample are used to calculate The motion vector of each pixel block in the current image block determines the predicted pixel value of each pixel point of each pixel block in the current image block by using the calculated motion vector of each pixel block in the current image block.
  • the S305 is combined with the decoding device, and the method and the exemplary embodiment of the S405 are described in reference to S305, and details are not described herein.
  • the method further includes: decoding the code stream to obtain residual information of the current image block, and reconstructing the current image block to be decoded according to the residual information and the predicted image.
  • S306-S307 is combined with the decoding device, and the summary and exemplary embodiments of this step can be referred to S306-S307, and details are not described herein.
  • the embodiment of the present invention when acquiring the motion information of the first pixel sample, only needs to obtain the corresponding predicted motion information as its motion information, and does not need to further parse the code stream to obtain the residual value of the predicted motion information, thereby saving Predicting the number of bits to be transmitted by the residual value of the information, reducing bit consumption, Increased efficiency.
  • FIG. 5 is a schematic flowchart diagram of still another image prediction method according to an embodiment of the present invention.
  • an image prediction method according to a fifth embodiment of the present invention is provided for an encoding end device, and any image block includes at least one first type pixel sample and at least one second type sample. It may be desirable to set the first type of pixel sample to include the first pixel sample, and the second type of pixel sample to include the second pixel sample.
  • the difference between the first type of sample and the second type of sample is that the motion information of the first type of sample only comes from For the corresponding motion information unit, and the motion information portion of the second type of sample is from its corresponding motion information unit, the method may include:
  • S101 is a general method summary, S201-S202 is combined with an encoding device, and the invention and exemplary embodiments of S501 can be referred to S101 and S201-S202, and will not be described again.
  • each motion information unit in the merged motion information unit set is at least a part of the motion information unit of the corresponding candidate motion information unit set in the first pixel sample and the second pixel sample, respectively.
  • the motion information of the motion information unit includes a motion vector in which the prediction direction is forward and/or a motion vector in which the prediction direction is backward.
  • S102 is a general method summary, and S203 is combined with the coding device.
  • the summary and exemplary embodiments of S502 can be referred to S102 and S203, and will not be described again.
  • S204 is combined with the coding device, and the summary and exemplary embodiments of S503 can be referred to S204, and details are not described herein.
  • the first code stream information is used to represent the motion information unit corresponding to the first pixel sample and the second pixel sample respectively.
  • the first code stream information that is parsed is an index value
  • the first code stream information may respectively indicate motion information units corresponding to the first pixel sample and the second pixel sample respectively, and may also uniformly indicate the first pixel sample and
  • the combination of the motion information units corresponding to the second pixel samples is not limited. set. It should be understood that the location of the first stream information in the code stream in this step needs to be in the parsing position in the code stream in the relative decoding end related step (for example, step S401 in the fourth embodiment of the present invention). correspond.
  • the motion information unit indicated by the first code stream information that is, the motion vector of the predicted image block corresponding to the first pixel sample, is used as the motion vector of the first pixel sample.
  • the motion information unit indicated by the first code stream information that is, the motion vector of the predicted image block corresponding to the second pixel sample, is used as the predicted motion vector of the second pixel sample.
  • the second code stream information is used to represent a residual value between a motion vector of the second pixel sample and the predicted motion vector. It should be understood that the motion vector of each second pixel sample corresponds to a residual value, and the residual value may be zero.
  • the parsed second code stream information may include a residual value of a motion vector of each second pixel sample, and may also include a set of residual values of motion vectors of all second pixel samples, which are not limited.
  • the motion vector of the second pixel sample is subtracted from the corresponding motion vector, and the residual value of the motion vector of the second pixel sample is obtained.
  • the second code stream information is used to characterize the difference motion information of the second pixel sample. It should be understood that the location of the second code stream information in the code stream in this step needs to be resolved in the code stream in the relative decoding end-related step (for example, step S403 in the fourth embodiment of the present invention). correspond.
  • steps S504 to S508 have no limitation on the order relationship, and may be performed in parallel.
  • S405 combines with a decoding device to perform a method summary, and the inventive content and exemplary embodiment of S509 Reference may be made to S405 and will not be described again.
  • the embodiment of the present invention when acquiring the motion information of the first pixel sample, only needs to obtain the corresponding predicted motion information as its motion information, and does not need to further encode the code stream to complete the transmission of the residual value of the predicted motion information.
  • the number of bits to be transmitted by the residual information of the prediction information is saved, the bit consumption is reduced, and the coding efficiency is improved.
  • a sixth embodiment of the present invention further provides an image prediction apparatus 600, which may include:
  • a first parsing unit 601 configured to parse the first code stream information, where the first code stream information is used to indicate a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples;
  • a first obtaining unit 602 configured to acquire, according to the parsed first code stream information, motion information of each of the first pixel samples and predicted motion information of each of the second pixel samples, the predicted motion Information is prediction information of motion information;
  • the second parsing unit 603 is configured to parse the second code stream information, where the second code stream information is used to represent the difference motion information of each of the second pixel samples, where the difference motion information is motion information and predicted motion information. Difference
  • a second acquiring unit 604 configured to acquire motion information of each of the second pixel samples according to the parsed second code stream information and the predicted motion information of each of the second pixel samples;
  • a third obtaining unit 605 configured to obtain, according to a motion model of the current image block, motion information of each of the first pixel samples, and motion information of each of the second pixel samples, a prediction of the current image block. value.
  • the image prediction apparatus 600 in this embodiment may be used to perform the method and various exemplary embodiments described in Embodiment 4 of the present invention.
  • the image prediction device 600 can be any device that needs to output and play video, such as a notebook computer, a tablet computer, a personal computer, a mobile phone, and the like.
  • FIG. 7 is a schematic diagram of an image prediction apparatus 700 according to a seventh embodiment of the present invention.
  • the image prediction apparatus 700 may include at least one bus 701 and at least one processor connected to the bus 701. 702 and at least one memory 703 coupled to bus 701.
  • the processor 702 calls, by the bus 701, a code or an instruction stored in the memory 703 for parsing the first code stream information, the first code stream information is used to indicate each of the first pixel samples and each a motion information unit corresponding to the second pixel sample; acquiring, according to the parsed first code stream information, motion information of each of the first pixel samples and predicted motion information of each of the second pixel samples, The predicted motion information is prediction information of the motion information; the second code stream information is parsed, the second code stream information is used to represent the difference motion information of each of the second pixel samples, and the difference motion information is motion information.
  • the image prediction apparatus 700 in this embodiment may be used to perform the method and the exemplary embodiments described in Embodiment 4 of the present invention.
  • the beneficial effects can be referred to the beneficial effects in the fourth embodiment of the present invention, and details are not described herein.
  • the image prediction device 700 can be any device that needs to output and play video, such as a notebook computer, a tablet computer, a personal computer, a mobile phone, and the like.
  • the eighth embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium may store a program, where the program includes some or all of the steps of any one of the image prediction methods described in the foregoing method embodiments,
  • the computer storage medium may store a program, where the program includes some or all of the steps of any one of the image prediction methods described in the foregoing method embodiments.
  • the program includes some or all of the steps of any one of the image prediction methods described in the foregoing method embodiments.
  • a ninth embodiment of the present invention further provides an image prediction apparatus 800, which may include:
  • a first determining unit 801 configured to determine a candidate motion information unit set corresponding to each of the first pixel samples and each of the second pixel samples, where any one of the candidate motion information unit sets includes at least one motion information unit;
  • a second determining unit 802 configured to determine a combined motion information unit set of the current block, where each motion information unit in the merged motion information unit set is each of the first pixel samples and each of the At least part of the motion information unit of the corresponding candidate motion information unit set in the second pixel sample, wherein the motion information of the motion information unit includes a motion vector whose prediction direction is forward The quantity and / or prediction direction is a backward motion vector;
  • a third determining unit 803, configured to determine, from the combined motion information unit set, a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples;
  • a first encoding unit 804 configured to encode first code stream information, where the first code stream information is used to represent each of the first pixel samples and each of the second pixels determined by the combined motion information unit set The motion information unit corresponding to the sample;
  • a first assignment unit 805, configured to use motion information of a motion information unit corresponding to the first pixel sample as motion information of the first pixel sample;
  • a second evaluation unit 806, configured to use motion information of a motion information unit corresponding to the second pixel sample as predicted motion information of the second pixel sample;
  • a calculating unit 807 configured to calculate difference motion information of the second pixel sample, where the difference motion information is a difference between the motion information and the predicted motion information;
  • a second encoding unit 808, configured to encode second code stream information, where the second code stream information is used to represent difference motion information of each of the second pixel samples;
  • the obtaining unit 809 is configured to obtain a predicted value of the current image block according to a motion model of the current image block, motion information of each of the first pixel samples, and motion information of each of the second pixel samples.
  • the image prediction apparatus 800 in this embodiment may be used to perform the method and various exemplary embodiments described in Embodiment 5 of the present invention.
  • the image prediction device 800 can be any device that needs to output and play video, such as a notebook computer, a tablet computer, a personal computer, a mobile phone, and the like.
  • FIG. 9 is a schematic diagram of an image prediction apparatus 900 according to a tenth embodiment of the present invention.
  • the image prediction apparatus 900 can include at least one bus 901, at least one processor 902 connected to the bus 901, and a bus 901. At least one memory 903.
  • the processor 902 calls a code or an instruction stored in the memory 903 via the bus 901 for determining a candidate motion corresponding to each of the first pixel samples and each of the second pixel samples.
  • a set of information units wherein any one of the candidate motion information unit sets includes at least one motion information unit; determining a merge motion information unit set of the current block, wherein each motion information unit in the merged motion information unit set At least part of the motion information unit of each of the first pixel samples and the corresponding candidate motion information unit groups in each of the second pixel samples, wherein the motion information of the motion information unit includes a prediction direction that is forward a motion vector and/or a prediction direction is a backward motion vector; determining, from the combined motion information unit set, a motion information unit corresponding to each of the first pixel samples and each of the second pixel samples; encoding the first code Flow information, the first code stream information is used to represent each of the first pixel samples and the motion information unit corresponding to each of the second pixel samples determined in the
  • the image prediction apparatus 900 in this embodiment may be used to perform the method and various exemplary embodiments in the fifth embodiment of the present invention.
  • the beneficial effects can be referred to the beneficial effects in the fifth embodiment of the present invention, and details are not described herein.
  • the image prediction device 900 can be any device that needs to output and play video, such as a notebook computer, a tablet computer, a personal computer, a mobile phone, and the like.
  • An eleventh embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium may store a program, where the program includes some or all of the steps of any one of the image prediction methods described in the foregoing method embodiments.
  • the computer storage medium may store a program, where the program includes some or all of the steps of any one of the image prediction methods described in the foregoing method embodiments.
  • the disclosed apparatus may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the above units is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical or otherwise.
  • the units described above as separate components may or may not be physically separated.
  • the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the above-described integrated unit if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium.
  • the instructions include a plurality of instructions for causing a computer device (which may be a personal computer, server or network device, etc., and in particular a processor in a computer device) to perform all or part of the steps of the above-described methods of various embodiments of the present invention.
  • the foregoing storage medium may include: a U disk, a mobile hard disk, a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM), and the like. The medium of the code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种图像预测方法和相关设备所述方法包括:解析第一码流信息,用于指示运动信息单元,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息;解析第二码流信息,用于表征每个所述第二像素样本的差异运动信息,获取每个所述第二像素样本的运动信息;根据所述当前图像块的运动模型、所述第一像素样本和第二像素样本的运动信息,获得所述当前图像块的预测值。

Description

图像预测方法和相关设备 技术领域
本发明涉及视频编解码领域,具体涉及图像预测方法和相关设备。
背景技术
随着光电采集技术的发展及不断增长的高清数字视频需求,视频数据量越来越大,有限异构的传输带宽、多样化的视频应用不断地对视频编码效率提出了更高的需求,高性能视频编码(英文:high efficient video coding,缩写:HEVC)标准的制定工作因需启动。
视频编码压缩的基本原理是利用空域、时域和码字之间的相关性,尽可能去除冗余。目前流行做法是采用基于块的混合视频编码框架,通过预测(包括帧内预测和帧间预测)、变换、量化、熵编码等步骤实现视频编码压缩。这种编码框架,显示了很强的生命力,HEVC也仍沿用这种基于块的混合视频编码框架。
在各种视频编/解码方案中,运动估计/运动补偿是一种影响编/解码性能的关键技术。其中,在现有的各种视频编/解码方案中,假设物体的运动总是满足平动运动,整个物体的各个部分有相同的运动。现有的运动估计/运动补偿算法基本都是建立在平动模型(英文:translational motion model)的基础上的块运动补偿算法。然而,现实世界中运动有多样性,缩放、旋转和抛物线运动等非规则运动普遍存在。上世纪90年代开始,视频编码专家就意识到了非规则运动的普遍性,希望通过引进非规则运动模型(如仿射运动模型等)来提高视频编码效率,但是现有的基于仿射运动模型进行图像预测的计算复杂度通常非常的高。
发明内容
本发明实施例提供图像预测方法和相关设备,以期降低基于仿射运动模型进行图像预测的计算复杂度,并提高编码效率。
在本发明实施例的第一个方面,提出了一种图像预测方法,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述方法包括:解析第一 码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息的预测信息;解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
可以看出本发明实施例在获取第一像素样本的运动信息时,仅需要获取其对应的预测运动信息来作为其运动信息,不需要进一步解析码流获得预测运动信息的残差值,节省了预测信息残差值所要传输的比特数,减少了比特消耗,提高了效率。
在第一方面的一种可实施方式中,所述第一码流信息包括索引,所述索引用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元。
在第一方面的一种可实施方式中,所述第二码流信息包括差值,所述差值为任一所述第二像素样本的运动矢量和预测运动矢量间的运动矢量残差。
在第一方面的一种可实施方式中,所述根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,包括:确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;根据所述解析后的第一码流信息从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;使用 所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息。
在第一方面的一种可实施方式中,所述确定所述当前块的合并运动信息单元集,包括:从N个候选合并运动信息单元集中确定出包含每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元的合并运动信息单元集,其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同。
在第一方面的一种可实施方式中,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的预测方向相同;所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的参考帧索引相同;所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量水平分量的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中一个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本;所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量竖直分量的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意一个运动信息单元和像素样本Z的运动矢 量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本。
在第一方面的一种可实施方式中,所述根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息,包括:根据解析后的所述第二码流信息,获得每个所述第二像素样本的差异运动信息;将每个所述第二像素样本的差异运动信息和对应的所述预测运动信息相加,获得每个所述第二像素样本的运动信息。
在第一方面的一种可实施方式中,所述运动模型为非平动运动模型,具体包括:所述非平动运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000001
其中,所述第一像素样本和所述第二像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;对应的,所述根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值,包括:利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;或者,利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
在第一方面的一种可实施方式中,所述运动模型为非平动运动模型,具体包括:所述非平动运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000002
其中,任意一个所述第一像素样本和任意两个所述第二像素样本的运动矢量,或者,任意两个所述第一像素样本和任意一个所述第二像素样本的运动矢量,分别为(vx0,vy0),(vx1,vy1)和(vx2,vy2),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;对应的,所述根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值,包括:利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;或者,利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
在第一方面的一种可实施方式中,所述至少一个第一像素样本和至少一个第二像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中两个像素样本;其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
在第一方面的一种可实施方式中,所述当前图像块的左上像素样本所对应 的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
在第一方面的一种可实施方式中,所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
在第一方面的一种可实施方式中,所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
在第一方面的一种可实施方式中,所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
在本发明实施例的第二个方面,提出了一种图像预测方法,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述方法包括:确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
可以看出本发明实施例在获取第一像素样本的运动信息时,仅需要获取其对应的预测运动信息来作为其运动信息,不需要进一步编码码流完成预测运动信息的残差值的传输,节省了预测信息残差值所要传输的比特数,减少了比特消耗,提高了编码效率。
在本发明实施例的第三个方面,提出了一种图像预测装置,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:第一解析单元,用于解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;第一获取单元,用于根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息 的预测信息;第二解析单元,用于解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;第二获取单元,用于根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;第三获取单元,用于根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
在第三方面的一种可实施方式中,所述第一码流信息包括索引,所述索引用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元。
在第三方面的一种可实施方式中,所述第二码流信息包括差值,所述差值为任一所述第二像素样本的运动矢量和预测运动矢量间的运动矢量残差。
在第三方面的一种可实施方式中,所述第一获取单元具体用于:确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;根据所述解析后的第一码流信息从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息。
在第三方面的一种可实施方式中,所述第一获取单元具体用于:从N个候选合并运动信息单元集中确定出包含每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元的合并运动信息单元集,其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息 单元,分别选自每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同。
在第三方面的一种可实施方式中,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的预测方向相同;所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的参考帧索引相同;所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量水平分量的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中一个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本;所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量竖直分量的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意一个运动信息单元和像素样本Z的运动矢量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本。
在第三方面的一种可实施方式中,第二获取单元具体用于:根据解析后的所述第二码流信息,获得每个所述第二像素样本的差异运动信息;将每个所述第二像素样本的差异运动信息和对应的所述预测运动信息相加,获得每个所述第二像素样本的运动信息。
在第三方面的一种可实施方式中,所述运动模型为非平动运动模型,具体包括:所述非平动运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000003
其中,所述第一像素样本和所述第二像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;对应的,所述第三获取单元具体用于:利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;或者,利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
在第三方面的一种可实施方式中,所述运动模型为非平动运动模型,具体包括:所述非平动运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000004
其中,任意一个所述第一像素样本和任意两个所述第二像素样本的运动矢量,或者,任意两个所述第一像素样本和任意一个所述第二像素样本的运动矢量,分别为(vx0,vy0),(vx1,vy1)和(vx2,vy2),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;对应的,所述第三获取单元具体用于:利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述 当前图像块中的各像素点的预测像素值;或者,利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
在第三方面的一种可实施方式中,所述至少一个第一像素样本和至少一个第二像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中两个像素样本;其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
在第三方面的一种可实施方式中,所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
在第三方面的一种可实施方式中,所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样 本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
在第三方面的一种可实施方式中,所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
在第三方面的一种可实施方式中,所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
在本发明实施例的第四个方面,提出了一种图像预测装置,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:第一确定单元,用于确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;第二确定单元,用于确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;第三确定单元,用于从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;第一编码单元,用于编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;第一赋值单元,用于使用所述第一像素样本对应的 运动信息单元的运动信息作为所述第一像素样本的运动信息;第二赋值单元,用于使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;计算单元,用于计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;第二编码单元,用于编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;获取单元,用于根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
在本发明实施例的第五个方面,提出了一种图像预测装置,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:处理器和耦合于处理器的存储器;所述存储器用于存储代码或者指令;所述处理器用于调用所述代码或指令以执行:解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息的预测信息;解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
在本发明实施例的第六个方面,提出了一种图像预测装置,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:处理器和耦合于处理器的存储器;所述存储器用于存储代码或者指令;所述处理器用于调用所述代码或指令以执行:确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每 个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
本发明第七方面提供一种图像预测方法,可包括:
确定当前图像块中的2个像素样本,确定所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集;其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一个运动信息单元;
确定包括2个运动信息单元的合并运动信息单元集i;
其中,所述合并运动信息单元集i中的每个运动信息单元分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;
利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测。
结合第七方面,在第七方面的第一种可能的实施方式中,所述确定包括2个运动信息单元的合并运动信息单元集i,包括:
从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i;其中,所述N个候选合并运动信息单元集中的每个候选合并 运动信息单元集所包含的每个运动信息单元,分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集包括2个运动信息单元。
结合第七方面的第一种可能的实施方式,在第七方面的第二种可能的实施方式中,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,
其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;
所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的预测方向相同;
所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的参考帧索引相同;
所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中1个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本;
所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意1个运动信息单元和像素样本Z的运动矢量竖直分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本。
结合第七方面或第七方面的第一种至第二种可能的实施方式中的任意一 种可能的实施方式,在第七方面的第三种可能的实施方式中,所述2个像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中2个像素样本;
其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
结合第七方面的第三种可能的实施方式,在第七方面的第四种可能的实施方式中,
所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;
其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
结合第七方面的第三种至第四种可能的实施方式中的任意一种可能的实施方式,在第七方面的第五种可能的实施方式中,所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;
其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当 前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
结合第七方面的第三种至第五种可能的实施方式中的任意一种可能的实施方式,在第七方面的第六种可能的实施方式中,
所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;
其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
结合第七方面的第三种至第六种可能的实施方式中的任意一种可能的实施方式,在第七方面的第七种可能的实施方式中,
所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,
其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
结合第七方面或第七方面的第一种至第七种可能的实施方式中的任意一种可能的实施方式,在第七方面的第八种可能的实施方式中,
所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测包括:当所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量对应的参考帧索引不同于所述当前图像块的参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量被缩放到所述当前图像块的参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当 前图像块进行像素值预测,其中,所述第一预测方向为前向或后向;
或者,
所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测包括:当所述合并运动信息单元集i中的预测方向为前向的运动矢量对应的参考帧索引不同于所述当前图像块的前向参考帧索引,并且所述合并运动信息单元集i中的预测方向为后向的运动矢量对应的参考帧索引不同于所述当前图像块的后向参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为前向的运动矢量被缩放到所述当前图像块的前向参考帧且使得所述合并运动信息单元集i中的预测方向为后向的运动矢量被缩放到所述当前图像块的后向参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
结合第七方面或第七方面的第一种至第八种可能的实施方式中的任意一种可能的实施方式,在第七方面的第九种可能的实施方式中,
所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,包括:
利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
或者,
利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
结合第七方面或第七方面的第一种至第九种可能的实施方式中的任意一种可能的实施方式,在第七方面的第十种可能的实施方式中,
所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,包括:利用所述2个像素样本的运动矢量水平分量之间的差值与所述当前图像块的长或宽的比值,以及所述2个像素样本的运动矢量竖直分 量之间的差值与所述当前图像块的长或宽的比值,得到所述当前图像块中的任意像素样本的运动矢量,其中,所述2个像素样本的运动矢量基于所述合并运动信息单元集i中的两个运动信息单元的运动矢量得到。
结合第七方面的第十种可能的实施方式,在第七方面的第十一种可能的实施方式中,
所述2个像素样本的运动矢量水平分量的水平坐标系数和运动矢量竖直分量的竖直坐标系数相等,且所述2个像素样本的运动矢量水平分量的竖直坐标系数和运动矢量竖直分量的水平坐标系数相反。
结合第七方面或第七方面的第一种至第十一种可能的实施方式中的任意一种可能的实施方式,在第七方面的第十二种可能的实施方式中,
所述仿射运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000005
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽。
结合第七方面或第七方面的第一种至第十二种可能的实施方式中的任意一种可能的实施方式,在第七方面的第十三种可能的实施方式中,
所述图像预测方法应用于视频编码过程中或所述图像预测方法应用于视频解码过程中。
结合第七方面的第十三种可能的实施方式,在第七方面的第十四种可能的实施方式中,在所述图像预测方法应用于视频解码过程中的情况下,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i,包括:基于从视频码流中获得的合并运动信息单元集i的标识,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i。
结合第七方面的第十三种可能的实施方式或第七方面的第十四种可能的 实施方式,在第七方面的第十五种可能的实施方式中,在所述图像预测方法应用于视频解码过程中的情况下,所述方法还包括:从视频码流中解码得到所述2个像素样本的运动矢量残差,利用所述2个像素样本的空域相邻或时域相邻的像素样本的运动矢量得到所述2个像素样本的运动矢量预测值,基于所述2个像素样本的运动矢量预测值和所述2个像素样本的运动矢量残差分别得到所述2个像素样本的运动矢量。
结合第七方面的第十三种可能的实施方式,在第七方面的第十六种可能的实施方式中,在所述图像预测方法应用于视频编码过程中的情况下,所述方法还包括:利用所述2个像素样本的空域相邻或者时域相邻的像素样本的运动矢量,得到所述2个像素样本的运动矢量预测值,根据所述2个像素样本的运动矢量预测值得到所述2个像素样本的运动矢量残差,将所述2个像素样本的运动矢量残差写入视频码流。
结合第七方面的第十三种可能的实施方式或第七方面的第十六种可能的实施方式,在第七方面的第十七种可能的实施方式中,在所述图像预测方法应用于视频编码过程中的情况下,所述方法还包括:将所述合并运动信息单元集i的标识写入视频码流。
本发明实施例第八方面提供一种图像预测装置,包括:
第一确定单元,用于确定当前图像块中的2个像素样本,确定所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集;其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一个运动信息单元;
第二确定单元,用于确定包括2个运动信息单元的合并运动信息单元集i;
其中,所述合并运动信息单元集i中的每个运动信息单元分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;
预测单元,用于利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测。
结合第八方面,在第八方面的第一种可能的实施方式中,所述第二确定单 元具体用于,从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i;其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集包括2个运动信息单元。
结合第八方面的第一种可能的实施方式,在第八方面的第二种可能的实施方式中,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,
其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;
所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的预测方向相同;
所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的参考帧索引相同;
所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中1个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本;
所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中1个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前 图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本。
结合第八方面或第八方面的第一种至第二种可能的实施方式中的任意一种可能的实施方式,在第八方面的第三种可能的实施方式中,所述2个像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中2个像素样本;
其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
结合第八方面的第三种可能的实施方式,在第八方面的第四种可能的实施方式中,所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;
其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
结合第八方面的第三种至第四种可能的实施方式中的任意一种可能的实施方式,在第八方面的第五种可能的实施方式中,所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;
其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的 视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
结合第八方面的第三种至第五种可能的实施方式中的任意一种可能的实施方式,在第八方面的第六种可能的实施方式中,
所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;
其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
结合第八方面的第三种至第六种可能的实施方式中的任意一种可能的实施方式,在第八方面的第七种可能的实施方式中,
所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,
其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
结合第八方面或第八方面的第一种至第七种可能的实施方式中的任意一种可能的实施方式,在第八方面的第八种可能的实施方式中,
所述预测单元具体用于,当所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量对应的参考帧索引不同于所述当前图像块的参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量被缩放到所述当前图像块的参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i 对所述当前图像块进行像素值预测,其中,所述第一预测方向为前向或后向;
或者,所述预测单元具体用于,当所述合并运动信息单元集i中的预测方向为前向的运动矢量对应的参考帧索引不同于所述当前图像块的前向参考帧索引,并且所述合并运动信息单元集i中的预测方向为后向的运动矢量对应的参考帧索引不同于所述当前图像块的后向参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为前向的运动矢量被缩放到所述当前图像块的前向参考帧且使得所述合并运动信息单元集i中的预测方向为后向的运动矢量被缩放到所述当前图像块的后向参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
结合第八方面或第八方面的第一种至第八种可能的实施方式中的任意一种可能的实施方式,在第八方面的第九种可能的实施方式中,
所述预测单元具体用于,利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
或者,
所述预测单元具体用于,利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
结合第八方面或第八方面的第一种至第九种可能的实施方式中的任意一种可能的实施方式,在第八方面的第十种可能的实施方式中,
所述预测单元具体用于,利用所述2个像素样本的运动矢量水平分量之间的差值与所述当前图像块的长或宽的比值,以及所述2个像素样本的运动矢量竖直分量之间的差值与所述当前图像块的长或宽的比值,得到所述当前图像块中的任意像素样本的运动矢量,其中,所述2个像素样本的运动矢量基于所述合并运动信息单元集i中的两个运动信息单元的运动矢量得到。
结合第八方面的第十种可能的实施方式,在第八方面的第十一种可能的实施方式中,所述2个像素样本的运动矢量水平分量的水平坐标系数和运动矢量竖直分量的竖直坐标系数相等,且所述2个像素样本的运动矢量水平分量的竖直坐标系数和运动矢量竖直分量的水平坐标系数相反。
结合第八方面或第八方面的第一种至第十一种可能的实施方式中的任意一种可能的实施方式,在第八方面的第十二种可能的实施方式中,
所述仿射运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000006
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽。
结合第八方面或第八方面的第一种至第十二种可能的实施方式中的任意一种可能的实施方式,在第八方面的第十三种可能的实施方式中,
所述图像预测装置应用于视频编码装置中或所述图像预测装置应用于视频解码装置中。
结合第八方面的第十三种可能的实施方式,在第八方面的第十四种可能的实施方式中,在当所述图像预测装置应用于视频解码装置中的情况下,所述第二确定单元具体用于,基于从视频码流中获得的合并运动信息单元集i的标识,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i。
结合第八方面的第十三种可能的实施方式或第八方面的第十四种可能的实施方式,在第八方面的第十五种可能的实施方式中,在当所述图像预测装置应用于视频解码装置中的情况下,
所述装置还包括解码单元,用于从视频码流中解码得到所述2个像素样本的运动矢量残差,利用所述2个像素样本的空域相邻或时域相邻的像素样本的 运动矢量得到所述2个像素样本的运动矢量预测值,基于所述2个像素样本的运动矢量预测值和所述2个像素样本的运动矢量残差分别得到所述2个像素样本的运动矢量。
结合第八方面的第十三种可能的实施方式,在第八方面的第十六种可能的实施方式中,在当所述图像预测装置应用于视频编码装置中的情况下,所述预测单元还用于:利用所述2个像素样本的空域相邻或者时域相邻的像素样本的运动矢量,得到所述2个像素样本的运动矢量预测值,根据所述2个像素样本的运动矢量预测值得到所述2个像素样本的运动矢量残差,将所述2个像素样本的运动矢量残差写入视频码流。
结合第八方面的第十三种可能的实施方式或第八方面的第十六种可能的实施方式,在第八方面的第十七种可能的实施方式中,在当所述图像预测装置应用于视频编码装置中的情况下,所述装置还包括编码单元,用于将所述合并运动信息单元集i的标识写入视频码流。
本发明实施例第九方面提供一种图像预测装置,包括:
处理器和存储器;
其中,所述处理器通过调用所述存储器中存储的代码或指令以用于,确定当前图像块中的2个像素样本,确定所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集;其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一个运动信息单元;确定包括2个运动信息单元的合并运动信息单元集i;其中,所述合并运动信息单元集i中的每个运动信息单元分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测。
结合第九方面,在第九方面的第一种可能的实施方式中,在确定包括2个运动信息单元的合并运动信息单元集i的方面,所述处理器用于,从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i;其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所 包含的每个运动信息单元,分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集包括2个运动信息单元。
结合第九方面的第一种可能的实施方式,在第九方面的第二种可能的实施方式中,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,
其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;
所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的预测方向相同;
所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的参考帧索引相同;
所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中1个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本;
所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量竖直分量的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意1个运动信息单元和像素样本Z的运动矢量竖直分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本。
结合第九方面或第九方面的第一种至第二种可能的实施方式中的任意一种可能的实施方式,在第九方面的第三种可能的实施方式中,所述2个像素样 本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中2个像素样本;
其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
结合第九方面的第三种可能的实施方式,在第九方面的第四种可能的实施方式中,所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;
其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
结合第九方面的第三种至第四种可能的实施方式中的任意一种可能的实施方式,在第九方面的第五种可能的实施方式中,所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;
其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
结合第九方面的第三种至第五种可能的实施方式中的任意一种可能的实施方式,在第九方面的第六种可能的实施方式中,
所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;
其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
结合第九方面的第三种至第六种可能的实施方式中的任意一种可能的实施方式,在第九方面的第七种可能的实施方式中,
所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,
其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
结合第九方面或第九方面的第一种至第七种可能的实施方式中的任意一种可能的实施方式,在第九方面的第八种可能的实施方式中,
在利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测的方面,所述处理器用于,当所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量对应的参考帧索引不同于所述当前图像块的参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量被缩放到所述当前图像块的参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测,其中,所述第一预测方向为前向或后向;
或者,在利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测的方面,所述处理器用于,当所述合并运动信息单元集i中的预测方向为前向的运动矢量对应的参考帧索引不同于所述当前图像块的前向参考帧索引,并且所述合并运动信息单元集i中的预测方向为后向的运动矢量对应的参考帧索引不同于所述当前图像块的后向参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为前向的运动矢量被缩放到所述当前图像块的前向参考帧且使得所述合并运动信息单元集i中的预测方向为后向的运动矢量被缩放到所述当前图像块的后向参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
结合第九方面或第九方面的第一种至第八种可能的实施方式中的任意一种可能的实施方式,在第九方面的第九种可能的实施方式中,在利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测的方面,所述处理器用于,利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
或者,
在利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测的方面,所述处理器用于,利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
结合第九方面或第九方面的第一种至第九种可能的实施方式中的任意一种可能的实施方式,在第九方面的第十种可能的实施方式中,
在利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测的方面,所述处理器用于,利用所述2个像素样本的运动矢量水平分量之间的差值与所述当前图像块的长或宽的比值,以及所述2个像素样本的运动矢量竖直分量之间的差值与所述当前图像块的长或宽的比值,得到所述当 前图像块中的任意像素样本的运动矢量,其中,所述2个像素样本的运动矢量基于所述合并运动信息单元集i中的两个运动信息单元的运动矢量得到。
结合第九方面的第十种可能的实施方式,在第九方面的第十一种可能的实施方式中,
所述2个像素样本的运动矢量水平分量的水平坐标系数和运动矢量竖直分量的竖直坐标系数相等,且所述2个像素样本的运动矢量水平分量的竖直坐标系数和运动矢量竖直分量的水平坐标系数相反。
结合第九方面或第九方面的第一种至第十一种可能的实施方式中的任意一种可能的实施方式,在第九方面的第十二种可能的实施方式中,
所述仿射运动模型为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000007
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽。
结合第九方面或第九方面的第一种至第十二种可能的实施方式中的任意一种可能的实施方式,在第九方面的第十三种可能的实施方式中,
所述图像预测装置应用于视频编码装置中或所述图像预测装置应用于视频解码装置中。
结合第九方面的第十三种可能的实施方式,在第九方面的第十四种可能的实施方式中,在当所述图像预测装置应用于视频解码装置中的情况下,在确定包括2个运动信息单元的合并运动信息单元集i的方面,所述处理器用于,基于从视频码流中获得的合并运动信息单元集i的标识,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i。
结合第九方面的第十三种可能的实施方式或第九方面的第十四种可能的实施方式,在第九方面的第十五种可能的实施方式中,在当所述图像预测装置 应用于视频解码装置中的情况下,所述处理器还用于,从视频码流中解码得到所述2个像素样本的运动矢量残差,利用所述2个像素样本的空域相邻或时域相邻的像素样本的运动矢量得到所述2个像素样本的运动矢量预测值,基于所述2个像素样本的运动矢量预测值和所述2个像素样本的运动矢量残差分别得到所述2个像素样本的运动矢量。
结合第九方面的第十三种可能的实施方式,在第九方面的第十六种可能的实施方式中,在当所述图像预测装置应用于视频编码装置中的情况下,所述处理器还用于,利用所述2个像素样本的空域相邻或者时域相邻的像素样本的运动矢量,得到所述2个像素样本的运动矢量预测值,根据所述2个像素样本的运动矢量预测值得到所述2个像素样本的运动矢量残差,将所述2个像素样本的运动矢量残差写入视频码流。
结合第九方面的第十三种可能的实施方式或第九方面的第十六种可能的实施方式,在第九方面的第十七种可能的实施方式中,在当所述图像预测装置应用于视频编码装置中的情况下,所述处理器还用于,将所述合并运动信息单元集i的标识写入视频码流。
本发明实施例第十方面提供一种图像处理方法,包括:
获得当前图像块的运动矢量2元组,所述运动矢量2元组包括所述当前图像块所属的视频帧中的2个像素样本各自的运动矢量;
利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量;
其中,所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000008
其中,(x,y)为所述任意像素样本的坐标,所述vx为所述任意像素样本的运动矢量的水平分量,所述vy为所述任意像素样本的运动矢量的竖直分量;
其中,在等式vx=ax+by中,a为所述仿射运动模型的水平分量的水平坐标系数,b为所述仿射运动模型的水平分量的竖直坐标系数;在等式vy=-bx+ay中,a为所述仿射运动模型的竖直分量的竖直坐标系数,-b为所 述仿射运动模型的竖直分量的水平坐标系数。
结合第十方面,在第十方面第一种可能的实现方式中,所述仿射运动模型还包括所述仿射运动模型的水平分量的水平位移系数c,以及所述仿射运动模型的竖直分量的竖直位移系数d,从而所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000009
结合第十方面或第十方面第一种可能的实现方式,在第十方面第二种可能的实现方式中,所述利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量包括:
利用所述2个像素样本各自的运动矢量与所述2个像素样本的位置,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十方面或第十方面第一种或第二种可能的实现方式,在第十方面第三可能的实现方式中,利用所述2个像素样本各自的运动矢量的水平分量之间的差值与所述2个像素样本之间距离的比值,以及所述2个像素样本各自的运动矢量的竖直分量之间的差值与所述2个像素样本之间距离的比值,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十方面或第十方面第一种或第二种可能的实现方式,在第十方面第四可能的实现方式中,所述利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量包括:
利用所述2个像素样本各自的运动矢量的分量之间的加权和与所述2个像素样本之间距离或所述2个像素样本之间距离的平方的比值,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十方面或第十方面第一种至第三种可能的实现方式中任意一种可能的实现方式,在第十方面第五可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本右侧的右区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000010
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx1,vy1)为所述右区域像素样本的运动矢量,w为所述所述2个像素样本之间的距离。
结合第十方面或第十方面第一种至第三种可能的实现方式中任意一种可能的实现方式,在第十方面第六可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本下方的下区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000011
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx2,vy2)为所述下区域像素样本的运动矢量,h为所述所述2个像素样本之间的距离。
结合第十方面或第十方面第一种、第二种和第四种可能的实现方式中任意一种可能的实现方式,在第十方面第七可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本右下方的右下区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000012
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx3,vy3)为所述右下区域像素样本的运动矢量,h1为所述所述2个像素样本之间的竖直方向距离,w1为所述2个像素样本之间的水平方向距离,w1 2+h1 2为所述所述2个像素样本之间的距离的平方。
结合第十方面或第十方面第一种至第七种可能的实现方式中任意一种可 能的实现方式,在第十方面第八可能的实现方式中,在所述利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量之后,还包括:
利用计算得到的所述当前图像块中任意像素样本的运动矢量,对所述当前图像块中的所述任意像素样本进行运动补偿预测编码。
结合第十方面或第十方面第一种至第七种可能的实现方式中任意一种可能的实现方式,在第十方面第九可能的实现方式中,在所述确定所述当前图像块中的所述任意像素样本的像素点的预测像素值之后,还包括:
利用计算得到的所述当前图像块中任意像素样本的运动矢量,对所述任意像素样本进行运动补偿解码,得到所述任意像素样本的像素重建值。
本发明实施例第十一方面提供一种图像处理装置,包括:
获得单元,用于获得当前图像块的运动矢量2元组,所述运动矢量2元组包括所述当前图像块所属的视频帧中的2个像素样本各自的运动矢量;
计算单元,用于利用仿射运动模型和所述获得单元获得的运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量;
其中,所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000013
其中,(x,y)为所述任意像素样本的坐标,所述vx为所述任意像素样本的运动矢量的水平分量,所述vy为所述任意像素样本的运动矢量的竖直分量;
其中,在等式vx=ax+by中,a为所述仿射运动模型的水平分量的水平坐标系数,b为所述仿射运动模型的水平分量的竖直坐标系数;在等式vy=-bx+ay中,a为所述仿射运动模型的竖直分量的竖直坐标系数,-b为所述仿射运动模型的竖直分量的水平坐标系数。
结合第十一方面,在第十一方面第一种可能的实现方式中,所述仿射运动模型还包括所述仿射运动模型的水平分量的水平位移系数c,以及所述仿射运动模型的竖直分量的竖直位移系数d,从而所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000014
结合第十一方面或第十一方面第一种可能的实现方式,在第十一方面第二种可能的实现方式中,所述计算单元具体用于:
利用所述2个像素样本各自的运动矢量与所述2个像素样本的位置,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十一方面或第十一方面第一种或第二种可能的实现方式,在第十一方面第三可能的实现方式中,所述计算单元具体用于:
利用所述2个像素样本各自的运动矢量的水平分量之间的差值与所述2个像素样本之间距离的比值,以及所述2个像素样本各自的运动矢量的竖直分量之间的差值与所述2个像素样本之间距离的比值,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十一方面或第十一方面第一种或第二种可能的实现方式,在第十一方面第四可能的实现方式中,所述计算单元具体用于:
利用所述2个像素样本各自的运动矢量的分量之间的加权和与所述2个像素样本之间距离或所述2个像素样本之间距离的平方的比值,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十一方面或第十一方面第一种至第三种可能的实现方式中任意一种可能的实现方式,在第十一方面第五可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本右侧的右区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000015
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx1,vy1)为所述右区域像素样本的运动矢量,w为所述所述2个像素样本之间的距离。
结合第十一方面或第十一方面第一种至第三种可能的实现方式中任意一种可能的实现方式,在第十一方面第六可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本下方的下区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000016
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx2,vy2)为所述下区域像素样本的运动矢量,h为所述所述2个像素样本之间的距离。
结合第十一方面或第十一方面第一种、第二种和第四种可能的实现方式中任意一种可能的实现方式,在第十一方面第七可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本右下方的右下区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000017
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx3,vy3)为所述右下区域像素样本的运动矢量,h1为所述所述2个像素样本之间的竖直方向距离,w1为所述2个像素样本之间的水平方向距离,w1 2+h1 2为所述所述2个像素样本之间的距离的平方。
结合第十一方面或第十一方面第一种至第七种可能的实现方式中任意一种可能的实现方式,在第十一方面第八可能的实现方式中,在当所述图像处理装置应用于视频编码装置中的情况下,所述装置还包括编码单元,用于利用所述计算单元计算得到的所述当前图像块中任意像素样本的运动矢量,对所述当前图像块中的所述任意像素样本进行运动补偿预测编码。
结合第十一方面或第十一方面第一种至第七种可能的实现方式中任意一种可能的实现方式,在第十一方面第九可能的实现方式中,在当所述图像处理装置应用于视频编码装置中的情况下,所述装置还包括解码单元,用于利用所述计算单元计算得到的所述当前图像块中任意像素样本的运动矢量,对所述任意像素样本进行运动补偿解码,得到所述任意像素样本的像素重建值。
本发明实施例第十二方面提供一种图像处理装置,包括:
处理器和存储器;
其中,所述处理器通过调用所述存储器中存储的代码或指令以用于,获得当前图像块的运动矢量2元组,所述运动矢量2元组包括所述当前图像块所属的视频帧中的2个像素样本各自的运动矢量;
利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量;
其中,所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000018
其中,(x,y)为所述任意像素样本的坐标,所述vx为所述任意像素样本的运动矢量的水平分量,所述vy为所述任意像素样本的运动矢量的竖直分量;
其中,在等式vx=ax+by中,a为所述仿射运动模型的水平分量的水平坐标系数,b为所述仿射运动模型的水平分量的竖直坐标系数;在等式vy=-bx+ay中,a为所述仿射运动模型的竖直分量的竖直坐标系数,-b为所述仿射运动模型的竖直分量的水平坐标系数。
结合第十二方面,在第十二方面第一种可能的实现方式中,所述仿射运动模型还包括所述仿射运动模型的水平分量的水平位移系数c,以及所述仿射运动模型的竖直分量的竖直位移系数d,从而所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000019
结合第十二方面或第十二方面第一种可能的实现方式,在第十二方面第二种可能的实现方式中,在所述利用仿射运动模型和所述运动矢量2元组,计算 得到所述当前图像块中任意像素样本的运动矢量方面,所述处理器用于,利用所述2个像素样本各自的运动矢量与所述2个像素样本的位置,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十二方面或第十二方面第一种或第二种可能的实现方式,在第十二方面第三可能的实现方式中,在利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量方面,所述处理器用于,利用所述2个像素样本各自的运动矢量的水平分量之间的差值与所述2个像素样本之间距离的比值,以及所述2个像素样本各自的运动矢量的竖直分量之间的差值与所述2个像素样本之间距离的比值,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十二方面或第十二方面第一种或第二种可能的实现方式,在第十二方面第四可能的实现方式中,在利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量方面,所述处理器用于,利用所述2个像素样本各自的运动矢量的分量之间的加权和与所述2个像素样本之间距离或所述2个像素样本之间距离的平方的比值,获得所述仿射运动模型的系数的值;
利用所述仿射运动模型以及所述仿射运动模型的系数的值,获得所述当前图像块中的任意像素样本的运动矢量。
结合第十二方面或第十二方面第一种至第三种可能的实现方式中任意一种可能的实现方式,在第十二方面第五可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本右侧的右区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000020
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx1,vy1)为所述右区域像素样本的运动矢量,w为所述所述2个像素样本之间的距离。
结合第十二方面或第十二方面第一种至第三种可能的实现方式中任意一种可能的实现方式,在第十二方面第六可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本下方的下区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000021
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx2,vy2)为所述下区域像素样本的运动矢量,h为所述所述2个像素样本之间的距离。
结合第十二方面或第十二方面第一种、第二种和第四种可能的实现方式中任意一种可能的实现方式,在第十二方面第七可能的实现方式中,在所述2个像素样本包括所述当前图像块的左上像素样本、位于所述左上像素样本右下方的右下区域像素样本时,所述仿射运动模型具体为:
Figure PCTCN2016083203-appb-000022
其中,(vx0,vy0)为所述左上像素样本的运动矢量,(vx3,vy3)为所述右下区域像素样本的运动矢量,h1为所述所述2个像素样本之间的竖直方向距离,w1为所述2个像素样本之间的水平方向距离,w1 2+h1 2为所述所述2个像素样本之间的距离的平方。
结合第十二方面或第十二方面第一种至第七种可能的实现方式中任意一种可能的实现方式,在第十二方面第八可能的实现方式中,在当所述图像处理装置应用于视频编码装置中的情况下,所述处理器还用于,在所述利用仿射运动模型和所述运动矢量2元组,计算得到所述当前图像块中任意像素样本的运动矢量之后,利用计算得到的所述当前图像块中任意像素样本的运动矢量,对所述当前图像块中的所述任意像素样本进行运动补偿预测编码。
结合第十二方面或第十二方面第一种至第七种可能的实现方式中任意一 种可能的实现方式,在第十二方面第九可能的实现方式中,所述处理器还用于,在所述确定所述当前图像块中的所述任意像素样本的像素点的预测像素值之后,利用计算得到的所述当前图像块中任意像素样本的运动矢量,对所述任意像素样本进行运动补偿解码,得到所述任意像素样本的像素重建值。
本发明实施例第十三方面一种图像处理方法,包括:
获得仿射运动模型的系数,利用所述仿射运动模型的系数以及所述仿射模型,计算得到所述当前图像块中任意像素样本的运动矢量;
利用计算得到的所述任意像素样本的运动矢量,确定所述任意像素样本的像素点的预测像素值;
其中,所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000023
其中,(x,y)为所述任意像素样本的坐标,所述vx为所述任意像素样本的运动矢量的水平分量,所述vy为所述任意像素样本的运动矢量的竖直分量;
其中,在等式vx=ax+by中,a为所述仿射运动模型的水平分量的水平坐标系数,b为所述仿射运动模型的水平分量的竖直坐标系数;在等式vy=-bx+ay中,a为所述仿射运动模型的竖直分量的竖直坐标系数,-b为所述仿射运动模型的竖直分量的水平坐标系数,所述仿射运动模型的系数包括a和b;
所述仿射运动模型的系数还包括所述仿射运动模型的水平分量的水平位移系数c,以及所述仿射运动模型的竖直分量的竖直位移系数d,从而所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000024
本发明实施例第十四方面提供一种图像处理装置,包括:
获得单元,用于获得仿射运动模型的系数;
计算单元,用于利用所述获得单元获得的所述仿射运动模型的系数以及所述仿射模型,计算得到所述当前图像块中任意像素样本的运动矢量;
预测单元,用于所述计算单元计算得到的所述任意像素样本的运动矢量,确定所述任意像素样本的像素点的预测像素值;
其中,所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000025
其中,(x,y)为所述任意像素样本的坐标,所述vx为所述任意像素样本的运动矢量的水平分量,所述vy为所述任意像素样本的运动矢量的竖直分量;
其中,在等式vx=ax+by中,a为所述仿射运动模型的水平分量的水平坐标系数,b为所述仿射运动模型的水平分量的竖直坐标系数;在等式vy=-bx+ay中,a为所述仿射运动模型的竖直分量的竖直坐标系数,-b为所述仿射运动模型的竖直分量的水平坐标系数,所述仿射运动模型的系数包括a和b;
所述仿射运动模型的系数还包括所述仿射运动模型的水平分量的水平位移系数c,以及所述仿射运动模型的竖直分量的竖直位移系数d,从而所述仿射运动模型为如下形式:
Figure PCTCN2016083203-appb-000026
可以看出,本发明的一些实施例提供的技术方案中,利用仿射运动模型和合并运动信息单元集i对当前图像块进行像素值预测,其中,合并运动信息单元集i中的每个运动信息单元分别选自2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,其中,由于合并运动信息单元集i选择范围变得相对较小,摒弃了传统技术采用的在多个像素样本的全部可能候选运动信息单元集合中通过大量计算才筛选出多个像素样本的一种运动信息单元的机制,有利于提高编码效率,并且也有利于降低基于仿射运动模型进行图像预测的计算复杂度,进而使得仿射运动模型引入视频编码标准变得可能。并且由于引入了仿射运动模型,有利于更准确描述物体运动,故而有利于提高预测准确度。并且,由于所参考的像素样本的数量可为2个,这样有 利于进一步降低引入仿射运动模型之后,基于仿射运动模型进行图像预测的计算复杂度,并且,也有利于减少编码端传递仿射参数信息或者运动矢量残差的个数等。
附图说明
为了更清楚地说明本发明实施例技术方案,下面将对实施例和现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。
图1-a~图1-b为本发明实施例提供的几种图像块的划分示意图;
图1-c为本发明实施例提供的一种图像预测方法的流程示意图;
图1-d为本发明实施例提供的一种图像块的示意图;
图2-a为本发明实施例提供的另一种图像预测方法的流程示意图;
图2-b~图2-d是本发明实施例提供的几种确定像素样本的候选运动信息单元集的示意图;
图2-e是本发明实施例提供的图像块x的顶点坐标的示意图;
图2-f~图2-g是本发明实施例提供的像素点仿射运动的示意图;
图2-h~图2-i是本发明实施例提供的像素点旋转运动的示意图;
图3为本发明实施例提供的另一种图像预测方法的流程示意图;
图4是本发明实施例提供的另一种图像预测方法的流程示意图;
图5是本发明实施例提供的另一种图像预测方法的流程示意图;
图6为本发明实施例提供的一种图像处理装置的示意图;
图7为本发明实施例提供的另一种图像处理装置的示意图;
图8为本发明实施例提供的另一种图像处理装置的示意图;
图9是本发明实施例提供的另一种图像处理装置的示意图。
具体实施方式
本发明实施例提供图像预测方法和相关设备,以期降低基于仿射运动模型 进行图像预测的计算复杂度。
为使得本发明的发明目的、特征、优点能够更加明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然下面所描述的实施例仅是本发明的一部分实施例,而非全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包括。例如包括了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面先对本发明实施例可能涉及的一些概念进行介绍。
在多数的编码框架中,视频序列包括一系列图像(英文:picture),图像被进一步划分为切片(英文:slice),slice再被划分为块(英文:block)。视频编码以块为单位,可从picture的左上角位置开始从左到右从上到下一行一行进行编码处理。在一些新的视频编码标准中,block的概念被进一步扩展。在H.264标准中有宏块(英文:macroblock,缩写:MB),MB可进一步划分成多个可用于预测编码的预测块(英文:partition)。其中,在HEVC标准中,采用编码单元(英文:coding unit,缩写:CU),预测单元(英文:prediction unit,缩写:PU)和变换单元(英文:transform unit,缩写:TU)等基本概念,从功能上划分了多种Unit,并采用全新的基于树结构进行描述。比如CU可以按照四叉树进行划分为更小的CU,而更小的CU还可以继续划分,从而形成一种四叉树结构。对于PU和TU也有类似的树结构。无论CU,PU还是TU,本质上都属于块block的概念,CU类似于宏块MB或者编码块,是对编码图像进行划分和编码的基本单元。PU可以对应预测块,是预测编码的基本单元。对CU按照划分模式进一步划分成多个PU。TU可以对应变换块,是对预测残差进行变换的基本单元。高性能视频编码(英文:high efficiency video coding,缩写:HEVC) 标准中则可以把它们统一称之为编码树块(英文:coding tree block,缩写:CTB)等等。
在HEVC标准中,编码单元的大小可包括64×64,32×32,16×16和8×8等四个级别,每个级别的编码单元按照帧内预测和帧间预测由可以划分为不同大小的预测单元。其中,例如图1-a和图1-b所示,图1-a举例示出了一种与帧内预测对应的预测单元划分方式,图1-b举例示出了几种与帧间预测对应的预测单元划分方式。
在视频编码技术发展演进过程中,视频编码专家们想了各种方法来利用相邻编解码块之间的时空相关性来努力提高编码效率。在H264/高级视频编码(英文:advanced video coding,缩写:AVC)标准中,跳过模式(skip mode)和直接模式(direct mode)成为提高编码效率的有效工具,在低码率时使用这两种编码模式的块能占到整个编码序列的一半以上。当使用跳过模式时,只需要在码流中传递一个跳过模式标记,就可以利用周边运动矢量推导得到当前图像块的运动矢量,根据该运动矢量来直接拷贝参考块的值作为当前图像块的重建值。此外,当使用直接模式时,编码器可以利用周边运动矢量推导得到当前图像块的运动矢量,根据该运动矢量直接拷贝参考块的值作为当前图像块的预测值,在编码端利用该预测值对当前图像块进行编码预测。目前最新的高性能视频编码(英文:high efficiency video coding,缩写:HEVC)标准中,通过引进一些新编码工具,进一步提高视频编码性能。融合编码(merge)模式和自适应运动矢量预测(英文:advanced motion vector prediction,缩写:AMVP)模式是两个重要的帧间预测工具。融合编码(merge)利用当前编码块周边已编码块的运动信息(可包括运动矢量(英文:motion vector,缩写:MV)和预测方向和参考帧索引等)构造一个候选运动信息集合,通过比较,可选择出编码效率最高的候选运动信息作为当前编码块的运动信息,在参考帧中找到当前编码块的预测值,对当前编码块进行预测编码,同时,可把表示选择来自哪个周边已编码块的运动信息的索引值写入码流。当使用自适应运动矢量预测模式时,利用周边已编码块的运动矢量作为当前编码块运动矢量的预测值,可以选定一个编码效率最高的运动矢量来预测当前编码块的运动矢量,并可把表示 选定哪个周边运动矢量的索引值写入视频码流。
下面先介绍本发明实施例提供的图像预测方法,本发明实施例提供的图像预测方法的执行主体是视频编码装置或视频解码装置,其中,该视频编码装置或视频解码装置可以是任何需要输出或存储视频的装置,如笔记本电脑、平板电脑、个人电脑、手机或视频服务器等设备。
本发明图像预测方法的一个实施例,一种图像预测方法包括:确定当前图像块中的2个像素样本,确定所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集;其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一个运动信息单元;确定包括2个运动信息单元的合并运动信息单元集i;其中,所述合并运动信息单元集i中的每个运动信息单元分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测。
请参见图1-c,图1-c为本发明的一个实施例提供的一种图像预测方法的流程示意图。其中,图1-c举例所示,本发明的第一个实施例提供的一种图像预测方法可包括:
S101、确定当前图像块中的2个像素样本,确定所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集。
其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一个运动信息单元。
其中,本发明各实施例中提及的像素样本可以是像素点或包括至少两个像素点的像素块。
其中,本发明各实施例中提及运动信息单元可包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量。也就是说,一个运动信息单元可能包括一个运动矢量或可能包括预测方向不同的两个运动矢量。
其中,若运动信息单元对应的预测方向为前向,表示该运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量。若运动信息 单元对应的预测方向为后向,表示该运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量。若运动信息单元对应的预测方向为单向,表示该运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,或者表示该运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量。其中,若运动信息单元对应的预测方向为双向,表示运动信息单元包括预测方向为前向的运动矢量和预测方向为后向的运动矢量。
可选的,在本发明的一些可能的实施方式中,所述2个像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的2个像素样本。其中,所述当前图像块的左上像素样本可为所述当前图像块的左上顶点或者所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
若像素样本为像素块,则该像素块的大小例如为2*2,1*2、4*2、4*4或其他大小。图像块可包括多个像素块。
需要说明的是,对于一个尺寸为w*w的图像块,当w为奇数时(例如w等于3、5、7或11等),该图像块的中心像素点是唯一的,当w为偶数时(例如w等于4、6、8或16等),该图像块的中心像素点可能有多个,图像块的中心素样本可为上述图像块的任意一个中心像素点或指定中心像素点,或者图像块的中心素样本可以为上述图像块中的包含任意一个中心像素点的的像素块,或者图像块的中心素样本可为所述图像块中的包含指定中心像素点的像素块。例如图1-d举例所示的尺寸为4*4的图像块,图像块的中心像素点有A1、A2、A3和A4这4个像素点,那么指定中心像素点可为像素点A1(左上中心像素点),像素点A2(左下中心像素点)、像素点A3(右上中心像素点)或像素点A4(右 下中心像素点),其他情况以此类推。
S102、确定包括2个运动信息单元的合并运动信息单元集i。
其中,所述合并运动信息单元集i中的每个运动信息单元分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元。其中,所述运动信息单元包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量。
举例来说,假设2个像素样本包括像素样本001和像素样本002。像素样本001对应的候选运动信息单元集为候选运动信息单元集011。像素样本002对应的候选运动信息单元集为候选运动信息单元集022。其中,合并运动信息单元集i包括运动信息单元C01和运动信息单元C02,其中,运动信息单元C01可选自候选运动信息单元集011,其中,运动信息单元C02可选自候选运动信息单元集022,以此类推。
可以理解,假设合并运动信息单元集i包括运动信息单元C01和运动信息单元C02,其中,运动信息单元C01和运动信息单元C02中的任意一个运动信息单元可能包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量,因此,合并运动信息单元集i可能包括2个运动矢量(这2个运动矢量对应的预测方式可为前向或后向。或这2个运动矢量可包括预测方向为前向的1个运动矢量和预测方向为后向的1个运动矢量),也可能包括4个运动矢量(其中,这4个运动矢量可能包括预测方向为前向的2个运动矢量和预测方向为后向的2个运动矢量),也可能包括3个运动矢量(这3个运动矢量可能也可能包括预测方向为前向的1个运动矢量和预测方向为后向的2个运动矢量,或也可能包括预测方向为前向的2个运动矢量和预测方向为后向的1个运动矢量)。
S103、利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测。
其中,当前图像块可为当前编码块或当前解码块。
可以看出,本实施例的技术方案中,利用仿射运动模型和合并运动信息单元集i对当前图像块进行像素值预测,其中,合并运动信息单元集i中的每个运动信息单元分别选自2个像素样本中的每个像素样本所对应的候选运动信息单 元集中的至少部分运动信息单元,由于合并运动信息单元集i选择范围变得相对较小,摒弃了传统技术采用的在多个像素样本的全部可能候选运动信息单元集合中通过大量计算才筛选出多个像素样本的一种运动信息单元的机制,有利于提高编码效率,并且也有利于降低基于仿射运动模型进行图像预测的计算复杂度,进而使得仿射运动模型引入视频编码标准变得可能。并且由于引入了仿射运动模型,有利于更准确描述物体运动,故而有利于提高预测准确度。并且由于所参考的像素样本的数量可为2个,这样有利于进一步降低引入仿射运动模型之后,基于仿射运动模型进行图像预测的计算复杂度,并且,也有利于减少编码端传递仿射参数信息或者运动矢量残差的个数等。
其中,本实施例提供的所述图像预测方法可应用于视频编码过程中或可应用于视频解码过程中。
在实际应用中,确定包括2个运动信息单元的合并运动信息单元集i的方式可能是多种多样的。
可选的,在本发明的一些可能的实施方式中,确定包括2个运动信息单元的合并运动信息单元集i,包括:从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i;其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集包括2个运动信息单元。
其中,两个候选合并运动信息单元集不相同,可指候选合并运动信息单元集包括的运动信息单元不完全相同。
其中,两个运动信息单元不相同,可指两个运动信息单元所包括的运动矢量不同,或两个运动信息单元所包括的运动矢量对应的预测方向不同,或者两个运动信息单元所包括的运动矢量对应的参考帧索引不同。其中,两个运动信息单元相同,可指两个运动信息单元所包括的运动矢量相同,且两个运动信息单元所包括的运动矢量对应的预测方向相同,且两个运动信息单元所包括的运 动矢量对应的参考帧索引相同。
可选的,在本发明的一些可能的实施方式中,在所述图像预测方法应用于视频解码过程中的情况下,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i,可以包括:基于从视频码流中获得的合并运动信息单元集i的标识,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i。
可选的,在本发明的一些可能的实施方式中,在所述图像预测方法应用于视频编码过程中的情况下,所述方法还可包括:将所述合并运动信息单元集i的标识写入视频码流。所述合并运动信息单元集i的标识可以是任何能够标识出所述合并运动信息单元集i的信息,例如所述合并运动信息单元集i的标识可为合并运动信息单元集i在合并运动信息单元集列表中的索引号等。
可选的,在本发明的一些可能的实施方式中,在所述图像预测方法应用于视频编码过程中的情况下,所述方法还包括:利用所述2个像素样本的空域相邻或者时域相邻的像素样本的运动矢量,得到所述2个像素样本的运动矢量预测值,根据所述2个像素样本的运动矢量预测值得到所述2个像素样本的运动矢量残差,将所述2个像素样本的运动矢量残差写入视频码流。
可选的,在本发明的一些可能的实施方式中,在所述图像预测方法应用于视频解码过程中的情况下,所述方法还包括:从视频码流中解码得到所述2个像素样本的运动矢量残差,利用所述2个像素样本的空域相邻或时域相邻的像素样本的运动矢量得到所述2个像素样本的运动矢量预测值,基于所述2个像素样本的运动矢量预测值和所述2个像素样本的运动矢量残差分别得到所述2个像素样本的运动矢量。
可选的,在本发明的一些可能的实施方式中,从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i可包括:基于失真或率失真代价从N个候选合并运动信息单元集之中确定出包含2个运动矢量的合并运动信息单元集i。
可选的,合并运动信息单元集i对应的率失真代价,小于或等于上述N个候选合并运动信息单元集中除合并运动信息单元集i之外的任意一个合并运动信息单元集对应的率失真代价。
可选的,合并运动信息单元集i对应的失真,小于或者等于上述N个候选合并运动信息单元集中除合并运动信息单元集i之外的任意一个合并运动信息单元集对应的失真。
其中,上述N个候选合并运动信息单元集之中的某个候选合并运动信息单元集(例如上述N个候选合并运动信息单元集中的合并运动信息单元集i)对应的率失真代价例如可以为利用该某个候选合并运动信息单元集(例如合并运动信息单元集i)对图像块(例如当前图像块)进行像素值预测而得到的该图像块的预测像素值所对应的率失真代价。
其中,上述N个候选合并运动信息单元集之中的某个候选合并运动信息单元集(例如上述N个候选合并运动信息单元集中的合并运动信息单元集i)对应的失真,例如可为图像块(如当前图像块)的原始像素值与利用该某个候选合并运动信息单元集(如合并运动信息单元集i)对该图像块进行像素值预测而得到的该图像块的预测像素值之间的失真(即,图像块的原始像素值与预测像素值之间的失真)。
在本发明一些可能的实施方式中,图像块(如当前图像块)的原始像素值与利用该某个候选合并运动信息单元集(如合并运动信息单元集i)对该图像块进行像素值预测而得到的该图像块的预测像素值之间的失真,具体例如可以为该图像块(如当前图像块)的原始像素值与利用该某个候选合并运动信息单元集(例如合并运动信息单元集i)对该图像块进行像素值预测而得到的该图像块的预测像素值之间的平方误差和(SSD,sum of square differences)或绝对误差和(SAD,sum of absolution differences)或误差和或能够衡量失真的其他失真参量。
其中,所述N为正整数。例如上述N例如可等于1、2、3、4、5、6、8或其他值。
可选的,在本发明的一些可能的实施方式中,上述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的各运动信息单元可以互不相同。
可选的,在本发明的一些可能的实施方式中,所述N个候选合并运动信 息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件。
其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动。例如,在候选合并运动信息单元集中的对应第一预测方向的所有运动矢量相等的情况下,可认为该候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为平动运动,反之,则可认为该候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动,其中,第一预测方向为前向或后向。又例如在候选合并运动信息单元集中的对应预测方向为前向的所有运动矢量相等,且候选合并运动信息单元集中的对应预测方向为后向的所有运动矢量相等的情况下,可认为该候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为平动运动,反之,则可认为该候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动。
所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的预测方向相同。
举例来说,当两个运动信息单元均包括预测方向为前向的运动矢量和预测方向为后向的运动矢量,表示这两个运动信息单元对应的预测方向相同。又例如,当两个运动信息单元中的其中一个运动信息单元包括预测方向为前向的运动矢量和预测方向为后向的运动矢量,另一个运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,或者该另一个运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量,可表示这两个运动信息单元对应的预测方向不相同。又例如,当两个运动信息单元中的其中一个运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,而另一个运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量,可表示这两个运动信息单元对应的预测方向不相同。又例如,当两个运动信息单元均包括预测方向为前向的运动矢量但这两个运动信息单元均不包括预测方向为后向的运动矢量,表示这两个运动 信息单元对应的预测方向相同。又例如,当两个运动信息单元均包括预测方向为后向的运动矢量但是这两个运动信息单元均不包括预测方向为前向的运动矢量,表示这两个运动信息单元对应的预测方向相同。
所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元对应的参考帧索引相同。
举例来说,当两个运动信息单元均包括预测方向为前向的运动矢量和预测方向为后向的运动矢量,且该两个运动信息单元中的预测方向为前向的运动矢量对应的参考帧索引相同,且该两个运动信息单元中的预测方向为后向的运动矢量对应的参考帧索引相同,可表示这两个运动信息单元对应的参考帧索引相同。又举例来说,当两个运动信息单元中的其中一个运动信息单元包括预测方向为前向的运动矢量和预测方向为后向的运动矢量,另一个运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,或该另一个运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量,表示这两个运动信息单元对应的预测方向不相同,可表示这两个运动信息单元对应的参考帧索引不同。又例如,当两个运动信息单元中的其中一个运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,另一个运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量,可表示这两个运动信息单元对应的参考帧索引不同。又例如,当两个运动信息单元中的其中一个运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,另一个运动信息单元包括预测方向为前向的运动矢量但不包括预测方向为后向的运动矢量,并且该两个运动信息单元中的预测方向为前向的运动矢量对应的参考帧索引相同,则可表示这两个运动信息单元对应的参考帧索引不同。又例如,当两个运动信息单元中的其中一个运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量,另一个运动信息单元包括预测方向为后向的运动矢量但不包括预测方向为前向的运动矢量,并且该两个运动信息单元中的预测方向为后向的运动矢量对应的参考帧索引相同,则可以表示这两个运动信息单元对应的参考帧索引不同。
所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量水平分量的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中1个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本。其中,上述水平分量阈值例如可以等于当前图像块的宽度的1/3、当前图像块的宽度的1/2、当前图像块的宽度的2/3或当前图像块的宽度的3/4或其他大小。
所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的2个运动信息单元的运动矢量竖直分量的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意1个运动信息单元和像素样本Z的运动矢量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,所述当前图像块的所述像素样本Z不同于所述2个像素样本中的任意一个像素样本。其中,上述垂直分量阈值例如可以等于当前图像块的高度的1/3、当前图像块的高度的1/2、当前图像块的高度的2/3或当前图像块的高度的3/4或其他大小。
假设,上述两个像素样本为当前图像块的左上像素样本,那么像素样本Z可为当前图像块的左下像素样本或中心像素样本或其他像素样本。其他情况可以此类推。
可选的,在本发明一些可能的实施方式中,所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数。例如,所述x1个像素样只包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本。
例如上述x1例如可等于1、2、3、4、5、6或其他值。
例如,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的 视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
可选的,在本发明一些可能的实施方式中,所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数。
例如上述x2例如可等于1、2、3、4、5、6或其他值。
例如,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
可选的,在本发明一些可能的实施方式中,所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数。例如所述x3个像素样本只包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本。
例如上述x3例如可等于1、2、3、4、5、6或其他值。
例如,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
可选的,在本发明一些可能实施方式中,所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2。例如所述x5个像素样本 只包括像素样本a2。其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
可选的,在本发明一些可能实施方式之中,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可包括:当所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量对应的参考帧索引不同于所述当前图像块的参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量被缩放到所述当前图像块的参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测,所述第一预测方向为前向或后向;
或者,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可以包括:当所述合并运动信息单元集i中的预测方向为前向的运动矢量对应的参考帧索引不同于所述当前图像块的前向参考帧索引,并且所述合并运动信息单元集i中的预测方向为后向的运动矢量对应的参考帧索引不同于所述当前图像块的后向参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为前向的运动矢量被缩放到所述当前图像块的前向参考帧且使得所述合并运动信息单元集i中的预测方向为后向的运动矢量被缩放到所述当前图像块的后向参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
可选的,在本发明的一些可能的实施方式之中,利用非平动运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测,例如可包括:对进行缩放处理后的合并运动信息单元集i中的运动矢量进行运动估计处理,以得到运动估计处理后的合并运动信息单元集i,利用非平动运动模型和运动估计处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
可选的,在本发明的一些可能的实施方式之中,所述利用仿射运动模型和 所述合并运动信息单元集i对所述当前图像块进行像素值预测,包括:利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;或利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
测试发现,若先利用仿射运动模型和所述合并运动信息单元集i计算得到所述当前图像块中的各像素块的运动矢量,而后再利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值,由于计算运动矢量时以当前图像块中的像素块为粒度,这样有利于较大的降低计算复杂度。
可选的,在本发明的一些可能的实施方式之中,利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测可包括:对所述合并运动信息单元集i中的运动矢量进行运动估计处理,以得到运动估计处理后的合并运动信息单元集i,利用仿射运动模型和运动估计处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
可选的,在本发明的一些可能的实施方式之中,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,包括:利用所述合并运动信息单元集i中的两个运动信息单元的运动矢量水平分量之间的差值与所述当前图像块的长或宽的比值,以及所述所述合并运动信息单元集i中的两个运动信息单元的运动矢量竖直分量之间的差值与所述当前图像块的长或宽的比值,得到所述当前图像块中的任意像素样本的运动矢量。
或者,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可包括:利用所述2个像素样本的运动矢量水平分量之间的差值与所述当前图像块的长或宽的比值,以及所述2个像素样本的运动矢量竖直分量之间的差值与所述当前图像块的长或宽的比值,得到所述当前图像块中的任意像素样本的运动矢量,其中,所述2个像素样本的运动矢量基于所 述合并运动信息单元集i中的两个运动信息单元的运动矢量得到(例如,所述2个像素样本的运动矢量为所述合并运动信息单元集i中的两个运动信息单元的运动矢量,或者,基于所述合并运动信息单元集i中的两个运动信息单元的运动矢量和预测残差得到所述2个像素样本的运动矢量)。
可选的,在本发明的一些可能的实施方式之中,所述2个像素样本的运动矢量水平分量的水平坐标系数和运动矢量竖直分量的竖直坐标系数相等,且所述2个像素样本的运动矢量水平分量的竖直坐标系数和运动矢量竖直分量的水平坐标系数相反。
可选的,在本发明的一些可能的实施方式之中,
所述仿射运动模型例如可为如下形式的仿射运动模型:
Figure PCTCN2016083203-appb-000027
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽。
其中,
Figure PCTCN2016083203-appb-000028
其中,(vx2,vy2)为当前图像块中的与上述2个像素样本不同的另一个像素样本的运动矢量。例如假设上述2个像素样本为当前图像块的左上像素样本和右上像素样本,那么(vx2,vy2)可为前图像块的左下像素样本或中心像素样本。又例如假设上述2个像素样本为当前图像块的左上像素样本和左下像素样本,那么(vx2,vy2)可为前图像块的右上像素样本或中心像素样本。
其中,当像素样本为为包括多个像素点的像素块时,像素样本的坐标可为像素样本中的任意一个像素点的作为,或者像素样本的坐标可为像素样本中的指定像素点的坐标(例如像素样本的坐标可为像素样本中的左上像素点或左下左上像素点或右上像素点或中心像素点的坐标等)。
可以理解的是,对于当前视频帧中的每个图像块,均可以按照与当前图像块对应的像素值预测方式相类似的方式进行像素值预测,当然,当前视频帧中的某些图像块也可能按照与当前图像块对应的像素值预测方式不同的方式进行像素值预测。
为便于更好的理解和实施本发明实施例的上述方案,下面结合更具体的应用场景进行进一步说明。
请参见图2-a,图2-a为本发明的另一个实施例提供的另一种图像预测方法的流程示意图。本实施例中主要以在视频编码装置中实施图像预测方法为例进行描述。其中,图2-a举例所示,本发明的第二个实施例提供的另一种图像预测方法可包括:
S201、视频编码装置确定当前图像块中的2个像素样本。
其中,本实施例中主要以所述2个像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中2个像素样本为例。例如,所述2个像素样本包括所述当前图像块的左上像素样本和右上像素样本。其中,所述2个像素样本为所述当前图像块的其他像素样本的场景可以此类推。
其中,所述当前图像块的左上像素样本可为所述当前图像块的左上顶点或者所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
若像素样本为像素块,则该像素块的大小例如为2*2,1*2、4*2、4*4或者其他大小。
S202、视频编码装置确定出所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集。
其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一 个运动信息单元。
其中,本发明各实施例中提及的像素样本可以是像素点或包括至少两个像素点的像素块。
其中,例如图2-b和图2-c所示,所述当前图像块的左上像素样本对应的候选运动信息单元集S1可包括x1个像素样本的运动信息单元。其中,所述x1个像素样本包括:与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本Col-LT、所述当前图像块的左边的空域相邻图像块C、所述当前图像块的左上的空域相邻图像块A、所述当前图像块的上边的空域相邻图像块B中的至少一个。例如可先获取所述当前图像块的左边的空域相邻图像块C的运动信息单元、所述当前图像块的左上的空域相邻图像块A的运动信息单元和所述当前图像块的上边的空域相邻图像块B的运动信息单元,将获取到的所述当前图像块的左边的空域相邻图像块C的运动信息单元、所述当前图像块的左上的空域相邻图像块A的运动信息单元和所述当前图像块的上边的空域相邻图像块B的运动信息单元添加到所述当前图像块的左上像素样本对应的候选运动信息单元集中,若所述当前图像块的左边的空域相邻图像块C的运动信息单元、所述当前图像块的左上的空域相邻图像块A的运动信息单元和所述当前图像块的上边的空域相邻图像块B的运动信息单元中的部分或全部运动信息单元相同,则进一步对所述候选运动信息单元集S1进行去重处理(此时去重处理后的所述候选运动信息单元集S1中的运动信息单元的数量可能是1或2),若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本Col-LT的运动信息单元,与去重处理后的所述候选运动信息单元集S1中的其中一个运动信息单元相同,则可向所述候选运动信息单元集S1中加入零运动信息单元,直到候选运动信息单元集S1中的运动信息单元数量等于3。此外,若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本Col-LT的运动信息单元,不同于去重处理后的所述候选运动信息单元集S1中的任意一个运动信息单元,则将与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素 样本LT位置相同的像素样本Col-LT的运动信息单元添加到去重处理后的所述候选运动信息单元集S1中,若此时所述候选运动信息单元集S1中的运动信息单元数量仍然少于3个,则可以向所述候选运动信息单元集S1中加入零运动信息单元,直到所述候选运动信息单元集S1中的运动信息单元数量等于3。
其中,若当前图像块所属视频帧是前向预测帧,则添加到候选运动信息单元集S1中的零运动信息单元包括预测方向为前向的零运动矢量但可不包括预测方向为后向的零运动矢量。若当前图像块所属视频帧是后向预测帧,则添加到候选运动信息单元集S1中的零运动信息单元包括预测方向为后向的零运动矢量但可不包括预测方向为前向的零运动矢量。此外,若当前图像块所属视频帧是双向预测帧,则添加到候选运动信息单元集S1中的零运动信息单元包括预测方向为前向的零运动矢量和预测方向为后向的零运动矢量,其中,添加到候选运动信息单元集S1中的不同零运动信息单元中的运动矢量所对应的参考帧索引可不相同,对应的参考帧索引例如可为0、1、2、3或其其他值。
类似的,例如图2-b和图2-c所示,所述当前图像块的右上像素样本对应的候选运动信息单元集S2可以包括x2个图像块的运动信息单元。其中,所述x2个图像块可以包括:与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT、所述当前图像块的右上的空域相邻图像块E、所述当前图像块的上边的空域相邻图像块D之中的至少一个。例如,可以先获取所述当前图像块的右上的空域相邻图像块E的运动信息单元和所述当前图像块的上边的空域相邻图像块D的运动信息单元,将获取的所述当前图像块的右上的空域相邻图像块E的运动信息单元和所述当前图像块的上边的空域相邻图像块D的运动信息单元添加到所述当前图像块的右上像素样本对应的候选运动信息单元集S2中,若所述当前图像块的右上的空域相邻图像块E的运动信息单元和所述当前图像块的上边的空域相邻图像块D的运动信息单元相同,则可对所述候选运动信息单元集S2进行去重处理(此时去重处理后的所述候选运动信息单元集S2中的运动信息单元的数量是1),若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT的运动信息单元,与去重 处理后的所述候选运动信息单元集S2中的其中一个运动信息单元相同,可进一步向所述候选运动信息单元集S2中加入零运动信息单元,直到所述候选运动信息单元集S2中运动信息单元数量等于2。此外,若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT的运动信息单元,不同于去重处理之后的所述候选运动信息单元集S2中的任意一个运动信息单元,则可以将与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT的运动信息单元添加到去重处理后的所述候选运动信息单元集S2中,若此时所述候选运动信息单元集S2之中的运动信息单元数量仍然少于2个,则进一步向所述候选运动信息单元集S2中加入零运动信息单元,直到所述候选运动信息单元集S2中运动信息单元的数量等于2。
其中,若当前图像块所属视频帧是前向预测帧,则添加到候选运动信息单元集S2中的零运动信息单元包括预测方向为前向的零运动矢量但可不包括预测方向为后向的零运动矢量。若当前图像块所属视频帧是后向预测帧,则添加到候选运动信息单元集S2中的零运动信息单元包括预测方向为后向的零运动矢量但可不包括预测方向为前向的零运动矢量。此外,若当前图像块所属视频帧是双向预测帧,则添加到候选运动信息单元集S2中的零运动信息单元包括预测方向为前向的零运动矢量和预测方向为后向的零运动矢量,其中,添加到候选运动信息单元集S2中的不同零运动信息单元中的运动矢量所对应的参考帧索引可不相同,对应的参考帧索引例如可为0、1、2、3或其其他值。
类似的,例如图2-b和图2-c所示,所述当前图像块的左下像素样本对应的候选运动信息单元集S3可以包括x3个图像块的运动信息单元。其中,所述x3个图像块可包括:与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB、所述当前图像块的左下的空域相邻图像块G、所述当前图像块的左边的空域相邻图像块F中的至少一个。例如先获取所述当前图像块的左下的空域相邻图像块G的运动信息单元和所述当前图像块的左边的空域相邻图像块F的运动信息单元,可将获取的所述当前图像块的左下的空域相邻图像块G的运动信息单元和所述当 前图像块的左边的空域相邻图像块F的运动信息单元添加到所述当前图像块的左下像素样本对应的候选运动信息单元集S3中,若所述当前图像块的左下的空域相邻图像块G的运动信息单元和所述当前图像块的左边的空域相邻图像块F的运动信息单元相同,则对所述候选运动信息单元集S3进行去重处理(此时去重处理后的所述候选运动信息单元集S3中的运动信息单元的数量是1),若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB的运动信息单元,与去重处理后的所述候选运动信息单元集S3中的其中一个运动信息单元相同,则可进一步向所述候选运动信息单元集S3中加入零运动信息单元,直到所述候选运动信息单元集S3中运动信息单元数量等于2。此外,若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB的运动信息单元,不同于去重处理后的所述候选运动信息单元集S3中的任意一个运动信息单元,则可将与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB的运动信息单元添加到去重处理后的候选运动信息单元集S3中,若此时所述候选运动信息单元集S3之中的运动信息单元数量仍然少于2个,则进一步向所述候选运动信息单元集S3中加入零运动信息单元,直到所述候选运动信息单元集S3中运动信息单元数量等于2。
其中,若当前图像块所属视频帧是前向预测帧,则添加到候选运动信息单元集S3中的零运动信息单元包括预测方向为前向的零运动矢量但可不包括预测方向为后向的零运动矢量。若当前图像块所属视频帧是后向预测帧,则添加到候选运动信息单元集S3中的零运动信息单元包括预测方向为后向的零运动矢量但可不包括预测方向为前向的零运动矢量。此外,若当前图像块所属视频帧是双向预测帧,则添加到候选运动信息单元集S3中的零运动信息单元包括预测方向为前向的零运动矢量和预测方向为后向的零运动矢量,其中,添加到候选运动信息单元集S3中的不同零运动信息单元中的运动矢量所对应的参考帧索引可不相同,对应的参考帧索引例如可为0、1、2、3或其其他值。
其中,两个运动信息单元不相同,可指该两个运动信息单元包括的运动矢 量不同,或该两个运动信息单元所包括的运动矢量对应的预测方向不同,或者该两个运动信息单元所包括的运动矢量对应的参考帧索引不同。其中,两个运动信息单元相同,可指该两个运动信息单元所包括的运动矢量相同,且该两个运动信息单元所包括的运动矢量对应的预测方向相同,且该两个运动信息单元所包括的运动矢量对应的参考帧索引相同。
可以理解,对于存在更多像素样本的场景,可以按照类似方式得到相应像素样本的候选运动信息单元集。
例如图2-d所示,其中,在图2-d所示举例中,所述2个像素样本可包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中两个像素样本。其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
S203、视频编码装置基于所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集确定N个候选合并运动信息单元集。其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元。所述N个候选合并运动信息单元集互不相同,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集包括2个运动信息单元。
可以理解的是,假设基于候选运动信息单元集S1(假设包括3个运动信息单元)和所述候选运动信息单元集S2(假设包括2个运动信息单元)来确定候选合并运动信息单元集,则理论上可确定出3*2=6个初始的候选合并运动信息单元集,然而为了提高可用性,例如可以利用第一条件、第二条件、第三条件、第四条件和第五条件中的至少一个条件来从这6个初始的候选合并运动 信息单元集中筛选出N个候选合并运动信息单元集。其中,如果候选运动信息单元集S1和所述候选运动信息单元集S2所包括的运动信息单元的数量不限于上述举例,那么,初始的候选合并运动信息单元集的数量不一定是6。
其中,第一条件、第二条件、第三条件、第四条件和第五条件的具体限制性内容可参考上述实施例中的举例说明,此处不在赘述。当然,所述N个候选合并运动信息单元集例如还可满足其他未列出条件。
在具体实现过程中,例如可先利用第一条件、第二条件和第三条件中的至少一个条件对初始的候选合并运动信息单元集进行筛选,从初始的候选合并运动信息单元集中筛选出N01个候选合并运动信息单元集,而后对N01个候选合并运动信息单元集进行缩放处理,而后再利用第四条件和第五条件中的至少一个条件从进行缩放处理的N01个候选合并运动信息单元集中筛选出N个候选合并运动信息单元集。当然,第四条件和第五条件也可能不参考,而是直接利用第一条件、第二条件和第三条件中的至少一个条件对初始的候选合并运动信息单元集进行筛选,从初始的候选合并运动信息单元集中筛选出N个候选合并运动信息单元集。
可以理解的是,视频编解码中运动矢量反映的是一个物体在一个方向(预测方向)上相对于同一时刻(同一时刻对应同一参考帧)偏移的距离。因此在不同像素样本的运动信息单元对应不同预测方向和/或对应不同参考帧索引的情况下,可能无法直接得到当前图像块的每个像素点/像素块相对于一参考帧的运动偏移。而当这些像素样本对应相同预测方向和对应相同参考帧索引的情况下,可利用这些合并运动矢量组合得到该图像块中每个像素点/像素块的运动矢量。
因此,在候选合并运动信息单元集中的不同像素样本的运动信息单元对应不同预测方向和/或对应不同参考帧索引的情况下,可以对候选合并运动信息单元集进行缩放处理。其中,对候选合并运动信息单元集进行缩放处理可能涉及到对该候选合并运动信息单元集中的一个或多个运动信息单元中的运动矢量进行修改、添加和/或删除等。
例如,在本发明一些可能实施方式之中,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可包括:当所述合并 运动信息单元集i中的预测方向为第一预测方向的运动矢量对应的参考帧索引不同于所述当前图像块的参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量被缩放到所述当前图像块的参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测,所述第一预测方向为前向或后向;
或者,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可以包括:当所述合并运动信息单元集i中的预测方向为前向的运动矢量对应的参考帧索引不同于所述当前图像块的前向参考帧索引,并且所述合并运动信息单元集i中的预测方向为后向的运动矢量对应的参考帧索引不同于所述当前图像块的后向参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为前向的运动矢量被缩放到所述当前图像块的前向参考帧且使得所述合并运动信息单元集i中的预测方向为后向的运动矢量被缩放到所述当前图像块的后向参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
S204、视频编码装置从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i。
可选的,在本发明一些可能的实施方式中,视频编码装置还可将所述合并运动信息单元集i的标识写入视频码流。相应的,视频解码装置基于从视频码流中获得的合并运动信息单元集i的标识,从N个候选合并运动信息单元集之中确定包含2个运动信息单元的合并运动信息单元集i。
可选的,在本发明一些可能的实施方式中,视频编码装置从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i可以包括:基于失真或者率失真代价从N个候选合并运动信息单元集之中确定出包含2个运动矢量的合并运动信息单元集i。
可选的,合并运动信息单元集i对应的率失真代价,小于或等于上述N个候选合并运动信息单元集中除合并运动信息单元集i之外的任意一个合并运动信息单元集对应的率失真代价。
可选的,合并运动信息单元集i对应的失真,小于或者等于上述N个候选合并运动信息单元集中除合并运动信息单元集i之外的任意一个合并运动信息单元集对应的失真。
其中,上述N个候选合并运动信息单元集之中的某个候选合并运动信息单元集(例如上述N个候选合并运动信息单元集中的合并运动信息单元集i)对应的率失真代价例如可以为利用该某个候选合并运动信息单元集(例如合并运动信息单元集i)对图像块(例如当前图像块)进行像素值预测而得到的该图像块的预测像素值所对应的率失真代价。
其中,上述N个候选合并运动信息单元集之中的某个候选合并运动信息单元集(例如上述N个候选合并运动信息单元集中的合并运动信息单元集i)对应的失真,例如可为图像块(如当前图像块)的原始像素值与利用该某个候选合并运动信息单元集(如合并运动信息单元集i)对该图像块进行像素值预测而得到的该图像块的预测像素值之间的失真(即,图像块的原始像素值与预测像素值之间的失真)。
在本发明一些可能的实施方式中,图像块(如当前图像块)的原始像素值与利用该某个候选合并运动信息单元集(例如合并运动信息单元集i)对该图像块进行像素值预测而得到的该图像块的预测像素值之间的失真,具体例如可以为该图像块(例如当前图像块)的原始像素值与利用该某个候选合并运动信息单元集(例如合并运动信息单元集i)对该图像块进行像素值预测而得到的该图像块的预测像素值之间的平方误差和(SSD)或绝对误差和(SAD)或误差和或能够衡量失真的其他失真参量。
进一步的,为进一步降低运算复杂度,当上述N大于n1,可从N个候选合并运动信息单元集中筛选出n1个候选合并运动信息单元集,基于失真或率失真代价从n1个候选合并运动信息单元集中确定出包含2个运动信息单元的合并运动信息单元集i。上述n1个候选合并运动信息单元集中的任意一个候选合并运动信息单元集对应的D(V)小于或等于上述N个候选合并运动信息单元集中的除n1个候选合并运动信息单元集之外的任意一个候选合并运动信息单元集对应的D(V),其中,n1例如等于3、4、5、6或其他值。
进一步的,可将上述n1个候选合并运动信息单元集或n1个候选合并运动信息单元集的标识加入候选合并运动信息单元集队列,其中,若上述N小于或者等于n1,则可将上述N个候选合并运动信息单元集或N个候选合并运动信息单元集的标识加入候选合并运动信息单元集队列。其中,候选合并运动信息单元集队列中的各候选合并运动信息单元集例如可按照D(V)大小进行升序或降序排列。
其中,上述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集(例如合并运动信息单元集i)的欧式距离参数D(V)例如可按照如下方式计算:
D(V)=abs((v1,x-v0,x)×h-(v2,y-v0,y)×w)
+abs((v1,y-v0,y)×h+(v2,x-v0,x)×w)
其中,
Figure PCTCN2016083203-appb-000029
表示运动矢量
Figure PCTCN2016083203-appb-000030
的水平分量,其中,vp,y表示运动矢量
Figure PCTCN2016083203-appb-000031
的竖直分量,
Figure PCTCN2016083203-appb-000032
和v0为N个候选合并运动信息单元集中的某个候选合并运动信息单元集包括的两个像素样本的2个运动矢量,运动矢量
Figure PCTCN2016083203-appb-000033
表示为当前图像块的另一个像素样本的运动矢量,该另一个像素样本不同于上述两个像素样本。例如图2-e所示,
Figure PCTCN2016083203-appb-000034
Figure PCTCN2016083203-appb-000035
表示当前图像块的左上像素样本和右上像素样本的运动矢量,运动矢量
Figure PCTCN2016083203-appb-000036
表示当前图像块的左下像素样本的运动矢量,当然,运动矢量
Figure PCTCN2016083203-appb-000037
也可表示当前图像块的中心像素样本或其他像素样本的运动矢量。
可选的,|v1,x-v0,x|≤w/2或者|v1,y-v0,y|≤h/2或者|v2,x-v0,x|≤w/2或者|v2,y-v0,y|≤h/2。
进一步的,根据上述N个候选合并运动信息单元集的D(V)值的大小按升序或降序排序,可以得到候选合并运动信息单元集队列。其中,候选合并运动信息单元集队列中的每个合并运动信息单元集互不相同,可用索引号指示候选合并运动信息单元集队列中的某个合并运动信息单元集。
S205、视频编码装置利用仿射运动模型和所述合并运动信息单元集i对所 述当前图像块进行运动矢量预测。
其中,假设当前图像块的大小为w×h,所述w等于或不等于h。
假设,上述两个像素样本的坐标为(0,0)和(w,0)和,此处以像素样本左上角像素的坐标参与计算为例。参见图2-e,图2-e示出了当前图像块的四个顶点的坐标。参见图2-f和图2-g示出了仿射运动的一种示意图。
所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),将2个像素样本的坐标及运动矢量代入如下举例的仿射运动模型,便可计算出当前图像块x内的任意像素点的运动矢量。
Figure PCTCN2016083203-appb-000038
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),其中,所述vx和vy分别是当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量(vx)和运动矢量竖直分量(vy),所述w为所述当前图像块的长或宽。
进一步,视频编码装置可基于计算出的所述当前图像块的各像素点或各像素块的运动矢量对所述当前图像块进行像素值预测。视频编码装置可利用当前图像块的原始像素值和对当前图像块进行像素值预测而得到的当前图像块预测像素值得到当前图像块的预测残差。视频编码装置可将当前图像块的预测残差写入视频码流。
可以看出,本实施例的技术方案中,视频编码装置利用仿射运动模型和合并运动信息单元集i对当前图像块进行像素值预测,合并运动信息单元集i中的每个运动信息单元分别选自2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,由于合并运动信息单元集i选择范围变得相对较小,摒弃了传统技术采用的在多个像素样本的全部可能候选运动信息单元集合中通过大量计算才筛选出多个像素样本的一种运动信息单元的机制,有利于提高编码效率,并且也有利于降低基于仿射运动模型进行图像预测的计算复杂度,进而使得仿射运动模型引入视频编码标准变得可能。并且由于引入了仿射运动模型,有利于更准确描述物体运动,故而有利于提高预测准确度。由于所参考的像素样本的数量可为2个,这样有利于进一步降低引入 仿射运动模型之后,基于仿射运动模型进行图像预测的计算复杂度,并且也有利于减少编码端传递仿射参数信息或者运动矢量残差的个数等。
下面举例公式1所示的仿射运动模型的一种推导过程。其中,例如可利用旋转运动模型来推导仿射运动模型。
其中,旋转运动例如图2-h或图2-i举例所示。
其中,旋转运动模型如公式(2)所示。其中(x′,y′)为坐标为(x,y)的像素点在参考帧中对应的坐标,其中,θ为旋转角度,(a0,a1)为平移分量。若已知变换系数,即可求得像素点(x,y)的运动矢量(vx,vy)。
Figure PCTCN2016083203-appb-000039
其中,采用的旋转矩阵为:
Figure PCTCN2016083203-appb-000040
若在旋转的基础上再进行一次系数为ρ的缩放变换,同时,为了避免旋转运动中的三角运算,得到如下简化的仿射运动矩阵。
Figure PCTCN2016083203-appb-000041
这样,有利于降低计算的复杂度,可以简化每个像素点的运动矢量的计算过程,而且该模型可以像一般的仿射运动模型一样应用于旋转和缩放等复杂运动场景。其中,简化的仿射运动模型描述可如公式3。其中,和一般仿射运动模型相比简化的仿射运动模型可只需要4个参数表示。
Figure PCTCN2016083203-appb-000042
对于尺寸为w×h的图像块(如CUR),将其右边及下边界各扩展一行并求得坐标点(0,0),(w,0)的顶点的运动矢量(vx0,vy0),(vx1,vy1)。以这两个顶点为像素样本(当然,也可以以其它点作为参考的像素样本,如中心像素样本等等),将它们的坐标及运动矢量代入公式(3),可以推导出公式1,
Figure PCTCN2016083203-appb-000043
其中,
Figure PCTCN2016083203-appb-000044
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽。
可以理解,从上面的推导过程可以看出公式1具有较强的可用性,实践过程发现,由于所参考的像素样本的数量可为2个,这样有利于进一步降低引入仿射运动模型之后,基于仿射运动模型进行图像预测的计算复杂度和减少编码传递仿射参数信息或运动矢量差值的个数。
请参见图3,图3为本发明的另一个实施例提供的另一种图像预测方法的流程示意图。本实施例中主要以在视频解码装置中实施图像预测方法为例进行描述。其中,图3举例所示,本发明的第三个实施例提供的另一种图像预测方法可包括:
S301、视频解码装置确定当前图像块中的2个像素样本。
其中,本实施例中主要以所述2个像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中2个像素样本为例。例如,所述2个像素样本包括所述当前图像块的左上像素样本和右上像素样本。其中,所述2个像素样本为所述当前图像块的其他像素样本的场景可以此类推。
其中,所述当前图像块的左上像素样本可为所述当前图像块的左上顶点或者所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上 顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
若像素样本为像素块,则该像素块的大小例如为2*2,1*2、4*2、4*4或者其他大小。
S302、视频解码装置确定出所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集。
其中,所述每个像素样本所对应的候选运动信息单元集包括候选的至少一个运动信息单元。
其中,本发明各实施例中提及的像素样本可以是像素点或包括至少两个像素点的像素块。
其中,例如图2-b和图2-c所示,所述当前图像块的左上像素样本对应的候选运动信息单元集S1可包括x1个像素样本的运动信息单元。其中,所述x1个像素样本包括:与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本Col-LT、所述当前图像块的左边的空域相邻图像块C、所述当前图像块的左上的空域相邻图像块A、所述当前图像块的上边的空域相邻图像块B中的至少一个。例如可先获取所述当前图像块的左边的空域相邻图像块C的运动信息单元、所述当前图像块的左上的空域相邻图像块A的运动信息单元和所述当前图像块的上边的空域相邻图像块B的运动信息单元,将获取到的所述当前图像块的左边的空域相邻图像块C的运动信息单元、所述当前图像块的左上的空域相邻图像块A的运动信息单元和所述当前图像块的上边的空域相邻图像块B的运动信息单元添加到所述当前图像块的左上像素样本对应的候选运动信息单元集中,若所述当前图像块的左边的空域相邻图像块C的运动信息单元、所述当前图像块的左上的空域相邻图像块A的运动信息单元和所述当前图像块的上边的空域相邻图像块B的运动信息单元中的部分或全部运动信息单元相同,则进一步对所述候选运动信息单元集S1进行去重处理(此时去重处理后的所述候选运动信息单元集S1中的运动信息单元的数量可能是1或2),若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本 Col-LT的运动信息单元,与去重处理后的所述候选运动信息单元集S1中的其中一个运动信息单元相同,则可向所述候选运动信息单元集S1中加入零运动信息单元,直到候选运动信息单元集S1中的运动信息单元数量等于3。此外,若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本Col-LT的运动信息单元,不同于去重处理后的所述候选运动信息单元集S1中的任意一个运动信息单元,则将与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本LT位置相同的像素样本Col-LT的运动信息单元添加到去重处理后的所述候选运动信息单元集S1中,若此时所述候选运动信息单元集S1中的运动信息单元数量仍然少于3个,则可以向所述候选运动信息单元集S1中加入零运动信息单元,直到所述候选运动信息单元集S1中的运动信息单元数量等于3。
其中,若当前图像块所属视频帧是前向预测帧,则添加到候选运动信息单元集S1中的零运动信息单元包括预测方向为前向的零运动矢量但可不包括预测方向为后向的零运动矢量。若当前图像块所属视频帧是后向预测帧,则添加到候选运动信息单元集S1中的零运动信息单元包括预测方向为后向的零运动矢量但可不包括预测方向为前向的零运动矢量。此外,若当前图像块所属视频帧是双向预测帧,则添加到候选运动信息单元集S1中的零运动信息单元包括预测方向为前向的零运动矢量和预测方向为后向的零运动矢量,其中,添加到候选运动信息单元集S1中的不同零运动信息单元中的运动矢量所对应的参考帧索引可不相同,对应的参考帧索引例如可为0、1、2、3或其其他值。
类似的,例如图2-b和图2-c所示,所述当前图像块的右上像素样本对应的候选运动信息单元集S2可以包括x2个图像块的运动信息单元。其中,所述x2个图像块可以包括:与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT、所述当前图像块的右上的空域相邻图像块E、所述当前图像块的上边的空域相邻图像块D之中的至少一个。例如,可以先获取所述当前图像块的右上的空域相邻图像块E的运动信息单元和所述当前图像块的上边的空域相邻图像块D的运动信息单元,将获取的所述当前图像块的右上的空域相邻图像块E的运动信息单元和 所述当前图像块的上边的空域相邻图像块D的运动信息单元添加到所述当前图像块的右上像素样本对应的候选运动信息单元集S2中,若所述当前图像块的右上的空域相邻图像块E的运动信息单元和所述当前图像块的上边的空域相邻图像块D的运动信息单元相同,则可对所述候选运动信息单元集S2进行去重处理(此时去重处理后的所述候选运动信息单元集S2中的运动信息单元的数量是1),若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT的运动信息单元,与去重处理后的所述候选运动信息单元集S2中的其中一个运动信息单元相同,可进一步向所述候选运动信息单元集S2中加入零运动信息单元,直到所述候选运动信息单元集S2中运动信息单元数量等于2。此外,若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT的运动信息单元,不同于去重处理之后的所述候选运动信息单元集S2中的任意一个运动信息单元,则可以将与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本RT位置相同的像素样本Col-RT的运动信息单元添加到去重处理后的所述候选运动信息单元集S2中,若此时所述候选运动信息单元集S2之中的运动信息单元数量仍然少于2个,则进一步向所述候选运动信息单元集S2中加入零运动信息单元,直到所述候选运动信息单元集S2中运动信息单元的数量等于2。
其中,若当前图像块所属视频帧是前向预测帧,则添加到候选运动信息单元集S2中的零运动信息单元包括预测方向为前向的零运动矢量但可不包括预测方向为后向的零运动矢量。若当前图像块所属视频帧是后向预测帧,则添加到候选运动信息单元集S2中的零运动信息单元包括预测方向为后向的零运动矢量但可不包括预测方向为前向的零运动矢量。此外,若当前图像块所属视频帧是双向预测帧,则添加到候选运动信息单元集S2中的零运动信息单元包括预测方向为前向的零运动矢量和预测方向为后向的零运动矢量,其中,添加到候选运动信息单元集S2中的不同零运动信息单元中的运动矢量所对应的参考帧索引可不相同,对应的参考帧索引例如可为0、1、2、3或其其他值。
类似的,例如图2-b和图2-c所示,所述当前图像块的左下像素样本对应的 候选运动信息单元集S3可以包括x3个图像块的运动信息单元。其中,所述x3个图像块可包括:与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB、所述当前图像块的左下的空域相邻图像块G、所述当前图像块的左边的空域相邻图像块F中的至少一个。例如先获取所述当前图像块的左下的空域相邻图像块G的运动信息单元和所述当前图像块的左边的空域相邻图像块F的运动信息单元,可将获取的所述当前图像块的左下的空域相邻图像块G的运动信息单元和所述当前图像块的左边的空域相邻图像块F的运动信息单元添加到所述当前图像块的左下像素样本对应的候选运动信息单元集S3中,若所述当前图像块的左下的空域相邻图像块G的运动信息单元和所述当前图像块的左边的空域相邻图像块F的运动信息单元相同,则对所述候选运动信息单元集S3进行去重处理(此时去重处理后的所述候选运动信息单元集S3中的运动信息单元的数量是1),若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB的运动信息单元,与去重处理后的所述候选运动信息单元集S3中的其中一个运动信息单元相同,则可进一步向所述候选运动信息单元集S3中加入零运动信息单元,直到所述候选运动信息单元集S3中运动信息单元数量等于2。此外,若与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB的运动信息单元,不同于去重处理后的所述候选运动信息单元集S3中的任意一个运动信息单元,则可将与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本LB位置相同的像素样本Col-LB的运动信息单元添加到去重处理后的候选运动信息单元集S3中,若此时所述候选运动信息单元集S3之中的运动信息单元数量仍然少于2个,则进一步向所述候选运动信息单元集S3中加入零运动信息单元,直到所述候选运动信息单元集S3中运动信息单元数量等于2。
其中,若当前图像块所属视频帧是前向预测帧,则添加到候选运动信息单元集S3中的零运动信息单元包括预测方向为前向的零运动矢量但可不包括预测方向为后向的零运动矢量。若当前图像块所属视频帧是后向预测帧,则添加 到候选运动信息单元集S3中的零运动信息单元包括预测方向为后向的零运动矢量但可不包括预测方向为前向的零运动矢量。此外,若当前图像块所属视频帧是双向预测帧,则添加到候选运动信息单元集S3中的零运动信息单元包括预测方向为前向的零运动矢量和预测方向为后向的零运动矢量,其中,添加到候选运动信息单元集S3中的不同零运动信息单元中的运动矢量所对应的参考帧索引可不相同,对应的参考帧索引例如可为0、1、2、3或其其他值。
其中,两个运动信息单元不相同,可指该两个运动信息单元包括的运动矢量不同,或该两个运动信息单元所包括的运动矢量对应的预测方向不同,或者该两个运动信息单元所包括的运动矢量对应的参考帧索引不同。其中,两个运动信息单元相同,可指该两个运动信息单元所包括的运动矢量相同,且该两个运动信息单元所包括的运动矢量对应的预测方向相同,且该两个运动信息单元所包括的运动矢量对应的参考帧索引相同。
可以理解,对于存在更多像素样本的场景,可以按照类似方式得到相应像素样本的候选运动信息单元集。
例如图2-d所示,其中,在图2-d所示举例中,所述2个像素样本可包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中两个像素样本。其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
S303、视频解码装置基于所述2个像素样本之中的每个像素样本所对应的候选运动信息单元集确定N个候选合并运动信息单元集。其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自所述2个像素样本中的每个像素样本所对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元。所述N个候选合并运 动信息单元集互不相同,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集包括2个运动信息单元。
可以理解的是,假设基于候选运动信息单元集S1(假设包括3个运动信息单元)和所述候选运动信息单元集S2(假设包括2个运动信息单元)来确定候选合并运动信息单元集,则理论上可确定出3*2=6个初始的候选合并运动信息单元集,然而为了提高可用性,例如可以利用第一条件、第二条件、第三条件、第四条件和第五条件中的至少一个条件来从这6个初始的候选合并运动信息单元集中筛选出N个候选合并运动信息单元集。其中,如果候选运动信息单元集S1和所述候选运动信息单元集S2所包括的运动信息单元的数量不限于上述举例,那么,初始的候选合并运动信息单元集的数量不一定是6。
其中,第一条件、第二条件、第三条件、第四条件和第五条件的具体限制性内容可参考上述实施例中的举例说明,此处不在赘述。当然,所述N个候选合并运动信息单元集例如还可满足其他未列出条件。
在具体实现过程中,例如可先利用第一条件、第二条件和第三条件中的至少一个条件对初始的候选合并运动信息单元集进行筛选,从初始的候选合并运动信息单元集中筛选出N01个候选合并运动信息单元集,而后对N01个候选合并运动信息单元集进行缩放处理,而后再利用第四条件和第五条件中的至少一个条件从进行缩放处理的N01个候选合并运动信息单元集中筛选出N个候选合并运动信息单元集。当然,第四条件和第五条件也可能不参考,而是直接利用第一条件、第二条件和第三条件中的至少一个条件对初始的候选合并运动信息单元集进行筛选,从初始的候选合并运动信息单元集中筛选出N个候选合并运动信息单元集。
可以理解的是,视频编解码中运动矢量反映的是一个物体在一个方向(预测方向)上相对于同一时刻(同一时刻对应同一参考帧)偏移的距离。因此在不同像素样本的运动信息单元对应不同预测方向和/或对应不同参考帧索引的情况下,可能无法直接得到当前图像块的每个像素点/像素块相对于一参考帧的运动偏移。而当这些像素样本对应相同预测方向和对应相同参考帧索引的情况下,可利用这些合并运动矢量组合得到该图像块中每个像素点/像素块的运动矢量。
因此,在候选合并运动信息单元集中的不同像素样本的运动信息单元对应不同预测方向和/或对应不同参考帧索引的情况下,可以对候选合并运动信息单元集进行缩放处理。其中,对候选合并运动信息单元集进行缩放处理可能涉及到对该候选合并运动信息单元集中的一个或多个运动信息单元中的运动矢量进行修改、添加和/或删除等。
例如,在本发明一些可能实施方式之中,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可包括:当所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量对应的参考帧索引不同于所述当前图像块的参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为第一预测方向的运动矢量被缩放到所述当前图像块的参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测,所述第一预测方向为前向或后向;
或者,所述利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行像素值预测,可以包括:当所述合并运动信息单元集i中的预测方向为前向的运动矢量对应的参考帧索引不同于所述当前图像块的前向参考帧索引,并且所述合并运动信息单元集i中的预测方向为后向的运动矢量对应的参考帧索引不同于所述当前图像块的后向参考帧索引的情况下,对所述合并运动信息单元集i进行缩放处理,以使得所述合并运动信息单元集i中的预测方向为前向的运动矢量被缩放到所述当前图像块的前向参考帧且使得所述合并运动信息单元集i中的预测方向为后向的运动矢量被缩放到所述当前图像块的后向参考帧,利用仿射运动模型和进行缩放处理后的合并运动信息单元集i对所述当前图像块进行像素值预测。
S304、视频解码装置对视频码流进行解码处理以得到合并运动信息单元集i的标识和当前图像块的预测残差,基于合并运动信息单元集i的标识,从N个候选合并运动信息单元集之中确定出包含2个运动信息单元的合并运动信息单元集i。
相应的,视频编码装置可将所述合并运动信息单元集i的标识写入到视频码流。
S305、视频解码装置利用仿射运动模型和所述合并运动信息单元集i对所述当前图像块进行运动矢量预测。
例如视频解码装置可先对所述合并运动信息单元集i中的运动矢量进行运动估计处理,以得到运动估计处理后的合并运动信息单元集i,视频解码装置利用仿射运动模型和运动估计处理后的合并运动信息单元集i对所述当前图像块进行运动矢量预测。
其中,假设当前图像块的大小为w×h,所述w等于或不等于h。
假设,上述两个像素样本的坐标为(0,0)和(w,0)和,此处以像素样本左上角像素的坐标参与计算为例。参见图2-e,图2-e示出了当前图像块的四个顶点的坐标。
所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),将2个像素样本的坐标及运动矢量代入如下举例的仿射运动模型,便可计算出当前图像块x内的任意像素点的运动矢量。
Figure PCTCN2016083203-appb-000045
其中,所述2个像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),其中,所述vx和vy分别是当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量(vx)和运动矢量竖直分量(vy),其中,公式1中的所述w可为所述当前图像块的长或者宽。
S306、视频解码装置基于计算出的所述当前图像块的各像素点或各像素块的运动矢量对所述当前图像块进行像素值预测以得到的当前图像块的预测像素值。
S307、视频解码装置利用当前图像块的预测像素值和当前图像块的预测残差对当前图像块进行重建。
可以看出,本实施例的技术方案中,视频解码装置利用仿射运动模型和合并运动信息单元集i对当前图像块进行像素值预测,合并运动信息单元集i中的每个运动信息单元分别选自2个像素样本中的每个像素样本所对应的候选运动信息单元集中的至少部分运动信息单元,由于合并运动信息单元集i选择 范围变得相对较小,摒弃了传统技术采用的在多个像素样本的全部可能候选运动信息单元集合中通过大量计算才筛选出多个像素样本的一种运动信息单元的机制,有利于提高编码效率,并且也有利于降低基于仿射运动模型进行图像预测的计算复杂度,进而使得仿射运动模型引入视频编码标准变得可能。并且由于引入了仿射运动模型,有利于更准确描述物体运动,故而有利于提高预测准确度。由于所参考的像素样本的数量可为2个,这样有利于进一步降低引入仿射运动模型之后,基于仿射运动模型进行图像预测的计算复杂度,并且也有利于减少编码端传递仿射参数信息或者运动矢量残差的个数等。
应理解,本发明第一个实施例的步骤S101到S102,第二个实施例的步骤S201到S202,第三个实施例的步骤S301到S302,均以两个像素样本为例,示例性地说明了任一像素样本从被确定到获取对应的候选运动矢量信息单元集的过程,像素样本的个数可以是1,2,3等任意正整数。
请参见图4,图4为本发明的一个实施例提供的又一种图像预测方法的流程示意图。其中,图4举例所示,本发明的第四个实施例提供的一种图像预测方法,用于解码端装置,任一图像块包括至少一个第一类像素样本,和至少一个第二类样本,不妨设第一类像素样本包含第一像素样本,第二类像素样本包含第二像素样本,示例性的,第一类样本和第二类样本的区别在于第一类样本的运动信息仅来自于其对应的运动信息单元,而第二类样本的运动信息部分来自于其对应的运动信息单元,该方法可包括:
S401、解析第一码流信息。
每个待预测的图像块均对应码流中的一部分码流。解码装置通过解析码流可以获得指导构建预测图像的辅助信息(英文:side information)和预测图像与待解码图像的残差值,通过预测图像和残差值,可以重建待解码图像。
第一码流信息用来表征述第一像素样本和第二像素样本分别对应的运动信息单元。示例性的,被解析的第一码流信息是一个索引值,第一码流信息可以分别指示第一像素样本和第二像素样本分别对应的运动信息单元,也可以统一指示第一像素样本和第二像素样本分别对应的运动信息单元的组合,不作限定。
S402、根据解析后的第一码流信息,获取第一像素样本的运动信息和第二像素样本的预测运动信息。
应理解,预测运动信息是指运动信息的预测值。示例性的,当运动信息指运动矢量时,预测运动信息指运动矢量的预测值。示例性的,在视频编解码领域运动矢量的预测值一般来源于当前图像块对应的运动信息单元,即预测图像块的运动矢量。
示例性的,该步骤具体包括:
S4021、确定第一像素样本和第二像素样本对应的候选运动信息单元集,其中,任一候选运动信息单元集包括至少一个运动信息单元。
S101进行了一般性的方法概括,S301-S302结合解码装置进行了方法概括,S4021的发明内容和示例性的实施方式可以参照S101以及S301-S302所述,不再赘述。
S4022、确定当前块的合并运动信息单元集,其中,合并运动信息单元集中的每个运动信息单元分别为第一像素样本和第二像素样本中每个像素样本对应的候选运动信息单元集中的至少部分运动信息单元,其中,运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量。
S102进行了一般性的方法概括,S303结合解码装置进行了方法概括,S4022的发明内容和示例性的实施方式可以参照S102以及S303所述,不再赘述。
S4023、根据解析后的第一码流信息从合并运动信息单元集中确定第一像素样本和第二像素样本中每个像素样本对应的运动信息单元。
S304结合解码装置进行了方法概括,S4023的发明内容和示例性的实施方式可以参照S304所述,不再赘述。
S4024、使用第一像素样本对应的运动信息单元的运动信息作为第一像素样本的运动信息。
示例性的,使用第一码流信息指示的运动信息单元,即第一像素样本对应的预测图像块的运动矢量,作为第一像素样本的运动矢量。
S4025、使用第二像素样本对应的运动信息单元的运动信息作为第二像素样本的预测运动信息。
示例性的,使用第一码流信息指示的运动信息单元,即第二像素样本对应的预测图像块的运动矢量,作为第二像素样本的预测运动矢量。
S403、解析第二码流信息。
第二码流信息用于表征第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异。示例性的,第二码流信息用于表示第二像素样本的运动矢量和预测运动矢量间的残差值。应理解,每一个第二像素样本的运动矢量均对应一个残差值,这个残差值可以为0。解析后的第二码流信息可以包括每一个第二像素样本的运动矢量的残差值,也可以包括全部第二像素样本的运动矢量的残差值的集合,不作限定。
S404、根据解析后的第二码流信息和第二像素样本的预测运动信息,获取第二像素样本的运动信息。
该步骤具体包括:
根据解析后的第二码流信息,获得第二像素样本的差异运动信息,将第二像素样本的差异运动信息和对应的预测运动信息相加,获得第二像素样本的运动信息。
示例性的,将解析第二码流信息获得的第二像素样本的运动矢量的残差值和对应的预测运动矢量相加,即可获得第二像素样本的运动矢量。
S405、根据当前图像块的运动模型,第一像素样本和第二像素样本的运动信息,获得当前图像块的预测值。
示例性的,当前图像块的运动模型可以为仿射运动模型,或者其它平动、非平动运动模型,可以为四参数仿射运动模型,也可以为六参数等其它仿射运动模型,不作限定。
该运动模型,示例性的,包括:
Figure PCTCN2016083203-appb-000046
其中,第一像素样本和第二像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),vx为当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,vy为 当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,w为当前图像块的长或宽。
该运动模型,示例性的,还包括:
Figure PCTCN2016083203-appb-000047
其中,第一像素样本和任意两个第二像素样本的运动矢量,或者,第二像素样本和任意两个第一像素样本的运动矢量,分别为(vx0,vy0),(vx1,vy1)和(vx2,vy2),vx为当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,vy为当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,w为当前图像块的长或宽。
该步骤具体包括:
在一种可行的实施方式中,利用仿射运动模型、第一像素样本和第二像素样本的运动矢量计算得到当前图像块中的各像素点的运动矢量,利用计算得到的当前图像块中的各像素点的运动矢量确定当前图像块中的各像素点的预测像素值;在另一种可行的实施方式中,利用仿射运动模型、第一像素样本和第二像素样本的运动矢量计算得到当前图像块中的各像素块的运动矢量,利用计算得到的当前图像块中的各像素块的运动矢量确定当前图像块中的各像素块的各像素点的预测像素值。
S305结合解码装置进行了方法概括,S405的发明内容和示例性的实施方式可以参照S305所述,不再赘述。
在解码得到当前图像块的预测图像之后,在一些实施例中还包括,解码码流得到当前图像块的残差信息,根据残差信息和预测图像重建当前待解码图像块。S306-S307结合解码装置进行了方法概括,本步骤的发明内容和示例性的实施方式可以参照S306-S307所述,不再赘述。
可以看出本发明实施例在获取第一像素样本的运动信息时,仅需要获取其对应的预测运动信息来作为其运动信息,不需要进一步解析码流获得预测运动信息的残差值,节省了预测信息残差值所要传输的比特数,减少了比特消耗, 提高了效率。
请参见图5,图5为本发明的一个实施例提供的又一种图像预测方法的流程示意图。其中,图5举例所示,本发明的第五个实施例提供的一种图像预测方法,用于编码端装置,任一图像块包括至少一个第一类像素样本,和至少一个第二类样本,不妨设第一类像素样本包含第一像素样本,第二类像素样本包含第二像素样本,示例性的,第一类样本和第二类样本的区别在于第一类样本的运动信息仅来自于其对应的运动信息单元,而第二类样本的运动信息部分来自于其对应的运动信息单元,该方法可包括:
S501、确定第一像素样本和第二像素样本对应的候选运动信息单元集,其中,任一候选运动信息单元集包括至少一个运动信息单元。
S101进行了一般性的方法概括,S201-S202结合编码装置进行了方法概括,S501的发明内容和示例性的实施方式可以参照S101以及S201-S202所述,不再赘述。
S502、确定当前块的合并运动信息单元集,其中,合并运动信息单元集中的每个运动信息单元分别为第一像素样本和第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量。
S102进行了一般性的方法概括,S203结合编码装置进行了方法概括,S502的发明内容和示例性的实施方式可以参照S102以及S203所述,不再赘述。
S503、从合并运动信息单元集中确定第一像素样本和第二像素样本对应的运动信息单元。
S204结合编码装置进行了方法概括,S503的发明内容和示例性的实施方式可以参照S204所述,不再赘述。
S504、编码第一码流信息。
第一码流信息用来表征述第一像素样本和第二像素样本分别对应的运动信息单元。示例性的,被解析的第一码流信息是一个索引值,第一码流信息可以分别指示第一像素样本和第二像素样本分别对应的运动信息单元,也可以统一指示第一像素样本和第二像素样本分别对应的运动信息单元的组合,不作限 定。应理解,该步骤中第一码流信息在码流中的编入位置需要和相对的解码端相关步骤(例如本发明第四个实施例中的步骤S401)中在码流中的解析位置相对应。
S505、使用第一像素样本对应的运动信息单元的运动信息作为第一像素样本的运动信息;
示例性的,使用第一码流信息指示的运动信息单元,即第一像素样本对应的预测图像块的运动矢量,作为第一像素样本的运动矢量。
S506、使用第二像素样本对应的运动信息单元的运动信息作为第二像素样本的预测运动信息。
示例性的,使用第一码流信息指示的运动信息单元,即第二像素样本对应的预测图像块的运动矢量,作为第二像素样本的预测运动矢量。
S507、计算第二像素样本的差异运动信息,差异运动信息为运动信息和预测运动信息的差异。
示例性的,第二码流信息用于表示第二像素样本的运动矢量和预测运动矢量间的残差值。应理解,每一个第二像素样本的运动矢量均对应一个残差值,这个残差值可以为0。解析后的第二码流信息可以包括每一个第二像素样本的运动矢量的残差值,也可以包括全部第二像素样本的运动矢量的残差值的集合,不作限定。
示例性的,将第二像素样本的运动矢量和对应的预测运动矢量相减,即可获得第二像素样本的运动矢量的残差值。
S508、编码第二码流信息。
第二码流信息用于表征第二像素样本的差异运动信息。应理解,该步骤中第二码流信息在码流中的编入位置需要和相对的解码端相关步骤(例如本发明第四个实施例中的步骤S403)中在码流中的解析位置相对应。
应理解,步骤S504到S508没有先后顺序关系的限制,也可以并行进行。
S509、根据当前图像块的运动模型、第一像素样本的运动信息和第二像素样本的运动信息,获得当前图像块的预测值。
S405结合解码装置进行了方法概括,S509的发明内容和示例性的实施方式 可以参照S405所述,不再赘述。
可以看出本发明实施例在获取第一像素样本的运动信息时,仅需要获取其对应的预测运动信息来作为其运动信息,不需要进一步编码码流完成预测运动信息的残差值的传输,节省了预测信息残差值所要传输的比特数,减少了比特消耗,提高了编码效率。
下面还提供用于实施上述方案的相关装置。
参见图6,本发明第六个实施例还提供一种图像预测装置600,可包括:
第一解析单元601,用于解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
第一获取单元602,用于根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息的预测信息;
第二解析单元603,用于解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;
第二获取单元604,用于根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;
第三获取单元605,用于根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
应理解,示例性的,本实施例中的图像预测装置600可以用于执行本发明实施例四中所述的方法及各示例性的实施方式。本实施例中功能模块601-605的具体实现功能可以参考本发明实施例四中对应的具体实施方式,有益效果可以参考本发明实施例四中的有益效果,不再赘述。图像预测装置600可为任何需要输出、播放视频的装置,如笔记本电脑,平板电脑、个人电脑、手机等设备。
参见图7,图7为本发明第七个实施例提供的图像预测装置700的示意图,图像预测装置700可包括至少一个总线701、与总线701相连的至少一个处理器 702以及与总线701相连的至少一个存储器703。
其中,处理器702通过总线701调用存储器703中存储的代码或者指令以用于用于解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息的预测信息;解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
应理解,示例性的,本实施例中的图像预测装置700可以用于执行本发明实施例四中所述的方法及各示例性的实施方式,具体实现功能可以参考本发明实施例四中对应的具体实施方式,有益效果可以参考本发明实施例四中的有益效果,不再赘述。图像预测装置700可为任何需要输出、播放视频的装置,如笔记本电脑,平板电脑、个人电脑、手机等设备。
本发明第八个实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任意一种图像预测方法的部分或全部步骤,具体实现功能可以参考本发明实施例四中对应的具体实施方式,有益效果可以参考本发明实施例四中的有益效果,不再赘述。
参见图8,本发明第九个实施例还提供一种图像预测装置800,可包括:
第一确定单元801,用于确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;
第二确定单元802,用于确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢 量和/或预测方向为后向的运动矢量;
第三确定单元803,用于从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
第一编码单元804,用于编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
第一赋值单元805,用于使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;
第二赋值单元806,用于使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;
计算单元807,用于计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;
第二编码单元808,用于编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;
获取单元809,用于根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
应理解,示例性的,本实施例中的图像预测装置800可以用于执行本发明实施例五中所述的方法及各示例性的实施方式。本实施例中功能模块801-809的具体实现功能可以参考本发明实施例五中对应的具体实施方式,有益效果可以参考本发明实施例五中的有益效果,不再赘述。图像预测装置800可为任何需要输出、播放视频的装置,如笔记本电脑,平板电脑、个人电脑、手机等设备。
参见图9,图9为本发明第十个实施例提供的图像预测装置900的示意图,图像预测装置900可包括至少一个总线901、与总线901相连的至少一个处理器902以及与总线901相连的至少一个存储器903。
其中,处理器902通过总线901调用存储器903中存储的代码或者指令以用于用于确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动 信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
应理解,示例性的,本实施例中的图像预测装置900可以用于执行本发明实施例五中所述的方法及各示例性的实施方式,具体实现功能可以参考本发明实施例五中对应的具体实施方式,有益效果可以参考本发明实施例五中的有益效果,不再赘述。图像预测装置900可为任何需要输出、播放视频的装置,如笔记本电脑,平板电脑、个人电脑、手机等设备。
本发明第十一个实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任意一种图像预测方法的部分或全部步骤,具体实现功能可以参考本发明实施例五中对应的具体实施方式,有益效果可以参考本发明实施例五中的有益效果,不再赘述。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可能可以采用其他顺序或者同时 进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以为个人计算机、服务器或者网络设备等,具体可以是计算机设备中的处理器)执行本发明各个实施例上述方法的全部或部分步骤。其中,而前述的存储介质可包括:U盘、移动硬盘、磁碟、光盘、只读存储器(ROM,Read-Only Memory)或者随机存取存储器(RAM,Random Access Memory)等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理 解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (32)

  1. 一种图像预测方法,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述方法包括:
    解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息的预测信息;
    解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;
    根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;
    根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
  2. 根据权利要求1所述的方法,其特征在于,所述第一码流信息包括索引,所述索引用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第二码流信息包括差值,所述差值为任一所述第二像素样本的运动矢量和预测运动矢量间的运动矢量残差。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,包括:
    确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;
    确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动 信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;
    根据所述解析后的第一码流信息从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;
    使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息。
  5. 根据权利要求4所述的方法,其特征在于,所述确定所述当前块的合并运动信息单元集,包括:
    从N个候选合并运动信息单元集中确定出包含每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元的合并运动信息单元集,其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同。
  6. 根据权利要求5所述的方法,其特征在于,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,
    其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;
    所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的预测方向相同;
    所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的参考帧索引相同;
    所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量水平分量的差值的绝 对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中一个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本;
    所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量竖直分量的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意一个运动信息单元和像素样本Z的运动矢量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息,包括:
    根据解析后的所述第二码流信息,获得每个所述第二像素样本的差异运动信息;
    将每个所述第二像素样本的差异运动信息和对应的所述预测运动信息相加,获得每个所述第二像素样本的运动信息。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述运动模型为非平动运动模型,具体包括:
    所述非平动运动模型为如下形式的仿射运动模型:
    Figure PCTCN2016083203-appb-100001
    其中,所述第一像素样本和所述第二像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢 量竖直分量,所述w为所述当前图像块的长或宽;
    对应的,所述根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值,包括:
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
    或者,
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
  9. 根据权利要求1至7任一项所述的方法,其特征在于,所述运动模型为非平动运动模型,具体包括:
    所述非平动运动模型为如下形式的仿射运动模型:
    Figure PCTCN2016083203-appb-100002
    其中,任意一个所述第一像素样本和任意两个所述第二像素样本的运动矢量,或者,任意两个所述第一像素样本和任意一个所述第二像素样本的运动矢量,分别为(vx0,vy0),(vx1,vy1)和(vx2,vy2),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;
    对应的,所述根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值,包括:
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢 量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
    或者,
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
  10. 根据权利要求1至9任一项所述的方法,其特征在于,所述至少一个第一像素样本和至少一个第二像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中两个像素样本;
    其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
  11. 根据权利要求10所述的方法,其特征在于,
    所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;
    其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
  12. 根据权利要求10至11任一项所述的方法,其特征在于,
    所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;
    其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
  13. 根据权利要求10至12任一项所述的方法,其特征在于,
    所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;
    其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
  14. 根据权利要求10至13任一项所述的方法,其特征在于,
    所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,
    其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
  15. 一种图像预测方法,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述方法包括:
    确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信 息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;
    确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;
    从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;
    使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;
    计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;
    编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;
    根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
  16. 一种图像预测装置,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:
    第一解析单元,用于解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    第一获取单元,用于根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动 信息为运动信息的预测信息;
    第二解析单元,用于解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;
    第二获取单元,用于根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;
    第三获取单元,用于根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
  17. 根据权利要求16所述的装置,其特征在于,所述第一码流信息包括索引,所述索引用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元。
  18. 根据权利要求16或17所述的装置,其特征在于,所述第二码流信息包括差值,所述差值为任一所述第二像素样本的运动矢量和预测运动矢量间的运动矢量残差。
  19. 根据权利要求15至18任一项所述的装置,其特征在于,所述第一获取单元具体用于:
    确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;
    确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;
    根据所述解析后的第一码流信息从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;
    使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息。
  20. 根据权利要求19所述的装置,其特征在于,所述第一获取单元具体用于:
    从N个候选合并运动信息单元集中确定出包含每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元的合并运动信息单元集,其中,所述N个候选合并运动信息单元集中的每个候选合并运动信息单元集所包含的每个运动信息单元,分别选自每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集中的符合约束条件的至少部分运动信息单元,其中,所述N为正整数,所述N个候选合并运动信息单元集互不相同。
  21. 根据权利要求20所述的装置,其特征在于,所述N个候选合并运动信息单元集满足第一条件、第二条件、第三条件、第四条件和第五条件之中的至少一个条件,
    其中,所述第一条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的运动信息单元所指示出的所述当前图像块的运动方式为非平动运动;
    所述第二条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的预测方向相同;
    所述第三条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元对应的参考帧索引相同;
    所述第四条件包括所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的两个运动信息单元的运动矢量水平分量的差值的绝对值小于或等于水平分量阈值,或者,所述N个候选合并运动信息单元集中的任意一个候选合并运动信息单元集中的其中一个运动信息单元和像素样本Z的运动矢量水平分量之间的差值的绝对值小于或等于水平分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本;
    所述第五条件包括所述N个候选合并运动信息单元集中的任意一个候选 合并运动信息单元集中的两个运动信息单元的运动矢量竖直分量的差值的绝对值小于或等于竖直分量阈值,或者,所述N个候选合并运动信息单元集中的其中一个候选合并运动信息单元集中的任意一个运动信息单元和像素样本Z的运动矢量竖直分量之间的差值的绝对值小于或等于竖直分量阈值,所述当前图像块的所述像素样本Z不同于所述第一像素样本和所述第二像素样本中的任意一个像素样本。
  22. 根据权利要求16至21任一项所述的装置,其特征在于,第二获取单元具体用于:
    根据解析后的所述第二码流信息,获得每个所述第二像素样本的差异运动信息;
    将每个所述第二像素样本的差异运动信息和对应的所述预测运动信息相加,获得每个所述第二像素样本的运动信息。
  23. 根据权利要求16至22任一项所述的装置,其特征在于,所述运动模型为非平动运动模型,具体包括:
    所述非平动运动模型为如下形式的仿射运动模型:
    Figure PCTCN2016083203-appb-100003
    其中,所述第一像素样本和所述第二像素样本的运动矢量分别为(vx0,vy0)和(vx1,vy1),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;
    对应的,所述第三获取单元具体用于:
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
    或者,
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
  24. 根据权利要求16至22任一项所述的装置,其特征在于,所述运动模型为非平动运动模型,具体包括:
    所述非平动运动模型为如下形式的仿射运动模型:
    Figure PCTCN2016083203-appb-100004
    其中,任意一个所述第一像素样本和任意两个所述第二像素样本的运动矢量,或者,任意两个所述第一像素样本和任意一个所述第二像素样本的运动矢量,分别为(vx0,vy0),(vx1,vy1)和(vx2,vy2),所述vx为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量水平分量,所述vy为所述当前图像块中的坐标为(x,y)的像素样本的运动矢量竖直分量,所述w为所述当前图像块的长或宽;
    对应的,所述第三获取单元具体用于:
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素点的运动矢量,利用计算得到的所述当前图像块中的各像素点的运动矢量确定所述当前图像块中的各像素点的预测像素值;
    或者,
    利用所述仿射运动模型、所述第一像素样本和所述第二像素样本的运动矢量计算得到所述当前图像块中的各像素块的运动矢量,利用计算得到的所述当前图像块中的各像素块的运动矢量确定所述当前图像块中的各像素块的各像素点的预测像素值。
  25. 根据权利要求16至24任一项所述的装置,其特征在于,所述至少一个第一像素样本和至少一个第二像素样本包括所述当前图像块的左上像素样本、右上像素样本、左下像素样本和中心像素样本a1中的其中两个像素样本;
    其中,所述当前图像块的左上像素样本为所述当前图像块的左上顶点或所述当前图像块中的包含所述当前图像块的左上顶点的像素块;所述当前图像块的左下像素样本为所述当前图像块的左下顶点或所述当前图像块中的包含所述当前图像块的左下顶点的像素块;所述当前图像块的右上像素样本为所述当前图像块的右上顶点或所述当前图像块中的包含所述当前图像块的右上顶点的像素块;所述当前图像块的中心素样本a1为所述当前图像块的中心像素点或所述当前图像块中的包含所述当前图像块的中心像素点的像素块。
  26. 根据权利要求25所述的装置,其特征在于,
    所述当前图像块的左上像素样本所对应的候选运动信息单元集包括x1个像素样本的运动信息单元,其中,所述x1个像素样本包括至少一个与所述当前图像块的左上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左上像素样本时域相邻的像素样本,所述x1为正整数;
    其中,所述x1个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左上像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
  27. 根据权利要求25至26任一项所述的装置,其特征在于,
    所述当前图像块的右上像素样本所对应的候选运动信息单元集包括x2个像素样本的运动信息单元,其中,所述x2个像素样本包括至少一个与所述当前图像块的右上像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的右上像素样本时域相邻的像素样本,所述x2为正整数;
    其中,所述x2个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的右上像素样本位置相同的像素样本、所述当前图像块的右边的空域相邻像素样本、所述当前图像块的右上的空域相邻像素样本和所述当前图像块的上边的空域相邻像素样本中的至少一个。
  28. 根据权利要求25至27任一项所述的装置,其特征在于,
    所述当前图像块的左下像素样本所对应的候选运动信息单元集包括x3个像素样本的运动信息单元,其中,所述x3个像素样本包括至少一个与所述当前 图像块的左下像素样本空域相邻的像素样本和/或至少一个与所述当前图像块的左下像素样本时域相邻的像素样本,所述x3为正整数;
    其中,所述x3个像素样本包括与所述当前图像块所属的视频帧时域相邻的视频帧之中的与所述当前图像块的左下像素样本位置相同的像素样本、所述当前图像块的左边的空域相邻像素样本、所述当前图像块的左下的空域相邻像素样本和所述当前图像块的下边的空域相邻像素样本中的至少一个。
  29. 根据权利要求25至28任一项所述的装置,其特征在于,
    所述当前图像块的中心像素样本a1所对应的候选运动信息单元集包括x5个像素样本的运动信息单元,其中,所述x5个像素样本中的其中一个像素样本为像素样本a2,
    其中,所述中心像素样本a1在所述当前图像块所属视频帧中的位置,与所述像素样本a2在所述当前图像块所属视频帧的相邻视频帧中的位置相同,所述x5为正整数。
  30. 一种图像预测装置,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:
    第一确定单元,用于确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;
    第二确定单元,用于确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;
    第三确定单元,用于从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;
    第一编码单元,用于编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样 本对应的运动信息单元;
    第一赋值单元,用于使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;
    第二赋值单元,用于使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;
    计算单元,用于计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;
    第二编码单元,用于编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;
    获取单元,用于根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
  31. 一种图像预测装置,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:处理器和耦合于处理器的存储器;
    所述存储器用于存储代码或者指令;
    所述处理器用于调用所述代码或指令以执行:
    解析第一码流信息,所述第一码流信息用于指示每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;根据解析后的所述第一码流信息,获取每个所述第一像素样本的运动信息和每个所述第二像素样本的预测运动信息,所述预测运动信息为运动信息的预测信息;解析第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;根据解析后的所述第二码流信息和对应的每个所述第二像素样本的预测运动信息,获取每个所述第二像素样本的运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
  32. 一种图像预测装置,其特征在于,当前图像块包括至少一个第一像素样本和至少一个第二像素样本,所述装置包括:处理器和耦合于处理器的存储器;
    所述存储器用于存储代码或者指令;
    所述处理器用于调用所述代码或指令以执行:
    确定每个所述第一像素样本和每个所述第二像素样本对应的候选运动信息单元集,其中,任一所述候选运动信息单元集包括至少一个运动信息单元;确定所述当前块的合并运动信息单元集,其中,所述合并运动信息单元集中的每个运动信息单元分别为每个所述第一像素样本和每个所述第二像素样本中对应的候选运动信息单元集中的至少部分运动信息单元,其中,所述运动信息单元的运动信息包括预测方向为前向的运动矢量和/或预测方向为后向的运动矢量;从所述合并运动信息单元集中确定每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;编码第一码流信息,所述第一码流信息用于表征所述合并运动信息单元集中确定的每个所述第一像素样本和每个所述第二像素样本对应的运动信息单元;使用所述第一像素样本对应的运动信息单元的运动信息作为所述第一像素样本的运动信息;使用所述第二像素样本对应的运动信息单元的运动信息作为所述第二像素样本的预测运动信息;计算所述第二像素样本的差异运动信息,所述差异运动信息为运动信息和预测运动信息的差异;编码第二码流信息,所述第二码流信息用于表征每个所述第二像素样本的差异运动信息;根据所述当前图像块的运动模型、每个所述第一像素样本的运动信息和每个所述第二像素样本的运动信息,获得所述当前图像块的预测值。
PCT/CN2016/083203 2016-05-24 2016-05-24 图像预测方法和相关设备 WO2017201678A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN201680085451.7A CN109076234A (zh) 2016-05-24 2016-05-24 图像预测方法和相关设备
EP16902668.9A EP3457694A4 (en) 2016-05-24 2016-05-24 IMAGE FORECASTING METHOD AND ASSOCIATED DEVICE
PCT/CN2016/083203 WO2017201678A1 (zh) 2016-05-24 2016-05-24 图像预测方法和相关设备
US16/197,585 US20190098312A1 (en) 2016-05-24 2018-11-21 Image prediction method and related device
HK18114854.6A HK1255704A1 (zh) 2016-05-24 2018-11-21 圖像預測方法和相關設備

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/083203 WO2017201678A1 (zh) 2016-05-24 2016-05-24 图像预测方法和相关设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/197,585 Continuation US20190098312A1 (en) 2016-05-24 2018-11-21 Image prediction method and related device

Publications (1)

Publication Number Publication Date
WO2017201678A1 true WO2017201678A1 (zh) 2017-11-30

Family

ID=60412001

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/083203 WO2017201678A1 (zh) 2016-05-24 2016-05-24 图像预测方法和相关设备

Country Status (5)

Country Link
US (1) US20190098312A1 (zh)
EP (1) EP3457694A4 (zh)
CN (1) CN109076234A (zh)
HK (1) HK1255704A1 (zh)
WO (1) WO2017201678A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020038232A1 (zh) * 2018-08-21 2020-02-27 华为技术有限公司 一种图像块的运动信息的预测方法及装置
CN110855993A (zh) * 2018-08-21 2020-02-28 华为技术有限公司 一种图像块的运动信息的预测方法及装置
US11528503B2 (en) 2017-12-31 2022-12-13 Huawei Technologies Co., Ltd. Picture prediction method and apparatus, and codec

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR112017019264B1 (pt) * 2015-03-10 2023-12-12 Huawei Technologies Co., Ltd Método de predição de imagem e dispositivo relacionado
US10560712B2 (en) 2016-05-16 2020-02-11 Qualcomm Incorporated Affine motion prediction for video coding
US10448010B2 (en) 2016-10-05 2019-10-15 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
US20180109809A1 (en) * 2016-10-13 2018-04-19 Google Llc Voxel video coding
US11877001B2 (en) 2017-10-10 2024-01-16 Qualcomm Incorporated Affine prediction in video coding
US11166015B2 (en) * 2019-03-06 2021-11-02 Tencent America LLC Method and apparatus for video coding
WO2024030279A1 (en) * 2022-08-01 2024-02-08 Innopeak Technology, Inc. Encoding method, decoding method, encoder and decoder

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101039419A (zh) * 2006-03-16 2007-09-19 汤姆森许可贸易公司 用于对图像序列的视频数据进行编码的方法
US20090175338A1 (en) * 2008-01-04 2009-07-09 Segall Christopher A Methods and Systems for Inter-Layer Image Prediction Parameter Determination
CN103327319A (zh) * 2012-03-21 2013-09-25 Vixs系统公司 利用缩放后的运动搜索识别运动矢量候选的方法与设备
CN103891288A (zh) * 2011-11-07 2014-06-25 株式会社Ntt都科摩 动态图像预测编码装置、动态图像预测编码方法、动态图像预测编码程序、动态图像预测解码装置、动态图像预测解码方法以及动态图像预测解码程序
US20150195557A1 (en) * 2014-01-08 2015-07-09 Microsoft Corporation Encoding Screen Capture Data
CN104885464A (zh) * 2012-12-26 2015-09-02 索尼公司 图像处理装置和方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1258925C (zh) * 2003-06-27 2006-06-07 中国科学院计算技术研究所 多视角视频编解码预测补偿方法及装置
KR101366242B1 (ko) * 2007-03-29 2014-02-20 삼성전자주식회사 움직임 모델 파라메터의 부호화, 복호화 방법 및 움직임모델 파라메터를 이용한 영상의 부호화, 복호화 방법 및장치
KR20110071047A (ko) * 2009-12-20 2011-06-28 엘지전자 주식회사 비디오 신호 디코딩 방법 및 장치
CN102685477B (zh) * 2011-03-10 2014-12-10 华为技术有限公司 获取用于合并模式的图像块的方法和设备
CN103916673B (zh) * 2013-01-06 2017-12-22 华为技术有限公司 基于双向预测的编码方法、解码方法和装置
CN112087629B (zh) * 2014-09-30 2021-08-20 华为技术有限公司 图像预测方法、装置及计算机可读存储介质
CN104363451B (zh) * 2014-10-27 2019-01-25 华为技术有限公司 图像预测方法及相关装置
CN105163116B (zh) * 2015-08-29 2018-07-31 华为技术有限公司 图像预测的方法及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101039419A (zh) * 2006-03-16 2007-09-19 汤姆森许可贸易公司 用于对图像序列的视频数据进行编码的方法
US20090175338A1 (en) * 2008-01-04 2009-07-09 Segall Christopher A Methods and Systems for Inter-Layer Image Prediction Parameter Determination
CN103891288A (zh) * 2011-11-07 2014-06-25 株式会社Ntt都科摩 动态图像预测编码装置、动态图像预测编码方法、动态图像预测编码程序、动态图像预测解码装置、动态图像预测解码方法以及动态图像预测解码程序
CN103327319A (zh) * 2012-03-21 2013-09-25 Vixs系统公司 利用缩放后的运动搜索识别运动矢量候选的方法与设备
CN104885464A (zh) * 2012-12-26 2015-09-02 索尼公司 图像处理装置和方法
US20150195557A1 (en) * 2014-01-08 2015-07-09 Microsoft Corporation Encoding Screen Capture Data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3457694A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11528503B2 (en) 2017-12-31 2022-12-13 Huawei Technologies Co., Ltd. Picture prediction method and apparatus, and codec
US12069294B2 (en) 2017-12-31 2024-08-20 Huawei Technologies Co., Ltd. Picture prediction method and apparatus, and codec
WO2020038232A1 (zh) * 2018-08-21 2020-02-27 华为技术有限公司 一种图像块的运动信息的预测方法及装置
CN110855993A (zh) * 2018-08-21 2020-02-28 华为技术有限公司 一种图像块的运动信息的预测方法及装置

Also Published As

Publication number Publication date
EP3457694A4 (en) 2019-05-22
CN109076234A (zh) 2018-12-21
EP3457694A1 (en) 2019-03-20
HK1255704A1 (zh) 2019-08-23
US20190098312A1 (en) 2019-03-28

Similar Documents

Publication Publication Date Title
WO2016141609A1 (zh) 图像预测方法和相关设备
WO2017201678A1 (zh) 图像预测方法和相关设备
WO2016065873A1 (zh) 图像预测方法及相关装置
CN107710761B (zh) 图像预测方法和相关设备
CN112087629B (zh) 图像预测方法、装置及计算机可读存储介质
TW202005389A (zh) 加權交織預測
WO2016065872A1 (zh) 图像预测方法及相关装置
JP7257524B2 (ja) ビデオエンコーディング/デコーディングシステムにおけるサイド動き精細化
TW202005388A (zh) 交織預測的應用

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16902668

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016902668

Country of ref document: EP

Effective date: 20181214