WO2020034921A1 - 运动估计的方法、装置、电子设备及计算机可读存储介质 - Google Patents

运动估计的方法、装置、电子设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2020034921A1
WO2020034921A1 PCT/CN2019/100236 CN2019100236W WO2020034921A1 WO 2020034921 A1 WO2020034921 A1 WO 2020034921A1 CN 2019100236 W CN2019100236 W CN 2019100236W WO 2020034921 A1 WO2020034921 A1 WO 2020034921A1
Authority
WO
WIPO (PCT)
Prior art keywords
reference frame
candidate
predicted
candidate reference
matching block
Prior art date
Application number
PCT/CN2019/100236
Other languages
English (en)
French (fr)
Inventor
范娟婷
樊鸿飞
Original Assignee
北京金山云网络技术有限公司
北京金山云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京金山云网络技术有限公司, 北京金山云科技有限公司 filed Critical 北京金山云网络技术有限公司
Publication of WO2020034921A1 publication Critical patent/WO2020034921A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to a method, a device, an electronic device, and a computer-readable storage medium for motion estimation.
  • each video frame in the video can be divided into multiple image blocks.
  • This image block is also called an encoding block.
  • a coding block may be divided into multiple prediction units.
  • the prediction of the coding block may include: performing intra prediction on the coding block and The coded block performs inter prediction.
  • the inter prediction is to search for reference image frames similar to the prediction unit as a matching block.
  • Inter-prediction may include motion estimation and motion compensation, where motion estimation is a process of searching for a matching block with the lowest rate-distortion cost among candidate reference frames in a candidate reference frame set, and the matching block is the best matching block of the prediction unit.
  • the reference frame where the matching block is located is the best reference frame.
  • a matching block with the lowest rate distortion cost in the candidate reference frame may be (Referred to as the preferred matching block). Then, among the preferred matching blocks corresponding to each candidate reference frame, a preferred matching block with the lowest rate distortion cost is determined as the best matching block of the prediction unit.
  • the purpose of the embodiments of the present application is to provide a method, a device, an electronic device, and a computer-readable storage medium for motion estimation, which can improve the coding efficiency of a video.
  • the technical scheme is as follows:
  • an embodiment of the present application discloses a method for motion estimation, the method includes: obtaining a reference frame to be predicted from a candidate reference frame set corresponding to a target prediction unit;
  • the predicted reference frame is a candidate reference frame in the candidate reference frame set that meets a first preset condition;
  • a pixel search is performed on the to-be-predicted reference frame according to a pixel search rule corresponding to the to-be-predicted reference frame to obtain a candidate matching block Determining the matching block with the lowest rate-distortion cost among the candidate matching blocks as the best matching block of the target prediction unit.
  • the first preset condition includes at least one of the following: the number of reference frames in the candidate reference frame set is less than or equal to a first preset threshold and the candidate reference frame is not the candidate reference A specified reference frame in a frame set; the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold and the candidate reference frame is in the candidate reference frame set; the Candidate reference frames are in the candidate reference frame set; the candidate reference frames are in the candidate reference frame set and the candidate reference frame is not a designated reference frame in the candidate reference frame set; wherein the The reference frame included in the candidate reference frame set is a reference frame where a matching block with the lowest rate distortion cost among matching blocks obtained by inter prediction of the image block is located, and the image block is a preset phase that satisfies the preset prediction unit. Neighboring image blocks.
  • the method further includes: skipping the motion estimation of the candidate reference frame if the candidate reference frame meets a second preset condition; wherein the second preset condition includes At least one of the following: the number of reference frames in the candidate reference frame set is greater than a first preset threshold and the candidate reference frame is not in the candidate reference frame set; the reference frames in the candidate reference frame set The number of candidates is less than or equal to the first preset threshold, the candidate reference frame is not a specified reference frame in the candidate reference frame set and the candidate reference frame is not in the candidate reference frame set;
  • the reference frame included in the candidate reference frame set is a reference frame where a matching block with the lowest rate distortion cost among matching blocks obtained by inter-prediction of the image block is located, and the image block is a preset that satisfies a preset with the target prediction unit.
  • Image patches with adjacent conditions includes At least one of the following: the number of reference frames in the candidate reference frame set is greater than a first preset threshold and the candidate reference frame is not in the candidate reference frame set; the reference
  • performing a pixel search on the to-be-predicted reference frame according to a pixel search rule corresponding to the to-be-predicted reference frame to obtain a candidate matching block includes at least one of the following: If it is not in the set of candidate reference frames, perform an entire pixel search on the reference frame to be predicted, and obtain a matching block in the reference frame to be predicted as a candidate matching block. In the case of the set of candidate reference frames, an entire pixel search and a sub-pixel search are performed on the reference frame to be predicted, and a matching block in the reference frame to be predicted is obtained as a candidate matching block.
  • Performing a pixel search on the to-be-predicted reference frame to obtain candidate matching blocks includes: performing a pixel search on the first to-be-predicted reference frame according to a pixel search rule corresponding to the first to-be-predicted reference frame to obtain a first Candidate matching block; after the pixel search is performed on the first to-be-predicted reference frame according to a pixel search rule corresponding to the first to-be-predicted reference frame to obtain a first candidate matching block, the method further includes: Determining that the first candidate matching block is a matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks, and in a case where the rate distortion cost of the first candidate matching block is less than a second preset threshold, according to the The arrangement order of the first to-
  • an embodiment of the present application further discloses an apparatus for motion estimation.
  • the apparatus includes an acquisition module configured to acquire a reference frame to be predicted from a candidate reference frame set corresponding to a target prediction unit.
  • the reference frame to be predicted is a candidate reference frame in the candidate reference frame set that meets a first preset condition;
  • a first processing module is configured to register all the frames according to a pixel search rule corresponding to the reference frame to be predicted;
  • a pixel search is performed on the reference frame to be predicted to obtain a candidate matching block;
  • a determining module is configured to determine a matching block with the lowest rate distortion cost among the candidate matching blocks as the best matching block of the target prediction unit.
  • the first preset condition includes at least one of the following: the number of reference frames in the candidate reference frame set is less than or equal to a first preset threshold and the candidate reference frame is not the candidate reference A specified reference frame in a frame set; the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold and the candidate reference frame is in the candidate reference frame set; the Candidate reference frames are in the candidate reference frame set; the candidate reference frames are in the candidate reference frame set and the candidate reference frame is not a designated reference frame in the candidate reference frame set; wherein the The reference frame included in the candidate reference frame set is a reference frame where a matching block with the lowest rate distortion cost among matching blocks obtained by inter prediction of the image block is located, and the image block is a preset phase that satisfies the preset prediction unit. Neighboring image blocks.
  • the apparatus further includes: a second processing module configured to skip motion estimation of the candidate reference frame if the candidate reference frame meets a second preset condition;
  • the second preset condition includes at least one of the following: the number of reference frames in the candidate reference frame set is greater than a first preset threshold and the candidate reference frame is not in the candidate reference frame set; the candidate The number of reference frames in the reference frame set is less than or equal to the first preset threshold, the candidate reference frame is not a designated reference frame in the candidate reference frame set and the candidate reference frame is not in the candidate reference In the frame set, wherein the reference frame included in the candidate reference frame set is a reference frame where a matching block with the lowest rate distortion cost among matching blocks obtained by inter prediction of the image block is located, and the image block is An image block in which the target prediction unit satisfies a preset neighboring condition.
  • the first processing module is specifically configured to perform an entire pixel search on the to-be-predicted reference frame when the to-be-predicted reference frame is not in the set of candidate reference frames to obtain the Describing the matching block in the reference frame to be predicted as a candidate matching block; and / or, when the reference frame to be predicted is in the candidate reference frame set, performing an entire pixel search on the reference frame to be predicted And sub-pixel search to obtain a matching block in the reference frame to be predicted as a candidate matching block.
  • the first processing module when there are multiple reference frames to be predicted, for the first reference frame to be predicted from among the multiple reference frames to be predicted, the first processing module is configured to The pixel search rule corresponding to the first to-be-predicted reference frame performs a pixel search on the first to-be-predicted reference frame to obtain a first candidate matching block; the device further includes: a third processing module configured to determine the first A candidate matching block is a matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks, and when the rate distortion cost of the first candidate matching block is less than a second preset threshold, according to the first to-be-predicted The arrangement order of the reference frames in the candidate reference frame set updates a pixel search rule of each reference frame located after the first to-be-predicted reference frame to an entire pixel search.
  • an embodiment of the present application further discloses an electronic device, the electronic device includes a memory and a processor; the memory is configured to store a computer program; and the processor is configured to execute a memory When the program is stored above, the method steps of the motion estimation according to the first aspect are implemented.
  • an embodiment of the present application further discloses a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program is executed by a processor, the computer program is implemented as described above. Method steps of motion estimation according to the first aspect.
  • an embodiment of the present application further discloses a computer program product containing instructions, which when run on a computer, causes the computer to execute the method steps of the motion estimation described in the first aspect.
  • the method, device, electronic device, and computer-readable storage medium for motion estimation obtained in the embodiments of the present application obtain a candidate reference frame set corresponding to a target prediction unit to be predicted that meets a first preset condition.
  • the reference frame is subjected to a pixel search according to a pixel search rule corresponding to the reference frame to be predicted to obtain a candidate matching block, and the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit.
  • pixel search is performed on candidate reference frames that meet the first preset condition.
  • FIG. 1 is a flowchart of a motion estimation method in the prior art
  • FIG. 2 is a schematic diagram of obtaining a matching block of a prediction unit according to an embodiment of the present application
  • FIG. 3 is a flowchart of a method for motion estimation provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of dividing a coding block according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of different hierarchical coding blocks according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an airspace neighboring block according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a search process of a pixel search according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of sub-pixel search according to an embodiment of the present application.
  • FIG. 9 is a flowchart of an example of a method for motion estimation according to an embodiment of the present application.
  • FIG. 10 is a structural diagram of a motion estimation device according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 1 is a flowchart of a motion estimation method in the prior art.
  • a candidate reference frame set corresponding to a target prediction unit is first obtained (S101).
  • a pixel search is performed on each candidate reference frame in the candidate reference frame set to obtain a matching block in the candidate reference frame (S102).
  • FIG. 2 which is implemented in this application.
  • the example provides a schematic diagram of obtaining a matching block of a prediction unit.
  • P n , P n-1 , P n-2 , P n-3 , and P n-4 represent the serial numbers of the five video frames in FIG. 2
  • P n is the current video frame to be encoded
  • P n An image block is a prediction unit
  • P n-1 , P n-2 , P n-3, and P n-4 are candidate reference frames.
  • the predictions in the candidate reference frame P n-1 , candidate reference frame P n-2 , candidate reference frame P n-3, and candidate reference frame P n-4 are found in the current video frame P n
  • the unit best matches the matching block.
  • the matching block with the lowest rate distortion cost among the matching blocks in each candidate reference frame is determined as the best matching block (S103).
  • a pixel search is performed on each candidate reference frame in the candidate reference frame set to obtain a matching block in each candidate reference frame, and then a matching block with the lowest rate distortion cost is selected from a plurality of matching blocks as a target prediction.
  • the best matching block of the unit is determined as the best matching block (S103).
  • the inventors have found that the prior art ignores the relevance of image content, and performs pixel searches for all candidate reference frames, which is cumbersome to operate, resulting in a higher complexity of motion estimation, which in turn reduces the coding efficiency of the video.
  • this application provides a method for motion estimation that can be applied to an electronic device that is used to encode a video.
  • the electronic device may obtain a to-be-predicted reference frame in a candidate reference frame set corresponding to a target prediction unit, where the to-be-predicted reference frame is a candidate reference in the candidate reference frame set that meets a first preset condition
  • a pixel search is performed on a reference frame to be predicted according to a pixel search rule corresponding to the reference frame to be predicted to obtain candidate matching blocks, and the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit.
  • the electronic device performs pixel search only on candidate reference frames that satisfy the first preset condition, instead of traversing all candidate reference frames and performing pixel search on each candidate reference frame, thereby improving the coding efficiency of the video.
  • FIG. 3 is a flowchart of a method for motion estimation according to an embodiment of the present application.
  • the method may include the following steps:
  • S301 Obtain a to-be-predicted reference frame in a candidate reference frame set corresponding to a target prediction unit.
  • the reference frame to be predicted is a candidate reference frame in the candidate reference frame set that meets the first preset condition.
  • the first preset condition may be set in advance by a technician based on experience.
  • the first preset condition may include at least one of the following:
  • the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold and the candidate reference frame is not a designated reference frame in the candidate reference frame set.
  • the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold and the candidate reference frames are in the candidate reference frame set.
  • the candidate reference frames are in a set of candidate reference frames.
  • the candidate reference frame is in the candidate reference frame set and the candidate reference frame is not a designated reference frame in the candidate reference frame set.
  • the reference frame included in the candidate reference frame set may be a reference frame in which a matching block with the lowest rate distortion cost among matching blocks obtained by inter-prediction of the image block is located.
  • the image block satisfies a preset phase with the target prediction unit. Neighboring image blocks. It can be seen that as long as a candidate reference frame is in the set of candidate reference frames, the electronic device can use the candidate reference frame as a reference frame to be predicted. It can be understood that the reference frame to be predicted includes image frames that have been referenced during inter prediction, thereby further improving the accuracy of the reference frame to be predicted.
  • the reference frame in the candidate reference frame set may be a reference frame of an image block that satisfies a preset neighboring condition with the target prediction unit, so the image block and the target prediction unit have a certain correlation in image content, that is, the present
  • the application makes use of the correlation of image content, and filters the reference frames in the set of candidate reference frames according to the reference frames of the image block, which can further improve the coding performance of the video.
  • An image block that satisfies a preset neighboring condition with a target prediction unit may include: an upper-layer coding block of the coding block where the target prediction unit is located, a lower-layer coding block of the coding block where the target prediction unit is located, and a spatially adjacent block of the target prediction unit.
  • FIG. 4 is a schematic diagram of dividing a coding block according to an embodiment of the present application.
  • the left side of Fig. 4 is a 2N ⁇ 2N (N is an integer greater than 1) coding block.
  • the coding block can be divided according to the eight classification examples shown on the right to obtain the corresponding prediction unit.
  • FIG. 5 is a schematic diagram of coding blocks of different levels provided in an embodiment of the present application.
  • the video frame is first divided into a coding tree unit (CTU) of equal size, and then the coding tree unit is used as a basic unit for coding.
  • the coding tree unit size is generally a 64 ⁇ 64 block.
  • the coding tree unit can be further divided into coding blocks of different sizes.
  • a 64 ⁇ 64 block represents a coding tree unit having a width of 64 pixels and a height of 64 pixels obtained by dividing a video frame.
  • a 64 ⁇ 64 coding tree unit can be encoded as a 64 ⁇ 64 coding block, or it can be divided into 4 equally-sized 32 ⁇ 32 coding blocks, and each 32 ⁇ 32 coding block To encode.
  • the rate-distortion cost of the 64 ⁇ 64-coded block can be compared with the sum of the rate-distortion costs of the four 32 ⁇ 32-coded blocks.
  • Way to divide and each 32 ⁇ 32 coding block can be divided into 4 equal-size 16 ⁇ 16 coding blocks, that is, for each 32 ⁇ 32 coding block, the rate distortion of a 32 ⁇ 32 coding block needs to be compared.
  • each 16 ⁇ 16 coding block can also be divided into 4 equal-size 8 ⁇ 8 coding blocks, and whether each 16 ⁇ 16 coding block needs to be divided again by comparing the rate of the 16 ⁇ 16 coding block.
  • the sum of the distortion cost and the rate-distortion cost of the four 8 ⁇ 8 coded blocks is determined.
  • the ending condition of the foregoing division may be default or user-defined.
  • the division may be ended by dividing into four 8 ⁇ 8 coding blocks, but it is not limited thereto.
  • the four 32 ⁇ 32 coding blocks marked in FIG. 5 can be obtained.
  • the four 32 ⁇ 32 coding blocks are called 64 ⁇ 64 in the figure.
  • the lower coding block of the coding block, correspondingly, the 64 ⁇ 64 coding block in the figure is called the upper coding block of the four 32 ⁇ 32 coding blocks in the figure.
  • the four 16 ⁇ 16 coding blocks obtained by each 32 ⁇ 32 partition are called the lower-level coding blocks of the 32 ⁇ 32 coding blocks, and the 32 ⁇ 32 coding blocks are called the four 16 ⁇ obtained by the division.
  • the upper-layer coding block and the lower-layer coding block are also referred to as the relationship between adjacent layers. This relationship between adjacent layers exists only between two adjacent layers, with one layer spaced between the two layers. The situation is not considered.
  • the coding block obtained by a partitioning method may only exist in the lower-level coding block, for example, a 64 ⁇ 64 coding block may only exist in the lower-level coding block, or there may be only an upper-level coding block, such as an 8 ⁇ 8 coding block only
  • the upper-level coding block may also have both the upper-level coding block and the lower-level coding block, for example, a 32 ⁇ 32 coding block and a 16 ⁇ 16 coding block, and there are both an upper-level coding block and a lower-level coding block.
  • the encoding block When actually encoding a video frame, it will choose from the upper encoding block to the lower encoding block or from the lower encoding block to the upper encoding block or from the middle layer to divide the coding tree unit in the video frame to obtain the encoding block, and then the encoding block Perform prediction to get prediction unit.
  • FIG. 6 is a schematic diagram of an airspace neighboring block according to an embodiment of the present application.
  • C is the current prediction unit
  • prediction unit A0, prediction unit A1, prediction unit B0, prediction unit B1, and prediction unit B2 are prediction units that are in the same video frame as the current prediction unit C and are located adjacent to the current prediction unit C.
  • the prediction unit A0, the prediction unit A1, the prediction unit B0, the prediction unit B1, and the prediction unit B2 are called spatially adjacent blocks of the current prediction unit C.
  • the first preset threshold may be set by a technician according to experience, and the first preset threshold may be greater than 1.
  • the first preset condition relates to the first preset threshold
  • the smaller the first preset threshold is, the smaller the probability of meeting the first preset condition is, and the smaller the number of reference frames to be predicted,
  • the coding efficiency is also higher, but the coding performance is reduced. Therefore, the first preset threshold may also be determined by comprehensively considering the coding efficiency and the coding performance.
  • the specified reference frame in the candidate reference frame set may be a reference frame located after the preset position in the candidate reference frame set.
  • the preset position can be set by a technician based on experience.
  • the preset position can be the position of the second candidate reference frame in the candidate reference frame set, and the specified reference frame includes the third candidate from the candidate reference frame set.
  • Reference frame to the last candidate reference frame; the preset position may also be the position of the fourth candidate reference frame in the candidate reference frame set, and the specified reference frame includes the fifth candidate reference frame in the candidate reference frame set to the last A candidate reference frame.
  • the electronic device may determine whether the candidate reference frame meets any one of the above four conditions.
  • the electronic device may determine that the candidate reference frame meets a first preset condition, that is, the electronic device may determine the candidate reference frame as a candidate Predict reference frames for subsequent processing.
  • S302 Perform a pixel search on the reference frame to be predicted according to a pixel search rule corresponding to the reference frame to be predicted to obtain a candidate matching block.
  • the pixel search rule may be set by a technician according to business requirements, for example, only an entire pixel search is performed for a reference frame to be predicted, or a sub-pixel search is performed based on the entire pixel search.
  • the sub-pixel search may include a half-pixel search, a quarter-pixel search, and an eighth-pixel search.
  • FIG. 7 it is a schematic diagram of a search process of a pixel search according to an embodiment of the present application.
  • the search range is (2d + 1 + M) ⁇ (2d + 1 + N).
  • the padding block is the prediction unit
  • the blank block is the matching block of the prediction unit
  • M is the width of the prediction unit
  • N is the prediction.
  • d is the value of the search window size
  • the coordinates of the upper left corner of the matching block of the prediction unit (k + u, l + v).
  • FIG. 8 is a schematic diagram of sub-pixel search according to an embodiment of the present application.
  • the solid dots represent whole pixels, and the solid dot in the center represents the best matching point for the whole pixel search; before performing the sub-pixel search, the sub-pixels are first interpolated, that is, the hollow dots (half the pixels) Using the solid dot in the center as the center, perform a full search on the surrounding eight half pixel points (full search, that is, exhaustive search in the search area, that is, traverse each pixel point in the search range).
  • the half-pixel point with the least rate distortion cost as the best matching point (here, the top-half pixel point is the best matching point); if a quarter-pixel search is supported, the current pixel is first interpolated The quarter-pixel point (solid triangle) around the best matching point of the full search is performed on the current best-matching point as the center, and the surrounding eight quarter-pixel points are fully searched to select the quarter with the least rate distortion cost.
  • One point serves as the best match point.
  • the one-fourth pixel point in the lower right corner is the best matching point, which is the hollow triangle in the figure.
  • the hollow triangle is the best matching point obtained by the sub-pixel search, that is, the candidate matching block in the current reference frame.
  • the electronic device may perform a pixel search according to a pixel search rule corresponding to a reference frame to be predicted, and obtain a matching block corresponding to a target prediction unit in the reference frame to be predicted as a candidate matching block. . Because there may be multiple reference frames to be predicted, there may also be multiple corresponding candidate matching blocks.
  • S303 Determine the matching block with the lowest rate distortion cost among the candidate matching blocks as the best matching block of the target prediction unit.
  • the electronic device may select, from the obtained candidate matching blocks, the candidate matching block with the lowest rate distortion cost as the best matching block of the target prediction unit.
  • the embodiment of the present application only performs pixel search on candidate reference frames that satisfy the first preset condition, and does not traverse all candidate reference frames and perform pixel search on each candidate reference frame, which can reduce the video coding Complexity and improve coding efficiency.
  • the method may further include the step of skipping the motion estimation of the candidate reference frame if the candidate reference frame meets the second preset condition.
  • the second preset condition includes at least one of the following:
  • the number of reference frames in the candidate reference frame set is greater than the first preset threshold and the candidate reference frames are not in the candidate reference frame set.
  • the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold, the candidate reference frame is not a designated reference frame in the candidate reference frame set and the candidate reference frame is not in the candidate reference frame set.
  • the electronic device may determine whether the candidate reference frame satisfies any one of the above two conditions.
  • the electronic device may determine that the candidate reference frame satisfies a second preset condition, that is, the electronic device may determine to skip the candidate reference frame. Motion estimation, and then determining a reference frame to be predicted for which motion estimation is required.
  • the electronic device may determine the reference frame to be predicted according to the first preset condition; the electronic device may also determine the reference frame to be predicted according to the second preset condition. In addition, the electronic device may also determine the reference frame to be predicted by combining the first preset condition and the second preset condition.
  • the electronic device may determine candidate reference frames that do not skip motion estimation in the candidate reference frame set according to the second preset condition, and then the electronic device may determine candidate reference frames that do not skip motion estimation in the determined reference frame set.
  • step S1 a candidate reference frame satisfying the first preset condition is determined as a reference frame to be predicted.
  • the electronic device may determine a candidate reference frame requiring motion estimation in the candidate reference frame set according to the first preset condition, and then the electronic device may determine among the determined candidate reference frames requiring motion estimation.
  • Candidate reference frames that do not satisfy the second preset condition are used as reference frames to be predicted.
  • the embodiment of the present application does not limit the execution order of the steps in the method for determining a reference frame to be predicted according to the first preset condition and the second preset condition.
  • the electronic device may determine whether the candidate reference frame is in the candidate reference frame set. When the electronic device determines that the candidate reference frame is in a set of candidate reference frames, the electronic device may determine that the candidate reference frame does not satisfy a preset skip condition.
  • the electronic device may determine whether the candidate reference frame satisfies a preset skip condition according to the following manner.
  • the electronic device obtains a first number of reference frames in the set of candidate reference frames. If the first number is greater than a first preset threshold, the electronic device determines that the candidate reference frame meets a preset skip condition.
  • the electronic device may further determine whether the candidate reference frame is a designated reference frame in a candidate reference frame set. If the candidate reference frame is a designated reference frame, the electronic device determines that the candidate reference frame satisfies a preset skip condition. If the candidate reference frame is not a designated reference frame, the electronic device determines that the candidate reference frame does not satisfy the preset skip condition. .
  • the electronic device may determine candidate reference frames that do not meet the preset skip condition, and then the electronic device may perform pixel search only on the candidate reference frames that do not meet the preset skip condition to improve the coding efficiency of the video.
  • the above-mentioned preset skip condition may be a condition that no pixel search or motion estimation is performed on the candidate reference frame.
  • the electronic device may determine a pixel search rule of the reference frame to be predicted according to the belonging relationship between the reference frame to be predicted and the set of candidate reference frames, so as to further improve the coding efficiency of the video.
  • step S302 may include the following processing steps: when the reference frame to be predicted is not in the set of candidate reference frames, a full pixel search is performed on the reference frame to be predicted to obtain a matching block in the reference frame to be predicted as a candidate match Piece.
  • the electronic device may determine whether the reference frame to be predicted is in a set of candidate reference frames.
  • the electronic device determines that the reference frame to be predicted is not in the set of candidate reference frames, the electronic device only performs an entire pixel search on the to-be-predicted reference frame instead of a sub-pixel search, and uses the matching block obtained by the entire pixel search as a candidate Matching blocks.
  • the whole pixel search and the sub-pixel search are performed on the reference frame to be predicted, and a matching block in the reference frame to be predicted is obtained as a candidate matching block.
  • the electronic device when the electronic device determines that the reference frame to be predicted is in a set of candidate reference frames, the electronic device performs an entire pixel search and a sub-pixel search on the to-be-predicted reference frame, and searches the sub-pixels.
  • a matching block is obtained as a candidate matching block.
  • the electronic device only performs an entire pixel search instead of a sub-pixel search on the predicted reference frame, which reduces the complexity of motion estimation and can further improve the encoding of the video. effectiveness.
  • the electronic device can further skip the sub-pixel search processing of some candidate reference frames to further improve the coding efficiency of the video.
  • step S302 may include the following process: according to a pixel search rule corresponding to the first reference frame to be predicted Perform a pixel search on the first to-be-predicted reference frame to obtain a first candidate matching block.
  • the method may further include the following processing process: determining that the first candidate matching block is the matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks, and the first candidate matching
  • the rate distortion cost of the block is less than the second preset threshold
  • the pixel search rule of each reference frame located after the first to-be-predicted reference frame is updated to an integer according to the arrangement order of the first to-be-predicted reference frame in the candidate reference frame set. Pixel search.
  • the arrangement order of the to-be-predicted reference frames in the candidate reference frame set may be determined according to the arrangement order of each to-be-predicted reference frame in the candidate reference frame set.
  • the second preset threshold may be set by a technician based on experience.
  • the second preset threshold can be expressed by Cost (cost). Cost is a constant related to information such as quantization parameters. Generally speaking, the larger the quantization parameter, the greater the cost.
  • the first to-be-predicted reference frame may be any one of a plurality of to-be-predicted reference frames, for example, it may be the first reference frame or the second reference frame. Not limited.
  • the electronic device may perform a pixel search on the first to-be-predicted reference frame according to a pixel search rule corresponding to the first to-be-predicted reference frame to obtain a matching block in the first to-be-predicted reference frame ( (Ie, the first candidate matching block). Then, the electronic device can determine whether the first candidate matching block is a matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks.
  • candidate reference frame P 1 is the first candidate reference frame in the candidate reference frame set
  • candidate reference frame P 2 is the second candidate reference frame in the candidate reference frame set
  • candidate reference frame P 3 is the first candidate reference frame in the candidate reference frame set.
  • the candidate reference frame P 1 , candidate reference frame P 2, and candidate reference frame P 3 all do not satisfy the preset skip condition.
  • the matching block in the candidate reference frame P 1 is Z 1
  • the matching block in the candidate reference frame P 2 is Z. 2.
  • the matching block in the candidate reference frame P 3 is Z 3 .
  • each currently obtained matching block is the matching block Z 1
  • the matching block Z 1 is the matching block with the lowest rate distortion cost among the currently obtained matching blocks.
  • the currently obtained matching blocks are the matching block Z 1 and the matching block Z 2.
  • the electronic device needs to determine whether the rate distortion cost of Z 2 is less than the rate distortion cost of the matching block Z 1 ;
  • the currently obtained matching blocks are the matching block Z 1 , the matching block Z 2, and the matching block Z 3.
  • the electronic device needs to determine whether the matching block Z 3 is the matching block Z 1 or not.
  • the electronic device may further determine whether the rate distortion cost of the first candidate matching block is less than a second preset threshold. If the rate distortion cost of the first candidate matching block is less than the second preset threshold, the electronic device may determine an arrangement order of the first to-be-predicted reference frame in the set of candidate reference frames, and assign each reference located after the first to-be-predicted reference frame.
  • the pixel search rule of a frame is updated to an integer pixel search, that is, when the electronic device performs motion estimation on a reference frame to be predicted (which may be referred to as a second reference frame to be predicted) that is located after the first reference frame to be predicted, the electronic device performs only whole pixels Searching does not perform sub-pixel searching, thereby improving the coding efficiency of the video.
  • the electronic device determines that the first candidate matching block is the matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks, and the rate distortion cost of the first candidate matching block is less than the second preset threshold
  • the electronic device performs an entire pixel search on the second to-be-predicted reference frame to obtain a matching block (which may be referred to as a second candidate matching block) in the second to-be-predicted reference frame, it is not necessary to determine whether the second candidate matching block is currently obtained.
  • the matching block with the lowest rate-distortion cost among the candidate candidate matching blocks does not need to determine whether the rate-distortion cost of the second candidate matching block is less than a second preset threshold.
  • FIG. 9 is a flowchart of an example of a motion estimation method according to an embodiment of the present application.
  • the method may include the following steps:
  • S901 Obtain an optimal reference frame used for inter prediction in an image block that meets a preset neighboring condition with a target prediction unit, and obtain a set of candidate reference frames.
  • S902 For each candidate reference frame in the candidate reference frame set corresponding to the target prediction unit, determine whether the candidate reference frame is in the candidate reference frame set. If the candidate reference frame is in the candidate reference frame set, perform S903, If the candidate reference frame is not in the candidate reference frame set, step S904 is performed.
  • S903 Determine that the candidate reference frame does not satisfy a preset skip condition.
  • S904 Determine whether the first number of candidate reference frames included in the candidate reference frame set is greater than a first preset threshold. If the first number is greater than the first preset threshold, execute S905, and if the first number is not greater than the first preset threshold, execute S906.
  • S905 Determine that the candidate reference frame meets a preset skip condition.
  • S906 Determine whether the candidate reference frame is a designated reference frame in the candidate reference frame set. If the candidate reference frame is not a designated reference frame, perform S903. If the candidate reference frame is a designated reference frame, perform S905.
  • S907 Determine a candidate reference frame that does not satisfy a preset skip condition as a reference frame to be predicted.
  • S908 Determine whether the reference frame to be predicted is in the candidate reference frame set. If the reference frame to be predicted is not in the candidate reference frame set, perform S909, and if the reference frame to be predicted is in the candidate reference frame set, perform S910.
  • S909 Perform an entire pixel search on the reference frame to be predicted to obtain a matching block in the reference frame to be predicted as a candidate matching block.
  • S910 Perform an entire pixel search and a sub-pixel search on a reference frame to be predicted, and obtain a matching block in the reference frame to be predicted as a candidate matching block.
  • S911 Use the matching block with the lowest rate distortion cost among the candidate matching blocks as the best matching block of the target prediction unit.
  • An embodiment of the present application further provides an optional method for motion estimation.
  • the method may include the following steps:
  • condition 1 For each candidate reference frame in the candidate reference frame set corresponding to the target prediction unit, determine whether condition 1 is satisfied. If condition 1 is satisfied, skip the motion estimation of the candidate reference frame and directly judge the next candidate reference frame. ; If condition 1 is not satisfied, determine whether condition 2 is satisfied; if condition 2 is satisfied, skip the motion estimation of the candidate reference frame and directly determine the next candidate reference frame; if condition 2 is not satisfied, determine whether If condition 3 is satisfied, if condition 3 is satisfied, the sub-pixel search is skipped and only the whole pixel search is performed when motion estimation is performed on the current candidate reference frame. If condition 3 is not satisfied, all motion search steps need to be completed.
  • condition 1 is that the number of reference frames in the candidate reference frame set is greater than a certain threshold T (T> 1) (equivalent to the first preset threshold) and the candidate reference frame is not in the candidate reference frame set; condition 2
  • Table (1) is an example using the embodiments of the present application. Comparison of motion estimation methods and encoding using existing techniques.
  • a column of resolution indicates image sequences with different resolutions
  • a column of image sequences indicates image sequences of different video pictures.
  • Y (BD-rate) column, U (BD-rate) column, V (BD-rate) column and YUV (BD-rate) column represent the code rate savings (negative values) under the combined quality of Y, U, V, and YUV (Saving means, positive means increasing), Y means brightness (Luminance or Luma), which is the gray level value; U and V means chrominance (Chrominance or Chroma), which is used to describe the color and saturation of the image, used to specify the pixel s color.
  • ⁇ fps represents encoding acceleration, for example, as shown in formula (1).
  • ⁇ fps indicates encoding acceleration
  • FPS anchor indicates the frame rate fps of the image sequence encoded using the original encoder
  • FPS proposed indicates the frame rate fps of the encoded image sequence after the same encoder uses the motion estimation method of this embodiment.
  • a positive value of ⁇ fps indicates acceleration, and a negative value indicates deceleration.
  • a reference frame to be predicted that meets a first preset condition in a set of candidate reference frames corresponding to a target prediction unit is obtained, and a pixel search rule corresponding to the reference frame to be predicted is obtained.
  • a pixel search is performed on the prediction reference frame to obtain candidate matching blocks, and the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit.
  • pixel search is performed on candidate reference frames that satisfy the first preset condition, which reduces video coding compared with the conventional method of traversing all candidate reference frames and performing pixel search on each candidate reference frame. Complexity, which saves encoding time, which can improve video encoding efficiency.
  • FIG. 10 is a structural diagram of a motion estimation device according to an embodiment of the present application.
  • the device includes:
  • the obtaining module 1001 is configured to obtain a to-be-predicted reference frame in a candidate reference frame set corresponding to a target prediction unit; wherein the to-be-predicted reference frame is a candidate reference frame in the candidate reference frame set that meets a first preset condition;
  • the first processing module 1002 is configured to perform a pixel search on a reference frame to be predicted according to a pixel search rule corresponding to the reference frame to be predicted to obtain a candidate matching block;
  • the determining module 1003 is configured to determine the matching block with the lowest rate distortion cost among the candidate matching blocks as the best matching block of the target prediction unit.
  • the first preset condition includes at least one of the following:
  • the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold and the candidate reference frame is not a designated reference frame in the candidate reference frame set;
  • the number of reference frames in the candidate reference frame set is less than or equal to the first preset threshold and the candidate reference frames are in the candidate reference frame set;
  • Candidate reference frames are in a set of candidate reference frames
  • the candidate reference frame is in the candidate reference frame set and the candidate reference frame is not a designated reference frame in the candidate reference frame set;
  • the reference frame included in the candidate reference frame set is a reference frame where a matching block with the lowest rate distortion cost among matching blocks obtained by inter prediction of the image block is located, and the image block is a preset adjacent condition that satisfies the target prediction unit.
  • Image blocks are a reference frame where a matching block with the lowest rate distortion cost among matching blocks obtained by inter prediction of the image block is located, and the image block is a preset adjacent condition that satisfies the target prediction unit.
  • the device further includes:
  • a second processing module configured to skip motion estimation of the candidate reference frame if the candidate reference frame meets a second preset condition
  • the second preset condition includes at least one of the following: the number of reference frames in the candidate reference frame set is greater than the first preset threshold and the candidate reference frame is not in the candidate reference frame set; The number of reference frames is less than or equal to the first preset threshold, the candidate reference frame is not a specified reference frame in the candidate reference frame set and the candidate reference frame is not in the candidate reference frame set; wherein the candidate reference frame set includes reference frames For the reference frame where the matching block with the lowest rate distortion cost among the matching blocks obtained by inter prediction of the image block is located, the image block is an image block that satisfies a preset adjacent condition with the target prediction unit.
  • the first processing module 1002 is specifically configured to perform an entire pixel search of the reference frame to be predicted when the reference frame to be predicted is not in the set of candidate reference frames, to obtain the reference frame to be predicted.
  • Matching block as a candidate matching block
  • the whole pixel search and the sub-pixel search are performed on the reference frame to be predicted, and a matching block in the reference frame to be predicted is obtained as a candidate matching block.
  • the first processing module 1002 when there are multiple reference frames to be predicted, for the first reference frame to be predicted among the multiple reference frames to be predicted, the first processing module 1002 is specifically configured to follow A pixel search rule corresponding to a reference frame to be predicted performs a pixel search on the first reference frame to be predicted to obtain a first candidate matching block;
  • the device further includes a third processing module configured to determine that the first candidate matching block is the matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks, and the rate candidate cost of the first candidate matching block is less than the second preset
  • a third processing module configured to determine that the first candidate matching block is the matching block with the lowest rate distortion cost among the currently obtained candidate matching blocks, and the rate candidate cost of the first candidate matching block is less than the second preset
  • the pixel search rule of each reference frame located after the first to-be-predicted reference frame in the candidate reference frame set is updated to the whole-pixel search according to the arrangement order of the first to-be-predicted reference frame.
  • the motion estimation device obtains a reference frame to be predicted that meets a first preset condition from a set of candidate reference frames corresponding to a target prediction unit, and searches for a pixel corresponding to the reference frame to be predicted A pixel search is performed on the prediction reference frame to obtain candidate matching blocks, and the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit. Based on the above processing, pixel search is performed on candidate reference frames that satisfy the first preset condition, which reduces video coding compared with the conventional method of traversing all candidate reference frames and performing pixel search on each candidate reference frame. Complexity, which saves encoding time, which can improve video encoding efficiency.
  • the above device may be located in a device, such as a terminal, a server, etc., but is not limited thereto.
  • An embodiment of the present application further provides an electronic device, as shown in FIG. 11, including a memory 1101 and a processor 1102.
  • the memory 1101 is configured to store a computer program
  • the processor 1102 is configured to implement the method of motion estimation provided in the embodiment of the present application when the program stored in the memory 1101 is executed.
  • the above motion estimation method includes:
  • the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit.
  • the electronic device may be provided with a communication interface that enables communication between the electronic device and another device.
  • the aforementioned processor 1102, communication interface, and memory 1101 communicate with each other through a communication bus.
  • the communication bus mentioned here may be a Peripheral Component Interconnect (PCI) bus or an extended industry standard structure ( Extended Industry Standard Architecture (referred to as EISA) bus and so on.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into an address bus, a data bus, a control bus, and the like.
  • the memory 1101 may include random access memory (Random Access Memory, RAM for short), and may also include non-volatile memory (Non-Volatile Memory, NVM for short), such as at least one disk memory.
  • the memory may also be at least one storage device located far from the foregoing processor.
  • the above-mentioned processor 1102 may be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc .; it may also be a Digital Signal Processor (DSP) ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • An embodiment of the present application further provides a computer-readable storage medium.
  • the computer-readable storage medium stores instructions that, when run on a computer, cause the computer to execute the method of motion estimation provided by the embodiments of the present application.
  • the above motion estimation method includes:
  • the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit.
  • the embodiment of the present application further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the method of motion estimation provided by the embodiment of the present application.
  • the above motion estimation method includes:
  • the matching block with the lowest rate distortion cost among the candidate matching blocks is determined as the best matching block of the target prediction unit.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission by wire (for example, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (for example, infrared, wireless, microwave, etc.) to another website site, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk (SSD)), and the like.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a DVD
  • a semiconductor medium for example, a solid state disk (Solid State Disk (SSD)
  • pixel search is performed on candidate reference frames that satisfy the first preset condition, and in the prior art, all candidate reference frames need to be traversed and pixel search is performed on each candidate reference frame.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种运动估计的方法、装置、电子设备及计算机可读存储介质,获取与目标预测单元对应的候选参考帧集合中的待预测参考帧,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧,按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块,将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。基于上述处理,仅对满足第一预设条件的候选参考帧进行像素搜索,而不需要遍历所有的候选参考帧并对每一个候选参考帧都进行像素搜索,进而提高视频的编码效率。

Description

运动估计的方法、装置、电子设备及计算机可读存储介质
本申请要求于2018年08月17日提交中国专利局、申请号为201810940267.7发明名称为“运动估计的方法、装置、电子设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,特别是涉及一种运动估计的方法、装置、电子设备及计算机可读存储介质。
背景技术
随着计算机网络技术的快速发展,为了减少视频传输时占用的带宽和存储空间,需要对传输的视频进行编码。在对视频进行编码时,可以将视频中的每一视频帧划分为多个图像块,此图像块也称为编码块。对编码块进行编码的过程中,需要对编码块进行预测,此时,可以将一个编码块划分为多个预测单元,其中,对编码块进行预测可以包括:对编码块进行帧内预测和对编码块进行帧间预测,帧间预测即在参考帧中搜索与预测单元相似的图像块作为匹配块。帧间预测可以包括运动估计和运动补偿,其中,运动估计是在候选参考帧集合中的各候选参考帧中搜索率失真代价最小的匹配块的过程,该匹配块为预测单元的最佳匹配块,该匹配块所在的参考帧为最佳参考帧。
相关技术中,根据候选参考帧集合中的各候选参考帧的排列顺序,针对每一候选参考帧,根据预设的像素搜索规则,可以得到该候选参考帧中率失真代价最小的匹配块(可以称为优选匹配块)。然后,在各候选参考帧对应的优选匹配块中,确定率失真代价最小的优选匹配块,作为预测单元的最佳匹配块。
可见,相关技术中,需要遍历所有候选参考帧,并对每一候选参考帧进行像素搜索,才可以确定预测单元的最佳匹配块,导致运动估计的复杂度较高,进而会降低视频的编码效率。
发明内容
本申请实施例的目的在于提供一种运动估计的方法、装置、电子设备及 计算机可读存储介质,可以提高视频的编码效率。技术方案如下:
第一方面,为了达到上述目的,本申请实施例公开了一种运动估计的方法,所述方法包括:获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,所述待预测参考帧为所述候选参考帧集合中满足第一预设条件的候选参考帧;按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块;将所述候选匹配块中率失真代价最小的匹配块,确定为所述目标预测单元的最佳匹配块。
在一实施方式中,所述第一预设条件包括以下至少之一:备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值且所述候选参考帧在所述备选参考帧集合中;所述候选参考帧在所述备选参考帧集合中;所述候选参考帧在所述备选参考帧集合中且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
在一实施方式中,所述方法还包括:在所述候选参考帧满足第二预设条件的情况下,跳过对所述候选参考帧的运动估计;其中,所述第二预设条件包括以下至少之一:备选参考帧集合中的参考帧的个数大于第一预设阈值且所述候选参考帧不在所述备选参考帧集合中;所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值,所述候选参考帧不是所述候选参考帧集合中的指定参考帧且所述候选参考帧不在所述备选参考帧集合中;其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
在一实施方式中,所述按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块,包括以下至少之一:在所述待预测参考帧不在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块;在所述待预测参考帧在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索和分像素搜索,得到所述待预测参考帧中的匹配块,作为 候选匹配块。
在一实施方式中,在所述待预测参考帧为多个的情况下,针对多个所述待预测参考帧中的第一待预测参考帧,按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块,包括:按照与所述第一待预测参考帧对应的像素搜索规则对所述第一待预测参考帧进行像素搜索,得到第一候选匹配块;在所述按照与所述第一待预测参考帧对应的像素搜索规则对所述第一待预测参考帧进行像素搜索,得到第一候选匹配块之后,所述方法还包括:在确定所述第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且所述第一候选匹配块的率失真代价小于第二预设阈值的情况下,根据所述第一待预测参考帧在所述候选参考帧集合中的排列顺序,将位于所述第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索。
第二方面,为了达到上述目的,本申请实施例还公开了一种运动估计的装置,所述装置包括:获取模块,设置为获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,所述待预测参考帧为所述候选参考帧集合中满足第一预设条件的候选参考帧;第一处理模块,设置为按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块;确定模块,设置为将所述候选匹配块中率失真代价最小的匹配块,确定为所述目标预测单元的最佳匹配块。
在一实施方式中,所述第一预设条件包括以下至少之一:备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值且所述候选参考帧在所述备选参考帧集合中;所述候选参考帧在所述备选参考帧集合中;所述候选参考帧在所述备选参考帧集合中且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
在一实施方式中,所述装置还包括:第二处理模块,设置为在所述候选参考帧满足第二预设条件的情况下,跳过对所述候选参考帧的运动估计;其中,所述第二预设条件包括以下至少之一:备选参考帧集合中的参考帧的个 数大于第一预设阈值且所述候选参考帧不在所述备选参考帧集合中;所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值,所述候选参考帧不是所述候选参考帧集合中的指定参考帧且所述候选参考帧不在所述备选参考帧集合中;其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
在一实施方式中,所述第一处理模块,具体设置为在所述待预测参考帧不在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块;和/或,在所述待预测参考帧在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索和分像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块。
在一实施方式中,在所述待预测参考帧为多个的情况下,针对多个所述待预测参考帧中的第一待预测参考帧,所述第一处理模块,设置为按照与所述第一待预测参考帧对应的像素搜索规则对所述第一待预测参考帧进行像素搜索,得到第一候选匹配块;所述装置还包括:第三处理模块,设置为在确定所述第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且所述第一候选匹配块的率失真代价小于第二预设阈值的情况下,根据所述第一待预测参考帧在所述候选参考帧集合中的排列顺序,将位于所述第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索。
第三方面,为了达到上述目的,本申请实施例还公开了一种电子设备,所述电子设备包括存储器和处理器;所述存储器,设置为存放计算机程序;所述处理器,设置为执行存储器上所存放的程序时,实现如上述第一方面所述的运动估计的方法步骤。
第四方面,为了达到上述目的,本申请实施例还公开了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现如上述第一方面所述的运动估计的方法步骤。
第五方面,为了达到上述目的,本申请实施例还公开了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的运动估计的方法步骤。
由上述的技术方案可见,本申请实施例提供的运动估计的方法、装置、电子设备及计算机可读存储介质,获取与目标预测单元对应的候选参考帧集 合中满足第一预设条件的待预测参考帧,按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块,将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。基于上述处理,对满足第一预设条件的候选参考帧进行像素搜索,与现有技术中需要遍历所有的候选参考帧并对每一个候选参考帧都进行像素搜索的方式相比,不需要对所有的候选参考帧都进行像素搜索,因而减少了视频编码的复杂度,节省了编码的时间,并且可以提高视频的编码效率。
当然,实施本申请的任一产品或方法必不一定需要同时达到以上所述的所有优点。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为现有技术中一种运动估计的方法的流程图;
图2为本申请实施例提供的获取预测单元的匹配块的示意图;
图3为本申请实施例提供的一种运动估计的方法的流程图;
图4为本申请实施例提供的一种对编码块进行划分的示意图;
图5为本申请实施例提供的不同层级编码块的示意图;
图6为本申请实施例提供的空域相邻块的示意图;
图7为本申请实施例提供的像素搜索的搜索过程示意图;
图8为本申请实施例提供的分像素搜索的示意图;
图9为本申请实施例提供的一种运动估计的方法的示例的流程图;
图10为本申请实施例提供的一种运动估计的装置的结构图;
图11为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本 申请保护的范围。
首先从现有技术中运动估计的方法与本申请实施例提供的运动估计的方法对比的角度进行说明。
参照图1,图1为现有技术中一种运动估计的方法的流程图。
现有技术中,首先获取目标预测单元对应的候选参考帧集合(S101)。
然后,根据预设的精度要求,对候选参考帧集合中的每一候选参考帧进行像素搜索,得到该候选参考帧内的匹配块(S102),示例性的,参见图2,为本申请实施例提供的获取预测单元的匹配块的示意图。其中,P n、P n-1、P n-2、P n-3、P n-4分别表示图2中5个视频帧的序号,P n为当前待编码的视频帧,P n中的图像块即为预测单元,P n-1、P n-2、P n-3和P n-4为候选参考帧。如图2所示,即在候选参考帧P n-1、候选参考帧P n-2、候选参考帧P n-3和候选参考帧P n-4中找到与当前视频帧P n中的预测单元最为匹配的匹配块。
将各候选参考帧内的匹配块中率失真代价最小的匹配块,确定为最佳匹配块(S103)。现有技术需要对候选参考帧集合中每一候选参考帧进行像素搜索,以获取每一候选参考帧内的匹配块,进而从多个匹配块中选择率失真代价最小的匹配块,作为目标预测单元的最佳匹配块。
发明人发现,现有技术忽略了图像内容的相关性,并且针对所有候选参考帧都进行像素搜索,操作繁琐,导致运动估计的复杂度较高,进而会降低视频的编码效率。
基于上述考虑,本申请提供了一种运动估计的方法,可以应用于电子设备,该电子设备用于对视频进行编码。在现有技术的基础上,电子设备可以获取与目标预测单元对应的候选参考帧集合中的待预测参考帧,其中,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧,按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块,将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。电子设备仅对满足第一预设条件的候选参考帧进行像素搜索,而不需要遍历所有的候选参考帧并对每一个候选参考帧都进行像素搜索,进而可以提高视频的编码效率。
下面再通过具体实施例对本申请进行详细介绍。
参见图3,图3为本申请实施例提供的一种运动估计的方法的流程图,该方法可以包括以下步骤:
S301:获取与目标预测单元对应的候选参考帧集合中的待预测参考帧。
其中,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧。第一预设条件可以由技术人员根据经验进行预先设置。
可选的,第一预设条件可以包括以下至少之一:
一、备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且候选参考帧不是候选参考帧集合中的指定参考帧。
二、备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且候选参考帧在备选参考帧集合中。
三、候选参考帧在备选参考帧集合中。
四、候选参考帧在备选参考帧集合中且候选参考帧不是候选参考帧集合中的指定参考帧。
其中,备选参考帧集合包括的参考帧可以为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,该图像块为与目标预测单元满足预设相邻条件的图像块。可以看出,只要某一候选参考帧在备选参考帧集合中,电子设备就可以将该候选参考帧作为待预测参考帧。可以理解的是,待预测参考帧包含有帧间预测时已参考的图像帧,进而能够提高待预测参考帧的准确性。
另外,备选参考帧集合中的参考帧可以为与目标预测单元满足预设相邻条件的图像块的参考帧,因而上述图像块与目标预测单元在图像内容上具有一定的相关性,即本申请利用了图像内容的相关性,根据上述图像块的参考帧,对候选参考帧集合中的参考帧进行筛选,能够进一步提高视频的编码性能。
与目标预测单元满足预设相邻条件的图像块可以包括:目标预测单元所在编码块的上层编码块、目标预测单元所在编码块的下层编码块、目标预测单元的空域相邻块。
目标预测单元由对该目标预测单元所在编码块进行划分得到的。参见图4,图4为本申请实施例提供的一种对编码块进行划分的示意图。图4左侧为一 个2N×2N(N为大于1的整数)的编码块,可以根据右侧所示的八种划分示例对该编码块进行划分,得到相应的预测单元。
参见图5,图5为本申请实施例提供的不同层级编码块的示意图。对视频帧进行编码时,会将视频帧先划分为等大小的编码树单元(Coding Tree Unit,CTU),再以编码树单元为基本单位进行编码。其中,编码树单元大小一般呈64×64的块,编码过程中,可将编码树单元进一步划分成不同大小的编码块。图5中,64×64的块代表通过视频帧划分得到的一个宽为64像素,高为64像素的编码树单元。可以看出,一个64×64的编码树单元可以作为一个64×64的编码块进行编码,也可以划分成4个等大小的32×32的编码块,再对每个32×32的编码块进行编码。根据率失真准则,对于每个64×64的编码块,可以比较该64×64编码块的率失真代价与4个32×32编码块的率失真代价之和,选择率失真代价较小的划分方式进行划分。而每个32×32的编码块又可以划分为4个等大小的16×16的编码块,也就是说,对于每个32×32的编码块,需要比较一个32×32编码块的率失真代价与4个16×16编码块的率失真代价之和,选择率失真代价较小的划分方式进行划分。同理,每个16×16的编码块还可以划分为4个等大小的8×8的编码块,而每个16×16编码块是否需要再进行划分则通过比较16×16编码块的率失真代价与4个8×8编码块的率失真代价之和来确定。
需要说明的是,上述划分的结束条件可以是默认的,也可以是用户自定义的,比如可以设置划分到4个8×8编码块就结束划分,但并不限于此。
例如,若以32×32的方式对CTU进行划分,可以得到图5中标出的四个32×32的编码块,此时,这四个32×32的编码块称为图中64×64的编码块的下层编码块,相应的,图中64×64的编码块称为图中四个32×32的编码块的上层编码块。以此类推,每个32×32划分得到的四个16×16的编码块称为该32×32的编码块的下层编码块,该32×32的编码块称为划分得到的四个16×16的编码块的上层编码块;每个16×16划分得到的四个8×8的编码块称为该16×16的编码块的下层编码块,该16×16的编码块称为划分得到的四个8×8的编码块的上层编码块。其中,上层编码块和下层编码块这种上下层关系也称为互为相邻层的关系,这种相邻层关系只存在于相邻的两 层之间,对于两层中间间隔了一层的情况,不在考虑范围内。
由上述可见,一种划分方式得到的编码块有可能只存在下层编码块,例如64×64的编码块只存在下层编码块,也有可能只存在上层编码块,例如8×8的编码块只存在上层编码块,也有可能既存在上层编码块也存在下层编码块,例如32×32的编码块和16×16的编码块,既存在上层编码块也存在下层编码块。而实际对视频帧进行编码时,会选择从上层编码块到下层编码块或从下层编码块到上层编码块或从中间层对视频帧内编码树单元进行划分,得到编码块,进而对编码块进行预测,得到预测单元。
参见图6,图6为本申请实施例提供的空域相邻块的示意图。其中,C为当前预测单元,预测单元A0、预测单元A1、预测单元B0、预测单元B1和预测单元B2为与当前预测单元C在同一视频帧且处于当前预测单元C相邻位置的预测单元。预测单元A0、预测单元A1、预测单元B0、预测单元B1和预测单元B2称为当前预测单元C的空域相邻块。
第一预设阈值可以由技术人员根据经验进行设置,第一预设阈值可以大于1。在第一预设条件涉及上述第一预设阈值的情况下,可以看出,第一预设阈值越小,满足第一预设条件的概率越小,则待预测参考帧的数目越小,相应的,编码效率也越高,但编码性能会降低,因此,也可以综合考虑编码效率和编码性能确定第一预设阈值。
候选参考帧集合中的指定参考帧可以为候选参考帧集合中位于预设位置之后的参考帧。预设位置可以由技术人员根据经验进行设置,例如,预设位置可以为候选参考帧集合中第二个候选参考帧所处的位置,则指定参考帧包括候选参考帧集合中从第三个候选参考帧到最后一个候选参考帧;预设位置也可以为候选参考帧集合中第四个候选参考帧所处的位置,则指定参考帧包括候选参考帧集合中从第五个候选参考帧到最后一个候选参考帧。预设位置越靠后,满足第一预设条件的概率越大,则待预测参考帧的数目越大,相应的,编码效率也越低,但编码性能会提升,因此,可以综合考虑编码效率和编码性能确定预设位置。
在本申请的一个具体实施例中,针对每一候选参考帧,电子设备可以判断该候选参考帧是否满足上述四个条件中的任一项。当电子设备判定该候选 参考帧满足上述四个条件中的任一项时,电子设备可以确定该候选参考帧满足第一预设条件,也即,电子设备可以将该候选参考帧,确定为待预测参考帧,以便进行后续处理。
S302:按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块。
其中,像素搜索规则可以由技术人员根据业务需求设置,例如,对待预测参考帧只进行整像素搜索,或者,在整像素搜索的基础上进行分像素搜索。分像素搜索可以包括二分之一像素搜索、四分之一像素搜索、八分之一像素搜索。
如图7所示,为本申请实施例提供的像素搜索的搜索过程示意图。搜索范围为(2d+1+M)×(2d+1+N),在该搜索范围内,填充块为预测单元,空白块为预测单元的匹配块,M为预测单元的宽度,N为预测单元的高度,d为搜索窗口大小的值,预测单元左上角坐标(k,l),预测单元的匹配块左上角坐标(k+u,l+v),进而可以得到运动矢量为(u,v)。
参见图8,图8为本申请实施例提供的分像素搜索的示意图。实心圆点表示整像素点,而中心的实心圆点表示整像素搜索的最佳匹配点;在进行分像素搜索前,先插值出分像素点,即空心圆点(二分之一像素点),以中心的实心圆点为中心,对周围八个二分之一像素点进行全搜索(全搜索,即在搜索区域内穷举搜索,即遍历搜索范围内的每一个像素点)。选择率失真代价最小的二分之一像素点作为最佳匹配点(此处以右上角的二分之一像素点为最佳匹配点);若支持四分之一像素搜索,则先插值出当前的最佳匹配点周围的四分之一像素点(实心三角形),以当前的最佳匹配点为中心,对周围八个四分之一像素点进行全搜索,选择率失真代价最小的四分之一点作为最佳匹配点。此处,以右下角的四分之一像素点为最佳匹配点,即图中空心三角形。则该空心三角形即分像素搜索得到的最佳匹配点,即当前参考帧内的候选匹配块。
在本申请的一个具体实施例中,电子设备可以根据待预测参考帧对应的像素搜索规则,对待预测参考帧进行像素搜索,得到待预测参考帧中目标预测单元对应的匹配块,作为候选匹配块。由于待预测参考帧可以为多个,因此,对应的候选匹配块也可以为多个。
S303:将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。
在本申请的一个具体实施例中,电子设备可以从得到的各候选匹配块中,选择率失真代价最小的候选匹配块,作为目标预测单元的最佳匹配块。
由以上可见,本申请实施例仅对满足第一预设条件的候选参考帧进行像素搜索,并不会遍历所有的候选参考帧并对每一候选参考帧都进行像素搜索,能够减少视频编码的复杂度,提高编码效率。
可选的,该方法还可以包括以下步骤:在候选参考帧满足第二预设条件的情况下,跳过对候选参考帧的运动估计。
第二预设条件包括以下至少之一:
一、备选参考帧集合中的参考帧的个数大于第一预设阈值且候选参考帧不在备选参考帧集合中。
二、备选参考帧集合中的参考帧的个数小于或等于第一预设阈值,候选参考帧不是候选参考帧集合中的指定参考帧且候选参考帧不在备选参考帧集合中。
关于备选参考帧集合和第一预设阈值,可以参考上述实施例中的详细介绍。
在本申请的一个具体实施例中,针对候选参考帧集合中的每一候选参考帧,电子设备可以判断该候选参考帧是否满足上述两个条件中的任一项。当电子设备判定该候选参考帧满足上述两个条件中的任一项时,电子设备可以确定该候选参考帧满足第二预设条件,也即,电子设备可以确定跳过对该候选参考帧的运动估计,进而确定需要进行运动估计的待预测参考帧。
需要说明的是,电子设备可以根据第一预设条件,确定待预测参考帧;电子设备也可以根据第二预设条件,确定待预测参考帧。另外,电子设备还可以结合第一预设条件和第二预设条件,确定待预测参考帧。
例如,电子设备可以根据第二预设条件,在候选参考帧集合中确定出不会跳过运动估计的候选参考帧,然后,电子设备可以在确定出的不会跳过运动估计的候选参考帧中,确定出满足第一预设条件的候选参考帧,作为待预测参考帧。
或者,电子设备可以根据第一预设条件,在候选参考帧集合中确定出需要进行运动估计的候选参考帧,然后,电子设备可以在确定出的需要进行运动估计的候选参考帧中,确定出不满足第二预设条件的候选参考帧,作为待预测参考帧。
本申请实施例对于上述根据第一预设条件和第二预设条件,确定待预测参考帧的方法中各步骤的执行顺序并不进行限定。
需要说明的是,针对目标预测单元对应的候选参考帧集合中的每一候选参考帧,电子设备可以判断该候选参考帧是否在备选参考帧集合中。当电子设备判定该候选参考帧在备选参考帧集合中时,电子设备可以确定该候选参考帧不满足预设跳过条件。
当电子设备判定该候选参考帧不在备选参考帧集合中时,电子设备可以根据以下方式判断该候选参考帧是否满足预设跳过条件。
方式一,电子设备获取备选参考帧集合中的参考帧的第一数目,如果第一数目大于第一预设阈值,则电子设备确定该候选参考帧满足预设跳过条件。
方式二,如果第一数目小于或者等于第一预设阈值,电子设备可以进一步判断该候选参考帧是否为候选参考帧集合中的指定参考帧。如果该候选参考帧为指定参考帧,电子设备则确定该候选参考帧满足预设跳过条件,如果该候选参考帧不是指定参考帧,电子设备则确定该候选参考帧不满足预设跳过条件。
基于上述处理,电子设备可以确定不满足预设跳过条件的候选参考帧,然后,电子设备可以仅对不满足预设跳过条件的候选参考帧进行像素搜索,以提高视频的编码效率。需要说明的是,上述预设跳过条件可以是对候选参考帧不进行像素搜索或运动估计的条件。
另外,电子设备可以根据待预测参考帧与备选参考帧集合的所属关系,确定待预测参考帧的像素搜索规则,以进一步提高视频的编码效率。
可选的,步骤S302可以包括以下处理步骤:在待预测参考帧不在备选参考帧集合中的情况下,对待预测参考帧进行整像素搜索,得到待预测参考帧中的匹配块,作为候选匹配块。
在本申请的一个具体实施例中,针对每一待预测参考帧,电子设备可以 判断该待预测参考帧是否在备选参考帧集合中。当电子设备判定该待预测参考帧不在备选参考帧集合中时,电子设备仅对该待预测参考帧进行整像素搜索,而不进行分像素搜索,将整像素搜索得到的匹配块,作为候选匹配块。
在待预测参考帧在备选参考帧集合中的情况下,对待预测参考帧进行整像素搜索和分像素搜索,得到待预测参考帧中的匹配块,作为候选匹配块。
在本申请的一个具体实施例中,当电子设备判定该待预测参考帧在备选参考帧集合中时,电子设备则对该待预测参考帧进行整像素搜索和分像素搜索,将分像素搜索得到匹配块,作为候选匹配块。
可以看出,如果待预测参考帧不在备选参考帧集合中,电子设备只对待预测参考帧进行整像素搜索,而不进行分像素搜索,降低了运动估计的复杂度,能够进一步提高视频的编码效率。
另外,电子设备还可以进一步跳过部分候选参考帧的分像素搜索处理,以进一步提高视频的编码效率。
可选的,待预测参考帧为多个时,针对多个待预测参考帧中的第一待预测参考帧,步骤S302可以包括以下处理过程:按照与第一待预测参考帧对应的像素搜索规则对第一待预测参考帧进行像素搜索,得到第一候选匹配块。
相应的,在得到第一候选匹配块之后,该方法还可以包括以下处理过程:在确定第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且第一候选匹配块的率失真代价小于第二预设阈值时,根据第一待预测参考帧在候选参考帧集合中的排列顺序,将位于第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索。
其中,待预测参考帧在候选参考帧集合中的排列顺序可以根据候选参考帧集合中各待预测参考帧的排列顺序确定。第二预设阈值可以由技术人员根据经验进行设置。第二预设阈值可以用Cost(代价)表示,Cost是与量化参数等信息有关的常量,一般而言,量化参数越大,Cost越大。
根据排列顺序,第一待预测参考帧可以为多个待预测参考帧中的任一个参考帧,比如可以是第一个参考帧,也可以是第二个参考帧,本申请实施例对此并不进行限定。
在本申请的一个具体实施例中,电子设备可以根据第一待预测参考帧对 应的像素搜素规则,对第一待预测参考帧进行像素搜索,得到第一待预测参考帧中的匹配块(即第一候选匹配块),然后,电子设备可以判断第一候选匹配块是否为当前得到的各候选匹配块中率失真代价最小的匹配块。
电子设备判定第一候选匹配块是否为当前得到的各候选匹配块中率失真代价最小的匹配块的方法,可以参考以下示例。
例如,候选参考帧P 1为候选参考帧集合中第一个候选参考帧,候选参考帧P 2为候选参考帧集合中第二个候选参考帧,候选参考帧P 3为候选参考帧集合中第三个候选参考帧。候选参考帧P 1、候选参考帧P 2和候选参考帧P 3均不满足预设跳过条件,候选参考帧P 1中的匹配块为Z 1、候选参考帧P 2中的匹配块为Z 2、候选参考帧P 3中的匹配块为Z 3。电子设备在对候选参考帧P 1进行像素搜索时,当前得到的各匹配块为匹配块Z 1,匹配块Z 1为当前得到的各匹配块中率失真代价最小的匹配块;在电子设备对候选参考帧P 2进行像素搜索后,当前得到的各匹配块为匹配块Z 1和匹配块Z 2,电子设备需要判断Z 2的率失真代价是否小于匹配块Z 1的率失真代价;在电子设备对候选参考帧P 3进行像素搜索后,当前得到的各匹配块为匹配块Z 1、匹配块Z 2和匹配块Z 3,电子设备需要判断匹配块Z 3是否为匹配块Z 1、匹配块Z 2和匹配块Z 3中率失真代价最小的匹配块。
当电子设备判定第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块时,电子设备可以进一步判断第一候选匹配块的率失真代价是否小于第二预设阈值。如果第一候选匹配块的率失真代价小于第二预设阈值,电子设备可以确定第一待预测参考帧在候选参考帧集合中的排列顺序,并将位于第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索,即电子设备对位于第一待预测参考帧之后的待预测参考帧(可以称为第二待预测参考帧)进行运动估计时,电子设备只进行整像素搜索,不进行分像素搜索,进而提高视频的编码效率。
需要说明的是,在电子设备确定第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且第一候选匹配块的率失真代价小于第二预设阈值的情况下,电子设备对第二待预测参考帧进行整像素搜索,得到第二待预测参考帧中的匹配块(可以称为第二候选匹配块)后,不需要判断第 二候选匹配块是否为当前得到的各候选匹配块中率失真代价最小的匹配块,也不需要判断第二候选匹配块的率失真代价是否小于第二预设阈值。
参见图9,图9为本申请实施例提供的一种运动估计的方法的示例的流程图,该方法可以包括以下步骤:
S901:获取与目标预测单元满足预设相邻条件的图像块中进行帧间预测采用的最佳参考帧,得到备选参考帧集合。
S902:针对目标预测单元对应的候选参考帧集合中的每一候选参考帧,判断该候选参考帧是否在备选参考帧集合中,如果该候选参考帧在备选参考帧集合中,执行S903,如果该候选参考帧不在备选参考帧集合中,执行S904。
S903:确定该候选参考帧不满足预设跳过条件。
S904:判断备选参考帧集合包含的备选参考帧的第一数目是否大于第一预设阈值。如果第一数目大于第一预设阈值,执行S905,如果第一数目不大于第一预设阈值,执行S906。
S905:确定该候选参考帧满足预设跳过条件。
S906:判断该候选参考帧是否为候选参考帧集合中的指定参考帧,如果该候选参考帧不是指定参考帧,执行S903,如果该候选参考帧是指定参考帧,执行S905。
S907:将不满足预设跳过条件的候选参考帧,确定为待预测参考帧。
S908:判断待预测参考帧是否在备选参考帧集合中,如果待预测参考帧不在备选参考帧集合中,执行S909,如果待预测参考帧在备选参考帧集合中,执行S910。
S909:对待预测参考帧进行整像素搜索,得到待预测参考帧中的匹配块,作为候选匹配块。
S910:对待预测参考帧进行整像素搜索和分像素搜索,得到待预测参考帧中的匹配块,作为候选匹配块。
S911:将候选匹配块中率失真代价最小的匹配块,作为目标预测单元的最佳匹配块。
本申请实施例还提供了一种可选的运动估计的方法,该方法可以包括以下步骤:
获取与目标预测单元满足预设相邻条件的图像块中进行帧间预测采用的最佳参考帧,得到备选参考帧集合;
针对目标预测单元对应的候选参考帧集合中的每一候选参考帧,判断是否满足条件1,如果满足条件1,则跳过对该候选参考帧进行运动估计,直接进行下一个候选参考帧的判断;如果不满足条件1,则判断是否满足条件2,如果满足条件2,则跳过对该候选参考帧进行运动估计,直接进行下一个候选参考帧的判断;如果不满足条件2,则判断是否满足条件3,如果满足条件3,则在对当前候选参考帧进行运动估计时,跳过分像素搜索,只做整像素搜索;如果不满足条件3,则需完成所有的运动搜索步骤。其中,条件1为备选参考帧集合中的参考帧个数大于一定阈值T(T>1)(相当于上述第一预设阈值)且候选参考帧并不在备选参考帧集合中;条件2为当前候选参考帧为第N个候选参考帧(相当于上述指定参考帧)(其中,N大于2),且当前候选参考帧不在备选参考帧集合中;条件3为当前候选参考帧不在备选参考帧集合中。
基于上述实施例的运动估计的方法,针对表(1)中所示图像序列分类中的每个分类,选取对应的图像序列,进行编码性能的测试,表(1)为利用本申请实施例的运动估计的方法和利用现有技术进行编码的对比结果。
表(1)
Figure PCTCN2019100236-appb-000001
Figure PCTCN2019100236-appb-000002
其中,分辨率一列表示采用不同分辨率的图像序列,图像序列一列表示不同视频画面的图像序列。
针对每个图像序列分类,选取不同个数的图像序列进行测试,表中的结果为每个图像序列分类中对所有图像序列利用本申请实施例的运动估计的方法和利用现有技术进行编码的比较结果的平均值。Y(BD-rate)列、U(BD-rate)列、V(BD-rate)列和YUV(BD-rate)列分别表示Y、U、V以及YUV合并质量下的码率节省(负值表示节省,正值表示增加),Y表示明亮度(Luminance或Luma),也就是灰阶值;U和V表示色度(Chrominance或Chroma),作用是描述影像色彩及饱和度,用于指定像素的颜色。Δfps表示编码加速,例如,如公式(1)所示。
Figure PCTCN2019100236-appb-000003
其中,Δfps表示编码加速,FPS anchor表示使用原编码器编码图像序列的帧率fps,FPS proposed表示同一编码器采用本实施例的运动估计的方法后的编码图像序列的帧率fps。Δfps为正值表示加速,为负值表示减速。
由表(1)中的数据可以看出,针对每一分类的图像序列,使用本申请实施例的方法进行运动估计具有显著的节省编码时间的效果,平均带来2.02%左右的增益。
由以上可见,基于本申请实施例的运动估计的方法,获取与目标预测单元对应的候选参考帧集合中满足第一预设条件的待预测参考帧,按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块,将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。基于上述处理,对满足第一预设条件的候选参考帧进行像素搜索,与现有技术中遍历所有的候选参考帧并对每一个候选参考帧都进行像素搜索的方式相比,减少了视频编码的复杂度,节省了编码的时间,进而可以提高视频的编码效率。
与图3的方法实施例相对应,参见图10,图10为本申请实施例提供的一 种运动估计的装置的结构图,该装置包括:
获取模块1001,设置为获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧;
第一处理模块1002,设置为按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块;
确定模块1003,设置为将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。
在本申请的一个具体实施例中,第一预设条件包括以下至少之一:
备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且候选参考帧不是候选参考帧集合中的指定参考帧;
备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且候选参考帧在备选参考帧集合中;
候选参考帧在备选参考帧集合中;
候选参考帧在备选参考帧集合中且候选参考帧不是候选参考帧集合中的指定参考帧;
其中,备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,图像块为与目标预测单元满足预设相邻条件的图像块。
在本申请的一个具体实施例中,装置还包括:
第二处理模块,设置为在候选参考帧满足第二预设条件的情况下,跳过对候选参考帧的运动估计;
其中,第二预设条件包括以下至少之一:备选参考帧集合中的参考帧的个数大于第一预设阈值且候选参考帧不在备选参考帧集合中;备选参考帧集合中的参考帧的个数小于或等于第一预设阈值,候选参考帧不是候选参考帧集合中的指定参考帧且候选参考帧不在备选参考帧集合中;其中,备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,图像块为与目标预测单元满足预设相邻条件的图像块。
在本申请的一个具体实施例中,第一处理模块1002,具体设置为在待预测参考帧不在备选参考帧集合中的情况下,对待预测参考帧进行整像素搜索,得到待预测参考帧中的匹配块,作为候选匹配块;
和/或,在待预测参考帧在备选参考帧集合中的情况下,对待预测参考帧进行整像素搜索和分像素搜索,得到待预测参考帧中的匹配块,作为候选匹配块。
在本申请的一个具体实施例中,在待预测参考帧为多个的情况下,针对多个待预测参考帧中的第一待预测参考帧,第一处理模块1002,具体设置为按照与第一待预测参考帧对应的像素搜索规则对第一待预测参考帧进行像素搜索,得到第一候选匹配块;
装置还包括:第三处理模块,设置为在确定第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且第一候选匹配块的率失真代价小于第二预设阈值的情况下,根据第一待预测参考帧在候选参考帧集合中的排列顺序,将位于第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索。
由以上可见,基于本申请实施例的运动估计的装置,获取与目标预测单元对应的候选参考帧集合中满足第一预设条件的待预测参考帧,按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块,将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。基于上述处理,对满足第一预设条件的候选参考帧进行像素搜索,与现有技术中遍历所有的候选参考帧并对每一个候选参考帧都进行像素搜索的方式相比,减少了视频编码的复杂度,节省了编码的时间,进而可以提高视频的编码效率。
需要说明的是,上述装置可以位于设备中,比如终端,服务器等,但并不限于此。
本申请实施例还提供了一种电子设备,如图11所示,包括存储器1101和处理器1102;
存储器1101,设置为存放计算机程序;
处理器1102,设置为执行存储器1101上所存放的程序时,实现本申请实 施例提供的运动估计的方法。
上述运动估计的方法,包括:
获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧;
按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块;
将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。
需要说明的是,上述运动估计的方法的其他实现方式与前述方法实施例部分相同,这里不再赘述。
上述电子设备可以具备有实现上述电子设备与其他设备之间通信的通信接口。
上述的处理器1102,通信接口,存储器1101通过通信总线完成相互间的通信,此处提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,简称PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,简称EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。
存储器1101可以包括随机存取存储器(Random Access Memory,简称RAM),也可以包括非易失性存储器(Non-Volatile Memory,简称NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。
上述的处理器1102可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(Digital Signal Processing,简称DSP)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
本申请实施例提供的电子设备,在进行运动估计时,仅对满足第一预设条件的候选参考帧进行像素搜索,减少了视频编码的复杂度,节省了编码的 时间,进而可以提高视频的编码效率。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行本申请实施例提供的运动估计的方法。
上述运动估计的方法,包括:
获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧;
按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块;
将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。
需要说明的是,上述运动估计的方法的其他实现方式与前述方法实施例部分相同,这里不再赘述。
通过运行本申请实施例提供的计算机可读存储介质中存储的指令,在进行运动估计时,仅对满足第一预设条件的候选参考帧进行像素搜索,减少了视频编码的复杂度,节省了编码的时间,进而可以提高视频的编码效率。
本申请实施例还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行本申请实施例提供的运动估计的方法。
上述运动估计的方法,包括:
获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,待预测参考帧为候选参考帧集合中满足第一预设条件的候选参考帧;
按照与待预测参考帧对应的像素搜索规则对待预测参考帧进行像素搜索,得到候选匹配块;
将候选匹配块中率失真代价最小的匹配块,确定为目标预测单元的最佳匹配块。
需要说明的是,上述运动估计的方法的其他实现方式与前述方法实施例部分相同,这里不再赘述。
通过运行本申请实施例提供的计算机程序产品,在进行运动估计时,仅对满足第一预设条件的候选参考帧进行像素搜索,减少了视频编码的复杂度, 节省了编码的时间,进而可以提高视频的编码效率。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、电子设备、计算机可读存储介质、计算机程序产品实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本申请的保护范围内。
工业实用性
基于本申请实施例提供的上述技术方案,对满足第一预设条件的候选参考帧进行像素搜索,与现有技术中需要遍历所有的候选参考帧并对每一个候选参考帧都进行像素搜索的方式相比,不需要对所有的候选参考帧都进行像素搜索,因而减少了视频编码的复杂度,节省了编码的时间,并且可以提高视频的编码效率。

Claims (12)

  1. 一种运动估计的方法,所述方法包括:
    获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,所述待预测参考帧为所述候选参考帧集合中满足第一预设条件的候选参考帧;
    按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块;
    将所述候选匹配块中率失真代价最小的匹配块,确定为所述目标预测单元的最佳匹配块。
  2. 根据权利要求1所述的方法,其中,所述第一预设条件包括以下至少之一:
    备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;
    所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值且所述候选参考帧在所述备选参考帧集合中;
    所述候选参考帧在所述备选参考帧集合中;
    所述候选参考帧在所述备选参考帧集合中且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;
    其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
  3. 根据权利要求1所述的方法,其中,所述方法还包括:
    在所述候选参考帧满足第二预设条件的情况下,跳过对所述候选参考帧的运动估计;
    其中,所述第二预设条件包括以下至少之一:备选参考帧集合中的参考帧的个数大于第一预设阈值且所述候选参考帧不在所述备选参考帧集合中;所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值,所述候选参考帧不是所述候选参考帧集合中的指定参考帧且所述候选参考帧不在所述备选参考帧集合中;其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
  4. 根据权利要求2所述的方法,其中,所述按照与所述待预测参考帧对 应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块,包括以下至少之一:
    在所述待预测参考帧不在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块;
    在所述待预测参考帧在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索和分像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块。
  5. 根据权利要求1所述的方法,其中,在所述待预测参考帧为多个的情况下,针对多个所述待预测参考帧中的第一待预测参考帧,按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块,包括:
    按照与所述第一待预测参考帧对应的像素搜索规则对所述第一待预测参考帧进行像素搜索,得到第一候选匹配块;
    在所述按照与所述第一待预测参考帧对应的像素搜索规则对所述第一待预测参考帧进行像素搜索,得到第一候选匹配块之后,所述方法还包括:
    在确定所述第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且所述第一候选匹配块的率失真代价小于第二预设阈值的情况下,根据所述第一待预测参考帧在所述候选参考帧集合中的排列顺序,将位于所述第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索。
  6. 一种运动估计的装置,所述装置包括:
    获取模块,设置为获取与目标预测单元对应的候选参考帧集合中的待预测参考帧;其中,所述待预测参考帧为所述候选参考帧集合中满足第一预设条件的候选参考帧;
    第一处理模块,设置为按照与所述待预测参考帧对应的像素搜索规则对所述待预测参考帧进行像素搜索,得到候选匹配块;
    确定模块,设置为将所述候选匹配块中率失真代价最小的匹配块,确定为所述目标预测单元的最佳匹配块。
  7. 根据权利要求6所述的装置,其中,所述第一预设条件包括以下至少之一:
    备选参考帧集合中的参考帧的个数小于或等于第一预设阈值且所述候选 参考帧不是所述候选参考帧集合中的指定参考帧;
    所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值且所述候选参考帧在所述备选参考帧集合中;
    所述候选参考帧在所述备选参考帧集合中;
    所述候选参考帧在所述备选参考帧集合中且所述候选参考帧不是所述候选参考帧集合中的指定参考帧;
    其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
  8. 根据权利要求6所述的装置,其中,所述装置还包括:
    第二处理模块,设置为在所述候选参考帧满足第二预设条件的情况下,跳过对所述候选参考帧的运动估计;
    其中,所述第二预设条件包括以下至少之一:备选参考帧集合中的参考帧的个数大于第一预设阈值且所述候选参考帧不在所述备选参考帧集合中;所述备选参考帧集合中的参考帧的个数小于或等于所述第一预设阈值,所述候选参考帧不是所述候选参考帧集合中的指定参考帧且所述候选参考帧不在所述备选参考帧集合中;其中,所述备选参考帧集合包括的参考帧为在对图像块进行帧间预测得到的匹配块中率失真代价最小的匹配块所在的参考帧,所述图像块为与所述目标预测单元满足预设相邻条件的图像块。
  9. 根据权利要求7所述的装置,其中,所述第一处理模块,设置为在所述待预测参考帧不在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块;和/或,在所述待预测参考帧在所述备选参考帧集合中的情况下,对所述待预测参考帧进行整像素搜索和分像素搜索,得到所述待预测参考帧中的匹配块,作为候选匹配块。
  10. 根据权利要求6所述的装置,其中,在所述待预测参考帧为多个的情况下,针对多个所述待预测参考帧中的第一待预测参考帧,所述第一处理模块,设置为按照与所述第一待预测参考帧对应的像素搜索规则对所述第一待预测参考帧进行像素搜索,得到第一候选匹配块;
    所述装置还包括:第三处理模块,设置为在确定所述第一候选匹配块为当前得到的各候选匹配块中率失真代价最小的匹配块,且所述第一候选匹配 块的率失真代价小于第二预设阈值的情况下,根据所述第一待预测参考帧在所述候选参考帧集合中的排列顺序,将位于所述第一待预测参考帧之后的各参考帧的像素搜索规则更新为整像素搜索。
  11. 一种电子设备,包括存储器和处理器;
    存储器,设置为存放计算机程序;
    处理器,设置为执行存储器上所存放的程序时,实现权利要求1-5任一所述的方法步骤。
  12. 一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-5任一所述的方法步骤。
PCT/CN2019/100236 2018-08-17 2019-08-12 运动估计的方法、装置、电子设备及计算机可读存储介质 WO2020034921A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810940267.7A CN110839155B (zh) 2018-08-17 2018-08-17 运动估计的方法、装置、电子设备及计算机可读存储介质
CN201810940267.7 2018-08-17

Publications (1)

Publication Number Publication Date
WO2020034921A1 true WO2020034921A1 (zh) 2020-02-20

Family

ID=69524706

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/100236 WO2020034921A1 (zh) 2018-08-17 2019-08-12 运动估计的方法、装置、电子设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN110839155B (zh)
WO (1) WO2020034921A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405282A (zh) * 2020-04-21 2020-07-10 广州市百果园信息技术有限公司 基于长期参考帧的视频编码方法、装置、设备和存储介质
CN112261413A (zh) * 2020-10-22 2021-01-22 北京奇艺世纪科技有限公司 视频编码方法、编码装置、电子设备和存储介质
CN112565753A (zh) * 2020-12-06 2021-03-26 浙江大华技术股份有限公司 运动矢量差的确定方法和装置、存储介质及电子装置
CN117615129A (zh) * 2024-01-23 2024-02-27 腾讯科技(深圳)有限公司 帧间预测方法、装置、计算机设备及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111462170B (zh) * 2020-03-30 2023-08-25 Oppo广东移动通信有限公司 运动估计方法、运动估计装置、存储介质与电子设备
CN111510727B (zh) * 2020-04-14 2022-07-15 腾讯科技(深圳)有限公司 一种运动估计方法及装置
CN111479115B (zh) * 2020-04-14 2022-09-27 腾讯科技(深圳)有限公司 一种视频图像处理方法、装置及计算机可读存储介质
CN111263151B (zh) * 2020-04-26 2020-08-25 腾讯科技(深圳)有限公司 视频编码方法、装置、电子设备和计算机可读存储介质
CN112770118B (zh) * 2020-12-31 2022-09-13 展讯通信(天津)有限公司 视频帧图像运动估计方法及相关设备
CN113596475A (zh) * 2021-06-24 2021-11-02 浙江大华技术股份有限公司 图像/视频编码方法、装置、系统及计算机可读存储介质
CN116567267A (zh) * 2022-01-28 2023-08-08 腾讯科技(深圳)有限公司 一种编码过程中的运动估计方法及相关产品
CN116074533B (zh) * 2023-04-06 2023-08-22 湖南国科微电子股份有限公司 运动矢量预测方法、系统、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101222640A (zh) * 2007-01-09 2008-07-16 华为技术有限公司 确定参考帧的方法及其装置
CN102843561A (zh) * 2011-06-21 2012-12-26 乐金电子(中国)研究开发中心有限公司 帧间图像预测编解码的参考帧序号编解码方法及编解码器
CN103501437A (zh) * 2013-09-29 2014-01-08 北京航空航天大学 一种基于分形和h.264的高光谱图像压缩方法
CN104602019A (zh) * 2014-12-31 2015-05-06 乐视网信息技术(北京)股份有限公司 一种视频编码方法及装置
US20160277756A1 (en) * 2015-03-19 2016-09-22 Alibaba Group Holding Limited Method, apparatus and coder for selecting optimal reference frame in hevc coding
CN106888024A (zh) * 2017-01-06 2017-06-23 南京邮电大学 一种基于双向最佳匹配的分布式视频压缩感知重构方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100747544B1 (ko) * 2006-03-31 2007-08-08 엘지전자 주식회사 움직임 추정 방법 및 장치
CN101621694B (zh) * 2009-07-29 2012-01-11 深圳市九洲电器有限公司 一种运动估计方法、系统及显示终端
CN102387360B (zh) * 2010-09-02 2016-05-11 乐金电子(中国)研究开发中心有限公司 视频编解码帧间图像预测方法及视频编解码器
US10448043B2 (en) * 2016-12-28 2019-10-15 Novatek Microelectronics Corp. Motion estimation method and motion estimator for estimating motion vector of block of current frame

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101222640A (zh) * 2007-01-09 2008-07-16 华为技术有限公司 确定参考帧的方法及其装置
CN102843561A (zh) * 2011-06-21 2012-12-26 乐金电子(中国)研究开发中心有限公司 帧间图像预测编解码的参考帧序号编解码方法及编解码器
CN103501437A (zh) * 2013-09-29 2014-01-08 北京航空航天大学 一种基于分形和h.264的高光谱图像压缩方法
CN104602019A (zh) * 2014-12-31 2015-05-06 乐视网信息技术(北京)股份有限公司 一种视频编码方法及装置
US20160277756A1 (en) * 2015-03-19 2016-09-22 Alibaba Group Holding Limited Method, apparatus and coder for selecting optimal reference frame in hevc coding
CN106034236A (zh) * 2015-03-19 2016-10-19 阿里巴巴集团控股有限公司 一种hevc编码最佳参考帧的选择方法、装置及编码器
CN106888024A (zh) * 2017-01-06 2017-06-23 南京邮电大学 一种基于双向最佳匹配的分布式视频压缩感知重构方法

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405282A (zh) * 2020-04-21 2020-07-10 广州市百果园信息技术有限公司 基于长期参考帧的视频编码方法、装置、设备和存储介质
CN111405282B (zh) * 2020-04-21 2022-04-01 广州市百果园信息技术有限公司 基于长期参考帧的视频编码方法、装置、设备和存储介质
CN112261413A (zh) * 2020-10-22 2021-01-22 北京奇艺世纪科技有限公司 视频编码方法、编码装置、电子设备和存储介质
CN112261413B (zh) * 2020-10-22 2023-10-31 北京奇艺世纪科技有限公司 视频编码方法、编码装置、电子设备和存储介质
CN112565753A (zh) * 2020-12-06 2021-03-26 浙江大华技术股份有限公司 运动矢量差的确定方法和装置、存储介质及电子装置
CN112565753B (zh) * 2020-12-06 2022-08-16 浙江大华技术股份有限公司 运动矢量差的确定方法和装置、存储介质及电子装置
CN117615129A (zh) * 2024-01-23 2024-02-27 腾讯科技(深圳)有限公司 帧间预测方法、装置、计算机设备及存储介质
CN117615129B (zh) * 2024-01-23 2024-04-26 腾讯科技(深圳)有限公司 帧间预测方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN110839155A (zh) 2020-02-25
CN110839155B (zh) 2021-12-03

Similar Documents

Publication Publication Date Title
WO2020034921A1 (zh) 运动估计的方法、装置、电子设备及计算机可读存储介质
TW202037165A (zh) 視訊寫碼中之位置相關內部-外部預測組合
WO2019072248A1 (zh) 运动估计方法、装置、电子设备及计算机可读存储介质
US8761253B2 (en) Intra prediction mode search scheme
WO2022104498A1 (zh) 帧内预测方法、编码器、解码器以及计算机存储介质
KR101621358B1 (ko) Hevc 부호화 장치 및 그 인트라 예측 모드 결정 방법
JP7250927B2 (ja) フレーム内予測符号化方法、装置、電子デバイスおよびコンピュータプログラム
WO2020119449A1 (zh) 色度块的预测方法和装置
US20170353720A1 (en) Prediction mode selection method, apparatus and device
WO2020125595A1 (zh) 视频译码器及相应方法
US20240031576A1 (en) Method and apparatus for video predictive coding
WO2021253373A1 (en) Probabilistic geometric partitioning in video coding
WO2020248715A1 (zh) 基于高效率视频编码的编码管理方法及装置
WO2020143585A1 (zh) 视频编码器、视频解码器及相应方法
JP2024514294A (ja) マルチメディアデータ処理方法、装置、機器、コンピュータ可読記憶媒体及びコンピュータプログラム
CN112203085A (zh) 图像处理方法、装置、终端和存储介质
WO2018040869A1 (zh) 一种帧间预测编码方法及装置
JP2019537904A (ja) ビデオ符号化処理方法、装置、及び記憶媒体
CN117156133B (zh) 一种视频编码的帧间预测模式选择方法及装置
CN112565768B (zh) 一种帧间预测方法、编解码系统及计算机可读存储介质
KR102631517B1 (ko) 픽처 분할 방법 및 장치
CN113498608A (zh) 帧间预测方法及装置、设备、存储介质
US10368087B2 (en) Dynamic reload of video encoder motion estimation search window under performance/power constraints
WO2020258039A1 (zh) 运动补偿的处理方法、编码器、解码器以及存储介质
KR20220066166A (ko) 현재 블록의 예측 방법 및 예측 장치, 디바이스, 저장 매체

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19850356

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.06.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19850356

Country of ref document: EP

Kind code of ref document: A1