WO2019085636A1 - 一种视频图像的处理方法和装置 - Google Patents

一种视频图像的处理方法和装置 Download PDF

Info

Publication number
WO2019085636A1
WO2019085636A1 PCT/CN2018/104073 CN2018104073W WO2019085636A1 WO 2019085636 A1 WO2019085636 A1 WO 2019085636A1 CN 2018104073 W CN2018104073 W CN 2018104073W WO 2019085636 A1 WO2019085636 A1 WO 2019085636A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
estimated
optimal position
optimal
rate distortion
Prior art date
Application number
PCT/CN2018/104073
Other languages
English (en)
French (fr)
Inventor
张宏顺
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP18873854.6A priority Critical patent/EP3706420A4/en
Priority to JP2019563038A priority patent/JP6921461B2/ja
Priority to KR1020197035773A priority patent/KR102276264B1/ko
Publication of WO2019085636A1 publication Critical patent/WO2019085636A1/zh
Priority to US16/657,226 priority patent/US10944985B2/en
Priority to US17/168,086 priority patent/US11589073B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/533Motion estimation using multistep search, e.g. 2D-log search or one-at-a-time search [OTS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria

Definitions

  • the present disclosure relates to the field of computer technology, and more particularly to video image processing.
  • the H.264 coding unit can only achieve a block size of 16x16, but the HEVC coding unit can reach multiple block sizes of 128x128, or 64x64, 8x8, and the like.
  • the H.264 coding unit can only achieve a block size of 16x16, but the HEVC coding unit can reach multiple block sizes of 128x128, or 64x64, 8x8, and the like.
  • the H.264 coding unit can only adopt the rectangular split mode, but the HEVC coding unit can also adopt the asymmetric split mode. Overall, HEVC can increase the compression ratio by 40% compared to H.264.
  • the HEVC encoding protocol has higher performance requirements on the machine, and the ordinary machine cannot achieve the real-time encoding capability, which inevitably leads to a decline in video compression performance.
  • embodiments of the present disclosure provide a data processing method, apparatus, and server to improve processing efficiency of user asset state mining.
  • an embodiment of the present disclosure provides a method for processing a video image, including:
  • the motion compensation is performed using the optimal position estimated by the quarter pixel as a motion estimation result.
  • an embodiment of the present disclosure further provides a video image processing apparatus, including:
  • An image acquisition module configured to acquire a target image frame from the video image to be encoded
  • the integer pixel estimation module is configured to perform integer pixel motion estimation on the target image frame to obtain an optimal position estimated by the whole pixel;
  • a first sub-pixel estimation module configured to perform a half-pixel estimation on the optimal position estimated by the integer pixel, to obtain an optimal position estimated by one-half pixel;
  • An image partitioning module configured to divide a surrounding area of the optimal position estimated by the one-half pixel into four partitions, wherein each partition includes two quarters of a quarter interpolation Pixel position
  • a first partition obtaining module configured to obtain a location corresponding to a minimum rate distortion cost from each of the one of the four partitions, and determine that the partition to which the location corresponding to the minimum rate distortion cost belongs is the first partition;
  • a second sub-pixel estimation module configured to perform quarter-pixel estimation in the first partition according to an optimal position estimated by one-half pixel, to obtain an optimal position estimated by a quarter pixel;
  • a motion compensation module configured to perform motion compensation by using the optimal position estimated by the quarter pixel as a motion estimation result.
  • an embodiment of the present disclosure provides a video processing device, where the video processing device includes:
  • a processor a communication interface, a memory, and a communication bus
  • the processor, the communication interface, and the memory complete communication with each other through the communication bus;
  • the communication interface is an interface of a communication module;
  • the memory is configured to store program code and transmit the program code to the processor
  • the processor the instruction for invoking program code in the memory, performs the method of the first aspect.
  • an embodiment of the present disclosure provides a storage medium for storing program code, the program code for performing the method of the first aspect.
  • an embodiment of the present disclosure provides a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the method of the first aspect.
  • the embodiments of the present disclosure have the following advantages:
  • the first partition is the minimum rate distortion cost.
  • the partition to which the corresponding position belongs so there is no need to perform quarter interpolation in all regions in the quarter-pixel estimation, and there is no need to calculate the minimum rate distortion cost for each quarter-pixel position in the entire region. It is only necessary to calculate the minimum rate distortion cost for the quarter-pixel position in the first partition, so that the encoding speed can be effectively improved and the computational complexity can be reduced while ensuring the video compression performance, thereby reducing the video compression real-time encoding pair. Machine performance requirements.
  • FIG. 1 is a schematic block diagram of a method for processing a video image according to an embodiment of the present disclosure
  • FIG. 2 is a schematic structural diagram of a structure of an HEVC coding framework according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of relationship between a center point position and a surrounding position in a one-half pixel motion estimation process according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of location point identification of a computing point in a one-half pixel motion estimation process according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a partition identifier of four blocks when the center point position is an optimal position according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of the position identification of the complementary point when the optimal position in the one-half pixel motion estimation process according to the embodiment of the present disclosure is the first four point positions;
  • FIG. 3 e is a schematic diagram of the padding position identification when the optimal position in the one-half pixel motion estimation process is not the first four point positions according to the embodiment of the present disclosure
  • FIG. 3 is a schematic diagram showing the relationship between the center point position and the surrounding position before the quarter-pixel motion estimation according to the embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of a video image processing apparatus according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a first sub-pixel estimation module according to an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of another first sub-pixel estimation module according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a first partition acquiring module according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of another first partition acquiring module according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a second sub-pixel estimation module according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a video image processing method applied to a server according to an embodiment of the present disclosure.
  • Embodiments of the present disclosure provide a method and apparatus for processing a video image, which are used to improve encoding speed and reduce computational complexity in the case of ensuring video compression performance.
  • the processing method of the video image provided by the embodiment of the present disclosure may be applied to a video processing device, and the video processing device may be a CPU (Central Processing Unit) or a GPU (Graphics Processor) having video processing capability; optionally, the video processing device may be The use of mobile phones, laptops and other terminals to achieve, you can also choose to use the server.
  • the video processing device may be a CPU (Central Processing Unit) or a GPU (Graphics Processor) having video processing capability; optionally, the video processing device may be The use of mobile phones, laptops and other terminals to achieve, you can also choose to use the server.
  • Motion Estimation may include three components: integer pixel motion estimation, Half Motion Estimation (HME). And Quarter Motion Estimation (QME).
  • HME integer pixel motion estimation
  • HME Half Motion Estimation
  • QME Quarter Motion Estimation
  • the HME is an image that needs to be enlarged twice, and QME is needed. If the image is enlarged by 4 times, these data do not exist, so it is necessary to interpolate the entire pixel to obtain the reference pixel of the corresponding position.
  • a method for processing a video image may include the following steps:
  • Motion estimation of an image frame may be done based on the HEVC coding framework in embodiments of the present disclosure.
  • the target image frame is first extracted, for example, the target image frame can be read from the frame buffer.
  • FrameBuffer is a display buffer, and writing data in a specific format to the display buffer means outputting content to the screen.
  • the frame buffer can be anywhere in the system memory (memory), and the video controller refreshes the screen by accessing the frame buffer.
  • the frame buffer has an address that is in memory.
  • the whole image motion estimation can be performed on the target image frame to obtain an optimal position estimated by the whole pixel.
  • the optimal position represents an integer pixel position with a minimum value of a Rate Distortion Cost selected from a plurality of integer pixel positions, and needs to calculate a corresponding position of each integer pixel in the whole pixel estimation process.
  • the rate distortion cost, the minimum pixel distortion cost corresponding to the integer pixel position is the optimal position estimated by the integer pixel.
  • the second pixel estimation is performed by estimating the optimal position of the integer pixel in the target frame to obtain a half pixel estimation.
  • the optimal location represents a half pixel position where the rate distortion cost selected from the plurality of half pixel positions is the smallest, and each dichotomy needs to be calculated in the one-half pixel estimation process.
  • the rate distortion cost corresponding to one pixel position, the half-pixel position corresponding to the minimum rate distortion cost is the optimal position estimated by one-half pixel.
  • step 103 performs a one-half pixel estimation on the optimal position estimated by the integer pixel, and obtains an optimal position estimated by one-half pixel, including:
  • Step A1 Acquire four adjacent one-half pixel positions in the optimal position estimated by the whole pixel, and the four-half pixel positions are directly above and below the optimal position estimated by the whole pixel.
  • Step A2 Obtain a position corresponding to the first minimum rate distortion cost from the optimal position estimated by the integer pixel and the four-half pixel position;
  • Step A3 When the position corresponding to the first minimum rate distortion cost is the optimal position estimated by the integer pixel, determine that the optimal position estimated by the one-half pixel is the optimal position estimated by the whole pixel.
  • the half-pixel motion estimation of the center point position is first taken as an example of the optimal position estimated by the integer pixel.
  • four pixel positions can be determined directly above, below, directly to the left, and on the right side of the optimal position estimated by the whole pixel, that is, four positions around the estimated optimal position of the whole pixel can be obtained.
  • the interpolation result of the position is one-half, and the interpolation result is parsed to obtain four half-pixel positions adjacent to the optimal position estimated by the integer pixel, wherein the half-interpolation refers to estimating the integer pixel
  • the optimal position is interpolated to a half pixel position.
  • step A2 the rate distortion cost of the following five pixel positions needs to be calculated: the optimal position estimated by the whole pixel, four half-pixel positions, and the minimum rate distortion is selected from the calculated five rate distortion costs.
  • step A3 when the position corresponding to the first minimum rate distortion cost is the optimal position estimated by the integer pixel, since the optimal position interpolated by the integer pixel has been acquired in step A1, the four dichotomies are obtained.
  • step 103 performs a one-half pixel estimation on the optimal position estimated by the integer pixel to obtain an optimal position estimated by one-half pixel, including:
  • Step A1 Obtain four half-pixel positions adjacent to the optimal position estimated by the integer pixel in the optimal position estimated by the integer pixel, and the four-half pixel positions are estimated at the entire pixel. Four pixel positions directly above, directly below, on the left side, and on the right side of the optimal position;
  • Step A2 Obtain a position corresponding to the first minimum rate distortion cost from the optimal position estimated by the integer pixel and the four-half pixel position;
  • Step A4 When the position corresponding to the first minimum rate distortion cost is the first half pixel position, acquiring two half pixel positions adjacent to the first half pixel position, and the first two The two half-pixel positions adjacent to one pixel position are located in the same axial direction as the first half-pixel position, and the first half-pixel position is in four half-pixel positions. One-half pixel position with the lowest rate distortion cost;
  • Step A5 Obtain a position corresponding to the second minimum rate distortion cost from the first half pixel position and the two half pixel positions adjacent to the first half pixel position;
  • Step A6 When the position corresponding to the second minimum rate distortion cost is the first half pixel position, determining that the optimal position estimated by the half pixel is the first half pixel position; or
  • Step A7 When the position corresponding to the second minimum rate distortion cost is the third half pixel position, determining that the optimal position estimated by the half pixel is the third half pixel position, the third binary One of the pixel positions is a half pixel position where the rate distortion cost is the smallest among the two half pixel positions adjacent to the first one-half pixel position.
  • step A7 four pixel positions can be determined directly above, below, directly to the left, and to the right of the optimal position estimated by the entire pixel, that is, the whole pixel is obtained. Estimating the optimal position to perform a half-interpolation result of the four surrounding positions, and parsing the one-half interpolation result to obtain four half-pixel positions adjacent to the optimal position estimated by the whole pixel, Wherein, one-half interpolation refers to interpolating the one-pixel position of the optimal position estimated for the integer pixel.
  • step A2 the rate distortion cost of the following five pixel positions needs to be calculated: the optimal position estimated by the whole pixel, four half-pixel positions, and the minimum rate distortion is selected from the calculated five rate distortion costs.
  • steps A4 to A7 may be triggered to search for the optimal position estimated by the half pixel.
  • a half-pixel position that minimizes the rate distortion cost among the four half-pixel positions is defined as a "first half-pixel position", which is obtained from the first half-pixel position.
  • Two two-half pixel positions of the neighbor, two half-pixel positions adjacent to the first one-half pixel position are located in the same axial direction as the first one-half pixel position. Therefore, the two half-pixel positions and the first half-pixel position interpolated can be located in the axial direction of the horizontal axis or the vertical axis, that is, the two half-pixel positions of the interpolation are in the first two-point position.
  • the horizontal direction of one of the pixel positions, or the interpolated two half-pixel positions are in the vertical direction of the first half-pixel position.
  • step A4 After obtaining the one-half interpolation result in step A4, it is necessary to calculate the rate distortion cost of the following three pixel positions: the first half-pixel position, two two adjacent to the first half-pixel position The position corresponding to the minimum rate distortion cost is selected from the calculated three rate distortion costs by one pixel position. Since there are multiple calculation rate distortion cost processes in the embodiment of the present disclosure, the minimum rate distortion cost in step A5 is obtained. Defined as "second minimum rate distortion cost", step A6 or A7 is performed according to the position corresponding to the minimum rate distortion cost.
  • the surrounding area of the optimal position estimated for the one-half pixel may be the one-half pixel.
  • the estimated optimal position is a center point, and the surrounding area of the optimal position estimated by one-half pixel is divided into four partitions by dividing the horizontal axis direction and the vertical axis direction, and the half-pixel estimation is performed.
  • the surrounding area of the optimal position refers to the four points in the eight directions of the upper, lower, left, right, upper left, upper right, lower left, and lower right with the optimal position estimated by one-half pixel as the center point.
  • One pixel location It can be seen from the foregoing that one-half of the interpolation result can be obtained in the one-half pixel estimation process, so that each of the partitions can respectively contain one-half pixel position interpolated by one-half.
  • step 104 divides the surrounding area of the optimal position estimated by one-half pixel into four partitions, including:
  • the position corresponding to the first minimum rate distortion cost is the optimal position estimated by the integer pixel
  • the surrounding area of the optimal position estimated by the integer pixel is divided into four partitions, wherein each partition is estimated by the whole pixel.
  • the resulting optimal position the area divided by two of the four half-pixel positions, is obtained.
  • the optimal position estimated by the half pixel is the optimal position estimated by the integer pixel, and estimating the integer pixel
  • the optimal position can be divided into four partitions.
  • the half interpolation result is obtained in the foregoing step A1, four half-pixel positions are respectively interpolated, and then the four half-pixel positions are The two half-pixel positions and the first center point area can be divided into one area, and the divided area can be defined as a single partition.
  • step 104 divides the surrounding area of the optimal position estimated by one-half pixel into four partitions, including :
  • the position corresponding to the second minimum rate distortion cost is the first half pixel position
  • determining the optimal position estimated by the half pixel is the first half pixel position, surrounding the first two Only three pixel positions are interpolated in the four pixel positions directly above, directly below, directly to the left, and on the right side of the pixel position, and it is necessary to acquire the position adjacent to the first half pixel position.
  • the second half of the pixel position is the first half pixel position.
  • four partitions are divided into the surrounding area of the first half-pixel position, and four half-pixel positions are respectively interpolated when the one-half interpolation result is obtained in the foregoing step A4 and the foregoing step C1:
  • two of the four half-pixel positions and the first center point area may be divided into one area, and the divided area may be defined as a single partition.
  • step 104 divides the surrounding area of the optimal position estimated by one-half pixel into four. Partitions, including:
  • the fourth half pixel is a pixel position that is not interpolated among four pixel positions located directly above, directly below, directly to the left, and on the right side of the third half pixel position;
  • defining a half-pixel position where the rate distortion cost is the smallest among two half-pixel positions adjacent to the first half-pixel position is "third half-pixel position", when the second When the position corresponding to the minimum rate distortion cost is the third half pixel position, only four pixel positions directly above, below, directly to the left, and on the right side of the third half pixel position are completed. Interpolation of two pixel positions requires obtaining two fourth half-pixel positions adjacent to the third half-pixel position.
  • four partitions are divided into the surrounding area of the third half-pixel position, and four half-pixel positions are respectively interpolated when the one-half interpolation result is obtained in the foregoing step A4 and the foregoing step D1: Two pixel positions adjacent to the third one-half pixel position and two fourth-half pixel positions among the four half-pixel positions. Then, two of the four half-pixel positions and the first center point area may be divided into one area, and the divided area may be defined as a single partition.
  • the surrounding area of the optimal position estimated by the half pixel is divided into four partitions by the foregoing step 104, and each partition contains two quarters of the interpolated two quarters.
  • each partition contains two quarters of the interpolated two quarters.
  • the rate distortion cost can be separately calculated, and the minimum rate distortion cost is selected from the four rate distortion costs, and the partition corresponding to the position corresponding to the minimum rate distortion cost is the first. Partition. Since the quarter-pixel motion estimation is determined according to the cost of the surrounding four pixels when performing the one-half pixel motion estimation in the embodiment of the present disclosure, only the position of the first partition is to be interpolated. Therefore, the interpolation range of the quarter-pixel motion estimation is greatly reduced, so that the coding speed can be improved and the computational complexity can be reduced.
  • step 105 determines from the four partitions based on the rate distortion cost corresponding to the four half-pixel positions adjacent to the optimal position estimated by the one-half pixel.
  • the first partition estimated in quarter pixels including:
  • the surrounding area of the optimal position estimated by one-half pixel is divided into: upper right area, upper left area, lower right area, lower left area, the upper half of the pixel position and the lower part of the lower part.
  • the rate distortion cost corresponding to each pixel position is numerically compared, and the rate distortion cost corresponding to the left half pixel position and the right right half pixel position respectively are numerically compared. If the rate distortion cost corresponding to the upper half pixel position is less than or equal to the rate distortion cost corresponding to the one-half pixel position immediately below, and the rate distortion cost corresponding to the left half pixel position is less than Or equal to the rate distortion penalty corresponding to the right half of the pixel position, then the first partition is determined to be the upper left area.
  • the rate distortion cost corresponding to the upper half pixel position is less than or equal to the rate distortion cost corresponding to the one-half pixel position immediately below, and the rate distortion cost corresponding to the left half pixel position is greater than The rate distortion cost corresponding to the right half of the pixel position determines that the first partition is the upper right area. If the rate distortion cost corresponding to the upper half pixel position is greater than the rate distortion cost corresponding to the lower half pixel position, and the rate distortion cost corresponding to the left half pixel position is less than or equal to The rate distortion cost corresponding to the right half of the pixel position determines that the first partition is the lower left area.
  • rate distortion cost corresponding to the one-half pixel position directly above is greater than the rate distortion cost corresponding to the one-half pixel position immediately below, and the rate distortion cost corresponding to the one-half pixel position on the left side is greater than the right
  • the rate-distortion penalty corresponding to the one-half pixel position of the side determines that the first partition is the lower right area.
  • the first position estimated by the position corresponding to the one-half pixel and the position corresponding to the minimum rate distortion cost is calculated by the foregoing steps, so only the first part is required for performing QME in HEVC coding. It can be performed in the sub-area without having to perform QME in all the regional positions, thus greatly reducing the interpolation range of the quarter-pixel motion estimation, thereby improving the coding speed and reducing the computational complexity.
  • step 106 performs a quarter-pixel estimation in the first partition based on the estimated optimal position of the one-half pixel, resulting in an optimal position estimated by the quarter-pixel, including :
  • E2 obtaining a position corresponding to a third minimum rate distortion cost from an optimal position estimated by one-half pixel and three quarter-pixel positions;
  • step E3 only one quarter interpolation of three surrounding positions needs to be performed in the first partition to obtain three adjacent to the optimal position estimated by the one-half pixel.
  • One quarter pixel position, three quarter pixel positions are located in the first partition.
  • the rate distortion cost of the following four pixel positions needs to be calculated: two points The optimal position, three quarter-pixel positions estimated by one pixel, and the position corresponding to the minimum rate distortion cost are selected from the calculated four rate-distortion costs, because there are multiple calculation rate distortions in the embodiment of the present disclosure.
  • the minimum rate distortion cost in step E2 is defined as "the third minimum rate distortion cost"
  • the position corresponding to the third minimum rate distortion cost is the optimal position estimated by the quarter pixel.
  • the motion estimation is performed by using the optimal position estimated by the quarter pixel as the motion estimation result.
  • the motion compensation can be performed based on the optimal position estimated by the quarter pixel, and the motion compensation method can be referred to There is technology, and I won't go into details here.
  • predictive coding can be performed, and the spatial correlation and temporal correlation of the video are mainly utilized, and the spatial and temporal redundancy information is removed by using intra prediction and inter prediction, respectively, to obtain a predicted image block.
  • the prediction image block is compared with the original image block to obtain a prediction residual block, and then the discrete residual cosine transform (DCT) and quantization are performed on the prediction residual to obtain quantized DCT coefficients.
  • DCT discrete residual cosine transform
  • quantized DCT coefficients are entropy encoded to obtain a compressed code stream.
  • the quarter-pixel estimation in the embodiment of the present disclosure is completed in the first partition according to the optimal position estimated by the half-pixel, and the first partition is Is the partition to which the position corresponding to the minimum rate distortion cost belongs, so it is not necessary to perform quarter interpolation in all regions in the quarter-pixel estimation, and it is not necessary to calculate each quarter-pixel position in the entire region.
  • the minimum rate distortion cost only needs to calculate the minimum rate distortion cost for the quarter-pixel position in the first partition, so it can effectively improve the encoding speed and reduce the computational complexity and thus reduce the video compression performance. Video compression real-time encoding requirements for machine performance.
  • FIG. 2 is a schematic structural diagram of a structure of an HEVC coding framework provided by an embodiment of the present disclosure.
  • One frame image is read from the frame buffer and then sent to the encoder.
  • the intra-frame or inter-frame prediction is used to obtain the predicted value.
  • the intra prediction is based on the surrounding pixels.
  • Interpolating the predicted pixel, referring to the information on the spatial domain, the interframe prediction is to find the position that best matches the target block from the reference frame, and the reference is the information in the time domain, and the inter prediction may include: Motion Estimation, ME) and Motion Compensation (MC).
  • ME Motion Estimation
  • MC Motion Compensation
  • the predicted value is subtracted from the input data to obtain a residual, and then Discrete Cosine Transform (DCT) is changed and quantized to obtain a residual coefficient, which is then sent to the entropy coding module to output the code stream.
  • DCT Discrete Cosine Transform
  • the residual value of the reconstructed image is obtained, and then the predicted values in the frame or between the frames are added, thereby obtaining the reconstructed image, and the reconstructed image is subjected to intra-loop filtering. , enters the reference frame queue as the reference image of the next frame, so that one frame is backward encoded.
  • the intra-ring filtering may include Deblocking Filter (DBF) and Sample Adaptive Offset (SAO).
  • inter-frame prediction and entropy coding of inter-frame parts account for The whole calculation is about 90%, the intra prediction and the entropy coding of the intra-frame part account for 8%, and SAO and DB account for less than 1%.
  • the ME part accounts for 30% to 40% of the total calculation, and with the optimization of other parts, the proportion is also increasing.
  • the ME consists of three parts: integer pixel motion estimation, Half Motion Estimation (HME), and Quarter Motion Estimation (QME).
  • a scheme for reducing the number of motion estimation points without compressing performance is proposed, which can improve the encoding speed and reduce the computational complexity, thereby reducing the requirements of the video compression real-time encoding machine.
  • the interpolation part required for the sub-pixel motion estimation is placed immediately after the target block reconstruction data is processed by the DB and the SAO, and the reference pixels corresponding to all positions of the half-pixel are interpolated, and one quarter of the horizontal direction is Two position reference pixels, and then do inter-frame prediction half-pixel estimation, directly according to mv value, when doing quarter-pixel motion estimation, then determine whether to interpolate according to the case of mv, to avoid repeated interpolation.
  • the final quarter-pixel estimation is only done.
  • the three points achieved the compression performance of the original 8 points.
  • a complete set of compensation strategy is designed. First, make 4 points, then determine whether to end or fill other points.
  • the DB of the entire frame is performed, then the SAO of the entire frame is finally entered into the reference frame queue, and
  • the reference pixel does not exist, according to the mv coordinate interpolation value, that is, plug and play, which position is required to interpolate which position, and then calculate and compare to obtain the optimal mv position.
  • the interpolation part is processed after the SAO, and the following adjustment is made.
  • the DB and the SAO are started immediately, and then all the reference pixels in the half-pixel position are interpolated, in units of CTU blocks. Interpolation separately, each position corresponds to a piece of image buffer, when used, according to the half-pixel position of mv, select which image cache corresponding, and offset to the corresponding position, at the same time, interpolate the quarter-pixel horizontal direction
  • the two faces of the mv are interpolated if the quarter-pixel position of the mv is not the position of the horizontal direction that has been interpolated.
  • the center point and the surrounding 8 point positions may specifically include three types of pixel positions: respectively, the entire pixel position represented by a solid line.
  • the dotted line formed by the dot indicates the position of the half pixel, and the dotted line formed by the short line indicates the position of the quarter pixel.
  • the one-half pixel estimation has been interpolated in advance, and when used, it is selected according to the corresponding coordinates of mv. Then, according to the cost, the location of the optimal mv is determined. For example, the Rate Distortion cost and the rate distortion cost can be used for the preference among the various options, and the corresponding value is referred to as cost.
  • the mv corresponding to the minimum cost is the optimal mv, where cost is obtained by the following formula:
  • Bit represents the bit corresponding to (mv-mvp).
  • SATD Sum of Absolute Transformed Difference
  • hadamard transform and then absolute value summation is a way to calculate distortion, after the hadamard transform of the residual signal, and then find the sum of the absolute values of each element, relative to SAD
  • Lamda is the Lagrangian constant
  • the one-pixel motion estimation mainly includes the following process:
  • Step 1 Calculate the cost of the center point and the four positions of up, down, left, and right, that is, the position 0, 1, 2, 3, 4, as shown in the position of the circle in Figure 3-b.
  • Step 2 If the minimum cost is at the center position, that is, 0 point, the HME is ended, and the size of the cost corresponding to positions 1 and 2 and the cost value of the positions 3 and 4 are compared. According to the following table, the center point is used as the coordinate origin. The area enclosed by coordinate points 5, 6, 7, and 8 is divided into four areas, as shown in Table 1 below:
  • Cost1 is less than or equal to cost2
  • Cost3 is less than or equal to cost4 Partition Yes Yes 0 Yes no 1 no Yes 2 no no 3
  • step 3 If the location of the minimum cost is not at the center point, proceed to step 3;
  • Step 3 First, according to the optimal position of the target (that is, the position corresponding to the minimum cost obtained after each position calculation), make two points and fill the points according to the following rules, as shown in the triangle in Figure 3-d. position:
  • the position where the target position is not made up or down is selected, and the complement is the position that is not made. For example, 1 position, need to fill the position 5, 6, n1, because 5 or 6 may also be optimal, n1 position does not have to be done, so you need to calculate 5 and 6, if the 1 position is still optimal, then fill the n1 position, Otherwise, follow step 5 to make up the point.
  • step 5 compare the point cost value with the minimum cost point, in order to find the optimal position. If the minimum cost corresponding position has not changed, then add 1 point according to step 4, and compare, otherwise the new minimum The cost position is centered, and 2 points are added according to step 5;
  • Step 4 First, add 1 point according to the position shown in Figure 3-d, as shown in Figure 3-d:
  • the minimum cost position is at 1 or 2 or 3 or 4 positions, and the four positions are all half-pixel positions.
  • step 2 centering on the location of the minimum cost, according to step 2, which partition of the four regions belongs to.
  • Step 5 Add 2 points according to the position shown in Figure 3-e.
  • the estimated position of the second position is not the first position of the four points.
  • the purpose of the supplementary point is to calculate the top, bottom, left and right positions of the optimal position of the target, and to determine which area belongs to, in order to guide the quarter-pixel motion estimation, and to specifically fill which point, it is necessary to see the lack of Which point, for example, the target optimal position is 7, has been calculated above and to the right of position 7, but not to the left and below, so the p4 and p5 points are added.
  • the minimum cost position is at 5 positions, it needs to be supplemented with p0 and p1 points, and the corresponding coordinates are (-4, -2), (-2, -4);
  • the aforementioned coordinates are the optimal position after estimating the motion of the whole pixel, and are also the origin of the one-half motion estimation. With this origin, the up or left is subtracted, and the downward or rightward is added.
  • step 2 centering on the location of the minimum cost, according to step 2, which partition of the four regions belongs to.
  • the quarter-pixel motion estimation only makes 3 points, namely: because I am When doing one-half pixel, the optimal result of the quarter-pixel motion estimation has been predicted, which region may belong to, so that the patch partition can be realized.
  • the mv and cost corresponding to the minimum cost are the optimal mv and the optimal cost, and the QME ends.
  • the pixel is read according to the position corresponding to the reference frame from the mv, and then if there is a quarter pixel, interpolation is required, then the residual is calculated, etc., and finally the code stream is written.
  • the decoder will also decode the mv and then read the pixels in the same way. It may also need to interpolate and then add the residual data to get the reconstructed image.
  • the half-pixel and quarter-pixel motion estimations are all performed by 8 points, and the current interpolation value is used, and then the motion search is performed.
  • 35% has a probability of 4 points
  • 40% has a probability of 7 points
  • 25% has a probability of 8 points.
  • the optimal position is calculated by the original 8 points method, then the probability of falling to 4 points, 7 points, and 8 points is calculated.
  • the 4 points refer to the end of step 1 in the HME, 7 points.
  • a video image processing apparatus 400 may include: an image acquisition module 401, an integer pixel estimation module 402, a first sub-pixel estimation module 403, and an image partitioning module 404. a first partition acquisition module 405, a second sub-pixel estimation module 406, and a motion compensation module 407, where
  • An image obtaining module 401 configured to acquire a target image frame from a video image to be encoded
  • the integer pixel estimation module 402 is configured to perform integer pixel motion estimation on the target image frame to obtain an optimal position estimated by the whole pixel;
  • the first sub-pixel estimation module 403 is configured to perform a half-pixel estimation on the optimal position estimated by the integer pixel to obtain an optimal position estimated by one-half pixel;
  • the image partitioning module 404 is configured to divide the surrounding area of the optimal position estimated by the one-half pixel into four partitions;
  • a first partition obtaining module 405, configured to determine, according to a rate distortion cost corresponding to four half pixel positions adjacent to an optimal position estimated by the one-half pixel, from the four partitions a first partition for a quarter-pixel estimate;
  • a second sub-pixel estimation module 406 configured to perform quarter-pixel estimation in the first partition according to an optimal position estimated by one-half pixel, to obtain an optimal position estimated by a quarter-pixel;
  • the motion compensation module 407 is configured to perform motion compensation on the estimated position of the quarter-pixel as the motion estimation result.
  • the first sub-pixel estimation module 403 includes:
  • a first half-pixel acquisition module 4031 configured to acquire four half-pixel positions adjacent to an optimal position estimated by an integer pixel in the optimal position estimated by the integer pixel, the four The one-half pixel position is four pixel positions directly above, below, on the left, and on the right side of the optimal position estimated by the whole pixel;
  • the first cost calculation module 4032 is configured to obtain a position corresponding to the first minimum rate distortion cost from the optimal position estimated by the integer pixel and the four half pixel positions;
  • a first optimal position determining module 4033 configured to determine an optimal position estimated by the one-half pixel when a position corresponding to the first minimum rate distortion cost is an optimal position estimated by the integer pixel The optimal position estimated for the integer pixel.
  • the first partition obtaining module 405 is specifically configured to: when the location corresponding to the first minimum rate distortion cost is the optimal position estimated by the integer pixel, The surrounding area of the optimal position estimated by the integer pixel is divided into four partitions, wherein each partition contains one of the four half-pixel positions.
  • the first sub-pixel estimation module 403 includes:
  • a first half-pixel acquisition module 4031 configured to acquire four half-pixel positions adjacent to an optimal position estimated by an integer pixel in the optimal position estimated by the integer pixel, the four The one-half pixel position is four pixel positions directly above, below, on the left, and on the right side of the optimal position estimated by the whole pixel;
  • the first cost calculation module 4032 is configured to obtain a position corresponding to the first minimum rate distortion cost from the optimal position estimated by the integer pixel and the four half pixel positions;
  • a second half-pixel acquiring module 4034 configured to acquire, adjacent to the first half pixel position, when the position corresponding to the first minimum rate distortion cost is a first half pixel position Two half-pixel positions, the two-half pixel positions adjacent to the first half-pixel position being in the same axial direction as the first half-pixel position Upper, the first half pixel position is a half pixel position where the rate distortion cost is the smallest among the four half pixel positions;
  • a second cost calculation module 4035 configured to obtain a second one of the first half pixel position and the two half pixel positions adjacent to the first half pixel position The position corresponding to the minimum rate distortion cost
  • a second optimal position determining module 4036 configured to determine an optimal position estimated by the one-half pixel when a position corresponding to the second minimum rate distortion cost is the first half-pixel position Or the first half-pixel position is determined; or, when the position corresponding to the second minimum rate distortion cost is a third half-pixel position, determining an optimality of the one-half pixel estimation a position of the third half-pixel position, wherein the third half-pixel position is a ratio of two half-pixel positions adjacent to the first one-half pixel position The one-pixel position where distortion is the least expensive.
  • the first partition obtaining module 405 includes:
  • a third half pixel acquisition module 4051 configured to acquire the first half pixel position when the position corresponding to the second minimum rate distortion cost is the first half pixel position An adjacent second half pixel position, wherein the second half pixel position is four directly above, directly below, right side, and right side of the first half pixel position No pixel position interpolated by one-half of the pixel positions;
  • the first area dividing module 4052 is configured to divide the surrounding area of the first half pixel position into four partitions, wherein each partition includes one of the following four positions: the integer pixel estimation An optimal position, two two-pixel positions adjacent to the first one-half pixel position, and a second one-half pixel position.
  • the first partition obtaining module 405 includes:
  • the fourth half-pixel acquiring module 4053 is configured to acquire, adjacent to the third half pixel position, when the position corresponding to the second minimum rate distortion cost is a third half pixel position Two fourth half-pixel positions, the two fourth-half pixels are located directly above, directly below, directly to the left, and to the right of the third half-pixel position No pixel positions interpolated among the four pixel positions;
  • the second area dividing module 4054 is configured to divide the surrounding area of the third half pixel position into four partitions, wherein each partition includes one of the following four positions: the four Two pixel positions adjacent to the third one-half pixel position and one of the two fourth one-half pixel positions in one pixel position.
  • the second sub-pixel estimation module 406 includes:
  • a quarter interpolation module 4061 configured to perform quarter interpolation of three surrounding positions in the first partition according to the optimal position estimated by the one-half pixel, and obtain the same with the two-division a three-quarter pixel position adjacent to an optimal position estimated by one pixel, the three quarter-pixel positions being located in the first partition;
  • the third cost calculation module 4062 is configured to obtain a position corresponding to the third minimum rate distortion cost from the optimal position estimated by the one-half pixel and the three quarter-pixel positions;
  • the third optimal position determining module 4063 is configured to determine that the position corresponding to the third minimum rate distortion cost is an optimal position estimated by a quarter pixel.
  • the first partition is the smallest.
  • the rate-distortion cost corresponds to the partition to which the position belongs, so there is no need to perform quarter-interpolation in all regions in the quarter-pixel estimation, and it is not necessary to calculate the minimum of each quarter-pixel position in the entire region.
  • the rate distortion cost only needs to calculate the minimum rate distortion cost for the quarter-pixel position in the first partition, so it can effectively improve the encoding speed and reduce the computational complexity, thus reducing the video compression, while ensuring the video compression performance. Real-time coding requirements for machine performance.
  • FIG. 5 is a schematic structural diagram of a server according to an embodiment of the present disclosure.
  • the server 1100 may have a large difference due to different configurations or performances, and may include one or more central processing units (CPUs) 1122 (for example, One or more processors and memory 1132, one or more storage media 1130 that store application 1142 or data 1144 (eg, one or one storage device in Shanghai).
  • the memory 1132 and the storage medium 1130 may be short-term storage or persistent storage.
  • the program stored on storage medium 1130 may include one or more modules (not shown), each of which may include a series of instruction operations in the server.
  • central processor 1122 can be configured to communicate with storage medium 1130, executing a series of instruction operations in storage medium 1130 on server 1100.
  • Server 1100 may also include one or more power sources 1126, one or more wired or wireless network interfaces 1150, one or more input and output interfaces 1158, and/or one or more operating systems 1141, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
  • operating systems 1141 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM and more.
  • the processing method steps of the video image executed by the server in the above embodiment may be based on the server structure shown in FIG. 5.
  • an embodiment of the present disclosure further provides a storage medium for storing program code, and the program code is used to execute the method provided by the foregoing embodiment.
  • Embodiments of the present disclosure also provide a computer program product comprising instructions that, when run on a server, cause the server to perform the methods provided by the above embodiments.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be Physical units can be located in one place or distributed to multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the connection relationship between the modules indicates that there is a communication connection between them, and specifically may be implemented as one or more communication buses or signal lines.
  • U disk mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), disk or optical disk, etc., including a number of instructions to make a computer device (may be A personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present disclosure.
  • a computer device may be A personal computer, server, or network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

本公开实施例公开一种视频图像的处理方法和装置,保证视频压缩性能的情况下提升编码速度,降低计算复杂度。在该方法中,对目标图像帧进行整像素运动估计,得到整像素估计出的最优位置;对整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置;将二分之一像素估计出的最优位置的周围区域划分为四个分区;根据与二分之一像素估计出的最优位置相邻的四个二分之一像素位置分别对应的率失真代价,从四个分区中确定出用于四分之一像素估计的第一分区;根据二分之一像素估计出的最优位置在第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置;以四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。

Description

一种视频图像的处理方法和装置
本申请要求于2017年10月31日提交中国专利局、申请号为201711050289.8、申请名称为“一种视频图像的处理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及计算机技术领域,尤其涉及视频图像处理。
背景技术
目前视频压缩技术发展迅速,未来视频的发展趋势是高清晰度、高帧率、高压缩率,现在普及的H.264压缩方式,在压缩原理上存在一定的局限性,不能适应未来的需求,因此高性能视频编码(High Efficiency Video Coding,HEVC)协议应运而生。举例说明如下,H.264编码单元只能达到16x16的块大小,但HEVC编码单元可以达到128x128、或64x64、8x8等多种块大小。另外,H.264编码单元的帧内预测方向只有9种,而HEVC编码单元的帧内预测方法可以达到35种。另外,对于帧间分割模式,H.264编码单元只能采用矩形分割的方式,但HEVC编码单元还可以采用非对称分割的方式。总体来看,HEVC比H.264的压缩率可以提高40%。
在目前的HEVC编码中,需要先根据整像素来插值得到对应位置的参考像素。因此匹配的像素点越多时,需要插值的次数也越多,因此HEVC编码协议对机器的性能要求比较高,普通机器还不能达到实时编码的能力,这样必然导致视频压缩性能的下降。
发明内容
有鉴于此,本公开实施例提供一种数据处理方法、装置及服务器,以提高用户资产状态挖掘的处理效率。
为实现上述目的,本公开实施例提供如下技术方案:
一方面,本公开实施例提供一种视频图像的处理方法,包括:
从待编码的视频图像中获取到目标图像帧;
对目标图像帧进行整像素运动估计,得到整像素估计出的最优位置;
对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置;
将所述二分之一像素估计出的最优位置的周围区域划分为四个分区,其中,每个分区内分别包含四分之一插值出的两个四分之一像素位置;
从所述四个分区中各个二分之一像素位置中获取最小率失真代价对应的位置,确定所述最小率失真代价对应的位置所属的分区为第一分区;
根据二分之一像素估计出的最优位置在所述第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置;
以所述四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。
另一方面,本公开实施例还提供一种视频图像的处理装置,包括:
图像获取模块,用于从待编码的视频图像中获取到目标图像帧;
整像素估计模块,用于对目标图像帧进行整像素运动估计,得到整像素估计出的最优位置;
第一分像素估计模块,用于对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置;
图像分区模块,用于将所述二分之一像素估计出的最优位置的周围区域划分为四个分区,其中,每个分区内分别包含四分之一插值出的两个四分之一像素位置;
第一分区获取模块,用于从所述四个分区中各个二分之一像素位置中获取最小率失真代价对应的位置,确定所述最小率失真代价对应的位置所属的分区为第一分区;
第二分像素估计模块,用于根据二分之一像素估计出的最优位置在所述第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置;
运动补偿模块,用于以所述四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。
又一方面,本公开实施例提供了一种视频处理设备,所述视频处理设备包括:
处理器、通信接口、存储器和通信总线;
其中,所述处理器、所述通信接口和所述存储器通过所述通信总线完成相互间的通信;所述通信接口为通信模块的接口;
所述存储器,用于存储程序代码,并将所述程序代码传输给所述处理器;
所述处理器,用于调用存储器中程序代码的指令执行第一方面所述的方法。
再一方面,本公开实施例提供了一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行第一方面所述的方法。
又一方面,本公开实施例提供了一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行第一方面所述的方法。
从以上技术方案可以看出,本公开实施例具有以下优点:
在本公开实施例中,由于本公开实施例中四分之一像素估计是根据二分之一像素估计出的最优位置在第一分区内来完成的,而第一分区是最小率失真代价对应的位置所属的分区,因此在四分之一像素估计时不需要在全部区域内进行四分之一插值,也不需要计算全部区域内的每个四分之一像素位置的最小率失真代价,只需要对于第一分区内的四分之一像素位置计算最小率失真代价,因此可以在保证视频压缩性能的情况下有效的提升编码速度,降低计算的复杂度,从而降低视频压缩实时编码对机器性能的要求。
附图说明
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域的技术人员来讲,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种视频图像的处理方法的流程方框示意图;
图2为本公开实施例提供的HEVC编码框架的组成结构示意图;
图3-a为本公开实施例提供的二分之一像素运动估计过程中的中心点位置与周围位置的关系标识示意图;
图3-b为本公开实施例提供的二分之一像素运动估计过程中的计算点位置标识示意图;
图3-c为本公开实施例提供的中心点位置为最优位置时四个块的分区标识示意图;
图3-d为本公开实施例提供的二分之一像素运动估计过程中的最优位置是第一次4个点位置时的补点位置标识示意图;
图3-e为本公开实施例提供的二分之一像素运动估计过程中的最优位置不是第一次4个点位置时的补点位置标识示意图;
图3-f为本公开实施例提供的四分之一像素运动估计前中心点位置与周围位置关系标识示意图;
图4-a为本公开实施例提供的一种视频图像的处理装置的组成结构示意图;
图4-b为本公开实施例提供的一种第一分像素估计模块的组成结构示意图;
图4-c为本公开实施例提供的另一种第一分像素估计模块的组成结构示意图;
图4-d为本公开实施例提供的一种第一分区获取模块的组成结构示意图;
图4-e为本公开实施例提供的另一种第一分区获取模块的组成结构示意图;
图4-f为本公开实施例提供的一种第二分像素估计模块的组成结构示意图;
图5为本公开实施例提供的视频图像的处理方法应用于服务器的组成结构示意图。
具体实施方式
本公开实施例提供了一种视频图像的处理方法和装置,用于保证视频压缩性能的情况下提升编码速度,降低计算的复杂度。
为使得本公开的发明目的、特征、优点能够更加的明显和易懂,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,下面所描述的实施例仅仅是本公开一部分实施例,而非全部实施例。基于本公开中的实施例,本领域的技术人员所获得的所有其他实施例,都属于本公开保护的范围。
本公开的说明书和权利要求书及上述附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
以下分别进行详细说明。
本公开实施例提供的视频图像的处理方法可应用于视频处理设备,视频处理设备可以是具有视频处理能力的CPU(中央处理器)或GPU(图形处理器);可选的,视频处理设备可选用手机、笔记本电脑等终端实现,也可选用服务器实现。
本公开视频图像的处理方法的一个实施例,具体可以应用于分像素快速运动估计与运动补偿。举例说明,在本公开实施例提供的HEVC编码中,运动估计(Motion Estimation,ME)可包含如下的三个组成部分:整像素运动估计、二分之一像素运动估计(Half Motion Estimation,HME)和四分之一像素运动估计(Quarter Motion Estimation,QME)。在HME和QME中,由于图像是整像素精度,为了提高压缩率,每个像素都需要先放大到四倍,再找最优匹配位置,其中HME是需要放大2倍后的图像,QME是需要放大4倍的图像,这些数据都不存在,所以需要先根据整像素来插值得到对应位置的参考像素。本公开实施例中二分之一像素运动估计和四分之一像素运动估计不是相互独立,而是在二分之一像素运动估计的基础上缩小插值的范围,然后再进行四分之一像素估计,从而可以提高编码速度,降低计算的复杂度。请参阅图1所示,本公开一个实施例提供的视频图像的处理方法,可以包括如下步骤:
101、从待编码的视频图像中获取到目标图像帧。
在本公开实施例中可以基于HEVC编码框架完成对图像帧的运动估计。其中,首先提取到目标图像帧,例如可以从帧缓存(Frame Buffer)中读取到目标图像帧。其中,FrameBuffer是一块显示缓存,往显示缓存中写入特定格式的数据就意味着向屏幕输出内容。帧缓存可以在系统存储器(内存)的任意位置,视频控制器通过访问帧缓存来刷新屏幕。帧缓存有个地址,是在内存里。
102、对目标图像帧进行整像素运动估计,得到整像素估计出的最优位置。
在本公开实施例中可提取到目标图像帧之后,可以对该目标图像帧进行整 像素运动估计,得到整像素估计出的最优位置。其中,所述最优位置表示从多个整像素位置中选择出的率失真代价(Rate Distortion Cost)取值最小的整像素位置,在整像素估计过程中需要计算每个整像素位置所对应的率失真代价,最小率失真代价对应的整像素位置即为整像素估计出的最优位置。
103、对整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置。
在本公开实施例中,通过前述步骤102整像素运动估计完成后,接下来通过对目标帧中的整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置。其中,所述最优位置表示从多个二分之一像素位置中选择出的率失真代价取值最小的二分之一像素位置,在二分之一像素估计过程中需要计算每个二分之一像素位置所对应的率失真代价,最小率失真代价对应的二分之一像素位置即为二分之一像素估计出的最优位置。
在本公开的一些实施例中,为了减少二分之一插值所带来的计算量,可以选择四个位置点进行二分之一插值。具体的,步骤103对整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置,包括:
步骤A1、获取与整像素估计出的最优位置中相邻的四个二分之一像素位置,四个二分之一像素位置为位于整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
步骤A2、从整像素估计出的最优位置和四个二分之一像素位置中获取第一最小率失真代价对应的位置;
步骤A3、当第一最小率失真代价对应的位置为整像素估计出的最优位置时,确定二分之一像素估计出的最优位置为整像素估计出的最优位置。
在前述步骤A1至步骤A3中,首先以整像素估计出的最优位置为中心点位置的二分之一像素运动估计为例。其中,在整像素估计出的最优位置的正上方、正下方、正左侧、正右侧可以确定出四个像素位置,即可以获取到对整像素估计出的最优位置进行四个周围位置的二分之一插值结果,解析该插值结果,得到与整像素估计出的最优位置相邻的四个二分之一像素位置,其中,二分之一插值指的是对整像素估计出的最优位置插值出二分之一像素位置。在步骤A2中,需要计算如下五个像素位置的率失真代价:整像素估计出的最优位置、四 个二分之一像素位置,从计算出的五个率失真代价中选择出最小率失真代价对应的位置,由于本公开实施例中有多处计算率失真代价过程,将步骤A2中的最小率失真代价定义为“第一最小率失真代价”。在步骤A3中,当第一最小率失真代价对应的位置为整像素估计出的最优位置时,由于在步骤A1中已经获取到该整像素估计出的最优位置插值出的四个二分之一像素位置,因此对于整像素估计出的最优位置的二分之一插值已经完成,该二分之一像素估计出的最优位置为整像素估计出的最优位置。
在本公开的另一些实施例中,步骤103对整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置,包括:
步骤A1、获取与整像素估计出的最优位置中的整像素估计出的最优位置相邻的四个二分之一像素位置,四个二分之一像素位置为位于整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
步骤A2、从整像素估计出的最优位置和四个二分之一像素位置中获取第一最小率失真代价对应的位置;
步骤A4、当第一最小率失真代价对应的位置为第一二分之一像素位置时,获取与第一二分之一像素位置相邻的两个二分之一像素位置,与第一二分之一像素位置相邻的两个二分之一像素位置与第一二分之一像素位置位于相同的轴向上,第一二分之一像素位置为四个二分之一像素位置中率失真代价最小的二分之一像素位置;
步骤A5、从第一二分之一像素位置和与第一二分之一像素位置相邻的两个二分之一像素位置中,获取第二最小率失真代价对应的位置;
步骤A6、当第二最小率失真代价对应的位置为第一二分之一像素位置时,确定二分之一像素估计出的最优位置为第一二分之一像素位置;或者,
步骤A7、当第二最小率失真代价对应的位置为第三二分之一像素位置时,确定二分之一像素估计出的最优位置为第三二分之一像素位置,第三二分之一像素位置为与第一二分之一像素位置相邻的两个二分之一像素位置中率失真代价最小的二分之一像素位置。
在前述步骤A1、A2、A4至步骤A7中,在整像素估计出的最优位置的正上方、正下方、正左侧、正右侧可以确定出四个像素位置,即获取到对整像素估 计出的最优位置进行四个周围位置的二分之一插值结果,解析该二分之一插值结果,得到与整像素估计出的最优位置相邻的四个二分之一像素位置,其中,二分之一插值指的是对整像素估计出的最优位置插值出二分之一像素位置。在步骤A2中,需要计算如下五个像素位置的率失真代价:整像素估计出的最优位置、四个二分之一像素位置,从计算出的五个率失真代价中选择出最小率失真代价对应的位置,由于本公开实施例中有多处计算率失真代价过程,将步骤A2中的最小率失真代价定义为“第一最小率失真代价”。在第一最小率失真代价对应的位置不是整像素估计出的最优位置的情况下,可以触发执行步骤A4至A7,重新寻找二分之一像素估计出的最优位置。
在步骤A4中,将四个二分之一像素位置中率失真代价最小的二分之一像素位置定义为“第一二分之一像素位置”,获取与第一二分之一像素位置相邻的两个二分之一像素位置,与第一二分之一像素位置相邻的两个二分之一像素位置与第一二分之一像素位置位于相同的轴向上。因此插值出的两个二分之一像素位置与第一二分之一像素位置可位于横轴或纵轴的轴向上,即插值出的两个二分之一像素位置在第一二分之一像素位置的水平方向上,或者插值出的两个二分之一像素位置在第一二分之一像素位置的垂直方向上。在步骤A4中获取到二分之一插值结果之后,需要计算如下三个像素位置的率失真代价:第一二分之一像素位置、与第一二分之一像素位置相邻的两个二分之一像素位置,从计算出的三个率失真代价中选择出最小率失真代价对应的位置,由于本公开实施例中有多处计算率失真代价过程,将步骤A5中的最小率失真代价定义为“第二最小率失真代价”,则根据最小率失真代价所对应的位置不同,分别执行步骤A6或者A7。
104、将二分之一像素估计出的最优位置的周围区域划分为四个分区。
在本公开实施例中,通过前述步骤103确定出二分之一像素估计出的最优位置之后,对于二分之一像素估计出的最优位置的周围区域,可以以该二分之一像素估计出的最优位置为中心点,通过横轴方向和纵轴方向的划分,将二分之一像素估计出的最优位置的周围区域划分为四个分区,所述二分之一像素估计出的最优位置的周围区域是指以二分之一像素估计出的最优位置为中心点的上、下、左、右、左上、右上、左下、右下共八个方向上的四分之一像素位 置。由前述可知,在二分之一像素估计过程中可以获取到二分之一插值结果,因此每个分区内可以分别包含二分之一插值出的一个二分之一像素位置。
在本公开的一些实施例中,在前述执行步骤A1至步骤A3的实现场景下,步骤104将二分之一像素估计出的最优位置的周围区域划分为四个分区,包括:
B1、当第一最小率失真代价对应的位置为整像素估计出的最优位置时,将整像素估计出的最优位置的周围区域划分为四个分区,其中,每个分区通过整像素估计出的最优位置、四个二分之一像素位置中的两个二分之一像素位置划分出的区域得到。
其中,当第一最小率失真代价对应的位置为整像素估计出的最优位置时,确定二分之一像素估计出的最优位置为整像素估计出的最优位置,围绕该整像素估计出的最优位置,可以划分出四个分区,在前述步骤A1中获取二分之一插值结果时分别插值出了四个二分之一像素位置,则这四个二分之一像素位置中的两个二分之一像素位置与第一中心点区域可以划分出一个区域,划分出的区域可以定义为一个单独的分区。
在本公开的一些实施例中,在前述执行步骤A1、A2、A4至步骤A6的实现场景下,步骤104将二分之一像素估计出的最优位置的周围区域划分为四个分区,包括:
C1、当第二最小率失真代价对应的位置为第一二分之一像素位置时,获取与第一二分之一像素位置相邻的第二二分之一像素位置,第二二分之一像素位置为位于第一二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被二分之一插值的像素位置;
C2、将第一二分之一像素位置的周围区域划分为四个分区,其中,每个分区通过第一二分之一像素位置与整像素估计出的最优位置、与所述第一二分之一像素位置相邻的两个二分之一像素位置、第二二分之一像素位置划分出的区域得到。
其中,当第二最小率失真代价对应的位置为第一二分之一像素位置时,确定二分之一像素估计出的最优位置为第一二分之一像素位置,围绕该第一二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中只完成了三个像素位置的插值,此时需要获取与第一二分之一像素位置相邻的第二二分 之一像素位置。接下来对第一二分之一像素位置的周围区域划分出四个分区,在前述步骤A4、前述步骤C1中获取二分之一插值结果时分别插值出了四个二分之一像素位置:整像素估计出的最优位置、与第一二分之一像素位置相邻的两个二分之一像素位置、第二二分之一像素位置。则这四个二分之一像素位置中的两个二分之一像素位置与第一中心点区域可以划分出一个区域,划分出的区域可以定义为一个单独的分区。
在本公开的一些实施例中,在前述执行步骤A1、A2、A4、步骤A5和步骤A7的实现场景下,步骤104将二分之一像素估计出的最优位置的周围区域划分为四个分区,包括:
D1、当第二最小率失真代价对应的位置为第三二分之一像素位置时,获取与第三二分之一像素位置相邻的两个第四二分之一像素位置,两个第四二分之一像素为位于第三二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被插值的像素位置;
D2、将第三二分之一像素位置的周围区域划分为四个分区,其中,每个分区通过所述第三二分之一像素位置与所述四个二分之一像素位置中与所述第三二分之一像素位置相邻的两个像素位置、所述两个第四二分之一像素位置划分出的区域得到。
其中,定义与第一二分之一像素位置相邻的两个二分之一像素位置中率失真代价最小的二分之一像素位置为“第三二分之一像素位置”,当第二最小率失真代价对应的位置为第三二分之一像素位置时,围绕该第三二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中只完成了两个像素位置的插值,此时需要获取与第三二分之一像素位置相邻的两个第四二分之一像素位置。接下来对第三二分之一像素位置的周围区域划分出四个分区,在前述步骤A4、前述步骤D1中获取二分之一插值结果时分别插值出了四个二分之一像素位置:四个二分之一像素位置中与第三二分之一像素位置相邻的两个像素位置、两个第四二分之一像素位置。则这四个二分之一像素位置中的两个二分之一像素位置与第一中心点区域可以划分出一个区域,划分出的区域可以定义为一个单独的分区。
105、根据与二分之一像素估计出的最优位置相邻的四个二分之一像素位 置分别对应的率失真代价,从四个分区中确定出用于四分之一像素估计的第一分区。
在本公开实施例中,通过前述步骤104将二分之一像素估计出的最优位置的周围区域划分为四个分区,每个分区内分别包含四分之一插值出的两个四分之一像素位置,则四个分区中各有两个四分之一像素位置。针对该四个分区中的各个二分之一像素位置,可以分别计算率失真代价,从四个率失真代价中选择出最小率失真代价,该最小率失真代价对应的位置所属的分区为第一分区。由于本公开实施例中在做二分之一像素运动估计时,根据周围四个像素的代价,来确定四分之一像素运动估计时,只需要做第一分区的位置点进行插值即可,因此极大的缩小了四分之一像素运动估计的插值范围,因此可以提升编码速度,降低计算的复杂度。
在本公开的一些实施例中,步骤105根据与二分之一像素估计出的最优位置相邻的四个二分之一像素位置分别对应的率失真代价,从四个分区中确定出用于四分之一像素估计的第一分区,包括:
确定二分之一像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个二分之一像素位置,作为与二分之一像素估计出的最优位置相邻的四个二分之一像素位置;
分别计算出与二分之一像素估计出的最优位置相邻的四个二分之一像素位置对应的率失真代价;
根据正上方的二分之一像素位置对应的率失真代价与正下方的二分之一像素位置对应的率失真代价之间的数值大小关系、正左侧的二分之一像素位置对应的率失真代价与正右侧的二分之一像素位置对应的率失真代价之间的数值大小关系。
举例说明如下,将二分之一像素估计出的最优位置的周围区域划分为:右上区、左上区、右下区、左下区,正上方的二分之一像素位置和正下方的二分之一像素位置分别对应的率失真代价进行数值比较,正左侧的二分之一像素位置和正右侧的二分之一像素位置分别对应的率失真代价进行数值比较。若正上方的二分之一像素位置对应的率失真代价小于或等于正下方的二分之一像素位置对应的率失真代价,且正左侧的二分之一像素位置对应的率失真代价小于 或等于正右侧的二分之一像素位置对应的率失真代价,则确定第一分区为左上区。若正上方的二分之一像素位置对应的率失真代价小于或等于正下方的二分之一像素位置对应的率失真代价,且正左侧的二分之一像素位置对应的率失真代价大于正右侧的二分之一像素位置对应的率失真代价,则确定第一分区为右上区。若正上方的二分之一像素位置对应的率失真代价大于正下方的二分之一像素位置对应的率失真代价,且正左侧的二分之一像素位置对应的率失真代价小于或等于正右侧的二分之一像素位置对应的率失真代价,则确定第一分区为左下区。若正上方的二分之一像素位置对应的率失真代价大于正下方的二分之一像素位置对应的率失真代价,且正左侧的二分之一像素位置对应的率失真代价大于正右侧的二分之一像素位置对应的率失真代价,则确定第一分区为右下区。
106、根据二分之一像素估计出的最优位置在第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置。
在本公开实施例中,通过前述步骤计算出二分之一像素估计出的最优位置以及最小率失真代价对应的位置所属的第一分区,因此在HEVC编码中进行QME时只需要在第一分区内进行即可,而不需要在全部区域位置都进行QME,因此极大的缩小了四分之一像素运动估计的插值范围,因此可以提升编码速度,降低计算的复杂度。
在本公开的一些实施例中,步骤106根据二分之一像素估计出的最优位置在第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置,包括:
E1、根据二分之一像素估计出的最优位置在第一分区内进行三个周围位置的四分之一插值,得到与二分之一像素估计出的最优位置相邻的三个四分之一像素位置,三个四分之一像素位置为位于第一分区内;
E2、从二分之一像素估计出的最优位置和三个四分之一像素位置中获取第三最小率失真代价对应的位置;
E3、确定第三最小率失真代价对应的位置为四分之一像素估计出的最优位置。
其中,在前述步骤E1至步骤E3的实现场景下,只需要在第一分区内进行 三个周围位置的四分之一插值,得到与二分之一像素估计出的最优位置相邻的三个四分之一像素位置,三个四分之一像素位置为位于第一分区内,在步骤E1中完成四分之一插值之后,需要计算如下四个像素位置的率失真代价:二分之一像素估计出的最优位置、三个四分之一像素位置,从计算出的四个率失真代价中选择出最小率失真代价对应的位置,由于本公开实施例中有多处计算率失真代价过程,将步骤E2中的最小率失真代价定义为“第三最小率失真代价”,最后可以将第三最小率失真代价对应的位置为四分之一像素估计出的最优位置。
107、以四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。
在本公开实施例中,通过前述步骤106得到四分之一像素估计出的最优位置之后,可以基于该四分之一像素估计出的最优位置进行运动补偿,运动补偿的方式可以参阅现有技术,此处不做赘述。运动补偿完成之后,可以进行预测编码,主要利用视频的空间相关性和时间相关性,分别采用帧内预测和帧间预测去除时空域冗余信息,从而得到预测图像块。然后将预测图像块与原始图像块作差得到预测残差块,再对预测残差进行离散余弦变换(Discrete Cosine Transform,DCT)和量化,获得量化的DCT系数。最后对量化后的DCT系数进行熵编码,得到压缩码流。
通过以上实施例对本公开实施例的描述可知,由于本公开实施例中四分之一像素估计是根据二分之一像素估计出的最优位置在第一分区内来完成的,而第一分区是最小率失真代价对应的位置所属的分区,因此在四分之一像素估计时不需要在全部区域内进行四分之一插值,也不需要计算全部区域内的每个四分之一像素位置的最小率失真代价,只需要对于第一分区内的四分之一像素位置计算最小率失真代价,因此可以在保证视频压缩性能的情况下有效的提升编码速度,降低计算的复杂度,从而降低视频压缩实时编码对机器性能的要求。
如图2所示,为本公开实施例提供的HEVC编码框架的组成结构示意图。首先对HEVC的编码过程进行详细说明,一帧图像从帧缓存中读取到之后送入到编码器,先经过帧内或帧间预测之后得到预测值,其中,帧内预测是参考周围像素来插值出预测像素,参考的是空域上信息,帧间预测是从参考帧中找出与目标块最匹配的位置,参考的是时域上信息,帧间预测可包括:运动估计 (Motion Estimation,ME)和运动补偿(Motion Compensation,MC)。在得到预测值之后,将预测值与输入数据相减,得到残差,然后进行离散余弦变换(Discrete Cosine Transform,DCT)变化和量化,得到残差系数,然后送入熵编码模块输出码流,同时,残差系数经反量化反变换之后,得到重构图像的残差值,再和帧内或者帧间的预测值相加,从而得到了重构图像,重构图像再经环内滤波之后,进入参考帧队列,作为下一帧的参考图像,从而一帧帧向后编码。其中,环内滤波可包括去块滤波(Deblocking Filter,DBF)和自适应像素补偿(Sample Adaptive Offset,SAO)。
在HEVC编码中,由于分割更细致,方向也更多,因此计算量非常大,要想实现高压缩性能,必须对整个编码器进行优化,一般,帧间预测和帧间部分的熵编码占了整个计算量的90%左右,帧内预测和帧内部分的熵编码占有8%,SAO和DB共占不到1%。而其中,ME部分就占了整个计算量的30%~40%,随着其他部分优化,占得比重也越来越大。
ME包含有3部分:整像素运动估计、二分之一像素运动估计(Half Motion Estimation,HME)、四分之一像素运动估计(Quarter Motion Estimation,QME)。
本公开实施例中提出一种减少运动估计点的个数,同时压缩性能又不下降的方案,可以提升编码速度,降低计算的复杂度,从而降低视频压缩实时编码机器的要求。
在本公开实施例中,包括了如下三部分:编码器架构调整、HME优化、QHE优化。首先将分像素运动估计时需要的插值部分,放在目标块重建数据经DB和SAO处理之后立刻进行,插值出二分之一像素的所有位置对应的参考像素,及四分之一水平方向的2个位置参考像素,然后做帧间预测二分之一像素估计时,直接根据mv取值,在做四分之一像素运动估计时,再根据mv的情况确定是否插值,避免重复插值。同时,在做二分之一像素运动估计时,根据周围四个像素的代价,来确定四分之一像素运动估计时,需要做哪个区的位置点,最终四分之一像素估计时只做3个点即达到了原先8个点的压缩性能。另外,在二分之一像素运动估计时,设计了一套完整的补点策略,先做4个点,然后再确定是否结束,或者补其他位置点。
在如图2所示的编码器架构中,整帧所有编码树单元(coding tree unit, CTU)编码完成之后,再做整帧的DB,然后整帧的SAO,最后进入参考帧队列,而在分像素运动估计时,参考像素不存在,根据mv坐标现插值,也就是即插即用,需要哪个位置就插值出哪个位置,然后计算比较得到最优mv位置。
本公开实施例中,插值部分放在SAO之后处理,作如下调整,编码完一个CTU之后,立即开始做DB和SAO,然后插值出二分之一像素位置的所有参考像素,按CTU块为单位分别插值,每个位置分别对应一片图像缓存,用的时候,根据mv的二分之一像素位置,选取对应哪个图像缓存,并偏移到相应位置,同时,插值出四分之一像素水平方向的2个面,如果mv的四分之一像素位置不是已经插值的水平方向的位置,则现插值。
接下来描述本公开实施例对HME的优化过程,如图3-a所示,中心点与周围8个点位置,具体可以包括三种像素位置:分别是用实线表示的整像素位置,用圆点构成的虚线表示二分之一像素位置,用短线段构成的虚线表示四分之一像素位置。
二分之一像素估计均已提前插值好,用的时候,根据mv对应坐标选取。然后根据cost来判决最优mv所在位置,例如可以使用Rate Distortioncost,率失真代价,用于多种选项中的择优,对应的值简称cost。最小cost对应的mv即是最优mv,其中,cost通过下列公式得到:
cost=satd+lamda*bit;
bit表示(mv-mvp)对应的bits。
其中,SATD指的是Sum of Absolute Transformed Difference,hadamard变换后再绝对值求和,是计算失真的一种方式,是将残差信号进行hadamard变换后,再求各元素绝对值之和,相对SAD,计算量精度很高。Lamda是拉格朗日常数,mvp表示运动向量预测,该值需要按照协议推导而来,在真正编码时,是编写的mvd,即mvd=mv-mvp,这样可以节省码子。
二分之一像素运动估计主要包括如下过程:
步骤1:计算中心点和上下左右四个位置的cost,即位置0,1,2,3,4,如下图3-b中圆圈所在位置。
经过cost值的比较,找到最小cost值及其对应的位置。
步骤2:如果最小cost在中心位置,即0点,则结束HME,并比较位置1和2所 对应cost的大小,及3和4位置cost值大小,根据下表,以中心点为坐标原点,将坐标点5、6、7、8所围区域分为4个区,如下表1所示:
Cost1小于等于cost2 Cost3小于等于cost4 分区
0
1
2
3
如图3-c所示,对应图中Z0、Z1、Z2、Z3四个区,通过分区可以指示出四分之一像素运动估计的哪3个点。概率上只计算这3个点就可以实现计算所有点。这样做是为了减少四分之一像素运动估计点的个数,提高编码速度。
如果最小cost所在位置不在中心点,则进入步骤3;
步骤3:首先根据目标最优位置(即每个位置计算后,得到的cost最小所对应的位置),补做两个点,按下列规则补点,如图3-d中所示的三角所在位置:
若最小cost位置在1位置,需要再补做5、6点;
若最小cost位置在2位置,需要再补做7、8点;
若最小cost位置在3位置,需要再补做5、7点;
若最小cost位置在4位置,需要再补做6、8点。
需要说明的是,本公开实施例中在最小cost位置所在二分之一像素位置上,再选择目标位置左右上下没做的位置,补得即是没做的位置。比如1位置,需要补位置5、6、n1,因为5或6还可能为最优,n1位置不一定要做,所以需要计算5和6,如果1位置还是最优,则再补n1位置,否则按照步骤5补点。
然后拿补出来的点cost值和最小cost点做比较,这样是为了找到最优位置,如果最小cost对应位置没有改变,则按照步骤4再补1个点,并进行比较,否则以新的最小cost位置为中心,按照步骤5再补做2点;
步骤4:首先,按照按图3-d所示的位置补1个点,如图3-d所示:
即:若最小cost位置在1位置,需要再补做n1点;
若最小cost位置在2位置,需要再补做n2点;
若最小cost位置在3位置,需要再补做n3点;
若最小cost位置在4位置,需要再补做n4点;
其中,按照步骤4补点时,最小cost位置是在1或2或3或4位置,这四个位置都是二分之一像素点位置。
然后,以最小cost所在位置为中心,按照步骤2,来判断属于4个区域的哪个分区。
步骤5:按照图3-e所示的位置补2点。
图3-e中,为二分之一运动估计最优位置不是第一次4个点位置时的补点位置标识。补点的目的是为了使目标最优位置的上下左右四个位置都有计算,并判断属于哪个区,为指导四分之一像素运动估计做准备,而具体补哪个点,就要看还缺哪个点,例如目标最优位置为7,在位置7的上面和右面已经计算过,但左边和下面没有,所以补充p4和p5点。
若最小cost位置在5位置,需要再补做p0,p1点,对应坐标为(-4,-2)、(-2,-4);
若最小cost位置在6位置,需要再补做p2,p3点,对应坐标为(2,-4)、(4,-2);
若最小cost位置在7位置,需要再补做p4,p5点,对应坐标为(-4,2)、(-2,4);
若最小cost位置在8位置,需要再补做p6,p7点,对应坐标为(2,4)、(4,2);
其中,前述的坐标是相对于整像素运动估计后的最优位置,也是二分之一运动估计的原点,以这个原点,向上或向左为减,向下或向右为加。
然后,以最小cost所在位置为中心,按照步骤2,来判断属于4个区域的哪个分区。
通过前述步骤1到步骤5描述的整个环路,可以实现所有二分之一像素运动估计的可能。
接下来进行QME优化。在二分之一像素估计最优位置上,确定周围8个四分之一像素的位置,如图3-f所示,星星所在位置,圆圈表示二分之一像素运动估计得到的最优位置。
以二分之一像素估计最优位置在中心点为例,结合HME中得到的属于哪个分区的标识,对应每个分区,四分之一像素运动估计只做3个点,即:因为我在做二分之一像素时,已经预判了四分之一像素运动估计的最优结果,可能属于哪个区域,从而可以实现补点分区。
若为第0分区,即z0,则做q5、q3、q1位置;
若为第1分区,即z1,则做q6、q1、q4位置;
若为第2分区,即z2,则做q7、q3、q2位置;
若为第3分区,即z3,则做q8、q2、q4位置;
然后进行比较,最小cost对应的mv和cost即为最优mv和最优代价,QME结束。
在本公开实施例中帧间预测时,要根据这个mv到参考帧对应位置读取像素,然后如果存在四分之一像素需要插值,然后计算残差等等,最终写入码流。解码端也会解码得到这个mv,然后相同方式读取像素,也可能需要插值,然后加上残差数据得到重构图像。
本公开实施例中避免了重复插值,在插值时,很多像素是重叠的,例如,二分之一像素中5和6的位置,只差了一列数据,其余部分都是相同。插值是插出跟目标预测块大小的数据,块越大,重叠越多。另外,在四分之一像素插值时,是在二分之一像素基础上插值,即插即用方式因为位置不确定,每次都是重新插值,先插值出二分之一像素,再插值出四分之一像素。本公开实施例中每个位置只插值一次,没有重复,而且四分之一插值是在已有的二分之一像素基础上插值,且已经做了2个位置。因为四分之一做得点比较少,所以可以现插值。
通过前述的举例说明可知,在优化前,二分之一像素和四分之一像素运动估计均做8个点,采用现插值,然后做运动搜素。而对于本公开实施例提供的方案,二分之一像素运动估计中,有35%的概率做了4个点,40%的概率做了7个点,25%的概率做了8个点,其中若是用原来8个点方式统计出的最优位置,然后计算落到4个点、7个点、8个点的概率,4个点是指做完HME中步骤1即结束,7个点是最优位置在1或2或3或4位置,需要补3个点,8个点即做完步骤5,该部分计算中可以节省其22.5%的计算量;而四分之一像素运动估计中,只做了3个点,该部分计算中可以节省其62.5%的计算量,再加上架构调整,避免了重复插值,总体编码速度提升了32.2%,压缩性能指标(bitrate distortion-rate,bd-rate)只下降了0.24%。因此,收益显著。
为便于更好的理解和实施本公开实施例的上述方案,下面举例相应的应用场景来进行具体说明。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本公开并不受所描述的动作顺序的限制,因为依据本公开,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本公开所必须的。
为便于更好的实施本公开实施例的上述方案,下面还提供用于实施上述方案的相关装置。
请参阅图4-a所示,本公开实施例提供的一种视频图像的处理装置400,可以包括:图像获取模块401、整像素估计模块402、第一分像素估计模块403、图像分区模块404、第一分区获取模块405、第二分像素估计模块406和运动补偿模块407,其中,
图像获取模块401,用于从待编码的视频图像中获取到目标图像帧;
整像素估计模块402,用于对目标图像帧进行整像素运动估计,得到整像素估计出的最优位置;
第一分像素估计模块403,用于对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置;
图像分区模块404,用于将所述二分之一像素估计出的最优位置的周围区域划分为四个分区;
第一分区获取模块405,用于根据与所述二分之一像素估计出的最优位置相邻的四个二分之一像素位置分别对应的率失真代价,从所述四个分区中确定出用于四分之一像素估计的第一分区;
第二分像素估计模块406,用于根据二分之一像素估计出的最优位置在所述第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置;
运动补偿模块407,用于以所述四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。
在本公开的一些实施例中,请参阅图4-b所示,所述第一分像素估计模块403,包括:
第一二分之一像素获取模块4031,用于获取所述整像素估计出的最优位置中的整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之 一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
第一代价计算模块4032,用于从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
第一最优位置确定模块4033,用于当所述第一最小率失真代价对应的位置为所述整像素估计出的最优位置时,确定所述二分之一像素估计出的最优位置为所述整像素估计出的最优位置。
进一步的,在本公开的一些实施例中,所述第一分区获取模块405,具体用于当所述第一最小率失真代价对应的位置为所述整像素估计出的最优位置时,将所述整像素估计出的最优位置的周围区域划分为四个分区,其中,每个分区内分别包含所述四个二分之一像素位置中的一个二分之一像素位置。
在本公开的一些实施例中,请参阅图4-c所示,所述第一分像素估计模块403,包括:
第一二分之一像素获取模块4031,用于获取所述整像素估计出的最优位置中的整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
第一代价计算模块4032,用于从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
第二二分之一像素获取模块4034,用于当所述第一最小率失真代价对应的位置为第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的两个二分之一像素位置,所述与所述第一二分之一像素位置相邻的两个二分之一像素位置与所述第一二分之一像素位置位于相同的轴向上,所述第一二分之一像素位置为所述四个二分之一像素位置中率失真代价最小的二分之一像素位置;
第二代价计算模块4035,用于从所述第一二分之一像素位置和所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中,获取第二最小率失真代价对应的位置;
第二最优位置确定模块4036,用于当所述第二最小率失真代价对应的位置 为所述第一二分之一像素位置时,确定所述二分之一像素估计出的最优位置为所述第一二分之一像素位置;或者,当所述第二最小率失真代价对应的位置为第三二分之一像素位置时,确定所述二分之一像素估计出的最优位置为所述第三二分之一像素位置,所述第三二分之一像素位置为所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中率失真代价最小的二分之一像素位置。
进一步的,在本公开的一些实施例中,请参阅图4-d所示,所述第一分区获取模块405,包括:
第三二分之一像素获取模块4051,用于当所述第二最小率失真代价对应的位置为所述第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的第二二分之一像素位置,所述第二二分之一像素位置为位于所述第一二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被二分之一插值的像素位置;
第一区域划分模块4052,用于将所述第一二分之一像素位置的周围区域划分为四个分区,其中,每个分区内包含如下四个位置中的一个位置:所述整像素估计出的最优位置、所述与所述第一二分之一像素位置相邻的两个二分之一像素位置、所述第二二分之一像素位置。
在本公开的一些实施例中,请参阅图4-e所示,所述第一分区获取模块405,包括:
第四二分之一像素获取模块4053,用于当所述第二最小率失真代价对应的位置为第三二分之一像素位置时,获取与所述第三二分之一像素位置相邻的两个第四二分之一像素位置,所述两个第四二分之一像素为位于所述第三二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被插值的像素位置;
第二区域划分模块4054,用于将所述第三二分之一像素位置的周围区域划分为四个分区,其中,每个分区内包含如下四个位置中的一个位置:所述四个二分之一像素位置中与所述第三二分之一像素位置相邻的两个像素位置、所述两个第四二分之一像素位置。
在本公开的一些实施例中,请参阅图4-f所示,所述第二分像素估计模块 406,包括:
四分之一插值模块4061,用于根据所述二分之一像素估计出的最优位置在所述第一分区内进行三个周围位置的四分之一插值,得到与所述二分之一像素估计出的最优位置相邻的三个四分之一像素位置,所述三个四分之一像素位置为位于所述第一分区内;
第三代价计算模块4062,用于从所述二分之一像素估计出的最优位置和所述三个四分之一像素位置中获取第三最小率失真代价对应的位置;
第三最优位置确定模块4063,用于确定所述第三最小率失真代价对应的位置为四分之一像素估计出的最优位置。
通过以上对本公开实施例的描述可知,由于本公开实施例中四分之一像素估计是根据二分之一像素估计出的最优位置在第一分区内来完成的,而第一分区是最小率失真代价对应的位置所属的分区,因此在四分之一像素估计时不需要在全部区域内进行四分之一插值,也不需要计算全部区域内的每个四分之一像素位置的最小率失真代价,只需要对于第一分区内的四分之一像素位置计算最小率失真代价,因此可以在保证视频压缩性能的情况下有效的提升编码速度,降低计算的复杂度,从而降低视频压缩实时编码对机器性能的要求。
图5是本公开实施例提供的一种服务器结构示意图,该服务器1100可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1122(例如,一个或一个以上处理器)和存储器1132,一个或一个以上存储应用程序1142或数据1144的存储介质1130(例如一个或一个以上海量存储设备)。其中,存储器1132和存储介质1130可以是短暂存储或持久存储。存储在存储介质1130的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1122可以设置为与存储介质1130通信,在服务器1100上执行存储介质1130中的一系列指令操作。
服务器1100还可以包括一个或一个以上电源1126,一个或一个以上有线或无线网络接口1150,一个或一个以上输入输出接口1158,和/或,一个或一个以上操作系统1141,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
上述实施例中由服务器所执行的视频图像的处理方法步骤可以基于该图5所示的服务器结构。
另外,本公开实施例还提供了一种存储介质,存储介质用于存储程序代码,程序代码用于执行上述实施例提供的方法。
本公开实施例还提供了一种包括指令的计算机程序产品,当其在服务器上运行时,使得服务器执行上述实施例提供的方法。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本公开提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本公开可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本公开而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述的方法。
综上所述,以上实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照上述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对上述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱 离本公开各实施例技术方案的精神和范围。

Claims (19)

  1. 一种视频图像的处理方法,其特征在于,应用于视频处理设备,所述方法包括:
    从待编码的视频图像中获取到目标图像帧;
    对所述目标图像帧进行整像素运动估计,得到整像素估计出的最优位置;
    对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置;
    将所述二分之一像素估计出的最优位置的周围区域划分为四个分区;
    根据与所述二分之一像素估计出的最优位置相邻的四个二分之一像素位置分别对应的率失真代价,从所述四个分区中确定出用于四分之一像素估计的第一分区;
    根据二分之一像素估计出的最优位置在所述第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置;
    以所述四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置,包括:
    获取与所述整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
    从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
    当所述第一最小率失真代价对应的位置为所述整像素估计出的最优位置时,确定所述二分之一像素估计出的最优位置为所述整像素估计出的最优位置。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述二分之一像素估计出的最优位置的周围区域划分为四个分区,包括:
    当所述第一最小率失真代价对应的位置为所述整像素估计出的最优位置时,将所述整像素估计出的最优位置的周围区域划分为四个分区,其中,每个 分区通过所述整像素估计出的最优位置、所述四个二分之一像素位置中的两个二分之一像素位置划分出的区域得到。
  4. 根据权利要求1所述的方法,其特征在于,所述对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置,包括:
    获取与所述整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
    从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
    当所述第一最小率失真代价对应的位置为第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的两个二分之一像素位置,所述与所述第一二分之一像素位置相邻的两个二分之一像素位置与所述第一二分之一像素位置位于相同的轴向上,所述第一二分之一像素位置为所述四个二分之一像素位置中率失真代价最小的二分之一像素位置;
    从所述第一二分之一像素位置和所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中,获取第二最小率失真代价对应的位置;
    当所述第二最小率失真代价对应的位置为所述第一二分之一像素位置时,确定所述二分之一像素估计出的最优位置为所述第一二分之一像素位置。
  5. 根据权利要求1所述的方法,其特征在于,所述对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置,包括:
    获取与所述整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
    从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
    当所述第一最小率失真代价对应的位置为第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的两个二分之一像素位置,所述与所述第一 二分之一像素位置相邻的两个二分之一像素位置与所述第一二分之一像素位置位于相同的轴向上,所述第一二分之一像素位置为所述四个二分之一像素位置中率失真代价最小的二分之一像素位置;
    从所述第一二分之一像素位置和所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中,获取第二最小率失真代价对应的位置;
    当所述第二最小率失真代价对应的位置为第三二分之一像素位置时,确定所述二分之一像素估计出的最优位置为所述第三二分之一像素位置,所述第三二分之一像素位置为所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中率失真代价最小的二分之一像素位置。
  6. 根据权利要求4所述的方法,其特征在于,所述将所述二分之一像素估计出的最优位置的周围区域划分为四个分区,包括:
    当所述第二最小率失真代价对应的位置为所述第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的第二二分之一像素位置,所述第二二分之一像素位置为位于所述第一二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被二分之一插值的像素位置;
    将所述第一二分之一像素位置的周围区域划分为四个分区,其中,每个分区通过所述第一二分之一像素位置与所述整像素估计出的最优位置、所述与所述第一二分之一像素位置相邻的两个二分之一像素位置、所述第二二分之一像素位置划分出的区域得到。
  7. 根据权利要求5所述的方法,其特征在于,所述将所述二分之一像素估计出的最优位置的周围区域划分为四个分区,包括:
    当所述第二最小率失真代价对应的位置为第三二分之一像素位置时,获取与所述第三二分之一像素位置相邻的两个第四二分之一像素位置,所述两个第四二分之一像素为位于所述第三二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被插值的像素位置;
    将所述第三二分之一像素位置的周围区域划分为四个分区,其中,每个分区通过所述第三二分之一像素位置与所述四个二分之一像素位置中与所述第三二分之一像素位置相邻的两个像素位置、所述两个第四二分之一像素位置划分出的区域得到。
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,所述根据二分之一像素估计出的最优位置在所述第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置,包括:
    根据所述二分之一像素估计出的最优位置在所述第一分区内进行三个周围位置的四分之一插值,得到与所述二分之一像素估计出的最优位置相邻的三个四分之一像素位置,所述三个四分之一像素位置为位于所述第一分区内;
    从所述二分之一像素估计出的最优位置和所述三个四分之一像素位置中获取第三最小率失真代价对应的位置;
    确定所述第三最小率失真代价对应的位置为四分之一像素估计出的最优位置。
  9. 一种视频图像的处理装置,其特征在于,包括:
    图像获取模块,用于从待编码的视频图像中获取到目标图像帧;
    整像素估计模块,用于对目标图像帧进行整像素运动估计,得到整像素估计出的最优位置;
    第一分像素估计模块,用于对所述整像素估计出的最优位置进行二分之一像素估计,得到二分之一像素估计出的最优位置;
    图像分区模块,用于将所述二分之一像素估计出的最优位置的周围区域划分为四个分区,其中,每个分区内分别包含四分之一插值出的两个四分之一像素位置;
    第一分区获取模块,用于从所述四个分区中各个二分之一像素位置中获取最小率失真代价对应的位置,确定所述最小率失真代价对应的位置所属的分区为第一分区;
    第二分像素估计模块,用于根据二分之一像素估计出的最优位置在所述第一分区内进行四分之一像素估计,得到四分之一像素估计出的最优位置;
    运动补偿模块,用于以所述四分之一像素估计出的最优位置作为运动估计结果进行运动补偿。
  10. 根据权利要求9所述的装置,其特征在于,所述第一分像素估计模块,包括:
    第一二分之一像素获取模块,用于获取与所述整像素估计出的最优位置相 邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
    第一代价计算模块,用于从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
    第一最优位置确定模块,用于当所述第一最小率失真代价对应的位置为所述整像素估计出的最优位置时,确定所述二分之一像素估计出的最优位置为所述整像素估计出的最优位置。
  11. 根据权利要求10所述的装置,其特征在于,所述第一分区获取模块,具体用于当所述第一最小率失真代价对应的位置为所述整像素估计出的最优位置时,将所述整像素估计出的最优位置的周围区域划分为四个分区,其中,每个分区通过所述整像素估计出的最优位置、所述四个二分之一像素位置中的两个二分之一像素位置划分出的区域得到。
  12. 根据权利要求9所述的装置,其特征在于,所述第一分像素估计模块,包括:
    第一二分之一像素获取模块,用于获取与所述整像素估计出的最优位置中的整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
    第一代价计算模块,用于从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
    第二二分之一像素获取模块,用于当所述第一最小率失真代价对应的位置为第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的两个二分之一像素位置,所述与所述第一二分之一像素位置相邻的两个二分之一像素位置与所述第一二分之一像素位置位于相同的轴向上,所述第一二分之一像素位置为所述四个二分之一像素位置中率失真代价最小的二分之一像素位置;
    第二代价计算模块,用于从所述第一二分之一像素位置和所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中,获取第二最小率失真代价对应的位置;
    第二最优位置确定模块,用于当所述第二最小率失真代价对应的位置为所 述第一二分之一像素位置时,确定所述二分之一像素估计出的最优位置为所述第一二分之一像素位置。
  13. 根据权利要求9所述的装置,其特征在于,所述第一分像素估计模块,包括:
    第一二分之一像素获取模块,用于获取与所述整像素估计出的最优位置中的整像素估计出的最优位置相邻的四个二分之一像素位置,所述四个二分之一像素位置为位于所述整像素估计出的最优位置的正上方、正下方、正左侧、正右侧的四个像素位置;
    第一代价计算模块,用于从所述整像素估计出的最优位置和所述四个二分之一像素位置中获取第一最小率失真代价对应的位置;
    第二二分之一像素获取模块,用于当所述第一最小率失真代价对应的位置为第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的两个二分之一像素位置,所述与所述第一二分之一像素位置相邻的两个二分之一像素位置与所述第一二分之一像素位置位于相同的轴向上,所述第一二分之一像素位置为所述四个二分之一像素位置中率失真代价最小的二分之一像素位置;
    第二代价计算模块,用于从所述第一二分之一像素位置和所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中,获取第二最小率失真代价对应的位置;以及
    第二最优位置确定模块,用于当所述第二最小率失真代价对应的位置为第三二分之一像素位置时,确定所述二分之一像素估计出的最优位置为所述第三二分之一像素位置,所述第三二分之一像素位置为所述与所述第一二分之一像素位置相邻的两个二分之一像素位置中率失真代价最小的二分之一像素位置。
  14. 根据权利要求12所述的装置,其特征在于,所述第一分区获取模块,包括:
    第三二分之一像素获取模块,用于当所述第二最小率失真代价对应的位置为所述第一二分之一像素位置时,获取与所述第一二分之一像素位置相邻的第二二分之一像素位置,所述第二二分之一像素位置为位于所述第一二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被二分之一插值的像素位置;
    第一区域划分模块,用于将所述第一二分之一像素位置的周围区域划分为四个分区,其中,每个分区通过所述第一二分之一像素位置与所述整像素估计出的最优位置、所述与所述第一二分之一像素位置相邻的两个二分之一像素位置、所述第二二分之一像素位置划分出的区域得到。
  15. 根据权利要求12所述的装置,其特征在于,所述第一分区获取模块,包括:
    第四二分之一像素获取模块,用于当所述第二最小率失真代价对应的位置为第三二分之一像素位置时,获取与所述第三二分之一像素位置相邻的两个第四二分之一像素位置,所述两个第四二分之一像素为位于所述第三二分之一像素位置的正上方、正下方、正左侧、正右侧的四个像素位置中没有被插值的像素位置;
    第二区域划分模块,用于将所述第三二分之一像素位置的周围区域划分为四个分区,其中,每个分区通过所述第三二分之一像素位置与所述四个二分之一像素位置中与所述第三二分之一像素位置相邻的两个像素位置、所述两个第四二分之一像素位置划分出的区域得到。
  16. 根据权利要求9至15中任一项所述的装置,其特征在于,所述第二分像素估计模块,包括:
    四分之一插值模块,用于根据所述二分之一像素估计出的最优位置在所述第一分区内进行三个周围位置的四分之一插值,得到与所述二分之一像素估计出的最优位置相邻的三个四分之一像素位置,所述三个四分之一像素位置为位于所述第一分区内;
    第三代价计算模块,用于从所述二分之一像素估计出的最优位置和所述三个四分之一像素位置中获取第三最小率失真代价对应的位置;
    第三最优位置确定模块,用于确定所述第三最小率失真代价对应的位置为四分之一像素估计出的最优位置。
  17. 一种视频处理设备,所述视频处理设备包括:
    处理器、通信接口、存储器和通信总线;
    其中,所述处理器、所述通信接口和所述存储器通过所述通信总线完成相互间的通信;所述通信接口为通信模块的接口;
    所述存储器,用于存储程序代码,并将所述程序代码传输给所述处理器;
    所述处理器,用于调用存储器中程序代码的指令执行权利要求1-8任意一项所述的方法。
  18. 一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行权利要求1-8任意一项所述的方法。
  19. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1-8任意一项所述的方法。
PCT/CN2018/104073 2017-10-31 2018-09-05 一种视频图像的处理方法和装置 WO2019085636A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP18873854.6A EP3706420A4 (en) 2017-10-31 2018-09-05 VIDEO IMAGE PROCESSING METHOD AND DEVICE
JP2019563038A JP6921461B2 (ja) 2017-10-31 2018-09-05 ビデオ画像の処理方法及び装置
KR1020197035773A KR102276264B1 (ko) 2017-10-31 2018-09-05 비디오 이미지 처리 방법 및 장치
US16/657,226 US10944985B2 (en) 2017-10-31 2019-10-18 Method and device for processing video image
US17/168,086 US11589073B2 (en) 2017-10-31 2021-02-04 Method and device for processing video image

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711050289.8 2017-10-31
CN201711050289.8A CN109729363B (zh) 2017-10-31 2017-10-31 一种视频图像的处理方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/657,226 Continuation US10944985B2 (en) 2017-10-31 2019-10-18 Method and device for processing video image

Publications (1)

Publication Number Publication Date
WO2019085636A1 true WO2019085636A1 (zh) 2019-05-09

Family

ID=66294470

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/104073 WO2019085636A1 (zh) 2017-10-31 2018-09-05 一种视频图像的处理方法和装置

Country Status (7)

Country Link
US (2) US10944985B2 (zh)
EP (1) EP3706420A4 (zh)
JP (1) JP6921461B2 (zh)
KR (1) KR102276264B1 (zh)
CN (1) CN109729363B (zh)
MA (1) MA50861A (zh)
WO (1) WO2019085636A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112866719A (zh) * 2019-11-27 2021-05-28 北京博雅慧视智能技术研究院有限公司 一种针对avs2的快速分像素预测方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109729363B (zh) * 2017-10-31 2022-08-19 腾讯科技(深圳)有限公司 一种视频图像的处理方法和装置
EP4351136A1 (en) * 2021-05-28 2024-04-10 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Encoding method, decoding method, code stream, encoder, decoder and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101247523A (zh) * 2008-03-18 2008-08-20 上海华平信息技术股份有限公司 用于h.264编码器的半像素运动估计方法
JP2009213173A (ja) * 2009-06-22 2009-09-17 Casio Comput Co Ltd 動きベクトル検出装置、および、プログラム
CN102164283A (zh) * 2011-05-30 2011-08-24 江苏大学 一种基于avs的亚像素运动估计方法
CN104378642A (zh) * 2014-10-29 2015-02-25 南昌大学 一种基于cuda的h.264分数像素快速插值方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6757330B1 (en) * 2000-06-01 2004-06-29 Hewlett-Packard Development Company, L.P. Efficient implementation of half-pixel motion prediction
JP2006254349A (ja) 2005-03-14 2006-09-21 Toshiba Corp 動きベクトル検出方法、動きベクトル検出装置およびコンピュータ上で動きベクトル検出処理を実行するコンピュータプログラム
US20070217515A1 (en) 2006-03-15 2007-09-20 Yu-Jen Wang Method for determining a search pattern for motion estimation
US10462480B2 (en) * 2014-12-31 2019-10-29 Microsoft Technology Licensing, Llc Computationally efficient motion estimation
WO2017082443A1 (ko) * 2015-11-13 2017-05-18 엘지전자 주식회사 영상 코딩 시스템에서 임계값을 이용한 적응적 영상 예측 방법 및 장치
CN109729363B (zh) * 2017-10-31 2022-08-19 腾讯科技(深圳)有限公司 一种视频图像的处理方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101247523A (zh) * 2008-03-18 2008-08-20 上海华平信息技术股份有限公司 用于h.264编码器的半像素运动估计方法
JP2009213173A (ja) * 2009-06-22 2009-09-17 Casio Comput Co Ltd 動きベクトル検出装置、および、プログラム
CN102164283A (zh) * 2011-05-30 2011-08-24 江苏大学 一种基于avs的亚像素运动估计方法
CN104378642A (zh) * 2014-10-29 2015-02-25 南昌大学 一种基于cuda的h.264分数像素快速插值方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3706420A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112866719A (zh) * 2019-11-27 2021-05-28 北京博雅慧视智能技术研究院有限公司 一种针对avs2的快速分像素预测方法
CN112866719B (zh) * 2019-11-27 2022-09-23 北京博雅慧视智能技术研究院有限公司 一种针对avs2的快速分像素预测方法

Also Published As

Publication number Publication date
KR20200002038A (ko) 2020-01-07
US20200053384A1 (en) 2020-02-13
EP3706420A1 (en) 2020-09-09
US20210235113A1 (en) 2021-07-29
JP2020520194A (ja) 2020-07-02
EP3706420A4 (en) 2021-07-21
MA50861A (fr) 2020-09-09
US11589073B2 (en) 2023-02-21
US10944985B2 (en) 2021-03-09
CN109729363B (zh) 2022-08-19
KR102276264B1 (ko) 2021-07-12
CN109729363A (zh) 2019-05-07
JP6921461B2 (ja) 2021-08-18

Similar Documents

Publication Publication Date Title
US20220116647A1 (en) Picture Prediction Method and Picture Prediction Apparatus
CN110249628B (zh) 用于预测分区的视频编码器和解码器
WO2016050051A1 (zh) 图像预测方法及相关装置
CN110290388B (zh) 帧内预测方法、视频编码方法、计算机设备及存储装置
CN108271023B (zh) 图像预测方法和相关设备
TW201640894A (zh) 用於視訊寫碼之重疊運動補償
JP7279154B2 (ja) アフィン動きモデルに基づく動きベクトル予測方法および装置
WO2018052552A1 (en) Dual filter type for motion compensated prediction in video coding
WO2019085636A1 (zh) 一种视频图像的处理方法和装置
CN108777794B (zh) 图像的编码方法和装置、存储介质、电子装置
CN110832869B (zh) 用于视频编码或解码的运动信息获取方法与装置
CN110719467B (zh) 色度块的预测方法、编码器及存储介质
US9473782B2 (en) Loop filtering managing storage of filtered and unfiltered pixels
CN116723328A (zh) 一种视频编码方法、装置、设备及存储介质
KR102225881B1 (ko) 인트라 예측을 이용한 비디오 부호화/복호화 방법 및 장치
KR102225880B1 (ko) 인트라 예측을 이용한 비디오 부호화/복호화 방법 및 장치
CN111869211B (zh) 图像编码装置和方法
JP2014230031A (ja) 画像符号化装置、及び画像符号化プログラム
CN116320443A (zh) 视频图像处理方法、装置、计算机设备和存储介质
CN112714312A (zh) 编码模式选择方法、装置以及可读存储介质
KR20170122350A (ko) 확장된 Intra Block Copy 방법을 통한 비디오 복호화 방법 및 장치
KR20170115809A (ko) 움직임 추정 장치 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18873854

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019563038

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20197035773

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018873854

Country of ref document: EP

Effective date: 20200602