WO2019062476A1 - 进行运动估计的方法、装置、设备及存储介质 - Google Patents

进行运动估计的方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2019062476A1
WO2019062476A1 PCT/CN2018/103642 CN2018103642W WO2019062476A1 WO 2019062476 A1 WO2019062476 A1 WO 2019062476A1 CN 2018103642 W CN2018103642 W CN 2018103642W WO 2019062476 A1 WO2019062476 A1 WO 2019062476A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
amvp
qme
list
motion
Prior art date
Application number
PCT/CN2018/103642
Other languages
English (en)
French (fr)
Inventor
张宏顺
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP18862765.7A priority Critical patent/EP3618445A4/en
Publication of WO2019062476A1 publication Critical patent/WO2019062476A1/zh
Priority to US16/656,116 priority patent/US10827198B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/563Motion estimation with padding, i.e. with filling of non-object values in an arbitrarily shaped picture block or region for estimation purposes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a method, an apparatus, a device, and a storage medium for performing motion estimation.
  • Motion estimation is the most important component in video coding. It divides each frame of image into at least one non-overlapping macroblock, and searches for the most similar to each macroblock in the specified area of the reference frame according to the specified search algorithm. The process of matching blocks. Motion estimation not only reduces the complexity of the video coding process, but also reduces the number of bits in the video transmission process, so motion estimation is necessary in the video coding process.
  • an AMVP Advanced Motion Vector Prediction
  • the correlation between the spatial motion vector and the time domain motion vector is utilized, and the current PU (Predicting Unit) is used.
  • PU is a macroblock
  • candidate MVs Motion Vectors
  • SAD Sud of Absolute Differences
  • RDcost Rate Distortioncost
  • HME Half Motion Estimation
  • One-pixel motion estimation and obtain the target MV of the HME from the calculation result; the mapping point of the target MV of the HME in the reference frame Perform QME (Quarter Motion Estimation) for the primary selection point, and obtain the minimum RDcost value of the target MV and QME of the QME from the calculation result, and determine the minimum RDcost value of the target MV and QME of the QME.
  • QME Quadrater Motion Estimation
  • an embodiment of the present invention provides a method, an apparatus, a device, and a storage medium for performing motion estimation.
  • the technical solution is as follows:
  • a method of performing motion estimation is provided, the method being applied to an apparatus for performing motion estimation, the method comprising:
  • each target MV is the MV corresponding to the minimum RDcost value estimated for each motion.
  • an apparatus for performing motion estimation comprising:
  • a list construction module configured to construct a candidate motion vector MV list for the PU based on the advanced vector prediction AMVP for any prediction unit PU in the image to be encoded, the candidate MV list including at least one MV of the PU;
  • a calculation module configured to calculate a rate distortion cost RDcost of each MV in the candidate MV list
  • An obtaining module configured to obtain, from the calculation result, a target MV of the AMVP and a minimum RDcost value of the AMVP;
  • the calculating module is configured to perform IME by using a mapping point of the target MV of the AMVP in a reference frame as a primary selection point;
  • the obtaining module is configured to obtain a target MV of the IME from the calculation result
  • An accuracy amplification module configured to amplify the target MV of the IME to a quarter pixel precision to obtain a reference target MV of the QME;
  • a determining module configured to determine a target MV of the AMVP and a minimum RDcost value of the AMVP as a final result of a motion estimation process when a target MV of the AMVP and a reference target MV of the QME are the same;
  • each target MV is the MV corresponding to the minimum RDcost value estimated for each motion.
  • an apparatus for performing motion estimation comprising a processor and a memory, the memory storing at least one instruction, at least one program, a code set, or a set of instructions, the at least one instruction The at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement a method of performing motion estimation.
  • a computer readable storage medium stores at least one instruction at least one instruction, at least one program, a code set, or a set of instructions, the at least one instruction, the at least one program, The set of codes or the set of instructions is loaded and executed by a processor to implement a method of performing motion estimation.
  • Calculate the RDcost of each MV in the candidate MV list obtain the target MV of the AMVP, perform the IME with the mapping point of the target MV of the AMVP in the reference frame as the primary point, and enlarge the target MV of the IME to a quarter precision, Obtaining the reference target MV of the QME.
  • the target MV of the AMVP is the same as the reference target MV of the QME, the HME and the QME are not required, and the minimum RDcost value of the target MV of the AMVP and the AMVP is directly used as the final result, thereby reducing the HME and
  • the amount of calculation calculated by QME shortens the duration of the motion estimation process and reduces resource consumption.
  • FIG. 1 is a HMVE coding framework diagram according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of a method for performing motion estimation according to an embodiment of the present invention
  • FIG. 3(A) is a schematic diagram showing a construction process of a spatial motion vector according to an embodiment of the present invention
  • FIG. 3(B) is a schematic diagram of a process of constructing a time domain motion vector according to an embodiment of the present invention
  • FIG. 3(C) is a schematic diagram of constructing a candidate MV list based on AMVP according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of a process for performing motion estimation according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a video encoding and decoding process according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of an apparatus for performing motion estimation according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a terminal for performing motion estimation according to an embodiment of the present invention.
  • FIG. 8 is a diagram of a server for performing motion estimation, according to an exemplary embodiment.
  • HEVC also known as H.265 video coding technology
  • H.265 video coding technology has the following advantages over traditional H.264 video coding technology:
  • FIG. 1 shows the coding framework of HEVC.
  • the HEVC coding process is as follows:
  • a first step for any frame to be encoded image, the image to be encoded is divided into at least one PU that does not overlap each other;
  • the image to be encoded is input into an encoder for encoding prediction, and the process mainly utilizes spatial correlation and temporal correlation of video data, and uses intra prediction or inter prediction to remove time-space domain redundancy of each PU. Information, resulting in a predicted image block for each PU in the reference frame.
  • the predicted image block and the original PU are compared to obtain a prediction residual block, and DTC (Discrete Cosine Transform) transform and quantization processing are performed on the prediction residual block respectively to obtain a quantized DTC coefficient.
  • DTC Discrete Cosine Transform
  • DTC is a mathematical operation closely related to Fourier transform.
  • the Fourier series expansion if the function to be expanded is a real function, its Fourier series only contains cosine terms. Discretize it before processing and then perform cosine transform.
  • Quantization is a commonly used technique in the field of digital signal processing. It refers to the process of approximating the continuous value of a signal (or a large number of possible discrete values) into a finite number (or fewer) of discrete values.
  • the quantization process is mainly applied to the conversion from continuous signal to digital signal.
  • the continuous signal is sampled into a discrete signal, and the discrete signal is quantized to become a digital signal.
  • the quantized DTC coefficients are entropy encoded to obtain a compressed code rate and output.
  • the quantized DTC coefficients are inverse quantized and inverse DTC transformed to obtain a residual block of the reconstructed image, and then the residual block of the reconstructed image is added to the intra-frame or inter-frame predicted image block to obtain Reconstruct the image.
  • the reconstructed image is processed by DB (Deblocking Filter) and SAO (Sample Adaptive Offset), and then added to the reference frame queue, and used as a reference for the next frame to be encoded image. frame.
  • DB Deblocking Filter
  • SAO Sample Adaptive Offset
  • DB main role of DB is to enhance the boundaries of the image and reduce the discontinuity of image boundaries.
  • Adaptive pixel compensation is mainly used for local information compensation of the DB processed image to reduce distortion between the source image and the reconstructed image.
  • the difference between the video images of two frames in a continuous video sequence is relatively small, perhaps only the relative position of the object changes, or only the two frames occur on the boundary. Variety.
  • the video encoder if each video image in the video sequence is encoded, it will cause a great waste of the code stream. If the encoding is based on the difference between the two images and the reference frame, the code stream can be greatly reduced. waste.
  • the basic idea of motion estimation is to divide each frame of the image sequence into a number of non-overlapping macroblocks, and set the displacement of all the pixels in the macroblock to be the same, and then match each macroblock according to the specified search algorithm and specified.
  • the criterion searches within a specified area of the reference frame to search for a matching block that is most similar to each macroblock, and the relative displacement of the matching block to the target block (eg, the current macroblock) is the motion vector.
  • the target block eg, the current macroblock
  • the inter-frame redundancy can be removed by motion estimation, so that the number of bits of video transmission is greatly reduced.
  • the specified search algorithm includes a global search algorithm, a fractional precision search algorithm, a fast search algorithm, a hierarchical number search algorithm, a hybrid search algorithm, and the like.
  • the specified matching criteria include MAD (Mean Absolute Difference), MSE (Mean Squared Error), and the like.
  • the segmentation accuracy of the image to be encoded is more detailed and the segmentation direction is more, so the amount of calculation is larger in the encoding process. If high compression performance is to be achieved, the encoder needs to be optimized.
  • the calculation amount of the interframe prediction and coding part is relatively large, accounting for about 90% of the calculation amount of the entire video coding process; the calculation amount of the intra prediction and coding part is relatively small, accounting for the entire video.
  • the calculation process of the encoding process is about 8%; the calculation of the block filtering and adaptive pixel compensation is relatively minimal, accounting for about 1% of the calculation of the entire video encoding process.
  • the calculation of motion estimation accounts for a large proportion, accounting for about 30% to 40% of the calculation of the entire video coding process. As the performance of other parts is optimized, the calculation of motion estimation will become more and more important. Since motion estimation in the video coding process directly affects the computational complexity of the video coding process, a new motion estimation method is needed to reduce the computational complexity of the motion estimation process, shorten the video coding time, and improve the video. Coding efficiency.
  • the embodiment of the present invention provides a method for performing motion estimation, which can be applied to a device for performing motion estimation, and the device for performing motion estimation It can be a terminal with video encoding function or a server with video encoding function.
  • the embodiment of the present invention is performed by using a terminal having a video coding function as an example. Referring to FIG. 2, the method flow provided by the embodiment of the present invention includes:
  • the terminal constructs a candidate MV list for the PU based on the AMVP.
  • the terminal may divide the image to be encoded into at least one mutually independent macroblock according to a preset format, and each macroblock forms a PU.
  • the preset format is set by the terminal and can be 4*4, 8*8, 16*16 pixels, and the like.
  • the terminal may construct a candidate MV list for the PU, where the candidate MV list includes at least one MV of the current PU, and the at least one MV includes a time domain motion vector and a spatial domain motion vector.
  • the terminal when the terminal constructs a candidate MV list for the current PU based on the AMVP, the following steps 2011 to 2017 may be adopted:
  • the terminal constructs an airspace candidate list and a time domain candidate list.
  • the airspace candidate list includes at least one airspace motion vector of the current PU.
  • a0, a1, b0, b1, and b2 are macroblocks in a reference frame.
  • the terminal When constructing a spatial candidate list based on AMVP, the terminal first needs to select a candidate macroblock from a0 and a1.
  • a candidate macroblock is selected from b0, b1, and b2.
  • the order of selection for a0, a1 is a0->a1->scaled a0->scaled a1, the scaled is a proportional scaling mode; the order of selection for b0, b1, b2 is (scaled b0->scaled b1->scaled b2) B0->b1->b2.
  • the terminal acquires the spatial domain motion vector corresponding to the candidate macroblock, and adds the spatial domain candidate vector corresponding to the candidate macroblock to a list to obtain a spatial domain candidate list.
  • proportional scaling mode is enclosed in parentheses because the proportional scaling mode and the ordinary non-proportional scaling mode are two-choice processes.
  • a0 and a1 are reference macroblocks.
  • the proportional scaling mode is adopted, and vice versa, the normal mode is adopted.
  • the time domain candidate list includes at least one time domain motion vector.
  • the motion information of the PU where the macroblock is located in the reference frame of the encoded image may be selected.
  • the reference macroblock in the reference frame of the encoded image is located at H, and if the macroblock of the reference frame H position of the image to be encoded is available, the macroblock of the H position is used as the candidate macroblock, and then The time domain motion vectors corresponding to the candidate macroblocks are added to a list to obtain a time domain candidate list.
  • the terminal selects a first preset number of airspace motion vectors from the airspace candidate list.
  • the first preset number is set by the terminal, and may be two, three, etc., and two embodiments are preferred in the embodiment of the present invention.
  • the terminal selects a second preset number of time domain motion vectors from the time domain candidate list.
  • the second preset number is set by the terminal, and may be one, two, etc., and one embodiment of the present invention is preferred.
  • the terminal constructs a first motion prediction list according to the first preset number of spatial domain motion vectors and the second preset number of time domain motion vectors.
  • the terminal obtains the first motion prediction list by adding the first preset number of spatial domain motion vectors and the second preset number of time domain motion vectors to the same list.
  • the terminal combines the same motion vectors in the first motion prediction list, and performs padding by using zero motion vectors to obtain a second motion prediction list.
  • the terminal may combine the same motion vectors in the first motion prediction list. Specifically, the terminal may combine the same motion vectors in the first preset number of spatial motion vectors in the first motion prediction list, and combine the same motion vectors in the second preset number of time domain motion vectors. When the same motion vector in the first motion prediction list is combined, the number of motion vectors in the first motion prediction list will be reduced, and at this time, the zero motion vector can be padded to obtain a second motion prediction list.
  • the terminal selects a third preset number of motion vectors from the second motion prediction list.
  • the third preset number may be set by the terminal, and the third preset number may be two, three, and the like. In order to improve the calculation precision, when the third preset number of motion vectors is selected from the second motion prediction list, it is ensured that the selected third preset number of motion vectors include both the time domain motion vector and the spatial domain motion vector.
  • the terminal constructs a candidate MV list according to the third preset number of motion vectors.
  • the terminal can obtain the candidate MV list by adding a third preset number of motion vectors to the same list.
  • an airspace candidate list and a time domain candidate list are constructed based on the AMVP terminal, the airspace candidate list includes five spatial domain motion vectors, and the time domain candidate list includes two time domain motion vectors.
  • the terminal selects two spatial motion vectors from the airspace candidate list, selects one time domain motion vector from the time domain candidate list, and combines the selected spatial domain motion vector and the time domain motion vector, and fills with zero motion vector. , a list of candidate MVs can be obtained.
  • the terminal calculates a rate distortion cost of each MV in the candidate MV list, and obtains a minimum RDcost value of the AMVP and a target MV of the AMVP from the calculation result.
  • the terminal may use SATD (Sum of Absolute Transformed Difference) to calculate the RDcost of each MV in the candidate MV list. At least one RDcost value.
  • SATD refers to the absolute value summation after the prediction residual block is subjected to the Hardman transform.
  • the minimum RDcost value may be selected from the at least one RDcost value, and the selected RDcost value is used as the minimum RDcost value of the AMVP, and the minimum of the AMVP is obtained.
  • the MV corresponding to the RDcost value is used as the target MV of the AMVP, and the target MV of the AMVP is actually the optimal MV of the AMVP process.
  • the minimum RDcost value of the AMVP can be represented by cost_amvp, which has the following functions:
  • the target MV of AMVP can be represented by mv_amvp, which has the following effects:
  • the mv_amvp can be used as the motion estimation mv_best.
  • the terminal performs an IME by using a mapping point of the target MV of the AMVP in the reference frame as a primary selection point, and acquiring a target MV of the IME from the calculation result.
  • the terminal acquires the mapping point of the target MV of the AMVP in the reference frame, and performs the IME with the mapping point as the primary selection point.
  • the terminal can determine an MV according to the position of each movement, and then calculate the determined MV. Rate distortion cost, get RDcost value.
  • the terminal obtains at least one RDcost value, and then selects the smallest RDcost value from the obtained at least one RDcost value, and uses the MV corresponding to the smallest RDcost value as the target MV of the IME.
  • the target MV of the IME is actually the optimal MV of the IME process, and the target MV of the IME can be represented by mv_ime.
  • the mapping point of the target MV of the AMVP in the reference frame is the IME for the primary selection point, only the entire pixel position is referenced, but the sub-pixel position is ignored, resulting in The determined primary point is not accurate.
  • the mapping point of the target MV of the AMVP in the reference frame is (7, 8), and the corresponding integer pixel position is (1, 2).
  • the (1, 2) is directly selected as the primary point.
  • the motion estimation is performed, and in fact, the mapping point is (7, 8) closer to the entire pixel (2, 2), and thus the position of the determined IME's primary point is not accurate.
  • the method provided by the embodiment of the present invention also corrects the position of the mapping point of the target MV of the AMVP in the reference frame, and then performs the IME with the corrected position as the primary selection point.
  • the position of the preliminary point can be corrected by combining the position of the sub-pixel, so that the position of the corrected primary point is closer to the actual whole pixel position.
  • the position of the primary selection point before correction is located in the positive direction of the coordinate axis, first add 2 units (add 2 to the original coordinate) and then shift 2 units to the right (equivalent to dividing by 4); If the position of the primary selection point before correction is in the negative direction of the coordinate axis, first subtract 2 units (subtract 2 from the original coordinates) and then shift 2 units to the right (equivalent to divide by 4), and finally Rounding is performed to obtain a corrected integer pixel position, wherein the positive and negative directions are determined by the established coordinate system.
  • the corresponding integer pixel position can be determined as (-2, 2).
  • the terminal amplifies the target MV of the IME to a quarter pixel precision, obtains a reference target MV of the QME, and determines whether the target MV of the AMVP is the same as the reference target MV of the QME. If yes, step 205 is performed, and if not, execution is performed. Step 206.
  • the terminal Based on the target MV of the obtained IME, the terminal obtains the reference target MV of the QME by shifting mv_ime to the left by 2 units (equivalently multiplied by 4), and the reference target MV of the QME is theoretically the target MV obtained by performing QME.
  • the terminal of the embodiment of the present invention compares the target MV of the AMVP with the reference target MV of the QME. Determine if HME and QME are required.
  • the target MV of the AMVP and the reference target MV of the QME are the same, it is determined that the HME and the QME need not be performed, and step 205 is directly performed; when the target MV of the AMVP and the reference target MV of the QME are different, it is determined that the HME and the QME are required, and Steps 206-208 are performed.
  • the terminal determines the target MV of the AMVP and the minimum RDcost value of the AMVP as the final result of the motion estimation process.
  • the terminal performs HME by using a mapping point of the target MV of the IME in the reference frame as a primary selection point, and acquiring a target MV of the HME from the calculation result.
  • the terminal When the target MV of the AMVP and the reference target MV of the QME are different, in order to ensure video encoding accuracy, the terminal performs HME with the mapping point of the target MV of the IME in the reference frame as the primary selection point. Before performing HME, the terminal can obtain a half pixel position by performing interpolation calculation according to the entire pixel position in the reference frame. Based on the obtained one-half pixel position and the original integer pixel position, the terminal performs HME with the mapping point of the target MV of the IME in the reference frame as the primary selection point.
  • the terminal determines an MV according to the position of each movement, and then calculates the rate distortion cost of the determined MV, and obtains the RDcost value. After the HME process ends, the terminal selects the smallest RDcost value from the obtained RDcost value, and uses the MV corresponding to the smallest RDcost value as the target MV of the HME.
  • the target MV of the HME is actually the optimal MV of the HME process, and the target MV of the HME can be represented by mv_hme.
  • the terminal performs QME with the mapping point of the target MV of the HME in the reference frame as the primary selection point, and obtains the minimum RDcost value of the QME and the target MV of the QME.
  • the terminal can obtain a quarter-pixel position by performing interpolation calculation, and then based on the obtained quarter-pixel position, one-half The pixel position and the original integer pixel position, the terminal performs QME with the mapping point of the HME target MV in the reference frame as the primary selection point.
  • the terminal determines an MV according to the position of each movement, and then calculates the rate distortion cost of the determined MV, and obtains the RDcost value.
  • the terminal selects the smallest RDcost value from the obtained RDcost value.
  • the minimum RDcost value is the minimum RDcost value of the QME.
  • the MV corresponding to the minimum RDcost value of the QME is the target MV of the QME.
  • the minimum RDcost value of the QME may be cost_qme
  • the target MV of the QME is the optimal MV of the QME process
  • the target MV of the QME may be represented by mv_qme.
  • the terminal determines a final result of the motion estimation process according to the minimum RDcost value of the AMVP, the minimum RDcost value of the QME, the target MV of the AMVP, and the target MV of the QME.
  • the terminal can minimize the minimum RDcost value of AMVP and the minimum RDcost value of QME. Under the premise of the calculation amount, a high-precision MV is obtained. Specifically, the terminal determines the final result of the motion estimation process, including but not limited to the following two situations:
  • the terminal determines the target MV of the AMVP and the minimum RDcost value of the AMVP as the final result of the motion estimation process.
  • the terminal determines the target MV of the QME and the minimum RDcost value of the QME as the final result of the motion estimation process.
  • the minimum RDcost value of the AMVP is greater than the minimum RDcost value of the QME, it indicates that the AMVP method is used to obtain the target MV of the AMVP.
  • the terminal can determine the target MV of the QME and the minimum RDcost of the QME as the final result of the motion estimation process.
  • FIG. 4 For the entire motion estimation process described above, for ease of understanding, the following description will be made by taking FIG. 4 as an example.
  • the terminal divides the image to be encoded into at least one PU that does not coincide with each other. For any PU, the terminal constructs a candidate MV list for the current PU based on the AMVP, and calculates a rate distortion cost for each MV in the candidate MV list to obtain at least one RDcost value. The terminal selects the smallest RDcost value from the at least one RDcost value, and uses the minimum RDcost value as the minimum RDcost value of the AMVP, and uses the MV corresponding to the smallest RDcost value as the target MV of the AMVP.
  • the terminal performs the whole pixel motion estimation with the mapping point of the target MV of the AMVP in the reference frame as the primary selection point, and obtains the target MV of the IME from the calculation result.
  • the terminal amplifies the target MV of the IME to a quarter pixel precision to obtain a reference target MV of the QME.
  • the terminal When the target MV of the AMVP is the same as the reference target MV of the QME, the terminal does not need to perform HME and QME, and directly takes the minimum RDcost value of the AMVP and the target MV of the AMVP as the final result of the motion estimation process; when the target MV of the AMVP and the QME When the reference target MV is different, the terminal performs HME by using the mapping point of the target MV of the IME in the reference frame as the primary selection point, and obtains the target MV of the HME from the calculation result, and the mapping point of the terminal with the target MV of the HME in the reference frame is The primary selection point performs QME, and the minimum RDcost value of the QME and its corresponding target MV are obtained from the calculation result.
  • the terminal compares the minimum RDcost value of the AMVP with the minimum RDcost value of the QME. If the minimum RDcost value of the AMVP is less than the minimum RDcost value of the QME, the minimum RDcost value of the AMVP and the target MV of the AMVP are used as the motion. The final result of the estimation process, if the minimum RDcost value of the AMVP is greater than the minimum RDcost value of the QME, the minimum RDcost value of the QME and its corresponding target MV are used as the final result of the motion estimation process.
  • the method for performing motion estimation provided by the embodiment of the present invention can be applied to a video codec process.
  • the video codec process includes the following steps:
  • the transmitting end inputs the video signal to be encoded into the video encoder.
  • the sending end is the terminal or the server in the embodiment of the present invention.
  • the video signal to be encoded is a digital signal.
  • the video encoder encodes the video signal to be encoded in units of frames to obtain a multi-frame encoded image.
  • the specific coding process is:
  • the video encoder encodes the image to be encoded in the first frame to obtain a first frame encoded image (reference frame);
  • the video encoder divides the second frame to be encoded image into at least one complementary overlapping PU;
  • the video encoder performs motion estimation on each PU by using the method provided by the embodiment of the present invention to obtain an optimal MV of each PU, and stores an optimal MV of each PU, and according to an optimal MV of each PU. Determining a predicted image block of each PU in the encoded image of the first frame;
  • the video encoder obtains a prediction residual block by making a difference between the predicted image block and each PU;
  • the video encoder obtains the quantized coefficients by performing discrete cosine transform transform and quantization processing on the prediction residual block, and performs entropy coding on the quantized DTC coefficients, and performs inverse quantization processing and inverse on the quantized DTC coefficients.
  • the DTC transform obtains a residual block of the reconstructed image, and obtains a reconstructed image by adding the reconstructed image to the predicted image block, and the reconstructed image is subjected to block filtering and adaptive pixel compensation to obtain a second frame.
  • the image to be encoded is the reference image of the image to be encoded in the next frame;
  • the video encoder encodes the other frame-encoded images by cyclically performing the above steps (c) to (e) until all the images are encoded.
  • the receiving end decompresses and decapsulates the processed image to obtain an encoded image, and inputs the encoded image into the video decoder.
  • the video decoder decodes the encoded image to obtain a video signal, and then plays the video signal.
  • the decoding process of the video decoder is: the video decoder decodes the encoded image of the first frame to obtain a first frame image, and performs an image according to the prediction residual block and the first frame image of the first frame image and the second frame image. Reconstructing, obtaining a second frame image, and performing image reconstruction according to the prediction residual block and the second frame image of the second frame image and the third frame image to obtain a third frame image, and so on, until all images are decoded come out.
  • the video codec method shown in FIG. 5 can be applied to home theater, remote monitoring, digital broadcasting, mobile streaming media, portable imaging, and medical imaging to meet the video viewing needs of users in different fields.
  • the method provided by the embodiment of the present invention calculates the RDcost of each MV in the candidate MV list, obtains the target MV of the AMVP, performs the IME by using the mapping point of the target MV of the AMVP in the reference frame as the primary selection point, and adopts the target MV of the IME. Zoom in to the quarter precision to obtain the reference target MV of the QME.
  • the target MV of the AMVP is the same as the reference target MV of the QME, the HME and QME are not required, and the minimum RDcost value of the AMMV target MV and AMVP is directly used as the final result. , thereby reducing the amount of calculation for performing HME and QME calculations, shortening the duration of the motion estimation process, and reducing resource consumption.
  • an embodiment of the present invention provides a schematic structural diagram of an apparatus for performing motion estimation, where the apparatus includes:
  • a list construction module 601 configured to construct, for any prediction unit PU in the image to be encoded, a candidate motion vector MV list based on the advanced vector prediction AMVP, the candidate MV list including at least one MV of the PU;
  • a calculation module 602 configured to calculate a rate distortion cost RDcost of each MV in the candidate MV list
  • the obtaining module 603 is configured to obtain, from the calculation result, a target MV of the AMVP and a minimum RDcost value of the AMVP;
  • the calculating module 602 is configured to perform IME by using a mapping point of the target MV of the AMVP in the reference frame as a primary selection point;
  • the obtaining module 603 is configured to obtain a target MV of the IME from the calculation result
  • the precision amplification module 604 is configured to amplify the target MV of the IME to a quarter pixel precision to obtain a reference target MV of the QME;
  • a determining module 605 configured to determine a target MV of the AMVP and a minimum RDcost value of the AMVP as a final result of the motion estimation process when the target MV of the AMVP and the reference target MV of the QME are the same;
  • each target MV is the MV corresponding to the minimum RDcost value estimated for each motion.
  • the calculating module 602 is configured to: when the target MV of the AMVP and the reference target MV of the QME are different, perform HME by using a mapping point of the target MV of the IME in the reference frame as a primary selection point;
  • the obtaining module 603 is configured to obtain a target MV of the HME from the calculation result
  • the calculation module 602 is configured to perform QME with the mapping point of the target MV of the HME in the reference frame as the primary selection point, and obtain the minimum RDcost value of the QME and the target MV of the QME;
  • the determining module 605 is configured to determine a final result of the motion estimation process according to the minimum RDcost value of the AMVP, the minimum RDcost value of the QME, the target MV of the AMVP, and the target MV of the QME.
  • the determining module 605 is configured to determine, when the minimum RDcost value of the AMVP is less than the minimum RDcost value of the QME, the target MV of the AMVP and the minimum RDcost value of the AMVP as the target result of the motion estimation process; When the minimum RDcost value of the AMVP is greater than the minimum RDcost value of the QME, the target MV of the QME and the minimum RDcost value of the quarter-pixel motion estimation are determined as the final result of the motion estimation process.
  • the list construction module 601 is configured to construct an airspace candidate list and a time domain candidate list based on the advanced vector prediction AMVP, where the spatial domain candidate list includes at least one spatial domain motion vector of the PU, where the time domain candidate list includes At least one time domain motion vector of the PU; selecting, from the spatial domain candidate list, a first preset number of spatial domain motion vectors; and selecting, from the time domain candidate list, a second predetermined number of time domain motion vectors; according to the first preset Constructing a first motion prediction list by a plurality of spatial domain motion vectors and a second predetermined number of time domain motion vectors; combining the same motion vectors in the first motion prediction list, and filling with the zero motion vectors to obtain a second motion a prediction list; from the second motion prediction list, selecting a third preset number of motion vectors; and constructing a candidate MV list according to the third preset number of motion vectors.
  • the calculating module 602 is further configured to calculate the RDcost of each MV in the candidate MV list by using the residual transform and then summing the SATD, to obtain at least one RDcost value;
  • the obtaining module 603 is further configured to select a minimum RDcost value from the at least one RDcost value, use the selected RDcost value as the minimum RDcost value of the AMVP, and use the MV corresponding to the minimum RDcost value of the AMVP as the target MV of the AMVP.
  • the apparatus further includes:
  • a position correction module configured to correct a position of a mapping point of the target MV of the AMVP in the reference frame
  • the calculation module 602 is configured to perform IME with the corrected position as a primary selection point.
  • the apparatus calculates the RDcost of each MV in the candidate MV list, obtains the target MV of the AMVP, and performs the IME by using the mapping point of the target MV of the AMVP in the reference frame as the primary selection point.
  • the target MV of the IME is amplified to a quarter precision, and the reference target MV of the QME is obtained.
  • the target MV of the AMVP is the same as the reference target MV of the QME, the target MV and the AMVP of the AMVP are directly minimized without performing HME and QME.
  • the RDcost value is used as the final result, which reduces the amount of calculation for HME and QME calculations, shortens the duration of the motion estimation process, and reduces resource consumption.
  • FIG. 7 is a schematic structural diagram of a terminal for performing motion estimation according to an embodiment of the present invention.
  • the terminal may be used to implement a method for performing motion estimation provided in an embodiment of the present invention. Specifically:
  • the terminal 700 may include an RF (Radio Frequency) circuit 110, a memory 120 including one or more computer readable storage media, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, and a WiFi (Wireless Fidelity, wireless).
  • the fidelity module 170 includes a processor 180 having one or more processing cores, and a power supply 190 and the like. It will be understood by those skilled in the art that the terminal structure shown in FIG. 7 does not constitute a limitation to the terminal, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements. among them:
  • the RF circuit 110 can be used for transmitting and receiving information or during a call, and receiving and transmitting signals. Specifically, after receiving the downlink information of the base station, the downlink information is processed by one or more processors 180. In addition, the data related to the uplink is sent to the base station. .
  • the RF circuit 110 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, an LNA (Low Noise Amplifier). , duplexer, etc.
  • RF circuitry 110 can also communicate with the network and other devices via wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • e-mail Short Messaging Service
  • the memory 120 can be used to store software programs and modules, and the processor 180 executes various functional applications and data processing by running software programs and modules stored in the memory 120.
  • the memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 700 (such as audio data, phone book, etc.) and the like.
  • memory 120 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 120 may also include a memory controller to provide access to memory 120 by processor 180 and input unit 130.
  • the input unit 130 can be configured to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls.
  • input unit 130 can include touch-sensitive surface 131 as well as other input devices 132.
  • Touch-sensitive surface 131 also referred to as a touch display or trackpad, can collect touch operations on or near the user (such as a user using a finger, stylus, etc., on any suitable object or accessory on touch-sensitive surface 131 or The operation near the touch-sensitive surface 131) and driving the corresponding connecting device according to a preset program.
  • the touch-sensitive surface 131 can include two portions of a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 180 is provided and can receive commands from the processor 180 and execute them.
  • the touch-sensitive surface 131 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 130 can also include other input devices 132.
  • other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • Display unit 140 can be used to display information entered by the user or information provided to the user and various graphical user interfaces of terminal 700, which can be constructed from graphics, text, icons, video, and any combination thereof.
  • the display unit 140 may include a display panel 141.
  • the display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
  • the touch-sensitive surface 131 may cover the display panel 141, and when the touch-sensitive surface 131 detects a touch operation thereon or nearby, it is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 according to the touch event The type provides a corresponding visual output on display panel 141.
  • touch-sensitive surface 131 and display panel 141 are implemented as two separate components to implement input and input functions, in some embodiments, touch-sensitive surface 131 can be integrated with display panel 141 for input. And output function.
  • Terminal 700 can also include at least one type of sensor 150, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 141 according to the brightness of the ambient light, and the proximity sensor may close the display panel 141 when the terminal 700 moves to the ear. / or backlight.
  • the gravity acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the terminal 700 can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, here Let me repeat.
  • the audio circuit 160, the speaker 161, and the microphone 162 can provide an audio interface between the user and the terminal 700.
  • the audio circuit 160 can transmit the converted electrical data of the received audio data to the speaker 161 for conversion to the sound signal output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal by the audio circuit 160. After receiving, it is converted into audio data, and then processed by the audio data output processor 180, transmitted to the terminal, for example, via the RF circuit 110, or outputted to the memory 120 for further processing.
  • the audio circuit 160 may also include an earbud jack to provide communication of the peripheral earphones with the terminal 700.
  • WiFi is a short-range wireless transmission technology
  • the terminal 700 can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 170, which provides wireless broadband Internet access for users.
  • FIG. 7 shows the WiFi module 170, it can be understood that it does not belong to the essential configuration of the terminal 700, and may be omitted as needed within the scope of not changing the essence of the invention.
  • the processor 180 is the control center of the terminal 700, connecting various portions of the entire handset with various interfaces and lines, by running or executing software programs and/or modules stored in the memory 120, and recalling data stored in the memory 120, The various functions and processing data of the terminal 700 are performed to perform overall monitoring of the mobile phone.
  • the processor 180 may include one or more processing cores; optionally, the processor 180 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, and an application. Etc.
  • the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 180.
  • the terminal 700 also includes a power source 190 (such as a battery) for powering various components.
  • a power source 190 such as a battery
  • the power source can be logically coupled to the processor 180 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • Power supply 190 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
  • the terminal 700 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the display unit of the terminal 700 is a touch screen display, and the terminal 700 further includes a memory, where the memory stores at least one instruction, at least one program, a code set or an instruction set, and the at least one instruction and the instruction The at least one program, the set of codes, or the set of instructions is loaded by the processor and performs the method of motion estimation illustrated in FIG. 2.
  • the terminal provided by the embodiment of the present invention calculates the RDcost of each MV in the candidate MV list, obtains the target MV of the AMVP, performs the IME by using the mapping point of the target MV of the AMVP in the reference frame as the primary selection point, and adopts the target MV of the IME. Zoom in to the quarter precision to obtain the reference target MV of the QME.
  • the target MV of the AMVP is the same as the reference target MV of the QME, the HME and QME are not required, and the minimum RDcost value of the AMMV target MV and AMVP is directly used as the final result. , thereby reducing the amount of calculation for performing HME and QME calculations, shortening the duration of the motion estimation process, and reducing resource consumption.
  • FIG. 8 is a diagram of a server for performing motion estimation, according to an exemplary embodiment.
  • server 800 includes a processing component 822 that further includes one or more processors, and memory resources represented by memory 832 for storing instructions executable by processing component 822, such as an application.
  • An application stored in memory 832 may include one or more modules each corresponding to a set of instructions.
  • processing component 822 is configured to execute instructions to perform the method of motion estimation illustrated in FIG. 2 above.
  • Server 800 may also include a power component 826 configured to perform power management of server 800, a wired or wireless network interface 850 configured to connect server 800 to the network, and an input/output (I/O) interface 858.
  • Server 800 can operate based on the operating system stored in the memory 832, for example, Windows Server TM, Mac OS X TM , Unix TM, Linux TM, FreeBSD TM or similar.
  • the server provided by the embodiment of the present invention calculates the RDcost of each MV in the candidate MV list, obtains the target MV of the AMVP, and performs the IME by using the mapping point of the target MV of the AMVP in the reference frame as the primary selection point, by using the target MV of the IME. Zoom in to the quarter precision to obtain the reference target MV of the QME.
  • the target MV of the AMVP is the same as the reference target MV of the QME, the HME and QME are not required, and the minimum RDcost value of the AMMV target MV and AMVP is directly used as the final result. , thereby reducing the amount of calculation for performing HME and QME calculations, shortening the duration of the motion estimation process, and reducing resource consumption.
  • the embodiment of the invention further provides a computer readable storage medium, where the storage medium stores at least one instruction, at least one instruction, at least one program, a code set or a set of instructions, the at least one instruction, the at least one program, The set of codes or the set of instructions is loaded and executed by a processor to implement the method of motion estimation illustrated in FIG.
  • the computer readable storage medium provided by the embodiment of the present invention calculates the RDcost of each MV in the candidate MV list, obtains the target MV of the AMVP, and performs the IME by using the mapping point of the target MV of the AMVP in the reference frame as the primary selection point.
  • the target MV of the IME is amplified to a quarter accuracy, and the reference target MV of the QME is obtained.
  • the target MV of the AMVP is the same as the reference target MV of the QME, the HME and the QME are not required, and the target MV of the AMVP and the minimum RDcost of the AMVP are directly used.
  • the value is the final result, which reduces the amount of computation for HME and QME calculations, shortens the length of the motion estimation process, and reduces resource consumption.
  • the apparatus and the device for performing motion estimation provided by the foregoing embodiments of the present invention perform motion estimation, only the division of the foregoing functional modules is illustrated. In actual applications, the foregoing functions may be allocated according to requirements. The different functional modules are completed, and the internal structure of the device and the device to be subjected to motion estimation are divided into different functional modules to complete all or part of the functions described above.
  • the apparatus and the device for performing motion estimation provided by the embodiment of the present invention are the same as the method for performing the motion estimation. The specific implementation process is described in detail in the method embodiment, and details are not described herein again.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种进行运动估计的方法、装置及存储介质,属于互联网技术领域。所述方法包括:获取AMVP的目标MV和AMVP的最小RDcost值;将以AMVP的目标MV在参考帧中的映射点为初选点进行IME得到的IME的目标MV放大到四分之一像素精度,得到QME的参考目标MV;当AMVP的目标MV和QME的参考目标MV相同时,将AMVP的目标MV和AMVP的最小RDcost值确定为最终结果。本方法获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。

Description

进行运动估计的方法、装置、设备及存储介质
本申请要求于2017年09月28日提交中国国家知识产权局、申请号为2017108944927、发明名称为“进行运动估计的方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及互联网技术领域,特别涉及一种进行运动估计的方法、装置、设备及存储介质。
背景技术
运动估计为视频编码中最重要的组成部分,是指将每帧图像分割成至少一个互不重叠的宏块,并按照指定搜索算法在参考帧的指定区域内搜索出与每个宏块最相似的匹配块的过程。运动估计不仅能够降低视频编码过程的复杂度,而且能够减少视频传输过程的比特数,因而在视频编码过程中有必要进行运动估计。
相关技术在进行运动估计时,主要采用如下方法:采用AMVP(Advanced Motion Vector Prediction,高级运动向量预测)方法,利用空域运动向量和时域运动向量的相关性,为当前PU(Predicting Unit,预测单元)(PU即宏块)建立候选MV(Motion Vector,运动向量)列表;采用SAD(Sum of Absolute Differences,绝对误差和)方法,计算候选MV列表中每个MV的RDcost(Rate Distortioncost,率失真代价),得到至少一个RDcost值;从至少一个RDcost值中,获取最小的RDcost值,并将最小的RDcost值对应的MV作为AMVP的目标MV;以AMVP的目标MV在参考帧中的映射点为初选点进行IME(Integer Motion Estimation,整像素运动估计),并从计算结果中获取IME的目标MV;以IME的目标MV在参考帧中的映射点为初选点进行HME(Half Motion Estimation,二分之一像素运动估计),并从计算结果中获取HME的目标MV;以HME的目标MV在参 考帧中的映射点为初选点进行QME(Quarter Motion Estimation,四分之一像素运动估计),并从计算结果中获取QME的目标MV和QME的最小RDcost值,将该QME的目标MV和QME的最小RDcost值确定为运动估计过程的最终结果。
然而,进行二分之一像素运动估计和四分之一像素运动估计的计算量较大,导致运动估计过程时间较长,资源消耗较大。
发明内容
为了缩短运动估计过程时长,降低运动估计过程的资源消耗,本发明实施例提供了一种进行运动估计的方法、装置、设备及存储介质。所述技术方案如下:
一方面,提供了一种进行运动估计的方法,所述方法应用于进行运动估计的设备中,所述方法包括:
对于待编码图像中任一预测单元PU,基于高级向量预测AMVP为所述PU构建候选运动向量MV列表,所述候选MV列表包括所述PU的至少一个MV;
计算所述候选MV列表中每个MV的率失真代价RDcost,并从计算结果中获取AMVP的目标MV和AMVP的最小RDcost值;
以所述AMVP的目标MV在参考帧中的映射点为初选点进行整像素运动估计IME,并从计算结果中获取IME的目标MV;
将所述IME的目标MV放大到四分之一像素精度,得到QME估计的参考目标MV;
当所述AMVP的目标MV和所述QME的参考目标MV相同时,将所述AMVP的目标MV和所述AMVP的最小RDcost值确定为运动估计过程的最终结果;
其中,每种目标MV为每种运动估计的最小RDcost值对应的MV。
另一方面,提供了一种进行运动估计的装置,所述装置设置于进行运动估计的设备中,所述装置包括:
列表构建模块,用于对于待编码图像中任一预测单元PU,基于高级向量预测AMVP为所述PU构建候选运动向量MV列表,所述候选MV列表包括所述 PU的至少一个MV;
计算模块,用于计算所述候选MV列表中每个MV的率失真代价RDcost;
获取模块,用于从计算结果中获取AMVP的目标MV和AMVP的最小RDcost值;
所述计算模块,用于以所述AMVP的目标MV在参考帧中的映射点为初选点进行IME;
所述获取模块,用于从计算结果中获取IME的目标MV;
精度放大模块,用于将所述IME的目标MV放大到四分之一像素精度,得到QME的参考目标MV;
确定模块,用于当所述AMVP的目标MV和所述QME的参考目标MV相同时,将所述AMVP的目标MV和所述AMVP的最小RDcost值确定为运动估计过程的最终结果;
其中,每种目标MV为每种运动估计的最小RDcost值对应的MV。
另一方面,提供了一种用于进行运动估计的设备,所述设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集由所述处理器加载并执行以实现进行运动估计的方法。
另一方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集由处理器加载并执行以实现进行运动估计的方法。
本发明实施例提供的技术方案带来的有益效果是:
计算候选MV列表中每个MV的RDcost,获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对本发明实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的HMVE编码框架图;
图2是本发明实施例提供的一种进行运动估计的方法流程图;
图3(A)是本发明实施例提供的一种空域运动向量的构造过程示意图;
图3(B)是本发明实施例提供的一种时域运动向量的构造过程示意图;
图3(C)是本发明实施例提供的基于AMVP构建候选MV列表的示意图;
图4是本发明实施例提供的进行运动估计的过程示意图;
图5是本发明实施例提供的视频编解码过程的示意图;
图6是本发明实施例提供的一种进行运动估计的装置结构示意图;
图7示出了本发明实施例所涉及的进行运动估计的终端的结构示意图;
图8是根据一示例性实施例示出的一种用于进行运动估计的服务器。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施例作进一步地详细描述。
随着互联网技术、数字通信技术等各种数字化技术的发展,逐渐改变了用户的生活方式和工作方式。为了满足家庭影院、远程监控、数字广播、移动流媒体、便携摄像和医学成像等领域的需求,高清晰度、高压缩率及高帧率成为了视频编码未来的发展趋势。由于现有的H.264视频编码技术方式本身存在一定的局限性,无法满足未来视频编码的需求,因而HEVC(High Efficiency Video Coding,高效率视频编码)视频编码技术应用而生。
HEVC也称为H.265视频编码技术,与传统的H.264视频编码技术相比,具有如下优点:
(1)、高压缩率,压缩率提高了50%,这意味着相同的画面质量只需要一半的比特率。
(2)、高帧率,在实时编码上相同的画面质量减少35%的宽带损耗。
(3)、高清晰度,支持更大的视频分辨率,包括2K和4K等。
(4)、低成本,可在低比特率上传输标准清晰度和高清晰度的视频数据。
图1示出了HEVC的编码框架,参见图1,HEVC编码过程如下:
第一步,对于任一帧待编码图像,将该待编码图像分割为至少一个互不重叠的PU;
第二步,将该待编码图像输入到编码器中进行编码预测,该过程主要利用视频数据的空间相关性和时间相关性,采用帧内预测或帧间预测去除每个PU的时空域冗余信息,得到每个PU在参考帧中的预测图像块。
第三步,将预测图像块和原始PU作差,得到预测残差块,并对预测残差块分别进行DTC(Discrete Cosine Transform,离散余弦变换)变换和量化处理,得到量化的DTC系数。
其中,DTC是一种与傅里叶变换紧密相关的数学运算,在傅里叶级数展开式中,如果被展开的函数为实偶函数,则其傅里叶级数中只包含余弦项,在处理时先将其离散化再进行余弦变换。量化处理为数字信号处理领域一种常用的技术,是指将信号的连续取值(或者大量可能的离散取值)近似为有限多个(或较少的)离散值的过程。量化处理主要应用于从连续信号到数字信号的转换中,连续信号经过采样成为离散信号,离散信号经过量化即成为数字信号。
第四步,将量化的DTC系数进行熵编码,得到压缩码率并输出。
第五步,将量化的DTC系数进行反量化处理和反DTC变换,得到重构图像的残差块,进而将重构图像的残差块与帧内或帧间的预测图像块相加,得到重构图像。
第六步,将重构图像经过DB(Deblocking Filter,去块滤波)和SAO(Sample Adaptive Offset,自适应像素补偿)处理后,加入到参考帧队列中,并作为下一帧待编码图像的参考帧。通过循环执行上述第一步至第六步使得视频图像能够一帧帧地向后编码。
其中,DB的主要作用是增强图像的边界,降低图像边界不连续性。自适应像素补偿主要用于对经过DB处理后的图像进行局部信息补偿,以减少源图像和重构图像之间的失真。
运动估计
考虑到现实生活中物体运动是连续的,一个连续的视频序列中前后两帧视频图像之间的差异比较小,可能只是物体的相对位置发生了变化,或者只是这两帧图像在边界上发生了变化。对于视频编码器而言,如果对视频序列中的每帧视频图像进行编码,则会造成码流的极大浪费,如果根据两幅图像的差异和参考帧进行编码,则可大大减低码流的浪费。
运动估计的基本思想是将图像序列的每一帧分割成许多互不重叠的宏块,并设定宏块内所有像素的位移量都相同,然后对每个宏块按照指定搜索算法和指定匹配准则在参考帧的指定区域内进行搜索,以搜索出与每个宏块最相似的匹配块,该匹配块与目标块(例如是当前宏块)的相对位移即为运动向量。在进行视频压缩时,只需存储运动向量、残差块及参考帧就可以恢复出目标块。其中,通过运动估计可以去除帧间冗余度,使得视频传输的比特数大为减少。其中,指定搜索算法包括全局搜索算法、分数精度搜索算法、快速搜索算法、分级数搜索算法、混合搜索算法等。指定匹配准则包括MAD(Mean Absolute Difference,平均绝对值差)、MSE(Mean Squared Error,平均平方误差)等。
采用HEVC进行视频编码时,由于待编码图像的分割精度要求更为细致,分割方向也更多,因而编码过程中计算量更大,如果要实现高压缩性能,则需要对编码器进行优化。根据实验数据可知,目前帧间预测和编码部分的计算量相对较大,大约占整个视频编码过程计算量的90%;帧内预测和编码部分的计算量相对较小,大约占整个视频编码过程计算量的8%左右;区块滤波和自适应像素补偿部分的计算量相对最小,大约占整个视频编码过程计算量的1%左右。对于帧间预测来说,运动估计的计算量所占的比重比较大,大约占整个视频编码过程计算量的30%~40%。随着其他部分性能的优化,运动估计的计算量所占的比重将越来越大。由于在视频编码过程中如何进行运动估计,直接影响着视频编码过程的计算量,因此,亟需一种新的运动估计的方法,以降低运动估计过程的计算量,缩短视频编码时间,提高视频编码效率。
为了降低运动估计过程的计算量,缩短视频编码时间,提高视频编码效率,本发明实施例提供了一种进行运动估计的方法,该方法可以应用于进行运动估 计的设备,该进行运动估计的设备可以为具有视频编码功能的终端,也可以为具有视频编码功能的服务器。以具有视频编码功能的终端执行本发明实施例为例,参见图2,本发明实施例提供的方法流程包括:
201、对于待编码图像中任一PU,终端基于AMVP为该PU构建候选MV列表。
在视频编码过程中,对于任一待编码图像,终端可按照预设格式将该待编码图像分割成至少一个相互独立的宏块,每个宏块形成一个PU。该预设格式由终端设置,可以为4*4、8*8、16*16像素等。对于每个PU,终端可为该PU构建一个候选MV列表,该候选MV列表中包括当前PU的至少一个MV,该至少一个MV包括时域运动向量和空域运动向量。
在本发明实施例中,终端基于AMVP为当前PU构建候选MV列表时,可采用如下步骤2011~2017:
2011、基于AMVP,终端构建空域候选列表和时域候选列表。
其中,空域候选列表包括当前PU的至少一个空域运动向量。以图3(A)为例,a0、a1、b0、b1、b2为参考帧中的宏块,基于AMVP构建空域候选列表时,终端首先需要从a0、a1中选出一个候选宏块,从b0、b1、b2中选出一个候选宏块。对于a0、a1的选择顺序为a0->a1->scaled a0->scaled a1,该scaled为比例伸缩模式;对于b0、b1、b2的选择顺序为(scaled b0->scaled b1->scaled b2)b0->b1->b2。然后,终端获取候选宏块对应的空域运动向量,将候选宏块对应的空域候选向量加入到一个列表中,得到空域候选列表。
需要说明的是,上式比例伸缩模式之所以采用括号括起来,是因为该比例伸缩模式与普通非比例伸缩模式为二选一的过程,当满足以下条件时,a0、a1均为参考宏块或者存在时其预测模式不是帧内预测,则采用比例伸缩模式,反之,采用普通模式。
其中,时域候选列表包括至少一个时域运动向量。基于AMVP构建时域候选列表时,可根据已编码图像(待编码图像的参考帧)的参考帧中宏块所在位置的PU的运动信息进行选取。参见图3(B),已编码图像的参考帧中参考宏块所在位置为H,若待编码图像的参考帧H位置的宏块可用,则将H位置的宏块作为候选宏块,进而将候选宏块对应的时域运动向量加入到一个列表中,得到时域候选列表。
2012、终端从空域候选列表中,选取第一预设数量个空域运动向量。
其中,第一预设数量由终端设置,可以为2个、3个等,本发明实施例以2个为宜。
2013、终端从时域候选列表中,选取第二预设数量个时域运动向量。
其中,第二预设数量由终端设置,可以为1个、2个等,本发明实施例以1个为宜。
2014、终端根据第一预设数量个空域运动向量和第二预设数量个时域运动向量,构建第一运动预测列表。
终端通过将第一预设数量个空域运动向量和第二预设数量个时域运动向量加入到同一列表中,可得到第一运动预测列表。
2015、终端对第一运动预测列表中相同的运动向量进行合并,并采用零运动向量进行填补,得到第二运动预测列表。
对于第一运动预测列表中的运动向量,终端可将第一运动预测列表中相同的运动向量进行合并。具体地,终端可将第一运动预测列表中第一预设数量个空域运动向量中相同的运动向量进行合并,并将第二预设数量个时域运动向量中相同的运动向量进行合并。当对第一运动预测列表中相同的运动向量进行合并后,第一运动预测列表中运动向量的数量将减少,此时可通过构造零运动向量进行填补,从而得到第二运动预测列表。
2016、终端从第二运动预测列表中,选取第三预设数量个运动向量。
其中,第三预设数量可由终端设置,该第三预设数量可以为2个、3个等,本发明实施例以2个为宜。为了提高计算精度,在从第二运动预测列表中,选取第三预设数量个运动向量时,需保证所选取的第三预设数量个运动向量中同时包括时域运动向量和空域运动向量。
2017、终端根据第三预设数量个运动向量,构建候选MV列表。
终端通过将第三预设数量个运动向量加入到同一列表中,可得到候选MV列表。
上述终端构建候选MV列表的过程,为了便于理解,下面以图3(C)为例进行说明。
参见图3(C),基于AMVP终端构建一个空域候选列表和时域候选列表,该空域候选列表包括5个空域运动向量,时域候选列表包括2个时域运动向量。 终端从空域候选列表中选取2个空域运动向量,从时域候选列表中选取1个时域运动向量,通过对所选取的空域运动向量和时域运动向量进行合并,并采用零运动向量进行填补,可得到候选MV列表。
202、终端计算候选MV列表中每个MV的率失真代价,并从计算结果中获取AMVP的最小RDcost值和AMVP的目标MV。
终端在计算候选MV列表中每个MV的率失真代价RDcost时,可采用SATD(Sum of Absolute Transformed Difference,残差变换后再绝对值求和),计算候选MV列表中每个MV的RDcost,得到至少一个RDcost值。其中,SATD是指将预测残差块经过哈德曼变换后再绝对值求和。
终端从计算结果中获取AMVP的目标MV和AMVP的最小RDcost值时,可从至少一个RDcost值中,选取最小的RDcost值,进而将所选取的RDcost值作为AMVP的最小RDcost值,将AMVP的最小RDcost值对应的MV作为AMVP的目标MV,该AMVP的目标MV实际上为AMVP过程的最优MV。
其中,AMVP的最小RDcost值可用cost_amvp表示,该cost_amvp具有如下作用:
(1)、用于判断是否需要进行HME和QME;
(2)、当无需进行HME和QME时,将该cost_amvp直接作为cost_best;
(3)、在进行HME和QME后,判断是否需要对QME的计算结果进行修正。
AMVP的目标MV可用mv_amvp表示,该mv_amvp具有如下作用:
(1)、用于预测运动向量;
(2)、用于确定IME的初选点;
(3)、当无需进行HME和QME时,该mv_amvp可作为运动估计的mv_best。
203、终端以AMVP的目标MV在参考帧中的映射点为初选点进行IME,并从计算结果中获取IME的目标MV。
终端获取AMVP的目标MV在参考帧中的映射点,并以映射点为初选点进行IME,在IME过程中,终端根据每次移动的位置,可确定一个MV,进而计算所确定的MV的率失真代价,得到RDcost值。IME过程结束后,终端得到至少一个RDcost值,进而从得到的至少一个RDcost值中,选取最小的RDcost值,并将最小的RDcost值对应的MV作为IME的目标MV。其中,IME的目标MV 实际上为IME过程的最优MV,该IME的目标MV可用mv_ime表示。
由于AMVP对应的像素精度为四分之一像素精度,而目前以AMVP的目标MV在参考帧中的映射点为初选点进行IME时,只参考整像素位置,却忽略了分像素位置,导致确定的初选点并不准确。例如,AMVP的目标MV在参考帧中的映射点为(7,8),其对应的整像素位置为(1,2),当前在进行IME时,直接以(1,2)为初选点进行运动估计,而实际上,映射点为(7,8)更靠近整像素(2,2)的位置,因而所确定的IME的初选点的位置并不准确。
为了提高IME的初选点的准确性,本发明实施例提供的方法还将对AMVP的目标MV在参考帧中的映射点的位置进行修正,进而以修正后的位置为初选点进行IME。考虑到mv_amvp的像素精度为四分之一像素精度,在修正初选点的位置时可结合分像素位置进行修正,使得修正后的初选点的位置更接近实际的整像素位置。具体修正时,可采用如下规则:修正前初选点的位置位于坐标轴的正方向,则先加上2个单位(在原坐标的基础上加2)再右移2个单位(相当于除以4);如果修正前初选点的位置位于坐标轴的负方向,则先减去2个单位(在原坐标的基础上减去2)再右移2个单位(相当于除以4),最后进行四舍五入,得到修正后的整像素位置,其中,正方向和负方向由所建立的坐标系确定。
例如,mv_amvp在参考帧中的映射点位置为(7,8),修正前其对应的整像素位置为(1,2),采用上述方法处理,修正后X轴方向的坐标为(7+2)/4=2.25,修正后Y轴方向的坐标为(8+2)/4=2.5,则其对应的整像素位置为(2,2)。
例如,mv_amvp在参考帧中的映射点位置为(-7,8),修正前其对应的整像素位置为(-1,2),采用上述方法处理,修正后X轴方向的坐标为(-7-2)/4=-2.25,修正后Y轴方向的坐标为(8+2)/4=2.5,经过四舍五入的近似处理,可确定其对应的整像素位置为(-2,2)。
需要说明的是,如果终端在计算候选MV列表中每个MV的率失真代价时,已经判断过(0,0)位置,则后续进行IME时,无需在初选点中加入(0,0)位置。
204、终端将IME的目标MV放大到四分之一像素精度,得到QME的参考目标MV,并判断AMVP的目标MV与QME的参考目标MV是否相同,如果是,执行步骤205,如果否,执行步骤206。
基于所得到的IME的目标MV,终端通过将mv_ime左移2个单位(相当于乘以4),得到QME的参考目标MV,该QME的参考目标MV理论上为进行QME得到的目标MV,该QME的参考目标MV可用mv_new表示,该过程记为mv_new=mv_ime<<2个单位。
由于AMVP的目标MV与QME的参考目标MV为采用不同方法得到的、理论上的QME对应的目标MV,因此,本发明实施例终端通过将AMVP的目标MV与QME的参考目标MV进行比较,可确定是否需要进行HME和QME。当AMVP的目标MV和QME的参考目标MV相同时,则确定无需进行HME和QME,直接执行步骤205;当AMVP的目标MV和QME的参考目标MV不同时,则确定需要进行HME和QME,并执行步骤206~208。
205、终端将AMVP的目标MV和AMVP的最小RDcost值确定为运动估计过程的最终结果。
当AMVP的目标MV和QME的参考目标MV相同时,终端无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值确定为运动估计过程的最终结果,即mv_amvp=mv_best,cost_amvp=cost_best。
由于无需进行HME和QME,因而大大降低了运动估计过程的计算量,缩短了视频编码时间,提高了视频编码效率。
206、终端以IME的目标MV在参考帧中的映射点为初选点进行HME,并从计算结果中获取HME的目标MV。
当AMVP的目标MV和QME的参考目标MV不同时,为了确保视频编码精度,终端将以IME的目标MV在参考帧中的映射点为初选点进行HME。在进行HME之前,终端可根据参考帧中的整像素位置,通过进行插值计算,得到二分之一像素位置。基于所得到的二分之一像素位置和原有的整像素位置,终端以IME的目标MV在参考帧中的映射点为初选点进行HME。在以IME的目标MV在参考帧中的映射点为初选点进行移动的过程中,终端根据每次移动的位置,确定一个MV,进而计算所确定的MV的率失真代价,得到RDcost值。HME过程结束后,终端从得到的RDcost值中,选取最小的RDcost值,并将最小的RDcost值对应的MV作为HME的目标MV。其中,HME的目标MV实际上为HME过程的最优MV,该HME的目标MV可用mv_hme表示。
207、终端以HME的目标MV在参考帧中的映射点为初选点进行QME,得 到QME的最小RDcost值和QME的目标MV。
基于步骤206中得到的二分之一像素位置和原有的整像素位置,终端通过进行插值计算可得到四分之一像素位置,进而基于所得到的四分之一像素位置、二分之一像素位置及原有的整像素位置,终端以HME的目标MV在参考帧中的映射点为初选点进行QME。在以HME的目标MV在参考帧中的映射点为初选点进行移动的过程中,终端根据每次移动的位置,确定一个MV,进而计算所确定的MV的率失真代价,得到RDcost值。QME过程结束后,终端从得到的RDcost值中,选取最小的RDcost值,该最小的RDcost值即为QME的最小RDcost值,该QME的最小RDcost值对应的MV为QME的目标MV。其中,QME的最小RDcost值可用cost_qme,QME的目标MV为QME过程的最优MV,该QME的目标MV可用mv_qme表示。
208、终端根据AMVP的最小RDcost值、QME的最小RDcost值、AMVP的目标MV及QME的目标MV,确定运动估计过程的最终结果。
由于AMVP的最小RDcost值与QME的最小RDcost值为采用不同方法得到的、理论上的QME对应的率失真代价值,终端通过将AMVP的最小RDcost值和QME的最小RDcost值进行比较,可以最小的计算量为代价的前提下,得到高精度MV。具体地,终端在确定运动估计过程的最终结果时,包括但不限于如下两种情况:
第一种情况、如果AMVP的最小RDcost值小于QME的最小RDcost值,则终端将AMVP的目标MV和AMVP的最小RDcost值确定为运动估计过程的最终结果。
当AMVP的最小RDcost值小于QME的最小RDcost值时,说明采用AMVP方法获取AMVP的目标MV时计算量最小,因而终端可将AMVP的目标MV和AMVP的最小RDcost值确定为运动估计过程的最终结果,即mv_amvp=mv_best,cost_amvp=cost_best。
第二种情况、如果AMVP的最小RDcost值大于QME的最小RDcost值,则终端将QME的目标MV和QME的最小RDcost值确定为运动估计过程的最终结果。
当AMVP的最小RDcost值大于QME的最小RDcost值时,说明采用AMVP方法获取AMVP的目标MV时计算量较大,因而终端可将QME的目标MV和 QME的最小RDcost值确定为运动估计过程的最终结果,即mv_qme=mv_best,cost_qme=cost_best。
对于上述整个运动估计的过程,为了便于理解,下面将以图4为例进行说明。
在视频编码过程中,对于任一帧待编码图像,终端将该待编码图像分割成至少一个互不重合的PU。对于任一PU,终端基于AMVP为该当前PU构建一个候选MV列表,并计算候选MV列表中每个MV的率失真代价,得到至少一个RDcost值。终端从至少一个RDcost值中,选取最小的RDcost值,并将该最小的RDcost值作为AMVP的最小的RDcost值,将该最小的RDcost值对应的MV作为AMVP的目标MV。接着,终端以该AMVP的目标MV在参考帧中的映射点为初选点进行整像素运动估计,从计算结果中得到IME的目标MV。终端将该IME的目标MV放大到四分之一像素精度,得到QME的参考目标MV。当AMVP的目标MV与QME的参考目标MV相同时,终端无需进行HME和QME,直接将AMVP的最小的RDcost值和AMVP的目标MV作为运动估计过程的最终结果;当AMVP的目标MV与QME的参考目标MV不同时,终端以IME的目标MV在参考帧中的映射点为初选点进行HME,从计算结果中获取HME的目标MV,终端以HME的目标MV在参考帧中的映射点为初选点进行QME,从计算结果中获取QME的最小的RDcost值及其对应的目标MV。接着,终端将AMVP的最小的RDcost值与QME的最小的RDcost值进行比较,如果AMVP的最小的RDcost值小于QME的最小的RDcost值,则将AMVP的最小的RDcost值和AMVP的目标MV作为运动估计过程的最终结果,如果AMVP的最小的RDcost值大于QME的最小的RDcost值,则将QME的最小的RDcost值及其对应的目标MV作为运动估计过程的最终结果。
本发明实施例提供的进行运动估计的方法可应用于视频编解码过程中,参见图5,视频编解码过程包括以下步骤:
(1)、发送端将待编码的视频信号输入到视频编码器中。
其中,发送端为本发明实施例中的终端或服务器。该待编码的视频信号为一种数字信号。
(2)、视频编码器以帧为单位对待编码的视频信号进行编码,得到多帧已编码图像。具体编码过程为:
(a)、视频编码器对第一帧待编码图像进行编码,得到第一帧已编码图像(参考帧);
(b)、对于第二帧待编码图像,视频编码器将该第二帧待编码图像分割成至少一个互补重叠的PU;
(c)、视频编码器采用本发明实施例提供的方法对每个PU进行运动估计,得到每个PU的最优MV,存储每个PU的最优MV,并根据每个PU的最优MV,确定每个PU在第一帧已编码图像中的预测图像块;
(d)、视频编码器通过将预测图像块和每个PU作差,得到预测残差块;
(e)、视频编码器通过对预测残差块进行离散余弦变换变换和量化处理,得到量化系数,并将量化的DTC系数进行熵编码后输出,同时将量化的DTC系数进行反量化处理和反DTC变换,得到重构图像的残差块,并通过将重构图像的与预测图像块相加,得到重构图像,该重构图像经过区块滤波和自适应像素补偿后,得到第二帧待编码图像,该第二帧待编码图像为下一帧待编码图像的参考图像;
(f)、视频编码器通过循环执行上述步骤(c)~(e)对其他帧编码图像进行编码,直至全部的图像均完成编码。
(3)、将已编码图像进行压缩、封装,得到处理后的图像,并通过IP网络将处理后的图像发送至接收端。
(4)、当接收到处理后的图像时,接收端对处理后的图像进行解压缩、解封装,得到已编码图像,并将已编码图像输入到视频解码器中。
(5)、视频解码器对已编码图像进行解码,得到视频信号,进而播放该视频信号。视频解码器的解码过程为:视频解码器对第一帧已编码图像进行解码,得到第一帧图像,并根据第一帧图像和第二帧图像的预测残差块及第一帧图像进行图像重构,得到第二帧图像,进而根据第二帧图像和第三帧图像的预测残差块及第二帧图像进行图像重构,得到第三帧图像,依次类推,直至全部的图像全部解码出来。
图5所示的视频编解码方法可应用于家庭影院、远程监控、数字广播、移动流媒体、便携摄像和医学成像等领域,以满足不同领域用户的视频观看需求。
本发明实施例提供的方法,计算候选MV列表中每个MV的RDcost,获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行 IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。
参见图6,本发明实施例提供了一种进行运动估计的装置结构示意图,该装置包括:
列表构建模块601,用于对于待编码图像中任一预测单元PU,基于高级向量预测AMVP为PU构建候选运动向量MV列表,该候选MV列表包括该PU的至少一个MV;
计算模块602,用于计算候选MV列表中每个MV的率失真代价RDcost;
获取模块603,用于从计算结果中获取AMVP的目标MV和AMVP的最小RDcost值;
计算模块602,用于以AMVP的目标MV在参考帧中的映射点为初选点进行IME;
获取模块603,用于从计算结果中获取IME的目标MV;
精度放大模块604,用于将IME的目标MV放大到四分之一像素精度,得到QME的参考目标MV;
确定模块605,用于当AMVP的目标MV和QME的参考目标MV相同时,将AMVP的目标MV和AMVP的最小RDcost值确定为运动估计过程的最终结果;
其中,每种目标MV为每种运动估计的最小RDcost值对应的MV。
在本发明的另一个实施例中,计算模块602,用于当AMVP的目标MV和QME的参考目标MV不同时,以IME的目标MV在参考帧中的映射点为初选点进行HME;
获取模块603,用于从计算结果中获取HME的目标MV;
计算模块602,用于以HME的目标MV在参考帧中的映射点为初选点进行QME,得到QME的最小RDcost值和QME的目标MV;
确定模块605,用于根据AMVP的最小RDcost值、QME的最小RDcost值、 AMVP的目标MV及QME的目标MV,确定运动估计过程的最终结果。
在本发明的另一个实施例中,确定模块605,用于当AMVP的最小RDcost值小于QME的最小RDcost值时,将AMVP的目标MV和AMVP的最小RDcost值确定为运动估计过程的目标结果;当AMVP的最小RDcost值大于QME的最小RDcost值时,将QME的目标MV和四分之一像素运动估计的最小RDcost值确定为运动估计过程的最终结果。
在本发明的另一个实施例中,列表构建模块601,用于基于高级向量预测AMVP,构建空域候选列表和时域候选列表,空域候选列表包括PU的至少一个空域运动向量,时域候选列表包括PU的至少一个时域运动向量;从空域候选列表中,选取第一预设数量个空域运动向量;从时域候选列表中,选取第二预设数量个时域运动向量;根据第一预设数量个空域运动向量和第二预设数量个时域运动向量,构建第一运动预测列表;对第一运动预测列表中相同的运动向量进行合并,并采用零运动向量进行填补,得到第二运动预测列表;从第二运动预测列表中,选取第三预设数量个运动向量;根据第三预设数量个运动向量,构建候选MV列表。
在本发明的另一个实施例中,计算模块602,还用于采用残差变换后再绝对值求和SATD,计算候选MV列表中每个MV的RDcost,得到至少一个RDcost值;
获取模块603,还用于从至少一个RDcost值中,选取最小的RDcost值;将所选取的RDcost值作为AMVP的最小RDcost值,并将AMVP的最小RDcost值对应的MV作为AMVP的目标MV。
在本发明的另一个实施例中,该装置还包括:
位置修正模块,用于对AMVP的目标MV在参考帧中的映射点的位置进行修正;
计算模块602,用于以修正后的位置为初选点进行IME。
综上所述,本发明实施例提供的装置,计算候选MV列表中每个MV的RDcost,获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最 终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。
参见图7,其示出了本发明实施例所涉及的进行运动估计的终端的结构示意图,该终端可以用于实施本发明实施例中提供的进行运动估计的方法。具体来讲:
终端700可以包括RF(Radio Frequency,射频)电路110、包括有一个或一个以上计算机可读存储介质的存储器120、输入单元130、显示单元140、传感器150、音频电路160、WiFi(Wireless Fidelity,无线保真)模块170、包括有一个或者一个以上处理核心的处理器180、以及电源190等部件。本领域技术人员可以理解,图7中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
RF电路110可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,交由一个或者一个以上处理器180处理;另外,将涉及上行的数据发送给基站。通常,RF电路110包括但不限于天线、至少一个放大器、调谐器、一个或多个振荡器、用户身份模块(SIM)卡、收发信机、耦合器、LNA(Low Noise Amplifier,低噪声放大器)、双工器等。此外,RF电路110还可以通过无线通信与网络和其他设备通信。所述无线通信可以使用任一通信标准或协议,包括但不限于GSM(Global System of Mobile communication,全球移动通讯系统)、GPRS(General Packet Radio Service,通用分组无线服务)、CDMA(Code Division Multiple Access,码分多址)、WCDMA(Wideband Code Division Multiple Access,宽带码分多址)、LTE(Long Term Evolution,长期演进)、电子邮件、SMS(Short Messaging Service,短消息服务)等。
存储器120可用于存储软件程序以及模块,处理器180通过运行存储在存储器120的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器120可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据终端700的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器120可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。 相应地,存储器120还可以包括存储器控制器,以提供处理器180和输入单元130对存储器120的访问。
输入单元130可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。具体地,输入单元130可包括触敏表面131以及其他输入设备132。触敏表面131,也称为触摸显示屏或者触控板,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触敏表面131上或在触敏表面131附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触敏表面131可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器180,并能接收处理器180发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触敏表面131。除了触敏表面131,输入单元130还可以包括其他输入设备132。具体地,其他输入设备132可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元140可用于显示由用户输入的信息或提供给用户的信息以及终端700的各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示单元140可包括显示面板141,可选的,可以采用LCD(Liquid Crystal Display,液晶显示器)、OLED(Organic Light-Emitting Diode,有机发光二极管)等形式来配置显示面板141。进一步的,触敏表面131可覆盖显示面板141,当触敏表面131检测到在其上或附近的触摸操作后,传送给处理器180以确定触摸事件的类型,随后处理器180根据触摸事件的类型在显示面板141上提供相应的视觉输出。虽然在图7中,触敏表面131与显示面板141是作为两个独立的部件来实现输入和输入功能,但是在某些实施例中,可以将触敏表面131与显示面板141集成而实现输入和输出功能。
终端700还可包括至少一种传感器150,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板141的亮度,接近传感器可在终端700移动到耳边时,关闭显示面板141和/或背光。作为运动传感器的一种, 重力加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于终端700还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路160、扬声器161,传声器162可提供用户与终端700之间的音频接口。音频电路160可将接收到的音频数据转换后的电信号,传输到扬声器161,由扬声器161转换为声音信号输出;另一方面,传声器162将收集的声音信号转换为电信号,由音频电路160接收后转换为音频数据,再将音频数据输出处理器180处理后,经RF电路110以发送给比如另一终端,或者将音频数据输出至存储器120以便进一步处理。音频电路160还可能包括耳塞插孔,以提供外设耳机与终端700的通信。
WiFi属于短距离无线传输技术,终端700通过WiFi模块170可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图7示出了WiFi模块170,但是可以理解的是,其并不属于终端700的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器180是终端700的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器120内的软件程序和/或模块,以及调用存储在存储器120内的数据,执行终端700的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器180可包括一个或多个处理核心;可选的,处理器180可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器180中。
终端700还包括给各个部件供电的电源190(比如电池),优选的,电源可以通过电源管理系统与处理器180逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源190还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
尽管未示出,终端700还可以包括摄像头、蓝牙模块等,在此不再赘述。具体在本实施例中,终端700的显示单元是触摸屏显示器,终端700还包括有 存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集由所述处理器加载并执行图2所示的进行运动估计的方法。
本发明实施例提供的终端,计算候选MV列表中每个MV的RDcost,获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。
图8是根据一示例性实施例示出的一种用于进行运动估计的服务器。参照图8,服务器800包括处理组件822,其进一步包括一个或多个处理器,以及由存储器832所代表的存储器资源,用于存储可由处理组件822的执行的指令,例如应用程序。存储器832中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件822被配置为执行指令,以执行上述图2所示的进行运动估计的方法。
服务器800还可以包括一个电源组件826被配置为执行服务器800的电源管理,一个有线或无线网络接口850被配置为将服务器800连接到网络,和一个输入输出(I/O)接口858。服务器800可以操作基于存储在存储器832的操作系统,例如Windows Server TM,Mac OS X TM,Unix TM,Linux TM,FreeBSD TM或类似。
本发明实施例提供的服务器,计算候选MV列表中每个MV的RDcost,获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。
本发明实施例还提供了一种计算机可读存储介质,该存储介质中存储有至少一条指令至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集由处理器加载并执行以实现图2所示的进行运动估计的方法。
本发明实施例提供的计算机可读存储介质,计算候选MV列表中每个MV的RDcost,获取AMVP的目标MV,以AMVP的目标MV在参考帧中的映射点为初选点进行IME,通过将IME的目标MV放大到四分之一精度,获取QME的参考目标MV,当AMVP的目标MV与QME的参考目标MV相同时,无需进行HME和QME,直接将AMVP的目标MV和AMVP的最小RDcost值作为最终结果,从而减小了进行HME和QME计算的计算量,缩短了运动估计过程的时长,同时降低了资源消耗。
需要说明的是:上述本发明实施例提供的进行运动估计的装置、设备在进行运动估计时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将进行运动估计的装置、设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述本发明实施例提供的进行运动估计的装置、设备与进行运动估计的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本领域普通技术人员可以理解实现上述本发明实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本发明实施例的较佳实施例,并不用以限制本发明实施例,凡在本发明实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明实施例的保护范围之内。

Claims (14)

  1. 一种进行运动估计的方法,其特征在于,所述方法应用于进行运动估计的设备中,所述方法包括:
    对于待编码图像中任一预测单元PU,基于高级向量预测AMVP为所述PU构建候选运动向量MV列表,所述候选MV列表包括所述PU的至少一个MV;
    计算所述候选MV列表中每个MV的率失真代价RDcost,并从计算结果中获取AMVP的最小RDcost值和AMVP的目标MV;
    以所述AMVP的目标MV在参考帧中的映射点为初选点进行整像素运动估计IME,并从计算结果中获取IME的目标MV;
    将所述IME的目标MV放大到四分之一像素精度,得到四分之一像素运动估计QME的参考目标MV;
    当所述AMVP的目标MV和所述QME的参考目标MV相同时,将所述AMVP的目标MV和所述AMVP的最小RDcost值确定为运动估计过程的最终结果;
    其中,每种目标MV为每种运动估计的最小RDcost值对应的MV。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    当所述AMVP的目标MV和所述QME的参考目标MV不同时,以所述IME的目标MV在所述参考帧中的映射点为初选点进行二分之一像素运动估计HME,并从计算结果中获取HME的目标MV;
    以所述HME的目标MV在所述参考帧中的映射点为初选点进行QME,得到QME的最小RDcost值和QME的目标MV;
    根据所述AMVP的最小RDcost值、所述QME的最小RDcost值、所述AMVP的目标MV及所述QME的目标MV,确定运动估计过程的最终结果。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述AMVP的最小RDcost值、所述QME的最小RDcost值、所述AMVP的目标MV及所述QME的目标MV,确定运动估计过程的最终结果,包括:
    如果所述AMVP的最小RDcost值小于所述QME的最小RDcost值,则将 所述AMVP的目标MV和所述AMVP的最小RDcost值确定为运动估计过程的最终结果;
    如果所述AMVP的最小RDcost值大于所述QME的最小RDcost值,则将所述QME的目标MV和所述QME的最小RDcost值确定为运动估计过程的最终结果。
  4. 根据权利要求1所述的方法,其特征在于,所述基于高级向量预测AMVP为所述PU构建候选运动向量MV列表,包括:
    基于AMVP构建空域候选列表和时域候选列表,所述空域候选列表包括所述PU的至少一个空域运动向量,所述时域候选列表包括所述PU的至少一个时域运动向量;
    从所述空域候选列表中,选取第一预设数量个空域运动向量;
    从所述时域候选列表中,选取第二预设数量个时域运动向量;
    根据所述第一预设数量个空域运动向量和所述第二预设数量个时域运动向量,构建第一运动预测列表;
    对所述第一运动预测列表中相同的运动向量进行合并,并采用零运动向量进行填补,得到第二运动预测列表;
    从所述第二运动预测列表中,选取第三预设数量个运动向量;
    根据所述第三预设数量个运动向量,构建所述候选MV列表。
  5. 根据权利要求1所述的方法,其特征在于,所述计算所述候选MV列表中每个MV的率失真代价RDcost,包括:
    采用残差变换后再绝对值求和SATD,计算所述候选MV列表中每个MV的RDcost,得到至少一个RDcost值;
    所述从计算结果中获取AMVP的最小RDcost值和AMVP的目标MV,包括:
    从所述至少一个RDcost值中,选取最小的RDcost值;
    将所选取的RDcost值作为所述AMVP的最小RDcost值,并将所述AMVP的最小RDcost值对应的MV作为所述AMVP的目标MV。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述以所述 AMVP的目标MV在参考帧中的映射点为初选点进行整像素运动估计IME之前,还包括:
    对所述AMVP的目标MV在所述参考帧中的映射点的位置进行修正;
    以修正后的位置为初选点进行IME。
  7. 一种进行运动估计的装置,其特征在于,所述装置设置于进行运动估计的设备中,所述装置包括:
    列表构建模块,用于对于待编码图像中任一预测单元PU,基于高级向量预测AMVP为所述PU构建候选运动向量MV列表,所述候选MV列表包括所述PU的至少一个MV;
    计算模块,用于计算所述候选MV列表中每个MV的率失真代价RDcost;
    获取模块,用于从计算结果中获取AMVP的目标MV和AMVP的最小RDcost值;
    所述计算模块,用于以所述AMVP的目标MV在参考帧中的映射点为初选点进行整像素运动估计IME;
    所述获取模块,用于从计算结果中获取IME的目标MV;
    精度放大模块,用于将所述IME的目标MV放大到四分之一像素精度,得到四分之一像素运动估计QME的参考目标MV;
    确定模块,用于当所述AMVP的目标MV和所述QME的参考目标MV相同时,将所述AMVP的目标MV和所述AMVP的最小RDcost值确定为运动估计过程的最终结果;
    其中,每种目标MV为每种运动估计的最小RDcost值对应的MV。
  8. 根据权利要求7所述的装置,其特征在于,所述计算模块,用于当所述AMVP的目标MV和所述QME的参考目标MV不同时,以所述IME的目标MV在参考帧中的映射点为初选点进行二分之一像素运动估计HME;
    所述获取模块,用于从计算结果中获取HME的目标MV;
    所述计算模块,用于以所述HME的目标MV在所述参考帧中的映射点为初选点进行QME,得到QME的最小RDcost值和QME的目标MV;
    所述确定模块,用于根据所述AMVP的最小RDcost值、所述QME的最小RDcost值、所述AMVP的目标MV及所述QME的目标MV,确定运动估计过 程的最终结果。
  9. 根据权利要求8所述的装置,其特征在于,所述确定模块,用于当所述AMVP的最小RDcost值小于所述QME的最小RDcost值时,将所述AMVP的目标MV和所述AMVP的最小RDcost值确定为运动估计过程的最终结果;当所述AMVP的最小RDcost值大于所述QME的最小RDcost值时,将所述QME的目标MV和所述QME的最小RDcost值确定为运动估计过程的最终结果。
  10. 根据权利要求9所述的装置,其特征在于,所述列表构建模块,用于基于高级向量预测AMVP构建空域候选列表和时域候选列表,所述空域候选列表包括所述PU的至少一个空域运动向量,所述时域候选列表包括所述PU的至少一个时域运动向量;从所述空域候选列表中,选取第一预设数量个空域运动向量;从所述时域候选列表中,选取第二预设数量个时域运动向量;根据所述第一预设数量个空域运动向量和所述第二预设数量个时域运动向量,构建第一运动预测列表;对所述第一运动预测列表中相同的运动向量进行合并,并采用零运动向量进行填补,得到第二运动预测列表;从所述第二运动预测列表中,选取第三预设数量个运动向量;根据所述第三预设数量个运动向量,构建所述候选MV列表。
  11. 根据权利要求7所述的装置,其特征在于,所述计算模块,还用于采用残差变换后再绝对值求和SATD,计算所述候选MV列表中每个MV的RDcost,得到至少一个RDcost值;
    所述获取模块,还用于从所述至少一个RDcost值中,选取最小的RDcost值;将所选取的RDcost值作为所述AMVP的最小RDcost值,并将所述AMVP的最小RDcost值对应的MV作为所述AMVP的目标MV。
  12. 根据权利要求7至12中任一项所述的装置,其特征在于,所述装置还包括:
    位置修正模块,用于对所述AMVP的目标MV在所述参考帧中的映射点的位置进行修正;
    所述计算模块,用于以修正后的位置为初选点进行IME。
  13. 一种用于进行运动估计的设备,其特征在于,所述设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集由所述处理器加载并执行以实现如权利要求1至6中任一项所述的进行运动估计的方法。
  14. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集由处理器加载并执行以实现如权利要求1至6中任一项所述的进行运动估计的方法。
PCT/CN2018/103642 2017-09-28 2018-08-31 进行运动估计的方法、装置、设备及存储介质 WO2019062476A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18862765.7A EP3618445A4 (en) 2017-09-28 2018-08-31 PROCESS FOR MAKING A MOVEMENT ESTIMATE, AND ASSOCIATED APPARATUS, DEVICE AND STORAGE SUPPORT
US16/656,116 US10827198B2 (en) 2017-09-28 2019-10-17 Motion estimation method, apparatus, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710894492.7 2017-09-28
CN201710894492.7A CN109587501B (zh) 2017-09-28 2017-09-28 进行运动估计的方法、装置及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/656,116 Continuation US10827198B2 (en) 2017-09-28 2019-10-17 Motion estimation method, apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2019062476A1 true WO2019062476A1 (zh) 2019-04-04

Family

ID=65900764

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103642 WO2019062476A1 (zh) 2017-09-28 2018-08-31 进行运动估计的方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US10827198B2 (zh)
EP (1) EP3618445A4 (zh)
CN (1) CN109587501B (zh)
WO (1) WO2019062476A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW202017377A (zh) 2018-09-08 2020-05-01 大陸商北京字節跳動網絡技術有限公司 視頻編碼和解碼中的仿射模式
CN111953974B (zh) * 2020-08-26 2021-08-27 珠海大横琴科技发展有限公司 一种运动参数候选列表构建方法、装置及计算机设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070002949A1 (en) * 2005-06-30 2007-01-04 Nokia Corporation Fast partial pixel motion estimation for video encoding
CN104811728A (zh) * 2015-04-23 2015-07-29 湖南大目信息科技有限公司 一种视频内容自适应的运动搜索方法
CN106101709A (zh) * 2016-07-08 2016-11-09 上海大学 一种联合增强层的shvc质量可分级的基本层帧间预测方法
US20170238005A1 (en) * 2016-02-15 2017-08-17 Qualcomm Incorporated Picture order count based motion vector pruning
CN107087171A (zh) * 2017-05-26 2017-08-22 中国科学技术大学 Hevc整像素运动估计方法及装置

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6141382A (en) * 1998-09-18 2000-10-31 Sarnoff Corporation Using estimated distortion values
US7580456B2 (en) * 2005-03-01 2009-08-25 Microsoft Corporation Prediction-based directional fractional pixel motion estimation for video coding
JP4410172B2 (ja) * 2005-08-29 2010-02-03 日本電信電話株式会社 動きベクトル推定方法,動きベクトル推定装置,動きベクトル推定プログラムおよび動きベクトル推定プログラム記録媒体
EP1960967B1 (en) * 2005-12-15 2013-03-13 Analog Devices, Inc. Motion estimation using prediction guided decimated search
CN101820547A (zh) * 2009-02-27 2010-09-01 源见科技(苏州)有限公司 帧间模式选择方法
CN102710934B (zh) * 2011-01-22 2015-05-06 华为技术有限公司 一种运动预测或补偿方法
US20140085415A1 (en) * 2012-09-27 2014-03-27 Nokia Corporation Method and apparatus for video coding
WO2014056150A1 (en) * 2012-10-09 2014-04-17 Nokia Corporation Method and apparatus for video coding
US9106922B2 (en) * 2012-12-19 2015-08-11 Vanguard Software Solutions, Inc. Motion estimation engine for video encoding
US20140301463A1 (en) * 2013-04-05 2014-10-09 Nokia Corporation Method and apparatus for video coding and decoding
WO2015004606A1 (en) * 2013-07-09 2015-01-15 Nokia Corporation Method and apparatus for video coding involving syntax for signalling motion information
WO2015015058A1 (en) * 2013-07-31 2015-02-05 Nokia Corporation Method and apparatus for video coding and decoding
CN106105220B (zh) * 2014-01-07 2019-07-05 诺基亚技术有限公司 用于视频编码和解码的方法和装置
US10531116B2 (en) * 2014-01-09 2020-01-07 Qualcomm Incorporated Adaptive motion vector resolution signaling for video coding
US20150264404A1 (en) * 2014-03-17 2015-09-17 Nokia Technologies Oy Method and apparatus for video coding and decoding
US9628793B1 (en) * 2014-09-26 2017-04-18 Polycom, Inc. Motion estimation
US20160127731A1 (en) * 2014-11-03 2016-05-05 National Chung Cheng University Macroblock skip mode judgement method for encoder
US10462480B2 (en) * 2014-12-31 2019-10-29 Microsoft Technology Licensing, Llc Computationally efficient motion estimation
EP3264769A1 (en) * 2016-06-30 2018-01-03 Thomson Licensing Method and apparatus for video coding with automatic motion information refinement
EP3264768A1 (en) * 2016-06-30 2018-01-03 Thomson Licensing Method and apparatus for video coding with adaptive motion information refinement
US10462462B2 (en) * 2016-09-29 2019-10-29 Qualcomm Incorporated Motion vector difference coding technique for video coding
US11356693B2 (en) * 2016-09-29 2022-06-07 Qualcomm Incorporated Motion vector coding for video coding
US10979732B2 (en) * 2016-10-04 2021-04-13 Qualcomm Incorporated Adaptive motion vector precision for video coding
CN115118988A (zh) * 2017-06-30 2022-09-27 华为技术有限公司 用于运动向量细化的搜索区域
US20190222858A1 (en) * 2019-03-26 2019-07-18 Intel Corporation Optimal out of loop inter motion estimation with multiple candidate support

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070002949A1 (en) * 2005-06-30 2007-01-04 Nokia Corporation Fast partial pixel motion estimation for video encoding
CN104811728A (zh) * 2015-04-23 2015-07-29 湖南大目信息科技有限公司 一种视频内容自适应的运动搜索方法
US20170238005A1 (en) * 2016-02-15 2017-08-17 Qualcomm Incorporated Picture order count based motion vector pruning
CN106101709A (zh) * 2016-07-08 2016-11-09 上海大学 一种联合增强层的shvc质量可分级的基本层帧间预测方法
CN107087171A (zh) * 2017-05-26 2017-08-22 中国科学技术大学 Hevc整像素运动估计方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3618445A4 *

Also Published As

Publication number Publication date
US20200053381A1 (en) 2020-02-13
CN109587501A (zh) 2019-04-05
CN109587501B (zh) 2022-07-12
EP3618445A4 (en) 2020-12-30
EP3618445A1 (en) 2020-03-04
US10827198B2 (en) 2020-11-03

Similar Documents

Publication Publication Date Title
JP7229261B2 (ja) ビデオ符号化のビットレート制御方法、装置、機器、記憶媒体及びプログラム
JP6758683B2 (ja) 予測モード選択方法、ビデオ符号化装置、及び記憶媒体
CN108391127B (zh) 视频编码方法、装置、存储介质及设备
KR102225235B1 (ko) 비디오 인코딩 방법, 장치, 및 디바이스, 및 저장 매체
US11375227B2 (en) Video motion estimation method and apparatus, and storage medium
WO2018036352A1 (zh) 视频数据的编解码方法、装置、系统及存储介质
CN114598880A (zh) 图像处理方法、智能终端及存储介质
KR101425286B1 (ko) 모션 추정을 위한 완전한 서브 매크로블록 형상 후보 저장 및 복구 프로토콜
US10827198B2 (en) Motion estimation method, apparatus, and storage medium
CN113709504B (zh) 图像处理方法、智能终端及可读存储介质
CN116456102B (zh) 图像处理方法、处理设备及存储介质
CN115379214B (zh) 图像处理方法、智能终端及存储介质
US20190320205A1 (en) Template matching-based prediction method and apparatus
JP7273154B2 (ja) ビデオシーケンスのためのピクチャエンコーディング及びデコーディング方法及び装置
CN116847088B (zh) 图像处理方法、处理设备及存储介质
CN115955565B (zh) 处理方法、处理设备及存储介质
CN110213593B (zh) 一种运动矢量的确定方法、编码压缩方法和相关装置
CN109003313B (zh) 一种传输网页图片的方法、装置和系统
CN113038124A (zh) 视频编码方法、装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18862765

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018862765

Country of ref document: EP

Effective date: 20191125

NENP Non-entry into the national phase

Ref country code: DE