WO2022218299A1 - 运动估计方法、装置、电子设备及可读存储介质 - Google Patents

运动估计方法、装置、电子设备及可读存储介质 Download PDF

Info

Publication number
WO2022218299A1
WO2022218299A1 PCT/CN2022/086324 CN2022086324W WO2022218299A1 WO 2022218299 A1 WO2022218299 A1 WO 2022218299A1 CN 2022086324 W CN2022086324 W CN 2022086324W WO 2022218299 A1 WO2022218299 A1 WO 2022218299A1
Authority
WO
WIPO (PCT)
Prior art keywords
search
target
point
pred
current
Prior art date
Application number
PCT/CN2022/086324
Other languages
English (en)
French (fr)
Inventor
张勇
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Priority to EP22787522.6A priority Critical patent/EP4311236A4/en
Publication of WO2022218299A1 publication Critical patent/WO2022218299A1/zh
Priority to US18/380,640 priority patent/US20240048753A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/57Motion estimation characterised by a search window with variable size or shape
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/533Motion estimation using multistep search, e.g. 2D-log search or one-at-a-time search [OTS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search

Definitions

  • the embodiments of the present application relate to the field of communication technologies, and in particular, to a motion estimation method, apparatus, electronic device, and readable storage medium.
  • Video coding takes the high correlation of video signals and the visual characteristics of the human eye as the starting point, and eliminates the redundancy caused by various correlations and human eye characteristics through appropriate coding methods to compress the video signal and reduce the transmission bit rate. the goal of.
  • the correlation of video signals can be divided into temporal correlation and spatial correlation. Spatial correlation refers to the similarity between adjacent pixels in the same image, which is mainly eliminated by Intra-Frame Prediction coding; temporal correlation refers to the adjacent pixels in the image sequence. Similarity between images, which is mainly eliminated by Inter-Frame Prediction coding.
  • Inter-frame predictive coding is also called motion compensated prediction (Motion Compensated Prediction, MCP) coding.
  • MCP Motion Compensated Prediction
  • inter-frame prediction coding The purpose of inter-frame prediction coding is to find the most similar block (matching block) for the current image coding block, and obtain the relative offset of the spatial position between the two according to the position of the predicted block, that is, the motion vector (Motion vector).
  • Vector, MV The process of obtaining a motion vector by searching is called motion estimation (Motion Estimation, ME), and the process of finding a prediction block from a specified reference frame according to the motion vector is called motion compensation.
  • the principle of predictive coding is specifically shown in FIG. 1 .
  • Motion estimation is the key technology of predictive coding. The more accurate the motion estimation is, the higher the image quality of predictive compensation will be, the smaller the compensation residual will be, the fewer bits required for compensation coding, and the smaller the bit rate will be. Meanwhile, in order to find the most similar block to the current block, a lot of search work is required. Motion estimation is the module with the highest computational complexity in video coding, and its computational complexity usually accounts for 50-70% of the entire encoder.
  • Motion estimation methods include full search and fast search.
  • the full search method is to search all points in the search range in order, compare the cost function point by point, and select the point that minimizes the cost function, that is, the optimal point.
  • the full search needs to traverse all the pixels in the search range, and can find the optimal solution of the cost function of the block, but the complexity is also the highest.
  • the basic idea of the fast search method is to reduce the number of candidate search points, and it is usually necessary to design a series of effective search patterns to find the optimal solution with as few search points as possible.
  • the current fast search method has a large number of candidate search points, and there is still a large redundancy between the search points, and it is easy to fall into a local optimum during the search and matching process.
  • the purpose of the embodiments of the present application is to provide a motion estimation method, apparatus, electronic device, and readable storage medium, which can solve the problem of reducing the computational complexity and coding time while avoiding the search falling into a local optimum.
  • an embodiment of the present application provides a motion estimation method, the method is executed by an electronic device, and the method includes:
  • an asymmetric cross search is performed to obtain a first target search point
  • the motion vector and the target macro of the previous frame of the target position in the target macroblock determine the second search range
  • an extended multi-level hexagonal lattice point search is performed to obtain a third target search point
  • an extended hexagon search is performed to obtain a fourth target search point
  • a motion vector of the target macroblock is determined according to the fourth target search point.
  • an embodiment of the present application provides a motion estimation apparatus, the apparatus is applied to an electronic device, and the apparatus includes:
  • the first search module is used to search the starting search point in the target macroblock to obtain the starting search point
  • a first determination module configured to determine a first search range according to the motion vector of the previous frame of the target position in the target macroblock when it is judged that the cost value of the initial search point is not terminated in advance;
  • a second search module configured to perform an asymmetric cross search according to the first search range to obtain a first target search point
  • the third search module is configured to perform a full search in a rectangular window to obtain a second target search point when it is judged that the cost value of the initial search point and the cost value of the first target search point have not expired in advance;
  • the second determination module is configured to, in the case of judging that the deadline is not prematurely terminated according to the cost value of the initial search point and the cost value of the second target search point, according to the previous value of the target position in the target macroblock
  • the motion vector of the frame and the motion vector of the first two frames of the target position in the target macroblock determine the second search range
  • a fourth search module configured to perform an expanded multi-level hexagonal lattice point search according to the second search range to obtain a third target search point
  • the fifth search module is used to carry out an extended hexagon search to obtain a fourth target search point when it is judged that the cost value of the initial search point and the cost value of the third target search point have not expired in advance ;
  • the third determining module is configured to determine the motion vector of the target macroblock according to the fourth target search point.
  • an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being executed by the processor When executed, the steps of the motion estimation method described in the first aspect are implemented.
  • an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the motion estimation method according to the first aspect are implemented .
  • an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.
  • embodiments of the present application further provide a computer program product, the computer program product being stored in a non-volatile storage medium, and the computer program product being configured to be executed by at least one processor to implement the above the steps of the method.
  • an embodiment of the present application further provides a motion estimation apparatus, where the apparatus is configured to perform the method described above.
  • the embodiments of the present application optimize the selection of the search range in the motion estimation process of video coding, so that the optimal matching block and the optimal motion vector can be determined more accurately and quickly, the search can be prevented from falling into a local optimum, and the computation can be reduced.
  • the complexity and coding time improve the efficiency and effectiveness of video coding.
  • 1 is a schematic diagram of the principle of predictive coding
  • Figure 2 is a schematic diagram of the application scenario of the UMHexagonS algorithm
  • FIG. 3 is a schematic flowchart of the UMHexagonS algorithm
  • FIG. 4 is a schematic flowchart of a motion estimation method provided by an embodiment of the present application.
  • FIG. 5 is one of schematic diagrams of application scenarios provided by an embodiment of the present application.
  • FIG. 6a is a second schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 6b is a third schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 6c is a fourth schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 6d is a fifth schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a motion estimation apparatus provided by an embodiment of the present application.
  • FIG. 8 is one of the schematic structural diagrams of an electronic device provided by an embodiment of the present application.
  • FIG. 9 is a second schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • first, second and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and distinguish between “first”, “second”, etc.
  • the objects are usually of one type, and the number of objects is not limited.
  • the first object may be one or more than one.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the associated objects are in an "or” relationship.
  • the fast search method uses three-step search algorithm (Three Step Search, TSS), new three-step search algorithm (Novel Three Step Search, NTSS), four-step search algorithm (Four Step Search (FSS), Block-Based Gradient Descent Search (BBGDS), Diamond Search (DS), and Hexagon-based Search (HEXBS); use motion Motion Vector Field Adaptive Search Technology (MVFAST) for vector spatiotemporal correlation, Unsymmetrical-cross Multi-Hexagon grid Search (UMHexagonS), enhanced Enhanced Predictive Zonal Search (EPZS).
  • TSS Three-step Search
  • NTSS new three-step search algorithm
  • FFSS Block-Based Gradient Descent Search
  • BCGDS Block-Based Gradient Descent Search
  • DS Diamond Search
  • HEXBS Hexagon-based Search
  • MVFAST Motion Vector Field Adaptive Search Technology
  • UHexagonS Unsymmetrical-cross Multi-Hexagon grid Search
  • EPZS enhanced Enhanced Predictive Zonal Search
  • the main basis of the above fast search algorithm is that starting from the optimal position, the prediction distortion increases monotonically with distance. Therefore, when performing motion estimation, start from a certain search position, find the position with the least distortion in each round of search, and start the next round of search based on this position, and finally reach the optimal position.
  • These fast search algorithms generally use a compact biased center search mode to obtain the optimal solution in the fastest way.
  • the UMHexagonS search uses a detailed global search, combined with a small-scale local search strategy, uses multi-level and multi-shaped templates for matching, and uses spatio-temporal correlation to estimate motion vectors.
  • the UMHexagonS search method has the following characteristics:
  • the starting point prediction adopted by the UMHexagonS method comprehensively utilizes the motion vector correlation of adjacent blocks within and between frames, as well as the motion correlation of blocks of different sizes brought by the macroblock division technology, so it can be selected
  • the point that best reflects the current block motion trend is used as the starting point, which improves the accuracy of the estimation.
  • Each step of the UMHexagonS search is related to the image content, and its search templates are divided into three categories: large-scale coarse search mixed template, fine search medium hexagonal template, and fine search small diamond template. Search performance is further improved by performing different searches for different content.
  • UMHexagonS adopts a fast search method that reduces the number of search blocks, and on the other hand, it ends the search ahead of time through search prediction, so its computational complexity can be saved by more than 90% compared with the full search method, and at the same time, it can achieve a better realization rate. Distortion optimization, its performance is close to the full search method.
  • the performance evaluation of motion estimation is based on the rate-distortion criterion, using the following cost function:
  • ⁇ MOTION is the Lagrange multiplier.
  • SAD Sum of Absolute Difference
  • MVP median prediction vector
  • s is the pixel point of the current image to be encoded
  • c is the pixel point of the reference image in motion estimation
  • B x and By represent the macroblock in the horizontal direction (x axis) and vertical direction ( y axis), respectively.
  • the size of the range, in formula (3), the values of B x and By can be 16, 8, and 4.
  • the UMHexagonS algorithm includes four different search modes: step1 initial search point search; step2 asymmetric cross search; step3 non-uniform multi-level hexagonal lattice search; step4 extended hexagon search.
  • Figure 3 shows the flow chart of UMHexagonS search, where cur_cost is the cost value calculated by formula (1), pred_SAD is the predicted value of the cost value of the best matching block for the current block, from the previously encoded adjacent blocks, beta1, beta2 are threshold coefficients, which are determined by the current QP and the current block size.
  • step1 start search point search, determine the current best start search point
  • Corresponding block prediction utilize temporal correlation, get the motion vector MV pred_CP that has obtained, the same position of the previous frame;
  • Adjacent reference frame prediction utilize time correlation, get the MV of the current block in the previous reference frame that has been obtained and adjust in proportion to obtain the current MV pred_REF ;
  • the predicted MVs obtained by the above prediction modes are formed into a predicted vector set S, and then a candidate motion vector with the smallest corresponding cost function is searched from the set S as the search starting point:
  • arg represents the variable value when the objective function J takes the minimum value
  • an early termination (Early_Termination) judgment operation is performed according to the cost value of the current best prediction starting point to determine whether to terminate early.
  • a good starting prediction point that is, close to the optimal position, corresponding to a small cost value
  • step step4-1 hexagon search
  • step step4-1 hexagon search
  • step2 asymmetric cross search
  • the candidate motion vector set ⁇ 1 in this step is:
  • step3 Non-uniform multi-level hexagonal lattice point search, this step is divided into two sub-steps:
  • step3-1 5 ⁇ 5 small rectangular window full search
  • the size of the rectangular window can be determined by the search range of the macroblock. For example, a 16 ⁇ 16 macroblock is shown in FIG. 2 , a 5 ⁇ 5 rectangular window can be selected. If the search range of the macroblock changes , the size of the rectangular window can also be adaptively changed.
  • the candidate motion vector set ⁇ 2 is:
  • ⁇ 2 ⁇ , cm m min2 , formula (7) for short;
  • Equation (8) for short, where m i ⁇ 2 ;
  • step3-2 Extended multi-level hexagonal lattice search
  • the 16 search points (corresponding to the 16 ⁇ 16 macroblock shown in Figure 2) in the hyperhexagonal template used in the search in this step are:
  • m ( ⁇ 4, ⁇ 2),( ⁇ 4, ⁇ 1),( ⁇ 4,0),( ⁇ 2, ⁇ 3),(0, ⁇ 4), referred to as formula (9);
  • step4 Extended hexagon search, this step is divided into two sub-steps:
  • step4-1 Hexagonal search
  • the candidate motion vector set ⁇ 3 in this step is:
  • step4-2 Small diamond search
  • the candidate motion vector set ⁇ 4 in this step is:
  • the small rhombus template is used to search repeatedly until the best prediction point is at the midpoint of the small rhombus, and the final motion vector is obtained.
  • an embodiment of the present application provides a motion estimation method, and the method is executed by an electronic device. It should be noted that the method of the embodiment of the present application is further improved and optimized for the UMHexagonS algorithm process, and the method includes the following steps:
  • Step 401 Search for a starting search point in a target macroblock to obtain a starting search point
  • the target macroblock refers to a macroblock that is currently undergoing motion estimation, and the size of the macroblock can be 16 ⁇ 16, 8 ⁇ 8, 8 ⁇ 16, etc., which is not specified in this embodiment of the present application limited.
  • the method of starting the search point search can refer to the specific process of step 1 in the UMHexagonS algorithm above.
  • Step 402 determine the first search range according to the motion vector of the previous frame of the target position in the target macroblock in the case of judging that it is not prematurely terminated according to the cost value of the initial search point;
  • Step 403 according to the first search range, perform an asymmetric cross search to obtain a first target search point;
  • the cost value of the initial search point is calculated by the cost function (specifically, formula (1)), and then the Early_Termination judgment is performed according to the cost value of the initial search point.
  • the specific judgment method refer to the UMHexagonS algorithm above.
  • the Early_Termination judgment operation is performed.
  • step4-1 in the UMHexagonS algorithm: hexagonal Shape search
  • step4-1 in the UMHexagonS algorithm: hexagonal Shape search
  • step2 in the UMHexagonS algorithm according to the process: asymmetrical cross search.
  • the first search range is determined according to the motion vector of the previous frame of the target position in the target macroblock.
  • the existing asymmetric cross search is based on the assumption that the motion amplitude in the horizontal direction in the video sequence is much larger than that in the vertical direction, and the search range in the horizontal direction is set to be twice that of the vertical direction. This method can lead to "local optima" problems. In actual complex and changeable video sequences, this assumption does not necessarily hold. For some video sequences, there are also violent motions in the vertical direction. Searching by the original method may obtain sub-optimal results.
  • the search length of the asymmetric cross search in the horizontal and vertical directions of the current frame is dynamically set according to the MV of the frame at the same position of the macroblock.
  • the motion vector of the previous frame of the target position in the target macroblock is MV pred_CP ;
  • the first search range is determined, including:
  • MV pred_CP (MV pred_CP_h , MV pred_CP_v );
  • W is the preset search range of the target macroblock
  • MV pred_CP_h is the component of MV pred_CP in the horizontal direction
  • MV pred_CP_v is the component of MV pred_CP in the vertical direction
  • rangeX is the horizontal search length
  • rangeY is the vertical search length.
  • the rangeX and rangeY are determined through the above process, and then the candidate motion vector set ⁇ 1 is calculated by formula (5), where in formula (5), the value range of i depends on rangeX, and the value range of j depends on rangeY.
  • a new search point that is, the first target search point.
  • Step 404 In the case of judging that the deadline is not premature according to the cost value of the initial search point and the cost value of the first target search point, perform a full search in the rectangular window to obtain the second target search point;
  • Step 405 In the case of judging that the deadline is not premature according to the cost value of the initial search point and the cost value of the second target search point, according to the motion vector of the previous frame of the target position in the target macroblock and the motion vector in the target macroblock The motion vectors of the first two frames of the target position determine the second search range;
  • the Early_Termination judgment is performed according to the cost value of the initial search point and the cost value of the first target search point.
  • the specific judgment method may refer to the Early_Termination judgment operation in the UMHexagonS algorithm above.
  • the second search range is determined according to the motion vector of the previous frame of the target position in the target macroblock and the motion vector of the previous two frames of the target position in the target macroblock;
  • the extended multi-level hexagonal grid is first determined according to the motion vector of the previous frame and the motion vector of the previous two frames at the same position in the target macroblock.
  • the search range of the point search that is, the second search range, further narrows the search range within the complete super-hexagon template within the macroblock.
  • the motion vector of the current frame of the target position in the target macroblock is MV current_t
  • the motion vector of the previous frame of the target position in the target macroblock is MV current_t-1
  • the target macroblock The motion vector of the first two frames of the target position in is MV current_t-2 ;
  • the second search range is determined, including:
  • any one of MV current_t , MV current_t-1 and MV current_t-2 is 0, the search is performed according to the extended multi-level hexagonal lattice point of the original method of UMHexagonS, that is, the execution is continued according to step3-2.
  • the second search range is determined according to the included angle ⁇ 2 .
  • the second search range corresponding to the angle range is determined, and the details are as follows:
  • the second search range is determined to be all search points within the fourth quadrant of the target macroblock.
  • Fig. 5 shows the pattern of the second search range under different conditions of ⁇ 2 :
  • the second search range is as shown in (a) in FIG. 5 ;
  • the second search range is as shown in (c) in FIG. 5 ;
  • the second search range is as shown in (d) of FIG. 5 .
  • Step 406 according to the second search range, perform an extended multi-level hexagonal grid point search to obtain a third target search point;
  • Step 407 In the case of judging that the deadline is not premature according to the cost value of the initial search point and the cost value of the third target search point, perform an extended hexagon search to obtain the fourth target search point;
  • Step 408 Determine the motion vector of the target macroblock according to the fourth target search point.
  • the third target search point is obtained;
  • the judgment method can refer to the Early_Termination judgment operation in the UMHexagonS algorithm above.
  • step 4 in the existing UMHexagonS algorithm expand the hexagonal search, and determine the final search point after the search, that is, the fourth target search point, and then obtain the final search point based on the fourth target search point. , that is, the motion estimation result for the target macroblock.
  • the determination of the search range of the asymmetric cross search and the extended multi-level hexagonal lattice point search is optimized, and a more sophisticated prediction mechanism and search module are adopted, With dynamic adaptability, it can determine the optimal matching block and the optimal motion vector more accurately and quickly, avoid searching prematurely into local optimal dead ends, greatly reduce the computational complexity and coding time, and improve the Efficiency and Effectiveness of Video Encoders.
  • the example in this section encodes a quarter common intermediate format (Quarter Common Intermediate Format, QCIF) video sequence (176 ⁇ 144), and one frame of image contains 99 macroblocks of 16 ⁇ 16 size.
  • QCIF Quadrater Common Intermediate Format
  • the rest of the frames are P frames, that is, IPPP...
  • step1 start search point search
  • E be the current block
  • A is on the left side of E
  • B is above E
  • C is on the top right of E. If there is more than one block on the left side of E, then select the top block as A, and select it above E Leftmost block as B. Also observe the following guidelines in the process of predicting E:
  • the predicted value is the median value of the motion vectors of blocks A, B, and C;
  • the predicted value of the upper 16 ⁇ 8 block is obtained according to B prediction, and the lower 16 ⁇ 8 predicted value is obtained according to A;
  • the predicted value of the 8 ⁇ 16 block on the left is predicted according to A, and the predicted value of the 8 ⁇ 16 block on the right is obtained according to C;
  • Motion vector estimation adopts top-down upper-layer motion vector prediction, as shown in Figure 6b. If the motion vector of the 16 ⁇ 16 block has been obtained, it can be used to predict two sub-blocks (16 ⁇ 8, 8 ⁇ 16); and a motion vector of a 16 ⁇ 8 (8 ⁇ 16) block can be used for Predict two 8x8 subblocks at the same location, etc.
  • the motion vector of the current block in the reference frame t'+1 can be used to The motion vector of the current block in frame t', namely:
  • the prediction vector set S in this step includes the following vectors:
  • step2 asymmetric cross search
  • the method of the embodiment of the present application is used to dynamically set the search length of the current frame asymmetric cross search in the horizontal and vertical directions according to the MV of the frame at the same position of the macroblock.
  • W 16 .
  • step3 Non-uniform multi-level hexagonal lattice search
  • step3-1 5x5 small rectangular window full search, this step uses the original UMHexagonS method
  • step3-2 This step uses the improved and extended multi-level hexagonal lattice point search method proposed in this patent
  • step4 Extended Hexagonal Search
  • step4-1 Hexagonal search, this step uses the original method of UMHexagonS
  • step4-2 Small diamond search, this step uses the original method of UMHexagonS
  • an embodiment of the present application provides a motion estimation apparatus 700, the apparatus is applied to an electronic device, and the apparatus includes:
  • the first search module 701 is used to search for the initial search point in the target macroblock to obtain the initial search point;
  • the first determination module 702 is configured to determine the first search range according to the motion vector of the previous frame of the target position in the target macroblock when it is judged according to the cost value of the initial search point that the deadline is not premature ;
  • a second search module 703, configured to perform an asymmetric cross search according to the first search range to obtain a first target search point
  • the third search module 704 is configured to perform a full search in a rectangular window to obtain a second target search point when it is judged based on the cost value of the initial search point and the cost value of the first target search point that the deadline has not expired in advance ;
  • the second determining module 705 is configured to, in the case of judging that the deadline has not been prematurely terminated according to the cost value of the starting search point and the cost value of the second target search point, determine whether the target position in the target macroblock is The motion vector of one frame and the motion vector of the first two frames of the target position in the target macroblock determine the second search range;
  • a fourth search module 706, configured to perform an expanded multi-level hexagonal lattice point search according to the second search range to obtain a third target search point;
  • the fifth search module 707 is configured to perform an extended hexagonal search to obtain a fourth target search when it is judged that the cost value of the initial search point and the cost value of the third target search point have not expired in advance. point;
  • the third determining module 708 is configured to determine the motion vector of the target macroblock according to the fourth target search point.
  • the motion vector of the previous frame of the target position in the target macroblock is MV pred_CP ;
  • the first determining module is further used for:
  • MV pred_CP (MV pred_CP_h , MV pred_CP_v );
  • W is the preset search range of the target macroblock
  • MV pred_CP_h is the component of MV pred_CP in the horizontal direction
  • MV pred_CP_v is the component of MV pred_CP in the vertical direction
  • rangeX is the horizontal search length
  • rangeY is the Vertical search length.
  • the motion vector of the current frame of the target position in the target macroblock is MV current_t
  • the motion vector of the previous frame of the target position in the target macroblock is MV current_t-1
  • the target The motion vector of the first two frames of the target position in the macroblock is MV curredt_t-2 ;
  • the second determining module is further used for, including:
  • the angle ⁇ 1 of MV current_t-1 relative to MV current_t-2 is determined by the following formula:
  • the second search range is determined according to the included angle ⁇ 2 , where the value range of a is [ ⁇ /10, ⁇ /4].
  • the second determining module is further configured to:
  • the second search range is determined to be all search points within the fourth quadrant of the target macroblock.
  • the execution body may be a motion estimation apparatus, or a control module in the motion estimation apparatus for executing the motion estimation method.
  • the motion estimation method performed by the motion estimation apparatus is used as an example to describe the motion estimation apparatus provided by the embodiments of the present application.
  • the motion estimation apparatus in this embodiment of the present application may be an apparatus, or may be a component, an integrated circuit, or a chip in a terminal.
  • the apparatus may be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant).
  • UMPC ultra-mobile personal computer
  • PDA personal digital assistant
  • non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.
  • the motion estimation apparatus in this embodiment of the present application may be an apparatus with an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
  • the motion estimation apparatus provided in this embodiment of the present application can implement each process implemented by the method embodiment in FIG. 4 . To avoid repetition, details are not repeated here.
  • an embodiment of the present application further provides an electronic device 800, including a memory 801, a processor 802, and a program or instruction stored in the memory 801 and executable on the processor 802, When the program or instruction is executed by the processor 802, each process of the above-mentioned motion estimation method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, details are not described here.
  • the electronic devices in the embodiments of the present application include mobile electronic devices and non-mobile electronic devices.
  • FIG. 9 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
  • the electronic device 900 includes but is not limited to: a radio frequency unit 901, a network module 902, an audio output unit 903, an input unit 904, a sensor 905, a display unit 906, a user input unit 907, an interface unit 908, a memory 909, and a processor 910, etc. part.
  • the electronic device 900 may also include a power supply (such as a battery) for supplying power to various components, and the power supply may be logically connected to the processor 910 through a power management system, so that the power management system can manage charging, discharging, and power management. consumption management and other functions.
  • a power supply such as a battery
  • the structure of the electronic device shown in FIG. 9 does not constitute a limitation to the electronic device.
  • the electronic device may include more or less components than the one shown, or combine some components, or arrange different components, which will not be repeated here. .
  • the processor 910 is used for:
  • an asymmetric cross search is performed to obtain a first target search point
  • the motion vector and the target macro of the previous frame of the target position in the target macroblock determine the second search range
  • an extended multi-level hexagonal lattice point search is performed to obtain a third target search point
  • a motion vector of the target macroblock is determined according to the fourth target search point.
  • the motion vector of the previous frame of the target position in the target macroblock is MV pred_CP ;
  • the processor 910 is further configured to:
  • MV pred_CP (MV pred_CP_h , MV pred_CP_v );
  • W is the preset search range of the target macroblock
  • MV pred_CP_h is the component of MV pred_CP in the horizontal direction
  • MV pred_CP_v is the component of MV pred_CP in the vertical direction
  • rangeX is the horizontal search length
  • rangeY is the Vertical search length.
  • the motion vector of the current frame of the target position in the target macroblock is MV current_t
  • the motion vector of the previous frame of the target position in the target macroblock is MV current_t-1
  • the target macroblock The motion vector of the first two frames of the target position in is MV current_t-2 ;
  • the processor 910 is further configured to include:
  • the angle ⁇ 1 of MV current_t-1 relative to MV current_t-2 is determined by the following formula:
  • the second search range is determined according to the included angle ⁇ 2 , where the value range of a is [ ⁇ /10, ⁇ /4].
  • processor 910 is further configured to:
  • the second search range is all search points within the fourth quadrant of the target macroblock.
  • the embodiments of the present application optimize the selection of the search range in the motion estimation process of video coding, so that the optimal matching block and the optimal motion vector can be determined more accurately and quickly, the search can be prevented from falling into a local optimum, and the computation can be reduced.
  • the complexity and coding time improve the efficiency and effectiveness of video coding.
  • the input unit 904 may include a graphics processor (Graphics Processing Unit, GPU) 9041 and a microphone 9042. Such as camera) to obtain still pictures or video image data for processing.
  • the display unit 906 may include a display panel 9061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 907 includes a touch panel 9071 and other input devices 9072 .
  • the touch panel 9071 is also called a touch screen.
  • the touch panel 9071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 9072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • Memory 909 may be used to store software programs as well as various data, including but not limited to application programs and operating systems.
  • the processor 910 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, and the like, and the modem processor mainly processes wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 910
  • Embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, each process of the foregoing motion estimation method embodiment can be achieved, and the same can be achieved. In order to avoid repetition, the technical effect will not be repeated here.
  • the processor is the processor in the electronic device described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the motion estimation method embodiments described above.
  • the chip includes a processor and a communication interface
  • the communication interface is coupled to the processor
  • the processor is configured to run a program or an instruction to implement the motion estimation method embodiments described above.
  • the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.
  • Embodiments of the present application further provide a computer program product, the computer program product being stored in a non-volatile storage medium, and the computer program product being configured to be executed by at least one processor to implement the method described above A step of.
  • Embodiments of the present application further provide a motion estimation apparatus, which is configured to perform the above-described method.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例提供一种运动估计方法、装置、电子设备及可读存储介质,属于通信技术领域。方法包括:根据目标宏块中的目标位置的前一帧的运动矢量,确定进行非对称十字交叉搜索的第一搜索范围;根据目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定进行扩展多层次六边形格点搜索的第二搜索范围。

Description

运动估计方法、装置、电子设备及可读存储介质
相关申请的交叉引用
本申请主张在2021年04月16日在中国提交的中国专利申请No.202110412068.0的优先权,其全部内容通过引用包含于此。
技术领域
本申请实施例涉及通信技术领域,具体涉及一种运动估计方法、装置、电子设备及可读存储介质。
背景技术
视频编码以视频信号的高度相关性和人眼的视觉特性为出发点,通过恰当的编码方式,消除各种相关性和人眼特性所产生的冗余,来达到压缩视频信号、减小传输码率的目的。视频信号的相关性可以分为时域相关性和空域相关性。空域相关性,指的是同一图像中,相邻像素之间的相似性,其主要通过帧内预测(Intra-Frame Prediction)编码来消除;时域相关性,指的是图像序列中相邻的图像之间的相似性,其主要通过帧间预测(Inter-Frame Prediction)编码来消除。帧间预测编码又称为运动补偿预测(Motion Compensated Prediction,MCP)编码。帧间预测编码的目的,就是为当前像编码块寻找最相似的块(匹配块),并根据预测块的位置,得出两者之间的空间位置的相对偏移量,即运动矢量(Motion Vector,MV)。通过搜索得到运动矢量的过程称为运动估计(Motion Estimation,ME),根据运动矢量,从指定的参考帧中找到预测块的过程称为运动补偿。预测编码的原理具体如图1所示。
运动估计是预测编码的关键技术,运动估计越准确,预测补偿的图像质量越高,补偿的残差就越小,补偿编码所需位数越少,比特率也就越小。同时,为了找出与当前块最相似的块,需要进行大量的搜索工作。运动估计是视频编码中计算复杂度最高的模块,通常其运算复杂度占整个编码器的50~ 70%。
运动估计方法包括全搜索和快速搜索两类。全搜索方法是对搜索范围内的所有点按照顺序进行搜索,逐点比较代价函数,选取使代价函数最小的点,即最优点。全搜索需要遍历搜索范围内的所有像素点,能够找到该块的代价函数最优解,但是复杂度也最高。快速搜索方法的基本思想是减小候选搜索点数,通常需要设计一些列有效的搜索模式,以尽可能少的搜索点数发现最优解。目前的快速搜索方法候选搜索点数较多,搜索点之间仍然存在较大的冗余,搜索匹配过程中容易落入局部最优。
发明内容
本申请实施例的目的是提供一种运动估计方法、装置、电子设备及可读存储介质,能够解决在减少运算复杂度和编码时间的同时避免搜索落入局部最优。
本申请是这样实现的:
第一方面,本申请实施例提供一种运动估计方法,所述方法由电子设备执行,所述方法包括:
在目标宏块中进行起始搜索点搜索,得到起始搜索点;
在根据所述起始搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
根据所述第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
在根据所述起始搜索点的代价值和所述第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
在根据所述起始搜索点的代价值和所述第二目标搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
根据所述第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
在根据所述起始搜索点的代价值和所述第三目标搜索点的代价值判断未提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
根据所述第四目标搜索点,确定所述目标宏块的运动矢量。
第二方面,本申请实施例提供一种运动估计装置,所述装置应用于电子设备,所述装置包括:
第一搜索模块,用于在目标宏块中进行起始搜索点搜索,得到起始搜索点;
第一确定模块,用于在根据所述起始搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
第二搜索模块,用于根据所述第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
第三搜索模块,用于在根据所述起始搜索点的代价值和所述第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
第二确定模块,用于在根据所述起始搜索点的代价值和所述第二目标搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
第四搜索模块,用于根据所述第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
第五搜索模块,用于在根据所述起始搜索点的代价值和所述第三目标搜索点的代价值判断未提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
第三确定模块,用于根据所述第四目标搜索点,确定所述目标宏块的运动矢量。
第三方面,本申请实施例提供一种电子设备,包括处理器、存储器及存 储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述运动估计方法的步骤。
第四方面,本申请实施例提供一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述运动估计方法的步骤。
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法。
第六方面,本申请实施例还提供一种计算机程序产品,所述计算机程序产品被存储在非易失的存储介质中,所述计算机程序产品被配置成被至少一个处理器执行以实现上文所述的方法的步骤。
第七方面,本申请实施例还提供一种运动估计装置,所述装置被配置成用于执行上文所述的方法。
本申请实施例对视频编码的运动估计过程中的搜索范围选取进行了优化,能够更加精确和快速的确定最优匹配块和最优运动矢量,能避免搜索落入局部最优,且能够减少运算复杂度和编码时间,提高了视频编码的高效性和实效性。
附图说明
图1为预测编码的原理示意图;
图2为UMHexagonS算法的应用场景示意图;
图3为UMHexagonS算法的流程示意图;
图4为本申请实施例提供的运动估计方法的流程示意图;
图5为本申请实施例提供的应用场景示意图之一;
图6a为本申请实施例提供的应用场景示意图之二;
图6b为本申请实施例提供的应用场景示意图之三;
图6c为本申请实施例提供的应用场景示意图之四;
图6d为本申请实施例提供的应用场景示意图之五;
图7为本申请实施例提供的运动估计装置的结构示意图;
图8为本申请实施例提供的电子设备的结构示意图之一;
图9为本申请实施例提供的电子设备的结构示意图之二。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
为更好理解本申请实施例的方案,首先对以下内容进行介绍。
为了减少运动估计的复杂度,一系列快速搜索方法被提出。快速搜索方法的基本思想是减小候选搜索点数,通常需要设计一些列有效的搜索模式,以尽可能少的搜索点数发现最优解。这其中比较经典的方法包括:利用运动矢量的中心偏置分布特性的三步搜索算法(Three Step Search,TSS)、新三步搜索算法(Novel Three Step Search,NTSS)、四步搜索算法(Four Step Search,FSS)、基于块的梯度下降搜索算法(Block-Based Gradient Descent Search,BBGDS)、钻石搜索算法(Diamond Search,DS)以及六边形搜索算法(Hexagon-based Search,HEXBS);利用运动矢量时空相关性的运动矢量场自适应搜索算法(Motion Vector Field Adaptive Search Technology,MVFAST)、 非对称十字形-多层次六边形格点搜索(Unsymmetrical-cross Multi-Hexagon grid Search,UMHexagonS)、增强型预测区域搜索(Enhanced Predictive Zonal Search,EPZS)。以上快速搜索算法的主要依据是从最优位置开始,预测失真随着距离而呈单调上升趋势。因此,在进行运动估计时,从某个搜索位置开始,找到每一轮搜索的失真度最小的位置,以这个位置为基础再开始下一轮的搜索,最终达到最优位置。这些快速搜索算法一般采用紧凑的偏向中心的搜索模式,用最快的方式获得最优解。
UMHexagonS算法
UMHexagonS搜索采用细致的全局搜索,再配合以小范围的局部搜索的策略,采用多层次多种形状的模板进行匹配,同时利用时空相关性进行运动矢量的估计。UMHexagonS搜索方法具有如下特点:
(1)UMHexagonS方法所采用的起始点预测,综合利用了帧内、帧间相邻块的运动矢量相关性,以及宏块划分技术所带来的不同尺寸块的运动相关性,因而可以选出最能反映当前块运动趋势的点作为起始点,提升了估计的准确率。
(2)UMHexagonS方法每步的搜索都与图像内容有关,其搜索模板分为三类:大范围粗搜索混合模板,细搜索中六角形模板,精细搜索小菱形模板。对不同的内容进行不同的搜索,搜索性能得到了进一步的改善。
(3)UMHexagonS一方面采用了减少搜索块的快速搜索方法,另一方面又通过搜索预测提前结束搜索,所以其运算量相对于全搜索方法可节约90%以上,同时还能较好的实现率失真优化,其性能接近全搜索方法。
视频编码的MV搜索中,运动估计性能评价基于率失真准则,使用如下的代价函数:
Figure PCTCN2022086324-appb-000001
式(1)中λ MOTION为拉格朗日乘子,当使用绝对误差和(Sum of Absolute Difference,SAD)作为失真度量时,其值为:
Figure PCTCN2022086324-appb-000002
其中,
Figure PCTCN2022086324-appb-000003
为候选运动矢量,
Figure PCTCN2022086324-appb-000004
为当前编块的中值预测矢量(MVP),
Figure PCTCN2022086324-appb-000005
Figure PCTCN2022086324-appb-000006
代表对运动矢量差分编码所需的比特数,QP为量化参数。
式(1)中SAD的计算如下:
Figure PCTCN2022086324-appb-000007
式(3)中s是当前待编码图像的像素点,c是运动估计中参考图像的像素点,B x和B y分别表示宏块在水平方向(x轴)和垂直方向(y轴)的范围大小,在式(3中)B x和B y的取值可以是16、8、4。
参见图2和图3,UMHexagonS算法包括四步不同的搜索模式:step1起始搜索点搜索;step2非对称十字交叉搜索;step3非均匀多层次六边形格点搜索;step4扩展六边形搜索。图2给出了一个搜索范围为W=16的搜索窗口内搜索过程的示意图(假设初始搜索点为(0,0))。图3给出了UMHexagonS搜索的流程图,其中cur_cost为式(1)计算得到的代价值,pred_SAD是对当前块的最佳匹配块代价值的预测值,来自先前编码的临近块,beta1、beta2为阈值系数,它们由当前QP和当前块尺寸大小决定。
假设搜索范围为W,结合图3所示的流程图,UMHexagonS搜索步骤如下:
step1:起始搜索点搜索,确定当前最佳起始搜索点;
(a).中值预测:利用空间相关性,取已求出的当前块左、上、右上相邻块的运动矢量的中值MV pred_MP
(b).原点预测:考虑到静止或者固定背景的情况,取MV pred=(0,0);
(c).上层预测:利用运动估计多宏块划分的特点,取已求出的、同位置、上一层(up layer)大一倍块的运动矢量MV pred_UP
(d).相应块预测:利用时间相关性,取已求出的、前一帧同位置的运动矢量MV pred_CP
(e).相邻参考帧预测:利用时间相关性,取已求出的、前一参考帧中当前块的MV按比例调节,得到当前的MV pred_REF
将上述各预测模式得到的预测MV组成预测矢量集S,然后从集合S中搜索一个对应代价函数最小的候选运动矢量作为搜索起始点:
Figure PCTCN2022086324-appb-000008
其中m i∈S;
其中,arg表示使目标函数J取最小值时的变量值;
然后根据当前最佳预测起始点的代价值执行提前截止(Early_Termination)判断操作,判断是否提前截止。一个好的起始预测点(即接近于最优位置,对应的代价值较小),将会加快搜索过程;否则,大量的候选点将被搜索,从而造成编码器复杂度的上升。
进一步地,在判断未提前截止后,对宏块的大小做一次判断,如果当前运动估计的宏块大小为4×4,则直接跳转到步骤step4-1:六边形搜索;如果当前运动估计的宏块大小不等于4×4(例如为其它大小,16×16、8×8、8×16等),则需要按流程进行step2:非对称十字交叉搜索。
step2:非对称十字交叉搜索;
由于自然界物体的水平运动水平方向要比垂直方向剧烈一些,因而采用非对称十字搜索方法,所谓非对称是指以搜索起始点为中心的十字搜索范围水平方向为垂直方向的2倍。本步候选运动矢量集合Ω 1为:
Ω 1={m=(m x,m y)|m=(cm x±2i,cm y),i=0,1,2,……W/2;
m=(cm x,cm y±2j),j=0,1,2,……W/4}简称式(5);
式(5)中cm为step1得到的最佳预测起始点,即cm=m min
从Ω 1中搜索一个代价函数最小的运动矢量作为当前最佳运动矢量,即:
Figure PCTCN2022086324-appb-000009
简称式(4),其中m i∈Ω 1
然后根据当前最佳运动矢量的代价值执行Early_Termination操作,判断是否提前截止。
step3:非均匀多层次六边形格点搜索,此步分两个子步骤:
step3-1:5×5小矩形窗全搜索;
需要说明的是,矩形窗口的大小可视宏块的搜索范围来确定,例如图2示出的是16×16的宏块,则可以选用5×5的矩形窗,若宏块的搜索范围改变,则矩形窗的大小也可适应性改变。
本步骤中候选运动矢量集合Ω 2为:
Ω 2={m=(m x,m y)||cm x-m x|≤2,|cm y-m y|≤2},cm=m min2,简称式(7);
从Ω 2中搜索一个代价函数最小的运动矢量作为当前最佳运动矢量,即:
Figure PCTCN2022086324-appb-000010
简称式(8),其中m i∈Ω 2
然后根据当前最佳运动矢量的代价值执行Early_Termination操作,判断是否提前截止。
step3-2:扩展的多层次六边形格点搜索;
本步骤搜索中使用的超六边形模板中的16个搜索点(对应图2所示的16×16的宏块)为:
Ω 16-HP={m=(x,y)|m=(±4,±2),(±4,±1),(±4,0),(±2,±3),(0,±4),简称式(9);
采用下述方法扩展搜索区:
for(k=0,k<W/4,k++)
{
k={m=(m x,m y)|m x=cm x+kx’,m y=cm y+ky’,(m x,m y)∈Ω 16-HP},cm=m min3
}
从∏ k中搜索一个代价函数最小的运动矢量作为当前最佳运动矢量,即:
Figure PCTCN2022086324-appb-000011
简称式(10),其中m i∈∏ k
然后根据当前最佳运动矢量的代价值执行Early_Termination操作,判断是否提前截止。
step4:扩展六边形搜索,此步分为两个子步骤:
step4-1:六边形搜索;
本步候选运动矢量集合Ω 3为:
Ω 3={m=(m x,m y)|m=(cm x±2,cm y),(cm x±1,cm y±2)},cm=m min4,简称式(11);
从Ω 3中搜索一个代价函数最小的运动矢量作为当前最佳运动矢量,即:
Figure PCTCN2022086324-appb-000012
简称式(12),其中m i∈Ω 3
上述步骤不断重复,即使用六边形模板反复搜索,直至最佳预测点在六边形的中点为止。
step4-2:小菱形搜索;
本步候选运动矢量集合Ω 4为:
Ω 4={m=(m x,m y)|m=(cm x±1,cm y),(cm x,cm y±1)},cm=m min5,简称式(13);
从Ω 4中搜索一个代价函数最小的运动矢量作为当前最佳运动矢量,即:
Figure PCTCN2022086324-appb-000013
简称式(12),其中m i∈Ω 4
上述步骤不断重复,即使用小菱形模板反复搜索,直至最佳预测点在小菱形的中点为止,得到最终的运动矢量。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的运动估计方法进行详细地说明。
参见图4,本申请实施例提供一种运动估计方法,该方法由电子设备执行,需要说明的是,本申请实施例的方法针对UMHexagonS算法流程做进一步改进优化,方法包括如下步骤:
步骤401:在目标宏块中进行起始搜索点搜索,得到起始搜索点;
在本申请实施例中,目标宏块指的是当前进行运动估计的宏块,该宏块的大小可以使16×16、8×8、8×16等,本申请实施例对此不做具体限定。
具体地,其实起始搜索点搜索的方式可以参照上文UMHexagonS算法中step1的具体过程。
步骤402:在根据起始搜索点的代价值判断未提前截止的情况下,根据目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
步骤403:根据第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
在本申请实施例中,通过代价函数(具体为式(1))计算起始搜索点的代价值,然后根据该起始搜索点的代价值进行Early_Termination判断,具体判断方式可以参照上文UMHexagonS算法中Early_Termination判断操作。
在一些实施方式中,在判断未提前截止后,对宏块的大小做一次判断,如果当前运动估计的宏块大小为4×4,则直接跳转到UMHexagonS算法中的step4-1:六边形搜索;如果当前运动估计的宏块大小不等于4×4(例如为其它大小,16×16、8×8、8×16等),则需要按流程进行UMHexagonS算法中step2:非对称十字交叉搜索。
在本申请实施例中,在判断未提前截止的情况下,根据目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围。
现有非对称十字搜索是基于视频序列中水平方向的运动幅度要远远大于垂直方向的假设,而将水平方向的搜索范围设置为垂直方向的2倍。该方法可能会导致“局部最优”的问题。在实际复杂多变的视频序列中,该假设并不一定成立,对于某些视频序列,在垂直方向也存在剧烈的运动,按原方法搜索可能会得到次优结果。
基于上述理由,在本申请实施例中,根据宏块相同位置上一帧的MV来动态设置当前帧非对称十字搜索在水平、垂直方向的搜索长度。
具体地,目标宏块中的目标位置的前一帧的运动矢量为MV pred_CP
根据目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围,包括:
通过如下公式确定水平搜索长度和垂直搜索长度:
MV pred_CP=(MV pred_CP_h,MV pred_CP_v);
if(MV pred_CP_h≥m×MV pred_CP_v),m≥1.5;
rangeX=W/2,rangeY=W/4;
else if(MV pred_CP_v≥n×MV pred_CP_h),n≥1.5;
rangeX=W/4,rangeY=W/2;
else rangeX=W/2,rangeY=W/2;
其中,W为预设的目标宏块的搜索范围,MV pred_CP_h为MV pred_CP在水平方向的分量,MV pred_CP_v为MV pred_CP在垂直方向的分量,rangeX为水平搜索长度,rangeY为垂直搜索长度。
通过上述过程确定出rangeX和rangeY,然后利用式(5)计算出候选运动矢量集合Ω 1,其中在式(5)里,i的取值范围取决于rangeX,j取值范围取决于rangeY。
经过非对称十字交叉搜索后,能够得到一个新的搜索点,即第一目标搜索点。
步骤404:在根据起始搜索点的代价值和第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
步骤405:在根据起始搜索点的代价值和第二目标搜索点的代价值判断未提前截止的情况下,根据目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
在本申请实施例中,在根据起始搜索点的代价值和第一目标搜索点的代价值进行Early_Termination判断,具体判断方式可以参照上文UMHexagonS算法中Early_Termination判断操作。
在判断未提前截止后,进行矩形窗全搜索,例如UMHexagonS算法中的5ⅹ5小矩形窗全搜索,搜索得到第二目标搜索点,然后根据起始搜索点的代价值和第二目标搜索点的代价值进行Early_Termination判断,具体判断方式可以参照上文UMHexagonS算法中Early_Termination判断操作。
在判断未提前截止后,根据目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
在现有UMHexagonS算法中,当矩形窗全搜索后判断未提前截止,则要进行扩展多层次六边形格点搜索,现有扩展多层次六边形格点搜索的搜索范围是针对宏块内的完整超六边形模板内的所有搜索点,需要计算代价值的搜索点较多。
在本申请实施例中,当矩形窗全搜索后判断未提前截止,首先根据目标 宏块中的相同位置的前一帧的运动矢量和前两帧的运动矢量,确定扩展多层次六边形格点搜索的搜索范围,即第二搜索范围,在宏块内的完整超六边形模板内进一步缩小搜索范围。
具体地,假定当前编码帧为t,目标宏块中的目标位置的当前帧的运动矢量为MV current_t,目标宏块中的目标位置的前一帧的运动矢量为MV current_t-1,目标宏块中的目标位置的前两帧的运动矢量为MV current_t-2
根据目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围,包括:
在MV current_t、MV current_t-1和MV current_t-2中任意一个为0的情况下,则按照UMHexagonS原方法的扩展多层次六边形格点搜索执行,即按照step3-2继续执行。
在MV current_t、MV current_t-1和MV current_t-2均不为0的情况下,通过如下公式确定MV current_t-1相对于MV current_t-2的夹角α 1
Figure PCTCN2022086324-appb-000014
以及通过如下公式确定MV current_t相对于MV current_t-1的夹角α 2
Figure PCTCN2022086324-appb-000015
在|α 12|>a的情况下,则按照UMHexagonS原方法的扩展多层次六边形格点搜索执行,即按照step3-2继续执行。
在|α 12|≤a的情况下,根据夹角α 2确定第二搜索范围。
其中a的取值范围为[π/10,π/4],例如a=π/4。
在一些实施方式中,根据夹角α 2所载的角度范围,确定与该角度范围对应的第二搜索范围,具体如下:
当α 2在(0,π/2]内的情况下,确定第二搜索范围为目标宏块的第一象限内的所有搜索点;
当α 2在(π/2,π]内的情况下,确定第二搜索范围为目标宏块的第二象限内的所有搜索点;
当α 2在(π,3π/2]内的情况下,确定第二搜索范围为目标宏块的第三象 限内的所有搜索点;
当α 2在(3π/2,2π]内的情况下,确定第二搜索范围为目标宏块的第四象限内的所有搜索点。
具体参见图5,图5示出了α 2在不同情况下,第二搜索范围的模式:
当α 2在(0,π/2]内的情况下,第二搜索范围为如图5中(a)所示;
当α 2在(π/2,π]内的情况下,第二搜索范围为如图5中(b)所示;
当α 2在(π,3π/2]内的情况下,第二搜索范围为如图5中(c)所示;
当α 2在(3π/2,2π]内的情况下,第二搜索范围为如图5中(d)所示。
在UMHexagonS算法中,对比图2和图5,图2所示搜索示意图((0,0)为起始搜索点,搜索范围为W=16),step3-2中搜索的总像素点是4×16=64。采用本申请实施例的方法,只要计算如图5中(a)、(b)、(c)、(d)的某一个即可完成搜索,优化后的搜索点数减少到了4×5=20,相比原方法的64个点减小了63.69%。
步骤406:根据第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
步骤407:在根据起始搜索点的代价值和第三目标搜索点的代价值判断未提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
步骤408:根据第四目标搜索点,确定目标宏块的运动矢量。
在本申请实施例中,在进行扩展多层次六边形格点搜索后,得到第三目标搜索点;然后根据起始搜索点的代价值和第三目标搜索点的代价值进行Early_Termination判断,具体判断方式可以参照上文UMHexagonS算法中Early_Termination判断操作。
在判断未提前截止后,执行现有UMHexagonS算法中的step4:扩展六边形搜索,并在该搜索后确定出最终的搜索点,即第四目标搜索点,然后基于第四目标搜索点得到最终的运动矢量,即针对目标宏块的运动估计结果。
本申请实施例中,针对现有UMHexagonS算法,对其中非对称十字交叉搜索和扩展的多层次六边形格点搜索的搜索范围的确定进行了优化,采用了 更加精妙的预测机制和搜索模块,具有动态自适应性,能更加精确和快速的确定最优匹配块和最优运动矢量,能避免搜索过早的进入局部最优的死胡同和较大幅度的减少运算复杂度和编码时间,提高了视频编码器的高效性和实效性。
参见图6a至图6d,结合附图对本申请实施例的方法的一种应用示例进行描述:
本部分示例为一个四分之一公共中间格式(Quarter Common Intermediate Format,QCIF)视频序列(176×144)进行编码,一帧图像包含99个16×16大小的宏块。编码时,除首帧外,其余各帧均为P帧,即采用IPPP…的方式(说明:视频编码中I帧标识关键帧,P帧表示的是这一帧跟之前的一个关键帧(或P帧)的差别),采用帧编码模式(说明:如果采用场编码模式,则相关配置参数需要乘以2),帧率为24帧/秒,运动估计搜索范围为W=16,参考帧的个数设定为1,开启率失真优化编码,量化参数QP=28。假定当前编码块为1个P帧图像中大小为16×16的块。
step1:起始搜索点搜索;
(a).中值预测:计算中值预测MV pred_MP,其计算过程如图6a所示,中值预测运动矢量MV pred_MP=median(MV A,MV B,MV C),median()函数实现取中值功能。
设E为当前块,A在E的左侧,B在E的上方、C在E的右上方,如果E的左侧多于一个块,那么选择最上方的块作为A,在E的上方选择最左侧的块作为B。在预测E的过程中同时遵守以下准则:
(1).除了块尺寸为16×8,8×16的块,预测值是块A,B,C的运动矢量中值;
(2).对于16×8块,上方的16×8块的预测值根据B预测得到,下方的16×8的预测值根据A得到;
(3).对于8×16块,左侧的8×16块的预测值根据A预测得到,右侧的8×16块的预测值根据C得到;
(c).上层预测:计算上层预测MV pred_UP
运动矢量估计采用了自上而下的上层运动矢量预测,如图6b所示。如已获得了16×16块的运动矢量,就利用其对两种子块(16×8,8×16)进行预测;而一个16×8(8×16)块的运动矢量,又可以用来预测相同位置上两个8×8子块等。
由于当前编码块大小为16×16,其不存在上层预测MV pred_UP
(d).相应块预测:利用时间相关性,取已求出的、前一帧同位置的运动矢量MV pred_CP。这种预测方式,在运动平缓区域,以及大面积整体运动区域中,是非常有效的,其原理图如图6c所示:
(e).相邻参考帧预测:采用了多参考帧的运动估计,其最多可以支持15个参考帧,其原理如图6d所示。
图8中,假设当前块所在帧的时间为t,则当在前面的参考帧t’中搜索当前块的最优匹配块时,可以利用当前块在参考帧t’+1中的运动矢量来估计出当前块在帧t’中的运动矢量,即:
Figure PCTCN2022086324-appb-000016
在本示例中,由于参考帧的个数设定为1,则其不存在相邻参考帧预测MV pred_NRP
结合上述分析,在本示例中,本步骤的预测矢量集S包括如下矢量:
S={中值预测:MV pred_MP;原点预测:MV pred;相应块预测:MV pred_CP};
按照式(1)计算各个预测矢量的代价函数,选择代价值最下的预测矢量对应点作为起始搜索点。选择beta1=0.1,beta1=0.4作为Early_Termination操作的阈值。
step2:非对称十字交叉搜索;
本步骤使用本申请实施例方法,根据宏块相同位置上一帧的MV来动态设置当前帧非对称十字搜索在水平、垂直方向的搜索长度,具体参照图4中步骤,在实现中W=16。
step3:非均匀多层次六边形格点搜索
step3-1:5ⅹ5小矩形窗全搜索,本步骤使用UMHexagonS原方法
step3-2:本步骤使用本专利提出的改进扩展的多层次六边形格点搜索方法
step4:扩展六边形搜索
step4-1:六边形搜索,本步骤使用UMHexagonS原方法
step4-2:小菱形搜索,本步骤使用UMHexagonS原方法
参见图7,本申请实施例提供一种运动估计装置700,所述装置应用于电子设备,所述装置包括:
第一搜索模块701,用于在目标宏块中进行起始搜索点搜索,得到起始搜索点;
第一确定模块702,用于在根据所述起始搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
第二搜索模块703,用于根据所述第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
第三搜索模块704,用于在根据所述起始搜索点的代价值和所述第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
第二确定模块705,用于在根据所述起始搜索点的代价值和所述第二目标搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
第四搜索模块706,用于根据所述第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
第五搜索模块707,用于在根据所述起始搜索点的代价值和所述第三目标搜索点的代价值判断未提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
第三确定模块708,用于根据所述第四目标搜索点,确定所述目标宏块的运动矢量。
在一些实施方式中,所述目标宏块中的目标位置的前一帧的运动矢量为MV pred_CP
所述第一确定模块进一步用于:
通过如下公式确定水平搜索长度和垂直搜索长度:
MV pred_CP=(MV pred_CP_h,MV pred_CP_v);
if(MV pred_CP_h≥m×MV pred_CP_v),m≥1.5;
rangeX=W/2,rangeY=W/4;
else if(MV pred_CP_v≥n×MV pred_CP_h),m≥1.5;
rangeX=W/4,rangeY=W/2;
else rangeX=W/2,rangeY=W/2;
其中,W为预设的所述目标宏块的搜索范围,MV pred_CP_h为MV pred_CP在水平方向的分量,MV pred_CP_v为MV pred_CP在垂直方向的分量,rangeX为所述水平搜索长度,rangeY为所述垂直搜索长度。
在一些实施方式中,所述目标宏块中的目标位置的当前帧的运动矢量为MV current_t,所述目标宏块中的目标位置的前一帧的运动矢量为MV current_t-1,所述目标宏块中的目标位置的前两帧的运动矢量为MV curredt_t-2
所述第二确定模块进一步用于,包括:
在MV cur□□nt_t、MV current_t-1和MV current_t-2均不为0的情况下,
通过如下公式确定MV current_t-1相对于MV current_t-2的夹角α 1
Figure PCTCN2022086324-appb-000017
以及通过如下公式确定MV current_t相对于MV current_t-1的夹角α 2
Figure PCTCN2022086324-appb-000018
在|α 12|≤a的情况下,根据夹角α 2确定所述第二搜索范围,其中a的取值范围为[π/10,π/4]。
在一些实施方式中,所述第二确定模块进一步用于:
当α 2在(0,π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的第一象限内的所有搜索点;
当α 2在(π/2,π]内的情况下,确定所述第二搜索范围为所述目标宏块的第二象限内的所有搜索点;
当α 2在(π,3π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的第三象限内的所有搜索点;
当α 2在(3π/2,2π]内的情况下,确定所述第二搜索范围为所述目标宏块的第四象限内的所有搜索点。
需要说明的是,本申请实施例提供的运动估计方法,执行主体可以为运动估计装置,或者该运动估计装置中的用于执行运动估计方法的控制模块。本申请实施例中以运动估计装置执行运动估计方法为例,说明本申请实施例提供的运动估计装置
本申请实施例中的运动估计装置可以是装置,也可以是终端中的部件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的运动估计装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的运动估计装置能够实现图4的方法实施例实现的各个过程,为避免重复,这里不再赘述
可选的,如图8所示,本申请实施例还提供一种电子设备800,包括存储器801,处理器802,存储在存储器801上并可在所述处理器802上运行的 程序或指令,该程序或指令被处理器802执行时实现上述运动估计方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要注意的是,本申请实施例中的电子设备包括移动电子设备和非移动电子设备。
图9为实现本申请实施例的一种电子设备的硬件结构示意图。
该电子设备900包括但不限于:射频单元901、网络模块902、音频输出单元903、输入单元904、传感器905、显示单元906、用户输入单元907、接口单元908、存储器909、以及处理器910等部件。
本领域技术人员可以理解,电子设备900还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器910逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图9中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
其中处理器910,用于:
在目标宏块中进行起始搜索点搜索,得到起始搜索点;
在根据所述起始搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
根据所述第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
在根据所述起始搜索点的代价值和所述第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
在根据所述起始搜索点的代价值和所述第二目标搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
根据所述第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
在根据所述起始搜索点的代价值和所述第三目标搜索点的代价值判断未 提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
根据所述第四目标搜索点,确定所述目标宏块的运动矢量。
可选地,所述目标宏块中的目标位置的前一帧的运动矢量为MV pred_CP
所述处理器910进一步用于:
通过如下公式确定水平搜索长度和垂直搜索长度:
MV pred_CP=(MV pred_CP_h,MV pred_CP_v);
if(MV pred_CP_h≥m×MV pred_CP_v),m≥1.5;
rangeX=W/2,rangeY=W/4;
else if(MV pred_CP_v≥n×MV pred_CP_h),n≥1.5;
rangeX=W/4,rangeY=W/2;
else rangeX=W/2,rangeY=W/2;
其中,W为预设的所述目标宏块的搜索范围,MV pred_CP_h为MV pred_CP在水平方向的分量,MV pred_CP_v为MV pred_CP在垂直方向的分量,rangeX为所述水平搜索长度,rangeY为所述垂直搜索长度。
可选地,所述目标宏块中的目标位置的当前帧的运动矢量为MV current_t,所述目标宏块中的目标位置的前一帧的运动矢量为MV current_t-1,所述目标宏块中的目标位置的前两帧的运动矢量为MV current_t-2
所述处理器910进一步用于,包括:
在MV current_t、MV current_t-1和MV current_t-2均不为0的情况下,
通过如下公式确定MV current_t-1相对于MV current_t-2的夹角α 1
Figure PCTCN2022086324-appb-000019
以及通过如下公式确定MV current_t相对于MV current_t-1的夹角α 2
Figure PCTCN2022086324-appb-000020
在|α 12|≤a的情况下,根据夹角α 2确定所述第二搜索范围,其中a的取值范围为[π/10,π/4]。
可选地,所述处理器910进一步用于:
当α 2在(0,π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的 第一象限内的所有搜索点;
当α 2在(π/2,π]内的情况下,确定所述第二搜索范围为所述目标宏块的第二象限内的所有搜索点;
当α 2在(π,3π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的第三象限内的所有搜索点;
当α 2在(3π/2,2π]内的情况下,所述第二搜索范围为所述目标宏块的第四象限内的所有搜索点。
本申请实施例对视频编码的运动估计过程中的搜索范围选取进行了优化,能够更加精确和快速的确定最优匹配块和最优运动矢量,能避免搜索落入局部最优,且能够减少运算复杂度和编码时间,提高了视频编码的高效性和实效性。
应理解的是,本申请实施例中,输入单元904可以包括图形处理器(Graphics Processing Unit,GPU)9041和麦克风9042,图形处理器9041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元906可包括显示面板9061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板9061。用户输入单元907包括触控面板9071以及其他输入设备9072。触控面板9071,也称为触摸屏。触控面板9071可包括触摸检测装置和触摸控制器两个部分。其他输入设备9072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。存储器909可用于存储软件程序以及各种数据,包括但不限于应用程序和操作系统。处理器910可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器910中
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述运动估计方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述运动估计方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
本申请实施例还提供一种计算机程序产品,所述计算机程序产品被存储在非易失的存储介质中,所述计算机程序产品被配置成被至少一个处理器执行以实现上文所述的方法的步骤。
本申请实施例还提供一种运动估计装置,所述装置被配置成用于执行上文所述的方法。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通 过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (13)

  1. 一种运动估计方法,所述方法包括:
    在目标宏块中进行起始搜索点搜索,得到起始搜索点;
    在根据所述起始搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
    根据所述第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
    在根据所述起始搜索点的代价值和所述第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
    在根据所述起始搜索点的代价值和所述第二目标搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
    根据所述第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
    在根据所述起始搜索点的代价值和所述第三目标搜索点的代价值判断未提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
    根据所述第四目标搜索点,确定所述目标宏块的运动矢量。
  2. 根据权利要求1所述的方法,其中,
    所述目标宏块中的目标位置的前一帧的运动矢量为MV pred_CP
    所述根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围,包括:
    通过如下公式确定水平搜索长度和垂直搜索长度:
    MVp red_CP=(MV pred_CP_h,MV pred_CP_v);
    Figure PCTCN2022086324-appb-100001
    rangeX=W/2,rangeY=W/4;
    else if(MV pred_CP_v≥n×MV pred_CP_h),n≥1.5;
    rangeX=W/4,rangeY=W/2;
    else rangeX=W/2,rangeY=W/2;
    其中,W为预设的所述目标宏块的搜索范围,MV pred_CP_h为MV pred_CP在水平方向的分量,MV pred_CP_v为MV pred_CP在垂直方向的分量,rangeX为所述水平搜索长度,rangeY为所述垂直搜索长度。
  3. 根据权利要求1所述的方法,其中,
    所述目标宏块中的目标位置的当前帧的运动矢量为MV current_t,所述目标宏块中的目标位置的前一帧的运动矢量为MV current_t-1,所述目标宏块中的目标位置的前两帧的运动矢量为MV current_t-2
    根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围,包括:
    在MV current_t、MV current_t-1和MV current_t-2均不为0的情况下,
    通过如下公式确定MV current_t-1相对于MV current_t-2的夹角α 1
    Figure PCTCN2022086324-appb-100002
    以及通过如下公式确定MV current_t相对于MV current_t-1的夹角α 2
    Figure PCTCN2022086324-appb-100003
    在|α 12|≤a的情况下,根据夹角α 2确定所述第二搜索范围,其中a的取值范围为[π/10,π/4]。
  4. 根据权利要求3所述的方法,其中,所述根据夹角α 2确定所述第二搜索范围,包括:
    当α 2在(0,π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的第一象限内的所有搜索点;
    当α 2在(π/2,π]内的情况下,确定所述第二搜索范围为所述目标宏块的第二象限内的所有搜索点;
    当α 2在(π,3π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的第三象限内的所有搜索点;
    当α 2在(3π/2,2π]内的情况下,确定所述第二搜索范围为所述目标宏块 的第四象限内的所有搜索点。
  5. 一种运动估计装置,所述装置应用于电子设备,所述装置包括:
    第一搜索模块,用于在目标宏块中进行起始搜索点搜索,得到起始搜索点;
    第一确定模块,用于在根据所述起始搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量,确定第一搜索范围;
    第二搜索模块,用于根据所述第一搜索范围,进行非对称十字交叉搜索,得到第一目标搜索点;
    第三搜索模块,用于在根据所述起始搜索点的代价值和所述第一目标搜索点的代价值判断未提前截止的情况下,进行矩形窗全搜索,得到第二目标搜索点;
    第二确定模块,用于在根据所述起始搜索点的代价值和所述第二目标搜索点的代价值判断未提前截止的情况下,根据所述目标宏块中的目标位置的前一帧的运动矢量和目标宏块中的目标位置的前两帧的运动矢量,确定第二搜索范围;
    第四搜索模块,用于根据所述第二搜索范围,进行扩展多层次六边形格点搜索,得到第三目标搜索点;
    第五搜索模块,用于在根据所述起始搜索点的代价值和所述第三目标搜索点的代价值判断未提前截止的情况下,进行扩展六边形搜索,得到第四目标搜索点;
    第三确定模块,用于根据所述第四目标搜索点,确定所述目标宏块的运动矢量。
  6. 根据权利要求5所述的装置,其中,
    所述目标宏块中的目标位置的前一帧的运动矢量为MV pred_CP
    所述第一确定模块进一步用于:
    通过如下公式确定水平搜索长度和垂直搜索长度:
    MV pred_CP=(MV pred_CP_h,MV pred_CP_v);
    if(MV pred_CP_h≥m×MV pred_CP_v),m≥1.5;
    rangeX=W/2,rangeY=W/4;
    else if(MV pred_CP_v≥n×MV pred_CP_h),n≥1.5;
    rangeX=W/4,rangeY=W/2;
    else rangeX=W/2,rangeY=W/2;
    其中,W为预设的所述目标宏块的搜索范围,MV pred_CP_h为MV pred_CP在水平方向的分量,MV pred_CP_v为MV pred_CP在垂直方向的分量,rangeX为所述水平搜索长度,rangeY为所述垂直搜索长度。
  7. 根据权利要求5所述的装置,其中,
    所述目标宏块中的目标位置的当前帧的运动矢量为MV current_t,所述目标宏块中的目标位置的前一帧的运动矢量为MV current_t-1,所述目标宏块中的目标位置的前两帧的运动矢量为MV current_t-2
    所述第二确定模块进一步用于,包括:
    在MV current_t、MV current_t-1和MV current_t-2均不为0的情况下,
    通过如下公式确定MV current_t-1相对于MV current_t-2的夹角α 1
    Figure PCTCN2022086324-appb-100004
    以及通过如下公式确定MV current_t相对于MV current_t-1的夹角α 2
    Figure PCTCN2022086324-appb-100005
    在|α 12|≤a的情况下,根据夹角α 2确定所述第二搜索范围,其中a的取值范围为[π/10,π/4]。
  8. 根据权利要求7所述的装置,其中,所述第一确定模块进一步用于:
    当α 2在(0,π/2]内的情况下,确定所述第二搜索范围为所述目标宏块的第一象限内的所有搜索点;
    当α 2在(π/2,π]内的情况下,确定所述第二搜索范围为所述目标宏块的第二象限内的所有搜索点;
    当α 2在(π,3π/2]内的情况下,确定所述第二搜索范围为所述目标宏块 的第三象限内的所有搜索点;
    当α 2在(3π/2,2π]内的情况下,所述第二搜索范围为所述目标宏块的第四象限内的所有搜索点。
  9. 一种电子设备,其中,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至4中任一项所述运动估计方法的步骤。
  10. 一种可读存储介质,其中,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至4中任一项所述运动估计方法的步骤。
  11. 一种芯片,包括处理器和通信接口,其中,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如权利要求1至4中任一项所述运动估计方法的步骤。
  12. 一种计算机程序产品,其中,所述计算机程序产品被存储在非易失的存储介质中,所述计算机程序产品被至少一个处理器执行以实现如权利要求1至4中任一项所述运动估计方法的步骤。
  13. 一种运动估计装置,其中,所述装置被配置成用于执行如权利要求1至4中任一项所述运动估计方法的步骤。
PCT/CN2022/086324 2021-04-16 2022-04-12 运动估计方法、装置、电子设备及可读存储介质 WO2022218299A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22787522.6A EP4311236A4 (en) 2021-04-16 2022-04-12 METHOD AND APPARATUS FOR MOTION ESTIMATION, ELECTRONIC DEVICE, AND READABLE STORAGE MEDIUM
US18/380,640 US20240048753A1 (en) 2021-04-16 2023-10-16 Motion estimation method and apparatus, electronic device and readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110412068.0A CN113115038B (zh) 2021-04-16 2021-04-16 运动估计方法、装置、电子设备及可读存储介质
CN202110412068.0 2021-04-16

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/380,640 Continuation US20240048753A1 (en) 2021-04-16 2023-10-16 Motion estimation method and apparatus, electronic device and readable storage medium

Publications (1)

Publication Number Publication Date
WO2022218299A1 true WO2022218299A1 (zh) 2022-10-20

Family

ID=76717879

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/086324 WO2022218299A1 (zh) 2021-04-16 2022-04-12 运动估计方法、装置、电子设备及可读存储介质

Country Status (4)

Country Link
US (1) US20240048753A1 (zh)
EP (1) EP4311236A4 (zh)
CN (1) CN113115038B (zh)
WO (1) WO2022218299A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113115038B (zh) * 2021-04-16 2022-03-29 维沃移动通信有限公司 运动估计方法、装置、电子设备及可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060133496A1 (en) * 2004-12-21 2006-06-22 Ali Corporation Motion estimation method for an adaptive dynamic search range
CN101815218A (zh) * 2010-04-02 2010-08-25 北京工业大学 基于宏块特征的快速运动估计视频编码方法
US20120106642A1 (en) * 2010-10-29 2012-05-03 Lsi Corporation Motion Estimation for a Video Transcoder
CN103793925A (zh) * 2014-02-24 2014-05-14 北京工业大学 融合时空特征的视频图像视觉显著程度检测方法
CN112203095A (zh) * 2020-12-04 2021-01-08 腾讯科技(深圳)有限公司 视频运动估计方法、装置、设备及计算机可读存储介质
CN113115038A (zh) * 2021-04-16 2021-07-13 维沃移动通信有限公司 运动估计方法、装置、电子设备及可读存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100912429B1 (ko) * 2006-11-09 2009-08-14 삼성전자주식회사 고속 움직임 추정을 위한 영상 검색 방법
CN103188496B (zh) * 2013-03-26 2016-03-09 北京工业大学 基于运动矢量分布预测的快速运动估计视频编码方法
CN103581647B (zh) * 2013-09-29 2017-01-04 北京航空航天大学 一种基于彩色视频运动矢量的深度图序列分形编码方法
CN109660799A (zh) * 2017-10-12 2019-04-19 北京金山云网络技术有限公司 视频编码中的运动估计方法、装置、电子设备及存储介质
CN112367486B (zh) * 2020-10-30 2023-03-28 维沃移动通信有限公司 视频处理方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060133496A1 (en) * 2004-12-21 2006-06-22 Ali Corporation Motion estimation method for an adaptive dynamic search range
CN101815218A (zh) * 2010-04-02 2010-08-25 北京工业大学 基于宏块特征的快速运动估计视频编码方法
US20120106642A1 (en) * 2010-10-29 2012-05-03 Lsi Corporation Motion Estimation for a Video Transcoder
CN103793925A (zh) * 2014-02-24 2014-05-14 北京工业大学 融合时空特征的视频图像视觉显著程度检测方法
CN112203095A (zh) * 2020-12-04 2021-01-08 腾讯科技(深圳)有限公司 视频运动估计方法、装置、设备及计算机可读存储介质
CN113115038A (zh) * 2021-04-16 2021-07-13 维沃移动通信有限公司 运动估计方法、装置、电子设备及可读存储介质

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LIFEN XIE ; CHUNQING HUANG ; BIHUI CHEN: "UMHexagonS search algorithm for fast motion estimation", COMPUTER RESEARCH AND DEVELOPMENT (ICCRD), 2011 3RD INTERNATIONAL CONFERENCE ON, IEEE, 11 March 2011 (2011-03-11), pages 483 - 487, XP031864174, ISBN: 978-1-61284-839-6, DOI: 10.1109/ICCRD.2011.5764063 *
LIU, XUEQIAN ET AL.: "Optimization Algorithm for Motion Estimation Based on UMHexagonS", VIDEO ENGINEERING, vol. 38, no. 9, 31 December 2014 (2014-12-31), pages 5 - 8, XP055976626, ISSN: 1002-8692 *
PENG HUANG ; CUI-HUA LI: "Irregularity-cross multi-hexagon-grid search algorithm for fastmotion estimation on H.264", COMPUTER ENGINEERING AND TECHNOLOGY (ICCET), 2010 2ND INTERNATIONAL CONFERENCE, 16 April 2010 (2010-04-16), Piscataway, NJ, USA , pages V3 - V3-592, XP031690014, ISBN: 978-1-4244-6347-3 *
See also references of EP4311236A4 *

Also Published As

Publication number Publication date
CN113115038A (zh) 2021-07-13
US20240048753A1 (en) 2024-02-08
EP4311236A1 (en) 2024-01-24
EP4311236A4 (en) 2024-05-29
CN113115038B (zh) 2022-03-29

Similar Documents

Publication Publication Date Title
US8130839B2 (en) Information processing apparatus with video encoding process control based on detected load
US20240031593A1 (en) Encoding/decoding method, encoding apparatus, decoding apparatus, and system for video image
CN108134939B (zh) 一种运动估计方法及装置
US9332271B2 (en) Utilizing a search scheme for screen content video coding
CN108419082B (zh) 一种运动估计方法及装置
WO2023131059A1 (zh) 图像编码方法、图像编码装置、电子设备和可读存储介质
KR101443701B1 (ko) 적응형 움직임 탐색 범위 결정 장치 및 방법
CN101888546B (zh) 一种运动估计的方法及装置
WO2018233411A1 (zh) 预测模式选择方法、视频编码设备及存储介质
JP2008227670A (ja) 画像符号化装置
WO2022218299A1 (zh) 运动估计方法、装置、电子设备及可读存储介质
WO2022121787A1 (zh) 视频预测编码的方法及装置
US10284850B2 (en) Method and system to control bit rate in video encoding
Lin et al. Fast predictive motion estimation algorithm with adaptive search mode based on motion type classification
Wang et al. Convolutional neural network based low complexity HEVC intra encoder
JP2024514348A (ja) ビデオ符号化方法、装置と電子機器
TWI590083B (zh) 一種產生影像位移偵測之快速搜尋視窗的方法
CN102387364B (zh) 一种快速帧内模式选择算法
González-Díaz et al. Adaptive multipattern fast block-matching algorithm based on motion classification techniques
JP2016025541A (ja) 動画像符号化装置、動画像符号化方法、及びプログラム
CN113382248B (zh) 帧间预测值修正方法、编码器及计算机可读存储介质
JP2018085608A (ja) 画像符号化装置、画像符号化方法及びコンピュータプログラム
CN113365078B (zh) 帧间编码的运动估计方法、装置及可读存储介质
JP2008263549A (ja) 動画像符号化装置および動画像符号化方法
CN101094404A (zh) 自适应伸缩模板的运动估计算法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22787522

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022787522

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022787522

Country of ref document: EP

Effective date: 20231017

NENP Non-entry into the national phase

Ref country code: DE