WO2018023352A1

WO2018023352A1 - Fast motion estimation method based on online learning

Info

Publication number: WO2018023352A1
Application number: PCT/CN2016/092751
Authority: WO
Inventors: 潘兆庆; 孙星明
Original assignee: 南京信息工程大学
Priority date: 2016-08-01
Filing date: 2016-08-01
Publication date: 2018-02-08

Abstract

A fast motion estimation method based on online learning, comprising the steps: (1) using a root prediction unit to perform encoding, comprising integer-pixel motion estimation and sub-pixel motion estimation, on a current encoding unit, and if an optimal motion vector of the root prediction unit is equal to an optimal motion vector of the root prediction unit when performing the integer-pixel motion estimation, turning to step (2), otherwise turning to step (3); (2) using a sub-prediction unit to successively perform encoding, comprising the integer-pixel motion estimation, and turning to step (4); (3) using a sub-prediction unit to perform encoding, comprising the integer-pixel motion estimation and the sub-pixel motion estimation; (4) using an intra-frame prediction unit to perform encoding; and (5) storing encoding information and a write code stream, and returning to step (1) to encode a next encoding unit. The motion estimation method can effectively economise an encoding time and improve the encoding performance.

Description

A Fast Motion Estimation Method Based on Online Learning

Technical field

The invention belongs to the technical field of video coding, and in particular relates to a fast motion estimation method based on online learning.

Background technique

High Efficiency Video Coding (HEVC) is the latest video coding standard that effectively addresses the storage and transmission of high definition (HD) and ultra high definition video. However, the high coding efficiency achieved by the HEVC encoder is based on a series of advanced coding techniques with high computational complexity, such as a quadtree-based coding unit (CU), motion estimation based on variable-size prediction units, Including integer pixel motion estimation (IPME) and sub-pixel motion estimation (FPME), the huge coding complexity greatly limits the wide application of HEVC encoders in multimedia products.

In the motion estimation (ME) process of the HEVC encoder, the whole pixel motion estimation is first performed, and then the optimal motion vector of the whole pixel motion estimation is set as the initial search position point of the pixel motion estimation, thereby refining the integer pixel. Search results for motion estimation. For the sub-pixel motion estimation in the HEVC reference software HM12.0, a sub-pixel motion processing with 8 search points and half pixel precision is performed on the optimal motion vector obtained during the whole pixel motion estimation process. Thereafter, 8-minute 1/4 pixel sub-pixel motion processing is performed on the optimal motion vector processed by the sub-pixel precision sub-pixel motion. There are 16 search points in the process of sub-pixel motion, the best search point

Determined by finding the minimum value of the Lagrangian rate distortion cost (RD) function, as follows:

Where SATD represents the original PU prediction unit

And its predicted PUc unit

The sum of the residuals, φ is the MV of the total candidate sub-pixel motion search point, and λ _{MOTION is the} Lagrangian multiplier.

Is the number of bits required to encode the motion parameters. This method of "traversing all, selecting the best" sub-pixel motion significantly improves the coding efficiency of the HEVC encoder, but at the expense of high coding complexity.

In order to reduce the coding complexity of the HEVC encoder ME motion estimation process, the academic community focuses on optimizing the whole The prediction process of pixel motion and sub-pixel motion estimation. Document 1 proposes a fast integer pixel motion algorithm that achieves an optimal motion vector by reducing the number of pixels in the current search window. In Document 2, by using the predicted motion vector to measure the motion intensity of the current coding unit, Li et al. propose a fast integer pixel motion algorithm for HEVC content adaptation. Document 3 proposes a fast integer pixel motion algorithm based on motion vector inheritance method, in which if the coded block flag (CBF) of the 2N×2N PU prediction unit is zero, the sub-divided PU prediction unit inherits the most of the 2N×2N PU prediction unit. Good motion vector. In Document 4, a fast integer pixel motion estimation algorithm based on confidence interval is proposed for HEVC, in which the integer pixel motion is expressed for the first time as a statistical inference problem, and then a motion estimation based on confidence interval is proposed. Document 5 proposes a depth-based adaptive search range fast integer pixel motion algorithm. These algorithms focus on optimizing the coding complexity of integer pixel motion estimation. The fast integer pixel motion TZSearch algorithm in HEVC uses some early termination strategies, and the space for integer pixel motion complexity can be optimized.

In order to further reduce the coding complexity of the ME motion estimation process, the sub-pixel motion estimation coding process can be further optimized. Document 6 proposes a fast two-step sub-pixel motion estimation algorithm, in which five adjacent integer pixel points are modeled as one error surface for the first time, and then a second-order approximation method is proposed to predict the position of the optimal sub-pixel. Document 7 proposes an optimal scalable and low-cost optimal sub-pixel motion estimation algorithm. In Document 8, Sotetsumoto et al. proposed a low-complexity sub-pixel motion estimation algorithm, which first uses the early termination strategy to terminate the FPME sub-pixel motion estimation process, and then designs the half-pixel and 1/4-pixel FPME sub-pixel motion estimation. Different search modes. In Document 9, based on residual samples and the number of edges, Blasi et al. proposed an adaptive precision motion estimation algorithm that adaptively determines motion vector accuracy while skipping 1/2-pixel and 1/4-pixel sub-pixels. Motion estimation. Based on the performance characteristics of the error surface and the block distortion measurement of the entire pixel, Document 10 proposes a free interpolation algorithm to reduce the computational complexity of HEVC sub-pixel motion. These algorithms effectively reduce the coding complexity of the sub-pixel motion, but do not consider the optimal motion vector relationship between integer pixel motion and sub-pixel motion. By establishing the optimal motion vector relationship between the integer pixel motion estimation and the pixel motion estimation, the coding complexity of the pixel motion can be further optimized.

List of documents:

Document 1, L. Gao, S. Dong, W. Wang, and W. Gao, "A novel interger-pixel motion estimation algorithm based quadratic prediction," in Proc. Int. Conf. Image Process. (ICIP), Quebec, Canada, Sept.2015, pp.2810-2814.

Literature 2, X.Li, R.Wang, X.Cui, W.Wang, "Context-adaptive fast motion estimation Of HEVC," in Proc. Int. Symp. Circuits Syst. (ISCAS), Lisbon, Portugal, May 2015, pp. 2784-2787.

Document 3, S. Yang, HJ Shim, B. Jeon, "Motion vector inheritance method for fast HEVC encoding," in Proc. Int. Symp. Broadband Multimedia Syst. Broadcast. (BMSB), Beijing, China, Jun. 2014, Pp.1-4.

Document 4, N. Hu, E.H. Yang, "Fast motion estimation based on confidence interval," IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 8, pp. 1310-1322, Aug. 2014.

Document 5, T.-K. Lee, Y.-L. Chan, and W.-C. Siu, "Depth-based adaptive search range algorithm for motion estimation in HEVC," in Proc. 19th Int. Conf.Digital Signal Process., Hong Kong, Aug. 2014, pp. 919-923.

Document 6, W. Dai, Oscar C. Au, C. Pang, L. Sun, R. Zou, and S. Li, "A novel fast two step sub-pixel motion estimation algorithm in HEVC," in Proc. Int. Conf. Acoustics, Speech and Signal Process. (ICASSP), Kyoto, Japan, Mar. 2012, pp. 1197-1200.

Document 7, H. Li, Y. Zhang, H. Chao, "An optimally scalable and cost-effective fraction-pixel motion estimation algorithm for HEVC," in Proc. Int. Conf. Acoustics, Speech and Signal Process. (ICASSP) , Vancouver, BC, Canada, May 2013, pp. 1399-1403.

Document 8, T. Sotetsumoto, T. Song, T. Shimamoto, "Low complexity algorithm for sub-pixel motion estimation of HEVC," in Proc. Int. Conf. Signal Process. Commun. and Comput. (ICSPCC), Kunming, China, Aug. 2013, pp.1-4.

Document 9, SGBlasi, I. Zupancic, E. Izquierdo, E. Peixoto, "Adaptive precision motion estimation for HEVC coding," in Proc. Picture Coding Symp. (PCS), Cairns, Australia, May 2015, pp. 144- 148.

Document 10, X. Zuo, L. Yu, "A novel interpolation-free scheme for fractional pixel motion estimation," in Proc. Picture Coding Symp. (PCS), Cairns, Australia, May 2015, pp. 80-84.

Document 11, F. Bossen, Common test conditions and software reference configurations, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) Document JCTVC-J1100, Mar.2012.

Document 12, Z. Pan, S. Kwong, M.-T. Sun, "Early MEMERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC," IEEE Trans. Broadcast., vol. 60, no. 2, pp .405-412, Jun.2014.

Document 13, G. Bjontegaard, Calculation of Average PSNR Differences between RD curves, ITU-T VCEG, Documement VCEG-M33, Apr. 2001.

Summary of the invention

The object of the present invention is to provide a fast motion estimation method based on online learning, which can effectively save coding time and improve coding performance.

In order to achieve the above object, the solution of the present invention is:

A fast motion estimation method based on online learning, comprising the following steps:

(1) Encoding the current coding unit using Inter_2N×2N prediction units, the specific process includes integer pixel motion estimation and sub-pixel motion estimation, and the optimal motion vector of the root prediction unit is recorded.

in case

Go to step (2), otherwise go to step (3), where

Representing the best motion vector when the root prediction unit uses the integer pixel motion estimation operation;

(2) sequentially encoding the current coding unit using sub-prediction units Inter_2N×N, Inter_N×2N, Inter_N×N, Inter_2N×nU, Inter_2N×nD, Inter_nL×2N, and Inter_nR×2N, the specific process including integer pixel motion estimation, Go to step (4);

(3) sequentially encoding the current coding unit using sub-prediction units Inter_2N×N, Inter_N×2N, Inter_N×N, Inter_2N×nU, Inter_2N×nD, Inter_nL×2N, and Inter_nR×2N, the specific process including integer pixel motion estimation and Pixel motion estimation, go to step (4);

(4) encoding the current coding unit by using an intra prediction unit, and moving to step (5);

(5) storing the encoded information and the write code stream, and returning to step (1) to encode the next coding unit.

In the above step (1), when the HEVC encoder operates, each frame image is divided into a series of coding tree units based on a quadtree structure, which is a basic processing unit of HEVC; in the encoding process, Based on the quadtree structure, the coding tree unit is further divided into coding units; according to the coding prediction type, the coding unit is further divided into one, two or four prediction units, and the prediction unit is the basic processing of intra prediction and inter prediction. unit.

In the above step (1), in the inter prediction 2N×2N encoding process of the coding unit, a total of eight prediction unit modes are supported, including Inter_2Nx2N, Inter_2NxN, Inter_Nx2N, Inter_NxN, Inter_2NxnU, Inter_2NxnD, Inter_nLx2N, and Inter_nRx2N.

In the above step (1), the motion estimation process is: first, performing integer pixel motion estimation on the current coding unit, and then setting the optimal motion vector obtained by the whole pixel motion estimation as the starting position of the sub-pixel motion estimation, and the current coding. The unit performs sub-pixel motion estimation, and finally, the optimal motion vector estimated by comparing the integer pixel motion

And the best motion vector for sub-pixel motion estimation

Interval rate distortion cost determines the best motion vector for motion estimation of the current prediction mode

which is

Among them, J _ipme and J _fpme are respectively

with

Rate distortion cost.

In step (3) above, the optimal motion vector of the sub-prediction unit

Determined by learning the motion vector information of its root prediction unit, ie

among them,

The best motion vector representing the integer pixel motion estimation of the sub-prediction unit,

Indicates the best motion vector for the root prediction unit,

The best motion vector representing the integer pixel motion estimation of the root prediction unit,

An optimal motion vector that represents the sub-pixel motion estimation of the sub-prediction unit.

After adopting the above scheme, the present invention firstly divides all prediction units into a root prediction unit (Inter_2N×2N) and a sub prediction unit (other prediction unit patterns) according to characteristics of a motion estimation process based on a variable size prediction unit, and then, By learning the sub-pixel motion estimation result of its root prediction unit, the sub-pixel motion estimation process of the sub-prediction unit is adaptively skipped, thereby optimizing the coding complexity of the sub-pixel motion estimation.

DRAWINGS

Figure 1 is an architectural diagram of the inheritance relationship between prediction units;

Figure 2 is an experimental result of statistical analysis of conditional probability.

detailed description

The technical solution of the present invention will be described in detail below with reference to the accompanying drawings.

In order to analyze the coding complexity of the sub-pixel motion estimation, it is necessary to test the ratio C _m of the total coding time of the sub-pixel motion estimation to the total coding time of the motion estimation and the total coding time of the sub-pixel motion estimation and the total coding time of the HEVC encoder. Ratio C _h . Five HEVC standard test video sequences with different resolution and content characteristics, including "BQSquare", "PartyScene", "KristenAndSara", "BasketballDrive" and "PeopleOnStreet", are encoded by HEVC reference software HM12.0. The characteristics of the sequences are defined in detail in Document 11. The test conditions are defined as follows, the maximum coding unit size and the maximum quadtree depth are 64×64 and 4, respectively; the integer pixel motion estimation method and its search range are set to TZSearch and ±64, respectively; the quantization parameter (QP) is 27, and is used. Low-Delay-Main (LDM) and Random-Access-Main (RAM) encoding configurations, other encoding parameters use the default settings in HM and Document 11. Calculate C _m and C _h from equation (1),

Where T _f , T _m and T _h represent the sub-pixel motion estimation, the motion estimation and the total coding time of the HEVC encoder, respectively. The statistical results are shown in Table 1.

As can be seen from Table 1, the total coding time of the sub-pixel motion estimation accounts for a large proportion of the total coding time of the motion estimation. For the LDM encoding configuration, the value of C _m is 47.82% to 82.92% with an average of 68.65%. For the RAM encoding configuration, the value of C _m is 51.89% to 84.80%, with an average of 66.14%.

Table 1 Code complexity analysis (%) for FPME coding, QP=27

In addition, we observed that the value of C _m for "BasketballDrive" dropped significantly because the sequence was complex and the object moved too fast, resulting in IPME requiring more coding time to locate the best search point. In addition, it should be noted that the total coding time of the FPME also accounts for a large proportion of the total coding time of the HEVC encoder. For LDM and RAM encoding configurations, the value of C _h is 31.57% to 52.90%, with an average of 41.04%, and 31.72% to 41.06%, with an average of 35.57%. From these values, we can conclude that FPME greatly increases the motion complexity and the coding complexity of the HEVC encoder, so if the FPME process is optimized, the coding time will be significantly saved.

FPME aims to refine IPME search results and maximize coding efficiency. In order to analyze the distribution of the best search points in the motion estimation, the event M indicates that the optimal search point for the whole pixel motion estimation is selected as the best search point for motion estimation, and the event N indicates that the best search point for the pixel motion estimation is selected as The best search point for motion estimation. Probabilities P(M) and P(N), the statistical results are shown in Table 2.

Table 2 Statistical results of the best MV allocation (%), QP=27

As can be seen from Table 2, the optimal search point for the whole pixel motion estimation has a higher probability and is selected as the best search point for the entire motion estimation. For the LDM encoding configuration, the probability of P(M) is 93.85% to 99.81%, with an average of 96.23%. The probability of P(N) is only 0.19% to 6.15%, with an average of 3.77%. For the RAM encoding configuration, the probability of P(M) is 92.66% to 98.42%, with an average of 95.28%. At the same time, the probability of P(N) is only 1.58% to 7.34%, with an average of 4.73%. In addition, it can be observed that the probability of "KristenAndSara" and "PeopleOnStreet" under the LDM and RAM encoding configuration conditions is the largest, respectively, because "KristenAndSara" content is simple and most of the area is the background. From these values, we can notice that more than 95% of the prediction units select the best search point for the whole pixel motion estimation as the best search point for their motion estimation process. Therefore, if these select the whole pixel motion estimation result as the whole motion estimation process The prediction unit of the optimal solution can be determined in advance, and the coding time of the motion estimation will be significantly reduced.

When the HEVC encoder is operating, each frame of image is divided into a series of coding tree units based on a quadtree structure, which is the basic processing unit of HEVC. In the encoding process, based on the quadtree structure, the coding tree unit is further divided into coding units. According to the coding prediction type, the coding unit is further divided into one, two or four prediction units, and the prediction unit is a basic processing unit of intra prediction and inter prediction. In the process of interframe predictive coding for coding units, a total of eight prediction unit modes are supported, including Inter_2N×2N, Inter_2N×N, Inter_N×2N, Inter_N×N, Inter_2N×nU, Inter_2N×nD, Inter_nL×2N, and Inter_nR. ×2N (see Reference 12). In the process of inter-frame predictive coding of coding units, all inter-prediction unit modes are sequentially encoded to achieve maximum removal of time domain data redundancy. In the inter prediction motion estimation process, an integer pixel motion estimation operation is first performed on the current prediction unit mode. Then, the optimal motion vector obtained by the whole pixel motion estimation is set as the starting position of the sub-pixel motion estimation, and the current prediction unit mode is subjected to the sub-pixel motion estimation operation. Finally, by comparing the best motion vectors estimated by the whole pixel motion

And the best motion vector for sub-pixel motion estimation

which is

Among them, J _ipme and J _fpme are respectively

with

Rate distortion cost.

By considering the prediction unit content characteristics and the prediction unit partition type, if the content of Inter_2Nx2N belongs to a simple region, other inter prediction unit patterns also have a high probability of belonging to a simple region, such as a background. Also, in video coding, a prediction unit whose content is simple is usually encoded with an integer pixel motion estimation. As shown in Table 2, more than 95% of the prediction units select the best search point for the whole pixel motion estimation as the best search point for the entire motion estimation. As shown in FIG. 1, based on the prediction unit partition type, Inter_2Nx2N is the root of other inter prediction unit modes, and these remaining prediction unit modes are sub-prediction unit modes. Therefore, if the root prediction unit mode selects the optimal motion vector of the whole pixel motion estimation as the final optimal motion vector of the motion estimation process, it means that the content of the root prediction unit is simple or the motion activity of the root prediction unit mode is slow, so that the sub prediction unit There is also a high probability of selecting the optimal motion vector for the whole pixel motion estimation as its final best motion vector and skipping the sub-pixel motion estimation. In order to verify the optimal motion vector selection correlation between the root prediction unit mode and its sub-prediction unit mode, the event S represents that the root prediction unit mode selects the optimal motion vector of the whole pixel motion estimation as the best motion vector of the entire motion estimation process, event T denotes that the sub-prediction unit mode also selects the optimal motion vector of the whole pixel motion estimation as the optimal motion vector of the entire motion estimation process, and calculates the conditional probability P(T|S). The experimental results are shown in Figure 2.

From Fig. 2, we can see that when the root prediction unit mode selects the best motion vector of the whole pixel motion estimation as the best motion vector for motion estimation, its sub-prediction unit mode also has considerable possibility to select the whole pixel motion estimation. The best motion vector is used as the best motion vector for its motion estimation. For the LDM encoding configuration, the conditional probability P(T|S) is 96.68% to 99.70%, and for the RAM encoding configuration, the conditional probability P(T|S) is 96.70% to 99.07%. For the sequence "KristenAndSara" and For "PeopleOnStreet", the conditional probability is 99%, because the two sequences have many simple areas, for example, most of the areas in "KristenAndSara" are backgrounds. Based on the analysis results, it can be concluded that there is a large motion vector selection correlation between the root prediction unit mode and its sub-prediction unit mode, that is, the root prediction unit mode selects the optimal motion vector of the whole pixel motion estimation as its motion. The estimated optimal motion vector, whose sub-prediction unit mode also selects the optimal motion vector of the integer pixel motion estimation as the best motion vector for its motion estimation. Therefore, we can use the motion vector vector information of the root prediction unit to quickly encode its sub-prediction unit mode.

In view of the relationship between the optimal motion vector selection between the root prediction unit and its sub-prediction unit, the optimal motion vector of the sub-prediction unit mode, ie

Can be determined early by learning the motion vector information of its root prediction unit, ie

among them,

Indicates the best motion vector for the root prediction unit,

Based on the above analysis, the present invention provides a fast motion estimation method based on online learning, which includes the following steps:

If the best motion vector of the root prediction unit is equal to the best motion vector when the root prediction unit uses the integer pixel motion estimation operation

That is

Go to step (2), otherwise go to step (3);

(2) The current coding unit uses the sub-prediction units Inter_2N×N, Inter_N×2N, Inter_N×N, Inter_2N×nU, Inter_2N×nD, Inter_nL×2N, and Inter_nR×2N to sequentially encode, and the specific process only includes integer pixel motion estimation. Operation, go to step (4);

The experimental results of the present invention will be given below.

(1) The theoretically saved coding time of the algorithm proposed by the present invention

By algorithm we can know that the coding time saved by the proposed method depends on the number of root prediction units that select the integer pixel motion estimation result as the motion estimation optimal motion vector. If most of the root prediction units select the integer pixel motion estimation result as the best motion vector for motion estimation, the coding time will be greatly reduced. We tested five different resolution video sequences to analyze the percentage of the best motion vector for the root prediction unit to select the integer pixel motion estimation during real-time encoding. The test results are shown in Table 3.

Table 3 The root prediction unit selects the MV of the IPME as the percentage of its final optimal MV, QP=27

编码配置Coding configuration	BQSquareBQSquare	PartyScenePartyScene	KristenAndSaraKristenAndSara	BasketballDriveBasketballDrive	PeopleOnStreetPeopleOnStreet	AverageAverage
LDMLDM	92.10％92.10%	91.25％91.25%	99.51％99.51%	93.53％93.53%	不适用Not applicable	94.10％94.10%
RAMRAM	91.17％91.17%	89.80％89.80%	不适用Not applicable	96.06％96.06%	99.07％99.07%	94.03％94.03%

As can be seen from Table 3, when using the LDM encoding configuration, in the real-time video encoding process, the probability that the root prediction unit selects the integer pixel motion estimation result as the motion estimation optimal motion vector varies from 91.25% to 99.51%, and the average 94.10%. When the RAM encoding configuration is used, in the real-time video encoding processing, the probability that the root prediction unit selects the integer pixel motion estimation result as the motion estimation optimal motion vector is from 89.80% to 99.07%, and the average value is 94.03%. In addition, we can see that the percentages of the sequences "KristenAndSara" and "PeopleOnStreet" exceed 99%, because these two sequences contain a fairly large area of content, the background, and due to the huge spatial correlation between successive video frames, A prediction unit with these simple contents has a high probability of selecting the best motion vector for the most motion estimation of the integer pixel motion estimation result. Based on the above data, we can conclude that the method proposed by the present invention can effectively suspend the motion estimation process in advance and significantly save motion estimation time.

(two) real-time coding performance comparison

In order to test the coding performance of the proposed method of the present invention, we use the HEVC reference software HM12.0 as a software platform. 15 HEVC standard test video sequences were tested using HEVC general test conditions, using LDM and RAM coding configurations; and the integer pixel motion estimation method and its search range were TZSearch and ±64, respectively; maximum coding unit size and maximum quadtree depth were respectively It is 64×64 and 4; the test includes 22, 27, 32 and 37 totaling 4 QPs. Hardware platform with Intel Xeon CPU E3-1241v3@3.50 Microsoft Windows 7 64-bit operating system with GHz and 4.00GB RAM.

We performed the proposed algorithm and the recently published PCS algorithm [9] in terms of peak signal-to-noise ratio (PSNR), bit rate (BR), total coding time savings (TETS), and motion estimation time savings (METS). Comparison. The benchmark test procedure is HEVC reference software HM12.0, and the comparison results are summarized and listed in Table IV, where Bjontegaard Delta PSNR (BDPSNR) is the average PSNR difference in dB for the same BR, and Bjontegaard Delta BR is expressed as a percentage under the same PSNR. The average BR difference is calculated and is calculated according to document 13; TETS and METS are calculated by:

Wherein, T φ _(QP _i) represents the value of QP _{QP i, Φ∈ {PCS [9} ], the proposed method} case, the fast motion estimation algorithm Φ total encoding time operations on the resulting HM12.0; T _B (QP _i ) is expressed as the total coding time obtained by the benchmark detection program motion estimation algorithm at HM12.0, where the QP value is QP _i , and the benchmark detection program motion estimation algorithm includes the original TZSearch and the original sub-pixel motion estimation. The algorithm can be expressed as HM12.0-ME; M _φ (QP _i ) indicates that QP takes the value QP _i , Φ ∈ {PCS [9], in the case of this method}, the fast motion estimation algorithm Φ is on HM12.0 The total motion estimation coding time obtained by the operation; M _B (QP _i ) represents the total motion estimation coding time of the HM12.0-ME in the case where the QP value is QP _i , QP _i ∈ {22, 27, 32, 37} .

As can be seen from Table 4, for the LDM encoding configuration, the PCS achieved a TETS of 5.63% to 23.62%, an average of 14.96%, and achieved a METS of 8.14% to 38.89% with an average of 25.70%. At the same time, the BDPSNR between PCS and HM12.0-ME is -0.002dB to -0.018dB, with an average of -0.009dB, and the BDBR between PCS and HM12.0-ME is 0.07% to 0.53%, with an average of 0.29. %. For RAM encoding configurations, PCS achieved a TETS of 6.26% to 16.56% with an average of 11.45% and a METS of 9.91% to 30.45% with an average of 22.59%. At the same time, the BDPSNR between PCS and HM12.0-ME is -0.004dB to -0.026dB, with an average of -0.009dB, and the BDBR between PCS and HM12.0-ME is 0.11% to 0.56%, with an average of 0.26. %. From these values, it can be seen that the PCS achieves excellent RD performance and limits the resulting TETS and METS.

From Table 4, it can also be seen that the algorithm proposed by the present invention can effectively reduce the coding complexity of the motion estimation process, and the RD performance loss is negligible. For the LDM encoding configuration, the present invention proposes The algorithm achieves a TETS of 20.16% to 35.69%, with an average of 28.82%, and a METS of 57.73% to 66.05%, with an average of 62.60%. The BDPSNR between the proposed algorithm and HM12.0-ME is -0.019dB to -0.097dB, with an average of -0.049dB; and the BDBR between the proposed algorithm and HM12.0-ME is 0.79%. To 2.18%, the average is 1.51%. For the RAM encoding configuration, the proposed algorithm achieves a TETS of 21.06% to 29.95% with an average of 24.35%, and the obtained METS is 61.22% to 66.02% with an average of 63.82%. In addition, the BDPSNR between the proposed algorithm and the HM12.0-ME is -0.020 dB to -0.089 dB, with an average of -0.051 dB; and the algorithm proposed by the present invention and the BDBR between HM12.0-ME It is 0.90% to 2.04% with an average of 1.41%. From these values, we can see that the proposed algorithm can effectively reduce the coding complexity and achieve good rate-distortion performance.

Table 4 Summary of coding results

In addition, it can also be seen that for the LDM coding configuration, the motion estimation method proposed by the present invention saves 15.87% of the total coding time and 49.27% of the motion estimation coding time compared to the PCS. For the RAM encoding configuration, the proposed algorithm saves 14.57% of the total encoding time and 53.26% of the motion estimation encoding time. Based on the above values, we can conclude that the technical solution proposed by the present invention effectively reduces the complexity of motion estimation coding.

The above embodiments are only for explaining the technical idea of the present invention, and the scope of protection of the present invention is not limited thereto. Any modification made based on the technical idea according to the technical idea of the present invention falls within the protection scope of the present invention. Inside.

Claims

A fast motion estimation method based on online learning, comprising the following steps:

(1) Encoding the current coding unit using Inter_2N×2N prediction units, the specific process includes integer pixel motion estimation and sub-pixel motion estimation, and the optimal motion vector of the root prediction unit is recorded.
in case
Go to step (2), otherwise go to step (3), where
Representing the best motion vector when the root prediction unit uses the integer pixel motion estimation operation;

(2) sequentially encoding the current coding unit using sub-prediction units Inter_2N×N, Inter_N×2N, Inter_N×N, Inter_2N×nU, Inter_2N×nD, Inter_nL×2N, and Inter_nR×2N, the specific process including integer pixel motion estimation, Go to step (4);

(3) sequentially encoding the current coding unit using sub-prediction units Inter_2N×N, Inter_N×2N, Inter_N×N, Inter_2N×nU, Inter_2N×nD, Inter_nL×2N, and Inter_nR×2N, the specific process including integer pixel motion estimation and Pixel motion estimation, go to step (4);

(4) encoding the current coding unit by using an intra prediction unit, and moving to step (5);

(5) storing the encoded information and the write code stream, and returning to step (1) to encode the next coding unit.
The online learning-based fast motion estimation method according to claim 1, wherein in the step (1), when the HEVC encoder operates, each frame image is divided into a series of four meanings. a coding tree unit of a tree structure, the coding tree unit is a basic processing unit of HEVC; in the encoding process, based on the quadtree structure, the coding tree unit is further divided into coding units; according to the coding prediction type, the coding unit is further divided into 1 , 2 or 4 prediction units, the prediction unit is the basic processing unit of intra prediction and inter prediction.
The online learning-based fast motion estimation method according to claim 1, wherein in the step (1), in the inter-prediction 2N×2N encoding process of the coding unit, a total of eight prediction units are supported. Modes, including Inter_2Nx2N, Inter_2NxN, Inter_Nx2N, Inter_NxN, Inter_2NxnU, Inter_2NxnD, Inter_nLx2N, and Inter_nRx2N.
The online learning-based fast motion estimation method according to claim 1, wherein in the step (1), the motion estimation process is: first, performing overall pixel motion estimation on the current coding unit, and then The optimal motion vector obtained by pixel motion estimation is set as the starting position of the sub-pixel motion estimation, and the current coding unit is subjected to sub-pixel motion estimation. Finally, the optimal motion vector estimated by comparing the integer pixel motion is compared.
And the best motion vector for sub-pixel motion estimation
Interval rate distortion cost determines the best motion vector for motion estimation of the current prediction mode
which is

Among them, J ipme and J fpme are respectively
with
Rate distortion cost.
A fast motion estimation method based on online learning according to claim 1, wherein in step (3), the optimal motion vector of the sub-prediction unit
Determined by learning the motion vector information of its root prediction unit, ie

among them,
The best motion vector representing the integer pixel motion estimation of the sub-prediction unit,
Indicates the best motion vector for the root prediction unit,
The best motion vector representing the integer pixel motion estimation of the root prediction unit,
An optimal motion vector that represents the sub-pixel motion estimation of the sub-prediction unit.