CN110139098B - Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder - Google Patents

Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder Download PDF

Info

Publication number
CN110139098B
CN110139098B CN201910281249.7A CN201910281249A CN110139098B CN 110139098 B CN110139098 B CN 110139098B CN 201910281249 A CN201910281249 A CN 201910281249A CN 110139098 B CN110139098 B CN 110139098B
Authority
CN
China
Prior art keywords
coding
ctu
rate
depth
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910281249.7A
Other languages
Chinese (zh)
Other versions
CN110139098A (en
Inventor
张昊
赵御兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201910281249.7A priority Critical patent/CN110139098B/en
Publication of CN110139098A publication Critical patent/CN110139098A/en
Application granted granted Critical
Publication of CN110139098B publication Critical patent/CN110139098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Abstract

One of the embodiments of the present invention provides a decision tree-based method for fast algorithm selection within a high-efficiency video encoder frame. Compared with a single algorithm, the method provided by one of the embodiments of the present invention can further reduce the computational complexity of the encoder, and meanwhile, the video quality loss is negligible. Under the condition of distortion similar to the original HEVC coding rate, the method of embodiment 1 of the invention can reduce the coding time by 61.7 percent on average, and meanwhile, the quality (BRBD) only loses 1.91 percent.

Description

Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder
Technical Field
The invention belongs to the technical field of video coding and decoding, and particularly relates to a decision tree-based intra-frame fast algorithm selection method for a high-efficiency video coder.
Background
Video coding means to convert a file of a video signal into another file format by some compression means, so that the bandwidth usage is reduced during the signal transmission process, and the video signal is efficiently transmitted. High Efficiency Video Coding (HEVC for short) is a new Video compression standard, and is more excellent in performance than h.264, and the compression rate can reach 2 times of h.264 under the same Video quality. After videos such as movies and cartoons are compressed by HEVC (high efficiency video coding), not only can the flow consumption be greatly reduced when a mobile phone user watches the online videos, but also the downloading speed can be higher, the image quality can not be influenced basically, and the online watching can be smoother and is not easy to cause card jamming.
In HEVC, a video signal sequence is encoded using a Group of pictures (GOP) as a basic Unit, where each frame is divided into a series of slices (independent Units for encoding), each slice is divided into a number of Tree-shaped Coding Units (CTUs), the CTUs are divided into four Coding Units (CU) with smaller sizes according to a quadtree-like structure, a CU is a basic Unit shared by intra/inter prediction, quantization transformation, entropy Coding, and other links of HEVC, the supportable Coding size is maximum 64 × 64 and minimum 8 × 8, and an encoder can reasonably select the size of a CU according to different Picture contents, picture sizes, and application requirements, so as to obtain a larger degree of optimization.
Intra-frame prediction techniques have been successfully applied in H.264/AVC. Intra-frame prediction in HEVC refers to predicting a current pixel block by an encoded reconstructed pixel block using spatial correlation of an image to remove spatial redundant information and improve the image compression rate. In HEVC, in order to describe the texture characteristics of an image more accurately and reduce the prediction error, more precise intra prediction techniques are proposed, and the number of prediction modes is increased to 35, as shown in fig. 1. For HEVC intra coding, the CTU may be iteratively divided into four CUs up to the minimum coding single (8 × 8), the CU at each depth may be further divided into two PUs, 2N × 2N and N × N, each PU is subjected to 35 intra prediction coding, and the optimal PU mode is selected according to RDCost. The optimal division mode of a CTU is determined, which generally needs to be performed by 1+4 2 +4 3 =85 RDCost calculations as indicated by the sequence numbers in fig. 2.
The intra-coding algorithm of HEVC selects the best prediction mode from the 35 prediction modes through two steps, namely "coarse search" and "fine search". In the coarse search, the encoder first selects N candidate types from the 35 modes that are most likely to be the best mode to form a "fine search" candidate set. N depends on the size of the Prediction Units (PUs), and when the size of a PU is {4 × 4,8 × 8, 16 × 16, 32 × 32, 64 × 64}, the corresponding N is {8, 3}, respectively. The "Most Probable" prediction Modes (MPMs for short) are then added to the candidate set. In the "fine search", the rate-distortion cost is completely calculated for the modes in the candidate set, and the mode with the minimum rate-distortion cost is taken as the direction of the intra-frame mode. Since the complexity of the complete rate-distortion cost calculation is high, only a simple rate-distortion cost is calculated in the coarse search.
In consideration of compression efficiency, the HEVC encoder has high computational complexity and is limited in applications with high requirements on delay, such as video conferencing and webcasting. Therefore, a new method is still needed to improve the video coding efficiency and reduce the computational complexity.
Disclosure of Invention
To solve the problems in the prior art, an object of an embodiment of the present invention is to provide a method for selecting an intra-frame fast algorithm of a high-efficiency video encoder based on a decision tree.
In order to achieve the above purpose, one of the embodiments of the present invention adopts the following technical solutions:
a decision tree-based intra-frame fast algorithm selection method for a high-efficiency video encoder comprises the following steps:
(1) Respectively coding a training video sequence by using a first algorithm and a second algorithm, and writing intermediate information when each CTU is coded in the coding process into a text file as a characteristic;
(2) Marking the text file in the step (1), if so
Figure BDA0002021732210000021
If the result is marked as 0, otherwise, marking as 1, and obtaining a marked training sample, wherein RDcost1 is the total rate-distortion cost of the first algorithm, RDcost2 is the total rate-distortion cost of the second algorithm, T1 is the time used for coding the first algorithm, and T2 is the time used for coding the second algorithm respectively;
(3) And (3) after the decision tree model is trained by using the training samples marked in the step (2), predicting when the CTU starts to encode through the trained decision tree model, and determining an encoding process.
In step (1), the training video sequence includes: kimono, parkScene, cactus, basketbaldDrill, BQMall, basketbalPass, BQSquad, fourPeople, kristen AndSara, vidoo 1, vidoo 3, basketbaldDrillText, and SlideEditing.
The number of frames of the test video is equal to the number of pictures contained in one second of the video, i.e., the frame rate of the video.
The configuration file of the encoder is encoder _ intra _ main.
Preferably, the first algorithm step comprises:
(1 s) fast mode decision:
(1s.a) carrying out coarse mode search on 11 modes {0,1,2,6, 10, 14, 18, 22, 26, 30, 34} according to the HEVC standard, selecting six optimal intra-frame mode candidates according to absolute transformation difference values, and combining the optimal intra-frame modes of the left side PU and the upper side PU of the current PU to form a set A;
(1s.b) testing 2-distance neighbor modes of all elements in the set A, further selecting the best two intra-mode candidates, and forming a set B by the 1-distance neighbor modes of the two intra-modes and the Most Probable Mode (MPM) of the PU;
(1s.c) performing coarse mode search on all modes in the set B;
(1s.d) finding M modes with the minimum SATD cost from all the modes subjected to coarse mode search, and performing subsequent operation, wherein the number of M is determined by the size of CU: when the CU sizes are {64 × 64, 32 × 32, 16 × 16,8 × 8,4 × 4}, the values of M are {3, 8}, respectively;
(2 s) mode screening based on rate-distortion optimized quantization:
(2s.a) selecting two { M1, M2} combination sets W with the minimum SATD cost from M modes obtained in the fast mode decision;
(2s.b) sequentially performing the following operations for the remaining modes mi of the M modes:
if m i Distances to all elements in W are greater than 1, then m is i Adding the W into the W set;
if the elements in the set include these patterns m 1 ,m 2 Intra _ Planar, intra _ DC, MPM }, then step (2s.b) is skipped;
(2s.c) performing a fine pattern search on all elements in W;
(3 s) rate-distortion cost based termination partitioning:
if the sum of the rate-distortion costs of the current sub-CU is larger than a certain threshold, skipping the coding process of the subsequent sub-CU, and reducing the computational complexity, wherein the specific judgment standard is as follows:
Figure BDA0002021732210000031
wherein: the value range of K is 1,2,3,4},β K the value corresponding to K = {1,2,3,4} is {1.5,1.2,1.1,1},
Figure BDA0002021732210000041
representing the sum of the hadamard costs of the 4 sub-CUs, and in case the 4 sub-CUs have not been completely encoded, its value is replaced by the rate-distortion cost of the current CU,
Figure BDA0002021732210000042
the sum of the hadamard costs of the first K sub-CUs,
Figure BDA0002021732210000043
represents the rate-distortion cost of the ith sub-CU,
Figure BDA0002021732210000044
the rate distortion cost of the current CU.
The N-distance neighbors of a certain Intra mode represent the Intra mode whose absolute value of the difference from the value of the Intra mode is equal to N, i.e., the Intra mode satisfying the equation | m-mi | = N (where mi represents the value of the Intra mode and the value of m represents the value of the Intra mode N-distance neighbors), and the modes Intra _ Planar and Intra _ DC have no N-distance neighbors.
Preferably, the second algorithm step comprises:
(1 s) CU depth prediction:
if it is
Figure BDA0002021732210000045
The depth range of the current coding block is set to be the same as the depth range of the coding block at the same position of the previous frame, if so
Figure BDA0002021732210000046
Depth range of current coding block is set as
Figure BDA0002021732210000047
Wherein
Figure BDA0002021732210000048
For the previous frame phaseThe minimum coding depth of the co-located coding block,
Figure BDA0002021732210000049
the maximum coding depth of the coding block at the same position of the previous frame is p, and the constant is 1.02;
(2 s) rate-distortion cost based termination partitioning:
if the rate-distortion cost J of the CU with the current coding block depth d meets the following conditions:
Figure BDA00020217322100000410
wherein
Figure BDA00020217322100000411
Represents the total rate distortion cost of the block at the current position of the previous frame, d represents the coding depth,
Figure BDA00020217322100000412
the value is the ratio of the number of CU (coding block) of the current position coding block of the previous frame in the maximum depth to the total number of CU of the coding block;
(3 s) fast candidate pattern screening:
and (3) for the candidate set obtained after the coarse search, wherein the elements are arranged from small to large according to the rate-distortion cost and are marked as P = { P (0), P (1), \8230;, P (M-1) }, and before the fine search is carried out, the following operations are carried out on the elements in the candidate set:
(3s.a) assuming that M is the last index value in P of the three elements in MPM, the size of the set P can be reduced to P = { P (0), P (1), \8230;, P (M) }, where M-1>, M;
(3s.b) for all elements in the new set P, if J (P (i)) SATD >1.3J(p(0)) SATD Then element P (i) is removed from set P, where J (P (i)) SATD ,J(p(0)) SATD Respectively representing the rate distortion cost of the (i + 1) th and 0 th elements in P.
Preferably, the intermediate information includes: maximum coding depth of the CTU, minimum coding depth of the CTU, ratio of the number of maximum coded depth CUs to the number of all CUs in the CTU, total rate-distortion cost of the CTU at the current position of the previous frame, difference between the left CTU maximum coded depth and the minimum coded depth, ratio of the left CTU maximum coded depth CU to the number of all CUs in the CTU, difference between the right CTU maximum coded depth and the minimum coded depth, ratio of the right CTU maximum coded depth CU to the number of all CUs in the CTU, and time taken to code the CTU.
Preferably, in step (3), the step of determining the encoding flow includes:
(a) Before CTU coding starts, judging whether a coding frame is a first frame, if so, coding the current CTU by using a first algorithm, and if not, performing the step (b);
(b) Inputting the characteristics stored when the CTU at the current position of the previous frame is coded into a decision tree for prediction, if the prediction result is 0, coding the CTU by using a first algorithm, and if the result is 1, coding the CTU by using a second algorithm;
(c) Collecting intermediate quantities in the encoding process of the step (b) and encoding the next CTU;
(d) Repeating steps (b) and (c) until all CTU codes are completed.
In the step (1), the current CTU is coded by a first algorithm, and after the coding is finished, intermediate quantities in the coding process are collected and stored as characteristics in the decision-making process.
The first algorithm introduces a fast decision algorithm at the micro and macro level respectively: microscopically providing a progressive coarse search algorithm to reduce the number of prediction directions for performing coarse search; macroscopically comparing the absolute transform difference (SATD) of the current PU with the sum of the absolute transform differences of the four sub-PUs to determine whether the CU is further divided down and reduce the coding depth.
With respect to the second algorithm, the average of the difference between the maximum coded depth and the minimum coded depth of a coded block is equal to 1.75, which means that only 2 to 3 depths need to be searched during the coding block encoding process, instead of encoding all depths. Therefore, the computational complexity of the encoder can be greatly reduced as long as the coding depth of the coding block can be accurately predicted.
In the method according to one embodiment of the present invention, the CTU is a minimum unit of coding, and coding a segment of video may be regarded as coding individual CTUs, and different algorithms may consume different time when coding the same CTU. If the shortest encoding time algorithm is used for each CTU to complete the encoding, the total encoding time is less than that of an encoder using a single algorithm.
The invention concept of one embodiment of the invention is as follows: the CTU is the smallest unit of coding, and coding a video segment can be regarded as coding individual CTUs, and different algorithms can consume different time to code the same CTU. If the shortest encoding time algorithm is used for each CTU to complete the encoding, the total encoding time will be less than an encoder using a single algorithm.
The embodiment of the invention has the beneficial effects
Compared with a single algorithm, the method provided by one of the embodiments of the present invention can further reduce the computational complexity of the encoder, and meanwhile, the video quality loss is negligible.
Under the condition of distortion similar to the original HEVC coding rate, the method of embodiment 1 of the invention can reduce the coding time by 61.7 percent on average, and meanwhile, the quality (BRBD) only loses 1.91 percent.
Drawings
Fig. 1 is a diagram illustrating 33 angular prediction directions of intra prediction.
Fig. 2 is a schematic diagram of a CTU quadtree recursive partitioning structure.
FIG. 3 is a flow chart for utilizing decision tree prediction.
Detailed Description
The following are specific examples of the present invention, and the technical solutions of the present invention will be further described with reference to the examples, but the present invention is not limited to the examples.
Example 1
This example provides a decision tree based method for fast algorithm selection in a frame of a high efficiency video encoder, comprising the steps of:
(1) Respectively coding a training video sequence by using a first algorithm and a second algorithm, and writing intermediate information when each CTU is coded in the coding process into a text file as a characteristic;
(2) Marking the text file in the step (1), if so
Figure BDA0002021732210000061
Marking the rate distortion cost as 0, otherwise marking the rate distortion cost as 1, and obtaining a marked training sample, wherein RDcost1 is the total rate distortion cost of the first algorithm, RDcost2 is the total rate distortion cost of the second algorithm, T1 is the time used for coding the first algorithm, and T2 is the time used for coding the second algorithm respectively;
(3) And (3) after the decision tree model is trained by using the training samples marked in the step (2), predicting when the CTU starts to encode through the trained decision tree model, and determining an encoding process, as shown in fig. 3.
In step (1), the training video sequence includes: kimono, parkScene, cactus, basketbaldDrill, BQMall, basketbalPass, BQSquad, fourPeople, kristen AndSara, vidoo 1, vidoo 3, basketbaldDrillText, and SlideEditing.
The number of frames of the test video is equal to the number of pictures contained in one second of the video, i.e., the frame rate of the video.
Cfg is the encoder _ intra _ main.
The first algorithm step includes:
(1 s) fast mode decision:
(1s.a) carrying out coarse mode search on 11 modes {0,1,2,6, 10, 14, 18, 22, 26, 30, 34} according to the HEVC standard, selecting six optimal intra-frame mode candidates according to absolute transformation difference values, and combining the optimal intra-frame modes of the left side PU and the upper side PU of the current PU to form a set A;
(1s.b) testing 2-distance neighbor modes of all elements in set a, further selecting the best two intra-mode candidates from the 2-distance neighbor modes, and combining the 1-distance neighbor modes of the two intra-modes with the Most Probable Mode (MPM) of the PU to form set B;
(1s.c) performing coarse mode search on all modes in the set B;
(1s.d) finding out M modes with the minimum SATD cost from all modes subjected to coarse mode search, and performing subsequent operation, wherein the number of M is determined by the size of the CU: when the CU sizes are {64 × 64, 32 × 32, 16 × 16,8 × 8,4 × 4}, the values of M are {3, 8}, respectively;
(2 s) mode screening based on rate-distortion optimized quantization:
(2s.a) selecting two { M1, M2} combination sets W with the minimum SATD cost from M modes obtained in the fast mode decision;
(2s.b) the remaining modes mi of the M modes are sequentially subjected to the following operations:
if m i Distances to all elements in W are greater than 1, then m is i Adding the obtained product into a W set;
if the elements in the set include these patterns m 1 ,m 2 Intra _ Planar, intra _ DC, MPM }, then step (2s.b) is skipped;
(2s.c) performing fine mode search on all elements in W;
(3 s) rate-distortion cost based termination partitioning:
if the sum of the rate-distortion costs of the current sub-CU is larger than a certain threshold, skipping the coding process of the subsequent sub-CU, and reducing the computational complexity, wherein the specific judgment standard is as follows:
Figure BDA0002021732210000071
wherein: the value range of K is {1,2,3,4}, beta K The value corresponding to K = {1,2,3,4} is {1.5,1.2,1.1,1},
Figure BDA0002021732210000081
represents the sum of the hadamard costs of the 4 sub-CUs, the value of which is replaced by the rate-distortion cost of the current CU in case the 4 sub-CUs have not been completely encoded,
Figure BDA0002021732210000082
the sum of the hadamard costs of the first K sub-CUs,
Figure BDA0002021732210000083
representing the rate-distortion cost of the ith sub-CU,
Figure BDA0002021732210000084
the rate distortion cost of the current CU.
The N-distance neighbors of a certain Intra mode represent an Intra mode whose absolute value of the difference from the value of the Intra mode is equal to N, i.e., an Intra mode satisfying the equation | m-mi | = N (where mi represents the value of the Intra mode and the value of m represents the value of the Intra mode N-distance neighbors), and the modes Intra _ Planar and Intra _ DC have no N-distance neighbors.
The second algorithm step comprises:
(1 s) CU depth prediction:
if it is
Figure BDA0002021732210000085
The depth range of the current coding block is set to be the same as that of the coding block at the same position of the previous frame, if so
Figure BDA0002021732210000086
Depth range of current coding block is set as
Figure BDA0002021732210000087
Wherein
Figure BDA0002021732210000088
The minimum coding depth of the block is coded for the same position of the previous frame,
Figure BDA0002021732210000089
the maximum coding depth of the coding block at the same position of the previous frame is p, and the constant is 1.02;
(2 s) rate-distortion cost based termination partitioning:
if the rate-distortion cost J of the CU with the current coding block depth d meets the following conditions:
Figure BDA00020217322100000810
wherein
Figure BDA00020217322100000811
Represents the total rate-distortion cost of the block of the current position of the last frame, d represents the coding depth,
Figure BDA00020217322100000812
the value is the ratio of the number of CUs of the current position coding block at the maximum depth of the previous frame to the total number of CUs of the coding blocks;
(3 s) fast candidate pattern screening:
and (3) for the candidate set obtained after the coarse search, wherein the elements are arranged from small to large according to the rate-distortion cost and are marked as P = { P (0), P (1), \8230;, P (M-1) }, and before the fine search is carried out, the following operations are carried out on the elements in the candidate set:
(3s.a) assuming that M is the most posterior index value of the three elements in MPM in P, the size of the set P can be reduced to P = { P (0), P (1), \8230;, P (M) }, where M-1>, M;
(3s.b) for all elements in the new set P, if J (P (i)) SATD >1.3J(p(0)) SATD Then element P (i) is removed from set P, where J (P (i)) SATD ,J(p(0)) SATD Respectively representing the rate distortion cost of the (i + 1) th and 0 th elements in P.
The intermediate information includes: the maximum coding depth of the CTU, the minimum coding depth of the CTU, the ratio of the number of the maximum coding depth CUs to the number of all CUs in the CTU, the total rate-distortion cost of the CTU at the current position of the previous frame, the difference between the maximum coding depth of the left CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the left CTU to the number of all CUs in the CTU, the difference between the maximum coding depth of the right CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the right CTU to the number of all CUs in the CTU, and the time taken for coding the CTU.
In the step (3), the step of determining the encoding process includes:
(a) Before CTU coding starts, judging whether a coding frame is a first frame, if so, coding the current CTU by using a first algorithm, and if not, performing the step (b);
(b) Inputting the characteristics stored when the CTU at the current position of the previous frame is coded into a decision tree for prediction, if the prediction result is 0, coding the CTU by using a first algorithm, and if the result is 1, coding the CTU by using a second algorithm;
(c) Collecting intermediate quantities in the encoding process of the step (b) and encoding the next CTU;
(d) Repeating steps (b) and (c) until all CTU codes are completed.
In the step (1), the current CTU is coded by a first algorithm, and after the coding is finished, intermediate quantities in the coding process are collected and stored as characteristics in the decision-making process.
Example 2
In this example, the method of embodiment 1 is adopted, and in a Win10 operating system, the coding environment is Visual Studio 2017, and the same video is coded and compared with the original HEVC coding result, and the HEVC reference software HM has a version number of 10.0. The results are shown in Table 1.
Table 1 comparison of performance of the method of example 1 with the original HEVC coding results
Video name Frame rate Encoding picture numbers BRBD(%) Rate of time reduction
BQSquare 60 600 1.465593 0.552692
BasketballDrill 50 500 1.904419 0.577065
BasketballDrive 50 500 2.15825 0.723275
FourPeople 60 600 1.745963 0.633975
BasketballDrillText 50 500 2.286081 0.598056
Mean value of 1.9120612 0.6170126
As can be seen from table 1, compared with the original HEVC encoder, the method of embodiment 1 of the present invention can reduce the encoding time by 61.7% on average, and only 1.91% of the quality (BRBD) is lost.

Claims (1)

1. A decision tree-based intra-frame fast algorithm selection method for a high-efficiency video encoder is characterized by comprising the following steps:
(1) Respectively coding a training video sequence by using a first algorithm and a second algorithm, and writing intermediate information when each CTU is coded in the coding process into a text file as a characteristic;
(2) Marking the text file in the step (1), if so
Figure 219302DEST_PATH_IMAGE001
Marking the rate distortion cost as 0, otherwise marking the rate distortion cost as 1, and obtaining a marked training sample, wherein RDcost1 is the total rate distortion cost of the first algorithm, RDcost2 is the total rate distortion cost of the second algorithm, T1 is the time used for coding the first algorithm, and T2 is the time used for coding the second algorithm respectively;
(3) After the decision tree model is trained by using the training samples marked in the step (2), predicting when the CTU starts to encode through the trained decision tree model, and determining an encoding process;
the first algorithm step comprises:
(1 s) fast mode decision:
(1s.a) performing coarse mode search on {0,1,2,6, 10, 14, 18, 22, 26, 30, 34}11 modes according to the HEVC standard, selecting six best intra mode candidates according to absolute transformation difference values, and combining the best intra modes of the left side PU and the upper side PU of the current PU to form a set A;
(1s.b) testing 2-distance neighbor modes of all elements in the set A, selecting two intra-mode candidates from the 2-distance neighbor modes, and combining the 1-distance neighbor modes of the two intra-modes and a Most Probable Mode (MPM) of the PU into a set B;
(1s.c) performing coarse mode search on all modes in the set B;
(1s.d) finding out M modes with the minimum SATD cost from all modes subjected to coarse mode search, and performing subsequent operation, wherein the number of M is determined by the size of the CU: when the CU sizes are {64 × 64, 32 × 32, 16 × 16,8 × 8,4 × 4}, the values of M are {3, 8};
(2 s) mode screening based on rate-distortion optimized quantization:
(2s.a) selecting two { M1, M2} sets W with the minimum SATD cost from M modes obtained in the rapid mode decision;
(2s.b) successively subjecting the remaining sets of M modes to:
if the distances between the remaining set and all elements in W are larger than 1, adding the remaining set into the W set;
if the elements in the set include these patterns m 1 ,m 2 Intra _ Planar, intra _ DC, MPM }, then step (2s.b) is skipped;
(2s.c) performing a fine pattern search on all elements in W;
(3 s) rate-distortion cost based termination partitioning:
if the sum of the rate-distortion costs of the current sub-CU is larger than a certain threshold, skipping the coding process of the subsequent sub-CU, and reducing the computational complexity, wherein the specific judgment standard is as follows:
Figure 766958DEST_PATH_IMAGE002
wherein: the value range of K is {1,2,3,4},
Figure 999356DEST_PATH_IMAGE003
the value corresponding to K = {1,2,3,4} is {1.5,1.2,1.1,1},
Figure 630057DEST_PATH_IMAGE004
represents the sum of the hadamard costs of the 4 sub-CUs, the value of which is replaced by the rate-distortion cost of the current CU in case the 4 sub-CUs have not been completely encoded,
Figure 49537DEST_PATH_IMAGE005
the sum of the hadamard costs of the first K sub-CUs,
Figure 666463DEST_PATH_IMAGE006
represents the rate-distortion cost of the ith sub-CU,
Figure 363155DEST_PATH_IMAGE007
is the rate-distortion cost of the current CU;
the second algorithm step comprises:
(1 s) CU depth prediction:
if it is
Figure 40124DEST_PATH_IMAGE008
If the depth range of the current coding block is set to be the same as that of the coding block at the same position of the previous frame, if so, the depth range of the current coding block is set to be the same as that of the coding block at the same position of the previous frame
Figure 212479DEST_PATH_IMAGE009
The depth range of the current coding block is set to
Figure 226572DEST_PATH_IMAGE010
Wherein
Figure 433562DEST_PATH_IMAGE011
The minimum coding depth of a block is coded for the same position of the previous frame,
Figure 389755DEST_PATH_IMAGE012
the maximum coding depth of the coding block at the same position of the previous frame is p, and the constant is 1.02;
(2 s) rate-distortion cost based termination partitioning:
if the rate-distortion cost J of the CU with the current coding block depth of d meets the following conditions:
Figure 49406DEST_PATH_IMAGE013
wherein
Figure 8135DEST_PATH_IMAGE014
Represents the total rate distortion cost of the block at the current position of the previous frame, d represents the coding depth,
Figure 804053DEST_PATH_IMAGE015
the value is the ratio of the number of CUs of the current position coding block at the maximum depth of the previous frame to the total number of CUs of the coding blocks;
(3 s) fast candidate pattern screening:
and (3) for the candidate set obtained after the coarse search, wherein the elements are arranged from small to large according to the rate-distortion cost and are marked as P = { P (0), P (1), \8230;, P (M-1) }, and before the fine search is carried out, the following operations are carried out on the elements in the candidate set:
(3s.a) assuming that M is the most backward index value in P of the three elements in MPM, the size of the set P can be reduced to P = { P (0), P (1), \8230;, P (M) }, where M-1> M;
(3s.b) for all elements in the new set P, if satisfied
Figure 681879DEST_PATH_IMAGE016
Then the element P (i) is removed from the set P, where
Figure 828826DEST_PATH_IMAGE017
Figure 591246DEST_PATH_IMAGE018
Respectively representing the rate distortion cost of the (i + 1) th element and the 0 th element in the P;
the intermediate information includes: the maximum coding depth of the CTU, the minimum coding depth of the CTU, the ratio of the number of the maximum coding depth CUs to the number of all CUs in the CTU, the total rate-distortion cost of the CTU at the current position of the previous frame, the difference between the maximum coding depth of the left CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the left CTU to the number of all CUs in the CTU, the difference between the maximum coding depth of the right CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the right CTU to the number of all CUs in the CTU, and the time for coding the CTU;
in the step (3), the step of determining the encoding process includes:
(a) Before CTU coding starts, judging whether a coding frame is a first frame, if so, coding the current CTU by using a first algorithm, and if not, performing the step (b);
(b) Inputting the characteristics stored when the coding of the CTU at the current position of the previous frame is finished into a decision tree for prediction, if the prediction result is 0, coding the CTU by using a first algorithm, and if the result is 1, coding the CTU by using a second algorithm;
(c) Collecting intermediate information in the encoding process of the step (b) and encoding the next CTU;
(d) Repeating steps (b) and (c) until all CTU codes are completed.
CN201910281249.7A 2019-04-09 2019-04-09 Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder Active CN110139098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910281249.7A CN110139098B (en) 2019-04-09 2019-04-09 Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910281249.7A CN110139098B (en) 2019-04-09 2019-04-09 Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder

Publications (2)

Publication Number Publication Date
CN110139098A CN110139098A (en) 2019-08-16
CN110139098B true CN110139098B (en) 2023-01-06

Family

ID=67569432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910281249.7A Active CN110139098B (en) 2019-04-09 2019-04-09 Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder

Country Status (1)

Country Link
CN (1) CN110139098B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111327909B (en) * 2020-03-06 2022-10-18 郑州轻工业大学 Rapid depth coding method for 3D-HEVC
CN115334308B (en) * 2022-10-14 2022-12-27 北京大学深圳研究生院 Learning model-oriented coding decision processing method, device and equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103338371A (en) * 2013-06-07 2013-10-02 东华理工大学 Fast and efficient video coding intra mode determining method
CN106131547A (en) * 2016-07-12 2016-11-16 北京大学深圳研究生院 The high-speed decision method of intra prediction mode in Video coding
CN107071418A (en) * 2017-05-05 2017-08-18 上海应用技术大学 A kind of quick division methods of HEVC intraframe coding units based on decision tree

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8467448B2 (en) * 2006-11-15 2013-06-18 Motorola Mobility Llc Apparatus and method for fast intra/inter macro-block mode decision for video encoding
US9426473B2 (en) * 2013-02-01 2016-08-23 Qualcomm Incorporated Mode decision simplification for intra prediction
US10142626B2 (en) * 2014-10-31 2018-11-27 Ecole De Technologie Superieure Method and system for fast mode decision for high efficiency video coding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103338371A (en) * 2013-06-07 2013-10-02 东华理工大学 Fast and efficient video coding intra mode determining method
CN106131547A (en) * 2016-07-12 2016-11-16 北京大学深圳研究生院 The high-speed decision method of intra prediction mode in Video coding
CN107071418A (en) * 2017-05-05 2017-08-18 上海应用技术大学 A kind of quick division methods of HEVC intraframe coding units based on decision tree

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
低复杂度的HEVC帧内编码模式决策算法;朱威等;《小型微型计算机系统》;20171215;全文 *

Also Published As

Publication number Publication date
CN110139098A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
US10659804B2 (en) Method and apparatus for encoding/decoding images using adaptive motion vector resolution
CN104937936B (en) Method and apparatus for video coding
RU2506714C1 (en) Image encoder and image decoder, image encoding method and image decoding method
CN104768011B (en) Image coding/decoding method and relevant apparatus
RU2426269C2 (en) Image encoder and image decoder, image encoding method and image decoding method, image encoding programme and image decoding programme and computer readable recording medium on which image encoding programme is recorded and computer readable recording medium on which image decoding programme is recorded
CN113382253B (en) Encoding and decoding method, device, equipment and storage medium
KR20190093534A (en) Method for inter prediction and apparatus thereof
CN108432248A (en) For carrying out entropy coding and decoded method and apparatus to vision signal
US11641481B2 (en) Method and apparatus for encoding/decoding images using adaptive motion vector resolution
CN104221380A (en) Common spatial candidate blocks for parallel motion estimation
CN102640495A (en) Motion vector encoding/decoding method and device and image encoding/decoding method and device using same
WO2009080133A1 (en) Adaptive intra mode selection
CN103782598A (en) Fast encoding method for lossless coding
CN105141954A (en) HEVC interframe coding quick mode selection method
CN112738511B (en) Fast mode decision method and device combined with video analysis
CN106937116A (en) Low-complexity video coding method based on random training set adaptive learning
US20200053368A1 (en) Method and apparatus for encoding a video
Zhang et al. Fast CU decision-making algorithm based on DenseNet network for VVC
KR102503760B1 (en) Method and apparatus for video encoding/decoding using image analysis
CN110139098B (en) Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder
CN108353175A (en) The method and apparatus of prediction processing vision signal caused by coefficient of utilization
CN106031173A (en) Flicker detection and mitigation in video coding
CN102934445A (en) Methods and apparatuses for encoding and decoding image based on segments
CN107690069B (en) Data-driven cascade video coding method
CN115836525A (en) Method and system for prediction from multiple cross components

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant