CN110139098B - Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder - Google Patents
Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder Download PDFInfo
- Publication number
- CN110139098B CN110139098B CN201910281249.7A CN201910281249A CN110139098B CN 110139098 B CN110139098 B CN 110139098B CN 201910281249 A CN201910281249 A CN 201910281249A CN 110139098 B CN110139098 B CN 110139098B
- Authority
- CN
- China
- Prior art keywords
- coding
- ctu
- rate
- depth
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/56—Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Abstract
One of the embodiments of the present invention provides a decision tree-based method for fast algorithm selection within a high-efficiency video encoder frame. Compared with a single algorithm, the method provided by one of the embodiments of the present invention can further reduce the computational complexity of the encoder, and meanwhile, the video quality loss is negligible. Under the condition of distortion similar to the original HEVC coding rate, the method of embodiment 1 of the invention can reduce the coding time by 61.7 percent on average, and meanwhile, the quality (BRBD) only loses 1.91 percent.
Description
Technical Field
The invention belongs to the technical field of video coding and decoding, and particularly relates to a decision tree-based intra-frame fast algorithm selection method for a high-efficiency video coder.
Background
Video coding means to convert a file of a video signal into another file format by some compression means, so that the bandwidth usage is reduced during the signal transmission process, and the video signal is efficiently transmitted. High Efficiency Video Coding (HEVC for short) is a new Video compression standard, and is more excellent in performance than h.264, and the compression rate can reach 2 times of h.264 under the same Video quality. After videos such as movies and cartoons are compressed by HEVC (high efficiency video coding), not only can the flow consumption be greatly reduced when a mobile phone user watches the online videos, but also the downloading speed can be higher, the image quality can not be influenced basically, and the online watching can be smoother and is not easy to cause card jamming.
In HEVC, a video signal sequence is encoded using a Group of pictures (GOP) as a basic Unit, where each frame is divided into a series of slices (independent Units for encoding), each slice is divided into a number of Tree-shaped Coding Units (CTUs), the CTUs are divided into four Coding Units (CU) with smaller sizes according to a quadtree-like structure, a CU is a basic Unit shared by intra/inter prediction, quantization transformation, entropy Coding, and other links of HEVC, the supportable Coding size is maximum 64 × 64 and minimum 8 × 8, and an encoder can reasonably select the size of a CU according to different Picture contents, picture sizes, and application requirements, so as to obtain a larger degree of optimization.
Intra-frame prediction techniques have been successfully applied in H.264/AVC. Intra-frame prediction in HEVC refers to predicting a current pixel block by an encoded reconstructed pixel block using spatial correlation of an image to remove spatial redundant information and improve the image compression rate. In HEVC, in order to describe the texture characteristics of an image more accurately and reduce the prediction error, more precise intra prediction techniques are proposed, and the number of prediction modes is increased to 35, as shown in fig. 1. For HEVC intra coding, the CTU may be iteratively divided into four CUs up to the minimum coding single (8 × 8), the CU at each depth may be further divided into two PUs, 2N × 2N and N × N, each PU is subjected to 35 intra prediction coding, and the optimal PU mode is selected according to RDCost. The optimal division mode of a CTU is determined, which generally needs to be performed by 1+4 2 +4 3 =85 RDCost calculations as indicated by the sequence numbers in fig. 2.
The intra-coding algorithm of HEVC selects the best prediction mode from the 35 prediction modes through two steps, namely "coarse search" and "fine search". In the coarse search, the encoder first selects N candidate types from the 35 modes that are most likely to be the best mode to form a "fine search" candidate set. N depends on the size of the Prediction Units (PUs), and when the size of a PU is {4 × 4,8 × 8, 16 × 16, 32 × 32, 64 × 64}, the corresponding N is {8, 3}, respectively. The "Most Probable" prediction Modes (MPMs for short) are then added to the candidate set. In the "fine search", the rate-distortion cost is completely calculated for the modes in the candidate set, and the mode with the minimum rate-distortion cost is taken as the direction of the intra-frame mode. Since the complexity of the complete rate-distortion cost calculation is high, only a simple rate-distortion cost is calculated in the coarse search.
In consideration of compression efficiency, the HEVC encoder has high computational complexity and is limited in applications with high requirements on delay, such as video conferencing and webcasting. Therefore, a new method is still needed to improve the video coding efficiency and reduce the computational complexity.
Disclosure of Invention
To solve the problems in the prior art, an object of an embodiment of the present invention is to provide a method for selecting an intra-frame fast algorithm of a high-efficiency video encoder based on a decision tree.
In order to achieve the above purpose, one of the embodiments of the present invention adopts the following technical solutions:
a decision tree-based intra-frame fast algorithm selection method for a high-efficiency video encoder comprises the following steps:
(1) Respectively coding a training video sequence by using a first algorithm and a second algorithm, and writing intermediate information when each CTU is coded in the coding process into a text file as a characteristic;
(2) Marking the text file in the step (1), if soIf the result is marked as 0, otherwise, marking as 1, and obtaining a marked training sample, wherein RDcost1 is the total rate-distortion cost of the first algorithm, RDcost2 is the total rate-distortion cost of the second algorithm, T1 is the time used for coding the first algorithm, and T2 is the time used for coding the second algorithm respectively;
(3) And (3) after the decision tree model is trained by using the training samples marked in the step (2), predicting when the CTU starts to encode through the trained decision tree model, and determining an encoding process.
In step (1), the training video sequence includes: kimono, parkScene, cactus, basketbaldDrill, BQMall, basketbalPass, BQSquad, fourPeople, kristen AndSara, vidoo 1, vidoo 3, basketbaldDrillText, and SlideEditing.
The number of frames of the test video is equal to the number of pictures contained in one second of the video, i.e., the frame rate of the video.
The configuration file of the encoder is encoder _ intra _ main.
Preferably, the first algorithm step comprises:
(1 s) fast mode decision:
(1s.a) carrying out coarse mode search on 11 modes {0,1,2,6, 10, 14, 18, 22, 26, 30, 34} according to the HEVC standard, selecting six optimal intra-frame mode candidates according to absolute transformation difference values, and combining the optimal intra-frame modes of the left side PU and the upper side PU of the current PU to form a set A;
(1s.b) testing 2-distance neighbor modes of all elements in the set A, further selecting the best two intra-mode candidates, and forming a set B by the 1-distance neighbor modes of the two intra-modes and the Most Probable Mode (MPM) of the PU;
(1s.c) performing coarse mode search on all modes in the set B;
(1s.d) finding M modes with the minimum SATD cost from all the modes subjected to coarse mode search, and performing subsequent operation, wherein the number of M is determined by the size of CU: when the CU sizes are {64 × 64, 32 × 32, 16 × 16,8 × 8,4 × 4}, the values of M are {3, 8}, respectively;
(2 s) mode screening based on rate-distortion optimized quantization:
(2s.a) selecting two { M1, M2} combination sets W with the minimum SATD cost from M modes obtained in the fast mode decision;
(2s.b) sequentially performing the following operations for the remaining modes mi of the M modes:
if m i Distances to all elements in W are greater than 1, then m is i Adding the W into the W set;
if the elements in the set include these patterns m 1 ,m 2 Intra _ Planar, intra _ DC, MPM }, then step (2s.b) is skipped;
(2s.c) performing a fine pattern search on all elements in W;
(3 s) rate-distortion cost based termination partitioning:
if the sum of the rate-distortion costs of the current sub-CU is larger than a certain threshold, skipping the coding process of the subsequent sub-CU, and reducing the computational complexity, wherein the specific judgment standard is as follows:
wherein: the value range of K is 1,2,3,4},β K the value corresponding to K = {1,2,3,4} is {1.5,1.2,1.1,1},representing the sum of the hadamard costs of the 4 sub-CUs, and in case the 4 sub-CUs have not been completely encoded, its value is replaced by the rate-distortion cost of the current CU,the sum of the hadamard costs of the first K sub-CUs,represents the rate-distortion cost of the ith sub-CU,the rate distortion cost of the current CU.
The N-distance neighbors of a certain Intra mode represent the Intra mode whose absolute value of the difference from the value of the Intra mode is equal to N, i.e., the Intra mode satisfying the equation | m-mi | = N (where mi represents the value of the Intra mode and the value of m represents the value of the Intra mode N-distance neighbors), and the modes Intra _ Planar and Intra _ DC have no N-distance neighbors.
Preferably, the second algorithm step comprises:
(1 s) CU depth prediction:
if it isThe depth range of the current coding block is set to be the same as the depth range of the coding block at the same position of the previous frame, if soDepth range of current coding block is set asWhereinFor the previous frame phaseThe minimum coding depth of the co-located coding block,the maximum coding depth of the coding block at the same position of the previous frame is p, and the constant is 1.02;
(2 s) rate-distortion cost based termination partitioning:
if the rate-distortion cost J of the CU with the current coding block depth d meets the following conditions:
whereinRepresents the total rate distortion cost of the block at the current position of the previous frame, d represents the coding depth,the value is the ratio of the number of CU (coding block) of the current position coding block of the previous frame in the maximum depth to the total number of CU of the coding block;
(3 s) fast candidate pattern screening:
and (3) for the candidate set obtained after the coarse search, wherein the elements are arranged from small to large according to the rate-distortion cost and are marked as P = { P (0), P (1), \8230;, P (M-1) }, and before the fine search is carried out, the following operations are carried out on the elements in the candidate set:
(3s.a) assuming that M is the last index value in P of the three elements in MPM, the size of the set P can be reduced to P = { P (0), P (1), \8230;, P (M) }, where M-1>, M;
(3s.b) for all elements in the new set P, if J (P (i)) SATD >1.3J(p(0)) SATD Then element P (i) is removed from set P, where J (P (i)) SATD ,J(p(0)) SATD Respectively representing the rate distortion cost of the (i + 1) th and 0 th elements in P.
Preferably, the intermediate information includes: maximum coding depth of the CTU, minimum coding depth of the CTU, ratio of the number of maximum coded depth CUs to the number of all CUs in the CTU, total rate-distortion cost of the CTU at the current position of the previous frame, difference between the left CTU maximum coded depth and the minimum coded depth, ratio of the left CTU maximum coded depth CU to the number of all CUs in the CTU, difference between the right CTU maximum coded depth and the minimum coded depth, ratio of the right CTU maximum coded depth CU to the number of all CUs in the CTU, and time taken to code the CTU.
Preferably, in step (3), the step of determining the encoding flow includes:
(a) Before CTU coding starts, judging whether a coding frame is a first frame, if so, coding the current CTU by using a first algorithm, and if not, performing the step (b);
(b) Inputting the characteristics stored when the CTU at the current position of the previous frame is coded into a decision tree for prediction, if the prediction result is 0, coding the CTU by using a first algorithm, and if the result is 1, coding the CTU by using a second algorithm;
(c) Collecting intermediate quantities in the encoding process of the step (b) and encoding the next CTU;
(d) Repeating steps (b) and (c) until all CTU codes are completed.
In the step (1), the current CTU is coded by a first algorithm, and after the coding is finished, intermediate quantities in the coding process are collected and stored as characteristics in the decision-making process.
The first algorithm introduces a fast decision algorithm at the micro and macro level respectively: microscopically providing a progressive coarse search algorithm to reduce the number of prediction directions for performing coarse search; macroscopically comparing the absolute transform difference (SATD) of the current PU with the sum of the absolute transform differences of the four sub-PUs to determine whether the CU is further divided down and reduce the coding depth.
With respect to the second algorithm, the average of the difference between the maximum coded depth and the minimum coded depth of a coded block is equal to 1.75, which means that only 2 to 3 depths need to be searched during the coding block encoding process, instead of encoding all depths. Therefore, the computational complexity of the encoder can be greatly reduced as long as the coding depth of the coding block can be accurately predicted.
In the method according to one embodiment of the present invention, the CTU is a minimum unit of coding, and coding a segment of video may be regarded as coding individual CTUs, and different algorithms may consume different time when coding the same CTU. If the shortest encoding time algorithm is used for each CTU to complete the encoding, the total encoding time is less than that of an encoder using a single algorithm.
The invention concept of one embodiment of the invention is as follows: the CTU is the smallest unit of coding, and coding a video segment can be regarded as coding individual CTUs, and different algorithms can consume different time to code the same CTU. If the shortest encoding time algorithm is used for each CTU to complete the encoding, the total encoding time will be less than an encoder using a single algorithm.
The embodiment of the invention has the beneficial effects
Compared with a single algorithm, the method provided by one of the embodiments of the present invention can further reduce the computational complexity of the encoder, and meanwhile, the video quality loss is negligible.
Under the condition of distortion similar to the original HEVC coding rate, the method of embodiment 1 of the invention can reduce the coding time by 61.7 percent on average, and meanwhile, the quality (BRBD) only loses 1.91 percent.
Drawings
Fig. 1 is a diagram illustrating 33 angular prediction directions of intra prediction.
Fig. 2 is a schematic diagram of a CTU quadtree recursive partitioning structure.
FIG. 3 is a flow chart for utilizing decision tree prediction.
Detailed Description
The following are specific examples of the present invention, and the technical solutions of the present invention will be further described with reference to the examples, but the present invention is not limited to the examples.
Example 1
This example provides a decision tree based method for fast algorithm selection in a frame of a high efficiency video encoder, comprising the steps of:
(1) Respectively coding a training video sequence by using a first algorithm and a second algorithm, and writing intermediate information when each CTU is coded in the coding process into a text file as a characteristic;
(2) Marking the text file in the step (1), if soMarking the rate distortion cost as 0, otherwise marking the rate distortion cost as 1, and obtaining a marked training sample, wherein RDcost1 is the total rate distortion cost of the first algorithm, RDcost2 is the total rate distortion cost of the second algorithm, T1 is the time used for coding the first algorithm, and T2 is the time used for coding the second algorithm respectively;
(3) And (3) after the decision tree model is trained by using the training samples marked in the step (2), predicting when the CTU starts to encode through the trained decision tree model, and determining an encoding process, as shown in fig. 3.
In step (1), the training video sequence includes: kimono, parkScene, cactus, basketbaldDrill, BQMall, basketbalPass, BQSquad, fourPeople, kristen AndSara, vidoo 1, vidoo 3, basketbaldDrillText, and SlideEditing.
The number of frames of the test video is equal to the number of pictures contained in one second of the video, i.e., the frame rate of the video.
Cfg is the encoder _ intra _ main.
The first algorithm step includes:
(1 s) fast mode decision:
(1s.a) carrying out coarse mode search on 11 modes {0,1,2,6, 10, 14, 18, 22, 26, 30, 34} according to the HEVC standard, selecting six optimal intra-frame mode candidates according to absolute transformation difference values, and combining the optimal intra-frame modes of the left side PU and the upper side PU of the current PU to form a set A;
(1s.b) testing 2-distance neighbor modes of all elements in set a, further selecting the best two intra-mode candidates from the 2-distance neighbor modes, and combining the 1-distance neighbor modes of the two intra-modes with the Most Probable Mode (MPM) of the PU to form set B;
(1s.c) performing coarse mode search on all modes in the set B;
(1s.d) finding out M modes with the minimum SATD cost from all modes subjected to coarse mode search, and performing subsequent operation, wherein the number of M is determined by the size of the CU: when the CU sizes are {64 × 64, 32 × 32, 16 × 16,8 × 8,4 × 4}, the values of M are {3, 8}, respectively;
(2 s) mode screening based on rate-distortion optimized quantization:
(2s.a) selecting two { M1, M2} combination sets W with the minimum SATD cost from M modes obtained in the fast mode decision;
(2s.b) the remaining modes mi of the M modes are sequentially subjected to the following operations:
if m i Distances to all elements in W are greater than 1, then m is i Adding the obtained product into a W set;
if the elements in the set include these patterns m 1 ,m 2 Intra _ Planar, intra _ DC, MPM }, then step (2s.b) is skipped;
(2s.c) performing fine mode search on all elements in W;
(3 s) rate-distortion cost based termination partitioning:
if the sum of the rate-distortion costs of the current sub-CU is larger than a certain threshold, skipping the coding process of the subsequent sub-CU, and reducing the computational complexity, wherein the specific judgment standard is as follows:
wherein: the value range of K is {1,2,3,4}, beta K The value corresponding to K = {1,2,3,4} is {1.5,1.2,1.1,1},represents the sum of the hadamard costs of the 4 sub-CUs, the value of which is replaced by the rate-distortion cost of the current CU in case the 4 sub-CUs have not been completely encoded,the sum of the hadamard costs of the first K sub-CUs,representing the rate-distortion cost of the ith sub-CU,the rate distortion cost of the current CU.
The N-distance neighbors of a certain Intra mode represent an Intra mode whose absolute value of the difference from the value of the Intra mode is equal to N, i.e., an Intra mode satisfying the equation | m-mi | = N (where mi represents the value of the Intra mode and the value of m represents the value of the Intra mode N-distance neighbors), and the modes Intra _ Planar and Intra _ DC have no N-distance neighbors.
The second algorithm step comprises:
(1 s) CU depth prediction:
if it isThe depth range of the current coding block is set to be the same as that of the coding block at the same position of the previous frame, if soDepth range of current coding block is set asWhereinThe minimum coding depth of the block is coded for the same position of the previous frame,the maximum coding depth of the coding block at the same position of the previous frame is p, and the constant is 1.02;
(2 s) rate-distortion cost based termination partitioning:
if the rate-distortion cost J of the CU with the current coding block depth d meets the following conditions:
whereinRepresents the total rate-distortion cost of the block of the current position of the last frame, d represents the coding depth,the value is the ratio of the number of CUs of the current position coding block at the maximum depth of the previous frame to the total number of CUs of the coding blocks;
(3 s) fast candidate pattern screening:
and (3) for the candidate set obtained after the coarse search, wherein the elements are arranged from small to large according to the rate-distortion cost and are marked as P = { P (0), P (1), \8230;, P (M-1) }, and before the fine search is carried out, the following operations are carried out on the elements in the candidate set:
(3s.a) assuming that M is the most posterior index value of the three elements in MPM in P, the size of the set P can be reduced to P = { P (0), P (1), \8230;, P (M) }, where M-1>, M;
(3s.b) for all elements in the new set P, if J (P (i)) SATD >1.3J(p(0)) SATD Then element P (i) is removed from set P, where J (P (i)) SATD ,J(p(0)) SATD Respectively representing the rate distortion cost of the (i + 1) th and 0 th elements in P.
The intermediate information includes: the maximum coding depth of the CTU, the minimum coding depth of the CTU, the ratio of the number of the maximum coding depth CUs to the number of all CUs in the CTU, the total rate-distortion cost of the CTU at the current position of the previous frame, the difference between the maximum coding depth of the left CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the left CTU to the number of all CUs in the CTU, the difference between the maximum coding depth of the right CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the right CTU to the number of all CUs in the CTU, and the time taken for coding the CTU.
In the step (3), the step of determining the encoding process includes:
(a) Before CTU coding starts, judging whether a coding frame is a first frame, if so, coding the current CTU by using a first algorithm, and if not, performing the step (b);
(b) Inputting the characteristics stored when the CTU at the current position of the previous frame is coded into a decision tree for prediction, if the prediction result is 0, coding the CTU by using a first algorithm, and if the result is 1, coding the CTU by using a second algorithm;
(c) Collecting intermediate quantities in the encoding process of the step (b) and encoding the next CTU;
(d) Repeating steps (b) and (c) until all CTU codes are completed.
In the step (1), the current CTU is coded by a first algorithm, and after the coding is finished, intermediate quantities in the coding process are collected and stored as characteristics in the decision-making process.
Example 2
In this example, the method of embodiment 1 is adopted, and in a Win10 operating system, the coding environment is Visual Studio 2017, and the same video is coded and compared with the original HEVC coding result, and the HEVC reference software HM has a version number of 10.0. The results are shown in Table 1.
Table 1 comparison of performance of the method of example 1 with the original HEVC coding results
Video name | Frame rate | Encoding picture numbers | BRBD(%) | Rate of time reduction |
BQSquare | 60 | 600 | 1.465593 | 0.552692 |
BasketballDrill | 50 | 500 | 1.904419 | 0.577065 |
BasketballDrive | 50 | 500 | 2.15825 | 0.723275 |
FourPeople | 60 | 600 | 1.745963 | 0.633975 |
BasketballDrillText | 50 | 500 | 2.286081 | 0.598056 |
Mean value of | 1.9120612 | 0.6170126 |
As can be seen from table 1, compared with the original HEVC encoder, the method of embodiment 1 of the present invention can reduce the encoding time by 61.7% on average, and only 1.91% of the quality (BRBD) is lost.
Claims (1)
1. A decision tree-based intra-frame fast algorithm selection method for a high-efficiency video encoder is characterized by comprising the following steps:
(1) Respectively coding a training video sequence by using a first algorithm and a second algorithm, and writing intermediate information when each CTU is coded in the coding process into a text file as a characteristic;
(2) Marking the text file in the step (1), if soMarking the rate distortion cost as 0, otherwise marking the rate distortion cost as 1, and obtaining a marked training sample, wherein RDcost1 is the total rate distortion cost of the first algorithm, RDcost2 is the total rate distortion cost of the second algorithm, T1 is the time used for coding the first algorithm, and T2 is the time used for coding the second algorithm respectively;
(3) After the decision tree model is trained by using the training samples marked in the step (2), predicting when the CTU starts to encode through the trained decision tree model, and determining an encoding process;
the first algorithm step comprises:
(1 s) fast mode decision:
(1s.a) performing coarse mode search on {0,1,2,6, 10, 14, 18, 22, 26, 30, 34}11 modes according to the HEVC standard, selecting six best intra mode candidates according to absolute transformation difference values, and combining the best intra modes of the left side PU and the upper side PU of the current PU to form a set A;
(1s.b) testing 2-distance neighbor modes of all elements in the set A, selecting two intra-mode candidates from the 2-distance neighbor modes, and combining the 1-distance neighbor modes of the two intra-modes and a Most Probable Mode (MPM) of the PU into a set B;
(1s.c) performing coarse mode search on all modes in the set B;
(1s.d) finding out M modes with the minimum SATD cost from all modes subjected to coarse mode search, and performing subsequent operation, wherein the number of M is determined by the size of the CU: when the CU sizes are {64 × 64, 32 × 32, 16 × 16,8 × 8,4 × 4}, the values of M are {3, 8};
(2 s) mode screening based on rate-distortion optimized quantization:
(2s.a) selecting two { M1, M2} sets W with the minimum SATD cost from M modes obtained in the rapid mode decision;
(2s.b) successively subjecting the remaining sets of M modes to:
if the distances between the remaining set and all elements in W are larger than 1, adding the remaining set into the W set;
if the elements in the set include these patterns m 1 ,m 2 Intra _ Planar, intra _ DC, MPM }, then step (2s.b) is skipped;
(2s.c) performing a fine pattern search on all elements in W;
(3 s) rate-distortion cost based termination partitioning:
if the sum of the rate-distortion costs of the current sub-CU is larger than a certain threshold, skipping the coding process of the subsequent sub-CU, and reducing the computational complexity, wherein the specific judgment standard is as follows:
wherein: the value range of K is {1,2,3,4},the value corresponding to K = {1,2,3,4} is {1.5,1.2,1.1,1},represents the sum of the hadamard costs of the 4 sub-CUs, the value of which is replaced by the rate-distortion cost of the current CU in case the 4 sub-CUs have not been completely encoded,the sum of the hadamard costs of the first K sub-CUs,represents the rate-distortion cost of the ith sub-CU,is the rate-distortion cost of the current CU;
the second algorithm step comprises:
(1 s) CU depth prediction:
if it isIf the depth range of the current coding block is set to be the same as that of the coding block at the same position of the previous frame, if so, the depth range of the current coding block is set to be the same as that of the coding block at the same position of the previous frameThe depth range of the current coding block is set toWhereinThe minimum coding depth of a block is coded for the same position of the previous frame,the maximum coding depth of the coding block at the same position of the previous frame is p, and the constant is 1.02;
(2 s) rate-distortion cost based termination partitioning:
if the rate-distortion cost J of the CU with the current coding block depth of d meets the following conditions:
whereinRepresents the total rate distortion cost of the block at the current position of the previous frame, d represents the coding depth,the value is the ratio of the number of CUs of the current position coding block at the maximum depth of the previous frame to the total number of CUs of the coding blocks;
(3 s) fast candidate pattern screening:
and (3) for the candidate set obtained after the coarse search, wherein the elements are arranged from small to large according to the rate-distortion cost and are marked as P = { P (0), P (1), \8230;, P (M-1) }, and before the fine search is carried out, the following operations are carried out on the elements in the candidate set:
(3s.a) assuming that M is the most backward index value in P of the three elements in MPM, the size of the set P can be reduced to P = { P (0), P (1), \8230;, P (M) }, where M-1> M;
(3s.b) for all elements in the new set P, if satisfiedThen the element P (i) is removed from the set P, where,Respectively representing the rate distortion cost of the (i + 1) th element and the 0 th element in the P;
the intermediate information includes: the maximum coding depth of the CTU, the minimum coding depth of the CTU, the ratio of the number of the maximum coding depth CUs to the number of all CUs in the CTU, the total rate-distortion cost of the CTU at the current position of the previous frame, the difference between the maximum coding depth of the left CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the left CTU to the number of all CUs in the CTU, the difference between the maximum coding depth of the right CTU and the minimum coding depth, the ratio of the number of the maximum coding depth CUs of the right CTU to the number of all CUs in the CTU, and the time for coding the CTU;
in the step (3), the step of determining the encoding process includes:
(a) Before CTU coding starts, judging whether a coding frame is a first frame, if so, coding the current CTU by using a first algorithm, and if not, performing the step (b);
(b) Inputting the characteristics stored when the coding of the CTU at the current position of the previous frame is finished into a decision tree for prediction, if the prediction result is 0, coding the CTU by using a first algorithm, and if the result is 1, coding the CTU by using a second algorithm;
(c) Collecting intermediate information in the encoding process of the step (b) and encoding the next CTU;
(d) Repeating steps (b) and (c) until all CTU codes are completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910281249.7A CN110139098B (en) | 2019-04-09 | 2019-04-09 | Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910281249.7A CN110139098B (en) | 2019-04-09 | 2019-04-09 | Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110139098A CN110139098A (en) | 2019-08-16 |
CN110139098B true CN110139098B (en) | 2023-01-06 |
Family
ID=67569432
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910281249.7A Active CN110139098B (en) | 2019-04-09 | 2019-04-09 | Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110139098B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111327909B (en) * | 2020-03-06 | 2022-10-18 | 郑州轻工业大学 | Rapid depth coding method for 3D-HEVC |
CN115334308B (en) * | 2022-10-14 | 2022-12-27 | 北京大学深圳研究生院 | Learning model-oriented coding decision processing method, device and equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103338371A (en) * | 2013-06-07 | 2013-10-02 | 东华理工大学 | Fast and efficient video coding intra mode determining method |
CN106131547A (en) * | 2016-07-12 | 2016-11-16 | 北京大学深圳研究生院 | The high-speed decision method of intra prediction mode in Video coding |
CN107071418A (en) * | 2017-05-05 | 2017-08-18 | 上海应用技术大学 | A kind of quick division methods of HEVC intraframe coding units based on decision tree |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8467448B2 (en) * | 2006-11-15 | 2013-06-18 | Motorola Mobility Llc | Apparatus and method for fast intra/inter macro-block mode decision for video encoding |
US9426473B2 (en) * | 2013-02-01 | 2016-08-23 | Qualcomm Incorporated | Mode decision simplification for intra prediction |
US10142626B2 (en) * | 2014-10-31 | 2018-11-27 | Ecole De Technologie Superieure | Method and system for fast mode decision for high efficiency video coding |
-
2019
- 2019-04-09 CN CN201910281249.7A patent/CN110139098B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103338371A (en) * | 2013-06-07 | 2013-10-02 | 东华理工大学 | Fast and efficient video coding intra mode determining method |
CN106131547A (en) * | 2016-07-12 | 2016-11-16 | 北京大学深圳研究生院 | The high-speed decision method of intra prediction mode in Video coding |
CN107071418A (en) * | 2017-05-05 | 2017-08-18 | 上海应用技术大学 | A kind of quick division methods of HEVC intraframe coding units based on decision tree |
Non-Patent Citations (1)
Title |
---|
低复杂度的HEVC帧内编码模式决策算法;朱威等;《小型微型计算机系统》;20171215;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110139098A (en) | 2019-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10659804B2 (en) | Method and apparatus for encoding/decoding images using adaptive motion vector resolution | |
CN104937936B (en) | Method and apparatus for video coding | |
RU2506714C1 (en) | Image encoder and image decoder, image encoding method and image decoding method | |
CN104768011B (en) | Image coding/decoding method and relevant apparatus | |
RU2426269C2 (en) | Image encoder and image decoder, image encoding method and image decoding method, image encoding programme and image decoding programme and computer readable recording medium on which image encoding programme is recorded and computer readable recording medium on which image decoding programme is recorded | |
CN113382253B (en) | Encoding and decoding method, device, equipment and storage medium | |
KR20190093534A (en) | Method for inter prediction and apparatus thereof | |
CN108432248A (en) | For carrying out entropy coding and decoded method and apparatus to vision signal | |
US11641481B2 (en) | Method and apparatus for encoding/decoding images using adaptive motion vector resolution | |
CN104221380A (en) | Common spatial candidate blocks for parallel motion estimation | |
CN102640495A (en) | Motion vector encoding/decoding method and device and image encoding/decoding method and device using same | |
WO2009080133A1 (en) | Adaptive intra mode selection | |
CN103782598A (en) | Fast encoding method for lossless coding | |
CN105141954A (en) | HEVC interframe coding quick mode selection method | |
CN112738511B (en) | Fast mode decision method and device combined with video analysis | |
CN106937116A (en) | Low-complexity video coding method based on random training set adaptive learning | |
US20200053368A1 (en) | Method and apparatus for encoding a video | |
Zhang et al. | Fast CU decision-making algorithm based on DenseNet network for VVC | |
KR102503760B1 (en) | Method and apparatus for video encoding/decoding using image analysis | |
CN110139098B (en) | Decision tree-based intra-frame fast algorithm selection method for high-efficiency video encoder | |
CN108353175A (en) | The method and apparatus of prediction processing vision signal caused by coefficient of utilization | |
CN106031173A (en) | Flicker detection and mitigation in video coding | |
CN102934445A (en) | Methods and apparatuses for encoding and decoding image based on segments | |
CN107690069B (en) | Data-driven cascade video coding method | |
CN115836525A (en) | Method and system for prediction from multiple cross components |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |