CN113709492B - SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics - Google Patents

SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics Download PDF

Info

Publication number
CN113709492B
CN113709492B CN202110978616.6A CN202110978616A CN113709492B CN 113709492 B CN113709492 B CN 113709492B CN 202110978616 A CN202110978616 A CN 202110978616A CN 113709492 B CN113709492 B CN 113709492B
Authority
CN
China
Prior art keywords
mode
current
residual coefficient
optimal
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110978616.6A
Other languages
Chinese (zh)
Other versions
CN113709492A (en
Inventor
汪大勇
解乐乐
王欣
王倩敏
宋丽娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Dayu Chuangfu Technology Co ltd
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202110978616.6A priority Critical patent/CN113709492B/en
Publication of CN113709492A publication Critical patent/CN113709492A/en
Application granted granted Critical
Publication of CN113709492B publication Critical patent/CN113709492B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention belongs to the field of SHVC video coding, and particularly relates to a SHVC spatial scalable video coding method based on distribution characteristics, which comprises the following steps: acquiring the depth of a current coding unit CU, judging whether skipping is performed according to the depth of the CU, if skipping is selected, selecting the next depth for coding, if not, judging whether a current mode is an optimal mode, if not, adopting a direction mode to predict the optimal mode, if so, judging whether division is terminated according to the optimal mode, if so, outputting a division result, and if not, selecting the next depth for coding; the invention adopts a variable step length halving search method to predict the direction mode, and solves the problem of low efficiency caused by Hadamard transformation of the direction mode with low possibility.

Description

SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics
Technical Field
The invention belongs to the field of SHVC (scalable video coding) video coding, and particularly relates to an SHVC spatial scalable video coding method based on distribution characteristics.
Background
Technologies such as digital television broadcasting, video conferencing, wireless video streaming, smart phone communication and the like are increasingly widely applied in daily life of people, so that a plurality of different terminal devices are generated, and the terminal devices may have different screen resolutions, so that the video streaming is required to adapt to the resolutions of different screens; scalable high efficiency video coding (SHVC) is an effective approach to this problem. SHVC consists of a Base Layer (BL) and one or more Enhancement Layers (EL). As shown in fig. 1, to accommodate different screen resolutions, the Spatial SHVC (SSHVC) encodes different layers with different screen resolution sequences, and by selecting the appropriate layer, the SSHVC can accommodate various devices of different screen resolutions.
In SSHVC consisting of one BL and one or more ELs, the base layer BL comprises only intra-layer prediction and the enhancement layer EL further comprises inter-layer prediction; the intra prediction coding process in the enhancement layer is the same as that of HEVC. Since the BL and EL have the same content but different resolutions, non-sampled prediction, i.e., inter-layer prediction, is required for the BL; the corresponding mode is denoted as an inter-layer reference (ILR) mode. Since the coding process of HEVC is already very complex, SSHVC needs to code all its layers, and therefore there must be a more complex coding process, which will limit its wide application, especially in wireless and real-time applications. Therefore, it is important to reduce the encoding complexity and increase the encoding speed.
Some existing algorithms can improve the coding speed to some extent, but the spatial SHVC still has some problems to be solved: 1. texture features and correlation are commonly used to predict candidate depths; however, their connection to depth selection is not straightforward; therefore, using them alone to predict depth selection does not guarantee optimal performance; 2. to increase the coding speed, the mode selection is usually terminated early with residual coefficients; as the principle behind it is not studied, the best performance cannot be obtained by using only residual coefficients for early termination mode selection.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a method for encoding SHVC spatial scalable video based on distribution characteristics, which comprises the following steps:
s1: acquiring the depth of a current coding unit, if the current CU depth is 1 or 2, judging whether to skip the current depth according to a residual error coefficient of an enhancement layer ILR mode, if so, executing a step S5, otherwise, executing a step S2;
s2: judging whether the ILR mode of the current CU is the optimal mode or not by adopting a GMM-EM method, if so, executing a step S4, otherwise, executing a step S3;
s3: carrying out intra-frame prediction on the mode of the current CU to obtain the optimal mode of the current CU;
s4: judging whether the current CU continues to be divided according to the residual coefficient of the optimal mode, and outputting the division result of the CU if the division is stopped; if the current CU continues to be divided, the depth of the current CU is obtained, if the current depth is 3, the CU directly jumps out of the division to obtain a final division result, and if the current depth is not 3, the step S5 is executed;
s5: and dividing the current CU to obtain four sub-CUs, and executing the steps S1 to S4 on the four sub-CUs.
Preferably, the process of determining whether to skip the current depth according to the residual coefficient includes: coding a current coding unit CU to obtain a residual coefficient map of an enhancement layer ILR mode; dividing the residual coefficient graph to obtain a first residual coefficient graph and a second residual coefficient graph; respectively calculating the expectation and variance of the first residual coefficient map and the second residual coefficient map, judging whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, if the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, skipping the current depth, otherwise, not skipping the current depth.
Further, the process of calculating the expectation and variance of the residual coefficient map comprises: obtaining residual coefficient samples of the divided residual coefficient graphs by subjecting each coefficient in the residual coefficient graphs to Gaussian distribution; and obtaining the probability density function and the corresponding likelihood function of the residual coefficient sample by adopting a maximum likelihood estimation algorithm according to the residual coefficient sample, and obtaining the expectation and the variance of the segmented residual coefficient graph according to the probability density function and the likelihood function.
Further, the process of determining whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map includes: inputting the expectation and the variance of the first residual coefficient graph into a judgment condition to obtain a first judgment result; inputting the expectation and the variance of the second residual coefficient map into a judgment condition to obtain a second judgment result; and comparing the first judgment result with the second judgment result to obtain a judgment result.
Further, the judgment conditions are as follows:
Figure GDA0003978385000000031
wherein the content of the first and second substances,
Figure GDA0003978385000000032
denotes the mean value of the samples, μ 1 Indicates expectation, σ 1 Denotes the standard deviation, n denotes the number of residual coefficients in each section, s α Representing a threshold value.
Preferably, the step of determining whether the ILR mode of the current CU is the optimal mode by using the GMM-EM method includes: saving coding modes and rate distortion costs of each depth CU of a previous frame and a current frame; obtaining the coding mode and the rate distortion cost of adjacent CUs of the previous frame and the current frame of each CU according to the coding mode and the rate distortion cost of the CU of the previous frame and the current frame; adopting rec0 to store the rate-distortion cost belonging to the ILR mode, and adopting rec1 to store the rate-distortion cost of the Intra mode; after the current CU finishes the ILR mode, obtaining the rate distortion cost of the ILR mode adopted by the current CU according to the rate distortion cost stored by rec0 and rec 1; performing GMM conversion on the rate distortion cost of the ILR mode of the current CU to obtain the probability based on rate distortion; obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU; predicting the probability that the current CU adopts the ILR mode according to the probability based on rate distortion and the probability based on quantity; and judging whether the ILR mode of the current CU is the optimal mode according to the probability result.
Preferably, the intra prediction of the current CU mode includes: predicting the mode of the current CU by adopting a method based on a direction mode DM; there are 35 all directional patterns DM for CU; the process of predicting the mode of the CU includes:
step 1: selecting 0, 1, 10 and 26 in the direction mode DM for Hadamard transformation, selecting a smaller Hadamard cost HC1 in DM0 and DM1 and a smaller Hadamard cost HC2 in DM10 and DM26, judging the size of HC1 and HC2, if HC1 is smaller than HC2, then DM0 and DM1 are the optimal DM, executing step 10, otherwise executing step 2;
step 2: judging the Hadamard cost in DM10 and DM26, if the Hadamard cost ratio of DM10 is less than the Hadamard cost of DM26, executing step 3, if the Hadamard cost ratio of DM10 is greater than the Hadamard cost of DM26, executing step 5, otherwise executing step 7;
and step 3: detecting DM8, DM9, DM11 and DM12, judging whether LMD exists in the modes of DM9, DM11 and DM12, if yes, the mode is the optimal direction mode, and executing step 10, otherwise executing step 4;
and 4, step 4: detecting DM2, DM6, DM14 and DM18, and if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14 and DM18, directly executing step 10; otherwise, acquiring the optimal DM by adopting a binary search method, and executing the step 9;
and 5: detecting DM24, DM25, DM27 and DM28, if LMD exists in DM24, DM25 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 6;
step 6: detecting DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM18, DM22, DM8, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, adopting a binary search method to obtain the optimal DM, and executing step 9;
and 7: detecting other DMs except the DMs 10 and 26, if LMDs exist in the DM9, the DM10, the DM11, the DM25, the DM26 and the DM27, the DM is the optimal DM, and executing the step 10, otherwise, executing the step 8;
and 8: detecting DM2, DM6, DM14, DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14, DM18, DM22, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, acquiring the optimal DM by adopting a binary search method, and executing step 9;
and step 9: checking the middle of the DM having the minimum hadamard cost and the neighboring DMs checked left (right) thereof and selecting the DM having the minimum hadamard cost, repeating the process until the DM is an LMD, which is an optimal DM, and performing step 10;
step 10: the DM selection terminates.
Preferably, the process of determining whether the current CU continues to be divided according to the residual coefficient of the optimal mode is the same as the process of determining whether the current depth is skipped according to the residual coefficient of the enhancement layer ILR mode.
The invention has the advantages that: the Hadamard transform is terminated in advance according to the significance difference, and the variable step length halving search method is adopted to predict and select the directional mode of the CU, so that the problem of low efficiency caused by the Hadamard transform of the directional mode with low possibility is solved.
Drawings
Fig. 1 is a result of a base layer and an enhancement layer in a conventional SHVC in the present invention;
FIG. 2 is a flow chart of the algorithm of the present invention;
FIG. 3 is a graph of a residual coefficient map according to the present invention;
FIG. 4 is a diagram of a neighboring CU structure of the present invention;
FIG. 5 is a schematic view of all directional modes of the present invention;
FIG. 6 is a diagram illustrating the prediction results of the type 1 directional mode of the present invention;
FIG. 7 is a diagram illustrating the results of the variable step size check and binary search of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A method of original encoding, the method comprising: coding the whole coding unit CU to obtain a rate distortion cost value RDcost of the coding unit, and recording the rate distortion cost value RDcost as C1; dividing the whole coding unit CU to obtain 4 sub-CUs with the depth of 1, coding the sub-CUs with the depth of 1 to obtain the optimal rate distortion cost value RDcost of each sub-CU, calculating the sum of all the optimal rate distortion cost values of the sub-CUs with the depth of 1, and recording the value as C2; comparing C1 with C2, and taking the minimum value as the optimal RDcost; if C1 is smaller than C2, the current CU is not divided, otherwise, the division is continued; and repeating the above processes until the depth of the current CU is 3, stopping dividing, and finally obtaining the optimal RDCost with the current depths of 0, 1, 2 and 3 respectively.
A SHVC spatial scalable video coding method based on distribution characteristics, as shown in fig. 2, the method comprising:
s1: acquiring the depth of a current coding unit, if the depth of a current CU is 1 or 2, judging whether to skip the current depth according to a residual error coefficient of an enhancement layer ILR mode, if so, executing a step S5, otherwise, executing a step S2;
s2: judging whether the ILR mode of the current CU is the optimal mode or not by adopting a GMM-EM method, if so, executing a step S4, otherwise, executing a step S3;
s3: performing intra-frame prediction on the mode of the current CU, namely performing candidate direction mode prediction on the current CU according to a direction mode DM-based method to obtain the optimal mode of the current CU;
s4: judging the depth of the current CU according to the optimal mode, and if the current depth is 0, executing a step S5; if the current depth is 1 or 2, judging whether to terminate the CU division according to the final residual error coefficient, if so, outputting a CU division result, and otherwise, executing a step S5; if the current depth is 3, directly jumping out of the division to obtain a final division result;
s5: and dividing the current CU to obtain four sub-CUs, and executing the steps S1 to S4 on the four sub-CUs.
The process of judging whether to skip the current depth according to the residual coefficient comprises the following steps: coding the current CU, and obtaining a residual coefficient diagram of the CU after the ILR mode is finished; dividing the residual coefficient graph to obtain a first residual coefficient graph and a second residual coefficient graph; respectively calculating the expectation and variance of the first residual coefficient map and the second residual coefficient map, judging whether the expectation and variance of the first residual coefficient map and the second residual coefficient map are different according to a hypothesis testing method, skipping the current depth if the expectation and variance of the first residual coefficient map and the second residual coefficient map are different, and not skipping the current depth if the expectation and variance of the first residual coefficient map and the second residual coefficient map are different.
Optionally, the manner of dividing the residual coefficient map includes dividing the residual coefficient map left and right to obtain a left half residual coefficient map and a right half residual coefficient map; and respectively calculating expectation and variance of the left half residual coefficient graph and the right half residual coefficient graph, judging whether the two parts have significant difference according to a hypothesis testing method, and skipping the current coding depth if the two parts have significant difference.
Optionally, the dividing the residual coefficient map includes dividing the residual coefficient map up and down to obtain an upper half residual coefficient map and a lower half residual coefficient map; and respectively calculating expectation and variance of the residual coefficient map of the upper half part and the residual coefficient map of the lower half part, judging whether the two parts have significant difference according to a hypothesis testing method, and skipping the current coding depth if the two parts have significant difference.
Preferably, the manner of dividing the residual coefficient map includes dividing the residual coefficient map left and right or up and down to obtain a left half residual coefficient map and a right half residual coefficient map or an upper half residual coefficient map and a lower half residual coefficient map; and respectively calculating expectation and variance of the left half residual coefficient map and the right half residual coefficient map or the upper half residual coefficient map and the lower half residual coefficient map, judging whether the two parts have significant difference according to a hypothesis testing method, and skipping the current coding depth if the two parts have significant difference.
In SHVC, each CTU includes four depths corresponding to Coding Units (CUs) of size 64 × 64 to 8 × 8, and each CU needs to check an ILR mode and an intra mode and then select a mode with a low Rate Distortion (RD) cost as a best mode. The corresponding mode distribution between inter-layer (ILR) mode and Intra (Intra) mode is shown in table 1:
Sequence ILR Intra
Blue-sky 98.50% 1.50%
Ducks 99.76% 0.24%
Park_Joy 99.07% 0.93%
Pedestrian 90.65% 9.35%
Tractor 97.68% 2.32%
Town 96.84% 3.16%
Station2 95.27% 4.73%
Average 96.82% 3.18%
TABLE 1ILR mode and Intra mode distributions
As can be seen from table 1, the average percentage of ILR mode is 96.82%, i.e. most users select ILR mode as the best mode because the content in BL and EL is the same, while QP in BL and EL is similar or even the same, and the inter-layer correlation is strong; by upsampling the located CU in the BL, the prediction of the ILR mode can be directly obtained, and the process is simple. Therefore, in the invention, the residual coefficient of the ILR mode is obtained without coding, and then whether the current CU needs to be further divided is judged according to the residual coefficient, so that the current CU can be directly skipped; otherwise, the ILR mode and the Intra mode need to be further encoded.
First, the CU of the residual coefficient is divided into upper and lower parts shown in fig. 3 (a) and left and right parts shown in fig. 3 (b). If there is a significant difference between the two parts of any one partition, it means that the current CU needs to be further partitioned. If the CU can be predicted accurately with the best mode, the corresponding residual coefficients will follow a gaussian distribution. If the residual coefficients follow gaussian distribution, the residual coefficients of one part in each partition are modeled as:
X~N(μ 11 2 )
wherein X represents a residual coefficient set of a certain part (here, the upper half or the left half), and μ 1 And σ 1 2 Respectively the expectation and variance of the part.
If x 1 ,x 2 ,....,x n For the samples in the residual coefficient set X, maximum Likelihood Estimation (MLE) is used to obtain a probability density function of each part of the samples in the selected residual coefficient set, where the expression of the function is:
Figure GDA0003978385000000081
where X denotes the samples in the residual coefficient set X, μ 1 And σ 1 2 Respectively the expectation and variance of the residual coefficient set.
The likelihood function corresponding to the probability density function is:
Figure GDA0003978385000000082
Figure GDA0003978385000000083
where L denotes a likelihood function and n denotes the number of residual coefficients in each section. To obtain mu 1 And σ 1 2 The following calculation can be performed:
Figure GDA0003978385000000084
/>
according to the calculation result of the above expression, it can be obtained:
Figure GDA0003978385000000085
Figure GDA0003978385000000086
wherein the content of the first and second substances,
Figure GDA0003978385000000091
the average of the samples is indicated.
If Y is the residual coefficient set of another part, Y 1 ,y 2 ,....,y n For the samples in the set of residual coefficients,
Figure GDA0003978385000000094
for the average value of the sample, the condition for judging whether the two parts have significant difference is as follows:
Figure GDA0003978385000000092
where n is the number of residual coefficients in each part, and α is the significance level value; for any kind of alpha, the corresponding threshold s can be obtained by checking the Gaussian distribution table α (ii) a If the above formula is satisfied, the two parts differ significantly.
Since the probability of different depths being skipped may be different, an optimal threshold for each depth needs to be selected. And for the depth 2, a common value is adopted for detection, and the coding efficiency corresponding to the value is high. In order to improve the coding efficiency, the maximum value of 3.49 in the gaussian distribution table is selected, and the multiple of the maximum value is further tested, and the corresponding coding efficiency is shown in table 2.
Figure GDA0003978385000000093
TABLE 2 coding efficiency under different test values
In table 2, the coding efficiency is represented by BDBR, which measures the bit rate difference for the same peak signal-to-noise ratio (PSNR) in EL. A positive or negative BDBR reflects a loss or increase in coding efficiency, respectively. As can be seen from Table 2, when the test value is greater than or equal to 20.94, the BDBRs of all the sequences are less than 0.1%. Therefore, test value 20.94 is selected as the threshold for depth 2. Likewise, the threshold for depth 1 is 31.41. If depth 0 is skipped, the corresponding coding efficiency is significantly reduced in some sequences; depth 0 is not skipped. The depth skip condition is:
Figure GDA0003978385000000101
where depth represents depth.
The process of judging whether the ILR mode of the current CU is the optimal mode by adopting the GMM-EM method comprises the following steps: saving coding modes and rate distortion costs of each depth CU of a previous frame and a current frame; obtaining the coding mode and the rate-distortion cost of adjacent CUs of the previous frame and the current frame of each CU according to the CU coding mode and the rate-distortion cost of the previous frame and the current frame, storing the rate-distortion cost belonging to the ILR mode by adopting rec0, storing the rate-distortion cost of the Intra mode by adopting rec1, and obtaining the rate-distortion cost of the ILR mode adopted by the current CU according to the rate-distortion costs stored by rec0 and rec 1; performing GMM conversion on the rate distortion cost of the ILR mode of the current CU to obtain the probability based on rate distortion; obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU; predicting the probability that the current CU adopts the ILR mode according to the probability based on rate distortion and the probability based on quantity; and judging whether the ILR mode of the current CU is the optimal mode or not according to the probability result. Specifically, the process of judging whether the ILR mode of the current CU is the optimal mode by using the GMM-EM method includes:
(1) Firstly, a variable currcosts 0 of a vector < pair < int, double > > > type is used for saving a final coding mode and a rate distortion cost of a CU at each depth 0 of a current frame, a preCosts0 is used for saving the final coding mode and the rate distortion cost of the CU at each depth 0 of a previous frame, the type of the preCosts0 is the same as that of the currcosts 0, the currcosts 0 is copied to the preCosts0 after each frame is coded, and the currcosts 0 is initialized again. The same applies to depths 1, 2, and 3, so that the coding mode and the rate-distortion cost of each depth CU coded in the previous frame and the current frame are obtained.
(2) And (2) obtaining adjacent CUs of a previous frame and a current frame of each CU through (1), obtaining coding modes and rate-distortion costs of the adjacent CUs, storing the rate-distortion costs belonging to the ILR mode by using rec0, storing the rate-distortion costs belonging to the Intra mode by using rec1, and obtaining the rate-distortion costs of the ILR mode adopted by the current CU because the current CU completes the coding in the ILR mode at the moment.
The structure of the current CU and its neighboring CUs is shown in FIG. 4, where U is 0 Is the current CU, U 1 ,U 2 ,U 3 And U 4 Is a neighboring CU, U of the current CU 5 ,U 6 ,U 7 ,U 8 And U 9 Is in the previous frame with U 0 ,U 1 ,U 2 ,U 3 And U 4 Co-located CUs.
(3) And (3) performing GMM conversion on the information obtained in the step (2) to obtain probability based on rate distortion, wherein the specific process comprises the following steps:
for arbitrary U i The rate distortion cost is denoted as rd i The corresponding gaussian mixture model is:
p(rd i )=π 1 N(rd i1 ,∑ 1 )+π 2 M(rd i2 ,∑ 2 )
wherein, pi 1 To adopt the possibility of ILR mode, μ 1 Sum Σ 1 Respectively, the expected value and variance of its rate-distortion cost; pi 2 Is the possibility of using the Intra mode, μ 2 Sum Σ 2 Respectively, the expectation and variance of its rate-distortion cost.
To obtain pi 1 、μ 1 、∑ 1 、π 2 、μ 2 And sigma 2 And adopting maximum likelihood estimation for six parameter values, wherein the expression is as follows:
Figure GDA0003978385000000111
wherein M represents the current CU and the number of adjacent CUs thereof, p represents the probability, rd i Representing the rate-distortion cost and N representing the gaussian distribution.
Preferably, the value of M is set to 10.
The log expression of the likelihood function is:
Figure GDA0003978385000000112
the maximum likelihood estimation expression and the logarithm expression of the likelihood function can be obtained as follows:
Figure GDA0003978385000000113
/>
from the above formula, one can obtain:
Figure GDA0003978385000000114
where γ (i, k) is the probability that the ith sample is produced by the kth section, and T represents the transpose.
Figure GDA0003978385000000115
Figure GDA0003978385000000116
I.e. the sum of the probabilities that all samples are generated by the kth part, the expression to get the possibility to use ILR mode is:
Figure GDA0003978385000000121
where N is the sum of the probability sums of the two modes employed, γ (i, k) can be obtained according to the following equation:
Figure GDA0003978385000000122
repeating the iteration to obtain mu k 、∑ k 、π k And γ (i, k) until γ (i, k) converges.
Since the current CU is U 0 In order to determine whether the ILR mode is the best mode, it is necessary to determine whether γ (0, k) converges. Suppose that the ith iteration of γ (0, k) is denoted as γ i (0, k) to avoid unnecessary repeated iterations if γ i-1 (0, k) and γ i The absolute difference between (0, k) is very small and the repeated iterations can be terminated. Selecting 0.01 as the threshold, then:
i (0,k)-γ i-1 (0,k)|≤0.01
if the above conditions are met, directly terminating the repeated iteration process; through this process, the probability that the current CU selects the ILR mode may be obtained, which is defined as a probability based on rate distortion.
(4) And obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU. The specific process is as follows:
since neighboring CUs are usually similar, neighboring CUs are used for prediction. The more ILR mode is used in neighboring CUs, the higher the probability that the current CU uses this mode and vice versa. The probability that the current CU selects the ILR mode is proportional to the number of neighboring CUs using the ILR mode. As shown in fig. 4, the current CU has nine neighboring CUs. Thus, the possibility that the current CU selects ILR mode mayIs written as
Figure GDA0003978385000000123
Where k is the number of neighboring CUs using ILR mode. Since this probability is obtained based on the number of neighboring CUs using the ILR mode, it is defined as a probability based on the number.
(5) Since both the rate-distortion based probability and the quantity based probability have a strong relationship with the selection of the ILR mode, the probability of using the ILR mode is predicted from the rate-distortion based probability and the quantity based probability. Let A and B denote probability based on rate distortion and probability based on quantity, respectively; because the two are independent of each other, the possibility of deep early termination is obtained, and the expression is as follows:
p r =p(A+B)=p(A)+p(B)-p(A)p(B)
wherein p is r Indicating the possibility of early termination of depth if p r Greater than or equal to 0.6, the current CU terminates early; p (a + B) represents a probability based on rate distortion and a probability based on quantity, i.e., a probability that both satisfy at least one, p (a) represents a probability based on rate distortion, and p (B) represents a probability based on quantity. 0.6, 0.7, 0.8 and 0.9 were used during the test, the corresponding BDBRs are shown in table 3.
Pr&BDBR 0.6 0.7 0.8 0.9
Blue-sky -0.3% -0.3% -0.3% -0.3%
Ducks 0.0% 0.0% 0.0% 0.0%
Park_Joy 0.0% 0.0% 0.0% 0.0%
Pedestrian -0.1% -0.1% -0.1% -0.1%
Tractor -0.1% 0.0% 0.0% 0.0%
town -0.1% -0.1% -0.1% -0.1%
station2 0.0% -0.1% -0.1% -0.2%
TABLE 3 probability of deep early termination p r And corresponding BDBR
As can be seen from Table 3, when p is r At increasing levels, the BDBR remains essentially unchanged except for the sequence "station 2". The BDBR of the sequence "station2" reaches a minimum value when the BDBR is equal to 0.9. Therefore, 0.9 is selected as the optimum value. I.e., if pr is greater than 0.9, the current depth terminates early.
The Directional Mode (DM) in SHVC is shown in fig. 5. There are 2 non-directional modes, namely DC (DM 0) and plane (DM 1), and 33 directional modes (DM 2 \8230; DM 34). In general, dm0 and dm1 are well suited for simple CUs. Similar to HEVC, SHVC first checks 35 directional modes in coarse mode decision (RDM) to get Hadamard Cost (HC) to select the smallest N DMs in HC, then checks these DMs in Rate Distortion Optimization (RDO) process, and selects the directional mode with the smallest RDO value as the best DM; through the above process, the optimal coding efficiency can be obtained. However, checking many unnecessary direction patterns takes much unnecessary encoding time. Especially the RDM procedure always checks 35 directional patterns, which is very time consuming. Large-size CUs in the Enhancement Layer (EL) are usually very simple if they use the Intra mode; for small CUs in the EL, their texture does not change much because of their small size, so they are usually also simple. Obviously, a simple CU may have special DM characteristics. Studying these DM characteristics in EL helps to increase the coding speed. The probability that different DMs are selected as the best DM among all CUs may be different by first obtaining their probabilities and then studying the distribution of DMs by grouping DMs with similar probabilities.
There are 35 all directional modes DM of a CU, namely DM0, DM1, DM2, DM3, DM4, DM5, DM6, DM7, DM8, DM9, DM10, DM11, DM12, DM13, DM14, DM15, DM16, DM17, DM18, DM19, DM20, DM21, DM22, DM23, DM24, DM25, DM26, DM27, DM28, DM29, DM30, DM31, DM32, DM33, and DM34; and calculating the probability of selecting each directional mode by the CU, and classifying all the directional modes according to the calculated probability to obtain three types of division results.
The formula for calculating the probability is:
Figure GDA0003978385000000141
wherein n is i Is DM i M is the number of all CUs, where all CUs are selected as the number of best DMs.
The classification result of the directional pattern includes class 0, class 1, and class 2. The 0-class directional pattern includes DM0 and DM1; the class 1 directional patterns include DM8, DM9, DM10, DM11, DM12, DM24, DM25, DM26, DM27, and DM28; the class 2 directional patterns include DM2, DM3, DM4, DM5, DM6, DM7, DM13, DM14, DM15, DM16, DM17, DM18, DM19, DM20, DM21, DM22, DM23, DM29, DM30, DM31, DM32, DM33, and DM34. The 1-class directional patterns are divided into two groups, i.e., DM8, DM9, DM10, DM11, and DM12 as a horizontal directional pattern group, and DM24, DM25, DM26, DM27, and DM28 as a vertical directional pattern group. Dividing the 2 types of direction modes into 4 groups of data, namely taking DM2, DM3, DM4, DM5, DM6 and DM7 as a first direction mode group; setting DM13, DM14, DM15, DM16, DM17 and DM18 as a second direction mode group; setting DM19, DM20, DM21, DM22 and DM23 as a third direction mode group; DM29, DM30, DM31, DM32, DM33, and DM34 are taken as a fourth directional mode group.
And in the process of carrying out intra prediction on the mode of the current CU, carrying out significance difference prediction on the 0-type directional mode. The specific process comprises the following steps: will DM i HC of (b) is represented by i Min () is expressed as the smaller value between two different HC values; DM0 and DM1 in the 0 type are non-directional modes, DM10 in the 1 type directional mode is in the horizontal direction, and DM26 is in the vertical direction; if HC0 and HC1 are significantly smaller than HC10 and HC26, then the optimal DM occurs in class 0; DM0 and DM1 in class 0 and DM10 and DM26 in class 1 are examined, based on min (HC 0, HC 1) and min (HC 10, HC 2)6) The difference between them to determine whether the optimal DM is present in class 0; in performing the directional pattern prediction, the user may terminate the selection of DM early at any time.
As shown in fig. 6, a significance difference prediction descending direction search is performed for the type 1 directional pattern; since the DM10 is a horizontal DM and the DM26 is a vertical DM, if the HC10 is significantly smaller than the HC26, the optimum DM is likely to be in the horizontal direction mode group, and conversely, the optimum DM is likely to be in the vertical direction mode group. First, DM10 and DM26 in class 1 are examined, and the set of possible directional modes is predicted based on the difference between HC10 and HC 26. After the set of possible directional patterns is obtained, the two directly neighboring DMs of DM10 or DM26 are further checked according to the set of directional patterns, and then the best DM is searched according to their hadamard costs in the set. Since the probability that a DM in class 0 and 1 is the best DM is high, if one DM and its two immediate neighbors are examined and its HC is the smallest HC of all the examined DMs, then this DM is likely to be the best DM (LBD). To obtain LBD as soon as possible, a search is made in the direction of decreasing hadamard cost. For example, if the horizontal subclass is a very likely group of directional patterns, the two immediately adjacent DMs of DM10, i.e., DM9 and DM11, are further examined. According to the combination of HC9, HC10 and HC11, there are three cases, as shown in FIG. 6. The LBD is searched for according to the arrows in each case.
Preferably, the three cases include: (1) If HC10 is the smallest HC of all Hadamard costs and DM10 is LBD, then DM selection is terminated early; (2) Since the hadamard cost is decreasing from the left and right, DM8 and DM12 are further checked in the decreasing direction to determine whether DM9 or DM11 is LBD; if not, DM7 and DM13 in class 2 need to be further examined to determine whether DM8 or DM12 is LBD; (3) As the hadamard costs of the three directional modes are all monotonically decreasing, the LBD is searched along the decreasing direction; if HC9> HC10> HC11, further check DM12 to determine whether DM11 is LBD; if not, further checking DM13 in class 2 and determining whether DM12 is an LBD; if HC11> HC10> HC9, further check DM8 to determine if DM9 is LBD; if not, DM7 in class 2 is further examined and it is determined whether DM8 is an LBD.
To determine DM8 or DM12 in class 1, DM7 or DM13 in class 2 needs to be checked. Looking at DM7 or DM13 through a significant difference predictive descending direction search, if an LBD can be obtained, it can be considered as the best DM and DM selection can be terminated.
And performing variable-step two-dimensional search on the 2 types of direction modes. Specifically, the best DM in class 2 is searched for using DM7 and DM13 or DM23 and DM29 as the starting DMs. For example, using DM7 in class 2 (selected) as the starting DM, DM6 is checked using step 1 (the distance between 7 and 6 is 1); starting from DM6, DM2 is checked using a step size of 4 (the distance between 6 and 2 is 4); if there is one DM in class 2 whose HC is the smallest among all the selected DMs, a binary search is used to find the best DM. More specifically, the process is: checking the middle point between the mDM with the smallest HC in all the checked DMs and the nearest adjacent lDM checked on the left side, then checking the middle point between the mDM and the nearest adjacent rDM checked on the right side, and finally selecting the DM with the smallest HC in all the checked DMs; this process is repeated until DM becomes LBD. For example, if DM4 has the smallest HC among all the selected DMs, then its left and right nearest neighboring selected DMs are DM2 and DM6, the midpoint between DM2 and DM4, i.e., DM3, and the midpoint between DM4 and DM6, i.e., DM5, are further checked, then the DM with the smallest HC among all the selected DMs is selected, if the DM is LBD, DM selection may be terminated early, otherwise, the process is repeated further. An example of a variable step check and binary search is shown in fig. 7.
Specifically, the process of intra-predicting the mode of the current CU includes:
step 1: selecting 0, 1, 10 and 26 in the direction mode DM for Hadamard transformation, selecting a smaller Hadamard cost HC1 in DM0 and DM1 and a smaller Hadamard cost HC2 in DM10 and DM26, judging the size of HC1 and HC2, if HC1 is smaller than HC2, then DM0 and DM1 are the optimal DM, executing step 10, otherwise executing step 2;
step 2: judging the Hadamard cost in DM10 and DM26, if the Hadamard cost ratio of DM10 is less than the Hadamard cost of DM26, executing step 3, if the Hadamard cost ratio of DM10 is greater than the Hadamard cost of DM26, executing step 5, otherwise executing step 7;
and step 3: detecting DM8, DM9, DM11 and DM12, judging whether LMD exists in the modes of DM9, DM11 and DM12, if yes, the mode is the optimal direction mode, and executing step 10, otherwise executing step 4;
and 4, step 4: detecting DM2, DM6, DM14 and DM18, and if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14 and DM18, directly executing step 10; otherwise, acquiring the optimal DM by adopting a binary search method, and executing the step 9;
and 5: detecting DM24, DM25, DM27 and DM28, if LMD exists in DM24, DM25 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 6;
step 6: detecting DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM18, DM22, DM8, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, adopting a binary search method to obtain the optimal DM, and executing step 9;
and 7: detecting DM of category 2 except DM10 and DM26, if LMD exists in DM9, DM10, DM11, DM25, DM26 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 8;
and 8: detecting DM2, DM6, DM14, DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14, DM18, DM22, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, obtaining the optimal DM by adopting a binary search method, and executing step 9;
and step 9: checking the middle of the DM having the minimum hadamard cost and the neighboring DMs checked left (right) thereof and selecting the DM having the minimum hadamard cost, repeating the process until the DM is an LMD, which is an optimal DM, and performing step 10;
step 10: the DM selection terminates.
In order to determine whether there is a significant difference between the HC of the two DMs, the corresponding residual coefficients are determined. Let R be 1 And R 2 Is the residual of two DMs, which are distinguished by:
R=R 1 -R 2
By Hadamard transformation, the above equation can be rewritten as:
HRH=HR 1 H-HR 2 H
where H denotes a Hadamard matrix.
According to the cauchy inequality, the expression of HRH can be rewritten as:
Figure GDA0003978385000000171
wherein, m is the side length of the current CU, then there are:
Figure GDA0003978385000000181
wherein x is i,j Is the HRH value at the (i, j) position, calculated as:
Figure GDA0003978385000000182
wherein h is ik Denotes the Hadamard value, h, at the (i, k) position pj Hadamard value, r, at the (p, j) position kp Represents the R value at the (k, p) position.
If any quantized value in HRH is less than k, R 1 And R 2 There was no significant difference. The following conditions should be satisfied:
Figure GDA0003978385000000183
wherein k represents a parameter value, Q step The representation of the quantization step size can be obtained according to the quantization parameter.
According to the calculation formula of R, the calculation formula of HRH, the Cauchy inequality and the above conditions, the following can be obtained:
Figure GDA0003978385000000184
HC 1 and HC 2 The conditions without significant differences were:
Figure GDA0003978385000000185
HC 1 and HC 2 Conditions with significant differences were:
Figure GDA0003978385000000186
wherein HC 1 And HC 2 Representing the Hadamard transform values of two DMs. If the obtained data are significantly identical, the two are not significantly different, otherwise, the two are significantly different. If HC is present 1 <HC 2 Then HC 1 Is significantly less than HC 2 And vice versa.
To obtain the most suitable k-value, the above conditions were tested and the corresponding BDBR was obtained with the results shown in Table 4.
Figure GDA0003978385000000187
Figure GDA0003978385000000191
TABLE 4 different k values and corresponding BDBRs
As can be seen from table 4, there is a turning point when k is equal to 5, and if greater than or equal to 5, the corresponding BDBR in all sequences is less than 0.1%. This means that good performance can be obtained when k is 5. If further larger, the corresponding increase in coding speed will be smaller. Therefore, k is set to 5.
The specific content in step S4 is: after the current CU is coded, obtaining a final residual coefficient map of the current CU; respectively obtaining expectation and variance of the left half part and the right half part of the residual coefficient graph, and judging whether the two parts have significance difference according to a hypothesis testing method; and respectively obtaining the expectation and the variance of the upper half part and the lower half part of the residual coefficient graph, and judging whether the two parts have significant difference according to a hypothesis testing method.
μ 1 And σ 1 2 Is the expected sum of the variances of one of the parts (here, the upper or left half), Z is the residual coefficient of the other part, Z 1 ,z 1 ,…z n Is a sample thereof, and it can be tested whether Z also satisfies μ by the following formula 2 And σ 2 2 The formula is:
Figure GDA0003978385000000192
where α is the significance level value and is the number of residual coefficients of each portion. By consulting the gaussian distribution table, the corresponding threshold value can be obtained. If | γ is satisfied i (0,k)-γ i-1 (0, k) | ≦ 0.01, the residual coefficients of the two parts use the same expected value and variance. Thus, there is no significant difference between the two parts and the current CU can terminate early.
Wherein e is at depth 1 and depth 2 α The values of (A) are as follows:
Figure GDA0003978385000000193
if the left and right parts and the upper and lower parts have no significant difference, the division is terminated early.
To verify the performance of the proposed spatial SHVC fast intra prediction algorithm, the reference software shm11.0 was used and tested on Intel (R) 2.0ghz processor and 30gb memory server. The training sequence and the test sequence are not overlapped, so that the universality of the algorithm is ensured. The performance of the algorithm was evaluated in terms of both coding efficiency and coding speed. Coding efficiency includes bit rate and visual quality, expressed in BDBR. It refers to the bit rate difference at the same PSNR compared to the reference software in EL. The encoding speed is represented by TS, which evaluates only the percentage saved during the encoding run in EL.
To verify the performance of the proposed algorithm, the algorithm integrates all the proposed strategies. The performance of the algorithm provided by the invention is compared with the performance of the PAPS algorithm, the EETBS algorithm and the FIICA algorithm. All algorithms are tested on the same computing platform. Since there are two setting ways for the scalability ratio and QP, respectively, their combination is classified into four cases (cases) in EL. case1 is scalable rate of 1.5 times and QP set (22, 26, 30, 34), case2 is scalable rate of 1.5 times and QP set (24, 28, 32, 36), case3 is scalable rate of 2 times and QP set (22, 26, 30, 34), and case4 is scalable rate of 2 times and QP set (24, 28, 32, 36). Table 5 (case 1), table 6 (case 2), table 7 (case 3), and table 8 (case 4) list the overall performance comparisons in terms of encoding efficiency and encoding speed, respectively.
Figure GDA0003978385000000201
TABLE 5 Case1 Performance comparison
Figure GDA0003978385000000202
TABLE 6 case2 Performance comparison
Figure GDA0003978385000000203
Figure GDA0003978385000000211
TABLE 7 case3 Performance comparison
Figure GDA0003978385000000212
TABLE 8 case4 Performance comparison
In table 5 (case 1), the average BDBR of the algorithm used in the present invention, PAPS, EETBS and FIICA were 0.02%, 0.30%, 0.20% and 0.06%, respectively. The average TS of the four algorithms is 79.66%, 67.03%, 55.34%, and 47.15%, respectively. In the test, the BDBR of the algorithm adopted by the invention is smaller than the other three algorithms, and the coding speed is obviously higher than the other three algorithms. In Table 6 (case 2), the average BDBRs of the proposed algorithm, BDBR, PAPS, EETBS and FIICA are-0.14%, 0.38%, -0.20% and-0.18%, respectively; the average TS of the four algorithms is 81.26%, 65.85%, 56.30%, and 45.75%, respectively; in the test, the BDBR of the algorithm is smaller than the PAPS algorithm and slightly larger than the EETBS and FIICA algorithms, and the coding speed of the algorithm is obviously higher than that of the other three algorithms. In table 7 (case 3), the average BDBR of the proposed algorithm, PAPS, EETBS and FIICA is 0.94%, 0.62%, 0.35% and 0.38%, respectively. The average TS of the proposed algorithm is 76.34%, 68.30%, 54.49% and 42.22%, respectively. In the test, the BDBR of the algorithm provided by the invention is larger than the other three algorithms, and the coding speed is obviously higher than the other three algorithms. In table 8 (case 4), the average BDBR for the proposed algorithm, PAPS, EETBS and FIICA is 0.68%, 0.31% and 0.40%, respectively. The average TS of the proposed algorithm is 78.02%, 66.67%, 54.11% and 43.25%, respectively. In the test, the BDBR of the algorithm provided by the invention is smaller than the PAPS algorithm and slightly larger than the EETBS and FIICA algorithms, and the coding speed is obviously higher than that of the other three algorithms.
To clearly demonstrate the performance of the algorithms presented herein, table 9 provides a comparison of the overall average performance of these four algorithms in all four cases.
Figure GDA0003978385000000221
TABLE 9 Overall average Performance comparison of different algorithms
The overall average BDBR for the proposed algorithm, PAPS, EETBS and FIICA was 0.38%, 0.49%, 0.17% and 0.16%, respectively. The total average TS for the four algorithms was 78.82%, 66.96%, 55.06%, and 44.59%, respectively. Therefore, the encoding speed of the algorithm is obviously faster than that of the other three algorithms. Meanwhile, the BDBR of the algorithm is smaller than the PAPS algorithm and larger than the EETBS algorithm and the FIICA algorithm.
The above-mentioned embodiments, which further illustrate the objects, technical solutions and advantages of the present invention, should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. A method for SHVC spatial scalable video coding based on distribution characteristics, the method comprising:
s1: acquiring the depth of a current Coding Unit (CU), if the depth of the current CU is 1 or 2, judging whether to skip the current depth according to a residual error coefficient of an enhancement layer ILR mode, if so, executing a step S5, otherwise, executing a step S2;
the process of judging whether to skip the current depth according to the residual coefficient comprises the following steps: coding a current coding unit CU to obtain a residual coefficient map of an enhancement layer ILR mode; dividing the residual coefficient graph to obtain a first residual coefficient graph and a second residual coefficient graph; respectively calculating the expectation and variance of the first residual coefficient map and the second residual coefficient map, judging whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, if the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, skipping the current depth, otherwise, not skipping the current depth;
s2: judging whether the ILR mode of the current CU is the optimal mode or not by adopting a GMM-EM method, if so, executing a step S4, otherwise, executing a step S3;
the process of judging whether the ILR mode of the current CU is the optimal mode by adopting the GMM-EM method comprises the following steps: saving coding modes and rate distortion costs of each depth CU of a previous frame and a current frame; obtaining the coding modes and the rate distortion costs of adjacent CUs of the previous frame and the current frame of each CU according to the coding modes and the rate distortion costs of the CUs of the previous frame and the current frame; adopting rec0 to store the rate-distortion cost belonging to the ILR mode, and adopting rec1 to store the rate-distortion cost of the Intra mode; after the current CU finishes the ILR mode, obtaining the rate distortion cost of the ILR mode adopted by the current CU according to the rate distortion cost stored by rec0 and rec 1; performing GMM conversion on the rate distortion cost of the ILR mode of the current CU to obtain the probability based on rate distortion; obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU; predicting the probability that the current CU adopts the ILR mode according to the probability based on rate distortion and the probability based on quantity; judging whether the ILR mode of the current CU is the optimal mode or not according to the probability of the ILR mode;
s3: performing intra-frame prediction on the mode of the current CU to obtain the optimal mode of the current CU;
s4: judging whether the current CU continues to be divided according to the residual coefficient of the optimal mode, and outputting the division result of the CU if the division is stopped; if the current CU continues to be divided, the depth of the current CU is obtained, if the current depth is 3, the CU directly jumps out of the division to obtain a final division result, and if the current depth is not 3, the step S5 is executed;
s5: and dividing the current CU to obtain four sub-CUs, and executing the steps S1 to S4 on the four sub-CUs.
2. The SHVC spatial scalable video coding method according to claim 1, wherein the process of calculating the expectation and variance of the residual coefficient map comprises: obtaining residual coefficient samples of the divided residual coefficient graphs by subjecting each coefficient in the residual coefficient graphs to Gaussian distribution; and obtaining the probability density function and the corresponding likelihood function of the residual coefficient sample by adopting a maximum likelihood estimation algorithm according to the residual coefficient sample, and obtaining the expectation and the variance of the segmented residual coefficient graph according to the probability density function and the likelihood function.
3. The SHVC spatial scalable video coding method according to claim 1, wherein the determining whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map comprises: inputting the expectation and the variance of the first residual coefficient graph into a judgment condition to obtain a first judgment result; inputting the expectation and the variance of the second residual coefficient map into a judgment condition to obtain a second judgment result; and comparing the first judgment result with the second judgment result to obtain a judgment result.
4. The SHVC spatial scalable video coding method based on distribution characteristics as claimed in claim 3, wherein the determining conditions are:
Figure FDA0003978384990000021
wherein the content of the first and second substances,
Figure FDA0003978384990000022
denotes the mean value of the samples, μ 1 Indicates expectation, σ 1 Denotes the standard deviation, n denotes the number of residual coefficients in each section, s α Representing a threshold value.
5. The method of claim 1, wherein intra-predicting the mode of the current CU comprises: predicting the mode of the current CU by adopting a method based on a direction mode DM; there are 35 all directional patterns DM for CU; the process of predicting the mode of the CU includes:
step 1: selecting DM0, DM1, DM10 and DM26 in a directional mode DM to carry out Hadamard transformation, selecting smaller Hadamard cost HC1 in DM0 and DM1 and smaller Hadamard cost HC2 in DM10 and DM26, judging the size of HC1 and HC2, if HC1 is smaller than HC2, DM0 and DM1 are optimal DM, executing step 10, otherwise executing step 2;
step 2: judging the Hadamard cost in DM10 and DM26, if the Hadamard cost ratio of DM10 is less than the Hadamard cost of DM26, executing step 3, if the Hadamard cost ratio of DM10 is greater than the Hadamard cost of DM26, executing step 5, otherwise executing step 7;
and step 3: detecting DM8, DM9, DM11 and DM12, judging whether LMD exists in the modes of DM9, DM11 and DM12, if yes, the mode is the optimal direction mode, and executing step 10, otherwise executing step 4;
and 4, step 4: detecting DM2, DM6, DM14 and DM18, and if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14 and DM18, directly executing step 10; otherwise, acquiring the optimal DM by adopting a binary search method, and executing the step 9;
and 5: detecting DM24, DM25, DM27 and DM28, if LMD exists in DM24, DM25 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 6;
step 6: detecting DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM18, DM22, DM8, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, adopting a binary search method to obtain the optimal DM, and executing step 9;
and 7: detecting other DMs except DM10 and DM26, if LMD exists in DM9, DM10, DM11, DM25, DM26 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 8;
and 8: detecting DM2, DM6, DM14, DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14, DM18, DM22, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, obtaining the optimal DM by adopting a binary search method, and executing step 9;
and step 9: checking the middle of the DM having the minimum hadamard cost and the neighboring DMs checked left or right thereof and selecting the DM having the minimum hadamard cost, repeating the process until the DM is an LMD, which is an optimal DM, and performing step 10;
step 10: the DM selection terminates.
6. The method of claim 1, wherein the determining whether the current CU continues to be partitioned according to the residual coefficients of the best mode is the same as the determining whether to skip the current depth according to the residual coefficients of the enhancement layer ILR mode.
CN202110978616.6A 2021-08-25 2021-08-25 SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics Active CN113709492B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110978616.6A CN113709492B (en) 2021-08-25 2021-08-25 SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110978616.6A CN113709492B (en) 2021-08-25 2021-08-25 SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics

Publications (2)

Publication Number Publication Date
CN113709492A CN113709492A (en) 2021-11-26
CN113709492B true CN113709492B (en) 2023-03-24

Family

ID=78654523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110978616.6A Active CN113709492B (en) 2021-08-25 2021-08-25 SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics

Country Status (1)

Country Link
CN (1) CN113709492B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114143536B (en) * 2021-12-07 2022-09-02 重庆邮电大学 Video coding method of SHVC (scalable video coding) spatial scalable frame
CN114520914B (en) * 2022-02-25 2023-02-07 重庆邮电大学 Scalable interframe video coding method based on SHVC (scalable video coding) quality
CN115278260A (en) * 2022-07-15 2022-11-01 重庆邮电大学 VVC (variable valve control) rapid CU (CU) dividing method based on space-time domain characteristics and storage medium
CN115633171B (en) * 2022-10-08 2024-01-02 重庆邮电大学 SHVC-based quick CU decision algorithm

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104333754B (en) * 2014-11-03 2017-06-13 西安电子科技大学 Based on the SHVC enhancement-layer video coding methods that predictive mode is quickly selected
EP3076669A1 (en) * 2015-04-03 2016-10-05 Thomson Licensing Method and apparatus for generating color mapping parameters for video encoding
CN108259898B (en) * 2018-02-01 2021-09-28 重庆邮电大学 Intra-frame fast coding method based on quality scalable video coding QSHVC
CN112383776B (en) * 2020-12-08 2022-05-03 重庆邮电大学 Method and device for quickly selecting SHVC (scalable video coding) video coding mode

Also Published As

Publication number Publication date
CN113709492A (en) 2021-11-26

Similar Documents

Publication Publication Date Title
CN113709492B (en) SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics
Hu et al. Intra-prediction and generalized graph Fourier transform for image coding
CN104754357B (en) Intraframe coding optimization method and device based on convolutional neural networks
Wu et al. HG-FCN: Hierarchical grid fully convolutional network for fast VVC intra coding
CN108924558B (en) Video predictive coding method based on neural network
Jin et al. Fast QTBT partition algorithm for intra frame coding through convolutional neural network
EP2131594B1 (en) Method and device for image compression
CN114286093A (en) Rapid video coding method based on deep neural network
CN109905712A (en) HEVC intraframe coding fast mode decision algorithm based on ResNet
CN109982071B (en) HEVC (high efficiency video coding) dual-compression video detection method based on space-time complexity measurement and local prediction residual distribution
CN112383776B (en) Method and device for quickly selecting SHVC (scalable video coding) video coding mode
Amestoy et al. Random forest oriented fast QTBT frame partitioning
CN102324037A (en) Shot boundary detection method based on support vector machine and genetic algorithm
CN115118977B (en) Intra-frame prediction encoding method, system, and medium for 360-degree video
CN115941943A (en) HEVC video coding method
Fu et al. Fast prediction mode selection and CU partition for HEVC intra coding
CN116489386A (en) VVC inter-frame rapid coding method based on reference block
Amna et al. LeNet5-Based approach for fast intra coding
Yang et al. Efficient screen content intra coding based on statistical learning
CN114143536B (en) Video coding method of SHVC (scalable video coding) spatial scalable frame
Wang et al. Hybrid strategies for efficient intra prediction in spatial SHVC
CN113784147B (en) Efficient video coding method and system based on convolutional neural network
Wang et al. A novel mode selection-based fast intra prediction algorithm for spatial SHVC
CN114827606A (en) Quick decision-making method for coding unit division
Mercat et al. Machine learning based choice of characteristics for the one-shot determination of the HEVC intra coding tree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240129

Address after: Room 801, 85 Kefeng Road, Huangpu District, Guangzhou City, Guangdong Province

Patentee after: Guangzhou Dayu Chuangfu Technology Co.,Ltd.

Country or region after: China

Address before: 400065 Chongwen Road, Nanshan Street, Nanan District, Chongqing

Patentee before: CHONGQING University OF POSTS AND TELECOMMUNICATIONS

Country or region before: China

TR01 Transfer of patent right