CN113709492B

CN113709492B - SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics

Info

Publication number: CN113709492B
Application number: CN202110978616.6A
Authority: CN
Inventors: 汪大勇; 解乐乐; 王欣; 王倩敏; 宋丽娟
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Guangzhou Dayu Chuangfu Technology Co ltd
Priority date: 2021-08-25
Filing date: 2021-08-25
Publication date: 2023-03-24
Anticipated expiration: 2041-08-25
Also published as: CN113709492A

Abstract

The invention belongs to the field of SHVC video coding, and particularly relates to a SHVC spatial scalable video coding method based on distribution characteristics, which comprises the following steps: acquiring the depth of a current coding unit CU, judging whether skipping is performed according to the depth of the CU, if skipping is selected, selecting the next depth for coding, if not, judging whether a current mode is an optimal mode, if not, adopting a direction mode to predict the optimal mode, if so, judging whether division is terminated according to the optimal mode, if so, outputting a division result, and if not, selecting the next depth for coding; the invention adopts a variable step length halving search method to predict the direction mode, and solves the problem of low efficiency caused by Hadamard transformation of the direction mode with low possibility.

Description

SHVC (scalable video coding) spatial scalable video coding method based on distribution characteristics

Technical Field

The invention belongs to the field of SHVC (scalable video coding) video coding, and particularly relates to an SHVC spatial scalable video coding method based on distribution characteristics.

Background

Technologies such as digital television broadcasting, video conferencing, wireless video streaming, smart phone communication and the like are increasingly widely applied in daily life of people, so that a plurality of different terminal devices are generated, and the terminal devices may have different screen resolutions, so that the video streaming is required to adapt to the resolutions of different screens; scalable high efficiency video coding (SHVC) is an effective approach to this problem. SHVC consists of a Base Layer (BL) and one or more Enhancement Layers (EL). As shown in fig. 1, to accommodate different screen resolutions, the Spatial SHVC (SSHVC) encodes different layers with different screen resolution sequences, and by selecting the appropriate layer, the SSHVC can accommodate various devices of different screen resolutions.

In SSHVC consisting of one BL and one or more ELs, the base layer BL comprises only intra-layer prediction and the enhancement layer EL further comprises inter-layer prediction; the intra prediction coding process in the enhancement layer is the same as that of HEVC. Since the BL and EL have the same content but different resolutions, non-sampled prediction, i.e., inter-layer prediction, is required for the BL; the corresponding mode is denoted as an inter-layer reference (ILR) mode. Since the coding process of HEVC is already very complex, SSHVC needs to code all its layers, and therefore there must be a more complex coding process, which will limit its wide application, especially in wireless and real-time applications. Therefore, it is important to reduce the encoding complexity and increase the encoding speed.

Some existing algorithms can improve the coding speed to some extent, but the spatial SHVC still has some problems to be solved: 1. texture features and correlation are commonly used to predict candidate depths; however, their connection to depth selection is not straightforward; therefore, using them alone to predict depth selection does not guarantee optimal performance; 2. to increase the coding speed, the mode selection is usually terminated early with residual coefficients; as the principle behind it is not studied, the best performance cannot be obtained by using only residual coefficients for early termination mode selection.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method for encoding SHVC spatial scalable video based on distribution characteristics, which comprises the following steps:

s1: acquiring the depth of a current coding unit, if the current CU depth is 1 or 2, judging whether to skip the current depth according to a residual error coefficient of an enhancement layer ILR mode, if so, executing a step S5, otherwise, executing a step S2;

s2: judging whether the ILR mode of the current CU is the optimal mode or not by adopting a GMM-EM method, if so, executing a step S4, otherwise, executing a step S3;

s3: carrying out intra-frame prediction on the mode of the current CU to obtain the optimal mode of the current CU;

s4: judging whether the current CU continues to be divided according to the residual coefficient of the optimal mode, and outputting the division result of the CU if the division is stopped; if the current CU continues to be divided, the depth of the current CU is obtained, if the current depth is 3, the CU directly jumps out of the division to obtain a final division result, and if the current depth is not 3, the step S5 is executed;

s5: and dividing the current CU to obtain four sub-CUs, and executing the steps S1 to S4 on the four sub-CUs.

Preferably, the process of determining whether to skip the current depth according to the residual coefficient includes: coding a current coding unit CU to obtain a residual coefficient map of an enhancement layer ILR mode; dividing the residual coefficient graph to obtain a first residual coefficient graph and a second residual coefficient graph; respectively calculating the expectation and variance of the first residual coefficient map and the second residual coefficient map, judging whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, if the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, skipping the current depth, otherwise, not skipping the current depth.

Further, the process of calculating the expectation and variance of the residual coefficient map comprises: obtaining residual coefficient samples of the divided residual coefficient graphs by subjecting each coefficient in the residual coefficient graphs to Gaussian distribution; and obtaining the probability density function and the corresponding likelihood function of the residual coefficient sample by adopting a maximum likelihood estimation algorithm according to the residual coefficient sample, and obtaining the expectation and the variance of the segmented residual coefficient graph according to the probability density function and the likelihood function.

Further, the process of determining whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map includes: inputting the expectation and the variance of the first residual coefficient graph into a judgment condition to obtain a first judgment result; inputting the expectation and the variance of the second residual coefficient map into a judgment condition to obtain a second judgment result; and comparing the first judgment result with the second judgment result to obtain a judgment result.

Further, the judgment conditions are as follows:

wherein the content of the first and second substances,

denotes the mean value of the samples, μ ₁ Indicates expectation, σ ₁ Denotes the standard deviation, n denotes the number of residual coefficients in each section, s _α Representing a threshold value.

Preferably, the step of determining whether the ILR mode of the current CU is the optimal mode by using the GMM-EM method includes: saving coding modes and rate distortion costs of each depth CU of a previous frame and a current frame; obtaining the coding mode and the rate distortion cost of adjacent CUs of the previous frame and the current frame of each CU according to the coding mode and the rate distortion cost of the CU of the previous frame and the current frame; adopting rec0 to store the rate-distortion cost belonging to the ILR mode, and adopting rec1 to store the rate-distortion cost of the Intra mode; after the current CU finishes the ILR mode, obtaining the rate distortion cost of the ILR mode adopted by the current CU according to the rate distortion cost stored by rec0 and rec 1; performing GMM conversion on the rate distortion cost of the ILR mode of the current CU to obtain the probability based on rate distortion; obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU; predicting the probability that the current CU adopts the ILR mode according to the probability based on rate distortion and the probability based on quantity; and judging whether the ILR mode of the current CU is the optimal mode according to the probability result.

Preferably, the intra prediction of the current CU mode includes: predicting the mode of the current CU by adopting a method based on a direction mode DM; there are 35 all directional patterns DM for CU; the process of predicting the mode of the CU includes:

step 1: selecting 0, 1, 10 and 26 in the direction mode DM for Hadamard transformation, selecting a smaller Hadamard cost HC1 in DM0 and DM1 and a smaller Hadamard cost HC2 in DM10 and DM26, judging the size of HC1 and HC2, if HC1 is smaller than HC2, then DM0 and DM1 are the optimal DM, executing step 10, otherwise executing step 2;

step 2: judging the Hadamard cost in DM10 and DM26, if the Hadamard cost ratio of DM10 is less than the Hadamard cost of DM26, executing step 3, if the Hadamard cost ratio of DM10 is greater than the Hadamard cost of DM26, executing step 5, otherwise executing step 7;

and step 3: detecting DM8, DM9, DM11 and DM12, judging whether LMD exists in the modes of DM9, DM11 and DM12, if yes, the mode is the optimal direction mode, and executing step 10, otherwise executing step 4;

and 4, step 4: detecting DM2, DM6, DM14 and DM18, and if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14 and DM18, directly executing step 10; otherwise, acquiring the optimal DM by adopting a binary search method, and executing the step 9;

and 5: detecting DM24, DM25, DM27 and DM28, if LMD exists in DM24, DM25 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 6;

step 6: detecting DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM18, DM22, DM8, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, adopting a binary search method to obtain the optimal DM, and executing step 9;

and 7: detecting other DMs except the

DMs

10 and 26, if LMDs exist in the DM9, the DM10, the DM11, the DM25, the DM26 and the DM27, the DM is the optimal DM, and executing the step 10, otherwise, executing the step 8;

and 8: detecting DM2, DM6, DM14, DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14, DM18, DM22, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, acquiring the optimal DM by adopting a binary search method, and executing step 9;

and step 9: checking the middle of the DM having the minimum hadamard cost and the neighboring DMs checked left (right) thereof and selecting the DM having the minimum hadamard cost, repeating the process until the DM is an LMD, which is an optimal DM, and performing step 10;

step 10: the DM selection terminates.

Preferably, the process of determining whether the current CU continues to be divided according to the residual coefficient of the optimal mode is the same as the process of determining whether the current depth is skipped according to the residual coefficient of the enhancement layer ILR mode.

The invention has the advantages that: the Hadamard transform is terminated in advance according to the significance difference, and the variable step length halving search method is adopted to predict and select the directional mode of the CU, so that the problem of low efficiency caused by the Hadamard transform of the directional mode with low possibility is solved.

Drawings

Fig. 1 is a result of a base layer and an enhancement layer in a conventional SHVC in the present invention;

FIG. 2 is a flow chart of the algorithm of the present invention;

FIG. 3 is a graph of a residual coefficient map according to the present invention;

FIG. 4 is a diagram of a neighboring CU structure of the present invention;

FIG. 5 is a schematic view of all directional modes of the present invention;

FIG. 6 is a diagram illustrating the prediction results of the type 1 directional mode of the present invention;

FIG. 7 is a diagram illustrating the results of the variable step size check and binary search of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

A method of original encoding, the method comprising: coding the whole coding unit CU to obtain a rate distortion cost value RDcost of the coding unit, and recording the rate distortion cost value RDcost as C1; dividing the whole coding unit CU to obtain 4 sub-CUs with the depth of 1, coding the sub-CUs with the depth of 1 to obtain the optimal rate distortion cost value RDcost of each sub-CU, calculating the sum of all the optimal rate distortion cost values of the sub-CUs with the depth of 1, and recording the value as C2; comparing C1 with C2, and taking the minimum value as the optimal RDcost; if C1 is smaller than C2, the current CU is not divided, otherwise, the division is continued; and repeating the above processes until the depth of the current CU is 3, stopping dividing, and finally obtaining the optimal RDCost with the current depths of 0, 1, 2 and 3 respectively.

A SHVC spatial scalable video coding method based on distribution characteristics, as shown in fig. 2, the method comprising:

s1: acquiring the depth of a current coding unit, if the depth of a current CU is 1 or 2, judging whether to skip the current depth according to a residual error coefficient of an enhancement layer ILR mode, if so, executing a step S5, otherwise, executing a step S2;

s3: performing intra-frame prediction on the mode of the current CU, namely performing candidate direction mode prediction on the current CU according to a direction mode DM-based method to obtain the optimal mode of the current CU;

s4: judging the depth of the current CU according to the optimal mode, and if the current depth is 0, executing a step S5; if the current depth is 1 or 2, judging whether to terminate the CU division according to the final residual error coefficient, if so, outputting a CU division result, and otherwise, executing a step S5; if the current depth is 3, directly jumping out of the division to obtain a final division result;

The process of judging whether to skip the current depth according to the residual coefficient comprises the following steps: coding the current CU, and obtaining a residual coefficient diagram of the CU after the ILR mode is finished; dividing the residual coefficient graph to obtain a first residual coefficient graph and a second residual coefficient graph; respectively calculating the expectation and variance of the first residual coefficient map and the second residual coefficient map, judging whether the expectation and variance of the first residual coefficient map and the second residual coefficient map are different according to a hypothesis testing method, skipping the current depth if the expectation and variance of the first residual coefficient map and the second residual coefficient map are different, and not skipping the current depth if the expectation and variance of the first residual coefficient map and the second residual coefficient map are different.

Optionally, the manner of dividing the residual coefficient map includes dividing the residual coefficient map left and right to obtain a left half residual coefficient map and a right half residual coefficient map; and respectively calculating expectation and variance of the left half residual coefficient graph and the right half residual coefficient graph, judging whether the two parts have significant difference according to a hypothesis testing method, and skipping the current coding depth if the two parts have significant difference.

Optionally, the dividing the residual coefficient map includes dividing the residual coefficient map up and down to obtain an upper half residual coefficient map and a lower half residual coefficient map; and respectively calculating expectation and variance of the residual coefficient map of the upper half part and the residual coefficient map of the lower half part, judging whether the two parts have significant difference according to a hypothesis testing method, and skipping the current coding depth if the two parts have significant difference.

Preferably, the manner of dividing the residual coefficient map includes dividing the residual coefficient map left and right or up and down to obtain a left half residual coefficient map and a right half residual coefficient map or an upper half residual coefficient map and a lower half residual coefficient map; and respectively calculating expectation and variance of the left half residual coefficient map and the right half residual coefficient map or the upper half residual coefficient map and the lower half residual coefficient map, judging whether the two parts have significant difference according to a hypothesis testing method, and skipping the current coding depth if the two parts have significant difference.

In SHVC, each CTU includes four depths corresponding to Coding Units (CUs) of size 64 × 64 to 8 × 8, and each CU needs to check an ILR mode and an intra mode and then select a mode with a low Rate Distortion (RD) cost as a best mode. The corresponding mode distribution between inter-layer (ILR) mode and Intra (Intra) mode is shown in table 1:

Sequence	ILR	Intra
			Blue-sky	98.50％	1.50％
Ducks	99.76％	0.24％
			Park_Joy	99.07％	0.93％
Pedestrian	90.65％	9.35％
			Tractor	97.68％	2.32％
Town	96.84％	3.16％
			Station2	95.27％	4.73％
Average	96.82％	3.18％

TABLE 1ILR mode and Intra mode distributions

As can be seen from table 1, the average percentage of ILR mode is 96.82%, i.e. most users select ILR mode as the best mode because the content in BL and EL is the same, while QP in BL and EL is similar or even the same, and the inter-layer correlation is strong; by upsampling the located CU in the BL, the prediction of the ILR mode can be directly obtained, and the process is simple. Therefore, in the invention, the residual coefficient of the ILR mode is obtained without coding, and then whether the current CU needs to be further divided is judged according to the residual coefficient, so that the current CU can be directly skipped; otherwise, the ILR mode and the Intra mode need to be further encoded.

First, the CU of the residual coefficient is divided into upper and lower parts shown in fig. 3 (a) and left and right parts shown in fig. 3 (b). If there is a significant difference between the two parts of any one partition, it means that the current CU needs to be further partitioned. If the CU can be predicted accurately with the best mode, the corresponding residual coefficients will follow a gaussian distribution. If the residual coefficients follow gaussian distribution, the residual coefficients of one part in each partition are modeled as:

X～N(μ ₁ ,σ ₁ ² )

wherein X represents a residual coefficient set of a certain part (here, the upper half or the left half), and μ ₁ And σ ₁ ² Respectively the expectation and variance of the part.

If x ₁ ,x ₂ ,....,x _n For the samples in the residual coefficient set X, maximum Likelihood Estimation (MLE) is used to obtain a probability density function of each part of the samples in the selected residual coefficient set, where the expression of the function is:

where X denotes the samples in the residual coefficient set X, μ ₁ And σ ₁ ² Respectively the expectation and variance of the residual coefficient set.

The likelihood function corresponding to the probability density function is:

where L denotes a likelihood function and n denotes the number of residual coefficients in each section. To obtain mu ₁ And σ ₁ ² The following calculation can be performed:

/>

according to the calculation result of the above expression, it can be obtained:

wherein the content of the first and second substances,

the average of the samples is indicated.

If Y is the residual coefficient set of another part, Y ₁ ,y ₂ ,....,y _n For the samples in the set of residual coefficients,

for the average value of the sample, the condition for judging whether the two parts have significant difference is as follows:

where n is the number of residual coefficients in each part, and α is the significance level value; for any kind of alpha, the corresponding threshold s can be obtained by checking the Gaussian distribution table _α (ii) a If the above formula is satisfied, the two parts differ significantly.

Since the probability of different depths being skipped may be different, an optimal threshold for each depth needs to be selected. And for the depth 2, a common value is adopted for detection, and the coding efficiency corresponding to the value is high. In order to improve the coding efficiency, the maximum value of 3.49 in the gaussian distribution table is selected, and the multiple of the maximum value is further tested, and the corresponding coding efficiency is shown in table 2.

TABLE 2 coding efficiency under different test values

In table 2, the coding efficiency is represented by BDBR, which measures the bit rate difference for the same peak signal-to-noise ratio (PSNR) in EL. A positive or negative BDBR reflects a loss or increase in coding efficiency, respectively. As can be seen from Table 2, when the test value is greater than or equal to 20.94, the BDBRs of all the sequences are less than 0.1%. Therefore, test value 20.94 is selected as the threshold for depth 2. Likewise, the threshold for depth 1 is 31.41. If depth 0 is skipped, the corresponding coding efficiency is significantly reduced in some sequences; depth 0 is not skipped. The depth skip condition is:

where depth represents depth.

The process of judging whether the ILR mode of the current CU is the optimal mode by adopting the GMM-EM method comprises the following steps: saving coding modes and rate distortion costs of each depth CU of a previous frame and a current frame; obtaining the coding mode and the rate-distortion cost of adjacent CUs of the previous frame and the current frame of each CU according to the CU coding mode and the rate-distortion cost of the previous frame and the current frame, storing the rate-distortion cost belonging to the ILR mode by adopting rec0, storing the rate-distortion cost of the Intra mode by adopting rec1, and obtaining the rate-distortion cost of the ILR mode adopted by the current CU according to the rate-distortion costs stored by rec0 and rec 1; performing GMM conversion on the rate distortion cost of the ILR mode of the current CU to obtain the probability based on rate distortion; obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU; predicting the probability that the current CU adopts the ILR mode according to the probability based on rate distortion and the probability based on quantity; and judging whether the ILR mode of the current CU is the optimal mode or not according to the probability result. Specifically, the process of judging whether the ILR mode of the current CU is the optimal mode by using the GMM-EM method includes:

(1) Firstly, a variable currcosts 0 of a vector < pair < int, double > > > type is used for saving a final coding mode and a rate distortion cost of a CU at each depth 0 of a current frame, a preCosts0 is used for saving the final coding mode and the rate distortion cost of the CU at each depth 0 of a previous frame, the type of the preCosts0 is the same as that of the currcosts 0, the currcosts 0 is copied to the preCosts0 after each frame is coded, and the currcosts 0 is initialized again. The same applies to

depths

1, 2, and 3, so that the coding mode and the rate-distortion cost of each depth CU coded in the previous frame and the current frame are obtained.

(2) And (2) obtaining adjacent CUs of a previous frame and a current frame of each CU through (1), obtaining coding modes and rate-distortion costs of the adjacent CUs, storing the rate-distortion costs belonging to the ILR mode by using rec0, storing the rate-distortion costs belonging to the Intra mode by using rec1, and obtaining the rate-distortion costs of the ILR mode adopted by the current CU because the current CU completes the coding in the ILR mode at the moment.

The structure of the current CU and its neighboring CUs is shown in FIG. 4, where U is ₀ Is the current CU, U ₁ ,U ₂ ,U ₃ And U ₄ Is a neighboring CU, U of the current CU ₅ ,U ₆ ,U ₇ ,U ₈ And U ₉ Is in the previous frame with U ₀ ,U ₁ ,U ₂ ,U ₃ And U ₄ Co-located CUs.

(3) And (3) performing GMM conversion on the information obtained in the step (2) to obtain probability based on rate distortion, wherein the specific process comprises the following steps:

for arbitrary U _i The rate distortion cost is denoted as rd _i The corresponding gaussian mixture model is:

p(rd _i )＝π ₁ N(rd _i |μ ₁ ,∑ ₁ )+π ₂ M(rd _i |μ ₂ ,∑ ₂ )

wherein, pi ₁ To adopt the possibility of ILR mode, μ ₁ Sum Σ ₁ Respectively, the expected value and variance of its rate-distortion cost; pi ₂ Is the possibility of using the Intra mode, μ ₂ Sum Σ ₂ Respectively, the expectation and variance of its rate-distortion cost.

To obtain pi ₁ 、μ ₁ 、∑ ₁ 、π ₂ 、μ ₂ And sigma ₂ And adopting maximum likelihood estimation for six parameter values, wherein the expression is as follows:

wherein M represents the current CU and the number of adjacent CUs thereof, p represents the probability, rd _i Representing the rate-distortion cost and N representing the gaussian distribution.

Preferably, the value of M is set to 10.

The log expression of the likelihood function is:

the maximum likelihood estimation expression and the logarithm expression of the likelihood function can be obtained as follows:

/>

from the above formula, one can obtain:

where γ (i, k) is the probability that the ith sample is produced by the kth section, and T represents the transpose.

I.e. the sum of the probabilities that all samples are generated by the kth part, the expression to get the possibility to use ILR mode is:

where N is the sum of the probability sums of the two modes employed, γ (i, k) can be obtained according to the following equation:

repeating the iteration to obtain mu _k 、∑ _k 、π _k And γ (i, k) until γ (i, k) converges.

Since the current CU is U ₀ In order to determine whether the ILR mode is the best mode, it is necessary to determine whether γ (0, k) converges. Suppose that the ith iteration of γ (0, k) is denoted as γ _i (0, k) to avoid unnecessary repeated iterations if γ _i-1 (0, k) and γ _i The absolute difference between (0, k) is very small and the repeated iterations can be terminated. Selecting 0.01 as the threshold, then:

|γ _i (0,k)-γ _i-1 (0,k)|≤0.01

if the above conditions are met, directly terminating the repeated iteration process; through this process, the probability that the current CU selects the ILR mode may be obtained, which is defined as a probability based on rate distortion.

(4) And obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU. The specific process is as follows:

since neighboring CUs are usually similar, neighboring CUs are used for prediction. The more ILR mode is used in neighboring CUs, the higher the probability that the current CU uses this mode and vice versa. The probability that the current CU selects the ILR mode is proportional to the number of neighboring CUs using the ILR mode. As shown in fig. 4, the current CU has nine neighboring CUs. Thus, the possibility that the current CU selects ILR mode mayIs written as

Where k is the number of neighboring CUs using ILR mode. Since this probability is obtained based on the number of neighboring CUs using the ILR mode, it is defined as a probability based on the number.

(5) Since both the rate-distortion based probability and the quantity based probability have a strong relationship with the selection of the ILR mode, the probability of using the ILR mode is predicted from the rate-distortion based probability and the quantity based probability. Let A and B denote probability based on rate distortion and probability based on quantity, respectively; because the two are independent of each other, the possibility of deep early termination is obtained, and the expression is as follows:

p _r ＝p(A+B)＝p(A)+p(B)-p(A)p(B)

wherein p is _r Indicating the possibility of early termination of depth if p _r Greater than or equal to 0.6, the current CU terminates early; p (a + B) represents a probability based on rate distortion and a probability based on quantity, i.e., a probability that both satisfy at least one, p (a) represents a probability based on rate distortion, and p (B) represents a probability based on quantity. 0.6, 0.7, 0.8 and 0.9 were used during the test, the corresponding BDBRs are shown in table 3.

Pr&BDBR	0.6	0.7	0.8	0.9
					Blue-sky	-0.3％	-0.3％	-0.3％	-0.3％
Ducks	0.0％	0.0％	0.0％	0.0％
					Park_Joy	0.0％	0.0％	0.0％	0.0％
Pedestrian	-0.1％	-0.1％	-0.1％	-0.1％
					Tractor	-0.1％	0.0％	0.0％	0.0％
town	-0.1％	-0.1％	-0.1％	-0.1％
					station2	0.0％	-0.1％	-0.1％	-0.2％

TABLE 3 probability of deep early termination p _r And corresponding BDBR

As can be seen from Table 3, when p is _r At increasing levels, the BDBR remains essentially unchanged except for the sequence "station 2". The BDBR of the sequence "station2" reaches a minimum value when the BDBR is equal to 0.9. Therefore, 0.9 is selected as the optimum value. I.e., if pr is greater than 0.9, the current depth terminates early.

The Directional Mode (DM) in SHVC is shown in fig. 5. There are 2 non-directional modes, namely DC (DM 0) and plane (DM 1), and 33 directional modes (DM 2 \8230; DM 34). In general, dm0 and dm1 are well suited for simple CUs. Similar to HEVC, SHVC first checks 35 directional modes in coarse mode decision (RDM) to get Hadamard Cost (HC) to select the smallest N DMs in HC, then checks these DMs in Rate Distortion Optimization (RDO) process, and selects the directional mode with the smallest RDO value as the best DM; through the above process, the optimal coding efficiency can be obtained. However, checking many unnecessary direction patterns takes much unnecessary encoding time. Especially the RDM procedure always checks 35 directional patterns, which is very time consuming. Large-size CUs in the Enhancement Layer (EL) are usually very simple if they use the Intra mode; for small CUs in the EL, their texture does not change much because of their small size, so they are usually also simple. Obviously, a simple CU may have special DM characteristics. Studying these DM characteristics in EL helps to increase the coding speed. The probability that different DMs are selected as the best DM among all CUs may be different by first obtaining their probabilities and then studying the distribution of DMs by grouping DMs with similar probabilities.

There are 35 all directional modes DM of a CU, namely DM0, DM1, DM2, DM3, DM4, DM5, DM6, DM7, DM8, DM9, DM10, DM11, DM12, DM13, DM14, DM15, DM16, DM17, DM18, DM19, DM20, DM21, DM22, DM23, DM24, DM25, DM26, DM27, DM28, DM29, DM30, DM31, DM32, DM33, and DM34; and calculating the probability of selecting each directional mode by the CU, and classifying all the directional modes according to the calculated probability to obtain three types of division results.

The formula for calculating the probability is:

wherein n is _i Is DM _i M is the number of all CUs, where all CUs are selected as the number of best DMs.

The classification result of the directional pattern includes class 0, class 1, and class 2. The 0-class directional pattern includes DM0 and DM1; the class 1 directional patterns include DM8, DM9, DM10, DM11, DM12, DM24, DM25, DM26, DM27, and DM28; the class 2 directional patterns include DM2, DM3, DM4, DM5, DM6, DM7, DM13, DM14, DM15, DM16, DM17, DM18, DM19, DM20, DM21, DM22, DM23, DM29, DM30, DM31, DM32, DM33, and DM34. The 1-class directional patterns are divided into two groups, i.e., DM8, DM9, DM10, DM11, and DM12 as a horizontal directional pattern group, and DM24, DM25, DM26, DM27, and DM28 as a vertical directional pattern group. Dividing the 2 types of direction modes into 4 groups of data, namely taking DM2, DM3, DM4, DM5, DM6 and DM7 as a first direction mode group; setting DM13, DM14, DM15, DM16, DM17 and DM18 as a second direction mode group; setting DM19, DM20, DM21, DM22 and DM23 as a third direction mode group; DM29, DM30, DM31, DM32, DM33, and DM34 are taken as a fourth directional mode group.

And in the process of carrying out intra prediction on the mode of the current CU, carrying out significance difference prediction on the 0-type directional mode. The specific process comprises the following steps: will DM _i HC of (b) is represented by _i Min () is expressed as the smaller value between two different HC values; DM0 and DM1 in the 0 type are non-directional modes, DM10 in the 1 type directional mode is in the horizontal direction, and DM26 is in the vertical direction; if HC0 and HC1 are significantly smaller than HC10 and HC26, then the optimal DM occurs in class 0; DM0 and DM1 in class 0 and DM10 and DM26 in class 1 are examined, based on min (HC 0, HC 1) and min (HC 10, HC 2)6) The difference between them to determine whether the optimal DM is present in class 0; in performing the directional pattern prediction, the user may terminate the selection of DM early at any time.

As shown in fig. 6, a significance difference prediction descending direction search is performed for the type 1 directional pattern; since the DM10 is a horizontal DM and the DM26 is a vertical DM, if the HC10 is significantly smaller than the HC26, the optimum DM is likely to be in the horizontal direction mode group, and conversely, the optimum DM is likely to be in the vertical direction mode group. First, DM10 and DM26 in class 1 are examined, and the set of possible directional modes is predicted based on the difference between HC10 and HC 26. After the set of possible directional patterns is obtained, the two directly neighboring DMs of DM10 or DM26 are further checked according to the set of directional patterns, and then the best DM is searched according to their hadamard costs in the set. Since the probability that a DM in class 0 and 1 is the best DM is high, if one DM and its two immediate neighbors are examined and its HC is the smallest HC of all the examined DMs, then this DM is likely to be the best DM (LBD). To obtain LBD as soon as possible, a search is made in the direction of decreasing hadamard cost. For example, if the horizontal subclass is a very likely group of directional patterns, the two immediately adjacent DMs of DM10, i.e., DM9 and DM11, are further examined. According to the combination of HC9, HC10 and HC11, there are three cases, as shown in FIG. 6. The LBD is searched for according to the arrows in each case.

Preferably, the three cases include: (1) If HC10 is the smallest HC of all Hadamard costs and DM10 is LBD, then DM selection is terminated early; (2) Since the hadamard cost is decreasing from the left and right, DM8 and DM12 are further checked in the decreasing direction to determine whether DM9 or DM11 is LBD; if not, DM7 and DM13 in class 2 need to be further examined to determine whether DM8 or DM12 is LBD; (3) As the hadamard costs of the three directional modes are all monotonically decreasing, the LBD is searched along the decreasing direction; if HC9> HC10> HC11, further check DM12 to determine whether DM11 is LBD; if not, further checking DM13 in class 2 and determining whether DM12 is an LBD; if HC11> HC10> HC9, further check DM8 to determine if DM9 is LBD; if not, DM7 in class 2 is further examined and it is determined whether DM8 is an LBD.

To determine DM8 or DM12 in class 1, DM7 or DM13 in class 2 needs to be checked. Looking at DM7 or DM13 through a significant difference predictive descending direction search, if an LBD can be obtained, it can be considered as the best DM and DM selection can be terminated.

And performing variable-step two-dimensional search on the 2 types of direction modes. Specifically, the best DM in class 2 is searched for using DM7 and DM13 or DM23 and DM29 as the starting DMs. For example, using DM7 in class 2 (selected) as the starting DM, DM6 is checked using step 1 (the distance between 7 and 6 is 1); starting from DM6, DM2 is checked using a step size of 4 (the distance between 6 and 2 is 4); if there is one DM in class 2 whose HC is the smallest among all the selected DMs, a binary search is used to find the best DM. More specifically, the process is: checking the middle point between the mDM with the smallest HC in all the checked DMs and the nearest adjacent lDM checked on the left side, then checking the middle point between the mDM and the nearest adjacent rDM checked on the right side, and finally selecting the DM with the smallest HC in all the checked DMs; this process is repeated until DM becomes LBD. For example, if DM4 has the smallest HC among all the selected DMs, then its left and right nearest neighboring selected DMs are DM2 and DM6, the midpoint between DM2 and DM4, i.e., DM3, and the midpoint between DM4 and DM6, i.e., DM5, are further checked, then the DM with the smallest HC among all the selected DMs is selected, if the DM is LBD, DM selection may be terminated early, otherwise, the process is repeated further. An example of a variable step check and binary search is shown in fig. 7.

Specifically, the process of intra-predicting the mode of the current CU includes:

and 7: detecting DM of category 2 except DM10 and DM26, if LMD exists in DM9, DM10, DM11, DM25, DM26 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 8;

and 8: detecting DM2, DM6, DM14, DM18, DM22, DM30 and DM34, if the DM with the minimum Hadamard cost is not in DM2, DM6, DM8, DM12, DM14, DM18, DM22, DM24, DM28, DM30 and DM34, directly executing step 10, otherwise, obtaining the optimal DM by adopting a binary search method, and executing step 9;

step 10: the DM selection terminates.

In order to determine whether there is a significant difference between the HC of the two DMs, the corresponding residual coefficients are determined. Let R be ₁ And R ₂ Is the residual of two DMs, which are distinguished by：

R＝R ₁ -R ₂

By Hadamard transformation, the above equation can be rewritten as:

HRH＝HR ₁ H-HR ₂ H

where H denotes a Hadamard matrix.

According to the cauchy inequality, the expression of HRH can be rewritten as:

wherein, m is the side length of the current CU, then there are:

wherein x is _i,j Is the HRH value at the (i, j) position, calculated as:

wherein h is _ik Denotes the Hadamard value, h, at the (i, k) position _pj Hadamard value, r, at the (p, j) position _kp Represents the R value at the (k, p) position.

If any quantized value in HRH is less than k, R ₁ And R ₂ There was no significant difference. The following conditions should be satisfied:

wherein k represents a parameter value, Q _step The representation of the quantization step size can be obtained according to the quantization parameter.

According to the calculation formula of R, the calculation formula of HRH, the Cauchy inequality and the above conditions, the following can be obtained:

HC ₁ and HC ₂ The conditions without significant differences were:

HC ₁ and HC ₂ Conditions with significant differences were:

wherein HC ₁ And HC ₂ Representing the Hadamard transform values of two DMs. If the obtained data are significantly identical, the two are not significantly different, otherwise, the two are significantly different. If HC is present ₁ <HC ₂ Then HC ₁ Is significantly less than HC ₂ And vice versa.

To obtain the most suitable k-value, the above conditions were tested and the corresponding BDBR was obtained with the results shown in Table 4.

TABLE 4 different k values and corresponding BDBRs

As can be seen from table 4, there is a turning point when k is equal to 5, and if greater than or equal to 5, the corresponding BDBR in all sequences is less than 0.1%. This means that good performance can be obtained when k is 5. If further larger, the corresponding increase in coding speed will be smaller. Therefore, k is set to 5.

The specific content in step S4 is: after the current CU is coded, obtaining a final residual coefficient map of the current CU; respectively obtaining expectation and variance of the left half part and the right half part of the residual coefficient graph, and judging whether the two parts have significance difference according to a hypothesis testing method; and respectively obtaining the expectation and the variance of the upper half part and the lower half part of the residual coefficient graph, and judging whether the two parts have significant difference according to a hypothesis testing method.

μ ₁ And σ ₁ ² Is the expected sum of the variances of one of the parts (here, the upper or left half), Z is the residual coefficient of the other part, Z ₁ ,z ₁ ,…z _n Is a sample thereof, and it can be tested whether Z also satisfies μ by the following formula ₂ And σ ₂ ² The formula is:

where α is the significance level value and is the number of residual coefficients of each portion. By consulting the gaussian distribution table, the corresponding threshold value can be obtained. If | γ is satisfied _i (0,k)-γ _i-1 (0, k) | ≦ 0.01, the residual coefficients of the two parts use the same expected value and variance. Thus, there is no significant difference between the two parts and the current CU can terminate early.

Wherein e is at depth 1 and depth 2 _α The values of (A) are as follows:

if the left and right parts and the upper and lower parts have no significant difference, the division is terminated early.

To verify the performance of the proposed spatial SHVC fast intra prediction algorithm, the reference software shm11.0 was used and tested on Intel (R) 2.0ghz processor and 30gb memory server. The training sequence and the test sequence are not overlapped, so that the universality of the algorithm is ensured. The performance of the algorithm was evaluated in terms of both coding efficiency and coding speed. Coding efficiency includes bit rate and visual quality, expressed in BDBR. It refers to the bit rate difference at the same PSNR compared to the reference software in EL. The encoding speed is represented by TS, which evaluates only the percentage saved during the encoding run in EL.

To verify the performance of the proposed algorithm, the algorithm integrates all the proposed strategies. The performance of the algorithm provided by the invention is compared with the performance of the PAPS algorithm, the EETBS algorithm and the FIICA algorithm. All algorithms are tested on the same computing platform. Since there are two setting ways for the scalability ratio and QP, respectively, their combination is classified into four cases (cases) in EL. case1 is scalable rate of 1.5 times and QP set (22, 26, 30, 34), case2 is scalable rate of 1.5 times and QP set (24, 28, 32, 36), case3 is scalable rate of 2 times and QP set (22, 26, 30, 34), and case4 is scalable rate of 2 times and QP set (24, 28, 32, 36). Table 5 (case 1), table 6 (case 2), table 7 (case 3), and table 8 (case 4) list the overall performance comparisons in terms of encoding efficiency and encoding speed, respectively.

TABLE 5 Case1 Performance comparison

TABLE 6 case2 Performance comparison

TABLE 7 case3 Performance comparison

TABLE 8 case4 Performance comparison

In table 5 (case 1), the average BDBR of the algorithm used in the present invention, PAPS, EETBS and FIICA were 0.02%, 0.30%, 0.20% and 0.06%, respectively. The average TS of the four algorithms is 79.66%, 67.03%, 55.34%, and 47.15%, respectively. In the test, the BDBR of the algorithm adopted by the invention is smaller than the other three algorithms, and the coding speed is obviously higher than the other three algorithms. In Table 6 (case 2), the average BDBRs of the proposed algorithm, BDBR, PAPS, EETBS and FIICA are-0.14%, 0.38%, -0.20% and-0.18%, respectively; the average TS of the four algorithms is 81.26%, 65.85%, 56.30%, and 45.75%, respectively; in the test, the BDBR of the algorithm is smaller than the PAPS algorithm and slightly larger than the EETBS and FIICA algorithms, and the coding speed of the algorithm is obviously higher than that of the other three algorithms. In table 7 (case 3), the average BDBR of the proposed algorithm, PAPS, EETBS and FIICA is 0.94%, 0.62%, 0.35% and 0.38%, respectively. The average TS of the proposed algorithm is 76.34%, 68.30%, 54.49% and 42.22%, respectively. In the test, the BDBR of the algorithm provided by the invention is larger than the other three algorithms, and the coding speed is obviously higher than the other three algorithms. In table 8 (case 4), the average BDBR for the proposed algorithm, PAPS, EETBS and FIICA is 0.68%, 0.31% and 0.40%, respectively. The average TS of the proposed algorithm is 78.02%, 66.67%, 54.11% and 43.25%, respectively. In the test, the BDBR of the algorithm provided by the invention is smaller than the PAPS algorithm and slightly larger than the EETBS and FIICA algorithms, and the coding speed is obviously higher than that of the other three algorithms.

To clearly demonstrate the performance of the algorithms presented herein, table 9 provides a comparison of the overall average performance of these four algorithms in all four cases.

TABLE 9 Overall average Performance comparison of different algorithms

The overall average BDBR for the proposed algorithm, PAPS, EETBS and FIICA was 0.38%, 0.49%, 0.17% and 0.16%, respectively. The total average TS for the four algorithms was 78.82%, 66.96%, 55.06%, and 44.59%, respectively. Therefore, the encoding speed of the algorithm is obviously faster than that of the other three algorithms. Meanwhile, the BDBR of the algorithm is smaller than the PAPS algorithm and larger than the EETBS algorithm and the FIICA algorithm.

The above-mentioned embodiments, which further illustrate the objects, technical solutions and advantages of the present invention, should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for SHVC spatial scalable video coding based on distribution characteristics, the method comprising:

s1: acquiring the depth of a current Coding Unit (CU), if the depth of the current CU is 1 or 2, judging whether to skip the current depth according to a residual error coefficient of an enhancement layer ILR mode, if so, executing a step S5, otherwise, executing a step S2;

the process of judging whether to skip the current depth according to the residual coefficient comprises the following steps: coding a current coding unit CU to obtain a residual coefficient map of an enhancement layer ILR mode; dividing the residual coefficient graph to obtain a first residual coefficient graph and a second residual coefficient graph; respectively calculating the expectation and variance of the first residual coefficient map and the second residual coefficient map, judging whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, if the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map, skipping the current depth, otherwise, not skipping the current depth;

the process of judging whether the ILR mode of the current CU is the optimal mode by adopting the GMM-EM method comprises the following steps: saving coding modes and rate distortion costs of each depth CU of a previous frame and a current frame; obtaining the coding modes and the rate distortion costs of adjacent CUs of the previous frame and the current frame of each CU according to the coding modes and the rate distortion costs of the CUs of the previous frame and the current frame; adopting rec0 to store the rate-distortion cost belonging to the ILR mode, and adopting rec1 to store the rate-distortion cost of the Intra mode; after the current CU finishes the ILR mode, obtaining the rate distortion cost of the ILR mode adopted by the current CU according to the rate distortion cost stored by rec0 and rec 1; performing GMM conversion on the rate distortion cost of the ILR mode of the current CU to obtain the probability based on rate distortion; obtaining the probability of the current CU based on the quantity according to the coding mode of the adjacent CU; predicting the probability that the current CU adopts the ILR mode according to the probability based on rate distortion and the probability based on quantity; judging whether the ILR mode of the current CU is the optimal mode or not according to the probability of the ILR mode;

s3: performing intra-frame prediction on the mode of the current CU to obtain the optimal mode of the current CU;

2. The SHVC spatial scalable video coding method according to claim 1, wherein the process of calculating the expectation and variance of the residual coefficient map comprises: obtaining residual coefficient samples of the divided residual coefficient graphs by subjecting each coefficient in the residual coefficient graphs to Gaussian distribution; and obtaining the probability density function and the corresponding likelihood function of the residual coefficient sample by adopting a maximum likelihood estimation algorithm according to the residual coefficient sample, and obtaining the expectation and the variance of the segmented residual coefficient graph according to the probability density function and the likelihood function.

3. The SHVC spatial scalable video coding method according to claim 1, wherein the determining whether the expectation and variance of the first residual coefficient map are different from the expectation and variance of the second residual coefficient map comprises: inputting the expectation and the variance of the first residual coefficient graph into a judgment condition to obtain a first judgment result; inputting the expectation and the variance of the second residual coefficient map into a judgment condition to obtain a second judgment result; and comparing the first judgment result with the second judgment result to obtain a judgment result.

4. The SHVC spatial scalable video coding method based on distribution characteristics as claimed in claim 3, wherein the determining conditions are:

wherein the content of the first and second substances,

5. The method of claim 1, wherein intra-predicting the mode of the current CU comprises: predicting the mode of the current CU by adopting a method based on a direction mode DM; there are 35 all directional patterns DM for CU; the process of predicting the mode of the CU includes:

step 1: selecting DM0, DM1, DM10 and DM26 in a directional mode DM to carry out Hadamard transformation, selecting smaller Hadamard cost HC1 in DM0 and DM1 and smaller Hadamard cost HC2 in DM10 and DM26, judging the size of HC1 and HC2, if HC1 is smaller than HC2, DM0 and DM1 are optimal DM, executing step 10, otherwise executing step 2;

and 7: detecting other DMs except DM10 and DM26, if LMD exists in DM9, DM10, DM11, DM25, DM26 and DM27, the DM is the optimal DM, executing step 10, otherwise executing step 8;

and step 9: checking the middle of the DM having the minimum hadamard cost and the neighboring DMs checked left or right thereof and selecting the DM having the minimum hadamard cost, repeating the process until the DM is an LMD, which is an optimal DM, and performing step 10;

step 10: the DM selection terminates.

6. The method of claim 1, wherein the determining whether the current CU continues to be partitioned according to the residual coefficients of the best mode is the same as the determining whether to skip the current depth according to the residual coefficients of the enhancement layer ILR mode.