WO2008064577A1

WO2008064577A1 - A method and an apparatus for adjusting quantization quality in encoder and decoder

Info

Publication number: WO2008064577A1
Application number: PCT/CN2007/003799
Authority: WO
Inventors: Wei Li; Lijing Xu; Qing Zhang; Jianfeng Xu; Shenghu Sang; Zhengzhong Du; Yao Zou; Peilin Liu
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2006-12-01
Filing date: 2007-12-26
Publication date: 2008-06-05
Also published as: CN101192410A; EP2104095A1; WO2008064577A8; EP2104095A4; CN101192410B

Abstract

A method for adjusting quantization quality in encoder. It comprises adjusting the first sample values to be encoded using two or more scale factors, and quantizing the first sample values adjusted to acquire quantized sample values. Then, getting rid of the influence of scale factors for the quantized sample values to get the second sample values, and based on the first sample values and the second sample values to get the global gain. Subsequently, encoding the quantized sample values, the two or more scale factors and the global gain into bit stream. Furthermore, the method for adjusting quantization quality in decoder and the apparatus for adjusting quantization quality in encoder and decoder can reduce the practical complication sharply, and adjust better quantization quality of important section, and acquire better encoding effect.

Description

Technical field

The present invention relates to coding techniques, and more particularly to a method and apparatus for adjusting quantization quality in codecs. Background technique

With the development of communication technologies and the expansion of multimedia services, for digital audio, video and other encoding, not only higher coding efficiency and real-time performance are required, but the coding bandwidth also needs to be further expanded. For digital audio coding, currently, technologies that can satisfy low bit rate and high quality audio coding mainly include: AAC+, EAAC+ and AMR-WB+. Among them, AAC+ and EAAC+ are extended from high-rate audio encoders, and AMR-WB+ is a hybrid coding method formed by extending low-rate speech coding.

In the usual audio coding, in order to better combine some characteristics of the human auditory system, the sampled values are generally time-frequency transformed, and then the spectral coefficients are weighted and quantized according to the auditory characteristics, and the quantized spectral coefficients are then passed through the entropy. Value encoding transmission. The main distortion in the coding results from the quantification of various parameters. Therefore, in order to adapt to different needs, the encoder needs to adjust the quality of the quantization according to the specified code rate: in a high bit rate coding scheme such as greater than 24 kbps, a good encoder will reach a transparent sound shield, that is, a human ear. The noise introduced in the coding quantization process cannot be detected. In the low code rate coding scheme, due to the shortage of the number of bits, it is impossible to completely achieve the effect of sound quality transparency, and thus only the subjective distortion as small as possible can be pursued.

A commonly used technique for adjusting the quantization shield is to use a scaling factor or gain. The encoded coefficients are first divided by the scaling factor or multiplied by the gain, and then the scaled coefficients are quantized. The most suitable scaling factor satisfies the code rate. The requirements can make the quantization error as small as possible. Therefore, when the code rate is relatively high, a smaller scaling factor is selected, so that the dynamic range of the quantized coefficient is relatively large, and the quantization is relatively fine; when the code rate is relatively small, the larger one is selected.

Confirmation The scaling factor, such that the dynamic range of the quantized coefficients is relatively small, and the quantization is relatively coarse.

Figure 1 shows a schematic block diagram of the MPEG1-LAYER3 audio coding algorithm. In the MPEG1-LAYER3 audio coding algorithm, before the time-frequency transform, the entire coding frequency band is equally divided into 32 sub-bands, each of which is assigned a scaling factor, and a global scaling factor is assigned to the entire frequency band; The closed-loop search algorithm adjusts the global scaling factor so that the number of quantization bits is within the allowable range of the current bit rate, while adjusting the scaling factor within the sub-band, so that the quantization noise is below the masking domain of the human ear as much as possible, that is, the human ear does not feel the quantization noise. The existence of the quantized coefficient stream is finally transmitted by Huffman coding.

The sub-band multi-scaling factor coding method in the MPEG1-LAYER3 coding algorithm has the following defects:

(1) Subband division requires 32 subband analysis filter banks, and the computational complexity is high;

(2) The scaling factor of each sub-band needs to be quantized and encoded, and the number of occupied bits is too large, which is not suitable for low-rate encoding.

Figure 2 shows the flow chart of the Transform Excitation Coding (TCX) section of the AMR-WB+ audio coding algorithm. In AMR-WB+ audio coding, a global scaling factor is used. Considering the limitation of using a scaling factor, it is impossible to fine-tune a specific frequency segment, and considering the coding requirements according to the low code rate, the frequency domain samples with less energy in the spectrum are lost in vector quantization. Since the sensitivity of the human auditory system to different frequency bands is different, it is expected that the smaller frequency domain samples at the important frequency bands can still be quantized during encoding. Therefore, in AMR-WB+ audio coding, spectrum pre-shaping and spectrum inverse shaping are used. technology. In the TCX part of the AMR-WB+ audio coding algorithm, frequency pre-shaping is performed on the more important frequency bands in the entire spectrum to increase the energy of these specific frequency bands, and then the same global scaling factor is used for the entire frequency band.

Since the human auditory system has a high frequency resolution at low frequencies, the so-called important frequency band refers to the low frequency band. In spectrum pre-shaping in AMR-WB+ audio coding, first calculate the energy E _{m of} each block for the first quarter spectrum, with each 8 point frequency domain sample as a block, where m is the block index number. , then find the largest block energy £^, and calculate =(£^/ E for each block and then obtain the amplification factor G _{m of} each block according to ^, so that the amplification factor in each block 6„ ₎ has monotonous decreasing, and finally for each minute The frequency domain samples of the block are multiplied by the amplification factor of the corresponding block. In AMR-WB+ audio coding, the amplification factor calculated in the frequency pre-shaping is not transmitted in the encoded code stream, but in the spectral inverse shaping, according to the frequency pre-shaping method, each frequency domain sample is calculated. After the block's amplification factor G _m , the recovered frequency domain samples are obtained by dividing the frequency domain samples of each block by the amplification factor of the corresponding block.

In the process of implementing the present invention, the inventors found that the global scaling factor algorithm of the existing AMR-WB+ audio coding algorithm TCX part has at least the following defects:

(1) Since only one scaling factor is used for the full band, the quantization quality can only be adjusted over the entire frequency band, and some important frequency segments cannot be emphasized;

(2) Although the frequency pre-shaping and spectral inverse shaping techniques enhance the quantization quality at low frequencies, the quantization quality at the remaining frequency bands is sacrificed;

(3) The frequency pre-shaping and inverse shaping techniques can only be applied to the frequency band with smaller bandwidth, otherwise the global scaling factor will be significantly improved, and the overall quantization effect will be reduced;

(4) Since the amplification factor pre-shaped in the encoding stage is not recorded in the encoded stream, the error generated after quantization will produce an error accumulation effect in the inverse shaping reduction factor. Summary of the invention

Embodiments of the present invention provide a method for adjusting quantization quality in coding, which reduces implementation complexity.

Embodiments of the present invention provide a method for adjusting a quantization shield amount in decoding, which can ensure quantization quality.

Embodiments of the present invention provide an apparatus for adjusting quantization quality in encoding, which reduces implementation complexity.

Embodiments of the present invention provide an apparatus for adjusting quantization quality in decoding, which can ensure quantization quality.

An embodiment of the present invention provides a method for adjusting quantization quality in coding, where the method includes: using two or more scaling factors to adjust a first sampling value for encoding, and then adjusting the first sampling The value is quantized to obtain a quantized sample value; the influence of the scaling factor is removed from the obtained quantized sample value to obtain a second sample value, and the first sample is utilized The value and the second sample value result in a global gain; the obtained quantized sample value, the information of the two or more scaling factors, and the resulting global gain are output as an encoded stream.

An embodiment of the present invention provides a method for adjusting a quantization quality in decoding, and decoding an encoded stream output by an encoding end to obtain a decoded stream, where the method includes: acquiring a quantized sample value, two or more scaling factors from the decoded stream Information and global gain; using the information of two or more scaling factors, removing the effect of the scaling factor from the quantized sample values to obtain the sampled value, multiplied by the global gain.

An embodiment of the present invention provides an apparatus for adjusting a quantization shield in coding, where the apparatus includes: a multi-scale factor control unit, a quantization unit, a gain balance unit, and a global gain calculation unit; wherein the multi-scale factor control unit is used by Receiving a first sample value, setting two or more scaling factors for the first sample value, adjusting the first sample value by using a scaling factor, and outputting the adjusted first sample value to the quantization unit; The unit is configured to quantize the received first sample value to obtain a quantized sample value and output the same to the gain balancing unit; the gain balancing unit is configured to receive the quantized sample value, and remove the influence of the scaling factor from the quantized sample value Obtaining a second sample value and outputting to the global gain calculation unit; the global gain calculation unit is configured to receive the first sample value and the second sample value, and obtain the global gain by using the first sample value and the second sample value.

An embodiment of the present invention provides an apparatus for adjusting a quantization quality in decoding, where the apparatus includes: a gain balancing unit and a global gain balancing unit; wherein the gain balancing unit is configured to receive a quantized sample value and a scaling factor, and utilize the Received scaling factor, removing the influence of the scaling factor from the quantized sample value to obtain a sampled value, and outputting the sampled value to the global gain balancing unit; the global gain balancing unit is configured to receive the global gain and the sampled value, and multiply the sampled value Output after global gain.

The method and apparatus for adjusting the quantization quality according to the embodiment of the present invention are different from the scheme of using the filter described in the prior art, and directly dividing the sampled value into a plurality of parts and respectively setting a scaling factor for each part, therefore, It can greatly reduce the implementation complexity; Moreover, unlike the prior art scheme using a global scaling factor, since multiple scaling factors are used, the quantization quality of important parts can be better adjusted, and better coding can be obtained. effect. DRAWINGS

1 is a schematic block diagram of an MPEG1-LAYER3 audio coding algorithm in the prior art;

Figure 2 is a flow chart showing the TCX portion of the AMR-WB+ audio coding algorithm in the prior art;

FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality according to Embodiment 1 of the present invention; FIG. 4 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 1 of the present invention; FIG. a flow chart for adjusting the quantization quality by a multi-scaling factor at the encoding end;

6 is a flowchart of selecting a plurality of scaling factors and fine-tuning frequency domain samples of an entire frequency band according to Embodiment 1 of the present invention;

7 is a flowchart of adjusting a quantized shield by a multi-scaling factor at a decoding end according to Embodiment 1 of the present invention;

8 is a schematic block diagram of an encoder for adjusting a quantized shield according to Embodiment 2 of the present invention; FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 2 of the present invention; FIG. 2 is a schematic diagram of peak pre-shaping in FIG. 2; FIG. 11 is a schematic diagram of implementing peak inverse shaping in Embodiment 2 of the present invention; FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3 of the present invention; A schematic block diagram of a decoder for adjusting quantization quality in Embodiment 3 of the present invention; FIG. 14 is a structural diagram of an apparatus for adjusting quantization quality at an encoding end according to Embodiment 4 of the present invention;

Figure 15 is a block diagram showing the arrangement of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment of the present invention. detailed description

In order to make the objects, technical solutions and advantages of the present invention more comprehensible, the present invention will be further described in detail.

The main idea of adjusting the quantization quality provided by the embodiment of the present invention is to utilize multiple scaling factors. The sub- or further use of spectral shaping techniques to adjust the quantization quality in the encoding process. In the following, the encoding process of time-frequency transforming the sampled values will be mainly described. Of course, the embodiment of the present invention can still be applied to the case where the time-frequency transform is not performed on the sampled values during the encoding process.

Example 1

Embodiment 1 provides a method of adjusting a quantized shield by a multi-scaling factor.

FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 1. In the encoding process, time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by a multi-scaling factor, quantized and output quantized. The sampled value, the output quantized sample value is calculated by gain balance and inverse time-frequency transform to calculate the optimal global gain. The coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.

4 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 1, in which a quantized frequency domain sample value is subjected to gain balance and inverse time-frequency transform to obtain a time domain sample value, and finally multiplied by a global gain. The time domain sampled values can be restored.

The specific steps of adjusting the quantization quality by the multi-scaling factor at the encoding end in Embodiment 1 are given below. As shown in FIG. 5, the following steps are included:

Step 501: Convert the time domain sample value to the frequency domain sample value X(k) by time-frequency transform. Here, time-frequency transform such as discrete Fourier transform (DFT), discrete cosine transform (DCT, MDCT, IDCT), and wavelet transform (DWT) may be employed. In the time-frequency transform process, a fast Fourier transform (FFT) can also be used, and P strives for low computational complexity.

Step 502: Perform multi-scaling control on the frequency domain sample values, specifically, selecting a suitable multiple scaling factors to fine-tune the frequency domain sample values of the entire frequency band.

In this embodiment, it is assumed that the frequency domain samples of the entire frequency band are ^ , = 0,1, · · ·, and m scaling factors are used, and are set in the encoding process, and the maximum number of allowed bits is ⁶ . Next, in conjunction with the flowchart shown in FIG. 6, the steps of selecting an appropriate multi-scaling factor and fine-tuning the frequency domain sample values will be described in detail.

Step 601: Divide the entire frequency band into m parts

, get m parts of the frequency domain sample value Χ^Χ-, η^,Χ^ +l,« _m _ ₁ +2,- , N),-, (« ₁ +l, 2, + 2,-,n ₂ ) , and ^! The scaling factor for each part is represented by _gl , &, ...^.

In the embodiment of the present invention, multiple scaling factors are directly divided on the entire frequency band after time-frequency transform, and it is not necessary to first divide the frequency band into several segments through the filter group, and then set a scaling factor in each segment, thereby Compared with the prior art, the implementation complexity can be greatly reduced.

Step 602: Select a reference value for estimating m scaling factors, the selection of the reference value of the scaling factor such that the number of consumed bits is ⁰ . The estimated value is less than the maximum allowable number of bits. In this embodiment, the estimated value of the consumed bit number b is a value related to the frequency domain sample value X, the number of frequency domain sample values N, and the scaling factor g, which can be 6 = cons ( , N, g) function representation. Therefore, in this step 602, the reference value of the scaling factor is selected to be g. The estimated number of bits consumed is b ₀ = com(X, N, g ₀ ), and satisfies b < b

Step 603: At ^g . Adjust m scaling factors nearby.

In this step 603, the m scaling factors can be adjusted by reducing the scaling factor of the more important frequency bands and increasing the scaling factor of the unimportant frequency band. Here, the more important frequency band refers to the low frequency band, and the unimportant frequency band refers to the high frequency band. Since & ~ corresponds to the low to high frequency bands respectively, the adjusted m scaling factors are gradually increasing relationships. Through this adjustment, the quantization quality of the more important frequency bands can be relatively high, and the quantization quality of the unimportant frequency bands is relatively low, so that the quantization quality in the entire frequency band is optimized.

Step 604: Determine that the estimated number of consumed bits does not exceed the total number of bits under the adjusted m scaling factors. If not, return to step 603 to adjust the scaling factor again. If yes, the number of consumed bits will be satisfied. The m scaling factors are represented as step 605: Calculating the quantized perceptual distortion based on the adjusted m scaling factors ^, g _m .

In this embodiment, the quantized perceptual distortion c is a value related to the frequency domain sample value X and m scaling factors, and can be represented by a function of = / ( , &, &, ·.., ^) to quantize the perceptual distortion c The value indicates: the original frequency domain sample value X and the difference between the sample values obtained by adjusting the frequency domain sample value X by m scaling factors _gl , g ₂ , -, g _m The value of the distortion that is brought. In step 605, according to the adjusted m scaling factors g! , g ₂ ,... , g _m calculated the quantitative perceptual distortion as c

Step 606: Determine whether the quantized perceptual distortion is within an _{unperceivable} range. If yes, the _m scaling factors obtained after the current adjustment are used as the optimal scaling factor, and gi _opt , g _2op ,, ", g _mop A Then, step 607 is performed; otherwise, step 603 is returned.

Among them, if the perceptual distortion is within the range that cannot be perceived, the person cannot perceive the quantization noise introduced by the encoder. For example, for audio coding, the human ear cannot perceive the quantization noise introduced by the encoder, and as for video coding, the human eye cannot perceive the quantization noise introduced by the encoder. Here, the specific insensible range is a specific range of values that allow distortion. A specific method for determining whether the quantized perceptual distortion is in an unperceivable range is: determining whether the value of the quantized perceptual distortion calculated in step 605 is within a range of the allowable distortion, and if so, the quantized perceptual distortion is not perceived. Otherwise, quantitative perception is considered to be perceptible.

In this embodiment, according to the judgment of step 606, when the quantized perceptual distortion can be perceived, if the quantized perceptual distortion can still be perceived after repeating the above-mentioned adjustment step M times, the closed loop selection is ended, and the repeated process is repeated from the above process. Among the scaling factors obtained in the adjustment, a set of scaling factors that minimize the perceptual distortion is selected as the optimal scaling factor, and then step 607 is performed. In practical applications, the number of closed-loop selections M can be determined according to actual conditions.

Step 607: Fine-tuning the frequency domain sample value X by using the obtained m optimal scaling factors ^g , ^g , that is, dividing the frequency domain sample value of each block by the optimal scaling factor of the corresponding block, and obtaining the fine-tuned spectrum. The concrete expression is as follows.

γ. |" (0,1,· · ·,«, ) Χ(η, + 1,^ + 2,- - ,^ )... X(n _m _ _{l +} \, n _m _, + 2,- , N)

Slopt Smopt sends the fine-tuned frequency-domain sampled values obtained in steps 601 to 607 above to the encoder.

Considering that the scaling factor is required to recover the data during decoding, the scaling factor needs to be transmitted in the encoded code stream. The way to transfer the scaling factor can be done in a variety of ways, as described below.

Mode 1 for transmitting the scaling factor: m scaling used to fine tune the frequency sampled value The factors ^^^..., ^^ are all encoded, so that the data can be recovered more accurately when decoding. Mode 2 of transmitting the scaling factor: m scaling factors g^, g ₂ f., g _m when used to fine tune the frequency sampled value. _P , , select a scaling factor as the reference scaling factor, then calculate the ratio of the remaining m - 1 scaling factors to the reference scaling factor, and encode the m - 1 ratio. For example, as a benchmark scaling factor, only coding is required. In this way, the number of bits consumed can be reduced.

Mode 3 for transmitting the scaling factor: m scaling factors used to fine tune the frequency sampled value

Medium, selecting a scaling factor as a reference scaling factor, then calculating a ratio of the remaining m-1 scaling factors to the reference scaling factor, and encoding the reference scaling factor and m-1 ratios. For example, put ^gl . p' as the reference scaling factor, you need to encode and ^ L, ,..., 3⁄4L. In this way, not only can the consumed bits be reduced

Slopt

Number, and because the decoding end can be calculated according to g ' and , , ..., 3⁄4^

g _20pt , , g _mop ,, and thus can recover data more accurately. In order to use a plurality of scaling factors without occupying a large number of bits, the number of preferred scaling factors can be selected according to the requirements of the coding rate and the quality of the quantization. For example, in low bit rate coding, 2 to 3 scaling factors can be selected.

Step 503: Quantize the frequency domain sample value obtained by the multi-scaling factor control, and output the quantized frequency domain sample value ₉ .

In step 503, different quantization methods may be used according to the coding requirements, for example, multi-level vector quantization, split vector quantization, tree quantization, lattice vector quantization, and the like.

Step 504: The quantized frequency sample value obtained in step 503 is removed, and the original frequency domain sample value is restored; ^ _∞∞ , that is, the quantized frequency sample value is obtained; and the gain balance is obtained to obtain _{ω ∞∞} . Depending on how the scaling factor is transmitted in step 502, the method of gain balancing also uses different methods.

If the mode of transmitting the scaling factor is the first mode or the third mode, the gain balancing can be performed by using multiple scaling factors selected in step 502, ^^..., ^^, specifically: the quantized frequency sampling value is also followed by steps. The frequency band division method in 601 is divided into m parts, and

X _g (0 ---, n,), X _q (n _m ^+\,n _m _ _l +2,---,N),---,X _q (r +l,r +2, ---, n ₂ ), and multiply the quantized frequency sample value of each part by the scaling factor of the corresponding part. The specific expression is as follows:

^X balance = [Slop, ' X _g ( ,l,- ,Π,), -g _2opt X _q («, + 1, «, + 2, · · ·, « ₂ ), · · · , g _mopt · X _q + 1, N)] If the method of transmitting the scaling factor is the above method 3, the gain balance can be performed by using the scaling values of the plurality of scaling factors, specifically: the quantized frequency sampling value is also followed by steps.

The frequency band division method in 601 is divided into m parts, and A^ ¹ '"''""), W 1, U2, -., N), ^(«, +1^+2, -, « ₂ ) are obtained. Multiplying the frequency sample value of the corresponding part of the reference scaling factor by 1, and the remaining part of the quantization frequency sample value is multiplied by the ratio of the scaling factor of the corresponding part to the reference scaling factor, assuming the first part of the corresponding scaling factor g As a reference scaling factor, the specific expression of the gain balance is as follows:

V

Balance +l,N)

Step 505: Perform inverse time-frequency transform on the J^ _fl/ obtained after the gain balance, and convert the restored frequency domain sample value into the restored time domain sample value χ ₉ («). Step 506: Calculate the optimal global gain g by using the original time domain sample value and the restored time domain sample value («).

Here, the global gain that minimizes the mean square error between the original time domain sampled value and the restored time domain sampled value can be used as the optimal global gain. ρ', the optimal global gain ^g gpi , minimizes ; [; c(«)-g _g -x _q (n)] ² . This gives the best global gain as -: ggopt -

∑ (")'") The best global gain g _g<?3⁄4 also requires coded transmission for data recovery at the decoder. The above is the process of adjusting the quantized shield by the multi-scaling factor at the encoding end. Corresponding to the quantization quality adjustment performed in the encoding process, the decoding end needs to recover the time domain sampling value according to the quantized frequency sampling value obtained after decoding by the flow shown in FIG. 7, and the specific process includes the following steps:

Step 701: Perform gain balance on the quantized frequency sample value by using a scaling factor obtained from the encoded stream. The specific implementation is the same as the method described in step 504, and the description thereof is omitted here. It should be noted that the method of gain balancing is also different according to the way of transmitting the scaling factor, and the gain balancing mode in the encoding end and the gain balancing mode in the decoding end are also consistent.

Step 702: Perform inverse time-frequency transform on the frequency domain sample value obtained after the gain balance, and obtain a time domain sample value.

Step 703: The time domain sample value is multiplied by the global gain obtained from the encoded stream to obtain a recovered time domain sample value.

The multi-scaling factor control technique used in the first embodiment can directly perform the sampling value in the time domain, that is, it can be applied to the case where there is no time-frequency transform, and correspondingly, when calculating the global gain, there is no inverse time-frequency transform process. In this case, when setting the multi-scaling factor, you can divide the time-domain sample value by time period. When adjusting the multi-scaling factor, you can reduce the scaling factor of the important time period and increase the scaling factor of the unimportant time period. .

Example 2

Embodiment 2 provides a method of adjusting the quantized shield by multi-scaling factors and spectral shaping.

FIG. 8 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 2. In the encoding process, time domain sample values are first converted into a frequency domain by time-frequency transform, and then controlled by spectrum pre-shaping and multi-scaling factors. The quantized sample values are quantized and output, and the output quantized sample values are calculated by gain balance, spectral inverse shaping, and inverse time-frequency transform to calculate an optimal global gain. The coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.

FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 2, in decoding In the process, the quantized frequency domain sampled values are obtained by gain balancing, spectral inverse shaping, and inverse time-frequency transform to obtain time domain sampled values, and finally multiplied by the global gain to restore the time domain sampled values.

In Embodiment 2, the specific steps of adjusting the quantization quality by the multi-scaling factor and the peak shaping are, based on the flow shown in FIG. 5 in Embodiment 1, the time-frequency transform and the step 502 described in the step 501. Between the multi-scaling factor control, the step of spectral pre-shaping further includes the step of spectrum inverse shaping between the gain balancing described in step 504 and the inverse time-frequency transform described in step 505. In the following, the specific implementation methods of frequency pre-shaping and frequency-language inverse shaping are introduced in detail.

Figure 10 shows a schematic diagram of spectrum pre-shaping, which can be implemented by the following steps. Step 1001: Step of determining a spectrum shaping area and performing the spectrum shaping area

In the frequency domain sample obtained in 501, the peak set of the sampled frequency domain samples is {p _{m 9} m = 1, · · ·, Μ} Here, the frequency shaping region refers to the spectral region of the more important frequency band. For example, in audio data, since the human auditory system has a higher frequency resolution at a low frequency, the low frequency portion is considered to be a more important frequency band; for example, in video, image, and the like, most of the data information is concentrated in At low frequencies, therefore, the low frequency portion is considered to be a more important frequency band. Therefore, the spectrum shaping area can use the front part of the full frequency band, for example, the first quarter can be used.

Here, the peak value may be defined as a local maximum value in the amplitude of the shaped spectrum segment, if > X(j), V; € [ - Δ, / + Δ], /≠ j , then [ - Δ, + Δ] The local maximum of 2 Δ + 1 point, where the local area can be arbitrarily selected.

Step 1002: Calculate a reference value p _ref for spectrum pre-shaping.

Here, the principle of selecting the reference value is to ensure that the reference value remains unchanged before and after spectral shaping. In this step 1002, the maximum peak value in the peak set { _Pm , = l, ..., M} may be used as a reference value; or the maximum local energy may be used as a reference value ^^. Considering the influence of the quantization error, the characteristic parameter of a piece of data can also be used as the reference value ^ to avoid the quantization error having a large influence on the reference value. Preferably, the reference value / can be selected as: the maximum peak value in the peak set _m , = l, ..., M} is close to the energy of the data point, or the average energy. Step 1003: Calculate a magnification factor for each peak in the peak set {p _m , = l,..., M}, R = r , t €(0,1) , where , and * can be selected according to actual conditions

Appropriate parameters. Step 1004: Amplify the peak using the calculated peak amplification factor. In order to ensure the reference value; 7 _re/ invariance, for the remaining peak points other than the peak point used to calculate the reference value p _{re /} correlation; ^ multiplied by the corresponding amplification factor, the peak point obtained after amplification For = / ^.

Considering that the human auditory system has a high frequency resolution of 4 艮 at the frequency, the peak energy of the low frequency portion is amplified so that the peak can be captured by the quantizer. Therefore, in the second embodiment, only a small number of spectral points are The peak is amplified. In this embodiment, the spectrum pre-shaping technique may also be referred to as peak pre-shaping. With this peak pre-shaping technique, the increase of the global gain is less affected, and the increase of the quantization error caused by the increase of the global gain is negligible. Of course, if you consider the effect of spectrum shaping better, you can also zoom in on the spectral points around the peak. For example, if you zoom in on the local peak of ² Δ+1 point, you can also ² Δ around the peak or A point less than ² Δ is amplified by the corresponding amplification factor.

Through the above-mentioned frequency pre-shaping process, the peak value of the frequency domain sample value at the important frequency band is increased, thereby reducing the quantization error at the smaller peak of the frequency domain sample value of the important frequency band, and reducing the frequency peak value of the more important frequency band. The probability of loss in quantization.

In the encoder, in order to calculate the optimal global gain, it is also necessary to recover the time domain sample values from the quantized frequency samples. If frequency pre-shaping is used, after A _{fl is} obtained by the gain balance described in step 504, the spectrum inverse shaping needs to be performed on J^„ _∞ , and the specific implementation process is as shown in FIG. 11, and includes the following steps:

Step 1101: In step 504, the peak set {^, = l, ..., M} of the frequency domain sampled values in the spectral shaping region is marked. The spectrum shaping area and the peak labeling criterion in the spectrum inverse shaping process should be the same as those in the frequency pre-shaping process.

Step 1102: Calculate a reference value for spectrum inverse shaping, where _β , spectrum inverse The reference value calculation criterion in the shape process should also be the same as in the frequency pre-shaping process. For example, if the peak of the peak set, = l, ..., M} is used as the reference value in the spectrum pre-shaping process, the peak set _m should also be used in the spectrum inverse shaping process. The maximum peak value in , _W = l, ..., M} is near the energy of the data point as a reference value.

Step 1103: Calculate the reduction factor r _m = C for each peak in the peak set _m , = l, ..., M}, and * should be pre-shaped with the spectrum

The parameters in the process are consistent. The calculation principle of the reduction factor ^ in the spectrum inverse shaping process is as follows: In the spectrum pre-shaping process, the amplification factor is ? = cf ) , * _e (0, l), if a certain peak point is large

P J

H / , according to the

It can be obtained from the above principle that the reduction factor is calculated in the frequency pan inverse shaping process, and it is not necessary to transmit the reference value for spectral inverse shaping in the encoded stream, and the decoding end can also utilize the characteristics of the sampled value of the decoding end according to the above principle. Calculate the reference value for spectral inverse shaping, and further calculate the reduction factor of the corresponding peak, without taking up extra bits.

Step 1104: The peak value is reduced by using the calculated peak reduction factor. In the spectrum inverse shaping process, the peaks amplified during the spectrum pre-shaping process should be reduced. If in the frequency pre-shaping process, other peak points other than the reference value are used for amplification, in the spectrum inverse shaping process, other peak points other than the reference value are also required to be reduced. , that is, in addition to the peak point used to calculate the reference value ^, the remaining other peak points ^ are divided by the corresponding reduction factor, and then reduced The peak point to be reached is = /r _m .

After performing spectrum inverse shaping by the above steps, in step 505, the frequency domain sampled values obtained after inverse frequency shaping are inverse-time-transformed.

In the second embodiment, since spectrum pre-shaping is performed between the time-frequency transform and the multi-scaling factor control in the encoding process, correspondingly, the spectrum inverse plasticizing between the gain balance and the inverse time-frequency transform is also required at the decoding end. The specific implementation method is the same as the frequency inverse processing method performed in the above encoding process, and the description thereof is omitted here.

In the second embodiment described above, the frequency pre-shaping is performed first, and then the multi-scaling factor is controlled. Similarly, in the encoding process, multi-scaling factor control may be performed first, and then spectrum pre-shaping is performed. Correspondingly, during the process of restoring the original sample value in the encoding process and during the decoding process, the first process may be performed first. The spectrum is inversely shaped and then gain balanced. In this case, no detailed introduction will be made.

Example 3

Embodiment 3 provides a method of adjusting quantization quality by spectral shaping.

FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3. In the encoding process, time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by spectrum pre-shaping, and quantized. The sampled value, the output quantized sample value is calculated by the frequency inverse inverse transform and the inverse time-frequency transform to calculate the optimal global gain. The coded stream needs to transmit the quantized value of the frequency domain sampled value and the global gain three parts.

FIG. 13 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 3. In the decoding process, the quantized frequency domain sample values are obtained by frequency inverse transform and inverse time-frequency transform to obtain time domain sample values, and finally multiplied by global values. Gain restores the time domain sample value.

In the third embodiment, the method of frequency pre-shaping and spectrum inverse shaping is consistent with the implementation method and the obtained technical effects in Embodiment 2, and will not be described in detail herein.

Example 4

Embodiment 4 gives an implementation device for adjusting the quantization quality.

Corresponding to the method described in Embodiment 2, FIG. 14 is a block diagram showing the configuration of the apparatus for adjusting the quantization quality at the encoding end in Embodiment 4. As shown in FIG. 14, the apparatus for adjusting the quantization quality at the encoding end includes: a time-frequency transform unit, a frequency pre-shaping unit, and a multi-scaling factor control list. Element, quantization unit, gain balance unit, spectrum inverse shaping unit, inverse time-frequency transform unit, global gain calculation unit. The time-frequency transform unit receives the first sampled value, and performs time-frequency transform on the first sampled value, and outputs the result to the spectrum pre-shaping unit. The spectrum pre-shaping unit receives the output of the time-frequency transform unit. a first sample value, which is frequency-pre-shaped and output to the multi-scale factor control unit; the multi-scale factor control unit receives the first sample value, and sets two or two on the first sample value And more than one scaling factor, adjusting the first sampling value by using a scaling factor, and outputting the adjusted first sampling value to the quantization unit; the quantization unit quantizing the received first sampling value to obtain a quantized sampling value and Outputting to the gain balancing unit; the gain balancing unit receives the quantized sample value, removes the influence of the scaling factor from the quantized sample value to obtain a second sampled value, and outputs the same to the frequency inverse inverse shaping unit; The unit receives the second sample value output by the gain balancing unit, performs spectral inverse shaping on the second sample value, and outputs the result to the inverse time-frequency transform unit; The inverse time-frequency transform unit receives the second sampled value from the peak inverse shaping unit, and performs inverse time-frequency transform on the second sampled value, and outputs the same to the global gain calculating unit; the global gain calculating unit receives the first sampling The value and the second sample value, and the first sample value and the second sample value are used to obtain a global gain.

The multi-scale factor control unit includes: a scaling factor setting unit and a sample value adjusting unit; the scaling factor setting unit is configured to set two or more scaling factors for the first sampling value, and output the set scaling factor And the sample value adjustment unit is configured to receive a scaling factor, and adjust the first sample value by using a scaling factor.

The scaling factor setting unit includes: a reference value setting unit, a scaling factor adjusting unit, a consumption bit number estimating unit, and a perceptual distortion calculating unit; the reference value setting unit is configured to set a reference value of the scaling factor, and output the scaling value to the scaling a factor adjustment unit; the scale factor adjustment unit is configured to adjust a scaling factor according to a reference value, and output the result to the consumption bit number estimation unit and the perceptual distortion calculation unit; the consumption bit number estimation unit is configured to estimate consumption according to a scaling factor The number of bits, and determining whether the number of consumed bits is smaller than the total number of bits allowed by the encoding, and transmitting the determination result to the scaling factor adjusting unit; the perceptual distortion calculating unit is configured to calculate the perceptual distortion according to the scaling factor, and determine the perceptual distortion Whether the result of the determination is sent to the scaling factor adjustment unit within a range that is not perceptible.

The frequency pre-shaping unit includes: a peak marking unit, a reference value calculating unit, an amplification factor calculating unit, and a pre-shaping unit; wherein the peak marking unit is configured to receive the first sampling value and is in the spectrum shaping area a sample value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency pre-shaping using a peak value, and output the result to the amplification factor calculation unit; The factor calculation unit is configured to calculate, by using the reference value, an amplification factor of each flag peak, and output the signal to the pre-shaping unit; the pre-shaping unit is configured to pre-shape the spectrum by using the amplification factor.

The frequency inverse transforming unit includes: a peak labeling unit, a reference value calculating unit, a reduction factor calculating unit, and an inverse shaping unit; wherein the peak labeling unit is configured to receive the sampling value and is in the sampling value in the spectrum shaping area. Marking a peak value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency inverse transformation using a peak value, and output the result to the reduction factor calculation unit; The reduction factor of each marker peak is calculated by using the reference value, and is output to the inverse shaping unit. The inverse shaping unit is configured to perform inverse shaping on the frequency using the reduction factor.

Corresponding to the method described in Embodiment 2, FIG. 15 is a block diagram showing the structure of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment. As shown in FIG. 15, the apparatus for adjusting the quantization quality at the decoding end includes: a gain balancing unit, a spectrum inverse shaping unit, an inverse time-frequency transform unit, and a global gain balancing unit. The gain balancing unit is configured to receive the quantized sample value and the scaling factor, and use the received scaling factor to remove the influence of the scaling factor from the quantized sample value to obtain a sampled value, and output the sampled value to the spectral inverse shaping unit; The inverse frequency shaping unit receives the sampled value output by the gain balancing unit, performs spectral inverse shaping on the sampled value, and outputs the sampled value to the inverse time-frequency transform unit; the inverse time-frequency transform unit inversely shapes the spectrum from the spectrum The sampling value is received in the unit, and the sampled value is inverse-time-converted and output to the global gain balancing unit; the global gain balancing unit receives the global gain and the sampled value, and multiplies the sampled value by the global gain and outputs the sampled value. The global gain balancing unit can be a multiplier. The spectrum inverse inverse unit of the decoding end is the same as the encoding end, and includes: a peak mark a unit, a reference value calculation unit, a reduction factor calculation unit, and an inverse shaping unit; wherein the peak marker unit receives the sample value, and marks a peak value in the sampled value in the spectrum shaping region, and outputs the peak value to the reference value calculation unit The reference value calculation unit is configured to calculate a reference value for spectral inverse shaping using a peak value, and output the reference value to the reduction factor calculation unit; the reduction factor calculation unit is configured to calculate a reduction factor of each marker peak value by using a reference value, And outputting to the inverse shaping unit; the inverse shaping unit is configured to perform inverse shaping on the spectrum by using the reduction factor.

Of course, corresponding to the methods described in Embodiments 1 and 3 above, and corresponding to the specific implementation method, devices for adjusting the quantization quality of different structures may be used, and the functions of the units in the device have been described in detail above. Therefore, it will not be elaborated.

The embodiments described above can be applied to various coding fields such as audio coding, video coding, and image coding.

Through the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is a better implementation. the way. Based on such understanding, the technical solution of the present invention may also be embodied in the form of a software product, which is stored in a storage medium, and includes a plurality of instructions for making A computer device (which may be a personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present invention. The above is only a few specific embodiments of the present invention, but the present invention is not limited thereto, and any changes that can be made by those skilled in the art should fall within the protection scope of the present invention.

The above is only the process and method embodiments of the embodiments of the present invention, and is not intended to be used in the embodiments of the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principles of the embodiments of the present invention. All should be included in the scope of protection of the embodiments of the present invention.

Claims

Rights request

A method of adjusting quantization quality in coding, characterized in that the method comprises:

After adjusting the first sample value used for encoding by using two or more scaling factors, quantizing the adjusted first sample value to obtain a quantized sample value;

Removing the influence of the scaling factor from the obtained quantized sample values to obtain a second sample value, and obtaining a global gain by using the first sample value and the second sample value;

The obtained quantized sample value, the information of the two or more scaling factors, and the obtained global gain are output as an encoded stream.

2. The method of claim 1 wherein

The first sample value and the second sample value are sample values in a time domain;

Before adjusting the first sample value, the method further includes: converting the first sample value of the time domain to the first sample value of the frequency domain;

The first sampling value is adjusted by using a scaling factor to: adjust a first sampling value in a frequency domain by using a scaling factor;

And performing quantization on the adjusted first sample value to obtain a quantized sample value: quantizing the first sampled value in the adjusted frequency domain to obtain a quantized sample value;

And obtaining, by the quantized sample value, a second sample value: removing a influence of the scaling factor from the quantized sample value to obtain a second sample value in the frequency domain;

After obtaining the second sample value, before obtaining the global gain, the method further includes: converting the second sample value in the frequency domain to the second sample value in the time domain;

The obtaining the global gain by using the first sample value and the second sample value is: using time domain The first sampled value and the second sampled value in the time domain result in a global gain.

3. The method of claim 2, wherein

Converting the first sample value of the time domain to a first sample value of the frequency domain: converting the first sample value of the time domain by a discrete Fourier transform, or a fast Fourier transform, or a discrete cosine transform, or a wavelet transform Is the first sampled value of the frequency domain.

4. The method of claim 2, wherein

The two or more scaling factors are: two or more scaling factors set for the first sampled value of the frequency domain.

5. The method of claim 4, wherein

The first sample value of the frequency domain is set to two or more scaling factors: dividing the first sample value of the frequency domain into two or more parts, and respectively setting a scaling factor for each part.

6. The method of claim 5, wherein

The first sampling value in the frequency domain is adjusted by using a scaling factor to: adjust the first sampling value of the frequency domain of each part by using a scaling factor of the corresponding part.

7. The method of claim 6 wherein:

The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the first sample values of the frequency domain, and using each The partial scaling factor removes the influence of the scaling factor of the corresponding portion from the quantized sample values of the corresponding portion.

8. The method of claim 7 wherein:

The information of the two or more scaling factors is output as an encoded stream as: Two or more scaling factors are described as the encoded stream output.

9. The method of claim 6 wherein:

After the scaling factors are respectively set for each part, the method further includes: selecting a scaling factor of one of the parts as a reference scaling factor, and calculating a ratio of the scaling factor of the remaining part to the reference scaling factor;

The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the first sample values of the frequency domain, and utilizing The resulting ratio removes the effect of the scaling factor of the corresponding portion from the quantized sample values of the corresponding portion.

10. The method according to claim 9, wherein the information of the two or more scaling factors is output as an encoded stream as: a ratio of a scaling factor of the remaining portion to the reference scaling factor is used as Encoded stream output.

11. The method of claim 9 wherein:

The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the first sample values of the frequency domain, and using the reference The scaling factor and the obtained ratio are calculated to obtain the scaling factor of each part, and the scaling factor of each part is used to remove the influence of the scaling factor of the corresponding part from the quantized sample values of the corresponding part.

12. The method according to claim 11, wherein the information of the two or more scaling factors is output as an encoded stream: the reference scaling factor and a scaling factor of the remaining portion are The ratio of the reference scaling factor is output as an encoded stream.

The method according to claim 6, wherein each of the portions is set with a scaling factor of: adjusting the scaling factor of each part according to the number of consumed bits and the perceptual distortion to obtain an optimal scaling factor of each part.

14. The method of claim 13 wherein:

The adjustment of the scaling factor of each part to obtain the optimal scaling factor is:

Setting a reference value of the scaling factor, the reference value making the number of consumed bits smaller than the total number of bits allowed by the encoding;

Adjusting the scaling factor of each part based on the reference value;

Determining whether the adjusted scaling factor is such that the number of consumed bits is less than the total number of bits allowed by the encoding. If the condition is not satisfied, the step of adjusting the scaling factor is continued until the condition is satisfied, and if the condition is satisfied, the perceptual distortion is calculated;

Determine whether the perceptual distortion is within the range that cannot be perceived. If yes, the scaling factor obtained by this adjustment is used as the optimal scaling factor. Otherwise, the step of adjusting the scaling factor is returned, and the step of adjusting the scaling factor and subsequent steps are repeated.

The method according to claim 14, wherein the number of consumed bits is estimated according to a first sample value in the frequency domain, a number of first sample values in the frequency domain, and a scaling factor.

16. The method according to claim 14, wherein the perceptual distortion is obtained according to a first sample value of a frequency domain and a scaling factor of each part.

17. The method of claim 14 wherein:

When the perceptual distortion is within the range of perception, the step of repeatedly adjusting the scaling factor and the subsequent steps are specified several times; If the perceptual distortion is still within the perceived range after repeated times, the scaling factor that minimizes the perceptual distortion is selected as the optimal scaling factor from the scaling factors adjusted in the above repetition process.

18. The method of claim 14 wherein:

The scaling factor of each part is adjusted on the basis of the reference value to: reduce the scaling factor of the important frequency band portion on the basis of the reference value, and increase the scaling factor of the unimportant frequency band portion on the reference value .

19. The method of claim 18, wherein

The important frequency band is a frequency band, and the unimportant frequency band is a high frequency band.

20. The method of claim 2, wherein

Before the first sample value in the frequency domain is adjusted by using the scaling factor, the method further includes: performing spectrum pre-shaping on the first sample value in the frequency domain;

After the second sampling value in the frequency domain is obtained by removing the influence of the scaling factor from the quantized sample value, before converting to the second sampling value in the time domain, the method further includes: performing spectral inverse shaping on the second sampling value in the frequency domain.

21. The method of claim 2, wherein

After the first sample value in the frequency domain is adjusted by using the scaling factor, before performing the quantization, the method further includes: performing frequency pre-shaping on the first sampled value in the adjusted frequency domain; after the quantization, from the quantized sample value Before removing the influence of the scaling factor, the method further includes: performing spectral inverse shaping on the quantized sample values.

The method according to claim 20 or 21, wherein the spectrum shaping area is determined; Performing spectral pre-shaping on the sampled value to perform spectrum pre-shaping on the determined sample-shaped value in the spectral shaping region;

Performing spectral inverse shaping on the sampled value is: performing spectral inverse shaping on the sampled value in the determined spectral shaping region.

The method according to claim 22, wherein the step of frequency pre-forming comprises:

Determining a peak value of the sampled value in the determined sampled value in the spectral shaping region; calculating a reference value for frequency pre-shaping using one of the peak values of the marker; calculating a magnification factor of each labeled peak using the reference value ;

The frequency is pre-shaped using the calculated amplification factor.

24. The method of claim 23, wherein

The peak value of the marked sample value is: In the spectrum shaping area, one or more local areas are selected, and in each local area, the sample value with the largest amplitude is selected as the peak value corresponding to the local area.

25. The method according to claim 24, wherein:

The pre-shaping of the spectrum is: in addition to the peak used to calculate the reference value, the local region where the remaining peak is located is pre-shaped by the amplification factor of the corresponding peak.

26. The method of claim 25, wherein

The pre-shaping is: amplifying the peak by using an amplification factor, or amplifying the peak value and a sample value in a local area where the peak is located by using an amplification factor.

27. The method of claim 23, wherein

The calculated reference value is: among the marked peaks, the maximum peak is selected and utilized This maximum peak gets the reference value.

28. The method of claim 27, wherein the reference value is: a magnitude of a maximum peak, or an energy of a peak near a maximum peak, or an average energy of a peak closest to the sample.

29. The method of claim 23, wherein

The amplification factor of the peak is a second parameter multiple of the first parameter power of the ratio of the reference value to the peak value, wherein the first parameter is a number greater than zero and less than 1, and the second parameter is an arbitrary number.

30. The method of claim 22, wherein the step of spectral inverse shaping comprises:

Determining the peak value of the sampled value in the determined sampled value in the spectral shaping region; calculating a reference value for frequency inverse modeling using one of the peak values of the marker; calculating a reduction factor for each labeled peak using the reference value ;

The frequency is inversely shaped using the calculated reduction factor.

31. The method of claim 2, wherein

The utilizing the first sampled value of the time domain and the second sampled value of the time domain to obtain a global gain is: the global gain multiplying the first sampled value of the time domain and the second sampled value of the time domain by The mean square error between the global gains is minimal.

32. A method for adjusting quantization quality in decoding, decoding a coded stream output by an encoder to obtain a decoded stream, wherein the method includes:

Obtaining quantized sample values, information of two or more scaling factors, and global gain from the decoded stream; Using the information of the two or more scaling factors, the effect of the scaling factor is removed from the quantized sample values to obtain a sampled value, which is then multiplied by the global gain.

33. The method of claim 32, wherein

The quantized sample value is a quantized sample value in the frequency domain;

And removing the effect of removing the scaling factor from the quantized sample value to obtain a sampling value: removing a sampling value of the frequency domain by removing the influence of the scaling factor from the quantized sampling value;

After removing the influence of the scaling factor from the quantized sample value to obtain the sampled value, before multiplying the global gain, the method further includes: converting the sampled value in the frequency domain into the sampled value in the time domain.

34. The method of claim 33, wherein

After removing the influence of the scaling factor from the quantized sample values in the frequency domain to obtain the sampled value in the frequency domain, before converting the sampled value in the frequency domain to the sampled value in the time domain, the method further includes: performing frequency sampling on the frequency domain Domain inverse shaping,

Alternatively, before removing the influence of the scaling factor from the quantized sample values in the frequency domain to obtain the sampled values in the frequency domain, the method further includes: performing spectral inverse shaping on the quantized sample values in the frequency domain.

The method according to any one of claims 32 to 34, wherein the information of the scaling factor obtained from the decoded stream is: all scaling factors; the removing from the obtained quantized sampling values The effect of the scaling factor is: dividing the quantized sample value into two or more corresponding parts according to the method of dividing the sampled values in the frequency domain at the time of encoding, and using the scaling factor of each part, the quantized sample value from the corresponding part Remove the effect of the scaling factor of the corresponding part.

The method according to any one of claims 32 to 34, characterized in that The information of the scaling factor obtained from the decoding stream is: using a scaling factor as a reference scaling factor, and a ratio of the remaining scaling factors to the reference scaling factor;

The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the frequency values of the frequency domain at the time of encoding, and utilizing The resulting ratio removes the influence of the scaling factor of the corresponding portion from the quantized sample values of the corresponding portion.

The method according to any one of claims 32 to 34, wherein the information of the scaling factor obtained from the decoded stream is: using a scaling factor as a reference scaling factor, and remaining scaling factors and the reference The ratio of the scaling factor and the reference scaling factor;

The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the sample values in the frequency domain at the time of encoding, and using the reference scaling The factor and the ratio are calculated to obtain the scaling factor of each part, and the scaling factor of each part is used to remove the influence of the scaling factor of the corresponding part from the quantized sample value of the corresponding part.

38. The method of claim 34, wherein the step of spectral inverse shaping comprises:

Marking the peak value of the sampled value in the sampled value in the spectrum shaping area determined at the time of encoding;

Using a peak value of the marked peak, calculating a reference value for frequency inverse transformation; using a reference value, calculating a reduction factor of each marker peak;

The spectrum is inverse shaped using the calculated reduction factor.

39. An apparatus for adjusting quantization quality in coding, the apparatus comprising: a multi-scaling factor control unit, a quantization unit, a gain balancing unit, and a global gain calculation unit;

The multi-scale factor control unit is configured to receive a first sample value, set two or more scaling factors for the first sample value, and adjust the first sample value by using a scaling factor, and adjust the first sample value. The sampled value is output to the quantization unit;

The quantization unit is configured to quantize the received first sample value to obtain a quantized sample value and output the result to the gain balance unit;

The gain balancing unit is configured to receive the quantized sample value, remove the influence of the scaling factor from the quantized sample value to obtain a second sample value, and output the same to the global gain calculation unit; the global gain calculation unit is configured to receive the first sample value and The second sampled value is used to obtain a global gain using the first sampled value and the second sampled value.

40. The apparatus according to claim 39, wherein the apparatus further comprises: a time-frequency transform unit and an inverse time-frequency transform unit;

The time-frequency transform unit is configured to receive the first sample value, and perform time-frequency transform on the first sample value, and output the result to the multi-scale factor control unit;

The inverse time-frequency transform unit is configured to receive a second sample value from the gain balance unit, and perform inverse time-frequency transform on the second sample value, and output the result to the global gain calculation unit.

The device according to claim 40, further comprising: a spectrum pre-shaping unit and a spectrum inverse shaping unit;

The spectrum pre-shaping unit is configured to receive the first output of the time-frequency transform unit output a sample, the frequency is pre-shaped and output to the multi-scale factor control unit; the spectrum inverse shaping unit is configured to receive a second sample value output by the gain balancing unit, and the second sample Performing spectral inverse shaping on the sample value and outputting the result to the inverse time-frequency transform unit;

Or,

The spectrum pre-shaping unit is configured to receive a first sample value output by the multi-scale factor control unit, perform spectrum pre-shaping on the first sample value, and output the result to the quantization unit; Receiving the quantized sample value output by the quantization unit, performing spectral inverse shaping on the quantized sample value, and outputting the same to the gain balancing unit.

The apparatus according to any one of claims 39 to 41, wherein the multi-scale factor control unit comprises: a scaling factor setting unit and a sample value adjusting unit;

The scaling factor setting unit is configured to set two or more scaling factors for the first sampling value, and output the set scaling factor to the sampling value adjusting unit; the sampling value adjusting unit is configured to receive a scaling factor And use the scaling factor to adjust the first sampled value.

The apparatus according to claim 42, wherein the scaling factor setting unit comprises: a reference value setting unit, a scaling factor adjusting unit, a consumption bit number estimating unit, and a perceptual distortion calculating unit;

The reference value setting unit is configured to set a reference value of the scaling factor, and output the result to the scaling factor adjustment unit;

The scaling factor adjustment unit is configured to adjust a scaling factor according to a reference value, and output the The consumption bit number estimation unit and the perceptual distortion calculation unit;

The consumption bit number estimating unit is configured to estimate the number of consumed bits according to the scaling factor, and determine whether the number of consumed bits is smaller than the total number of bits allowed by the encoding, and send the determination result to the scaling factor adjusting unit;

The perceptual distortion calculation unit is configured to calculate the perceptual distortion according to the scaling factor, and determine whether the perceptual distortion is within an incapable range, and transmit the determination result to the scaling factor adjustment unit.

44. The apparatus according to claim 41, wherein the frequency pre-shaping unit comprises: a peak marking unit, a reference value calculating unit, an amplification factor calculating unit, and a pre-shaping unit;

The peak labeling unit is configured to receive the first sample value, and mark a peak value in the first sampled value in the spectrum shaping area, and output the signal to the reference value calculating unit;

The reference value calculation unit is configured to calculate a reference value for spectrum pre-shaping by using a peak value, and output the value to the text factor calculation unit;

The amplification factor calculation unit is configured to calculate an amplification factor of each marker peak value by using a reference value, and output the signal to the pre-shaping unit;

The pre-shaping unit is configured to pre-shape the spectrum by using the amplification factor.

The device according to claim 41, wherein the frequency inverse unit comprises: a peak marker unit, a reference value calculation unit, a reduction factor calculation unit, and an inverse shaping unit;

The peak marking unit is configured to receive a sampling value, and mark a peak value in the sampling value in the spectrum shaping region, and output the signal to the reference value calculating unit; The reference value calculation unit is configured to calculate a reference value for spectral inverse shaping using a peak value, and output the result to the reduction factor calculation unit;

The reduction factor calculation unit is configured to calculate a reduction factor of each mark peak value by using a reference value, and output the result to the inverse shaping unit;

The inverse shaping unit is configured to inversely shape the spectrum by using the reduction factor.

46. An apparatus for adjusting quantization quality in decoding, the apparatus comprising: a gain balancing unit and a global gain balancing unit;

The gain balancing unit is configured to receive the quantized sample value and the scaling factor, and use the received scaling factor to remove the influence of the scaling factor from the quantized sample value to obtain a sampled value, and output the sampled value to the global gain balancing unit;

The global gain balancing unit is configured to receive the global gain and the sampled value, and multiply the sampled value by the global gain and output.

47. The apparatus according to claim 46, wherein the apparatus further comprises: an inverse time-frequency transform unit;

The inverse time-frequency transform unit is configured to receive a sampled value from the gain balance unit, and perform inverse time-frequency transform on the sampled value, and output the sampled value to the global gain balance unit.

The device according to claim 47, further comprising: a spectrum inverse shaping unit;

The frequency inverse transforming unit is configured to receive the sampled value output by the gain balancing unit, perform inverse frequency shaping on the sampled value, and output the sampled value to the inverse time-frequency transform unit;

Or,

The spectrum inverse shaping unit is configured to receive a quantized sample value, and perform the quantized sample value The frequency is inversely shaped and output to the gain balancing unit.

49. The apparatus according to claim 48, wherein the spectrum inverse shaping unit comprises: a peak marking unit, a reference value calculating unit, a reduction factor calculating unit, and an inverse shaping unit;

The peak labeling unit is configured to receive a sampled value, and mark a peak value in the sampled value in the spectral shaping area, and output the peak value to the reference value calculating unit;

The reference value calculation unit is configured to calculate a reference value for spectrum inverse shaping using a peak value, and output the result to the reduction factor calculation unit;