EP2104095A1 - Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur - Google Patents

Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur Download PDF

Info

Publication number
EP2104095A1
EP2104095A1 EP07855801A EP07855801A EP2104095A1 EP 2104095 A1 EP2104095 A1 EP 2104095A1 EP 07855801 A EP07855801 A EP 07855801A EP 07855801 A EP07855801 A EP 07855801A EP 2104095 A1 EP2104095 A1 EP 2104095A1
Authority
EP
European Patent Office
Prior art keywords
sample values
group
unit
scale factors
rectification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07855801A
Other languages
German (de)
English (en)
Other versions
EP2104095A4 (fr
Inventor
Wei Li
Lijing Xu
Qing Zhang
Jianfeng Xu
Shenghu Sang
Zhengzhong Du
Yao Zou
Peilin Liu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of EP2104095A1 publication Critical patent/EP2104095A1/fr
Publication of EP2104095A4 publication Critical patent/EP2104095A4/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Definitions

  • the present invention relates to encoding technology, and more specifically, to a method and apparatus for adjusting quality of quantization for encoding/decoding.
  • encoding such as digital audio cording or digital video encoding not only requires a higher encoding efficiency and a real-time feature, but also a further extended encoding bandwidth.
  • techniques meeting the requirement of low bit rate and high audio encoding quality mainly include: AAC+, EAAC+ and AMR-WB+.
  • the AAC+ and EAAC+ are evolved from an audio encoder with high bit rate, while the AMR-WB+ is a mixing encoding method by extending audio encoding with low bit rate.
  • time-frequency transformation is typically first performed on samples and the rounding, weighting, and quantization are then performed on spectrum coefficients based on auditory characteristics.
  • the quantized spectrum coefficients are transported by encoding entropy.
  • a major distortion in the encoding comes from quantization of various parameters. Therefore, to accommodate different requirements, the encoder needs to adjust the quality of quantization based on a specified encoding rate.
  • a good encoder may achieve a transparent sound, i.e., human ear may not perceive the noise introduced in the encoding and quantizing process.
  • an encoding scheme with low bit rate since the number of bits is insufficient, the effect of a perfect sound transparency may not be achieved. Therefore, one may only pursue a minimum subject distortion.
  • a common scheme for adjusting quality of quantization is to use a scale factor or a gain.
  • a coded coefficient is divided by a scale factor or multiplied with a gain.
  • the scaled coefficient is quantized.
  • the optimal scale factor may both satisfy the requirement of bit rate and minimize the quantization error. Therefore, when the bit rate is high, a smaller scale factor is chosen such that the quantized coefficient may have a larger dynamic range and a relatively refined quantization. When the bit rate is slow, a bigger scale factor is chosen such that the quantized coefficient may have a smaller dynamic range and a relatively coarse quantization.
  • Figure 1 illustrates a block diagram of an MPEG1-LAYER3 audio encoding algorithm.
  • the whole encoding band is divided into 32 sub-bands with each assigned with a scale factor.
  • the whole band is assigned with a global scale factor.
  • the global scale factor is adjusted using a close loop search algorithm such that the number of quantization bits is controlled within the range allowed by the current bit rate.
  • scale factors for sub-bands are adjusted such that the quantization noise is controlled under the masking threshold of human auditory system. That is, the human ear may not perceive the presence of the quantization noise.
  • the quantized coefficient flow is transmitted by way of Huffman encoding.
  • FIG. 2 illustrates a partial flowchart of Transform Coded Excitation (TCX) of AMR-WB+ audio encoding algorithm.
  • TCX Transform Coded Excitation
  • AMR-WB+ audio encoding a global scale factor is used. Due to the limitation of using one scale factor, a specific frequency band cannot be finely tuned. Moreover, considering encoding requirement on low bit rate, the frequency domain samples in the spectrum which have a low energy may lost during vector quantization. However, since human auditory system has different sensitivities over different frequency bands, it is desired that the frequency domain samples with low energy at critical frequency bands can still be quantized during encoding. Therefore, in AMR-WB+ audio encoding, the spectrum pre-rectification and spectrum inverse rectification are employed. For TCX of AMR-WB+ audio encoding algorithm, critical frequency bands in the whole spectrum are first pre-rectified to raise the energy at these specific bands and then a global scale factor is used for the whole frequency band.
  • the gain factor obtained from the spectrum pre-rectification is not transmitted in encoding streams. Instead, according to spectrum inverse rectification method, original sample values in frequency domain are restored by dividing sample values in frequency domain of each block by a gain factor of a corresponding block after a gain factor G m of each block is calculated based on sample values in frequency domain.
  • a method for adjusting quality of quantization for encoding is provided to reduce the implementation complexity.
  • a method for adjusting quality of quantization for decoding is provided to guarantee the quality of quantization.
  • an apparatus for adjusting quality of quantization for encoding is provided to reduce the implementation complexity.
  • an apparatus for adjusting quality of quantization for decoding is provided to guarantee the quality of quantization.
  • a method for adjusting quality of quantization for encoding includes: adjusting a first group of sample values for encoding with at least two scale factors; quantizing the adjusted first group of sample values to obtain the quantized sample values; eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values; obtaining a global gain with the first group of sample values and the second group of sample values; and outputting the quantized sample values, information of the two or more scale factors and the obtained global gain as an encoding stream.
  • a method for adjusting quality of quantization for decoding is provided according to one embodiment of the present invention where an encoding stream output by an encoder is decoded as a decoding stream.
  • the method includes: acquiring quantized sample values, information of two or more scale factors and a global gain from the decoding stream; utilizing the information of the two or more scale factors to eliminate the impact of the scale factors from the quantized sample values to obtain sample values; and multiplying the sample values with the global gain.
  • the apparatus includes: a multiple scale factors control unit, a quantization unit, a gain balancing unit, and a global gain computing unit.
  • the multiple scale factors control unit is configured to receive a first group of sample values, configure two or more scale factors for the first group of sample values, adjust the first group of sample values with the scale factors, and output the first group of adjusted sample values to the quantization unit.
  • the quantization unit is configured to quantize the received first group of sample values, obtain quantized sample values and output the quantized sample values to the gain balancing unit.
  • the gain balancing unit is configured to receive the quantized sample values, eliminate the impact of the scale factors from the quantized sample values, obtain a second group of sample values, and output the second group of sample values to the global gain computing unit.
  • the global gain computing unit is configured to receive the first group of sample values and the second group of sample values, and obtain the global gain by using the first group of sample values and the second group of sample values.
  • the apparatus includes: a gain balancing unit, and a global gain balancing unit.
  • the gain balancing unit is configured to receive the quantized sample values and reduction factors, utilize the received reduction factors to eliminate the impact of the scale factors from the quantized sample values and obtain sample values, and output the sample values to the global gain balancing unit.
  • the global gain balancing unit is configured to receive a global gain and the sample values, multiply the sample values with the global gain and output the multiplications.
  • methods and apparatuses for adjusting quality of quantization directly divide the sample values into a plurality of portions and configure a scale factor for each portion. Therefore, the present invention may greatly reduce the implementation complexity. Moreover, compared with the prior art scheme using one global factor, since a plurality of scale factors are introduced, the present invention may better adjust the quality of quantization at critical bands and achieve a better encoding performance.
  • Figure 1 illustrates a conventional block diagram of an MPEG1-LAYER3 audio encoding algorithm
  • Figure 2 illustrates a conventional flowchart of TCX part in AMR-WB+ audio encoding algorithm
  • Figure 3 illustrates a block diagram of an encoder for adjusting quality of quantization according to Embodiment 1;
  • Figure 4 illustrates a block diagram of a decoder for adjusting quality of quantization according to Embodiment 1;
  • Figure 5 illustrates a flowchart of adjusting quality of quantization at the encoder by using a plurality of scale factors according to Embodiment 1;
  • Figure 6 illustrates a flowchart of selecting a plurality of scale factors and finely tuning the frequency domain sample values on the whole frequency band according to Embodiment 1;
  • Figure 7 illustrates a flowchart of adjusting quality of quantization at the decoder by using a plurality of scale factors according to Embodiment 1;
  • Figure 8 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 2;
  • Figure 9 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 2;
  • Figure 10 illustrates a schematic diagram of peak pre-rectification according to Embodiment 2.
  • Figure 11 illustrates a schematic diagram of peak inverse rectification according to Embodiment 2;
  • Figure 12 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 3;
  • Figure 13 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 3;
  • Figure 14 illustrates a block diagram of an apparatus for adjusting quality of quantization at an encoder according to Embodiment 4.
  • Figure 15 illustrates a block diagram of adjusting quality of quantization at a decoder according to Embodiment 4.
  • the main idea of adjusting quality of quantization is to utilize a plurality of scale factors or further utilize the spectrum rectification technique to adjust quality of quantization during an encoding process.
  • An encoding process where a time-frequency transform has been performed is illustrated below.
  • Embodiments of the present invention also apply to an encoding process where time-frequency transform has not been performed.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • Embodiment 1 provides a method for adjusting quality of quantization with a plurality of scale factors.
  • Figure 3 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 1.
  • sample values in time domain time domain frequency values
  • time-frequency transform operation After the control of a plurality of scale factors, these sample values are quantized and the quantized sample values are output.
  • An optimal global gain is calculated by performing gain balancing and inverse time-frequency transform on the output quantized sample values.
  • Scale factors, quantized sample values in frequency domain (frequency domain sample values) and a global gain need to be transmitted in encoding streams.
  • Figure 4 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 1.
  • a decoding process after the quantized sample values in frequency domain are gain-balanced and inversely transformed from frequency domain to time domain, sample values in time domain are obtained. Finally, these sample values are multiplied with the global gain to form restored sample values in time domain.
  • Figure 5 illustrates steps of adjusting quality of quantization at the encoder with a plurality of scale factors according to Embodiment 1. The steps are as following.
  • Step 501 Time domain sample values x(n) is transformed to frequency domain sample values X(k) by virtue of a time-frequency transform.
  • Time-frequency transform herein may include a Discrete Fourier Transform (DFT), a Discrete Cosine Transform (DCT, MDCT, IDCT), a Discrete Wavelet Transform (DWT), etc.
  • DFT Discrete Fourier Transform
  • DCT Discrete Cosine Transform
  • IDCT IDCT
  • DWT Discrete Wavelet Transform
  • FFT Fast Fourier Transform
  • Step 502 A plurality of scale factors is used to control frequency domain sample values X(k).
  • a plurality of proper scale factors are selected and used to finely tune the frequency domain sample values on the whole frequency band.
  • Step 601 divide the whole frequency band into m portions [0, n 1 ],[ n 1 +1, n 2 ], ⁇ ,[ n m -1 +1, N ] and frequency domain sample values X (0,1, ⁇ ,n 1 ), X ( n m -1 +1, n m -1 +2, ⁇ ,N ), ⁇ ,X ( n 1 +1 ,n 1 +2, ⁇ ,n 2 ) for m portions are obtained.
  • the scale factor for each portion is denoted as g 1 , g 2 , ⁇ , g m .
  • the plurality of scale factors can be used for a direct division of the whole frequency band after a time-frequency transform is performed, thereby eliminating the necessity of first using a group of filters for dividing the spectrum into several bands and then configuring a scale factor for each band.
  • the present invention may significantly reduce the implementation complexity.
  • Step 602 A criteria value g 0 is selected for estimating m scale factors.
  • the criteria g 0 for scale factors is selected in such a way that the estimation of the number of consumed bits b 0 is less than the maximum allowable number of bits b max .
  • Step 603 m scale factors g 1 , g 2 , ⁇ , g m are adjusted around g 0 .
  • m scale factors are adjusted in such a way as to decrease scale factors at more critical bands and increase scale factors at less critical bands.
  • the more critical bands refer to low frequency bands while the less critical bands refer to high frequency bands.
  • the adjusted m scale factors g ' 1 , g ' 2 , ⁇ , g ' m increase gradually. With such adjustment, the quality of quantization at more critical bands is relatively good and the quality of quantization at less critical bands is relatively lower. Consequently, the quality of quantization at the whole frequency band can be optimized.
  • Step 604 It is determined whether the estimated number of consumed bits is no more than the total number of bits. If not, the process returns to step 603 and the scale factors are adjusted again. If so, m scale factors which satisfy the number of consumed bits are denoted as g ' 1 , g ' 2 , ⁇ , g ' m .
  • Step 605 Quantization perception distortion is computed based on m adjusted scale factors g ' 1 , g ' 2 , ⁇ g ' m .
  • the quantization perception distortion C indicates a distortion due to the difference between the original frequency domain sample values X and the sample values which come from the frequency domain sample values X adjusted by m scale factors g 1 , g 2 , ⁇ , g m .
  • Step 606 It is determined whether the quantization perception distortion is within an imperceptible range. If so, m scale factors obtained from the current adjustment are regarded as the optimal scale factors which are denoted as g 1 opt ,g 2 opt , ⁇ , g mopt . Then, the process proceeds to step 607; otherwise, the process returns to step 603.
  • the specific imperceptible range herein is a specific value interval where distortion is tolerated.
  • the method for determining whether the quantization perception distortion is within an imperceptible range includes determining whether the quantization perception distortion computed at step 605 is within a value interval where distortion is tolerated. If the quantization perception distortion computed at step 605 is within a value interval where distortion is tolerated, the quantization perception distortion is regarded as imperceptible; otherwise, the quantization perception distortion is regarded as perceptible.
  • the close loop selection is terminated and a set of scale factors which contribute to a minimum perception distortion are selected from the scale factors obtained during the repetitive adjustment procedure as optimal scale factors. Then, the process proceeds to step 607.
  • the times of close loop selection M may be determined based on actual situation.
  • Step 607 m optical scale factors g 1 opt , g 2 opt , ⁇ , g mopt obtained are used to finely tune the frequency domain sample values X. That is, the frequency domain sample values of each block are divided by an optical scale factor corresponding to the block.
  • the finely tuned frequency domain sample values X' obtained at steps 601 ⁇ 607 are fed into encoder.
  • scale factors are needed for restoring data during decoding
  • scale factors should be transmitted in the encoding streams.
  • a variety of methods of transmitting scale factors are introduced below, respectively.
  • a first method for transmitting scale factors is to encode m scale factors g 1 opt ,g 2 opt , ⁇ , g mopt which are used to finely tune the sample values in frequency domain. Thus, the data can be restored more correctly when being decoded.
  • a second method for transmitting scale factors is to select a scale factor as a criteria scale factor from m scale factors g 1 opt , g 2 opt , ⁇ , g mopt which are used to finely tune the sample values in frequency domain, and compute the ratios of the remaining m-1 scale factors to the criteria scale factor and encode these m-1 ratios. For instance, if g 1 opt is selected as the criteria scale factor, only g 2 ⁇ opt g 1 ⁇ opt , g 3 ⁇ opt g 1 ⁇ opt , ⁇ , g mopt g 1 ⁇ opt needs to be coded, thereby reducing the number of consumed bits.
  • a third method for transmitting scale factors is to select a scale factor as a criteria scale factor from m scale factors g 1 opt , g 2 opt , ⁇ , g mopt which are used to finely tune the sample values in frequency domain, and compute the ratios of the remaining m-1 scale factors to the criteria scale factor and encode the criteria scale factor and these m-1 ratios. For instance, if g 1 opt is selected as a criteria scale factor, only g 1 opt and g 2 ⁇ opt g 1 ⁇ opt , g 3 ⁇ opt g 1 ⁇ opt , ⁇ , g mopt g 1 ⁇ opt need to be encoded.
  • the decoder can compute g 1 opt , g 2 opt , ⁇ , g mopt from g 1 opt and g 2 ⁇ opt g 1 ⁇ opt , g 3 ⁇ opt g 1 ⁇ opt , ⁇ , g mopt g 1 ⁇ opt .
  • optimal number of scale factors may be selected in accordance with the requirement of encoding bit rate and quality of quantization. For instance, 2 ⁇ 3 scale factors may be selected for a low bit rate encoding.
  • Step 503 Frequency domain sample values X' obtained by controlling a plurality of scale factors are quantized and quantized frequency domain sample values X q are output.
  • step 503 other quantization approaches may be employed in accordance with encoding requirement, such as multistage vector quantization, split vector quantization, tree-structured vector quantization and trellis coded vector quantization.
  • Step 504 The impact imposed by the scale factors is eliminated from the quantized frequency domain sample values X q obtained from step 503 and original frequency domain sample values X balance can thus be restored. That is, X balance can be obtained by performing a gain balance on the quantized frequency domain sample values X q .
  • the gain balancing method varies with different method for transmitting scale factors.
  • the scale factors g 1 opt , g 2 opt , ⁇ , g mopt selected according to step 502 may be used for gain balancing.
  • quantized frequency domain sample values X q are also divided into m portions in accordance with the method for dividing frequency bands as described in step 601. Then, X q (0,1, ⁇ , n 1 ), X q ( n m -1 +1, n m -1 +2, ⁇ , N ), ⁇ , X q ( n 1 +1, n 1 +2, ⁇ , n 2 ) are obtained. Quantized frequency domain sample values for each portion are multiplied with a scale factor of a corresponding portion.
  • ratios of a plurality of scale factors can be used for gain balancing.
  • quantized frequency domain sample values X q are also divided into m portions in accordance with the method for dividing frequency bands as described in step 601. Then, X q (0,1, ⁇ , n 1 ), X q ( n m -1 +1, n m -1 ,+2, ⁇ , N ), ⁇ , X q ( n 1 +1, n 1 +2, ⁇ , n 2 ) are obtained.
  • the frequency domain sample values of the portion to which the criteria scale factor corresponds are multiplied with 1.
  • X balance X q 0 1 ⁇ n 1 , g 2 ⁇ opt g 1 ⁇ opt ⁇ X q ⁇ n 1 + 1 , n 1 + 2 , ⁇ , n 2 , ⁇ , g mopt g 1 ⁇ opt ⁇ X q ⁇ n m - 1 + 1 , N
  • Step 505 Inverse time-frequency transform is performed on X balance which are obtained through gain balancing.
  • the restored frequency domain sample values X balance are transformed to the restored time domain sample values x q ( n ).
  • Step 506 The original time domain sample values x(n) and the restored time domain sample values x q ( n ) are used to compute an optimal global gain g gopt .
  • a global gain g g is selected as an optimal global gain g gopt such that the variance between the original time domain sample values and the restored time domain sample values is at its minimum, i.e., the optimal global gain g gopt renders ⁇ n x n - g g ⁇ x q n 2 at its minimum.
  • the optimal global gain g gopt may also require an encoding transmission so that the optimal global gain g gopt can be used for data recovery.
  • the foregoing is a procedure for adjusting quality of quantization at the encoder by using a plurality of scale factors.
  • the process of restoring the sample values in time domain at the decoder based on the decoded quantized sample values in frequency domain is illustrated in Figure 7 .
  • the process includes the following steps.
  • Step 701 Scale factors obtained from the encoding streams are used for gain balancing for the quantized sample values in frequency domain.
  • the implementation is similar to the method described in step 504, which is omitted herein for brevity.
  • the gain balancing method may vary with the different method of transmitting scale factors.
  • the gain balancing method at the encoder and the gain balancing method at the decoder should also be consistent with each other.
  • Step 702 Inverse time-frequency transform is performed on the sample values in frequency domain which have been gain balanced and the sample values in time domain are thus obtained.
  • Step 703 Restored sample values in time domain are obtained by multiplying the sample values in time domain with the global gain obtained from the coded streams.
  • the scheme of multiple scale factors control adopted in Embodiment 1 may be applied directly to sample values in time domain, which means that the scheme may be applied to the case where no time-frequency transform is performed. Accordingly, no inverse time-frequency transform is involved during the computation of the global gain.
  • the sample values in time domain can be divided by time intervals.
  • scale factors associated with more critical time intervals are decreased and the scale factors associated with less critical time intervals are increased.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • Embodiment 2 provides a method for adjusting quality of quantization with a plurality of scale factors and spectrum rectification.
  • Figure 8 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 2.
  • sample values in time domain are first transformed to frequency domain by a time-frequency transform operation. Then, after the spectrum pre-rectification and the control of a plurality of scale factors, these samples are quantized and the quantized sample values are output.
  • An optimal global gain is calculated by performing gain balancing, inverse spectrum rectification and inverse time-frequency transform on the output quantized sample values. Scale factors, quantized sample values in frequency domain and a global gain need to be transmitted in an encoding stream.
  • Figure 9 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 2.
  • a decoding process after the quantized sample values in frequency domain go through a gain-balancing, inverse spectrum rectification and inverse time-frequency transform, sample values in time domain are obtained. Finally, these sample values are multiplied with the global gain to form restored sample values in time domain.
  • the method for adjusting quality of quantization with a plurality of scale factors and peak rectification according to Embodiment 2 may further include a spectrum pre-rectification step between the time-frequency transform at step 501 and the control of scale factors at step 502 and may further include an inverse spectrum rectification step between the gain balancing at step 504 and the inverse time-frequency transform at step 505.
  • the spectrum pre-rectification and inverse spectrum rectification are now detailed below.
  • Figure 10 illustrates a schematic diagram of spectrum pre-rectification which is implemented by the following steps.
  • the spectrum rectification area herein refers to a spectrum area at more critical bands. For instance, for audio data, since human auditory system has a high resolution at low frequencies, the low frequencies are considered as more critical bands. For another instance, for data such as videos, images, since most of the data information is distributed at low frequencies, the low frequencies are considered as more critical bands. Therefore, the spectrum rectification area may take front part of the whole band, such as, the first quarter of the band.
  • peak p k may be defined as a local maximum value among amplitudes in the spectrum rectification area. If X ( i )> X ( j ), ⁇ j ⁇ [ i - ⁇ , i + ⁇ ], i ⁇ j , X ( i ) is a local maximum value among 2 ⁇ +1 points within [ i - ⁇ , i + ⁇ ] where the local area can be selected at random.
  • Step 1002 Reference p ref for spectrum pre-rectification is computed.
  • the principle of selecting reference is to remain the value of the reference unchanged before or after spectrum rectification.
  • a local maximum energy value can be regarded as reference p ref .
  • a characteristic parameter of a block of data can be regarded as reference p ref for lessoning the impact by the quantization error on the reference.
  • Step 1004 The computed gain factors of the peaks are used to amplify the peaks.
  • the peak energies can be captured by a quantizer by simply amplifying the peak energies at low frequencies. Therefore, in Embodiment2, only a few frequency points, or peaks, need to be amplified.
  • the spectrum pre-rectification technique may also be referred to as peak pre-rectification. With such peak pre-rectification technique, there is less impact on the global gain increase. The quantization error caused by the global gain increase may be neglectable.
  • the frequency points neighboring to the peaks can also be amplified. For instance, in addition to amplifying a local peak among 2 ⁇ +1 points, 2 ⁇ points or less than 2 ⁇ points adjacent to the peak may be amplified by corresponding gain factors.
  • the peaks of frequency domain sample values at more critical bands are enhanced, thereby reducing quantization error at peaks of sample values in frequency domain at more critical bands and reducing the possibility of the loss of the spectrum peaks at more critical bands during quantization.
  • the spectrum rectification area and the peak marking principle during inverse spectrum rectification is the same as those in the process of spectrum pre-rectification.
  • Step 1102 Reference q ref for inverse spectrum rectification is computed.
  • Step 1104 The computed reduction factors of the peaks are used to decrease the peaks.
  • the sample values in frequency domain obtained from inverse spectrum rectification at step 505 are transformed from frequency domain to time domain.
  • the decoder may need to perform, accordingly, an inverse spectrum rectification between the gain balance process and the inverse time- frequency transform process.
  • the detailed implementation is similar to that of the method of inverse spectrum rectification in the above encoding process, which is omitted herein for brevity.
  • the spectrum pre-rectification is performed prior to the scale factors being controlled.
  • the scale factors may also be controlled prior to the spectrum pre-rectification. Accordingly, in the process of restoring the original sample values during encoding and in the decoding process, inverse spectrum rectification may be performed prior to gain balancing. Description of such situation will not be detailed.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • Embodiment 3 provides a method for adjusting quality of quantization by spectrum rectification.
  • Figure 12 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 3.
  • sample values in time domain are first transformed to frequency domain by a time-frequency transform operation. Then, after the spectrum pre-rectification, these samples are quantized and the quantized sample values are output.
  • An optimal global gain is calculated by performing inverse spectrum rectification and inverse time-frequency transform on the output quantized sample values. Quantized sample values in frequency domain and a global gain need to be transmitted in the encoding streams.
  • Figure 13 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 3.
  • a decoding process after the quantized sample values in frequency domain go through an inverse spectrum rectification and inverse time-frequency transform, sample values in time domain are obtained. Finally, these sample values are multiplied with the global gain to form restored sample values in time domain.
  • Embodiment 3 the methods of spectrum pre-rectification and the inverse spectrum rectification and the technical effects thereof are the same as those in Embodiment 2, which are omitted herein for brevity.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • Embodiment 4 An apparatus for adjusting quality of quantization according to Embodiment 4 is provided.
  • Figure 14 illustrates a block diagram of an apparatus for adjusting quality of quantization at an encoder according to Embodiment 4.
  • the apparatus for adjusting quality of quantization at the encoder may include a time-frequency transform unit, a spectrum pre-rectification unit, a multiple scale factors control unit, a quantization unit, a gain balancing unit, an inverse spectrum rectification unit, an inverse time-frequency transform unit, and a global gain computing unit.
  • the time-frequency transform unit receives a first group of sample values, performs a time-frequency transform on the first group of sample values and outputs to the spectrum pre-rectification unit.
  • the spectrum pre-rectification unit receives the first group of sample values output from the time-frequency transform unit, performs a spectrum pre-rectification on the first group of sample values and outputs to the multiple scale factors control unit.
  • the multiple scale factors control unit receives the first group of sample values, configures two or more scale factors for the first group of sample values, adjusts the first group of sample values with the scale factors, and outputs the adjusted first sample value to the quantization unit.
  • the quantization unit quantizes the received first sample value, obtains quantized sample values and outputs the quantized sample values to the gain balancing unit.
  • the gain balancing unit receives the quantized sample value, eliminates the influence imposed by the scale factors on the quantized sample value, obtains a second group of sample values, and outputs the second group of sample values to the inverse spectrum rectification unit.
  • the inverse spectrum rectification unit receives the second group of sample values output from the gain balancing unit, performs an inverse spectrum rectification on the second group of sample values and outputs to the inverse time-frequency transform unit.
  • the inverse time-frequency transform unit receives the second group of sample values from the peak spectrum rectification unit, performs an inverse time-frequency transform on the second group of sample values and outputs to the global gain computing unit.
  • the global gain computing unit receives the first group of sample values and the second group of sample values, and obtains the global gain by using the first group of sample values and the second group of sample values.
  • the multiple scale factors control unit includes a scale factor configuration unit and a sample value adjusting unit.
  • the scale factor configuration unit is configured to configure two or more scale factors for the first group of sample values and outputs the configured scale factor to the sample value adjusting unit.
  • the sample value adjusting unit is configured to receive scale factors and adjust the first group of sample values with the scale factors.
  • the scale factor configuration unit includes a criteria setting unit, a scale factor adjusting unit, a unit for estimating the number of consumed bits, a perception distortion computing unit.
  • the criteria setting unit is configured to set a criterion for scale factors and output the criteria to the scale factor adjusting unit.
  • the scale factor adjusting unit is configured to adjust the scale factors based on the criteria and output the adjusted scale factors to the unit for estimating the number of consumed bits and the perception distortion computing unit.
  • the unit for estimating the number of consumed bits is configured to estimate the number of consumed bits based on the scale factors and determine if the number of consumed bits is less than the total number of bits allowable by an encoding process and transmit a determination result to the scale factor adjusting unit.
  • the perception distortion computing unit is configured to calculate perception distortion based on the scale factors, determine whether the perception distortion is within an imperceptible range and transmit the determination result to the scale factor adjusting unit.
  • the spectrum pre-rectification unit includes a peak marking unit, a reference computing unit, a gain factor computing unit and a pre-rectification unit.
  • the peak marking unit is configured to receive the first group of sample values, mark a peak among the first group of sample values within the spectrum rectification area, and output the peak to the reference computing unit.
  • the reference computing unit is configured to compute based on the peak a reference for spectrum pre-rectification and output the reference to the gain factor computing unit.
  • the gain factor computing unit is configured to compute based on the reference a gain factor for each marked peak and output the gain factor to the pre-rectification unit.
  • the pre-rectification unit is configured to pre-rectify the spectrum with the gain factor.
  • the inverse spectrum rectification unit includes a peak marking unit, a reference computing unit, a reduction factor computing unit and an inverse rectification unit.
  • the peak marking unit is configured to receive the sample values, mark peaks among the sample values within the spectrum rectification area, and output the marked peaks to the reference computing unit.
  • the reference computing unit is configured to compute based on the peaks the reference for inverse spectrum rectification and output the reference to the reduction factor computing unit.
  • the reduction factor computing unit is configured to compute based on the reference a reduction factor for each marked peak and output the reduction factor to the inverse rectification unit.
  • the inverse rectification unit is configured to inversely rectify the spectrum with the reduction factor.
  • Figure 15 illustrates a block diagram of an apparatus for adjusting quality of quantization at a decoder according to Embodiment 4.
  • the apparatus for adjusting quality of quantization at the decoder includes a gain balancing unit, an inverse spectrum rectification unit, an inverse time-frequency transform unit and a global gain balancing unit.
  • the gain balancing unit is configured to receive the quantized sample values and scale factors, utilize the received scale factors to eliminate the influence of the scale factors from the quantized sample values and obtain sample values, and output the sample values to the inverse spectrum rectification unit.
  • the inverse spectrum rectification unit receives the sample values output from the gain balancing unit, performs an inverse spectrum rectification on the sample values and outputs to the inverse time-frequency transform unit.
  • the inverse time-frequency transform unit receives the sample values from the inverse spectrum rectification unit, performs an inverse time-frequency transform on the sample values and outputs to the global gain balancing unit.
  • the global gain balancing unit receives a global gain and sample values, multiplies the sample values with the global gain and outputs the multiplications.
  • the global gain balancing unit may be a multiplier.
  • the inverse spectrum rectification unit of the decoder includes a peak marking unit, a reference computing unit, a reduction factor computing unit and an inverse rectification unit.
  • the peak marking unit is configured to receive the sample values, mark peaks among the sample values within the spectrum rectification area, and output the marked peaks to the reference computing unit.
  • the reference computing unit is configured to compute based on the peaks the reference for inverse spectrum rectification and output the reference to the reduction factor computing unit.
  • the reduction factor computing unit is configured to compute based on the reference a reduction factor for each marked peak and output the reduction factor to the inverse rectification unit.
  • the inverse rectification unit is configured to inversely rectify the spectrum with the reduction factor.
  • Embodiments described above may be applicable to various encoding fields such as audio encoding, video encoding, image encoding.
  • the present invention may be implemented with software on a necessary hardware platform.
  • the embodiment may also be implemented with hardware. But, most of the time, the former approach is more preferable.
  • technical solutions of the present invention, or the part which the present invention makes contribution over the prior art may be embodied in a software product.
  • the computer software product may be stored in a readable storage media.
  • the software product may include a set of instructions enabling a computer (may be a personal computer, a server, or a network device, etc.) to perform methods according to various embodiments of the present invention.
  • the foregoing disclosure is only a few embodiments of the present invention. However, the present invention is not intended to be limiting in these respects. Any modification made by those skilled in the art shall be construed as falling within the scope of the present invention.
EP07855801A 2006-12-01 2007-12-26 Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur Withdrawn EP2104095A4 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN 200610164330 CN101192410B (zh) 2006-12-01 2006-12-01 一种在编解码中调整量化质量的方法和装置
PCT/CN2007/003799 WO2008064577A1 (fr) 2006-12-01 2007-12-26 Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur

Publications (2)

Publication Number Publication Date
EP2104095A1 true EP2104095A1 (fr) 2009-09-23
EP2104095A4 EP2104095A4 (fr) 2012-07-18

Family

ID=39467436

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07855801A Withdrawn EP2104095A4 (fr) 2006-12-01 2007-12-26 Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur

Country Status (3)

Country Link
EP (1) EP2104095A4 (fr)
CN (1) CN101192410B (fr)
WO (1) WO2008064577A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103354091A (zh) * 2013-06-19 2013-10-16 北京百度网讯科技有限公司 基于频域变换的音频特征提取方法及装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609674B (zh) * 2008-06-20 2011-12-28 华为技术有限公司 编解码方法、装置和系统
CN101964690B (zh) * 2009-07-22 2012-07-04 联芯科技有限公司 一种harq合并译码方法、装置及系统
JP5316896B2 (ja) * 2010-03-17 2013-10-16 ソニー株式会社 符号化装置および符号化方法、復号装置および復号方法、並びにプログラム
CN102821069B (zh) * 2011-06-07 2018-06-08 中兴通讯股份有限公司 基站及基站侧上行数据压缩方法
CN105721879B (zh) * 2016-01-26 2018-08-31 北京空间飞行器总体设计部 一种深空探测图像分段保护下的感兴趣区域传输方法
CN111429944B (zh) * 2020-04-17 2023-06-02 北京百瑞互联技术有限公司 一种编解码器开发测试优化方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996014695A1 (fr) * 1994-11-04 1996-05-17 Philips Electronics N.V. Codage et decodage d'un signal large bande d'informations numeriques
US5864802A (en) * 1995-09-22 1999-01-26 Samsung Electronics Co., Ltd. Digital audio encoding method utilizing look-up table and device thereof
US20040143431A1 (en) * 2003-01-20 2004-07-22 Mediatek Inc. Method for determining quantization parameters
US6912496B1 (en) * 1999-10-26 2005-06-28 Silicon Automation Systems Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics
US20050254586A1 (en) * 2004-05-12 2005-11-17 Samsung Electronics Co., Ltd. Method of and apparatus for encoding/decoding digital signal using linear quantization by sections
US20060074693A1 (en) * 2003-06-30 2006-04-06 Hiroaki Yamashita Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5388181A (en) * 1990-05-29 1995-02-07 Anderson; David J. Digital audio compression system
NL9100173A (nl) * 1991-02-01 1992-09-01 Philips Nv Subbandkodeerinrichting, en een zender voorzien van de kodeerinrichting.
DE69830979T2 (de) * 1997-07-29 2006-05-24 Koninklijke Philips Electronics N.V. Verfahren und vorrichtung zur videocodierung mit variabler bitrate
CA2252170A1 (fr) * 1998-10-27 2000-04-27 Bruno Bessette Methode et dispositif pour le codage de haute qualite de la parole fonctionnant sur une bande large et de signaux audio
JP3594829B2 (ja) * 1999-02-24 2004-12-02 アルパイン株式会社 Mpegオーディオの復号化方法
CN1318904A (zh) * 2001-03-13 2001-10-24 北京阜国数字技术有限公司 一种实用的基于小波变换的声音编解码器

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996014695A1 (fr) * 1994-11-04 1996-05-17 Philips Electronics N.V. Codage et decodage d'un signal large bande d'informations numeriques
US5864802A (en) * 1995-09-22 1999-01-26 Samsung Electronics Co., Ltd. Digital audio encoding method utilizing look-up table and device thereof
US6912496B1 (en) * 1999-10-26 2005-06-28 Silicon Automation Systems Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics
US20040143431A1 (en) * 2003-01-20 2004-07-22 Mediatek Inc. Method for determining quantization parameters
US20060074693A1 (en) * 2003-06-30 2006-04-06 Hiroaki Yamashita Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model
US20050254586A1 (en) * 2004-05-12 2005-11-17 Samsung Electronics Co., Ltd. Method of and apparatus for encoding/decoding digital signal using linear quantization by sections

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HANSEN J H L ET AL: "Use of objective speech quality measures in selecting effective spectral estimation techniques for speech enhancement", 19890814; 19890814 - 19890816, 14 August 1989 (1989-08-14), pages 105-108, XP010090133, *
See also references of WO2008064577A1 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103354091A (zh) * 2013-06-19 2013-10-16 北京百度网讯科技有限公司 基于频域变换的音频特征提取方法及装置
CN103354091B (zh) * 2013-06-19 2015-09-30 北京百度网讯科技有限公司 基于频域变换的音频特征提取方法及装置

Also Published As

Publication number Publication date
CN101192410B (zh) 2010-05-19
WO2008064577A8 (fr) 2009-05-07
EP2104095A4 (fr) 2012-07-18
CN101192410A (zh) 2008-06-04
WO2008064577A1 (fr) 2008-06-05

Similar Documents

Publication Publication Date Title
US7337118B2 (en) Audio coding system using characteristics of a decoded signal to adapt synthesized spectral components
US7613603B2 (en) Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model
RU2705052C2 (ru) Распределение битов, кодирование и декодирование аудио
US8688440B2 (en) Coding apparatus, decoding apparatus, coding method and decoding method
RU2439718C1 (ru) Способ и устройство для обработки звукового сигнала
US8560330B2 (en) Energy envelope perceptual correction for high band coding
US8452588B2 (en) Encoding device, decoding device, and method thereof
US8818539B2 (en) Audio encoding device, audio encoding method, and video transmission device
US20080140405A1 (en) Audio coding system using characteristics of a decoded signal to adapt synthesized spectral components
EP2104095A1 (fr) Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur
CN105874534B (zh) 编码装置、解码装置、编码方法、解码方法及程序
US20040225495A1 (en) Encoding apparatus, method and program
JPWO2005064594A1 (ja) 音声・楽音符号化装置及び音声・楽音符号化方法
US9202454B2 (en) Method and apparatus for audio encoding for noise reduction
US8924203B2 (en) Apparatus and method for coding signal in a communication system
US9691398B2 (en) Method and a decoder for attenuation of signal regions reconstructed with low accuracy

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20090701

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

RIN1 Information on inventor provided before grant (corrected)

Inventor name: LI, WEI

Inventor name: ZHANG, QING

Inventor name: XU, JIANFENG

Inventor name: XU, LIJING

Inventor name: DU, ZHENGZHONG

Inventor name: ZOU, YAO

Inventor name: SANG, SHENGHU

Inventor name: LIU, PEILIN

DAX Request for extension of the european patent (deleted)
A4 Supplementary search report drawn up and despatched

Effective date: 20120619

RIC1 Information provided on ipc code assigned before grant

Ipc: G10L 19/02 20060101ALI20120613BHEP

Ipc: G10L 19/00 20060101AFI20120613BHEP

17Q First examination report despatched

Effective date: 20140225

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20140708