WO2008064577A1 - Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur - Google Patents

Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur Download PDF

Info

Publication number
WO2008064577A1
WO2008064577A1 PCT/CN2007/003799 CN2007003799W WO2008064577A1 WO 2008064577 A1 WO2008064577 A1 WO 2008064577A1 CN 2007003799 W CN2007003799 W CN 2007003799W WO 2008064577 A1 WO2008064577 A1 WO 2008064577A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
scaling factor
unit
shaping
factor
Prior art date
Application number
PCT/CN2007/003799
Other languages
English (en)
Chinese (zh)
Other versions
WO2008064577A8 (fr
Inventor
Wei Li
Lijing Xu
Qing Zhang
Jianfeng Xu
Shenghu Sang
Zhengzhong Du
Yao Zou
Peilin Liu
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to EP07855801A priority Critical patent/EP2104095A4/fr
Publication of WO2008064577A1 publication Critical patent/WO2008064577A1/fr
Publication of WO2008064577A8 publication Critical patent/WO2008064577A8/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Definitions

  • the present invention relates to coding techniques, and more particularly to a method and apparatus for adjusting quantization quality in codecs. Background technique
  • technologies that can satisfy low bit rate and high quality audio coding mainly include: AAC+, EAAC+ and AMR-WB+.
  • AAC+ and EAAC+ are extended from high-rate audio encoders
  • AMR-WB+ is a hybrid coding method formed by extending low-rate speech coding.
  • the sampled values are generally time-frequency transformed, and then the spectral coefficients are weighted and quantized according to the auditory characteristics, and the quantized spectral coefficients are then passed through the entropy. Value encoding transmission.
  • the main distortion in the coding results from the quantification of various parameters. Therefore, in order to adapt to different needs, the encoder needs to adjust the quality of the quantization according to the specified code rate: in a high bit rate coding scheme such as greater than 24 kbps, a good encoder will reach a transparent sound shield, that is, a human ear.
  • the noise introduced in the coding quantization process cannot be detected.
  • the low code rate coding scheme due to the shortage of the number of bits, it is impossible to completely achieve the effect of sound quality transparency, and thus only the subjective distortion as small as possible can be pursued.
  • a commonly used technique for adjusting the quantization shield is to use a scaling factor or gain.
  • the encoded coefficients are first divided by the scaling factor or multiplied by the gain, and then the scaled coefficients are quantized.
  • the most suitable scaling factor satisfies the code rate. The requirements can make the quantization error as small as possible. Therefore, when the code rate is relatively high, a smaller scaling factor is selected, so that the dynamic range of the quantized coefficient is relatively large, and the quantization is relatively fine; when the code rate is relatively small, the larger one is selected.
  • FIG. 1 shows a schematic block diagram of the MPEG1-LAYER3 audio coding algorithm.
  • the entire coding frequency band is equally divided into 32 sub-bands, each of which is assigned a scaling factor, and a global scaling factor is assigned to the entire frequency band;
  • the closed-loop search algorithm adjusts the global scaling factor so that the number of quantization bits is within the allowable range of the current bit rate, while adjusting the scaling factor within the sub-band, so that the quantization noise is below the masking domain of the human ear as much as possible, that is, the human ear does not feel the quantization noise.
  • the existence of the quantized coefficient stream is finally transmitted by Huffman coding.
  • the sub-band multi-scaling factor coding method in the MPEG1-LAYER3 coding algorithm has the following defects:
  • Subband division requires 32 subband analysis filter banks, and the computational complexity is high;
  • FIG. 2 shows the flow chart of the Transform Excitation Coding (TCX) section of the AMR-WB+ audio coding algorithm.
  • TCX Transform Excitation Coding
  • the so-called important frequency band refers to the low frequency band.
  • the amplification factor calculated in the frequency pre-shaping is not transmitted in the encoded code stream, but in the spectral inverse shaping, according to the frequency pre-shaping method, each frequency domain sample is calculated.
  • the recovered frequency domain samples are obtained by dividing the frequency domain samples of each block by the amplification factor of the corresponding block.
  • the inventors found that the global scaling factor algorithm of the existing AMR-WB+ audio coding algorithm TCX part has at least the following defects:
  • Embodiments of the present invention provide a method for adjusting quantization quality in coding, which reduces implementation complexity.
  • Embodiments of the present invention provide a method for adjusting a quantization shield amount in decoding, which can ensure quantization quality.
  • Embodiments of the present invention provide an apparatus for adjusting quantization quality in encoding, which reduces implementation complexity.
  • Embodiments of the present invention provide an apparatus for adjusting quantization quality in decoding, which can ensure quantization quality.
  • An embodiment of the present invention provides a method for adjusting quantization quality in coding, where the method includes: using two or more scaling factors to adjust a first sampling value for encoding, and then adjusting the first sampling The value is quantized to obtain a quantized sample value; the influence of the scaling factor is removed from the obtained quantized sample value to obtain a second sample value, and the first sample is utilized The value and the second sample value result in a global gain; the obtained quantized sample value, the information of the two or more scaling factors, and the resulting global gain are output as an encoded stream.
  • An embodiment of the present invention provides a method for adjusting a quantization quality in decoding, and decoding an encoded stream output by an encoding end to obtain a decoded stream, where the method includes: acquiring a quantized sample value, two or more scaling factors from the decoded stream Information and global gain; using the information of two or more scaling factors, removing the effect of the scaling factor from the quantized sample values to obtain the sampled value, multiplied by the global gain.
  • An embodiment of the present invention provides an apparatus for adjusting a quantization shield in coding, where the apparatus includes: a multi-scale factor control unit, a quantization unit, a gain balance unit, and a global gain calculation unit; wherein the multi-scale factor control unit is used by Receiving a first sample value, setting two or more scaling factors for the first sample value, adjusting the first sample value by using a scaling factor, and outputting the adjusted first sample value to the quantization unit; The unit is configured to quantize the received first sample value to obtain a quantized sample value and output the same to the gain balancing unit; the gain balancing unit is configured to receive the quantized sample value, and remove the influence of the scaling factor from the quantized sample value Obtaining a second sample value and outputting to the global gain calculation unit; the global gain calculation unit is configured to receive the first sample value and the second sample value, and obtain the global gain by using the first sample value and the second sample value.
  • An embodiment of the present invention provides an apparatus for adjusting a quantization quality in decoding, where the apparatus includes: a gain balancing unit and a global gain balancing unit; wherein the gain balancing unit is configured to receive a quantized sample value and a scaling factor, and utilize the Received scaling factor, removing the influence of the scaling factor from the quantized sample value to obtain a sampled value, and outputting the sampled value to the global gain balancing unit; the global gain balancing unit is configured to receive the global gain and the sampled value, and multiply the sampled value Output after global gain.
  • the method and apparatus for adjusting the quantization quality according to the embodiment of the present invention are different from the scheme of using the filter described in the prior art, and directly dividing the sampled value into a plurality of parts and respectively setting a scaling factor for each part, therefore, It can greatly reduce the implementation complexity; Moreover, unlike the prior art scheme using a global scaling factor, since multiple scaling factors are used, the quantization quality of important parts can be better adjusted, and better coding can be obtained. effect.
  • FIG. 1 is a schematic block diagram of an MPEG1-LAYER3 audio coding algorithm in the prior art
  • Figure 2 is a flow chart showing the TCX portion of the AMR-WB+ audio coding algorithm in the prior art
  • FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality according to Embodiment 1 of the present invention
  • FIG. 4 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 1 of the present invention
  • FIG. a flow chart for adjusting the quantization quality by a multi-scaling factor at the encoding end;
  • FIG. 6 is a flowchart of selecting a plurality of scaling factors and fine-tuning frequency domain samples of an entire frequency band according to Embodiment 1 of the present invention
  • FIG. 7 is a flowchart of adjusting a quantized shield by a multi-scaling factor at a decoding end according to Embodiment 1 of the present invention.
  • FIG. 8 is a schematic block diagram of an encoder for adjusting a quantized shield according to Embodiment 2 of the present invention
  • FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 2 of the present invention
  • FIG. 2 is a schematic diagram of peak pre-shaping in FIG. 2
  • FIG. 11 is a schematic diagram of implementing peak inverse shaping in Embodiment 2 of the present invention
  • FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3 of the present invention
  • FIG. 14 is a structural diagram of an apparatus for adjusting quantization quality at an encoding end according to Embodiment 4 of the present invention;
  • Figure 15 is a block diagram showing the arrangement of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment of the present invention. detailed description
  • the main idea of adjusting the quantization quality provided by the embodiment of the present invention is to utilize multiple scaling factors.
  • the encoding process of time-frequency transforming the sampled values will be mainly described.
  • the embodiment of the present invention can still be applied to the case where the time-frequency transform is not performed on the sampled values during the encoding process.
  • Embodiment 1 provides a method of adjusting a quantized shield by a multi-scaling factor.
  • FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 1.
  • time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by a multi-scaling factor, quantized and output quantized.
  • the sampled value, the output quantized sample value is calculated by gain balance and inverse time-frequency transform to calculate the optimal global gain.
  • the coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.
  • FIG. 4 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 1, in which a quantized frequency domain sample value is subjected to gain balance and inverse time-frequency transform to obtain a time domain sample value, and finally multiplied by a global gain. The time domain sampled values can be restored.
  • Step 501 Convert the time domain sample value to the frequency domain sample value X(k) by time-frequency transform.
  • time-frequency transform such as discrete Fourier transform (DFT), discrete cosine transform (DCT, MDCT, IDCT), and wavelet transform (DWT) may be employed.
  • DFT discrete Fourier transform
  • DCT discrete cosine transform
  • DWT wavelet transform
  • FFT fast Fourier transform
  • P strives for low computational complexity.
  • Step 502 Perform multi-scaling control on the frequency domain sample values, specifically, selecting a suitable multiple scaling factors to fine-tune the frequency domain sample values of the entire frequency band.
  • Step 601 Divide the entire frequency band into m parts , get m parts of the frequency domain sample value ⁇ -, ⁇ , ⁇ +l, « m _ 1 +2,- , N),-, ( « 1 +l, 2, + 2,-,n 2 ) , and ⁇ !
  • the scaling factor for each part is represented by gl , &, ... ⁇ .
  • multiple scaling factors are directly divided on the entire frequency band after time-frequency transform, and it is not necessary to first divide the frequency band into several segments through the filter group, and then set a scaling factor in each segment, thereby Compared with the prior art, the implementation complexity can be greatly reduced.
  • Step 602 Select a reference value for estimating m scaling factors, the selection of the reference value of the scaling factor such that the number of consumed bits is 0 .
  • the estimated value is less than the maximum allowable number of bits.
  • Step 603 At g . Adjust m scaling factors nearby.
  • the m scaling factors can be adjusted by reducing the scaling factor of the more important frequency bands and increasing the scaling factor of the unimportant frequency band.
  • the more important frequency band refers to the low frequency band
  • the unimportant frequency band refers to the high frequency band. Since & ⁇ corresponds to the low to high frequency bands respectively, the adjusted m scaling factors are gradually increasing relationships. Through this adjustment, the quantization quality of the more important frequency bands can be relatively high, and the quantization quality of the unimportant frequency bands is relatively low, so that the quantization quality in the entire frequency band is optimized.
  • Step 604 Determine that the estimated number of consumed bits does not exceed the total number of bits under the adjusted m scaling factors. If not, return to step 603 to adjust the scaling factor again. If yes, the number of consumed bits will be satisfied.
  • the m scaling factors are represented as step 605: Calculating the quantized perceptual distortion based on the adjusted m scaling factors ⁇ , g m .
  • the value indicates: the original frequency domain sample value X and the difference between the sample values obtained by adjusting the frequency domain sample value X by m scaling factors gl , g 2 , -, g m
  • step 605 according to the adjusted m scaling factors g! , g 2 ,... , g m calculated the quantitative perceptual distortion as c
  • Step 606 Determine whether the quantized perceptual distortion is within an unperceivable range. If yes, the m scaling factors obtained after the current adjustment are used as the optimal scaling factor, and gi opt , g 2op ,, ", g mop A Then, step 607 is performed; otherwise, step 603 is returned.
  • the perceptual distortion is within the range that cannot be perceived, the person cannot perceive the quantization noise introduced by the encoder.
  • the specific insensible range is a specific range of values that allow distortion.
  • a specific method for determining whether the quantized perceptual distortion is in an unperceivable range is: determining whether the value of the quantized perceptual distortion calculated in step 605 is within a range of the allowable distortion, and if so, the quantized perceptual distortion is not perceived. Otherwise, quantitative perception is considered to be perceptible.
  • step 606 when the quantized perceptual distortion can be perceived, if the quantized perceptual distortion can still be perceived after repeating the above-mentioned adjustment step M times, the closed loop selection is ended, and the repeated process is repeated from the above process.
  • a set of scaling factors that minimize the perceptual distortion is selected as the optimal scaling factor, and then step 607 is performed.
  • the number of closed-loop selections M can be determined according to actual conditions.
  • Step 607 Fine-tuning the frequency domain sample value X by using the obtained m optimal scaling factors g , g , that is, dividing the frequency domain sample value of each block by the optimal scaling factor of the corresponding block, and obtaining the fine-tuned spectrum.
  • the concrete expression is as follows.
  • Slopt Smopt sends the fine-tuned frequency-domain sampled values obtained in steps 601 to 607 above to the encoder.
  • the scaling factor is required to recover the data during decoding, the scaling factor needs to be transmitted in the encoded code stream.
  • the way to transfer the scaling factor can be done in a variety of ways, as described below.
  • Mode 1 for transmitting the scaling factor m scaling used to fine tune the frequency sampled value
  • the factors ⁇ ..., ⁇ are all encoded, so that the data can be recovered more accurately when decoding.
  • Mode 2 of transmitting the scaling factor m scaling factors g ⁇ , g 2 f., g m when used to fine tune the frequency sampled value.
  • P select a scaling factor as the reference scaling factor, then calculate the ratio of the remaining m - 1 scaling factors to the reference scaling factor, and encode the m - 1 ratio. For example, as a benchmark scaling factor, only coding is required. In this way, the number of bits consumed can be reduced.
  • Mode 3 for transmitting the scaling factor m scaling factors used to fine tune the frequency sampled value Medium, selecting a scaling factor as a reference scaling factor, then calculating a ratio of the remaining m-1 scaling factors to the reference scaling factor, and encoding the reference scaling factor and m-1 ratios. For example, put gl . p' as the reference scaling factor, you need to encode and ⁇ L, ,..., 3 ⁇ 4L. In this way, not only can the consumed bits be reduced
  • the number of preferred scaling factors can be selected according to the requirements of the coding rate and the quality of the quantization. For example, in low bit rate coding, 2 to 3 scaling factors can be selected.
  • Step 503 Quantize the frequency domain sample value obtained by the multi-scaling factor control, and output the quantized frequency domain sample value 9 .
  • step 503 different quantization methods may be used according to the coding requirements, for example, multi-level vector quantization, split vector quantization, tree quantization, lattice vector quantization, and the like.
  • Step 504 The quantized frequency sample value obtained in step 503 is removed, and the original frequency domain sample value is restored; ⁇ ⁇ , that is, the quantized frequency sample value is obtained; and the gain balance is obtained to obtain ⁇ ⁇ .
  • the method of gain balancing also uses different methods.
  • the gain balancing can be performed by using multiple scaling factors selected in step 502, ⁇ ..., ⁇ , specifically: the quantized frequency sampling value is also followed by steps.
  • the frequency band division method in 601 is divided into m parts, and
  • X balance [Slop, ' X g ( ,l,- , ⁇ ,), -g 2opt X q («, + 1, «, + 2, ⁇ ⁇ ⁇ , « 2 ), ⁇ ⁇ ⁇ , g mopt ⁇ X q + 1, N)] If the method of transmitting the scaling factor is the above method 3, the gain balance can be performed by using the scaling values of the plurality of scaling factors, specifically: the quantized frequency sampling value is also followed by steps.
  • the frequency band division method in 601 is divided into m parts, and A ⁇ 1 '"''""), W 1, U2, -., N), ⁇ ( «, +1 ⁇ +2, -, « 2 ) are obtained.
  • Multiplying the frequency sample value of the corresponding part of the reference scaling factor by 1, and the remaining part of the quantization frequency sample value is multiplied by the ratio of the scaling factor of the corresponding part to the reference scaling factor, assuming the first part of the corresponding scaling factor g
  • the specific expression of the gain balance is as follows:
  • Step 505 Perform inverse time-frequency transform on the J ⁇ fl/ obtained after the gain balance, and convert the restored frequency domain sample value into the restored time domain sample value ⁇ 9 («).
  • Step 506 Calculate the optimal global gain g by using the original time domain sample value and the restored time domain sample value ( «).
  • the optimal global gain g gpi minimizes ; [; c( «)-g g -x q (n)] 2 . This gives the best global gain as -: ggopt -
  • the best global gain g g ⁇ ?3 ⁇ 4 also requires coded transmission for data recovery at the decoder.
  • the above is the process of adjusting the quantized shield by the multi-scaling factor at the encoding end.
  • the decoding end needs to recover the time domain sampling value according to the quantized frequency sampling value obtained after decoding by the flow shown in FIG. 7, and the specific process includes the following steps:
  • Step 701 Perform gain balance on the quantized frequency sample value by using a scaling factor obtained from the encoded stream.
  • the specific implementation is the same as the method described in step 504, and the description thereof is omitted here. It should be noted that the method of gain balancing is also different according to the way of transmitting the scaling factor, and the gain balancing mode in the encoding end and the gain balancing mode in the decoding end are also consistent.
  • Step 702 Perform inverse time-frequency transform on the frequency domain sample value obtained after the gain balance, and obtain a time domain sample value.
  • Step 703 The time domain sample value is multiplied by the global gain obtained from the encoded stream to obtain a recovered time domain sample value.
  • the multi-scaling factor control technique used in the first embodiment can directly perform the sampling value in the time domain, that is, it can be applied to the case where there is no time-frequency transform, and correspondingly, when calculating the global gain, there is no inverse time-frequency transform process.
  • Embodiment 2 provides a method of adjusting the quantized shield by multi-scaling factors and spectral shaping.
  • FIG. 8 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 2.
  • time domain sample values are first converted into a frequency domain by time-frequency transform, and then controlled by spectrum pre-shaping and multi-scaling factors.
  • the quantized sample values are quantized and output, and the output quantized sample values are calculated by gain balance, spectral inverse shaping, and inverse time-frequency transform to calculate an optimal global gain.
  • the coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.
  • FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 2, in decoding
  • the quantized frequency domain sampled values are obtained by gain balancing, spectral inverse shaping, and inverse time-frequency transform to obtain time domain sampled values, and finally multiplied by the global gain to restore the time domain sampled values.
  • the specific steps of adjusting the quantization quality by the multi-scaling factor and the peak shaping are, based on the flow shown in FIG. 5 in Embodiment 1, the time-frequency transform and the step 502 described in the step 501.
  • the step of spectral pre-shaping further includes the step of spectrum inverse shaping between the gain balancing described in step 504 and the inverse time-frequency transform described in step 505.
  • the specific implementation methods of frequency pre-shaping and frequency-language inverse shaping are introduced in detail.
  • FIG. 10 shows a schematic diagram of spectrum pre-shaping, which can be implemented by the following steps.
  • Step 1001 Step of determining a spectrum shaping area and performing the spectrum shaping area
  • the frequency shaping region refers to the spectral region of the more important frequency band.
  • the spectrum shaping area can use the front part of the full frequency band, for example, the first quarter can be used.
  • the peak value may be defined as a local maximum value in the amplitude of the shaped spectrum segment, if > X(j), V; € [ - ⁇ , / + ⁇ ], / ⁇ j , then [ - ⁇ , + ⁇ ] The local maximum of 2 ⁇ + 1 point, where the local area can be arbitrarily selected.
  • Step 1002 Calculate a reference value p ref for spectrum pre-shaping.
  • the principle of selecting the reference value is to ensure that the reference value remains unchanged before and after spectral shaping.
  • the characteristic parameter of a piece of data can also be used as the reference value ⁇ to avoid the quantization error having a large influence on the reference value.
  • the peak energy of the low frequency portion is amplified so that the peak can be captured by the quantizer. Therefore, in the second embodiment, only a small number of spectral points are The peak is amplified.
  • the spectrum pre-shaping technique may also be referred to as peak pre-shaping. With this peak pre-shaping technique, the increase of the global gain is less affected, and the increase of the quantization error caused by the increase of the global gain is negligible.
  • you consider the effect of spectrum shaping better you can also zoom in on the spectral points around the peak. For example, if you zoom in on the local peak of 2 ⁇ +1 point, you can also 2 ⁇ around the peak or A point less than 2 ⁇ is amplified by the corresponding amplification factor.
  • the peak value of the frequency domain sample value at the important frequency band is increased, thereby reducing the quantization error at the smaller peak of the frequency domain sample value of the important frequency band, and reducing the frequency peak value of the more important frequency band.
  • the probability of loss in quantization is increased.
  • the spectrum shaping area and the peak labeling criterion in the spectrum inverse shaping process should be the same as those in the frequency pre-shaping process.
  • Step 1102 Calculate a reference value for spectrum inverse shaping, where ⁇ , spectrum inverse
  • spectrum inverse
  • the parameters in the process are consistent.
  • the reduction factor is calculated in the frequency pan inverse shaping process, and it is not necessary to transmit the reference value for spectral inverse shaping in the encoded stream, and the decoding end can also utilize the characteristics of the sampled value of the decoding end according to the above principle. Calculate the reference value for spectral inverse shaping, and further calculate the reduction factor of the corresponding peak, without taking up extra bits.
  • Step 1104 The peak value is reduced by using the calculated peak reduction factor.
  • step 505 After performing spectrum inverse shaping by the above steps, in step 505, the frequency domain sampled values obtained after inverse frequency shaping are inverse-time-transformed.
  • the spectrum inverse plasticizing between the gain balance and the inverse time-frequency transform is also required at the decoding end.
  • the specific implementation method is the same as the frequency inverse processing method performed in the above encoding process, and the description thereof is omitted here.
  • the frequency pre-shaping is performed first, and then the multi-scaling factor is controlled.
  • multi-scaling factor control may be performed first, and then spectrum pre-shaping is performed.
  • the first process may be performed first. The spectrum is inversely shaped and then gain balanced. In this case, no detailed introduction will be made.
  • Embodiment 3 provides a method of adjusting quantization quality by spectral shaping.
  • FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3.
  • time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by spectrum pre-shaping, and quantized.
  • the sampled value, the output quantized sample value is calculated by the frequency inverse inverse transform and the inverse time-frequency transform to calculate the optimal global gain.
  • the coded stream needs to transmit the quantized value of the frequency domain sampled value and the global gain three parts.
  • FIG. 13 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 3.
  • the quantized frequency domain sample values are obtained by frequency inverse transform and inverse time-frequency transform to obtain time domain sample values, and finally multiplied by global values. Gain restores the time domain sample value.
  • the method of frequency pre-shaping and spectrum inverse shaping is consistent with the implementation method and the obtained technical effects in Embodiment 2, and will not be described in detail herein.
  • Embodiment 4 gives an implementation device for adjusting the quantization quality.
  • FIG. 14 is a block diagram showing the configuration of the apparatus for adjusting the quantization quality at the encoding end in Embodiment 4.
  • the apparatus for adjusting the quantization quality at the encoding end includes: a time-frequency transform unit, a frequency pre-shaping unit, and a multi-scaling factor control list.
  • the time-frequency transform unit receives the first sampled value, and performs time-frequency transform on the first sampled value, and outputs the result to the spectrum pre-shaping unit.
  • the spectrum pre-shaping unit receives the output of the time-frequency transform unit.
  • the multi-scale factor control unit receives the first sample value, and sets two or two on the first sample value And more than one scaling factor, adjusting the first sampling value by using a scaling factor, and outputting the adjusted first sampling value to the quantization unit;
  • the quantization unit quantizing the received first sampling value to obtain a quantized sampling value and Outputting to the gain balancing unit;
  • the gain balancing unit receives the quantized sample value, removes the influence of the scaling factor from the quantized sample value to obtain a second sampled value, and outputs the same to the frequency inverse inverse shaping unit;
  • the unit receives the second sample value output by the gain balancing unit, performs spectral inverse shaping on the second sample value, and outputs the result to the inverse time-frequency transform unit;
  • the inverse time-frequency transform unit receives the second sampled value from the peak inverse shaping unit, and performs inverse time-frequency transform on the second sampled value, and outputs the same to
  • the multi-scale factor control unit includes: a scaling factor setting unit and a sample value adjusting unit; the scaling factor setting unit is configured to set two or more scaling factors for the first sampling value, and output the set scaling factor And the sample value adjustment unit is configured to receive a scaling factor, and adjust the first sample value by using a scaling factor.
  • the scaling factor setting unit includes: a reference value setting unit, a scaling factor adjusting unit, a consumption bit number estimating unit, and a perceptual distortion calculating unit;
  • the reference value setting unit is configured to set a reference value of the scaling factor, and output the scaling value to the scaling a factor adjustment unit;
  • the scale factor adjustment unit is configured to adjust a scaling factor according to a reference value, and output the result to the consumption bit number estimation unit and the perceptual distortion calculation unit;
  • the consumption bit number estimation unit is configured to estimate consumption according to a scaling factor The number of bits, and determining whether the number of consumed bits is smaller than the total number of bits allowed by the encoding, and transmitting the determination result to the scaling factor adjusting unit;
  • the perceptual distortion calculating unit is configured to calculate the perceptual distortion according to the scaling factor, and determine the perceptual distortion Whether the result of the determination is sent to the scaling factor adjustment unit within a range that is not perceptible.
  • the frequency pre-shaping unit includes: a peak marking unit, a reference value calculating unit, an amplification factor calculating unit, and a pre-shaping unit; wherein the peak marking unit is configured to receive the first sampling value and is in the spectrum shaping area a sample value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency pre-shaping using a peak value, and output the result to the amplification factor calculation unit; The factor calculation unit is configured to calculate, by using the reference value, an amplification factor of each flag peak, and output the signal to the pre-shaping unit; the pre-shaping unit is configured to pre-shape the spectrum by using the amplification factor.
  • the frequency inverse transforming unit includes: a peak labeling unit, a reference value calculating unit, a reduction factor calculating unit, and an inverse shaping unit; wherein the peak labeling unit is configured to receive the sampling value and is in the sampling value in the spectrum shaping area. Marking a peak value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency inverse transformation using a peak value, and output the result to the reduction factor calculation unit; The reduction factor of each marker peak is calculated by using the reference value, and is output to the inverse shaping unit.
  • the inverse shaping unit is configured to perform inverse shaping on the frequency using the reduction factor.
  • FIG. 15 is a block diagram showing the structure of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment.
  • the apparatus for adjusting the quantization quality at the decoding end includes: a gain balancing unit, a spectrum inverse shaping unit, an inverse time-frequency transform unit, and a global gain balancing unit.
  • the gain balancing unit is configured to receive the quantized sample value and the scaling factor, and use the received scaling factor to remove the influence of the scaling factor from the quantized sample value to obtain a sampled value, and output the sampled value to the spectral inverse shaping unit;
  • the inverse frequency shaping unit receives the sampled value output by the gain balancing unit, performs spectral inverse shaping on the sampled value, and outputs the sampled value to the inverse time-frequency transform unit; the inverse time-frequency transform unit inversely shapes the spectrum from the spectrum
  • the sampling value is received in the unit, and the sampled value is inverse-time-converted and output to the global gain balancing unit; the global gain balancing unit receives the global gain and the sampled value, and multiplies the sampled value by the global gain and outputs the sampled value.
  • the global gain balancing unit can be a multiplier.
  • the spectrum inverse inverse unit of the decoding end is the same as the encoding end, and includes: a peak mark a unit, a reference value calculation unit, a reduction factor calculation unit, and an inverse shaping unit; wherein the peak marker unit receives the sample value, and marks a peak value in the sampled value in the spectrum shaping region, and outputs the peak value to the reference value calculation unit
  • the reference value calculation unit is configured to calculate a reference value for spectral inverse shaping using a peak value, and output the reference value to the reduction factor calculation unit;
  • the reduction factor calculation unit is configured to calculate a reduction factor of each marker peak value by using a reference value, And outputting to the inverse shaping unit;
  • the inverse shaping unit is configured to perform inverse shaping on the spectrum by using the reduction factor.
  • the embodiments described above can be applied to various coding fields such as audio coding, video coding, and image coding.
  • the present invention can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is a better implementation. the way.
  • the technical solution of the present invention may also be embodied in the form of a software product, which is stored in a storage medium, and includes a plurality of instructions for making A computer device (which may be a personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present invention.
  • a computer device which may be a personal computer, server, or network device, etc.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

L'invention concerne un procédé permettant d'ajuster la qualité de la quantification dans un codeur. Ce procédé comprend l'ajustement des premières valeurs échantillons à coder en utilisant deux facteurs d'échelle ou plus, et la quantification des premières valeurs échantillons ajustées pour acquérir des valeurs échantillons quantifiées. Ensuite, l'élimination de l'influence des facteurs d'échelle pour les valeurs échantillons quantifiées pour obtenir les secondes valeurs échantillons, et sur la base des premières valeurs échantillons et des secondes valeurs échantillons pour obtenir le gain global. Par la suite, le codage des valeurs échantillons quantifiées, des deux facteurs d'échelle ou plus et du gain global en flux binaire. De plus, l'invention concerne le procédé permettant d'ajuster la qualité de la quantification dans un décodeur et l'appareil permettant d'ajuster la qualité de la quantification dans un codeur et dans un décodeur qui peuvent diminuer les complications pratiques nettement, etmieux ajuster la qualité de la quantification d'une partie importante, et mieux acquérir l'effet de codage.
PCT/CN2007/003799 2006-12-01 2007-12-26 Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur WO2008064577A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07855801A EP2104095A4 (fr) 2006-12-01 2007-12-26 Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN 200610164330 CN101192410B (zh) 2006-12-01 2006-12-01 一种在编解码中调整量化质量的方法和装置
CN200610164330.X 2006-12-01

Publications (2)

Publication Number Publication Date
WO2008064577A1 true WO2008064577A1 (fr) 2008-06-05
WO2008064577A8 WO2008064577A8 (fr) 2009-05-07

Family

ID=39467436

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/003799 WO2008064577A1 (fr) 2006-12-01 2007-12-26 Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur

Country Status (3)

Country Link
EP (1) EP2104095A4 (fr)
CN (1) CN101192410B (fr)
WO (1) WO2008064577A1 (fr)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609674B (zh) * 2008-06-20 2011-12-28 华为技术有限公司 编解码方法、装置和系统
CN101964690B (zh) * 2009-07-22 2012-07-04 联芯科技有限公司 一种harq合并译码方法、装置及系统
JP5316896B2 (ja) * 2010-03-17 2013-10-16 ソニー株式会社 符号化装置および符号化方法、復号装置および復号方法、並びにプログラム
CN102821069B (zh) * 2011-06-07 2018-06-08 中兴通讯股份有限公司 基站及基站侧上行数据压缩方法
CN103354091B (zh) * 2013-06-19 2015-09-30 北京百度网讯科技有限公司 基于频域变换的音频特征提取方法及装置
CN105721879B (zh) * 2016-01-26 2018-08-31 北京空间飞行器总体设计部 一种深空探测图像分段保护下的感兴趣区域传输方法
CN111429944B (zh) * 2020-04-17 2023-06-02 北京百瑞互联技术有限公司 一种编解码器开发测试优化方法及系统

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0497413A1 (fr) * 1991-02-01 1992-08-05 Koninklijke Philips Electronics N.V. Dispositif de codage par sous-bandes et émetteur muni de ce dispositif
US5388181A (en) * 1990-05-29 1995-02-07 Anderson; David J. Digital audio compression system
WO1996014695A1 (fr) 1994-11-04 1996-05-17 Philips Electronics N.V. Codage et decodage d'un signal large bande d'informations numeriques
CN1241336A (zh) * 1997-07-29 2000-01-12 皇家菲利浦电子有限公司 可变比特率视频编码方法和相应的视频编码器
JP2000244325A (ja) * 1999-02-24 2000-09-08 Alpine Electronics Inc Mpegオーディオの復号化方法
CN1318904A (zh) * 2001-03-13 2001-10-24 北京阜国数字技术有限公司 一种实用的基于小波变换的声音编解码器
US20040143431A1 (en) 2003-01-20 2004-07-22 Mediatek Inc. Method for determining quantization parameters
US20050254586A1 (en) 2004-05-12 2005-11-17 Samsung Electronics Co., Ltd. Method of and apparatus for encoding/decoding digital signal using linear quantization by sections
US20060074693A1 (en) 2003-06-30 2006-04-06 Hiroaki Yamashita Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864802A (en) * 1995-09-22 1999-01-26 Samsung Electronics Co., Ltd. Digital audio encoding method utilizing look-up table and device thereof
CA2252170A1 (fr) * 1998-10-27 2000-04-27 Bruno Bessette Methode et dispositif pour le codage de haute qualite de la parole fonctionnant sur une bande large et de signaux audio
US6912496B1 (en) * 1999-10-26 2005-06-28 Silicon Automation Systems Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5388181A (en) * 1990-05-29 1995-02-07 Anderson; David J. Digital audio compression system
EP0497413A1 (fr) * 1991-02-01 1992-08-05 Koninklijke Philips Electronics N.V. Dispositif de codage par sous-bandes et émetteur muni de ce dispositif
US5621855A (en) * 1991-02-01 1997-04-15 U.S. Philips Corporation Subband coding of a digital signal in a stereo intensity mode
WO1996014695A1 (fr) 1994-11-04 1996-05-17 Philips Electronics N.V. Codage et decodage d'un signal large bande d'informations numeriques
CN1241336A (zh) * 1997-07-29 2000-01-12 皇家菲利浦电子有限公司 可变比特率视频编码方法和相应的视频编码器
JP2000244325A (ja) * 1999-02-24 2000-09-08 Alpine Electronics Inc Mpegオーディオの復号化方法
CN1318904A (zh) * 2001-03-13 2001-10-24 北京阜国数字技术有限公司 一种实用的基于小波变换的声音编解码器
US20040143431A1 (en) 2003-01-20 2004-07-22 Mediatek Inc. Method for determining quantization parameters
US20060074693A1 (en) 2003-06-30 2006-04-06 Hiroaki Yamashita Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model
US20050254586A1 (en) 2004-05-12 2005-11-17 Samsung Electronics Co., Ltd. Method of and apparatus for encoding/decoding digital signal using linear quantization by sections

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2104095A4 *

Also Published As

Publication number Publication date
WO2008064577A8 (fr) 2009-05-07
EP2104095A4 (fr) 2012-07-18
CN101192410B (zh) 2010-05-19
CN101192410A (zh) 2008-06-04
EP2104095A1 (fr) 2009-09-23

Similar Documents

Publication Publication Date Title
JP5539203B2 (ja) 改良された音声及びオーディオ信号の変換符号化
JP4977471B2 (ja) 符号化装置及び符号化方法
KR101221918B1 (ko) 신호 처리 방법 및 장치
TWI601130B (zh) 音訊編碼裝置
JP5013863B2 (ja) 符号化装置、復号化装置、通信端末装置、基地局装置、符号化方法及び復号化方法
CN1890711B (zh) 将数字信号编码成可扩缩比特流的方法和对可扩缩比特流解码的方法
JP2022050609A (ja) 音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法
US9037454B2 (en) Efficient coding of overcomplete representations of audio using the modulated complex lapped transform (MCLT)
JP5418930B2 (ja) 音声復号化方法および音声復号化器
US9443534B2 (en) Bandwidth extension system and approach
WO2008064577A1 (fr) Procédé et appareil permettant d'ajuster la qualité de la quantification dans un codeur et décodeur
JP6368029B2 (ja) 雑音信号処理方法、雑音信号生成方法、符号化器、復号化器、並びに符号化および復号化システム
JP4548348B2 (ja) 音声符号化装置及び音声符号化方法
WO2005096274A1 (fr) Dispositif et procede de codage/decodage audio ameliores
JPWO2004010415A1 (ja) オーディオ復号装置と復号方法およびプログラム
RU2530926C2 (ru) Изменение формы шума округления для основанных на целочисленном преобразовании кодирования и декодирования аудио и видеосигнала
US20080140393A1 (en) Speech coding apparatus and method
WO2009109139A1 (fr) Procédé de codage et de décodage par extension de la très large bande, codeur et système d'extension de la très large bande
JP2011013560A (ja) オーディオ符号化装置、オーディオ符号化方法及びオーディオ符号化用コンピュータプログラムならびに映像伝送装置
CN102194458B (zh) 频带复制方法、装置及音频解码方法、系统
CN103366750A (zh) 一种声音编解码装置及其方法
WO2010000179A1 (fr) Procédé, système et dispositif pour élargir une bande passante
US20130006644A1 (en) Method and device for spectral band replication, and method and system for audio decoding
WO2006008817A1 (fr) Appareil de codage audio et méthode de codage audio
KR101387808B1 (ko) 가변 비트율을 갖는 잔차 신호 부호화를 이용한 고품질 다객체 오디오 부호화 및 복호화 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07855801

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2007855801

Country of ref document: EP